We provide both monolingual and bilingual word distributed representations (i.e. embeddings) learnt using Wikipedia(2013) and also aim to provide different types of data (e.g. News, Subtitles, SpeechtoText etc.,) with various apporoaches in python shelve format


You can download the Bilingual Embeddings from the following Links


You can download the Monolingual Embeddings from the following Links


  • Gouws, Stephan, Yoshua Bengio, and Greg Corrado. "Bilbowa: Fast bilingual distributed representations without word alignments." arXiv preprint arXiv:1410.2455 (2014).
  • Shazeer, Noam, Ryan Doherty, Colin Evans, and Chris Waterson. "Swivel: Improving Embeddings by Noticing What's Missing." arXiv preprint arXiv:1602.02215 (2016).
  • Le, Quoc V., and Tomas Mikolov. "Distributed representations of sentences and documents." arXiv preprint arXiv:1405.4053 (2014).
  • Pennington, Jeffrey, Richard Socher, and Christopher D. Manning. "Glove: Global Vectors for Word Representation." In EMNLP, vol. 14, pp. 1532-1543. 2014.


Aditya Mogadala

For Any Queries, Please mail AT -- aditya DOT mogadala AT kit DOT edu

Copyright © 2016 Aditya Mogadala. All Rights Reserved.

Design byW3layouts