ISSN 2394-5125
 

Research Article 


A ROBUST UNSUPERVISED WORD BY WORD TRANSLATION FOR MORPHOLOGICAL RICH LANGUAGES USING DIFFERENT RETRIEVAL TECHNIQUES

Shweta Chauhan, Umesh Pant, Mustafa, Philemon Daniel.

Abstract
Abstract
Word translation or incorporation of bilingual dictionaries is an important capability that impacts
many multilingual language processing tasks. In recent years, cross-lingual word embedding has
been receiving considerable attention. Recently, it has been shown that these word embeddings can
be learned by aligning two monolingual disjoint vector spaces via linear transformations, using as
supervision no more than a small bilingual dictionary.
In this paper, the best cross-lingual word embedding is generated for English as source
language, and Hindi, Punjabi, Telegu and Tamil are target language and vice versa. Here, there is
no aligned document or sentence aligned corpus, nor any bilingual dictionary has been considering.
We are following the assumption of intralingual similarity distribution that, for the most common
word, the distribution graph is similar between language pair and embeddings are isometric.
Different types of retrieval methods nearest neighbor, inverted nearest neighbor retrieval, inverted
Softmax, and cross-lingual word scaling are performed and compared for the bi-lingual embedding
of language pairs, which is trained for fully unsupervised learning techniques. Bi-lingual word
embedding is tested on generated English-Hindi, English-Punjabi, English-Telegu English-Tamil
dictionary..

Key words: Keywords:Cross-Lingual Word Embedding, RetrievalTechniques, Unsupervised word translation, Word Embedding


 
ARTICLE TOOLS
Abstract
PDF Fulltext
How to cite this articleHow to cite this article
Citation Tools
Related Records
 Articles by Shweta Chauhan
Articles by Umesh Pant
Articles by Mustafa
Articles by Philemon Daniel
on Google
on Google Scholar


How to Cite this Article
Pubmed Style

Shweta Chauhan, Umesh Pant, Mustafa, Philemon Daniel. A ROBUST UNSUPERVISED WORD BY WORD TRANSLATION FOR MORPHOLOGICAL RICH LANGUAGES USING DIFFERENT RETRIEVAL TECHNIQUES. JCR. 2020; 7(17): 2677-2684. doi:10.31838/jcr.07.17.333


Web Style

Shweta Chauhan, Umesh Pant, Mustafa, Philemon Daniel. A ROBUST UNSUPERVISED WORD BY WORD TRANSLATION FOR MORPHOLOGICAL RICH LANGUAGES USING DIFFERENT RETRIEVAL TECHNIQUES. http://www.jcreview.com/?mno=100579 [Access: August 18, 2021]. doi:10.31838/jcr.07.17.333


AMA (American Medical Association) Style

Shweta Chauhan, Umesh Pant, Mustafa, Philemon Daniel. A ROBUST UNSUPERVISED WORD BY WORD TRANSLATION FOR MORPHOLOGICAL RICH LANGUAGES USING DIFFERENT RETRIEVAL TECHNIQUES. JCR. 2020; 7(17): 2677-2684. doi:10.31838/jcr.07.17.333



Vancouver/ICMJE Style

Shweta Chauhan, Umesh Pant, Mustafa, Philemon Daniel. A ROBUST UNSUPERVISED WORD BY WORD TRANSLATION FOR MORPHOLOGICAL RICH LANGUAGES USING DIFFERENT RETRIEVAL TECHNIQUES. JCR. (2020), [cited August 18, 2021]; 7(17): 2677-2684. doi:10.31838/jcr.07.17.333



Harvard Style

Shweta Chauhan, Umesh Pant, Mustafa, Philemon Daniel (2020) A ROBUST UNSUPERVISED WORD BY WORD TRANSLATION FOR MORPHOLOGICAL RICH LANGUAGES USING DIFFERENT RETRIEVAL TECHNIQUES. JCR, 7 (17), 2677-2684. doi:10.31838/jcr.07.17.333



Turabian Style

Shweta Chauhan, Umesh Pant, Mustafa, Philemon Daniel. 2020. A ROBUST UNSUPERVISED WORD BY WORD TRANSLATION FOR MORPHOLOGICAL RICH LANGUAGES USING DIFFERENT RETRIEVAL TECHNIQUES. Journal of Critical Reviews, 7 (17), 2677-2684. doi:10.31838/jcr.07.17.333



Chicago Style

Shweta Chauhan, Umesh Pant, Mustafa, Philemon Daniel. "A ROBUST UNSUPERVISED WORD BY WORD TRANSLATION FOR MORPHOLOGICAL RICH LANGUAGES USING DIFFERENT RETRIEVAL TECHNIQUES." Journal of Critical Reviews 7 (2020), 2677-2684. doi:10.31838/jcr.07.17.333



MLA (The Modern Language Association) Style

Shweta Chauhan, Umesh Pant, Mustafa, Philemon Daniel. "A ROBUST UNSUPERVISED WORD BY WORD TRANSLATION FOR MORPHOLOGICAL RICH LANGUAGES USING DIFFERENT RETRIEVAL TECHNIQUES." Journal of Critical Reviews 7.17 (2020), 2677-2684. Print. doi:10.31838/jcr.07.17.333



APA (American Psychological Association) Style

Shweta Chauhan, Umesh Pant, Mustafa, Philemon Daniel (2020) A ROBUST UNSUPERVISED WORD BY WORD TRANSLATION FOR MORPHOLOGICAL RICH LANGUAGES USING DIFFERENT RETRIEVAL TECHNIQUES. Journal of Critical Reviews, 7 (17), 2677-2684. doi:10.31838/jcr.07.17.333