India is a land of 122 languages and numerous dialects. Lack of competent lexical resources for Indian languages is a ubiquitous fact, which negatively affects the development of tools for NLP of Indian languages. Recent advancements like the Indo WordNet project has significantly contributed to dealing with the scarcity of lexicons, but the progress and coverage is a matter of dispute. The bottlenecks, cost, time, and skilled lexicographers further slackens the progress. In this article, the authors propose a technique to automate the generation of lexical entries using a machine learning approach which visibly expedites the process of lexicon generation like WordNet. The reluctance to adopt an automated approach is majorly credited to a lack of accuracy, the inability to capture a regional touch of a language, incorrect back-translation, etc. To overcome this issue, the author will use Wikipedia to validate the synsets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.