This research presents design, experiment and development of longest-match based Stemmer for Wolaita texts.The objective of this paper is to conflate the variants of Wolaita text words into its stem with better accuracy, using Longest-Match based approach. To help the researcher how to compile the possible combination of suffixes, the deep analysis of Wolaita word morphology has been made. For data preprocess and implementation, C# programming language is used. After preprocessing, 12789 unique words are reserved to experiment this research. Out of these unique words, 1200 words are randomly selected earlier and kept separate for testing purpose. Then the developed stemmer was tested using Paice's actual error counting method. The output on that test dataset has showed 91.84% accuracy over actual manually stemmed words. The obtained result shows that the rule based longest match approach is promising for stemming Wolaita language texts.
This article reviews Natural Language Processing (NLP) and its challenge on Omotic language groups. All technological achievements are partially fuelled by the recent developments in NLP. NPL is one of component of an artificial intelligence (AI) and offers the facility to the companies that need to analyze their reliable business data. However, there are many challenges that tackle the effectiveness of NLP applications on Omotic language groups (Ometo) of Ethiopia. These challenges are irregularity of the words, stop word identification problem, compounding and languages ‘digital data resource limitation. Thus, this study opens the room to the upcoming researchers to further investigate the NLP application on these language groups.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.