Now-a-days Sindhi language is widely used in internet for the various purposes such as: newspapers, Sindhi literature, books, educational/official websites and social networks communications, teaching and learning processes. Having developed technology of computer system, users face difficulties and problems in writing Sindhi script. In this study, various issues and challenges come in the Romanized Sindhi text by using Roman transliteration (Sindhi text (ST) forms of Romanized Sindhi text) are identified. These acknowledged issues are known as noise, written script of Romanized and its style, space issues in Romanized script, some characters not suitable in Romanized Sindhi, as a paragraph, rows, character issues, punctuation, row break and font style. However, this study provides the summary of issues and challenges of Romanized Sindhi text. This research work provides detailed information of issues and challenges faced by people during chatting in Romanized Sindhi text.
Sindhi is one of the historical languages which is widely used in all over the world, but especially in the province of Sindh Pakistan. Sindhi language has its own script and written by the right-handed. Nowadays the use of different Sindhi platforms is increasing especially for communication. The majority of the people of Sindh province read, write and speak very well, but they face the problem in text communication while using different communication platforms. However, the users of computer and mobile phone feel trouble/difficulty during the use of the Sindhi script in typing of text messages, tweets and comments while using different platforms in computer and mobile phone. Natural Language Processing (NLP) is one of the better options for the solution of these problems of text communication on different platforms. For the proper solution of text communication issues, Romanized Sindhi text is used instead of Sindhi text. Romanized text writing is easier than the Sindhi text writing because Sindhi text writing needs the special type of keyboard while writing of Romanized text does not need any special type of keyboard. For the writing of Romanized Sindhi text, rules are defined in this paper which provide easiness during writing and understanding of the text. Romanized Sindhi Rules (RSR) are simple and easy to understand the meaning of the text and provide fast communication (text). This study is also helpful for further research in the Romanized Sindhi text by using different approaches and provides easiness in communication.
Sindhi is one of the most ancient languages in the world and it has its own written and spoken scripts. After the rigorous study it was found that a lot of research work has been done in different languages, but word by word labelling of Sindhi language had not been done yet. In this research study, word labelling was done on 100 sentences of Romanized Sindhi texts using Python online tool. The dataset was collected from different sources which include Sindhi newspaper, blogs and social media webpages. From this dataset, a rule-based model has been applied for the Parts-of-Speech (POS) tagging of the Romanized Sindhi sentences. A total of 624 words of Romanized Sindhi texts were tested and successfully tagged by the SindhiNLP tool in which 482 words were tagged as nouns and pronouns, 92 words tagged as verbs and 50 words tagged as determinants.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.