2022
DOI: 10.14569/ijacsa.2022.0130831
|View full text |Cite
|
Sign up to set email alerts
|

Word by Word Labelling of Romanized Sindhi Text by using Online Python Tool

Abstract: Sindhi is one of the most ancient languages in the world and it has its own written and spoken scripts. After the rigorous study it was found that a lot of research work has been done in different languages, but word by word labelling of Sindhi language had not been done yet. In this research study, word labelling was done on 100 sentences of Romanized Sindhi texts using Python online tool. The dataset was collected from different sources which include Sindhi newspaper, blogs and social media webpages. From th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 12 publications
0
4
0
Order By: Relevance
“…Figures 3 and 4 summarize the settings and the training steps of the language models. Machine Learning classifiers deployed, trained, and tested in this work were Logistic regression [30], Support Vector Machine (SVM) [31], K-nearest neighbors (KNN) [32], Decision Tree [33], Stochastic Gradient Descent (SGD) [34], and Multinomial Naive Bayes [35]. In the ensemble learning category, several models were applied to do the same task such as Voting Classifiers [36], Random Forest [37], Bagging Meta-Estimator [38], AdaBoost [39], XGBoost [40], Gradient Boosting [41], and Light Gradient Boosting Machine (LightGBM) [42].…”
Section: The Proposed Approachmentioning
confidence: 99%
“…Figures 3 and 4 summarize the settings and the training steps of the language models. Machine Learning classifiers deployed, trained, and tested in this work were Logistic regression [30], Support Vector Machine (SVM) [31], K-nearest neighbors (KNN) [32], Decision Tree [33], Stochastic Gradient Descent (SGD) [34], and Multinomial Naive Bayes [35]. In the ensemble learning category, several models were applied to do the same task such as Voting Classifiers [36], Random Forest [37], Bagging Meta-Estimator [38], AdaBoost [39], XGBoost [40], Gradient Boosting [41], and Light Gradient Boosting Machine (LightGBM) [42].…”
Section: The Proposed Approachmentioning
confidence: 99%
“…Sentiment analysis of RST has been done on the online Python tool for 100 sentences. But during, before, and after performing the task of sentiment analysis on RST, faced issues with the completion of this task [18,19]. While performing the task of sentiment analysis on RST, positive sentences were not identified by the tool (Python), but after the characters of the Romanized text were changed, and then the results came.…”
Section: Issues Of Sentiment Analysis Of Romanized Sindhi Textmentioning
confidence: 99%
“…Sindhi is a complex language with a rich morphology that allows for word borrowing and lending (9,10) . It has a high rate of ambiguity due to similar patterns and vowel deletions.…”
Section: Sindhi Morphologymentioning
confidence: 99%
“…Although they are recognized as part of great languages, the variety of affixes in Sindhi makes it more complex. The significant variety in Sindhi's morphology caused by different prefix, suffix, and stem placements in words makes it difficult to computerize (10) .…”
Section: Sindhi Morphologymentioning
confidence: 99%