2021
DOI: 10.48550/arxiv.2101.04899
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Experimental Evaluation of Deep Learning models for Marathi Text Classification

Atharva Kulkarni,
Meet Mandhane,
Manali Likhitkar
et al.

Abstract: The Marathi language is one of the prominent languages used in India. It is predominantly spoken by the people of Maharashtra. Over the past decade, the usage of language on online platforms has tremendously increased. However, research on Natural Language Processing (NLP) approaches for Marathi text has not received much attention. Marathi is a morphologically rich language and uses a variant of the Devanagari script in the written form. This works aims to provide a comprehensive overview of available resourc… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
4
1

Relationship

5
0

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 13 publications
0
6
0
Order By: Relevance
“…In this work, we provide a comparative view of different families of algorithms on a range of datasets. Similar comparison of deep learning approaches on different datasets and languages have been studied in [14,13,9,32,10,17].…”
Section: Introductionmentioning
confidence: 88%
“…In this work, we provide a comparative view of different families of algorithms on a range of datasets. Similar comparison of deep learning approaches on different datasets and languages have been studied in [14,13,9,32,10,17].…”
Section: Introductionmentioning
confidence: 88%
“…We are using common deep learning text classification approaches for the task of Hate speech detection [19]. The models are used directly for binary classification tasks whereas a hierarchical approach is used for multi-labeled fine-grained classification.…”
Section: Model Architecturesmentioning
confidence: 99%
“…For conducting baseline experiments on our dataset, hashtags, mentions, and special symbols were removed during preprocessing. We used some of the widely used text classification architectures for sentiment classification (Kulkarni et al, 2021;Kowsari et al, 2019;Kim, 2014;Sun et al, 2019). The text is tokenized as words or sub-words and passed to the algorithms mentioned • CNN: The initial embedding layer outputs word embeddings of size 300.…”
Section: Experimentationsmentioning
confidence: 99%
“…Marathi is an Indian language spoken by around 83 million people and ranks as the third most spoken language in India. But surprisingly, there is no significant work or resource for the task of sentiment analysis in Marathi (Kulkarni et al, 2021). A sentiment analysis dataset curated by IIT-Bombay is available, but it has a very small size consisting of only 150 samples (Balamurali et al, 2012).…”
Section: Introductionmentioning
confidence: 99%