Proceedings of the Sixth Workshop on Noisy User-Generated Text (W-Nut 2020) 2020
DOI: 10.18653/v1/2020.wnut-1.24
|View full text |Cite
|
Sign up to set email alerts
|

Annotation Efficient Language Identification from Weak Labels

Abstract: India is home to several languages with more than 30m speakers. These languages exhibit significant presence on social media platforms. However, several of these widely-used languages are under-addressed by current Natural Language Processing (NLP) models and resources. User generated social media content in these languages is also typically authored in the Roman script as opposed to the traditional native script further contributing to resource scarcity. In this paper, we leverage a minimally supervised NLP t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
references
References 23 publications
(20 reference statements)
0
0
0
Order By: Relevance