2023
DOI: 10.48550/arxiv.2301.09685
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Noisy Parallel Data Alignment

Abstract: An ongoing challenge in current natural language processing is how its major advancements tend to disproportionately favor resource-rich languages, leaving a significant number of under-resourced languages behind. Due to the lack of resources required to train and evaluate models, most modern language technologies are either nonexistent or unreliable to process endangered, local, and nonstandardized languages. Optical character recognition (OCR) is often used to convert endangered language documents into machi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 24 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?