2021
DOI: 10.48550/arxiv.2101.00204
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
16
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(16 citation statements)
references
References 0 publications
0
16
0
Order By: Relevance
“…Our experiment results can serve as benchmarks for future work. We hope the present study will encourage researchers to make use of said models for various tasks in Bangla 53 As indicated in [28]. NLP, and serve as a stepping stone for future endeavors that will contribute to enriching BNLP research.…”
Section: Discussionmentioning
confidence: 94%
See 2 more Smart Citations
“…Our experiment results can serve as benchmarks for future work. We hope the present study will encourage researchers to make use of said models for various tasks in Bangla 53 As indicated in [28]. NLP, and serve as a stepping stone for future endeavors that will contribute to enriching BNLP research.…”
Section: Discussionmentioning
confidence: 94%
“…The dataset has 1, 313 parallel sentences, in which English sentences were collected from the Penn Treebank corpus. ( 7) Global Voices: 28 The Global Voices corpus consists of the translations of spoken languages.…”
Section: Machinementioning
confidence: 99%
See 1 more Smart Citation
“…• We have built an annotation management system from scratch to annotate Bangla SA data. We have made both the annotation management system and SentiGOLD dataset publicly available upon request 4 . • To establish a benchmark, we have investigated different architectures and training methodologies on this dataset and achieved 0.62 macro f1 for 5 classes with BanglaBert [4].…”
Section: Introductionmentioning
confidence: 99%
“…We have made both the annotation management system and SentiGOLD dataset publicly available upon request 4 . • To establish a benchmark, we have investigated different architectures and training methodologies on this dataset and achieved 0.62 macro f1 for 5 classes with BanglaBert [4]. • We employ cross-dataset testing to showcase the generalization capability of the proposed dataset.…”
Section: Introductionmentioning
confidence: 99%