2019
DOI: 10.1145/3325885
|View full text |Cite
|
Sign up to set email alerts
|

Towards Burmese (Myanmar) Morphological Analysis

Abstract: This article presents a comprehensive study on two primary tasks in Burmese (Myanmar) morphological analysis: tokenization and part-of-speech (POS) tagging. Twenty thousand Burmese sentences of newswire are annotated with two-layer tokenization and POS-tagging information, as one component of the Asian Language Treebank Project. The annotated corpus has been released under a CC BY-NC-SA license, and it is the largest open-access database of annotated Burmese when this manuscript was prepared in 2017. Detailed … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 20 publications
(4 citation statements)
references
References 26 publications
0
4
0
Order By: Relevance
“…After the voice has split, they do not convert it to text. For creating strong acoustic models for speech recognition, accurate phonetic transcriptions are required [68,69]. After increasing the expected voice, IVSE converts the voice into text to guarantee that the converted text matches the original voice's text.…”
Section: Reason For Choosing Lightgbmmentioning
confidence: 99%
“…After the voice has split, they do not convert it to text. For creating strong acoustic models for speech recognition, accurate phonetic transcriptions are required [68,69]. After increasing the expected voice, IVSE converts the voice into text to guarantee that the converted text matches the original voice's text.…”
Section: Reason For Choosing Lightgbmmentioning
confidence: 99%
“…Morphologically, it is analytic language without the inflection of morphemes. Syntactically, it is usually the head-final language that the functional morphemes follow content morphemes, and the verb always becomes at the end of a sentence [17]. The sentences are delimited by a sentence boundary marker, but phrases and words are rarely delimited with spaces.…”
Section: Myanmar Languagementioning
confidence: 99%
“…We exploited RNN pre-ordering approach with lexicalized (Lex-RNN) and unlexicalized (Unlex-RNN) features. It took about 17 [22], was used for comparison. Hyper-parameters are set is being as: the matching features is the maximum value of 10, the window size is 3, and the maximum waiting time is set to 30 minutes.…”
Section: Training the Rnn Pre-ordering Modelmentioning
confidence: 99%
“…Once the speech is separated the voice is not converted into text. For building robust acoustic models for speech recognition [68,69], accurate phonetic transcriptions are important. VoSE after enhancing the predicted voice converts the speech to text to make sure that the converted text matches the original speech's text.…”
Section: Why Lightgbm?mentioning
confidence: 99%