Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, Volume 2: Short Pa 2014
DOI: 10.3115/v1/e14-4019
|View full text |Cite
|
Sign up to set email alerts
|

Chinese Native Language Identification

Abstract: We present the first application of Native Language Identification (NLI) to nonEnglish data. Motivated by theories of language transfer, NLI is the task of identifying a writer's native language (L1) based on their writings in a second language (the L2). An NLI system was applied to Chinese learner texts using topicindependent syntactic models to assess their accuracy. We find that models using part-of-speech tags, context-free grammar production rules and function words are highly effective, achieving a maxim… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
28
0

Year Published

2014
2014
2020
2020

Publication Types

Select...
5
4
1

Relationship

3
7

Authors

Journals

citations
Cited by 32 publications
(30 citation statements)
references
References 7 publications
2
28
0
Order By: Relevance
“…This would enable them to not just provide new evidence for previous findings, but to also perform semi-automated data-driven generation of new and viable hypotheses. This, in turn, can help reduce expert effort and involvement in the process, particularly as such studies expand to more corpora and emerging language like Chinese (Malmasi and Dras, 2014b) and Arabic (Malmasi and Dras, 2014a).…”
Section: Discussionmentioning
confidence: 99%
“…This would enable them to not just provide new evidence for previous findings, but to also perform semi-automated data-driven generation of new and viable hypotheses. This, in turn, can help reduce expert effort and involvement in the process, particularly as such studies expand to more corpora and emerging language like Chinese (Malmasi and Dras, 2014b) and Arabic (Malmasi and Dras, 2014a).…”
Section: Discussionmentioning
confidence: 99%
“…Analyzing the language produced by learners could provide insight into the limitations of learners' vocabulary. Learner corpora, widely used in the task of Native Language Identification (Malmasi and Dras, 2014;Malmasi and Dras, 2015b) could be useful here.…”
Section: Resultsmentioning
confidence: 99%
“…This can provide useful information in forensic linguistic tasks (Estival et al, 2007) or could be used in an educational setting to provide contrastive feedback to language learners. Most research has focused on identifying the native language of English language learners, though there have been some efforts recently to identify the native language of writing in other languages (Malmasi and Dras, 2014).…”
Section: Native Language Identificationmentioning
confidence: 99%