Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2018
DOI: 10.18653/v1/p18-1131
|View full text |Cite
|
Sign up to set email alerts
|

Twitter Universal Dependency Parsing for African-American and Mainstream American English

Abstract: Due to the presence of both Twitterspecific conventions and non-standard and dialectal language, Twitter presents a significant parsing challenge to current dependency parsing tools. We broaden English dependency parsing to handle social media English, particularly social media African-American English (AAE), by developing and annotating a new dataset of 500 tweets, 250 of which are in AAE, within the Universal Dependencies 2.0 framework. We describe our standards for handling Twitter-and AAE-specific features… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 27 publications
(22 citation statements)
references
References 17 publications
0
15
0
Order By: Relevance
“…An equally important consideration, in addition to whom the data describes is who authored the data. For example, Blodgett et al (2018) show that parsing systems trained on White Mainstream American English perform poorly on African American English (AAE). 5 In a more general example, Wikipedia has become a popular data source for many NLP tasks.…”
Section: Nlp Systems Encode Racial Biasmentioning
confidence: 99%
See 2 more Smart Citations
“…An equally important consideration, in addition to whom the data describes is who authored the data. For example, Blodgett et al (2018) show that parsing systems trained on White Mainstream American English perform poorly on African American English (AAE). 5 In a more general example, Wikipedia has become a popular data source for many NLP tasks.…”
Section: Nlp Systems Encode Racial Biasmentioning
confidence: 99%
“…• Annotation schema Returning to Blodgett et al (2018), this work defines new parsing standards for formalisms common in AAE, demonstrating how parsing labels themselves were not designed for racialized language varieties. • Annotation instructions Sap et al (2019) show that annotators are less likely to label tweets using AAE as offensive if they are told the likely language varieties of the tweets.…”
Section: Nlp Systems Encode Racial Biasmentioning
confidence: 99%
See 1 more Smart Citation
“…Table 2 expresses 22 metrics from the literature as instances of our generalized metrics from Section 3. The presented metrics span a number of NLP tasks, including text classification (Dixon et al, 2018;Garg et al, 2019;Borkan et al, 2019;Prabhakaran et al, 2019), relation extraction (Gaut et al, 2020), text generation (Huang et al, 2020a) and dependency parsing (Blodgett et al, 2018). We arrive at this list by reviewing 146 papers that study bias from the survey of Blodgett et al…”
Section: Classifying Existing Fairness Metricsmentioning
confidence: 99%
“…They evaluated the performance of two parsers on this dataset and found that their performance lagged significantly in comparison to their performance on the Italian UD Treebank. A new dataset of 500 tweets within the framework of UD 2.0 was developed and annotated by [5], out of which 250 tweets are in African American English. TWEEBANK V2 was developed by [13] by completely labelling TWEEBANK V1 according to UD 2.0 along with additionally sampled tweets, for a total of 3,550 tweets.…”
Section: Related Workmentioning
confidence: 99%