Proceedings of the Third Workshop on Natural Language Processing and Computational Social Science 2019
DOI: 10.18653/v1/w19-2103
|View full text |Cite
|
Sign up to set email alerts
|

Tweet Classification without the Tweet: An Empirical Examination of User versus Document Attributes

Abstract: NLP naturally puts a primary focus on leveraging document language, occasionally considering user attributes as supplemental. However, as we tackle more social scientific tasks, it is possible user attributes might be of primary importance and the document supplemental. Here, we systematically investigate the predictive power of user-level features alone versus document-level features for document-level tasks. We first show user attributes can sometimes carry more task-related information than the document its… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
18
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
4

Relationship

3
6

Authors

Journals

citations
Cited by 18 publications
(19 citation statements)
references
References 32 publications
1
18
0
Order By: Relevance
“…We also include the top participant from the shared task Zarrella and Marsh (2016) which uses a different F1 score as defined for the shared task, referred to here as SemEval F1 3 . Lastly, we compare our results to the approach of Lynn et al (2019), from whom we received the extended history dataset, which uses the labeled tweet and a list of accounts the author follows. However, they only report the weighted-F1 score for their best performing model.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…We also include the top participant from the shared task Zarrella and Marsh (2016) which uses a different F1 score as defined for the shared task, referred to here as SemEval F1 3 . Lastly, we compare our results to the approach of Lynn et al (2019), from whom we received the extended history dataset, which uses the labeled tweet and a list of accounts the author follows. However, they only report the weighted-F1 score for their best performing model.…”
Section: Resultsmentioning
confidence: 99%
“…In total we have 3,021 instances with a split of 1658 train, 418 dev, and 945 test across all targets. The original 2016 shared task had 4,100 instances, however due to accounts or messages being deleted over time, we were unable to replicate the complete original dataset and instead used the smaller version available from Lynn et al (2019)…”
Section: A Appendixmentioning
confidence: 99%
“…Our work is aligned with a growing set of methods to to embed language processing within the social and human contexts they are applied (Lynn et al, 2019). Most similar is the work on language generation or dialog agents (i.e.…”
Section: Related Workmentioning
confidence: 99%
“…Such tasks present an interesting challenge for the NLP community to model the people behind the language rather than the language itself, and the social scientific community has begun to see success of such approaches as an alternative or supplement to standard psychological assessment techniques like questionnaires (Kern et al, 2016;Eichstaedt et al, 2018). Generally, such work is helping to embed NLP in a greater social and human context (Hovy and Spruit, 2016;Lynn et al, 2019).…”
Section: Introductionmentioning
confidence: 99%