2022
DOI: 10.48550/arxiv.2202.03829
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

TimeLMs: Diachronic Language Models from Twitter

Abstract: Despite its importance, the time variable has been largely neglected in the NLP and language model literature. In this paper, we present TimeLMs, a set of language models specialized on diachronic Twitter data. We show that a continual learning strategy contributes to enhancing Twitter-based language models' capacity to deal with future and out-of-distribution tweets, while making them competitive with standardized and more monolithic benchmarks. We also perform a number of qualitative analyses showing how the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 25 publications
(32 citation statements)
references
References 26 publications
(34 reference statements)
0
21
0
Order By: Relevance
“…Broadly, most of the observed semantic shift can be described as changes in the popularity of different word senses [46]. Although this suggests that contextual language models [32] would be well-suited for mitigating the effect of semantic shift in longitudinal analyses, emerging research suggests this is not necessarily true in the absence of additional tuning [33,61].…”
Section: Naïvementioning
confidence: 99%
See 1 more Smart Citation
“…Broadly, most of the observed semantic shift can be described as changes in the popularity of different word senses [46]. Although this suggests that contextual language models [32] would be well-suited for mitigating the effect of semantic shift in longitudinal analyses, emerging research suggests this is not necessarily true in the absence of additional tuning [33,61].…”
Section: Naïvementioning
confidence: 99%
“…A lack of analyses of temporal robustness of these models belies the seriousness of the problem: language shifts over time -especially on social media [15,61] -and statistical classifiers degrade in the presence of distributional changes [28,50]. Three types of distributional change are of particular concern for classifiers applied over time: 1) new terminology is used to convey existing concepts; 2) existing terminology is used to convey new concepts; and 3) semantic relationships remain fixed, but the overall language distribution changes.…”
Section: Introductionmentioning
confidence: 99%
“…Several recent studies have explored and evaluated the generalization ability of language models to time (Röttger and Pierrehumbert, 2021;Lazaridou et al, 2021;Agarwal and Nenkova, 2021;Hofmann et al, 2021;Loureiro et al, 2022). To better handle continuously evolving web content, Hombaiah et al ( 2021) performed incremental training.…”
Section: Temporal Language Modelsmentioning
confidence: 99%
“…The "static" nature of existing LMs makes them unaware of time, and in particular unware of language changes that occur over time. This prevents such models from adapting to time and generalizing temporally (Röttger and Pierrehumbert, 2021;Lazaridou et al, 2021;Hombaiah et al, 2021;Dhingra et al, 2022;Agarwal and Nenkova, 2021;Loureiro et al, 2022), abilities that were shown to be important for many tasks in NLP and Information Retrieval (Kanhabua and Anand, 2016;Rosin et al, 2017;Huang and Paul, 2019;Röttger and Pierrehumbert, 2021;Savov et al, 2021). Recently, to create time-aware models, the NLP community has started to use time as a feature in training and fine-tuning language models (Dhingra et al, 2022;Rosin et al, 2022).…”
Section: Introductionmentioning
confidence: 99%
“…While these works focus on understanding bias in film directly, we take a slightly differently framing, examining how the bias in a film dataset can impact the biases of a language model. Loureiro et al (2022) examine concept drift and generalization on language models trained on Twitter data over time. Our work on longitudinal effects of film data is distinct in timescale (reflecting the much slower release rate of films relative to tweets) and in motivation; (Loureiro et al, 2022) consider the effects of the data's time period on model performance, while we examine the effects of the time period on model biases.…”
Section: Related Workmentioning
confidence: 99%