Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1403
|View full text |Cite
|
Sign up to set email alerts
|

Neural Temporality Adaptation for Document Classification: Diachronic Word Embeddings and Domain Adaptation Models

Abstract: Language usage can change across periods of time, but document classifiers models are usually trained and tested on corpora spanning multiple years without considering temporal variations. This paper describes two complementary ways to adapt classifiers to shifts across time. First, we show that diachronic word embeddings, which were originally developed to study language change, can also improve document classification, and we show a simple method for constructing this type of embedding. Second, we propose a … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
25
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 26 publications
(26 citation statements)
references
References 24 publications
0
25
0
Order By: Relevance
“…Temporal information has also been used to improve named entity disambiguation on a data set of historical documents (Agarwal et al, 2018). Finally, Huang and Paul (2019) present a model that uses diachronic word embeddings combined with a method inspired by domain adaptation to improve document classification.…”
Section: Related Workmentioning
confidence: 99%
“…Temporal information has also been used to improve named entity disambiguation on a data set of historical documents (Agarwal et al, 2018). Finally, Huang and Paul (2019) present a model that uses diachronic word embeddings combined with a method inspired by domain adaptation to improve document classification.…”
Section: Related Workmentioning
confidence: 99%
“…Finally, our attempts at domain transfer are constrained. Namely, we do not invoke explicit domain adaptation methods (Peng and Dredze, 2017;Li et al, 2018;Huang and Paul, 2019). Moving forward, we plan to explore algorithmic strategies to mitigate the biases discovered in this study.…”
Section: Limitations and Future Workmentioning
confidence: 99%
“…For our experiments, we used the IMDB dataset (135,669 documents) [ 28 ], the Yelp-hotel dataset (34,961 documents) [ 29 ], the Yelp-rest dataset (178,239 documents) [ 29 ], and the Amazon dataset (83,159 documents) [ 29 ]. The IMDB dataset is a movie review dataset annotated with 10-scale polarities.…”
Section: Resultsmentioning
confidence: 99%
“…Table 2 lists data statistics of the four datasets. For fair comparison with the previous models, we encoded review scores of the Yelp-hotel dataset, the Yelp-rest dataset, and the Amazon dataset into three discrete categories (score >3 as positive, =3 as neutral, and <3 as negative) according to Huang and Paul's experimental settings [29].…”
Section: Datasets and Experimental Settingsmentioning
confidence: 99%