Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 2018
DOI: 10.18653/v1/p18-2110
|View full text |Cite
|
Sign up to set email alerts
|

Examining Temporality in Document Classification

Abstract: Many corpora span broad periods of time. Language processing models trained during one time period may not work well in future time periods, and the best model may depend on specific times of year (e.g., people might describe hotels differently in reviews during the winter versus the summer). This study investigates how document classifiers trained on documents from certain time intervals perform on documents from other time intervals, considering both seasonal intervals (intervals that repeat across years, e.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
37
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 30 publications
(41 citation statements)
references
References 11 publications
3
37
0
Order By: Relevance
“…We retrieved available data sources from previous publications (Zhang et al, 2014;He and McAuley, 2016;Huang and Paul, 2018). Specifically, we use four different sources in in English-Amazon (music reviews), Yelp (restaurant and hotel reviews), Twitter, and economic newspaper articles ( Figure Eight Inc., 2015)-and one source in Chinese, Dianping (Meituan-Dianping, 2019).…”
Section: Datamentioning
confidence: 99%
See 3 more Smart Citations
“…We retrieved available data sources from previous publications (Zhang et al, 2014;He and McAuley, 2016;Huang and Paul, 2018). Specifically, we use four different sources in in English-Amazon (music reviews), Yelp (restaurant and hotel reviews), Twitter, and economic newspaper articles ( Figure Eight Inc., 2015)-and one source in Chinese, Dianping (Meituan-Dianping, 2019).…”
Section: Datamentioning
confidence: 99%
“…Following Huang and Paul (2018), we group the corpora into several bins of temporal intervals; specifically, non-repeating time intervals spanning one or more years (Table 1). We encode each temporal domain into the discrete time labels, 1, 2, ...T .…”
Section: Datamentioning
confidence: 99%
See 2 more Smart Citations
“…Consistency regularization methods (e.g., self-ensembling) outperform adversarial methods on visual semi-supervised and domain adaptation tasks (Athiwaratkun et al, 2019), but have rarely been applied to textual data (Ko et al, 2019). Finally, Huang and Paul (2018) establish the feasibility of using domain adaptation to label documents from discrete time periods. Our work departs from previous work by proposing an adaptive, time-aware approach to consistency regularization provisioned with causal convolutional networks.…”
Section: Related Workmentioning
confidence: 99%