Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing 2015
DOI: 10.18653/v1/d15-1067
|View full text |Cite
|
Sign up to set email alerts
|

Online Sentence Novelty Scoring for Topical Document Streams

Abstract: The enormous amount of information on the Internet has raised the challenge of highlighting new information in the context of already viewed content. This type of intelligent interface can save users time and prevent frustration. Our goal is to scale up novelty detection to large web properties like Google News and Yahoo News. We present a set of lightweight features for online novelty scoring and fast nonlinear feature transformation methods.Our experimental results on the TREC 2004 shared task datasets show … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 15 publications
0
2
0
Order By: Relevance
“…Finally, a related but distinct topic is novelty detection (Soboroff and Harman, 2005;Lee, 2015;Ghosal et al, 2018), in which two sets of documents are provided, one that is assumed to be known, and one that may contain new content. The task is to identify novel content in the second set.…”
Section: Outlier Detectionmentioning
confidence: 99%
“…Finally, a related but distinct topic is novelty detection (Soboroff and Harman, 2005;Lee, 2015;Ghosal et al, 2018), in which two sets of documents are provided, one that is assumed to be known, and one that may contain new content. The task is to identify novel content in the second set.…”
Section: Outlier Detectionmentioning
confidence: 99%
“…Novelty detection is a well-studied problem in information retrieval literature and has widespread natural language processing (NLP) applications such as: text summarization (Allan, Gupta, and Khandelwal 2001;Bysani 2010), event detection from news or tracking development of news items (Karkali et al 2013), predicting impact of scholarly articles (Mishra and Torvik 2016), etc. However, we find that most of the investigations (Gamon 2006;Zhang and Tsai 2009;Lee 2015) and exercises/shared tasks (Soboroff 2004;Bentivogli et al 2011) till date are directed toward sentence-level novelty mining. But considering the present context and exponential growth of redundant documents across the web, we deem document-level novelty detection as a very welltimed problem.…”
Section: Introductionmentioning
confidence: 75%