Proceedings of the Ninth International Conference on Information and Knowledge Management 2000
DOI: 10.1145/354756.354843
|View full text |Cite
|
Sign up to set email alerts
|

First story detection in TDT is hard

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
87
0
2

Year Published

2006
2006
2015
2015

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 119 publications
(95 citation statements)
references
References 6 publications
3
87
0
2
Order By: Relevance
“…Allan et al [10] develop a framework for the evaluation of TDT tasks where missed detection rate is the percentage of documents which should have been categorised as novel (but were not) to the total amount of documents that indicate as new and false alarm rate is the ratio of documents that mistakenly identified as novel to the total number of documents categorized as new. A variation of ROC curves, detection error trade-off (DET) can be used to demonstrate the trade-off between miss probability and false alarms.…”
Section: Evaluation Metricsmentioning
confidence: 99%
“…Allan et al [10] develop a framework for the evaluation of TDT tasks where missed detection rate is the percentage of documents which should have been categorised as novel (but were not) to the total amount of documents that indicate as new and false alarm rate is the ratio of documents that mistakenly identified as novel to the total number of documents categorized as new. A variation of ROC curves, detection error trade-off (DET) can be used to demonstrate the trade-off between miss probability and false alarms.…”
Section: Evaluation Metricsmentioning
confidence: 99%
“…In a tf.idf model, the frequency of a term in a document (tf) is weighted by the inverse document frequency (idf), the inverse of the number of documents containing a term. Researchers have tested a number of similarity measures in the link detection task, including weighted sum, language modeling and KullbackLeibler divergence, and found that the cosine similarity produced the best results [18]. In addition, using different methods together improved the retrieval performance [8] [32].…”
Section: Related Workmentioning
confidence: 99%
“…A possible reason for that is that NED has no scope: it provides no intuition for what we should look for in a report; the only thing we know is what we should not look for: we should not retrieve anything we have seen before. Allan, Lavrenko and Jin [7] presented a formal argument showing that the New Event Detection problem cannot be solved using existing methods.…”
Section: Definitionmentioning
confidence: 99%