Proceedings of the Second ACM International Conference on Web Search and Data Mining 2009
DOI: 10.1145/1498759.1498806
|View full text |Cite
|
Sign up to set email alerts
|

Query by document

Abstract: We are experiencing an unprecedented increase of content contributed by users in forums such as blogs, social networking sites and microblogging services. Such abundance of content complements content on web sites and traditional media forums such as news papers, news and financial streams, and so on. Given such plethora of information there is a pressing need to cross reference information across textual services. For example, commonly we read a news item and we wonder if there are any blogs reporting related… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
69
0
3

Year Published

2009
2009
2022
2022

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 82 publications
(72 citation statements)
references
References 22 publications
0
69
0
3
Order By: Relevance
“…They first extract semantic keywords from the paragraph and then do the search in an annotated image database. In addition, querying by documents is studied in [24].…”
Section: B Cross Domain Searchmentioning
confidence: 99%
“…They first extract semantic keywords from the paragraph and then do the search in an annotated image database. In addition, querying by documents is studied in [24].…”
Section: B Cross Domain Searchmentioning
confidence: 99%
“…We chose YTE over frequency based techniques since we did not want to be limited by counts from a 12000 post corpus for tf.idf calculations. Also, a recent work comparing YTE, tf.idf and mutual information based techniques for word and phrase identification concluded that YTE did better than tf.idf when identifying top k < 4 keywords in a document and all three were similar in characterizing document content for larger values of k [6].…”
Section: Abstract For Advertisingmentioning
confidence: 99%
“…As every keyword k i is added from C2 to C1, the change in Information Content of C1 is measured as IC(C1,ki ) δ =IC(C1,ki )−IC(C1) (6) where IC(C1, k i ) is the information content of C1 after adding keyword k i from C2. IC(C1, k i ) δ is positive when k i is strongly associated with words in C1 and negative when k i is unrelated to words in C1.…”
Section: Identifying Contextual Abstractmentioning
confidence: 99%
“…Relational data can be observed in many predictive modeling tasks, such as forecasting the winner in two-player computer games [1], predicting proteins that interact with other proteins in bioinformatics [2], retrieving documents that are similar to a target document in text mining [3], investigating the persons that are friends of each other on social network sites [4], etc. All these examples represent fields of applica-tion in which specific machine learning and data mining algorithms are successfully developed to infer relations from data; pairwise relations, to be more specific.…”
Section: Introductionmentioning
confidence: 99%