Proceedings of the Twelfth International Conference on World Wide Web - WWW '03 2003
DOI: 10.1145/775153.775155
|View full text |Cite
|
Sign up to set email alerts
|

Improving pseudo-relevance feedback in web information retrieval using web page segmentation

Abstract: In contrast to traditional document retrieval, a web page as a whole is not a good information unit to search because it often contains multiple topics and a lot of irrelevant information from navigation, decoration, and interaction part of the page. In this paper, we propose a VIsion-based Page Segmentation (VIPS) algorithm to detect the semantic content structure in a web page. Compared with simple DOM based segmentation method, our page segmentation scheme utilizes useful visual cues to obtain a better part… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
96
0
1

Year Published

2006
2006
2022
2022

Publication Types

Select...
9
1

Relationship

1
9

Authors

Journals

citations
Cited by 91 publications
(97 citation statements)
references
References 11 publications
(14 reference statements)
0
96
0
1
Order By: Relevance
“…• Co-occurrence-based techniques employ terms highly cooccurring with the initial query retrieved from a corpus (e.g., entire documents [34] or lexical affinity relationships [5]), resulting in an increase of retrieval precision [25]. • Relevance feedback techniques analyse the documents retrieved from the initial query in order to extract related information, in a supervised [7] or unsupervised fashion [27,47]. • Brute-force techniques [17] recursively construct queries from an initial one by adding new terms from a repository of common words until the amount of results is below the maximum number of indexed resources.…”
Section: Query Expansionmentioning
confidence: 99%
“…• Co-occurrence-based techniques employ terms highly cooccurring with the initial query retrieved from a corpus (e.g., entire documents [34] or lexical affinity relationships [5]), resulting in an increase of retrieval precision [25]. • Relevance feedback techniques analyse the documents retrieved from the initial query in order to extract related information, in a supervised [7] or unsupervised fashion [27,47]. • Brute-force techniques [17] recursively construct queries from an initial one by adding new terms from a repository of common words until the amount of results is below the maximum number of indexed resources.…”
Section: Query Expansionmentioning
confidence: 99%
“…For example, Yu et al (2003), investigated a technique in which expansion terms are selected from vision-based segments of a Web page instead of the whole page. Generally speaking, entire Web pages often contain multiple topics and a lot of irrelevant information relating to navigation or decoration within a page.…”
Section: Related Workmentioning
confidence: 99%
“…Their results showed significant improvement in retrieval effectiveness by applying RF mechanisms. More recently, pseudo-relevance feedback in web information retrieval by using Web Page Segmentation has been also introduced [7].…”
Section: Related Workmentioning
confidence: 99%