Proceedings of the 21st ACM Conference on Hypertext and Hypermedia 2010
DOI: 10.1145/1810617.1810621
|View full text |Cite
|
Sign up to set email alerts
|

Is this a good title?

Abstract: Missing web pages, URIs that return the 404 "Page Not Found" error or the HTTP response code 200 but dereference unexpected content, are ubiquitous in today's browsing experience. We use Internet search engines to relocate such missing pages and provide means that help automate the rediscovery process. We propose querying web pages' titles against search engines. We investigate the retrieval performance of titles and compare them to lexical signatures which are derived from the pages' content. Since titles nat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
5
0

Year Published

2011
2011
2016
2016

Publication Types

Select...
3
2
2

Relationship

4
3

Authors

Journals

citations
Cited by 12 publications
(6 citation statements)
references
References 36 publications
(40 reference statements)
1
5
0
Order By: Relevance
“…Note that the data in Table 3 on aggregated values meaning we merged the results for 5− and 7−term lexical signatures into one category and likewise for all tag based query lengths. We can see that titles outperform lexical signatures, supporting our earlier findings in [9,10]. Both methods perform better than tags in terms of URIs returned top ranked, mean nDCG and MAP even though tags leave slightly fewer URIs undiscovered than lexical signatures.…”
Section: Performance Compared To Content Based Queriessupporting
confidence: 86%
See 1 more Smart Citation
“…Note that the data in Table 3 on aggregated values meaning we merged the results for 5− and 7−term lexical signatures into one category and likewise for all tag based query lengths. We can see that titles outperform lexical signatures, supporting our earlier findings in [9,10]. Both methods perform better than tags in terms of URIs returned top ranked, mean nDCG and MAP even though tags leave slightly fewer URIs undiscovered than lexical signatures.…”
Section: Performance Compared To Content Based Queriessupporting
confidence: 86%
“…Even though they are expensive to compute, similar to tags, they may provide an alternative if no copy of a missing page can be found in the web infrastructure. Further research in [9,10] has shown that titles of web pages are a very strong alternative to lexical signatures. The results also prove that we can increase the retrieval performance by applying both methods combined.…”
Section: Content and Link Based Methods To Rediscover Web Pagesmentioning
confidence: 99%
“…In our "Just-in-time" preservation research we discovered new locations of web pages that are missing in the current web [19]. We investigated a variety of techniques, including using page titles [22], tags [21], and lexical signatures [23], all of which could be used as queries to search engines to find replacement copies of the missing web page.…”
Section: Related Workmentioning
confidence: 99%
“…In our "Justin-time" preservation research we discovered new locations of web pages that are missing in the current web [19]. We investigated a variety of techniques, including using page titles [22], tags [21], and lexical signatures [23], all of which could be used as queries to search engines to find replacement copies of the missing web page.…”
Section: Related Workmentioning
confidence: 99%