2014
DOI: 10.1007/s00799-014-0108-0
|View full text |Cite
|
Sign up to set email alerts
|

Moved but not gone: an evaluation of real-time methods for discovering replacement web pages

Abstract: Inaccessible Web pages and 404 "Page Not Found" responses are a common Web phenomenon and a detriment to the user's browsing experience. The rediscovery of missing Web pages is, therefore, a relevant research topic in the digital preservation as well as in the Information Retrieval realm. In this article, we bring these two areas together by analyzing four content-and link-based methods to rediscover missing Web pages. We investigate the retrieval performance of the methods individually as well as their combin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 17 publications
(11 citation statements)
references
References 62 publications
0
11
0
Order By: Relevance
“…The correlations between the features are considered towards data retrieval. In [7], different methods of retrieval have been evaluated for their performance under standalone and combined way. Toward corpus retrieval of newspapers, an log based approach is presented in [8].…”
Section: Literature Reviewmentioning
confidence: 99%
“…The correlations between the features are considered towards data retrieval. In [7], different methods of retrieval have been evaluated for their performance under standalone and combined way. Toward corpus retrieval of newspapers, an log based approach is presented in [8].…”
Section: Literature Reviewmentioning
confidence: 99%
“…In summary, anchor texts are related to real queries, and target documents' titles. In addition to this, anchor text is available not only for pages in the archive, but also for pages that have not been archived when there are pointers to them from pages in the Web archive [29,33,42].…”
Section: Query Setmentioning
confidence: 99%
“…Links and anchor texts can be used to locate missing webpages, of which the original URL is not accessible anymore. Klein and Nelson [13] computed lexical signatures of lost webpages, using the top n words of link anchors, and used these and other methods to retrieve alternative URLs for lost webpages. The use of the link structure and anchor texts to uncover and reconstruct target pages that were not archived was studied in [9], based on a depth-first crawl of manually selected websites.…”
Section: Related Workmentioning
confidence: 99%