2014
DOI: 10.1007/s00799-014-0120-4
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating sliding and sticky target policies by measuring temporal drift in acyclic walks through a web archive

Abstract: When a user views an archived page using the archive's user interface (UI), the user selects a datetime to view from a list. The archived web page, if available, is then displayed. From this display, the web archive UI attempts to simulate the web browsing experience by smoothly transitioning between archived pages. During this process, the target datetime changes with each link followed; drifting away from the datetime originally selected. When browsing sparselyarchived pages, this nearly-silent drift can be … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 11 publications
(14 citation statements)
references
References 23 publications
0
14
0
Order By: Relevance
“…If the Memento is deemed relevant, it is added to the event collection, its outlinks are extracted and added to the priority queue. We note that most web archives rewrite outlinks in their Mementos to point back into the same archive rather than to the live web, even when the archive does not hold a Memento for the linked resource or only holds Mementos that are temporally distant from the desired time, which in our case is DTE [2]. We therefore add the original URI (URI-R) of the outlink, which can be obtained using features of the Memento protocol, to the priority queue rather than the rewritten URI-M of the outlink.…”
Section: Web Archive Crawlsmentioning
confidence: 99%
“…If the Memento is deemed relevant, it is added to the event collection, its outlinks are extracted and added to the priority queue. We note that most web archives rewrite outlinks in their Mementos to point back into the same archive rather than to the live web, even when the archive does not hold a Memento for the linked resource or only holds Mementos that are temporally distant from the desired time, which in our case is DTE [2]. We therefore add the original URI (URI-R) of the outlink, which can be obtained using features of the Memento protocol, to the priority queue rather than the rewritten URI-M of the outlink.…”
Section: Web Archive Crawlsmentioning
confidence: 99%
“…8 When using a page-at-a-time archival service, the resulting memento contains embedded resources with the same archival datetime [1]. This section identifies our damage measurement of this page-at-a-time archiver and outlines the differences between Heritrix and WebCite.…”
Section: Measuring Webcitementioning
confidence: 99%
“…The Internet Archive alone boasts 455 billion Web pages in its archive, 1 which is far larger than can be evaluated through human methods. While Banos et al constructed the CLEAR method to assign a predictive archivability score [6], a similar score for the actual performance of an archival tool does not exist outside of the simple metric of the percent of embedded resources archived.…”
Section: Introductionmentioning
confidence: 99%
“…As the entire process is done manually, the researcher will typically have to try many hyperlinks that are not available in the archive. The researcher also has to be aware of temporal drift while navigating the archive [1]. Temporal drift occurs because the linking and linked page were usually crawled at different times and therefore each navigation step moves the analyzed point in time.…”
Section: Introductionmentioning
confidence: 99%