2011
DOI: 10.1007/978-3-642-23088-2_29
|View full text |Cite
|
Sign up to set email alerts
|

Improving the Quality of Web Archives through the Importance of Changes

Abstract: Abstract. Due to the growing importance of the Web, several archiving institutes (national libraries, Internet Archive, etc.) are harvesting sites to preserve (a part of) the Web for future generations. A major issue encountered by archivists is to preserve the quality of web archives. One way of assessing the quality of an archive is to quantify its completeness and the coherence of its page versions. Due to the large number of pages to be captured and the limitations of resources (storage space, bandwidth, e… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2013
2013
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 16 publications
0
2
0
Order By: Relevance
“…McCown and Nelson address coverage [17], but their research is limited to search engine caches. Ben Saad et al [5,4] address qualitative completeness through change detection to identify and archive important changes (rather than simply archiving every change). This research primarily addresses a priori completeness.…”
Section: Completeness (Coverage)mentioning
confidence: 99%
“…McCown and Nelson address coverage [17], but their research is limited to search engine caches. Ben Saad et al [5,4] address qualitative completeness through change detection to identify and archive important changes (rather than simply archiving every change). This research primarily addresses a priori completeness.…”
Section: Completeness (Coverage)mentioning
confidence: 99%
“…Baeza-Yates et al [5] compared different strategies based on the available information about the crawling cycle (no-information, partial information, or all the information). Ben Saad and Gançarski [6,7] focused on adapting new crawling strategies to increase the quality of the web archive for completeness and coherence.…”
Section: Related Workmentioning
confidence: 99%