Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries 2018
DOI: 10.1145/3197026.3197045
|View full text |Cite
|
Sign up to set email alerts
|

A Framework for Aggregating Private and Public Web Archives

Abstract: Personal and private Web archives are proliferating due to the increase in the tools to create them and the realization that Internet Archive and other public Web archives are unable to capture personalized (e.g., Facebook) and private (e.g., banking) Web pages. We introduce 1 a framework to mitigate issues of aggregation in private, personal, and public Web archives without compromising potential sensitive information contained in private captures. We amend Memento syntax and semantics to allow TimeMap enrich… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 11 publications
(7 citation statements)
references
References 24 publications
0
7
0
Order By: Relevance
“…The process of HTTP requests as recursively applied through an aggregator subsequently querying additional sources resembles a graph structure, typically reduced to a tree in the conventional case (Section 4.2). As this work reiterates the potential for an aggregator querying an aggregator [16], the scenario arises of graph-style cycles (Figure 3) that must be mitigated. Additionally, we may encounter redundancies in this "chaining" process (Figure 5) where aggregators down the request chain are configured to query identical, previously queried archives with the same parameters.…”
Section: Abstractions From Other Domainsmentioning
confidence: 82%
See 2 more Smart Citations
“…The process of HTTP requests as recursively applied through an aggregator subsequently querying additional sources resembles a graph structure, typically reduced to a tree in the conventional case (Section 4.2). As this work reiterates the potential for an aggregator querying an aggregator [16], the scenario arises of graph-style cycles (Figure 3) that must be mitigated. Additionally, we may encounter redundancies in this "chaining" process (Figure 5) where aggregators down the request chain are configured to query identical, previously queried archives with the same parameters.…”
Section: Abstractions From Other Domainsmentioning
confidence: 82%
“…For MemGator, however, the set of endpoints is user-configurable, and thus this valid scenario may arise and has implications. The merits of "aggregator chaining" were discussed in the seminal work introducing the concept [16], but did not go into detail or highlight some problems that may occur. We reiterate and address these in Section 6.…”
Section: Aggregator Chaining (S 2 )mentioning
confidence: 99%
See 1 more Smart Citation
“…If users want to run their own aggregator, they can install MemGator , developed by Old Dominion University. Research by Kelly et al extends aggregators to query both public and private web archives (Kelly et al, 2018), allowing users to seamlessly transition between archives at different levels of access. Without the standardised interfaces offered by the Memento Protocol, such aggregators would need to apply an assortment of hacks specific to each web archive, like those mentioned in Section 1.1.…”
Section: Memento-compliant Infrastructure and Standardised Accessmentioning
confidence: 99%
“…Hallak estimated recently that almost two thirds of the web traffic is not publicly archivable because it goes to sites that are behind session walls or paywalls, to which some social media sites are big contributors of [22]. Kelly et al developed a framework to archive the private web and integrate it with the public web to fill some of these cavities [23], [24]. These works identify Archival Voids as they show some biases in web archiving as well as quantify the small portion of the web many archives hold.…”
Section: Related Workmentioning
confidence: 99%