Web archiving initiatives around the world capture ephemeral web content to preserve our collective digital memory. In this paper, we describe initial experiences in providing an exploratory search interface to web archives for humanities scholars and social scientists. We describe our initial implementation and discuss our findings in terms of desiderata for such a system. It is clear that the standard organization of a search engine results page (SERP), consisting of an ordered list of hits, is inadequate to support the needs of scholars. Shneiderman's mantra for visual information seeking ("overview first, zoom and filter, then details-ondemand") provides a nice organizing principle for interface design, to which we propose an addendum: "Make everything transparent". We elaborate on this by highlighting the importance of the temporal dimension of web pages as well as issues surrounding metadata and veracity.
The Archives Unleashed project aims to improve scholarly access to web archives through a multi-pronged strategy involving tool creation, process modeling, and community building-all proceeding concurrently in mutually-reinforcing efforts. As we near the end of our initially-conceived three-year project, we report on our progress and share lessons learned along the way. The main contribution articulated in this paper is a process model that decomposes scholarly inquiries into four main activities: filter, extract, aggregate, and visualize. Based on the insight that these activities can be disaggregated across time, space, and tools, it is possible to generate "derivative products", using our Archives Unleashed Toolkit, that serve as useful starting points for scholarly inquiry. Scholars can download these products from the Archives Unleashed Cloud and manipulate them just like any other dataset, thus providing access to web archives without requiring any specialized knowledge. Over the past few years, our platform has processed over a thousand different collections from over two hundred users, totaling around 300 terabytes of web archives.
Background Snowboarding is a popular and risky winter sport. Snowboarders perform tricks on man-made features in terrain parks, which may introduce additional risk. Objective To determine snowboard terrain park feature-specifi c injury rates and risk factors. Design Case-control study with exposure estimation. Setting A terrain park at a resort in Alberta, Canada, used for recreational and competitive snowboarding. Participants Cases were snowboarders injured in the terrain park who presented to the ski patrol or local emergency department (ED) (n=334). Controls were non-injured snowboarders using the terrain park (n=1262). The number of snowboarder-runs in the terrain park was recorded. Participants were recruited for two winter seasons. Assessment of risk factors Cases were identifi ed from resort patient care records (PCRs) and ED logs. The PCRs captured demographic and environmental risk factors and injury assessment. Injured snowboarders were telephoned to determine exposure (feature used), listening to music and drugs/alcohol. Randomly selected controls were interviewed. Main outcome measurements Overall and feature-specifi c injury rates (per 1000 runs) were calculated. Cases and controls were compared for risk factor prevalence using multiple logistic regressions to estimate adjusted OR (aOR) and 95% CI. Results The overall injury rate was 0.75 injuries/1000 runs. Injury rates were highest on jumps (2.56/1000 runs), the halfpipe (2.56/1000 runs) and kickers (0.61/1000 runs). Compared with rails, the adjusted odds of injury were signifi cantly higher on the half-pipe (aOR=9.6; 95% CI 4.8 to 19.3), jumps (aOR=4.3; 95% CI 2.7 to 6.8), mushroom (aOR=2.3; 95% CI 1.1 to 4.4) and kickers (aOR=2.0; 95% CI 1.3 to 3.1). The odds of severe injury (present to ED) versus minor injury did not differ by feature. Conclusions The injury rates and odds of injury were highest on features that facilitate aerial maneouvers. Resorts may consider marking all features to indicate diffi culty and associated injury risk.
The tangible development of tools and platforms that meet demonstrated needs (i.e., better support for scholarly inquiry); A better understanding of the processes by which scholars, curators, and others work with these materials, providing a reference workflow with which to evaluate future research tools; The building of a community, in part supported by the continued use of datathon communication channels and standing infrastructure, as well as encouragement to attend follow-up events. We've now run SEVEN datathons! (exhausting but fun) So why datathons?
Any preservation effort must begin with an assessment of what content to preserve, and web archiving is no different. There have historically been two answers to the question "what should we archive?" The Internet Archive's broad entire-web crawls have been supplemented by narrower domain-or topic-specific collections gathered by numerous libraries. We can characterize this as content selection and curation by "gatekeepers". In contrast, we have witnessed the emergence of another approach driven by "the masses"-we can archive pages that are contained in social media streams such as Twitter. The interesting question, of course, is how these approaches differ. We provide an answer to this question in the context of a case study about the 2015 Canadian federal elections. Based on our analysis, we recommend a hybrid approach that combines an effort driven by social media and more traditional curatorial methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with đź’™ for researchers
Part of the Research Solutions Family.