It is The National Archives' responsibility to collect and secure the future of the public record in all its forms and to make it as accessible as possible. The UK Government Web Archive1 (UKGWA) effectively preserves the open digital record. This article will explore the challenges encountered, and the Application Programming Interface (API) based solutions developed, by The National Archives and the Internet Memory Foundation (IMF) in the completion of a pilot project to capture the record as it is published on the social media services Twitter and YouTube. An outline of the wider web archiving programme and its role within the management of the government web estate is provided. The legislative framework that guides web archiving at The National Archives is described as it has necessarily influenced the policy decisions that shaped the solutions developed. A brief overview of some comparative approaches taken by other organizations and commercial services to capturing Twitter content is also presented as context to the policy and technical solutions arrived at by the authors. The National Archives has sought to develop the building blocks of a collection whose growth can be sustained over time. The publication of this part of the archive will be followed by further evaluation and improvements to the initial approach taken.
Although there are a great variety of web archiving projects around the world, there are not many that focus explicitly on websites of broadcasters. The reason is that funds are often lacking to do this, and that broadcaster websites are difficult to archive, due to their dynamic and audiovisual content. The Netherlands Institute for Sound and Vision, with its collection of over 800,000 hours of audiovisual content has been involved in a small-scale research project related to web archiving since 2008. When Sound and Vision was approached by Dutch public broadcaster NTR to archive four of its websites, it was decided to start a collaborative pilot project that focused both on learning more about archiving broadcaster websites and developing a clean and modern public access interface. The main lesson learned from this pilot is that to archive highly dynamic and AV-heavy broadcaster websites it is vital to use supplementary capture tools and manual archiving of this ‘difficult’ content. Furthermore, since the focus of web archiving projects is usually not on a good-looking front-end, the wheel had to be partly re-invented by involving various stakeholders and determining the most important requirements. The first version of the web archive was evaluated by various prospective target users. This evaluation revealed that the participants indeed appreciated the look and speed of the web archive, and that users needed to be made more aware of the web archive's purpose and limitations. The work will be continued and scaled up, by archiving more broadcaster websites, continuing the research on how best to capture and make accessible dynamic and AV content, and by creating standard practices for making the web archive publicly available.
No abstract
Distribution électronique Cairn.info pour A.D.B.S.. © A.D.B.S.. Tous droits réservés pour tous pays.La reproduction ou représentation de cet article, notamment par photocopie, n'est autorisée que dans les limites des conditions générales d'utilisation du site ou, le cas échéant, des conditions générales de la licence souscrite par votre établissement. Toute autre reproduction ou représentation, en tout ou partie, sous quelque forme et de quelque manière que ce soit, est interdite sauf accord préalable et écrit de l'éditeur, en dehors des cas prévus par la législation en vigueur en France. Il est précisé que son stockage dans une base de données est également interdit.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.