In the digital age, political science is faced with a shift of election campaigns and political discourse to digital or virtual arenas. Because the internet is a highly volatile medium and online content can become inaccessible after the campaign season, new challenges for research arise as well as the need for the preservation of online content. Moreover, the sheer volume of data researchers have to deal with has reached levels where traditional methods are being highly challenged. This paper puts forth a web harvesting workflow with a strong focus on granular extraction of unstructured information (publication dates) for automated analysis. As our approach is methodological, we would like to point out the benefits that researches in political science may draw from adapting our methodology. We demonstrate this by analysing an event-based web crawl of German parties participating in the election campaign for the European Parliamentary Election in 2019. We employ distant reading methods to generate topic models, which are subsequently evaluated by hermeneutic analysis of a subset of the data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.