2021
DOI: 10.3390/info12090348
|View full text |Cite
|
Sign up to set email alerts
|

Impresso Inspect and Compare. Visual Comparison of Semantically Enriched Historical Newspaper Articles

Abstract: The automated enrichment of mass-digitised document collections using techniques such as text mining is becoming increasingly popular. Enriched collections offer new opportunities for interface design to allow data-driven and visualisation-based search, exploration and interpretation. Most such interfaces integrate close and distant reading and represent semantic, spatial, social or temporal relations, but often lack contrastive views. Inspect and Compare (I&C) contributes to the current state of the art i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 34 publications
0
4
0
Order By: Relevance
“…At this stage of development, six evaluators found them "either difficult to read or [they] did not provide useful insights." In addition, recommendations for future development addressed the already foreseen integration of impresso's Inspect & Compare component (Düring et al, 2021) for side-by-side comparisons of article sets, higher speed for the creation of collections, API access to the data, and new filters based on a yet to be created taxonomy of text reuse types.…”
Section: Discussion Of Evaluation Resultsmentioning
confidence: 99%
“…At this stage of development, six evaluators found them "either difficult to read or [they] did not provide useful insights." In addition, recommendations for future development addressed the already foreseen integration of impresso's Inspect & Compare component (Düring et al, 2021) for side-by-side comparisons of article sets, higher speed for the creation of collections, API access to the data, and new filters based on a yet to be created taxonomy of text reuse types.…”
Section: Discussion Of Evaluation Resultsmentioning
confidence: 99%
“…Compared to standard OCR results, these models achieve good layout segmentation, but they lack the article-level information that is required to improve searchability in historical collections. Just as text data can be classified according to its characteristics, or content, illustrations can also be classified according to their context [58], location, and features such as color or shape [59], enabling the evaluation of visual content.…”
Section: Digitization and Extractionmentioning
confidence: 99%
“…This has been further emphasized by the global COVID-19 pandemic (Samaroudi et al, 2020;Sułkowski, 2020). In particular, the digitization of large-scale textual collections, such as historical newspapers, has sparked much interest from the digital humanities community (Allen, 2015;Düring et al, 2021;Oberbichler et al, 2022). Some of the remarkable initiatives for digitization of newspaper collections, their conservation in digital format and access provision using digital platforms has been undertaken by Google Newspaper Search (Chaudhury et al, 2009), Europeana (Pekárek and Willems, 2012;Willems and Atanassova, 2015), Bibliothèque nationale du Luxembourg (Zaagsma, 2019), KB Lab of Research department of the Koninklijke Bibliotheek, National Library of the Netherlands (Smits and Faber, 2018;Wevers and Lonij, 2017), Bibliothèque nationale de France, National Library of France (Moreux 2017), Library of Congress -Chronicling America (Lee et al, 2020), Biblioteca Digitale Italiana -BDI, an Italian digital library promoted by the Ministry for Cultural Heritage and Activities (Leombroni, 2004;Paoli, 2005), Australian Newspaper Digitization program (Holley, 2009), British Library (Hiltunen, 2021) and The National Library of Sweden (Nilsson, 2012).…”
Section: Introductionmentioning
confidence: 99%