Named entity processing over historical texts is more and more being used due to the massive documents and archives being stored in digital libraries. However, due to the poor annotated resources of historical nature, information extraction performances fall behind those on contemporary texts. In this paper, we introduce the development of the NewsEye resource, a multilingual dataset for named entity recognition and linking enriched with stances towards named entities. The dataset is comprised of diachronic historical newspaper material published between 1850 and 1950 in French, German, Finnish, and Swedish. Such historical resource is essential in the context of developing and evaluating named entity processing systems. It evenly allows enhancing the performances of existing approaches on historical documents which enables adequate and efficient semantic indexing of historical documents on digital cultural heritage collections.
CCS CONCEPTS• Information systems → Information retrieval; Digital libraries and archives; • General and reference → Cross-computing tools and techniques.
Today filesystems of big companies are both huge and distributed amongst the world. They contain huge sets of metadata, but are not optimized to analyze them. In contrast, if metadata is stored in a database system and updated synchronously, it could be analyzed and processed in a much easier and straightforward way. Then even adding new attributes, not natively supported by the underlying filesystem, is easily possible. Thus, synchronous metadata storage in a database system can help managing and administrating huge filesystems efficiently but must not slow down the filesystem significantly. The aim of this paper is to describe possible solutions for synchronous metadata storage, inspect how such an integration of filesystem and database system might look like and evaluate the performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.