Markus Freudenberg scite author profile

Markus Freudenberg

6Publications

62Citation Statements Received

65Citation Statements Given

How they've been cited

How they cite others

Affiliations

Leipzig University

Publications

Order By: Most citations

Assessing and Refining Mappingsto RDF to Improve Dataset Quality

Dimou

Kontokostas

Freudenberg

et al. 2015

View full text Add to dashboard Cite

Abstract. rdf dataset quality assessment is currently performed primarily after data is published. However, there is neither a systematic way to incorporate its results into the dataset nor the assessment into the publishing workflow. Adjustments are manually -but rarely-applied. Nevertheless, the root of the violations which often derive from the mappings that specify how the rdf dataset will be generated, is not identified. We suggest an incremental, iterative and uniform validation workflow for rdf datasets stemming originally from (semi-)structured data (e.g., csv, xml, json). In this work, we focus on assessing and improving their mappings. We incorporate (i) a test-driven approach for assessing the mappings instead of the rdf dataset itself, as mappings reflect how the dataset will be formed when generated; and (ii) perform semi-automatic mapping refinements based on the results of the quality assessment. The proposed workflow is applied to diverse cases, e.g., large, crowdsourced datasets such as dbpedia, or newly generated, such as iLastic. Our evaluation indicates the efficiency of our workflow, as it significantly improves the overall quality of an rdf dataset in the observed cases.

show abstract

The Metadata Ecosystem of DataID

Freudenberg

Brümmer

Rücknagel³

et al. 2016

View full text Add to dashboard Cite

Enabling Combined Software and Data Engineering at Web-Scale: The ALIGNED Suite of Ontologies

Solanki

Božić

Freudenberg

et al. 2016

View full text Add to dashboard Cite

Abstract. Effective, collaborative integration of software and big data engineering for Web-scale systems, is now a crucial technical and economic challenge. This requires new combined data and software engineering processes and tools. Semantic metadata standards and linked data principles, provide a technical grounding for such integrated systems given an appropriate model of the domain. In this paper we introduce the ALIGNED suite of ontologies specifically designed to model the information exchange needs of combined software and data engineering. These ontologies are deployed in web-scale, data-intensive, system development environments in both the commercial and academic domains. We exemplify the usage of the suite on a complex collaborative software and data engineering scenario from the legal information system domain.

show abstract

Interlinking SciGraph and DBpedia Datasets Using Link Discovery and Named Entity Recognition Techniques

Yaman¹,

Pasin²,

Freudenberg³

2019

View full text Add to dashboard Cite

Validating Interlinks Between Linked Data Datasets with the SUMMR Methodology

Meehan

Kontokostas²,

Freudenberg³

et al. 2016

View full text Add to dashboard Cite

Abstract. Linked Data datasets use interlinks to connect semantically similar resources across datasets. As datasets evolve, a resources locator can change which can cause interlinks that contain old resource locators, to no longer dereference and become invalid. Validating interlinks, through validating the resource locators within them, when a dataset has changed is important to ensure interlinks work as intended. In this paper we introduce the SPARQL Usage for Mapping Maintenance and Reuse (SUMMR) methodology. SUMMR is an approach for Mapping Maintenance and Reuse (MMR) that provides query templates which are based on standard SPARQL queries for MMR activities. This paper describes SUMMR and two experiments: a lab-based evaluation of SUMMR's mapping maintenance query templates and a deployment of SUMMR in the DBpedia v.2015-10 release to detect invalid interlinks. The labbased evaluation involved detecting interlinks that have become invalid, due to changes in resource locators and the repair of the invalid interlinks. The results show that the SUMMR templates and approach can be used to effectively detect and repair invalid interlinks. SUMMR's query template for discovering invalid interlinks was applied to the DBpedia v.2015-10 release, which discovered 53,418 invalid interlinks in that release.

show abstract

DataID

Brümmer

Baron

Ermilov

et al. 2014

View full text Add to dashboard Cite

The constantly growing amount of Linked Open Data (LOD) datasets constitutes the need for rich metadata descriptions, enabling users to discover, understand and process the available data. This metadata is often created, maintained and stored in diverse data repositories featuring disparate data models that are often unable to provide the metadata necessary to automatically process the datasets described. This paper proposes DataID, a best-practice for LOD dataset descriptions which utilize RDF files hosted together with the datasets, under the same domain. We are describing the data model, which is based on the widely used DCAT and VoID vocabularies, as well as supporting tools to create and publish DataIDs and use cases that show the benefits of providing semantically rich metadata for complex datasets. As a proof of concept, we generated a DataID for the DBpedia dataset, which we will present in the paper.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.