The increased number of data repositories has greatly increased the availability of open data. To enable broad discovery and access to research dataset, some data repositories have begun leveraging data discovery services from commercial search engines by embedding structured metadata markup in dataset web landing pages using vocabularies from Schema.org and extensions. This paper aims to examine metadata interoperability for supporting global data discovery. Specifically, the paper reports a survey on which metadata schema has been adopted by participating data repositories, and presents an analysis of crosswalks from fourteen research data schemas to Schema.org. The analysis indicates most descriptive metadata are interoperable among the schemas, the most inconsistent mapping is the rights metadata, and a large gap exists in the structural metadata and controlled vocabularies to specify various property values. The analysis and collated crosswalks can serve as a reference for data repositories when they develop crosswalks from their own schemasto Schema.org, and provide the research data community a benchmark of structured metadata implementation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.