A prototype application for machine-readable literature is investigated. The program is called pyDataRecognition and serves as an example of a data-driven literature search, where the literature search query is an experimental data set provided by the user. The user uploads a powder pattern together with the radiation wavelength. The program compares the user data to a database of existing powder patterns associated with published papers and produces a rank ordered according to their similarity score. The program returns the digital object identifier and full reference of top-ranked papers together with a stack plot of the user data alongside the top-five database entries. The paper describes the approach and explores successes and challenges.
We investigate a prototype application for machine-readable literature. The program is called pyDataRecognition and serves as an example of a data-driven literature search, where the literature search query is an experimental data-set provided by the user. The user uploads a powder pattern together with the radiation wavelength. The program compares the user data to a database of existing powder patterns associated with published papers and produces a rank ordered according to their similarity score.The program returns the digital object identifier (doi) and full reference of top ranked papers together with a stack plot of the user data alongside the top five database entries. The paper describes the approach and explores successes and challenges. IntroductionThe activity of communicating science, including paper writing, always includes a search of the literature to discover and acknowledge prior work (Garfield, 1996). Since the advent of the internet, this process has largely moved from manual, library based, searches to online searches using search engines (Butler, 2000). Literature search engines such as Google Scholar (Van Noorden, 2014) normally work by accepting text and metadata search queries, such as author names, keywords, journal name, year, and so on. On the contrary, here we explore the concept of a data-seeded literature search where we use a measured data set as the search query to retrieve data-relevant papers from the literature. We chose to use X-ray powder diffraction data for our test case.X-ray powder diffraction is an important technique in materials science, where structural characterization is at the very center of the workflow as it is inherently linked to material properties. The goal of the technique is to understand the arrangement of atoms in the material based on measurements of X-ray (or neutron or electron) diffrac-IUCr macros version 2.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.