Dimensions is a new scholarly search database that focuses on the broader set of use cases that academics now face. By including awarded grants, patents, and clinical trials alongside publication and Altmetric attention data, Dimensions goes beyond the standard publication-citation ecosystem to give the user a much greater sense of context of a piece of research. All entities in the graph may be linked to all other entities. Thus, a patent may be linked to a grant, if an appropriate reference is made. Books, book chapters, and conference proceedings are included in the publication index. All entities are treated as first-class objects and are mapped to a database of research institutions and a standard set of research classifications via machine-learning techniques. This article gives an overview of the methodology of construction of the Dimensions dataset and user interface.
Until recently, comprehensive scientometrics data has been made available only in siloed, subscription-based tools that are inaccessible to researchers who lack institutional support and resources. As a result of limited data access, research evaluation practices have focused upon basic indicators that only take publications and their citation rates into account. This has blocked innovation on many fronts. Dimensions is a database that links and contextualizes different research information objects. It brings together data describing and linking awarded grants, clinical trials, patents, and policy documents, as well as altmetric information, alongside traditional publications and citations data. This article describes the approach that Digital Science is taking to support the scientometric community, together with the various Dimensions tools available to researchers who wish to use Dimensions data in their research at no cost.
Dimensions was built as a platform to allow stakeholders in the research community, including academic bibliometricians, to more easily create and understand the context of different types of research object through the linkages between these objects. Links between objects are created via persistent identifiers and machine learning techniques, while additional context is introduced via data enhancements such as per-object categorisations and person and institution disambiguation. While these features make analytical use cases accessible for end users, the COVID-19 crisis has highlighted a different set of needs to analyze trends in scholarship as they occur: Real-time bibliometrics. The combination of full-text search, daily data updates, a broad set of scholarly objects including pre-prints and a wider set of data fields for analysis, broadens opportunities for a different style of analysis. A subset of these emerging capabilities is discussed and three basic analyses are presented as illustrations of the potential for real-time bibliometrics.
With Dimensions, Digital Science provides the research community a new approach on research related information, bringing formerly siloed content types such as grants, patents, clinical trials with publications and citations together, making it as openly available as possible (see app.dimensions.ai). Due to the different content types, (controversial) journal based classifications were not an option since it would not allow to categorise grants etc. Hence Digital Science opted for applying a categorisation approach using machine learning and based on the content of the documents and well established classification systems for which a training set was available. The implementation at launch was a first step and requires to be improved—although we observe a reliability comparably to manual coding for grants, the implementation at launch comes with some shortcomings as observed by Bornmann (2018), mostly due to challenges with the training set coverage. To overcome the shortcomings of the initial categorization approach we implemented an improvement process with the research community and Lutz Bornmann’s analysis presented a great opportunity to provide more transparency and insights in the ongoing improvements.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.