While there exists an abundance of open biomedical data, the lack of high-quality metadata makes it challenging for others to find relevant datasets and to reuse them for another purpose. In particular, metadata are useful to understand the nature and provenance of the data. A common approach to improving the quality of metadata relies on expensive human curation, which itself is time-consuming and also prone to error. Towards improving the quality of metadata, we use scientific publications to automatically predict metadata key:value pairs. For prediction, we use a Convolutional Neural Network (CNN) and a Bidirectional Long-short term memory network (BiLSTM). We focus our attention on the NCBI Disease Corpus, which is used for training the CNN and BiLSTM. We perform two different kinds of experiments with these two architectures: (1) we predict the disease names by using their unique ID in the MeSH ontology and (2) we use the tree structures of MeSH ontology to move up in the hierarchy of these disease terms, which reduces the number of labels. We also perform various multi-label classification techniques for the above-mentioned experiments. We find that in both cases CNN achieves the best results in predicting the superclasses for disease with an accuracy of 83%.
The scientific community's efforts have increased regarding the application and assessment of the FAIR principles on Digital Objects (DO) such as publications, datasets, or research software. Consequently, openly available automated FAIR assessment services have been working on standardization, such as FAIR enough, the FAIR evaluator or FAIRsFAIR's F-UJI. Digital Competence Centers such as University Libraries have been paramount in this process by facilitating a range of activities, such as awareness campaigns, trainings, or systematic support. However, in practice, using the FAIR assessment tools is still an intricate process for the average researcher. It requires a steep learning curve since it involves performing a series of manual processes requiring specific knowledge when learning the frameworks, disengaging some some researchers in the process. We aim to use technology to close this gap and make this process more accessible by bringing the FAIR assessment to the researcher's profiles. We will develop "The FAIR extension", an open-source, user-friendly web browser extension that allows researchers to make FAIR assessment directly at the web source. Web browser extensions have been an accessible digital tool for libraries supporting scholarship (De Sarkar 2015). A remarkable example is the lightweight version of reference managers deployed as a browser service (Ferguson 2019). Moreover, it has been demonstrated that they can be a vehicle for open access, such as Lean Library Browser Extension. The FAIR extension is a service that builds on top of the community-accepted FAIR evaluator APIs, i.e. it does not intend to create yet another FAIR assessment framework from scratch. The objective of the FAIR Digital Objects Framework (FDOF) is for objects published in a digital environment to comply with a set of requirements, such as identifiability, and the use of a rich metadata record (Santos 2021, Schultes and Wittenburg 2019). The FAIR extension will connect via REST-like operations to individual FAIR metrics test endpoints, according to Wilkinson et al. (2018), Wilkinson et al. (2019) and ultimately display the FAIR metrics on the client side (Fig. 1). Ultimately, the user will get FAIR scores of articles, datasets and other DOs in real-time on a web source, such as a scholarly platform or DO repository. With the possibility of creating simple reports of the assessment. It is acknowledged that the development of web-based tools carries some constraints regarding platform versions releases, e.g. Chromium Development Calendar. Nevertheless, we are optimistic about the potential use cases. For example, A student wanting to make use of a DO (e.g. software package), but doesn't know which to choose. The FAIR extension will indicate which one is more FAIR and aid the decision making process A Data steward recommending sources A researcher who wants to display all FAIR metrics of her DOs on a research profile A PI that wants to evaluate an aggregated metric for a project. A student wanting to make use of a DO (e.g. software package), but doesn't know which to choose. The FAIR extension will indicate which one is more FAIR and aid the decision making process A Data steward recommending sources A researcher who wants to display all FAIR metrics of her DOs on a research profile A PI that wants to evaluate an aggregated metric for a project. These use cases can be the means to bringing the open source community and FAIR DO interest groups to work together.
People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website.• The final author version and the galley proof are versions of the publication after peer review.• The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal.If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the "Taverne" license above, please follow below link for the End User Agreement:
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.