A significant amount of information about drug-related safety issues such as adverse effects are published in medical case reports that can only be explored by human readers due to their unstructured nature. The work presented here aims at generating a systematically annotated corpus that can support the development and validation of methods for the automatic extraction of drug-related adverse effects from medical case reports. The documents are systematically double annotated in various rounds to ensure consistent annotations. The annotated documents are finally harmonized to generate representative consensus annotations. In order to demonstrate an example use case scenario, the corpus was employed to train and validate models for the classification of informative against the non-informative sentences. A Maximum Entropy classifier trained with simple features and evaluated by 10-fold cross-validation resulted in the F₁ score of 0.70 indicating a potential useful application of the corpus.
With this study, we provide a comprehensive reference dataset of detailed miRNA expression profiles from seven types of human peripheral blood cells (NK cells, B lymphocytes, cytotoxic T lymphocytes, T helper cells, monocytes, neutrophils and erythrocytes), serum, exosomes and whole blood. The peripheral blood cells from buffy coats were typed and sorted using FACS/MACS. The overall dataset was generated from 450 small RNA libraries using high-throughput sequencing. By employing a comprehensive bioinformatics and statistical analysis, we show that 3′ trimming modifications as well as composition of 3′ added non-templated nucleotides are distributed in a lineage-specific manner—the closer the hematopoietic progenitors are, the higher their similarities in sequence variation of the 3′ end. Furthermore, we define the blood cell-specific miRNA and isomiR expression patterns and identify novel cell type specific miRNA candidates. The study provides the most comprehensive contribution to date towards a complete miRNA catalogue of human peripheral blood, which can be used as a reference for future studies. The dataset has been deposited in GEO and also can be explored interactively following this link: http://134.245.63.235/ikmb-tools/bloodmiRs.
Motivation: Chemical compounds like small signal molecules or other biological active chemical substances are an important entity class in life science publications and patents. Several representations and nomenclatures for chemicals like SMILES, InChI, IUPAC or trivial names exist. Only SMILES and InChI names allow a direct structure search, but in biomedical texts trivial names and Iupac like names are used more frequent. While trivial names can be found with a dictionary-based approach and in such a way mapped to their corresponding structures, it is not possible to enumerate all IUPAC names. In this work, we present a new machine learning approach based on conditional random fields (CRF) to find mentions of IUPAC and IUPAC-like names in scientific text as well as its evaluation and the conversion rate with available name-to-structure tools.Results: We present an IUPAC name recognizer with an F1 measure of 85.6% on a MEDLINE corpus. The evaluation of different CRF orders and offset conjunction orders demonstrates the importance of these parameters. An evaluation of hand-selected patent sections containing large enumerations and terms with mixed nomenclature shows a good performance on these cases (F1 measure 81.5%). Remaining recognition problems are to detect correct borders of the typically long terms, especially when occurring in parentheses or enumerations. We demonstrate the scalability of our implementation by providing results from a full MEDLINE run.Availability: We plan to publish the corpora, annotation guideline as well as the conditional random field model as a UIMA component.Contact:roman.klinger@scai.fraunhofer.de
Pathway-centric approaches are widely used to interpret and contextualize -omics data. However, databases contain different representations of the same biological pathway, which may lead to different results of statistical enrichment analysis and predictive models in the context of precision medicine. We have performed an in-depth benchmarking of the impact of pathway database choice on statistical enrichment analysis and predictive modeling. We analyzed five cancer datasets using three major pathway databases and developed an approach to merge several databases into a single integrative one: MPath. Our results show that equivalent pathways from different databases yield disparate results in statistical enrichment analysis. Moreover, we observed a significant dataset-dependent impact on the performance of machine learning models on different prediction tasks. In some cases, MPath significantly improved prediction performance and also reduced the variance of prediction performances. Furthermore, MPath yielded more consistent and biologically plausible results in statistical enrichment analyses. In summary, this benchmarking study demonstrates that pathway database choice can influence the results of statistical enrichment analysis and predictive modeling. Therefore, we recommend the use of multiple pathway databases or integrative ones.
Development of ADO as an open ADO is a first attempt to organize information related to Alzheimer's disease in a formalized, structured manner. We demonstrate that ADO is able to capture both established and scattered knowledge existing in scientific text.
An increasing interest in a healthy lifestyle raises questions about optimal body weight. Evidently, it should be clearly discriminated between the standardised “normal” body weight and individually optimal weight. To this end, the basic principle of personalised medicine “one size does not fit all” has to be applied. Contextually, “normal” but e.g. borderline body mass index might be optimal for one person but apparently suboptimal for another one strongly depending on the individual genetic predisposition, geographic origin, cultural and nutritional habits and relevant lifestyle parameters—all included into comprehensive individual patient profile. Even if only slightly deviant, both overweight and underweight are acknowledged risk factors for a shifted metabolism which, if being not optimised, may strongly contribute to the development and progression of severe pathologies. Development of innovative screening programmes is essential to promote population health by application of health risks assessment, individualised patient profiling and multi-parametric analysis, further used for cost-effective targeted prevention and treatments tailored to the person. The following healthcare areas are considered to be potentially strongly benefiting from the above proposed measures: suboptimal health conditions, sports medicine, stress overload and associated complications, planned pregnancies, periodontal health and dentistry, sleep medicine, eye health and disorders, inflammatory disorders, healing and pain management, metabolic disorders, cardiovascular disease, cancers, psychiatric and neurologic disorders, stroke of known and unknown aetiology, improved individual and population outcomes under pandemic conditions such as COVID-19. In a long-term way, a significantly improved healthcare economy is one of benefits of the proposed paradigm shift from reactive to Predictive, Preventive and Personalised Medicine (PPPM/3PM). A tight collaboration between all stakeholders including scientific community, healthcare givers, patient organisations, policy-makers and educators is essential for the smooth implementation of 3PM concepts in daily practice.
Decades of costly failures in translating drug candidates from preclinical disease models to human therapeutic use warrant reconsideration of the priority placed on animal models in biomedical research. Following an international workshop attended by experts from academia, government institutions, research funding bodies, and the corporate and non-governmental organisation (NGO) sectors, in this consensus report, we analyse, as case studies, five disease areas with major unmet needs for new treatments. In view of the scientifically driven transition towards a human pathways-based paradigm in toxicology, a similar paradigm shift appears to be justified in biomedical research. There is a pressing need for an approach that strategically implements advanced, human biology-based models and tools to understand disease pathways at multiple biological scales. We present recommendations to help achieve this.
Heme is an iron ion-containing molecule found within hemoproteins such as hemoglobin and cytochromes that participates in diverse biological processes. Although excessive heme has been implicated in several diseases including malaria, sepsis, ischemiareperfusion, and disseminated intravascular coagulation, little is known about its regulatory and signaling functions. Furthermore, the limited understanding of heme's role in regulatory and signaling functions is in part due to the lack of curated pathway resources for heme cell biology. Here, we present two resources aimed to exploit this unexplored information to model heme biology. The first resource is a terminology covering heme-specific terms not yet included in standard controlled vocabularies. Using this terminology, we curated and modeled the second resource, a mechanistic knowledge graph representing the heme's interactome based on a corpus of 46 scientific articles. Finally, we demonstrated the utility of these resources by investigating the role of heme in the Toll-like receptor signaling pathway. Our analysis proposed a series of crosstalk events that could explain the role of heme in activating the TLR4 signaling pathway. In summary, the presented work opens the door to the scientific community for exploring the published knowledge on heme biology.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.