BackgroundAlthough studies report that more than 90% of pregnant women utilize digital sources to supplement their maternal healthcare, little is known about the kinds of information that women seek from their peers during pregnancy. To date, most research has used self-report measures to elucidate how and why women to turn to digital sources during pregnancy. However, given that these measures may differ from actual utilization of online health information, it is important to analyze the online content pregnant women generate. ObjectiveTo apply machine learning methods to analyze online pregnancy forums, to better understand how women seek information from a community of online peers during pregnancy. MethodsData from seven WhatToExpect.com "birth club" forums (September 2018; January-June 2018) were scraped. Forum posts were collected for a one-year period, which included three trimesters and three months postpartum. Only initial posts from each thread were analyzed (n = 262,238). Automatic natural language processing (NLP) methods captured 50 discussed topics, which were annotated by two independent coders and grouped categorically. ResultsThe largest topic categories were maternal health (45%), baby-related topics (29%), and people/relationships (10%). While pain was a popular topic all throughout pregnancy, individual topics that were dominant by trimester included miscarriage (first trimester), labor (third trimester), and baby sleeping routine (postpartum period).
BackgroundOpioid use disorder (OUD) is underdiagnosed in health system settings, limiting research on OUD using electronic health records (EHRs). Medical encounter notes can enrich structured EHR data with documented signs and symptoms of OUD and social risks and behaviors. To capture this information at scale, natural language processing (NLP) tools must be developed and evaluated. We developed and applied an annotation schema to deeply characterize OUD and related clinical, behavioral, and environmental factors, and automated the annotation schema using machine learning and deep learning-based approaches.MethodsUsing the MIMIC-III Critical Care Database, we queried hospital discharge summaries of patients with International Classification of Diseases (ICD-9) OUD diagnostic codes. We developed an annotation schema to characterize problematic opioid use, identify individuals with potential OUD, and provide psychosocial context. Two annotators reviewed discharge summaries from 100 patients. We randomly sampled patients with their associated annotated sentences and divided them into training (66 patients; 2,127 annotated sentences) and testing (29 patients; 1,149 annotated sentences) sets. We used the training set to generate features, employing three NLP algorithms/knowledge sources. We trained and tested prediction models for classification with a traditional machine learner (logistic regression) and deep learning approach (Autogluon based on ELECTRA's replaced token detection model). We applied a five-fold cross-validation approach to reduce bias in performance estimates.ResultsThe resulting annotation schema contained 32 classes. We achieved moderate inter-annotator agreement, with F1-scores across all classes increasing from 48 to 66%. Five classes had a sufficient number of annotations for automation; of these, we observed consistently high performance (F1-scores) across training and testing sets for drug screening (training: 91–96; testing: 91–94) and opioid type (training: 86–96; testing: 86–99). Performance dropped from training and to testing sets for other drug use (training: 52–65; testing: 40–48), pain management (training: 72–78; testing: 61–78) and psychiatric (training: 73–80; testing: 72). Autogluon achieved the highest performance.ConclusionThis pilot study demonstrated that rich information regarding problematic opioid use can be manually identified by annotators. However, more training samples and features would improve our ability to reliably identify less common classes from clinical text, including text from outpatient settings.
One core measure of healthcare quality set forth by the Institute of Medicine is whether care decisions match patient goals. High-quality “serious illness communication” about patient goals and prognosis is required to support patient-centered decision-making, however current methods are not sensitive enough to measure the quality of this communication or determine whether care delivered matches patient priorities. Natural language processing (NLP) offers an efficient method for identification and evaluation of documented serious illness communication, which could serve as the basis for future quality metrics in oncology and other forms of serious illness. In this study, we trained NLP algorithms to identify and characterize serious illness communication with oncology patients.
Neurological complications worsen outcomes in COVID-19. To define the prevalence of neurological conditions among hospitalized patients with a positive SARS-CoV-2 reverse transcription polymerase chain reaction test in geographically diverse multinational populations during early pandemic, we used electronic health records (EHR) from 338 participating hospitals across 6 countries and 3 continents (January–September 2020) for a cross-sectional analysis. We assessed the frequency of International Classification of Disease code of neurological conditions by countries, healthcare systems, time before and after admission for COVID-19 and COVID-19 severity. Among 35,177 hospitalized patients with SARS-CoV-2 infection, there was an increase in the proportion with disorders of consciousness (5.8%, 95% confidence interval [CI] 3.7–7.8%, pFDR < 0.001) and unspecified disorders of the brain (8.1%, 5.7–10.5%, pFDR < 0.001) when compared to the pre-admission proportion. During hospitalization, the relative risk of disorders of consciousness (22%, 19–25%), cerebrovascular diseases (24%, 13–35%), nontraumatic intracranial hemorrhage (34%, 20–50%), encephalitis and/or myelitis (37%, 17–60%) and myopathy (72%, 67–77%) were higher for patients with severe COVID-19 when compared to those who never experienced severe COVID-19. Leveraging a multinational network to capture standardized EHR data, we highlighted the increased prevalence of central and peripheral neurological phenotypes in patients hospitalized with COVID-19, particularly among those with severe disease.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.