Tetrazolium: an important test for physiological seed quality evaluation

Background Medical notes are a rich source of patient data; however, the nature of unstructured text has largely precluded the use of these data for large retrospective analyses. Transforming clinical text into structured data can enable large-scale research studies with electronic health records (EHR) data. Natural language processing (NLP) can be used for text information retrieval, reducing the need for labor-intensive chart review. Here we present an application of NLP to large-scale analysis of medical records at 2 large hospitals for patients hospitalized with COVID-19. Objective Our study goal was to develop an NLP pipeline to classify the discharge disposition (home, inpatient rehabilitation, skilled nursing inpatient facility [SNIF], and death) of patients hospitalized with COVID-19 based on hospital discharge summary notes. Methods Text mining and feature engineering were applied to unstructured text from hospital discharge summaries. The study included patients with COVID-19 discharged from 2 hospitals in the Boston, Massachusetts area (Massachusetts General Hospital and Brigham and Women’s Hospital) between March 10, 2020, and June 30, 2020. The data were divided into a training set (70%) and hold-out test set (30%). Discharge summaries were represented as bags-of-words consisting of single words (unigrams), bigrams, and trigrams. The number of features was reduced during training by excluding n-grams that occurred in fewer than 10% of discharge summaries, and further reduced using least absolute shrinkage and selection operator (LASSO) regularization while training a multiclass logistic regression model. Model performance was evaluated using the hold-out test set. Results The study cohort included 1737 adult patients (median age 61 [SD 18] years; 55% men; 45% White and 16% Black; 14% nonsurvivors and 61% discharged home). The model selected 179 from a vocabulary of 1056 engineered features, consisting of combinations of unigrams, bigrams, and trigrams. The top features contributing most to the classification by the model (for each outcome) were the following: “appointments specialty,” “home health,” and “home care” (home); “intubate” and “ARDS” (inpatient rehabilitation); “service” (SNIF); “brief assessment” and “covid” (death). The model achieved a micro-average area under the receiver operating characteristic curve value of 0.98 (95% CI 0.97-0.98) and average precision of 0.81 (95% CI 0.75-0.84) in the testing set for prediction of discharge disposition. Conclusions A supervised learning–based NLP approach is able to classify the discharge disposition of patients hospitalized with COVID-19. This approach has the potential to accelerate and increase the scale of research on patients’ discharge disposition that is possible with EHR data.

show abstract

Multiple domain protein diagnostic patterns

Adams

Das

Smith

1996

Protein Science

View full text Add to dashboard Cite

We have implemented an iterative algorithm for the identification of diagnostic patterns from sets of multipledomain proteins, where domains need not be common to all the proteins in the defining set. Our algorithm was applied to sequences gathered using a variety of methods, including BLAST, common keywords, and common E.C. numbers. In all cases, useful diagnostic patterns were obtained, possessing both high sensitivity and specificity. The patterns were found to correlate in several cases with both functional and structural domains. Patterns generated from a large number of sequence families were analyzed for probable multiple-domain structure.

show abstract

Classification of neurologic outcomes from medical notes using natural language processing

Fernandes

Valizadeh²,

Alabsi³

et al. 2023

Expert Systems with Applications

View full text Add to dashboard Cite

IoT based Low-Cost Gas Leakage, Fire, and Temperature Detection System with Call Facilities

Debnath

Ahmed

Das

et al. 2020

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Sudeshna Das

Biology's new Rosetta stone

Identifying nature's protein lego set

Detection of hard cuts and gradual transitions from video using fuzzy logic

Context-sensitive gender inference of named entities in text

Classification of the Disposition of Patients Hospitalized with COVID-19: Reading Discharge Summaries Using Natural Language Processing

Multiple domain protein diagnostic patterns

Classification of neurologic outcomes from medical notes using natural language processing

IoT based Low-Cost Gas Leakage, Fire, and Temperature Detection System with Call Facilities

Contact Info

Product

Resources

About