2023
DOI: 10.1038/s41598-023-35482-0
|View full text |Cite
|
Sign up to set email alerts
|

Constructing a disease database and using natural language processing to capture and standardize free text clinical information

Abstract: The ability to extract critical information about an infectious disease in a timely manner is critical for population health research. The lack of procedures for mining large amounts of health data is a major impediment. The goal of this research is to use natural language processing (NLP) to extract key information (clinical factors, social determinants of health) from free text. The proposed framework describes database construction, NLP modules for locating clinical and non-clinical (social determinants) in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(1 citation statement)
references
References 33 publications
0
0
0
Order By: Relevance
“…To identify the appearance and duration of variants of concern, we used data on sequenced SARS-CoV-2 variants from the Global Initiative on Sharing All Influenza Data (GISAID), which is an effective and trusted online resource for sharing genetic, clinical, and epidemiological COVID-19 data [ 56 - 60 ]. We used Nextclade nomenclature [ 61 ] to collect clade designations from sequences and Pangolin nomenclature for lineage designations of SARS-CoV-2 [ 62 , 63 ].…”
Section: Methodsmentioning
confidence: 99%
“…To identify the appearance and duration of variants of concern, we used data on sequenced SARS-CoV-2 variants from the Global Initiative on Sharing All Influenza Data (GISAID), which is an effective and trusted online resource for sharing genetic, clinical, and epidemiological COVID-19 data [ 56 - 60 ]. We used Nextclade nomenclature [ 61 ] to collect clade designations from sequences and Pangolin nomenclature for lineage designations of SARS-CoV-2 [ 62 , 63 ].…”
Section: Methodsmentioning
confidence: 99%