2022
DOI: 10.3390/cancers14133063
|View full text |Cite
|
Sign up to set email alerts
|

Considerations for the Use of Machine Learning Extracted Real-World Data to Support Evidence Generation: A Research-Centric Evaluation Framework

Abstract: A vast amount of real-world data, such as pathology reports and clinical notes, are captured as unstructured text in electronic health records (EHRs). However, this information is both difficult and costly to extract through human abstraction, especially when scaling to large datasets is needed. Fortunately, Natural Language Processing (NLP) and Machine Learning (ML) techniques provide promising solutions for a variety of information extraction tasks such as identifying a group of patients who have a specific … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
1

Relationship

2
4

Authors

Journals

citations
Cited by 12 publications
(18 citation statements)
references
References 36 publications
0
14
0
Order By: Relevance
“…Additionally, there is a need for model transparency and explainability such that model predictions can be trusted by stakeholders and therefore be more readily accepted [ 36 ]. Finally, proper model evaluation is needed to ensure that models are fair and generalizable, which requires an adequate volume of high-quality labeled test data that is not used during model training and validation [ 6 , 37 ].…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…Additionally, there is a need for model transparency and explainability such that model predictions can be trusted by stakeholders and therefore be more readily accepted [ 36 ]. Finally, proper model evaluation is needed to ensure that models are fair and generalizable, which requires an adequate volume of high-quality labeled test data that is not used during model training and validation [ 6 , 37 ].…”
Section: Discussionmentioning
confidence: 99%
“…These sentences are then transformed into a mathematical representation that the model can interpret. Individual models used in this study were evaluated with the research-centric evaluation framework developed by Estevez et al [ 6 ]. Each model’s performance was evaluated using a test set of over 3000 unique lung cancer patients.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Measuring performance is a complex challenge because even a model with good overall performance might systematically underperform on a particular subcohort of interest, and because while conventional metrics apply to individual models, dozens of ML extracted variables may be combined to answer a specific research question. We use a research-centric evaluation framework 34 to assess the quality of variables curated with ML. Evaluations include one or more of the following strategies: (1) overall performance assessment, (2) stratified performance assessment, and (3) quantitative error analysis, and (4) replication analysis.…”
Section: Model Evaluation and Performance Assessmentmentioning
confidence: 99%