2020
DOI: 10.1038/s42256-020-00251-5
|View full text |Cite
|
Sign up to set email alerts
|

Li Yan et al. reply

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
10
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(11 citation statements)
references
References 25 publications
0
10
0
1
Order By: Relevance
“…“Internal” model performance on structurally similar, previously unseen data, gathered from the same source used for model training, can be contrasted with “external” model performance on new, previously unseen data from other sources. ML models perform worse in external cohorts due to several reasons such as different protocols, confounding variables, or heterogeneous populations (Cabitza et al, 2017 ; Zech et al, 2018 ; Martensson et al, 2020 ; Goncalves et al, 2021 ). Moreover, medical data can be biased by a variety of factors such as admission policies, hospital treatment protocols, country-specific guidelines, clinician discretion, healthcare economy, etc.…”
Section: Discussionmentioning
confidence: 99%
“…“Internal” model performance on structurally similar, previously unseen data, gathered from the same source used for model training, can be contrasted with “external” model performance on new, previously unseen data from other sources. ML models perform worse in external cohorts due to several reasons such as different protocols, confounding variables, or heterogeneous populations (Cabitza et al, 2017 ; Zech et al, 2018 ; Martensson et al, 2020 ; Goncalves et al, 2021 ). Moreover, medical data can be biased by a variety of factors such as admission policies, hospital treatment protocols, country-specific guidelines, clinician discretion, healthcare economy, etc.…”
Section: Discussionmentioning
confidence: 99%
“…Oleh karena itu, sekarang ada kebutuhan yang kuat untuk teknik baru dan alat otomatis yang akan dirancang yang secara signifikan dapat membantu kami dalam menyiapkan data yang berkualitas. Persiapan data bisa lebih memakan waktu daripada data penambangan, dan dapat menghadirkan tantangan yang setara, jika tidak lebih, daripada penambangan data [8]. Pada bagian ini, kami memperdebatkan pentingnya persiapan data pada tiga aspek: 1.…”
Section: Pentingnya Persiapan Data Pada Data Miningunclassified
“…Few papers address this small-data issue, or the resulting imbalance of class sizes, making it unlikely that their results will generalize to the wider community. For example, because of the prevalence of data from China, many researchers train on small datasets from China when the model is intended for European populations, and recent research suggests such models are ineffective in practice (6). Differences between the training data and the target population, including patient phenotypes and data acquisition procedures, can all affect a model's generalisability (6).…”
Section: Systematic Errors In the Literaturementioning
confidence: 99%
“…For example, because of the prevalence of data from China, many researchers train on small datasets from China when the model is intended for European populations, and recent research suggests such models are ineffective in practice (6). Differences between the training data and the target population, including patient phenotypes and data acquisition procedures, can all affect a model's generalisability (6). Training generalisable models from small amounts of labeled data are a common problem in medical imaging, and techniques such as transfer learning, self-or semisupervised learning, and parameter pruning can ameliorate this issue (7,8).…”
Section: Systematic Errors In the Literaturementioning
confidence: 99%