2021
DOI: 10.21203/rs.3.rs-943804/v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Biased Data, Biased AI: Deep Networks Predict the Acquisition Site of TCGA Images

Abstract: Deep learning models applied to healthcare applications including digital pathology have been increasing their scope and importance in recent years. Many of these models have been trained on The Cancer Genome Atlas (TCGA) atlas of digital images, or use it as a validation source. This study shows that there are tissue source site (tss) specific patterns of TCGA images that could be used to identify contributing institutions without any explicit training. Furthermore, it was observed that a model trained for ca… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
17
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1
1

Relationship

4
3

Authors

Journals

citations
Cited by 10 publications
(17 citation statements)
references
References 19 publications
0
17
0
Order By: Relevance
“…The experiments showed that the model gave a remarkable performance, and provide important assistance for health experts and diagnostic radiographers. The whole-slide images available in the Cancer Genome Atlas (TCGA) have been used by Dehkharghanian et al [15] for developing a deep-learning model for the automatic analysis of tumor slides where in order to recognize tumor in lung versus normal tissue. After testing and evaluating the deep-learning model by the training they found that artificial intelligence technology based on deep-learning models can help pathologists in cancer subtype detection or gene mutations in any cancer type with a save of costs and time.…”
Section: Introductionmentioning
confidence: 99%
“…The experiments showed that the model gave a remarkable performance, and provide important assistance for health experts and diagnostic radiographers. The whole-slide images available in the Cancer Genome Atlas (TCGA) have been used by Dehkharghanian et al [15] for developing a deep-learning model for the automatic analysis of tumor slides where in order to recognize tumor in lung versus normal tissue. After testing and evaluating the deep-learning model by the training they found that artificial intelligence technology based on deep-learning models can help pathologists in cancer subtype detection or gene mutations in any cancer type with a save of costs and time.…”
Section: Introductionmentioning
confidence: 99%
“…The discrepancy between the source and target domain distributions is known as domain shift [7]. According to Yagi et al [8], the domain shift in histopathology usually occurs when WSIs are acquired at different trial sites due to differences in slide preparation, staining procedure, and scanner characteristics as well as biases in developing AI models [9]. Overall, correct diagnosis is a critical task in histopathology which can directly influence the treatment outcome.…”
Section: Introductionmentioning
confidence: 99%
“…That is possibly related to the non-optimal design of the DNNs structure which leading to extracted non-optimal set of features [23]. Consequently, it seems that DNNs have a tendency to learn irrelevant shortcuts patterns related to data acquisition sites, rather than the actual underlying morphological information [19].…”
Section: Introductionmentioning
confidence: 99%
“…In a recent study [19], the existence of bias in histopathology images of The Cancer Genome Atlas (TCGA) [20] was investigated. It was established that the deep features extracted from the images are able to accurately distinguish the WSIs based on their acquisition site.…”
Section: Introductionmentioning
confidence: 99%