Biased data, biased AI: deep networks predict the acquisition site of TCGA images

Dehkharghanian, Taher; Bidgoli, Azam Asilian; Riasatian, Abtin; Mazaheri, Pooria; Campbell, Clinton J. V.; Pantanowitz, Liron; Tizhoosh, Hamid R.; Rahnamayan, Shahryar

doi:10.1186/s13000-023-01355-3

Cited by 13 publications

(6 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Regardless of model/algorithmic goals, AI/ML must be evaluated using a diverse range of externally sourced data because deep models have a tendency to learn medically irrelevant shortcuts to achieve their medically relevant goals. For instance, it has been shown that models trained on The Cancer Genome Atlas (TCGA) WSIs for cancer subtype classification learned to distinguish hospitals and medical centers that provided WSIs 59 . Additionally, researchers usually evaluate their models on their own data from their own institution.…”

Section: Discussionmentioning

confidence: 99%

Applied machine learning in hematopathology

Dehkharghanian

Tizhoosh

et al. 2023

Int J Lab Hematology

View full text Add to dashboard Cite

An increasing number of machine learning applications are being developed and applied to digital pathology, including hematopathology. The goal of these modern computerized tools is often to support diagnostic workflows by extracting and summarizing information from multiple data sources, including digital images of human tissue. Hematopathology is inherently multimodal and can serve as an ideal case study for machine learning applications. However, hematopathology also poses unique challenges compared to other pathology subspecialities when applying machine learning approaches. By modeling the pathologist workflow and thinking process, machine learning algorithms may be designed to address practical and tangible problems in hematopathology. In this article, we discuss the current trends in machine learning in hematopathology. We review currently available machine learning enabled medical devices supporting hematopathology workflows. We then explore current machine learning research trends of the field with a focus on bone marrow cytology and histopathology, and how adoption of new machine learning tools may be enabled through the transition to digital pathology.

show abstract

Section: Discussionmentioning

confidence: 99%

Applied machine learning in hematopathology

Dehkharghanian

Tizhoosh

et al. 2023

Int J Lab Hematology

View full text Add to dashboard Cite

show abstract

“…It is also worth noting that implementing a system that enables rapid and consistent imaging, correction, and virtual staining of tissue samples would significantly enhance stain uniformity/repeatability. This is particularly crucial considering the lab-based biases present in extensive and reputable databases, such as the digital image collection of The Cancer Genome Atlas (TCGA) ( Dehkharghanian et al, 2023 ).…”

Section: Discussionmentioning

confidence: 99%

Digital staining facilitates biomedical microscopy

2023

View full text Add to dashboard Cite

Traditional staining of biological specimens for microscopic imaging entails time-consuming, laborious, and costly procedures, in addition to producing inconsistent labeling and causing irreversible sample damage. In recent years, computational “virtual” staining using deep learning techniques has evolved into a robust and comprehensive application for streamlining the staining process without typical histochemical staining-related drawbacks. Such virtual staining techniques can also be combined with neural networks designed to correct various microscopy aberrations, such as out-of-focus or motion blur artifacts, and improve upon diffracted-limited resolution. Here, we highlight how such methods lead to a host of new opportunities that can significantly improve both sample preparation and imaging in biomedical microscopy.

show abstract

“…The widespread use of The Cancer Genome Atlas (TCGA) dataset, as seen in 42% of the studies included in our review, further underscores the importance of addressing dataset biases. Some models trained on TCGA have shown a tendency to recognize specific institutional patterns, which, although not medically relevant, could unintentionally affect model performance [ 88 , 89 ]. Moreover, the lack of cross-validation among different cohorts, potential lab-induced tissue artifacts, and the biases from institutional patterns limit model generalizability and clinical application.…”

Section: Discussionmentioning

confidence: 99%

Section: Navigating the Future: Challenges And Improvements In Bc Cpathmentioning

confidence: 99%

Artificial Intelligence in Digital Pathology for Bladder Cancer: Hype or Hope? A Systematic Review

Khoraminia,

Fuster,

Kanwal

et al. 2023

Cancers

View full text Add to dashboard Cite

Bladder cancer (BC) diagnosis and prediction of prognosis are hindered by subjective pathological evaluation, which may cause misdiagnosis and under-/over-treatment. Computational pathology (CPATH) can identify clinical outcome predictors, offering an objective approach to improve prognosis. However, a systematic review of CPATH in BC literature is lacking. Therefore, we present a comprehensive overview of studies that used CPATH in BC, analyzing 33 out of 2285 identified studies. Most studies analyzed regions of interest to distinguish normal versus tumor tissue and identify tumor grade/stage and tissue types (e.g., urothelium, stroma, and muscle). The cell’s nuclear area, shape irregularity, and roundness were the most promising markers to predict recurrence and survival based on selected regions of interest, with >80% accuracy. CPATH identified molecular subtypes by detecting features, e.g., papillary structures, hyperchromatic, and pleomorphic nuclei. Combining clinicopathological and image-derived features improved recurrence and survival prediction. However, due to the lack of outcome interpretability and independent test datasets, robustness and clinical applicability could not be ensured. The current literature demonstrates that CPATH holds the potential to improve BC diagnosis and prediction of prognosis. However, more robust, interpretable, accurate models and larger datasets—representative of clinical scenarios—are needed to address artificial intelligence’s reliability, robustness, and black box challenge.

show abstract

Biased data, biased AI: deep networks predict the acquisition site of TCGA images

Cited by 13 publications

References 35 publications

Applied machine learning in hematopathology

Applied machine learning in hematopathology

Digital staining facilitates biomedical microscopy

Artificial Intelligence in Digital Pathology for Bladder Cancer: Hype or Hope? A Systematic Review

Contact Info

Product

Resources

About