“…In this study, we extract raw expression data of 30 datasets, where 29 transcriptome datasets were obtained from GEO and one is from TCGA; each dataset contains at least 10 samples. The following is the list of datasets obtained from GEO: GSE102079 (Chiyonobu et al, 2018), GSE22405, GSE98383 (Diaz et al, 2018), GSE84402 (Wang et al, 2017), GSE64041 (Makowska et al, 2016), GSE69715 (Sekhar et al, 2018), GSE51401, GSE62232 (Schulze et al, 2015), GSE45267 (Chen et al, 2018a), GSE32879 (Oishi et al, 2012), GSE19665 (Deng et al, 2010), GSE107170 (Diaz et al, 2018), GSE76427 (Grinchuk et al, 2018), GSE39791 (Kim et al, 2014), GSE57957 (Mah et al, 2014), GSE87630 (Woo et al, 2017), GSE46408, GSE57555 (Murakami et al, 2015), GSE54236 (Villa et al, 2016;Zubiete-Franco et al, 2019), GSE65484 (Dong et al, 2015), GSE31370 (Seok et al, 2012), GSE84598, GSE89377, GSE29721 (Stefanska et al, 2011), GSE14323 (Mas et al, 2009), GSE25097 (Lamb et al, 2011;Tung et al, 2011;Wong et al, 2016), GSE14520 (Roessler et al, 2010;Zhao et al, 2015), GSE36376 (Lim et al, 2013), GSE36076). All GEO datasets were obtained using GEOquery package of Bioconductor in R-3.5.3.…”