“…There are many pitfalls and limitations from the existing algorithms and databases for achieving an accurate prediction of the transcription regulation: (1) the disease-associated SNPs in GWAS datasets (Bryzgalov et al, 2013, Li et al, 2013, Teng et al, 2012, Ward & Kellis 2012 may not be in fact causal, since its selection is due to the linkage to other causal SNPs (Levo & Segal 2014); (2) chromatin accessibility datasets lack many specific cell lines (Bryzgalov et al, 2013, Macintyre et al, 2010, Manke et al, 2010, Teng et al, 2012 or are limited in their use , Li et al, 2013, Ward & Kellis 2012; (3) there are few resources integrating genetic variant databases and expression profile information (Yang et al 2010, Holm, Melum, Franke, & Karlsen, 2010, though there is a recent effort in providing tissue-specific gene expression and genotype information (GTEx Consortium, 2015); and (4) the absence of the quantification of the impact of multiple TFBSs of the same TF in the regulatory region (also called homotypic redundancy) , Bryzgalov et al, 2013, Macintyre et al, 2010, Teng et al, 2012, Ward & Kellis 2012, which is an important feature in the regulation of gene expression (Gotea et al, 2010, Spivakov et al, 2012. This is because it has been suggested that the more TFBSs for a single TF in the same region, the less impact a single TFBS perturbation will have (Sharon et al 2012, Smith et al 2013.…”