Most breast cancer (BC) risk-associated variants (raSNPs) identified in genome-wide association studies (GWAS) are believed to cis-regulate the expression of genes. We hypothesise that cis-regulatory variants contributing to disease risk may be affecting miRNA genes and/or miRNA-binding. To test this we adapted two miRNA-binding prediction algorithms -TargetScan and miRanda -to perform allele-specific queries, and integrated differential allelic expression (DAE) and expression quantitative trait loci (eQTL) data, to query 150 genome-wide significant (P ≤ 5 × 10 −8 ) raSNPs, plus proxies. We found that no raSNP mapped to a miRNA gene, suggesting that altered miRNA targeting is an unlikely mechanism involved in BC risk. Also, 11.5% (6 out of 52) raSNPs located in 3'UTRs of putative miRNA target genes were predicted to alter miRNA::mRNA pair binding stability in five candidate target genes. Of these, we propose RNF115, at locus 1q21.1, as a strong novel target gene associated with BC risk, and re-inforce the role of miRNA mediated cis-regulation at locus 19p13.11. We believe that integrating allele-specific querying in miRNA-binding prediction, and data supporting cis-regulation of expression, improves the identification of candidate target genes in BC risk, as well as in other common cancers and complex diseases.
Results
Some BC risk variants locate to the 3'UTR of PCGs, but none to miRNA genesTo evaluate the contribution to BC risk of genetic variation modelling miRNA::mRNA binding, we first assessed how many GWAS SNPs and their proxies were located in either miRNA genes or 3'UTRs of PCGs. We identified 2749 raSNPs, resulting from 150 BC GWAS-SNPs (Additional file 1: Table S1) and their proxies, of which almost one third (805 raSNPs) were solely annotated to "gene deserts" (585 raSNPs) or intergenic regions (220 raSNPs). The remainder 1944 raSNPs were located in either ncRNA genes or PCG's (Figure 1), in a total of 161 unique Ensembl gene IDs, correspondent to 129 HGNC (HUGO Gene Nomenclature Committee) symbols.Next we assessed how many would change the miRNA gene sequence, thus affecting their biogenesis or target genes. Interestingly, none of the raSNPs mapped to miRNA genes, even after the LD threshold was lowered to r 2 ≥ 0.2 when defining proxy SNPs (results not shown). This suggests that altered miRNA biogenesis or altered seed region sequence are unlikely mechanisms associated with BC risk. However, 13 SNPs were annotated as downstream or upstream variants of miRNA genes (Additional file 1: Table S2), raising the possibility of them being regulating the expression of the miRNA itself. However, we did not pursue this hypothesis further due to unavailability of DAE or eQTL data for these particular miRNA genes.The vast majority of the raSNPs located within PCGs were in non-coding regions (1881 out of 1915, 98%) (Figure 1), consistent with previous reports 17 . SNPs located at the 3'UTR of the mRNA sequence of PCGs could potentially modify, create or destroy miRNA binding sites and we found 52 raSNPs (1.9% of ...