Background
Acute lymphoblastic leukemia (ALL) is the most common childhood cancer, suggesting that germline variants influence ALL risk. Although multiple genome-wide association (GWA) studies have identified variants predisposing children to ALL, it remains unclear whether genetic heterogeneity affects ALL susceptibility and how interactions within and among genes containing ALL-associated variants influence ALL risk.
Methods
Here we jointly analyze two published datasets of case-control GWA summary statistics along with germline data from ALL case-parent trios. We use the gene-level association method PEGASUS to identify genes with multiple variants associated with ALL. We then use PEGASUS gene scores as input to the network analysis algorithm HotNet2 to characterize the genomic architecture of ALL.
Results
Using PEGASUS, we confirm associations previously observed at genes such as ARID5B, IKZF1, CDKN2A/2B, and PIP4K2A, and we identify novel candidate gene associations. Using HotNet2, we uncover significant gene subnetworks that may underlie inherited ALL risk: a subnetwork involved in B-cell differentiation containing the ALL-associated gene CEBPE; and a subnetwork of homeobox genes including MEIS1.
Conclusions
Gene and network analysis uncovers loci associated with ALL that are missed by GWA studies such as MEIS1. Further, ALL-associated loci do not appear to interact directly with each other to influence ALL risk, and instead appear to influence leukemogenesis through multiple, complex pathways.
Impact
We present a new pipeline for post-hoc analysis of association studies that yields new insight into the etiology of ALL, and can be applied in future studies to shed light on the genomic underpinnings of cancer.