Background:
Developments in gene-hunting techniques identified several ASD associated genes. The considerable significance of cluster analysis associated with gene network studies has led to reveal many disrupted key pathways in ASD, even if its genetic underpinnings remain a challenging task. This study aims to determine, through a novel data-driven approach, how networks of mutated genes impact biological processes underlying autism.
Methods:
We analyzed the VariCarta dataset, which presents more than 200,000 genomic variant events collected from 13,069 people with ASD. Firstly, we created a whole-genome and an exome sequencing subset. Then, for each subset we compared pairwise patients of each group to build “patient similarity matrices”. Hierarchical-agglomerative-clustering and heatmap were performed to identify clusters of patients with common occurrences of gene networks within these matrices. The subsequent enrichment analysis (EA) highlighted biological processes that might be impacted by the mutated genes of each subgroup.
Results:
Considering the whole-genome matrix, we identified three main genetic clusters of ASD patients, each one characterized by a network of shared genetic variants. We isolated 11,609 genetic variants shared by at least two subjects in each cluster; 4,187 of these variants (36.1%) were common to the three clusters. Only 331 patients (2.5%) shared none or very few mutated genes with anyone else. The EA highlighted common or cluster-specific biological processes related to the variants. Most of the common abnormal processes were involved in neuron projections guidance and morphogenesis, cell junctions and synapse assembly. Exome sequencing alone was not effectual in identifying ASD subgroups.
Limitations:
Caution is warranted when interpreting our results, as we did not compare them with a control group and did not verify if the identified subgroups where actually associated with different phenotypes. Future work will have to ascertain the strength and reproducibility of these results.
Conclusions:
Itemizing not just single mutated genes, but also gene networks and specific biological processes that characterize different ASD subpopulations might allow to better understand which networks of genetic variants play a major role in the etiopathology of ASD. The proposed methodology may represent a novel approach to help disentangle ASD complexity and an instrument to boost more focused genotype-phenotype studies.