Grace Tiao scite author profile

Summary Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. We describe the aggregation and analysis of high-quality exome (protein-coding region) sequence data for 60,706 individuals of diverse ethnicities generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of truncating variants with 72% having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human “knockout” variants in protein-coding genes.

show abstract

The mutational constraint spectrum quantified from variation in 141,456 humans

Karczewski¹,

Francioli²,

Tiao³

et al. 2020

Nature

6,378

5,152

View full text Add to dashboard Cite

show abstract

The mutational constraint spectrum quantified from variation in 141,456 humans

Karczewski¹,

Francioli²,

Tiao³

et al. 2019

Preprint

1,238

1,541

View full text Add to dashboard Cite

Supplemental Figure 1 Method: All MS runs were compared and clustered using standard artMS ( https://github.com/biodavidjm/artMS ) procedures on observed feature intensities computed by MaxQuant. Supplemental Figure 1 shows all Pearson's pairwise correlations between MS runs, and are clustered according to similar correlation patterns. Supplemental Figure 2 Method: See main text. Supplemental Figure 3 Method: PFAM domain enrichment analysis. The enrichment of individual PFAM domains (or PFAM clans) 1 was calculated with a hypergeometric test where success is defined as number of domains, and the number of trials is the number of individual preys pulled-down with each viral bait. The population values were the numbers of individual PFAM domains and clans in the human proteome.To make sure that the p-values that signify enrichment were meaningful, we only considered PFAM domains that have been pulled-down at least three times with any SARS-CoV-2 protein, and which occur in the human proteome at least five times. In SI Figure 3 we show PFAM domains/clans with the lowest p-value for a given viral bait protein.

show abstract

Pan-cancer analysis of whole genomes

Campbell¹,

Getz²,

Korbel³

et al. 2020

Nature

2,002

1,015

View full text Add to dashboard Cite

The pan-cancer analysis of whole genomes The expansion of whole-genome sequencing studies from individual ICGC and TCGA working groups presented the opportunity to undertake a meta-analysis of genomic features across tumour types. To achieve this, the PCAWG Consortium was established. A Technical Working Group implemented the informatics analyses by aggregating the raw sequencing data from different working groups that studied individual tumour types, aligning the sequences to the human genome and delivering a set of high-quality somatic mutation calls for downstream analysis (Extended Data Fig. 1). Given the recent meta-analysis

show abstract

Integrated Genomic Characterization of Pancreatic Ductal Adenocarcinoma

Raphael¹,

Hruban²,

Aguirre³

et al. 2017

Cancer Cell

1,344

951

View full text Add to dashboard Cite

SUMMARY We performed integrated genomic, transcriptomic and proteomic profiling of 150 pancreatic ductal adenocarcinoma (PDAC) specimens, including samples with characteristic low neoplastic cellularity. Deep whole-exome sequencing revealed recurrent somatic mutations in KRAS, TP53, CDKN2A, SMAD4, RNF43, ARID1A, TGFβR2, GNAS, RREB1 and PBRM1. KRAS wild-type tumors harbored alterations in other oncogenic drivers, including GNAS, BRAF, CTNNB1 and additional RAS pathway genes. A subset of tumors harbored multiple KRAS mutations, with some showing evidence of biallelic mutations. Protein profiling identified a favorable prognosis subset with low epithelial-mesenchymal transition and high MTOR pathway scores. Associations of non-coding RNAs with tumor-specific mRNA subtypes were also identified. Our integrated multi-platform analysis reveals a complex molecular landscape of PDAC and provides a roadmap for precision medicine.

show abstract

Author Correction: The mutational constraint spectrum quantified from variation in 141,456 humans

Karczewski¹,

Francioli²,

Tiao³

et al. 2021

Nature

483

726

View full text Add to dashboard Cite

show abstract

A structural variation reference for medical and population genetics

Collins

Brand

Karczewski

et al. 2020

Nature

641

656

View full text Add to dashboard Cite

Structural variants (SVs) rearrange large segments of DNA1 and can have profound consequences in evolution and human disease2,3. As national biobanks, disease-association studies, and clinical genetic testing have grown increasingly reliant on genome sequencing, population references such as the Genome Aggregation Database (gnomAD)4 have become integral in the interpretation of single-nucleotide variants (SNVs)5. However, there are no reference maps of SVs from high-coverage genome sequencing comparable to those for SNVs. Here we present a reference of sequence-resolved SVs constructed from 14,891 genomes across diverse global populations (54% non-European) in gnomAD. We discovered a rich and complex landscape of 433,371 SVs, from which we estimate that SVs are responsible for 25–29% of all rare protein-truncating events per genome. We found strong correlations between natural selection against damaging SNVs and rare SVs that disrupt or duplicate protein-coding sequence, which suggests that genes that are highly intolerant to loss-of-function are also sensitive to increased dosage6. We also uncovered modest selection against noncoding SVs in cis-regulatory elements, although selection against protein-truncating SVs was stronger than all noncoding effects. Finally, we identified very large (over one megabase), rare SVs in 3.9% of samples, and estimate that 0.13% of individuals may carry an SV that meets the existing criteria for clinically important incidental findings7. This SV resource is freely distributed via the gnomAD browser8 and will have broad utility in population genetics, disease-association studies, and diagnostic screening.

show abstract

Comprehensive analysis of cancer-associated somatic mutations in class I HLA genes

et al. 2015

View full text Add to dashboard Cite

Detection of somatic mutations in HLA genes using whole-exome sequencing (WES) is hampered by the high polymorphism of the HLA loci, which prevents alignment of sequencing reads to the human reference genome. We describe a computational pipeline that enables accurate inference of germline alleles of class I HLA-A, -B and -C genes and subsequent detection of mutations in these genes using the inferred alleles as a reference. Analysis of WES data from 7,930 pairs of tumor and healthy tissue from the same patient revealed 298 non-silent HLA mutations in tumors from 266 patients. These 298 mutations are enriched for likely functional mutations, including putative loss-of-function events. Recurrence of mutations suggested that these ‘hotspot’ sites were positively selected. Cancers with recurrent somatic HLA mutations were associated with upregulation of signatures of cytolytic activity characteristic of tumor infiltration by effector lymphocytes, supporting immune evasion by altered HLA function as a contributory mechanism in cancer.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Grace Tiao

Analysis of protein-coding genetic variation in 60,706 humans

The mutational constraint spectrum quantified from variation in 141,456 humans

The mutational constraint spectrum quantified from variation in 141,456 humans

Pan-cancer analysis of whole genomes

Integrated Genomic Characterization of Pancreatic Ductal Adenocarcinoma

Author Correction: The mutational constraint spectrum quantified from variation in 141,456 humans

A structural variation reference for medical and population genetics

Comprehensive analysis of cancer-associated somatic mutations in class I HLA genes

Contact Info

Product

Resources

About