Personalized medical care focuses on prediction of disease risk and response to medications. To build the risk models, access to both large-scale genomic resources and human genetic studies is required. The Taiwan Biobank (TWB) has generated high-coverage, whole-genome sequencing data from 1492 individuals and genome-wide SNP data from 103,106 individuals of Han Chinese ancestry using custom SNP arrays. Principal components analysis of the genotyping data showed that the full range of Han Chinese genetic variation was found in the cohort. The arrays also include thousands of known functional variants, allowing for simultaneous ascertainment of Mendelian disease-causing mutations and variants that affect drug metabolism. We found that 21.2% of the population are mutation carriers of autosomal recessive diseases, 3.1% have mutations in cancer-predisposing genes, and 87.3% carry variants that affect drug response. We highlight how TWB data provide insight into both population history and disease burden, while showing how widespread genetic testing can be used to improve clinical care.
Background Data from the 1000 Genomes project is quite often used as a reference for human genomic analysis. However, its accuracy needs to be assessed to understand the quality of predictions made using this reference. We present here an assessment of the genotyping, phasing, and imputation accuracy data in the 1000 Genomes project. We compare the phased haplotype calls from the 1000 Genomes project to experimentally phased haplotypes for 28 of the same individuals sequenced using the 10X Genomics platform. Results We observe that phasing and imputation for rare variants are unreliable, which likely reflects the limited sample size of the 1000 Genomes project data. Further, it appears that using a population specific reference panel does not improve the accuracy of imputation over using the entire 1000 Genomes data set as a reference panel. We also note that the error rates and trends depend on the choice of definition of error, and hence any error reporting needs to take these definitions into account. Conclusions The quality of the 1000 Genomes data needs to be considered while using this database for further studies. This work presents an analysis that can be used for these assessments. Electronic supplementary material The online version of this article (10.1186/s12864-019-5957-x) contains supplementary material, which is available to authorized users.
Baboons (genus Papio) are broadly studied in the wild and in captivity. They are widely used as a nonhuman primate model for biomedical studies, and the Southwest National Primate Research Center (SNPRC) at Texas Biomedical Research Institute has maintained a large captive baboon colony for more than 50 yr. Unlike other model organisms, however, the genomic resources for baboons are severely lacking. This has hindered the progress of studies using baboons as a model for basic biology or human disease. Here, we describe a data set of 100 high-coverage whole-genome sequences obtained from the mixed colony of olive (P. anubis) and yellow (P. cynocephalus) baboons housed at the SNPRC. These data provide a comprehensive catalog of common genetic variation in baboons, as well as a fine-scale genetic map. We show how the data can be used to learn about ancestry and admixture and to correct errors in the colony records. Finally, we investigated the consequences of inbreeding within the SNPRC colony and found clear evidence for increased rates of infant mortality and increased homozygosity of putatively deleterious alleles in inbred individuals.
We analyze the role of solvation for enzymatic catalysis in two distinct, artificially designed Kemp Eliminases, KE07 and KE70, and mutated variants that were optimized by laboratory directed evolution. Using a spatially resolved analysis of hydration patterns, intermolecular vibrations, and local solvent entropies, we identify distinct classes of hydration water and follow their changes upon substrate binding and transition state formation for the designed KE07 and KE70 enzymes and their evolved variants. We observe that differences in hydration of the enzymatic systems are concentrated in the active site and undergo significant changes during substrate recruitment. For KE07, directed evolution reduces variations in the hydration of the polar catalytic center upon substrate binding, preserving strong protein-water interactions, while the evolved enzyme variant of KE70 features a more hydrophobic reaction center for which the expulsion of low-entropy water molecules upon substrate binding is substantially enhanced. While our analysis indicates a system-dependent role of solvation for the substrate binding process, we identify more subtle changes in solvation for the transition state formation, which are less affected by directed evolution.
We have used the AMOEBA model to simulate the THz spectra of two zwitterionic amino acids in aqueous solution, which is compared to the results on these same systems using ab initio molecular dynamics (AIMD) simulations. Overall we find that the polarizable force field shows promising agreement with AIMD data for both glycine and valine in water. This includes the THz spectral assignments and the mode-specific spectral decomposition into intramolecular solute motions as well as distinct solute-water cross-correlation modes some of which cannot be captured by non-polarizable force fields that rely on fixed partial charges. This bodes well for future studies for simulating and decomposing the THz spectra for larger solutes such as proteins or polymers for which AIMD studies are presently intractable. Furthermore, we believe that the current study on rather simple aqueous solutions offers a way to systematically investigate the importance of charge transfer, nuclear quantum effects, and the validity of computationally practical density functionals, all of which are needed to fully quantitatively capture complex dynamical motions in the condensed phase.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.