The relationships between the levels of transcripts and the levels of the proteins they encode have not been examined comprehensively in mammals, although previous work in plants and yeast suggest a surprisingly modest correlation. We have examined this issue using a genetic approach in which natural variations were used to perturb both transcript levels and protein levels among inbred strains of mice. We quantified over 5,000 peptides and over 22,000 transcripts in livers of 97 inbred and recombinant inbred strains and focused on the 7,185 most heritable transcripts and 486 most reliable proteins. The transcript levels were quantified by microarray analysis in three replicates and the proteins were quantified by Liquid Chromatography–Mass Spectrometry using O(18)-reference-based isotope labeling approach. We show that the levels of transcripts and proteins correlate significantly for only about half of the genes tested, with an average correlation of 0.27, and the correlations of transcripts and proteins varied depending on the cellular location and biological function of the gene. We examined technical and biological factors that could contribute to the modest correlation. For example, differential splicing clearly affects the analyses for certain genes; but, based on deep sequencing, this does not substantially contribute to the overall estimate of the correlation. We also employed genome-wide association analyses to map loci controlling both transcript and protein levels. Surprisingly, little overlap was observed between the protein- and transcript-mapped loci. We have typed numerous clinically relevant traits among the strains, including adiposity, lipoprotein levels, and tissue parameters. Using correlation analysis, we found that a low number of clinical trait relationships are preserved between the protein and mRNA gene products and that the majority of such relationships are specific to either the protein levels or transcript levels. Surprisingly, transcript levels were more strongly correlated with clinical traits than protein levels. In light of the widespread use of high-throughput technologies in both clinical and basic research, the results presented have practical as well as basic implications.
Next‐generation sequencing has aided characterization of genomic variation. While whole‐genome sequencing may capture all possible mutations, whole‐exome sequencing remains cost‐effective and captures most phenotype‐altering mutations. Initial strategies for exome enrichment utilized a hybridization‐based capture approach. Recently, amplicon‐based methods were designed to simplify preparation and utilize smaller DNA inputs. We evaluated two hybridization capture‐based and two amplicon‐based whole‐exome sequencing approaches, utilizing both Illumina and Ion Torrent sequencers, comparing on‐target alignment, uniformity, and variant calling. While the amplicon methods had higher on‐target rates, the hybridization capture‐based approaches demonstrated better uniformity. All methods identified many of the same single‐nucleotide variants, but each amplicon‐based method missed variants detected by the other three methods and reported additional variants discordant with all three other technologies. Many of these potential false positives or negatives appear to result from limited coverage, low variant frequency, vicinity to read starts/ends, or the need for platform‐specific variant calling algorithms. All methods demonstrated effective copy‐number variant calling when evaluated against a single‐nucleotide polymorphism array. This study illustrates some differences between whole‐exome sequencing approaches, highlights the need for selecting appropriate variant calling based on capture method, and will aid laboratories in selecting their preferred approach.
This article is available online at http://www.jlr.org provided insight into the many different genetic factors that contribute to the disease ( 3-7 ). Using data acquired from thousands of patients and healthy controls, these studies have collectively identifi ed 35 genetic loci associated with CAD ( 8, 9 ). Although 10 of the recently identifi ed CAD risk loci work through known risk factors, such as lipids and blood pressure, this is not the case for the majority of loci ( 3 ), implying that key pathways leading to coronary atherosclerosis are yet to be discovered. Given their fi rm association with disease risk, novel CAD loci provide a solid foundation to unravel disease networks.As GWAS results do not provide functional information on the loci identifi ed, additional studies are needed to determine the candidate genes and their role in disease. The immediate challenges associated with the validation of GWAS candidate genes include identifying the likely cell types in which the risk variants and genes function, determining which of the multiple candidate genes represented by each CAD locus contribute to disease, and defi ning the functions of poorly annotated candidate genes. Expression analyses of candidate genes under defi ned conditions in model organisms and their associations with risk variants in human samples provide a powerful way to address these issues and predict the causal candidate genes.It is likely that at least some of the novel genes (that is, those not affecting known risk factors, such as plasma lipids or blood pressure) are perturbing vessel wall or infl ammatory cell functions. Endothelial cells (EC) play a critical role in the initiation and progression of atherosclerosis. Abstract Recent genome-wide association studies (GWAS)have identifi ed 35 loci that signifi cantly associate with coronary artery disease (CAD) susceptibility. The majority of the genes represented in these loci have not previously been studied in the context of atherosclerosis. To characterize the roles of these candidate genes in the vessel wall, we determined their expression levels in endothelial, smooth muscle, and macrophage cells isolated from healthy, prelesioned, and lesioned mouse aortas. We also performed expression quantitative locus (eQTL) mapping of these genes in human endothelial cells under control and proatherogenic conditions. Of the 57 genes studied, 31 were differentially expressed in one or more cell types in disease state in mice, and the expression levels of 8 were signifi cantly associated with the CAD SNPs in human cells, 7 of which were also differentially expressed in mice. By integrating human and mouse results, we predict that PPAP2B , GALNT4 , MAPKAPK5 , TCTN1 , SRR , SNF8 , and ICAM1 play a causal role in the susceptibility to atherosclerosis through a role in the vasculature. Additionally, we highlight the genetic complexity of a subset of CAD loci through the differential expression of multiple candidate genes per locus and the involvement of genes that lie outside linkage disequilib...
The genetics of messenger RNA (mRNA) expression has been extensively studied in humans and other organisms, but little is known about genetic factors contributing to microRNA (miRNA) expression. We examined natural variation of miRNA expression in adipose tissue in a population of 200 men who have been carefully characterized for metabolic syndrome (MetSyn) phenotypes as part of the Metabolic Syndrome in Men (METSIM) study. We genotyped the subjects using high-density single-nucleotide polymorphism microarrays and quantified the mRNA abundance using genome-wide expression arrays and miRNA abundance using next-generation sequencing. We reliably quantified 356 miRNA species that were expressed in human adipose tissue, a limited number of which made up most of the expressed miRNAs. We mapped the miRNA abundance as an expression quantitative trait and determined cis regulation of expression for nine of the miRNAs and of the processing of one miRNA (miR-28). The degree of genetic variation of miRNA expression was substantially less than that of mRNAs. For the majority of the miRNAs, genetic regulation of expression was independent of the expression of mRNA from which the miRNA is transcribed. We also showed that for 108 miRNAs, mapped reads displayed widespread variation from the canonical sequence. We found a total of 24 miRNAs to be significantly associated with MetSyn traits. We suggest a regulatory role for miR-204-5p which was predicted to inhibit acetyl coenzyme A carboxylase β, a key fatty acid oxidation enzyme that has been shown to play a role in regulating body fat and insulin resistance in adipose tissue.
Targeted, capture-based DNA sequencing is a cost-effective method to focus sequencing on a coding region or other customized region of the genome. There are multiple targeted sequencing methods available, but none has been systematically investigated and compared. We evaluated four commercially available custom-targeted DNA technologies for next-generation sequencing with respect to on-target sequencing, uniformity, and ability to detect single-nucleotide variations (SNVs) and copy number variations. The technologies that used sonication for DNA fragmentation displayed impressive uniformity of capture, whereas the others had shorter preparation times, but sacrificed uniformity. One of those technologies, which uses transposase for DNA fragmentation, has a drawback requiring sample pooling, and the last one, which uses restriction enzymes, has a limitation depending on restriction enzyme digest sites. Although all technologies displayed some level of concordance for calling SNVs, the technologies that require restriction enzymes or transposase missed several SNVs largely because of the lack of coverage. All technologies performed well for copy number variation calling when compared to single-nucleotide polymorphism arrays. These results enable laboratories to compare these methods to make informed decisions for their intended applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.