BackgroundAllele specific gene expression (ASE), with the paternal allele more expressed than the maternal allele or vice versa, appears to be a common phenomenon in humans and mice. In other species the extent of ASE is unknown, and even in humans and mice there are several outstanding questions. These include; to what extent is ASE tissue specific? how often does the direction of allele expression imbalance reverse between tissues? how often is only one of the two alleles expressed? is there a genome wide bias towards expression of the paternal or maternal allele; and finally do genes that are nearby on a chromosome share the same direction of ASE? Here we use gene expression data (RNASeq) from 18 tissues from a single cow to investigate each of these questions in turn, and then validate some of these findings in two tissues from 20 cows.ResultsBetween 40 and 100 million sequence reads were generated per tissue across three replicate samples for each of the eighteen tissues from the single cow (the discovery dataset). A bovine gene expression atlas was created (the first from RNASeq data), and differentially expressed genes in each tissue were identified. To analyse ASE, we had access to unambiguously phased genotypes for all heterozygous variants in the cow’s whole genome sequence, where these variants were homozygous in the whole genome sequence of her sire, and as a result we were able to map reads to parental genomes, to determine SNP and genes showing ASE in each tissue. In total 25,251 heterozygous SNP within 7985 genes were tested for ASE in at least one tissue. ASE was pervasive, 89 % of genes tested had significant ASE in at least one tissue. This large proportion of genes displaying ASE was confirmed in the two tissues in a validation dataset.For individual tissues the proportion of genes showing significant ASE varied from as low as 8–16 % of those tested in thymus to as high as 71–82 % of those tested in lung. There were a number of cases where the direction of allele expression imbalance reversed between tissues. For example the gene SPTY2D1 showed almost complete paternal allele expression in kidney and thymus, and almost complete maternal allele expression in the brain caudal lobe and brain cerebellum. Mono allelic expression (MAE) was common, with 1349 of 4856 genes (28 %) tested with more than one heterozygous SNP showing MAE. Across all tissues, 54.17 % of all genes with ASE favoured the paternal allele. Genes that are closely linked on the chromosome were more likely to show higher expression of the same allele (paternal or maternal) than expected by chance. We identified several long runs of neighbouring genes that showed either paternal or maternal ASE, one example was five adjacent genes (GIMAP8, GIMAP7 copy1, GIMAP4, GIMAP7 copy 2 and GIMAP5) that showed almost exclusive paternal expression in brain caudal lobe.ConclusionsInvestigating the extent of ASE across 18 bovine tissues in one cow and two tissues in 20 cows demonstrated 1) ASE is pervasive in cattle, 2) the ASE is often MAE but ra...
BackgroundWhole-genome sequence (WGS) data could contain information on genetic variants at or in high linkage disequilibrium with causative mutations that underlie the genetic variation of polygenic traits. Thus far, genomic prediction accuracy has shown limited increase when using such information in dairy cattle studies, in which one or few breeds with limited diversity predominate. The objective of our study was to evaluate the accuracy of genomic prediction in a multi-breed Australian sheep population of relatively less related target individuals, when using information on imputed WGS genotypes.MethodsBetween 9626 and 26,657 animals with phenotypes were available for nine economically important sheep production traits and all had WGS imputed genotypes. About 30% of the data were used to discover predictive single nucleotide polymorphism (SNPs) based on a genome-wide association study (GWAS) and the remaining data were used for training and validation of genomic prediction. Prediction accuracy using selected variants from imputed sequence data was compared to that using a standard array of 50k SNP genotypes, thereby comparing genomic best linear prediction (GBLUP) and Bayesian methods (BayesR/BayesRC). Accuracy of genomic prediction was evaluated in two independent populations that were each lowly related to the training set, one being purebred Merino and the other crossbred Border Leicester x Merino sheep.ResultsA substantial improvement in prediction accuracy was observed when selected sequence variants were fitted alongside 50k genotypes as a separate variance component in GBLUP (2GBLUP) or in Bayesian analysis as a separate category of SNPs (BayesRC). From an average accuracy of 0.27 in both validation sets for the 50k array, the average absolute increase in accuracy across traits with 2GBLUP was 0.083 and 0.073 for purebred and crossbred animals, respectively, whereas with BayesRC it was 0.102 and 0.087. The average gain in accuracy was smaller when selected sequence variants were treated in the same category as 50k SNPs. Very little improvement over 50k prediction was observed when using all WGS variants.ConclusionsAccuracy of genomic prediction in diverse sheep populations increased substantially by using variants selected from whole-genome sequence data based on an independent multi-breed GWAS, when compared to genomic prediction using standard 50K genotypes.
BackgroundMammalian phenotypes are shaped by numerous genome variants, many of which may regulate gene transcription or RNA splicing. To identify variants with regulatory functions in cattle, an important economic and model species, we used sequence variants to map a type of expression quantitative trait loci (expression QTLs) that are associated with variations in the RNA splicing, i.e., sQTLs. To further the understanding of regulatory variants, sQTLs were compare with other two types of expression QTLs, 1) variants associated with variations in gene expression, i.e., geQTLs and 2) variants associated with variations in exon expression, i.e., eeQTLs, in different tissues.ResultsUsing whole genome and RNA sequence data from four tissues of over 200 cattle, sQTLs identified using exon inclusion ratios were verified by matching their effects on adjacent intron excision ratios. sQTLs contained the highest percentage of variants that are within the intronic region of genes and contained the lowest percentage of variants that are within intergenic regions, compared to eeQTLs and geQTLs. Many geQTLs and sQTLs are also detected as eeQTLs. Many expression QTLs, including sQTLs, were significant in all four tissues and had a similar effect in each tissue. To verify such expression QTL sharing between tissues, variants surrounding (±1 Mb) the exon or gene were used to build local genomic relationship matrices (LGRM) and estimated genetic correlations between tissues. For many exons, the splicing and expression level was determined by the same cis additive genetic variance in different tissues. Thus, an effective but simple-to-implement meta-analysis combining information from three tissues is introduced to increase power to detect and validate sQTLs. sQTLs and eeQTLs together were more enriched for variants associated with cattle complex traits, compared to geQTLs. Several putative causal mutations were identified, including an sQTL at Chr6:87392580 within the 5th exon of kappa casein (CSN3) associated with milk production traits.ConclusionsUsing novel analytical approaches, we report the first identification of numerous bovine sQTLs which are extensively shared between multiple tissue types. The significant overlaps between bovine sQTLs and complex traits QTL highlight the contribution of regulatory mutations to phenotypic variations.Electronic supplementary materialThe online version of this article (10.1186/s12864-018-4902-8) contains supplementary material, which is available to authorized users.
BackgroundThe use of whole-genome sequence (WGS) data for genomic prediction and association studies is highly desirable because the causal mutations should be present in the data. The sequencing of 935 sheep from a range of breeds provides the opportunity to impute sheep genotyped with single nucleotide polymorphism (SNP) arrays to WGS. This study evaluated the accuracy of imputation from SNP genotypes to WGS using this reference population of 935 sequenced sheep.ResultsThe accuracy of imputation from the Ovine Infinium® HD BeadChip SNP (~ 500 k) to WGS was assessed for three target breeds: Merino, Poll Dorset and F1 Border Leicester × Merino. Imputation accuracy was highest for the Poll Dorset breed, although there were more Merino individuals in the sequenced reference population than Poll Dorset individuals. In addition, empirical imputation accuracies were higher (by up to 1.7%) when using larger multi-breed reference populations compared to using a smaller single-breed reference population. The mean accuracy of imputation across target breeds using the Minimac3 or the FImpute software was 0.94. The empirical imputation accuracy varied considerably across the genome; six chromosomes carried regions of one or more Mb with a mean imputation accuracy of < 0.7. Imputation accuracy in five variant annotation classes ranged from 0.87 (missense) up to 0.94 (intronic variants), where lower accuracy corresponded to higher proportions of rare alleles. The imputation quality statistic reported from Minimac3 (R2) had a clear positive relationship with the empirical imputation accuracy. Therefore, by first discarding imputed variants with an R2 below 0.4, the mean empirical accuracy across target breeds increased to 0.97. Although accuracy of genomic prediction was less affected by filtering on R2 in a multi-breed population of sheep with imputed WGS, the genomic heritability clearly tended to be lower when using variants with an R2 ≤ 0.4.ConclusionsThe mean imputation accuracy was high for all target breeds and was increased by combining smaller breed sets into a multi-breed reference. We found that the Minimac3 software imputation quality statistic (R2) was a useful indicator of empirical imputation accuracy, enabling removal of very poorly imputed variants before downstream analyses.Electronic supplementary materialThe online version of this article (10.1186/s12711-018-0443-5) contains supplementary material, which is available to authorized users.
Methane (CH) is a product of enteric fermentation in ruminants, and it represents around 17% of global CH emissions. There has been substantial effort from the livestock scientific community toward tools that can help reduce this percentage. One approach is to select for lower emitting animals. To achieve this, accurate genetic parameters and identification of the genomic basis of CH traits are required. Therefore, the objectives of this study were 1) to perform a genomewide association study to identify SNP associated with several CH traits in Angus beef cattle (1,020 animals) and validate them in a lactating Holstein population (population 1 [POP1]; 205 animals); 2) to validate significant SNP for DMI and weight at test (WT) from a second Holstein population, from a previous study (population 2 [POP2]; 903 animals), in an Angus population; and 3) to evaluate 2 different residual CH traits and determine if the genes associated with CH also control residual CH traits. Phenotypes calculated for the genotyped Angus population included CH production (MeP), CH yield (MeY), CH intensity (MI), DMI, and WT. The Holstein population (POP1) was multiparous, with phenotypes on CH traits (MeP, MeY, and MI) plus genotypes. Additionally, 2 CH traits, residual genetic CH (RGM) and residual phenotypic CH (RPM), were calculated by adjusting MeP for DMI and WT. Estimated heritabilities in the Angus population were 0.30, 0.19, and 0.15 for MeP, RGM, and RPM, respectively, and genetic correlations of MeP with DMI and WT were 0.83 and 0.80, respectively. Estimated heritabilities in Holstein POP1 were 0.23, 0.30, and 0.42 for MeP, MeY, and MI, respectively. Strong associations with MeP were found on chromosomes 4, 12, 14, 20, and 30 at < 0.001, and those chromosomes also had significant SNP for DMI in Holstein POP1. In the Angus population, the number of significant SNP for MeP at < 0.005 was 3,304, and approximately 630 of those SNP also were important for DMI and WT. When a set (approximately 3,300) of significant SNP for DMI and WT in the Angus population was used to estimate genetic parameters for MeP and MeY in Holstein POP1, the genetic variance and, consequently, the heritability slightly increased, meaning that most of the genetic variation is largely captured by these SNP. Residual traits could be a good option to include in the breeding goal, as this would facilitate selection for lower emitting animals without compromising DMI and WT.
Background: Thousands of years of natural and artificial selection have resulted in indigenous cattle breeds that are well-adapted to the environmental challenges of their local habitat and thereby are considered as valuable genetic resources. Understanding the genetic background of such adaptation processes can help us design effective breeding objectives to preserve local breeds and improve commercial cattle. To identify regions under putative selection, GGP HD 150 K single nucleotide polymorphism (SNP) arrays were used to genotype 106 individuals representing five Swedish breeds i.e. native to different regions and covering areas with a subarctic cold climate in the north and mountainous west, to those with a continental climate in the more densely populated south regions. Results: Five statistics were incorporated within a framework, known as de-correlated composite of multiple signals (DCMS) to detect signatures of selection. The obtained p-values were adjusted for multiple testing (FDR < 5%), and significant genomic regions were identified. Annotation of genes in these regions revealed various verified and novel candidate genes that are associated with a diverse range of traits, including e.g. high altitude adaptation and response to hypoxia (DCAF8,
Residual feed intake (RFI) is a measure of the efficiency of animals in feed utilization. The accuracies of GEBV for RFI could be improved by increasing the size of the reference population. Combining RFI records of different breeds is a way to do that. The aims of this study were to 1) develop a method for calculating GEBV in a multibreed population and 2) improve the accuracies of GEBV by using SNP associated with RFI. An alternative method for calculating accuracies of GEBV using genomic BLUP (GBLUP) equations is also described and compared to cross-validation tests. The dataset included RFI records and 606,096 SNP genotypes for 5,614 Bos taurus animals including 842 Holstein heifers and 2,009 Australian and 2,763 Canadian beef cattle. A range of models were tested for combining genotype and phenotype information from different breeds and the best model included an overall effect of each SNP, an effect of each SNP specific to a breed, and a small residual polygenic effect defined by the pedigree. In this model, the Holsteins and some Angus cattle were combined into 1 "breed class" because they were the only cattle measured for RFI at an early age (6-9 mo of age) and were fed a similar diet. The average empirical accuracy (0.31), estimated by calculating the correlation between GEBV and actual phenotypes divided by the square root of estimated heritability in 5-fold cross-validation tests, was near to that expected using the GBLUP equations (0.34). The average empirical and expected accuracies were 0.30 and 0.31, respectively, when the GEBV were estimated for each breed separately. Therefore, the across-breed reference population increased the accuracy of GEBV slightly, although the gain was greater for breeds with smaller number of individuals in the reference population (0.08 in Murray Grey and 0.11 in Hereford for empirical accuracy). In a second approach, SNP that were significantly (P < 0.001) associated with RFI in the beef cattle genomewide association studies were used to create an auxiliary genomic relationship matrix for estimating GEBV in Holstein heifers. The empirical (and expected) accuracy of GEBV within Holsteins increased from 0.33 (0.35) to 0.39 (0.36) and improved even more to 0.43 (0.50) when using a multibreed reference population. Therefore, a multibreed reference population is a useful resource to find SNP with a greater than average association with RFI in 1 breed and use them to estimate GEBV in another breed.
BackgroundThe mutations changing the expression level of a gene, or expression quantitative trait loci (eQTL), can be identified by testing the association between genetic variants and gene expression in multiple individuals (eQTL mapping), or by comparing the expression of the alleles in a heterozygous individual (allele specific expression or ASE analysis). The aims of the study were to find and compare ASE and local eQTL in 4 bovine RNA-sequencing (RNA-Seq) datasets, validate them in an independent ASE study and investigate if they are associated with complex trait variation.ResultsWe present a novel method for distinguishing between ASE driven by polymorphisms in cis and parent of origin effects. We found that single nucleotide polymorphisms (SNPs) driving ASE are also often local eQTL and therefore presumably cis eQTL. These SNPs often, but not always, affect gene expression in multiple tissues and, when they do, the allele increasing expression is usually the same. However, there were systematic differences between ASE and local eQTL and between tissues and breeds. We also found that SNPs significantly associated with gene expression (p < 0.001) were likely to influence some complex traits (p < 0.001), which means that some mutations influence variation in complex traits by changing the expression level of genes.ConclusionWe conclude that ASE detects phenomenon that overlap with local eQTL, but there are also systematic differences between the SNPs discovered by the two methods. Some mutations influencing complex traits are actually eQTL and can be discovered using RNA-Seq including eQTL in the genes CAST, CAPN1, LCORL and LEPROTL1.Electronic supplementary materialThe online version of this article (10.1186/s12864-018-5181-0) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.