Identifying regulatory elements and revealing their role in gene expression regulation remains a central goal of plant genome research. We exploited the detailed genomic sequencing information of a large number of Arabidopsis (Arabidopsis thaliana) accessions to characterize known and to identify novel cis-regulatory elements in gene promoter regions of Arabidopsis by relying on conservation as the hallmark signal of functional relevance. Based on the genomic layout and the obtained density profiles of single-nucleotide polymorphisms (SNPs) in sequence regions upstream of transcription start sites, the average length of promoter regions in Arabidopsis could be established at 500 bp. Genes associated with high degrees of variability of their respective upstream regions are preferentially involved in environmental response and signaling processes, while low levels of promoter SNP density are common among housekeeping genes. Known cis-elements were found to exhibit a decreased SNP density than sequence regions not associated with known motifs. For 15 known cis-element motifs, strong positional preferences relative to the transcription start site were detected based on their promoter SNP density profiles. Five novel candidate cis-element motifs were identified as consensus motifs of 17 sequence hexamers exhibiting increased sequence conservation combined with evidence of positional preferences, annotation information, and functional relevance for inducing correlated gene expression. Our study demonstrates that the currently available resolution of SNP data offers novel ways for the identification of functional genomic elements and the characterization of gene promoter sequences.
The imputation from lower density SNP chip genotypes to whole-genome sequence level is an established approach to generate high density genotypes for many individuals. Imputation accuracy is dependent on many factors and for small cattle populations such as the endangered German Black Pied cattle (DSN), determining the optimal imputation strategy is especially challenging since only a low number of high density genotypes is available. In this paper, the accuracy of imputation was explored with regard to (1) phasing of the target population and the reference panel for imputation, (2) comparison of a 1-step imputation approach, where 50 k genotypes are directly imputed to sequence level, to a 2-step imputation approach that used an intermediate step imputing first to 700 k and subsequently to sequence level, (3) the software tools Beagle and Minimac, and (4) the size and composition of the reference panel for imputation. Analyses were performed for 30 DSN and 30 Holstein Frisian cattle available from the 1000 Bull Genomes Project. Imputation accuracy was assessed using a leave-one-out cross validation procedure. We observed that phasing of the target populations and the reference panels affects the imputation accuracy significantly. Minimac reached higher accuracy when imputing using small reference panels, while Beagle performed better with larger reference panels. In contrast to previous research, we found that when a low number of animals is available at the intermediate imputation step, the 1-step imputation approach yielded higher imputation accuracy compared to a 2-step imputation. Overall, the size of the reference panel for imputation is the most important factor leading to higher imputation accuracy, although using a larger reference panel consisting of a related but different breed (Holstein Frisian) significantly reduced imputation accuracy. Our findings provide specific recommendations for populations with a limited number of high density genotyped or sequenced animals available such as DSN. The overall recommendation when imputing a small population are to (1) use a large reference panel of the same breed, (2) use a large reference panel consisting of diverse breeds, or (3) when a large reference panel is not available, we recommend using a smaller same breed reference panel without including a different related breed.
Background Gastrointestinal nematodes (GIN), liver flukes ( Fasciola hepatica ) and bovine lungworms ( Dictyocaulus viviparus ) are the most important parasitic agents in pastured dairy cattle. Endoparasite infections are associated with reduced milk production and detrimental impacts on female fertility, contributing to economic losses in affected farms. In quantitative-genetic studies, the heritabilities for GIN and F. hepatica were moderate, encouraging studies on genomic scales. Genome-wide association studies (GWAS) based on dense single nucleotide polymorphism (SNP) marker panels allow exploration of the underlying genomic architecture of complex disease traits. The current GWAS combined the identification of potential candidate genes with pathway analyses to obtain deeper insights into bovine immune response and the mechanisms of resistance against endoparasite infections. Results A 2-step approach was applied to infer genome-wide associations in an endangered dual-purpose cattle subpopulation [Deutsches Schwarzbuntes Niederungsrind (DSN)] with a limited number of phenotypic records. First, endoparasite traits from a population of 1166 Black and White dairy cows [including Holstein Friesian (HF) and DSN] naturally infected with GIN, F. hepatica and D. viviparus were precorrected for fixed effects using linear mixed models. Afterwards, the precorrected phenotypes were the dependent traits (rFEC-GIN, rFEC-FH, and rFLC-DV) in GWAS based on 423,654 SNPs from 148 DSN cows. We identified 44 SNPs above the genome-wide significance threshold ( p Bonf = 4.47 × 10 − 7 ), and 145 associations surpassed the chromosome-wide significance threshold (range: 7.47 × 10 − 6 on BTA 1 to 2.18 × 10 − 5 on BTA 28). The associated SNPs identified were annotated to 23 candidate genes. The DAVID analysis inferred four pathways as being related to immune response mechanisms or involved in host-parasite interactions. SNP effect correlations considering specific chromosome segments indicate that breeding for resistance to GIN or F. hepatica as measured by fecal egg counts is genetically associated with a higher risk for udder infections. Conclusions We detected a large number of loci with small to moderate effects for endoparasite resistance. The potential candidate genes regulating resistance identified were pathogen-specific. Genetic antagonistic associations between disease resistance and productivity were specific for specific chromosome segments. The 2-step approach was a valid methodological approach to infer genetic mechanisms in an endangered breed with a limited number of phenotypic records. Electronic supplementary material The online version...
German Black Pied cattle (DSN) is an endangered population of about 2,550 dual-purpose cattle in Germany. Having a milk yield of about 2,500 kg less than the predominant dairy breed Holstein, the preservation of DSN is supported by the German government and the EU. The identification of the genomic loci affecting milk production in DSN can provide a basis for selection decisions for genetic improvement of DSN in order to increase market chances through the improvement of milk yield. A genome-wide association analysis of 30 milk traits was conducted in different lactation periods and numbers. Association using multiple linear regression models in R was performed on 1,490 DSN cattle genotyped with BovineSNP50 SNP-chip. 41 significant and 20 suggestive SNPs affecting milk production traits in DSN were identified, as well as 15 additional SNPs for protein content which are less reliable due to high inflation. The most significant effects on milk yield in DSN were detected on chromosomes 1, 6, and 20. The region on chromosome 6 was located nearby the casein gene cluster and the corresponding haplotype overlapped the CSN3 gene (casein kappa). Associations for fat and protein yield and content were also detected. High correlation between traits of the same lactation period or number led to some SNPs being significant for multiple investigated traits. Half of all identified SNPs have been reported in other studies, previously. 15 SNPs were associated with the same traits in other breeds. The other associated SNPs have been reported previously for traits such as exterior, health, meat and carcass, production, and reproduction traits. No association could be detected between DGAT1 and other known milk genes with milk production traits despite the close relationship between DSN and Holstein. The results of this study confirmed that many SNPs identified in other breeds as associated with milk traits also affect milk traits in dual-purpose DSN cattle and can be used for further genetic analysis to identify genes and causal variants that affect milk production in DSN cattle.
Casein proteins were repeatedly examined for protein polymorphisms and frequencies in diverse cattle breeds. The occurrence of casein variants in Holstein Friesian, the leading dairy breed worldwide, is well known. The frequencies of different casein variants in Holstein are likely affected by selection for high milk yield. Compared to Holstein, only little is known about casein variants and their frequencies in German Black Pied cattle (“Deutsches Schwarzbuntes Niederungsrind,” DSN). The DSN population was a main genetic contributor to the current high-yielding Holstein population. The goal of this study was to investigate casein (protein) variants and casein haplotypes in DSN based on the DNA sequence level and to compare these with data from Holstein and other breeds. In the investigated DSN population, we found no variation in the alpha-casein genes CSN1S1 and CSN1S2 and detected only the CSN1S1*B and CSN1S2*A protein variants. For CSN2 and CSN3 genes, non-synonymous single nucleotide polymorphisms leading to three different β and κ protein variants were found, respectively. For β-casein protein variants A 1, A 2, and I were detected, with CSN2*A 1 (82.7%) showing the highest frequency. For κ-casein protein variants A, B, and E were detected in DSN, with the highest frequency of CSN3*A (83.3%). Accordingly, the casein protein haplotype CSN1S1*B-CSN2*A 1-CSN1S2*A-CSN3*A (order of genes on BTA6) is the most frequent haplotype in DSN cattle.
SummaryThe aim of this study was to detect selection signatures considering cows from the German Holstein (GH) and the local dual‐purpose black and white (DSN) population, as well as from generated sub‐populations. The 4654 GH and 261 DSN cows were genotyped with the BovineSNP50 Genotyping BeadChip. The geographical herd location was used as an environmental descriptor to create the East‐DSN and West‐DSN sub‐populations. In addition, two further sub‐populations of GH cows were generated, using the extreme values for solutions of residual effects of cows for the claw disorder dermatitis digitalis. These groups represented the most susceptible and most resistant cows. We used cross‐population extended haplotype homozygosity methodology (XP‐EHH) to identify the most recent selection signatures. Furthermore, we calculated Wright’s fixation index (FST). Chromosomal segments for the top 0.1 percentile of negative or positive XP‐EHH scores were studied in detail. For gene annotations, we used the Ensembl database and we considered a window of 250 kbp downstream and upstream of each core SNP corresponding to peaks of XP‐EHH. In addition, functional interactions among potential candidate genes were inferred via gene network analyses. The most outstanding XP‐EHH score was on chromosome 12 (at 77.34 Mb) for DSN and on chromosome 20 (at 36.29–38.42 Mb) for GH. Selection signature locations harbored QTL for several economically important milk and meat quality traits, reflecting the different breeding goals for GH and DSN. The average FST value between GH and DSN was quite low (0.068), indicating shared founders. For group stratifications according to cow health, several identified potential candidate genes influence disease resistance, especially to dermatitis digitalis.
The dual-purpose German Black Pied Cattle (DSN) has become an endangered breed of approximately 2,550 registered cows in Germany. The breed is genetically related to Holstein-Friesian cattle because the old DSN breed contributed to the selection of the modern Holstein dairy cow. In dairy farms, breeders aim to improve animal health and well-being by reducing the number of mastitis cases, which would also reduce milk losses and treatment costs. On the genomic level, no markers associated with clinical mastitis have been reported in DSN. Therefore, we performed a genome-wide association study on 1,062 DSN cows using a univariate linear mixed model that included a relatedness matrix to correct for population stratification. Although the statistical power was limited by the small population size, 3 markers were significantly associated, and 2 additional markers showed a suggestive association with clinical mastitis. Those markers accounted for 1 to 3% of the variance of clinical mastitis in the examined DSN population. One marker was found in the intragenic region of NEURL1 on BTA26, and the other 4 markers in intergenic regions on BTA3, BTA6, and BTA9. Further analyses identified 23 positional candidate genes. Among them is BMPR1B, which has been previously associated with clinical mastitis in other dairy cattle breeds. The markers presented here can be used for selection for mastitis-resistant animals in the endangered DSN population, and can broadly contribute to a better understanding of mastitis determinants in dairy cattle breeds.
Post-translational modifications (PTMs) represent an important regulatory layer influencing the structure and function of proteins. With broader availability of experimental information on the occurrences of different PTM types, the investigation of a potential "crosstalk" between different PTM types and combinatorial effects have moved into the research focus. Hypothesizing that relevant interferences between different PTM types and sites may become apparent when investigating their mutual physical distances, we performed a systematic survey of pairwise homo- and heterotypic distances of seven frequent PTM types considering their sequence and spatial distances in resolved protein structures. We found that actual PTM site distance distributions differ from random distributions with most PTM type pairs exhibiting larger than expected distances with the exception of homotypic phosphorylation site distances and distances between phosphorylation and ubiquitination sites that were found to be closer than expected by chance. Random reference distributions considering canonical acceptor amino acid residues only were found to be shifted to larger distances compared to distances between any amino acid residue type indicating an underlying tendency of PTM-amenable residue types to be further apart than randomly expected. Distance distributions based on sequence separations were found largely consistent with their spatial counterparts suggesting a primary role of sequence-based pairwise PTM-location encoding rather than folding-mediated effects. Our analysis provides a systematic and comprehensive overview of the characteristics of pairwise PTM site distances on proteins and reveals that, predominantly, PTM sites tend to avoid close proximity with the potential implication that an independent attachment or removal of PTMs remains possible. Proteins 2016; 85:78-92. © 2016 Wiley Periodicals, Inc.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.