We describe a new computer program, SnpEff, for rapidly categorizing the effects of variants in genome sequences. Once a genome is sequenced, SnpEff annotates variants based on their genomic locations and predicts coding effects. Annotated genomic locations include intronic, untranslated region, upstream, downstream, splice site, or intergenic regions. Coding effects such as synonymous or non-synonymous amino acid replacement, start codon gains or losses, stop codon gains or losses, or frame shifts can be predicted. Here the use of SnpEff is illustrated by annotating ~356,660 candidate SNPs in ~117 Mb unique sequences, representing a substitution rate of ~1/305 nucleotides, between the Drosophila melanogaster w(1118); iso-2; iso-3 strain and the reference y(1); cn(1) bw(1) sp(1) strain. We show that ~15,842 SNPs are synonymous and ~4,467 SNPs are non-synonymous (N/S ~0.28). The remaining SNPs are in other categories, such as stop codon gains (38 SNPs), stop codon losses (8 SNPs), and start codon gains (297 SNPs) in the 5'UTR. We found, as expected, that the SNP frequency is proportional to the recombination frequency (i.e., highest in the middle of chromosome arms). We also found that start-gain or stop-lost SNPs in Drosophila melanogaster often result in additions of N-terminal or C-terminal amino acids that are conserved in other Drosophila species. It appears that the 5' and 3' UTRs are reservoirs for genetic variations that changes the termini of proteins during evolution of the Drosophila genus. As genome sequencing is becoming inexpensive and routine, SnpEff enables rapid analyses of whole-genome sequencing data to be performed by an individual laboratory.
This paper describes a new program SnpSift for filtering differential DNA sequence variants between two or more experimental genomes after genotoxic chemical exposure. Here, we illustrate how SnpSift can be used to identify candidate phenotype-relevant variants including single nucleotide polymorphisms, multiple nucleotide polymorphisms, insertions, and deletions (InDels) in mutant strains isolated from genome-wide chemical mutagenesis of Drosophila melanogaster. First, the genomes of two independently isolated mutant fly strains that are allelic for a novel recessive male-sterile locus generated by genotoxic chemical exposure were sequenced using the Illumina next-generation DNA sequencer to obtain 20- to 29-fold coverage of the euchromatic sequences. The sequencing reads were processed and variants were called using standard bioinformatic tools. Next, SnpEff was used to annotate all sequence variants and their potential mutational effects on associated genes. Then, SnpSift was used to filter and select differential variants that potentially disrupt a common gene in the two allelic mutant strains. The potential causative DNA lesions were partially validated by capillary sequencing of polymerase chain reaction-amplified DNA in the genetic interval as defined by meiotic mapping and deletions that remove defined regions of the chromosome. Of the five candidate genes located in the genetic interval, the Pka-like gene CG12069 was found to carry a separate pre-mature stop codon mutation in each of the two allelic mutants whereas the other four candidate genes within the interval have wild-type sequences. The Pka-like gene is therefore a strong candidate gene for the male-sterile locus. These results demonstrate that combining SnpEff and SnpSift can expedite the identification of candidate phenotype-causative mutations in chemically mutagenized Drosophila strains. This technique can also be used to characterize the variety of mutations generated by genotoxic chemicals.
BackgroundPrevious whole-genome shotgun bisulfite sequencing experiments showed that DNA cytosine methylation in the honey bee (Apis mellifera) is almost exclusively at CG dinucleotides in exons. However, the most commonly used method, bisulfite sequencing, cannot distinguish 5-methylcytosine from 5-hydroxymethylcytosine, an oxidized form of 5-methylcytosine that is catalyzed by the TET family of dioxygenases. Furthermore, some analysis software programs under-represent non-CG DNA methylation and hydryoxymethylation for a variety of reasons. Therefore, we used an unbiased analysis of bisulfite sequencing data combined with molecular and bioinformatics approaches to distinguish 5-methylcytosine from 5-hydroxymethylcytosine. By doing this, we have performed the first whole genome analyses of DNA modifications at non-CG sites in honey bees and correlated the effects of these DNA modifications on gene expression and alternative mRNA splicing.ResultsWe confirmed, using unbiased analyses of whole-genome shotgun bisulfite sequencing (BS-seq) data, with both new data and published data, the previous finding that CG DNA methylation is enriched in exons in honey bees. However, we also found evidence that cytosine methylation and hydroxymethylation at non-CG sites is enriched in introns. Using antibodies against 5-hydroxmethylcytosine, we confirmed that DNA hydroxymethylation at non-CG sites is enriched in introns. Additionally, using a new technique, Pvu-seq (which employs the enzyme PvuRts1l to digest DNA at 5-hydroxymethylcytosine sites followed by next-generation DNA sequencing), we further confirmed that hydroxymethylation is enriched in introns at non-CG sites.ConclusionsCytosine hydroxymethylation at non-CG sites might have more functional significance than previously appreciated, and in honey bees these modifications might be related to the regulation of alternative mRNA splicing by defining the locations of the introns.
Nutritional benefits of cultivated oat (Avena sativa L., 2n = 6x = 42, AACCDD) are well recognized; however, seed protein levels are modest and resources for genetic improvement are scarce. The wild tetraploid, A. magna Murphy et Terrell (syn A. maroccana Gdgr., 2n = 4x = 28, CCDD), which contains approximately 31% seed protein, was hybridized with cultivated oat to produce a domesticated A. magna. Wild and cultivated accessions were crossed to generate a recombinant inbred line (RIL) population. Although these materials could be used to develop domesticated, high-protein oat, mapping and quantitative trait loci introgression is hindered by a near absence of genetic markers. Objectives of this study were to develop high-throughput, A. magna-specific markers; generate a genetic linkage map based on the A. magna RIL population; and map genes controlling oat domestication. A Diversity Arrays Technology (DArT) array derived from 10 A. magna genotypes was used to generate 2,688 genome-specific probes. These, with 12,672 additional oat clones, produced 2,349 polymorphic markers, including 498 (21.2%) from A. magna arrays and 1,851 (78.8%) from other Avena libraries. Linkage analysis included 974 DArT markers, 26 microsatellites, 13 SNPs, and 4 phenotypic markers, and resulted in a 14-linkage-group map. Marker-to-marker correlation coefficient analysis allowed classification of shared markers as unique or redundant, and putative linkage-group-to-genome anchoring. Results of this study provide for the first time a collection of high-throughput tetraploid oat markers and a comprehensive map of the genome, providing insights to the genome ancestry of oat and affording a resource for study of oat domestication, gene transfer, and comparative genomics.
The CDC42 regulated non-receptor tyrosine kinase ACK-2 has been associated with integrin signaling. In this report, the effect of ACK-2 on the modulation of cell spreading and motility was examined. HeLa cells expressing epitope-tagged wild type ACK-2 showed a slower rate of spreading on fibronectin when compared with untransfected cells. An ACK-2 protein lacking its SH3 domain was still capable of modulating HeLa cell spreading suggesting that its tyrosine kinase activity is sufficient to induce the observed phenotype. The ACK-2 effect on the rate of cell spreading did not involve inhibition of integrin-mediated activation of PI-3K signaling, since it did not alter membrane translocation of a GFP-PH-AKT domain (AKT pleckstrin homology domain) used as a reporter for PI-3K products induced by cell adhesion. The ACK-2 effect appears to be upstream from the adapter protein CrkII, since co-expression of CrkII and ACK-2 results in a neutralization of ACK-2 mediated effects on HeLa cell spreading. Similarly, co-expression of p130Cas, which interacts with the adapter protein CrkII, with ACK-2, also results in a partial reversion of the ACK-2 effects on cell spreading. CrkII mediated reversal of the ACK-2 induced phenotype requires the activity of the small GTPase, Rap1. Co-expression of ACK-2 and CrkII with a dominant negative form of Rap1 reverses the neutralization by CrkII suggesting that CrkII mediated activation of Rap1 is required. However, an active form of Rap1 is not sufficient to reverse the ACK-2 phenotype by itself. A role for Rac1 in ACK-2 effects was also established. An activated Rac1 protein neutralized the ACK-2 mediated inhibition of cell spreading. A direct measurement of cell motility by either a modified Boyden chamber or wounding assay demonstrates that ACK-2 overexpression increases the motility of the cells. These results suggest that ACK-2 modulates HeLa cells spreading upstream of pathways regulated by CrkII and that ACK-2 may regulate cell motility by controlling the activation of small GTPases such as Rap1 and Rac1.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.