Structural variation of the genome involves kilobase- to megabase-sized deletions, duplications, insertions, inversions, and complex combinations of rearrangements. We introduce high-throughput and massive paired-end mapping (PEM), a large-scale genome-sequencing method to identify structural variants (SVs) approximately 3 kilobases (kb) or larger that combines the rescue and capture of paired ends of 3-kb fragments, massive 454 sequencing, and a computational approach to map DNA reads onto a reference genome. PEM was used to map SVs in an African and in a putatively European individual and identified shared and divergent SVs relative to the reference genome. Overall, we fine-mapped more than 1000 SVs and documented that the number of SVs among humans is much larger than initially hypothesized; many of the SVs potentially affect gene function. The breakpoint junction sequences of more than 200 SVs were determined with a novel pooling strategy and computational analysis. Our analysis provided insights into the mechanisms of SV formation in humans.
BackgroundWorldwide, grapes and their derived products have a large market. The cultivated grape species Vitis vinifera has potential to become a model for fruit trees genetics. Like many plant species, it is highly heterozygous, which is an additional challenge to modern whole genome shotgun sequencing. In this paper a high quality draft genome sequence of a cultivated clone of V. vinifera Pinot Noir is presented.Principal FindingsWe estimate the genome size of V. vinifera to be 504.6 Mb. Genomic sequences corresponding to 477.1 Mb were assembled in 2,093 metacontigs and 435.1 Mb were anchored to the 19 linkage groups (LGs). The number of predicted genes is 29,585, of which 96.1% were assigned to LGs. This assembly of the grape genome provides candidate genes implicated in traits relevant to grapevine cultivation, such as those influencing wine quality, via secondary metabolites, and those connected with the extreme susceptibility of grape to pathogens. Single nucleotide polymorphism (SNP) distribution was consistent with a diffuse haplotype structure across the genome. Of around 2,000,000 SNPs, 1,751,176 were mapped to chromosomes and one or more of them were identified in 86.7% of anchored genes. The relative age of grape duplicated genes was estimated and this made possible to reveal a relatively recent Vitis-specific large scale duplication event concerning at least 10 chromosomes (duplication not reported before).ConclusionsSanger shotgun sequencing and highly efficient sequencing by synthesis (SBS), together with dedicated assembly programs, resolved a complex heterozygous genome. A consensus sequence of the genome and a set of mapped marker loci were generated. Homologous chromosomes of Pinot Noir differ by 11.2% of their DNA (hemizygous DNA plus chromosomal gaps). SNP markers are offered as a tool with the potential of introducing a new era in the molecular breeding of grape.
Cancers arise by the gradual accumulation of mutations in multiple genes. We now use shotgun pyrosequencing to characterize RNA mutations and expression levels unique to malignant pleural mesotheliomas (MPMs) and not present in control tissues. On average, 266 Mb of cDNA were sequenced from each of four MPMs, from a control pulmonary adenocarcinoma (ADCA), and from normal lung tissue. Previously observed differences in MPM RNA expression levels were confirmed. Point mutations were identified by using criteria that require the presence of the mutation in at least four reads and in both cDNA strands and the absence of the mutation from sequence databases, normal adjacent tissues, and other controls. In the four MPMs, 15 nonsynonymous mutations were discovered: 7 were point mutations, 3 were deletions, 4 were exclusively expressed as a consequence of imputed epigenetic silencing, and 1 was putatively expressed as a consequence of RNA editing. Notably, each MPM had a different mutation profile, and no mutated gene was previously implicated in MPM. Of the seven point mutations, three were observed in at least one tumor from 49 other MPM patients. The mutations were in genes that could be causally related to cancer and included XRCC6, PDZK1IP1, ACTR1A, and AVEN.DNA sequencing ͉ tumor mutations ͉ lung cancer ͉ bioinformatics ͉ loss of heterozygosity B ecause cancer arises as a consequence of multiple mutations, human cancer genomes are being sequenced to identify the mechanisms of tumorigenesis. Pilot sequencing studies include recent exon resequencing of tumors and cell lines that revealed somatic mutations in hundreds of genes not previously implicated in oncogenesis. These studies generally focused on a single class of mutations such as point mutations in coding regions of preselected candidate genes, and the results so far indicate that even within similar histological classes, tumors possess unique mutational profiles (1-3). However, there has rarely been an analysis of whether a mutated gene is actually expressed in the tumor cell nor has there been an attempt to use sequencing to identify other types of mutations such as chromosomal deletions or translocation (4, 5) or loss of heterozygosity related to epigenetic silencing (6, 7). Moreover, no unbiased deep sequencing analysis of all expressed genes in cancer tissues has been reported to date.Malignant pleural mesothelioma (MPM) is an asbestosrelated, rapidly fatal cancer. Its genetic basis is unknown but appears to involve multiple types of chromosomal abnormalities (5,(8)(9)(10)(11)(12)(13)(14). Central mechanisms underlying MPM are unclear, although MPM tumors evoke a strong inflammatory response thought to contribute to tumorigenesis (15). In addition, tumor cell survival promoted by TNF-␣ responsive antiapoptotic proteins such as Inhibitor of Apoptosis-1 (IAP-1) facilitates the resistance of MPM to most cytotoxic chemotherapeutic drugs (16). Expression profiling with microarrays has supported the general role of inflammation in MPM etiology and has provided...
Abstract. Centrin, a 20-kD phosphoprotein with four calcium-binding EF-hands, is present in the centrosome/basal body apparatus of the green alga Chlamydomonas reinhardtii in three distinct locations: the nucleus-basal body connectors, the distal striated fibers, and the flagellar transition regions. In each location, centrin is found in fibrous structures that display calcium-mediated contraction. The mutant vfl2 has structural defects at all of these locations and is defective for basal body localization and/or segregation. We show that the vfl2 mutation is a G-to-A transition in the centrin structural gene which converts a glutamic acid to a lysine at position 101, the first amino acid of the E-helix of the protein's third EF-hand. This proves that centrin is required to construct the nucleus-basal body connectors, the distal striated fibers, and the flagellar transition regions, and it demonstrates the importance of amino acid 101 to normal centrin function. Based on immunofluorescence analysis using anti-centrin antibodies, it appears that vfl2 centrin is capable of binding to the basal body but is incapable of polymerizing into filamentous structures. 19 phenotypic revertants of vfl2 were isolated, and 10 of them, each of which had undergone further mutation at codon 101, were examined in detail. At the DNA level, 1 of the 10 was wild type, and the other 9 were pseudorevertants encoding centrins with the amino acids asparagine, threonine, methionine, or isoleucine at position 101. No ultrastructure defects were apparent in the revertants with asparagine or threonine at position 101, but in those with methionine or isoleucine at position 101, the distal striated fibers were found to be incomplete, indicating that different amino acid substitutions at position 101 can differentially affect the assembly of the three distinct centrincontaining fibrous structures associated with the Chlamydomonas centrosome.
The Eastern woodchuck (Marmota monax) is naturally infected with woodchuck hepatitis virus (WHV), a hepadnavirus closely related to the human hepatitis B virus (HBV). The woodchuck is used as an animal model for studying chronic hepatitis B (CHB) and HBV-associated hepatocellular carcinoma (HCC) in humans, but the lack of sequence information has hitherto precluded functional genomics analysis. To address this major limitation of the model, we report here the sequencing, assembly and annotation of the woodchuck transcriptome, together with the generation of custom woodchuck microarrays. Using this new platform, we characterized the transcriptional response to persistent WHV infection and WHV-induced HCC. This revealed that chronic WHV infection, like HBV, is associated with (i) a limited intrahepatic type I interferon response, (ii) intrahepatic induction of markers associated with T cell exhaustion, (iii) elevated levels of suppressor of cytokine signaling 3 (SOCS3) in the liver, and (iv) intrahepatic accumulation of neutrophils. Underscoring the translational value of the woodchuck model, this study also determined that WHV-induced HCC shares molecular characteristics with a subtype of human HCC with poor prognosis. Conclusion Our data establish the translational value of the woodchuck model and provide new insights into immune pathways which may play a role either in the persistence of HBV infection or the sequelae of CHB.
We have provided a high-resolution snapshot of intrapatient viral variation, prior and after treatment with maraviroc, and detected preexisting CXCR4-using variants present at an extremely low frequency. The evolutionary analysis demonstrates the extent of diversity present at a single time point within an infected individual and the rapid effect of drug pressure on the structure of a viral population.
Large-scale parallel pyrosequencing produces unprecedented quantities of sequence data. However, when generated from viral populations current mapping software is inadequate for dealing with the high levels of variation present, resulting in the potential for biased data loss. In order to apply the 454 Life Sciences' pyrosequencing system to the study of viral populations, we have developed software for the processing of highly variable sequence data. Here we demonstrate our software by analyzing two temporally sampled HIV-1 intra-patient datasets from a clinical study of maraviroc. This drug binds the CCR5 coreceptor, thus preventing HIV-1 infection of the cell. The objective is to determine viral tropism (CCR5 versus CXCR4 usage) and track the evolution of minority CXCR4-using variants that may limit the response to a maraviroc-containing treatment regimen. Five time points (two prior to treatment) were available from each patient. We first quantify the effects of divergence on initial read k-mer mapping and demonstrate the importance of utilizing population-specific template sequences in relation to the analysis of next-generation sequence data. Then, in conjunction with coreceptor prediction algorithms that infer HIV tropism, our software was used to quantify the viral population structure pre- and post-treatment. In both cases, low frequency CXCR4-using variants (2.5–15%) were detected prior to treatment. Following phylogenetic inference, these variants were observed to exist as distinct lineages that were maintained through time. Our analysis, thus confirms the role of pre-existing CXCR4-using virus in the emergence of maraviroc-insensitive HIV. The software will have utility for the study of intra-host viral diversity and evolution of other fast evolving viruses, and is available from http://www.bioinf.manchester.ac.uk/segminator/.
Tuberous sclerosis complex (TSC) is an often severe neurocutaneous syndrome. Cortical tubers are the predominant neuropathological finding in TSC, and their number and location has been shown to correlate roughly with the severity of neurologic features in TSC. Past studies have shown that genomic deletion events in TSC1 or TSC2 are very rare in tubers, and suggested the potential involvement of the MAPK pathway in their pathogenesis. We used deep sequencing to assess all coding exons of TSC1 and TSC2, and the activating mutation hot spots within KRAS in 46 tubers from TSC patients. Germline heterozygous mutations were identified in 81% of tubers. The same secondary mutation in TSC2 was identified in 6 tuber samples from one individual. Further study showed that this second hit mutation was widely distributed in the cortex from one cerebral hemisphere of this individual at frequencies up to 10%. No other secondary mutations were found in the other 40 tubers analyzed. These data indicate that small second hit mutations in any of these three genes are very rare in TSC tubers. However, in one TSC individual, a second hit TSC2 point mutation occurred early during brain development, and likely contributed to tuber formation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.