The systematic comparison of genomic sequences from different organisms represents a central focus of contemporary genome analysis. Comparative analyses of vertebrate sequences can identify coding and conserved non-coding regions, including regulatory elements, and provide insight into the forces that have rendered modern-day genomes. As a complement to whole-genome sequencing efforts, we are sequencing and comparing targeted genomic regions in multiple, evolutionarily diverse vertebrates. Here we report the generation and analysis of over 12 megabases (Mb) of sequence from 12 species, all derived from the genomic region orthologous to a segment of about 1.8 Mb on human chromosome 7 containing ten genes, including the gene mutated in cystic fibrosis. These sequences show conservation reflecting both functional constraints and the neutral mutational events that shaped this genomic region. In particular, we identify substantial numbers of conserved non-coding segments beyond those previously identified experimentally, most of which are not detectable by pair-wise sequence comparisons alone. Analysis of transposable element insertions highlights the variation in genome dynamics among these species and confirms the placement of rodents as a sister group to the primates.
The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5'-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline.
Sequencing of full-insert clones from full-length cDNA libraries from both Xenopus laevis and Xenopus tropicalis has been ongoing as part of the Xenopus Gene Collection Initiative. Here we present 10,967 full ORF verified cDNA clones (8049 from X. laevis and 2918 from X. tropicalis) as a community resource. Because the genome of X. laevis, but not X. tropicalis, has undergone allotetraploidization, comparison of coding sequences from these two clawed (pipid) frogs provides a unique angle for exploring the molecular evolution of duplicate genes. Within our clone set, we have identified 445 gene trios, each comprised of an allotetraploidization-derived X. laevis gene pair and their shared X. tropicalis ortholog. Pairwise d N /d S, comparisons within trios show strong evidence for purifying selection acting on all three members. However, d N /d S ratios between X. laevis gene pairs are elevated relative to their X. tropicalis ortholog. This difference is highly significant and indicates an overall relaxation of selective pressures on duplicated gene pairs. We have found that the paralogs that have been lost since the tetraploidization event are enriched for several molecular functions, but have found no such enrichment in the extant paralogs. Approximately 14% of the paralogous pairs analyzed here also show differential expression indicative of subfunctionalization.
BackgroundOculocutaneous albinism (OCA) is an autosomal recessive disorder. A significant portion of OCA patients has been found with a single pathogenic variant either in the TYR or the OCA2 gene. Diagnostic sequencing of the TYR and OCA2 genes is routinely used for molecular diagnosis of OCA subtypes. To study the possibility that genomic abnormalities with single or multiple exon involvement may account for a portion of the potential missing pathogenic variants (the second), we retrospectively analyzed the TYR gene by long range PCR and analyzed the target 2.7 kb deletion in the OCA2 gene spanning exon 7 in OCA patients with a single pathogenic variant in the target genes.ResultsIn the 108 patients analyzed, we found that one patient was heterozygous for the 2.7 kb OCA2 gene deletion and this patient was positive with one pathogenic variant and one possibly pathogenic variant [c.1103C>T (p.Ala368Val) + c.913C>T (p.R305W)]. Further analysis of maternal DNA, and two additional OCA DNA homozygous for the 2.7 kb deletion, revealed that the phenotypically normal mother is heterozygous of the 2.7 kb deletion and homozygous of the p.R305W. The two previously reported patients with homozygous of the 2.7 kb deletion are also homozygous of p.R305W.ConclusionsAmong the reported pathogenic variants, the pathogenicity of the p.R305W has been discussed intensively in literature. Our results indicate that p.R305W is unlikely a pathogenic variant. The possibility of linkage disequilibrium between p.R305W with the 2.7 kb deletion in OCA2 gene is also suggested.Electronic supplementary materialThe online version of this article (doi:10.1186/s13578-017-0149-3) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.