Dynamic refolding of IFN-γ mRNA enables it to function as PKR activator and translation template

Sequence similarity between a translated nucleotide sequence and a known biological protein can provide strong evidence for the presence of a homologous coding region, even between distantly related genes. The computer program BLASTX performed conceptual translation of a nucleotide query sequence followed by a protein database search in one programmatic step. We characterized the sensitivity of BLASTX recognition to the presence of substitution, insertion and deletion errors in the query sequence and to sequence divergence. Reading frames were reliably identified in the presence of 1% query errors, a rate that is typical for primary sequence data. BLASTX is appropriate for use in moderate and large scale sequencing projects at the earliest opportunity, when the data are most prone to containing errors.

show abstract

Rapid gene mapping in Caenorhabditis elegans using a high density polymorphism map

Wicks

et al. 2001

View full text Add to dashboard Cite

Single nucleotide polymorphisms (SNPs) are valuable genetic markers of human disease. They also comprise the highest potential density marker set available for mapping experimentally derived mutations in model organisms such as Caenorhabditis elegans. To facilitate the positional cloning of mutations we have identified polymorphisms in CB4856, an isolate from a Hawaiian island that shows a uniformly high density of polymorphisms compared with the reference Bristol N2 strain. Based on 5.4 Mbp of aligned sequences, we predicted 6,222 polymorphisms. Furthermore, 3,457 of these markers modify restriction enzyme recognition sites ('snip-SNPs') and are therefore easily detected as RFLPs. Of these, 493 were experimentally confirmed by restriction digest to produce a snip-SNP map of the worm genome. A mapping strategy using snip-SNPs and bulked segregant analysis (BSA) is outlined. CB4856 is crossed into a mutant strain, and exclusion of CB4856 alleles of a subset of snip-SNPs in mutant progeny is assessed with BSA. The proximity of a linked marker to the mutation is estimated by the relative proportion of each form of the biallelic marker in populations of wildtype and mutant genomes. The usefulness of this approach is illustrated by the rapid mapping of the dyf-5 gene.

show abstract

[27] Local alignment statistics

Altschul

Gish²

1996

633

490

View full text Add to dashboard Cite

Issues in searching molecular sequence databases

et al. 1994

View full text Add to dashboard Cite

Sequence similarity search programs are versatile tools for the molecular biologist, frequently able to identify possible DNA coding regions and to provide clues to gene and protein structure and function. While much attention had been paid to the precise algorithms these programs employ and to their relative speeds, there is a constellation of associated issues that are equally important to realize the full potential of these methods. Here, we consider a number of these issues, including the choice of scoring systems, the statistical significance of alignments, the masking of uninformative or potentially confounding sequence regions, the nature and extent of sequence redundancy in the databases and network access to similarity search services.

show abstract

A general approach to single-nucleotide polymorphism discovery

et al. 1999

View full text Add to dashboard Cite

Single-nucleotide polymorphisms (SNPs) are the most abundant form of human genetic variation and a resource for mapping complex genetic traits. The large volume of data produced by high-throughput sequencing projects is a rich and largely untapped source of SNPs (refs 2, 3, 4, 5). We present here a unified approach to the discovery of variations in genetic sequence data of arbitrary DNA sources. We propose to use the rapidly emerging genomic sequence as a template on which to layer often unmapped, fragmentary sequence data and to use base quality values to discern true allelic variations from sequencing errors. By taking advantage of the genomic sequence we are able to use simpler yet more accurate methods for sequence organization: fragment clustering, paralogue identification and multiple alignment. We analyse these sequences with a novel, Bayesian inference engine, POLYBAYES, to calculate the probability that a given site is polymorphic. Rigorous treatment of base quality permits completely automated evaluation of the full length of all sequences, without limitations on alignment depth. We demonstrate this approach by accurate SNP predictions in human ESTs aligned to finished and working-draft quality genomic sequences, a data set representative of the typical challenges of sequence-based SNP discovery.

show abstract

Generation and analysis of 280,000 human expressed sequence tags.

et al. 1996

View full text Add to dashboard Cite

We report the generation of 319,311 single-pass sequencing reactions (known as expressed sequence tags, or ESTs) obtained from the 5' and 3' ends of 194,031 human cDNA clones. Our goal has been to obtain tag sequences from many different genes and to deposit these in the publicly accessible Data Base for Expressed Sequence Tags. Highly efficient automatic screening of the data allows deposition of the annotated sequences without delay. Sequences have been generated from 26 oligo(dT) primed directionally cloned libraries, of which 18 were normalized. The libraries were constructed using mRNA isolated from 17 different tissues representing three developmental states. Comparisons of a subset of our data with nonredundant human mRNA and protein data bases show that the ESTs represent many known sequences and contain many that are novel. Analysis of protein families using Hidden Markov Models confirms this observation and supports the contention that although normalization reduces significantly the relative abundance of redundant cDNA clones, it does not result in the complete removal of members of gene families.

show abstract

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Warren Gish

Basic local alignment search tool

Initial sequencing and analysis of the human genome

Identification of protein coding regions by database similarity search

Rapid gene mapping in Caenorhabditis elegans using a high density polymorphism map

[27] Local alignment statistics

Issues in searching molecular sequence databases

A general approach to single-nucleotide polymorphism discovery

Generation and analysis of 280,000 human expressed sequence tags.

Contact Info

Product

Resources

About