Philip F. LoCascio scite author profile

BackgroundThe quality of automated gene prediction in microbial organisms has improved steadily over the past decade, but there is still room for improvement. Increasing the number of correct identifications, both of genes and of the translation initiation sites for each gene, and reducing the overall number of false positives, are all desirable goals.ResultsWith our years of experience in manually curating genomes for the Joint Genome Institute, we developed a new gene prediction algorithm called Prodigal (PROkaryotic DYnamic programming Gene-finding ALgorithm). With Prodigal, we focused specifically on the three goals of improved gene structure prediction, improved translation initiation site recognition, and reduced false positives. We compared the results of Prodigal to existing gene-finding methods to demonstrate that it met each of these objectives.ConclusionWe built a fast, lightweight, open source gene prediction program called Prodigal http://compbio.ornl.gov/prodigal/. Prodigal achieved good results compared to existing methods, and we believe it will be a valuable asset to automated microbial annotation pipelines.

show abstract

The Genome of Black Cottonwood, Populus trichocarpa (Torr. & Gray)

Tuskan

DiFazio

Jansson

et al. 2006

Science

3,790

104

3,437

View full text Add to dashboard Cite

We report the draft genome of the black cottonwood tree, Populus trichocarpa . Integration of shotgun sequence assembly with genetic mapping enabled chromosome-scale reconstruction of the genome. More than 45,000 putative protein-coding genes were identified. Analysis of the assembled genome revealed a whole-genome duplication event; about 8000 pairs of duplicated genes from that event survived in the Populus genome. A second, older duplication event is indistinguishably coincident with the divergence of the Populus and Arabidopsis lineages. Nucleotide substitution, tandem gene duplication, and gross chromosomal rearrangement appear to proceed substantially more slowly in Populus than in Arabidopsis. Populus has more protein-coding genes than Arabidopsis , ranging on average from 1.4 to 1.6 putative Populus homologs for each Arabidopsis gene. However, the relative frequency of protein domains in the two genomes is similar. Overrepresented exceptions in Populus include genes associated with lignocellulosic wall biosynthesis, meristem development, disease resistance, and metabolite transport.

show abstract

Gene and translation initiation site prediction in metagenomic sequences

et al. 2012

View full text Add to dashboard Cite

show abstract

MicroRNAs Form Triplexes with Double Stranded DNA at Sequence-Specific Binding Sites; a Eukaryotic Mechanism via which microRNAs Could Directly Alter Gene Expression

et al. 2016

View full text Add to dashboard Cite

MicroRNAs are important regulators of gene expression, acting primarily by binding to sequence-specific locations on already transcribed messenger RNAs (mRNA) and typically down-regulating their stability or translation. Recent studies indicate that microRNAs may also play a role in up-regulating mRNA transcription levels, although a definitive mechanism has not been established. Double-helical DNA is capable of forming triple-helical structures through Hoogsteen and reverse Hoogsteen interactions in the major groove of the duplex, and we show physical evidence (i.e., NMR, FRET, SPR) that purine or pyrimidine-rich microRNAs of appropriate length and sequence form triple-helical structures with purine-rich sequences of duplex DNA, and identify microRNA sequences that favor triplex formation. We developed an algorithm (Trident) to search genome-wide for potential triplex-forming sites and show that several mammalian and non-mammalian genomes are enriched for strong microRNA triplex binding sites. We show that those genes containing sequences favoring microRNA triplex formation are markedly enriched (3.3 fold, p<2.2 × 10−16) for genes whose expression is positively correlated with expression of microRNAs targeting triplex binding sequences. This work has thus revealed a new mechanism by which microRNAs could interact with gene promoter regions to modify gene transcription.

show abstract

Background rareness-based iterative multiple sequence alignment algorithm for regulatory element detection

Narasimhan¹,

LoCascio²,

Uberbacher³

2003

View full text Add to dashboard Cite

show abstract

A computational pipeline for protein structure prediction and analysis at genome scale

Shah¹,

Passovets²,

Kim³

et al. 2003

View full text Add to dashboard Cite

show abstract

Application of PROSPECT in CASP4: Characterizing protein structures with new folds

Crawford

LoCascio

et al. 2001

Proteins

View full text Add to dashboard Cite

In the Fourth Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP4), we predicted all 43 targets using our threading application PROSPECT. PROSPECT guarantees to find an optimal alignment between a protein sequence and a structural fold for a general energy function with pairwise contact potential. For each prediction, it gives a reliability assessment based on a neural network approach. In addition, PROSPECT has been added to the Genomic Integrated Supercomputing Toolkit (GIST) and is deployed on terascale computing resources. Structural predictions in CASP4 included three categories, that is comparative modeling, fold recognition, and prediction for structures with new folds. In the fold recognition category, PROSPECT correctly identified 8 of a total of 22 and finished the sixth in the total scores among 127 assessed groups. In the "new fold" category, it found important structural features for most targets, and its overall performance is among the best of all prediction methods. Our CASP4 performance demonstrates that PROSPECT is a powerful tool to quickly characterize structures with new folds, and it may provide useful structural restraints for ab initio prediction methods.

show abstract

The Locally Self-consistent Multiple Scattering code in a geographically distributed linked MPP environment

et al. 1998

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.