2004
DOI: 10.1023/b:plan.0000036376.11710.6f
|View full text |Cite
|
Sign up to set email alerts
|

Automated SNP Detection in Expressed Sequence Tags: Statistical Considerations and Application to Maritime Pine Sequences

Abstract: We developed an automated pipeline for the detection of single nucleotide polymorphisms (SNPs) in expressed sequence tag (EST) data sets, by combining three DNA sequence analysis programs: Phred, Phrap and PolyBayes. This application requires access to the individual electrophoregram traces. First, a reference set of 65 SNPs was obtained from the sequencing of 30 gametes in 13 maritime pine (Pinus pinaster Ait.) gene fragments (6671 bp), resulting in a frequency of 1 SNP every 102.6 bp. Second, parameters of t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

6
51
0

Year Published

2005
2005
2013
2013

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 67 publications
(58 citation statements)
references
References 28 publications
6
51
0
Order By: Relevance
“…However, the sequences of PCR products obtained using 10 primers (see above) gave a lower density (1 SNP/520 bp), which is also understandable in view of the fact that only 7 out of 30 SNPs could be validated. The density of SNPs reported during the present study, however, falls within the range (1SNP/21bp to 1SNP/8500bp) of densities of SNPs reported in earlier studies on different tree and plant species (Bryan et al 1999;Schmid et al 2003;Blake et al 2004;Bundock & Henry 2004;Feltus et al 2004;Gupta & Rustgi 2004;Le Dantec et al 2004;Morales et al 2004;Russell et al 2004;Shen et al 2004;Yang et al 2004;Rostoks et al 2005), but is little above the density reported for bread wheat in an earlier study (1 SNP/540 bp; Somers et al 2003). We also noticed that in bread wheat the density of SNPs scored in EST databases (1SNP/144.9 bp; 1 SNP/540 bp) was higher than that reported in sequences of genes of economic importance (1SNP/1000bp to 1SNP/1700bp; Bryan et al 1999;Mochida et al 2003;Somers et al 2003;Zhang et al 2003;Blake et al 2004).…”
Section: Density Of Snpssupporting
confidence: 84%
“…However, the sequences of PCR products obtained using 10 primers (see above) gave a lower density (1 SNP/520 bp), which is also understandable in view of the fact that only 7 out of 30 SNPs could be validated. The density of SNPs reported during the present study, however, falls within the range (1SNP/21bp to 1SNP/8500bp) of densities of SNPs reported in earlier studies on different tree and plant species (Bryan et al 1999;Schmid et al 2003;Blake et al 2004;Bundock & Henry 2004;Feltus et al 2004;Gupta & Rustgi 2004;Le Dantec et al 2004;Morales et al 2004;Russell et al 2004;Shen et al 2004;Yang et al 2004;Rostoks et al 2005), but is little above the density reported for bread wheat in an earlier study (1 SNP/540 bp; Somers et al 2003). We also noticed that in bread wheat the density of SNPs scored in EST databases (1SNP/144.9 bp; 1 SNP/540 bp) was higher than that reported in sequences of genes of economic importance (1SNP/1000bp to 1SNP/1700bp; Bryan et al 1999;Mochida et al 2003;Somers et al 2003;Zhang et al 2003;Blake et al 2004).…”
Section: Density Of Snpssupporting
confidence: 84%
“…The overall sequences per locus were aligned, and polymorphic sites were automatically identified using an informatic pipeline described by Le Dantec et al (2004). Every polymorphic site was then manually verified with CodonCode Aligner v.1.5.1 (CodonCode Corporation, Dedham, MA, USA).…”
Section: Snp Detectionmentioning
confidence: 99%
“…For computational SNP discovery, two important points should be considered. First, the program should be able to distinguish allelic variation from sequence variation between paralogous sequences (Marth et al, 1999;Le Dantec et al, 2004;Batley et al, 2003). Secondly, the program should be able to recognize sequencing errors which are usually caused by poor quality sequences, especially for EST data (Picoult-Newberg et al, 1999;Garg et al, 1999;Batley et al, 2003;Matukumalli et al, 2006).…”
Section: Snp Marker Identification and Developmentmentioning
confidence: 99%
“…Also, other factors such as alternative splicing, reverse transcription errors and RNA editing interfere with the predictions even after including sequence quality scores. But SNP discovery from EST sequences was successfully implemented for maize (Rafalski, 2002) and pine (Le Dantec et al, 2004) species by constructing a software data analysis pipeline. Thus, the selection of optimal tool for SNP identification and/or discovery basically depends on the nature of input sequences.…”
Section: Snp Marker Identification and Developmentmentioning
confidence: 99%