2011
DOI: 10.1101/gr.115949.110
|View full text |Cite
|
Sign up to set email alerts
|

High sensitivity to aligner and high rate of false positives in the estimates of positive selection in the 12 Drosophila genomes

Abstract: We investigate the effect of aligner choice on inferences of positive selection using site-specific models of molecular evolution. We find that independently of the choice of aligner, the rate of false positives is unacceptably high. Our study is a whole-genome analysis of all protein-coding genes in 12 Drosophila genomes annotated in either all 12 species (~6690 genes) or in the six melanogaster group species. We compare six popular aligners: PRANK, T-Coffee, ClustalW, ProbCons, AMAP, and MUSCLE, and find tha… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

5
148
0

Year Published

2012
2012
2024
2024

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 134 publications
(154 citation statements)
references
References 61 publications
5
148
0
Order By: Relevance
“…In total, 8.7% of MSA columns were removed before selection computations for Primates, versus 4.4% in Selectome 5 (GBLOCKS based pipeline); 12% of columns were removed for Glires, and 34% of columns for Euteleostomi, consistent with the expectation that more divergent sequences are more difficult to align reliably. More in detail, in Selectome 5, in Primates we identified 246 678 out of 1 149 639 sites (21%) as under positive selection, including long continuous stretches of ‘positively selected’ sites, which manual examination showed to be alignment or gene model errors [consistent with (10)]. In Selectome 6, filtering reduced the number of sites analyzed to 392 104, of which 61 119 are identified as under positive selection (16%); there are no more long stretches of sites, and manual inspection does not identify any obvious false positives.…”
Section: Changes In Database Contentmentioning
confidence: 84%
“…In total, 8.7% of MSA columns were removed before selection computations for Primates, versus 4.4% in Selectome 5 (GBLOCKS based pipeline); 12% of columns were removed for Glires, and 34% of columns for Euteleostomi, consistent with the expectation that more divergent sequences are more difficult to align reliably. More in detail, in Selectome 5, in Primates we identified 246 678 out of 1 149 639 sites (21%) as under positive selection, including long continuous stretches of ‘positively selected’ sites, which manual examination showed to be alignment or gene model errors [consistent with (10)]. In Selectome 6, filtering reduced the number of sites analyzed to 392 104, of which 61 119 are identified as under positive selection (16%); there are no more long stretches of sites, and manual inspection does not identify any obvious false positives.…”
Section: Changes In Database Contentmentioning
confidence: 84%
“…The substitution rate upstream of FZF1 is characterized by an accelerated rate along the lineages leading to Saccharomyces cerevisiae and Saccharomyces paradoxus relative to that along the lineages leading to Saccharomyces mikatae and Saccharomyces bayanus (Figure 1). However, previous studies have shown that signals of selection are highly dependent on the alignment [40], [41]. To determine whether the evidence for rate heterogeneity upstream of FZF1 is dependent on the alignment used, we generated additional alignments using alternative alignment parameters and algorithms, and tested each for substitution rate heterogeneity.…”
Section: Resultsmentioning
confidence: 99%
“…As these alignments required only two species to overlap, they resulted in a larger number of alignments and longer alignments. The quality and method of sequence alignment have important impacts on inferences regarding rates of evolution [46]. During the course of this study, we tested multiple alignment pipelines, including the use of amino acid sequence-based approaches [47].…”
Section: Methodsmentioning
confidence: 99%