2009
DOI: 10.1093/gbe/evp012
|View full text |Cite
|
Sign up to set email alerts
|

Estimates of Positive Darwinian Selection Are Inflated by Errors in Sequencing, Annotation, and Alignment

Abstract: Published estimates of the proportion of positively selected genes (PSGs) in human vary over three orders of magnitude. In mammals, estimates of the proportion of PSGs cover an even wider range of values. We used 2,980 orthologous protein-coding genes from human, chimpanzee, macaque, dog, cow, rat, and mouse as well as an established phylogenetic topology to infer the fraction of PSGs in all seven terminal branches. The inferred fraction of PSGs ranged from 0.9% in human through 17.5% in macaque to 23.3% in do… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

5
101
0
1

Year Published

2010
2010
2020
2020

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 103 publications
(107 citation statements)
references
References 23 publications
5
101
0
1
Order By: Relevance
“…This observation supports our hypothesis that energy metabolism played a crucial role in the attainment of flight by bats. A caveat of our genome-wide analyses is that both of the bat genomes (little brown bat and flying fox) used to identify nuclear-encoded OXPHOS and mitochondrial protein genes are only of draft quality (1.7× and 2.63×, respectively); thus, a smaller number of complete genes could be identified and analyzed, and required additional care to avoid false-positive predictions (36). To control for the quality of the gene data in the analyses of the nuclear-encoded OXPHOS genes and mitochondrial protein genes, we used two approaches (i): measurement of the background rate of positive selection in the draft bat genomes, and (ii) resequencing of OXPHOS genes from several bat species and reanalysis.…”
Section: Resultsmentioning
confidence: 99%
“…This observation supports our hypothesis that energy metabolism played a crucial role in the attainment of flight by bats. A caveat of our genome-wide analyses is that both of the bat genomes (little brown bat and flying fox) used to identify nuclear-encoded OXPHOS and mitochondrial protein genes are only of draft quality (1.7× and 2.63×, respectively); thus, a smaller number of complete genes could be identified and analyzed, and required additional care to avoid false-positive predictions (36). To control for the quality of the gene data in the analyses of the nuclear-encoded OXPHOS genes and mitochondrial protein genes, we used two approaches (i): measurement of the background rate of positive selection in the draft bat genomes, and (ii) resequencing of OXPHOS genes from several bat species and reanalysis.…”
Section: Resultsmentioning
confidence: 99%
“…Alignment quality is of major importance for ω estimation because errors can lead to the misidentification of synonymous sites as nonsynonymous sites 37 . To minimize the effect of alignment errors, two algorithms, SATé-II and DIALIGN-TX 38 , were chosen to align each ortholog, as they have been reported to be robust in dealing with global and local alignments, respectively (Supplementary Note).…”
Section: Genome Evolution Analysismentioning
confidence: 99%
“…We used Spearman's rank correlation to determine if the following characteristics of the sequence data were correlated with the P values from the LRTs: (i) average GC content at the third position; (ii) average overall GC content; (iii) transition/transversion ratio (kappa); and (iv) d N tree length. The gappiness of an alignment could introduce potential biases in our results (17,18), so we also looked for correlations between the P values from the LRTs and two metrics to assess coverage in our alignments: (i) gap percent (gapPCT), or the sum of the number of gaps in each sequence in an alignment divided by the sum of the total number of sites in all of the sequences in an alignment; and (ii) an alignment quality score (described in SI Text). Only a few of these characteristics of the data were significantly correlated (P < 0.05) with the P values of the LRTs, but all correlations were very weak (range of Spearman's rho = −0.1-0.06, for all tests; Dataset S2).…”
Section: Heterogeneous Patterns Of Molecular Evolution Among Bee Linementioning
confidence: 99%