2010
DOI: 10.1186/1471-2105-11-462
|View full text |Cite
|
Sign up to set email alerts
|

Context dependent substitution biases vary within the human genome

Abstract: BackgroundModels of sequence evolution typically assume that different nucleotide positions evolve independently. This assumption is widely appreciated to be an over-simplification. The best known violations involve biases due to adjacent nucleotides. There have also been suggestions that biases exist at larger scales, however this possibility has not been systematically explored.ResultsTo address this we have developed a method which identifies over- and under-represented substitution patterns and assesses th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

4
11
0

Year Published

2011
2011
2017
2017

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 16 publications
(15 citation statements)
references
References 43 publications
(71 reference statements)
4
11
0
Order By: Relevance
“…The aim of the machine learning procedure was to establish the optimal sequence length to minimise the error in the predicted rate constants (Figures S4 and S5 in Additional file 1). In agreement with prior evidence [9, 10, 44, 50], but now obtained for each individual i → j substitution type from Trek data, the optimal window was found to be 5-7-nt (both 5- and 7-nt resulting in comparable results for many substitution types) and was subsequently used as guidance for the direct mapping of the Trek rate constants from the L1 sequence onto any given human nuclear DNA sequence for the assignment.…”
Section: Resultssupporting
confidence: 75%
See 2 more Smart Citations
“…The aim of the machine learning procedure was to establish the optimal sequence length to minimise the error in the predicted rate constants (Figures S4 and S5 in Additional file 1). In agreement with prior evidence [9, 10, 44, 50], but now obtained for each individual i → j substitution type from Trek data, the optimal window was found to be 5-7-nt (both 5- and 7-nt resulting in comparable results for many substitution types) and was subsequently used as guidance for the direct mapping of the Trek rate constants from the L1 sequence onto any given human nuclear DNA sequence for the assignment.…”
Section: Resultssupporting
confidence: 75%
“…To this end, the GBM models here had a sole purpose of identifying the optimal range of influence for accounting the neighbouring nucleotides. The optimal range was found to be captured, on average, by a 5-7-nt long window (Figures S4 and S5 in Additional file 1) which is in an excellent agreement with the prior < 10 nt estimate [911, 44, 50]. We thus used the maximum 7-nt length to stratify the Trek data for the further model-free mapping on any provided sequence, including the whole human genome.…”
Section: Methodsmentioning
confidence: 58%
See 1 more Smart Citation
“…AT pressure, while the number of flanking pyrimidines on a single strand is correlated with a mutational bias, or skew, toward pyrimidines. Nevarez et al (2010) introduced an empirical method to examine context beyond immediately adjacent bases, based on the relative abundance method for studying word frequency bias. The authors used this method, which identifies over and underrepresented substitution patterns, to investigate context bias in the human lineage after the divergence from chimpanzee.…”
Section: Empirical Researchmentioning
confidence: 99%
“…The authors used this method, which identifies over and underrepresented substitution patterns, to investigate context bias in the human lineage after the divergence from chimpanzee. Nevarez et al (2010) found that nucleotides beyond the immediately adjacent ones are responsible for substantial context effects and that these biases vary across the genome.…”
Section: Empirical Researchmentioning
confidence: 99%