2018
DOI: 10.1101/332478
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

The structure of the genetic code as an optimal graph clustering problem

Abstract: The standard genetic code (SGC) is the set of rules by which genetic information is 1 translated into proteins, from codons, i.e. triplets of nucleotides, to amino acids. The 2 questions about the origin and the main factor responsible for the present structure of sequences generated by single nucleotide substitutions. We described the genetic code as 10 a partition of an undirected and unweighted graph, which makes the model general and 11 universal. Using this approach, we showed that the structure of the g… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
17
0

Year Published

2019
2019
2020
2020

Publication Types

Select...
3
2
1

Relationship

4
2

Authors

Journals

citations
Cited by 7 publications
(18 citation statements)
references
References 44 publications
(34 reference statements)
1
17
0
Order By: Relevance
“…The properties of the genetic code can be also tested using techniques borrowed from graph theory [59,60]. The analysis of the SGC as a partition of an undirected and unweighted graph showed that the majority of codon blocks are optimal in terms of the conductance measure, which is the ratio of non-synonymous substitutions between the codons in this group to all possible single nucleotide substitutions affecting these codons [60]. Therefore, this parameter can be interpreted as a measure of robustness against the potential changes in proteincoding sequences generated by point mutations.…”
mentioning
confidence: 99%
“…The properties of the genetic code can be also tested using techniques borrowed from graph theory [59,60]. The analysis of the SGC as a partition of an undirected and unweighted graph showed that the majority of codon blocks are optimal in terms of the conductance measure, which is the ratio of non-synonymous substitutions between the codons in this group to all possible single nucleotide substitutions affecting these codons [60]. Therefore, this parameter can be interpreted as a measure of robustness against the potential changes in proteincoding sequences generated by point mutations.…”
mentioning
confidence: 99%
“…This approach conforms with the adaptation hypothesis postulating that the standard genetic code evolved to minimize harmful consequences of mutations or mistranslations of coded proteins [Woese, 1965, Sonneborn, 1965, Epstein, 1966, Goldberg and Wittes, 1966]. The SGC turned out to be quite well optimized in this respect when compared with a sample of randomly generated codes [Haig and Hurst, 1991, Freeland and Hurst, 1998a, Freeland and Hurst, 1998b, Freeland et al, 2000, Gilis et al, 2001] but the application of optimization algorithms revealed that the SGC is not perfectly optimized in this respect and more robust codes can be found [Błażej et al, 2018a, Błażej et al, 2016, Massey, 2008, Novozhilov et al, 2007, Santos et al, 2011, Santos and Monteagudo, 2017, Wnetrzak et al, 2018, Błażej et al, 2018b, Błażej et al, 2019b]. The minimization of mutation errors is important from biological point of view, because it protects organism against losing genetic information.…”
Section: Discussionmentioning
confidence: 99%
“…We start our investigation by applying a similar approach to that presented by [Błażej et al, 2018a], in which the standard genetic code is described as a graph G ( V C , E C ), where VC is the set of vertices (nodes), whereas E C is the set of edges. V C corresponds to the set of 64 canonical codons using four natural nucleotides { A, T, G, C }, while the edges are induced by all possible single nucleotide substitutions between the codons.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…We start our investigation by applying a similar approach to that presented by [34], in which the SGC is described as a graph G(V 0 , E 0 ), where V 0 is the set of vertices (nodes), whereas E 0 is the set of edges. V 0 corresponds to the set of 64 canonical codons using four natural nucleotides {A, T, G, C}, while the edges are induced by all possible single nucleotide substitutions between the codons.…”
Section: The Extension Of the Standard Genetic Codementioning
confidence: 99%