The structure of the genetic code as an optimal graph clustering problem

Błażej, Paweł; Dr, Kowalski; Wnętrzak, Małgorzata; Da, Aloqalaa; Mackiewicz, Dorota

doi:10.1101/332478

Cited by 7 publications

(18 citation statements)

References 44 publications

(34 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The properties of the genetic code can be also tested using techniques borrowed from graph theory [59,60]. The analysis of the SGC as a partition of an undirected and unweighted graph showed that the majority of codon blocks are optimal in terms of the conductance measure, which is the ratio of non-synonymous substitutions between the codons in this group to all possible single nucleotide substitutions affecting these codons [60]. Therefore, this parameter can be interpreted as a measure of robustness against the potential changes in proteincoding sequences generated by point mutations.…”

mentioning

confidence: 99%

The influence of different types of translational inaccuracies on the genetic code structure

et al. 2019

Self Cite

View full text Add to dashboard Cite

Background The standard genetic code is a recipe for assigning unambiguously 21 labels, i.e. amino acids and stop translation signal, to 64 codons. However, at early stages of the translational machinery development, the codons did not have to be read unambiguously and the early genetic codes could have contained some ambiguous assignments of codons to amino acids. Therefore, the goal of this work was to obtain the genetic code structures which could have evolved assuming different types of inaccuracy of the translational machinery starting from unambiguous assignments of codons to amino acids. Results We developed a theoretical model assuming that the level of uncertainty of codon assignments can gradually decrease during the simulations. Since it is postulated that the standard code has evolved to be robust against point mutations and mistranslations, we developed three simulation scenarios assuming that such errors can influence one, two or three codon positions. The simulated codes were selected using the evolutionary algorithm methodology to decrease coding ambiguity and increase their robustness against mistranslation. Conclusions The results indicate that the typical codon block structure of the genetic code could have evolved to decrease the ambiguity of amino acid to codon assignments and to increase the fidelity of reading the genetic information. However, the robustness to errors was not the decisive factor that influenced the genetic code evolution because it is possible to find theoretical codes that minimize the reading errors better than the standard genetic code.

show abstract

mentioning

confidence: 99%

The influence of different types of translational inaccuracies on the genetic code structure

et al. 2019

Self Cite

View full text Add to dashboard Cite

show abstract

“…This approach conforms with the adaptation hypothesis postulating that the standard genetic code evolved to minimize harmful consequences of mutations or mistranslations of coded proteins [Woese, 1965, Sonneborn, 1965, Epstein, 1966, Goldberg and Wittes, 1966]. The SGC turned out to be quite well optimized in this respect when compared with a sample of randomly generated codes [Haig and Hurst, 1991, Freeland and Hurst, 1998a, Freeland and Hurst, 1998b, Freeland et al, 2000, Gilis et al, 2001] but the application of optimization algorithms revealed that the SGC is not perfectly optimized in this respect and more robust codes can be found [Błażej et al, 2018a, Błażej et al, 2016, Massey, 2008, Novozhilov et al, 2007, Santos et al, 2011, Santos and Monteagudo, 2017, Wnetrzak et al, 2018, Błażej et al, 2018b, Błażej et al, 2019b]. The minimization of mutation errors is important from biological point of view, because it protects organism against losing genetic information.…”

Section: Discussionmentioning

confidence: 99%

“…We start our investigation by applying a similar approach to that presented by [Błażej et al, 2018a], in which the standard genetic code is described as a graph G ( V C , E C ), where VC is the set of vertices (nodes), whereas E C is the set of edges. V C corresponds to the set of 64 canonical codons using four natural nucleotides { A, T, G, C }, while the edges are induced by all possible single nucleotide substitutions between the codons.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Basic principles of the genetic code extension

Błażej

Wnętrzak

Mackiewicz

et al. 2019

Preprint

Self Cite

View full text Add to dashboard Cite

Compounds including non-canonical amino acids or other artificially designed molecules can find a lot of applications in medicine, industry and biotechnology. They can be produced thanks to the modification or extension of the standard genetic code (SGC). Such peptides or proteins including the non-canonical amino acids can be constantly delivered in a stable way by organisms with the customized genetic code. Among several methods of engineering the code, using non-canonical base pairs is especially promising, because it enables generating many new codons, which can be used to encode any new amino acid. Since even one pair of new bases can extend the SGC up to 216 codons generated by six-letter nucleotide alphabet, the extension of the SGC can be achieved in many ways. Here, we proposed a stepwise procedure of the SGC extension with one pair of non-canonical bases to minimize the consequences of point mutations. We reported relationships between codons in the framework of graph theory. All 216 codons were represented as nodes of the graph, whereas its edges were induced by all possible single nucleotide mutations occurring between codons. Therefore, every set of canonical and newly added codons induces a specific subgraph. We characterized the properties of the induced subgraphs generated by selected sets of codons. Thanks to that, we were able to describe a procedure for incremental addition of the set of meaningful codons up to the full coding system consisting of three pairs of bases. The procedure of gradual extension of the SGC makes the whole system robust to changing genetic information due to mutations and is compatible with the views assuming that codons and amino acids were added successively to the primordial SGC, which evolved to minimize harmful consequences of mutations or mistranslations of encoded proteins.

show abstract

“…We start our investigation by applying a similar approach to that presented by [34], in which the SGC is described as a graph G(V 0 , E 0 ), where V 0 is the set of vertices (nodes), whereas E 0 is the set of edges. V 0 corresponds to the set of 64 canonical codons using four natural nucleotides {A, T, G, C}, while the edges are induced by all possible single nucleotide substitutions between the codons.…”

Section: The Extension Of the Standard Genetic Codementioning

confidence: 99%

Basic principles of the genetic code extension

et al. 2020

Self Cite

View full text Add to dashboard Cite

Compounds including non-canonical amino acids (ncAAs) or other artificially designed molecules can find a lot of applications in medicine, industry and biotechnology. They can be produced thanks to the modification or extension of the standard genetic code (SGC). Such peptides or proteins including the ncAAs can be constantly delivered in a stable way by organisms with the customized genetic code. Among several methods of engineering the code, using non-canonical base pairs is especially promising, because it enables generating many new codons, which can be used to encode any new amino acid. Since even one pair of new bases can extend the SGC up to 216 codons generated by a six-letter nucleotide alphabet, the extension of the SGC can be achieved in many ways. Here, we proposed a stepwise procedure of the SGC extension with one pair of non-canonical bases to minimize the consequences of point mutations. We reported relationships between codons in the framework of graph theory. All 216 codons were represented as nodes of the graph, whereas its edges were induced by all possible single nucleotide mutations occurring between codons. Therefore, every set of canonical and newly added codons induces a specific subgraph. We characterized the properties of the induced subgraphs generated by selected sets of codons. Thanks to that, we were able to describe a procedure for incremental addition of the set of meaningful codons up to the full coding system consisting of three pairs of bases. The procedure of gradual extension of the SGC makes the whole system robust to changing genetic information due to mutations and is compatible with the views assuming that codons and amino acids were added successively to the primordial SGC, which evolved minimizing harmful consequences of mutations or mistranslations of encoded proteins.

show abstract

The structure of the genetic code as an optimal graph clustering problem

Cited by 7 publications

References 44 publications

The influence of different types of translational inaccuracies on the genetic code structure

The influence of different types of translational inaccuracies on the genetic code structure

Basic principles of the genetic code extension

Basic principles of the genetic code extension

Contact Info

Product

Resources

About