The purpose of this thread is to discuss recent findings on the optimality of the genetic code.
Selected articles in the OP include the following:
This conclusion might be misleading though (addressed here), as the paper states that the tested codes were from a biosynthetically restricted set based on the current hypothesis of the evolution of the genetic code from pre-biotic scenarios. When not viewed from this point of view, other, more optimized codes are possible.
A) The actual code is far better than other possible codes in minimizing the number of amino acids incorporated until translation is interrupted after a frameshift error occurred.
B) The code is highly optimal for encoding arbitrary additional information, i.e., information other than the amino acid sequence in protein-coding sequences.
Thus, two more features for which the code is close to being optimal. What is interesting about these two optimal features is that they may facilitate evolution i.e. the code is primed for the future by being optimal in allowing future incorporation of additional information.
A few interesting observations can be made:
Firstly, from the article.
Thus an interesting question can be applied to an "evolving" code as posited in the above quote:
Are these "still evolving" mRNAs, still evolving? Or did it hit an inevitable global optimum?
Secondly, from the article:
Selected articles in the OP include the following:
- Early Fixation of an Optimal Genetic Code
- Evolution of the genetic code: partial optimization of a random code for robustness to translation error in a rugged fitness landscape
- The genetic code is nearly optimal for allowing additional information within protein-coding sequences
- An extension of the coevolution theory of the origin of the genetic code
- Can the genetic code be mathematically described?
- On the Hypercube Structure of the Genetic Code
- Topological structure of the triplet genetic code
- A Neutral Origin for Error Minimization in the Genetic Code.
- Does codon bias have an evolutionary origin?
- A chemical toolkit for proteins — an expanded genetic code
Article 1
Thus, to begin, in the first article it was determined by the researchers that:No better codes out of a million biosynthetically restricted codes.The Best of All Possible Codes?
When the error value of the standard code is compared with the lowest error value of any code found in an extensive search of parameter space, results are somewhat more variable. Estimates based on PAM data for the restricted set of codes indicate that the canonical code achieves between 96% and 100% optimization relative to the best possible code configuration (fig. 2c ). If our definition of biosynthetic restrictions are a good approximation of the possible variation from which the canonical code emerged, then it appears at or very close to a global optimum for error minimization: the best of all possible codes.
This conclusion might be misleading though (addressed here), as the paper states that the tested codes were from a biosynthetically restricted set based on the current hypothesis of the evolution of the genetic code from pre-biotic scenarios. When not viewed from this point of view, other, more optimized codes are possible.
The next article (nr 2) shows that:
Thus showing in that analysis which include all possible codes (not only biosynthetically restricted codes) that the genetic code is partially optimal with regards to error minimization. It should be noted though that analysis only included a subset of the possible optimality feature of the code.Thus, the standard genetic code appears to be a point on an evolutionary trajectory from a random point (code) about half the way to the summit of the local peak. The fitness landscape of code evolution appears to be extremely rugged, containing numerous peaks with a broad distribution of heights, and the standard code is relatively unremarkable, being located on the slope of a moderate-height peak.
From article 3
The analysis above did not include other nearly optimal features of the genetic code including:A) The actual code is far better than other possible codes in minimizing the number of amino acids incorporated until translation is interrupted after a frameshift error occurred.
B) The code is highly optimal for encoding arbitrary additional information, i.e., information other than the amino acid sequence in protein-coding sequences.
Thus, two more features for which the code is close to being optimal. What is interesting about these two optimal features is that they may facilitate evolution i.e. the code is primed for the future by being optimal in allowing future incorporation of additional information.
In article nr.4
The coevolution theory of the origin of the genetic code is discussed. The theory suggests that the genetic code is an imprint of the biosynthetic (biosynthetically restricted) relationships between amino acids.A few interesting observations can be made:
Firstly, from the article.
It should be noted that other exotic amino acids are also used by a few other codes (derived form the original). E.g. Selenocysteine and pyrrolysine are encoded for in many archaea and vertebrates. Archaea, however seem to be the most primitive organisms, thus these encoded amino acids must have been fixated early on.As will become clear in the following, I maintain that these amino acid-pre-tRNAs came directly from the biosynthetic pathways of the first six amino acids evolving along the biosynthetic pathways of energetic metabolism and that they were the first amino acids to be codified on these still evolving mRNAs.
Thus an interesting question can be applied to an "evolving" code as posited in the above quote:
Are these "still evolving" mRNAs, still evolving? Or did it hit an inevitable global optimum?
Secondly, from the article:
Is it correct to assume that in the presence of the precursors of the standard genetic code (e.g. intermediates of glucose degradation and the citric acid cycle), the intimate relationship between these molecules resulted in the inevitable organization of the genetic code (global optimum of the system)?While Wong [9] highlighted the precursor-product relationships between amino acids and their crucial role in defining the organisation of the genetic code, Miseta [10] clearly identified that the non-amino acid molecules that were precursors of amino acids might have been able to play an important role in organising the genetic code. Miseta [10] suggested the idea of an intimate relationship between molecules, the intermediates of glucose degradation, as precursors of precursor amino acids, and the organisation of the genetic code. This observation is also analysed by Taylor and Coates [11] who showed the relationship between the glycolytic pathway, the citric acid cycle, the biosyntheses of amino acids and the genetic code (Fig. 1) and, in particular, they point out that (i) all the amino acids that are members of a biosynthetic family tend to have codons with the same first base (Fig. 1) and (ii) that the five amino acids codified by GNN codons are found in four biosynthetic pathways close to or at the beginning of the pathway head (Fig. 1)[11]. More recently, Davis [12,13] has provided evidence that tRNAs descending from a common ancestor were adaptors of amino acids synthesised by a common precursor and he also discusses the biosynthetic families of amino acids, suggesting their importance in genetic code origin.
Last edited: