We have investigated the origin of genes, the genetic code, proteins and life using six indices (hydropathy, alpha-helix, beta-sheet and beta-turn formabilities, acidic amino acid content and basic amino acid content) necessary for appropriate three-dimensional structure formation of globular proteins. From the analysis of microbial genes, we have concluded that newly-born genes are products of nonstop frames (NSF) on antisense strands of microbial GC-rich genes [GC-NSF(a)] and from SNS repeating sequences [(SNS)n] similar to the GC-NSF(a) (S and N mean G or C and either of four bases, respectively). We have also proposed that the universal genetic code used by most organisms on the earth presently could be derived from a GNC-SNS primitive genetic code. We have further presented the [GADV]-protein world hypothesis of the origin of life as well as a hypothesis of protein production, suggesting that proteins were originally produced by random peptide formation of amino acids restricted in specific amino acid compositions termed as GNC-, SNS- and GC-NSF(a)-0th order structures of proteins. The [GADV]-protein world hypothesis is primarily derived from the GNC-primitive genetic code hypothesis. It is also expected that basic properties of extant genes and proteins could be revealed by considerations based on the scenario with four stages.
We have previously proposed an SNS hypothesis on the origin of the genetic code (Ikehara and Yoshida 1998). The hypothesis predicts that the universal genetic code originated from the SNS code composed of 16 codons and 10 amino acids (S and N mean G or C and either of four bases, respectively). But, it must have been very difficult to create the SNS code at one stroke in the beginning. Therefore, we searched for a simpler code than the SNS code, which could still encode water-soluble globular proteins with appropriate three-dimensional structures at a high probability using four conditions for globular protein formation (hydropathy, alpha-helix, beta-sheet, and beta-turn formations). Four amino acids (Gly [G], Ala [A], Asp [D], and Val [V]) encoded by the GNC code satisfied the four structural conditions well, but other codes in rows and columns in the universal genetic code table do not, except for the GNG code, a slightly modified form of the GNC code. Three three-amino acid systems ([D], Leu and Tyr; [D], Tyr and Met; Glu, Pro and Ile) also satisfied the above four conditions. But, some amino acids in the three systems are far more complex than those encoded by the GNC code. In addition, the amino acids in the three-amino acid systems are scattered in the universal genetic code table. Thus, we concluded that the universal genetic code originated not from a three-amino acid system but from a four-amino acid system, the GNC code encoding [GADV]-proteins, as the most primitive genetic code.
Based on the fact that RNA has not only a genetic function but also a catalytic function, the RNA world theory on the origin of life was first proposed about 20 years ago. The theory assumes that RNA was amplified by self-replication to increase RNA diversity on the primitive earth. Since then, the theory has been widely accepted as the most likely explanation for the emergence of life. In contrast, we reached another hypothesis, the [GADV]-protein world hypothesis, which is based on pseudo-replication of [GADV]-proteins. We reached this hypothesis during studies on the origins of genes and the genetic code, where [G], [A], [D], and [V] refer to Gly, Ala, Asp, and Val, respectively. In this review, possible steps to the emergence of life are discussed from the standpoint of the [GADV]-protein world hypothesis, comparing it in parallel with the RNA world theory. It is also shown that [GADV]-peptides, which were produced by repeated dry-heating cycles and by solid phase peptide synthesis, have catalytic activities, hydrolyzing peptide bonds in a natural protein, bovine serum albumin. These experimental results support the [GADV]-protein world hypothesis for the origin of life.
We have previously postulated a novel hypothesis for the origin of life, assuming that life on the earth originated from "[GADV]-protein world", not from the "RNA world" (see Ikehara's review, 2002). The [GADV]-protein world is constituted from peptides and proteins with random sequences of four amino acids (glycine [G], alanine [A], aspartic acid [D] and valine [V]), which accumulated by pseudo-replication of the [GADV]-proteins. To obtain evidence for the hypothesis, we produced [GADV]-peptides by repeated heat-drying of the amino acids for 30 cycles ([GADV]-P(30)) and examined whether the peptides have some catalytic activities or not. From the results, it was found that the [GADV]-P(30) can hydrolyze several kinds of chemical bonds in molecules, such as umbelliferyl-beta-D-galactoside, glycine-p-nitroanilide and bovine serum albumin. This suggests that [GADV]-P(30) could play an important role in the accumulation of [GADV]-proteins through pseudo-replication, leading to the emergence of life. We further show that [GADV]-octapaptides with random sequences, but containing no cyclic compounds as diketepiperazines, have catalytic activity, hydrolyzing peptide bonds in a natural protein, bovine serum albumin. The catalytic activity of the octapeptides was much higher than the [GADV]-P(30) produced through repeated heat-drying treatments. These results also support the [GADV]-protein-world hypothesis of the origin of life (see Ikehara's review, 2002). Possible steps for the emergence of life on the primitive earth are presented.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.