J. theor. BioZ. (1975) 50, 13-23

Some Theoretical Aspects of the Problem of Life Origin D. S. CHJZRNAVSKII P. N. Lebedev Physical Institute AND N. M. CHERNAVSKAYA

Moscow State University, Moscow, U.S.S.R. (Received 14 March 1974) The problem of the origin of the primary self-reproducing proteinnucleotide complex is considered. A hypothesis on the origin is proposed which differs from the hypothesis of the random origin of the specitic protein. According to the hypothesis given here the type of synthesis determines the form of protein molecule which uniquely determines its function. Coexistence of primary organisms with different codes is considered. It is shown that their mutual interactions are antagonistic and therefore a single version of the code among the many equivalent codes is selected and survives. The divergent stage of evolution during which many different organisms with the same code evolve is discussed. 1. IIeroduclion

Theoretical aspects of the problem of the origin of life have been discussed by Quastler (1964), Eigen (1971) and Kuhn (1972). Three questions will be considered here. (1) Formation of the protein-nucleotide complex in which protein promotes the replication of a polynucleotide and the polynucleotide itself promotes the synthesis of the same protein (formation of the simplest hypercycle in the sense of Eigen, 1971). (2) Formation of a united code. (3) The appearance of various proteins with different functions on the basis of the united code. Solution of the first problem is trivial under the assumption that the formation of the protein-nucleotide complex was random. Such a probability is extremely small and, thus, this hypothesis contradicts the principle of evolutional continuity suggested by Lehninger (1970). According to this 13

14

D.

S.

CHERNAVSKII

AND

N.

M.

CHERNAVSKAYA

principle, each next step of evolution was the result of previous development and its probability was close to unity. We shall discuss a possible mechanism of a gradual protein-nucleotide continuity complex formation, on the one hand, and a mechanism with modern physico-chemical considerations of the process, on the other. The second question has already been discussed by Quastler (1964) and Eigen (1971). According to Quastler one of the versions of the code which had appeared before the others was realized. Eigen suggested the idea of selection of codes: first several different versions of the code were formed, then only one best version was selected and survived. In selecting, however, there are few best versions and several equal versions should have remained. We shall consider the process of selection of one version out of many equal ones. The third question becomes urgent in connection with the first two. The point is that the process of a united code selection is convergent. Under idealized conditions (in a mathematical model) it leads to the fact that only one type of organism with one protein type and one nucleotide sequence of a limited length remains. In this situation a transition to the divergent state and to the formation of many proteins requires additional ideas. We shall treat the question on the basis of the hypothesis about the association of identical objects with their mutation following. 2. Formation of Protein-Nneleotide Complex In all the schemes of the origin of protein-nucleotide complexes capable of autoreproduction of this or that stage a hypothesis about a random formation of a specifically active protein is suggested. This hypothesis is the most vulnerable point. In modern enzymes the primary sequence is rigidly fixed and defines the tertiary protein structure and functions. On the other hand polypeptides synthesized at random can display enzymic (especially hydrolytic and non-specific) activity. The problem of the origin of life touches upon quite a different function: the protein produced should possess replicase but not hydrolytic activity. The supposition of random formation of protein with specific activity seems improbable. We shall consider the mechanism of synthesis of the “primary replicase” which does not require a hypothesis for accidental appearance of specific activity and which corresponds to the principle of evolutional continuity (see Chernavskaya & Chernavskii, 1974). We shall proceed from the following. (1) The primary replicase was formed before the translation mechanism and therefore we shall not assume that the amino acid sequence was “coded” by a nucleotide sequence or their groups.

THEORETICAL

ASPECTS OF LIFE ORIGIN

15

(2) By the time of “primary replicase” formation both free nucleotides and spontaneously formed molecules of nucleotides [either of double helix DNA or compactly packed molecules like modern tRNA (Kuhn, 197211 were already present in the “prebiological milieu” [more exactly, in its proteinoid microspheres (Oparin, 1968; Fox, 1969, 1972, 1973)]. All necessary conditions for spontaneous synthesis of polypeptides, i.e. necessary amino acids and a condensing agent (Oparin, 1968; Fox, 1969) or merely activated amino acids, also existed. (3) The presence of a polynucleotide macromolecule influenced the spontaneous synthesis of polypeptides in the same manner as the presence of a heterogeneous catalyst influences polymer synthesis. This means that the formation of at least a part of polypeptides consisted of two stages. First the amino acids were absorbed on the surface of a polynucleotide molecule covering it completely (or almost completely) and then, condensation took place, i.e. the formation of peptide bonds between neighbouring amino acids (when, of course, their carboxyl and amino-group were close enough). In such a synthesis the polypeptides produced acquired the form of a cover not accidentally but owing to the form of synthesis. Note that with such a form of synthesis the nucleotide sequence in double helix DNA or the monomer sequence in a polynucleotide complex in the Kuhn scheme is not important since the tertiary structure and form of polynucleotide complexes in the first approximation does not depend on sequence. Similar protein covers could be formed on polynucleotides with different sequence but of the same form. (4) Proteins in the form of a cover had in most cases a replicase activity. This hypothesis is based on the statement, well known in enzymology, that a function of a protein-ferment macromolecule depends considerably on its form. Let. us exemplify this statement. Figure 1 presents schematically a double helix DNA and its protein cover. Let us consider two versions.

16

D.

S.

CHERNAVSKII

AND

N.

M.

CHERNAVSKAYA

(a) The protein cover is not completely complementary to the polynucleotide (there exists a dynamical correspondence) and it contains dynamical tensions. They promote the change in form of the protein to that shown in Fig. l(a). The van der Waals forces between the internal surface protein cover and the external polynucleotide surface favour breakage of internal hydrogen bonds. Free places are filled with free nucleotides which causes movement of the protein cover to the left. In such a process the protein cover works as a replicase. This function is determined firstly by the form, secondly by the effect of dynamical correspondence (or by the presence of mechanical tensions and conformational transitions induced by them) and thirdly, by the presence of the van der Waals forces between the protein and the polynucleotide. The mechanism of replication presented above does not contradict modern ideas on fermentative catalysis. It is based on the dynamical correspondence principle (Koshland, 1964) and takes into account the role of mechanical tensions in the course of catalytic action (Hurgin, Chernavskii & Schnoll, 1967a,b). (b) The protein cover is completely complementary and mechanical tensions are absent. Under stable conditions such a cover would inhibit rather than promote replication. If the conditions are changed, however, (temperature, pH or the salt composition of the medium) tensions can appear in the protein cover and it will promote replication. In this case a considerable role can be played by periodic change of external conditions discussed by Kuhn (1972). The hypothesis concerning the mechanism of primary protein-nucleotide complex synthesis can, in principle, be tested experimentally. For this, one should compare the results of a spontaneous polypeptides synthesis in vitro in the presence and in the absence of polynucleotide macromolecules. An important question concerning the conditions of such an experiment (temperature, pH, composition of salts, presence of organic, low molecular weight compounds, etc.) requires some special discussion. 3. Primary

Translation

Mechanism

The above presented synthesis of initial replication provides complementary autoreproduction of polynucleotides. It does not guarantee that the daughter fixed polynucleotides will synthesize on themselves a protein cover identical to the maternal one. To fix the properties in posterity it is necessary to have a mechanism of translation. Its main function is to establish the correspondence between the amino acid (or the oligopeptide) and a group of nucleotides. The main role in translation is played by the adapters, the substances, which can on the one hand “recognize” the region of polynucleotide

THEORETICAL

ASPECTS

OF LIFE

ORIGIN

17

sequence and on the other hand “recognize” a corresponding amino acid (or oligopeptide)?. In the modern biospheres the role of adapters is played by tRNA and proteins aminoacyladenil synthetases. Initial adapters were evidently much simpler but they also must have possessed two determinant centres. One can imagine several (at least two) schemes of initial adapter synthesis satisfying the principle of evolutional continuity and making no use of the hypothesis of an accidental appearance of specific adapters (Chernavskaya & Chernavskii, 1974). We shall not discuss which of the schemes is more realistic. It is important to emphasize that in all the existing (and in all possible) schemes the correspondence between amino acids and nucleotide groups is not predetermined. Let us consider it in more detail. Let an object be formed which is autoreproducing and capable of the protein-cover synthesis. The amino acid sequence (or oligopeptides) in the protein replicase is designated by a, b, c, d, . . . n. The sequence of nucleotide groups (codons) in a polynucleotide of the object is designed by CI,/3, y, 6. , . v. Let the object be capable of inheriting the ability to synthesize its own cover (replicase). This means that in this object (or in a set of a similar objects) the following correspondence is established: a * cr; b * /3, etc. The correspondence is provided by a set of adapters with the structure (a, a); (b, p), etc., (the left index here designates the determinant centre of amino acids and the right one designates anticodon). Let us assume that in some other place an analogous object was formed which has the same protein cover (whose initial sequence is a, b, c, d, . . . , n) but another nucleotide sequence designated by /3, a, 6, . . . . In this object the adapters have the structures (a, a); (b, /3); (c, 8); etc. Both types will have an equal right to exist and equal rates of autoreproduction. The number of equal versions of the code is high: of order v!. If v = 20, then v! = 20! N lo-. Of all possible versions only one did survive and develop, i.e. that observed in the modern biosphere. In what way could one out of many equal versions have survived? To answer this question we shall discuss the problem of code degeneration. The code existing in the modern biosphere is degenerate, i.e. several different triplets of nucleotides code for the same amino acid. This indicates the presence of adapters of the type (a, a) and (a, fl). The inverse degeneration, that is the presence of adapters of the type (a, a) and (b, a) is excluded. Not only the number of nucleotide triplets (64) exceeds the number of amino t The primary translation mechanism, apparently, differed from the modern one (Crick, 1973; Orgel, 1968). The codon could retain not three but a smaller number of nucleotides. Amino acid alone could not be coded, but protein containing several amino acids could be. 2 1.s.

18

D.

S.

CHERNAVSKII

AND

N.

M.

CHERNAVSKAYA

acids. The presence of such adapters would lead to ambiguity in translation of information registered in DNA. Being introduced to the organism an inversely degenerate adaptor (if it was artificially synthesized) violates the process of translation and leads to errors in protein sequence, i.e. it is a poison for any organism. This statement is also valid for primary objects? and suggests two conclusions. (1) The requirement that the inverse degeneration be absent imposes limitation on the length of the primary polynucleotide. In fact, if the coding polynucleotide contains more than 64 triplets at least one of them will be in duplicate. In this case there is a great probability that the adapters containing similar anticodons will turn out to be complementary for different amino acids (or their proteins), i.e. inverse degeneration takes place. This leads to errors in the synthesis of a protein cover in daughter objects. It can be shown (Chernavskaya & Chernavskii, 1974) that in a random nucleotide sequence the probability of the absence of an inverse degeneration is small only if the length of the sequence is less than mrr = +.3(64) = 96. (2) When two objects with similar protein covers and reproduction rates but with a different nucleotide sequence meet, both objects are poisoned and become incapable of reproduction.‘In fact, the sets of adapters in such objects are different: similar amino acids correspond, generally speaking, to different anticodons. When the objects meet their adapters mix and an inverse degeneration takes place which leads to error in reproduction of both the objects. Thus, the interaction of such objects is antagonistic. This does not happen when identical objects originating from a common ancestor meet. 4. Selection of One Code out of Many QuaI

Codes

Let us consider a situation where there are objects with similar protein coversz coded by different nucfeotide sequences. The number of the objects of the ith kind is designated by Xi. The number of different kinds of objects (i.e. different sequences) will be N. The model describing the development of the system with time can be written:

dXi - “Xi-Y

dt Here 0: is an effective reproduction

~ XiXj. j#i

coefficient (the difference between the

7 The primary object here and throughout the paper is understood as a protein-nucleotide complex surrounded by a set of adapters. 1 It is thereforesupposedthat selectionof the best version of a protein cover has already taken place, for example, according to the schemedescribedby Eigen (1971).

THEORETICAL

ASPECTS

OF

LIFE

ORIGIN

19

reproduction coefficients and mortality). The second term describes the death of the “objects” as a result of antagonistic interaction at their meeting. The quantities tc and y are considered to be similar (they have no indices i andi) which means the equality of different types of “objects”. Consider the simplest version of the model N = 2, designating XI = X; x, = Y and going over to dimensionless variables, t ’ = at,

x=:x;

y$Y.

System (1) may be written in the form dx iF=x-xy dY

jp

(2) =

Y-XY-

The phase picture of equations (2) is shown in Fig. 2. There are two stationary states: point x = y = 0 and point (1) x = y = 1. Both are

FIG. 2. The phase plane-picture

of the system (2).

instable, the point (1) being a saddle. They are crossed by a separatric. Due to the symmetry it coincides with the bisectric. Thus, the system is a trigger. Stable stationary points are removed to inhity ; there are two ofthem : x + co ; y + 0 and x + 0; y e, 00. The meaning of this result is clear: since in this simplest model the development of populations is limited neither by feeding nor by the “tight effect”, the growth of the amount of individuals is unlimited.

20

D.

S.

CHERNAVSKII

AND

N.

M.

CHERNAVSKAYA

It would be also of interest to investigate the influence of the parameters on the intensity of selection. In this case it is easiIy done, since equations (2) admit an analytical solution. The trajectories of equations (2) obey the equation dY -=-

dx

YU-4

x(1-y)

(3)

which has the solution In!=y-x+C. x

(4)

If the initial state takes the place in the separatric, the integration constant C = 0. Consider the behaviour of the trajectory when the selection is going on rather intensely (i.e. beginning at a time t, at which x %-y). Then equation (4) can be written in the form y = x eeX. Substituting into equations (2) and solving we find: x = x0 e’; y = y. exp [ -x,(e’l)]. Going back to dimensionable variables we find Y = Y. exp

-x0 i (e”‘- 1) (5) [ I (here 1, and Y, are already dimensional initial values). From this it is seen that the selection is intense in the model under consideration; the rate of decrease of Y may be regarded as its measure. At y -+ 0 the intensity of the selection tends to zero. This is natural since at y = 0 the equations are split and each population is developing independently. From equation (5) it is also seen that the parameter a (reproduction greater than death) is much more important in selection than parameter y. The properties of equations (1) at N > 2 are analogous to those considered above: there are N+ 1 stationary states. Two are symmetric: in the first state Xi = 0, in the second all Xi are the same and equal to (iV-- I)-‘a/y. Both these states are unstable. The remaining N- I states correspond to an infinite value of one of the variables and to zero values of the rest. It can be seen that there are no solutions corresponding to an unlimited growth of at least two (and more) variables Xi. Thus the model describes a complete mutual exclusion of different objects. 5. The Origin of Information

and the Role of Instability

The whole process from the soup and up to the formation of “objects” and a united code is often connected with the origin of biological information and entropy decrease. Let us consider it in more detail.

THEORETICAL

ASPECTS

OF LIFE

ORIGIN

21

To characterize a set of objects of different types we may introduce the quantity of the entropy type S-khll? (6) where k is the Boltxman constant and r is statistical weight, i.e. the number of different ways to realize the given state. This number is n! r= (7) nt!nz!. . .n,! Here n, is the number of objects of the ith type and n = Z nl is the total number of objects. Using the Stirlings formula, equation (6) can be written in the form N

-nkCaj In aj J where aj = n,/n is a relative part of objects of the&h type. The relative entropy (per object) is s=

a=i=-kCa,lnap I The introduced entropy characterizes the degree of state disordering only with respect to several separated degrees of freedom, i.e. to those in which the objects under consideration differ. If the objects are similar in all the remaining degrees of freedom (except those under consideration), the entropy (8) enters in the total one as an additive, otherwise the entropy connected with the code makes no sense [as well as expressions (6) and (711. The entropy (8) is a very small part of the total entropy of the system including oscihative degrees of freedom, etc. (Blumenfeld, 1964). The specific entropy (9) decreases in the course of code selection and at the end of the divergent stage it vanishes.? Suppose at the beginning there were N different types of objects and partial entropy c had maximum value (at a given IV) c = k ln N (this value is reached if the quantities aj are equal). At the end of the prooess all a,‘s are equal to zero except one which is equal to unity). The specific entropy o is equal to zero in this case. The entropy decrease is followed by the increase of information and thus we face the origination of information which is possible under two conditions : accident and the ability to remember an accidental event (Quasfler, 1964; Bhunenfeld, 1964). A very important property of the model considered above is instability of a symmetric state. Small random deviations produce a great effect. Any of the final states is t The deamse of the entropy (9) does not contradict the thermodynamic second law since it is compensated for by the entropy increw of the other degrees of freedom. It ls bportant to emphasize that such a process ia possible. only in a thermodynamically nonequilibrium open .syatcmwhich wasrepmented by tile primary soup.

22

D.

S. CHERNAVSUI

AND

N.

M.

CHERNAVSKAYA

stable so that an accidental choice is remembered. The same role is played by the symmetric state instability in other processes accompanied by accumulation of information. An analogous situation occurs in the dynamical description of differentiation processes in ontogenesis (Grigorov, Polakova & Chernavskii, 1967). It turns out that any dynamical system capable of increasing information in the course of its development should go through unstable states. Let us discuss the character of information appearing by the end of the convergent phase. As usual, biological information is understood as information about the structure and the protein properties written down in DNA. It is supposed here that one of the codes can give information about several different proteins performing different functions. In our case there is only one sort of protein and at this stage there are no other sorts. Drawing an analogy with linguistics one can say that in the modern biosphere on the basis of a unique alphabet many different words and word combinations are written down, information being of more importance than the alphabet in which it is written down. Up to the moment under discussion only one word was written. The meaning of this word (if a word is not meaningless for lack of other words) was “life”. On the example of this word an alphabet united for every living thing was chosen. Thus, biological information in the true sense of the word did not appear. Only a necessary basis for its notation-a united alphabetwas created. 6. Divergent Evolution In the framework of the developed scheme the appearance of different proteins and nucleotide sequence (obeying the united code) is possible only after the united code is formed. It could happen when the necessity appeared to perform several different functions. This moment came when the substrates (nucleotides and amino acids) created in the pre-biological period were exhausted and it was necessary to accelerate their synthesis from the precursor. We cannot believe that new proteins with new functions as well as polynucleotides which coded them appeared at random; this would contradict the principle of evolutional continuity. Besides, in a random synthesis of protein-nucleotide complexes with different functions, all the abovementioned difficulties due to inverse degeneration would appear. We must assume that new proteins (and new nucleotide sequences) arose by mutation from those already existing. In order that new objects do not lose old, vitally important functions we must assume that association of identical objects took place before mutation. The hypothesis of an important role of association is not new and has already been discussed in many papers.

THEORETICAL

ASPECTS

OF

LIFE

23

ORIGIN

Let us emphasize that in the model in question the association not merely speeds up the evolution rate but is a necessary factor of it. Remember that the optimal length of the primary coding polynucleotide (m, = 96) is not very large; in the modern biosphere the lengths of the structural genes coding one protein sequence are much greater (for example the gene which coded 200 amino acids had 600 nucleotides). It is natural to suppose that such long sequences were also caused by the association of primary polynucleotides with the following mutation and their parts. This can be tested experimentally. In fact, in this case, in a long sequence the memory about their origin should be kept in the form of correlation in the dislocation of nucleotides. It should be expected that each term can be divided into approximately equal sections. To verify this on nucleotide sequences is difficult since there are few interpreted sequences as yet. However it can be verified on interpreted protein sequence (for their amount is greater); they should be expected to possess similar properties. There are indications of the existence of such correlations with the correlation length of about 14 amino acid residues. This does not contradict our scheme since the length of the corresponding region of the polynucleotide II = 3.14 = 42 is smaller than m,. No detailed and convincing correlation analysis of protein sequences has been carried out. It follows that such an analysis would be of great interest. We would like to thank U. I. Hurgin and L. A. Blumenfeld for fruitful cussions and remarks..

dis-

REFERENCES L. A. (1964). In On the Nature of Life, p. 221.Moscow: Nauka. CHERNAVSKAJA, N. M. & CEERNAVSKII, D. S. (1973). 7Reor. exp. Biophys. 6,3. CRICK, F. H. C. (1971). J. nwlec. Bill. 38,381. fiGEN, M. (1971). Naturwissendajten 58,465. FOX, S. (1969). Naturwissenschften 56, 1. Fox, S. (1972). Molecular Evolution and the Origin of Ll$e. San Francisco: Freemen. FOX, S. (1973). Naturwissenschaften 60,8. GRIWROV, L. N., ~OLAKOVA, M. S. & CHEF~NA~SKII, D. S. (1967). Molek. Bioi. 1,3. HUROIN, U. I., CHFJ~NA~SIUI, D. S. & SHNOLL, I. E. (1967u). Molek. Btof. 1,415. HURGIN, U. I., CHERNAVSKII, D. S. & SHNOLL, I. E. (19676). In Oscillatory Processes B&logical and Chemical Systems. Moscow: Nauka. K~SHLAND, D. (1964). Horizons in Biochemistry (M. Kasha &B. Pullman, eds). New York: BLUMFJNFELD,

in

AcademicPress.

KUHN, H. (1972). Angew. Gem. 84,838. LEHNINGER. A. I. (1970). Bkxhemistr~. New York: OPARIN, A.‘I. (19&). Z&Life, its NGure, the Or&b 0~0% L. I. (1968). J. molec. B&l. 38,381. QUASTLER, H. (1964). E&e Emergence of Biological

university Press.

Worth.

andthe Development. Moscow: Organ&ation. New Haven

and

Nati.

London:

Some theoretical aspects of the problem of life origin.

J. theor. BioZ. (1975) 50, 13-23 Some Theoretical Aspects of the Problem of Life Origin D. S. CHJZRNAVSKII P. N. Lebedev Physical Institute AND N. M...
684KB Sizes 0 Downloads 0 Views