Proc. Natl. Acad. Sct. USA Vol. 76, No. 5, pp. 2288-2292, May 1979

Biophysics

Complexity transmission during replication (intramolecular aperiodicity/entropy measure/cause-effect relationship)

BRIAN K. DAVIS Long Island Research Institute, Health Sciences Center

lOT, State University of New York, Stony Brook, New York 11794

Communicated by J. L. Oncley, February 23, 1979

ABSTRACT The transmission of complexity during DNA replication has been investigated to clarify the significance of this molecular property in a deterministic process. Complexity was equated with the amount of randomness within an ordered molecular structure and measured by the entropy of a posteriori probabilities for discrete (monomer sequences, atomic bonds) and continuous (torsion angle sequences) structural parameters in polynucleotides, proteins, and ligand molecules. A theoretical analysis revealed that sequence complexity decreases during transmission from DNA to protein. It was also found that sequence complexity limits the attainable complexity in the folding of a polypeptide chain and that a protein cannot interact with a ligand moiety of higher complexity. The analysis indicated, furthermore, that in any deterministic molecular process a cause possesses more complexity than its effect. This outcome broadly complies with Curie's symmetry principle. Results from an analysis of an extensive set of experimental data are presented; they corroborate these findings. It is suggested, therefore, that complexity governs the direction of order-order molecular transformations. Two biological implications are (i) replication of DNA in a stepwise, repetitive manner by a polymerase appears to be a necessary consequence of structural constraints imposed by complexity, and (ii) during evolution, increases in complexity had to involve a nondeterministic mechanism. This latter requirement apparently applied also to development of the first replicating system on earth.

Measure of complexity Heretofore, many investigations of renaturation kinetics have used the length of nonrepetitious sequences in DNA as an index of complexity. Because these studies link biological complexity with randomness in biopolymer sequences, they can be regarded as antecedents to the present work. However, it was necessary for this investigation to have a new complexity measure, which would apply to different species of molecules and to both discrete and continuous structural elements (monomers, torsion angles, atomic bonds). An intuitively satisfactory measure would of course depend on the length of a unique segment and, in addition, on the variety of elements it contains. Use of a generalized entropy is an obvious possibility. This measure of complexity (@) can be defined as follows: @

=

E i

i =

-K E ni i

log pi =

E i

niC; pi = ni/ Yni. i

[1]

(i and C refer, respectively, to total complexity for i-type associations (for example, i-type element pairs) and to the average complexity per association. ni denotes the number of i-type element associations, and the summations include all i. K is a normalization constant, which depends on the logarithmic base. With logarithms of base 2, complexity is measured in binary units (bits) if K = 1. In this system, a minimally complex structure having only two distinct elements contains just one bit of complexity per element. The pi are a posteriori probabilities. Thus, Eq. 1 extends the domain of entropy from that of thermodynamics and information theory; these fields, as is well known, involve a priori probability or estimations of it (6-8). For example, if each nucleotide in the natural set fA, G, T, C} is equiprobable a priori, occurrence of any equally long sequences, such as A-A-A-A and C-T-G-A, is also equiprobable a priori. They represent, therefore, identical amounts of information (H); here, H = 10g2 4 = 2 bits per monomer. However, these sequences are distinguished by the amount of complexity they contain; C(A-A-A-A)= 0 and C(C-T-G-A) = 2 bits per monomer. The variety of element associations depends exponentially on span length when structural elements have equal a posteriori probability. With a logarithmic measure, however, complexity is a function of only the total number and variety of elements, and it is independent of the choice of span length. e=(Nlogms)/s=Nlogm; N/s ms; s$

Complexity transmission during replication.

Proc. Natl. Acad. Sct. USA Vol. 76, No. 5, pp. 2288-2292, May 1979 Biophysics Complexity transmission during replication (intramolecular aperiodicit...
1MB Sizes 0 Downloads 0 Views