ANALYTICAL

BIOCHEMISTRY

79, 1lo- 118 (1977)

Resolution of Components in Sedimentation Equilibrium Concentration Distribution9 KIRK C. AuNE~,~ Marrs

McLean

Department

AND MICHAEL

of Biochemistry, Houston, Texas

Baylor 77030

F. ROHDE College

of Medicine,

Received June 7, 1976; accepted December 8, 1976 A procedure is discussed whereby the concentration distribution at sedimentation equilibrium may be resolved into the contributing redistributed components in the system. The procedure is shown to deal best with heterogeneous systems, but can also lend itself to the analysis of a system where the molecular weights of interacting proteins are quite similar by making composition constraints. In those cases where interaction occurs between components of the system, a calculation of the equilibrium constant which described the association can be made. Moreover, the fitting error is related to the fitting parameters in such a manner as to yield the estimated error of the equilibrium constant.

A system containing macromolecules which has been redistributed in a centrifugal field by an analytical ultracentrifuge reveals considerable information about its chemical content. The resultant concentration distribution contains information regarding the molecular size of the components within, as well as the extent of molecular interaction between components of the system. Considerable effort has been devoted to the extraction of information from sedimentation equilibrium experiments (l-6). Most of these approaches have been quite successful for the characterization of homogeneous associations. Recently, this laboratory has presented data regarding the interaction between isolated proteins from the 30s ribosomal subunit (7,8). After trying many different approaches, including curve fitting of point-weight average molecular weight distributions in terms of an association constant and known molecular weights and a fell-swoop least-square fit of all concentration displacement data in terms of meniscus concentration and known molecular weights, the technique presented in this paper was adopted. The disadvantages of the error found in point-weight average molecular weights are particularly serious for heterogeneous systems, since there is an additional degree of freedom. The disadvantage of the single fit obtained by the least-square technique is that the low level error inherent in the data frequently allows for a fit which has no physical significance (i.e., negative concentrations). 1 This work was supported in part by the National Institutes of Health (GM 22244). 2 Recipient of an NIH Career Development Award (K04 GM 00071). 3 To whom correspondence should be addressed. 110 Copyright 0 1977 by Academic Press. Inc. All rights of reproduction in any form reserved.

ISSN 0003-2697

RESOLUTION

OF COMPONENTS

The search technique to be discussed allows for error minimization maintaining physical relevance.

111 while

METHODS The calculations discussed here were performed using a HewlettPackard 98lOA programmable calculator equipped with 2036 programming steps, 111 storage registers, and a cassette storage device. A program was developed using the numerical methods of Hooke and Jeeves (9). Modifications were required to accommodate the idiosyncrasies of the Hewlett-Packard language and the particular nature of the problem to be dealt with here. It is important to consider the nature of the data that are analyzed. Sedimentation equilibrium experiments provide, in the case of an ideal solute, a concentration distribution which is, theoretically, exponential according to the physical properties of that solute. The concentration distribution is generally measured utilizing the differenti~ refraction property observed with Rayleigh optics. This method does not necessarily provide an absolute concentration because it is a differential measurement. Hence, the convention used here employs the symbol y to denote the fringe displacement as measured relative to the meniscus; the symbol f, to denote the absolute fringe displacement (directly proportional to the concentration of the solute through a determinable constant, k); and the symbolf(a), to denote the absolute fringe displacement at the radial position of the meniscus (r = r,). The data available are, therefore, sets of y values for sets of radial positions. The y values are determined for a particular experiment as discussed previously (7). Generally, 20-50 values of y are collected, starting from displacements of about 50 pm (micrometers). These values along with the experimental parameters are saved as a single-run data set available for further analysis. If the set of data were derived from an ideally behaving homogeneous solute, the data would then be described by Eq. [1]: Y(r)

=f(a>~{eXp[oi(r2

-

ra2)/2] - 1},

Ui = M,( 1 - Bip)ti’/‘RT,

ill El

with Mi and ii,, the molecular weight and partial specific volume of solute i, and p, w, R, and T the density, angular velocity, gas constant, and temperature, respectively. If N species are present, the total fringe displacement observed at radial position r is: y(r)

=

ji$i(at i=l

*{

exp[cdr’

- r,V21 - 1).

r31

112

AUNE

AND

ROHDE

Therefore, it is the desired goal to take experimental data and extract information regarding composition by requiring conformity with Eq. [3]. That function is performed by the iteration scheme. The search scheme employed here establishes the best set of fi(a)s which describes the data. The molecular weight Mi and, hence, mi are established from separate experiments on homogeneous systems. Consequently, the fitting procedure is one of establishing the,linear coefficients of a series of exponent&. A three-species system will require three coefficients for fit if the three species have nonidentical molecular weights. The fit proceeds with guess values for the meniscus concentrations. A computation of “goodness of fit” is performed; one or two meniscus concentration parameters are changed by a specified change parameter; the “goodness of fit” is computed again and compared with the previous configuration. If there is improvement in fit at the end of a series of changes considering all parameters, the parameters assume the new values. The change parameter corresponding to the parameter changed is increased, and the cycle is repeated. When no improvement is achieved upon completion of a series of changes (both positive and negative and/or paired) for the parameters, the change parameter is reduced by a specified amount. The meniscus concentration parameters are constrained to values equal to or greater than zero. A measure of “goodness of fit” employed is one where the deviation of fit, 6, from the data defined by Eq. [4]: 6(r) = y(r) - ;f;(a)+

{eXp[Ui(r”

- r,‘)/2] - 1)

[41

i=l

is computed at each radial position. The absolute value of 6 is then summed up for all radial positions to define the average residual r;) as

R = ~[(~WrJI)i(n - N - 01,

[A

j=l

where n is the number of data points. The numerical value of i? should correspond to the random error in the data collected in the case of a “perfect” fit. The definition of a residual is an arbitrary one for the sum of the square of the deviations could be minimized as an alternative. The method requires guess values for initialization as well as change values to be applied to the guess values in order to search for a fit. Convergence on a set of values considered to be the parameters which best describe the distribution is achieved by a decrementing factor which reduces the change parameters when no better fit can be obtained. Finally, the cycling stops when the change parameters are reduced to some fraction of the fitting parameters. This term is called the limit change.

RESOLUTION

113

OF COMPONENTS

The values to be used for the guess values, change parameters, decrementing factor, and limit change will be discussed in connection with some treated data in the Results and Discussion section. This is because these quantities can have a considerable effect on proper convergence. RESULTS

AND DISCUSSION

The curve-fitting procedure outlined provides a set of values, f(a), for each component i in the system, in the event that the molecular weight of the components are dissimilar. These coefficients can be used to characterize the particular system. Homogeneous Association The self-association

of macromolecules A+

as in the scheme,

161

A$A2,

A+A 2 > AS, etc., is tractable with the procedure discussed here. If the system is indefinite association where a single equilibrium constant can describe the complete association, other methods of fit would be more desirable, because the parameters derived from the search fit are independent parameters, and one parameter is required for each species. For dimerization, trimerization, and possibly tetramerization, the fitting procedure would yield values off(a) for each component. At sedimentation equilibrium, chemical equilibrium is also achieved. Hence, the distribution of components at the meniscus is at chemical equilibrium. The equilibrium constants of association are readily obtainable as in the case of dimerization:

i.f.&)lW i.f&)l*~;

[71

- Lfa,(4lM-f&) f&)1,

PI

K, = M/2. and trimerization: K2 = 2~3

where k is the optical constant relating fringe displacement concentration in grams per liter.

to C, the

C =k-f Heterogeneous Association A simple heterogeneous

association

[91

is schematically

A+B=C, where molecules A and B are dissimilar. leads to the equilibrium constant:

represented

by

[lOI The extraction

K = WA-MB/MC)* {fc(a)l[k.fA(a)lfB(a)l}.

of coefficients

[Ill

114

AUNE

AND

ROHDE

It is immediately evident that a problem develops as MA approaches MB. This occurs because the ability to assign a uniqueness to the concentration of components A or B at the meniscus is impaired. If MA equals MB, it is clear that only the sum of the concentrations of monomeric species can be obtained as a coefficient, and, therefore, the equilibrium constant becomes indeterminant on that data alone. If it is known what fraction of the mass, XA, in the system is comprised by component A then the equilibrium constant becomes:

K = (MdWfcWWAb)

[=I

+fBW2(XA(1- WI.

Equation [ll] assumes that the molecular weights of A and B are significantly different so that the coefficients f*(a) and fB(a) are uniquely determined. Equation [ 121 assumes that the molecular weights are identical and that the mass fractions are known. At some ratio of MA/MB approaching 1.0, the ability to use Eq. [l l] breaks down, and the ability to use Eq. [12] improves. The domain between the two equations is obscured because of the inability of the fitting procedure to take the data (with or without error) and establish a significance between the two exponential factors. Naturally, the distinction is better realized with data possessing a low level of error, but, at a certain point, the number of significant digits carried in the calculation then becomes important. The size of that domain can be explored by analyzing the results obtained with simulated data. A series of data were generated for an A+B+ C equilibrium scheme, where the molecular weight of A was held constant and that of B was varied such that MA/MB ratios from 0.5 to 2.0 could be obtained. The meniscus concentration of A and the association constant were held constant with values of 1.0 g/liter and TABLE COMPUTED

COMPOSITION

AND

1 EQUILIBRIUM

CONSTANTS'

WM.

C*”

CB’j

CC’

R’

C*T/C.T

0.5

0.78 (I .O) 1.08(1.0) O.%(l.O) 0.84 11.0) -” (1.0) 0.95 (1.0) 1.24(1.0) 1.29 (I.01 1.18(1.0)

0.84 (0.75) 0.85 (0.85) 0.90 (0.90) 1.22 (0.95) -” (1.0) 1.09(l.M) 0.94(1.13) 0.92 (1.21) I.31 (1.50)

0.55 (0.56) 0.72 (0.72) 0.81 (0.81) 0.88 (0.90) 0.99 (1.00) l.IZ(I.ll) 1.24(1.27) 1.44 (1.47) 2.21 (2.25)

0.15 0.21 0.20 0. I5 0. I3 0.26 0. I7 0. I4 0.12

0.41 (0.45) 0.70 (0.69) 0.79 (0.80) 0.78 (0.90) -c (1.0) 1.07(1.10) 1.39tl.22) 1.60(1.35) I.89 (1.73)

0.7 0.8 0.9 1.0 I.111 I.25 1.429 2.0

n Quantities in parenthesis are theoretical (K = I x IW ~0: e Using Eq. [I I] and computed composition. r Using Eq. [I21 and known composition. d Not uniquely defined. C, + CR = 2.10 (two species fit). v Not uniquely defined. ’ Imposed error was normal t= 20.2). thus I? = 0.14.

concentrations

K (wl

x IO?

I.13 0.92 I.05 0.91

e 5 + e

0.07 0.07 0.08 0.06

c f f 2

0.13 0.08 0.10 0.09

I 1.02 0.94 I.00 0.95

in gramsfiiter.

K CM-’ x IOY I.31 0.94 1.05 0.88 0.90 I.02 0.94 0.99 1.02

RESOLUTION

115

OF COMPONENTS

104 M-I, respectively. (Note that the magnitude of the concentration is irrelevant here since ideal behavior of solutes is assumed.) In order to obtain a proper evaluation of the method, the synthetic data were generated with an imposed random error. The random error utilized was Gaussian about each concentration in the data set and possessed a standard deviation of +0.2 g/liter. This error is approximately 1% of the total concentration in the system and would be proportional to and typical of real interferometric data. The data are shown in Table 1. A procedure for obtaining the data was followed whereby it was assumed that nothing was known about the quantities of the species in the system other than that there were three species with specified molecular weights. The initial value taken for the meniscus concentration of each species was 0.0. The change parameters, that is, the variation taken for each meniscus concentration, was established in the following manner. The displacement at the base of the cell, yb, and the mean molecular weight in the cell are combined to estimatef(a) by the equation: f(a) = yb’exp{[-M(l

- Pp)w2/mT]‘(rb2

-ra2)},

]131

where M = CL1 MI/N. This very approximate value is then used to establish a set of change parameters for the meniscus concentrations, ACi, of each of the species, i, in the system. A reasonable guess for the change parameter for component 1 (the smallest component) is taken to be 10% of the approximate value off(a). It can be shown that an equal change to the total concentration of the system can be effected using the change parameters for the meniscus concentration of the other species in the system with the relation: AC(a) = AC,(a).(M,/M,>*[exp(2H,)

- l]/[exp(2HJ

- 11,

]141

where H, = M,( 1 - v’p)w2(rb2 - r,*)/4RT.

1151

The search is then commenced as outlined earlier, with a decrement factor of 0.29 (in order to generate pseudo random decrements) and a limit change of 0.01. This means that all parameters will have been decremented four times. When this preliminary search stops, the established values for the concentrations are taken as starting guesses, and the original change parameters are reapplied. The limit change is then set at 0.001 (0.1% change sensitivity), the decrement factor is set at 0.79 (enlarges the search surface), and the search begins again. The result obtained after the outlined sequence of operations does not necessarily uniquely define the best set to fit the data. Normally, other searches are utilized starting from above and below the previously obtained sets with slightly varied change parameters. A final set is obtained when several

116

AUNE

AND

ROHDE

attempts provide essentially the same set of values within a reasonable variation. The sum of the set of meniscus concentrations obtained by the aforementioned method should agree with other means of measure. A technique outlined elsewhere (1) provides one method for estimating the total amount of material at the meniscus. In those instances where the initial composition is known, greater confidence in the final set of fitting parameters can be achieved. In the case of A and B forming C, the initial concentration of A can be obtained by integrating over the contents of the cell, fAo = bfA(a) * exp[(rA(r2 - r,2)/2]dr2/(rb2 - ra2), ia to obtain that amount found as free component

A and,

.f-~’ 1c = MA/WA + MB) V-co, as the amount of component integration, one obtains:

[161

A associated with component

iI71 C. Thus, after

.f~'= CfaW*[ew(2HA) - 11/3-L)

+ MdOL + MB)fc(a)~[exp(2W- 1W-M. [181 Similarly,

f~'=

the initial concentration

Lfda)-[expW-Id

of B is

- 11/2Hd

+ BMMA + MB)~fda>~[exp(2Hc>- 11/2H,). [191 Thus, if the ratio, fAo/&O, computed from parameters f..,(a), .&,(a), f&a) in Eqs. [ 181 and [19] agree with the known initial concentration ratios, a satisfactory set of parameters is established. The data of Table 1, therefore, illustrate that uniqueness deteriorates with a three-species fit as the ratio of molecular weights approaches 1.0. The last column in the table shows that, if the composition is known, the equilibrium constant is still reliably obtained with the use of Eq. [12]. Another feature illustrated in Table 1 is that the largest molecular weight species (largest exponential term) is most accurately determined. This is due to the domination of the fit by the large exponential term. Hence, in some cases, the equilibrium constant may have less precision than desired, but a significant association will not necessarily escape detection. Precision of Equilibrium

Constants

The procedures discussed lead to a determination of the meniscus concentrations of the species in the system. These quantities are then combined to yield an equilibrium constant. The remaining questions are:

RESOLUTION

OF COMPONENTS

117

What is the error on the determined meniscus concentrations? and How does that error cascade to the equilibrium constant? The curve fitting procedure yields an error parameter, R , which has a value as defined in Eq. [5]. This uncertainty would relate to the ability to define the total concentration in the cell. The total concentration in the cell is obtained by numerical integration and has increased precision as the number of points increases. However, the concentration at any point is the result of a difference determination relative to that near the meniscus. Hence, the accuracy of the concentration determination remains dependent on the inherent error at a single position which is on the order of a. The total concentration,f,, is distributed among the different species in the cell. The uncertainty in the total concentration of any one species i in the cell must be tp=R, PO1 because of the inability to uniquely proportion the total error among the contributing species. Now, the variation of the total concentration of the ith species, 6fi”, varies directly with the variation of the meniscus concentration of the ith species, Gfi(a). Thus, fi”

hence, by combining

=

ViO/fi(a)lWi(a);

Eq. [20] and [21] &fdd = Lfifi(a)lfiOP

Pll 1221

is obtained. Therefore, it is seen that the error in determination of the meniscus concentration of the ith component is proportional to the error of fit and absolute value of the parameter, while inversely proportional to the total amount of that species present. The error in the equilibrium constant is then computed as

which reduces to 6K = i? ‘K[ i

(l/fi”)*]‘.

]241

i=l

The error in the equilibrium constant is seen to greatly increase as the quantity of material implicated in the equilibrium is decreased. The effect of the error can now be considered. Since the imposed error had a standard deviation of kO.2, the total meniscus concentration that was ultimately obtained through the fitting process should be within that error. An inspection of Table 1 reveals that the total meniscus concentration for each MA:MB ratio was computed to be about 0.14 g/liter in the worst case and about 0.06 on the average.

118

AUNE

AND

ROHDE

The error for the meniscus concentration of each species as predicted by Eq. [22] is that which is shown in Table 1. The error expected for the equilibrium constant as computed by Eq. [24] is also illustrated in Table 1. The summation term of the left-hand side of Eq. [24] is relatively constant for the different data sets. Hence, the expected error, MC, for the data sets presented in Table 1 that possess a random error of k0.2 g/liter would be kO.065 x 104 M-l. This is consistent with results illustrated in Table 1. Heterogeneous

Nonassociation

A heterogeneous system of noninteracting molecules should be reasonably well characterized, at least for N = 2 or 3. The composition ratios should be independent of loading concentration and revolutions per minute. These latter observations are, of course, utilized to demonstrate the absence of reversible mass-action association within a system. Again, the resolution will depend upon the molecular weight ratio in much the same manner as for the interacting systems. ACKNOWLEDGMENT The program development was greatly facilitated by having a FORTRAN basic search program kindly supplied to us by Michael Lee of the Department of Physics, Northwestern University.

REFERENCES 1. Kar, E. G., and Aune, K. C. (1974) Anal. Biochem. 62, 297. 2. Chemyak, V. Ya., and Magretova, N. N. (1975) Biochem. Biophys. Res. Commun. 65, 990. 3. Teller, D. C.. Horbett, T. A., Richards, E. G., and Schachman, H. K. (1%9) Ann. N. Y. Acad. Sci. 164, 66. 4. Adams, E. T., Jr. (1969) Ann. N. Y. Acad. Sci. 164, 226. 5. Roark, D. E., and Yphantis, D. A. (l%9) Ann. N. Y. Acad. Sci. 164, 245. 6. Haschemeyer, R. H., and Bowers, W. F. (1970) Biochemistry 9,435. 7. Rohde, M. F., O’Brien, S.. Cooper, S., and Aune. K. C. (1975) Biochemistry 14, 1079. 8. Rohde, M. F.. and Aune, K. C. (1975) Biochemistry 14,4344. 9. Hooke, R., and Jeeves. T. A. (l%l) J. Ass. Compuf. Mach. 8, 2.

Resolution of components in sedimentation equilibrium concentration distributions.

ANALYTICAL BIOCHEMISTRY 79, 1lo- 118 (1977) Resolution of Components in Sedimentation Equilibrium Concentration Distribution9 KIRK C. AuNE~,~ Marrs...
556KB Sizes 0 Downloads 0 Views