J. Mol. Biol. (1976) 102, 221-235

Similarity of Three-dimensional Structure Between the lmmunoglobulin Domain and the Copper, Zinc Superoxide Dismutase Subunit JANE S. RICHARDSON, DAVID C. RICHARDSON, KENNETH A. THOMAS

Department of Anatomy and Department of Biochemistry Duke University, Durham, N.C. 27710, U.S.A. ENID W. SILVERTON AND DAVID R. DAVIES

Laboratory of Molecular Biology, N I A M D D National Institutes of Health, Bethesda, Md 20014, U.S.A. (Received 11 August 1975, and in revised form 8 December 1975) A striking similarity in three-dimensional structure has boon observed in two functionally unrelated, sequentially non-homologous proteins: the immuneglobulin domain and the copper, zinc superoxide dismutase subunit. The immuneglobulin molecule contains several structurally similar domains composed of antiparallel fi strands forming a bilayer structure or flattened cylinder. The same topological folding pattern of the antiparallol fl strands into a bilayer structure of the same overall shape is found in suporoxide dismutase, and external loops occur in places equivalent to hyporvariable region loops. Quantitative comparisons of the various structures have been made and are discussed in detail. 1. I n t r o d u c t i o n

I n the past few years m a n y examples have been noted of significant similarities in basic three-dimensional folding pattern among the proteins whose conformations have been determined b y X-ray crystallography. So far, these similarities in threedimensional structure have fallen into three general categories. (1) Similarity of tertiary structure among protein families which share clear amino acid sequence homologies, such as the globins (Hendrickson & Love, 1971), the cytochromes c (Timkovich & Dickerson, 1973), and the trypsin-like serine proteases (Shorten & Watson, 1970). (2) Similarity of folding pattern (with or without clear sequence homology) for structural domains within a single protein, such as are found in the immunoglobulins (Schiffer etal., 1973; Poljak et al., 1973) and in the carp muscle calcium-binding protein (Kretsinger, 1972). (3) Similarity of three-dimensional folding pattern for functionally related domains in separate proteins which differ widely in the remainder of their backbone conformations, such as the nucleotide-binding domain found in lactate dehydrogenase, liver alcohol dehydrogenase, flavodoxin, and adenylate kinase (Rossmann et al., 1974; Sehulz & Sehirmer, 1974). 221

222

J . S. R I C H A R D S O N

ET

AL.

The current paper describes a fourth such type of similarity, a close resemblance of folding pattern between two functionally unrelated proteins with no evident sequence homology: the immunoglobulin structural domain and the Cu,Zn superoxide dismutase subunit.

2. Description and Comparison o f the Two Structures An IgG immunoglobulin molecule contains 12 domains (4 in the 2 light chains and 8 in the 2 heavy chains), all about the same size and with an equivalent internal disulfide bridge. The Fab fragment, the largest portion of an immunoglobulin for

Zn

FIG. 1. Stereo drawings of the ~-carbon backbone for (a) a n immunoglobulin variable domain (VH of McPC603) and (b) a copper, zinc superoxide dismutase subunit. Both are viewed from the same direction as the schematic drawings in Fig. 4. The hypervariable residues are indicated by solid circles in (a).

IMMUNOGLOBULIN

AND SUPEROXIDE

DISMUTASE

223

w h i c h a d e t a i l e d t h r e e - d i m e n s i o n a l s t r u c t u r e is k n o w n ( P o l j a k e~ aL, 1974; Segal ~ aL, 1974) c o n t a i n s f o u r s u c h d o m a i n s (V~.,VH,C~., a n d CH1) ; t h e s e h a v e n o w b e e n s h o w n t o b e s i m i l a r i n c o n f o r m a t i o n as well as sequence. This s i m i l a r i t y e x t e n d s across species, h u m a n a n d m o u s e s t r u c t u r e s b e i n g closely r e l a t e d ( P a d l a n & Davies, 1975). F i g u r e l ( a ) is a s t e r e o d r a w i n g o f t h e ~ - c a r b o n b a c k b o n e o f a n i m m u n o g l o b u l i n v a r i a b l e d o m a i n . F i g u r e 2(a) is a d i a g r a m o f i t s t o p o l o g y . T h e b a s i c f o l d i n g p a t t e r n c o m m o n t o all t h e h n m u n o g l o b n l i n d o m a i n s c o n t a i n s s e v e n s t r a n d s o f ~ s t r u c t u r e ( s t r a n d s A t h r o u g h G i n Fig. 2(a)), all a n t i p a r a l l e l e x c e p t for t h e first a n d l a s t s t r a n d s . I n t h e v a r i a b l e d o m a i n s (VL a n d VH) t h e r e a r e i n a d d i t i o n t h r e e h y p e r v a r i a b l e regions, b e t w e e n ~ s t r a n d s B a n d C, C a n d D, a n d F a n d G. T h e s e

N

A

J

"-

s

(

J

E --"

-

D

9

! eoe o ooeoBoeo

o 0o0 o~ ~

~

eo 9 9 ooeeo ee~

--C

o

J

e o e o e o o o O~

"~

9

F"

oeoeeoo

G

~

",-

(a)

A J

B

E" -"""

..._

--

D C

"--

G

",-

(b)

F~G. 2. Topology diagrams of the fl structures of (a) an immunoglobulin variable domain and (b) a copper, zinc superoxide dismutase subunit. Each horizontal line represents a strand of the E-sheet, labeled as described in the text. The cylinders are shown as though opened between the N and C terminal strands, laid fiat on the page, and viewed from the outside. The dotted sections in (a) arc the hypervariablc regions; the topology of a constant domain would be the same, but with the dotted sections loft out.

224

J.S.

RICHARDSON

ET

AL.

hypervariable regions extend as loops to form the antigen-binding site, and they vary greatly in length and amino acid sequence from one ~mmunoglobulin molecule to another. The constant domains (Cr. and CH1) are appro~mately the same total length as variable domains (about 110 residues) but are missing the long loop between strands C and D; each fl strand is longer and the entire constant domain is somewhat longer and flatter than a variable domain. The seven fl strands of each domain are arranged in two layers which curve around to form a flattened cylinder. Strands A,B,E and D are in one layer and strands C,F and G in the other. For variable domains the curve on the fl sheet is fairly smoothly continuous, with three or four parallel-type hydrogen bonds, on the side between strands A and G, but strands C and D are rather widely separated. For constant domains the layers are quite distinct. In almost all variable domains the hypervariable region loop between strands C and D forms two more extended chains which approximately continue the C,I~,G layer of ~ sheet, forming a nine-stranded cylinder. In each domain a disulfide bridge crosses the cylinder between strands B and F. Further description of ~mmunoglobulin three-dimensional structures is available in the literature, including the antigen-binding sites, binding of small antigens, domain interactions, ~-carbon co-ordinates, hydrogen-bonding schemes, positions of invariant residues, etc. (Schiffer et al., 1973; Poljak et al., 1974; Epp et al., 1974; Segal et al., 1974; Davies ~ al., 1975a,b). The superoxide dismutases are enzymes which scavenge the highly reactive superoxide radical (02"-) by dismuting it to 02 and H202, a protection that seems to be essential for any organism capable of metabolizing oxygen (Fridovich, 1974). The form of this enzyme found in the cytoplasm of eukaryotes has two identical subunits each with about 150 amino acid residues, one zinc, and one catalytic copper. The principal feature of the three-dlmensional structure of bovine Cu2+, Zn 2+ superoxide dismutase (Richardson et al., 1975a,b) is a somewhat flattened cylinder (or "barrel") consisting of eight strands of fl sheet, all with antiparallel-type hydrogen bonding between neighbouring strands. Figure l(b) is a stereo drawing of the ~-carbon backbone of the superoxide dismutase subunit and Figure 2(b) is a diagram showing the topology of the fl structure. I f the first fl strand in the sequence is labeled N, then the remaining strands, A through G, have the same topological connectivity and occupy the same positions on the cylinder as the corresponding strands in an immunoglobulin domain. The superoxide dismutasc cylinder can also be thought of as two layers of fl sheet, with fl strands N,A,B and E in one layer and D,C,F and G in the other. The superoxide dismutase fl sheet is somewhat more smoothly cylindrical than that in the hnmunogiobulin domains (especially the constant domains). It also has a relatively wide separation next to strand D, but in this case between strands D and E. The D strand is within hydrogen-bondhag distance of E next to the D,E bend, but most of its length lies next to C. The similarity in shape and arrangement of the various fl cylinders can be seen in Figure 3, which shows end-on and side views of just the fl strands for a copper, zinc superoxide dismutase subunit, a variable immunoglobulin domain, and a constant immunoglobuliu domain. The cylinders are all somewhat flattened in the direction perpendicular to the two "layers"; the minor and major cross-sectional axes (measured between main chains) and the length of the cylinders are about 12 A • 16 A • 28 A for superoxide dismutase, about 10 A • 16 ~ • 28 ~ for variable domains, and about 10 A • 16 A • 33 A for constant domains. All of these cylinders have the

IMbIUNOGLOBULIN

AND SUPEROXIDE

DISbIUTASE

225

(o)

(b)

(c) FIG. 3. E n d - o n a n d side views of the ~ cylinder for (a) copper, zinc superoxide dismutase, (b) V s from MePC603, a n d (e) CH1 from MePC603. For t h e variable domain, ~ strands contributed b y hypervariable regions are shown with open lines.

usual direction of/~-sheet twist (right-handed if defined along the chains, left-handed if defined perpendicular to the chains). An imaginary spiral path going once around the cylinder and remaining locally perpendicular to the chain direction would have its beginning and end offset from each other by four or five residues. This represents much less twist than, for instance, in the two six-stranded chymotrypsin/~ cylinders where the offset would be eight or nine residues. In the superoxide dismutase structure, bends CD and FG extend out from the cylinder as long loops which help form the copper and zinc binding sites (the metals are only 6 ~ apart). These bends are in the same places as two of the three bends which form hypervariable-region (antigen-binding) loops in immunoglobulin variable domains. In Figure 4 the correspondence of both ~ structure and loops between a superoxide dismutase subunit and an immunoglobulin VH domain is shown by means 15

226

J.S.

RICHARDSON

ET

AL.

FIG. 4. Comparison of step-by-step build-up of backbone configurations for suporoxide dismutase (down the left side) a n d a n immunoglobulin variable domain (down t h e right side). New backbone added a t each stop is shown b y h e a v y arrows.

IMMUNOGLOBULIN AND S U P E R O X I D E DISMUTASE

227

of a sequence of schematic drawings in which the two structures are built up side-byside. Superoxide dismutase does not have a disulfide bridge across the fl cylinder, and the residues in those positions arc not cysteines (the single disulfide in superoxide dismutase is between fl strand G and one of the long external loops). Also the subunit contact within the superoxide dismutase dimer (Richardson et al., 1975b) does not have the same geometry as either of the two types of domain contact found in Fab structures.

3. Superposition o f a-carbon Co-ordinates Within the Fab fragment there is very close conformational (and also sequence) homology between the two variable domains VT. and Vm and between the two constant domains C~. and C~I. Variable and constant domains are almost certainly related to each other also, but much more distantly. For the purposes of the current study, the superoxide dismutase co-ordinates were compared to Vm VL, Cr. and CH1 co-ordinates, and for reference, comparisons were made between V H and CL, VH and VL, and between VT, and C~I. As a quantitative measure of the similarity between these structures, their a-carbon co-ordinates were superimposed and the square of the distances between equivalent a-carbons was refined to an optimal least-squares fit (Rao & Rossmann, 1973). The superoxide dismutase co-ordinates were measured from the 3 A resolution electron density map of the bovine erythrocyte enzyme (Richardson et al., 1975a,b) and the immunoglobulin co-ordinates were from the 3-1 A resolution map of the mouse myeloma protein McPC603 Fab fragment (Segal et al., 1974). Initial equivalencing of residues was achieved by aligning the corresponding fl strands, taking into account which residues were internal and which external, and aligning the known or presumed hydrogen bonds. A relative shift along the cylinder axis b y two residues in either direction allowed fewer residues to be superimposed. The initial transformation matrix (defined by 3 Eulerian angles and 3 translational parameters) which would approximately superimpose the two structures was then refined by least-squares, a-carbons wkich were more than 4 A apart were omitted from the following refinement cycle; this had the effect of including only the "framework" of fl structure and a few of the most equivalent loops. Where needed, the assignment of equivalent residues was changed. Cycles of refinement and equivalencing were repeated until there were no further changes in the equivalencing. Table 1 lists the a-carbon pairs which were equivalenced for each of the comparisons and gives the overall root-mean-square separation distances obtained. Figure 5 shows the complete (sorted) distribution of separation distances for each of these comparisons. Also illustrated is a similarly obtained distribution for the comparison of the MePC603 VL with VT.REI, an analogous domain but in an independently determined structure from a different species (Epp et al., 1974) ; that overall r.m.s, distance was 1.18 A for 80 equivalenced co-carbon pairs. V H and V,. are the most similar of the non-identical domains, giving an r.m.s, distance of 1.72 A for 77 a-carbon pairs. Comparisons of variable with constant domains show somewhat greater differences, even though the hypervariable loops are not included. All four comparisons with superoxide dismutase show greater differences than any of the intra-immunoglobulin comparisons, but both variable and constant domains are nearly as similar to superoxide dismutase as they are to each other.

~

A

I

I

A~

A

N

~

r~

+.-

,.Q

c~

o

? I

r

f

I

I

~

I

E~

r~

r~

I

I C,l

c,3

I

='

A

r

,

I

I

Z

I

"7

I

~

~q

o'2

I1KMUNOGLOBULIN

AND SUPEROXIDE

DISMUTASE

229

4'0

SOD-CH

.

I

OD_VH

/

soo-

SOD-CL

2.5

f

o

/ v L- v LREI

~.

1.5-

9

.'~162 ~'~

0.5 84

I

I0

~

I

20

~

I

50

~

I

40

~

t

50

~

I

~

60

I

70

~

L

80

a-Carbon pair number (sorted by increasing separolion distance)

FiG. 5. Distributions of e-carbon pair separation distances for the structure superpositions described in Table 1 ; for each structure comparison the distances are sorted so t h a t t h e y increase monotonically from left to right. All immunoglobulin domains are from McPC603 except for V~.REI, which is also a Ktype light chain. SOD refers to the bovine copper, zinc superoxide dismutase.

4. Probability Analysis of the Topological Similarity In order to make a rough estimate of the likelihood that the degree of similarity shown by the immunoglohulin and superoxide dismutase structures occurred by chance, one may calculate the number of possible topologies for a tertiary structure of this general type, as was done by Schulz & Schirmer (1974) for the nucleotidebinding domains. For this purpose we will consider the structure as a complete cylinder, although (except for VL of IgG RHE (B. C. Wang, C. S. Yoo and M. Sax, personal communication)) immunoglobulin domains are always missing the hydrogen bonds on one side of strand D. We will allow either parallel or antiparallel fl sheet (after all, the immunoglobulin case itself has two parallel strands). It is arguable

230

J. S. RICHARDSON E T A L .

whether or not to allow connections which "cross" (using a helix or some other structure) from one end of the cylinder to the other; seven "cross" connections occur in the triose phosphate isomerase fl cylinder (Banner et al., 1975), but they are undoubtedly somewhat disfavored in a structure of only 100 to 150 residues. We will try the calculation both with and without "cross" connections, but in any case will not permit a connecting chain to go down the inside of the cylinder. For a cylinder with n strands of fl structure (parallel or anti-parallel) there are ( n - - l ) ! possible topologies if "cross" connections are not permitted and 2 n-1 x ( n - - l ) ! possible topologies if they are permitted. I f n = 8 this gives 7! = 5040 possibilities (27 x 7 ! ---- 645,120 possibilities under the less conservative assumption), or a probability of 1/5040 that two such structures will match by chance. The probability of such a structure matching the immunoglobulin domain structure is a factor of two greater than that, since deletion to make seven rather than eight strands could be on either the N or C terminal side. The overall probability of matching must be multiplied by another factor of 1/7, because there are at least 7 x 6 ---- 42 ways of placing the external loops of superoxide dismutase relative to the topology of the cylinder, and only 3 x 2-~6 of these ways match the placement of immunoglobulin hypervariable region loops. From this analysis, the probability of the topological similarity between these two molecules is 1/7! X 2 x 1/7~1/17,640 (or 1/2,257,920 if "cross" connections are permitted). There are two very serious problems with the above sort of analysis. The first is that, as we have seen, a fairly minor change in our defmition of what constitutes "the same general type of structure" changed the result b y more than two orders of magnitude. The second, related problem is that we have assumed t h a t all local connectivity types are equally probable; although there is no way to calculate the relative probabilities a priori, a simple count of what patterns occur in the known protein structures shows an extremely strong preference for simple connection between nearest-neighbor strands. This is presumably a result of the fact (Wetlaufer, 1973; Ptitsyn & Raslun, 1975) that the statistics of interactions during the folding process very strongly favor interaction of secondary structure pairs which are adjacent in the sequence. In order to greatly alleviate both of the above problems, we will perform a topological analysis which builds in empirical estimates of the relative probabilities of the various local connectivity types. Table 2 shows the observed occurrence frequencies of the different strand connectivities in fl structures of at least four strands, classified according to the separation of the two connected strands in the fl sheet and according to whether the connection stays at one end of the sheet (types =kn) or crosses to the opposite end of the sheet (types ~=nX, or "cross" connections). For • types the two strands connected are antiparallel to one another and for =knX they are parallel; however, except for n ---- 1 that-does not directly correlate with whether they each are parallel or antiparallel to their nearest neighbors in the sheet. Type X connections possess a handedness; however, right and left-handed cases are not listed separately in the table because out of 66 total observed cases of type X connections 64 are right-handed and only two left-handed (a =]=IX in subtilisin and a • in hexokinase). In the following analysis we will simply assume t h a t only right-handed "cross" connections are perml.qsible. Referring to the topology in l~igure 2(b), let us proceed through the superoxide dismutas.e cylinder one strand at a time in sequence, using the empirical occurrence

IMMUNOGLOBULIN

AND SUPEROXIDE

DISMUTASE

231

TABLE 2

A summary of how often each local connectivity tylae has been observed in 18 Troteins with fl-sheet struvtures of at least four atrands AntiparallelJ-

Mixedw

Total occurrences

Smoothed relative occurrence frequencies

29 5 8 4 4

52 20 8 10 8

52 20 9 9 9

4

5

5

2 2

2 3 2

2 2 2 1 1 1 1

4-7X 4-8

1

1

1 1

4-8X

1

1

1

4-1 4-1X 4-2 4-2X 4-3

23 15 4 4

/:3X

4-4 4-4X 4-5 4-5X 4-6 4-6X 4-7

Parallel~t

2 1

3

No structures are included which have identical topologies or which differ in only one strand. The first 3 columns give separate tabulations for fl-sheets with all antiparallelt, all parallel:~, and with mixed w hydrogen bonding. The connectivity types are defined in the text. Note that one reason for small numbers near the bottom of the table is that only a few of the fi-sheets are large enough to make those connectivity types possible; however, all sheets included allow for differences at least up to 4-3. Concanavalin A (Reeke et al., 1975), superoxide dismutase (Richardson e$ al., 1975a), chymotrypsin (Birktoft & Blow, 1972), papain (Drcnth et al., 1971b), rubredoxin (Watenpaugh e$ aL, 1973), and T4 phage lysozyme (Matthcws & Remington, 1974). Subtilisin (Drenth et al., 1971a), lactate dehydrogenase (Rossmann eta/., 1974), and triosc phosphate isomerase (Banner e$ al., 1975). wCarbonic anhydrase (Kannaa e$ al., 1971), hcxokinase (Fletterick et al., 1975), carboxypeptidase A (Lipscomb et al., 1968), thermolysin (Cohnan et al., 1972), prealbumin (Blake et al., 1974), glyceraldehyde 3-phosphate dehydrogenase (Buehner et al., 1974), staphylococcal nuclease (Arnone et al., 1971), cytochromo b5 (Mathews et al., 1971), and bhioredoxin (Holmgren et aZ., 1975). frequencies t o e v a l u a t e a m o n g t h e possibilities a v a i l a b l e a t e a c h s t e p t h e r e l a t i v e p r o b a b i l i t y o f t h e a c t u a l c o n n e c t i o n m a d e . T h e p r o b a b i l i t y a t each s t e p is defined as t h e s m o o t h e d r e l a t i v e occurrence f r e q u e n c y for t h e t y p e o f c o n n e c t i o n a c t u a l l y m a d e , d i v i d e d b y t h e s u m of t h e s m o o t h e d r e l a t i v e occurrence frequencies for all p e r m i s s i b l e c o n n e c t i o n s still a v a i l a b l e a t t h a t step. A t t h e first c o n n e c t i o n (which is a c t u a l l y o f t y p e 4-1 o u t of t h e possibilities : h l , : h l X , ! 2 , + 2 X , • o m i t t i n g l e f t - h a n d e d X t y p e s ) t h e p r o b a b i l i t y is 52 - - ( 5 2 - 5 2 0 + 9 + 9 + 9 + 5 + 2 + 2 + 9 - 5 9 - 5 5 2 ) -~ 52/178. A t t h e n e x t s t e p (also + 1 ) t h e p r o b a b i l i t y is 52 - - (52-520-5 9 - 5 9 + 9 - 5 5 - 5 2 - 5 2 - 5 9 - 5 9 ) ---- 52/126, since one o f t h e n e a r e s t - n e i g h b o r c h a i n s h a s a l r e a d y b e e n used. C o n t i n u i n g in t h i s m a n n e r , one calculates 52[178 • 52/126 • 9/88 x 52/151>

Similarity of three-dimensional structure between the immunoglobulin domain and the copper, zinc superoxide dismutase subunit.

J. Mol. Biol. (1976) 102, 221-235 Similarity of Three-dimensional Structure Between the lmmunoglobulin Domain and the Copper, Zinc Superoxide Dismuta...
840KB Sizes 0 Downloads 0 Views