Structure and Function of Glycosylation

ing period (irrespective of seniority) and the requirement to give meticulous attention to experimental technique and presentation of results. For example, in the days of the Lang Levy pipettes, it was unthinkable to start experimental work for one's pro-ject without first calibrating a new, personal, set of these. This was a multipurpose exercise, not least to ensure good pipetting and weighing skills, and to set the scene for the precision work to follow. Then, there were (and still are) the Saturday 08.00 h-11.00 h seminars that everyone simply had to attend, and be prepared to discuss three recent articles. The enthusiasm and joy in sharing new data with Kabat is an unforgettable experience. Elvin Kabat remains extremely active in teaching, research and writing. During the last 16

~

~

years he has been spending two days a week at the National Institutes of Health in Hethesda where he has an office and compiles the atlas, 'Sequences of Proteins of Immunological Interest', of which he has recently completed the fifth edition and hopes to be producing the sixth. The rest of the week is spent in his laboratory at Columbia University in New York where he is pursuing active research, attracting grant support and supervizing the work of graduate students. One of his most recent studies has been a convergence of the fields of immunology and carbohydrate biology: it is focused on the influence of glycosylation of the antibody variable region on antigen binding. The key findings of this topic, which Dr Kabat addressed at this Colloquium, are summarized below.

~~

V-region glycosylation and antigen binding in monoclonal anti-a( I

+

6)dextrans

A. Wright,* S. Wallick,t M. H. T a o , t E. A. Kabatt and S. L. Morrison* *Molecular Biology Institute, U.C.L.A.,Los Angeles, CA 90024, U.S.A. and +Department of Microbiology,Columbia University College of Physicians and Surgeons, New York, N Y 10032, U.S.A. Carbohydrate addition sites Asn-X-Ser/Thr are essential but not sufficient for glycosylation on the Asn. Since carbohydrate is present on the CH1 region of mouse IgC, its replacement with human IgG4 constant region permits study of variable region glycosylation. Anti-a( 1 6) dextrans arising as somatic cell mutants 'or obtained by site-specific mutagenesis and having potential addition sites may or may not be glycosylated depending on their position in CDR2. Addition of carbohydrate to hybridoma 14.6b.1 increases binding for dextran 10-SO-fold. Creation of carbohydrate addition sites +

in CDR2 at Asn-54 and Asn-60 results in their glycosylation; Lys-62 Thr-62 decreases affinity for antigen, whereas a glycosylation site created by Asn -+ Thr-60 increased affinity 3-fold and glycosylation at amino acid 54 inhibited binding. A carbohydrate binding site in CDR2 in an antidansyl antibody was not glycosylated. All carbohydrates added to the glycosylation sites bound to concanavalin A indicating functionality at the surface of CDR2 of the expressed antibodies. J.4 and J.5 reduced or abolished binding compared ; specificity was not affected. with J ~ 2idiotypic -+

Studies of oligosaccharide and glycoprotein conformation Elizabeth F. Hounsell, Michael J. Davies and David V. Renouf Glycoconjugates Section, Clinical Research Centre, Watford Road, Harrow HA I 3Uj, U.K.

An underlying aspect of our understanding of the role that glycosylation has to play in the structure and function of glycoproteins is the necessity to visualize oligosaccharide conformations and identify determinants of activity for biological and immunological recognition. The carbohydrate moieties can contribute a significant proportion of Abbreviations used: HOHAHA, homonuclear-Hartmann-Ilahn spectroscopy; n.O.e., nuclear Overhauser enhancement; KOESY, rotating-frame n.0.e. spectrosCOPY.

the hydrodynamic mass of glycoproteins and have a diverse array of oligosaccharide sequences for specific interactions. As described in detail in this article, we can now model complete glycoproteins by a procedure which includes data from: (a) X-ray crystallographic and n.m.r. studies of proteins, polypeptides and oligosaccharide-protein core regions. (b) Molecular dynamics approaches to flexible backbone regions of oligosaccharide chains, and (c) structural and conformational analysis of globular oligosaccharide determinants at the non-reducing end of chains.

I992

'

259

Biochemical Society Transactions

260

First the structures of oligosaccharide core, backbones and peripheral regions will be discussed. At the structurally least complex end of the scale of protein glycosylation are the monosaccharide substitutions, such as the single N-acetylglucosamine residue (GlcNAc) linked through the hydroxyl group of Ser/Thr (0-linked) of cytoplasmic proteins (reviewed in the contribution by G. IIart in this Colloquium proceedings, pp. 264-260). The possibility of multiple GlcNAc additions on Ser/ Thr residues in close proximity on the protein, the possibility of joint recognition of carbohydrateprotein conformations, the possiblity of reciprocal phosphorylation/glycosylation and the location of glucosaminylation on important nuclear membrane (pp. 264-260) and cytoskeletal [ 1 I proteins suggests a specific structure/function role in cell regulation. 0-linked chains classically have a GaINAc aSer/‘l’hr linkage and the chains are extended by the additions of fNeuAc a2-3Galpl-3 and k NeuAca2-6 to the GalNAc. These are often found clustered on adjacent Ser or Thr residues and form stereochemically relatively rigid carbohydrateprotein determinants which, for example, can be recognized as joint antigenic epitopes (e.g. blood group M and N antigens 121 and the oncodevelopmental antigen of fibronectin [ 3 I) or provide conformational rigidity to glycoprotein receptors (e.g. receptor for low-density lipoprotein [41, nerve growth Factor [ S 1 and interleukin-2 (11,-2) [ h ] ) . In addition to the Gal/31-3(;alNAc a 1-Ser/ Thr core region, an array of different substitutions of 0-linked (;aINAc are noh known 171. These cores are often carriers of quite long backbone regions of repeating (-+ 3/6Galpl 3/4(;lcNAcpl + ),, sequences. Additional terminal substitution with sialic acid, fucose. (;alNAc a, Gala or GalNAca is common. These backbone and peripheral sequences are the same as those found on the N-linked chain cores linked through (;lcNAcpl-2, 4 or h to the (Man a)s of +

Mana 1 \*‘(ManPI

.+ 4(;lcNAcPl+

4CJlcNAcPl-Asn

Mana 1 f ‘I’he backbone and peripheral sequences on both types of core have been characterized not only at a structural level but also initially as differentiation antigens and now as molecules with functions as address codes (see contribution of T. Feizi, pp. 274-270 this volume). Hecause oligosaccharides can be built up with branching (i.e. more than two hydroxyl groups around the ring may be linked),

Volume 20

which imposes stereochemical controls on the degrees of conformational freedom. sequences at the tri- to heptasaccharide level can form distinct determinants independent of the remainder of the chain and the underlying protein. Thus n.m.r. analysis of the structures of interest in immunological and biological recognition can be performed on free oligosaccharides representing these terminal sequences, which can be obtained from different sources [81. The conformational information obtained can then be used to design therapeutic analogues or to build up a picture of completely glycosylated proteins as outlined below.

N.m.r. analysis of oligosaccharides, glycopeptides and polypeptides Although some information on glycoprotein structure can be obtained from X-ray crystallographic studies of that glycoprotein, or of homologous proteins for which X-ray crystallographic data are available [9], in general glycoproteins are proving hard to crystallize. ‘I‘he heterogeneity and relative flexibility of oligosaccharides obviates their visualization by X-ray and the databank of largely nonglycosylated proteins providing the algorithms for protein structure prediction may not be applicable. Data from n.m.r. analysis is therefore essential for conformational studies. Information from n.1n.r. analysis is being obtained from isolated oligosaccharides and glycopeptides together with data from chemically synthesized glycopeptides and peptides using one-dimensional, two-dimensional and three-dimensional techniques. It has been shown by extensive analysis of ‘ I In.m.r. data of oligosaccharides reported in the literature I101 that the chemical shifts of sequences in different oligosaccharides can be compared to within f 0.03 p p m . to give unique identification of oligosaccharides. A databank has been set up which can be interrogated to within f 0 . 0 3 p p m . to provide computer-assisted interpretation of n.m.r. spectra. Chemical-shift differences outside this error margin are indicative of conformational effects caused by the close proximity through-space of different topographical features. This information can therefore be included in computer-graphics molecular modelling of oligosaccharides. Table 1 shows the data entry for sialylated and fucosylated oligosaccharide motifs including structures discussed in the subsequent chapters of this Colloquium as determinants of molecular recognition. The following examples show the type of conformational information which can be obtained.

Structure and Function of Glycosylation

Fucosylation produces a large 'knock on effect', causing differences in chemical shifts of substituents on non-adjacent monosaccharides, thus necessitating often quite long sequences to be included in the database given to f0.03 p p m . (Table 1). This shows that fucose residues often interact through space giving rise to globular-type epitopes. A detailed n.m.r. analysis of the conformation of fucosylated oligosaccharides having the blood group €4 type-1, Le"* Le", I,eh and the 3-fucosyllactose/3-fucosyllactitol sequence has been discussed previously [8, 113. The blood group A sequence linked to GalNAcol has now been extensively studied as a component of bovine submaxillary mucin sialylated at C6 [ 11a]. The chemical shifts for NeuAca2-6 and NeuGca2-6 linked to GalNAcol are largely independent of the substituent on C3, e.g. monosaccharide-to-tetrasaccharide linked through GlcNAcPl-3GalNAcol [ 1la], suggesting that the chains on C6 and C3 do not interact significantly through the C3-C5 face of sialic acid. The chemical shifts for sialic acid linked to Gal and GlcNAc are largely independent of the adjacent monosaccharide along the chain. For example, the chemical shifts shown for NeuAc linked 2 - 6 to GlcNAc and 2 3 to Gal are within & 0.03 p p m . in the sequences NeuAca2 6GlcNAcpl- 3Galp1- , NeuAc a 2 3GalB1- 3GlcNAc, NeuAca2 3Gal/31+ 3[Fuca 1 4]GlcNAc/31- 3GalBI- [ 121 and NeuAca2- 3Gal/31-. 3[NeuAca2-6]GlcNAcPl3GalP1- suggesting that neither the sialic acidfucose nor the sialic acid-sialic acid residues interact through space. Similarly the chemical shifts of Fuca 1-4 in the above sequence are within f 0.03 p p m . of the same fucose in the non-sialylated Le" structure necessitating only one data entry (Table 1). However, the chemical shifts of NeuAc linked 2 3 to Gal having a GalNAcP linked 1-4 are significantly changed, suggesting an anisotropic effect of the close proximity of these two residues and necessitating a new entry in the databank. Complimentary information on the relative proximity of particular atoms through space is obtained by n.m.r. nuclear Overhauser enhancement (n.0.e.) studies carried out either by two two-dimensional homonuclear-Hartmann-Hahn spectroscopy (HOHAHA) and rotating-frame n.0.e. spectroscopy (KOESY) experiments or a threedimensional HOHAHAIROESY experiment (E. F. Hounsell, C. J. Rauer, T. A. Frenkiel & J. Feeney, unpublished work). The qualitative information obtained about nearest neighbours through-space is incorporated into the computer-graphics molecular modelling. -+

+

-

-.

-

Setting up of commercial computer graphics packages for oligosaccharide and glycopeptide modelling Commercially available software packages (for historical/developmental reasons) are primarily set up for protein manipulation. This is useful in part for glycoprotein studies, but the modelling of the oligosaccharide moieties is as yet poorly supported. T o take full advantage of the functionality of these programs, ways of incorporating relevant carbohydrate modelling features are being investigated. Examples described here have been derived using a Silicon Graphics Personal Iris 25GT Workstation, with Hiosym Molecular Graphics Package installed (Insight/Discover). This software offers no predefined sugar structures in its library (N.R. other software with structures included may need checking for inaccuracies). Hasic structures can be built from a standard icon library of common hydrocarbons and groups (e.g. by starting from a cyclohexane ring, substituting a ring carbon for an oxygen, and replacing relevant hydrogen atoms with hydroxyl groups), but ring numbering, geometry, potential atom types and charges require modification. Structures can be edited on-screen for many of these features and the product can be stored for reuse either as User Icons, or archived structures.

Monosaccharides

Sugar ring structures are affected by forces resulting from the presence of oxygen molecules in the pyranose ring and glycosidic linkage and therefore suitable parameters to accommodate these effects are required for oligosaccharide and glycopeptide modelling. Hiosyms' CVFF forcefield gives automatic parameter assignments to molecules but has no carbohydrate-specific parameters. To best model both oligosaccharide and amino acid sequences we have chosen the AMHEK forcefield, which is provided in a format suitable for use with Discover. However, the automated features are not supported for it and therefore either on-screen modification of parameters is required, or alternatively the structure text files can be edited. Potential atom types for AMHEK, developed for unsubstituted pyranose sugars [ 131 including charges [ 141 can be incorporated into the Hiosym system. Additional groups (eg. acetamido deoxy, carboxyl) are derived from AMHEK forcefield parameters relating to proteins and nucleic acid structure libraries. Monosaccharides are constructed from crystal structure dimensions and either fully minimized

I992

26 I

Biochemical Society Transactions

Table I

Chemical shifts for structural reporter groups of fucose and sialic acid allowing for computer-assisted interpretation of 'H-n.m.r. data and for distinguishing commonly occurring fucosylated and sialylated sequences 262

Only a limited number of proton signals are included which are the 'structural reporter groups' easily assigned in the spectrum. Analysis of all the signals allows for additional conformational information t o be obtained.

Fucose reporter groups

-- --

Fucose sequences ( f 0.03 p.p.m.)' Fuca I + 2GalP I 4 Fuca I 2GalP I 3 Fuca I 2[Gala I 3IGalP I Fuca I 2[GalNAca I 31GalP I 3GlcNAcb I Fuca I 2[GalNAca I + 3IGalP I 4GlcNAcb I Fuca I 2[GalNAca I -3IGalPI -.3GalNAcol Fuca I 3[GalB I + 41GlcNAcP I -. Fuca I -4[GalpI 3]GlcNAc/?I Fuca I 3[GalPI 41Glc Fuca I 3[GalPI -4IGlcol Fuca I -+ 2GalP I -4[Fuca I 31Glcol Fuca I 3[Fuca I + 2GalPI -4IGlcol Fuca I 2GalP I 3[Fuca I 4]GlcNAc/3I -. Fucal -4[Fuca I -2GalPI -3IGlcNAcPI Fuca I 6[GlcNAcPI -4IGlcNAc or GlcNAcol

---

5.3 I 5.19 5.35 5.26 5.35 5.38 5.1 I 5.02 5.40 5.08 5.42 5.07 5.15 5.03 4.89

4

+

--

-t

-

-

-t

+

-

-t

--

-

-

-

4.25 4.28 4.42 4.33 4.32 4.33 4.86 4.88 4.83 4.27 4.20 4.20 4.34 4.13 4.10

I .23 I .23 I .22 I .25 I .25 I .23 1.17 I .20 1.16 I .22 I .23 I .23 I .26 I .27 I .21

Sialic acid reporter groups

( k 0.03 p.p.m.)"

H,, ( f 0.03 p.p.m.)

NAdNG I ( If:0.005 p.p.m.)

I .60 1.71 I .70 I .7l I .73 I .80 I .82 I .93

2.72 2.75 2.74 2.67 2.68 2.76 2.79 2.66

2.032 4. I23 2.03 I 2.030 4.1 18 2.03 I 4. I22 2.03 I

4,

--

Sialic acid sequences

NeuAca2 6NalNAcol NeuGca2 6GalNAcol NeuAca2 6GlcNAcP I NeuAca2 6GalPI -+ or GalNAcP I NeuGca2- 6GalP I NeuAca2 + 3Galb I + NeuGca2- 3GalP I NeuAca2 3[GalNAcP I 41GalP I

- --+

-

-

- -

-

+

'p.p.m. at 295 K in 'H,O pH 7.0 with reference t o acetone at 2.225 p.p.m. from 4,4-dimethyl-4-silapentane-I -sulphonate. For the same oligosaccharide analysed the accuracy for all chemical shifts is k O.OO5 p.p.m., but when comparing sequences from different oligosaccharides the value kO.03 p.p.m. allows for distinct identity while restricting the number of lines in the database.

using the appropriate forcefield or manipulated to align groups for most suitable linkage orientations. Disaccharides

Atom potential types and charges need reassigning after linkage. This may be done automatically if the default CVFF forcefield is being used, but for AMBER, editing is necessary. In many cases, glycosidic torsion angles need modifying to remove unfavourable atomic interactions before minimization and other procedures. #H(Hx-Cx-O-C),qH(Cx-OVolume 20

C,.-H,) O", 0" is a reasonable starting angle, or alternatively a known minimum energy angle (e.g. from HSEA calculations [15]) can be used. It would be preferable to set the glycosidic t angle (1 16.9") implicitly, but in Insight this is controlled by the forcefield alone. Structures created from defined glycosidic torsion angles need minimizing to relieve interactions across the linkage. The two torsions, # and q, involved in the glycosidic linkage are both potentially flexible by rotation, each by 360". Some

Structure and Function of Glycosylation

regions of this rotational space are ruled out by steric interactions. These in turn, however, can give rise to multiple minima of different energy potentials, separated by energy barriers. One or more of these minimum energy wells may be accessible and preferred by the molecule under given conditions (temperature, environment, etc.). The three-dimensional space around the glycosidic torsions is explored using the following strategy. A structure is created for every combination of $ and 1/, angle from - 180" to + 180". A force is applied to maintain the torsion angle and the rest of the structure is minimized to a root mean square value less than 0.1 Kcal/A. The energy of the resulting structures are then pooled for all angle combinations to produce an energy contour plot over $ 1space. ~ If intervals of the order of 30" are used, the minimum energy structure can be further refined by minimization or molecular dynamics, after the forcing term has been removed. Hasic energy contour plots can be obtained in a few hours for a disaccharide simulation in vucuo. The normal solution environment can be simulated by changing the dielectric constant E from 1.0 (vacuum) to 80 (water), or by use of a distancedependent dielectric ( E increasing with increasing distance between atoms), to mimic the screening effect of solvent. Inclusion of water molecules explicitly is also possible as discussed below.

Molecular dynamics studies of oligosaccharides, polypeptides and glycoproteins Within the molecular dynamics software Newton's equations of motion are used to investigate the movement of molecules under constant pressure and definable temperatures over very short time scales (ps). Water can be explicitly included as the solvent for the simulation by defining periodic boundary conditions (this is comparable with a crystal environment, except that adjacent molecules have no contact with each other and are therefore free to move). A box is defined from the maximum extents of the molecule in each of the three dimensions, increasing these by 2 A on each side of the molecule, and positioning the molecule centrally within it. The solvent space is then filled with water molecules and subjected to minimization to relax the system. Objectives

This allows a number of studies to be undertaken. Some of the main objectives are as follows. (1) To simulate molecular movement at a constant

temperature to observe conformational changes over a given time scale. (2) T o search for minimum energy configurations - e.g. by simulated annealing, as an alternative to $/W/energy maps described above. ( 3 ) T o investigate linkages with more than two rotational bonds (e.g. 1 - 6 ) and molecules larger than disaccharides. (4)The use of the molecular models for comparison with experimental data, and their refinement by incorporating additional information (e.g. n.m.r. n.0.e. distance constraints). (5) The study of glycopeptide structures within a single forcefield system. (6) The studies of polypeptides for which no secondary structure is strongly predicted from the commonly available graphics algorithms [9], and ( 7 ) The energy minimization and simulated annealing of glycoproteins after building them from modular protein, glycopeptide and oligosaccharide components. 1. King, I. A. & Hounsell. E. F. (1990) J. Hiol. Cheni. 264,14022- 14028, 2. Welsh, E.J., Thom. I)., Morris, E. K. & Kees, 1). A. (1985) Hiopolymers 24,2301 -23.32 3. Matsuura. H.. Green, T . & Hakomori, S. (1980) J. Biol. Chem. 264, 10472- 10476 4. Yamamoto, T., Davis, C. G., Brown, M. S., Schneider, W. J., Casey, M. I,., Goldstein, J. I,. & Kussell. I). W. (1984) Cell (Cambridge, Mass.) 39.27-38 5 . Johnson, D., Lanahan, A., Buck, C. K.,Sehgal, A.. Morgan, C.. Mercer, E., Hothwell, M. & Chao, M. (1986) Cell (Cambridge. Mass.) 47, 545-554 6. Nikaido, T., Shimizu, A,, Ishida, N., Sabe, H., Teshigawara, K., Meade, M.. LJchiyama, T.. Yodoi, J. & Honjo. T. (1984) Nature (London) 31 1, 63 1-635 7. Hounsell, E. F., Lawson, A. M.. Stoll, M. S.. Kane, 1). P.. Cashmore. G. C.. Carruthers, K. A,, Feeney, J. & Feizi, T. (1989) Eur. J. Hiochem. 186, 597-010 8. Hounsell, E. F.(1987) Chem. Soc. Rev. 16. 161-185 9. Hounsell, E. F., Kenouf. L). V., Liney, D.>Lhlgleish, A. & Habeshaw, J. (1091) Mol. Aspects Med. 12, 283-290 10. Hounsell. E. F. & Wright, L). J. (1990) Carbohydr. Kes. 205. 19-20 11. Hounsell, E. F.,Jones, N. J., Gooi. H. C.. Feizi, T., Donald, A. S. K. & Feeney, J. (1988) Carbohydr. Kes.

178,67-78 I l a . Chai. W., Hounsell, E. F., Cashmore. G. C., Kosankiewicz, J. K.. Feeney, J. & 1,awson. A. M. (1992) Fur. J. Hiochem. in the press 12. Hreg, J.. Komijn, 11.. Vliegenthart, J. F. G., Strecker, G. & Montreuil, J. (1988) Carbohydr. Kes. 183, 10-34 13. Homans, S. (1 990) Hiochemistry 2 9 , 9 1 10-9 1 18 14. Ha, S. N., Giamona, A., Field, M. & Hrady, J. W. (1988) Carbohydr. Kes. 180,207-221 15. Lemieux, K. lJ. & Hock, K. (1983) Arch. Hiochem. Hiophys. 221,125-134 Received 18 December 1991

I992

263

Studies of oligosaccharide and glycoprotein conformation.

Structure and Function of Glycosylation ing period (irrespective of seniority) and the requirement to give meticulous attention to experimental techn...
473KB Sizes 0 Downloads 0 Views