..) 1991 Oxford University Press

Nucleic Acids Research, Vol. 19, No. 24 6771 -6779

Solution conformation of an oligonucleotide containing a G.G mismatch determined by nuclear magnetic resonance and molecular mechanics J.A.H.Cognet, J.Gabarro-Arpa, M.Le Bret, G.A.van der Marell, J.H.van Boom1 and G.V.Fazakerley2*

Laboratoire de Physico-chimie Macromoleculaire, Institut Gustave Roussy, F-94800 Villejuif, France, 1Gorlaeus Laboratoria, University of Leiden, 2300A-Leiden, The Netherlands and 2Service de Biochimie et de Gen6tique Moleculaire, Bat.,142, Departement de Biologie Cellulaire et Moleculaire, Centre d'Etudes Nucleaires de Saclay, 91191 Gif-sur-Yvette Cedex, France Received October 7, 1991; Revised and Accepted November 27, 1991

ABSTRACT We have determined by two-dimensional nuclear magnetic resonance studies and molecular mechanics calculations the three dimensional solution structure of the non-selfcomplementary oligonucleotide, d(GAGGAGGCACG). d(CGTGCGTCCTC) in which the central base pair is G.G. This is the first structural determination of a G.G mismatch in a oligonucleotide. Two dimensional nuclear magnetic resonance spectra show that the bases of the mismatched pair are stacked into the helix and that the helix adopts a classical BDNA form. Spectra of the exchangeable protons show that the two guanosines are base paired via their imino protons. For the non-exchangeable protons and for some of the exchangeable protons nuclear Overhauser enhancement build up curves at short mixing times have been measured. These give 84 proton-proton distances which are sensitive to the helix conformation. One of the guanosines adopts a normal anti conformation while the other is syn or close to syn. All non-terminal sugars are C2' endo. These data sets were incorporated into the refinement of the oligonucleotide structure by molecular mechanics calculations. The G.G mismatch shows a symmetrical base pairing structure. Although the mismatch is very bulky many of its features are close to that of normal B-DNA. The mismatch induces a small lateral shift in the helix axis and the sum of the helical twist above and below the mismatch is close to that of B-DNA. INTRODUCTION DNA base pair mismatches can occur in vivo as a consequence of (a) replication errors, (b) heteroduplex formation in the course of genetic recombination between homologous, but not identical sequences and (c) deamination of 5-methyl cytosine giving rise *

To whom correspondence should be addressed

to a G.T mismatch (1). Replication errors are corrected by the

localized excision of the newly synthesized strand and resynthesis using the parental strand as a template (1-4). The repair of all possible mismatches between two strands has been tested in heteroduplex DNAs of phage X transfecting E. coli (2). Transition mismatches (G.T and A.C) (2,3,5) are well repaired. Three of the transversion mismatches (G.A, C.C and C.T) are not well repaired (2,5,6). Furthermore, repair for G.A and C.T mismatches depends upon the base sequence around the mismatch site (7). We previously reported (8) that a G.A mismatch which is in a highly A.T rich sequence and is unrepaired in vitro, as tested with the E. coli mismatch repair system, deviates strongly from a B-DNA structure and forms a double loop around the mismatch site. The same mismatch, but in a G.C rich sequence, formed an intrahelical wobble pair. To investigate further the relationship between the recognition and/or repair of DNA mismatches and DNA conformation, we are studying the conformation of oligonucleotides containing mismatches corresponding to position 45 in the c 1 gene of phage X. At this position the G.G mismatch is well repaired whereas the C.C mismatch is poorly repaired (8). Here we present NMR data and molecular mechanics studies on the G.G mismatch.

MATERIALS AND METHODS The undecanucleotides were synthesized by a classical phosphotriester method (9,10). The pair of oligonucleotides were heated to 80°C followed by slow cooling to form the duplex, 5'd(G' A2 G3 G4 A5 G6 G7 C8 A9 CI0 G") 3'd(C22 T21 C20 C'9 T18 G17 C16 GI5 T'4 G13 C12) The phosphate backbone is numbered correspondingly, G'P'A2P2...

6772 Nucleic Acids Research, Vol. 19, No. 24 NMR Spectra. The duplex was 4 mM strand concentration dissolved in 10 mM phosphate buffer, pH 7.4 (unless otherwise stated), 150 mM NaCl and 0.2 mM EDTA. Chemical shifts were measured relative to the internal reference tetramethylammonium chloride, 3.18 ppm. NMR spectra were recorded on either WM500 or AM600 spectrometers. NOESY spectra were recorded with mixing times of 50, 60, 70, 80 and 400 ms in 2H20 and 250 ms in H20, in the phase sensitive mode (11). Typically 256 tI increments were recorded. After zero filling the data were multiplied by a 5-15 degree shifted sine bell function in both dimensions except for the short mixing time experiments. For these the data were multiplied by a sine bell shifted by fI/2 prior to Fourier transformation. Distance determination from the initial NOE build up curves was done as described previously (12). This method takes into account the observed differences in effective correlation time for different classes of interproton vectors. This phenomenon has recently been treated theoretically (13 and references therein). For both 1-dimensional and 2-dimensional spectra in H20 the observation pulse was replaced by a jump and return sequence (14) with the pulse maximum placed at 15 ppm. 1-dimensional NOE build up curves for certain interactions involving well resolved imino proton resonances were measured. Presaturation times of 0.1, 0.2, 0.3, 0.4 and 0.5 seconds were used. The TOCSY spectrum was recorded in the phase sensitive mode (15) with a 25 ms mixing time.

Modelling and Molecular Mechanics Calculations. Our modelling method was guided by two main concerns. Firstly, to find what specific geometrical changes are required to transform a classical B-DNA structure (16) into one or several molecular models that satisfy the following two criteria: 1) a qualitative fit to the NMR data concerning the observed base pairing and sugar puckering and a quantitative fit with the NMR distance measurements. 2) to be structurally stable in a mathematical sense, that is, to retain the principal structural elements even after application of a small strain or deformation. Here, we checked that the original hydrogen bonding of the central G.G base pair was preserved upon a change of the helical twist above and below this base pair.

a

ppm

U0

0

a

0

Structural stability implies here deep and wide potential wells for the parameters of deformation. These two criteria define whether the model is satisfactory or not. Secondly, to describe simply and reproducibly the nature of these geometrical changes. Consequently our modelling strategy consisted of making many different molecular models from classical B-DNA. The preliminary molecular models were as close as possible to classical B-DNA and included only the changes necessary to constitute a given pairing scheme. These models were then energy refined according to different protocols: directly without any constraint, with sugar pucker constraints or with hydrogen bond constraints followed by minimization without constraints. These models were constructed, displayed and analysed with the programs MORCAD (Molecular Object oRiented Computer Design) (17) and OCL (Object Command Language) (18) on a Silicon Graphics Iris 4D70GT. In particular these models were analyzed for their torsion angles, wedge parameters (helical twist, propellor twist, buckle etc.), mechanical energies, hydrogen bonds and their fit to the NMR data. Energy minimizations were carried out using the program AMBER (19-2 1) on a VAX 8600 computer. The parameters used were as described elsewhere (20,21). All hydrogen atoms are treated explicitly. To simulate the screening effect of the solvent, a gas phase potential was employed where the dielectric constant Dij, is proportional to the distance dij separating a pair of atoms: Dij - C.dij (22,23). C was taken as 4 A-' . All atom pairs were included in the calculations of non-bonding interactions. In this work, the minimizations were run with the 1-4 interatomic interactions divided by two in agreement with other workers (23). Energy refinements were terminated when the root mean square of the energy gradient was less than 0.1 kcal/A. In the early stages of the molecular modelling we attempted to energy refine our models with NMR distance constraints. Proton-proton distances, dij, were forced to their NMR determined distances, rij, by addition of E = Ek(rij-dij)2 to the energy function of AMBER, the penalty constant, k, was set to 500 kcal/(mol.A2). We have previously applied this method with success to a nonanucleotide duplex containing an abasic site (12). However, here, this method failed. After relaxation, that is after energy refinements without constraints, it yielded

ppm

CIO

Ct

00 l2 12 Kc

7.8-

7.8-

D

E

7.4-

7.4-

@a ¢j

0oH

U

aaa 8.2-

8.2-

6.2

5.8

5.4

ppm

0

0 6.2

Q 5.8

5.4

ppm

Figure 1. Part of the 400 millisecond NOESY spectrum. The region corresponds to interactions between base protons and the Hi '/H5 protons. Cross peaks marked with an X correspond to CH6-CH5 cross peaks. a) the chain of connectivities for the first strand and b) for the second strand.

Nucleic Acids Research, Vol. 19, No. 24 6773 structures with a poor fit to the NMR data and structurally unstable models. Thus the molecular modelling of this system was based solely on building different starting models without forcing them with NMR distance constraints. Constraints were used only to maintain temporarily a base pairing scheme by reinforcing hydrogen bonding by addition of an energy term as described above or similarly to force a particular sugar pucker.

In order to change the sugar ring puckers it was only necessary to force the angle 6 (C5'-C4'-C3'-03') to 60 (60=820 for C3'

endo and 60= 1440 for C2' endo conformation (22). The torsion angle 6 was forced to 60 by addition of E = E k(6 -60)2 to the energy function of AMBER. The constant k was set to 900 kcal/(mol.rad2). Comparison of the NMR data and model analysis. The value, F = E (rij-dij)2/dij, where rij is the distance obtained by NMR and dij the distance in a proposed structure, is used to give an estimate of the overall agreement between the computed structure and the NMR distance measurements.

PPM 00

T1

Go~~~~~ A T18 7.4.

RESULTS

080

7.8-

NOESY spectra in 2H20: assignment of non-exchangeable protons In order to assign the base and sugar proton resonances of the duplex we first recorded a NOESY spectrum at 15°C with a 400 millisecond mixing time. The region of this spectrum corresponding to interactions between the base H6/H8 protons and the Hi'/H5 protons is shown in Figures la,b. For the four base pairs at each end of the sequence the observed connectivities are those of a typical B-DNA. In Figure la the G7H8 proton shows an interresidue cross peak at 5.36 ppm which must be with the G6H1' proton. At lower contour plot levels a weak intraresidue cross peak is observed for G6H8-H1'. For the other strand, Figure lb, a chain break is observed. We are unable to detect a cross peak between G17H8 and C16Hl '. Interbase cross peaks are labelled A-E in Figures la,b. All the H2 resonances

0 aZOO.

8.2-

oq

0 2.5

30

a 1.5

2.0

PPM

Figure 2. Region of the 400 millisecond NOESY spectrum showing interactions between base protons and the H2'/H2"/CH3 protons. Intraresidue NOEs are linked by solid lines.

Table 1. Interproton distances (A) determined by NOE measurements and molecular mechanics calculations. The first entry is from the NMR data and the second is the percentage difference between the model distance and the NMR data, (model-NMR)/model. The overall NMR fit criterion for these 84 distances: F= 1.9. The upper and lower bounds for the NMR derived distances are taken as 15% of those shown below. H6/H8 Base i

GI A2 G3 G.14 A5 G6 G7 C8 A9

Hi 'intra

Hi 'inter

H2'intra

H2"inter

i

i-I

i

i-I

2.3 3 2.4 4 2.3 8 2.7 -11 2.9 -17 2.5 3 2.3 2 2.3 0 2.2 3 2.2 12

2.2 3 2.4 -3 2.4 0 5 2.4 3.2 -20 2.1 11 11 2.4 2.3 6 2.4 -2 2.2 6

2.4 2.3 2.5 2.4

5 0 -5 -6

2.2 1 2.6 -12 2.3 1 2.6 -15

3.9

-10

2.2 2.3

1 -7

2.5a 9 2.4 -11

3.5

0

3.9 3.9 3.8 3.5

1 1 3 11 6 -2 11 6 1 4 8

3.4 3.4 3.4 3.4 3.6 4.1 3.6 3.8 3.4 3.5

12 4 8 11 -10 3 -14 1 -2 4

3.8 3.5 3.8

3 6 3

3.6 3.2 3.5

5 14 6

3.7 3.5 3.5 3.6 3.6

1 7

2.8 -11 3.5 27 2 3.9

3.7 4.0 3.5 3.5 3.9 CIO 3.6 Gil l3.6

G13 T14 G15 C16 G17 T'8 C'9 C20 T21 C22

GI7(Hl) C'9(HN4) a

b

CH3inter i+1

4.0

13

3.9

-7

6 4 4

3.9 3.4

2

G7(H 1)

A5(H2)

4.6 3.5

0 3.4 3.8 -24 3.8 -4

G4(H 1) G6(H1)

H5inter i+1

3 13

Average < l/r3 > -1/3 for the coincident H2' and H2" protons. Average < l/r6 > -1/6 for the methyl protons.

T18(H3)

3.6 3.3 3.3 4.5

1 10 1 0

4

3.5b

-3

3.4b

8

6774 Nucleic Acids Research, Vol. 19, No. 24 can be assigned from their cross peaks with H ' protons, peaks F-K. The C20H6-H1' intraresidue cross peak appears to be weak. This is an artefact caused by the strong negative wings on the terminal C12H6-H5 cross peak resulting from the sine bell filtering. In Figure 2 all inter and intraresidue cross peaks are observed for the first strand showing that G6 lies within the helix and without any major conformational change relative to that of Watson Crick geometry. For the second strand the intraresidue cross peak G'7H8 with the coincident H2'/H2" protons is very weak and interresidue interactions with the C16H2'/H2' protons are absent. Interbase cross peaks are labelled A and B. Significantly no G17H8-T18CH3 interaction is observed. The interactions observed in Figures 1 and 2 suggest that the major structural change occurs for G17 rather than G6. Nevertheless, from the observed T18H6-Gl7H2'/H2" cross peak G17 must lie within the helix. In the 1-dimensional spectrum we observe that the H6/H8 protons of the bases on either side of the mismatch, C16 and T18, give resonances ca. 5 Hz broader than base protons well removed from the mismatch site. As measured from the NOESY cross peaks the H8 resonances of G6 and G17 are even broader. It is clear that the presence of this purine-purine mismatch produces a degeneracy around the mismatch site and also in it. Qualitatively we observe that the two G residues are not conformationally equivalent. We might thus expect another species in which the conformations of G6 and G17 are reversed, but we observe no evidence for this. The broadening may be due to an equilibrium for the mismatched pair involving only minor conformational changes. A NOESY spectrum recorded at 25°C shows that the linewidths of the base protons of G6 and G17 become sharper as the temperature is raised but a number of cross peaks become much weaker. The strong preference for one species, or a close family of species may be determined by the fact that G6 lies between two purines while G17 lies between two pyrimidines. It appears that no long range deformation is present and that globally the helix adopts a B form structure as particularly indicated by the normal interbase cross peaks. To obtain the relative assignment of the H2' and H2" resonances and to probe quantitatively the structure of the duplex we have recorded NOESY spectra at short mixing times. The H3' and H4' resonances were assigned from the TOCSY

spectrum. From examination of the H6/H8-sugar proton NOEs we find that for all non-terminal residues that the predominant sugar pucker is C2'-endo (25). We have measured the NOE build up rates and determined interproton distances as previously described (12). These are given in Table 1. The derived distances are given for comparison with the model structure. From these distances upper and lower bounds corresponding to i 15% can be set. These generate a distance range of ca. 0.7 A for short distances and ca. 1.2 A for long distances. The spectral resolution for a duplex with 22 residues is relatively good and only a few distances could not be determined because of cross peak overlap. For the central part of the duplex the only pair missing is that for the C'6H6-Hl' intra and interresidue interactions. We observe that the NOE build up rate for the intraresidue interaction G17H8-Hl' is ca. 5 tines greater than for the average of all other interactions of the same type outside the central region of the duplex. This indicates that the sugar base orientation is not a typical anti one. We calculate a distance of 2.8 A for this interaction. For a typical syn orientation we would expect a build up rate ca. 10 times faster than the average, corresponding to an interproton distance of 2.5 A. A distance of 2.8 A corresponds to a high anti conformation although the difference between this and a syn conformation is minor and within the limits of experimental error. This further indicates that the exchange broadening observed is due to minor conformational changes as we would not expect a syn/anti equilibrium (reversing the conformations of G6 and G17) to be fast at 15°C. A similar series of spectra have been recorded on the parent sequence containing a central C6.GI7 base pair (not shown). Excepting for the central three base pairs the magnitudes of the

G17

06

07

Tl8

14

12

i'o

8

'ppm

Figure 3. a) 1-dimensional spectrum of the duplex in 90% H20 at pH 5.5 and 1 'C. b) difference spectrum after presaturation for 0.5 seconds of the resonance at 10.50 ppm.

Figure 4. Part of the NOESY spectrum of the duplex in 90% H20 recorded at 1 °C with a mixing time of 250 milliseconds. Lower, region for imino-imino interactions and upper, imino-amino, CH5, AH2 interactions. Pairs of amino cross peaks are connected by solid lines.

Nucleic Acids Research, Vol. 19, No. 24 6775

observed interactions are very similar to those described above showing that the local structure is very similar.

NOESY spectra in H20: assignment of exchangeable protons We have recorded 1-dimensional spectra at 1°C as a function of pH from 7.5 to 5.5. Lowering the pH resulted in a narrowing of some of the resonances of the imino protons, indicating slower exchange with solvent, but no evidence for a structural change of the duplex is observed. The imino and aromatic proton regions of the 1-dimensional spectrum recorded at 1 °C and pH 5.5 are shown in Figure 3. We observe the 3 A.T imino protons in the region 13.5-14 ppm, the G.C imino protons between 12.5 and 13 ppm and two high field shifted resonances at 10.50 and 11.01 ppm which must arise from the mismatched base pair. Although lowering the pH reduced the linewidths of some of the imino resonances the effect upon the two high field shifted resonances was much less pronounced. The linewidths of the A.T and G.C imino protons are ca. 25 Hz whereas that at 11.01 ppm is 46 Hz and that at 10.50 is 42 Hz. These two high field resonances broaden rapidly on increasing the temperature. Clearly the two high field shifted protons are more exposed to exchange with solvent than those of the normal base pairs. However, as they are observed at pH 7.5, though broader, this indicates that the bases are intra rather than extra-helical. The NOESY spectrum recorded with a mixing time of 250 milliseconds is shown in Figure 4. The chain of imino-imino connectivities can be followed from G' to G'3 although G15 and G7 are virtually coincident. The chain passes through one of the two high field imino resonances. Cross peaks are observed with the resonance at 10.50 ppm but are very weak. As stated above this resonance is very broad and relaxes fast. In order to probe the environment of this proton we have recorded a 1-dimensional difference spectrum for which the proton was presaturated for 0.5 seconds and this is shown in Figure 3. We observe NOEs with the neighbouring imino protons, T'8 and G7, and by spin diffusion also to GC5. A significantly larger NOE is observed between the two high field imino protons indicating their close proximity in space. In the aromatic region we observe a rather weak NOE with the A5H2 and NOEs with amnino protons. NOE build up curves have been measured by 1-dimensional difference

spectra for the three well resolved imino protons, G6, GI7 and T18 with presaturation times of 0.1, 0.2, 0.3, 0.4 and 0.5

seconds. The derived distances are shown in Table 1. As these spectra were recorded at low temperature and with presaturation times which are relatively long the results could be influenced by spin diffusion. The inter-proton distances will be less accurate than those obtained between non-exchangeable protons. The interactions observed are nevertheless indicative of close proximity especially as the presence of an efficient spin diffusion pathway would be visible in the difference spectra. The chain of connectivities observed for the imino-imino interactions is confirmed in the region corresponding to iminoamino/H2 interactions. The two high field imino resonances show few NOEs in this region. One of them, that at 1 1.01 ppm, shows a strong cross peak with the A5H2 resonance. This proton must be on the minor groove side relative to that at 10.50 ppm. Both show cross peaks with resonances at 5.35 and 6.63 ppm, which are broad exchangeable proton resonances. In the 1-dimensional spectrum the resonance at 6.63 ppm is sufficiently well resolved at 12°C to be integrated and it corresponds to two protons. As all the C amino protons have been assigned and no C amino proton is found at this chemical shift this resonance must be assigned to a G amino group which is relatively free to rotate about the C-N bond. Table 2. Pairing scheme (cf. Figure 5) F (Fit criterion for 56 distances in the central heptamer) Distances violating the NMR measurements by 20-30% 30-50% 50-60%

A

B

C

2-3

6-7

9-10 7-9

5 0 0

3 5 1

8 0 3

D

6 5 2

4 CD m L'i F-

I

3

CrC-) F-

2L

Lrr-

P---,

X: 27 LL-

70 75 80 HELICAL TWIST

85 90 (BP 5-7)

95 100 (DEGREE)

B -0

-160

-

65 65

2

-1 70 21 1 75

Z -1 85 LI

1

10 A

Figure 5. The four possible structures for the G.G base pair which have two hydrogen bonds and do not involve a charged residue. The mismatches are viewed from the top of the oligonucleotide and G6 (right) remains in an anti conformation and a position close to that of B-DNA while G17 (left) adopts the major changes. Sugar conformations are restrained to C2'-endo. These pairing structures are shown as observed in an oligonucleotide to show the relative displacement associated with each type of pairing. The arrows point from the helical axis.

{

_ A

-1 90

A

A_

75 80 HELICAL TWIST

70

85

90

(BP 5-7)

95

1 00

(DEGREE)

Figure 6. Computed total energy (A) of the oligonucleotide containing the mismatch G6.GI7 (+) and of the corresponding oligonucleotide containing a central G6.CI7 base pair (*) in kcal/mol and (B) NMR distance fit, F, for the 84 distances measured by NMR, computed as a function of the helical twist between the base pairs A5.TI8 and G7.CI6. Counterions were not included in the calculations.

6776 Nucleic Acids Research, Vol. 19, No. 24

When the amino group of a G residue is involved in hydrogen bonding, rotation, exchange of one proton in the hydrogen bond for the other, is in the intermediate exchange rate and the resonances are generally not observed (26). When this group is not involved in hydrogen bonding, as in wobble structures (27,28), rotation is faster and the amino group gives rise to a broad resonance for the two protons. We can assign the resonance at 6.63 ppm to the amino group of the G for which the imino proton is at 10.50 ppm. The cross peak with the imino proton at 11.01 ppm arises from spin diffusion. For the amino group of the other G residue of the mismatched pair, as we can not integrate the resonance at 5.35 ppm, a priori two possibilities exist. Either the resonances at 11.01 and 5.35 ppm are the two protons of the G amino group or that at 11.01 ppm is the imino proton and that at 5.35 ppm the two protons of the amino group. There are several arguments in favour of the latter attribution. We would not expect in this mismatch structure that the hydrogen bonding of the amino group would be stronger than in a G.C pair giving rise to slow rotation on an NMR time scale. Secondly, we would not expect that hydrogen bonding would shift the hydrogen bonded amino proton to 11 ppm, as in G.C pairs they are at 7-8 ppm. Thirdly, we would expect to see the non-hydrogen bonded imino proton in the spectrum which we do not. In order to verify that the imino proton is not broadened beyond detection we have recorded 1-dimensional spectra down to pH 4 and we do not observe any new resonance in the spectrum. We conclude that the resonance at 5.35 ppm is that of the amino group. Only the T'8 imino proton shows inter base pair interactions with the mismatched

pair. A weak cross peak is observed to the amino group at 5.35 ppm. This imino proton shows only a weak cross peak with the non-hydrogen bonded A5 amino proton. The chemical shift of the hydrogen bonded proton was found by examination of the amino-amino proton region (not shown). A cross pak is observed

between the GI" imino proton at 13.0 ppm and a resonance at 6.7 ppm. The latter may correspond to the G" amino protons, which, through fraying at the end of the helix, may rotate more rapidly than for non-terminal G.C base pairs (29). All the other intra and inter residue cross peaks observed are typical for BDNA and are not indicative of any major structural change. Figure 5 shows the only four possible structures, which involve two hydrogen bonds and no charged residue. For the mispair the extrahelical structure is excluded as we have shown that the two G residues are stacked into the helix. Qualitatively structure A fits our data. Hydrogen bonding takes place between the G imino protons and the carbonyl groups. This explains the observed chemical shift for the G imino protons similar to that observed in wobble structures. Both amino groups are free and could rotate although their environment must be quite different to account for the difference in chemical shift. To construct such a structure in a double helix requires that one of the G residues is syn and the other anti, qualitatively in agreement with what we have observed. Structure B also satisfies the requirement for one syn and one anti G but involves hydrogen bonding via one of the amino groups and the presence of a free imino proton. Structure C is similar to structure B in its hydrogen bonding pattern, but has both residues anti. Finally, structure D is held together only by

Figure 7. Stereoscopic view of a family of representative structures obtained by unrestrained energy minimization of a model structure involving 84 distance constraints. This B-DNA was deformed by the geometric procedures described in the text. It is viewed from the minor groove to show the width of the helix at the G.G pair.

Nucleic Acids Research, Vol. 19, No. 24 6777 hydrogen bonds of the amino groups. Below we will examine in more detail the fit of these structures to the NMR data but taking structure A as the most likely one, we can assign the two high field imino protons. From the NOE observed to the A5H2 the imino proton at 11.01 ppm corresponds to the base in an anti conformation which we know is G6. Model building and molecular mechanics calculations In this work we search for the closest structure to the classical B-DNA model (16) (because the NMR data show that globally the duplex adopts a B-DNA form) that contains one of the a priori possible four types of G.G pairing which involve two hydrogen bonds and no charged residue (30,31). The space of parameter variations or the set of conformations is the set of deformed BDNA structures that fits the NMR data and that is structurally stable. We have first explored the simplest deformations which allow the relative positioning of parts of the molecule through translation and rotation in Cartesian space. Once a satisfactory model was obtained, different values for torsion angles were examined and the helical twist above and below the G.G pair was varied. The starting point was a classical B-DNA d(GAGGAGGCACG).d(CGTGCCTCCTC) structure (16). The central C on the second strand was replaced by a guanine. We then moved this

f

Figure 8. Stereoscopic view of one of the duplexs shown in Figure 7 seen from the top of the oligonucleotide showing the stacking of T'8.A5 on G17.G6 (upper) and of G'7.G6 on C16.G7 (lower).

residue to one of the four possible types of hydrogen bonding. We proceeded in this way as the NMR data indicate that the guanine on the first strand is very close to a normal B-DNA structure, its position in the helix is not greatly modified relative to a Watson-Crick pair. For the four types of G.G pairing shown in Figure 5, G6 remains in an anti conformation as close as possible to a WatsonCrick conformation while G17 adopts the conformation required to give the hydrogen bonding structure and be fitted into a double helix. The principal features of these models are A, symmetric pairing with G17 syn, B, asymmetric pairing with G17 syn, C, asymmetric pairing with G17 anti and D, symmetric pairing with G17syn. All sugars were restrained to a C2' endo conformation. These pairing schemes are shown as observed in a relatively fixed oligonucleotide, they show the relative displacement associated with each type of pairing. After energy refinement of these structures for which the hydrogen bonding scheme is kept fixed, the molecular models are characterized by the NMR data fit criteria as shown in Table 2. Pairing schemes C and D can be readily ruled out. Their fit to the NMR data is very poor. Model B is very easy to construct: upon turning over G17 from anti conformation to syn the sugar moiety stays very close to its normal position in B-DNA, (within 1.5A if G17 alone is moved). However, the model shows a poor fit with the NMR data and it is qualitatively not in agreement with the hydrogen bonding scheme observed by NMR. Model A fits best the NMR data and was chosen as the starting point for further model building. However, at this stage the model is unstable, the G.G pairing tends to dissociate upon energy refinement. The reasons why this model is unstable are explained by the operations that must be performed to construct the model. Pairing scheme A is symmetric and is very wide as judged by the C1'-C1' distance (ca. 13 A), furthermore G17 is syn. These features make it difficult to contsruct a G.G pair that fits well inside a B-DNA helix, i.e. that is well stacked, that minimally distorts the sugar phosphate chains and that remains hydrogen bonded. Upon turning over G17 from the normal anti conformation into the syn conformation of model A, G'7 and its sugar moiety are moved away from the minor groove towards the major groove. This transformation implies that G17 is moved towards phosphate pl6, compressing the sugar phosphate chain, but also stretching it between G17 and P'7. This situation could be improved at the most local level by rotating the G.G pair inside the helix. However, during the course of this rotation G6 has to be moved towards phosphate P5, compressing the sugar phosphate chain and also stretching it between G6 and P6.

Table 3. Torsion angles for the central part of the chain of the oligomer, structure A. y

6

59 62

139 146

-66

-175 159 168

72

141

-72 -78 -71

174 166 168

65 60 66

141 150 147

61

133

Residue

a

A5

-65 -70

G6 G7

C16 G17 T18

X

-160 176 -172

-154 -92 -133

-107 -110 -110

174 -177 174

173 -70 -101

-

-

110 72 -96

average for all other non tabulated nucleotides: -70

177

Angles are defined: 5 - y-C4 '-6-C3'-e-03 '--P P- a-05 ' -f-C5

180

-

102

- 119

6778 Nucleic Acids Research, Vol. 19, No. 24 All of these features make model A difficult to construct. Because we have exhausted all reasonable local transformations, translations or rotation of the G.G pair within the helix, further deformations on a more global level must be introduced to create a more stable base pairing. This situation is improved by rotating the top and the bottom half of the helix to increase the helical twist above (+80) and below (+80) the G.G pair. No further modification of the helical twist improved the situation as the sugar phosphate chains G6P6 and G'7P'7 are very extended. The simplest transformation that reduces sugar phosphate chain extension or compression is to translate the top part of the oligonucleotide, above the G.G, sideways by 1 A towards G17 in the plane perpendicular to the axis of the helix. Similarly, translating the bottom part of the oligonucleotide below the G.G by 1 A towards G6 in the plane perpendicular to the helix axis. These modifications yield a mechanically more stable structure as well as a better fit to the NMR data. Simple force fitting of the NMR data at any stage of model building never yielded, after relaxation, either a good fit or a stable structure. Two reasons might be invoked for this: either that the number of distance measurements is too low or that global modifications beyond the range of the minimizer are required to accomodate the G.G pairing. In order to test whether the model obtained at this stage is structurally stable, and to probe for the helical twist around the G.G pair we have systematically varied it. Refinements were carried out under constraints for the sugar puckers by forcing the angle 6 to 1440 to hold the sugar pucker in a C2' endo conformation (24) and by forcing the hydrogen bonds of the three central base pairs to standard values. The results are shown in Figure 6. Starting from a model characterized by (total energy)E= -176 kcal/ mol and F = 2.44 we find models that fit better the NMR data, have a lower total energy and are more stable upon energy refinements without constraints. Data in Figure 6 were fitted to a parabola which has a minimum at ca. 800 for the energy fit and 850 for the F fit. Comparison with the control oligonucleotide containing a G.C base pair, 790 for the A5-T'8 to G7-C'6 twist, shows that the total helical twist is little changed by replacement of G.C by G.G. Note that the uncertainty in the helical twist is similar to the variations observed in the B-DNA dodecamer (32), in the global twist in short DNA (33) and in a previously studied oligonucleotide (25). A family of structures is shown in Figure 7: they were obtained by energy minimization without restraint of a classical B-DNA, solely deformed by the geometrical procedures described above. The structures were selected among those generated by the change of helical twist because of their good energy and fit to the NMR data and because their torsion angle values are the least changed with respect to classical B-DNA. For the best model the fit to the NMR data is given in Table 1 and the description is given in terms of torsion angles in Table 3. None of these torsion angles have been experimentally measured, they are solely indicative of the modifications required to introduce the G.G mismatch into the helix and are, in any case, subject to atomic fluctuations (34) within the family of acceptable models. It is remarkable that very few torsion angles are required to deviate from classical B-DNA to accomodate the G.G mismatch: r of A5 and C'6 are trans instead of gauche - and r of G7 is close to trans. The most striking change in wedge parameters is the small helical twist for the base pairs A5-T'8/G6-G'7, 320, and the very large helical twist for the base pairs G6-G'7/G7-C'6, 510. This is clearly reflected in the stacking of the central bases as shown in Figure 8.

DISCUSSION The NMR spectra and the fit of the distance constraints show that the predominant species present in solution corresponds to that of model A in Figure 5. The fit is good, although not as good as in our previous studies (12,25,35). The origin of this might lie in the dynamic character of the mismatch. We observe in the NMR spectra that the base resonances around the mismatch are broadened and in the 2-dimensional spectra, that those of G6 and G'7 are even more broadened. This could be explained by an equilibrium within a family of conformations closely related to structure A. In this structure the hydrogen bonding might not be uniquely defined as suggested by the configurations of the hydrogen bond donors and acceptors in model A and also by our calculations partial hydrogen bonding might occur between the G6NH2 and G'706 or G606 and G17NH2. This may explain the dynamic character of the central part of the duplex. The relatively small lateral displacements of the G6 and G17 bases resulting from such an equilibrium cannot be detected from the measured proton -proton distances as the differences are within the limits of experimental error. The Wpe of mismatched structure adopted here could arise, not only from favourable stacking interactions and good hydrogen bonding but also from the possibility of partial hydrogen bonding, although we have no direct evidence of this. Observation of pairing scheme A instead of B was surprising in the light of the deformations introduced by this very wide and symmetric pairing scheme. This suggests that other factors such as solvent interactions or local molecular motion, as suggested above, are important in determining the final structure. Structures A and B belong to two different classes of models for which total energy comparison is not valid. However, detailed energy comparison within a particular class, as was carried out for model A, Figure 6, is meaningful. The hydrogen bonding scheme corresponding to structure B has been observed in gels of guanosine and its derivatives which form quadruple helices (36-39). The base pairing scheme C has been observed in the crystal structures of both guanine and guanosine (40). That of structure D has been observed in the crystal structure of guanine hydrochloride monohydrate (41,42). Structure A has not been, to our knowledge, observed for a guanine derivative, though this type of hydrogen bonding has been found for uric acid (43). This structure, G(syn).G(anti), is one of the most exotic mismatches. The base pairing results in a C ' -C l' distance of ca. 13 A, 0.5 A larger than that observed for the G(antz).A(anti) mismatch (44) and 2.5 A larger than that for normal Watson -Crick base pairs. The G(antz).A(ant) structure, which has also been observed in solution (45) ressembles, in certain respects structure A. However, the G.A mismatch

can

adopt

another structure, G(anti).A(syn) (46,47) in which the Cl' Cl' distance is 10.5 A, close to that for standard Watson-Crick pairs, which ressembles that of structure B. Despite these features, our G.G mismatch is strikingly close to an Arnott B-DNA in respect of its torsion angles as very few are significantly changed. This was also the case for the structure of the G(anff).A(antm) mismatch (44). We observe that the sum of the helical twists above and below the mismatch is also remarkably close to the situation with a normal G.C pair. Our modelling procedure is as conservative as possible. It implies that we have studied essentially the most basic geometrical deformations of B-DNA required to take into account the NMR -

Nucleic Acids Research, Vol. 19, No. 24 6779 data and required to yield a structurally stable molecular model. On the other hand it does not provide information as to whether the duplex is kinked at the mismatch site as the distances which can be measured by NMR are not very sensitive to this kind of deformation. Our modelling method appears to be well suited to the study of small oligonucleotides with deformations. In this way, we have explored many different conformations, by translating or rotating parts of an initial B-DNA, followed by a study of various torsion angles and the helical twist about the G.G pair. It provides a clear picture of the geometrical changes induced by the mismatch pair. We are currently studying the same sequence, but with a C.C mismatch at the centre. We hope that by comparing such systems we will gain insight into the factors which govern mismatch recognition and repair.

ACKNOWLEDGEMENTS We greatly appreciate the contributions of Miro Radman. We are most grateful to Pr Jean Chavaudra and to Mr Michel Le Minor for access to the VAX computers. M.L.B. is a recipient of grants from I'Association pour la Recherche sur le Cancer and from la Ligue Nationale contre le Cancer and from the Universite Pierre et Marie Curie. J.A.H.C., J.G-A; and M.L.B. thank Prof. P.A.Kollman for providing us with AMBER 3.0.

REFERENCES 1. Radman, M. & Wagner, R. (1984). Curr. Top. Microbiol. Immun. 108, 23-28. 2. Wagner, R., Dohet, C., Jones, M., Doutriaux, M.-P & Radman, M. (1984). Cold Spring Harbor Symp. Quant. Biol. 49, 611-615. 3. Lu, A.L., Clark, S. & Modrich, P. (1983). Proc. Natl. Acad. Sci. USA. 80, 4639-4643. 4. Lu, A.L., Welsh, K., Clark, S., Su, S.S. & Modrich, P. (1984). Cold Spring Harbor Symp. Quant. Biol. 49, 589-596. 5. Dohet, C., Wagner, R. & Radman, M. (1985). Proc. Natl. Acad. Sci. USA. 82, 503-505. 6. Kramer, B., Kramer, W. & Fritz, H.-J. (1984). Cell, 38, 879-887. 7. Jones, M., Wagner, R. & Radman, M. (1987). Genetics, 115, 605-610. 8. Fazakerley, G.V., Quignard, E., Woisard, A., Guschlbauer, W., van der Marel, G.A., van Boom, J.H., Jones, M. & Radman, M. (1986). EMBO J. 5, 3697-3703. 9. van der Marel, G.A., van Boeckel, C.A.A., Wille, G. & van Boom, J.H. (1981). Tetrahedron Lett. 22, 3887-3890. 10. Marugg, J.E., Tromp, M., Ihurani, P., Hoyng, C.F., van der Marel, G.A. & van Boom, J.H. (1984). Tetrahedron, 40, 73-78. 11. Bodenhausen, G., Kogler, H. & Ernst, E.E. (1984). J. Magn. Reson. 58, 370-388. 12. Cuniasse, Ph., Sowers, L.C., Eritja, R., Kaplan, B., Goodman, M.F., Cognet, J.A.H., Le Bret, M., Guschlbauer, W. & Fazakerley, G.V. (1987). Nucleic Acids Res. 15, 8003-8022. 13. Withka, J.M., Swaminathan, S. & Bolton, P.H. (1990). J. Magn. Reson. 89, 386-390. 14. Plateau, P. & Gueron, M. (1982). J. Am. Chem. Soc. 104, 7310-7311. 15. Davis, G.G. & Bax, A. (1985). J. Am. Chem. Soc. 107, 2820-2821. 16. Arnott, S., Campbell-Smith, P. & Chandrasekharan, P. (1976). CRC Handbook Biochem. 2, 411-414. 17. Le Bret, M., Gabarro-Arpa, J., Gilbert, J.Ch. & Lemarechal, Cl. (1991) J. Chim. Phys. Phys-Chim. Biol. 88, in press. 18. Gabarro-Arpa, J., Cognet, J.A.H. & Le Bret, M. J. Mol. Graphics, in press. 19. Weiner, P. & Kollman, P.A. (1981). J. Comp. Chem. 2, 287-303. 20. Weiner, S.J., Kollman, P.A., Nguyen, D.A. & Case, D.A. (1986). J. Comp. Chem. 7, 230-252. 21. Singh, U.C., Weiner, P.K., Caldwell, J.W. & Kollman, P.A. (1986). AMBER 3.0, University of California, San Fransisco. 22. Gelin, B. & Karplus, M. (1981). Proc. Natl. Acad. Sci. USA, 72, 2002-2006.

23. Weiner, S.J., Kollman, P.A., Case, D.A., Singh, U.C., Ghio, C., Alagona, G., Profeta, S.Jr. & Weiner, P. (1984). J. Am. Chem. Soc. 106, 765-784. 24. Dickerson, R.E., Kopka, M. & Pjura, P. (1985). Biological Macromolecules and Assemblies. Vol. 2. Nucleic Acids and Interactive Proteins. Jurnak, F. A. & McPherson, A. eds., John Wiley & Sons, New York. 25. Cuniasse, Ph., Sowers, L.C., Eritja, R., Kaplan, B., Goodman, M.F., Cognet, J.A.H., Le Bret, M., Guschlbauer, W. & Fazakerley, G.V. (1989). Biochemistry, 28, 2018-2026. 26. Fazakerley, G.V., van der Marel, G.A., van Boom, J.H. & Guschlbauer, W. (1984). Nucl. Acids Res. 12, 8269-8279. 27. Sowers, L.C., Eritja, R., Kaplan, B., Goodman, M.F. & Fazakerley, G.V. (1988). J. Biol. Chem. 263, 14794-14801. 28. Sowers, L.C., Goodman, M.F., Eritja, R., Kaplan, B. & Fazakerley, G.V. (1989). J. Mol. Biol. 205, 437-447. 29. Carbonnaux, C., Fazakerley, G.V. & Sowers, L.C. (1990). Nucleic Acids Res., 18, 4075-4081. 30. Donohue, J. (1956). Proc. Natl. Acad. Sci. USA. 42, 60-65. 31. Saenger, W. (1984). Principles of Nucleic Acid Structure, Springer Verlag, New York. 32. Fratini, A.V., Kopka, M.L. & Drew, H.R. (1982). J. Biol. Chem. 257, 14686- 14707. 33. Shore, D. & Baldwin, R.L. (1983). J. Mol. Biol. 170, 957-982. 34. McCammon, J.A. & Harvey, S.A. (1987). Dynamics of Proteins and Nucleic Acids, Cambridge University Press. 35. Cognet, J.A.H., Gabarro-Arpa, J., Cuniasse, Ph., Fazakerley, G.V. & Le Bret, M. (1990). J. Biomol. Struct. Dynam. 7, 1095-1115. 36. Gellert, M., Lipsett, M.N. & Davies, D.R. (1962). Proc. Natl. Acad. Sci. USA. 48, 2013-2018. 37. Tougard, P., Chantot, J.-F. & Guschlbauer, W. (1973). Biochem. Biophys. Acta, 308, 9-16. 38. Sasisekharan, V., Zimmerman, S.B. & Davies, D.R. (1975). J. Mol. Biol. 92, 171-179. 39. Voet, D. & Rich, A. (1970). Prog. Nucl. Acid Res. 10, 183-265. 40. Bugg, C.E., Thewalt, U.T. & Marsh, R.E. (1968). Biochem. Biophys. Res. Commun. 33, 436-440. 41. Sobell, H.M. (1966). J. Mol. Biol. 18, 1-7. 42. Mazza, F., Sobell, H.M. & Kartha, G. (1969). J. Mol. Biol. 43, 407 -422. 43. Ringertz, H. (1966). Acta Cryst. 20, 397-403. 44. Prive, G.G., Heineman, U., Chandrasegaran, S., Kan, L.S., Kopka, M.L. & Dickerson, R.E. (1987). Science, 238, 498-504. 45. Patel, D.J., Kozlowski, S.A., Ikuta, S. & Itakura, K. (1984). Biochemistry, 23, 3207-3217. 46. Brown, T., Hunter, W.N., Kneale, G. & Kennard, 0. (1986). Proc. Natl. Acad. Sci. USA. 83, 2402-2406. 47. Hunter, W. N., Brown, T. & Kennard, 0. (1986). J. Biomol. Struct. Dynam. 4, 173-191.

Solution conformation of an oligonucleotide containing a G.G mismatch determined by nuclear magnetic resonance and molecular mechanics.

We have determined by two-dimensional nuclear magnetic resonance studies and molecular mechanics calculations the three dimensional solution structure...
2MB Sizes 0 Downloads 0 Views