d. Mol. Biol. (1990)215, 411-428

Solution Conformation of Purine-pyrimidine D N A Octamers using Nuclear Magnetic Resonance, Restrained Molecular Dynamics and NOE-based Refinement J a m e s D . Baleja ~, M a r k u s W . G e r m a n n ~ J o h a n H . van de S a n d e 2 a n d Brian D . S y k e s ~

'Deparhnent of Biochemistry and Medical Research Council of Canada Group in Protein Structure and Function University of Alberta, Edmonton, Alberta T6G 2H7, Canada ~Department of Medical Biochemistry University of Calgary, Calgary, Alberta T 2 N 4N1, Canada (Received 11 December 1989; accepted 29 May 1990) The solution structures of two alternating purine-pyrimidine octamers, [d(G-T-A-C-G-T-A-C)]2 and the reverse sequence [d(C-A-T-G-C-A-T-G)]2, are investigated by using nuclear magnetic resonance spectroscopy and restrained molecular dynamics calculations. Chemical shift assignments are obtained for non-exchangeable protons by a combination of two-dimeusional correlation and nuclear Overhauser enhancement (NOE) spectroscopy experiments. Distances between protons are estimated by extrapolating distances derived from time-dependent NOE measurements to zero mixing time. Approximate dihedral angles are determined within the deoxyribose ring from coupling constants observed in one and two-dimensional spectra. Sets of distance and dihedral determinations for each of the duplexes form the bases for structure determination. Molecular dynamics is then used to generate structures that satisfy the experimental restraints incorporated as effective potentials into the total energy. Separate runs start from classical A and B-form DNA and converge to essentially identical structures. To circumvent the problems of spin diffusion and differential motion associated with distance measurements within molecules, models are improved by NOE-based refinement in which observed NOE intensities are compared to those calculated using a fitll matrix analysis procedure. The refined structures generally have the global features of B-type I)NA. Some, but not all, variations in dihedral angles and in the spatial relationships of adjacent basepairs are observed to be in synchrony with the alternating I)urine-pyrimidine sequence.

1. Introduction

Gronenborn & Clore, 1985; Patel el al., 1982), crystallographic determinations (Wang et al., 1979, Dickerson & Drew, 1981), and enzymatic studics (McLean & Wells, 1988; Naylor et at., 1986) of the right-handed )1 and B-type helical forms, and tile left-handed Z form exemplify tile heterogeneity within each major structural class. Z conformations arc found in sequences with a regular alternation of I)urine and pyrimidine residues which are mostly guanine and cytosine, although there lmve bccn some exceptions (Feigon et al., 1985; McLean & Wells, 1988). Alternating adenine-thymine tracts, however, adopt right-handed conformations in solution (Lomonossoff el al., 1981; Suzuki el al., 1986). Intermediary sequences of 50 ~/o G +C content are of interest because these sequences appear in gcnomes, such as in the anti-Z I)NA antii)ody binding regions of SV40 viral DNA (Ilagen el al., 1985; Nordheim & Rich, 1983) and because of the exhibition of conformational polymorphism (Z-DNA tract formation

Although DNA displays considerable structural diversity, deoxyoligonueleotides can be classified into one of several distinct conformational families. The relative stability of possible conformations is dependent upon both base sequence and external factors such as solution media composition and a presence of superhelical torsion forces. Nuclear magnetic resonance (n.m.r.'~) experiments (Cohen, 1987; 1"Abbreviations used: n.m.r, nuclear magnetic resonance; NOE, nuclear Overhauser effect; J, n.m.r. spin-spin coupling constant; NOEs, NOE~Y cross-peak intensities; r.m.s, root-mean-square; COSY, 2-dimensional correlation spectroscopy; NOESY, 2-dimensional NOE st)ectroseopy; MD, molecular dynamics; CATG, [d(C-A-T-G-C-A-T-G)]2; GTAC, [d(GT-A-C-G-T-A-C)]2; t t, acquisition time for first dimension of 20 n.m.r, experiment; 12, acquisition time for second dimension of 20 n.m.r, experiment. 0022-28361901190tl I-IS $03.00]0

411

(~) 1990 Aca(lemiePress Limited

412

J.D.

Baleja et al.

cruciform extrusion) as a response to different conditions of superhelical stress when cloned into supercoiled plasmids (McLean & Wells, 1988; Naylor et al., 1988). To tinderstand the predisposition of these sequences in promoting the B - Z transition, a knowledge of the structural details of [d(G-T-A-C-G-T-A-C)]2 and [d(C-A-T-G-C-A-T-G)]2 as linear duplexes without effects of superhelical stress was undertaken herein. Two-dimensional n.m.r, techniques yield data that can provide high-resolution molecular structures in solution. In particular, the nuclear Overhauser effect (NOE) results from the proximity of protons in space and can be used to determine their separation (Noggle & Shirmer, 1971; Havel & Wiithrich, 1985; Patel et al., 1987). These data c/m be supplemented with measurements of proton-proton coupling constants (J), which provide quantitative estimates of torsion angles {Rinkel & Altona, 1987; Chary et al., 1988). Sets of interproton and dihedral measurements are then used as a basis for structure determination by incorporation of these restraints as effective potentials into the total energy function of the system in restrained molecular dynamics simulations (Behling et al., 1987; Kaptein et al., 1985). The empirical energy function ensures that structures that satisfy the experimental restraints still have approximately correct lo~al stereochemistry and non-bonded interactions (Nilsson et al., 1986; Nilsson & Karplus, 1986). The distance between two spins is often estimated from NOE data by assuming inverse proportionality to the sixth root of the NOE cross-peak intensity. Distances so derived are only approximate, since the cross-peak intensity due to direct crossrelaxation between spins i and j is modified by additional cross-relaxation with any spin k, especially if spin k exists such that rik < r 0 o r rjk < r O. However, NOE cross-peak intensities may be predicted from the structures produced by dynamics and compared directly to the observed intensities, eliminating approximate distance calculation (Keepers & James, 1984; Suzuki et al., 1986; Gupta et al., 1988; Boelens et al., 1988). The structures resulting from restrained molecular dynamics calculations can be refined in an iterative manner so as to minimize the difference between the two sets of NOEs (Baleja et al., 1990a). In this paper, we investigate the solution structures of two self-complementary alternating purinepyrimidine DNA oligonueleotides [d(G-T-A-C-G-TA-C)]2 and [d(C-A-T-G-C-A-T-G)]2. After sequential assignment of all the non-exchangeable protons, approximate interproton distances are obtained by extrapolating distances derived from time-dependent NOE measurements to zero "mixing time (Baleja et al., 1990a). Estimates of coupling constants from one and two-dimensional spectra enable us to limit the conformational space for all glycosidic torsion angles. For each oligonucleotide, restrained molecular dynamics simulations are started from A and from B-form DNA (atomic versus

root-mean-square (r.m.s.) difference of 4"3 A (1 A = 0"1 nm)) and converge to structures that satisfy the experimental restraints. For [d(G-T-A-CG-T-A-C)]2 and [d(C-A-T-G-C-A-T-G)]2, the atomic r.m.s, difference between the averaged dynamics structures (0-66 and 0"65 A, respectively) is comparable to'the r.m.s, fluctuations of the atoms about their average positions. The resulting structures are refined by comparing observed NOE intensities to the NOE intensities calculated using a full matrix analysis procedure (Macura & Ernst, 1980; Keepers & James, 1984) and minimizing the difference between the two sets of NOEs (Baleja et al., 1990a). Final structures have NOE R factors (Baleja & Sykes, 1988; Gupta et al., 1988) of 0-19 for [d(G-T-AC-G-T-A-C)]2 and 0"23 for [d(C-A-T-G-C-A-T-G)]z, each of which represents a high quality of fit to the experimental data. Conformational parameters of the structures are analyzed with respect to basesequence dependence and chain termination effects. In general, they reflect B-type DNA features, qualitatively in agreement with previous tH n.m.r. studies of related sequences (Lown et al., 1984; Nilges et al., 1987; Niisson et al., 1986; Scarle et al., 1988; Stevens et al., 1988).

2. Experimental Procedures (a) Sample preparation Deoxyoligonucleotides, d(G-T-A-C-G-T-A-C) and d(CA-T-G-C-A-T-G), w e r e prepared on an Applied BioSystems model 380A DNA synthesizer using phosphoramidate chemistry (Beaueage & Caruthers, 1981). After deblocking and detritylation, synthesis products were purified by anion-exchange chromatography at pH 13-0 on NACS-20 (Bethesda Research Laboratories). Analysis on samples of the purified oligonucleotides, which were 5'-end-labeled with [~-32p]ATP and phage T4 polynueleotide kinase (Chaeonas & van de Sande, 1980; Germann et al., 1985), gave a single band on denaturing 20~o (w/v) polyacrylamide gels for each product. The purified oligonucleotides were de-salted by G25 chromatography and lyophilized. By heating each sample in 1-3 ml of 10 m.~i-sodium phosphate buffer (pH 7'0), 0-1 .~t-NaCl,25 p.~I-EDTA to 85~ and allowing the solutions to cool to room temperature over several hours, strands annealed to form the duplexes:

and

(3'-5') (5'-3')

(Gt-T2-A3-C,,-Gs-T6-AT-C8) (C8-AT-T6-Gs-C4-A3-T2-GI)

(3'5') (c,A T

,c AoT %)

GTAC CATG

(5'-3') (Gs-TT-A6-Cs-G4-T3-A2-O1), where GTAC and CATG were abbreviations for [d(G-T-AC-G-T-A-C)]2 and [d(C-A-T-G-C-A-T-G)],. Solutions were passed over the Na + form of Chelex 100 to remove paramagnetic metal ions prior to lyophilization. Samples were dissolved and re-lyophilized 3 times in increasing grades of 2 H 2 0 , and finally taken u I) in 0'65 ml of 99-997% (v/v), 2H20. Final buffer concentrations for both samples were 0-2 M-xN'aCl,10 mM-NatI2PO4, 10 m.~l-lN*a2tlPO4 (pH 7"3, direct meter reading), 50 pM-EDTA. Duplex eoncentra-

Solution Structure of Purine-pyrimidine D N A tions were 0"5 and 0"6 mM for CATG and GTAC, respectively.

(b) Nuclear magnetic resonance spectroscopy All n.m.r, spectra were obtained on a Varian XL 400 n.m.r, spectrometer with an operational frequency of 400 MHz for protons. Experiments were taken with 2000 data points along t 2 and with 256 t I increments. The spectral width employed was approx. 3400 Hz, with a relaxation delay time of 2-0 s. Streaking along t I was reduced by multiplying the first domain time point by a factor optimized near 0'5 (Otting et at., 1986). Absolute value COSY spectra (Nagayama et at., 1980) were recorded at 26 ~ The appropriate phase cycling was used for quadrature detection and to eliminate axial peaks, and in the case of NOESY spectra, single and multiple q u a n t u m coherences. Phase-sensitive NOESY spectra (States et at., 1982) were collected at 2O~ which was optimal for spectral resolution, and was a~eompromise between broader lines at lower temperatures and duplex fraying at higher temperatures. Average mixing times of 100, 200, 400 and 500 ms were used. A random delay of between - 10 and + 10 ms was incorporated to suppress zero quantum coherences. Spectra with the longer mixing times were mainly used for resonance assignment. Although spin diffusion was apparent, it did not interfere with the sequential assignment procedure. Low sample concentrations precluded the use of shorter mixing times. Prior to 2-dimensional Fourier transformation, the data were zero-filled to 2048 points along the t I dimension. COSY and NOESY data were weighted by both exp(t/RE) and e x p ( - t 2 / A F 2) functions in each dimension. Values of RE (to effect resolution enhancement) and A F (an apodization function to suppress truncation artifacts) were chosen so that for COSY spectra, the data were nearly nulled at the first and last time points (t), and were maximized at a point that corresponded to an acquisition time of about 1/(2*J), where J is an average coupling constant (approx. 8 Hz) best suited to observe most correlations. For phase-sensitive NOESY spectra, RE values were chosen to give slight resolution enhancement, and an A F value small enough to avoid truncation effects. Such a procedure reduces tailing about the diagonal with little change in cross-peak intensity. Final symmetrized 2-dimensional spectra were 1000 by 1000 data points, representing a resolution of approximately 3"4 Hz per point. NOESY intensities were quantified by determining the volume integral of each cross-peak. Nominally empty areas perpendicularly adjacent to each cross-peak were examined for baseline correction.

(c) Restrained molecular dynamics calculations Energy minimization and molecular dynamics (MD) calculations were carried out with tim GROMOS program (de Vlieg el at., 1986) and force field, which consisted of the usual terms for bonds, bond-angles, sinusoidal dihedral torsion, non-bonded interactions (van der Waals' and electrostatics), and harmonic terms to maintain proper planar or tetrahedral geometries, and to which 2 extra terms representing distance and dihedral restraints were added. The distance restraint square well potential, EDIS, was given by: EDIs = 0"5 CDIS (rV-rii) 2 0-5 CDIS ( ~ - - ro)2

if r0 > r~ if r o < r~j,

(l)

413

where r~ and ~ were the upper and lower estimates of the distance between protons i and j , respectively, r 0 was the calculated distance, and CDI8 was a force constant. The effective dihedral angle restraint, EDm, was represented by: Et)m = 0'5 CDLR (Co_ r =. 0-5 CDLR ( r 1 6 2

z

if Ck > Ckv if r < r

(2)

where r and r were upper and lower allowed limits of the torsion angle, Ck was the calculated angle, and CDLR was the force constant. Several small alterations to the GROMOS force field were made in order to be more consistent with the nucleic acid force field of the CHARMM molecular mechanics programs (Nilsson & Karplus, 1986) and have been tested on a DNA decamer (Baleja el at., 1990b). The normal van der Waals' radius on united methylene carbon atoms was reduced from 2"22 to 2-10 A to avoid steric clashes between C--2' and one of the oxygen atoms on the 3' phosphate. Methine carbon atoms were given van der Waals' radii of 2-05 A. Corresponding 1--4 van der Waals' interactions were left unaltered. The effect of solvent was approximated for structural determination by a 1/er screening function, where r was the separation of the charged groups in A (Brooks et al., 1983) and the dielectric constant, e, was equal to 4 (Weiner et at., 1984). The net charge on each phosphate group was reduced to 0"32e (Tidor el al., 1983; Nilsson el at., 1986). Starting models were first subjected to 200 steps of steepest descents energy minimization. During the first l0 ps of each MD simulation, values of the distance restraint force constant were increased from 500 to 10,000 kJ tool -1 nm -2 and the dihedral restraint force constant from 5 to 50 kJ mol - I rad -z. Velocities were re-initialized (taken from a Maxwellian distribution at 300 K) with ever)" increase in the force constants (approx. every 1"5 ps). Newton's equations of motion were i n t e grated with a time step of 2 Is, with all bond lengths kept rigid by the SHAKE algorithm (Ryckaert et at., 1977). The molecule was weakly coupled to a temperature bath with a reference temperature of 300 K using a coupling constant of0"l ps (Berendsen et al., 1984). A cut-off radius of l0 A was applied for non-bonded interactions and the non-bonded pair list was updated every 20 fs. Molecular dynamics runs with the highest values of CDIS and CDLR were continued to 20 ps in total, and co-ordinates were averaged over the last 5 ps. Averaged molecular dynamics structures were then subjected to 200 steps of energy minimization to correct distortions in the structure caused by the averaging procedure. We performed a total of 4 molecular dynamics runs for structural determinations with experimental distance and dihedral restraints. The experimental data sets for each DNA octamer were based on observations from their 2-dimensional n.m.r, spectra (see below). For CATG, a MD run was started with a model in a classical A-type DNA conformation (Arnott & Hukins, 1972) and a second MD run used a starting model of CATG in an average B-DNA configuration. Two other MD runs were performed for GTAC, also with A and B-DNA starting models. (d) NOE-based structure refinement Molecular dynamics calculations result in structures tlmt satisfy the 2 sets of experimental restraints; the approximate distance set and the dihedral angle set. Distance restraints based on the NOE intensities are most often inaccurate because of spin diffusion effects. In addi-

414

J.D.

B a l e j a e t al.

tion, the distance restraints for Watson-Crick basepairing forces some degree of planarity for the 2 bases of a base-pair, and may affect certain conformational parameters, such as propeller twist. After non-exchangeable protons were added in geometrically reasonable positions, structures were subjected to further refinement directly based on NOE intensities. The force field included a description of non-exchangeable protons t h a t were given no van der Waals' radii, as the carbon atoms to which they were attached retained the united atom approach for the non-bonded interaction calculation. Terms for bond length (1"08 A) and bond angles and improper dihedral angles were added to maintain proper stereochemistry. The effective potential for distance restraints was replaced by a n ENo E potential for NOE restraints (Baleja & Sykes, 1088; Baleja el al., 1990a):

and off-diagonal elements: Ri~ = ~ (6J:(m)--J0(w)), rO

where Q was equal to 0"174h2, and the spectral densities, J,(to), had the form: j,(to) =

~i~ 1 + (ntooTo) 2'

(3)

with the forces due to this pseudo-energy potential calculated as: Fx,-~ -

~EN~ = CNOE ~ (NOEob~--NOEr162 0NOE~j~ Oxi ~ " (4)

Derivatives of the NOE with respect to a change in each Cartesian co-ordinate of proton i were evaluated numerically by changing the co-ordinate slightly (by 0-01 A) and re-calculating the NOE. Evahmtion of the function was repeated for each l)roton in the molecule. Forces from the NOE l)otential were evaluated at the 1st step and were subsequently evaluated ever)" l0 steps during the molecular fimchanics run. The NO E "energy" gradients included both direct forces, where 2 1)rotons, i and j , were pushed closer together if the calculated NOEl/ was too weak, or stretched apart if too strong, and indirect forces that resulted from the effect of moving proton i on NOEjk (Baleja el al., 1990a). The pseudo-energy forces were calculated for off-diagonal N 0 E cross-peak intensities at 200 and 500 ms, which was a compromise between less signal-to-noise at shorter mixing times, and greater spin diffusion effects at long times. The simultaneous fitting of all NOE intensities for even a single, long, mixing time (such at 500 ms) requires inclusion of spin diffusion effects. Although the inclusion of NOE d a t a sets with all shorter mixing times woukl improve accuracy from having more data points, these would have a smaller contribution from spin diffusion effects, and overall would not appreciably effect the final generated structures. The intensities of diagonal peaks were not measured, since they are resolved only for aronmtic base and l' protons and, if included, they dominate the relaxation nmtrix without giving specific information on the structure (Borgias & James, 1988). Associated errors in the diagonal intensities result in a poorer match between observed and calculated off-diagonal intensities. NOE cross-peak intensities between non-exchangeable protons were calculated from: A(~m) = Z" exp ( - ) . r ~ ) 9Z- 1. A(0),

(5)

where Z was the matrix of eigenveetors of the relaxation matrix R, ). was tile diagoiml matrix of eigenvalues, and Zm was the mixing time (Bodenhausen & Ernst, 1982). R was the cross-relaxation nmtrix assuming homonuclear dipolar relaxation for a maeromolecule tumbling isotropitally in solution (Solonmn, 1955; Abragaml 1961), and had diagonal elements: Ri i = Q ~. l (jo(tO)+3Jl(o.~)TOJ2(tO)), (i # j) " i j

(6)

(8)

with ~0' the effective correlation time of tile interproton vector between i and j , and tOo, the Larmor frequency. We incorporated correlation time adjustment factors during refinement to take into account differential motion: z 0 = zr

ENo E = 0"5 CNOE ~ (NOEob~-NOE~I~) 2,

(7)

(9)

The product, S 2, is related to the order l)arametcr, S 2, of Lipari & Szabo (1982), and can vary between 0 and I. A value of l indicates that the correlation time of the interproton vector is the same as the overall tumbling tinm of the macromolecule; a value of zero indicates complete motional frcedom. Internal motions in ])NA (Hogan & Jardetzky, 1980; Keepers & James, 1982; Clore & Gronenborn, 1984) are indicated here with correlation time adjustment factors of less than unity, which effectively reduce correlation times for more mobile protons. Values of 0-65, 0-85 and 0'9 were used empirically to reflect the increased motion of all thymine methyl groups, sugar 2' and 2" methylene protons, and the 5' and 3' terminal residues, rcsi)ectively, with an overall correlation time of 3"0 ns. These correlation time adjustment factors result in the lowest NOE residual, or R factor, for any right-handed DNA structure (Baleja et al., 1990a). The NOE R factor is also used to monitor the fit of all observed NOE intensities to those calculated from a given structure (Lef'evre el al., 1987; Gupta el al., 1988; Baleja & Sykes, 1988): R =

~JNOn~ x'Oncalr ~ NOEob~ ,

(10)

where the summation runs over the number of observed NOE cross-peaks at mixing times of 200 and 500 ms. Structure refinement began with 100 steps of energy minimization using a dihedral force constant of 50 kJ tool - I rad -2 and an NOE force constant of 1000 kJ tool -1 (ANOE) -2. A filrthcr 100 steps of minimization with CNOE set to 2000 kJ tool -1 (ANOE) -2 completed the refinement proccdure. (Larger NOE force constants cause unacceptable energies or distortions in the proton stereochemistry.) Helical parameters of the final structures were analyzed with the programs A H E L I X , BI(OLL and CYLIN (Fratini el al., 1982; Dickerson et al., 1985a,b).

3. Results (a) R e s o n a n c e assignmeTzt A p r e r e q u i s i t e for t h e d e t e r m i n a t i o n o f a s o l u t i o n s t r u c t u r e b y N O E m e a s u r e m e n t s is t h e a s s i g n m e n t o f r e s o n a n c e s to specific p r o t o n s o f t h e m a c r o molecule. R e s o n a n c e s ill n u c l e i c acids c a n be i d e n t i fied b y a c o m b i n a t i o n o f COSY a n d N O E S Y e x p e r i m e n t s , which p r o v i d e , r e s p e c t i v e l y , t h r o u g h bond and through-space conncctivities between i n d i v i d u a l nuclei (Aue et al., 1976; N a g a y a m a et al.,

Solution Structure of l'urine-pyrimidine D N A

415

8•

5

3,tA7

L__JVL_

.

6.4"

0

6-3-

0 |

6"2-

}

6-16.0~ E

P

5,9-

"~

ci.s.85.75-B-

0

I

i .O o.,.

] ii 0

5.55"4-

e

8"4

8r2

8"0

7"8

7.6

7"4

tl

@

5.3 7"2

8"4

8"2

8"0

7"8

p.p.m,

p.p.rn

(a)

(b)

7"6

7-4

7~2

Figure 1. The 400 MHz ~H n.m.r. NOESY spectra of the aromatic base and sugar 1' protons of DNA: (a) [d(G-T-A-C-O-T-A-C)]2; (b) [d(C-A-T-G-C-A-T-G)]2. The mixing time for the experiment is 500 ms. Cross-peaks occur between the resonance frequencies for which corresponding protons are spatially proximate. Asterisks mark the cytosine H-5 ~ H-6 NOE cross-peaks. Cross-peaks between adenine H-2 protons and 1' protons are circled. For the 1st nueleotide of each duplex, the intra-residue baseo 1' NOE cross-peak is indicated by an arrow. Residue numbers are given for H-6/H-8 protons and 1' protons on the l-dimensional spectra. 1980; Jecner et al., 1979). COSY cross-peaks are observed for a pair of protons through their n.m.r. scalar coupling constant. The protons are generally two bonds apart, or three bonds apart with a favorable intervening dihedral angle. NOESY crosspeaks are observed between spatially proximate protons (generally with interl)roton distances of less than 5 A). Figure 1 shows tim well-resolved region of the NOESY spectrmn indicating possible close approaches of base H-2, H-6 and H-8 i)rotons (7 to 8"5 p.l).m.) to cytosine H-5 and sugar ring l' protons (5"3 to 6"3 p.p.m.). Adenine H-8 and guanine H-8 base protons generally resonate betwcen 8 and 8"5 p.p.m., and betwcen 7"5 and 8 p.p.m., respectively (Shindo et al., 1988). Pyrimidine H-6 protons are between 7 and 7"7 p.I).m. COSY spectra of tiffs spectral region each show only two strong peaks, which correspond to the approximately 8 Hz threebond coupling between cytosine base H-5 and H-6 protons. Tim short 2-5 A length of tiffs proton pair results in a strong NOE cross-peak at these resonance positions. The absence of strong cross-peaks (with an intensity similar to tim cytosine H - 5 ~ H - 6 pair) between guanine H-8 and sugar l' protons indicates no syn guanosine conformation, as is observed in left-handed DNA helices.'NOE crosspeaks are observed from cytosine H-5 to the base H-8 protons preceding (5') in sequence (adenine for [d(G-T-A-C-G-T-A-C)]2, guanine for [d(C-A-T-G-CA-T-G)]2, revealing t h a t both duplexes arc rightbanded (Cohen, 1987). Moreover, circular dichoism

spectra (not shown) indicate right-handed helieity. Therefore, protocols developed for assignments of well-resolved, non-exchangeable protons ill righthanded DNA apply (Feigon et al., 1982; Scheek et al., 1983; Hare et al., 1983; Chazin et al., 1986). Here, we review this assignment procedure by illustration with tile sequence [d(C-A-T-G-C-A-T-G)] 2, and update tire assignment methods to include backbone'4', 5' and 5" protons. Tire thymidine H-6 protons can be identified by a four-bond coupling constant showing as COSY cross-peaks to their respective methyl resonances (near 1-3 p.p.m., not shown). The remaining adenine H-2 resonances are attributable in one-dimensional spectra to the peaks with narrower linewidths and much longer spin-lattice relaxation times (7'1) than tim other base protons, which is a consequence of tim position of tlre H-2 protons nearer the middle of the DNA helix and resulting isolation from most other protons. Differential line shapes for protons at either 5' or 3' ends of the DNA duplex reflect tim dynamic belmvior of these residues. In both helices, terminal 1' sugar protons are sharper than the others, indicating faster motion on a nanosecond time-scale. In [d(G-T-A-C-G-T-A-C)]2, one of the cytosine H-6 base protons appears to be in slow chemical exchange (tl/2 >_ 0"13 s) with an alternate form. The H - 5 o H - 6 NOE intensity and linewidth for tiffs cytosine is similar to t h a t for tim other cytosine residue. The dynamics of this terminal residue is not investigated filrther, but must be kept in mind

J . D . Baleja et al.

416

H6

1t6

/

,d

"

HS:

Figure 2. Resonance assignment in nucleic acids. The stereo-diagram represents the first 2 nucleotides of the [d(C-A-TG-C-A-T-G)]2 duplex DNA. Spectral assignment most often begins with base H-6/H-8 and sugar 1' protons since they have the greatest spectral dispersion. In right-handed helices, the l' proton is within 4 Jx of the base proton of the same residue, and the base proton of the 5' residue (broken line). A connectivity for a proton pair is represented by a crosspeak in the NOESY spectrum at the resonance frequencies of the 2 protons involved (Fig. 1). when interpreting structures that represent this nucleic acid. Having attributed base protons to their residue type and knowing which sugar protons are likely at terminal positions, a sequence-specific sequential assignment may now begin. For assignments in the base and l' proton spectral region, we take advantage of the fact that the protons of a DNA duplex in 2H20 form a more or less equidistant linear array up and down each strand. Furthermore, in righthanded h'elices, base protons are near only two l' sugar protons; their own and that of the preceding (5') residue, but not the 1' of the succeeding (3') residue (Fig. 2). In [d(C-A-T-G-C-A-T-G)]2, of the two cytosine H-6 protons, one has a cross-peak to its own H-5 proton, and to only one H-I' proton. This represents the terminal cytosine H-6 and its H-I', since the 5' nucleotide is absent. The sharper H-I' supports tile assignment. The H-I' has the cross-peak to the Ct H-6, but also to an adenine H-8, which is, therefore, tim second residue. In turn, A 2 H-8 has a cross-peak to its own intra-residue 1' proton, as well as tile CI I' proton already noted. Tiffs connectivity Imttern continues to the 3'-terminal G s residue and is also observed throu'ghout [d(G-T-A-C-G-T-A-C)]2. A similar sequential procedure (Fig. 2) exists between base H-8]H-6 and H-2' resonances and between base and the 2" protons, confirming assignments. Adenine H-2 base proton asignments are made by the observation of weak cross-peaks to intra-residue 1' and succeeding l' protons. Assignments into the ring can be made by examination of COSY spectra linking the assigned 1' protons to the 2' and 2" protons three bonds distant (Fig. 3). For all sugar ring conformations, tire 1' proton is always closer, and has a larger NOE, to the 2" than to the 2' proton, thereby assigning tlle protons on tire 2' carbon stereospecificaIly. Excepting 3' termini, for any given residue the 2' proton is upficld from the 2" proton (Hare et al,, 1983). 3' Protons are assigned by COSY 2 ' ~ 3 ' crosS-peaks. The absence of strong 2" ~ 3' cross-peaks (except for terminal residues) indicates that J2,3, is very small,

and has implications for the sugar ring conformation (see below). Assignment of the 4' and 5', 5" protons is illustrated for GTAC and CATG in Figure 4. In rightbanded DNA, these protons are nearest, and NOE intensities are calculated to be strongest, to the intra-residue base H-6/H-8 protons. Assignments are more conventionally made (Gronenborn & Clore, 1985) by examining COSY and NOESY spectral regions between the 3' a n d 4', 5' and 5" protons. COSY cross-peaks between 3' and 4' protons are repeated in the NOESY spectra, but with additional intra-residue 3'-5' and 3'-5" correlations. The spectrum shown in Figure 4 is more useflfl, as the effect of the residual 2HHO signal and limited spectral dispersion is removed from one frequency axis. 5' and 5" protons were not assigned stereospecifically because of the limited dispersion of chemical shifts in the 4', 5', 5" COSY and NOESY spectral regions. NOE intensities involving some 5' and 5" protons would also be modified by strong coupling effects (Kay el al., 1986) not taken into account in this study. Non-exchangeable proton assignments for GTAC and CATG are presented in Table 1. Several trends in chemical shifts reflect tile alternating nature of the purine-pyrimidine sequences, and will be useful in fi~ture assignments of other related sequences. As has been noted (Shindo el al., 1988), base H-8 protons of purines resonate at lower field than H-6 protons of pyrimidines. For alternating purine-pyrimidine sequences, generally all proton resonances, except tile 5' and 5" protons, are at lower field for a purine nucleotide. Tiffs is a consequence of the chemical nature of the larger purine ring, and tile conformational preference of six-member pyrimidine rings for a more negative Z torsion angles about tim C-I'-N bond. Base H-6/H-8 protons nearer the 5' end of tile duplex are at higher frequency than tlmt of the same type of residue four residues down the chain. The trend decreases further into tile chain, and eventually reverses. Tim chemical shifts do not vary within 0"Ol p.p.m. between 5 and 30~ (except for Cs H-6 of [d(G-T-AC-G-T-A-C)]2), indicating only limited fraying or

Solution Structure of Purine-pyrimidine D N A

6.2

5-0

8

i~1

5

417

.~o 49

5.8

~

3

4.6

I~ 5.4

2

g 4.4

11

5-0 4-2

Oa %

4.6 4-0 (a)

3.8

J (a)

6.2-

s-o

5.8

"J I

i

4.8

el.

5.4-

I

Ct.

4.6

2

4

5"0

8

~.. 4-4 t~

4"6

P~"

0:o

I 4.2

3"0

2"8 ' 2"6 '

'~.4";" 2"2 ' 2'-0

1~8

p.p.m (b)

Figure 3. Assignment of 2', 2" and 3' protons in 400 MHz COSY spectra of the DNA octanaers: (a) [d(G-TA-C-G-T-A-C)]2; (b) [d(C-A-T-G-C-A-T-G)] 2. Cross-peaks in these spectra occur between protons 3 bonds apart, which exist in a conformation such that there is a sizeable scalar coupling constant between them. In the top part of each panel, 1' protons have cross-peaks to 2' and 2" protons. There are corresponding cross-peaks for each residue between 2' and 3' protons (lower half).

reduced base stacking at the ends of the helices. These instead reflect chain termination effects, with tile chemical shift being sensitive to both the chemical nature and conformational change induced by the lack of any additional nueleotides (Griitter et al., 1988). Tile terminal base-pairs are most effected, but some distortion may extend further into the duplex. (b) Distance determination Distances between protons are obtained using a distance extrapolation procedure (Baleja et al., 1990a). Distances are derived at each mixing time, plotted against tile mixing time, and extrapolated

I

4.0 3.8

8 "4

8 ~2

810

7~8 7~6 7~4 7~2 p.p.m (b) Figure 4. Base H-6/H-8 NOE connectivities to backbone 3', 4', 5' and 5" protons: (a) [d(G-T-A-C-G-T-A-C)]2; (b) [d(C-A-T-G-C-A-T-G)]z. The 500 ms NOESY contour plots show close approaches of pyrimidine H-6 or purine H-8 protons to intra-residue 3', 4', 5' and 5" protons9

back to zero mixing time as a first-order correction for spin diffusion effects: [NOE,r

ro= ~,~olimru(l'm)---- ~olimL

NOEo(z~)

j

. (!1)

This method yields the same quality of distances as that obtained by fitting the NOE buildup curve with a polynomial in z~ and determining the slope at ~m= 0, but offers advantages in being selfcorrecting for changes in instrument gain between mixing time experiments and allowing a more direct visualization of spin diffusion effects in derived distances (Baleja et al., 1990a). For base and 1'

J . D . Baleja et al.

418

Table 1

Proton chemical shift assignments for [d(G-T-A-C-G-T-A-C) ] 2 and [d(C-A-T-G-C-A-T-G) ]2 Chemical shift (p.p.m.)i" Residue

H-6]H-8

Gt T2 Aa Ca Gs T6 A7 Cs

7-91 7.44 8.31 7"25 7-82 7"22 8"26 7"35

Ct A2 Ta G4 Cs A6 T7 Gs

7"72 .8"42 7"16 7'82 7"37 8"29 7"12 7"87

H-2/H-5]CH a

1-40 7.52 5"27 1"49 7"57 5"35 5"98 7"82 1'42 5"36 7'68 1'46

1'

2'

2~

3'

4'

5'

5"

5.98 5-80 6.25 5"57 5"92 5"71 6"25 6"05

2.68 2-26 2.73 2"03 2"58 2"08 2"67 2"16

2.77 2-58 2.93 2"36 2"76 2"42 2"86 2"08

4.S0 4-93 5-05 4"79 4"94 4'85 5"00 4"47

4.22 4-26 4.45 4"18 4"32 4"17 4"39 4"04

3-78 4.14 4-09 4"23 4"12 4"l I 4"15 4"49

3.78 4-03 4-19 4"10 4"07 4"22 4"09 4"25

5"72 6'32 5"77 5"87 5"63 6"23 5"82 6"13

2"00 2"78 2"03 2"62 2"07 2"64 1'85 2"58

2"43 2"95 2"42 2"68 2"38 2"87 2"28 2"36

4"72 5"03 4'87 4"97 4'84 5"00 4'82 4"65

4"06 4"42 4"19 4"36 4" 17 4"38 4"11 4"16

3'72 4"03 4'27 4"15 4"13 4"16 4"24 4"04

3'72 4"14 4"08 4"08 4"10 4"06 4"08 4"07

t Chemieal shifts are given relative to 2,2-dimethyl-2-silapentane-5-sulfonate at 20~ stereospecifically.

sugar protons, the cytosine H - 5 o H - 6 reference distance of 2-46 A is used. Remaining distances between non-exelmngeable protons are determined 6 products for the C-2' using the average NOE,,f x rrCr methylene 2 ' ~ 2 " proton pair (1"76A) and the 1 ' ~ 2" pair (2"3(___0"1) A for all sugar puckers). The upper and lower bound on each distance is estimated from the distance extrapolation curve and is generally + 5 % of the distance (Baleja, 1990). For all distances greater than 3"5 A and involving both an aromatic base and either of the C-2' methylene protons, upper bounds are increased by 0"2 A to account for systematie distance under-determination for this arrangement of protons (Baleja et al., 1990a). Distances involving exchangeable protons were not included, since the corresponding NOE is modified by loss of magnetization due to exchange with H20, even with measurement at lower temperatures. The alternating purine-pyrimidine DNA octamers are self-complementary so that restraints are entered in symmetry-related pairs. Experimental restraintst are summarized in Table 2. (c) Glycosidic dihedral angles The geometry of the five-membered sugar ring of DNA can be described by five torsion angles Vo to va. Because of ring closure, the values of vo to va are interrelated: v,=Vm~eos[P+144(n--2)], n=0,4, (12) where Vm~~ is the maximum amplitude of the sugar ring pucker, and P is the sugar pseudorotation angle "~Full listings of experimental restraints used in structural determination and co-ordinates "of final refined structures have been deposited in the " Brookhaven Protein Data Bank (Chemistry l)epartment, Brookhaven National Laboratories, Upton, Long Island, NY 11973).

5' and 5" protons are not assigned

(Altona & Sundaralingam, 1972). The magnitude of three-bond coupling constants between protons is dependent on the intervening dihedral angle, and therefore reflects the pseudorotation angle that specifies the conformation of the sugar ring (Hosur et al., 1986). The absence or l)resence of cross-peaks in magnitude COSY spectra give qualitative information on sugar ring conformation. For example, except for 3' termini, 2 " ~ 3' correlations are absent in the COSY spectra (Fig. 3). This indicates that the sugar pseudorotation angle is between 1O0 and 250 ~ for all non-(3')terminal sugar rings (Hosur et al., 1986). The observation of 3 ' ~ 4' correlations (not shown) narrows the allowed range of the sugar pseudorotation angle to be between 105 and 175~ Following established procedures (Chary et al., 1988; Hosur et al., 1988; Rinkel & Altona, 1987), the pseudorotation angle tan be further restricted with individual J v v and Jr2- coupling constants measured from one-dimensional spectra (Fig. 1). Allowed ranges of glycosidic dihedral angles are obtained for each nueleotide from the coupling constant data assuming a Vm~~ of 40 ~ (Rinkel & Altona, 1987) and by using equation (12), (Baleja et al., 1990b). (d) Right-handed D N A heliz restraints To preserve the right-handed character of the DNA during tile molecular dynamics calculations, it was sometimes necessary to constrain backbone dihedral angles to be in a broad allowed region of torsional angle space (Gronenborn & Clore, 1989). The allowed angles (a, - 9 0 to -30~ fl, < - 1 4 5 ~ and >135~ t, < - 6 0 ~ and > 140~ ~', < - 4 5 ~ and > 150 ~ are derived from a table of conformation angles found in the different DNA types (Suzuki el al., 1986) and from considering individual variations in right-handed helices from X-ray erystallographie

Solution Structure of Purine-pyrimidine D N A

419

Table 2

Numbers and types of experimental restraints used in structural determinations Numbers of experimental restraints NOE intensities Structuret GTAC CATG

Distances (A)

Mix 200~

Mix 500:~

258 220

280 278

NOE derivedw 244 208

Dihedral angles (~

Watson-Crickll

Glycosidic

48 48

Right-handed helical restraints

80 80

56 56

t GTAC and CATG refer to [d(G-T-A-C-G-T-A-C)]2and [d(C-A-T-G-C-A-T-G)]2,respectively. :~Mix 200 and .Mix 500 are the nfixing times of the NOESY experiment, in milliseconds. These intensities are used for NOE-based refinement. wNOE derived distances are obtained from ext)erimentally observed NOE intensities with nfixing times of 100 to 500 ms. [[Watson-Crick base-pairing restraints.

studies (Diekerson et at., 1985a,b). These rightbanded helix restraints would cause no violations for a n y of the average A, B, alternating-B, C, D or wrinkled D DNA forms, nor for a n y individual angles found in the best studied single-crystal X - r a y structures of B-DNA (Diekerson & Drew, 1981; Priv6 et at., 1987), except for two fl angles (of residues G3 and Ct2) in the B - D N A form of a phosphotltioate analog of DNA (Cruse et at., 1986). During molecular dynamics runs, base-pairs were kept Watson-Crick hydrogen-bonded by distance restraints between bases. These were for all basepairs: r(C-l'-C-l', across the base-pair = 10-87(_+0-2)A, for A ' T base-pairs: r(AN.6.To.4) = 2.8(_0"1) A, r(AnN.To.4) = 1"7(+0-1) A, r(AN.t'Tn.3) = 1"7(_0"1) A, r(AN.t'Ts.a) = 2-8('-_0"1) A, and for G ' C pairs: r(Go.6"CnN ) = 1.6(+-0"1) h , r(Go.6"CN.4) = 2"7(_0-1) A, r(Gs.t "CN.3) = 2"8(_0"1) A, r(GH.t "CN.3) = 1.7(--,0"1) A, r(GN.2-Co.2) = 2"8(--,0"1) A, r(GnN.Co.2) = l-7(+-0"l) A (Arnott & Hukins, 1972). (e) Molecular dynamics calculations and

of both runs starting from A and from B-DNA (i.e. averaging over 10 ps of MD for each DNA octamer duplex). The r.m.s, fluctuations increase to only 0"55 and 0"65 A for GTAC and CATG, respectively. Tim averaged structures are each subjected to refinement directly based on the NOE intensities, and not the derived, approximate distances. NOE R factors on tim MD structures are 0.24 and 0"26, respectively, for GTAC and CATG. Energy minimization reduces tim NOE residual to 0"19 and 0-23, respectively. The larger R factor for CATG is likely not to be due to a poorer strt~etural determination per se, but reflects tim lower concentration of this duplex, with concomitantly lower signal-to-noise ratio in N O E S Y spectra, and increased error in the NOE intensities. We have shown (Baleja et al., 1990a) than an R factor o f less than 0-20 indicates that the structure is in agreement with the NOE data within a reasonable experimental error. The average residual difference between observed and calculated NOE intensity per nueleotide and per nucleotide step is shown in Figure 6. The profile is rather uniform, showing t h a t all parts of the DNA duplexes are equally in agreement with the NOE

N OE-based refinement The two A and B-DNA starting structures for each MD run have a r.m.s, deviation of 4.3 A (Table 3). NOE R factors of initial A - D N A models are 0"66 and 0'69, and for initial B - D N A models are 0"39 and 0"41, respectively, for GTAC and CATG. For each duplex, the application of molecular dynamics including the experimental distance restraints results in convergence to structures with an atomic r.m.s, deviation of 0"65 A and 0"66 A for GTAC and CATG, respectively. This is comparable to the r.m.s. fluctuations over the last five pieoseeonds of the molecular dynamics run (0-50 and 0"62 A, respectively). The MD structures are shown i~l Figure 5(a) and (b). The structures drawn in bold are results starting front A - D N A models; the others itave initial B-DNA models. The two structures for each duplex appear to be, notwithstanding motional dynamics, essentially identical. I t is therefore appropriate to average over the last five pieoseeonds

Table

3

Atomic r.m.s, differences ( ~ ) between alternating pyrimidine-purine D N A structures A-DNA B-DNA MD-A

MD-B

R,,c

Overall r.m.s, difference (A) for [d(G-T-A-C-G-T-A-C)]2 A-DNA -4-3 3"9 3"4 3"7 B-DNA 4"3 -1"13 1"5 1"3 MD-A 4-3 1-2 -0-65 0"33 MD-B 3"9 1"4 0-66 -0"34 It,,, 4"1 1"2 0"34 0"34 -O v e r a l l r.m.s, difference (A) for [d(C-A-T-G-C-A-T-G)]z A-DNA and B-DNAare starting structures with regular A and B geometries, respectively. MD-A and MD-B are average restrained molecular dynamics structures. NOE-based refinement results in the R,vc structures. Numbers above the diagonal indicate the r.m.s, atomic deviation between 2 structures for [d(G-T-A-C-G-T-A-C}]z. Numbers below the diagonal show t h e comparison for [d(C-A-T-G-C-A-T-G)h.

J . D . Baleja

420

e t al.

c~

o

~ A

6~

9-~ -w

o v

v

%

Solution Structure of Purine-pyrimidine D N A

0020f

GTAC]~ 0"020f

GTAC]

0~I0

OOCO

{~000 I

2

3

4

5

6

7

$

I

2

3

Residue number

0"020[

CATG]

8 0-020

!

OOlO

2

5

6

7

3 4 5 6 ? Residue number

CATG

iT015

0010

~176 M I

4

Base step

~ooooL~

i 0~05

8

I

2

3

4 $ 6 Base step

7

Figure 6. Residual differences between observed and calculated NOE intensities. The average difference between the observed NOE intensity and that calculated from the final refined structures for [d(G-T-A-C-G-T-A-C)]2 and [d(C-A-T-O-C-A-T-G)]2 are shown for intranucleotide NOEs of each residue, and for internucleotide NOEs for each base step. Observed NOEs are first scaled to the calculated intensities by multiplication with a factor (Z NOE~a~JZ NOEob~). NOEs are approximately in percentage of the main diagonal peak, if the diagonal peak intensity is taken as equal to 1-0. The intensity of a typical N0E (for example, the sugar ring l' ~ 2', corresponding to a distance of about 3 A) is 0-039 for a 200 ms NOE experiment and 0'058 for 500 ms. The average NOE intensity over all cross-peaks at 200 and at 500 ms, is 0-045.

data. Cross-peak intensity from two 500 millisecond mixing time NOESY spectra have been measured for GTAC. The NOE differences between the two observed data sets yield tile same profile (with an average difference of 0-0065), showing t h a t the differences in Figure 6 "mainly arise from statistical measurement errors. However, tile a m o u n t of NOE data is not uniform. There are about 11 intranueleotide NOEs per mixing time, but only about five internucleotide NOEs (which range from 3 to 9 NOEs per mixing time per base step). I t m a y be expected that DNA eonformational parameters such as tile sugar ring conformation and the disposition of the aromatic base to the sugar ring be well defined, but the relationship of adjacent nucleotides (and more distant relationships) rely more-on the approximate force field used in the molecular mechanics calculations (Gronenborn & Clore, 1989). There could be other DNA structures consistent with the NOE data, although these would not be likely to be energetically reasonable, at least in terms of tile force field used. The refinement procednre results in little change from the average MD structure (less than 0"1 .~ r.m.s, deviation), which is consistent with structnres

421

that are energetically near a global minimum. The distances were sufficiently numerous and precise to generate accurate structures during the molecular dynamics portion of structure determination. Long NOE-restrained molecular dynamics simulations, which would sample much more conformational space than NOE-restrained energy minimization, could not be undertaken because of the computer time required (approx. 25 times greater than distance restrained MD, although analytical derivative methods could reduce this requirement b y a factor between 2 and 3; Baleja el al., 1990a; Yip & Case, 1989). The final structures are energetically reasonable, showing better energies than the average MD structure (Table 4). The refined structures are shown in Figure 5(c) and (d). Helical parameters are obtained by fitting the best overall helix, excluding the terminal base-pairs. GTAC has an average helix rotation of 34"8~, or 10-4 residues per turn, and a mean residue-to-residue rise of 3"27 A. CATG has an average helix rotation of 34"4 ~ or 10"5 residues pcr turn, with a mean rise of 3"34 _A. Both structures have atomic r.m.s, deviations from classical B-DNA of approximately 1"25 A. Convergence from the widely different starting structures indicate t h a t the conformational space has been well sampled and the final refined structures provide reasonable representations of the structure of alternating , purine-pyrimidine sequences in solution (Niiges et al., 1987; Baleja et al., 1990b). Despite tile fact t h a t NOEs give the spatial relationships between protons only a few fingstr/Sm units apart, and coupling constants indicate torsion angles for protons separated by three bonds, the structures shown in Figure 5 have global features similar to known oligonucleotide structures determined crystallographically. The short-range nature of the n.m.r, information suggests that local conformational parameters should be better determined, which are therefore discussed in some detail.

4.Discussion (a) Structural details of alternating

purine-pyrimidine D N A octamers Structural parameters typical for describing nucleic acid structure (Dickerson et al., 1985a,b) are shown in Figure 7. The top four parameters, helix twist, roll, 51-2, and propeller twist, represent conformational features shown to be most basesequence-dependent (Dickerson, 1983), and a set of rules has been proposed to explain their behavior (Calladine, 1982) based on a steric clash model. In nucleic acids, a small propeller twist enables the adjacent base along one strand to overlap, and stack more efficiently. Along one strand, bases rotate in the same sense with no interference. However, purines extend beyond the center of the base-pair and, because of the anti-parallel nature of DNA, bases of the opposite strand have their propeller rotation in opposite direction with respect to the first. This results in steric clashes occurring

I

44

I

I

~

42

I

I

I

I

I

It I

II 6--

9~ 38 .~ 36

~

/

Q

'

6

o t~

,

/ ~S,. X ,, ; \

2

~34

,-

~,

o

32 -2

30

I

1

I

I

I

1

I

, ,

1

2

3

4

5

6

7

1

2

I

I

I

! ;

Solution conformation of purine-pyrimidine DNA octamers using nuclear magnetic resonance, restrained molecular dynamics and NOE-based refinement.

The solution structures of two alternating purine-pyrimidine octamers, [d(G-T-A-C-G-T-A-C)]2 and the reverse sequence [d(C-A-T-G-C-A-T-G)]2, are inves...
2MB Sizes 0 Downloads 0 Views

Recommend Documents