JVI Accepts, published online ahead of print on 15 January 2014 J. Virol. doi:10.1128/JVI.03483-13 Copyright © 2014, American Society for Microbiology. All Rights Reserved.
1
Common mechanism for RNA encapsidation by negative strand RNA viruses
2
3
Todd J Green, Robert Cox, Jun Tsao, Michael Rowse, Shihong Qiu, Ming Luo#
4
5
Department of Microbiology
6
University of Alabama at Birmingham
7
Birmingham, Alabama, 35294, USA
8
9
10
(Running title: NSV RNA encapsidation) #
send correspondence to:
11
1720 2nd avenue south
12
Birmingham, Alabama, 35294, USA
13
Phone: 1-205-934-4259
14
Fax: 1-205-934-0480
15
Email:
[email protected] 1
16
Abstract
17
The nucleocapsid of a negative strand RNA virus is assembled with a single nucleocapsid
18
protein and the viral genomic RNA. The nucleocapsid protein polymerizes along the length of
19
the single strand genomic RNA (vRNA) or its complementary RNA (cRNA). This process of
20
encapsidation occurs concomitantly with genomic replication. Structural comparisons of several
21
nucleocapsid-like particles show that the mechanism of RNA encapsidation in negative strand
22
RNA viruses has many common features. Fundamentally, there is a unifying mechanism to keep
23
the capsid protein protomer monomeric prior to encapsidation of viral RNA. In the nucleocapsid,
24
there is a cavity between two globular domains of the nucleocapsid protein where the viral RNA
25
is sequestered. The viral RNA must be transiently released from the nucleocapsid in order to
26
reveal the template RNA sequence for transcription/replication. There are cross-molecular
27
interactions among the protein subunits linearly along the nucleocapsid to stabilize its structure.
28
Empty capsids can form in the absence of RNA. The common characteristics of the RNA
29
encapsidation not only delineate the evolutionary relationship of negative strand RNA viruses,
30
but also provide insights to their mechanism of replication.
31
2
32
Importance
33 34
What separates NSVs from the rest of the virosphere is that the nucleocapsid of NSVs serves
35
as the template for viral RNA synthesis. Their viral RNA-dependent RNA polymerase (vRdRp)
36
can induce local conformational changes in the nucleocapsid to temporarily release the RNA
37
genome so that vRdRp can use it as the template for RNA synthesis during both transcription and
38
replication. After RNA synthesis at the local region is completed, vRdRp processes downstream
39
and the RNA genome is restored in the nucleocapsid. We found that the nucleocapsid assembly
40
of all NSVs shares three essential elements: a monomeric capsid protein protomer, parallel
41
orientation of subunits in the linear nucleocapsid, and a 5H+3H motif that forms a proper cavity
42
for sequestration of the RNA. This observation also suggests that all NSVs are evolved from a
43
common ancestor that has this unique nucleocapsid.
3
44
Introduction
45
All viruses contain a protein capsid that encapsidates the genomic polynucleotide. The capsid
46
is assembled with multiple copies of one or a few types of protein subunits following certain
47
symmetry. The most commonly studied symmetry is that of icosahedron which leads to spherical
48
or prolate virus particles (1-3). By contrast, helical symmetry is used for assembly of filamentous
49
capsids (4). The basic fold of the capsid protein subunit is found to be the same for a large
50
number of virus families, even though they may not be related to each other on a genomic basis.
51
The quintessential example of this is that of the β–barrel fold that was first found in small plant
52
RNA viruses, yet now has been discovered in at least 15 different viral families (5). The
53
architecture of the nucleocapsid is closely tied with the mechanism of replication in view of the
54
fact that the assembly of the capsid packages the viral genome.
55
For negative strand RNA viruses (NSVs), eight families have been recognized by the
56
International Committee on Taxonomy of Viruses (ICTV). The unique feature that distinguishes
57
NSV from the rest of the virosphere is that the nucleocapsid, instead of the naked genome, is
58
used as the template for viral nucleotide synthesis. It is indubitable that the assembly of the NSV
59
nucleocapsid is related to the unique mechanism of its viral RNA synthesis. In each of NSVs, the
60
nucleocapsid is packaged inside a lipid envelope. The appearance of the nucleocapsid inside the
61
envelope is different from virus to virus. In rhadboviruses, the nucleocapsid adopts a
62
characteristic bullet shape (6). In paramyxoviruses, the nucleocapsid is filamentous or
63
herringbone-like (7). In orthomyxoviruses, the nucleocapsid has a double helical structure (8, 9).
64
When the nucleocapsid is released from the virion, they all have the appearance of a coil (10).
65
The genomic RNA encapsidated in the nucleocapsid is protected from RNA nucleases to various
66
degrees depending on the structure. This protective structure renders the RNA not readily 4
67
accessible to the viral polymerase. Thus the viral polymerase must gain access to the
68
encapsidated RNA in order to carry out viral RNA synthesis. Since this is a common mechanism
69
of all NSVs, it is likely that the nucleocapsid of NSVs bears characteristic elements shared by all
70
NSV families. Recognition of these elements helps in defining essential viral functions and
71
revealing the underlining mechanism for NSV replication.
72
By systematic analyses of the known structures of the NSV nucleocapsid, we discovered the
73
common mechanism for genomic RNA encapsidation during replication. This unifying
74
mechanism suggests a common origin of NSV families and presents a clear picture for the
75
functions of the nucleocapsid protein.
76 77
Materials and Methods
78
The coordinates with PDB codes of 3PU4 (VSVN), 2WJ8 (RSVN), 4H5P (RVFVN) and
79
4BHH (LACVN) were retrieved from PDB (11-14). The superposition was carried out with Fr-
80
TM-align (15). The results were summarized in Table 1.
81
The NA320-24-2P protein complex was produced in E. coli as reported previously (16). X-ray
82
scattering data on VSV NA320-24-2P samples were collected on the SIBYLS beamline at the
83
Advanced Light Source (Figure 1A). Scattering curves were processed with PRIMUS (17).
84
Protein samples ranging from 0.75 to 4.3 mg/mL showed no sign of aggregation by Guinier plot
85
analysis (Figure 1B). Rg, Dmax and Porod volumes were calculated with the ATSAS package
86
(Table 2) (18). Bead models (ten in total) were generated from the scattering data with
87
DAMMIN (Figure 2) (19) and an averaged bead model was calculated with DAMAVER (20).
88
The average χ2 value for the bead models was 0.986. EOM (RANCH and GAJOE) was used to 5
89
build a multi-domain model of the NA320-24-2P complex against the SAXS data (18, 21).
90
Structures of the VSV P N-terminus (PNT, amino acids 6-34) (22), the dimeric P oligomerization-
91
domain (POD) (23), PCTD/N protein complex with bound PNT (24) and an additional unbound PCTD
92
were used as rigid bodies. The domain structures were derived from the structures with PDB
93
codes: 2FQM, 3HHW, and 3PMK. No partial restraints were imposed on the individual rigid
94
bodies. Fit of the bead and the multi-domain models to the experimental data are shown in
95
Figure 1E. Inspection of the fitted curves showed a dip in the calculated curve at low q values
96
suggesting that a larger organized species was in solution. Two copies of the EOM model were
97
fit to the experimental curve with SASREF (25). This fit to the curve was a remarkable match
98
yielding a χ value of 1.581 (single fit 12.865). The resulting models were superimposed onto the
99
bead models from DAMMIN with SUPCOMB (19, 26). The fit is shown in Figure 2A, B.
100 101
Results and Discussion
102
Capsid protomer
103
The atomic structure of nucleocapsid-like particles (NLPs) has been reported for three virus
104
families of NSV, Rhabdoviridae, Paramyxoviridae (Genus Pneumovirus), and Bunyaviridae
105
(Genera Phlebovirus and Orthobunyavirus) (11-14, 27-31). A comparison of representative
106
structures from each genus was performed. Since the structures in the same genus are highly
107
homologous, vesicular stomatitis virus (VSV) (11) is selected to represent rhabdoviruses;
108
respiratory syncytial virus (RSV) (12), pneumoviruses; rift valley fever virus (RVFV) (13),
109
phleboviruses; and La Crosse virus (LACV) (14), orthobunyaviruses. In each of these structures,
110
it was found that the encapsidated RNA is sequestered in a protein cavity. The capsid protein (N, 6
111
also known as nucleocapsid protein or nucleoprotein) is first synthesized as a monomeric protein
112
(named N0, the capsid protein protomer) before being incorporated in the nucleocapsid. N0
113
remains monomeric through a number of different ways. However, the fundamental requirement
114
to support viral replication is to prevent N0 from oligomerization before encapsidation of viral
115
RNA. The historic description that N0 is prevented from RNA binding is proven incorrect (16,
116
32). N0 is not an RNA binding protein but rather a capsid protein that assembles a capsid to
117
accommodate any RNA sequence. Any reported in vitro RNA binding measurements are mainly
118
for nonspecific electrostatic interactions between the negative charges of RNA phosphate groups
119
and the positively charged residues in N0. The nature of these nonspecific interactions has no
120
difference from that between any other positively charged protein, for instance, the matrix
121
protein of influenza virus, and RNA.
122
For rhabdoviruses, the N0 form is kept monomeric by forming a complex with the
123
phosphoprotein (P). To support viral replication continuously, it is required that the N and P
124
proteins are expressed in a 1:1 molar ratio (33, 34). A complex of N0-P was isolated from insect
125
cell expression of rabies virus (RABV) N and P proteins, which contains a N subunit and a dimer
126
of P (35). The same N0-P complex was also isolated for a mutant of the VSV N protein when the
127
mutant N protein was coexpressed with P in E. coli (16). The mutations in the VSV N protein
128
changed a stretch of five residues to Ala (NA320-24) and prevented formation of NLP. Analytical
129
unltracentrifugation experiments showed that the complex like the insect derived complex has a
130
1:2 (N to P) stoichiometry. This complex was studied by small angle X-ray scattering (SAXS)
131
techniques (Figure 1 and Table 2). The shape of the complex was determined by the ab initio
132
method in DAMMIN and is shown in Figure 2 (19). Independently, previously determined
133
crystal structures of N and domains of P were modeled against the scattering curves with EOM 7
134
(18, 21). In this method, the linkers between P protein domains were also modeled by an ab
135
initio approach yielding a complete model of the No-P2 protein complex. A complete model is
136
presented in Figure 3. The model suggests that the dimeric P is associated with N0 through
137
multiple sites of interactions. The observation that different parts from each monomer of the P
138
dimer contribute to N binding helps to explain studies where functions of the N-terminus
139
truncated P protein mutant can be restored when a C-terminus truncated P protein mutant is
140
provided in trans (36).
141
The P protein has a modular structure with a flexible N-terminal region, a structured
142
oligomerization domain in the middle and a compact C-terminal domain (PCTD). Both the N- and
143
C-terminal regions of P bind to the N protein. In one study, a complex was generated by binding
144
the N-terminal 60 residues of VSV P with the ring structure of a N mutant in which the N-
145
terminal 21 residues were deleted (22). The N-terminus of P interacts with the back of the C-
146
terminal half (C-lobe) of N. An α-helix corresponding to residues 17 to 31 of P was shown to
147
occupy the RNA cavity in the structure. Subsequent studies by Chen et al. (37) using a P3A
148
mutant (triple mutations of Ser or Thr to Ala) showed that phosphorylation of P at positions
149
Ser60, Thr62 and Ser64 is required to prevent N0 from encapsidating cellular RNA, which leads
150
to a dead-end product of N and diminishes viral replication. In the SAXS model, residues Ser60,
151
Thr62 and Ser64 were notably positioned to form charge interactions with Lys398, Arg399,
152
Lys414 and Lys417 of N (named here as the “PO binding site”). The crystal structure of the
153
VSV NLP in complex with PCTD showed P binds to the C-lobe of N, primarily the C-terminal
154
extended loop and α-helix 13 (24). Combining the results of SAXS and crystallography suggests
155
that P forms a stable complex with the monomeric N subunit involving interactions with the C-
156
terminal loop, a positively charged patch at the PO binding site, the RNA cavity, and the back of 8
157
the C-lobe of N. The oligomerization of N is prevented by binding of P whereby the binding site
158
for the N-terminus of P overlaps with that for the N-terminus of N in the nucleocapsid. The
159
region of P bound at the PO binding site can block the side-by-side contact between N subunits
160
found in the nucleocapsid. Mutation studies showed that loss of any of these interaction sites
161
could diminish N oligomerization, thus RNA encapsidation (16). Upon assembly of the
162
nucleocapsid, the N0 bound P is removed. N begins to oligomerize and encapsidate viral RNA.
163
The P protein is then recycled to bind another N0 subunit.
164
For RSV, there is no structural report of an N-P complex. However, it was shown that the N-
165
terminal 119 residues precede the oligomerization domain of P in primary sequence (38). The N-
166
terminus of RSVP is required for P binding to N0 (39). The N-terminal region of RSVP was
167
predicted to be intrinsically disordered, but residues 15-25 are predicted to be an α-helix. A
168
homologous structure was recently reported for a complex between the N-terminus of P and N of
169
Nipah virus (40). The N-terminus of P binds the C-terminal domain of N where it may exclude
170
binding of the C-terminus of a neighboring N subunit to this surface to prevent N
171
oligomerization. RSVP could bind to a similar site in RSVN, or another site of N which is
172
required for binding the N-terminus of a neighboring N subunit. The exact binding site for the N-
173
terminus of RSVP remains to be determined for RSVN.
174
For RVFV, the structure of the N protein has been solved in two forms, as an apo-structure
175
without RNA and an NLP. The first form of RVFVN has its N-terminal helix associated with its
176
own RNA encapsidation cavity (41). Residues 20 to 26 form a helix (α2). Within this helix, the
177
hydrophobic sidechain of Trp24 fits in the pocket where the bases of the RNA are sequestered in
178
the nucleocapsid, as observed in the RVFV NLP. In the oligomeric structure of the RVFV NLP,
179
the N-terminus of N is moved out of the RNA cavity to interact with the neighboring N subunit 9
180
(13). These observations suggest that the RVFVN protein is kept RNA-free by sequestering its
181
own N-terminus. During assembly, the N-terminus of RVFVN rearranges to support capsid
182
formation and RNA encapsidation. In another study, the structures of RVFV empty capsids have
183
also been solved (42). The empty capsids maintain the architecture of capsids with encapsidated
184
RNA. The reason for formation of empty capsids in this case is not understood. It is possible that
185
the N protein is not stable in the over-expressed bacterial milieu or an unknown protein, either
186
viral or cellular is necessary to help keep N from self-assembling. Lastly, this could be an artifact
187
resulted from producing the protein with an N-terminal thioredoxin-fusion.
188
For LACVN, several structures have been produced. The N-terminal 17 residues of N are
189
highly flexible and could be in a fold-back conformation in the N monomer, based on structures
190
of the N protein complex after RNA was removed (14). The fold-back conformation is similar to
191
sequestering the N-terminus in N of phleboviruses, but to a lesser extent. In LACV NLP, the N-
192
terminus of N assumes a conformation that extends to its neighboring N subunit. The extended
193
N-terminus has interactions with the bases and backbone of the encapsidated RNA, as well as
194
amino acid interactions with its neighboring N subunit. The C-terminus (residues 218-235) is
195
extended from the core of N and shows some conformational flexibility though not as much as
196
the N-terminus. The elements required for keeping the orthobunyaviruses N in a monomeric
197
form could be similar to those for N of phleboviruses.
198
The essential requirement for N to be competent in encapsidating viral RNA is to remain
199
monomeric. Different strategies may be used to achieve this, including occupying the sites
200
required for oligomerization by P binding, such as in the case of rhabdoviruses and
201
pneumoviruses, or sequestering the N-terminus by N itself, such as in the case of bunyaviruses.
202
In rhabdoviruses, the proximity of P binding to the RNA cavity is purely coincidental and only 10
203
for the formation of a stable N0-P complex. This complex is essential to prevent premature N
204
oligomerization rather than prevent RNA binding. This concept is supported by the results in
205
Chen et al. who showed that a phosphorylation deficient triple-mutant of P (P3A) that is mutated
206
outside of the cavity binding region could not prevent encapsidation of cellular RNA by VSVN
207
(37).
208
Architecture of the nucleocapsid
209
The N protein oligomerizes to encapsidate the single strand viral RNA. The N subunits are
210
associated with each other in a parallel orientation. A linear nucleocapsid is assembled with the
211
single strand viral RNA sequestered in the center. In rhabdoviruses, the nucleocapsid is a random
212
coil when it is isolated from the virion (43). The nucleocapsid is packaged into a superhelical
213
structure in the bullet shaped virion (6). The super symmetry is imposed on the nucleocapsid by
214
the matrix protein that has a 1:1 interaction with the N protein in the virion. The matrix protein
215
subunits have direct contacts among themselves to form the 2D helical mesh under the viral
216
membrane envelope. The same helical symmetry is adopted by the nucleocapsid when packaged
217
in the virion. The fact that the single strand viral RNA is completely encapsidated prior to being
218
packaged into the virion defines that the true symmetry of the nucleocapsid is linear. In most
219
NSVs, the nucleocapsid appears to be a random coil (10). The nucleocapsid in members of the
220
Paramyxovirinae contains a number of helical segments, but the helical symmetry is not strict in
221
terms of pitch and rotation of the N subunits (44). The Sendai virus nucleocapsid exists in at least
222
four different helical states (45). Since the repeating unit in the nucleocapsid is linear along the
223
encapsidated RNA and the helical symmetry is not required for RNA encapsidation, the exact
224
symmetry of the nucleocapsid of members in Paramyxovirinae strictly speaking must be
225
considered linear. In influenza virus (IFV), the linear nucleocapsid is twisted into a rough double 11
226
helix, with interactions between the two associated strings (8, 9). The helical superstructure is a
227
way to condense the nucleocapsid for packaging in the virion, but not essential for viral RNA
228
encapsidation.
229
The forces that stabilize the nucleocapsid involve extensive cross-molecular interactions
230
among the N subunits. There are side-by-side interactions between N subunits aligned in parallel.
231
These interactions are critical to capsid formation. The N protein can no longer assemble the
232
nucleocapsid if these interactions are disrupted by mutation (16). The extent of the side-by-side
233
interactions is different from virus to virus ranging from below 200 Å2 for LACVN up to ~2200
234
Å2 for VSV. There is also a disparity to the amount of buried surface between adjacent C-
235
terminal domains versus N-terminal domains of each viral N (11). The contact areas between
236
neighboring domains in different NLPs are listed in Table 3. The contact areas must have
237
plasticity because these calculated contact areas are derived from an artificial ring structure. In
238
the authentic linear nucleocapsid, these contact areas are likely to be different. The difference
239
between the two halves of the N protein may have some functional implications because the N
240
protein needs to undergo a conformational change to reveal the sequestered template RNA
241
during viral RNA synthesis. It is likely that the domain that has a lesser contact area with the
242
neighboring domains should be opened to reveal the RNA. As shown in Table 3, the N-terminal
243
domain of VSVN has a lesser contact area and is likely to be open during viral RNA synthesis,
244
while the C-terminal domain remains associated to maintain the integrity of the nucleocapsid.
245
Rearrangement of the N-terminal domain to open the cavity is further supported in lieu of the
246
greater association between the C-terminal domains due to additional interactions, as noted
247
below.
12
248
In addition to the side-by-side contacts, the most obvious interactions between the N subunits
249
are from the structural elements extended from the core of the N subunit. The core of N and the
250
nucleotides associated with the core act as one structural unit. The number of nucleotides
251
accommodated in the core is listed in Table 3. The registration of each N subunit may be defined
252
by choosing one subunit as the origin (position 0). If the access to the RNA cavity faces away
253
from the reader, -1 refers to the subunit on the left of the origin subunit (5’ end of the
254
encapsidated RNA), and +1, on the right (3’end of the encapsidated RNA). In rhabdoviruses, the
255
N-terminal arm (21 residues) extends away from the core to interact with the -1 N-terminal
256
domain (N-lobe), whereas an extended loop in the C-lobe interacts with the +1 C-lobe. The N-
257
terminal arm also interacts with the extended loop in the -2 C-lobe. In RSVN, the N-terminal arm
258
(28 residues) interact with both N- and C-terminal domains of the -1 subunit, whereas the
259
extended C-terminal arm, rather than a loop in this case, interacts with the C-terminal domain of
260
the +1 N subunit. In RVFVN, only the N-terminal arm (32 residues) interacts with the -1 N-
261
terminal domain. In LACVN, the N-terminal arm (17 residues) interacts with the -1 N-terminal
262
domain involving only the first 8 residues. The remaining residues interact with the encapsidated
263
RNA. The C-terminal arm of LACVN (18 residues) contains an α-helix that interacts with the C-
264
terminal domain of subunit +1. Residues that are involved in capsid formation are illustrated in
265
Figure 4.
266
In IFVN, there is also an extended loop in the C-terminal region that interacts with the C-
267
terminal domain in the +1 N subunit. It is not clear if the N-terminal region is involved in the
268
interactions with the neighboring subunits in the nucleocapsid. An atomic model of the
269
nucleocapsid or an NLP is needed to fully address this. It is also not clear how IFVN is
270
maintained in a monomeric form prior to RNA encapsidation. There is no report of any viral 13
271
protein that is involved in this function. In the reported structure of IFVN, the N-terminus of N
272
could be in a sequestered conformation. If this is the case, IFVN may employ a method similar to
273
that used by RVFVN and LACVN to remain monomeric. The extended loop in the C-terminal
274
domain folds back on the surface of its own monomer as shown by the structure of an obligate
275
monomeric mutant of IFVN (46). This folded conformation of the C-terminal extended loop
276
should also contribute to maintaining a monomeric mode of IFVN prior to RNA encapsidation.
277
The surface contact and domain-swap interactions between neighboring N subunits are
278
essential for the assembly of NSV nucleocapsids. Similar interactions are also common in the
279
nucleocapsid of other viruses, such as picornaviruses and adenovirus (47, 48). The only unique
280
feature in the NSV nucleocapsid is that these interactions are linear along the encapsidated single
281
strand viral RNA.
282
Sequestered RNA in the cavity
283
The nucleocapsid of NSVs needs to be capable of encapsidating all possible RNA sequences.
284
The mechanism for specific encapsidation of viral RNA may be that viral RNA encapsidation is
285
concomitant with viral replication, likely to be at the site of the viral RNA replication complex
286
(24). At this point, the monomeric N protein assembles the nucleocapsid simultaneous with
287
encapsidation of the genome. No consensus interactions with the bases of the encapsidated RNA
288
were identified among all the reported NLP structures, including the cases in which
289
homogeneous sequences were encapsidated in the NLP (49). Interactions of the backbone
290
phosphate groups with positively charged sidechains were found for a fraction of the sequestered
291
phosphate groups, but there is not a conserved pattern among NLPs of different viruses. This
292
holds true for even the closely related NLP of VSV and RABV (32) and suggests that the
14
293
positively charged residues in the RNA cavity are not the key factor responsible for RNA
294
encapsidation.
295
What is unique about the viral RNA in the NSV nucleocapsid is that the sequestered bases
296
are stacked to form a motif similar to one-half of the A-form double helix of RNA (Figure 5A).
297
In rhabdoviruses, the bases of nucleotides 1-4 are stacked and face the solvent side of the RNA
298
cavity, whereas bases of nucleotides 5, 7 and 8 are stacked and face the interior of the N subunit.
299
Similarly in RSV, bases of nucleotides 2-4 are stacked and face the solvent side of the RNA
300
cavity, whereas bases of nucleotides 5-7 are stacked and face the interior of the N subunit. There
301
is an additional “linker” nucleotide between the N subunits in rhabdoviruses and RSV. This
302
nucleotide is still sequestered from solvent accessibility when N subunits oligomerize in the
303
nucleocapsid, but it may allow some flexibility between the N subunits. The base of this linker
304
nucleotide can be stacked with the other bases, or an aromatic sidechain will take its place to
305
stack with the other bases when the linker nucleotide is in a transitional conformation (11, 12). In
306
the RVFV NLP, four bases are sequestered by the core domains each facing the interior of the N
307
subunit. Among these four bases, the two in the center are stacked. The bases of the three
308
“linker” nucleotides are also in a stacked conformation and protected by the capsid structure. In
309
the LACV NLP, seven bases near the 5’ end (nucleotides -11 and 1-6) of the RNA strand are
310
stacked and face the entrance to the RNA cavity. The aromatic sidechain of tyrosine-177 is
311
intercalated between the bases of nucleosides 7 and 8, and the base of nucleotide 8 is further
312
stacked with the base of nucleotide 9. The bases in this triple stacking face the RNA cavity.
313
Lastly, the linker nucleotide 10 is sequestered in the RNA cavity, but its base is not stacked. The
314
unique stacking arrangements observed in NSVs allow for maximized packaging of the RNA in
315
the capsid. 15
316
The base stacking is only stabilized when the viral RNA is in the cavity of the nucleocapsid.
317
The N subunits can form a stable empty capsid without RNA (16).The encapsidated RNA is a
318
resident in the nucleocapsid, perhaps also contributing to the overall stability of the
319
nucleocapsid. The shape and the electrostatic environment of the cavity define how base stacking
320
occur. The stability of the base-stacking motifs in the RNA cavity is dependent on the RNA
321
sequence, with poly(rA) being the most stable and poly(rU) being the least stable (49). Specific
322
sequences in the sequestered RNA genome regulate many viral functions, such as transcriptional
323
initiation and termination. In terms of transcriptional termination, there is a highly conserved U7
324
track at the end of each coding region in rhabdoviruses. The structure in this region is the least
325
stable in the nucleocapsid. It is conceivable that such instability could promote dissociation of
326
the viral transcription complex. Mutations in the N protein have been shown to alter the level of
327
mRNA synthesis. In this case, genomic mutations have subsequently occurred resulting in an
328
extension of the U track to U8 in rescued VSV (50). U rich sequences are also found in the
329
intergenic regions in other NSV genomes, but not all.
330
The RNA cavity is formed by secondary structural elements from the N- and C-domains in
331
the N protein. A highly conserved (5H+3H) motif consisting of 5 helices from the N-domain and
332
3 helices from the C-domain has been identified to constitute the RNA cavity (51). When the
333
structures of VSVN and RSVN are superimposed using the Fr-TM-align method (15), the
334
(5H+3H) motif from each N protein (α4-α8 in the N-domain, and α9-α11 in the C-domain of the
335
VSVN protein (11); αN3, αN5-αN8, and αC1, η2, and αC3 of the RSVN protein (12)) could be
336
superimposed as a rigid body (Figure 5B, 5C and Table 1). η indicates a 310 helix. In addition,
337
two helices following the (5H+3H) motif in the C-domain could also be superimposed as part of
338
the core in these two N proteins (Figure 5B, 5C). In the case of RVFVN, only the first four 16
339
helices (α3, α5-α7 in RVFVN (13)) in the (5H+3H) motif may be aligned with those in VSVN or
340
RSVN (Figure 5B, 5C and Table 1). The 5th helix of the motif is present but moved away from
341
the RNA cavity, inducing reorientation of the following three helices in the (5H+3H) motif, as
342
well as helices following the motif. As a result, this RNA cavity is almost completely collapsed
343
compared to that in VSVN or RSVN. This small RNA cavity could only accommodate the bases
344
of four nucleotides, while excluding the ribose-phosphate backbone. LACVN (14) has the same
345
topology as RVFVN (Figure 5B, 5C and Table 1). The first four helices of LACVN in the
346
(5H+3H) motif (α2, η1, α3, and α4) correspond to those in RVFVN. However, there are two
347
noticeable differences between the two structures. One is that the linker between the first and
348
second helices is changed from a helix (α4) to a β–hairpin (β1-4). The other is that the third helix
349
in RVFVN (α4) is much shorter than that in LACVN (α3). The three helices in the C-terminal
350
domain of the (5H+3H) motif are the same in RVFVN and LACVN, as well as three additional
351
helices following the motif, in terms of topology. In addition, LACVN has an extra helix at the
352
C-terminus (α11). This helix is involved in interactions with the neighboring N subunit on the
353
right side (+1), as described above.
354
Summary
355
The nucleocapsid of NSVs is assembled by the universal assembly principle of all viral
356
nucleocapsids. Oligomerization of the protein subunits forms the capsid that encapsidates the
357
viral polynucleotide. Extensive interactions cross protein subunits, including domain-swaps.
358
These interactions are involved in the stabilization of the nucleocapsid. What is unique about
359
NSVs is that their nucleocapsid has a linear symmetry. The viral RNA is a string in the center of
360
the nucleocapsid ribbon. The nucleocapsid assembly of all NSVs shares three essential elements:
361
a monomeric capsid protein protomer, parallel orientation of subunits in the nucleocapsid, and 17
362
the 5H+3H motif that forms a proper cavity for sequestration of the RNA. During viral RNA
363
synthesis, the nucleocapsid does not disassemble to completely release the viral RNA template.
364
Instead, conformational changes must occur in protein domains to temporarily release the RNA
365
template on a local rather than global basis. The viral polymerase complex is a likely candidate
366
to induce this conformational change in order to gain access to the sequestered bases, some of
367
which face the interior of the N subunits. Once revealed, the RNA template is available for
368
transcription and genomic replication. After the viral polymerase complex passes, the RNA is
369
tidily repositioned in the encapsidation cavity and the integrity of the nucleocapsid is restored.
370
This new mechanism of RNA encapsidation allows the N protein to play a role in viral
371
replication and transcription. Mutations in the N protein may increase or decrease the level of
372
viral replication (52, 53). More interestingly, some of these mutations in the N protein did not
373
change the level of viral replication, but changed the level of transcription. This suggests that the
374
structure of the nuclecapsid itself has a role in viral transcription that is independent of
375
replication. For viral replication, the viral polymerase complex needs to load at the 3’ end of the
376
genome and process through the nucleocapsid. For viral transcription, on the other hand, the viral
377
polymerase complex needs to recognize the promoter and the transcription termination
378
sequences, in addition to loading and processivity. There may be a unique structural feature
379
associated with the regions where the promoter or the transcription termination sequence is
380
located in the nucleocapsid to allow such recognition.
381
Acknowledgement
382
We thank Dr. Mark Walter for assistance in collecting SAXS data. SAXS experiments were
383
conducted at the Advanced Light Source (ALS), a national user facility operated by Lawrence
18
384
Berkeley National Laboratory on behalf of the Department of Energy, Office of Basic Energy
385
Sciences, through the Integrated Diffraction Analysis Technologies (IDAT) program, supported
386
by DOE Office of Biological and Environmental Research. Additional support comes from the
387
National Institute of Health project MINOS (R01GM105404). The work is supported in part by a
388
NIH grant 1R01AI10630 to ML and 1R56AI01087 to TG. References
389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.
13. 14. 15. 16.
Prasad BV, Schmid MF. 2012. Principles of virus structural organization. Advances in experimental medicine and biology 726:17-47. Poranen MM, Bamford DH. 2012. Assembly of large icosahedral double-stranded RNA viruses. Advances in experimental medicine and biology 726:379-402. Ganser-Pornillos BK, Yeager M, Pornillos O. 2012. Assembly and architecture of HIV. Advances in experimental medicine and biology 726:441-465. Stubbs G, Kendall A. 2012. Helical viruses. Advances in experimental medicine and biology 726:631-658. Abrescia NG, Bamford DH, Grimes JM, Stuart DI. 2012. Structure unifies the viral universe. Annual review of biochemistry 81:795-822. Ge P, Tsao J, Schein S, Green TJ, Luo M, Zhou ZH. 2010. Cryo-EM model of the bullet-shaped vesicular stomatitis virus. Science 327:689-693. Loney C, Mottet-Osman G, Roux L, Bhella D. 2009. Paramyxovirus ultrastructure and genome packaging: cryo-electron tomography of sendai virus. Journal of virology 83:8191-8197. Arranz R, Coloma R, Chichon FJ, Conesa JJ, Carrascosa JL, Valpuesta JM, Ortin J, Martin-Benito J. 2012. The structure of native influenza virion ribonucleoproteins. Science 338:1634-1637. Moeller A, Kirchdoerfer RN, Potter CS, Carragher B, Wilson IA. 2012. Organization of the influenza virus replication machinery. Science 338:1631-1634. Ruigrok RW, Crepin T, Kolakofsky D. 2011. Nucleoproteins and nucleocapsids of negative-strand RNA viruses. Current opinion in microbiology 14:504-510. Green TJ, Zhang X, Wertz GW, Luo M. 2006. Structure of the vesicular stomatitis virus nucleoprotein-RNA complex. Science 313:357-360. Tawar RG, Duquerroy S, Vonrhein C, Varela PF, Damier-Piolle L, Castagne N, MacLellan K, Bedouelle H, Bricogne G, Bhella D, Eleouet JF, Rey FA. 2009. Crystal structure of a nucleocapsidlike nucleoprotein-RNA complex of respiratory syncytial virus. Science 326:1279-1283. Raymond DD, Piper ME, Gerrard SR, Skiniotis G, Smith JL. 2012. Phleboviruses encapsidate their genomes by sequestering RNA bases. Proc Natl Acad Sci U S A 109:19208-19213. Reguera J, Malet H, Weber F, Cusack S. 2013. Structural basis for encapsidation of genomic RNA by La Crosse Orthobunyavirus nucleoprotein. Proc Natl Acad Sci U S A 110:7246-7251. Pandit SB, Skolnick J. 2008. Fr-TM-align: a new protein structural alignment method based on fragment alignments and the TM-score. BMC bioinformatics 9:531. Zhang X, Green TJ, Tsao J, Qiu S, Luo M. 2008. Role of intermolecular interactions of vesicular stomatitis virus nucleoprotein in RNA encapsidation. J Virol 82:674-682.
19
423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469
17. 18.
19. 20. 21.
22.
23. 24.
25. 26. 27.
28.
29.
30.
31.
32. 33. 34.
Konarev PV, Volkov VV, Sokolova AV, Koch MHJ, Svergun DI. 2003. PRIMUS - a Windows-PC based system for small-angle scattering data analysis. J. Appl Cryst 36:1277-1282. Petoukhov MV, Franke D, Shkumatov AV, Tria G, Kikhney AG, Gajda M, Gorba C, Mertens HDT, Konarev PV, Svergun DI. 2012. New developments in the ATSAS program package for smallangle scattering data analysis. J. Appl. Cryst. 45:342-350. Svergun DI. 1999. Restoring low resolution structure of biological macromolecules from solution scattering using simulated annealing. Biophys J 76:2879-2886. Volkov VV, Svergun DI. 2003. Uniqueness of ab initio shape determination in small-angle scattering. J. Appl. Cryst. 36:860-886. Bernado P, Mylonas E, Petoukhov MV, Blackledge M, Svergun DI. 2007. Structural characterization of flexible proteins using small-angle X-ray scattering. Journal of the American Chemical Society 129:5656-5664. Leyrat C, Yabukarski F, Tarbouriech N, Ribeiro EA, Jr., Jensen MR, Blackledge M, Ruigrok RW, Jamin M. 2011. Structure of the vesicular stomatitis virus N0-P complex. PLoS Pathog 7:e1002248. Ding H, Green TJ, Lu S, Luo M. 2006. Crystal structure of the oligomerization domain of the phosphoprotein of vesicular stomatitis virus. J Virol 80:2808-2814. Green TJ, Luo M. 2009. Structure of the vesicular stomatitis virus nucleocapsid in complex with the nucleocapsid-binding domain of the small polymerase cofactor, P. Proc Natl Acad Sci U S A 106:11713-11718. Petoukhov MV, Svergun DI. 2005. Global rigid body modeling of macromolecular complexes against small-angle scattering data. Biophys J 89:1237-1250. Kozin M, Svergun DI. 2001. Automated matching of high- and low-resolution structural models. J. Appl. Cryst. 34:33-41. Albertini AA, Wernimont AK, Muziol T, Ravelli RB, Clapier CR, Schoehn G, Weissenhorn W, Ruigrok RW. 2006. Crystal structure of the rabies virus nucleoprotein-RNA complex. Science 313:360-363. Li B, Wang Q, Pan X, Fernandez de Castro I, Sun Y, Guo Y, Tao X, Risco C, Sui SF, Lou Z. 2013. Bunyamwera virus possesses a distinct nucleocapsid protein to facilitate genome encapsidation. Proc Natl Acad Sci U S A 110:9048-9053. Jiao L, Ouyang S, Liang M, Niu F, Shaw N, Wu W, Ding W, Jin C, Peng Y, Zhu Y, Zhang F, Wang T, Li C, Zuo X, Luan CH, Li D, Liu ZJ. 2013. Structure of severe fever with thrombocytopenia syndrome virus nucleocapsid protein in complex with suramin reveals therapeutic potential. J Virol 87:6829-6839. Ariza A, Tanner SJ, Walter CT, Dent KC, Shepherd DA, Wu W, Matthews SV, Hiscox JA, Green TJ, Luo M, Elliott RM, Fooks AR, Ashcroft AE, Stonehouse NJ, Ranson NA, Barr JN, Edwards TA. 2013. Nucleocapsid protein structures from orthobunyaviruses reveal insight into ribonucleoprotein architecture and RNA polymerization. Nucleic Acids Res 41:5912-5926. Dong H, Li P, Bottcher B, Elliott RM, Dong C. 2013. Crystal structure of Schmallenberg orthobunyavirus nucleoprotein-RNA complex reveals a novel RNA sequestration mechanism. Rna 19:1129-1136. Luo M, Green TJ, Zhang X, Tsao J, Qiu S. 2007. Conserved characteristics of the rhabdovirus nucleoprotein. Virus Res 129:246-251. Peluso RW, Moyer SA. 1988. Viral proteins required for the in vitro replication of vesicular stomatitis virus defective interfering particle genome RNA. Virology 162:369-376. Howard M, Wertz G. 1989. Vesicular stomatitis virus RNA replication: a role for the NS protein. J Gen Virol 70:2683-2694.
20
470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45. 46. 47.
48. 49. 50.
51.
Mavrakis M, Iseni F, Mazza C, Schoehn G, Ebel C, Gentzel M, Franz T, Ruigrok RW. 2003. Isolation and characterisation of the rabies virus N degrees-P complex produced in insect cells. Virology 305:406-414. Chen M, Ogino T, Banerjee AK. 2007. Interaction of vesicular stomatitis virus P and N proteins: identification of two overlapping domains at the N terminus of P that are involved in N0-P complex formation and encapsidation of viral genome RNA. J Virol 81:13478-13485. Chen L, Zhang S, Banerjee AK, Chen M. 2013. N-terminal phosphorylation of phosphoprotein of vesicular stomatitis virus is required for preventing nucleoprotein from binding to cellular RNAs and for functional template formation. J Virol 87:3177-3186. Llorente MT, Garcia-Barreno B, Calero M, Camafeita E, Lopez JA, Longhi S, Ferron F, Varela PF, Melero JA. 2006. Structural analysis of the human respiratory syncytial virus phosphoprotein: characterization of an alpha-helical domain involved in oligomerization. J Gen Virol 87:159-169. Mallipeddi SK, Lupiani B, Samal SK. 1996. Mapping the domains on the phosphoprotein of bovine respiratory syncytial virus required for N-P interaction using a two-hybrid system. J Gen Virol 77 ( Pt 5):1019-1023. Yabukarski F, Leyrat C, Tarbouriech N, Jensen MR, Blackledge M, Ruigrok R, Jamin M. 2013. Crystal structure of the N0-P complex of Nipah virus and of VSV provide new insights into the encapsidation mechanism. Proceedings of XV International Conference on Negative Strand RNA Viruses:84. Raymond DD, Piper ME, Gerrard SR, Smith JL. 2010. Structure of the Rift Valley fever virus nucleocapsid protein reveals another architecture for RNA encapsidation. Proc Natl Acad Sci U S A 107:11769-11774. Ferron F, Li Z, Danek EI, Luo D, Wong Y, Coutard B, Lantez V, Charrel R, Canard B, Walz T, Lescar J. 2011. The hexamer structure of Rift Valley fever virus nucleoprotein suggests a mechanism for its assembly into ribonucleoprotein complexes. PLoS Pathog 7:e1002030. Blocquel D, Bourhis J-M, Éléouët J-F, Gerlier D, Habchi J, Jamin M, Longhi S, Yabukarski F. 2012. Transcription et réplication des Mononegavirales : une machine moléculaire originale. Virologie 16:3. Bhella D, Ralph A, Yeo RP. 2004. Conformational flexibility in recombinant measles virus nucleocapsids visualised by cryo-negative stain electron microscopy and real-space helical reconstruction. J Mol Biol 340:319-331. Egelman EH, Wu SS, Amrein M, Portner A, Murti G. 1989. The Sendai virus nucleocapsid exists in at least four different helical states. J Virol 63:2233-2243. Chenavas S, Estrozi LF, Slama-Schwok A, Delmas B, Di Primo C, Baudin F, Li X, Crepin T, Ruigrok RW. 2013. Monomeric nucleoprotein of influenza A virus. PLoS Pathog 9:e1003275. Rossmann MG, Arnold E, Erickson JW, Frankenberger EA, Griffith JP, Hecht HJ, Johnson JE, Kamer G, Luo M, Mosser AG, et al. 1985. Structure of a human common cold virus and functional relationship to other picornaviruses. Nature 317:145-153. Liu H, Jin L, Koh SB, Atanasov I, Schein S, Wu L, Zhou ZH. 2010. Atomic structure of human adenovirus by cryo-EM reveals interactions among protein networks. Science 329:1038-1043. Green TJ, Rowse M, Tsao J, Kang J, Ge P, Zhou ZH, Luo M. 2011. Access to RNA encapsidated in the nucleocapsid of vesicular stomatitis virus. J Virol 85:2714-2722. Harouaka D, Wertz GW. 2012. Second-site mutations selected in transcriptional regulatory sequences compensate for engineered mutations in the vesicular stomatitis virus nucleocapsid protein. J Virol 86:11266-11275. Luo M, Green TJ, Zhang X, Tsao J, Qiu S. 2007. Structural comparisons of the nucleoprotein from three negative strand RNA virus families. Virol J 4:72-78.
21
517 518 519 520 521 522 523 524 525 526 527 528 529
52.
53.
54. 55. 56. 57.
Nayak D, Panda D, Das SC, Luo M, Pattnaik AK. 2009. Single-amino-acid alterations in a highly conserved central region of vesicular stomatitis virus N protein differentially affect the viral nucleocapsid template functions. J Virol 83:5525-5534. Harouaka D, Wertz GW. 2009. Mutations in the C-terminal loop of the nucleocapsid protein affect vesicular stomatitis virus RNA replication and transcription differentially. J Virol 83:1142911439. Svergun DI. 1992. Determination of the regularization parameter in indirect-transform methods using perceptual criteria. . J. Appl. Crystal. 25:495-503. Volkov VV, Svergun DI. 2003. Uniqueness of ab initio shape determination in small-angle scattering. Journal of Applied Crystallography 36:860-864. Kozin MB, Svergun DI. 2001. Automated matching of high- and low-resolution structural models. Journal of Applied Crystallography 34:33-41. PyMol. The PyMOL Molecular Graphics System, Version 1.3, Schrödinger, LLC.
530 531 532
22
533
Figure legends
534
Figure 1. Small-angle X-ray scattering (SAXS) analysis of the N0–P complex from VSV. A)
535
The experimental SAXS profile for three concentrations of the protein complex, 0.75 (black), 2.4
536
(blue) and 4.3 (red) mg/mL. Scattering curves were scaled with PRIMUS (17). The
537
concentrations and associated coloring schemes are used in (A) – (D). B) A Guinier plot of the
538
experimental SAXS profile with the fit shown. Plots are shown on a relative scale. C) A Kratky
539
plot. D) The pair-distribution function as calculated in GNOM (54).
540
Figure 2. Ab initio shape modeling of the VSV N0-P complex. A) DAMMIN (19) was used
541
to produce 10 bead models. These 10 models were used to produce an averaged model with
542
DAMAVER (55). The averaged model (white spheres) is shown with aligned “dimer” of the N0-
543
P complex (red and blue). The dimer was determined with SASREF (25) and superimposed with
544
SUPCOMB (56). 90° rotations relating each orientation are noted. B) A representative single
545
bead model from DAMMIN is shown in beads of cyan. The orientations are the same as in (A).
546
The protein model is not adjusted to fit the single DAMMIN bead model. C) The plot of a single
547
DAMMIN model against the experimental scattering curve. D) Fit of the N0-P model shown in
548
Figure 3 to the experimental SAXS curve. E) SASREF (25) was used to rigid body fit two copies
549
of the complex against the SAXS data. The fit of the SASREF derived dimer is shown.
550 551 552
Figure 3. Maintaining a monomeric N. A complete multi-domain model of theN0-P complex
553
from VSV as determined with EOM is presented. The N protein is shown in red, while the
554
monomers of the P protein dimer are colored, yellow and green. Previously determined domains 23
555
are shown in cartoon representation, while ab initio generated loops are shown as C-α ribbons.
556
The three orange spheres represent positions of Ser60, Thr62 and Ser64 (sites of
557
phosphorylation) of the P protein. These residues have a proximity to the negatively charged
558
patch formed by residues Lys398, Arg399, Lys414 and Lys417 of the N protein (colored dark
559
blue on the surface). This region is circled and denoted the “PO binding site”. All cartoon
560
drawings in this and following figures were prepared with PyMol (57).
561 562
Figure 4. Oligomerization of the N subunits in the nucleocapsid. Surface representations of
563
the nucleocapsid proteins of (A) VSV, (B) RSV, (C) RVFV and (D) LACV are shown for three
564
subunits. Each subunit has been radially spaced to expose the surfaces that contribute to subunit
565
interactions. The monomers are colored red, green and yellow while residues that contact the
566
adjacent protomers are shaded in black.
567 568
Figure 5. Comparisons of the RNA cavity. (A) Ribbon drawings that show the N subunit
569
with encapsidated RNA for VSV, RSV, RVFV and LACV. The rainbow color code of the
570
polypeptide is blue for the N-terminus to red for the C-terminus. The encapsidated RNA is
571
shown again below each N subunit as stick and ribbon models. The nucleotides are numbered
572
from 5’ to 3’. The number 1’ for RVFV means that this nucleotide is the equivalent of nucleotide
573
1 in the next N subunit in the NLP. (B) Superposition of the N proteins. Cα tracing of the aligned
574
coordinates was shown. In the first three panels, the Cα tracing for VSVN was shown in gray.
575
The superimposed Cα tracing was colored in rainbow from N-terminus (blue) to C-terminus
576
(red). In the fourth panel, the Cα tracing for LACVN was shown in skyblue; RVFVN, in 24
577
rainbow. Only the aligned portions are shown (see Table 1). (C) Topological cartoons
578
representing the helices in the N protein core. Each circle represents a helix that is labeled
579
according to their secondary structure assignment in the references (11-14).
580
25
Table 1. Superposition of the N proteins
RSVN(2-375) vs VSVN(2-422) RVFVN(49-126) vs VSVN(131-206) LACVN(35-124) vs VSVN(131-206) LACVN(35-124) vs RVFVN(49-126)
Residues aligned
RMSD (Å2)
TM-score
292(421) 53(76) 50(76) 55(78)
4.97 3.13 4.08 3.88
0.52 0.44 0.35 0.40
Residue number in parenthesis is the length of the reference structure. Table 2. Radius of gyration (Rg), maximum size (Dmax) and Porod volume (Vp) as calculated from the SAXS curves for the VSV N0-P complex. concentration (mg/mL) 0.75 2.4 4.3
Rg (Å) 60.94 65.07 66
Vp (Å3) 5.82 x 105 6.27 x 105 6.52 x 105
Dmax (Å) 209 227.5 229
Rg was calculated with AUTORG, Dmax with DATGNOM and Vp with DATPOROD. All of which are from the ATSAS package (18). Table 3. Area of contacts between the N proteins and number of nucleotides bound in the N protein Surface Virus Area/Protomer 23093.9 VSV 20134.4 RSV 13884.2 RVFV 13775.4 LACV
Buried Interface Complex (Å2) 5648.0 5117.6 3576.6 1982.7
Buried Interface Core Buried Interface N-lobe (Å2) Core C-lobe (Å2) 552.4 1572.4 997.1 248.0 344.6 165.0 63.9
581
26
Buried Interface Arms/Loops (Å2) 3523.2 3872.5 3232.0 1753.8
# of nucleotides in the core 8 6 4 10