RNA BIOLOGY 2017, VOL. 14, NO. 1, 36–44 http://dx.doi.org/10.1080/15476286.2016.1239688

RESEARCH PAPER

Archaeal RNA ligase from thermoccocus kodakarensis for template dependent ligation Lei Zhang and Anubhav Tripathi Center for Biomedical Engineering, School of Engineering, Brown University, Providence, RI, USA

ABSTRACT

ARTICLE HISTORY

Nicking-sealing RNA ligases play a significant biological role in host defense and cellular repair, and have become an important molecular tool in biomedical engineering. Due to the propensity for RNA to form secondary structures, RNA modifying enzymes with elevated optimum temperatures are highly desired. Current characterized double stranded RNA ligases, such as the bacteriophage T4 RNA ligase 2, while possessing good template dependency, are not active at elevated temperatures. The few characterized RNA ligases from thermophiles exhibit high template independency. We synthesize and characterize here, KOD RNA ligase (KOD1Rnl), a thermostable and template dependent RNA ligase from the archaeon, Thermoccocus Kodakarensis. We disclose that a 13 time reduction in template independent ligation can be achieved with the addition of a single stranded DNase, such as RecJ. We also elucidate the effects of the presence of blood proteins on the activity of KOD1Rnl. Template dependent and thermostable RNA ligases, such as KOD RNA ligase, can be utilized in RNA detection, modification and sequencing.

Received 19 July 2016 Revised 14 September 2016 Accepted 16 September 2016

Introduction RNA ligases are a family of enzymes that catalyze the joining of adjacent terminal 30 -hydroxyl and 50 -phosphate groups.1 They are involved in the editing and repair of RNA, and therefore, are essential proteins for biological processes. Similar to other nucleotidyltransferases, the catalysis by RNA ligases involves the formation of an enzyme-NMP intermediate to provide a thermodynamic gradient.2 Their abilities to covalently join 2 adjacent RNA or DNA has been explored extensively, in vitro, for biomedical research. RNA ligases find their uses in the profiling of target sequences, the quantification of different RNA species, and the detection of particular RNA mutations.3-5 Based on their optimal substrate, RNA ligases are categorized into 2 groups. Single stranded RNA ligases are capable of joining 2 RNA or DNA without a hybridizing template.5 Double stranded RNA ligases preferentially join the nick in a RNA duplex.6 While both groups are important RNA modifying enzymes, template dependent RNA ligases are less well studied. RNA ligases of mesophilic origin make up the bulk of their family that has been characterized.2 Although mesophilic ligases are typically very efficient at nicking sealing, there are applications where, RNA modification at elevated temperatures is desired.7 Due to presence of hair-pin loops in the structure of most RNA at physiological temperature, the hybridization efficiency of molecular probes can be negatively impacted. For efficient hybridization and ligation of targeting probes, thermostable RNA ligases are preferred. To date, only 2 RNA ligases from thermophiles have been studied.8,9 All of them are single stranded ligases that are efficiently in strand joining.

CONTACT Anubhav Tripathi [email protected] Supplemental data for this article can be accessed on the publisher’s website. © 2017 Taylor & Francis Group, LLC

KEYWORDS

Double-stranded ligase; KOD RNA ligase; RNA sequencing; template dependent ligation; thermostable RNA ligase

Chlorella virus DNA ligase, T4 DNA ligase, and T4 RNA ligase 2 are all monomeric proteins and have been used in RNA splint ligation.10-12 The former 2 are classic DNA ligases that display a greater preference for DNA templated nick sealing than RNA templated ligation.11,13,14 T4 RNA ligase 2 belongs to the Rnl2 family of ligases. By searching for conserved motifs identified in Rnl2 proteins 6, we arrived at a candidate RNA ligase from the archaeon Thermoccocus Kodakarensis.15 A DNA ligase from the thermophile was previously isolated and characterized.16 Thermoccocus Kodakarensis RNA Ligase, KOD1Rnl, is a 380 amino acid residue protein encoded by the TK1545 open reading frame. Here, we report that KOD1Rnl is a thermostable and template dependent RNA ligase. We demonstrate that KOD1Rnl greatly prefers the last 2 nucleotides at 30 -hydroxyl terminal of a nicked RNA duplex to be ribonucleotides. This preference is also found in T4 RNA ligase 2.6 When ligation reactions are performed in the presence of a single stranded DNase, such as RecJ,17 we observed an increases in the template dependent ligation efficiency, and a reduction in the template independent ligation efficiency. When a pre-treatment step with thermo-labile proteases is performed, we found that KOD1Rnl is able to overcome the inhibitory effects of human serum and serum proteins. We also demonstrate that KOD1Rnl can be used to detect Ebola RNA transcripts.

Results Synthesis and purification of KOD1Rnl Cell-free protein synthesis (CFPS) produces analytical mass of proteins rapidly and enables the high throughput screening of

RNA BIOLOGY

candidate proteins.18-21 Using the PURE CFPS system by Shimizu et al.,22 we synthesized KOD1Rnl from both plasmid and linearized gene. While reduced protein yield is often reported for linearized templates due to mRNA degradation,23 we observed no difference in yield between syntheses from plasmid and from linearized template. Previous reports indicate that protein synthesis velocity from the PURE system drops off after the first hour of incubation.22 This is thought to be caused by the accumulation of inorganic byproducts and the reduction in the concentration of free magnesium. The addition of free magnesium to the synthesis was previously reported to increase protein synthesis yield.24,25 By adding 20 mM of magnesium chloride to the reactions every hour, we hoped to prolong the reaction. Using this modified batch synthesis method, we observed a yield of around 200 mg/mL for both template types. Thermostable proteins are easily purified from mesophilic hosts or expression systems by heat precipitation.26 As shown by Bioanalyzer gel electrophoresis (Fig. 1a), the isolated protein is free of major contaminant bands. While the protein has a calculated molecular weight of 44 kDa, it has an apparent molecular weight of 41 kDa by electrophoresis (Fig. 1b).

Structure prediction of KOD1Rnl Using Circular Dichroism spectroscopy, we scanned the KOD1Rnl protein for its secondary structure and thermal stability. As shown in Fig. 1c, KOD1Rnl has the strongest ellipticity at around 210 nm. The characteristic shape of the KOD1Rnl CD spectrum suggests that the protein composes of a majority of b sheets.27,28 Unlike NMR or X-ray crystallography, CD spectroscopy does not provide detailed structural information about proteins, but it can provide insight about the protein

37

folding.28 Fig. 1d illustrates the ellipticity of KOD1Rnl at 210 nm during a thermal melt. As indicated by the s-shape of the curve, the protein has a true melting point. The point of inflexion of the thermodynamic fit of the melting curve indicates that KOD1Rnl melts at 59 C. After the protein is brought back to room temperature, an almost identical CD spectrum is obtained (Fig. 1c). This demonstrates that KOD1Rnl is capable of completely refolding back to its native state following heat denaturation. The predicted 3D structure of KOD1Rnl is produced by homology and ab initio based folding (Fig. 1e). The Rnl2 family of RNA ligases share 5 (I, III, IIIa, IV and V) conserved motifs that are analogous to ATP-dependent DNA ligases.6 However, unlike DNA ligases, the nucleotidyl transferase motifs found in RNA ligases are highly variable. Using the conserved motifs previously outlined by Ho et al 6, we identified the TK1545 ORF as a possible gene for an Rnl2-like ligase. By aligning the protein sequence of KOD1Rnl against that of T4Rnl2, we highlighted the 5 motifs in KOD1Rnl (Fig. 2a). Conserved or semi-conserved amino acids within each motif is displayed in bold. As can be seen in Fig. 2a, only motif IIIa is fully conserved, and motif V has only 2 out of the its 5 amino acids being conserved. Fig. 2b illustrates the predicted 3D structures of the 5 motifs found in KOD1Rnl. Very interestingly, all 5 structural motifs contain b sheets.

Template dependent ligation Ligation of adjacently annealed oligonucleotide probes has been elegantly used to detect DNA and RNA targets.29 Due to the heat resilience of DNA, the annealing of ligation probes can be aided by temperature cycling and gradients. RNA molecules on the other

Figure 1. KOD1Rnl purification and predicted structure. (A) Bioanalyzer protein electrophoresis gel plot of crude and purified KOD1Rnl. (B) Protein eletropherogram of crude and purified KOD1Rnl. (C) Circular dichroism spectra of KOD1Rnl before and after thermal denaturation. (D) The CD thermal denaturation curve of KOD1Rnl at 210 nm. (E) Predicted 3D structure of KOD1Rnl from both homology and ab initio based protein folding.

38

L. ZHANG AND A. TRIPATHI

Figure 2. Alignment and structure of KOD1Rnl motifs. (A) Protein sequence of KOD1Rnl aligned against that of T4Rnl2, showing the conserved motifs (I-V) of Rnl2-like ligases. The five motifs are highlighted in yellow. Fully or semi- conserved amino acids within the motifs are bolded. (B) The predicted structures of the 5 motifs in KOD1Rnl. The b sheets are displayed in red, and the random coils are displayed in yellow. The amino acid residues are labeled.

hand, are extremely prone to secondary structure formation.7 These hairpin loops are very stable in most reaction buffers and at room temperature. The presence of these loops reduces the hybridization efficiency of ligation probes, as shown in Fig. 3a. KOD1Rnl is a thermostable protein that can be used to perform RNA templated ligation at elevated temperature (Fig. 3a). Ligases are classified to be single-stranded or double-stranded based on their abilities to perform template independent or

template dependent ligations, respectively. To quantitate small concentrations of nucleic acids, the concentration of template independent ligation products must be minimized.30 As shown in Fig. S1, the time based velocity profiles of KOD1 RNA ligase, T4 DNA ligase, T4 RNA ligase 2, and Mth RNA ligase are measured. The template independent reactions were performed without any templates, and the template dependent reactions were performed in a slight excess of templates.

Figure 3. RNA detection using KOD1Rnl. (A) Illustration of RNA detection using thermostable double-stranded RNA ligase. (B) The effects of the terminal nucleotide on ligation probes, and the addition of RecJ, on both the template dependent and template independent ligation efficiencies. Reactions with or without RecJ were left for an hour at room temperature. (C) The effects of the time of incubation at 37  C in the presence of RecJ, on the template dependent and template independent ligation efficiencies (not attempted). The ligation probes used terminates in 20 O-Methyl RNA analogs on the 30 end. (D) The effects of different mismatched ligation probe terminal nucleotides on ligation efficiency. (E) Detection of different concentrations of Ebola RNA. (F) Illustration of the ligation probe hybridization regions on the Ebola RNA transcripts, and the corresponding primer binding regions on the joined ligation probes. Error bars denote 1 s.d.

RNA BIOLOGY

39

The velocity ratio of ligases, NRNA , is defined as the initial unit V time velocity of RNA template dependent ligation . ½E0;d Þ over that 0 V0;i of template independent ligation . ½E /. The normalized velocity 0 ratio, nRNA , is velocity ratio of different ligases normalized to that of KOD1Rnl. This ratio is used to evaluate the template dependency of KOD1Rnl. T4Rnl2 is 1.7 times more template dependent than KOD1Rnl, while both T4Rnl2 and KOD1Rnl are more template dependent than Mth RNA ligase (MthRnl). This suggest that KOD1Rnl is a double-stranded RNA ligase, unlike previously characterized thermophilic RNA ligases. T4 DNA ligase has a normalized velocity ratio of 200, much more than any of the RNA ligases (Supplemental table S1). Previously,30 T4 DNA ligase was reported to have a higher ratio of template dependent to template independent ligation yield than outlined in this study. In addition to sequence and methodology differences, this could be due to the DNA templated reactions performed by the group, instead of the RNA templated ligation detailed here. Modeling of substrate preference of ligase For any template dependent ligation reaction, both template dependent and independent pathways will proceed.30 Eq. 1 illustrates the equilibria of the template dependent and independent ligation reactions, where E , ESi , ESd , Pi , and Pd are the enzyme, enzyme and independent substrate complex, enzyme and dependent substrate complex, independent ligation product, and dependent ligation product, respectively. For most hybridization reactions, the annealing probes are present at a much higher concentration than the target. When the ligation probes are hybridized onto the RNA template, the formed complex becomes the substrate for the subsequent template dependent ligation (Sd). The excess ligation probes become the substrate for the template independent ligation (Si). The mass balance equations are defined in the supplemental materials. The initial and maximum velocities for template dependent and independent ligation reactions are represented by V0;d , V0;i , Vmax;d and Vmax;i , respectively. The ratio of the dimensionless template dependent reaction velocity to the dimensionless template independent reaction velocity, is defined in terms of the ratio of the concentrations of the template dependent substrate to that of the template independent substrate (Eq. 2). This relationship exists in several regimes depending on the values of a and b, which are defined in Eq. 3 and Eq. 4, respectively. k3

k¡1

k2

k1

k¡2

k4

E C Pi C Sd ← ESi C Sd fi Si C E C Sd fi ESd C Si → E C Pd C Si

(1) ½Sd 0 ½Si 0

aðb−1Þ V0;d ⋅V max;i ½Sd 0 −ð½ESd  C ½Pd Þ D D ½S  ½Si 0 −ð½ESi  C ½Pi Þ V0;i ⋅V max;d ab ¡ ½Sdi  0

(2)

Figure 4. The relationship between the ratio of the dimensionless velocities and the ratio of the initial substrate concentrations. The ratio of the dimensionless template dependent velocity to the dimensionless template independent ligation velocity is plotted against the ratio of the concentrations of the initial dependent substrate to that of the initial independent substrate. The relationships are evaluated for different regimes of a and b. The hashed lines denote when the ratio of the dimensionless velocity is directly proportional to the ratio of initial substrate concentrations.

the degree of template dependent ligation reaction occurring can be controlled linearly by adjusting the concentrations of the substrates. Similarly, the degree of template independent ligation reactions occurring can be minimized by eliminating the unhybridized ligation probes. Enzymatic elimination of template independent ligation To detect trace concentrations of RNA targets, we devised an enzymatic approach to reduce the concentration of un-annealed ligation probes. RecJ is a single-stranded DNase. After the ligation probes are allowed to hybridize onto the RNA templates, the reaction mixture is incubated with RecJ to degrade all un-annealed probes (Fig. 3a). As can be seen in Fig. 3b–c, the incubation with RecJ significantly reduces the rate of template independent reactions. After an incubation time of 40 minutes with RecJ, a 13 time reduction in template independent ligation is observed for ligation probes terminating with 20 O-Methyl RNA analogs. An also noted effect is the slight increase in template dependent ligation in the presence of RecJ.

0

½ESd  C ½Pd  ½ESi  C ½Pi  ½Sd 0 bD ½ESd  C ½Pd 

aD

(3) (4)

Fig. 4 illustrate the effects of the different ranges of a and b on the relationship. As a or b becomes very large, the relationship V ⋅V max;i ½S  between V0;d and ½Sdi  0 becomes linear. Under such regimes, 0;i ⋅V max;d 0

Single nucleotide specificity and detection of Ebola RNA As shown in Fig. 3d, ligation reactions were performed on 4 synthetic Ebola RNA transcripts, each containing a variant of the nucleotide of interest. For each RNA transcript, a reduced ligation efficiency is observed when the set of ligation probes used does not have a corresponding nucleotide that can pair at the mutation site. The mutation site is generated to be flanked by the 30 terminal nucleotide on the ligation probes (Fig. 3f).

40

L. ZHANG AND A. TRIPATHI

For KOD1Rnl, this reduced efficiency varies from 1.8 £ 10¡4 to 1.5 £ 10¡3, 3.9 £ 10¡4 to 4.7 £ 10¡3, 7.7 £ 10¡4 to 2.6 £ 10¡2, and 1.4 £ 10¡3 to 4.9 £ 10¡3 for RNA templates with U, A G, and C nucleotides at the target site, respectively. The single nucleotide specificity of KOD1Rnl compares favorably to that of existing ligases.31 We performed RNA templated nick joining reaction using KOD1Rnl for different concentrations of Ebola RNA transcript. Fig. 3e illustrates the real-time PCR curves of the nick joined ligation probes. The quantitative range for Ebola RNA at concentrations from 109 to 102 copies per reaction is demonstrated in Fig. S2. The alignment of the ligation probes (LP1-3) and the primers are shown in Fig. 3f. Ligase activity The in vitro conditions for the optimal activity of enzymes can be different from their physiological conditions. As shown in Fig. S3a–c, KOD1Rnl is the most active at 55 C and pH 7.5, as well as in the presence of 60 mM sodium chloride. The optimum temperature of 55 C is just below the determined melting temperature of KOD1Rnl. The preference of low monovalent ion concentration marks a contrast with the high salt growing conditions of the archaea.32 To understand the required cofactors for the protein, the 4 ribonucleoside and 4 nucleoside triphosphates were added to ligation reactions. Ligation product was observed only for the reaction containing ATP (data not shown). As shown in Fig. S3d, 1-Thio-ATP, an ATP analog is also accepted as a cofactor by KOD1Rnl. By varying the 30 penultimate and ultimate nucleotides of ligation probes, we discovered that KOD1Rnl prefers probes ending with 20 O-methyl RNA analogs over those ending with DNA. While nick joining reactions still proceed when probes terminate with DNA, it is 9.2 times slower than when they terminate in RNA analogs (Fig. 3b). This effect is also seen with T4Rnl2.6 As illustrated in Fig. S3e–f, both magnesium and manganese can be used as the divalent ion for ligation reactions. As show in Fig. S3f, when 10 mM each of both manganese and magnesium chloride are present, a higher ligation efficiency is observed than when 4 mM manganese chloride is present. This indicates that while manganese is accepted, magnesium is the preferred divalent ion. Calcium is the most prevalent divalent ion in the human blood33 In the presence of calcium ion, KOD1Rnl demonstrates reduced ligation efficiency (Fig. S3g). Effect of blood proteins on ligase activity Blood is a complex matrix of proteins, cells, and aqueous fluid. Fig. 5a illustrates the major proteins present in human serum by electrophoresis. Albumin, globulins and lipoproteins are all present at detectable concentrations. Most molecular assays are inhibited by the presence of high concentrations of these proteins. Therefore, to detect circulating pathogenic or carcinogenic agents in human serum, the isolation and purification of the respective biomarkers must be performed. For the detection of Ebola RNA or cellular mRNA, the target nucleic acid has to be extracted from the

complex matrix for downstream assays. We evaluated the efficacy of KOD1Rnl in performing ligation based RNA detection in the presence of human serum (Fig. 5b). At 5% concentration of human serum, only 4% of ligase activity remains. Albumin is the most abundant protein in serum, and is capable of partial refolding and remains soluble even after heat shock treatment (Fig. 5a). In the presence of 5% heat treated human serum, 5% of KOD RNA ligase activity remains. This could be due to the residual albumin present. To identify the inhibitory agent in human serum, the serum proteins are degraded using heat-labile proteases from S. griseus.34 With protease treated serum, a significant percentage of ligase activity (91%) is retained. This demonstrates that the inhibitory effects of human serum on ligase activity come from the serum proteins rather than the salts present. For further confirmation, a dialysis membrane with a 10 kDa molecular weight cutoff was used to separate the serum proteins from the aqueous fluid. Upon reconstitution in water to their original concentrations, both the protein and protein-free fractions of serum were spiked into ligation reactions. The protein fraction shows a strong inhibitory effect on ligase activity, while a much higher residual activity (76%) is seen for ligation reactions spiked with the protein-free fraction. Due to the use of a calcium chelator in all reactions, it is possible any adverse effects from calcium ions is mitigated. Similar to the addition of human serum, the additions of purified plasma albumin and g-globulins cause a significant reduction in ligation efficiency, and whose inhibitory effects are alleviated upon protease treatment. While heat treatment for human albumin demonstrated no alleviation in inhibitory effects over untreated albumin, heat treatment is sufficient to mitigate all effects from g-globulins. This shows that for immunoglobulins, heat denaturation and formation of insoluble bodies are not easily reversible. Interestingly, increased ligase activity over baseline is observed when protease treated albumin and g-globulins are added to ligation reactions. To determine the causative protein for ligation inhibition, both plasma derived and recombinant forms of human albumin, along with g-globulins and bovine serum albumin are spiked into ligation reactions at different concentrations (Fig. 5c). Human serum contains endogenous nucleases. Plasma derived human albumin is purified by chromatographic methods, and could retain trace amounts of nucleases.35 This theoretical nuclease contaminant should not be present in the recombinant form of human albumin. As shown in Fig. 5c, recombinant human albumin reduces ligase activity by a smaller margin compared to plasma derived albumin. This difference in inhibitory effect is more prominent at 0.02 mg/mL than at higher concentrations of proteins. The results suggest both endogenous nuclease and albumin play a role in inhibiting KOD1Rnl. The pre-treatment of human serum with heat-labile proteases degrades both nucleases and serum proteins. As can be seen in Fig. 5d, 15% of KOD1Rnl activity remains even in the presence of 30% protease treated serum. For all concentrations of bovine serum albumin tested, increased ligase activity is observed (Fig. 5c).

RNA BIOLOGY

41

Figure 5. The effects of serum proteins on KOD1Rnl. (A) Bioanalyzer protein electropherogram of unprocessed human serum, and the soluble fraction of human serum after heat treatment. (B) Template dependent ligation efficiency of KOD1Rnl in the presence of different treated human serum, serum albumin and gamma-globulin. (C) Ligation efficiency of KOD1Rnl in the presence of different concentrations of plasma-derived and recombinant human serum albumin, human gamma-globulin, and bovine serum albumin. (D) The effects of the addition of different concentrations of S. griseus protease treated human serum on ligation efficiency. Error bars denote 1 s.d.

Discussion

Ligase template dependency

Melting temperature of KOD1Rnl  32

Thermococcus Kodakarensis is capable of growing at 60-100 C . Thus, we expected KOD1Rnl to have a melting point within its optimum growing temperature range. However, a melting point of around 59  C is observed by CD spectroscopy. To enable a “transparent” baseline for CD spectroscopy, low absorbance phosphate buffers are used to reconstitute the proteins. Due to the low salt nature of these buffers, proteins suspended in them can exhibit different folding configuration and thermal stability than when in their physiological conditions. This could explain why a lower than expected melting point is observed. Since the E. coli derived PURE system is used to produce this archaeal protein, certain glycosylation that is required for protein stability could be missing. Or perhaps, RNA repair, the biological role fulfilled by Rnl2-like ligases, is active only at its lowest optimum growing temperature for KOD1.

Single stranded and double stranded ligases are defined by their abilities to catalyze template independent and dependent ligations. For the purposes of detecting or sequencing RNA, the minimization, and if possible, the elimination of any template independent ligation reactions is a pre-requisite. KOD1Rnl is shown to have comparable template dependency to T4 Rnl2, and better template dependency that MthRnl. The stringency of their substrate preference is an intrinsic property of the ligase. In addition to relying on the intrinsic template dependency of the protein, we can systematically reduce this template promiscuity by eliminating the template independent ligation substrates. Using RecJ, a thermo-labile and single stranded DNAse, we have shown significant reductions in the yield of template independent ligation reactions, while preserving the yield of template dependent ligation reactions.

42

L. ZHANG AND A. TRIPATHI

Ligase specificity The sequence-indiscriminate nature of double-stranded ligases are used to detect various genetic point mutations. In such reactions, ligation probes are designed with their nicks at the nucleotide of interest. When a target variant of nucleotide is present, the ligation probes would be perfectly hybridized onto the template and the ligation reaction would proceed at the optimum rate. However, if any variant other than the target is present, the structure of the hybridization site would be disrupted, and the ligation reaction would proceed at a reduced rate. The ability of any ligase to distinguish between single point polymorphisms (SNP’s) is an intrinsic property of the protein. Previously,31 we reported that certain chemical and length modifications to the ligation probes, and reaction buffer optimizations increase the specificity of ligases. While these optimizations are important to increase the reliability of ligases in reporting the presence of different point mutations, it is ideal to have an enzyme with a high baseline specificity. KOD1Rnl demonstrates good baseline specificity for all 4 RNA bases. With additional ligation probe optimizations, this specificity can be further increased. The usefulness of highly specific RNA ligases extend beyond mutation detection. When ligation modification of RNA is required as part of a work, such as in RNA sequencing, reduced background ligation will significantly increase the fidelity of the final sequencing reads. RNA modifying enzymes are becoming an indispensable part of the appliances in biomedical sciences, their import in understanding the basic biological processes already palpable. RNA ligases, especially, are an important class of proteins that allows novel manipulations of RNA. We demonstrate in this study, the identification, purification, and preliminary characterization of a thermostable and template dependent RNA ligase, KOD1Rnl. KOD RNA Ligase is shown to possess good mismatch specificity, and tolerance to blood protein contaminants, in addition to demonstrated thermostability and template dependency. These traits endow KOD1Rnl utility in a plentitude of applications from RNA detection and modification to sequencing.

reactions were ran for 4 hours at 37 C, with the addition of 20 mM magnesium chloride every hour after the second hour. Following the completion of protein synthesis, the reactions were diluted 1-fold in a buffer containing 2 mM DTT, 50 mM sodium chloride, and 50 mM Tris-HCl at pH 7.5. A cocktail of heat labile nucleases was added to the protein synthesis reactions and incubated overnight at room temperature. The cocktail consists of Cryonase (Clontech Laboratories, Mountain View, CA, USA), RecJ and RNase I (New England Biolabs,Ipswich, MA, USA) at concentrations of 2 units/mL, 1 units/mL, and 1 units/mL respectively. The reactions were brought to 90 C for 20 minutes to heat inactivate the nucleases and denature the E. coli proteins. After 20 minutes of centrifugation at 14,000x g, the heat stable, the supernatant fractions were removed and placed into 3K MWCO Amicon Ultra-0.5 filters (EMD Millipore, Billerica, MA, USA) for diafiltration. The protein was diafiltrated 6 times for 10 minutes each. After each wash, the retentate is diluted to 500 mL with a buffer containing 50 mM sodium chloride, 1 mM DTT, and 20 mM Tris-HCL at pH 7.5. Upon desalting and nucleotide removal, the purified ligase was concentrated to 400 ng/mL, and stored in the wash buffer with the addition of 30% glycerol. Protein concentration and purity were determined by Bioanalyzer protein electrophoresis from Agilent Technologies (Santa Clara, CA, USA).

Protein structure prediction and motif identification Protein homology detection and sequence alignment was performed using the HHPred tool.36 Subsequent protein homologous and ab initio folding was performed using the Phyre2 protein folding server.37 The generated protein structure was imported into ChemBio3D (PerkinElmer, Waltham, MA, USA). The protein sequence of KOD1Rnl was aligned against T4Rnl2 sequence. Conserved motifs from Rnl2-like RNA ligases were used to identify motifs in KOD1Rnl. The identified motifs are modeled in ChemBio3D with computed election density to provide secondary structure information.

Materials and method Expression and purification of KOD1Rnl

Circular dichroism spectrum of KOD1Rnl

Double-stranded gBlocks oligonucleotide fragments covering the TK1545 ORF were purchased from Integrated DNA Technologies (Coralville, IA, USA). Codon optimization was manually applied using the codon usage table for E. coli. Oligonucleotide primers (IDT, Coralville, IA, USA) containing T7 promoter and terminator sequences were used to amplify the gBlock fragment. The PCR was performed using Q5 DNA polymerase (New England Biolabs, Ipswich, MA, USA) for 15 cycles to minimize unintended mutations. Synthesized TK1545 gene was prepared and cloned into a pUC57-mini plasmid vector by Genscript (Piscataway, NJ, USA). The purified PCR product and plasmid were added to PURExpress (New England Biolabs, Ipswich, MA, USA) protein synthesis reactions for coupled in vitro translation and transcription. Protein synthesis

Purified protein was washed with a buffer containing 20 mM of potassium phosphate at pH 7.6 in a 10K MWCO Amicon Ultra filter. The chloride and DTT-free protein solution was normalized to a concentration of 1 mM. Circular Dichorism (CD) spectrum of the protein was acquired using a 10 mm quartz curvette in a Jasco JS-815 spectropolarimeter at the Brown University Proteomics Shared Facility. The spectrum was acquired at a pitch of 0.5 nm and a scanning speed of 50 nm/s. Thermal denaturation of the protein was performed at 210 nm from 20 C to 95 C at a ramp rate of 5 C/minute. CD spectra was performed in triplicates before and after the thermal denaturation of the protein. The melting temperature of the protein was determined by the point of inflection of the thermodynamic fit of the melt curve.

RNA BIOLOGY

Transcription and purification of ebola sequences Double stranded gBlock fragments encoding a portion of the Ebola genome was purchased from Integrated DNA Technologies (Coralville, IA, USA). All four variants of a single point polymorphism (SNP) was generated into the template using PCR. Primers containing T7 promoter and T7 terminator sequences were used to amplify the 4 templates. Wild type Taq DNA polymerase (New England Biolabs, Ipswich, MA, USA) was used for PCR to avoid proof-reading during mutation generation. Overnight transcription was performed using the TranscriptAid T7 transcription kit (Thermo Fisher Scientific, Waltham, MA, USA). Heat-labile double stranded DNase (Thermo Fisher Scientific, Waltham, MA, USA) was added to the finished reaction and incubated for 30 minutes at 37 C. Following 10 minutes of heat inactivation at 55 C, the RNA transcripts were purified by spin column. Concentration of RNA transcripts was determined using UV-vis nanodrop spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) and by Bioanalyzer Small RNA gel electrophoresis system (Agilent Technologies, Santa Clara, CA, USA).

Ligation reactions and quantitative PCR All template dependent ligation reactions were performed using the synthesized Ebola RNA transcripts as templates. Ligation yields were determined using the relative threshold values from the q-PCR amplification of ligated probes. Oligonucleotide ligation probes were purchased from Integrated DNA Technologies (Coralville, IA, USA), and Eurofins Operon (Louisville, KY, USA). The sequences of all probes and primers are detailed in Table S2. Three ligation probes (LP1-3) are designed to hybridize adjacently and successively onto the ligation template. PCR primers were design to overlap successive ligation probes. The square root is taken of the relative ligation yield obtained from q-PCR to derive the ligation efficiency. T4 DNA ligase, T4 RNA Ligase 2 and Mth RNA Ligase were purchased from New England Biolabs (Ipswich, MA, USA). KOD FX DNA polymerase was purchased from EMD Millipore (Billerica, MA, USA). Ligation reactions using KOD1Rnl were performed in a buffer containing 60 mM sodium chloride, 10 mM magnesium chloride, 2.5 mM DTT, 20 mg/mL BSA, 0.8 mM ATP, and 50 mM Tris-HCl at pH 7.5. Ligation reactions using other ligases were performed in their respective manufacturer supplied buffers. Ligation probes were present at 10 nM. Ligation reactions were perform for 40 minutes, except for timecourse experiments, where reaction were performed for increasing incubation times. One tenth of the volume of each ligation reaction was used in the subsequent quantitative PCR. Human serum was purchased from Golden West Biologicals (Temecula, CA, USA). Plasma derived human serum albumin and human serum gamma-globulins, recombinant human serum albumin, and S. grieus proteases were purchased from Sigma-Aldrich (Natick, MA, USA). Dimethyl BAPTA was purchased from Biotium (Hayward, CA, USA). Serum proteins were spiked into ligation reactions for evaluation of their inhibitory effects. Heat denaturation of human serum was performed at 100 C for 10 minutes. The soluble fraction of the heat treated human serum was insolated by centrifugation in

43

10 K WMCO Amicon Ultra spin filter. S. griseus protease digestion of human serum and serum proteins was performed with 1 mg/mL of proteases in the ligation buffer for 20 minutes at 37 C. Heat inactivation of the proteases was performed at 60 C for 5 minutes in the presence of dimethyl BAPTA.

Disclosure of potential conflicts of interest The authors declare no competing financial interest.

Acknowledgments L.Z. is supported by a Graduate Fellowship from Brown University. This work was funded by a Seed Award for Translational Research from Brown University. We thank Dr. Michael Clarkson at the Proteomics Shared Facility for providing training on the CD spectrometer. This research is based in part up on work performed using the Rhode Island NSF/EPSCoR Proteomics Share Resource Facility, which is supported in part by the National Science Foundation EPSCoR Grant No. 1004057, National Institutes of Health Grant No. 1S10RR020923, a Rhode Island Science and Technology Advisory Council grant, and the Division of Biology and Medicine, Brown University.

Author contributions L.Z. designed and performed the experiments, and wrote the manuscript. A.T. supervised the project, edited the manuscript and provided technical advice.

References 1. Shuman S, Schwer B. Rna Capping Enzyme and DNA-Ligase - a Superfamily of Covalent Nucleotidyl Transferases. Mol Microbiol 1995; 17:405-10; PMID:8559059; http://dx.doi.org/10.1111/j.13652958.1995.mmi_17030405.x 2. Shuman S, Lima CD. The polynucleotide ligase and RNA capping enzyme superfamily of covalent nucleotidyltransferases. Curr Opin Struc Biol 2004; 14:757-64; PMID:15582400; http://dx.doi.org/ 10.1016/j.sbi.2004.10.006 3. Zhuang FL, Fuchs RT, Sun ZY, Zheng Y, Robb GB. Structural bias in T4 RNA ligase-mediated 30 -adapter ligation. Nucleic Acids Res 2012; 40; PMID:22241775; http://dx.doi.org/10.1093/nar/gkr1263 4. Zhelkovsky AM, McReynolds LA. Simple and efficient synthesis of 50 pre-adenylated DNA using thermostable RNA ligase. Nucleic Acids Res 2011; 39:E117-U71; PMID:21724605; http://dx.doi.org/10.1093/ nar/gkr544 5. Zhang XH, Chiang VL. Single-stranded DNA ligation by T4 RNA ligase for PCR cloning of 50 -noncoding fragments and coding sequence of a specific gene. Nucleic Acids Res 1996; 24:990-1; PMID:8600474; http://dx.doi.org/10.1093/nar/24.5.990 6. Ho CK, Shuman S. Bacteriophage T4 RNA ligase 2 (gp24.1) exemplifies a family of RNA ligases found in all phylogenetic domains (vol 99, pg 12709, 2002). Proc Natl Acad Sci USA 2002; 99:12709-14; PMID:12228725; http://dx.doi.org/10.1073/pnas.222559599 7. Chursov A, Kopetzky SJ, Bocharov G, Frishman D, Shneider A. RNAtips: analysis of temperature-induced changes of RNA secondary structure. Nucleic Acids Res 2013; 41:W486-W91; PMID:23766288; http://dx.doi.org/10.1093/nar/gkt486 8. Torchia C, Takagi Y, Ho CK. Archaeal RNA ligase is a homodimeric protein that catalyzes intramolecular ligation of single-stranded RNA and DNA. Nucleic Acids Res 2008; 36:6218-27; PMID:18829718; http://dx.doi.org/10.1093/nar/gkn602 9. Brooks MA, Meslet-Cladiere L, Graille M, Kuhn J, Blondeau K, Myllykallio H, Van Tilbeurgh H. The structure of an archaeal homodimeric ligase which has RNA circularization activity. Protein Sci 2008; 17:1336-45; PMID:18511537; http://dx.doi.org/10.1110/ ps.035493.108

44

L. ZHANG AND A. TRIPATHI

10. Lohman GJS, Zhang YH, Zhelkovsky AM, Cantor EJ, Evans TC. Efficient DNA ligation in DNA-RNA hybrid helices by Chlorella virus DNA ligase. Nucleic Acids Res 2014; 42:1831-44; PMID:24203707; http://dx.doi.org/10.1093/nar/gkt1032 11. Larman HB, Scott ER, Wogan M, Oliveira G, Torkamani A, Schultz PG. Sensitive, multiplex and direct quantification of RNA sequences using a modified RASL assay. Nucleic Acids Res 2014; 42:9146-57; PMID:25063296; http://dx.doi.org/10.1093/nar/gku636 12. Nilsson M, Antson DO, Barbany G, Landegren U. RNA-templated DNA ligation for transcript analysis. Nucleic Acids Res 2001; 29:57881; PMID:11139629; http://dx.doi.org/10.1093/nar/29.2.578 13. Fareed GC, Richards CC. Enzymatic breakage and joining of deoxyribonucleic acid .2. structural gene for polynucleotide ligase in bacteriophage T4. P Natl Acad Sci USA 1967; 58:665-72; PMID:25008400; http://dx.doi.org/10.1073/pnas.58.2.665 14. Ho CK, VanEtten JL, Shuman S. Characterization of an ATP-dependent DNA ligase encoded by Chlorella virus PBCV-1. J Virol 1997; 71:1931-7; PMID:9032324 15. Fukui T, Atomi H, Kanai T, Matsumi R, Fujiwara S, Imanaka T. Complete genome sequence of the hyperthermophilic archaeon Thermococcus kodakaraensis KOD1 and comparison with Pyrococcus genomes. Genome Res 2005; 15:352-63; PMID:15710748; http:// dx.doi.org/10.1101/gr.3003105 16. Seo MS, Kim YJ, Choi JJ, Lee MS, Kim JH, Lee JH, Kwon ST. Cloning and expression of a DNA ligase from the hyperthermophilic archaeon Staphylothermus marinus and properties of the enzyme. J Biotechnol 2007; 128:519-30; PMID:17118474; http://dx.doi.org/10.1016/j. jbiotec.2006.09.024 17. Lovett ST, Clark AJ. Genetic-Analysis of the recj-gene of escherichiacoli-K-12. J Bacteriol 1984; 157:190-6; PMID:6317649 18. Petro TM, Agarkova IV, Zhou Y, Yolken RH, Van Etten JL, Dunigan DD. Response of mammalian macrophages to challenge with the chlorovirus acanthocystis turfacea chlorella virus 1. J Virol 2015; 89:12096-107; PMID:26401040; http://dx.doi.org/10.1128/JVI.01254-15 19. Coleman G. Novel method for achieving cell-free synthesis of protein. Nature 1969; 222:666-7; PMID:5768275; http://dx.doi.org/10.1038/ 222666a0 20. Bretscher MS, Grunbergmanago M. Polyribonucleotide-directed protein synthesis using an E. coli cell-free system. Nature 1962; 195:2834; PMID:13872932; http://dx.doi.org/10.1038/195283a0 21. Wainwright SD. Mediation of the genetic control of protein synthesis in cell-free extracts of neurospora-crassa. Nature 1960; 185:314-5; http://dx.doi.org/10.1038/185314a0 22. Shimizu Y, Inoue A, Tomari Y, Suzuki T, Yokogawa T, Nishikawa K, Ueda T. Cell-free translation reconstituted with purified components. Nat Biotechnol 2001; 19:751-5; PMID:11479568; http://dx.doi.org/ 10.1038/90802 23. Niederholtmeyer H, Xu L, Maerkl SJ. Real-Time mRNA measurement during an in vitro transcription and translation reaction using binary probes. Acs Synth Biol 2013; 2:411-7; PMID:23654250; http://dx.doi. org/10.1021/sb300104f

24. Jackson K, Kanamori T, Ueda T, Fan ZH. Protein synthesis yield increased 72 times in the cell-free PURE system. Integr Biol-Uk 2014; 6:781-8; http://dx.doi.org/10.1039/C4IB00088A 25. Karig DK, Iyer S, Simpson ML, Doktycz MJ. Expression optimization and synthetic gene networks in cell-free systems. Nucleic Acids Res 2012; 40:3763-74; PMID:22180537; http://dx.doi.org/10.1093/nar/ gkr1191 26. Kirk N, Cowan D. Optimizing the Recovery of Recombinant Thermostable Proteins Expressed in Mesophilic Hosts. J Biotechnol 1995; 42:177-84; PMID:7576536; http://dx.doi.org/10.1016/0168-1656(95) 00078-5 27. Manavalan P, Johnson WC. Sensitivity of circular-dichroism to protein tertiary structure class. Nature 1983; 305:831-2; http://dx.doi.org/ 10.1038/305831a0 28. Barela TD, Darnall DW. Practical aspects of calculating protein secondary structure from circular-dichroism spectra. Biochemistry-Us 1974; 13:1694-700; PMID:4831358; http://dx.doi.org/10.1021/ bi00705a022 29. Landegren U, Kaiser R, Sanders J, Hood L. A ligase-mediated gene detection technique. Science 1988; 241:1077-80; PMID:3413476; http://dx.doi.org/10.1126/science.3413476 30. Kuhn H, Frank-Kamenetskii MD. Template-independent ligation of single-stranded DNA by T4 DNA ligase. Febs J 2005; 272:5991-6000; PMID:16302964; http://dx.doi.org/10.1111/j.17424658.2005.04954.x 31. Zhang L, Wang JJ, Coetzer M, Angione S, Kantor R, Tripathi A. OneStep ligation on RNA amplification for the detection of point mutations. J Mol Diagn 2015; 17:679-88; PMID:26322949; http://dx.doi. org/10.1016/j.jmoldx.2015.07.001 32. Fujiwara S, Takagi M, Imanaka T. Archaeon Pyrococcus kodakaraensis KOD1: application and evolution. Biotechnol Ann Rev 1998; 4:259-84; PMID:9890143; http://dx.doi.org/10.1016/S1387-2656(08) 70073-5 33. Salomonsson M, Arendshorst WJ. Calcium recruitment in renal vasculature: NE effects on blood flow and cytosolic calcium concentration. Am J Physiol 1999; 276:F700-10; PMID:10330052 34. Hiramatsu A, Ouchi T. On proteolytic enzymes from commercial protease preparation of streptomyces griseus (Pronase P). J BiochemTokyo 1963; 54:462-4; PMID:14089742 35. Cohn EJ, Hughes WL, Weare JH. Preparation and Properties of serum and plasma proteins .13. crystallization of serum albumins from ethanol water mixtures. J Am Chem Soc 1947; 69:1753-61; PMID:20251413; http://dx.doi.org/10.1021/ja01199a051 36. Soding J, Biegert A, Lupas AN. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res 2005; 33:W244-W8; PMID:15980461; http://dx.doi.org/10.1093/nar/ gki408 37. Kelley LA, Mezulis S, Yates CM, Wass MN, Sternberg MJE. The Phyre2 web portal for protein modeling, prediction and analysis. Nat Protoc 2015; 10:845-58; PMID:25950237; http://dx.doi.org/10.1038/ nprot.2015.053

Archaeal RNA ligase from thermoccocus kodakarensis for template dependent ligation.

Nicking-sealing RNA ligases play a significant biological role in host defense and cellular repair, and have become an important molecular tool in bio...
1MB Sizes 0 Downloads 7 Views