PROTEIN STRUCTURE REPORT Structural features of Cas2 from Thermococcus onnurineus in CRISPR-cas system type IV

Tae-Yang Jung,1,2 Kwang-Hyun Park,1 Yan An,1 Alexy Schulga,3 Sergey Deyev,3 Jong-Hyun Jung,5 and Eui-Jeon Woo1,4* 1

Disease Target Structure Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon 305-333, South Korea 2 Department of Biological Sciences, KAIST Institute for the Biocentury, Korea Advanced Institute of Science and Technology, Daejeon 305-701, South Korea 3 Molecular Immunology Laboratory, Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, Moscow 117997, Russia 4 Department of Analytical Bioscience, University of Science and Technology, Daejeon 305-333, South Korea 5

Research Division for Biotechnology, Korea Atomic Energy Research Institute, Jeongeup 580-185, South Korea

Received 17 May 2016; Accepted 5 July 2016 DOI: 10.1002/pro.2981 Published online 12 July 2016 proteinscience.org

Abstract: CRISPR-Cas is RNA-based prokaryotic immune systems that defend against exogenous genetic elements such as plasmids and viruses. Cas1 and Cas2 are highly conserved components that play an essential part in the adaptation stage of all CRISPR-Cas systems. Characterization of CRISPR-Cas genes in Thermococcus onnurineus reveals the association of the Cas2 gene with the putative type IV system that lacks Cas1 or its homologous genes. Here, we present a crystal structure of T. onnurineus Cas2 (Ton_Cas2) that exhibits a deep and wide cleft at an interface lined with positive residues (Arg16, Lys18, Lys19, Arg22, and Arg23). The obvious DNA recognizing loops in Cas2 from E. coli (Eco_Cas2) are absent in Ton_Cas2 and have significantly different shapes and electrostatic potential distributions around the putative nucleotide binding region. Furthermore, Ton_Cas2 lacks the hairpin motif at the C-terminus that is responsible for Cas1 binding in Eco_Cas2. These structural features could be a unique signature and indicate an altered functional mechanism in the adaptation stage of Cas2 in type IV CRISPR-Cas systems. Keywords: CRISPR-Cas system; Thermococcus onnurineus; Cas2; protein structure; type IV system

T.-Y. Jung and K.-H. Park contributed equally to this work Grant sponsor: International Research & Development Program of the National Research Foundation of Korea (NRF); Grant sponsor: Ministry of Science, ICT and Future Planning (MSIP) of Korea; Grant number: NRF-2014K1A3A1A49069194/ NRF-2015R1A2A2A03006970; Grant sponsor: KRIBB research initiative program and by the Ministry of Education and Science of Russian Federation; Grant number: RFMEFI61315X0033. *Correspondence to: Eui-jeon Woo, Disease Target Structure Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon 305-333, Korea. E-mail:[email protected]

1890

PROTEIN SCIENCE 2016 VOL 25:1890—1897

C 2016 The Protein Society Published by Wiley-Blackwell. V

Figure 1. The CRISPR-Cas system of T. onnurineus. A: Six CRISPR loci identified within the genome of T. onnurineus NA1, based on the CRISPR database (http://crispr.u-psud.fr/). B: Gene organization of the cas locus in type III and putative type IV. White arrows indicate ORFs present in T. onnurineus gene, and CRISPR loci 1, 3, and 4 are displayed (gray box). Red arrows represent the cas2 gene, and blue arrows represent the putative effector complex genes specific to type IV.

Introduction The CRISPR-Cas (Clustered Regularly Interspaced Short Palindromic Repeats/CRISPR Associated gene) system has been widely studied in microbial and archaeal antiviral immune systems.1 The CRISPR system protects the host organism from foreign genetic material via a three stage process of adaptation, processing, and interference. Cas proteins are thought to involve, in function or through various modifications, cleavage or binding activities to facilitate RNA or DNA biogenesis and cleavage.2,3 CRISPR–Cas systems are typically encoded by a predicted single operon that encompasses the Cas1, Cas2 genes together with the genes for various RNA biogenesis and interference functions. Cas1 and Cas2 function as a complex in vivo and in vitro, and the Cas1–Cas2 complex exhibits DNA acquisition activity during the CRISPRCas adaptive stage.4,5 The current nomenclature classifies the CRISPR-Cas systems of bacteria and archaea into five types (I–V) and 16 subtypes, based on phylogenetic and functional studies.6 Previous studies on Cas2 proteins report various activities and substrate specificities depending on the sources of the microorganisms. For example, Cas2 from Sulfolobus solfataricus (Sso_Cas2) that belongs to both type I-A and type III-B is characterized as a metal-dependent single strand specific

Jung et al.

endoribonuclease with preference for U-rich ssRNA.7 Cas2 from Bacillus halodurans (Bha_Cas2) that belongs to both type I-C and type III-B shows endonuclease activity toward double strand DNA substrates.8 Cas2 from D. vulgaris is known to exhibit neither nuclease nor ribonuclease activity.9 Contrary to expectation, recent structure studies have shown that the enzymatic activity of Cas2 is not required for spacer acquisition, suggesting a simple structural platform function in binding to Cas1.4,10 The complex structure of Cas1-Cas2-DNA from E. coli of a type I-E system indicates that the main function of Cas2 may be to form a scaffold between two dimeric Cas1 proteins and assist DNA substrate binding with the dsDNA substrate placed on one side of the Cas2 dimeric surface. Despite numerous biochemical characterization and structural analysis, the complete functional mechanism of Cas2 still remains elusive. To date, five structures of Cas2 from different organisms and CRISPR-Cas systems have been reported in the PDB databank (Sso_Cas2:2I8E, Dvu_Cas2:3OQ2, Tth_Cas2:1ZPW, Pf_Cas2:2I0X, and Bha_Cas2:4ES1).7–9,11 In this study, we report the association of Cas2 with the type IV CRISPR-Cas system in Thermococcus onnurineus NA1 and analyze the structure of the protein—the first report of Cas2 in type IV CRISPR-Cas systems to date.

PROTEIN SCIENCE VOL 25:1890—1897

1891

Table I. Data Collection and Structure Solution Parameters Crystal type ˚) Unit cell parameters (A

˚) Resolution (A Space group Completeness (%) Rsym (%)b I/r (I) No. of refined atoms: protein/water Rfactor/Rfree (%)c ˚) r.m.s.d. bond length (A r.m.s.d. bond angle Ramachandran plot (%) Most favored region Additionally allowed region Outlier region

Overlapped plate a 5 29.52 b 5 32.80 c 5 83.16 50.00 – 1.7 P2(1)2(1)2 99.18 (99.82)a 4.62 (9.64) 27.64 (17.23) 12,582/66 21.50/24.07 0.006 0.814 97. 4% 2.6% 0%

a Numbers in parentheses are statistics from the highest resolution shell. b Rsym 5 R|Iobs – Iavg|/R Iobs, where Iobs is the observed individual reflection, and Iavg is the average over symmetry equivalents. c Rfactor 5 RjjF0|–|Fcjj/R|F0|, where |F0| and |Fc| are the observed and calculated structure factor amplitudes, respectively. Rfree was calculated using 5% of the data.

Results Characterization of CRISPR-Cas system in T. onnurineus NA1 The hyperthermophilic archaeon, T. onnurineus NA1, has six CRISPR loci in the circular genome(1847607nt) based on the CRISPR database.12 The CRISPR locus in T. onnurineus contains 28-30nt of repeat sequences and 7-43nt of spacers [Fig. 1(A)]. Six repeats share similar nucleotide sequences and lengths, which are predicted to form a palindrome secondary structure, with an identical repeat sequence among loci 1, 5, and 6 and marginal differences between loci 2, 3, and 4. Repeat sequences are highly conserved in the beginning of the four nucleotides (GTTT) and in the terminal region (GXAAX) for all six regions. When ORFs near the CRISPR loci were analyzed, 16 Cas proteins were identified with two major cas gene cassettes: the subtype III-A and the putative IV system. The subtype III-A was located between loci 3 and 4 with similar individual cas genes and cassettes to those of P. yayanosii, a recently discovered thermophilic archea13,14 [Fig. 1(B)]. The type IV system, previously described as a csf module, is known to form a minimal effector complex with four subunits.6 The putative type IV system in T. onnurineus was located on the downstream of locus 1 in that the signature proteins of Csf1 and Csf2 were identified, but the other two subunits could not be assigned, possibly due to the low sequence identity to known cas genes (

Structural features of Cas2 from Thermococcus onnurineus in CRISPR-cas system type IV.

CRISPR-Cas is RNA-based prokaryotic immune systems that defend against exogenous genetic elements such as plasmids and viruses. Cas1 and Cas2 are high...
3MB Sizes 0 Downloads 9 Views