New Technology

CRISPR Primer Designer: Design primers for knockout & chromosome imaging CRISPR-Cas system

Running Title: A Tool for Designing CRISPR Primer

Meng Yan, Shi-Rong Zhou,* and Hong-Wei Xue*

National Key Laboratory of Plant Molecular Genetics, Shanghai Institute of Plant Physiology & Ecology, Chinese Academy of Sciences, Shanghai, 200032, China.

* Correspondence: [email protected] and [email protected]

Edited by: Peking University, China

This article has been accepted for publication and undergone full peer review but has not been through the copyediting, typesetting, pagination and proofreading process, which may lead to differences between this version and the Version of Record. Please cite this article as doi: [10.1111/jipb.12295]

This article is protected by copyright. All rights reserved. Received: August 10, 2014; Accepted: October 8, 2014

Abstract Clustered regularly interspaced short palindromic repeats – CRISPR associated system enables biologists to edit genome precisely and provides a powerful tool for perturbing endogenous gene regulation, modulation of epigenetic markers and genome architecture. However, there are concerns rose about the specificity of the system, especially the usages of knocking out a gene. Previous designing tools either were mostly built-in websites or ran as command-line program, and none of them ran locally and acquired a user-friendly interface. In addition, with the development of CRISPR derived-systems, such as chromosome imaging, there were still no tools helping users to generate specific end-user spacers. We here presented CRISPR Primer Designer for researchers to design primers for CRISPR applications. The program has a user-friendly interface, and can analyze the BLAST results by using multiple parameters and score for each candidate spacers, generate the primers when using a certain plasmid. In addition, CRISPR Primer Designer runs locally and can be used to search spacer clusters, and exports primers for CRISPR-Cas system based chromosome imaging system. K e yw o r d s : C R I S P R ; p r i m e r ; s o f t w a r e ; k n o c k o u t ; c h r o m o s o m e i m a g i n g

INTRODUCTION Precise genome manipulation is one of the greatest challenges for biologists to study the functions of specific genes in the past years. The newly developed Clustered regularly interspaced short palindromic repeats (CRISPR) – Cas (CRISPR associated) system rose as a powerful and an easy-to-implement tool to meet these demands (Belhaj et al. 2013; Feng et al. 2013; Gaj et al. 2013; Gao and Zhao 2014; Gratz et al. 2013; Hwang et al. 2013; Mali et al. 2013; Miao et al. 2013; Li et al. 2013; Nekrasov et al. 2013; Shan et al. 2013). CRISPR-Cas knockout system only needs a single guide RNA (sgRNA) that guides Cas9 to targeted fragments followed with a protospacer-adjacent motif (PAM, 5’-NGG-3’) (Jinek et al. 2012). Chromosome imaging system is another CRISPR-Cas based system to inspect gene locations on the chromosome in vivo, provides a new method in addition to FISH and DNA-binding proteins for imaging (Chen et al. 2013). It also allows researchers to track the movements of any genes during chromosome reorganization and observe the telomere dynamics during telomere elongation or disruption. Other derived-systems using engineered dCas9 protein have been used for perturbing endogenous gene expression (Gilbert et al. 2013; Qi et al. 2013), which makes the CRISPR-Cas system even powerful for biologists to reveal the signature of life. Due to its versatile functions, targeting specificity of CRISPR-Cas system posed a challenge for its application. Cas9-sgRNA complex can cause off-target-cleavage in various degrees depending on the number and position of mismatches between spacer and target DNA (Fu et al. 2013; Hsu et al. 2013). Choosing specific targeting sequences will improve the specificity of CRISPR-Cas system greatly and minimize the risks of off-target events. Previous off-target searching tools were mostly web-based, or not acquired a user-friendly local interface to manipulate the parameters easily for searching (Bae et al. 2014; Lei et al. 2014; Xiao et al. 2014). Here, we reported a new software, CRISPR Primer Designer, to help the CRISPR-Cas system users to choose specific spacer sequences for targeting in the background genome, including Arabidopsis thaliana, Oryza sativa, human, mouse, Drosophila melanogaster, etc. By generating a list of potential spacers for short input BLAST and analyzing the BLAST results automatically, this program will produce the preferred specific spacers for researching. In addition, CRISPR Primer Designer can also output an array of spacers for CRISPR chromosome imaging system.

RESULTS CRISPR Primer Designer Software CRISPR Primer Designer was developed to design primers when using CRISPR-Cas system to knockout specific genes and in vivo imaging of a specific fragment on the chromosome. The program was written in C++/CLI language, ran on a windows system with the installation of Microsoft .NET Framework 3.5 or higher. The program has a user-friendly interface (Figure 1A) and calls attentions to users when there was an error. CRISPR Primer Designer offers two modes: in Knockout Mode, it helps users to read through BLAST result files, analyzes the results and generates the best spacer sequence for CRISPR-Cas based knockout; in Chromosome Imaging Mode, it generates a list of spacer sequences for CRISPR-Cas based chromosome imaging system. Different parameters are displayed in each functional section, and default parameters are embedded in the program, which allows user to adjust the searching and the filtering processes for an optimum output.

Knockout Mode. CRISPR Primer Designer generated a list of 14 spacers and PAMs with default searching parameters for A. thaliana gene BRI1, the spacers were exported and sent to NCBI for BLAST (Supplement Information 2) against A. thaliana genome (Figure 1A). The result file was downloaded, and loaded into the program for analysis. Parameters for filter set as default, seeds number is 7, rejected range is 12, buffer zone is 5, and buffer threshold is 1 (Figure 1B). The default settings for filter are based on two previously published experiment results (Hsu et al. 2013; Qi et al. 2013). The seed region is the bases next to the “NGG”. We choose 7 bases as a default value as the results showing that one mismatch of the 7 bases will cause the efficiency of CRISPR drops dramatically. You can be more restricted by setting the seed number to 5. The bases next to the seed region is the buffer zone. The default base number of buffer zone is 5, from the 8th base to the 12th base, as mismatches in this area causes major efficiency lost. However, it is possible that even mismatch happens in this area, the cutting may still be executed which causes off-target, and thus we add 1 point to the score for each case. As two mismatches appears in this region will results in almost loss of cutting efficiency. So, the default base number of the buffer threshold is 1, which means if more than 1 mismatches appears in the buffer zone, the cutting will not happen. The default value for reject zone is

12 (very restricted), which means if there is no mismatches appear in the first 12-base, this likely causing off-target. If this is too restricted for your genes, you can set the region from 12 to 18. The program scored for these 14 spacers, which can be sorted by score by clicking the “Score title” in the score board. The filtered BLAST blocks for every spacer can be viewed by double clicking the row that spacer was shown at the spacer score board. In addition, all the filtered blocks can be exported to a .txt file for saving the details. We suggest user to open the filter box to check whether the subject sequence is in the target gene. The plasmid we selected is psgR-Cas9-At (Mao et al. 2013), primers for this plasmid using selected spacer was shown at the bottom textbox. Comparing with other programs for off-target searching (Bae et al. 2014; Lei et al. 2014; Xiao et al. 2014), the developed CRISPR Primer Designer allows user to set the searching range of PAMs and the GC content of the spacers. These parameters are very useful for choosing the cleavage site at the beginning of the target gene to generate the most severe mutation, or if user wants to edit the specific region of the target gene, such as a specific domain or a unique structure. Other programs didn’t provide users the parameters for filtering the result, and the scoring system is based on individual score for each nucleotide match in the genome, while CRISPR Primer Designer uses short input BLAST to search for matches and counts the numbers of possible cleavage sites as score, and rejects the spacer if it falls into the reject zone parameter. By introducing 4 simplified parameters for filtering the BLAST result, it is much easier for users to control the scoring system and output results. In addition, CRISPR Primer Designer reduces the workload for local machines and the complexity of downloading the whole background genome by using NCBI for BLAST and a user-friendly local interface is optimized for most users to operate (Xiao et al. 2014).

Chromosome Imaging Mode. Number of the PAMs was set to 36 for the intron 1 of the MUC4 gene, spacer length varies from 17 to 28, and GC content from 40 to 60, least distance between adjacent PAMs is 30 bp. CRISPR Primer Designer produced an array of 36 spacers, and showed the location and the distribution of the selected spacers (Figure 2).

CRISPR Primer Designer is the first tool for chromosome imaging users to find an array of spacers, which also shows the spacer array graphically, and can export primers using vector pSLQ1371 (Chen et al. 2013).

DISCUSSION CRISPR system has been applied in many organisms including the model plants, A. thaliana and Oryza sativa. Compared with the published CRISPR off-target searching tools Cas-OFFinder (Bae et al. 2014), CRISPR-P (Lei et al. 2014) and CasOT (Xiao et al. 2014), CRISPR Primer Designer has some special features, (1) CRISPR Primer Designer has a user-friendly local interface and runs in a windows platform, this is very important for biologist, as quite a lot of them cannot use command-line programs skillfully; (2) CRISPR Primer Designer generates a list of potential spacers for short input BLAST, and analyzes BLAST results automatically by using multiple parameters and scores for each found spacers; (3) CRISPR Primer Designer directly generates the primers for a certain plasmid, as most published plasmids have been built in this program, this function can avoid potential errors during primer design, and time saving; (4) CRISPR Primer Designer can also be used for searching spacer clusters, and exports primers used in CRISPR-Cas system based chromosome imaging system, which is not provided by any other programs. CRISPR-P (Lei et al. 2014) is a latest published plant sgRNA design tool for CRISPR-Cas system. We compared the results generated by CRISPR Primer Designer and CRISPR-P, the results of possible spacers are quite consistent, which indicating that short input BLAST could be used for searching off-targets. CRISPR-P generated more alignment results and most of the subject sequences have multiple mismatches, including some low cleavageprone sequences. CRISPR Primer Designer filtered much more genomic match results using default parameters, and provided users a much clear intuition to choose spacers.

METHODS Implementation for Knockout Mode. In Knockout Mode, the user only has to input the coding sequence (CDS) of the target gene for searching the PAMs and connected spacers. We used the CDSs of different genes from different species to test the program, A. thaliana gene BRI1, Oryza sativa gene LAZY1, Danio rerio gene FH1, Mouse gene TET1, and Human gene

EMX1(Supplement Information 1 for Manual, BLAST parameters for human, mouse, Drosophila melanogaster and Danio rerio are different from A. thaliana and Oryza sativa). The program will generate a list of 20 bp-spacers for the target gene by searching for PAMs (5’-NGG-3’) on both strands, or specifically assign a G in the first place of 5’-end for the T7 promoter. Parameters for Searching are: Searching Range, the fragment in the gene you want to find spacers, GC content of the spacers and whether a G is needed in the first place. The user needs to export the candidate spacers and uploads it to NCBI for BLAST (Supplement Information 2) against background genome. BLAST program will adjust parameters for short input sequence, thus the results can be used for analysis. After downloading the result file, user can load the results into the program for analysis. There are 4 parameters for the result filter, seeds number, rejected range, buffer zone, and buffer threshold (Figure 1B). Purpose of the filter is to filter out non-targeted BLAST result matches, and record might-targeted matches for scoring. Seed is the nucleotides just before the PAM, if there is one mismatch appears in this region, the CRISPRCas system cleavage will not happen. Rejected zone is also the nucleotides just before the PAM, if all the nucleotides are all matched in this region, the cleavage might be appear here, so the spacer of this PAM will be rejected for next step. Buffer zone is the region before the seed, which is used for scoring and comparison. In Buffer zone, there might be some mismatches and the number of allowed mismatch nucleotides is called buffer threshold. The program scores for every possible spacer and chooses the best one. A few plasmids were preset (Cong et al. 2013; Hwang et al. 2013; Mao et al. 2013; Miao et al. 2013) (Supplement Information 4). If the plasmid intending to use is not shown, a new plasmid can be created and primers for this plasmid will be shown at the primer box. Implementation for Chromosome Imaging Mode. Chromosome Imaging Mode works easier than Knockout Mode. To test the program, we used the intron 1 of the non-repetitive sequence of the MUC4 gene (Chen et al. 2013). The PAM pattern used for searching is 5’GNXNGG-3’ (Chen et al. 2013), searching parameter include the desired number of the PAMs, the length of spacers, GC content of spacers and least distance between adjacent PAMs (the PAMs should be scattered along the given fragment as dCas9 occupies some spaces on the DNA strand).

The program will generate a list of PAMs and its spacers. By choosing one or multiple PAMs, the location and distribution of PAMs will be shown graphically. When the full length checkbox is selected, the graph will be displayed using the whole input sequence, if not, the graph will be shown using the sequence from first nucleotide to the last PAM found in the list. Parameters can be adjusted to acquire the best result and primers using vector pSLQ1371 (Chen et al. 2013) could be exported. Alternatively, one can just export the PAMs and spacers, then adjust them for the new plasmids. Availability and requirements Project name: CRISPR Primer Designer Project home page: http://www.plantsignal.cn Operating system(s): Windows® system XP or higher Programming language: C++/CLI Other requirements: Microsoft .NET Framework 3.5 or higher. You can download it from Microsoft website at http://www.microsoft.com/net/downloads, choose a suitable version for your operation system. Reminder: the newest version 4.5.X can’t be installed on Windows XP. If you need help with the installation of .NET Framework, go to http://www.microsoft.com/net/downloads, choose corresponding topics to view the help documents. License: free to noncommercial users Any restrictions to use by non-academics: Contact Meng Yan ([email protected]) or any other authors listed. A CK NOW LEDGMENT S The work was supported by the Chinese Academy of Sciences. We would like to thank Zheng-Yan Feng from Shanghai Stress Biology Center, CAS for the specificity of the CRISPR-Cas system and CRISPR protocol in the help system.

REFERENCES Bae S, Park J, Kim J-S (2014) Cas-OFFinder: A fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics 30: 1473-1475 Belhaj K, Chaparro-Garcia A, Kamoun S, Nekrasov V (2013) Plant genome editing made easy: Targeted mutagenesis in model and crop plants using the CRISPR/Cas system. Plant Methods 9: 39 Chen BH, Gilbert LA, Cimini BA, Schnitzbauer J, Zhang W, Li GW, Park J, Blackburn EH, Weissman JS, Qi LS, Huang B (2013) Dynamic imaging of genomic loci in living human cells by an optimized CRISPR/Cas System. Cell 155: 1479-1491 Cong L, Ran FA, Cox D, Lin SL, Barretto R, Habib N, Hsu PD, Wu XB, Jiang WY, Marraffini LA, Zhang F (2013) Multiplex genome engineering using CRISPR/Cas Systems. Science 339: 819-823 Feng ZY, Zhang BT, Ding WN, Liu XD, Yang DL, Wei PL, Cao FQ, Zhu SH, Zhang F, Mao YF, Zhu JK (2013) Efficient genome editing in plants using a CRISPR/Cas system. Cell Res 23: 1229-1232 Fu YF, Foden JA, Khayter C, Maeder ML, Reyon D, Joung JK, Sander JD (2013) High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nat Biotechnol 31: 822-826 Gaj T, Gersbach CA, Barbas CF (2013) ZFN, TALEN, and CRISPR/Cas-based methods for genome engineering. Trends Biotechnol 31: 397-405 Shan QW, Wang YP, Li J, Zhang Y, Chen KL, Liang Z, Zhang K, Liu JX, Xi JJ, Qiu JL, Gao CX (2013) Targeted genome modification of crop plants using a CRISPR-Cas system. Nat Biotechnol 31: 686-688 Gilbert LA, Larson MH, Morsut L, Liu ZR, Brar GA, Torres SE, Stern-Ginossar N, Brandman O, Whitehead EH, Doudna JA, Lim WA, Weissman JS, Qi LS (2013) CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell 154: 442-451 Gao YB, Zhao YD (2014) Self- processing of ribozyme- flanked RNAs into guide RNAs in vitro and in vivo for CRISPR- mediated genome editing. J Integr Plant Biol 56: 343-349

Gratz SJ, Cummings AM, Nguyen JN, Hamm DC, Donohue LK, Harrison MM, Wildonger J, O'Connor-Giles KM (2013) Genome engineering of drosophila with the CRISPR RNA-guided Cas9 nuclease. Genetics 194: 1029-1035 Hsu PD, Scott DA, Weinstein JA, Ran FA, Konermann S, Agarwala V, Li YQ, Fine EJ, Wu XB, Shalem O, Cradick TJ, Marraffini LA, Bao G, Zhang F (2013) DNA targeting specificity of RNA-guided Cas9 nucleases. Nat Biotechnol 31: 827-832 Hwang WY, Fu YF, Reyon D, Maeder ML, Tsai SQ, Sander JD, Peterson RT, Yeh JRJ, Joung JK (2013) Efficient genome editing in zebrafish using a CRISPR-Cas system. Nat Biotechnol 31: 227-229 Jinek M, Chylinski K, Fonfara I, Hauer M, Doudna JA, Charpentier E (2012) A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337: 816-821 Lei Y, Lu L, Liu HY, Li S, Xing F, Chen LL (2014) CRISPR-P: A web tool for synthetic single-guide rna design of CRISPR-system in plants. Mol Plant online doi: 10.1093/mp/ssu044 Li JF, Norville JE, Aach J, McCormack M, Zhang DD, Bush J, Church GM, Sheen J (2013) Multiplex and homologous recombination-mediated genome editing in Arabidopsis and Nicotiana benthamiana using guide RNA and Cas9. Nat Biotechnol 31: 688-691 Mali P, Esvelt KM, Church GM (2013) Cas9 as a versatile tool for engineering biology. Nat Methods 10: 957-963 Mao Y, Zhang H, Xu N, Zhang B, Gou F, Zhu JK (2013) Application of the CRISPR-Cas system for efficient genome engineering in plants. Mol Plant 6: 2008-2011 Miao J, Guo DS, Zhang JZ, Huang QP, Qin GJ, Zhang X, Wan JM, Gu HY, Qu LJ (2013) Targeted mutagenesis in rice using CRISPR-Cas system. Cell Res 23: 1233-1236 Nekrasov V, Staskawicz B, Weigel D, Jones JDG, Kamoun S (2013) Targeted mutagenesis in the model plant Nicotiana benthamiana using Cas9 RNA-guided endonuclease. Nat Biotechnol 31: 691-693 Qi LS, Larson MH, Gilbert LA, Doudna JA, Weissman JS, Arkin AP, Lim WA (2013) Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell 152: 1173-1183 Xiao A, Cheng ZC, Kong L, Zhu ZY, Lin S, Gao G, Zhang B (2014) CasOT: a genome-wide Cas9/gRNA offtarget searching tool. Bioinformatics 30: 1180-1182

Figure legends Figure 1. CRISPR Primer Designer in Knockout Mode and parameters diagram for Filter (A) Input Sequence is the CDS of BRI1 (A. thaliana), 14 PAMs and spacers were generated. Selected plasmid was psgR-Cas9-At, PAM1 was selected to construct the primers. (B) Parameters for the filter algorism are seed length, buffer zone length, reject zone length, buffer threshold number. Working range of each parameters is shown.

Figure 2. CRISPR Primer Designer in Chromosome Imaging Mode 36 PAMs was generated for Chromosome imaging for the MUC4 gene by inputting the non-repetitive sequence of the intron 1 within the gene.

Figure 1

Figure 2

CRISPR Primer Designer: Design primers for knockout and chromosome imaging CRISPR-Cas system.

The clustered regularly interspaced short palindromic repeats (CRISPR)-associated system enables biologists to edit genomes precisely and provides a p...
999KB Sizes 0 Downloads 10 Views