a r tic l e s

Genome-wide recessive genetic screening in mammalian cells with a lentiviral CRISPR-guide RNA library

npg

© 2014 Nature America, Inc. All rights reserved.

Hiroko Koike-Yusa1,2, Yilong Li1,2, E-Pien Tan1, Martin Del Castillo Velasco-Herrera1 & Kosuke Yusa1 Identification of genes influencing a phenotype of interest is frequently achieved through genetic screening by RNA interference (RNAi) or knockouts. However, RNAi may only achieve partial depletion of gene activity, and knockout-based screens are difficult in diploid mammalian cells. Here we took advantage of the efficiency and high throughput of genome editing based on type II, clustered, regularly interspaced, short palindromic repeats (CRISPR)–CRISPR-associated (Cas) systems to introduce genome-wide targeted mutations in mouse embryonic stem cells (ESCs). We designed 87,897 guide RNAs (gRNAs) targeting 19,150 mouse protein-coding genes and used a lentiviral vector to express these gRNAs in ESCs that constitutively express Cas9. Screening the resulting ESC mutant libraries for resistance to either Clostridium septicum alpha-toxin or 6-thioguanine identified 27 known and 4 previously unknown genes implicated in these phenotypes. Our results demonstrate the potential for efficient lossof-function screening using the CRISPR-Cas9 system. Genome-wide loss-of-function screening is a powerful hypothesis-free approach to discover genes and pathways that underlie biological processes. The approach has been successful in organisms such as yeast1 and Caenorhabditis elegans2, but screening the effects of recessive mutations is difficult in mammalian cells because of the diploid nature of their genomes. In mouse ESCs, homozygous mutants can be obtained by two rounds of gene targeting using standard genetic modification techniques, but this process is time consuming and has insufficient throughput to generate genome-wide homozygous mutant libraries. RNAi is a powerful tool for genome-wide screening because a single short interfering RNA or a single short hairpin RNA (shRNA)-expressing vector can inactivate gene function in a sequence-specific manner irrespective of the copy number of the target gene and of cell type3–5. However, RNAi often has off-target effects and suppression of gene expression by RNAi is frequently insufficient to allow observation of knockout phenotypes. In this regard, introducing mutations into mammalian haploid cells has advantages over RNAi, as demonstrated by haploid screens that have uncovered host factors required for virus infection and toxin cytotoxicity6–8. However, in mammalian cells the haploid state can be stably maintained in only a few cell types6,9–12, which restricts the application of haploid screening. Genome editing technologies now provide efficient means to introduce mutations into mammalian cells. Although zinc-finger nucleases and transactivator-like effector nucleases have been successfully used to modify the genomes of various organisms including mammals13,14, they have not been used with a high level of multiplexing or at sufficient

throughput to generate genome-wide knockout libraries. Recent studies have shown that genome editing systems based on the CRISPR-Cas system derived from Streptococcus pyogenes can introduce double-strand breaks (DSBs) into cultured mammalian cells15–17. The system consists of two components, namely a Cas9 endonuclease and gRNAs. The target specificity of the system relies on a 20-nt variable region at the 5ʹ end of the gRNAs and thus rapid, multiplexed construction of gRNA expression vectors targeting various genomic sites is possible. Furthermore, the high efficiency with which DSBs are induced by the CRISPR-Cas9 system allows bi-allelic mutation at multiple loci simultaneously18. In this study, we applied the CRISPR-Cas9 technology to achieve targeted genome-wide mutagenesis. We constructed a mouse genomewide lentiviral CRISPR gRNA library and used it to generate genomewide mutant mouse ESC libraries that we used for two recessive screens, leading to the identification of previously unknown host factors that modulate toxin susceptibility. RESULTS Cleavage with constitutive expression of Cas9 and gRNA We first examined whether constitutive expression of Cas9 and gRNA from single-copy transgenes is sufficient for DSB induction. After ESCs are transduced with a genome-wide lentiviral gRNA library, multiple viruses might integrate, but each cell is likely to carry only one copy of a provirus expressing a particular gRNA. The performance of gRNAs expressed from a single-copy transgene is therefore crucial for successful use of genome-wide lentiviral libraries. To measure the performance

1Wellcome

Trust Sanger Institute, Hinxton, Cambridge, UK. 2These authors contributed equally to this work. Correspondence should be addressed to K.Y. ([email protected]). Received 12 November; accepted 17 December; published online 23 December 2013; doi:10.1038/nbt.2800

nature biotechnology volume 32 NUMBER 3 MARCH 2014

267

a r tic l e s d

a

Figure 1 Stable expression of Cas9 and gRNA from single-copy transgenes can induce sitegRNA PB EF1α Puro-2A-hCas9 bpA PB PB U6 20-bp guide seq. scaffold T PGK.neo.pA PB specific DSBs. (a) Schematic of the piggyBac vector carrying human EF1a promoter-driven Cas9 + Piga gRNA2 No gRNA puromycin-resistant gene (Puro) and humanized 50 Piga gRNA2 only Cas9 (hCas9). The two coding sequences are * 40 fused with the T2A self-cleaving peptide. 120 * * 30 PB, piggyBac repeats; bpA, bovine * * * ** * * * 100 polyadenylation signal sequence. (b) Flow 20 cytometry analyses of GPI-anchored protein 10 80 expression using FLAER. The parental wild-type 0 60 ESCs (JM8) and the Cas9-expressing clones were #3 #4 #5 #8 #9 #10 #11 #12 JM8 JM8-Cas9 transfected with the indicated plasmid DNAs. 40 60 Data are from three independent transfections 50 20 and are shown as mean ± s.d. (n = 3). Variances 40 of two populations were tested by F-test 0 30 and found to be equal. Student’s t-test was 20 performed between co-transfected JM8 and 10 each cell line. *P < 0.05, **P < 0.01. gRNA2, JM8-Cas9#5 JM8-Cas9#8 0 gRNA targeting site 2 of the Piga gene. (c) Flow #3 #4 #5 #8 #9 #10 #11 #12 JM8 cytometry analyses of GFP expression. To measure JM8-Cas9 transfection efficiency for the experiments shown in b, we transfected each line separately with a GFP-expressing plasmid. Data are shown as mean ± s.d. (n = 3). Comparing JM8 and each of the stable cell lines, no statistical significance was detected by Student t-test. (d) Schematic of the piggyBac vector carrying the gRNA expression cassette and a neomycinresistant gene cassette. U6, human U6 promoter; T7, U6 terminator; PGK, mouse Pgk1 promoter. (e) Flow cytometry analysis of ESC lines transgenic for both Cas9 and the gRNA expression cassette. Two Cas9-expressing ESC lines, JM9-Cas9#5 and #8, were used. The analysis was performed 3 d after the colonies were picked. Analysis was performed in duplicate. Note that there is contamination of wild-type feeder cells, which accounts for a maximum of 5% FLAER-positive cells. Data are shown as mean ± s.d. (n = 2).

npg

© 2014 Nature America, Inc. All rights reserved.

of constitutively expressed components, we targeted genes involved in the glycosylphosphatidylinositol (GPI)-anchor synthesis pathway (Supplementary Fig. 1a). Mutant cells lack GPI-anchored protein expression on the cell surface and can be readily detected by flow cytometry19,20. In addition, alpha-toxin from C. septicum uses GPI-anchored proteins as a cellular receptor, allowing mutant cells to be selected by toxin treatment21. We generated ESC lines that constitutively express Cas9 from a singlecopy piggyBac transposon (Fig. 1a and Supplementary Fig. 2) and then transfected these ESC lines with a vector expressing a gRNA (site 2 of the X chromosome–linked Piga gene; Supplementary Fig. 3). All Cas9expressing ESC lines showed higher knockout frequencies than the control ESCs transfected with the same vectors, despite similar transfection efficiencies (Fig. 1b,c). Overexpression of Cas9 did not have any effect on the knockout frequencies in the Cas9-expressing ESCs (Fig. 1b). These results indicate that the expression levels of Cas9 in these stable clones are sufficient to induce DSBs. We next introduced the gRNA expression cassette into two Cas9expressing ESC lines by single-copy piggyBac transposition (Fig. 1d and Supplementary Fig. 4). Out of 20 stable transfectants analyzed, 18 colonies almost completely lacked GPI-anchored protein expression (Fig. 1e). As 90% of the colonies examined carried clonal indels (Supplementary Fig. 5), it is likely that the cleavage had a clonal origin (Supplementary Results and Supplementary Fig. 6). Off-target cleavage analyses at 275 potential off-target sites in these stable clones revealed that only two loci were genuine off-target sites and that the number of off-target cleavages in the stable clones was the same as that observed in transiently transfected cells (Supplementary Results, Supplementary Figs. 7 and 8, and Supplementary Tables 1 and 2). Taken together, our results indicate that constitutive expression of Cas9 and gRNA from single-copy transgenes is sufficient to induce CRISPR-Cas–mediated DSBs. Lentiviral delivery of gRNA expression cassettes We next examined whether lentiviral vectors can be used to deliver gRNA expression cassettes into target cells. We generated a lentiviral vector carrying both a gRNA expression cassette and an enrichment 268

Colony 1 Colony 2 Colony 3 Colony 4 Colony 5 Colony 6 Colony 7 Colony 8 Colony 9 Colony 10 Colony 1 Colony 2 Colony 3 Colony 4 Colony 5 Colony 6 Colony 7 Colony 8 Colony 9 Colony 10

GFP-positive cells (%)

c

e

FLAER-negative cells (%)

FLAER-negative cells (%)

b

cassette, which consists of the puromycin-resistant gene and a blue fluorescent protein (BFP) (Fig. 2a). JM8-Cas9#5 ESCs were individually transduced with lentivirus expressing one of the Piga site 1–3 gRNAs (Supplementary Fig. 3) and analyzed for GPI-anchored protein expression. Cells expressing site 2 gRNA had the highest fraction of fluorescently labeled aerolysin (FLAER)-negative cells, whereas site 1 and 3 gRNAs produced fewer FLAER-negative cells (Fig. 2b,c). This is likely due to lower transcription from the U6 promoter of gRNAs that lack a G nucleotide at the first position (+1) of the gRNA (Fig. 2a). Recent studies have shown that mismatches at the 5ʹ end of gRNAs have no effect on Cas9-mediated cleavage efficiency22,23, so we replaced the first nucleotide with a G for sites 1 and 3 and, the number of FLAERnegative cells significantly increased (P < 0.01; Fig. 2b,c). Therefore there are benefits to having the G nucleotide at the first position of the gRNAs when using a U6 promoter, so we used N19+NGG sites for the gRNA design for lentiviral libraries (see below). In these experiments, a fraction of cells is negative for both FLAER and BFP (Fig. 2b). Because these cells are FLAER-negative, the gRNA must have been expressed and Piga must have been inactivated. These FLAER-negative cells subsequently became BFP-negative due to proviral silencing24 (Supplementary Fig. 9), resulting in double-negative cells. We also tested lentiviral expression of gRNAs targeting four mismatch repair (MMR) genes. Cells defective in these MMR genes or Hprt are insensitive to a purine analog, 6-thioguanine (6TG) (Supplementary Fig. 1b). We obtained 6TG-resistant cells by expressing gRNAs targeting the MMR genes (Fig. 2d). These results indicate that the lentiviral expression of gRNAs in Cas9-expressing cells can induce DSBs. Validation of gRNA design We designed a genome-wide collection of gRNAs (Supplementary Data 1) and chose two gRNAs for each of the 26 genes in the GPI-anchor biosynthesis pathway to test their performance. We transfected JM8-Cas9#5 ESCs with each gRNA expression vector separately and carried out deepsequencing analyses of the cut sites; flow cytometry analyses; and treatment with alpha-toxin to identify resistant phenotypes. volume 32 NUMBER 3 MARCH 2014 nature biotechnology

a r tic l e s Deep sequencing analysis revealed that out of the 52 gRNAs analyzed, 50 were able to induce DSBs with an average indel frequency of 12.7 ± 6.7% (Fig. 3a), and gRNAs targeting 17 genes gave rise to a GPI-anchorsynthesis-deficient phenotype (Fig. 3b,c and Supplementary Fig. 10). In most cases, the fraction of FLAER-negative cells observed (Fig. 3b) was comparable to the cutting frequency (Fig. 3a); however, Pigh sites 1 and 2 showed a marked difference. We found that DSBs at Pigh site 2 frequently yielded an in-frame deletion (12 bp), whereas site 1 of this gene was repaired predominantly with a 2-bp deletion (Supplementary Figs. 11 and 12 and Supplementary Table 3). The gene product from the PighΔ12 allele was functional (Supplementary Fig. 13), indicating that the frequency of in-frame deletions will affect gene inactivation efficiency.

We analyzed indel sizes at the 50 sites and found that small (≤9 bp) and large (up to 45 bp) in-frame indels have average frequencies of 22.3 ± 15.9% and 31.8 ± 16.7%, respectively (Fig. 3d, e). We also found that most cut sites had at least one prominent deletion associated with 2- to 4-bp microhomologies (Supplementary Figs. 11 and 12 and Supplementary Table 3). These repair patterns were reproducibly observed (Supplementary Fig. 14), indicating that alternative nonhomologous end joining (NHEJ) is operating in ESCs, as reported previously25. The majority of repairs of CRISPR-Cas9–mediated DSBs involved deletions (83.8 ± 8.0%), although insertions were also observed (Supplementary Fig. 15). We concluded that most of our designed gRNAs are functional and that, importantly, bi-allelic targeting is achievable, supporting the use of these for screening recessive genes.

a CMV

Guide sequence

U6

RU5

gRNA scaffold

T

PGK

Puro-2A-BFP ΔU3RU5

gRNA with the endogenous Piga site 3 sequence gRNA with the altered Piga site 3 sequence

+1

b

Site 3 0.33%

Empty

72.3%

53.6%

Site 1 14.3%

36.5%

Site 2 29.9%

69.9%

T

C

4.2%

G Endogenous sequence

0.14%

0%

Mock

0.75%

BFP

© 2014 Nature America, Inc. All rights reserved.

Transcription

AACACCCGGATTTGCTGATGTCAGCTGTTTTA AACACCGGGATTTGCTGATGTCAGCTGTTTTA

27.2%

5.9%

0.41%

62.1%

26.2%

1.9%

5.8%

61.4%

31.7%

10.8%

11.8%

11.6%

G

98.8%

14.2%

G

21.3%

8.9%

Altered +1 sequence

18.0%

c

100 FLAER-negative (%)

npg

FLAER

80

*

*

Endogenous +1 Altered +1

60

d Mlh1

Msh2

Msh6

Pms2

Empty

Mock

6TG-treated

40 20 0

Non-treated C G Site 3

T G Site 1

G Site 2

Empty

Mock

Figure 2 Lentiviral delivery of gRNA expression cassettes. (a) Schematic of the self-inactivating lentiviral vector that expresses gRNA. The Pgk1 promoter-driven puro-2A-BFP cassette was inserted downstream of the gRNA expression cassette. The sequences shown are the gRNAs targeting site 3 of the Piga gene with the endogenous sequence (top) or the altered sequence (bottom). Note that the +1 position of the U6 transcript has been changed to a G nucleotide in the altered sequence. CMV, cytomegalovirus promoter; RU5, 5ʹ long terminal repeat lacking the U3 region; U6, human U6 promoter; T7, U6 terminator; PGK, mouse Pgk1 promoter; BFP, blue fluorescent protein; 2A, selfcleavage peptide; puro, puromycin-resistant gene; DU3RU5, enhancer-deleted 3ʹ LTR. (b) Inactivation of the Piga gene by lentiviral delivery of the gRNA expression cassette. Cas9-expressing mouse ESCs were infected with lentivirus expressing the gRNA targeting the indicated sites of the Piga gene with the endogenous (top) or altered (bottom) +1 sequences. Transduced cells were analyzed by flow cytometry 6 d after infection. Empty, an empty lentivirus that does not have a gRNA sequence; mock, mock infection. (c) A summary of the flow cytometry analysis shown in b. Data are shown as mean ± s.d. (n = 2). Student’s t-test was performed. * P < 0.01. (d) Methylene blue staining of Cas9-expressing ESCs transduced with lentivirus expressing gRNAs against the indicated genes. The transduced cells were subjected to 6TG (2 mM) selection.

nature biotechnology volume 32 NUMBER 3 MARCH 2014

Construction of a mouse genome-wide lentiviral gRNA library Figure 4a shows an experimental scheme of the gRNA library–based genetic screening. We designed up to five gRNAs for protein-coding genes in the mouse genome (Online Methods) and identified 87,897 gRNAs covering 94.3% of genes with at least two gRNAs per gene (Fig. 4b and Supplementary Data 2). These gRNAs were cloned into the lentiviral vector shown in Figure 2a. To validate the quality of the library, we sequenced 139 randomly isolated clones by capillary sequencing and found that 121 clones (87.1%) had the correct gRNA sequences (Supplementary Table 4). The library was deep sequenced at a depth of 503×, and 87,802 (99.892%) gRNAs were found to be present (Fig. 4c). Although a small fraction of gRNAs were under- or over-represented, 82% of gRNAs fell within a tenfold difference in frequency (Fig. 4c). Next, we generated a lentivirus pool and performed two independent infections of Cas9-expressing ESCs, resulting in two mutant ESC libraries. We deep sequenced all the gRNAs in the ESC libraries at a depth of 487× and 527× and found that 97.10% and 98.08% of the gRNAs was present in the ESC libraries, respectively (Fig. 4d). A number of gRNAs were under-represented in the ESC libraries compared to the lentiviral plasmid library (Fig. 4e). There are two possible explanations for this: individual gRNA sequences can affect lentiviral packaging adversely, leading to a reduced viral titer; or the gRNAs inactivate essential genes and thus are depleted in the ESC populations. Gene ontology (GO) analyses on depleted genes showed that GO terms associated with essential biological processes were overrepresented (Supplementary Table 5 and Supplementary Fig. 16). The genes included pluripotency genes, such as Nanog and Pou5f1, and DNA repair genes, such as Rad51 and Brca1, which are all known to be essential for ESC proliferation26–29 (Fig. 4f). By contrast, lineage-specification genes such 269

Site 1 gRNA

20

**

**

10

10

**

** **

**

** *

**

** ** **

Alpha-toxin Non-treated

20

Site 2 gRNA

Site 2

30

0

d

Site 1

** Dpm1 Dpm2 Dpm3 Gpaa1 Pigq

Dpm1 Dpm2 Dpm3 Gpaa1 Mpdu1 Pgap2 Piga Pigb Pigc Pigf Pigg Pigh Pigk Pigl Pigm Pign Pigo Pigp Pigq Pigs Pigt Pigu Pigv Pigw Pigx Pigy Control

FLAER-negative (%)

40

Pigl

Pigm

Mpdu1 Piga

≤ ±9 bp

60 40

Dpm1 Dpm2 Dpm3 Gpaa1 Mpdu1 Pgap2 Piga Pigb Pigc Pigf Pigg Pigh Pigk Pigl Pigm Pign Pigo Pigp Pigq Pigs Pigt Pigu Pigv Pigw Pigx Pigy

20 0

e

100

Pign

Pigo

wt

wt

Pigp

Pigg

Pigk Pgap2 Pigb

Pigc

Pigf

Pigh

Pigs

Pigt

Pigw

Pigx

Pigy

Pigu

Pigv

±12–45 bp ≤ ±9 bp

80 60 40 20 0

**

**

Dpm1 Dpm2 Dpm3 Gpaa1 Mpdu1 Pgap2 Piga Pigb Pigc Pigf Pigg Pigh Pigk Pigl Pigm Pign Pigo Pigp Pigq Pigs Pigt Pigu Pigv Pigw Pigx Pigy

80

±12–45 bp

In-frame indels (%)

100 In-frame indels (%)

© 2014 Nature America, Inc. All rights reserved.

npg

Site 2

30

0

b

Site 1

Dpm1 Dpm2 Dpm3 Gpaa1 Mpdu1 Pgap2 Piga Pigb Pigc Pigf Pigg Pigh Pigk Pigl Pigm Pign Pigo Pigp Pigq Pigs Pigt Pigu Pigv Pigw Pigx Pigy

Reads with indels (%)

40

Non-treated

c

a

Alpha-toxin

a r tic l e s

Figure 3 Analyses of 52 gRNAs targeting 26 genes involved in the GPI-anchor biosynthesis pathway. (a) The percentage of reads with indels analyzed by deep sequencing for each of the genes and sites tested. gRNAs against two sites were used for each gene. **, no indel was detected. (b) Flow cytometry analysis of cells transfected with the indicated gRNA-expression vectors. pBluescript was used as a control vector. *, not significant when compared to the control by Student’s t-test. All others were significant (P < 0.001). (c) Alpha-toxin treatment of transfected cells. The cells were treated with 1.0 nM alphatoxin for 48 h and stained with methylene blue after an additional 2 d in culture. (d,e) Percentage of in-frame indels at site 1 (d) and at site 2 (e) for each gene. The y axis shows the number of reads with in-frame indels at the indicated size divided by the total number of reads with all indels at site 1 (d) and at site 2 (e). **, no indel was detected. Data are shown as mean ± s.d. (n = 2–4).

as T, Pax6, Nkx2-5 and Cdx2, appeared at the same frequency in both the ESC libraries and the lentiviral plasmid libraries (Fig. 4f). These results support the second explanation and indicate that the gRNAs are functional. Screens using the genome-wide lentiviral gRNA library Using these two mutant ESC libraries, we conducted two recessive screens to identity genes that modulate susceptibility to alpha-toxin and 6TG. In alpha-toxin– and 6TG-resistant cells, we identified 654 and 276 genes, respectively, that were targeted by at least one gRNA (Supplementary Table 6). Figure 5a,b shows those genes targeted by two or more gRNAs, together with known genes involved in the GPI-anchor biosynthesis pathway and the MMR pathway. Out of the 17 genes that when targeted generated toxin-resistant phenotypes in the smaller GPI-anchor biosynthesis pathway screen (Fig. 3), 16 genes were identified in this genome-wide screen (Fig. 5a). The one gene that could not be identified, Pigv, did not have a gRNA designed in the genome-wide library owing to the splice vari270

ants predicted for this gene. Thus all known essential components of this pathway have been identified. In addition, there are 13 genes identified in the screen whose association with alpha-toxin resistance had not been reported previously. Of these, seven genes had two independent gRNA hits and six genes had the same gRNA hit in both ESC libraries (Fig. 5a). Analysis of the genes targeted in the 6TG-resistant cells revealed all known factors, that is, four major known MMR genes and Hprt (Fig. 5b). In addition, six previously unknown genes were identified with two gRNA hits each. Taken together, we have not only identified all of the previously known genes for each screen but have also found genes whose association with the biological agents tested have not been described before. Genetic validation of candidate genes We tested whether previously unknown genes obtained from the two screens could be validated, by choosing genes that have at least two independent gRNA hits and are expressed in ESCs. We constructed four or five gRNA expression vectors for each gene and tested these gRNAs individually to see whether they could give rise to resistant cells. volume 32 NUMBER 3 MARCH 2014 nature biotechnology

For alpha-toxin, none of the gRNAs gave rise to resistant cells at 1.0 nM alpha-toxin at a level similar to that of the gRNA targeting Piga. However, cells transfected with gRNAs targeting four genes—B4galt7, 1700016K19Rik, Cstf3 and Ext2—were resistant at lower toxin concentrations (0.50–0.75 nM) (Fig. 5c,d). Cells transfected with gRNAs for Lypla1 and Trpc2 displayed sensitivity to the toxin similar to that of the wild-type cells and are therefore likely to be false positives (Fig. 5c). In spite of the increased resistance to the toxin, all transfected cells showed normal GPI-anchored protein expression, as revealed by flow cytometry analysis (Supplementary Fig. 17), suggesting that these genes modulate susceptibility to alpha-toxin through a mechanism not directly affecting the GPI-anchor biosynthesis pathway. Different gRNAs targeting the same genes displayed similar phenotypes, suggesting that the phenotype is a direct consequence of gene inactivation. Nevertheless, to rule out the possibility of phenotypes resulting from off-target cleavages, we used cDNA complementation for two of the mutants (B4galt7 and Ext2). In contrast to cells transfected with the control vector, cells transfected with the relevant cDNA expression vector reverted to wild-type sensitivity levels (Fig. 5e). Overexpression of these genes in wild-type ESCs did not change the susceptibility to 0.50 nM alpha toxin (Fig. 5f). These results indicate that the increased resistance was due to the inactivation of the relevant genes by on-target cleavage. For 6TG, none of the gRNAs targeting the three candidate genes gave rise to resistant cells, suggesting

a

that they might be false-positives (Supplementary Fig. 18); however, only one concentration of 6TG was tested. Comparison of gRNA- and shRNA-based approaches Many RNAi screens have successfully identified novel genes and pathways that are involved in given phenotypes30–32. To compare the CRISPRCas9 screening strategy to RNAi, we cloned validated shRNA sequences for Piga, Pigx, Msh6, Mlh1, Msh2 and Pms2 into our lentiviral vector and compared the gene inactivation efficiencies of gRNAs and shRNAs by observing whether knockout phenotypes were produced under alphatoxin and 6TG treatment. All the gRNAs tested generated resistant cells (Supplementary Fig. 19), whereas only one shRNA targeting Mlh1 gave rise to a similar proportion of resistant cells. These results indicate that the CRISPR-Cas9–based approach has considerable advantages over RNAi, especially for biological agents with strong killing effects such as toxin. DISCUSSION Important to the success of genome-wide gRNA-based genetic screening is the performance (that is, cutting efficiency) of each gRNA. We showed that 50 out of 52 gRNAs tested were functional, albeit with variable cutting frequencies. A recent report by Ran et al. showed that all of the 86 gRNAs they tested were functional in HEK293T cells33.

Lentiviral gRNA library Phenotypic screening

Cas9-expressing target cells

c

8,000

d 10,000

10,000

1,000

1,000

Read counts

Read counts

100

4,000

10

3

4

5

0 20,000 40,000 60,000 80,000

Number of gRNAs/gene

e

f

10

x 1/10

100

10

1 2

1 2

gRNA ranked by counts

1 Fold change

1,000

0 20,000 40,000 60,000 80,000

gRNA ranked by counts

x1

10,000

10

P = 0.006 P = 0.003

2

P = 0.001 P = 0.001

1

100

1

1

0

P < 0.001 P = 0.001

Number of genes

12,000

Validation of candidate genes

Enriched mutants

Mutant library

16,000

0

Normalized read counts (ESC library 1)

Next-generation sequencing of enriched gRNAs

P = 0.006 P = 0.007

b

npg

© 2014 Nature America, Inc. All rights reserved.

a r tic l e s

0.1

0.01

1 0

0 0

1

10

100

1,000

10,000

Read counts (lentiviral library)

Nanog Pou5f1

1 2

1 2

1 2

1 2

1 2

1 2

Rad51

Brca1

T

Pax6

Nkx2-5

Cdx2

ESC library Gene

Figure 4 Generation of a mouse genome-wide lentiviral gRNA library. (a) Schematic of genetic screening with genome-wide lentiviral gRNA libraries. (b) The number of gRNAs designed per gene in the genome-wide library. (c,d) Deep-sequencing analyses of the gRNAs in the lentiviral plasmid DNA library (c) and ESC library 1 (d). (e) Scatter plots comparing gRNA frequencies in the original lentiviral plasmid DNA and in the ESC library 1. gRNA counts in the ESC libraries have been normalized against the gRNA counts from the lentiviral library. (f) Fold changes of read counts between the lentiviral plasmid DNA library and the ESC libraries. The same gRNAs in the two ESC libraries are linked with lines. Mann-Whitney U test was performed by comparing gRNAs of each gene from each library with all gRNAs in the corresponding ESC library.

nature biotechnology volume 32 NUMBER 3 MARCH 2014

271

a r tic l e s

Mlh1 Msh2 Msh6 Pms2

No hit Hit in one of the two ESC libraries

Alpha-toxin (nM)

Hit in both ESC libraries

Ext2 Site 3

Site 4

B4galt7 Alpha-toxin (nM)

Site 2

d

1.00

Site 3

c

MMR genes

a * GM15293 * Letmd1 Olfr815 * Prkg1 Tmem8c

No. of gRNAs * 1700016K19Rik * B4galt7 * Cstf3 * Ext2 * Lypla1 Olfr1206 * Trpc2 Cmtm7 Gm5108 Ifna2 Olfr866 Slc25a25 Tas2r138

GPI pathway genes

6 5 4 3 2 1 0 Hprt

b

6 5 4 3 2 1 0 Dpm1 Dpm2 Dpm3 Gpaa1 Mpdu1 Pgap2 Piga Pigb Pigc Pigf Pigg Pigh Pigk Pigl Pigm Pign Pigo Pigp Pigq Pigs Pigt Pigu Pigv Pigw Pigx Pigy

No. of gRNAs

a

Piga Wt

1.00 0.75

0.75

npg

0.25 0.50

B4galt7 Cstf3 Trpc2

1700016K19Rik Ext2

Piga

Lypla1

Trpc2

cDNA

Alpha-toxin (0.50 nM)

Control

Non-treated

Non-treated

Wild-type cells

pBS

Control

Cstf3

Piga

cDNA

Ext2

Alpha-toxin (0.50 nM)

JM8 Wt cells co-transfected with GFP + cDNA

Transfection

B4galt7

0

f Piga

e

0.25

Ext2 - Site 3

0

B4galt7 - Site 4

© 2014 Nature America, Inc. All rights reserved.

0.50

Figure 5 Genetic screens using the genome-wide gRNA library and genetic validation assays of the novel candidate genes. (a,b) Genes with multiple gRNA hits in cells resistant to alpha-toxin (a) and 6TG (b). All known genes involved in the GPI-anchor synthesis pathway are shown in a. *, genes chosen for further validation. MMR, mismatch repair. (c) Validation analysis of the candidate genes identified in the alpha-toxin–resistance screen. Cas9-expressing cells were transfected separately with gRNAs targeting the indicated genes. Different gRNAs were used in each well. Six days after transfection, the cells were treated with alpha-toxin at the indicated concentration. The cells were stained with methylene blue 4 d later. (d) Increased resistance to alpha-toxin by inactivation of B4galt7 and Ext2. Cells that survived at 0.50 nM alpha-toxin were treated with alpha-toxin again at the indicated concentration. (e) cDNA complementation assay. Upon expression of the relevant cDNA, the mutant cells reverted to the wild-type sensitivity to the toxin. (f) Overexpression of the cDNAs does not change the sensitivity of the cells to 0.50 nM alpha-toxin.

Thus, a high success rate in designing functional gRNAs seems to be consistent across various cell types. Our deep sequencing analyses also revealed frequent microhomology-mediated repair. In some cases—such as site 1 of Pigk and site 2 of Pigh—this repair mechanism produced a higher frequency of in-frame deletions than expected, leading to a reduction in the frequency of generating null mutations. It might be possible to predict deletion patterns based on the target site sequences and design gRNAs that have higher chances of generating out-of-frame deletions. Recent studies have shown evidence of off-target cleavages using the CRISPR-Cas9 system in mammalian cells22,23,34,35. However, the frequencies vary among the studies, and higher GC content of the gRNAs might be associated with higher frequencies22,23,35. Site 2 of the Piga gene, which we tested, has 55% GC content, and out of 275 potential off-target sites analyzed, including sites with bulge-type mismatches and the NAG PAM, we found evidence of only two off-target cleavages. It is clear that off-target cleavages can occur through imperfect hybridization between the gRNA and the genomic DNA. Nevertheless, well-designed 272

gRNAs, with no off-target sites in exons, are suitable for use as genetic tools. More comprehensive analyses of off-target cleavages are necessary to discern the true extent of these effects. We have shown that the mismatch at the 5ʹ end of a gRNA that introduces a G at this position increases the cutting frequencies of lentivirally expressed gRNAs, presumably through enabling more efficient transcription from the U6 promoter. The design of gRNAs therefore need not be restricted to sites with GN19NGG; sites with N19NGG can be used as CRISPR target sites. This new design greatly increases the repertoire of gRNAs available for use. We conducted two genetic screens using the genome-wide gRNA libraries and identified known components that modulate susceptibility to the biological agents tested and also previously unknown genes. Given our evidence that most gRNAs lead to cleavages and evidence that the CRISPR-Cas9 system can produce null mutations, gRNA-based genetic screening is expected to produce gene hits at a high signalto-noise ratio. As it is unlikely that two gRNAs targeting the same gene have similar off-target effects on the phenotype of interest, we volume 32 NUMBER 3 MARCH 2014 nature biotechnology

npg

© 2014 Nature America, Inc. All rights reserved.

a r tic l e s used a cut-off of a minimum of two different gRNA hits per gene to select candidate genes for validation. Candidate genes can be further narrowed down by focusing on genes that are expressed in the cell type used. If we assume that candidates not expressed in ESCs are false positives, we obtained three and five false positives in the alpha-toxin and 6TG screens, respectively. If a more stringent cut-off of a minimum of three different gRNA hits is used, there are no false positives. It is noteworthy that ~95% of genes identified had a single gRNA hit, and we did not analyze these further. There are several possible reasons for the presence of this group of genes: ‘true’ genes were hit only once; the gRNAs in this group induced DSBs in a different ‘true’ gene through an off-target effect; cells with the phenotype have multiple lentivirus integrations, and the gRNAs in this group were ‘passenger’ gRNAs in cells that also had another ‘driver’ gRNA; or the gRNAs in this group were derived from either PCR contamination or cells that had survived under selection without having a resistance phenotype. Because construction of gRNAs can be multiplexed, validation of the limited number of candidates is readily conducted by CRISPR-Cas9–mediated knockout. Alternatively, secondary libraries, which include all initial hit genes with an increased number of gRNAs per gene, could be constructed and used to further narrow down the candidates. Genome-wide lentiviral gRNA libraries as tools of genome-wide mutagenesis have several advantages over existing mutagenesis methods. In particular, creating null mutations by the CRISPR-Cas9 system could overcome one of the major limitations of RNAi, that is, incomplete suppression of gene expression. Furthermore, the finding that gRNAs targeting genes involved in pluripotency maintenance and essential biological processes were depleted from the ESC libraries suggests that population-based analyses of gRNA-mediated mutant libraries might be possible. A lentiviral genome-wide gRNA library will have wide applicability and represents a promising platform for functional genomics. METHODS Methods and any associated references are available in the online version of the paper. Accession codes. Sequence data have been deposited at the European Nucleotide Archive under accession number ERP003292. Note: Any Supplementary Information and Source Data files are available in the online version of the paper. ACKNOWLEDGMENTS We thank A. Bradley for comments on the manuscript, B. Ng and W. Cheng for the flow cytometry analyses, and the Sanger Institute DNA pipeline for the sequence analyses. We also thank J. Takeda and T. Kinoshita for providing us alpha-toxin and the cDNA expression vectors, respectively. Y.L. is supported by the Wellcome Trust PhD program. M.D.C.V.-H. is supported by the Cancer Research UK and Wellcome Trust PhD program. This work was supported by Wellcome Trust (WT077187). The mouse CRISPR library is available through Addgene. The plasmid DNAs are available at the Wellcome Trust Sanger Institute Archives (http://www.sanger.ac.uk/technology/clonerequests/). Author contributions K.Y. conceived the research and wrote the manuscript with comments from all authors. H.K.-Y., E.-P.T. and K.Y. performed the experiments. Y.L. and M.D.C.V.-H. performed the bioinformatics analyses. COMPETING FINANCIAL INTERESTS The authors declare competing financial interests: details are available in the online version of the paper.

nature biotechnology volume 32 NUMBER 3 MARCH 2014

Reprints and permissions information is available online at http://www.nature.com/ reprints/index.html. 1. Forsburg, S.L. The art and design of genetic screens: yeast. Nat. Rev. Genet. 2, 659–668 (2001). 2. Jorgensen, E.M. & Mango, S.E. The art and design of genetic screens: Caenohabditis elegans. Nat. Rev. Genet. 3, 356–369 (2002). 3. Boutros, M. & Ahringer, J. The art and design of genetic screens: RNA interference. Nat. Rev. Genet. 9, 554–566 (2008). 4. Moffat, J. et al. A lentiviral RNAi library for human and mouse genes applied to an arrayed viral high-content screen. Cell 124, 1283–1298 (2006). 5. Iorns, E., Lord, C.J., Turner, N. & Ashworth, A. Utilizing RNA interference to enhance cancer drug discovery. Nat. Rev. Drug Discov. 6, 556–568 (2007). 6. Carette, J.E. et al. Haploid genetic screens in human cells identify host factors used by pathogens. Science 326, 1231–1235 (2009). 7. Carette, J.E. et al. Global gene disruption in human cells to assign genes to phenotypes by deep sequencing. Nat. Biotechnol. 29, 542–546 (2011). 8. Carette, J.E. et al. Ebola virus entry requires the cholesterol transporter Niemann-Pick C1. Nature 477, 340–343 (2011). 9. Leeb, M. & Wutz, A. Derivation of haploid embryonic stem cells from mouse embryos. Nature 479, 131–134 (2011). 10. Yang, H. et al. Generation of genetically modified mice by oocyte injection of androgenetic haploid embryonic stem cells. Cell 149, 605–617 (2012). 11. Elling, U. et al. Forward and reverse genetics through derivation of haploid mouse embryonic stem cells. Cell Stem Cell 9, 563–574 (2011). 12. Yang, H. et al. Generation of haploid embryonic stem cells from Macaca fascicularis monkey parthenotes. Cell Res. 23, 1187–1200 (2013). 13. Urnov, F.D., Rebar, E.J., Holmes, M.C., Zhang, H.S. & Gregory, P.D. Genome editing with engineered zinc finger nucleases. Nat. Rev. Genet. 11, 636–646 (2010). 14. Joung, J.K. & Sander, J.D. TALENs: a widely applicable technology for targeted genome editing. Nat. Rev. Mol. Cell Biol. 14, 49–55 (2013). 15. Cho, S.W., Kim, S., Kim, J.M. & Kim, J.S. Targeted genome engineering in human cells with the Cas9 RNA-guided endonuclease. Nat. Biotechnol. 31, 230–232 (2013). 16. Cong, L. et al. Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819–823 (2013). 17. Mali, P. et al. RNA-guided human genome engineering via Cas9. Science 339, 823– 826 (2013). 18. Wang, H. et al. One-step generation of mice carrying mutations in multiple genes by CRISPR/Cas-mediated genome engineering. Cell 153, 910–918 (2013). 19. Takeda, J. et al. Deficiency of the GPI anchor caused by a somatic mutation of the PIG-A gene in paroxysmal nocturnal hemoglobinuria. Cell 73, 703–711 (1993). 20. Kinoshita, T., Fujita, M. & Maeda, Y. Biosynthesis, remodelling and functions of mammalian GPI-anchored proteins: recent progress. J. Biochem. 144, 287–294 (2008). 21. Gordon, V.M. et al. Clostridium septicum alpha toxin uses glycosylphosphatidylinositolanchored protein receptors. J. Biol. Chem. 274, 27274–27280 (1999). 22. Fu, Y. et al. High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nat. Biotechnol. 31, 822–826 (2013). 23. Hsu, P.D. et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat. Biotechnol. 31, 827–832 (2013). 24. Ellis, J. Silencing and variegation of gammaretrovirus and lentivirus vectors. Hum. Gene Ther. 16, 1241–1246 (2005). 25. Bennardo, N., Cheng, A., Huang, N. & Stark, J.M. Alternative-NHEJ is a mechanistically distinct pathway of mammalian chromosome break repair. PLoS Genet. 4, e1000110 (2008). 26. Nichols, J. et al. Formation of pluripotent stem cells in the mammalian embryo depends on the POU transcription factor Oct4. Cell 95, 379–391 (1998). 27. Mitsui, K. et al. The homeoprotein Nanog is required for maintenance of pluripotency in mouse epiblast and ES cells. Cell 113, 631–642 (2003). 28. Hakem, R. et al. The tumor suppressor gene Brca1 is required for embryonic cellular proliferation in the mouse. Cell 85, 1009–1023 (1996). 29. Tsuzuki, T. et al. Targeted disruption of the Rad51 gene leads to lethality in embryonic mice. Proc. Natl. Acad. Sci. USA 93, 6236–6240 (1996). 30. Luo, J. et al. A genome-wide RNAi screen identifies multiple synthetic lethal interactions with the Ras oncogene. Cell 137, 835–848 (2009). 31. Agaisse, H. et al. Genome-wide RNAi screen for host factors required for intracellular bacterial infection. Science 309, 1248–1251 (2005). 32. Karlas, A. et al. Genome-wide RNAi screen identifies human host factors crucial for influenza virus replication. Nature 463, 818–822 (2010). 33. Ran, F.A. et al. Double nicking by RNA-guided CRISPR Cas9 for enhanced genome editing specificity. Cell 154, 1380–1389 (2013). 34. Mali, P. et al. CAS9 transcriptional activators for target specificity screening and paired nickases for cooperative genome engineering. Nat. Biotechnol. 31, 833–838 (2013). 35. Pattanayak, V. et al. High-throughput profiling of off-target DNA cleavage reveals RNA-programmed Cas9 nuclease specificity. Nat. Biotechnol. 31, 839–843 (2013).

273

npg

© 2014 Nature America, Inc. All rights reserved.

ONLINE METHODS

Plasmid construction. Primer sequences are listed in Supplementary Table 7. All PCR-generated fragments were verified by Sanger sequencing. All plasmids are available upon request at the Wellcome Trust Sanger Institute Archives (http:// www.sanger.ac.uk/technology/clonerequests/). The humanized Cas9 expression vector17 was obtained from Addgene (41815). We modified the vector as follows. First, the AgeI-BstZ17I region in the vector was replaced with a PCR-generated fragment containing the bovine growth hormone polyadenylation signal sequence (bpA). Second, the MluI-NcoI region containing the CMV promoter was replaced with the human EF1a promoter, resulting in pEF1a-Cas9. pPB-LR5.1-EF1a-puro2ACas9 was constructed as follows. First, we removed the NotI and the AscI sites from pPB-LR5 (ref. 36) by cloning the MluI-XbaI fragment containing the piggyBac transposon into the MluI-XbaI site of PCRgenerated pBluescript, resulting in pPB-LR5.1. Second, the CAG promoter (the NheI-ClaI fragment of pPB-CAG.EBNXN37) and bpA (the PCR-generated ClaIXhoI fragment) were cloned into the NheI-SalI site of pPB-LR5.1, resulting in pPB-LR5.1-CAG. The PCR-generated MfeI-PacI fragment containing puroT2A-GFP was then cloned into the EcoRI-PacI site of pPB-LR5.1-CAG, resulting in pPB-LR5.1-CAGpuro2AGFP. Separately, the fragment containing human EF1a promoter was PCR-generated using a BAC clone, RP11-159L14, as a template and cloned into pPB vector together with the GFP fragment, resulting in pPB-EF1a-GFP. The NheI-AscI fragment containing the hEF1a promoter was excised from pPB-EF1a-GFP and cloned into the NheI-AscI site of pPB-LR5.1GAGpuro2AGFP, resulting in pPB-LR5.1-EF1a-puro2AGFP. Finally, the NcoI-NotI fragment containing Cas9 was cloned into the NcoI-NotI site of pPB-LR5.1-EF1a-puro2AGFP, resulting in pPB-LR5.1 EF1a-puro2ACas9. The gRNA cloning vector, pU6-gRNA(BbsI), was constructed by cloning a gBlock fragment (IDT) containing the human U6 promoter, the gRNA cloning site (Supplementary Fig. 3) and the gRNA scaffold, into the XhoI-BamHI site of pBluescriptII. The lentiviral gRNA expression vector, pKLV-U6gRNA(BbsI)-PGKpuro2ABFP, was constructed as follows. First, a new lentiviral backbone vector, pKLV was constructed. A vector containing the multicloning site, SpeI-ApaIMluI-XhoI-AscI-BamHI-NotI-KpnI-EagI-PacI, was generated by PCR using pBluescript as a template, resulting in pBS-MCS-KLV. The modified 3ʹ LTR followed by bpA was synthesized (GeneArt) and cloned into the KpnI-PacI site of pBS-MCS-KLV, resulting in pBS-3LTRbpA. The SpeI-ApaI fragment containing the CMV promoter, the 5ʹ R/U5 region and the packaging signal sequence was excised from FUW-OSKM38 (Addgene, 20328) and cloned into pBS-3LTRbpA, resulting in pKLV. Second, the PGK-puro2ABFP cassette was constructed as follows. Fragments containing PGK-puro2A and 2ABFP were PCR-generated. The BbsI site within the BFP coding sequence39 was mutated during this PCR process. The full-length coding sequence of puro-2A-BFP was generated by fusing PGK-puro2A and 2ABFP in the second PCR reaction and then cloned into the BamHI-NotI site of pBluescript, resulting in pPGK-puro2ABFP. Finally, the XhoI-BamHI fragment from pU6-gRNA(BbsI) and the BamHI-NotI fragment of pPGK-puro2ABFP were cloned into the XhoI-NotI site of pKLV, resulting in pKLV-U6gRNA(BbsI)-PGKpuro2ABFP. Individual gRNA expression vectors were constructed as follows. Top and bottom 26-nt oligonucleotides (Supplementary Table 8) were mixed at 10 mM each in 10 mM Tris-HCl (pH8.0) and 5 mM MgCl2 in a total volume of 100 ml. The mixture was incubated at 95 °C for 5 min and cooled to room temperature. The duplex oligonucleotides were then cloned into the BbsI site of pU6-gRNA(BbsI) or pKLV-U6gRNA(BbsI)-PGKpuro2ABFP. cDNA expression vectors were constructed as follows. cDNAs for B4galt7 and Ext2 were PCR-amplified using cDNAs from JM8 mouse ESCs as a PCR template, digested with SalI/BsrGI (B4galt7) or SalI/EcoRI (Ext2), and cloned into the SalI/ BsrGI or SalI/EcoRI site of pPB-Ef1a-GFP, respectively. The coding sequence of PighD12 was PCR-amplified using two primer pairs (U1 and L1; U2 and L2) and cloned into the SalI/EcoRI site of pPB-EF1a-GFP using Gibson Assembly Master Mix (NEB). A full-length Pigh coding sequence was also PCR-amplified with a primer pair (U1 and L2) and cloned into the SalI/EcoRI site of pPB-EF1a-GFP. cDNA expression vectors for the GPI-anchor biosynthesis genes20 were kindly provided by T. Kinoshita. shRNA expression vectors were constructed as follows. Top and bottom oligonucleotides (Supplementary Table 9) were mixed at 3 mM each in 10 mM

nature biotechnology 

Tris-HCl (pH8.0) and 5 mM MgCl2 in a total volume of 50 ml. The mixture was incubated first at 95 °C for 5 min, then at 70 °C for 10 min. Subsequently, the mixture was cooled to room temperature. The duplex oligonucleotides were then cloned into the BbsI site of pKLV-U6gRNA(BbsI)-PGKpuro2ABFP. Apart from the position of central PPE and the addition of gRNA scaffold, the configuration of our vector is essentially the same as that of the RNAi consortium vector, pLKO.1 (ref. 4). As the shRNA expression unit has the U6 terminator sequence, the gRNA scaffold should not interfere with shRNA transcription. Genome-wide mouse gRNA design. A BED file containing the exonic coordinates of all protein coding genes (ENSEMBL release 71) on the mouse reference genome GRCm38 was obtained. Overlapping coordinates were merged using BEDtools40. The sequences of each genomic interval in the BED file, with an additional 20 nucleotides on both sides of the intervals, were retrieved and used to identify all sequences comprising 5ʹ-GN20GG-3ʹ. To avoid off-target cleavages, only gRNAs that matched stringent conditions were chosen: from position 8, the 5ʹ-N14GG-3ʹ of each gRNA only had a single match to the mouse genome. A total of 325,638 sites was identified (Supplementary Data 1). Genome-wide mouse gRNA design for a lentiviral library. In order to generate a genome-wide lentiviral gRNA library, we designed an additional set of genomewide gRNAs. A BED file containing the consensus coding sequence (CCDS) on the mouse reference genome GRC38 (CCDS released on 08/14/2012) was obtained. When genes have multiple CCDS transcripts, we only chose the overlapping regions. The sequences of each genomic interval were retrieved and used to identify all sequences comprising 5ʹ-N19NGG-3ʹ. Filtering was performed as follows: First, sites with more than 1 perfect hit in any of the Ensembl exons were removed. Second, off-target sites of each candidate gRNA were examined with the following two options (i) N12NGG without any mismatches and (ii) N20NGG with up to three mismatches. Third, gRNAs that are positioned at least 100 bp away from the translation initiation site and in the first half of coding sequences were collected. Finally, up to 5 gRNAs were chosen for each gene, prioritizing gRNAs with fewer predicted off-target sites. A total of 87,897 gRNA sequences was chosen (Supplementary Data 2). Genome-wide mouse gRNA lentiviral library construction. A 79-mer oligo pool was purchased from CustomArray Inc. The oligo sequences are 5ʹ-GCAG ATGGCTCTTTGTCCTAGACATCGAAGACAACACCGN19GTTTTACAGTC TTCTCGTCGC-3ʹ, where N19 indicates each of the 87,897 gRNA sequences. The single-stranded oligos were converted to double-stranded DNA by PCR using Q5 Hot Start High-Fidelity 2X Master Mix (NEB) with 32 fmol of the oligo as template and primers (79mer-U1 and –L1) using the following conditions: 98 °C for 10 s, 10 cycles of 98 °C for 10 s, 64 °C for 15 s and 72 °C for 15 s, and the final extension, 72 °C for 2 min. The PCR products were purified with the Nucleotide Removal kit (Qiagen), digested with BbsI and separated by PAGE. The 26-bp fragment was excised from the PAGE gel. Ligation was performed using 1.5 ng of the 26-bp fragment and 20 ng of BbsI-digested pKLV-U6gRNA(BbsI)-PGKpuro2ABFP with T4 DNA ligase (NEB). Two ligation reactions were combined, column-purified using MinElute PCR purification kit (Qiagen) and eluted into 10 ml water. Three electroporations were carried out using 1 ml of the purified ligation product and 25 ml of NEB10-beta electrocompetent cells (NEB) per reaction according to the manufacturer’s instruction. The electroporated cells were combined and plated onto sixty 15-cm LB + ampicillin agar plates. Resulting bacteria from 30 plates (170× library complexity) were scraped off and combined. The plasmid DNA was purified with Plasmid Maxi kit (Qiagen). Cell culture and transfection. Male mouse ESCs (JM8; ref. 41) were cultured on mitomycin C–treated MEFs in M15L: Knockout DMEM (Invitrogen) supplemented with 15% FBS (PAA), 1% GlutaMax (Invitrogen), 1% nonessential amino acids (Invitrogen), 0.1 mM 2-mercaptoethanol and 1,000 U ml–1 leukemia inhibitory factor (LIF; Millipore). 293FT cells (Invitrogen) were cultured in DMEM containing 10% FBS and 1% GlutaMax. All cell lines used in this study are negative for mycoplasma contamination. Transient reverse transfection of ESCs was carried out using Lipofectamine LTX (Invitrogen) according to the manufacturer’s instruction. Briefly, 100 ng of plasmid DNA and 0.1 ml of the PLUS reagent were mixed into 10 ml OPTI-MEM (Invitrogen) and incubated for 5 min at room temperature. 0.3 ml of the LTX reagent was diluted into 10 ml of OPTI-MEM and combined

doi:10.1038/nbt.2800

with the DNA:PLUS mixture. This was incubated for 30 min at room temperature. Subsequently, 15,000 ESCs suspended in 80 ml OPTI-MEM were mixed with 20 ml of the DNA:PLUS:LTX mixture and plated onto a well of a 96-well plate containing feeder cells. These cells were incubated for 1 h at 37 °C. The transfection mixture was then removed and 150 ml of ESC medium was added. The transfected cells were cultured for 6–7 d before relevant functional analysis. Single-copy transgenesis using the piggyBac transposon system was carried out as described previously42. Briefly, the piggyBac transposase expression vector, pCMV-mPBase (5 mg) and a transposon vector (100 ng) were electroporated into 1 × 106 ESCs at 230V and 500 mF using GenePluser II (BioRad) and plated onto a 10-cm dish. Two days later, drug selection was initiated. The resulting colonies were picked and further expanded.

npg

© 2014 Nature America, Inc. All rights reserved.

Flow cytometry. Fluorophore (Alexa488)-labeled aerolysin (FLAER) was purchased (VH bio) and used for cell staining at 25 nM in 1% BSA in PBS for 20 min at room temperature. The stained cells were analyzed on the LSRII or LSRFortessa instrument (BD). Data were subsequently analyzed using FlowJo. Alpha-toxin treatment. ESCs were dissociated into single-cell suspension and plated onto gelatin-coated plate at a density of 9 × 104 cells cm-2 in a volume of 220 ml cm-2 with the indicated concentrations of alpha-toxin. The cells were cultured for 48 h and then the medium was replaced with fresh M15L medium daily until staining with methylene blue or harvesting for downstream analysis. 6-thioguanine treatment. ESCs were dissociated into single-cell suspension and plated onto pSNL feeder plates at a density of 5 × 106 cells per 10-cm dish for the MMR screening or 2.5 × 104 cells per well of a 12-well plate for comparison of gene inactivation efficiencies between gRNA and shRNA. On the following day, the medium was replaced with a selective medium containing 2 mM 6-thioguanine (Sigma). Two mM is the lowest concentration that can be used with no background (see ref. 43). The selection was continued for 5 d and the cells were cultured for an additional 5 d without 6-TG. cDNA complementation assay. Mutant cells (1 × 106 cells) were transfected with a mixture of cDNA expression vector (2.25 µg) and pPB-EF1a-GFP (0.25 µg) using Lipofectamine LTX. As a negative control, pBluescriptII was used. Two days after transfection, GFP-positive cells were sorted using MoFlow XDP (Beckman). Immediately after cell sorting, 5 × 104 cells were treated at the indicated concentration of alpha toxin in a 96-well plate for 48 h. The cells were further cultured in M15L medium until staining. Lentivirus production and transduction. 3 mg of a lentiviral vector, 9 mg of ViraPower Lentiviral Packaging Mix (Invitrogen) and 12 ml of the PLUS reagent were added to 3 ml of OPTI-MEM and incubated for 5 min at room temperature. 36 ml of the LTX reagent was then added to this mixture and further incubated for 30 min at room temperature. The transfection complex was added to 80% confluent 293FT cells in a 10-cm dish and incubated for 3 h. The medium was replaced with fresh medium 24 h after transfection. Viral supernatant was harvested 48 h after transfection and stored at –80 °C. Transduction of ESCs was performed in suspension as follows: 15,000 ESCs and diluted virus were mixed in 100 ml of the ESC medium containing 8 mg ml-1 polybrene (Millipore), incubated for 30 min at 37 °C in a well of a round-bottomed 96-well plate, plated onto a well of a feeder-containing 96-well plate and cultured until functional analyses. Transduction volumes were scaled up according to the areas of the culture plates if necessary. Generation of genome-wide ESC mutant libraries and screening. 1.0 × 107 ESCs (JM8-Cas9#5) were infected with the genome-wide gRNA lentiviral library at an MOI of 0.3. Two independent infections were conducted, thus producing two independent ESC libraries. Three days after infection, 2.0 × 106 BFP-positive cells were sorted for each of the libraries and cultured for an additional 4 d. For each of the 2 ESC libraries, 6 × 106 or 10 × 106 mutant ESCs were treated with alpha-toxin (1.0 nM) for 48 h or 6TG (2 mM) for 5 d, respectively, and further cultured for an additional 5 d. Surviving cells were pooled per library, and genomic DNA was extracted and used for PCR templates. Off-target site prediction and cleavage analysis. Twenty-nucleotide guide sequences were mapped to the mouse reference genome (GRCm38) using BWA aln

doi:10.1038/nbt.2800 

with the following option: –n 5 –o 0 –l 20 –N (ref. 44). Subsequently, the mapped positions that were followed by the PAM sequences (NGG or NAG) were extracted as potential off-target sites. Potential off-target sites with bulge structures were identified by mapping the 20-bp gRNA sequences to the mouse genome using BWA aln with the following option: -n 3 -o 1 -k 3 –N. All potential off-target sites with a maximum of five mismatches (with or without bulges) for site 2 gRNA of the Piga gene are listed in Supplementary Data 3–5. For off-target cleavage analysis, we excluded sites for which specific primers could not be designed because of the presence of repetitive elements. We designed PCR primers using Batch Primer3 and selected 95 off-target sites for each of the NGG and NAG PAMs and 41 and 44 off-target sites with bulges + NGG and bulges + NAG, respectively. Each locus was individually amplified using genomic DNA derived from 20 doubly transgenic ESC lines and transiently transfected ESCs (day 4) with Phusion High-Fidelity polymerase in GC buffer (Thermo Scientific). PCR products were pooled, purified using QIAquick PCR Purification Kit (Qiagen). Five hundred nanograms of the purified PCR products were ligated with Illumina adaptors45 using NEBNext DNA Library Prep Master Mix (NEB) according to the manufacturer’s protocols. The adaptorligated products (7% of the input material) were used for PCR enrichment45 with KAPA HiFi HotStart ReadyMix with the following PCR conditions: 98 °C for 30 s, 7 cycles of 98 °C for 10 s, 66 °C for 15 s and 72 °C for 20 s, and the final extension, 72 °C for 5 min. The PCR products were purified with Agencourt AMPure XP beads (Beckman) in a PCR-product-to-bead ratio of 1:0.7. The purified libraries were quantified and sequenced on Illumina MiSeq by 250-bp paired-end sequencing. Each read was mapped to a custom reference sequence using BWA-SW44. Reads containing indels overlapping the ± 20-bp region of the predicted cut sites were considered to be the outcome of NHEJ. The cut frequency was calculated by dividing the number of reads with indels by the total number of reads mapped. Illumina sequencing of gRNAs in the genome-wide library and the enriched mutants. For sequencing of all gRNAs in the genome-wide library, the region containing the gRNA was amplified using primers (gLibrary-HiSeq_50bp-SEU1 and –L1) with Q5 Hot Start High-Fidelity 2× Master Mix. We conducted ten independent PCR reactions using 15 ng of the whole-genome lentiviral plasmid library per reaction and 72 independent PCR reactions using 1 mg of the mouse ESC library per reaction for each of the two ESC libraries. These correspond to 1.7 × 1010 molecules of the plasmid DNA and 1.1 × 107 ESCs in total, respectively. For sequencing of gRNAs in the enriched mutants, the region containing the gRNA was amplified using 1 mg of genomic DNA (1.5 × 105 cells) and primers (gLibrary-MiSeq_150bp-PE-U1 and –L1) with Q5 Hot Start High-Fidelity 2× Master Mix. The PCR products were pooled in each group and purified using QIAquick PCR Purification Kit. Two hundred picograms of the purified PCR products was used for PCR enrichment45 with KAPA HiFi HotStart ReadyMix with the following conditions: 98 °C for 30 s, 12 cycles of 98 °C for 10 s, 66 °C for 15 s and 72 °C for 20 s, and the final extension, 72 °C for 5 min. The PCR products were purified with Agencourt AMPure XP beads in a PCR-product-to-bead ratio of 1:0.7. The purified libraries were quantified and sequenced on Illumina HiSeq2500 by 50-bp single-end sequencing (for the entire libraries) or on Illumina MiSeq by 150-bp paired-end sequencing (for the enriched mutants). gRNA sequences were extracted by removing constant regions from each read and these were used to count the number of reads of each gRNA in the library. Gene ontology analysis. We computed an average depletion rate for each gRNA and chose genes with at least three gRNAs with an average depletion rate larger than tenfold for gene ontology analysis. Gene ontology analyses were performed using DAVID with default parameters as described previously46 (http://david. abcc.ncifcrf.gov/). 36. Cadinanos, J. & Bradley, A. Generation of an inducible and optimized piggyBac transposon system. Nucleic Acids Res. 35, e87 (2007). 37. Yusa, K., Rad, R., Takeda, J. & Bradley, A. Generation of transgene-free induced pluripotent mouse stem cells by the piggyBac transposon. Nat. Methods 6, 363–369 (2009). 38. Carey, B.W. et al. Reprogramming of murine and human somatic cells using a single polycistronic vector. Proc. Natl. Acad. Sci. USA 106, 157–162 (2009). 39. Subach, O.M. et al. Conversion of red fluorescent protein into a bright blue probe. Chem. Biol. 15, 1116–1124 (2008).

nature biotechnology

npg

© 2014 Nature America, Inc. All rights reserved.

40. Quinlan, A.R. & Hall, I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010). 41. Pettitt, S.J. et al. Agouti C57BL/6N embryonic stem cells for mouse genetic resources. Nat. Methods 6, 493–495 (2009). 42. Wang, W., Bradley, A. & Huang, Y. A piggyBac transposon-based genome-wide library of insertionally mutated Blm-deficient murine ES cells. Genome Res. 19, 667–673 (2009). 43. Abuin, A., Zhang, H. & Bradley, A. Genetic analysis of mouse embryonic stem cells

bearing Msh3 and Msh2 single and compound mutations. Mol. Cell. Biol. 20, 149–157 (2000). 44. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009). 45. Quail, M.A. et al. Optimal enzymes for amplifying sequencing libraries. Nat. Methods 9, 10–11 (2012). 46. Huang, da W. et al. DAVID gene ID conversion tool. Bioinformation 2, 428–430 (2008).

Genome-wide recessive genetic screening in mammalian cells with a lentiviral CRISPR-guide RNA library.

Identification of genes influencing a phenotype of interest is frequently achieved through genetic screening by RNA interference (RNAi) or knockouts. ...
3MB Sizes 6 Downloads 3 Views