Author's Accepted Manuscript

Metabolic engineering of Escherichia coli using CRISPR-Cas9 meditated genome editing Yifan Li, Zhenquan Lin, Can Huang, Yan Zhang, Zhiwen Wang, Ya-jie Tang, Tao Chen, Xueming Zhao

www.elsevier.com/locate/ymben

PII: DOI: Reference:

S1096-7176(15)00075-0 http://dx.doi.org/10.1016/j.ymben.2015.06.006 YMBEN1012

To appear in:

Metabolic Engineering

Received date: 5 March 2015 Revised date: 19 June 2015 Accepted date: 22 June 2015 Cite this article as: Yifan Li, Zhenquan Lin, Can Huang, Yan Zhang, Zhiwen Wang, Ya-jie Tang, Tao Chen, Xueming Zhao, Metabolic engineering of Escherichia coli using CRISPR-Cas9 meditated genome editing, Metabolic Engineering, http://dx.doi.org/10.1016/j.ymben.2015.06.006 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting galley proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Metabolic engineering of Escherichia coli using CRISPR-Cas9 meditated genome editing Yifan Lia,b,1, Zhenquan Lina,b,1, Can Huanga,b, Yan Zhanga,b, Zhiwen Wanga,b, Ya-jie Tangc Tao Chena,b*, Xueming Zhaoa,b a

Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin

300072, People’s Republic of China b

SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering

(Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, People’s Republic of China c

Key Laboratory of Fermentation Engineering (Ministry of Education), Hubei University of

Technology, Wuhan 430068, People’s Republic of China 1

These authors contributed equally to this work: Yifan Li, Zhenquan Lin

Contact information Yifan Li:

[email protected]

Zhenquan Lin:

[email protected]

Can Huang:

[email protected]

Yan Zhang:

[email protected]

Zhiwen Wang:

[email protected]

Ya-jie Tang:

[email protected]

Tao Chen:

[email protected]

Xueming Zhao:

[email protected]

Corresponding author: Tao Chen. Department of Biochemical Engineering, School of Chemical Engineering & Technology, Tianjin University, Nankai District, 92 Weijin Road, Tianjin 300072, China. Tel/Fax: +86-22-27406770; E-mail: [email protected]

1

Abstract Engineering cellular metabolism for improved production of valuable chemicals requires extensive modulation of bacterial genome to explore complex genetic spaces. Here, we report the development of a CRISPR-Cas9 based method for iterative genome editing and metabolic engineering of Escherichia coli. This system enables us to introduce various types of genomic modifications with near 100% editing efficiency and to introduce three mutations simultaneously. We also found that cells with intact mismatch repair system had reduced chance to escape CRISPR mediated cleavage and yielded increased editing efficiency. To demonstrate its potential, we used our method to integrate the β-carotene synthetic pathway into the genome and to optimize the methylerythritol-phosphate (MEP) pathway and central metabolic pathways for β-carotene overproduction. We collectively tested 33 genomic modifications and constructed more than 100 genetic variants for combinatorially exploring the metabolic landscape. Our best producer contained15 targeted mutations and produced 2.0 g/L β-carotene in fed-batch fermentation.

Keywords CRISPR/Cas9, genome editing, combinatorial metabolic engineering, β-carotene

Abbreviations CRISPR, clustered regularly interspaced short palindromic repeat; PAM, protospacer adjacent motif; gRNA, guide RNA; DSB, double strand break; ORF, open reading frame; MMR, mismatch repair; MEP, methylerythritol-phosphate; ssDNA, single-strand DNA; dsDNA, double-strand DNA;

2

MAGE, multiplex automated genome engineering; TRMR, trackable multiplex recombineering; PEP, phosphoenolpyruvate; G3P, glyceraldehyde-3-phosphate.

1. Introduction Metabolic engineering for the production of chemicals, fuels, and medicines typically requires extensive modulation of cellular metabolism for enhanced productivity (Lee et al., 2012; Lynch and Gill, 2011; Santos and Stephanopoulos, 2008). Genome engineering to introduce various types of genetic modifications including gene deletion, overexpression, and precise regulation is essential to improve pathway efficiency and product yield (Esvelt and Wang, 2013; Wang et al., 2009; Woodruff and Gill, 2011). In addition, heterologous genes are often required to be transferred into the producing hosts to implement functional pathways. To achieve this, genome integration of the pathway genes has often proven to be superior to plasmid-borne overexpression in terms of increased stability and decreased metabolic burden (Ajikumar et al., 2010; Tyo et al., 2009). Recombination-mediated genetic engineering, known as recombineering, uses phage-derived proteins to efficiently recombine donor DNA in the host and has come into widespread use for engineering bacterial genomes (Sharan et al., 2009). Double-strand DNA (dsDNA) mediated recombineering for the introduction of gene deletions and insertions usually requires selectable markers to identify correct mutants (Datsenko and Wanner, 2000; Sharan et al., 2009). An additional step for eliminating the markers from the chromosome for subsequent modifications significantly increases the effort required for iterative genome engineering. Recombineering using single-strand DNA (ssDNA) to introduce mutations yields much higher efficiency than dsDNA and has evolved into multiplex genome engineering techniques such as MAGE (Wang et al., 2009) and TRMR 3

(Warner et al., 2010). These techniques enable targeting or modifying several loci simultaneously and greatly enhance our ability to engineer complex phenotypes (Raman et al., 2014; Sandoval et al., 2012). However, they have limited ability to introduce sequences longer than 20 bp without using selectable markers and often require high-throughput screening methods to explore a large population of cells to search for the desired phenotypes. The bacterial CRISPR-Cas9 genome editing technology has been used in diverse organisms including bacteria and yeast (Bao et al., 2014; Cobb et al., 2014; DiCarlo et al., 2013; Jiang et al., 2013; Jinek et al., 2012). A trans-activating crRNA (tracrRNA):crRNA duplex (or a chimeric guide RNA (gRNA)) directs the Cas9 protein to cleave a target DNA sequence with a required protospacer adjacent motif (PAM). CRISPR-Cas9 mediated genome cutting kills non-edited cells, circumventing the need for using selectable markers to select mutants. This technology has also been applied for metabolic pathway engineering in Saccharomyces cerevisiae, in which all combinations of 5 gene disruptions were searched for improved mevalonate production (Jakociunas et al., 2015). In E. coli, CRISPR-Cas9 has been used in combination with recombineering to introduce point mutations and codon replacements (Jiang et al., 2013; Pines et al., 2014). A very recent report described the using of CRISPR-Cas9 system to realize a variety of precise genome modifications in E. coli (Jiang et al., 2015). By combining the gRNA expressing cassette and the donor DNA into a single vector, the authors were able to achieve high editing efficiency and to disrupt three target genes simultaneously. However, co-transformation of double-strand donor DNA with the gRNA expressing plasmid yielded relatively low efficiency. Here, we describe a CRISPR-Cas9 based method for iterative genome editing and metabolic engineering of E. coli. We performed detailed optimization of this system and achieved near 100%

4

efficiency for introducing gene deletions, insertions, and replacements by co-transforming gRNA expression plasmid and dsDNA as editing template. The effect of endogenous mismatch repair (MMR) system on CRISPR mediated genome editing was also evaluated. We finally used this genome editing method to integrate the β-carotene biosynthetic pathway into the genome and to conduct combinatorial modulation of the MEP pathway and central metabolic pathways to search for improved β-carotene producer.

2. Materials and methods 2.1 Strains and culture conditions The EcPHE strain, which has the genotype of MG1655 ∆bioA::λ-Red, was used for characterization of CRISPR-Cas9 mediated editing in cells with intact MMR system. The EcMutS strain, which is MG1655 ∆mutS ∆bioA::λ-Red, was used as MMR system inactivated strain for testing CRISPR-Cas9 mediated editing. The EcKan strain, which is MG1655 ∆bioA::λ-Red-kan, was used as the parental strain for the construction of β-carotene overproducing strains. Plasmid pCas9cur, which is modified from pCas9 (Jiang et al., 2013), was introduced into the above mentioned strains for expressing Cas9 protein and curing gRNA plasmid in most cases. Plasmid pREDCas9, which combined the IPTG inducible λ-Red recombineering system from pTKRED (Kuhlman and Cox, 2010), the Cas9 expressing system, and the plasmid curing system into a single vector, was also functional for CRISPR-Cas9 mediated editing in strains without chromosomally integrated λ-Red system (Supplementary Table S5). Detailed methods for strain and plasmid construction can be found in Supplementary Information. LB media was used for cell growth in all cases unless otherwise noted. Ampicillin, kanamycin, 5

chloramphenicol and spectinomycin were added at concentrations of 100 mg/ml, 10 mg/ml, 30 mg/ml, and 50 mg/ml, respectively. IPTG and X-gal for blue/white selection were added at concentrations of 0.1 mM and 40 µg/ml, respectively.

2.2 Constructing gRNA plasmid and donor DNA To construct gRNA plasmid, a set of primers was used to PCR amplify the pGRB backbone. The 20 bp spacer sequence specific for each target was synthesized in primers. The PCR product was then self-ligated using Golden Gate Assembly to obtain the desired gRNA plasmid (Engler et al., 2008). We developed a method to use a single golden gate assembly reaction to construct gRNA plasmid expressing two or three gRNA simultaneously. The detailed design and procedure can be found in Supplementary Fig. S8. Oligos used as donor DNA were all 89 bp in length and contained four phosphorothioate linkages at the 5’ terminus (Wang et al., 2009). Donor dsDNA usually had 300~500 bp homologous arm on each side unless otherwise noted. To construct donor dsDNA, two homologous arms and the sequence to be inserted were separately amplified and were then fused together by fusion PCR. Gel purification of the PCR products prior to electroporation is necessary. All primers, including those used as donor ssDNA, were ordered from Genewiz. All primers and spacers used in this study can be found in Supplementary Table S6.

2.3 Iterative genome editing procedure Electrocompetent cells were generated as previously described (Wang et al., 2009). In brief, a single colony or 100 times diluted overnight culture was inoculated in 3 ml LB medium (or 100ml LB for large scale preparation) and was grown at 32 °C to OD=0.5. The cultures were shifted to

6

42°C for 15 min to induce λ Red, and were then quick-chilled in ice water slurry for 10 min. The cells were then washed twice with cold-sterile ddH2O in test tubes. One microliter of cells were finally concentrated 20-fold into 50 µl volume for each reaction. Unless otherwise noted, 100ng donor dsDNA (or 1µM ssDNA) and 100 ng gRNA plasmid were added in each electroporation reaction. Bio-Rad MicroPulser was used for electroporation (0.1 cm cuvette, 1.80 kV). Cells after electroporation were immediately added into 3 ml LB and recovered for 3 h prior to plating. For plasmid curing, correct colonies were inoculated in LB containing 0.2% L-arabinose and cultivated for 6~8 hours or overnight. To save time for iterative engineering, we usually inoculate colonies for plasmid curing before testing the mutation because our system usually yielded high editing efficiency. Then, the cultures after plasmid curing were streaked and the colonies were tested for ampicillin sensitivity. To save time, we usually inoculate colonies for the next round of editing before testing ampicillin sensitivity because of the high curing efficiency.

2.4 Determining editing efficiency and recombination frequency When lacZ gene was targeted, the editing efficiency was determined by calculating the number of blue and white colonies on IPTG and X-gal plates. When deleting or introducing stop codons in lacZ gene, sequencing was performed to confirm that the lacZ gene was inactivated by the designed mutations rather than random and spontaneous mutations. When kan gene was integrated, the editing efficiency was calculated as the fraction of cells exhibiting kanamycin resistance by spotting colonies on kanamycin plates. When the target gene cannot be used as reporter, mutants were determined by MASC-PCR (Carr et al., 2012; Li et al., 2013). When determining the recombination frequency with DSB, we used the colony forming unit

7

(cfu) resulted from transforming no-targeting gRNA plasmid as an approximation of the number of cells receiving targeting gRNA plasmid, assuming that the two plasmids with identical size should have similar efficiency to get into the cells. Therefore, the recombination frequency with DSB was estimated as the ratio of the cfu resulted from transforming targeting gRNA plasmid to the cfu resulted from transforming identical amount of no-targeting gRNA plasmid.

2.5 Shake-flask fermentation For testing β-carotene production, engineered E. coli strains were cultured in LB medium with appropriate antibiotics at 28.5 °C until they reached stationary phase. Cells were then diluted 100 times and transferred to flasks containing 100 ml 2×YT medium with or without glucose. Initially, strains constructed during MEP pathway optimization were cultivated in 2×YT medium (16 g Bacto Tryptone, 10 g yeast extract and 5 g NaCl per liter) for 24 h for characterizing β-carotene production. To manifest the modification effects in subsequent experiments, fermentations were performed in 2×YT with 10 g/L glucose for 48 h for the rest of the strains.

2.6 Fed-batch Fermentations The modified minimal medium used for seed preparation and fermentation contained (per liter): 10 g glucose, 10 g yeast extract, 1.7 g citric acid, 11.2 g KH2PO4, 6 g (NH4)2HPO4 , 3.44 g MgSO4•7H2O and 1 mL trace metal solution. The trace metal solution contained (per liter): 2.5 g CoCl2•6H2O, 15 g MnCl2•4H2O, 1.5 g CuCl2•2H2O, 2.5 g H3BO3, 2.5 g Na2M4O7•2H2O, 2.5 g Zn(CH3COO)2•2H2O, 12.5 g Fe(III)Citrate. ZF237T was cultured in 5 ml LB medium until reaching the mid-exponential phase, then transferred into 100 ml modified minimal medium flask cultures at a starting OD600 of 0.1. The flask cultures were incubated at 30°C and 220 rpm for 24 h to prepare seed cultures. The seed was then transferred to a 5 L fermentor (Shanghai Bailun Bio Co Ltd) containing 2.5 L medium with an initial OD600 of 0.05. Fermentations were performed at 29 °C with the pH automatically controlled at 7.0 using a 25% solution of ammonium hydroxide. Dissolved oxygen was maintained at ≥ 30% by adjusting agitation speed. Fed solution, containing (per liter) 650 g glucose, 25 g Tryptone, 50 g yeast extract and 17.2 g MgSO 4•7H2O, was used to fed the fermenter with a feed rate of 15 mL/h. 8

2.7 Analytical Methods. Intracellular β-carotene concentration was quantified by measuring the absorption of the acetone-extracted β-carotene at 453 nm as previously described (Zhao et al., 2013). The cell cultures were harvested and suspended in 1ml acetone and incubated at 55 °C for 15 min in dark with intermittent vortexing. After centrifuging the samples, the β-carotene content in the supernatant was quantified through absorbance at 453 nm and concentrations were calculated through a standard curve (Cat. No. C4582, Sigma, USA) using an ultraviolet spectrophotometer (Beijing Puxi Universal Co Ltd). β-carotene concentration in fed-batch fermentation was determined by HPLC analysis. The growth of cells was monitored by measuring the OD600 with an ultraviolet spectrophotometer (Beijing Puxi Universal Co Ltd). Glucose in the fermentation broth was determined utilizing SBA-40C biosensor analyzer (Institute of Microbiology, Shangdong, China).

3. Results 3.1 CRISPR-Cas9 mediated iterative genome editing in E. coli Our CRISPR-Cas9 based genome editing system is composed of five elements: Cas9 constitutively expressing cassette, gRNA expression plasmid, λ Red recombineering system, donor template DNA, and inducible plasmid curing system for eliminating gRNA plasmid from the cells (Fig. 1a). Specifically, Cas9 protein was expressed on plasmid pCas9cur, which was modified from pCas9 (Jiang et al., 2013) and contained a p15A replication origin and a cat gene. Targeting gRNA was expressed by constitutive promoter J23100 on a plasmid containing a ColE1 replication origin and a bla gene (Qi et al., 2013). Donor DNA, which can be ssDNA or dsDNA, were designed to

9

generate gene deletions, insertions, or replacements while altering either the PAM or the protospacer sequences to allow mutant cells to escape CRISPR induced cell death (Fig. 1a). To construct the plasmid curing system, we used an arabinose inducible promoter PBAD to express a gRNA targeting the bla gene on the gRNA plasmid. When induced, the bla-targeting gRNA was expressed, leading Cas9 to cleave the gRNA plasmid and resulting in plasmid elimination (Supplementary Fig. S1). Each cycle of editing started with co-transformation of donor DNA and gRNA plasmid into cells expressing Cas9 and recombineering proteins (Fig. 1b). Theoretically, Cells receiving gRNA plasmid were killed by CRISPR mediated digestion unless they acquire mutations at the CRISPR targeting sequences (Jiang et al., 2013). Thus, plating cells on medium containing chloramphenicol and ampicillin allowed the selection of cells containing the desired modification. Correct mutants were incubated in medium containing arabinose for plasmid curing and were then streaked on plates without arabinose. Colonies were analyzed for ampicillin sensitivity to confirm the loss of gRNA plasmid and were grown to mid-log phase to prepare electrocompetent cells for the next round of editing. Each cycle of editing could be finished in two days (Fig. 1b).

3.2 Characterization and optimization of CRISPR-Cas9 mediated genome editing We first systematically evaluated the ability of CRISPR-Cas9 mediated genome editing to introduce various types of modifications including codon replacements, gene deletions and insertions (Fig. 2a, b and Supplementary Fig. S2). The lacZ gene was used as reporter and thus the editing efficiency could be determined by the number of white and blue colonies on IPTG/X-gal

10

plates. As shown in Fig. 2a, our system could generate near 100% editing efficiency for codon replacements and gene deletions with both ssDNA and dsDNA as editing template. However, the efficiency with ssDNA as editing template decreased dramatically with increased length of deleting sequence. In contrast, the efficiency with dsDNA maintained more than 90% even when deleting sequence as long as 12 kb. Using dsDNA as editing template, we were able to insert sequences no longer than 2 kb with more than 90% efficiency (Fig. 1b). However, increased length of insertion significantly decreased editing efficiency. We were able to integrate 8 kb sequence into the genome, albeit with relatively low efficiency. We then explored the ability of our CRISPR-Cas9 based system to introduce several mutations simultaneously. We targeted three sites: lacZ, galK, and, ldhA, which scattered on the chromosome so that co-selection mechanism could be avoided (Carr et al., 2012) (Fig. 2c). We designed donor ssDNA to introduce codon replacements in these target genes. A single plasmid expressing multiple gRNAs targeting two or three sites were co-transformed with related donor ssDNA. The results showed that two mutations could be introduced simultaneously with 83% editing efficiency (Fig 2d). When three sites were targeted, however, the fraction of cells containing all desired mutations dropped to 23%, suggesting a limitation for combinatorial modification. We also quantitatively evaluated the effect of CRISPR generated double strand break (DSB) on recombination and found that DSB stimulated recombination frequency using both ssDNA and dsDNA as editing template (Fig. 2e). Without CRISPR mediated cleavage, using ssDNA to introduce point mutations and using dsDNA to introduce gene insertions yielded 10% and 1.1E10-4 recombination frequency, respectively, which were at similar level as previous reports (Sharan et al., 2009). CRISPR generated DSB increased recombination frequency to near 100% for ssDNA

11

recombineering, indicating that almost every cell receiving the gRNA plasmid recombined with the donor ssDNA and survived CRISPR mediated cutting. Using dsDNA as template yielded 7.5% recombination frequency with DSB, which was about 1000-fold higher than that without DSB.

3.3 The effect of MMR system on CRISPR-Cas9 mediated genome editing Even without adding donor DNA in transformation, cells can survive CRISPR mediated cutting by introducing spontaneous mutations in the Cas9/gRNA targeting system or the target sequences (protospacer and PAM) (Jiang et al., 2013; Qi et al., 2013). These background of ‘escapers’ increased the possibility of selecting unwanted clones and decreased editing efficiency. MMR system was usually removed (by knocking out mutS gene) to achieve optimized recombineering efficiency (Costantino and Court, 2003; Wang et al., 2009). However, the removing of MMR system was previously reported to increase replication mutation rate by two orders of magnitude (Isaacs et al., 2011) and thus may increase the level of background ‘escapers’ and decrease editing efficiency. To investigate the effect of inactivating MMR on the level of background ‘escapers’, we transformed gRNA into both mutS+ and mutS- strains without adding donor DNA. Four different targeting sites were tested to eliminate site specific effect. We found that in all cases, mutS+ cells yielded 10-fold fewer “escapers” than mutS- cells (Fig. 3a). This decrease in the amount of “escapers” increased editing efficiency in almost every case tested (Supplementary Fig. S2), and the improvement was significant when the efficiency was relatively low. For example, inserting 1 kb sequence with 50 bp-homologous-arm dsDNA yielded 46% editing efficiency in mutS+ cells while

12

only yielded 8% editing efficiency in mutS- cells (Fig. 3b). We also evaluated the effect of MMR system on correcting point mutations introduced by CRISPR mediated genome editing method. We introduced 1 bp A:C mismatch mutation and 9 bp mutation into both mutS+ and mutS- strains. The 1 bp A:C mismatch was reported to be efficiently corrected by MMR system and yield low recombineering efficiency in mutS+ strains (Sawitzke et al., 2011; Wang et al., 2011). The 9 bp mutation, on the other hand, could evade MMR correction and yield high recombineering efficiency in both mutS+ and mutS- strains (Sawitzke et al., 2011; Wang et al., 2011). We found that point mutation introduced by ssDNA mediated CRISPR editing could be efficiently corrected by MMR, as could be seen by the decreased editing efficiency for introducing 1 bp point mutation than 9 bp mutation (Fig. 3c). In contrast, using 1 kb-homologous-arm dsDNA to introduce 1 bp point mutation yielded similar editing efficiency to that of 9 bp replacement, suggesting that point mutation introduced by dsDNA with long-homologous-arm had increased chance to evade MMR correction (Fig. 3c). One previous report argues that λ exonuclease entirely degrades one strand and the recombineering process occurs through a fully single-stranded intermediate (Mosberg et al., 2010). Based on these findings, a possible explanation for the increased chance of evading was that long-homologous-arm dsDNA was less likely to be entirely degraded before recombining into the CRISPR generated DSB. Previous studies have reported the using of consecutive mismatch mutations and chemically modified bases to evade MMR correction (Sawitzke et al., 2011; Wang et al., 2011). Our finding provided a new strategy for such purpose. Overall, using cells with intact MMR system for CRISPR-Cas9 mediated genome editing has the advantage of decreased background ‘escapers’ and increased editing efficiency. The

13

disadvantage of its correcting target point mutations can be avoided by using long-homologous-arm dsDNA. On the basis of these results, we used mutS+ strains in all of our following experiments.

3.4 Engineering β-carotene synthetic pathway and MEP pathway Our CRISPR-Cas9 mediated genome editing procedure significantly enhanced our ability to iteratively introduce well designed genomic manipulations in E. coli. To illustrate its potential application in metabolic engineering, we exclusively used our system to modulate E. coli genome for overproduction of β-carotene. This red-orange pigment belongs to isoprenoids family and has many applications in pharmaceutical and nutraceutical industry (Zhao et al., 2013). Isoprenoids are derived from isopentenyl pyrophosphate (IPP) and dimethylallyl pyrophosphate (DMAPP), which are synthesized through MEP pathway in E. coli (Fig. 4) (Ajikumar et al., 2010; Zhao et al., 2013). As an initial step, we fused the first three genes of the β-carotene synthetic pathway from Pantoea agglomerans (crtE, crtB, and, crtI) (Alper et al., 2005) into an operon driven by strong constitutive promoter J23119 and integrated the operon into the chromosome at the ldhA locus (Supplementary Fig. S3 and Supplementary Table S1). Correct mutants exhibited red color, indicating the synthesis of lycopene (Alper et al., 2005). Then, crtY, the last gene of the pathway, was also integrated at the ldhA locus, resulting in strain ZF02 (Supplementary Fig. S3 and Supplementary Table S1). ZF02 exhibited orange color and was able to produce 7.5 mg/L β-carotene in 2×YT medium. To increase the supply of IPP and DMAPP, we took an iterative overexpression approach to improve MEP pathway flux. In one engineering cycle, several genes in the pathway were overexpressed by inserting a rationally design 5’-UTR sequences containing a strong constitutive

14

promoter J23119 and an RBS B0034 in front of the coding sequence (Supplementary Fig. S4). An exogenous gene gps, coding for GGPP synthase from Archaeoglobus fulgidus (Wang et al., 1999), was also tested for overexpression by integrating it into the chromosome (Supplementary Fig. S3 and Supplementary Table S1). Engineered clones were characterized for β-carotene production and one or two clones with improved phenotype were selected for the next cycle of optimization (Fig. 5a). We collectively tested 33 different strains including the one (ZF53) with all 9 genes/operons overexpressed in the MEP pathway (Fig. 5a). Among them, the best producer ZF43 had six overexpressed genes and was able to produce 21.4 mg/L β-carotene in 2×YT medium (79.2 mg/L in 2×YT medium with 10 g/L glucose), which was nearly 3-fold higher than that of ZF02.

3.6 Modulating central metabolism for improved β-carotene production In the MEP pathway optimized strains, we speculated that the supply of pyruvate and glyceraldehyde-3-phosphate (G3P), which are the precursors of MEP pathway, limited β-carotene production. To increase the intracellular concentration of these two metabolites, we modulated central carbon metabolism by engineering the glucose transport system (De Anda et al., 2006), overexpressing tpiA (Choi et al., 2010), edd/eda (Liu et al., 2013), talA/tktB (Song et al., 2006), ppsA (Farmer and Liao, 2001), and pckA (Farmer and Liao, 2001), and knocking out gdhA (Alper et al., 2005) and pykF (Toya et al., 2010). Using galactose permease system to replace phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS) to transport glucose has been reported to increase the concentration of phosphoenolpyruvate (PEP) and G3P (Escalante et al., 2010; Zhang et al., 2013)_ENREF_28. Other targets (except the gdhA deletion) are directly related

15

with G3P, PEP, or pyruvate metabolism. We found several modifications demonstrating noticeable improvement in β-carotene production (Fig. 5b). Among them, combined engineering of galP and ptsHIcrr (GP) yielded the highest β-carotene titer, which was more than 1.7-fold higher than ZF43. We then combinatorially introduced these genetic modifications in central metabolic network, resulting in 33 distinct genotypes (including those with a single modification) (Fig. 5c and Supplementary Table S2). We noticed both cooperative and non-cooperative effect for combined modulation. Combined introduction of GP and pykF deletion, which yielded the highest titer when introduced independently, resulted in decreased β-carotene production and significantly reduced growth rate. On the contrary, tpiA and edd/eda overexpression in ZF43-GP background further improved β-carotene production and resulted in the highest titer in this group of producers (Fig. 5c).

3.6 Combinatorial multiplexing of genetic modifications to search the best producer We then integrated an additional copy of the genes in MEP pathway and β-carotene synthetic pathway into the genome of ZF43-GP to further enhance the expression level of downstream genes (Supplementary Fig. S5). Among the 12 target genes, a second copy of gps, crtE, dxs, and ispA increased β-carotene production and were selected for subsequent combinatorial optimization (Supplementary Fig. S6). Considering the complexity and nonlinearity in engineering cellular metabolism, we finally conducted combinatorial multiplexing of the selected positive targets in central metabolic pathways and downstream pathways to search for the best producer (Ajikumar et al., 2010; Alper et al., 2005;

16

Blazeck et al., 2014; Xu et al., 2013). Specifically, ZF43-GP and the best three producers resulted from central metabolism optimization (ZF43-GP-tpiA, ZF43-GP-edd, ZF43-GP-tpiA-edd) were selected for further testing (Fig. 5c, d). These four genotypes were multiplexed with five downstream pathway overexpression targets (or combinations of targets): gps, gps-dxs, gps-crtE, gps-dxs-crtE, and gps-crtE-ispA. Collectively, this combinatorial multiplexing resulted in 20 distinct genotypes (Fig. 5d). Combined overexpression of gps-crtE-ispA in the ZF43-GP-tpiA background resulted in the best producer ZF237T (Supplementary Table S3) among these 20 distinct strains and increased β-carotene production to 212 mg/L in 2×YT medium with 10 g/L glucose, which represented 2.8-fold improvement than ZF43 (Fig. 5d). Fed-batch fermentation of ZF237T in 5 L bioreactor reached a maximum titer of 2.0 g/L β-carotene (Fig. 5e).

4. Discussion 4.1 CRISPR-Cas9 mediated genome editing In this report, we developed a CRISPR-Cas9 based system for genome editing of E. coli. This system significantly reduced the cycling time and efforts to iteratively introduce seamless modifications in the bacterial chromosome. Traditional dsDNA mediated recombineering yields relatively low frequency (~10-4). Using ssDNA to introduce short mismatches generates much higher recombination frequency. However, the frequency drops substantially for larger replacements and inserts (20 bp) (Wang and Church, 2011). Thus, it is usually required to introduce markers into the genome for selecting correct recombinants and to excise the markers for subsequent genome editing. Three strategies were commonly used for marker excision. The first strategy is to use site-specific recombination to “pop out” the marker, which has the disadvantage of 17

leaving a scar sequence around the introduced modification (Datsenko and Wanner, 2000). The second strategy is to flank a counter-selectable marker with the donor DNA and to use an additional recombination step to evict the marker. This method is able to introduce seamless modifications but is usually time and labor consuming (Sharan et al., 2009). The third strategy is to use I-SceI mediated genome cutting to stimulate recombination and selection of marker evicted strains but has to pre-introduce the I-SceI recognition site into the genome (Kuhlman and Cox, 2010). CRISPR-Cas9 system is able to target any desired genomic loci and to induce DSB to kill native cells, allowing to select mutants without introducing markers in the genome. This feature renders the method great simplicity and efficiency. The editing system we have developed is also highly reliable, as demonstrated by the testing of 33 genomic modifications (Supplementary Table S4) and sequentially introducing 15 modifications into a single strain (Supplementary Table S3). In E. coli, Jiang et al. (2013) was the first to report CRISPR-Cas9 mediated genome editing. The authors used ssDNA as editing template and achieved 65% efficiency for introducing a codon replacement. Another study used dsDNA as editing template and introduced a codon replacement in near 100% population of the cells (Pines et al., 2014). During preparation of this manuscript, Jiang et al. (2015) reported the development of a CRISPR-Cas9 based system to introduce a variety of precise genome modifications with high efficiency and to modulate up to three targets simultaneously. Rather than co-transforming gRNA expression plasmid and donor DNA, the authors combined gRNA expression cassette and donor DNA into a single vector for CRISPR mediated genome editing. However, the cloning of such combined plasmid increased the time and efforts for introducing desired modifications, especially when targeting multiple sites simultaneously. Jiang et al. (2015) also tested the co-transformation of donor dsDNA and a separate

18

gRNA expression vector for CRISPR mediated editing, but achieved relatively low efficiency (69%

for deletion and 28% for gene insertion). By systematical characterization and optimization of our system, we achieved much higher editing efficiency using dsDNA as editing template. Near 100% efficiency could be generated for deleting sequences as long as 12 kb and inserting sequences no longer than 2 kb. Previous reports have indicated that traditional recombineering was often unable to insert sequences longer than 3 kb (Kuhlman and Cox, 2010). However, we were able to achieve 59%, 35%, and 14% efficiency for inserting 3 kb, 5 kb, and 8 kb sequences, respectively. This increased efficiency was possibly attributed to DSB generated recombination stimulation effect (Fig. 2e) (Kuhlman and Cox, 2010). We were also able to introduce three mutations simultaneously. The difference between our strategy and the report by Jiang et al. (2015) was that we used three separate ssDNA as editing templates rather than a single plasmid containing all pieces of donor DNAs and gRNA expression cassettes. This avoided the time and labor consuming process for constructing the complex hybrid plasmid. We used a single Golden Gate Assembly reaction to construct plasmids expressing multiple gRNAs, which is also much more efficient and simple. Multiplex recombineering techniques such as MAGE and TRMR can be more powerful when working in combination with CRISPR (Esvelt and Wang, 2013). To avoid MMR correction and achieve optimized recombineering frequency, these techniques usually utilize MMR inactivated host cells. However, we found that MMR inactivated cells generated increased amount of escapers and yielded reduced editing efficiency. In addition, previous studies have reported much higher mutation rate in MMR inactivated cells (Isaacs et al., 2011). This can be detrimental for developing overproducing microbes because such strains have decreased stability and are more likely to

19

degenerate during growth. Thus, we argue that inactivating MMR system is not a good choice for metabolic engineering using CRISPR based genome editing.

4.2 Engineering MEP pathway for β-carotene overproduction In MEP pathway, DXS and IDI were commonly thought to catalyze two major rate-limiting steps and were overexpressed to enhance isoprenoid production (Ajikumar et al., 2010; Zhao et al., 2013). A previous report tested the overexpression of MEP pathway genes by replacing the native promoter with T5 promoter (Yuan et al., 2006). However, this report did not perform extensive combinatorial overexpression to search the best producer. We conducted an iterative optimization strategy to increase pathway flux, which enabled us to sequentially debottleneck the pathway. We found that overexpression of ispA, ispH, and ispE could further increase pathway flux in dxs and idi overexpressed strain. In addition, we noted disparities of modification effects in different genetic context. For example, the overexpression of ispH alone yielded very slight effect on production but resulted in 43% improvement when introduced into the gps-idi-dxs overexpression background. Overexpression of ispD/ispF operon decreased isoprenoid production in our research, which contradicted previous report (Yuan et al., 2006). A possible explanation was that the expression level of this operon in our research was too strong and thus generated metabolites decreasing cellular fitness. Using libraries of RBS sequence to fine-tune the expression of these two genes may result in an optimized expression level for β-carotene overproduction.

20

4.3

Engineering

central

metabolic

pathways

for

β-carotene

overproduction We noticed several gene targets in central metabolic pathways generating improved β-carotene production. Among them, the modulation of glucose transport system (Zhang et al., 2013), the overexpression of edd/eda operon (Liu et al., 2013) and tpiA gene (Choi et al., 2010), and the deletion of gdhA (Alper et al., 2005) were previously reported targets. On the other hand, the overexpression of talA/tktB operon and the deletion of pykF gene were novel targets in this study. The deletion of pykF was previously reported to increase the intracellular concentration of PEP (Al Zaid Siddiquee et al., 2004), which has direct correlation with lycopene production (Zhang et al., 2013). The talA and tktB were two important genes in the nonoxidative branch of the PP pathway and were directly involved in the metabolism of G3P (Song et al., 2006). We speculate that the overexpression of talA/tktB operon may have enhanced β-carotene production by increasing the PP pathway flux and improving the availability of G3P and NADPH for β-carotene production. Although some of the modulated genes were previously reported targets for independent modification, no effort has been made to investigate the effect of combined modulation. Thus, we combinatorially introduced these modifications to search improved phenotypes. We noticed the very complex nature of the metabolic landscape. The modulation of glucose transport system and the deletion of pykF could significantly increase β-carotene production when introduced independently. However, combined modulation of these two targets significantly hampered cellular growth rate and decreased β-carotene production. These two targets were both reported to decrease the intracellular concentration of pyruvate (Al Zaid Siddiquee et al., 2004; Zhang et al., 2013). Thus, a possible explanation for the decreased cellular growth and β-carotene production was the lack of 21

pyruvate as an important building block for the synthesis of biomass and as one of the precursors of the MEP pathway. It is also worth noting that we used glucose as the sole carbon source for the production of β-carotene. In contrast, most studies on the production of isoprenoid used glycerol as the carbon source because glycerol yielded higher productivity than glucose for this chemical.

4.4 Combinatorial metabolic engineering Due to the complexity and nonlinearity in engineering cellular metabolism, it is usually difficult to specify a priori an optimal synthetic design. Thus combinatorial metabolic engineering is required to explore the complex genetic spaces to search for the best phenotype (Ajikumar et al., 2010; Alper et al., 2005; Blazeck et al., 2014; Du et al., 2012; Santos and Stephanopoulos, 2008; Zhao et al., 2014). At the final stage of our metabolic engineering efforts, we performed combinatorial multiplexing of the selected positive targets and found unpredictable effect. For example, the ZF43-GP-edd outperformed ZF43-GP-tpiA for β-carotene overproduction. However, when downstream genes (gps, crtE, and ispA) were further overexpressed, ZF43-GP-tpiA yielded higher β-carotene titer than ZF43-GP-edd. These results highlighted the importance of using combinatorial optimization to search for the best phenotype. In fed-batch fermentation, our best producer ZF237T produced 2.0 g/L β-carotene, which was similar to the report by Zhao et al. (2013) and was still lower than the reports by Nam et al. (2013) and Yang and Guo (2014). Besides combinatorially introducing well designed modifications, using our CRISPR-Cas9 based method to fine-tune the expression level of key genes may further increase β-carotene titer (Coussement et al., 2014; Li et al., 2013; Wang et al., 2009). In addition, we anticipate the integration of the exogenous mevalonate (MVA) pathway genes into the genome

22

would further enhance the supply of IPP and DMAPP for improved β-carotene production (Yang and Guo, 2014).

5. Conclusion In this study, we reported the development of a CRISPR-Cas9 mediated method for efficient and iterative genome engineering of E. coli and subsequently demonstrated its potential by a real case application in metabolic engineering. We performed detailed characterization and optimization of our method and achieved 100% editing efficiency for various types of modifications and to introduce 3 mutations simultaneously. One cycle of editing can be finished in two days. We also investigate the effect of MMR system on CRIPSR based genome editing. We applied this method to engineer E. coli for enhanced production of β-carotene. The β-carotene biosynthetic pathway, MEP pathway, and central metabolic pathways were systematically optimized. The best producer contained 15 genomic modifications and produced 2.0 g/L β-carotene in fed-batch fermentation. Our work offers not only an efficient genome editing method, but also novel strategies and insights into engineering cellular metabolism for enhanced isprenoids production.

References Ajikumar, P. K., Xiao, W. H., Tyo, K. E., Wang, Y., Simeon, F., Leonard, E., Mucha, O., Phon, T. H., Pfeifer, B., Stephanopoulos, G., 2010. Isoprenoid pathway optimization for Taxol precursor overproduction in Escherichia coli. Science. 330, 70-4. Al Zaid Siddiquee, K., Arauzo-Bravo, M. J., Shimizu, K., 2004. Metabolic flux analysis of pykF gene knockout Escherichia coli based on 13C-labeling experiments together with measurements of enzyme activities and intracellular metabolite concentrations. Appl Microbiol Biotechnol. 63, 407-17. Alper, H., Miyaoku, K., Stephanopoulos, G., 2005. Construction of lycopene-overproducing E. coli strains by combining systematic and combinatorial gene knockout targets. Nat Biotechnol. 23, 612-6. Bao, Z., Xiao, H., Liang, J., Zhang, L., Xiong, X., Sun, N., Si, T., Zhao, H., 2014. Homology-Integrated 23

CRISPR-Cas (HI-CRISPR) System for One-Step Multigene Disruption in Saccharomyces cerevisiae. ACS Synth Biol. Blazeck, J., Hill, A., Liu, L., Knight, R., Miller, J., Pan, A., Otoupal, P., Alper, H. S., 2014. Harnessing Yarrowia lipolytica lipogenesis to create a platform for lipid and biofuel production. Nat Commun. 5, 3131. Carr, P. A., Wang, H. H., Sterling, B., Isaacs, F. J., Lajoie, M. J., Xu, G., Church, G. M., Jacobson, J. M., 2012. Enhanced multiplex genome engineering through co-operative oligonucleotide co-selection. Nucleic Acids Res. 40, e132. Choi, H. S., Lee, S. Y., Kim, T. Y., Woo, H. M., 2010. In silico identification of gene amplification targets for improvement of lycopene production. Appl Environ Microbiol. 76, 3097-105. Cobb, R. E., Wang, Y., Zhao, H., 2014. High-Efficiency Multiplex Genome Editing of Streptomyces Species Using an Engineered CRISPR/Cas System. ACS Synth Biol. Costantino, N., Court, D. L., 2003. Enhanced levels of lambda Red-mediated recombinants in mismatch repair mutants. Proc Natl Acad Sci U S A. 100, 15748-53. Coussement, P., Maertens, J., Beauprez, J., Van Bellegem, W., De Mey, M., 2014. One step DNA assembly for combinatorial metabolic engineering. Metab Eng. 23, 70-7. Datsenko, K. A., Wanner, B. L., 2000. One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proc Natl Acad Sci U S A. 97, 6640-5. De Anda, R., Lara, A. R., Hernandez, V., Hernandez-Montalvo, V., Gosset, G., Bolivar, F., Ramirez, O. T., 2006. Replacement of the glucose phosphotransferase transport system by galactose permease reduces acetate accumulation and improves process performance of Escherichia coli for recombinant protein production without impairment of growth rate. Metab Eng. 8, 281-90. DiCarlo, J. E., Norville, J. E., Mali, P., Rios, X., Aach, J., Church, G. M., 2013. Genome engineering in Saccharomyces cerevisiae using CRISPR-Cas systems. Nucleic Acids Res. 41, 4336-43. Du, J., Yuan, Y., Si, T., Lian, J., Zhao, H., 2012. Customized optimization of metabolic pathways by combinatorial transcriptional engineering. Nucleic Acids Res. 40, e142. Engler, C., Kandzia, R., Marillonnet, S., 2008. A one pot, one step, precision cloning method with high throughput capability. PLoS One. 3, e3647. Escalante, A., Calderon, R., Valdivia, A., de Anda, R., Hernandez, G., Ramirez, O. T., Gosset, G., Bolivar, F., 2010. Metabolic engineering for the production of shikimic acid in an evolved Escherichia coli strain lacking the phosphoenolpyruvate: carbohydrate phosphotransferase system. Microb Cell Fact. 9, 21. Esvelt, K. M., Wang, H. H., 2013. Genome-scale engineering for systems and synthetic biology. Mol Syst Biol. 9, 641. Farmer, W. R., Liao, J. C., 2001. Precursor balancing for metabolic engineering of lycopene production in Escherichia coli. Biotechnol Prog. 17, 57-61. Isaacs, F. J., Carr, P. A., Wang, H. H., Lajoie, M. J., Sterling, B., Kraal, L., Tolonen, A. C., Gianoulis, T. A., Goodman, D. B., Reppas, N. B., Emig, C. J., Bang, D., Hwang, S. J., Jewett, M. C., Jacobson, J. M., Church, G. M., 2011. Precise manipulation of chromosomes in vivo enables genome-wide codon replacement. Science. 333, 348-53. Jakociunas, T., Bonde, I., Herrgard, M., Harrison, S. J., Kristensen, M., Pedersen, L. E., Jensen, M. K., Keasling, J. D., 2015. Multiplex metabolic pathway engineering using CRISPR/Cas9 in Saccharomyces cerevisiae. Metab Eng. Jiang, W., Bikard, D., Cox, D., Zhang, F., Marraffini, L. A., 2013. RNA-guided editing of bacterial

24

genomes using CRISPR-Cas systems. Nat Biotechnol. 31, 233-9. Jiang, Y., Chen, B., Duan, C., Sun, B., Yang, J., Yang, S., 2015. Multigene editing in the Escherichia coli genome using the CRISPR-Cas9 system. Appl Environ Microbiol. Jinek, M., Chylinski, K., Fonfara, I., Hauer, M., Doudna, J. A., Charpentier, E., 2012. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science. 337, 816-21. Kuhlman, T. E., Cox, E. C., 2010. Site-specific chromosomal integration of large synthetic constructs. Nucleic Acids Res. 38, e92. Lee, J. W., Na, D., Park, J. M., Lee, J., Choi, S., Lee, S. Y., 2012. Systems metabolic engineering of microorganisms for natural and non-natural chemicals. Nat Chem Biol. 8, 536-46. Li, Y., Gu, Q., Lin, Z., Wang, Z., Chen, T., Zhao, X., 2013. Multiplex iterative plasmid engineering for combinatorial optimization of metabolic pathways and diversification of protein coding sequences. ACS Synth Biol. 2, 651-61. Liu, H., Sun, Y., Ramos, K. R., Nisola, G. M., Valdehuesa, K. N., Lee, W. K., Park, S. J., Chung, W. J., 2013. Combination of Entner-Doudoroff pathway with MEP increases isoprene production in engineered Escherichia coli. PLoS One. 8, e83290. Lynch, S. A., Gill, R. T., 2011. Synthetic biology: New strategies for directing design. Metab Eng. Mosberg, J. A., Lajoie, M. J., Church, G. M., 2010. Lambda red recombineering in Escherichia coli occurs through a fully single-stranded intermediate. Genetics. 186, 791-9. Nam, H. K., Choi, J. G., Lee, J. H., Kim, S. W., Oh, D. K., 2013. Increase in the production of beta-carotene in recombinant Escherichia coli cultured in a chemically defined medium supplemented with amino acids. Biotechnol Lett. 35, 265-71. Pines, G., Pines, A., Garst, A. D., Zeitoun, R. I., Lynch, S. A., Gill, R. T., 2014. Codon Compression Algorithms for Saturation Mutagenesis. ACS Synth Biol. Qi, L. S., Larson, M. H., Gilbert, L. A., Doudna, J. A., Weissman, J. S., Arkin, A. P., Lim, W. A., 2013. Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell. 152, 1173-83. Raman, S., Rogers, J. K., Taylor, N. D., Church, G. M., 2014. Evolution-guided optimization of biosynthetic pathways. Proc Natl Acad Sci U S A. 111, 17803-8. Sandoval, N. R., Kim, J. Y., Glebes, T. Y., Reeder, P. J., Aucoin, H. R., Warner, J. R., Gill, R. T., 2012. Strategy for directing combinatorial genome engineering in Escherichia coli. Proc Natl Acad Sci U S A. 109, 10540-5. Santos, C. N., Stephanopoulos, G., 2008. Combinatorial engineering of microbes for optimizing cellular phenotype. Curr Opin Chem Biol. 12, 168-76. Sawitzke, J. A., Costantino, N., Li, X. T., Thomason, L. C., Bubunenko, M., Court, C., Court, D. L., 2011. Probing cellular processes with oligo-mediated recombination and using the knowledge gained to optimize recombineering. J Mol Biol. 407, 45-59. Sharan, S. K., Thomason, L. C., Kuznetsov, S. G., Court, D. L., 2009. Recombineering: a homologous recombination-based method of genetic engineering. Nat. Protoc. 4, 206-223. Song, B. G., Kim, T. K., Jung, Y. M., Lee, Y. H., 2006. Modulation of talA gene in pentose phosphate pathway for overproduction of poly-beta-hydroxybutyrate in transformant Escherichia coli harboring phbCAB operon. Journal of bioscience and bioengineering. 102, 237-40. Toya, Y., Ishii, N., Nakahigashi, K., Hirasawa, T., Soga, T., Tomita, M., Shimizu, K., 2010. 13C-metabolic flux analysis for batch culture of Escherichia coli and its Pyk and Pgi gene knockout mutants based on mass isotopomer distribution of intracellular metabolites.

25

Biotechnol Prog. 26, 975-92. Tyo, K. E. J., Ajikumar, P. K., Stephanopoulos, G., 2009. Stabilized gene duplication enables long-term selection-free heterologous pathway expression. Nature Biotechnology. 27, 760-U115. Wang, C. W., Oh, M. K., Liao, J. C., 1999. Engineered isoprenoid pathway enhances astaxanthin production in Escherichia coli. Biotechnol Bioeng. 62, 235-41. Wang, H. H., Church, G. M., 2011. Multiplexed genome engineering and genotyping methods applications for synthetic biology and metabolic engineering. Methods Enzymol. 498, 409-26. Wang, H. H., Isaacs, F. J., Carr, P. A., Sun, Z. Z., Xu, G., Forest, C. R., Church, G. M., 2009. Programming cells by multiplex genome engineering and accelerated evolution. Nature. 460, 894-8. Wang, H. H., Xu, G., Vonner, A. J., Church, G., 2011. Modified bases enable high-efficiency oligonucleotide-mediated allelic replacement via mismatch repair evasion. Nucleic Acids Res. 39, 7336-47. Warner, J. R., Reeder, P. J., Karimpour-Fard, A., Woodruff, L. B. A., Gill, R. T., 2010. Rapid profiling of a microbial genome using mixtures of barcoded oligonucleotides. Nature Biotechnology. 28, 856-U138. Woodruff, L. B., Gill, R. T., 2011. Engineering genomes in multiplex. Curr Opin Biotechnol. Xu, P., Gu, Q., Wang, W., Wong, L., Bower, A. G., Collins, C. H., Koffas, M. A., 2013. Modular optimization of multi-gene pathways for fatty acids production in E. coli. Nat Commun. 4, 1409. Yang, J., Guo, L., 2014. Biosynthesis of beta-carotene in engineered E. coli using the MEP and MVA pathways. Microb Cell Fact. 13, 160. Yuan, L. Z., Rouviere, P. E., Larossa, R. A., Suh, W., 2006. Chromosomal promoter replacement of the isoprenoid pathway for enhancing carotenoid production in E. coli. Metab Eng. 8, 79-90. Zhang, C., Chen, X., Zou, R., Zhou, K., Stephanopoulos, G., Too, H. P., 2013. Combining genotype improvement and statistical media optimization for isoprenoid production in E. coli. PLoS One. 8, e75164. Zhao, J., Li, Q., Sun, T., Zhu, X., Xu, H., Tang, J., Zhang, X., Ma, Y., 2013. Engineering central metabolic modules of Escherichia coli for improving beta-carotene production. Metab Eng. 17, 42-50. Zhao, S., Jones, J. A., Lachance, D. M., Bhan, N., Khalidi, O., Venkataraman, S., Wang, Z., Koffas, M. A., 2014. Improvement of catechin production in Escherichia coli through combinatorial metabolic engineering. Metab Eng. 28C, 43-53.

Acknowledgements We thank George Church for providing EcNR1 cells. We thank Luciano Marraffini for providing pCas9 plasmid. We thank Thomas E. Kuhlman for providing pTKRED plasmid. We thank Francis Cunningham for providing pACLYC plasmid. This work was supported by National 26

973 Project [2011CBA00804, 2012CB725203]; National Natural Science Foundation of China [NSFC-21176182, NSFC-21206112, NSFC-21390201]; and National High-tech R&D Program of China [2012AA02A702, 2012AA022103].

Appendix A. Supporting information Supplementary data associated with this article can be found in the online version at

Figure legends Figure 1 The CRISPR-Cas9 based system for iterative genome editing. (a) Components of the CRISPR-Cas9 based system. Donor DNA and gRNA were co-transformed into cells expressing Cas9 and recombineering proteins. Gene deletions, insertions, or replacements were introduced, allowing cells to escape CRISPR mediated cutting by abolishing the protospacer or the PAM sequences. When induced by arabinose, gRNA targeting the bla gene is expressed for eliminating gRNA plasmid from the cells. (b) Step-by-step diagram of iterative genome editing. The time required for each step is presented in red color.

Figure 2 Characterization and optimization of CRISPR-Cas9 mediated genome editing. (a) Editing efficiency for introducing codon replacements and sequence deletions using either ssDNA or 500 bp-homologous-arm dsDNA. For testing codon replacement efficiency, three tandem stop codons were introduced in lacZ coding region. For deletion, a 1 kb region of lacZ gene was deleted. Error

27

bars, mean ± s.d. (b) Editing efficiency for gene insertions using 500 bp-homologous-arm dsDNA as editing template. A deleted sequence in the lacZ region was restored. Error bars, mean ± s.d. (c) Relative position of the selected targets for simultaneous modification on the chromosome. (d) Editing efficiency for simultaneous modification of several targets. Two tandem codon replacements were introduced in each site. Error bars, mean ± s.d. (e) The effect of DSB on recombination frequency. Non-targeting gRNA plasmid was introduced as control for determining the recombination frequency without DSB. For ssDNA mediated editing, three tandem stop codons were introduced in lacZ coding region and the recombination frequency without DSB was calculated as the fraction of white colonies on the plates. For dsDNA, a 1 kb kan gene was inserted in the lacZ region and the recombination frequency without DSB was determined by the fraction of kanamycin resistant colonies in the cell population. With DSB, the recombination frequency for both ssDNA and dsDNA mediated editing was estimated as the ratio of transformation efficiency of functional gRNA to that of non-targeting gRNA. Error bars, mean ± s.d.

Figure 3 The effect of MMR system on CRISPR mediated editing. (a) The amount of background escapers in cells with intact and inactivated MMR system. In each 50 ul electroporation reaction, 100 ng gRNA plasmid were introduced into either mutS+ or mutS- strains without donor DNA. Non-targeting gRNA plasmid were transformed as control. Error bars, mean ± s.d. (b) Editing efficiency for generating 1 kb deletion with 89 bp ssDNA (Group A) and 1 kb insertion with 50 bp-homologous-arm dsDNA (Group B) in either mutS+ or mutS- strains. Error bars, mean ± s.d. (c) The effect of MMR system in correcting point mutations introduced by CRIPSR mediated genome

28

editing. One A:C mismatch mutation (1 bp replacement) or three tandem codon replacements (9 bp replacement) were introduced into lacZ by either ssDNA or dsDNA mediated editing. Error bars, mean ± s.d.

Figure 4 Schematic diagram of central metabolic pathways, MEP pathway, and β-carotene biosynthetic pathway in E. coli. Genes tested for overexpression are depicted in green and those tested for deletion are shown in red. G6P, glucose-6-phosphate; F6P, fructose-6-phosphate; FBP, fructose 1,6-bisphosphate; DHAP, Dihydroxyacetone phosphate; G3P, glyceraldehyde 3-phosphate; 3PG, 3-phosphoglycerate; PEP, phosphoenolpyruvate; Pyr, pyruvate; 6PG, 6-phosphogluconate; X5P, xylulose-5-phosphate;

R5P,

ribose-5-phosphate;

E4P,

erythrose-4-phosphate;

S7P,

sedoheptulose-7-phosphate; KDPG, 2-keto-3-deoxy-6-phosphogluconate; 2-KG, α-oxoglutarate; OAA, oxaloacetate; Glu, glutamic acid. DXP, 1-deoxy-D-xylulose-5-phosphate; MEP, 2C-methyl-D-erythritol-4-phosphate; CDP-ME-2P, 2C-methyl-D-erythritol

CDP-ME,

4-diphosphocytidyl-2C-methyl-D-erythritol;

4-diphosphocytidyl-2C-methyl-D-erythritol-2-phosphate; 2,4-cyclodiphosphate;

1-hydroxy-2-methyl-2-(E)-butenyl-4-diphosphate;

IPP,

isopentenyl

MEC, HMBDP,

diphosphate;

DPP,

dimethylallyl diphosphate; FPP, farnesyl diphosphate; GGPP, geranylgeranyl diphosphate.

Figure 5 Metabolic engineering for enhanced production of β-carotene. (a) Effect of overexpressing MEP pathway genes on β-carotene production. The parental genotypes for testing gene overexpression

29

effects are presented below the colored lines. The best producer (ZF43) and the strain with all 9 genes (operons) overexpressed (ZF53) are indicated. Fermentations were performed in 2×YT medium for 24 h. Error bars, mean ± s.d. (b) Effect of independently and combinatorially overexpressing or deleting central metabolic pathway genes on β-carotene production. Combined deleting ptsHIcrr and overexpressing galP is abbreviated as GP. The overexpression of edd/eda and talA/tktB are shorten as edd and talA, respectively. Some bottom-ranked clones are not shown. Fermentations were performed in 2×YT medium with 10 g/L glucose for 48 h. Error bars, mean ± s.d. (c) Effect of combinatorial multiplexing of genetic targets to search the strain yielding the highest β-carotene titer. Fermentations were performed in 2×YT medium with 10 g/L glucose. (d) Time course profile of β-carotene accumulation and growth profile of the strain ZF237T in fed-batch fermentation.

30

Highlights 1. 2. 3. 4. 5.

Generated near 100% editing efficiency using dsDNA as editing template. One cycle of genomic editing required only two days. Strains with functional MMR system yielded increased editing efficiency. Combinatorially optimized MEP pathway and central metabolic pathways. Best strain produced 2.0 g/L β-carotene using glucose as the sole carbon source.

31

Figure 1

a

b J23100 gRNA

Co-transform donor DNA and gRNA plasmid (0.5 h)

PBAD

bla

gRNA-bla

cat

cas9

ColE1 donor DNA

Red αβγ

Protospacer PAM

Grow single colony to mid-log phase and make electrocompetent (6~8 h)

p15A

2 days per cycle

Streak cells for isola!ng single colonies (overnight)

Insertion

Deletion

Replacement

Plate cells for selec!ng gRNA transformants (overnight)

Analyze mutants by PCR or phenotypic screening (2 h)

Inoculate correct mutants for plasmid curing (6~8 h)

Figure 2

a

b ssDNA

dsDNA

80%

80%

Editing efficiency

100%

Editing efficiency

100%

60% 40% 20%

60% 40% 20% 0%

0% mismatch

1k

2k

3k

5k

8k

1k

12k

galK Chromosome

ldhA terminus

5k

8k

e 100%

1E+0

80%

1E-1

Recombination Frequency

oriC

Fraction of cells with all targeted mutations

d lacZ

3k

Insertion length (bp)

Deletion length (bp)

c

2k

60% 40% 20%

No DSB

DSB

1E-2 1E-3 1E-4 1E-5

0% lacZ galK

lacZ galK ldhA

ssDNA

dsDNA

Figure 3

a

b

c 1 bp replacement

100%

1E+7 mutSEditing efficiency

mutS+

Colony No.

1E+5 1E+4 1E+3 1E+2 1E+1

80%

100%

mutS+

80%

Editing efficiency

1E+6

mutS-

60% 40% 20%

control

lacZ

adhE Targeting site

pflB

lldD

60% 40% 20% 0%

0%

1E+0

9 bp replacement

Group A

Group B

ssDNA

100 bp

250 bp

500 bp

1 kb

dsDNA homologous arm length

Figure 4

Central metabolic pathways dxs

Glucose ptsHIcrr

MEP pathway DXP dxr

galP MEP

6PG

G6P X5P R5P

edd tktB G3P S7P

CDP-ME ispE

F6P tktB

talA G3P F6P

ispD

DHAP

CDP-ME-2P ispF MEC

tpiA

ispG

E4P HMBDP

KDPG

ispH

3PG

eda

DPP gps

PEP ppsA

IPP idi

ispA FPP

pykF

crtE

Pyr

GGPP crtB Phytoene crtI

pckA OAA

Lycopene 2-KG gdhA Glu

crtY β-carotene

β-carotene synthetic pathway

Figure 5

ZF43

a 20

ZF53

15 10 5 0 ZF02 dxs ispG ispD ispH dxr ispE gps

idi ispA dxs

ZF02

idi ispA ispD dxs dxr ispD ispH ispE ispD dxs ispH ispG dxr ispD dxs dxr ispG IspD ispE dxr ispD dxr ispG

gps

gps-idi

gps-idi-dxs

gps-idi-ispA

gps-idi-ispA-ispH

gps-idi-ispA-ispH-dxs

b 200

Independent modulation

Combined modulation

150 100 50 0

160

gps-crtE

140

gps-dxs-crtE

120

gps-crtE-ispA

100

2500

250

2000

200

1500

150

1000

100

500

50

0

0 0

20

40

60 80 Time (h)

100 120 140

OD600

tp iA -e d 180

gps-dxs

β-carotene titer (mg/L)

G P-

G P200

gps

β-carotene titer OD600

d β-carotene titer (mg/L)



G P-

G P

tp iA

c

ed d

d

β-carotene titer (mg/L)

β-carotene titer (mg/L)

25

ZF43

ZF43 -ispD

Metabolic engineering of Escherichia coli using CRISPR-Cas9 meditated genome editing.

Engineering cellular metabolism for improved production of valuable chemicals requires extensive modulation of bacterial genome to explore complex gen...
721KB Sizes 0 Downloads 23 Views