Accepted Article

Received Date : 02-Apr-2014 Revised Date : 21-May-2014 Accepted Date : 27-May-2014 Article type

: MiniReview

Editor

: Zhao, Huimin

2

Submitted to FEMS Yeast Research, 5/21/2014

Recent Advances in DNA Assembly Technologies Ran Chao1, Yongbo Yuan1, Huimin Zhao1, 2*

1

Department of Chemical and Biomolecular Engineering, Institute for Genomic Biology

Departments of Chemistry, Biochemistry, and Bioengineering, University of Illinois at UrbanaChampaign, Urbana, IL 61801

* To whom correspondence should be addressed. Phone: (217) 333-2631. Fax: (217) 333-5052.

E-mail: [email protected]. Running title: DNA assembly technologies This article has been accepted for publication and undergone full peer review but has not been through the copyediting, typesetting, pagination and proofreading process, which may lead to differences between this version and the Version of Record. Please cite this article as doi: 10.1111/1567-1364.12171 This article is protected by copyright. All rights reserved.

Accepted Article

Abstract DNA assembly is one of the most important foundational technologies for synthetic biology and metabolic engineering. Since the development of the restriction digestion and ligation method in the early 1970s, a significant amount of effort has been devoted to developing better DNA assembly methods with higher efficiency, fidelity, and modularity, as well as simpler and faster protocols. This review will not only summarize the key DNA assembly methods and their recent applications, but also highlight the innovations in assembly schemes and the challenges in automating the DNA assembly methods.

Keywords: Metabolic engineering; Synthetic biology; Pathway construction; Pathway

optimization; Genome synthesis; Automation

Introduction In the endeavors of metabolic engineering and synthetic biology, methods for putting genetic

parts together into replicable and expressible DNA molecules are critical to rapid prototyping of the metabolic pathways or genetic circuits of interest. Although DNA can be chemically synthesized to a certain length, construction of larger fragments still relies on enzymatic assembly methods (Kosuri & Church, 2014). The first developed DNA assembly method is the restriction digestion and ligation method (Cohen, et al., 1973), which led to the biotechnology revolution. Though popular, the limitations of this 40-year-old method restrict our ability of

This article is protected by copyright. All rights reserved.

Accepted Article

synthesizing complex DNA molecules. Increasingly complicated DNA construct design involving multiple genes and intergenic components requires higher efficiency and fidelity in DNA assembly that is beyond the capability of the traditional cloning methods based on restriction digestion and ligation (Cobb, et al., 2013). Moreover, the large size of these DNA constructs makes the selection of unique restriction sites extremely difficult. Even if restriction sites could be selected for a specific construct, they would most likely not be applicable for a different construct, which cripples the modularity of DNA assembly, a hallmark of synthetic biology. Actually, modularity is highly desired in various research projects. For example, combinatorial pathway library construction and screening has been demonstrated as an effective approach for pathway optimization, in which various genetic parts of similar functions are mixed and matched to search for particular combinations that improve the metabolic flux or other traits (Kim, et al., 2013, Xu, et al., 2013, Yuan, et al., 2013). Similarly, in genetic circuit design, expedited prototyping requires assembly of many characterized pre-fabricated parts (Canton, et al., 2008, Purnick & Weiss, 2009). Therefore, broadly applicable and highly efficient DNA assembly methods are desirable. Furthermore, to survey a greater range of combinations or designs, DNA assembly will most likely be performed in large scales via automation. Highthroughput DNA assembly requires robust and standardized protocols, which necessitates improvements in assembly methods for even higher efficiency, fidelity, and modularity.

A number of new DNA assembly methods have been developed in the past decade. Some of them were built upon the traditional restriction digestion and ligation method, such as the Golden Gate method (Engler, et al., 2008), while others harness different mechanisms, such as Gibson assembly (Weber, et al., 2011) and DNA Assembler (Shao & Zhao, 2009). By changing reaction

This article is protected by copyright. All rights reserved.

Accepted Article

protocols and linker regions between DNA parts to be assembled, these new methods have achieved improved efficiency, fidelity, and modularity, which have simplified both design and bench-side operation. Based on these methods, a number of innovations in assembly scheme and the resulting toolkits have been reported (Shetty, et al., 2008, Xu, et al., 2012, Casini, et al., 2014), which opened doors to a wide variety of applications. One major application with great industrial and scientific significance is the construction and engineering of metabolic pathways for production of chemicals and fuels. This review takes a glance at the state-of-the-art DNA assembly technologies and focuses on the applications of DNA assembly methods for pathway construction and engineering.

Key DNA Assembly Methods Based on the different assembly mechanisms employed, the various recently developed key DNA assembly methods can be divided into four groups, including restriction enzyme based methods, in vivo and in vitro sequence homology based methods, and bridging oligo based

methods (Fig. 1).

Restriction Enzymes Based Methods Restriction digestion and ligation using type II restriction enzymes and DNA ligase has been exploited as the standard cloning technique for about 40 years in molecular biology. A few improvements have been made based on the original method. The BioBrickTM standard was the first DNA assembly strategy that allows the sequential assembly of standard biological parts. It

This article is protected by copyright. All rights reserved.

Accepted Article

employs iterative cycles of restriction digestion and ligation reactions to assemble small DNA parts into a large DNA construct (Smolke, 2009, Sarrion-Perdigones, et al., 2011). Each DNA part is flanked by EcoRI and XbaI restriction sites in the upstream end and by SpeI and PstI restriction sites in the downstream end. XbaI and SpeI are isocaudamers that generate two

compatible sticky ends. Once ligated, the newly generated scar sequence (ACTAGA) between the two DNA parts is different from both original sites, and therefore cannot be cut in subsequent digestions with either XbaI or SpeI. On the other end of the inserted fragment, EcoRI is restored while a new XbaI site is introduced. Hence, the insertion can be repeated. The original BioBrickTM design had two extra nucleotides besides the natural 6-nucleotide scar which

hampers its application in protein fusion as it creates frame shifts and premature stop codons. Improved revisions of this method have been developed to address this issue by starting with two standards specifically designed to assist assembly of fusion proteins (Phillips & Silver, 2006, Grünberg, et al., 2009). A smaller 6-nucleotide scar sequence (ACTAGA) encoding threoninearginine is generated in each conjunction, which eliminates open reading frame shifts and stop codons. More recently, a modified standard called BglBrick (Anderson, et al., 2010) was reported. It addresses some key problems associated with the original BioBrickTM standards. The

DNA parts are flanked by restriction sites of more efficient and methylation insensitive enzymes: BglII and BamHI. The scar sequence (GGATCT) encodes glycine-serine so that it is also suitable for protein fusion applications.

The Golden Gate method (Engler, et al., 2008, Engler, et al., 2009) relies on type IIs restriction enzymes, which are able to cleave DNA outside of their recognition site and produce an overhang of 4 arbitrary nucleotides (if the most popular BsaI is used). When designed properly,

This article is protected by copyright. All rights reserved.

Accepted Article

two digested fragments can be ligated to generate a product lacking the original restriction sites. This method also allows restriction digestion and ligation cycling in a one-pot reaction at 37°C and 16°C, which can greatly increase the efficiency, driving the reaction to completion. Notwithstanding, it is not suitable to assemble long DNA constructs due to the lack of unique restriction enzymes. However, an improvement has been made on BasI-based Golden Gate method, which uses SapI with a rarer cut site than BasI (Whitman, et al., 2013). Another method termed MASTER (methylation-assisted tailorable ends rational) uses endonuclease MspJI which specifically recognizes methylated 4-bp sites, mCNNR (R=A or G), and generates a 4-bp

arbitrary overhang like type IIs endonucleases (Chen, et al., 2013). Since it avoids cuts on corresponding type IIs sites within the fragments as in the Golden Gate method, the MASTER method is more suitable for assembling large DNA constructs. However, it requires expensive methylated primers and PCR amplification of parts which may introduce errors for long parts.

Although restriction enzyme based methods are able to assemble multiple DNA parts into relatively large constructs, all DNA parts are required to be free of the restriction sites used in the assembly. Even though such sites can be removed by site-directed mutagenesis prior to assembly, it would require extra effort and cost to do so. Furthermore, restriction enzyme based methods rely on annealing of short sticky ends that may have limited affinity and specificity when assembling multiple DNA parts in one pot.

This article is protected by copyright. All rights reserved.

Accepted Article

Sequence Homology Based Methods: in vitro Sequence homology based methods usually utilize longer arbitrary overlapping regions between parts, which prevents the same issues as with restriction enzyme methods. Both in vitro and in vivo sequence homology based methods have been developed. Overlap extension polymerase chain reaction (OE-PCR) enables scarless assembly of DNA parts (Horton, et al., 1989). Basic DNA parts are first amplified in separate PCRs with homologous ends between them. With the corresponding homologous regions, these DNA parts can anneal to each other and be extended by DNA polymerase in the second round of PCR to yield spliced DNA molecules. The resulting larger DNA fragments can then be inserted into plasmids using the restriction digestion and ligation method. In order to improve the efficiency of annealing, special GC-rich overlap sequences are designed to mediate reliable fusion PCR (Cha-aim, et al., 2009). Recently, one method named circular polymerase extension cloning (CPEC) (Quan & Tian, 2009) was developed allowing assembly of multiple inserts with any vector in a one-step OE-PCR reaction. With extended homologous regions, the fused DNA molecules circularize with a nick in each strand. After transformation, host Escherichia coli fixes the nicks to form intact vectors. Furthermore, it was shown that non-linearized vectors can be used as templates for extension followed by DpnI digestion to eliminate the original plasmids (Bryksin & Matsumura, 2010). In another sequence homology based in vitro DNA assembly method called SLIC (sequence and ligation-independent cloning), recombination intermediates generated in vitro are transformed into cells. Endogenous DNA repair machinery was utilized to finish the repair and generate recombinant DNA molecules (Li & Elledge, 2007). The 3’ ends of the linearized vector and the

This article is protected by copyright. All rights reserved.

Accepted Article

overlap regions of the inserts are chewed back by T4 DNA polymerase in the absence of dNTPs and left as single-stranded. Subsequently, the RecA protein and ATP are used to promote recombination before being used to transform E. coli. The gaps are also fixed by host E. coli in vivo. A follow-up method named SLiCE (Seamless Ligation Cloning Extract) (Zhang, et al., 2012) uses inexpensive E. coli cell extracts to drive homology-mediate DNA assembly, which significantly reduces the cost. A disadvantage of this strategy is that the length of single-strand overlaps is not very controllable in the chew-back reaction. A modified method is called uracilspecific excision reagent cloning (USER) (Smith, et al., 1993, Nour-Eldin, et al., 2010). A single deoxyuridine residue is included in each primer in the PCR. The PCR products and the vector backbone are then excised by uracil DNA glycosylase and DNA glycosylase-lyase Endo VIII to produce 3’ overhangs. The deoxyuridine residues introduced in this method, however, significantly increase the cost of primers. Similar to SLIC, the Gibson assembly method (Weber, et al., 2011) utilizes T5 exonuclease to chew back the 5’ ends to generate single-stranded complementary overhangs which are joined together covalently by Phusion DNA polymerase and Taq DNA ligase. In a one-step isothermal in vitro reaction at 50 °C, the fragments can be assembled into a single circular DNA molecule. Similar to USER, Wang et al. developed a method termed nicking endonucleases for LIC (NELIC) (Wang, et al., 2013), in which nicking endonucleases (NEases) are exploited to generate overhangs of controlled lengths, although the NEase recognition sites are left as a scar. Notably, both Gibson assembly and NE-LIC feature an in vitro ligation reaction which was found to increase the efficiency of DNA assembly. Colloms and co-workers reported a method named SIRA (serine integrase recombinational assembly) (Colloms, et al., 2013). SIRA takes advantage of the recombination machinery of

This article is protected by copyright. All rights reserved.

Accepted Article

ϕC31 integrase from phage. ϕC31 cuts at attP and attB sites and rejoin the exchanged half sites to form new attL and attR sites in vitro. By the addition of phage recombination directionality protein (RDF), the reaction can be reversed which allows removal of recombined fragments. This unique feature of SIRA allows future editing of the DNA constructs. In addition to these methods, two well-known commercial kits are available for plasmid construction. One kit is In-FusionTM (Clontech), which is a proprietary version of the “chew back and anneal” method to directionally clone one or more fragments into any vector. PCR-generated or vector-linearized DNA fragments with a 15 bp sequence overlap at their ends can be assembled in a fast and efficient way. Another kit is GatewayTM (Life Technologies), which

utilizes a specific recombinase to mediate the recombination of specific overlap sequences. This method allows fast and efficient cloning of single genes into flexible vectors but is not suitable for assembly of large DNAs due to the remaining same sequences at each junction.

Sequence Homology Based Methods: in vivo Homologous recombination occurs naturally in Saccharomyces cerevisiae with high efficiency and fidelity, which was exploited by Gibson et al. to construct the Mycoplasma genitalium

genome by assembling 25 DNA fragments (Gibson, Benders, Axelrod, et al., 2008) and by Shao et al. to construct a large pathway by assembling multiple fragments (Shao, et al., 2009) nearly at the same time. In the method developed by Shao et al., or so-called DNA Assembler, all DNA parts to be assembled can be obtained either from PCR amplification or restriction digestion with homologous arms between neighboring parts in the pathway. All the linear DNA parts are

This article is protected by copyright. All rights reserved.

Accepted Article

directly transformed into S. cerevisiae. Circular plasmids are then constructed by its endogenous homologous recombination machinery. With a similar mechanism, DNA assembly was also performed in other organisms such as Bacillus subtilis (Yonemura, et al., 2007, Itaya, et al., 2008) and certain plants (Zhu, et al., 2008, Farre, et al., 2012). Recently a promising direct DNA cloning method in E. coli was reported by Fu et al. (Zhang, et al., 2000, Fu, et al., 2012). Based on the discovery that the full-length Rac prophage proteins RecE and RecT facilitate highly efficient homologous recombination between two linear DNA molecules in E. coli, digested or sheared DNA was transformed together with PCR-amplified vector containing terminal homology arms into the host which over-expressed full-length RecET promoting formation of an intact vector. By using this strategy, ten megasynthetase gene clusters from 10-52 kb were cloned into expression vectors.

Bridging Oligo Based Methods Different from sequence homology based methods, Kok and coworkers developed a DNA assembly method based on ligase cycling reaction (LCR) (Kok, et al., 2014). Single stranded bridging oligos are designed to be complementary to two ends of neighboring DNA parts. Based on a one-step DNA assembly method named chain reaction cloning (CRC) (Pachuk, et al., 2000), the assembly conditions were systematically optimized. DNA parts to be assembled via LCR were amplified using 5’-phosphorylated primers. During the reaction, the double stranded DNA parts are denatured at a high temperature, and then the upper (or lower) strands from the two parts to be adjoined anneal with the bridging oligo at a lower temperature and are subsequently joined scarlessly by thermostable ligase via a phosphodiester bond. In following cycles, the

This article is protected by copyright. All rights reserved.

Accepted Article

ligated strand serves as the template to assemble the complementary strands. By running multiple thermal cycles, linear DNA parts can be assembled into circular DNA constructs and then transformed into E. coli competent cells for amplification. By comparing this method with other available scarless and sequence-independent DNA assembly methods, the authors found that LCR method with optimized conditions had similar fidelity with yeast homologous recombination when assembling up to 12 DNA parts, whereas CPEC and Gibson isothermal assembly had lower fidelity under the same conditions.

Innovations in DNA Assembly Schemes Besides the reaction mechanisms, the scheme by which DNA parts are put together greatly affects the quality and simplicity of assembly. Building multi-fragment pathways is usually problematic regardless of the method. In sequence homology based methods, unexpected homology between fragments and non-homologous end joining (NHEJ) will yield mis-assembled products. Since usually the selection after assembly only determines if the vector is present and replicable, there is no guarantee that each fragment is correctly assembled. It is common to find fragments omitted or swapped. Wingler and Cornish proposed an iterative integration scheme (Fig. 2) to ensure the incorporation of each fragment (Wingler & Cornish, 2011). The fragments are inserted into two types of serial donor plasmids: Type A with Homing Endonuclease 1, Recognition Site 2, and Marker 1; and Type B with Homing Endonuclease 2, Recognition Site 1, and Marker 2. Type A and B alternate in that series, in which the plasmids have homologous regions that can be linked head-to-tail. When the host is cured with the first donor (Type A), recognition site 1 on the host genome is cut, which facilitates integration of a part of the donor

This article is protected by copyright. All rights reserved.

Accepted Article

containing the first fragment, Marker 1, and Recognition Site 2 and replaces Marker 2 and Recognition Site 1 on the genome. Then the host is selected against Marker 1. Similarly, when the host is cured with the second donor (Type B), the host is changed back to Marker 2 and selected accordingly. The subsequent fragments are integrated in the same way. This work innovatively employed alternating markers to apply selective pressure for each fragment. However, the increased fidelity comes at a cost of time. Sequential assembly is obviously less appealing over one-step assembly of multiple fragments if the required level of fidelity and efficiency could be reached. However, highly accurate and efficient one-step assembly is not easy to accomplish and requires additional engineering. In the in vivo homologous recombination based assembly methods, NHEJ of the linearized vectors contributes to most of the false positives. One solution could be to introduce a counter-selective marker at the cloning site (Anderson & Haj-Ahmad, 2003). In order to survive, the host must have the counter selective marker replaced by the designated inserts. To further minimize this problem, separating the essential elements on the vector, selection marker and episome has been proposed (Kuijpers, et al., 2013). As the host organism needs at least 2 NHEJs to incorporate both elements, the probability of false positives was greatly reduced. With this new scheme, 9 fragments were assembled by 60-bp overlapping regions with a correct yield of 95%. Most of the DNA assembly methods rely on overlapping linker sequences, no matter short or long, shared between adjacent fragments. The linkers in some of the methods can be customized, which leaves freedom for further optimization. Liang and co-workers have optimized the 4-bp linkers for synthesizing customized recombinant transcription activator-like effectors (TALEs) with the Golden Gate method (Liang, et al., 2013). TALEs are proteins that can target DNA sequences with specifically ordered central repeat domains (CRDs). Each CRD recognizes one

This article is protected by copyright. All rights reserved.

Accepted Article

nucleotide. Linkers with higher efficiency were selected by experiment. With these selected linkers, 13 fragments or DNA sequences coding for up to 31 CRDs were ligated together with a nuclease domain in one step. Without picking single colonies, 96% of the assembled TALE nucleases were found to be functional. Nonfunctional ones were sequenced and found to be correctly assembled too. With the improved efficiency and fidelity brought by the optimized linkers, colony picking could be avoided, which made this process much more automationfriendly compared to a closely related approach (Kim, et al., 2013). Building genetic circuits usually involves multiple design-built-test cycles, in which rapid prototyping is desired. Sequence homology based methods such as the Gibson assembly method allow assembly of multiple standardized genetic parts in one step. However, as the DNA molecules coding for these circuits are likely to be highly repetitive, sequence homology based methods will suffer from low efficiency and mis-assembly. Thus synthetic linkers that are unique for each fragment should be used instead of the endogenous sequences to ensure the complementary single stranded ends of the fragments to anneal as designed. Moreover, the use of predefined sets of synthetic linkers modularizes the sequence homology based assembly. Since there is full freedom in designing the synthetic linkers, more parameters can be taken into consideration to further improve the efficiency and fidelity. Due to the large number of possible candidates, 4n, where is the number of base pairs in the linkers, computational approaches are

usually preferred. A few algorithms have been implemented to generate the optimal orthogonal set of synthetic linker sequences. Most of these algorithms screens a large number of random oligo nucleotide sequences of a certain length. Although different thresholds are used, the screens commonly applied include: 1) orthogonality among the oligo sequences; 2) homology against host genomes; 3) melting temperature; 4) GC content and distribution; and 5) likelihood

This article is protected by copyright. All rights reserved.

Accepted Article

of forming hairpins and primer dimers. Using computationally designed linkers, it was possible to assemble over 30 parts with above 10% fidelity (Guye, et al., 2013). Another work reached ~85% fidelity when assembling 5 fragments with ~80% identity (Torella, et al., 2013). R2oDNA Designer is an on-line computational tool for designing the optimized linker set for Gibson assembly or similar methods (Casini, et al., 2014). It was shown that designed linkers yielded an improved assembly efficiency compared to the conventional method (Casini, et al., 2014). On the other hand, the synthetic biology community keeps making great progress in standardization of assembly schemes with characterized genetic parts. Owing to the modularity of BioBrickTM standard and the contributions of the community, thousands of standard parts are

available in the growing public registry (Partregistry, 2014), which have greatly facilitated rapid construction of gene circuits and beyond (Smolke, 2009, McLennan, 2012, Partregistry, 2014). Xu and co-workers developed ePathBrick, i.e., a number of BiobrickTM vectors carrying characterized regulatory signal elements, which allows combinatorial assembly of promoters, ribosome binding sites, and terminators for fine tuning of gene expressions in pathways (Xu, et al., 2012). ePathBrick vectors contain 4 isocaudomers that are compatible with the BioBrickTM standard, adding a fine-tuning toolkit to the growing BioBrickTM part collection. Tool kits based

on Golden Gate method, such as MoClo (Weber, et al., 2011) and GoldenBraid (SarrionPerdigones, et al., 2011) have also been developed for modular assembly of large pathways. Due to the nature of Golden Gate assembly, these kits allow rapid construction of multiple parts in one pot. With these standardized parts, researchers will be able to test their designs by quickly assembling without “remachining” them to fit each other.

This article is protected by copyright. All rights reserved.

Accepted Article

Successful Applications Although most of the above-mentioned DNA assembly methods were developed in the past few years, there are already many successful applications. For example, one application of the DNA assembly methods is to discover novel natural products. Activation of cryptic or silent gene clusters encoding biosynthesis of natural products is usually difficult but is an important way to discover novel natural products. Recently, Shao and coworkers developed a novel plug-and-play strategy for natural product discovery in which a set of constitutive promoters, which were proven functional in the target expression host, were assembled upstream of each pathway gene to refactor silent biosynthetic pathways (Shao, et al., 2013). With this strategy, a silent spectinabilin pathway from Streptomyces orinoci was successfully activated. Similarly, a cryptic polycyclic tetramate macrolactams (PTMs) biosynthetic gene cluster from Streptomyces griseus was successfully activated and three new PTMs were discovered (Luo, et al., 2013). Another application of the DNA assembly methods is to design and characterize gene circuits, which can not only help researchers gain better understanding of intra-cellular (Elowitz & Leibler, 2000, Atkinson, et al., 2003) and inter-cellular (Bulter, et al., 2004, Chen & Weiss, 2005,

Wang, et al., 2008) regulatory machineries but also have a wide range of potential applications (Wall, et al., 2004, Purnick & Weiss, 2009). The rapid development of gene circuits should be attributed to the consistent effort in standardization of genetic parts and devices (Canton, et al., 2008, Shetty, et al., 2008). As discussed earlier, BioBrickTM parts can be readily assembled like Legos for prototyping of gene circuits, however, the sequential nature of the BioBrick standard would be time consuming if multiple parts are assembled. The Gibson assembly method allows one-pot assembly of multiple pieces of DNA regardless of restriction site availability. Using

This article is protected by copyright. All rights reserved.

Accepted Article

synthetic linkers, high-throughput isothermal assembly of complex gene circuits of up to 33 parts was made possible (Guye, et al., 2013, Torella, et al., 2013). A third application of the DNA assembly methods is to synthesize genomes. The J. Craig Venter Institute synthesized a 583 kb Mycoplasma genitalium genome by a combination of in vitro

enzymatic and in vivo homologous recombination based methods (Gibson, et al., 2008). In the early stage, in vitro recombination method was utilized to assemble 25 DNA cassettes with an average length of 24 kb to eight 72 kb assemblies and subsequently assembled into four 144 kb assemblies. It was found that the efficiency of in vitro procedure declined as the assemblies became larger and the half-genome in size of 290 kb each was unable to be assembled. Therefore the in vivo S. cerevisiae recombination method was exploited to complete the final whole genome assembly. In the meantime, it was also found possible to directly assemble 25 DNA cassettes from the earliest stages into a complete genome in a single step by in vivo recombination in S. cerevisiae (Gibson, et al., 2008). Two years later, a Mycoplasma mycoides cell controlled by the chemically synthesized genome was created and exhibited the in silico designed phenotype (Gibson, et al., 2010). The 16.3 kb mouse mitochondrial genome was also assembled via in vitro isothermal recombination method from 600 DNA pieces with 60 bp overlaps (Gibson, et al., 2010). Very recently, a full designer S. cerevisiae chromosome was synthesized (Beauverd, et al., 2014). These successes are excellent demonstrations of the power of modern DNA assembly methods. Another very important application of the DNA assembly methods is pathway optimization for metabolic engineering. DNA Assembler is an outstanding pathway optimization tool, especially in S. cerevisiae. Du and co-workers developed a simple and efficient strategy for multi-gene pathway optimization strategy termed customized optimization of metabolic pathways by

This article is protected by copyright. All rights reserved.

Accepted Article

combinatorial transcriptional engineering (COMPACTER) (Du, et al., 2012, Yuan, et al., 2013). In this strategy, a library of promoters with varying strengths were used for each of the structural genes in either the xylose utilizing pathway or the cellobiose utilizing pathway and assembled together with the corresponding structural genes and terminators to generate a library of xylose or cellobiose utilizing pathways. Using a cell growth based high throughput screening strategy, a heterologous xylose utilizing pathway with high efficiency and a heterologous cellobiose utilizing pathway with the highest reported efficiency for both laboratory and industrial S. cerevisiae strains were isolated. The heterologous cellobiose utilizing pathway was further optimized through directed evolution (Yuan & Zhao, 2013). Similar to COMPACTER, errorprone PCR generated promoter mutants were assembled with corresponding genes by DNA Assembler to create the first round pathway library. Another pathway library was created based on the best pathway mutant from the first round and screened again. This iterative process was continued until no further improvement was observed. After 3 rounds of screening, the final industrial S. cerevisiae strain showed 6-fold higher cellobiose consumption rate and ethanol

productivity. Also using DNA Assembler, Eriksen and coworkers improved the cellobiose pathway by simultaneous directed evolution of cellodextrin transporter and β-glucosidase (Eriksen, et al., 2013). Kim and co-workers developed a combinatorial pathway engineering approach to rapidly create/screen a highly efficient xylose utilizing pathway in S. cerevisiae (Kim, et al., 2013). A total of 20 xylose reductase homologues, 22 xylitol dehydrogenase homologues and 19 xylulose homologues were cloned from different organisms. Using DNA assembler, the homologue libraries were first inserted into 3 different helper plasmids containing 3 pairs of promoters and terminators accordingly to generate expression cassettes. These cassettes were then PCR amplified and assembled by DNA Assembler to create the xylose

This article is protected by copyright. All rights reserved.

Accepted Article

pathway library. The optimized pathway was then identified by rapid growth-based screening (Fig. 3). As in gene circuit design, characterized pre-fabricated DNA parts can also accelerate the pathway optimization process. ePathBrick is a pathway fine-tuning toolkit compatible with BioBrickTM standard, which was recently used to optimize the fatty acid biosynthetic pathway in E. coli (Xu, et al., 2013). The pathway was first engineered by gene knock-out and overexpressions of a number of fatty acid biosynthesis related genes. Then the entire pathway was divided to 3 modules: glycolysis (GLY), acetyl-CoA activation (ACA), and fatty acid biosynthesis (FAS). GLY, ACA, and FAS modules were assembled to 5 ePathBrick vectors with different copy numbers and promoter strengths. Eventually, the engineered strain achieved the highest fatty acid productivity ever reported. Also following BioBrickTM standard, a 7 gene

carotenoid biosynthetic pathway was fine-tuned by spanning the expression space with computationally designed ribosome binding sites (RBS) of various strengths (Salis, et al., 2009, Zelcbuch, et al., 2013). The RBS libraries were randomly assembled with the genes in the BioBrickTM iterative cloning process. After screening, an astaxanthin yield of 4-fold higher than

previously reported was achieved.

Conclusions and Future Prospects Over the past decade, DNA assembly technologies have made great advances. Most researchers of the synthetic biology and metabolic engineering community have adopted and benefited from these advances (Kahl & Endy, 2013). Nevertheless, there are still many challenges to overcome. The assembly quality still depends on the fragment sequences while the efficiency and fidelity

This article is protected by copyright. All rights reserved.

Accepted Article

still need to be improved. High failure rate makes DNA assembly time and labor consuming. There is also room for the modularity to improve as well. Ideally, we should be able to “bolt on” genetic parts with standard “fasteners” and “adaptors” without too much concern about compatibility. Automated design will undoubtedly have broader applications (Ellis, et al., 2011, MacDonald, et al., 2011). Besides simple computer aided design tools for drawing the maps for one or several constructs, more high-throughput design tools will soon become available due to the needs in combinatorial assembly and multiplex genome engineering. Moreover, these high-throughput tools should be able to generate primer sequences according to the assembly methods and schemes as well as to interface with robotic liquid handlers in order to set up PCR and assembly reaction systems automatically. J5, a software package developed by the Joint BioEnergy Institute can help mass-process thousands of assemblies and generate the primer sequences for a number of commonly used assembly methods (Hillson, et al., 2012). J5 can also generate the control files for automated liquid handers to set up PCR reactions. By the same group, a crossplatform language has been proposed to simplify the programming of liquid handlers (Linshiz, et al., 2013). Since it is difficult for a generalized algorithm to fulfill all research purposes, specialized tools are also necessary. Liang and co-workers developed a high-throughput TALE synthesis platform (Liang, et al., 2013). The computational tool in this work generated liquid handler control files based on the target binding sequences for setting up assembly reactions automatically. With the development of laboratory automation devices, there is a potential to extend the level of automation in the assembly process. Most up-to-date robotic liquid handlers are able to carry out relatively accurate pipetting and transportation of labware (Kong, et al., 2012). Some of the more

This article is protected by copyright. All rights reserved.

Accepted Article

complex platforms integrate temperature control units, shakers, and detection devices on board. Thus cherry-picking genetic parts, adding reagents, incubation, or even thermocycling could be readily automated, while steps after assembly reactions still have barriers to be fully automated. In this type of bio-fabrication line, any procedure that requires large amount of human intervention would be a throughput bottleneck. With the increasing needs, full automation of the assembly process may become the next game-changing technology. In the post assembly steps, colony picking especially, even if automated by colony picker and petri-dish handlers, would still significantly lower the throughput of the whole process. The best scenario is when the assembly has enough fidelity that colony picking could be avoided. Although some DNA constructs can be verified by functional assays, lack of low cost DNA analyzing devices still remains a technical hurdle. Application of microfluidics in molecular biology has made promising advances which may turn out to be the enabling technologies for high-throughput DNA assembly confirmation (Mueller, et al., 2000, Dorfman, et al., 2013). Taking advantage of high throughput capillary electrophoresis, a cost-effective method for verification of DNA constructs has been reported (Dharmadi, et al., 2014). With the decreasing cost of DNA sequencing (Shendure & Ji, 2008, Clarke, et al., 2009), it may also become practical to directly sequence all assembled DNA molecules in the future. Further in the future, de novo synthesis of large DNA fragments may also be more affordable (Goldberg, 2013). De novo synthesis can help prefabricate DNA fragments with necessary linkers or mutations that make the downstream assembly easier. We envision that DNA assembly will soon become a cost-effective service provided by specialized corporations that handle orders on fully automated bio-assembly lines. By this way, synthetic biologists and metabolic engineers can focus more on designing and testing their ideas instead of the problematic assembly process of DNA molecules.

This article is protected by copyright. All rights reserved.

Accepted Article

Acknowledgements We thank the National Institutes of Health (GM077596), the National Academies Keck Futures Initiative on Synthetic Biology, Department of Defense Advanced Research Program Agency, Institute for Genomic Biology at the University of Illinois at Urbana-Champaign, and the Energy Biosciences Institute for financial support in our development and application of DNA assembly technologies. In addition, we state that there is no conflict of interest.

Authors’ contribution R.C. and Y.Y. contributed equally to this publication.

References Anderson JC, Dueber JE, Leguia M, Wu GC, Goler JA, Arkin AP & Keasling JD (2010) BglBricks: A flexible standard for biological part assembly. J Biol Eng 4: 1. Anderson PR & Haj-Ahmad Y (2003) Counter-selection facilitated plasmid construction by homologous recombination in Saccharomyces cerevisiae. BioTechniques 35: 692-694, 696, 698. Atkinson MR, Savageau MA, Myers JT & Ninfa AJ (2003) Development of genetic circuitry exhibiting toggle switch or oscillatory behavior in Escherichia coli. Cell 113: 597-607. Beauverd Y, Roosnek E, Tirefort Y, et al. (2014) Validation of the disease risk index for outcome of patients undergoing allogeneic hematopoietic stem cell transplantation after T cell-depletion. Biol Blood Marrow Transplant. doi:10.1016/j.bbmt.2014.04.023 Bryksin AV & Matsumura I (2010) Overlap extension PCR cloning: a simple and reliable way to create recombinant plasmids. BioTechniques 48: 463-465. Bulter T, Lee SG, Woirl WWC, Fung E, Connor MR & Liao JC (2004) Design of artificial cellcell communication using gene and metabolic networks. Proc Natl Acad Sci USA 101: 2299-2304. Canton B, Labno A & Endy D (2008) Refinement and standardization of synthetic biological parts and devices. Nat Biotechnol 26: 787-793. Casini A, Christodoulou G, Freemont PS, Baldwin GS, Ellis T & MacDonald JT (2014) R2oDNA Designer: computational design of biologically neutral synthetic DNA sequences. ACS Synth Biol DOI: 10.1021/sb4001323.

This article is protected by copyright. All rights reserved.

Accepted Article

Casini A, MacDonald JT, De Jonghe J, Christodoulou G, Freemont PS, Baldwin GS & Ellis T (2014) One-pot DNA construction for synthetic biology: the Modular Overlap-Directed Assembly with Linkers (MODAL) strategy. Nucleic Acids Res 42: e7. Cha-aim K, Fukunaga T, Hoshida H & Akada R (2009) Reliable fusion PCR mediated by GCrich overlap sequences. Gene 434: 43-49. Chen MT & Weiss R (2005) Artificial cell-cell communication in yeast Saccharomyces cerevisiae using signaling elements from Arabidopsis thaliana. Nat Biotechnol 23: 15511555. Chen WH, Qin ZJ, Wang J & Zhao GP (2013) The MASTER (methylation-assisted tailorable ends rational) ligation method for seamless DNA assembly. Nucleic Acids Res. 41: e93. Clarke J, Wu HC, Jayasinghe L, Patel A, Reid S & Bayley H (2009) Continuous base identification for single-molecule nanopore DNA sequencing. Nat Nanotechnol 4: 265-270. Cobb RE, Ning JC & Zhao H (2013) DNA assembly techniques for next-generation combinatorial biosynthesis of natural products. J Ind Microbiol Biotechnol. 41: 469-477 Cohen SN, Chang AC, Boyer HW & Helling RB (1973) Construction of biologically functional bacterial plasmids in vitro. Proc Natl Acad Sci USA 70: 3240-3244. Dharmadi Y, Patel K, Shapland E, Hollis D, Slaby T, Klinkner N, Dean J & Chandran SS (2014) High-throughput, cost-effective verification of structural DNA assembly. Nucleic Acids Res 42: e22. Dorfman KD, King SB, Olson DW, Thomas JD & Tree DR (2013) Beyond gel electrophoresis: microfluidic separations, fluorescence burst analysis, and DNA stretching. Chem Rev 113: 2584-2667. Du J, Yuan Y, Si T, Lian J & Zhao H (2012) Customized optimization of metabolic pathways by combinatorial transcriptional engineering. Nucleic Acids Res 40: e142. Ellis T, Adie T & Baldwin GS (2011) DNA assembly for synthetic biology: from parts to pathways and beyond. Integr Biol (Camb) 3: 109-118. Elowitz MB & Leibler S (2000) A synthetic oscillatory network of transcriptional regulators. Nature 403: 335-338. Engler C, Kandzia R & Marillonnet S (2008) A one pot, one step, precision cloning method with high throughput capability. PLoS One 3: e3647. Engler C, Gruetzner R, Kandzia R & Marillonnet S (2009) Golden gate shuffling: a one-pot DNA shuffling method based on type IIs restriction enzymes. PLoS One 4: e5553. Eriksen DT, Hsieh PC, Lynn P & Zhao H (2013) Directed evolution of a cellobiose utilization pathway in Saccharomyces cerevisiae by simultaneously engineering multiple proteins. Microb Cell Fact 12: 61. Farre G, Naqvi S, Sanahuja G, et al. (2012) Combinatorial genetic transformation of cereals and the creation of metabolic libraries for the carotenoid pathway. Methods Mol Biol 847: 419435. Fu J, Bian X, Hu S, et al. (2012) Full-length RecE enhances linear-linear homologous recombination and facilitates direct cloning for bioprospecting. Nat Biotechnol 30: 440-446. Gibson DG, Smith HO, Hutchison CA, Venter JC & Merryman C (2010) Chemical synthesis of the mouse mitochondrial genome. Nat Methods 7: 901-905. Gibson DG, Benders GA, Axelrod KC, Zaveri J, Algire MA, Moodie M, Montague MG, Venter JC, Smith HO & Hutchison CA, 3rd (2008) One-step assembly in yeast of 25 overlapping DNA fragments to form a complete synthetic Mycoplasma genitalium genome. Proc Natl Acad Sci USA 105: 20404-20409.

This article is protected by copyright. All rights reserved.

Accepted Article

Gibson DG, Benders GA, Andrews-Pfannkoch C, et al. (2008) Complete chemical synthesis, assembly, and cloning of a Mycoplasma genitalium genome. Science 319: 1215-1220. Gibson DG, Glass JI, Lartigue C, et al. (2010) Creation of a bacterial cell controlled by a chemically synthesized genome. Science 329: 52-56. Goldberg M (2013) BioFab: Applying Moore's law to DNA synthesis. Ind Biotechnol 9: 10-12. Grünberg R, Arndt K & Müller K (2009) Fusion Protein (Freiburg) Biobrick assembly standard. http://hdl.handle.net/1721.1/45140. Guye P, Li Y, Wroblewska L, Duportet X & Weiss R (2013) Rapid, modular and reliable construction of complex mammalian gene circuits. Nucleic Acids Res 41: e156. Hillson NJ, Rosengarten RD & Keasling JD (2012) j5 DNA assembly design automation software. ACS Synth Biol 1: 14-21. Horton RM, Hunt HD, Ho SN, Pullen JK & Pease LR (1989) Engineering hybrid genes without the use of restriction enzymes: gene splicing by overlap extension. Gene 77: 61-68. Itaya M, Fujita K, Kuroki A & Tsuge K (2008) Bottom-up genome assembly using the Bacillus subtilis genome vector. Nat Meth 5: 41-43. Kahl LJ, Endy D (2013) A survey of enabling technologies in synthetic biology. J Biol Eng 7: 13. Kim B, Du J, Eriksen DT & Zhao HM (2013) Combinatorial design of a highly efficient xyloseutilizing pathway in saccharomyces cerevisiae for the production of cellulosic biofuels. Appl Environ Microbiol 79: 931-941. Kim Y, Kweon J, Kim A, et al. (2013) A library of TAL effector nucleases spanning the human genome. Nat Biotechnol 31: 251-258. Kok Sd, Stanton LH, Slaby T, et al. (2014) Rapid and reliable DNA assembly via ligase cycling reaction. ACS Synth Biol 3: 97-106. Kong F, Yuan L, Zheng YF & Chen W (2012) Automatic liquid handling for life science: a critical review of the current state of the art. J Lab Autom 17: 169-185. Kosuri S, Church GM (2014) Large-scale de novo DNA synthesis: technologies and applications. Nat Meth 11: 499-507 Kuijpers NG, Solis-Escalante D, Bosman L, van den Broek M, Pronk JT, Daran JM & DaranLapujade P (2013) A versatile, efficient strategy for assembly of multi-fragment expression vectors in Saccharomyces cerevisiae using 60 bp synthetic recombination sequences. Microb Cell Fact 12: 47. Li MZ & Elledge SJ (2007) Harnessing homologous recombination in vitro to generate recombinant DNA via SLIC. Nat Meth 4: 251-256. Liang J, Chao R, Abil Z, Bao Z & Zhao H (2013) FairyTALE: a high-throughput TAL effector synthesis platform. ACS Synth Biol 3: 67-73. Linshiz G, Stawski N, Poust S, Bi C, Keasling JD & Hillson NJ (2013) PaR-PaR laboratory automation platform. ACS Synth Biol 2: 216-222. Luo Y, Huang H, Liang J, Wang M, Lu L, Shao Z, Cobb RE & Zhao H (2013) Activation and characterization of a cryptic polycyclic tetramate macrolactam biosynthetic gene cluster. Nat Commun 4: 2894. MacDonald JT, Barnes C, Kitney RI, Freemont PS & Stan GB (2011) Computational design approaches and tools for synthetic biology. Integr Biol (Camb) 3: 97-108. McLennan A (2012) Building with BioBricks: constructing a commons for synthetic biology research. Edward Elgar, Cheltenham.

This article is protected by copyright. All rights reserved.

Accepted Article

Mueller O, Hahnenberger K, Dittmann M, Yee H, Dubrow R, Nagle R & Ilsley D (2000) A microfluidic system for high-speed reproducible DNA sizing and quantitation. Electrophoresis 21: 128-134. Nour-Eldin HH, Geu-Flores F & Halkier BA (2010) USER cloning and USER fusion: the ideal cloning techniques for small and big laboratories. Methods Mol Biol 643: 185-200. Pachuk CJ, Samuel M, Zurawski JA, Snyder L, Phillips P & Satishchandran C (2000) Chain reaction cloning: a one-step method for directional ligation of multiple DNA fragments. Gene 243: 19-25. Partregistry (2014) Registry of Standard Biological Parts. http://www.webcitation.org/6O8Ha2b2B. Phillips I & Silver P (2006) A new biobrick assembly strategy designed for facile protein engineering. http://hdl.handle.net/1721.1/32535. Purnick PEM & Weiss R (2009) The second wave of synthetic biology: from modules to systems. Nat Rev Mol Cell Bio 10: 410-422. Quan J & Tian J (2009) Circular polymerase extension cloning of complex gene libraries and pathways. PLoS One 4: e6441. Salis HM, Mirsky EA & Voigt CA (2009) Automated design of synthetic ribosome binding sites to control protein expression. Nat Biotechnol 27: 946-950. Sarrion-Perdigones A, Falconi EE, Zandalinas SI, Juarez P, Fernandez-del-Carmen A, Granell A & Orzaez D (2011) GoldenBraid: an iterative cloning system for standardized assembly of reusable genetic modules. PLoS One 6: e21622. Shao Z & Zhao H (2009) DNA assembler, an in vivo genetic method for rapid construction of biochemical pathways. Nucleic Acids Res 37: e16. Shao Z, Rao G, Li C, Abil Z, Luo Y & Zhao H (2013) Refactoring the silent spectinabilin gene cluster using a plug-and-play scaffold. ACS Synth Biol 2: 662-669. Shao ZY, Zhao H & Zhao HM (2009) DNA assembler, an in vivo genetic method for rapid construction of biochemical pathways. Nucleic Acids Res 37, e16. Shendure J & Ji H (2008) Next-generation DNA sequencing. Nat. Biotechnol. 26: 1135-1145. Shetty RP, Endy D & Knight TF, Jr. (2008) Engineering bioBrick vectors from biobrick parts. J Biol Eng 2: 5. Smith C, Day PJ & Walker MR (1993) Generation of cohesive ends on PCR products by UDGmediated excision of dU, and application for cloning into restriction digest-linearized vectors. PCR Methods Appl 2: 328-332. Smolke CD (2009) Building outside of the box: iGEM and the BioBricks Foundation. Nat Biotechnol. 27: 1099-1102. Torella JP, Boehm CR, Lienert F, Chen JH, Way JC & Silver PA (2013) Rapid construction of insulated genetic circuits via synthetic sequence-guided isothermal assembly. Nucleic Acids Res 42: 681-689. Colloms SD, Merrick CA, Olorunniji FJ, Stark WM, Smith MCM, Osbourn A, Keasling KJ & Rosser SJ (2013) Rapid metabolic pathway assembly and modification using serine integrase site-specific recombination. Nucleic Acids Res 42: e23 Wall ME, Hlavacek WS & Savageau MA (2004) Design of gene circuits: Lessons from bacteria. Nat Rev Genet 5: 34-42. Wang RY, Shi ZY, Guo YY, Chen JC & Chen GQ (2013) DNA fragments assembly based on nicking enzyme system. PLoS One 8: e57943.

This article is protected by copyright. All rights reserved.

Accepted Article

Wang WD, Chen ZT, Kang BG & Li R (2008) Construction of an artificial intercellular communication network using the nitric oxide signaling elements in mammalian cells. Exp. Cell Res 314: 699-706. Weber E, Engler C, Gruetzner R, Werner S & Marillonnet S (2011) A modular cloning system for standardized assembly of multigene constructs. PLoS One 6: e16765. Whitman L, Core M, Ness J, Theotdorous E, Gustafsson C & Minshull J (2013) Advertorial: Rapid, scarless cloning of gene fragments using the Electra Vector System™. Genet Eng Biotechnol 33. Wingler LM & Cornish VW (2011) Reiterative recombination for the in vivo assembly of libraries of multigene pathways. Proc Natl Acad Sci USA 108: 15135-15140. Xu P, Vansiri A, Bhan N & Koffas MA (2012) ePathBrick: a synthetic biology platform for engineering metabolic pathways in E. coli. ACS Synth Biol 1: 256-266. Xu P, Gu Q, Wang W, Wong L, Bower AG, Collins CH & Koffas MA (2013) Modular optimization of multi-gene pathways for fatty acids production in E. coli. Nat Commun 4: 1409. Yonemura I, Nakada K, Sato A, Hayashi J, Fujita K, Kaneko S & Itaya M (2007) Direct cloning of full-length mouse mitochondrial DNA using a Bacillus subtilis genome vector. Gene 391: 171-177. Yuan Y & Zhao H (2013) Directed evolution of a highly efficient cellobiose utilizing pathway in an industrial Saccharomyces cerevisiae strain. Biotechnol Bioeng 110: 2874-2881. Yuan Y, Du J & Zhao H (2013) Customized optimization of metabolic pathways by combinatorial transcriptional engineering. Methods Mol Biol 985: 177-209. Zelcbuch L, Antonovsky N, Bar-Even A, et al. (2013) Spanning high-dimensional expression space using ribosome-binding site combinatorics. Nucleic Acids Res 41, e98. Zhang Y, Werling U & Edelmann W (2012) SLiCE: a novel bacterial cell extract-based DNA cloning method. Nucleic Acids Res 40: e55. Zhang Y, Muyrers JP, Testa G & Stewart AF (2000) DNA cloning by homologous recombination in Escherichia coli. Nat Biotechnol 18: 1314-1317. Zhu C, Naqvi S, Breitenbach J, Sandmann G, Christou P & Capell T (2008) Combinatorial genetic transformation generates a library of metabolic phenotypes for the carotenoid pathway in maize. Proc Natl Acad Sci USA 105: 18232-18237.

This article is protected by copyright. All rights reserved.

Accepted Article

Fig. 1. Key DNA assembly methods.

This article is protected by copyright. All rights reserved.

Accepted Article

Fig. 2. Improved in vivo sequence homology based DNA assembly methods in S. cerevisiae. (a) In Wingler and coworkers’ method (Wingler & Cornish, 2011), Type A and Type B are alternated in the series of donor plasmids harboring the fragments to be assembled. The homing endonuclease coded by the donor plasmid specifically cuts the integration site on the genome to promote integration of the next fragment. The markers are reused in turns to guarantee that the corresponding fragments are successfully appended. Mod represents modulo operation. “H. Arm” represents homologous arms. (b) In Kuijpers and coworkers’ method (Kuijpers, et al.,

2013), the yeast replicon and marker fragments on the vector backbone are separated to reduce the likelihood of generating false positives.

This article is protected by copyright. All rights reserved.

Accepted Article

Fig. 3. Schematic of three applications based on in vivo DNA assembler method. (a) Promoter libraries are generated by error-prone PCR and assembled in front of each gene in the pathway. (b) For each enzyme in the pathway, homologous genes from various microorganisms are cloned and flanked by the arms homologous to their neighboring promoters and terminators. All expression cassettes are then assembled using the DNA assembler method. (c) All enzymes in the pathway are mutated via error-prone PCR to generate mutant libraries and are flanked by the arms homologous to their neighboring promoters and terminators. These expression cassettes are then assembled using the DNA assembler method.

This article is protected by copyright. All rights reserved.

Recent advances in DNA assembly technologies.

DNA assembly is one of the most important foundational technologies for synthetic biology and metabolic engineering. Since the development of the rest...
381KB Sizes 2 Downloads 3 Views