de novo computational enzyme design.

Available online at www.sciencedirect.com

ScienceDirect de novo computational enzyme design Alexandre Zanghellini Recent advances in systems and synthetic biology as well as metabolic engineering are poised to transform industrial biotechnology by allowing us to design cell factories for the sustainable production of valuable fuels and chemicals. To deliver on their promises, such cell factories, as much as their brick-and-mortar counterparts, will require appropriate catalysts, especially for classes of reactions that are not known to be catalyzed by enzymes in natural organisms. A recently developed methodology, de novo computational enzyme design can be used to create enzymes catalyzing novel reactions. Here we review the different classes of chemical reactions for which active protein catalysts have been designed as well as the results of detailed biochemical and structural characterization studies. We also discuss how combining de novo computational enzyme design with more traditional protein engineering techniques can alleviate the shortcomings of state-of-the-art computational design techniques and create novel enzymes with catalytic proficiencies on par with natural enzymes. Addresses Arzeda Corp., 2722 Eastlake Avenue E., Suite 150, Seattle, WA 98102, United States Corresponding author: Zanghellini, Alexandre ([email protected])

Current Opinion in Biotechnology 2014, 29:132–138 This review comes from a themed issue on Cell and pathway engineering Edited by Tina Lu¨tke-Eversloh and Keith EJ Tyo

0958-1669/$ – see front matter, # 2014 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.copbio.2014.03.002

Introduction The field of cell and pathway engineering holds great promise for the production of renewable chemicals and biofuels such as second generation biofuels that are not competing with food supplies. Powered by metabolic engineering, spectacular cell and pathway engineering have been achieved at industrial scale by companies such as Amyris [1], Gevo and Genomatica [2]. However, such applications have relied on pathways employing natural enzymes, sometimes from heterologous hosts and at most exhibiting activity against a new substrate [2]. Novel cell factories producing industrial chemicals and materials will require designing and engineering novel pathways producing chemicals that are ‘xenobiotics’ (such as butadiene). A variety of advanced computational tools have Current Opinion in Biotechnology 2014, 29:132–138

been developed for pathway prospection [3,4,5]. Complemented with our ability to design novel enzymes with new function, they open the way for de novo engineering of metabolic pathways and cellular function [6]. Traditional protein engineering techniques can be used to re-engineer existing enzymes for thermostability or substrate specificity switches but have not to date been used to design new enzymatic function de novo. Recently, computational protein design has been successfully applied to that endeavor. Here we review the most recent developments in the de novo computational design of new enzymes with a special emphasis on what has been learned since the initial proof-of-concepts.

De novo design of new function Fueled by simultaneous advances in molecular mechanics force-fields and increase in available computing power, computational protein design (CPD) techniques have gained momentum since the first successful protein core redesigns in the mid 90s [7]. Given the coordinates of a particular protein backbone, CPD techniques automate the task of redesigning protein sequences, alter their structure, and impart them with new function. Contrary to traditional rational design where the number of mutations is limited, CPD software is able to generate libraries of novel sequences where a significant proportion (sometimes in excess of 50%) of the WT protein sequence has been altered while maintaining a stable and expressed protein. The application of CPD to the problem of enzyme design is however recent and can be divided in two subclasses: the de novo design of new function, and the redesign of existing enzymes. In the de novo problem, one needs to simultaneously introduce both a substrate binding pocket and the necessary catalytic machinery into protein scaffolds devoid of the desired catalytic activity. Typically, de novo enzyme design follows one or all the steps depicted in Figure 1. The earliest attempts at de novo design made use of the program Dezymer developed by Hellinga and Richards to design a His3-Fe-O2 metal active site that was placed into thioredoxin in order to obtain a superoxide dismutase-like enzyme [8,9]. Shortly after, Bolon and Mayo introduced a single histidine residue active site to confer ester hydrolysis activity to the same thioredoxin scaffold [10]. The designed enzyme’s catalytic activity obtained was modest (kcat/kuncat = 180) when compared to 4-methyl imidazole alone, natural esterases or catalytic antibodies for this reaction, despite the fact that the ester substrate is highly activated and the activation energy for the reaction is low. In addition, the algorithms used did not provide a general solution for catalytic sites that consist of more then one residue, while most natural www.sciencedirect.com

Computational enzyme design Zanghellini 133

Figure 1

Previous X-RAY structures First Principles

Build Active Site Model

Quantum Mechanical Calculations

Place Active Site in Scaffold (“Matching”) Design Active Site Pocket (“Design”)

• Obtain crystal structures • Fix bb redesign • Flexible bb redesign

Filter/Rank Active Sites

Assemble/Order Gene Clone Express, Purify and Assay

Computationally guided “smart” libraries

Develop HT Screen/Selection

Construct Random Libraries (random PCR/DNA shuffling) Obtain crystal structures (as needed)

Screen/Select for Improved Variants

e

e Enzy nov m de

Obtain crystal structures (as needed)

Current Opinion in Biotechnology

Flowchart representation of the steps for the computational design of a novel enzymatic activity. Computational steps are in light blue. Experimental steps are in light green. Optional transitions in the process are marked with a broken line, and optional steps are marked with a light gray box. Note that one of the advantages of computational enzyme methods is that a high-throughput assay is not strictly necessary for enzyme design. However, if protein engineering techniques are employed in addition to computational design, a medium to high-throughput screen or selection is a requirement for practical success.

enzymes have complex active sites incorporating multiple functional residues. To alleviate these limitations, the most recent algorithm developments in de novo enzyme design have focused on the efficient placement of complex active sites with several catalytic side-chains in large libraries of protein scaffolds. Lassila et al. [11] developed a two-tiered approach for transition-state placement, whereas Zanghellini and Jiang et al. have developed a much more efficient strategy based on 6D hashing called RosettaMatch [12,13]. In this approach, all catalytic side chains including the transition state(s) are placed independently at every putative pocket position to ensure that any additional catalytic side chain increases the search time linearly and not exponentially. The location and orientation of the transition state and the corresponding side-chain are recorded in a hash table. At the end, quick look-ups in the table are performed to find transition state positions that overlap for each catalytic side chain in order to construct the novel active site. The initial de novo enzyme design successes (see below) www.sciencedirect.com

have triggered additional work in fast algorithms development. Newer active site placement algorithms have been developed, such as ScaffoldSelection [14], PRODA-MATCH [15], Zhu and Lai’s vector matching technique [16] and OptGraft [17]. Although most studies benchmark and compare the accuracy and computational efficiency of their algorithms to RosettaMatch, the most recent algorithms have not yet been used for actual de novo design and experimental validation (Table 1).

De novo designed aldolases Advanced active site placement and design techniques have led to a wealth of successful de novo enzyme designs for chemical reactions for which no natural counterparts exist. In 2008, Baker and coworkers reported the first computationally designed aldolases [18] and Kemp eliminases [19]. In the case of retroaldolase, 72 designs were made by placing a variety of active sites predicted by QM modeling to catalyze the retroaldol cleavage of the substrate 4-hydroxy-4-(6-methoxy-2-naphthyl)-2-butanone to acetone and naphtyl-aldehyde into a library of robust Current Opinion in Biotechnology 2014, 29:132–138

134 Cell and pathway engineering

Table 1 Summary of the successful de novo design of enzyme activity and (when available) improvements obtained after optimization by computational design, libraries of mutant screening, or a combination of both. de novo enzyme activity

Superoxide dismutaselike enzyme Esterase of p-nitrophenylacetate Esterase of p-nitrophenylacetate O2-dependent phenyl oxidase (retro-)Aldolase

Esterase of phenyl p-nitrophenyl acetate Kemp eliminase

Diels Alderase a

Kemp eliminase Kemp eliminase a

Active site introduced

Template scaffold

kcat/KM (M 1 s 1)

Improved kcat/KM (M 1 s 1)

Reference

His3-Fe-O2

Thioredoxin

6.4

N/A

Benson et al. [40]

His His (HisGluGlu-Fe)2 binding site Lys and Hbond acceptor/donor

Thioredoxin Thioredoxin De novo 4-helix bundle protein Multiple templates (>13)

2.7 0.3 92,400

N/A N/A N/A

Bolon et al. [10] Suarez et al. [41] Kaplan et al. [42]

0.02–0.74

55

34

405 6 10 5

Jiang and Althoff et al. [18] Althoff, Wang and Jiang et al. [20] Wang, Althoff and Jiang et al. [22] Richter and Blomberg et al. [28] Ro¨thlisberger, Khersonsky, Wollacott et al. [19] Khersonsky et al. [35] Khersonsky et al. [36] Khersonsky et al. [37] Siegel and Zanghellini et al. [26] Eiben and Siegel et al. [27] Korendovych et al. [23] Privett et al. [25] Blomberg et al. [38]

Glu, Trp, Ser or His-Asp, Phe, Ser

Multiple templates; most active design in indole-3-glycerolphosphate synthase

6–163

Gln, Tyr

Diisopropylfluorophosphatase

0.06 (s 1 M

Calmodulin xylanase

5.8 425

Glu (native) Asp, Trp, Ser

1

M 1)

87.3 (s 1 M N/A 2 10 5

1

M 1)

A bimolecular reaction.

protein scaffolds. The genes for all designs were assembled, cloned into pET expression vectors and the designed proteins expressed, purified and assayed using a fluorescence assay. 32 designs showed detectable activity whereas the original scaffold failed to catalyze the reaction. Interestingly, whereas natural Type I aldolase enzymes occur only in TIM-barrel scaffolds, with the Schiff-base lysine found at two conserved positions in the b-barrel, RosettaMatch yielded 11 distinct positions for Schiff-base lysines and the most active design was in a Jelly Roll fold, not a TIM-barrel fold. In a subsequent recent report, Althoff, Wang, Jiang et al. [20] further designed 33 novel retro-aldolases in 13 protein scaffolds, increasing significantly their success rate based on what was learned in the original study. This result demonstrates that a good understanding of the key catalytic groups necessary for catalytic rate enhancement, as well as a trial-and-error process are key to increase the hit rate of enzyme design. Although the novel enzymes show robust activity that is abrogated upon mutation of the key catalytic residues, the observed catalytic activities are modest with kcat/KM lower than 10 M 1 s 1, orders of magnitude from naturally evolved aldolase catalytic efficiencies. Althoff, Wang, Jiang et al. then screened a small library of point mutations at each the positions in the active site (except the introduced catalytic lysine residue) Current Opinion in Biotechnology 2014, 29:132–138

for 4 of the designs and recombined the best variants. This led to a 88-fold improvement in kcat/KM. The mutations found tend to be larger hydrophobic residues that pack better with the substrate in the active site. This shed some lights on the limits of CPD (at least as implemented in the design software RosettaDesign [21]) as these favorable residues were not picked up at the design stage, most probably because of the side chain rotamer approximation and hard Lennard-Jones potential used. Subsequently, the authors further constructed random libraries using error-prone PCR and DNA shuffling. After 10 rounds of screening, an additional 100-fold improvement in kcat/KM was obtained. Although such experimental optimization resulted in an overall 1000-fold improvement in catalytic rate for the computationally designed aldolases, these enzymes are still significantly slower than any natural enzyme and even the best of catalytic antibodies. In an attempt to investigate the mechanistic limitations of the designed enzymes, Wang and Althoff et al. carried out a detailed crystallographic study of the designs [22]. High-resolution apo crystal structures were previously obtained for the most active designs and showed very close agreement with the computationally designed models. Co-crystallization with two small molecules, a mechanism-based inhibitor and a substrate analog were obtained for the improved variant www.sciencedirect.com


RA34.6 and compared to the original apo RA34 structure, revealing that the key Schiff-base lysine in the active site (covalently bound in the diketone mechanism-based inhibitor structures) was better packed in RA34.6. Perhaps more interestingly, most of the improving mutations resulted in tighter hydrophobic packing against the large naphthyl group of the substrate. The co-crystallized substrate analog naphthyl group binds in a similar but rotated position relative to the design model, suggesting that atomic-level control of substrate binding in the catalytically productive orientation is still a challenge for computational design methodologies.

from Lassila et al. [11] in the same scaffold as one of Baker’s original Kemp eliminase did not demonstrate any catalytic activity and needed to be further redesigned to yield the first hits. Understanding how to select active versus inactive designs computationally (and therefore increase the hit rate) is probably one of the most important problems posed by the initial proofs-of-concepts and is beginning to be answered through hybrid protein design/molecular dynamics simulations approaches (see below).

Initial and recent Kemp eliminases de novo designs

Success was also obtained for other unrelated reactions. Siegel and Zanghellini et al. [26] reported the first computational design of a bi-molecular enzyme: a DielsAlderase that catalyzes the cyclo-addition of N,Ndimethylacrylamide (the dienophile) to 4-carboxybenzyl trans-1,3-butadiene-1-carbamate (the diene). The specific constraints of designing for a bi-molecular pericyclic reaction (binding simultaneously two substrates in an exact relative conformation) as well as the large and relatively hydrophobic nature of the complex/product (requiring the carving of a larger, more hydrophobic cleft in a protein scaffold) might explain the relatively lower hit rate (2 active designs out of 84 designs) obtained by the authors, compared to the Kemp and retro-aldol catalyzing enzymes designed using the same methodology. Similarly to other cases, improved catalytic parameters for the best enzyme were obtained with limited site directed mutagenesis in the active site. More recently, further improvements on the best design from the initial work by Siegel and Zanghellini et al. were obtained using a mixed library screening/crowdsourcing design strategy [27] although it is too early to assess how generally applicable this methodology may be. Finally, two recent reports describe the use of the RosettaMatch methodology for designing a de novo esterase [28] and Morita–Baylis–Hillman catalyzing protein [29]. Although highly efficient esterases abound in nature, the designed esterase uses a slightly different active site than most natural esterases, with a Cys-His dyad to act as a nucleophile and an oxyanion hole including backbone amides. The designed esterase, exhibiting a rather limited catalytic efficiency, has not yet been optimized using classical protein engineering techniques but interestingly Richter et al. have designed their active site so that it can accommodate a tyrosyl ester that upon enzymatic cleavage would afford a tyrosine aminoacid, complementing a tyrosine-deficient strain and allowing for an easy activity-based selection system. The result of such a large-scale selection experiment, coupled with crystallographic characterization of the improved variants, may highlight the evolutionary path to highly active novel esterases from the starting design and provide invaluable details when compared to highly efficient naturally occurring esterases.

The Kemp eliminases represent a second class of successful de novo computational designs. The first proof-ofconcept was reported by Baker and coworkers [19]. The same methodology as for the retroaldolase — RosettaMatch and RosettaDesign — was used to produce 59 designs in 17 different scaffolds using two different catalytic groups for proton abstraction off the substrate, 5-nitrobenzisoxazole: either a Asp or Glu residue or a HisAsp dyad as a catalytic base. Eight designs showed detectable activity that was expectedly abrogated or severely reduced upon knock-out mutations of the key catalytic groups. As in the case of retroaldolase designs, the catalytic efficiencies of the active sequences, before any further optimization, were modest with the highest kcat/KM a little under 102 M 1 s 1. The question of whether the modest initial activities of both these retroaldolase and Kemp eliminase enzymes were due to the particular methodology used has been answered through two recent reports by Mayo and coworkers and DeGrado and coworkers. Korendovych et al. computationally design a switchable Kemp eliminase activity into calmodulin [23] where, upon calcium binding calmodulin opens up a hydrophobic pocket that is large enough to accommodate 5-nitrobenzisoxazole, the substrate used for all published Kemp elimination designs. Exploiting this native allosteric motion, a catalytically active glutamate residue was introduced at the bottom of the hydrophobic pocket, which resulted in a modest but robust catalytic activity (5.8 M 1 s 1). Experiments confirmed that the introduced glutamate is only substrate accessible in the open conformation and the enzyme can therefore be turned on or off by the addition or depletion of calcium. Using a similar approach to Baker and coworkers but a different computational design program (an efficient implementation of the FASTER algorithm [24] using the ORBIT forcefield) Mayo and coworkers were able to also design active Kemp eliminases for the same 5nitrobenzisoxazole with catalytic rates marginally higher than the Baker team (4.5 102 M 1 s 1 for design HG-2) [25]. It is to be noted here that the initial designs obtained with the active site placement www.sciencedirect.com

Recent successes for other reactions: DielsAlderases, Morita–Baylis–Hillman and cysteine esterases

Current Opinion in Biotechnology 2014, 29:132–138


Understanding the impact of protein dynamics on the de novo design of enzymes

Synergetic applications of computational protein design and protein engineering

For efficiency reasons, computational design methodologies do not normally take into account protein dynamics at the backbone or side-chain level, nor the fact that natural evolution performs multi-objective optimization, for example, the sequences are selected to be folding into the desired fold and not into alternative states while being solubly expressed and catalytically active. The initial successes reviewed above prompted several studies on how CPD and molecular dynamics simulation (MD) can be combined to help detect and overcome some of the mechanistic limitations of the designs. In a seminal study [30], Kiss et al. developed a new methodology based on short MD simulations of the designed proteins and were able to recapitulate which of the Kemp designs by Baker and coworkers were active. As was already discovered by analyzing the apo and holo crystal structures for the computationally designed retroaldolases, one of the key outcomes of the work of Kiss et al. is the importance of atomic-level accuracy of active-site pre-organization. For designs to be active, one needs to simultaneously solve three challenges. First one needs to ensure proper active site desolvation that is making sure that no unnecessary water networks are present in the active site. Second and somewhat related is the need for precise catalytic residues positioning, in particular, making sure that the catalytic residues cannot access alternative conformations during the course of the simulations. Finally, it is crucial that first-shell residues be delineating a tight binding pocket that positions the substrate in the catalytically active binding mode at the expense of other non-productive binding modes. Similar conclusions were derived from a separate independent study on some of the retroaldolase designs [31]. It is noteworthy that the methodology developed by Kiss et al. was successfully applied to several blind tests and in particular was able to provide directions as to which mutations would fix the original non-working HG-1 design and improve upon the active HG-2 to yield the most active sequence HG-3 as reported by Privett et al. [25]. Integrating CPD and MD simulations is therefore likely to dramatically improve de novo computational enzyme design success rates, but poses its own engineering problems as it is not computationally feasible to perform nanosecond MD simulations on all the designs before selection for experimental validation. Currently MD can only be used as a ‘post-design’ filter for a limited number of designs (10s). One avenue of research is to use multistate design strategies that have been developed for other CPD uses [32,33]. When using for instance ensembles of scaffolds obtained from constrained MD simulations on the initial scaffolds, it may be possible to design sequences and active site pre-organizations that are robust to scaffold backbone relaxation dynamics.

As noted earlier in this review, in all published studies the initial catalytic rate of the computationally designed enzymes has been modest. Comparisons can be made to catalytic antibodies: both de novo computational enzyme design and catalytic antibodies elicited using transition state analogs as haptens rely on the same assumption that one needs to bind tightly and stabilize the highest-energy transition state along the reaction path. The fact that both de novo computational design and catalytic antibodies achieve rates of the same order of magnitude suggests that much more needs to be taken into consideration to reach levels of activity comparable to natural enzymes. Even if the idea that most natural enzymes exhibit extremely high catalytic proficiencies may need to be taken with a grain of salt [34], natural efficient enzymes such as triosephosphate isomerase (TIM) routinely achieve kcat/KM between 108 and 109 M 1 s 1 and other examples of diffusion-limited enzymes abound in biochemistry textbooks. The analogy with catalytic antibodies however ends here: by construction, de novo computationally designed enzymes are found in arbitrary protein scaffolds, including highly stable and expressed scaffolds that can easily be subjected to protein engineering techniques such as directed evolution, whereas antibodies are notoriously hard to optimize with protein engineering. Following the initial proof-of-concepts of de novo retroaldolase and Kemp eliminases, there has been a significant amount of work dedicated to applying protein engineering techniques to improve the novel enzymes designed computationally. As mentioned above, a combination of saturation mutagenesis and random PCR libraries was able to improve 1000fold upon the original retroaldolase designs [20,22]. Larger random-PCR libraries of mutants were screened for three of the 8 active Kemp eliminases designed by Baker and coworkers: KE07 [35], KE70 [36] and KE59 [37]. KE07 catalytic parameters were optimized 2600-fold from the original to reach a kcat/KM of 2.6 103 M 1 s 1 with 7 rounds of random mutagenesis and screening. KE70 was optimized through a combination of computational redesign and 9 rounds of random mutagenesis and screening, increasing a mere 400-fold up the original computationally designed enzymes but reaching a remarkable kcat/KM in excess of 5 104 M 1 s 1. Finally, 16 rounds of random mutagenesis yielded a KE59 variant with a kcat/KM of 0.6 106 s 1 M 1 on a related but even less activated substrate than that used for the design and experimental characterization of KE59. Perhaps more noteworthy is the fact that Tawfik and coworkers first had to increase the initial KE59 scaffold thermodynamic stability by spiking consensus mutations in the random libraries to be able to accumulate functionally beneficial mutations during the evolution course. This could serve as a basis for developing systematic protocols for computational designed enzymes improvements. Interestingly, Hilvert and coworkers [38]


www.sciencedirect.com


obtained nearly identical catalytic parameters (kcat/ KM = 0.2 106 M 1 s 1) after 17 rounds of optimization (random-PCR and DNA shuffling) of the Kemp eliminase designed by Privett et al. that was found in the same scaffold as KE59 but used a different active site placement [25]. Taken together, these results offer very exciting perspectives for enzyme design both at the practical and theoretical levels. First and foremost, de novo computational design combined with a moderate amount of traditional protein engineering can deliver entirely new enzymes for reactions not catalyzed in Nature with catalytic parameters on the same order of natural enzymes, and certainly sufficient for most cell and pathway design applications. Second, crystallographic and kinetic studies of the evolved enzymes [11,22,35,38] highlight mechanisms by which computationally designed enzyme may be improved and therefore will most likely extend considerably our understanding of enzyme catalysis and consequently improve the design process itself. Finally, it is apparent that one cannot predict how easy the enzyme optimization task will be and which scaffold and initial enzyme will fare better in the long run. Therefore, from a practical and industrial point of view, the future of enzyme design most likely resides in a ‘shotgun’ approach where as many initial starting points are designed and systematically optimized in parallel. As such an approach is being proven sufficient to bring to the world novel enzymes, a recent comparison study by Kipnis and Baker hypothesizes that it is also most likely necessary [39]: the screening of random or semirandom libraries is not an adequate tool to obtain new activity in the first place.

Conclusion Computational de novo design of enzymatic function has now been accomplished by multiple academic groups for mechanistically diverse chemical reactions. These successes have in turn spurred a growing body of work of both basic and applied science aimed at understanding the limitations of our understanding of enzyme catalysis and improving upon the enzymes obtained by computational design. In particular, it is rather striking that the combination of computational design and protein engineering techniques can now yield enzymes that can conceivably be used to enable new metabolic pathways and the production of industrial chemicals. Initial proof-ofconcept work carried out at Arzeda (Zanghellini et al., unpublished; Otte et al., manuscript in preparation) shows that it is possible to combine computationally designed enzymes into novel pathways to enable the production of new chemicals. However, the long-term success of the field of de novo enzyme and metabolic pathway will be dependent on the ability of the community to incorporate what is learned from the multiple mechanistic and structural characterization studies into the design methodologies to increase the hit rate and catalytic rate of the designs. www.sciencedirect.com

Acknowledgements Alexandre Zanghellini would like to thank Rudesh Toofanny for helpful discussions and comments while preparing the manuscript. Alexandre Zanghellini and Arzeda Corporation received funding from NSF through the Small Business Innovation Research (SBIR) grant program.

References and recommended reading Papers of particular interest, published within the period of review, have been highlighted as: of special interest of outstanding interest 1.

Ro D-K, Paradise EM, Ouellet M, Fisher KJ, Newman KL, Ndungu JM, Ho KA, Eachus RA, Ham TS, Kirby J et al.: Production of the antimalarial drug precursor artemisinic acid in engineered yeast. Nature 2006.

2.

Yim H, Haselbeck R, Niu W, Pujol-Baxley C, Burgard A, Boldt J, Khandurina J, Trawick JD, Osterhout RE, Stephen R et al.: Metabolic engineering of Escherichia coli for direct production of 1,4-butanediol. Nat Chem Biol 2011, 7:445-452.

3.

Hatzimanikatis V, Li C, Ionita JA, Henry CS, Jankowski MD, Broadbelt LJ: Exploring the diversity of complex metabolic networks. Bioinformatics 2005, 21:1603-1609.

4.

Pharkya P, Burgard AP, Maranas CD: OptStrain: a computational framework for redesign of microbial production systems. Genome Res 2004, 14:2367-2376.

5.

Carbonell P, Planson A-G, Fichera D, Faulon J-L: A retrosynthetic biology approach to metabolic pathway design for therapeutic production. BMC Syst Biol 2011, 5:122.

6.

Prather KLJ, Martin CH: De novo biosynthetic pathways: rational design of microbial chemical factories. Curr Opin Biotechnol 2008, 19:468-474.

7.

Dahiyat BI, Mayo SL: De novo protein design: fully automated sequence selection. Science 1997, 278:82-87.

8.

Hellinga HW, Richards FM: Construction of new ligand binding sites in proteins of known structure. I. Computer-aided modeling of sites with pre-defined geometry. J Mol Biol 1991, 222:763-785.

9.

Hellinga HW, Caradonna JP, Richards FM: Construction of new ligand binding sites in proteins of known structure. II. Grafting of a buried transition metal binding site into Escherichia coli thioredoxin. J Mol Biol 1991, 222:787-803.

10. Bolon DN, Mayo SL: Enzyme-like proteins by computational design. Proc Natl Acad Sci U S A 2001, 98:14274-14279. 11. Lassila JK, Privett HK, Allen BD, Mayo SL: Combinatorial methods for small-molecule placement in computational enzyme design. Proc Natl Acad Sci U S A 2006, 103:1671016715. 12. Zanghellini A, Jiang L, Wollacott AM, Cheng G, Meiler J, Althoff EA, Ro¨thlisberger D, Baker D: New algorithms and an in silico benchmark for computational enzyme design. Protein Sci Publ Protein Soc 2006, 15:2785-2794. 13. Richter F, Leaver-Fay A, Khare SD, Bjelic S, Baker D: De novo enzyme design using Rosetta3. PLoS ONE 2011, 6:e19230. 14. Malisi C, Kohlbacher O, Ho¨cker B: Automated scaffold selection for enzyme design. Proteins 2009, 77:74-83. 15. Lei Y, Luo W, Zhu Y: A matching algorithm for catalytic residue site selection in computational enzyme design. Protein Sci 2011, 20:1566-1575. 16. Liu S, Liu S, Zhu X, Liang H, Cao A, Chang Z, Lai L: Nonnatural protein–protein interaction-pair design by key residues grafting. Proc Natl Acad Sci U S A 2007, 104:5330-5335. 17. Fazelinia H, Cirino PC, Maranas CD: OptGraft: a computational procedure for transferring a binding site onto an existing protein scaffold. Protein Sci Publ Protein Soc 2009, 18:180-195. Current Opinion in Biotechnology 2014, 29:132–138


18. Jiang L, Althoff EA, Clemente FR, Doyle L, Ro¨thlisberger D, Zanghellini A, Gallaher JL, Betker JL, Tanaka F, Barbas CF 3rd et al.: De novo computational design of retro-aldol enzymes. Science 2008, 319:1387-1391. 19. Rothlisberger D, Khersonsky O, Wollacott AM, Jiang L, DeChancie J, Betker J, Gallaher JL, Althoff EA, Zanghellini A, Dym O et al.: Kemp elimination catalysts by computational enzyme design. Nature 2008, 453:190-195. 20. Althoff EA, Wang L, Jiang L, Giger L, Lassila JK, Wang Z, Smith M, Hari S, Kast P, Herschlag D et al.: Robust design and optimization of retroaldol enzymes. Protein Sci Publ Protein Soc 2012, 21:717-726. 21. Leaver-Fay A, Tyka M, Lewis SM, Lange OF, Thompson J, Jacak R, Kaufman K, Renfrew PD, Smith CA, Sheffler W et al.: ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol 2011, 487:545-574. 22. Wang L, Althoff EA, Bolduc J, Jiang L, Moody J, Lassila JK, Giger L, Hilvert D, Stoddard B, Baker D: Structural analyses of covalent enzyme–substrate analog complexes reveal strengths and limitations of de novo enzyme design. J Mol Biol 2012, 415:615-625. 23. Korendovych IV, Kulp DW, Wu Y, Cheng H, Roder H, DeGrado WF: Design of a switchable eliminase. Proc Natl Acad Sci U S A 2011, 108:6823-6827. 24. Allen BD, Mayo SL: Dramatic performance enhancements for the FASTER optimization algorithm. J Comput Chem 2006, 27:1071-1075. 25. Privett HK, Kiss G, Lee TM, Blomberg R, Chica RA, Thomas LM, Hilvert D, Houk KN, Mayo SL: Iterative approach to computational enzyme design. Proc Natl Acad Sci U S A 2012, 109:3790-3795. The authors use a combination of computational design, molecular dynamics simulations and crystallographic validation to recover a nonworking design and improve its activity. Although Kemp eliminases were obtained previously by computational design, the approach here is informative as to potential problems and limitations of a particular design strategy. 26. Siegel JB, Zanghellini A, Lovick HM, Kiss G, Lambert AR, St Clair JL, Gallaher JL, Hilvert D, Gelb MH, Stoddard BL et al.: Computational design of an enzyme catalyst for a stereoselective bimolecular Diels-Alder reaction. Science 2010, 329:309-313. 27. Eiben CB, Siegel JB, Bale JB, Cooper S, Khatib F, Shen BW, Players F, Stoddard BL, Popovic Z, Baker D: Increased DielsAlderase activity through backbone remodeling guided by Foldit players. Nat Biotechnol 2012, 30:190-192. 28. Richter F, Blomberg R, Khare SD, Kiss G, Kuzin AP, Smith AJT, Gallaher J, Pianowski Z, Helgeson RC, Grjasnow A et al.: Computational design of catalytic dyads and oxyanion holes for ester hydrolysis. J Am Chem Soc 2012, 134:16197-16206. 29. Bjelic S, Nivon LG, Celebi-Olcum N, Kiss G, Rosewall CF, Lovick HM, Ingalls EL, Gallaher JL, Seetharaman J, Lew S et al.: Computational design of enone-binding proteins with catalytic activity for the Morita–Baylis–Hillman reaction. ACS Chem Biol 2013, 8:749-757. 30. Kiss G, Ro¨thlisberger D, Baker D, Houk KN: Evaluation and ranking of enzyme designs. Protein Sci Publ Protein Soc 2010, 19:1760-1773.


31. Ruscio JZ, Kohn JE, Ball KA, Head-Gordon T: The influence of protein dynamics on the success of computational enzyme design. J Am Chem Soc 2009, 131:14111-14115. 32. Allen BD, Mayo SL: An efficient algorithm for multistate protein design based on FASTER. J Comput Chem 2010, 31:904-916. 33. Leaver-Fay A, Jacak R, Stranges PB, Kuhlman B: A generic program for multistate protein design. PLoS ONE 2011, 6:e20937. 34. Bar-Even A, Noor E, Savir Y, Liebermeister W, Davidi D, Tawfik DS, Milo R: The moderately efficient enzyme: evolutionary and physicochemical trends shaping enzyme parameters. Biochemistry (Mosc) 2011, 50:4402-4410. 35. Khersonsky O, Ro¨thlisberger D, Dym O, Albeck S, Jackson CJ, Baker D, Tawfik DS: Evolutionary optimization of computationally designed enzymes: Kemp eliminases of the KE07 series. J Mol Biol 2010, 396:1025-1042. 36. Khersonsky O, Ro¨thlisberger D, Wollacott AM, Murphy P, Dym O, Albeck S, Kiss G, Houk KN, Baker D, Tawfik DS: Optimization of the in-silico-designed kemp eliminase KE70 by computational design and directed evolution. J Mol Biol 2011, 407:391-412. 37. Khersonsky O, Kiss G, Ro¨thlisberger D, Dym O, Albeck S, Houk KN, Baker D, Tawfik DS: Bridging the gaps in design methodologies by evolutionary optimization of the stability and proficiency of designed Kemp eliminase KE59. Proc Natl Acad Sci U S A 2012, 109:10358-10363. This study uses 16 rounds of error-prone PCR library screening and spiked consensus mutations to improve one of the computationally designed Kemp eliminase. The highest level of activity for any computationally designed enzyme is achieved (kcat/KM > 105 M 1 s 1) on a less activated substrate, demonstrating the power of de novo design and protein engineering. X-ray crystallography also sheds lights to the molecular mechanisms behind the improvements. 38. Blomberg R, Kries H, Pinkas DM, Mittl PRE, Gru¨tter MG, Privett HK, Mayo SL, Hilvert D: Precision is essential for efficient catalysis in an evolved Kemp eliminase. Nature 2013, 503:418-421. This study demonstrates the improvement of a de novo Kemp eliminase with library screening (error-prone PCR and gene shuffling). X-ray crystallography demonstrates the need for precise catalytic residue placement and selecting for the productive substrate binding mode other nonproductive competing ones. 39. Kipnis Y, Baker D: Comparison of designed and randomly generated catalysts for simple chemical reactions. Protein Sci 2012, 21:1388-1395. The authors perform a simple comparison experiment building semirandom libraries of mutants approximating the amino-acid distribution of working Kemp and retro-aldol designs in one of the scaffold in which both Kemp and retro-aldol activity was computationally designed previously. The results show that whereas some activity can be obtained from the semi-random libraries, the hit rate and catalytic activity obtained are significantly lower than that of computational design, quantifying the need for computational protein design techniques. 40. Benson DE, Haddy AE, Hellinga HW: Converting a maltose receptor into a nascent binuclear copper oxygenase by computational design. Biochemistry (Mosc) 2002, 41:3262-3269. 41. Suarez M, Tortosa P, Garcia-Mira MM, Rodrı´guez-Larrea D, Godoy-Ruiz R, Ibarra-Molero B, Sanchez-Ruiz JM, Jaramillo A: Using multi-objective computational design to extend protein promiscuity. Biophys Chem 2010, 147:13-19. 42. Kaplan J, DeGrado WF: De novo design of catalytic proteins. Proc Natl Acad Sci U S A 2004, 101:11566-11570.

www.sciencedirect.com

Computational de novo design of a four-helix bundle protein--DND_4HB.

Computational approaches for de novo design and redesign of metal-binding sites on proteins.

Computational de novo design of a self-assembling peptide with predefined structure.

Computational Structure-Based De Novo Design of Hypothetical Inhibitors against the Anti- Inflammatory Target COX-2.

Nanomedicine: de novo design of nanodrugs.

De novo design of functional proteins: Toward artificial hydrogenases.

De novo design of functional oligonucleotides with acyclic scaffolds.

De novo design - hop(p)ing against hope.

De novo design of sequences for nucleic acid structural engineering.

Advances in multiparameter optimization methods for de novo drug design.

Protein folding and de novo protein design for biotechnological applications.

Accurate de novo design of hyperstable constrained peptides.

Multidimensional de novo design reveals 5-HT2B receptor-selective ligands.

Multi-objective molecular de novo design by adaptive fragment prioritization.

Escherichia coli PagP Enzyme-Based De Novo Design and In Vitro Activity of Antibacterial Peptide LL-37.

Computational Enzyme Design: Advances, hurdles and possible ways forward.

A highly efficient cocaine-detoxifying enzyme obtained by computational design.

Computer design of bioactive molecules: a method for receptor-based de novo ligand design.

De novo AVM formation.

BioMog: a computational framework for the de novo generation or modification of essential biomass components.

Translocation Mongolism Arising de Novo?

De novo selection of oncogenes.

Multiple cerebral de novo aneurysms.

Metal stopping reagents facilitate discontinuous activity assays of the de novo purine biosynthesis enzyme PurE.