REVIEW

Dynamics and Constraints of Enzyme Evolution MIRIAM KALTENBACH AND NOBUHIKO TOKURIKI* Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada

ABSTRACT

J. Exp. Zool. (Mol. Dev. Evol.) 322B:468–487, 2014

The wealth of distinct enzymatic functions found in nature is impressive and the on‐going evolutionary divergence of enzymatic functions continues to generate new and efficient catalysts, which can be seen through the recent emergence of enzymes able to degrade xenobiotics. However, recreating such processes in the laboratory has been met with only moderate success. What are the factors that lead to suboptimal research outputs? In this review, we discuss constraints on enzyme evolution, which can restrict evolutionary trajectories and lead to evolutionary dead‐ends. We highlight recent studies that have used experimental evolution to mimic different aspects of enzymatic adaptation under simple, controlled settings to shed light on evolutionary dynamics and constraints. A better understanding of these constraints will lead to the development of more efficient strategies for directed evolution and enzyme engineering. J. Exp. Zool. (Mol. Dev. Evol.) 322B:468–487, 2014. © 2014 Wiley Periodicals, Inc. How to cite this article: Kaltenbach M, Tokuriki N. 2014. Dynamics and constraints of enzyme evolution. J. Exp. Zool. (Mol. Dev. Evol.) 322B:468–487.

The great variety and complexity of enzymatic functions found in nature will never cease to be astounding. Even within a single enzyme superfamily (a collection of enzymes that have the same fold and active site features and are thus likely to share a common ancestor), the number of distinct chemical reactions carried out and substrates accepted is impressive (Armstrong, 2000; Seibert and Raushel, 2005; Glasner et al., 2006; Bebrone, 2007; Furnham et al., 2012; Gerlt et al., 2012). For example, the >25,000 members of the amidohydrolase superfamily, which share the common (b/a)8 barrel fold and a mono‐ or binuclear metal in the active site, are reported to catalyze >40 unique reactions (Seibert and Raushel, 2005; Gerlt et al., 2011). The functional expansion in this and other superfamilies is still ongoing; many enzymes have recently evolved for the degradation of xenobiotic compounds (pesticides, explosives) (Ramos et al., 2005; Afriat et al., 2006; Copley, 2009; Wackett, 2009) and antibiotics (Bebrone, 2007; Lozovsky et al., 2009). The remarkable functional and sequence divergence that is found in nature has led to a consensus in the protein science community that enzymes are highly evolvable molecules, which can tolerate drastic sequence changes and rapidly adapt to new functions. However, in the last decade there has been a growing realization that enzyme evolution is not limitless, but instead highly restricted. There have been many advances in enzyme engineering and design, but enzyme engineering efforts, and in

particular directed evolution, are still challenging and often lead to evolutionary dead‐ends. Unfortunately negative results generally remain unpublished, however, many, if not most, directed evolution experiments come to an early halt after only a minor improvement or no improvement at all (Dalby, 2011; Gumulya and Reetz, 2011). Only a handful of works successfully generated enzymes with catalytic efficiencies and specificities comparable to those found in natural enzymes (Fasan et al., 2008; Afriat‐Jurnou et al., 2012; Tokuriki et al., 2012; Meier et al., 2013). How does the seemingly limitless diversity in nature evolve? Why is nature far more successful at creating new and efficient enzymes? In this regard, many efforts have been devoted to unraveling the dynamics of enzyme evolution to identify and better understand the factors that restrict and burden evolution, and to ultimately find a remedy for them. In particular, experimental evolution has been extensively employed to

 Correspondence to: Nobuhiko Tokuriki, Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada. E‐mail: [email protected] Received 20 August 2013; Accepted 6 January 2014 DOI: 10.1002/jez.b.22562 Published online 13 February 2014 in Wiley Online Library (wileyonlinelibrary.com).

© 2014 WILEY PERIODICALS, INC.

DYNAMICS AND CONSTRAINTS OF ENZYME EVOLUTION examine various aspects of molecular evolution in a controlled setting (all parameters are tunable, including mutation rate and type, selection pressure, and the environment) (Dean and Thornton, 2007; Peisajovich and Tawfik, 2007; Romero and Arnold, 2009). While experimental evolution is artificial and can never perfectly mirror natural evolution, it allows the detailed study of stepwise evolutionary trajectories in response to a simple selection pressure. Thus, experimental evolution has successfully filled a missing link in the study of evolution and provided new insights into evolutionary dynamics, as well as novel methodologies to engineer and evolve proteins efficiently in the laboratory (Harms and Thornton, 2013). In this review, we describe lessons learned from experimental evolution and summarize recent advances in our understanding of evolutionary dynamics, with particular focus on evolutionary constraints. We begin by briefly introducing a general model of enzyme evolution and use the concept of a fitness landscape to depict transitions between enzymatic functions. We discuss a selection of experimental works that identified factors restricting enzyme evolution and highlight examples where such constraints were overcome. These studies combine to advance our understanding of enzyme evolutionary dynamics, which will aid our ability to successfully engineer novel enzymes in the laboratory.

General Model of Enzyme Evolution How does a new enzymatic function evolve? In 1970, Maynard Smith published a seminal paper that stated, “If evolution by natural selection is to occur, functional proteins must form a continuous network which can be traversed by unit mutational steps without passing through non‐functional intermediates” (Smith, '70). In other words, the functional expansion of an enzyme superfamily must occur gradually and smoothly by accumulating adaptive mutations one step at a time and forming a continuous network in sequence space. Therefore, in order to form a continuous network, a foundation for divergence of the new function must pre‐exist in the superfamily. The existence of secondary, low‐level enzymatic functions and their role in enzyme evolution was first recognized by Jensen ('76). Jensen stated that in ancestral, primitive enzymes, “Broad substrate specificity provides a kind of biochemical leakiness […], which is exploitable for gain of function” (Jensen, '76). Indeed, recent studies of the biochemical and biophysical properties of resurrected plausible ancestral genes have supplied experimental support for this theory (Bridgham et al., 2009; Afriat‐Jurnou et al., 2012; Huang et al., 2012; Voordeckers et al., 2012; Bar‐Rogovsky et al., 2013). Over the last decade, it has been shown that many modern enzymes are also promiscuous (Matsumura and Ellington, 2001; Babtie et al., 2010; Khersonsky and Tawfik, 2010). Most, if not all, enzymes can recognize alternative compounds that are related to the native substrate (i.e. differ only in a substituent) and subject

469 them to the same chemical transformation (substrate ambiguity or substrate promiscuity). Furthermore, many extant enzymes catalyze distinct chemical reactions (involving different chemical bonds and/or different transition state geometries) other than the one for which they have evolved (catalytic promiscuity). In some cases, enzyme promiscuity is evidence of the evolutionary connection between members of a superfamily, for example, if the native activity of one enzyme is carried out promiscuously in related homologous enzymes (Roodveldt et al., 2005; Afriat et al., 2006; Vick and Gerlt, 2007; van Loo et al., 2010). In addition, many enzymes can catalyze additional reactions not related to their ancestry or homologues and this serendipitous promiscuity provides further potential starting points for novel functional adaptation. Soo et al. (2011) investigated the capacity of Escherichia coli to respond to a large variety of antibiotics and other toxins (237 different compounds). They found that overexpression of different E. coli proteins conferred higher resistance to >35% of the toxins. Interestingly, many toxins were inactivated by not just one, but multiple proteins. Some of these proteins were identified as promiscuous enzymes, whereas others were efflux pumps, regulatory elements, or proteins involved in stress response. In combination with point mutations, overexpression of promiscuous enzymes can also compensate knockouts of essential metabolic enzymes in amino acid and nucleotide biosynthesis (McLoughlin and Copley, 2008; Patrick and Matsumura, 2008), either by directly replacing the missing enzyme or by rerouting the entire biosynthesis into an alternative pathway (Kim et al., 2010). Enzyme promiscuity therefore provides a reservoir of candidates for evolutionary tinkering (Jacob, '77). At a certain point of evolution, if a promiscuous function becomes advantageous, it can then be further increased. The improvement of the new function may occur gradually and smoothly via the step‐wise accumulation of mutations (substitutions, insertions, deletions, elongations, and truncations). However, in some cases, recombination or gene fusion can suddenly alter sequence and function. Eventually, gene duplication occurs, and each copy may diverge, specialize, and become a new member of its enzyme superfamily (Kondrashov and Koonin, 2004; Soskine and Tawfik, 2010; Conant and Wolfe, 2008). During the adaptive process, as well as through genetic drift, the newly evolved enzyme may acquire novel promiscuous activities previously non‐existent in its functional reservoir, and in this way, provide new starting points for further divergence. The repeating cycles of promiscuous activity recruitment and adaptation combine to generate a “continuous network” of sequence and functional divergence in enzyme evolution (Smith, '70).

Fitness Landscapes to Depict Adaptive Processes The concept of a fitness landscape (adaptive landscape) is useful to visualize adaptive evolutionary processes (Wright, '32; Carneiro J. Exp. Zool. (Mol. Dev. Evol.)

470 and Hartl, 2010). In sequence space, each protein sequence represents one point, and sequences that differ by a single substitution are located next to each other. The sequence space is multidimensional and accordingly complex, for instance, a protein containing 100 amino acid residues occupies 100 dimensions. However, sequence space is frequently compressed into one or two dimensions for simplification and ease of visualization. Strictly speaking, “fitness” refers to the reproduction or survival rate of an organism in a certain environment, but any phenotypic property of the protein such as enzymatic activity, stability or cell viability related to protein function (e.g., antibiotic resistance) can be used to construct a fitness landscape. The landscape might be deserted over large areas, as the majority of sequences do not code for a folded, active enzyme. Sequence clusters coding for active enzymes exhibit higher fitness and are depicted as a peak. There can be multiple distinct peaks in sequence space corresponding to enzymes that show high level of the same activity but have significantly different in sequences and even structural folds. This can be a consequence of extensive divergence from a common ancestor or alternatively, a consequence of convergent evolution; evolution for the same function starting from unrelated ancestral enzymes. Continuous fitness changes by mutations can be understood as moving across the fitness landscape, and adaptive evolution can be described as climbing a peak on this landscape (Fig. 1). Multiple catalytic activities can be depicted as several overlaid fitness landscapes in sequence space, each of which possesses

Figure 1. Fitness landscapes. Adaptive evolution can be seen as climbing a fitness peak in sequence space by the step‐wise accumulation of mutations. Such a peak might have a smooth, “Mt Fuji”‐type shape (A) or it may be rugged and contain multiple peaks (B), photograph courtesy of Florian Baier). In the latter case, interactions between mutations severely restrict evolutionary trajectories.

J. Exp. Zool. (Mol. Dev. Evol.)

KALTENBACH AND TOKURIKI peaks of higher fitness and valleys. In certain regions, peaks from different fitness landscapes may overlap, which represents catalytic promiscuity or multi‐functionality (Fig. 2). Evolutionary related catalytic activities, that is, reactions catalyzed by an enzyme superfamily, generate a mountain range of overlapping peaks, in which adaptation can occur through a “continuous network” of functional sequences. While depictions of the fitness landscape are inevitably oversimplified and not able to capture the actual situation in detail, they are useful to illustrate overall evolutionary processes intuitively. We will use the fitness landscape concept throughout this review to explain constraints in enzyme evolution. Constraints in Enzyme Evolution Constraints in enzyme evolution can be understood by two major features of the fitness landscape. The first constraint relates to the shape of the fitness landscape, which is associated with the mutational tolerance of a protein as well as the number of evolutionary trajectories available (Fig. 1). If the fitness landscape is smooth and single‐peaked (a “Mt. Fuji”‐type shape), evolution is not constrained and all possible routes from a given coordinate to the peak are viable (Fig. 1A). However, if the fitness landscape is rugged and riddled with valleys that separate secondary, suboptimal peaks, evolution will be strongly constrained as valleys, or regions of lower fitness between peaks, cannot be crossed by adaptive evolution (Fig. 1B). For example, a single point mutation can completely abolish enzymatic function by knocking‐out a critical catalytic residue or the ability to fold correctly. This observation, and the fact that many directed evolution experiments experience evolutionary dead‐ends, suggest that fitness landscapes are rugged to some extent. We discuss recent studies that have characterized the ruggedness of local landscapes in natural and experimental evolution, and the molecular underpinnings leading to this restriction of evolutionary trajectories in “Mutational Epistasis Causes Ruggedness of the Fitness Landscape”. The second constraint is associated with the overlap and connectivity between fitness landscapes belonging to different catalytic activities (Fig. 2). Generally, evolution cannot bridge fitness peaks that do not overlap (Fig. 2A) and the extent of the peak overlap dictates the degree of trade‐off between different catalytic activities (Fig. 2B–D). We highlight recent works that have advanced the understanding of these constraints and how they shape evolutionary dynamics, and ways to overcome them in “Connectivity Between Functions on a Multidimensional Fitness Landscape”. It should be noted that this review exclusively focuses on the dynamics of enzyme evolution (and more generally protein evolution), and strategies to enhance our ability to engineer and design novel enzymes, although very similar phenomena are observed at higher levels of biological systems (reviewed by Elena and Lenski, 2003; Phillips, 2008; Kogenaru et al., 2009; Koonin

DYNAMICS AND CONSTRAINTS OF ENZYME EVOLUTION

471

Figure 2. Connectivity between fitness landscapes. Generally, fitness peaks of do not overlap (i.e., an enzyme with Activity A does not have Activity B) and thus, no evolutionary starting point is given (A). When peaks do overlap (i.e., the enzyme with Activity A catalyzes Activity B promiscuously), however, adaptive evolution may occur through a “continuous network” of functional sequences (B). Depending on the shape of the peaks and the extent of their overlap, trade‐offs between native (Activity A) and promiscuous (Activity B) function exist. Trade‐offs may be strong (B, dashed line in D) or weak (C, continuous line in D). Weak trade‐offs channel evolution through a generalist regime where the enzyme catalyzes both reactions with high efficiency.

and Wolf, 2010; de Visser et al., 2011; Kawecki et al., 2012; Loewe, 2012).

MUTATIONAL EPISTASIS CAUSES RUGGEDNESS OF THE FITNESS LANDSCAPE The interaction between individual mutations means their effect is dependent on the genetic background in which they occur, causing ruggedness of the fitness landscape. The term “epistasis” was originally introduced to define interactions between genes, but has also been applied to the interaction between mutations in a single gene. Epistasis can be formally described as the deviation between the combined effect of a mutation A and a mutation B in the double mutant AB from the value obtained by simply summing up the effect of the single mutants. Epistasis is classified based on the degree and sign of this deviation (see also Box I). No epistasis (additive): AB ¼ A þ B, the effect of the double mutation is equal to the sum of the individual mutations (Fig. 3A). Magnitude epistasis: AB > A þ B, the effect of the double mutation is larger than the sum of the individual mutations (synergistic epistasis) or AB < A þ B, the effect of the double mutation is smaller than the sum the of individual mutations (antagonistic epistasis, Fig. 3B). Sign epistasis: A > 0 and/or B > 0, AB < 0, the effect of the double mutation is inverted compared to one of the individual mutations (Fig. 3C) or both (reciprocal sign epistasis, Fig. 3D). In adaptive evolution, if the effect of mutations is either additive or displays mild magnitude epistasis (synergistic or antagonistic epistasis), all combinations of advantageous mutations lead uphill to a fitness peak regardless of their order of occurrence, although some trajectories may be more likely than others. On the other hand, strong magnitude epistasis or sign epistasis are likely to

create a rugged landscape where evolutionary trajectories are restricted because some mutations are only favorable in the presence or absence of others. On such a landscape, the order and combination of mutations becomes crucial (Weinreich et al., 2005; Poelwijk et al., 2007; Dawid et al., 2010; de Visser et al., 2011; Kvitek and Sherlock, 2011). In this way, evolution becomes contingent on historical events; depending on which mutation initially occurs, a different pathway and eventually, a different outcome, may result (Weinreich et al., 2005; Weinreich et al., 2006; Gumulya and Reetz, 2011; Papp et al., 2011; Salverda et al., 2011; Dickinson et al., 2013). Early studies on epistasis demonstrated that mutational effects at the protein level are largely additive or slightly antagonistic (Kuliopulos et al., '90; Wells, '90; Dill, '97; Qasim et al., 2003). Therefore, the prevailing view of the fitness landscape was a relatively smooth, single‐peaked, “Mt. Fuji‐type” mountain, although, some ruggedness was also recognized (Fig. 1A) (Kauffman and Levin, '87; Aita and Husimi, '98; Tracewell and Arnold, 2009; Carneiro and Hartl, 2010). However, in the last decade, various experiments have amended the picture of empirical fitness landscapes to become much more rugged, and the role of epistasis in enzyme evolution has been debated extensively. A milestone toward this change was a study published by Weinreich et al. (2006). The authors constructed all combinations of five naturally occurring mutations, which jointly improve resistance of TEM‐1 b‐lactamase against the third‐ generation cephalosporin b‐lactam cefotaxime by a factor of 100,000 (Weinreich et al., 2006). Surprisingly, only 18 out of the 120 possible trajectories were able to reach the peak in antibiotic resistance in a gradual up‐hill manner. Subsequent to this research, further experimental evidence for the severe restriction of enzyme evolution has been obtained, as discussed below. J. Exp. Zool. (Mol. Dev. Evol.)

472

KALTENBACH AND TOKURIKI

Box I Epistasis terminology Multiple, sometimes overlapping definitions of epistasis exist in the literature, which emphasize different aspects of the phenomenon. To avoid confusion, we summarize the definitions used in this review below. Basic definition of epistasis: No epistasis (additive): AB ¼ A þ B, the effect of the double mutation is equal to the sum of the individual mutations (Fig. 3A). Magnitude epistasis can be divided into synergistic and antagonistic epistasis. Synergistic epistasis: AB > A þ B, the effect of the double mutation is larger than the sum of the individual mutations. Antagonistic epistasis: AB < A þ B, the effect of the double mutation is smaller than the sum the of individual mutations (Fig. 3B). Sign epistasis: A > 0 and/or B > 0, AB < 0, the effect of the double mutation is inverted compared to one of the individual mutations (Fig. 3C). A special case is reciprocal sign epistasis, where both mutations invert their effect (Fig. 3D). Additional definitions include positive and negative epistasis. Positive epistasis generally refers to the second mutations having no or a positive effect relative to the single mutant, and negative epistasis means the double mutant exhibits decreased fitness relative to the single mutant. Both phenomena can be caused by magnitude or sign epistasis according to the above definition. To avoid confusion, we do not use these terms in our review. Other definitions: Hierarchical epistasis describes deviations from additivity not in the function or stability of the protein itself, but on higher levels of organization. In the cell, the protein is embedded in a complex network, and mutations with additive effects on its function/stability can translate into non‐linear effects on soluble protein expression, flux through the immediate metabolic pathway, output of the whole metabolic network, and ultimately organimal fitness (See Fig. 4). Intrinsic epistasis: In this review, we introduce the term intrinsic epistasis to emphasize cases where epistasis occurs directly on the lowest level of the fitness hierarchy. Mutations can have non‐additive effects on protein function or stability by direct (formation of H‐ bonds, salt bridges, non‐polar contacts) or indirect (long‐range interaction networks, dynamics) interaction in the protein structure. Stability‐mediated epistasis refers to epistasis on the level of function in the cell (A) caused by changes in protein stability, in particular to the loss of protein function caused by destabilization of the protein. It is a commonly observed type of hierarchical epistasis because protein stability constitutes a universal constraint in evolution. Diminishing returns epistasis is observed during the adaptive process of enzyme evolution and engineering; initial mutations are often able to dramatically enhance the function under selection, while later mutations lead to only minor improvements. It is caused by antagonistic epistasis between adaptive mutations. Permissive mutation: A permissive mutation allows another mutation to fixate because of epistasis between the two. The interaction may be synergistic (enhancement of a positive effect), antagonistic (attenuation of a negative effect) or sign epistatic (the negative effect becomes positive). Often, permissive mutations are stabilizing mutations (see stability‐mediated epistasis). Compensatory mutation: A compensatory mutation has a positive effect only after another mutation has occurred. Like permissive mutations, compensatory mutations are often stabilizing and play a role in restoring (compensating) protein stability after the accumulation of function‐altering, destabilizing mutations (see stability‐mediated epistasis). Restrictive mutation: A restrictive mutation prevents other mutations from fixating once it occurs. This can be caused by antagonistic (reduction of the positive effect), synergistic (enhancement of a slightly negative effect), or sign epistasis (the positive effect becomes negative).

Hierarchical Epistasis One of the major reasons for epistasis is that the intrinsic phenotypic properties of an enzyme, such as its function (kcat/KM) and stability (DG), are usually not proportional to organismal fitness, or other phenotypic properties. An enzyme is one component of a complex cellular network. It is part of a larger collection of enzymes that catalyze sequential reactions and together constitute a metabolic pathway. Often, enzymes and proteins may interact directly or indirectly and function cooperatively. The environment can also significantly affect the interactions between networks and pathways. Therefore, the contribution of a single enzyme to overall fitness is complicated, and even if the effect of mutations on the basic physicochemical J. Exp. Zool. (Mol. Dev. Evol.)

properties is additive, they can show epistasis on a higher level of organization (hierarchical epistasis, Fig. 4A and Box I) (Sanjuan and Elena, 2006; Martin et al., 2007; Aylor and Zeng, 2008; Perfeito et al., 2011; Walkiewicz et al., 2012). Indeed, studies of whole viruses, bacteria etc. have revealed more epistasis among mutations than studies looking at the protein level (Bonhoeffer et al., 2004; Michalakis and Roze, 2004; Segre et al., 2005; Martin et al., 2007; Kryazhimskiy et al., 2011; Breen et al., 2012; Kachanovsky et al., 2012; Kouyos et al., 2012; Flynn et al., 2013). However even in a simplified model of enzyme evolution, in which enzymatic activity in the cell is directly related to fitness (as is the case for essential metabolic enzymes or antibiotic resistance markers), the relationship between activity and fitness is

DYNAMICS AND CONSTRAINTS OF ENZYME EVOLUTION

473 inherently non‐linear (see next section), and hierarchical epistasis is pronounced. In the following, we will describe several factors that cause hierarchical epistasis (Fig. 3). Non‐Linear Relationship between Enzymatic Activity and Flux. The relationship between enzyme activity and the fitness of an organism (W) can be described in a simplified saturation model, in which the intracellular level of enzymatic activity (A) is directly related to flux (F) or the overall rate of the enzyme‐catalyzed reaction in the cell. Fmax is defined as the maximum attainable flux and KA is the amount of enzyme at which the flux reaches its half‐maximum value (Fig. 4B) (Hartl et al., '85).

Figure 3. Basic definition of epistasis. Mutations do not interact, and their combined effect in the double mutant AB equals the sum of their individual effects in the single mutants Ab and aB (A). Antagonistic esistasis (a type of magnitude epistasis): The effect is strongest in the double mutant AB, but less than expected from the single mutants Ab and aB (B). Sign epistasis: The effect of the double mutant AB is increased relative to Ab, which in turn is larger than ab (C). However, AB is reduced relative to the aB mutant. Reciprocal sign epistasis: AB is reduced relative to both Ab and aB (D).

W F¼

A  F max KA þ A

Depending on the level of A, mutations with additive effects on A can result in antagonistic epistasis in F (see also “Diminishing Returns Epistasis”) (Hartl et al., '85; Dykhuizen et al., '87; Dean et al., '88). When A is much lower than KA, the enzyme constitutes a bottleneck for metabolic flux, and changes in F are directly proportional to A. However, when the enzyme is sufficiently active (i.e., immediately transforms the substrate provided by the enzyme preceding it in the metabolic pathway), increasing its activity does not lead to further increases in flux and therefore fitness. Lunzer

Figure 4. Hierarchical organization of phenotypic properties leads to epistasis. (A) “Fitness” can be seen on different levels of cellular organization, ranging from the basic biophysical properties of the enzyme (DG, kcat/KM) to whole organism fitness. (B) The non‐linear relationship between flux (F) and enzymatic activity in the cell A can be described by a saturation model. (C) The relationship between the number of mutations and DG. (D) The sigmoidal relationship between the number of mutations (and therefore DG) and [E] according to the thermodynamic stability model. (E) The relationship between the number of mutations and kcat/KM. (F) The non‐linear relationship between the number of mutations and the level of enzymatic activity in the cell (A) results from the hierarchical organization of these properties.

J. Exp. Zool. (Mol. Dev. Evol.)

474

KALTENBACH AND TOKURIKI

et al. (2005) systematically characterized the fitness landscape of six amino acids controlling coenzyme (NADH and NADPH) use in isopropylmalate dehydrogenase (IMDH). They showed that each amino acid additively contributed to changes in catalytic activity with both NADH and NADPH. However, the resulting overall fitness (the relative growth rate of E. coli expressing one of the variants compared to cells containing wild‐type IMDH) responded in a non‐linear fashion and could be described by a hyperbolic function, demonstrating hierarchical epistasis (Lunzer et al., 2005). Stability‐Mediated Epistasis. A mutation may influence both enzyme function and enzyme stability, which leads to hierarchical epistasis even on this basic level of organization, that is, within the protein itself. The level of enzymatic activity in the cell (A) is proportional to the intrinsic catalytic parameters (kcat/KM or kcat) and the amount of active enzyme ([E]): A ¼ kcat =K M  ½E The enzyme concentration [E] in turn is dictated by protein stability. In a simplified model, thermodynamic stability (DG) correlates to functional and soluble expression in the cell. Here, we will focus on the relationship between [E] and DG, and assume that there is a certain stability threshold beyond which protein solubility in the cell decreases (DGt): ½E  1=ð1 þ eðDGDGt Þ=RT Þ Note that this model is oversimplified, and kinetic stability, which is related to the protein's folding pathway, folding speed, degradation rate, and aggregation tendency, is more relevant for soluble expression in the cell (Sanchez‐Ruiz, 2010; Socha and Tokuriki, 2013). However, thermodynamic stability has been shown to correlate with the soluble expression of some proteins, and the sigmoidal relationship between DG and [E] has been validated (Bershtein et al., 2006; Mayer et al., 2007; Papp et al., 2011). Because of this sigmoidal relationship, a certain change in DG may have varying effects on functional expression [E]. Depending on the background, as long as DG remains above the threshold (DGt) the protein will continue to fold and function, that is, [E] is robust to mutations in this regime. However, once the threshold is crossed, further destabilizing mutations will lead to a drastic decrease in [E] (synergistic epistasis, Fig. 4C) (Bloom et al., 2005; Bershtein et al., 2006; Tokuriki and Tawfik, 2009b). We can simulate a very simple case of enzyme evolution making the following assumptions (Fig. 3); (i) the enzyme possesses a 5 kcal/ mol margin from the threshold (DG ¼ DGt  5), (ii) each mutation increases function two‐fold (kcat/KM,iþ1 ¼ 2kcat/KM,i), and destabilizes the protein by þ1 kcal/mol (DGiþ1 ¼ DGi þ 1), and (iii) the effect of mutations is additive for both stability and function. As mutations accumulate, DG increases linearly (Fig. 4C), but [E] decrease in a sigmoidal fashion; while the first few mutations do J. Exp. Zool. (Mol. Dev. Evol.)

not affect [E], once DG crosses the threshold (DGt) subsequent mutations dramatically reduce functional expression (Fig. 4D). Similarly, enzyme function (kcat/KM or kcat) improves exponentially as mutations accumulate (Fig. 4E). However, enzymatic activity in the cell (A) exhibits a convex shape: Initially, A increases, reflecting the improvement in kcat/KM. This improvement slows down as DG approaches the threshold (DGt), and then turns into a decrease for DG > DGt, because the decrease in [E] is more significant than the increase in kcat/KM (Fig. 4F). Therefore, the response of the activity level (A) to mutations is highly dependent on the stability of the starting point, and ranges from no epistasis to antagonistic and ultimately sign epistasis (Fig. 4F). This demonstrates how the non‐linear relationship between different properties can cause epistasis on higher levels of fitness ([E] and A), despite the fact that the underlying variables (DG, kcat/KM) respond additively to mutations. Noted that this very simplified model does not contain various parameters such as gene expression and other environmental changes. However, these factors generally result in even more pronounced epistasis. Stability‐mediated epistasis seems to be one of the major constraints in natural and laboratory enzyme evolution (see also Box I). Most proteins possess only marginal stability (5 to 15 kcal/mol), therefore even a single point mutation can undermine stability and lead to loss of [E] and A (DePristo et al., 2005). Indeed, a systematic study of mutational effects showed that most mutations, particularly those that confer higher catalytic activity, are destabilizing (Tokuriki et al., 2007; Tokuriki et al., 2008). If the destabilizing effect of function‐altering mutations is not compensated, further mutations will be purged and the enzyme will experience an evolutionary dead‐end. Therefore, stabilizing mutations (global suppressors) that alleviate the destabilizing effect of other mutations play a significant role in enzyme evolution (for a more extensive discussion of protein stability and evolution, see DePristo et al., 2005; Bloom and Glassman, 2009; Tokuriki and Tawfik, 2009b; Socha and Tokuriki, 2013 for reviews). For example, in the restricted evolutionary trajectories of TEM‐1 b‐lactamase, M182T, one of the five mutations which led to the 100,000‐fold increase in antibiotic resistance, had no effect on function but exerted an important stabilizing effect (2.7 kcal/mol), which allowed the accumulation of other functional mutations without compromising soluble expression (Wang et al., 2002; Bloom et al., 2005; Bershtein et al., 2008; Brown et al., 2010a). An analysis comparing a nucleoprotein from two influenza virus strains isolated 39 years apart led to the identification of 39 mutational steps that occurred during the evolution of this protein. Systematic characterization of these mutations revealed that three mutations were highly deleterious when introduced individually into the parent protein (Bloom et al., 2010; Gong et al., 2013). However, the destabilizing effect was alleviated during the evolutionary trajectory by stabilizing mutations that provided a permissive background to accommodate them.

DYNAMICS AND CONSTRAINTS OF ENZYME EVOLUTION In experimental evolution and enzyme engineering, strategies to overcome stability constraints are essential to avoid evolutionary dead‐ends. Two strategies, incorporating stabilizing/compensatory mutations and buffering the effect of destabilizing mutations by chaperones, have been demonstrated as effective solutions (for a recent overview, see Socha and Tokuriki, 2013). Various methods have been developed to predict and experimentally screen stabilizing mutations and can be used to maintain protein stability and evolvability (Lehmann et al., 2000; Roodveldt et al., 2005; Bommarius et al., 2006; Mayer et al., 2007; Papp et al., 2011; Traxlmayr and Obinger, 2012; Socha and Tokuriki, 2013). Moreover, combining stabilizing/compensatory mutations with chaperone buffering can further enhance the evolvability of enzymes and enable sustainable enzyme engineering: We recently reported the experimental evolution of a phosphotriesterase (PTE) towards increased arylesterase activity. Over eighteen rounds of evolution, the overexpression of GroEL/ ES chaperones was either switched on or off to control whether function‐altering mutations (in the presence of chaperones) or stabilizing compensatory mutations (in the absence of chaperones) were under selection (Tokuriki et al., 2012; Wyganowski et al., 2013). Overall, the accumulation of 12 function‐altering mutations and 6 compensatory mutations produced a significant increase in arylesterase activity toward 2‐naphthyl hexanoate (>104‐fold) (Tokuriki et al., 2012). In another study, the combination of stabilizing/compensatory mutations and chaperone buffering enabled switching the substrate specificity of a DNA methyltransferase (Rockah‐Shmuel and Tawfik, 2012). It is clear that without stability modulation, the evolution of these enzymes would have been greatly hindered and a complete switch of their functions could not have been accomplished. Epistasis Intrinsic to Physicochemical Properties Mutations, especially when occurring in close proximity in protein tertiary structure, can show strong epistasis in the intrinsic phenotypes of the fitness hierarchy (kcat/KM and/or DG, see also Box I). In adaptive evolution, new‐function mutations tend to cluster around the active site and often interact directly or indirectly, leading to epistasis on the functional level. Recently, several studies revealed that such intrinsic epistasis strongly constrains evolutionary trajectories. Diminishing Returns. Diminishing returns are a universal phenomenon for any kind of optimization process, from economics to adaptive evolution; the closer the process is to its optimum, the slower and less cost‐effective optimization becomes (Stebbins, '44; Hartl et al., '85). Diminishing returns have been observed on different organizational levels of biological systems (e.g., metabolic pathways, bacterial evolution, the evolution of antibiotic resistance, enzymatic activities (Arjan et al., '99; Miralles et al., 2000; MacLean et al., 2010; Tokuriki et al., 2012; Flynn et al., 2013). In some cases, the diminishing returns were

475 caused by antagonistic epistasis (Chou et al., 2011; Khan et al., 2011; Tokuriki et al., 2012) therefore, antagonistic epistasis has also been called “diminishing returns epistasis”, see also Box I. For example, during the laboratory evolution from PTE to arylesterase, strong diminishing returns were observed (Tokuriki et al., 2012). The early steps involved marked improvements: Arylesterase activity increased 1,100‐fold via four substitutions in the first four rounds. Over the subsequent five rounds (until Round 9), the activity increased by another 7‐fold via five substitutions. In the last rounds (R10–R18), nine more substitutions only gave a 4.4‐fold higher rate. The molecular basis for the diminishing returns is that later mutations reinforce the effect of the initial mutations. The first mutation, H254R, adopted two rotamers in the crystal structure; one rotamer sterically blocks entrance of the new substrate into the active site, whereas the other is ideally positioned to stabilize the transition state of the reaction. In the next round, the mutation D233E occurred close to H254R and reinforced the positioning of Arg254 in its active conformation (Fig. 5A). Subsequent mutations further fine‐tuned the conformational shift and in the final, evolved variant, only one Arg254 conformer was observed. Restrictive and Permissive Mutations. Strong magnitude epistasis and sign epistasis restrict the order in which mutations can occur. On the one hand, restrictive mutations can prohibit other mutations from fixating by negating their beneficial effect (antagonistic and sign epistasis). On the other hand, permissive mutations, which allow other mutations to arise, may have to be fixated prior to the appearance of the other mutations (synergistic and sign epistasis). Often, permissive mutations are global suppressors that increase protein stability and expression (see “Stability‐Mediated Epistasis” and Box I). In the examples discussed below, it has been shown that permissive mutations may also generate a local environment in which mutations that have no positive effect on their own exhibit a beneficial effect. Evolutionary trajectories that depend on such permissive and restrictive mutations are highly constrained in terms of the order of mutations (DePristo et al., 2007; Lozovsky et al., 2009; Brown et al., 2010b; Novais et al., 2010; Costanzo et al., 2011; de Visser et al., 2011; Toprak et al., 2011). Thornton et al. studied the evolution of vertebrate glucocorticoid hormone receptors using an ancestral resurrection approach (Ortlund et al., 2007). They identified seven mutations responsible for the evolution of specificity during the transition from the older ancestral receptor AncGR1, which recognizes aldosterone, DOC (deoxycorticosterone), and cortisol, to AncGR2, which is specific to cortisol. Combinatorial mutational analysis revealed that epistasis restricted the evolutionary trajectory, and that the accumulation of the mutations must have happened in a certain order. When introduced into the AncGR1 background, the mutation L111Q only marginally affected specificity for either hormone, while S106P dramatically reduced activation by all J. Exp. Zool. (Mol. Dev. Evol.)

476

KALTENBACH AND TOKURIKI

Figure 5. Epistasis intrinsic to physicochemical properties. In the evolution from PTE to arylesterase, H254R initially adopts two rotamers: A bent, productive conformation involved in substrate binding (arylester analogue shown in yellow), and a nonproductive conformation. The subsequent introduction of D233E stabilizes the active conformer (Tokuriki et al., 2012) (A). In the evolution of hormone receptor specificity from AncGR1 (grey) to AncGR2 (blue), L111Q forms a H‐bond with the C17‐hydroxyl group of cortisol (shown in yellow), but only in presence of the permissive mutation S106P, which repositions the helix containing 111 (black arrows, Ortlund et al., 2007). Adapted from Harms and Thornton (2013) (B). In the transition from AtzA to TriA, S3331C serves as initial, permissive mutation for N328D and F84L, which in turn permit E125D (Noor et al., 2012). Homology model of AtzA courtesy of Colin Jackson; atrazine shown in yellow (C). Parallel evolution of TEM‐1 b‐lactamase revealed sign epistasis between G238S and R164S or A237T, preventing their simultaneous incorporation (Salverda et al., 2011). Cefotaxime shown in yellow (D).

ligands. However, by combining these mutations, receptor preference was switched from aldosterone and DOC to cortisol. Structural analysis revealed the molecular basis for the strong epistasis: L111Q has the potential to introduce a new hydrogen bond specific to the C17‐hydroxyl group of cortisol, which is not present in the other hormones. However, this bond can only be established in the presence of Pro106; otherwise steric clashes occur (Fig. 5B). Furthermore, in the process of reducing specificity against aldosterone and deoxycorticosterone, two non‐functional, neutral mutations (N26T and Q105L) are required prior to the accumulation of three specificity‐switching mutations (L29M, F98I, and S212D). These neutral mutations appear to play a J. Exp. Zool. (Mol. Dev. Evol.)

permissive role by locally stabilizing the hormone‐binding site, thereby allowing the subsequent accumulation of functional mutations. The reversibility of the observed evolutionary pathway was also investigated (Bridgham et al., 2009). Intriguingly, reversing the same subset of key mutations that generated specificity during the forward evolution from AncGR2 to AncGR1 resulted in inactivation of the receptor. It appears that five restrictive but functionally neutral mutations (H84Q, T91C, A107Y, G114Q, and L197M) block the reversion. Because the reversion of these neutral mutations cannot be selected for, these mutations served as an epistatic ratchet, which makes reversion of the trajectory an extremely unlikely event. Although these

DYNAMICS AND CONSTRAINTS OF ENZYME EVOLUTION restrictive mutations do not directly interact with the specificity‐ switching mutations, they seem to influence local conformational changes of the hormone‐binding site. The recent evolution of herbicide‐degrading enzymes, atrazine dechlorinase (AtzA) and its close relative melamine deaminase (TriA), in two different Pseudomonas species, provides insights into how epistasis shapes the transition between enzymatic functions. The two orthologues differ by only nine amino acid substitutions but possess distinct catalytic activities. AtzA is an efficient atrazine dechlorinase with no measurable deaminase activity, and TriA is an efficient deaminase with only a low level of promiscuous dechlorinase activity (Noor et al., 2012). Noor et al. reconstructed plausible evolutionary trajectories for the functional transition between AtzA and TriA. Overall, the deduced evolutionary pathway connecting the two enzymes exhibited a gradual and smooth transition with strong diminishing returns; however, the accumulation of mutations must follow a certain order. When introduced into AtzA, S331C increased melamine deaminase activity 18‐fold but none of the other eight mutations showed any detectable improvement. S331C played a permissive role as two of the initially neutral mutations, N328D and F84L, became advantageous after S331C, whereas the remaining five mutations were still neutral or even deleterious (Fig. 5C). The accumulation of three more mutations (F84L, N328D, S331C) then permitted another mutation, E125D, and from this point the remaining four mutations became beneficial. The authors also examined the reverse trajectory (evolution from TriA to AtzA), and showed that the order of mutations differs from simply reversing the forward process. Specifically, L84F was now independent from C331S, whereas D328N had to occur after C331S, otherwise it completely inactivated the enzyme (Noor et al., 2012). Epistasis between Initial Mutations Leads to Different Trajectories. Epistasis, especially reciprocal sign epistasis, may lead to alternative fitness peaks in the same function. Mutations occurring early in the evolutionary trajectory can significantly alter subsequent dynamics by opening up certain mutational pathways while closing others. Parallel evolutionary experiments, in which several lineages of experimental evolution are performed simultaneously, have revealed evolution to be contingent on such historical substitutions because different outcomes could be observed in different lines (Lobkovsky and Koonin, 2012). Salverda et al. (2011) performed 12 lines of parallel directed evolution of TEM‐1 b‐lactamase to confer higher resistance to cefotaxime. In the majority of cases, three key mutations (E104K, M182T, and G238S) known for their role in cefotaxime resistance occurred. However, two evolutionary trajectories deviated from this pattern, and accumulated the mutations R164S or A237T, but not G238S. Sign epistasis between G238S, and R164S or A237T, respectively, prohibits their simultaneous accumulation (Fig. 5D). Interestingly, the trajectories originating from R164S and A237T

477 led to lower adaptive peaks, involving more and other substitutions than those observed in the typical pathway, although they were evolved under identical conditions. Dickinson et al. (2013) performed experimental evolution of T7 RNA polymerase toward recognition of a new promoter using their phage‐assisted continuous evolution system (PACE). The initial population of T7 RNA polymerase variants was divided into several lines, and evolved for either T3 (which differs from the T7 promoter in 6 out of 23 bases) or SP6 (11 substitutions relative to T7) promoter recognition. After several days of continuous evolution (100 generations) to adapt to these intermediate promoters, the populations were evolved for a common “final” promoter sequence (11, 7, or 3 substitutions away from the T7, T3, and SP6 promoter, respectively). Interestingly, some populations from the intermediate SP6 evolution adapted significantly better than other SP6 populations, and better than all of the T3 populations. Mutational analysis revealed strong antagonistic and sign epistasis among mutations occurring in the different trajectories, which prevented their simultaneous accumulation, and led to different evolutionary outcomes. Both parallel evolution experiments demonstrate that evolutionary events can be historically contingent according to the order in which mutations accumulate, because favorable mutations can be mutually exclusive. However, these experiments also indicate that there is a relatively small number of evolutionary trajectories available. In both cases, only a limited amount of solutions were observed, and key mutations were enriched in the different lines. A similar observation was made for the natural and experimental evolution of antibiotic resistance genes. Experimental evolution of b‐lactamases and DHFR for resistance against new antibiotics consistently identified mutations that also appeared in clinical isolates (Schild et al., '91; Hall, 2002, 2004; Lozovsky et al., 2009; Novais et al., 2010; Salverda et al., 2010; Schenk et al., 2012). Therefore, in some enzymes with only a few distinct trajectories available, enzyme evolution might be highly reproducible and predictable.

CONNECTIVITY BETWEEN FUNCTIONS ON A MULTI‐ DIMENSIONAL FITNESS LANDSCAPE The second major evolutionary constraint relates to the connectivity between different landscapes. The overlap between functional regions from different landscapes provides a springboard for the evolution of enzymatic activity. However, adaptive mutations that increase the new function may negatively affect the original, native function (Fig. 6). Such functional trade‐offs may cause severe evolutionary constraints; if the reduction in native activity exceeds the level necessary for organismal fitness and survival, it cannot be tolerated. Enzyme promiscuity and functional trade‐offs have been the subject of many studies and reviews (see Babtie et al., 2010; Khersonsky and Tawfik, 2010; Soskine and Tawfik, 2010) and references therein). Therefore, we have limited this section to the discussion of several recent works J. Exp. Zool. (Mol. Dev. Evol.)

478

KALTENBACH AND TOKURIKI

Figure 6. Substrate walking. Nearby starting points can be accessed indirectly by first selecting for an intermediate substrate (A). The different related activities can be seen as a mountain range, which can be crossed in a stepwise fashion as new starting points become available, through a “continuous network” (B). As part of the sitagliptin synthesis process, the active site pocket of a transaminase was sequentially widened as the substrate was enlarged in two steps (C, Savile et al., 2010). In the evolution of a propane monooxygenase, on the other hand, the substrate was sequentially shrunk (D, Fasan et al., 2008). The evolution of DNA methyltransferases proceeded through progressive recognition of longer sequences (E, Rockah‐Shmuel and Tawfik, 2012).

and emphasize remaining challenges and open questions, namely the dynamics during the later stages of adaptation. We also highlight examples where a large evolutionary distance was overcome in the laboratory. A functional transition between two fitness peaks with little or no overlap (no promiscuous activity in the starting point), is a great challenge in enzyme engineering. However, a handful of successful examples have been reported, and will be discussed in “Bridging Distance between Distinct Enzymatic Functions”. Trade‐Offs between New and Original Function In order to catalyze a reaction efficiently, an active site must be exquisitely organized to physically accommodate the substrate and stabilize the transition state. This active site, which is highly optimized for a particular reaction, may be capable of performing alternative reactions but not ideally suited for them. Therefore, promiscuous activities are generally multiple orders of magnitude lower than the native reaction. Many natural enzymes are highly specific and it is likely that trade‐offs exist between different enzymatic functions (adaptation for one activity comes at the cost of another). Interestingly, numerous enzyme engineering studies have observed that climbing to the peak of a promiscuous activity often leads to a relatively small reduction of the native activity, J. Exp. Zool. (Mol. Dev. Evol.)

and engineering efforts often result in the creation of highly promiscuous or broad‐specificity enzymes (Aharoni et al., 2005; Khersonsky et al., 2006; Hult and Berglund, 2007; Bloom and Arnold, 2009). This indicates that evolution can follow a trajectory with only weak trade‐offs, and that multiple fitness peaks may extensively overlap (Fig. 2C and D). It has been discussed that this overlap “buys time” and enables a significant progression of adaptation prior to gene duplication (Kondrashov and Koonin, 2004; Khersonsky and Tawfik, 2010; Soskine and Tawfik, 2010). If, on the other hand, any improvement in promiscuous activity leads to a severe reduction of the native activity, that is, the overlap between fitness peaks is marginal (Fig. 2B and D), adaptation is difficult without the alleviation of these functional constraints by gene duplication (Newcomb et al., '97; McLoughlin and Copley, 2008). Asymmetric Trade‐Offs and Specialization Most directed evolution work has demonstrated that the initial stages of adaptation lead to highly promiscuous enzymes. However, the later stages of divergence, that is, the process of re‐specialization, have not been extensively investigated. Recently, several long‐term evolutionary experiments have begun addressing some of the remaining open questions: If trade‐offs

DYNAMICS AND CONSTRAINTS OF ENZYME EVOLUTION between new and original function are initially weak, do they remain so as adaptation proceeds further or will they eventually become stronger to drive specialization? Fasan et al. reported the experimental evolution of a propane monooxygenase from a P450 enzyme naturally active toward C12–C20 fatty acids by selecting for higher catalytic activity for propane (Fasan et al., 2008). Although a high level of activity towards longer alkyl chains remained after the first two rounds of evolution, (31% for palmitate, 61% for laurate), a subsequent six rounds of evolution which increased propane hydroxylation 9  103‐fold, reduced the original activity levels to 0.05% for palmitate, and no activity was detectable for laurate. The evolution from PTE to arylesterase also demonstrated that specialization can be achieved exclusively through selection for the new function (Tokuriki et al., 2012). A total switch in specificity of >109‐fold was observed, even though the native activity was never selected against. Trade‐off along the course of evolution was asymmetric; while initially weak, it became stronger as evolution progressed. In the first six rounds, aryl esterase hydrolysis increased 1,600‐fold and paraoxon hydrolysis was reduced by a factor of 28. However, in the following 12 rounds, aryl esterase hydrolysis increased only by another factor of 20, but paraoxonase was reduced 1,400‐fold. Similar dynamics during the transition between two functions have been observed in other directed evolution experiments (Afriat‐Jurnou et al., 2012; Meier et al., 2013). In those studies, specialization was achieved exclusively by selection of the new function, which indicates that very high catalytic efficiencies for both functions are mutually exclusive. In the case of PTE, the strong trade‐offs in the later rounds of evolution were associated with strong diminishing returns in the new function (Tokuriki et al., 2012, see also “Diminishing Returns”). Additionally, in all cases, the original function was still observed as a low‐level, promiscuous activity. Therefore, in nature, specialization based solely on trade‐offs is likely to be slow or may never occur, particularly for enzymes with small or transitory contribution to overall organismal fitness (the diminishing returns make the advantage of improved variants increasingly small and difficult to select). In all likelihood, in the majority of cases, there may be no selection pressure to completely remove the original activity and other promiscuous activities as they might not significantly influence or contribute to metabolic flux and organismal fitness. Indeed, several recent studies suggest that many enzymes exhibit broad catalytic specificity and/or promiscuity, particularly enzymes in secondary metabolism (Bar‐ Even et al., 2011; Nam et al., 2012; Weng et al., 2012; Weng, 2013). Secondary metabolism is by definition non‐essential (reactions only relevant in certain environments/at certain times, like the production of toxins by plants to fend of predators or detoxification of harmful compounds) and evolved more recently than primary metabolism (reactions directly contributing to organismal growth and reproduction, like carbohydrate metabo-

479 lism and amino acid biosynthesis). Therefore, enzymes in secondary metabolism might experience weaker selection pressure for high efficiency and specificity, and hold greater potential to acquire new functions (Bar‐Even et al., 2011; Nam et al., 2012; Weng et al., 2012). However, highly efficient enzymes from primary metabolism also display promiscuous activities and can be recruited as evolutionary starting points (Iffland et al., 2001; Gould and Tawfik, 2005; McLoughlin and Copley, 2008; Patrick and Matsumura, 2008; Kim et al., 2010; Soo et al., 2011). It is becoming clear that in addition to the well‐established metabolic network, an “underground metabolism” exists, which is orchestrated by promiscuous enzymes and plays a significant role for the physiological response to environmental change and evolution (D'Ari and Casadesus, '98; Soo et al., 2011). Overall, complete specialization is difficult but may occur in some cases through direct negative selection (if the residual activity is toxic to the organism) or neutral drift (if the residual activity is not harmful). Specialization through negative selection against an undesired activity in the laboratory has also been reported (Varadarajan et al., 2005; Collins et al., 2006; Doyon et al., 2006; Melancon and Schultz, 2009). Gene Duplication and Horizontal Gene Transfer to Alleviate Trade‐ Off Constraints For a complete functional transition, gene duplication followed by divergence is generally required. While many theoretical models exist that describe the processes that affect the fate of duplicated genes (Kondrashov and Koonin, 2004; Taylor and Raes, 2004; Innan and Kondrashov, 2010), the experimental capture of such dynamics is a great challenge. Recently however, Nasvall et al. (2012) took on this challenge in a laboratory setting. S. enterica contains two homologous enzymes, HisA and TrpF, which carry out essential steps in the biosynthesis of histidine (His) and tryptophan (Trp), respectively. Culturing of a TrpF knockout strain on media lacking both Trp and His led to a HisA mutant capable of contributing to Trp biosynthesis, albeit with reduced native activity. When placing this bifunctional HisA gene in a genetic context where duplications and higher‐order amplification occur frequently, bacterial growth rates increased considerably within 500 generations of experimental evolution. Indeed, this increased fitness was caused by amplification of the HisA gene, which lead to a maximum of a 20‐fold increase in HisA dosage in some lineages. Moreover, functional point mutations, which occurred increasingly after the initial duplication phase, led to different evolutionary outcomes within the 3,000 generations monitored. Some lineages experienced specialization and possessed two distinct enzymes involved in either His or Trp biosynthesis (in some cases, specialization was incomplete within the time of the experiment, and one of the two enzymes still catalyzed the alternative reaction). In other lineages, a single, improved bifunctional enzyme became responsible for carrying out both tasks. J. Exp. Zool. (Mol. Dev. Evol.)

480 This experiment demonstrated how divergence of enzymatic function can occur gradually via gene duplication and specialization under positive selection, and that different evolutionary outcomes are viable. Intriguingly, this functional transition from a specialist HisA to a bifunctional enzyme back to a specialist enzyme has been observed in nature. It is known that members of Actinomycetes lack the gene for TrpF and instead possess a bifunctional enzyme called PriA, which is responsible for both His and Trp biosynthesis. PriA evolved from HisA to compensate for the loss of TrpF (Barona‐Gomez and Hodgson, 2003). Recently, Noda‐Garcia and Barona‐Gomez (2013) described that in some members of Corynebacterium, acquisition of a whole‐pathway tryptophan operon by horizontal gene transfer (HGT) has led to a division of labor between the newly acquired TrpF and the bifunctional PriA. Selective loss or mutation of the original Trp biosynthesis genes occurred, and functional mutations led to specialization of PriA back into HisA (neoHisA) (Noda‐Garcia and Barona‐Gomez, 2013). The work by Noda‐Garcia also indicates that specialization may be under positive selection to abolish regulatory cross‐talk between the now functionally and spatially separated Trp and His biosynthesis pathways, and does not occur simply due to the increased likelihood of deleterious mutations in redundant genes (Noda‐Garcia and Barona‐ Gomez, 2013). Bridging Distance between Distinct Enzymatic Functions How can large evolutionary distances be overcome? So far, we have discussed cases where an activity was already present promiscuously in the enzyme and could be further enhanced by mutations and selection. However, in some cases, especially in enzyme engineering, the parent enzyme does not have an inherent basal level of the activity in question. Neutral drift might be able to move the sequence to a region where the new fitness peak overlaps with the original one and in this way access the new activity (Amitai et al., 2007; Bloom et al., 2007; Bershtein and Tawfik, 2008; Wagner, 2008; Hayden et al., 2011). Alternatively, adaptive processes may open up a starting point, such as evolution for an intermediary function that bridges the two peaks (substrate walking). And yet in other cases, events such as rearrangements or multiple simultaneous mutations may enable the “jump” from one peak to the other and lead to drastic structural and functional innovations. While such “hopeful monsters” (Gould, '77) are expected to be extremely rare in nature, they may be implemented in the lab if enough information on the sequence‐structure‐ function relationship is available. Substrate Walking to Connect Distant Fitness Peaks One way to overcome a greater distance on the fitness landscape is to break down the challenge into several, smaller challenges: An intermediary substrate, which resembles both the native and ultimately desired substrate, can be used. Increases in activity towards the intermediate substrate may then introduce the desired J. Exp. Zool. (Mol. Dev. Evol.)

KALTENBACH AND TOKURIKI activity (Fig. 5A). This approach was originally called “in vitro co‐ evolution” by Chen and Zhao (2005) and later dubbed “substrate walking” (Savile et al., 2010). An outstanding example of this approach was observed in the experimental evolution of a transaminase for the commercial synthesis of the antidiabetic drug sitagliptin (Fig. 6B, Savile et al., 2010). Because the active site of the starting enzyme was too small to transaminate the sitagliptin precursor molecule, Savile et al. first challenged the enzyme with a smaller intermediate substrate and identified a single mutation that improved activity toward this functional stepping stone. Subsequently, the active site was widened further by other mutations, which, in combination, introduced transaminase activity for the desired substrate. Importantly, it was only after activity for the intermediate substrate had been enhanced that successful selections for the actual substrate could be performed. Further rounds of evolution resulted in an enzyme that exhibited a 104‐fold increase in transaminase activity. The substrate walking approach was also successfully employed in the evolution of propane monooxygenase mentioned above (Fig. 6C, Fasan et al., 2008). Because the initial enzyme's substrate preference was directed towards longer alkyl chains (C12–C20), an initial round of screening for octane (C8) hydroxylation was performed, which first introduced the activity towards propane (C3). Likewise, the above‐mentioned evolution of DNA methyltransferase, M.HaeIII, was performed stepwise (Fig. 6D, Rockah‐ Shmuel and Tawfik, 2012). Although direct selections failed to yield any improvements, intermittently selecting for different promiscuous recognition sequences (e.g., M.AvaII) gave access to target sequence methylation (e.g., M.BamHI). The authors found that the substrate walking was based on progressive expansion (or shrinkage, in other cases) of the recognition site: While the parent M.HaeIII recognizes the 4 bp sequence GGCC, M.AvaII methylates the 5 bp sequence GG(A/T)CC. Increased activity toward this sequence resulted in improved recognition of the desired 6 bp M. BamHI recognition site GGTACC, which could then be further improved by selection. In summary, several related functions may form a mountain range of overlapping fitness peaks that can be traversed by adaptive evolution in a continuous fashion. While the above examples are concerned with substrate specificity, it is likely that the same co‐evolutionary processes exist for distinct catalytic activities. Members of an enzyme superfamily may constitute functional clusters in which some peaks are overlapping substantially, while others are more separated, yielding a rich reservoir for the evolution of new enzymes specializing for related functions. The examples also prove substrate walking to be an attractive engineering strategy when intermediate distances in sequence (and structure) must be overcome. Simultaneous Introduction of Multiple Sequence Elements If large evolutionary distances need to be bridged and distant peaks are to be connected, more drastic changes in sequence and

DYNAMICS AND CONSTRAINTS OF ENZYME EVOLUTION structure might be necessary. In nature, homologous DNA recombination is an important evolutionary mechanism that leads to a sudden increase in diversity compared to the stepwise accumulation of point mutations. Recombination of closely related homologues is routinely used in experimental evolution to generate diversity, recombine favorable, previously selected mutations, and remove deleterious mutations (Zhao et al., '98; Carbone and Arnold, 2007). Likewise, several techniques have been developed that mimic recombination between genes at points of low sequence identity (Carbone and Arnold, 2007) or the fusion of completely unrelated genes (Ostermeier et al., '99; Sieber et al., 2001). Furthermore, processes such as circular permutations (Mehta et al., 2012), elongation (Matsuura et al., '99) or truncation (Ostermeier et al., '99) of sequence elements at the protein termini, or insertions/deletions of bases throughout the gene (Jones, 2005; Afriat‐Jurnou et al., 2012; Kipnis et al., 2012) have all found their counterparts in the lab. An elegant example is the generation of a chimera from human and rat GST, which was more active toward the selection substrate than the parent gene from either organism (Griswold et al., 2005). Interestingly, the chimera serendipitously exhibited activity for a substrate not recognized by either of the parents, demonstrating how larger sequence rearrangements can lead to functional innovation. Recent advances in rational and computational design have enabled more complex enzyme restructuring. Park et al. (2006) reported the conversion of human glyoxalase II (GlyII) with no detectable b‐lactamase activity into a metallo‐b‐ lactamase (MBL). Comparison of the two protein structures led to the identification of regions important for the activity switch, which were then modified. The C‐terminal domain was deleted from the GlyII scaffold; partially randomized functional loops originating from the natural metallo b‐lactamase IMP‐1 were

481 inserted, and metal‐binding residues were changed to those found in the IMP‐1 active site (Fig. 7). Random point mutations completed this process, which yielded an MBL conveying cefotaxime resistance, but with no traces of the original glyoxalase activity. Notably, once the initial level of cefotaxime resistance was established, it became accessible to further optimization by directed evolution (Sun et al., 2013). It is now possible to design enzyme active sites de novo (Bolon and Mayo, 2001; Kiss et al., 2013). Generally, the designed catalytic residues are introduced into existing protein scaffolds with no initial activity. A recent example is that of a designed retro‐aldolase with a rate enhancement of 15,000 (kcat 0.1 s1, kcat/KM 0.2 M1 s1, Giger et al., 2013). The initial design could be improved by a factor of 4,400 (kcat ca. 200 s1, kcat/KM ca. 500 M1 s1) by a combination of iterative cassette mutagenesis, random mutations, and DNA shuffling. The potential of computational design to enable large jumps in sequence space is also illustrated by the fact that even reactions not assigned to any natural enzyme, such as the Diels–Alder reaction (Siegel et al., 2010) and the Kemp elimination (Rothlisberger et al., 2008), have been implemented successfully. Nevertheless, successfully carrying out large‐scale protein restructuring in the lab remains a great challenge. To date, only few examples have been reported, but these studies pave the way for the development of more successful design strategies. Perspectives Experimental enzyme evolution enables us to generate evolutionary trajectories in the laboratory, and analyze the molecular fossil record making up these trajectories. Over the last years, in combination with enzymology, bioinformatics, and ancestral reconstruction techniques, experimental evolution has led to

Figure 7. “Jumping” onto a new fitness peak by rational design. Glyoxalase II (GlyII) has no detectable b‐lactamase activity. Through exchange of multiple active site loops for sequences resembling natural b‐lactamases, deletion of the C‐terminal domain, and point mutations, b‐lactamase activity was introduced, while glyoxalase activity was completely abolished (Park et al., 2006). J. Exp. Zool. (Mol. Dev. Evol.)

482 advances in the description of evolutionary dynamics that have complemented traditional evolutionary biology. Unraveling how, why, and to what extent enzyme evolution is restricted is key for understanding and predicting natural evolution as well as enhancing our ability to engineer and generate novel enzymes. Epistasis is increasingly recognized as a major constraint in evolution, which restricts accessible trajectories and can lead to different evolutionary outcomes. Nevertheless, many open questions remain to be addressed. How rugged is a real fitness landscape? In other word, how frequent and how pronounced is epistasis? To date, most systematic analyses focus on only one protein, on only a few mutations, and on only one particular trajectory (Weinreich et al., 2006; Ortlund et al., 2007; Lozovsky et al., 2009; Novais et al., 2010; Noor et al., 2012). A larger scale, systematic mutational and computational analysis of proteins with different folds and functions, orthologous proteins, or paralogs in different cellular contexts would be highly valuable to uncover the prevalence of epistasis. Moreover, phenotype/ genotype mapping of double and higher order mutational effects would help us to understand how mutations within a trajectory, but also from different, incompatible trajectories interact (Lunzer et al., 2010; Ackermann and Beyer, 2012). It is crucial to gain a better understanding of the molecular mechanisms leading to intrinsic epistasis, which so far has been achieved in only a few studies (Harms and Thornton, 2013). In particular, the interaction of mutations that epistatically influence enzyme activity, but are not in close proximity in the protein structure, should be explored. It is likely that conformational dynamics are a key element in such long‐distance interactions (Tomatis et al., 2008; Tokuriki and Tawfik, 2009a). Experimental parallel evolution is an extremely powerful tool to examine evolutionary dynamics because it is possible to examine the possible trajectories statistically (Lobkovsky and Koonin, 2012). Parallel enzyme evolution experiments can be performed in different environments and conditions (e.g., strength of the selection pressure, mutation rate or population size) and complement whole‐organism parallel evolution. The different trajectories that are obtained in such experiments also provide excellent material to explore ruggedness of the fitness landscape (Esvelt et al., 2011; Salverda et al., 2011; Nasvall et al., 2012). These experiments also provide valuable training sets to predict the evolution of enzymes that threaten global health by responding to xenobiotic compounds such as antibiotics (Kogenaru et al., 2009). Generally, a discussion of epistasis would benefit from a clear differentiation between hierarchical epistasis and intrinsic epistasis. Our understanding of the molecular and evolutionary basis for functional innovation is developing. We are learning when and how evolutionary starting points become accessible, as well as of the trade‐offs that shape and constrain the transition between two enzymatic functions. Building on this knowledge will help us recognize the factors that lead to increased evolvability. Evidently, J. Exp. Zool. (Mol. Dev. Evol.)

KALTENBACH AND TOKURIKI some proteins are better starting points than others, but which sequences exactly? Did ancestral enzymes with broad specificity have a higher evolutionary potential than extant enzymes? Are some folds, or some subsets of the protein structure, such as loops, more evolvable than others (Dellus‐Gur et al., 2013)? Furthermore, when no starting points are available, how can they be found? Neutral drift and substrate walking, that is, the iterative adaptation towards new, related substrates, are the likely scenarios for the evolution of diversity within enzyme superfamilies. Which other processes lead to functional innovation, and how? To date, experimental evolution has almost exclusively focused on point mutations. Recreating and analyzing other diversifying mechanisms such as insertions/deletions will expand our understanding of evolutionary dynamics. Moreover, gene duplication and specialization, which are difficult to study in natural systems, as selective pressures are often ambiguous, can be studied by experimental evolution. The vast majority of enzymes evolved in the lab are hydrolases, while other enzyme classes such as isomerases, ligases, etc. are severely underrepresented. In addition, hydrolase activity is usually shown through easily detectable, but not necessarily physiologically relevant, substrates. One of the remaining challenges for the field is to move away from oversimplified models. This includes studying enzymes as part of more realistic and complex levels of the fitness hierarchy, that is, working in concert with other proteins within metabolic, regulatory networks. Connecting enzyme evolutionary dynamics to organismal fitness and genome evolution is one of the ultimate challenges in the field. Which technology developments are necessary to overcome the limitations imposed by the constraints on experimental evolution? Directed evolution via multiple parallel trajectories and/or multiple starting points might increase the chance of accessing higher fitness peaks. More challenging systems with no starting point are increasingly implemented. Perhaps the most notable advancement in this regard was the advent of computational design: Predicting the stability effect of mutations by computational algorithms is becoming more reliable, and can be very useful to identify stabilizing mutations (Cabrita et al., 2007). Moreover, it is now possible to create functional innovation at will, “from scratch.” Nevertheless, improvements in this area are still necessary to increase design efficiency. Furthermore, the computational prediction of epistasic interactions between mutations will help overcome evolutionary dead‐ ends because it offers a mean to identify and then “unlock” permissive/restrictive mutations (McLaughlin et al., 2012). Insertions, deletions, elongations and truncations constitute a largely untapped reservoir of diversifying elements. Such changes, both random and designed, may lead to more drastic functional changes and in combination with point mutations access more distant starting points and generate new and efficient enzymes.

DYNAMICS AND CONSTRAINTS OF ENZYME EVOLUTION

ACKNOWLEDGMENTS We thank Janine Copp, Lianet Noda‐Garcia, and Balint Kintses for useful comments on the manuscript. This work was supported by the Natural Sciences and Engineering Research Council of Canada, Canadian Institute of Health Research, and Human Frontier Science Program. N.T. is a CIHR new investigator and a Michael Smith Foundation of Health Research (MSFHR) career investigator.

LITERATURE CITED Ackermann M, Beyer A. 2012. Systematic detection of epistatic interactions based on allele pair frequencies. PLoS Genet 8: e1002463. Afriat L, Roodveldt C, Manco G, Tawfik DS. 2006. The latent promiscuity of newly identified microbial lactonases is linked to a recently diverged phosphotriesterase. Biochemistry 45:13677– 13686. Afriat‐Jurnou L, Jackson CJ, Tawfik DS. 2012. Reconstructing a missing link in the evolution of a recently diverged phosphotriesterase by active‐site loop remodeling. Biochemistry 51:6047–6055. Aharoni A, Gaidukov L, Khersonsky O, et al. 2005. The ‘evolvability’ of promiscuous protein functions. Nat Genet 37:73–76. Aita T, Husimi Y. 1998. Adaptive walks by the fittest among finite random mutants on a Mt. Fuji‐type Fitness Landscape. J Theor Biol 193:383–405. Amitai G, Gupta RD, Tawfik DS. 2007. Latent evolutionary potentials under the neutral mutational drift of an enzyme. HFSP J 1:67–78. Arjan JA, Visser M, Zeyl CW, et al. 1999. Diminishing returns from mutation supply rate in asexual populations. Science 283:404–406. Armstrong RN. 2000. Mechanistic diversity in a metalloenzyme superfamily. Biochemistry 39:13625–13632. Aylor DL, Zeng ZB. 2008. From classical genetics to quantitative genetics to systems biology: modeling epistasis. PLoS Genet 4: e1000029. Babtie A, Tokuriki N, Hollfelder F. 2010. What makes an enzyme promiscuous? Curr Opin Chem Biol 14:200–207. Bar‐Even A, Noor E, Savir Y, et al. 2011. The moderately efficient enzyme: evolutionary and physicochemical trends shaping enzyme parameters. Biochemistry 50:4402–4410. Bar‐Rogovsky H, Hugenmatter A, Tawfik DS. 2013. The evolutionary origins of detoxifying enzymes: the mammalian serum paraoxonases (PONs) relate to bacterial homoserine lactonases. J Biol Chem 288:23914–23927. Barona‐Gomez F, Hodgson DA. 2003. Occurrence of a putative ancient‐like isomerase involved in histidine and tryptophan biosynthesis. EMBO Rep 4:296–300. Bebrone C. 2007. Metallo‐beta‐lactamases (classification, activity, genetic organization, structure, zinc coordination) and their superfamily. Biochem Pharmacol 74:1686–1701. Bershtein S, Tawfik DS. 2008. Ohno's model revisited: measuring the frequency of potentially adaptive mutations under various mutational drifts. Mol Biol Evol 25:2311–2318.

483 Bershtein S, Segal M, Bekerman R, Tokuriki N, Tawfik DS. 2006. Robustness‐epistasis link shapes the fitness landscape of a randomly drifting protein. Nature 444:929–932. Bershtein S, Goldin K, Tawfik DS. 2008. Intense neutral drifts yield robust and evolvable consensus proteins. J Mol Biol 379:1029– 1044. Bloom JD, Arnold FH. 2009. In the light of directed evolution: pathways of adaptive protein evolution. Proc Natl Acad Sci USA 106(Suppl 1):9995–10000. Bloom JD, Glassman MJ. 2009. Inferring stabilizing mutations from protein phylogenies: application to influenza hemagglutinin. PLoS Comput Biol 5:e1000349. Bloom JD, Silberg JJ, Wilke CO, et al. 2005. Thermodynamic prediction of protein neutrality. Proc Natl Acad Sci USA 102:606–611. Bloom JD, Romero PA, Lu Z, Arnold FH. 2007. Neutral genetic drift can alter promiscuous protein functions, potentially aiding functional evolution. Biol Direct 2:17. Bloom JD, Gong LI, Baltimore D. 2010. Permissive secondary mutations enable the evolution of influenza oseltamivir resistance. Science 328:1272–1275. Bolon DN, Mayo SL. 2001. Enzyme‐like proteins by computational design. Proc Natl Acad Sci USA 98:14274–14279. Bommarius AS, Broering JM, Chaparro‐Riggers JF, Polizzi KM. 2006. High‐throughput screening for enhanced protein stability. Curr Opin Biotechnol 17:606–610. Bonhoeffer S, Chappey C, Parkin NT, Whitcomb JM, Petropoulos CJ. 2004. Evidence for positive epistasis in HIV‐1. Science 306:1547– 1550. Breen MS, Kemena C, Vlasov PK, Notredame C, Kondrashov FA. 2012. Epistasis as the primary factor in molecular evolution. Nature 490:535–538. Bridgham JT, Ortlund EA, Thornton JW. 2009. An epistatic ratchet constrains the direction of glucocorticoid receptor evolution. Nature 461:515–519. Brown NG, Pennington JM, Huang W, Ayvaz T, Palzkill T. 2010a. Multiple global suppressors of protein stability defects facilitate the evolution of extended‐spectrum TEM beta‐lactamases. J Mol Biol 404:832–846. Brown KM, Costanzo MS, Xu W, et al. 2010b. Compensatory mutations restore fitness during the evolution of dihydrofolate reductase. Mol Biol Evol 27:2682–2690. Cabrita LD, Gilis D, Robertson AL, et al. 2007. Enhancing the stability and solubility of TEV protease using in silico design. Protein Sci 16:2360–2367. Carbone MN, Arnold FH. 2007. Engineering by homologous recombination: exploring sequence and function within a conserved fold. Curr Opin Struct Biol 17:454–459. Carneiro M, Hartl DL. 2010. Colloquium papers: adaptive landscapes and protein evolution. Proc Natl Acad Sci USA 107(Suppl 1):1747– 1751. Chen Z, Zhao H. 2005. Rapid creation of a novel protein function by in vitro coevolution. J Mol Biol 348:1273–1282. J. Exp. Zool. (Mol. Dev. Evol.)

484 Chou HH, Chiu HC, Delaney NF, Segre D, Marx CJ. 2011. Diminishing returns epistasis among beneficial mutations decelerates adaptation. Science 332:1190–1192. Collins CH, Leadbetter JR, Arnold FH. 2006. Dual selection enhances the signaling specificity of a variant of the quorum‐sensing transcriptional activator LuxR. Nat Biotechnol 24:708–712. Conant GC, Wolfe KH. 2008. Turning a hobby into a job: how duplicated genes find new functions. Nat Rev Genet 9:938–950. Copley SD. 2009. Evolution of efficient pathways for degradation of anthropogenic chemicals. Nat Chem Biol 5:559–566. Costanzo MS, Brown KM, Hartl DL. 2011. Fitness trade‐offs in the evolution of dihydrofolate reductase and drug resistance in Plasmodium falciparum. PLoS ONE 6:e19636. Dalby PA. 2011. Strategy and success for the directed evolution of enzymes. Curr Opin Struct Biol 21:473–480. D'Ari R, Casadesus J. 1998. Underground metabolism. Bioessays 20:181–186. Dawid A, Kiviet DJ, Kogenaru M, de Vos M, Tans SJ. 2010. Multiple peaks and reciprocal sign epistasis in an empirically determined genotype‐phenotype landscape. Chaos 20:026105. de Visser JA, Cooper TF, Elena SF. 2011. The causes of epistasis. Proc Biol Sci 278:3617–3624. Dean AM, Thornton JW. 2007. Mechanistic approaches to the study of evolution: the functional synthesis. Nat Rev Genet 8:675–688. Dean AM, Dykhuizen DE, Hartl DL. 1988. Fitness effects of amino acid replacements in the beta‐galactosidase of Escherichia coli. Mol Biol Evol 5:469–485. Dellus‐Gur E, Toth‐Petroczy A, Elias M, Tawfik DS. 2013. What makes a protein fold amenable to functional innovation? Fold polarity and stability trade‐offs. J Mol Biol 425:2609–2621. DePristo MA, Weinreich DM, Hartl DL. 2005. Missense meanderings in sequence space: a biophysical view of protein evolution. Nat Rev Genet 6:678–687. DePristo MA, Hartl DL, Weinreich DM. 2007. Mutational reversions during adaptive protein evolution. Mol Biol Evol 24:1608–1610. Dickinson BC, Leconte AM, Allen B, Esvelt KM, Liu DR. 2013. Experimental interrogation of the path dependence and stochasticity of protein evolution using phage‐assisted continuous evolution. Proc Natl Acad Sci USA 110:9007–9012. Dill KA. 1997. Additivity principles in biochemistry. J Biol Chem 272:701–704. Doyon JB, Pattanayak V, Meyer CB, Liu DR. 2006. Directed evolution and substrate specificity profile of homing endonuclease I‐SceI. J Am Chem Soc 128:2477–2484. Dykhuizen DE, Dean AM, Hartl DL. 1987. Metabolic flux and fitness. Genetics 115:25–31. Elena SF, Lenski RE. 2003. Evolution experiments with microorganisms: the dynamics and genetic bases of adaptation. Nat Rev Genet 4:457–469. Esvelt KM, Carlson JC, Liu DR. 2011. A system for the continuous directed evolution of biomolecules. Nature 472:499–503.

J. Exp. Zool. (Mol. Dev. Evol.)

KALTENBACH AND TOKURIKI Fasan R, Meharenna YT, Snow CD, Poulos TL, Arnold FH. 2008. Evolutionary history of a specialized p450 propane monooxygenase. J Mol Biol 383:1069–1080. Flynn KM, Cooper TF, Moore FB, Cooper VS. 2013. The environment affects epistatic interactions to alter the topology of an empirical fitness landscape. PLoS Genet 9:e1003426. Furnham N, Sillitoe I, Holliday GL, et al. 2012. Exploring the evolution of novel enzyme functions within structurally defined protein superfamilies. PLoS Comput Biol 8:e1002403. Gerlt JA, Allen KN, Almo SC, et al. 2011. The enzyme function initiative. Biochemistry 50:9950–9962. Gerlt JA, Babbitt PC, Jacobson MP, Almo SC. 2012. Divergent evolution in enolase superfamily: strategies for assigning functions. J Biol Chem 287:29–34. Giger L, Caner S, Obexer R, et al. 2013. Evolution of a designed retro‐ aldolase leads to complete active site remodeling. Nat Chem Biol 9:494–498. Glasner ME, Gerlt JA, Babbitt PC. 2006. Evolution of enzyme superfamilies. Curr Opin Chem Biol 10:492–497. Gong LI, Suchard MA, Bloom JD. 2013. Stability‐mediated epistasis constrains the evolution of an influenza protein. Elife 2: e00631. Gould SJ. 1977. The return of hopeful monsters. Nat Hist 86:30. Gould SM, Tawfik DS. 2005. Directed evolution of the promiscuous esterase activity of carbonic anhydrase II. Biochemistry 44:5444– 5452. Griswold KE, Kawarasaki Y, Ghoneim N, et al. 2005. Evolution of highly active enzymes by homology‐independent recombination. Proc Natl Acad Sci USA 102:10082–10087. Gumulya Y, Reetz MT. 2011. Enhancing the thermal robustness of an enzyme by directed evolution: least favorable starting points and inferior mutants can map superior evolutionary pathways. Chembiochem 12:2502–2510. Hall BG. 2002. Predicting evolution by in vitro evolution requires determining evolutionary pathways. Antimicrob Agents Chemother 46:3035–3038. Hall BG. 2004. Predicting the evolution of antibiotic resistance genes. Nat Rev Microbiol 2:430–435. Harms MJ, Thornton JW. 2013. Evolutionary biochemistry: revealing the historical and physical causes of protein properties. Nat Rev Genet 14:559–571. Hartl DL, Dykhuizen DE, Dean AM. 1985. Limits of adaptation: the evolution of selective neutrality. Genetics 111:655–674. Hayden EJ, Ferrada E, Wagner A. 2011. Cryptic genetic variation promotes rapid evolutionary adaptation in an RNA enzyme. Nature 474:92–95. Huang R, Hippauf F, Rohrbeck D, et al. 2012. Enzyme functional evolution through improved catalysis of ancestrally nonpreferred substrates. Proc Natl Acad Sci USA 109:2966–2971. Hult K, Berglund P. 2007. Enzyme promiscuity: mechanism and applications. Trends Biotechnol 25:231–238.

DYNAMICS AND CONSTRAINTS OF ENZYME EVOLUTION Iffland A, Gendreizig S, Tafelmeyer P, Johnsson K. 2001. Changing the substrate specificity of cytochrome c peroxidase using directed evolution. Biochem Biophys Res Commun 286:126– 132. Innan H, Kondrashov F. 2010. The evolution of gene duplications: classifying and distinguishing between models. Nat Rev Genet 11:97–108. Jacob F. 1977. Evolution and tinkering. Science 196:1161–1166. Jensen RA. 1976. Enzyme recruitment in evolution of new function. Annu Rev Microbiol 30:409–425. Jones DD. 2005. Triplet nucleotide removal at random positions in a target gene: the tolerance of TEM‐1 beta‐lactamase to an amino acid deletion. Nucleic Acids Res 33:e80. Kachanovsky DE, Filler S, Isaacson T, Hirschberg J. 2012. Epistasis in tomato color mutations involves regulation of phytoene synthase 1 expression by cis‐carotenoids. Proc Natl Acad Sci USA 109:19021– 19026. Kauffman S, Levin S. 1987. Towards a general theory of adaptive walks on rugged landscapes. J Theor Biol 128:11–45. Kawecki TJ, Lenski RE, Ebert D, et al. 2012. Experimental evolution. Trends Ecol Evol 27:547–560. Khan AI, Dinh DM, Schneider D, Lenski RE, Cooper TF. 2011. Negative epistasis between beneficial mutations in an evolving bacterial population. Science 332:1193–1196. Khersonsky O, Tawfik DS. 2010. Enzyme promiscuity: a mechanistic and evolutionary perspective. Annu Rev Biochem 79:471– 505. Khersonsky O, Roodveldt C, Tawfik DS. 2006. Enzyme promiscuity: evolutionary and mechanistic aspects. Curr Opin Chem Biol 10:498– 508. Kim J, Kershner JP, Novikov Y, Shoemaker RK, Copley SD. 2010. Three serendipitous pathways in E. coli can bypass a block in pyridoxal‐50 ‐ phosphate synthesis. Mol Syst Biol 6:436. Kipnis Y, Dellus‐Gur E, Tawfik DS. 2012. TRINS: a method for gene modification by randomized tandem repeat insertions. Protein Eng Des Sel 25:437–444. Kiss G, Celebi‐Olcum N, Moretti R, Baker D, Houk KN. 2013. Computational enzyme design. Angew Chem Int Ed Engl 52:5700– 5725. Kogenaru M, de Vos MG, Tans SJ. 2009. Revealing evolutionary pathways by fitness landscape reconstruction. Crit Rev Biochem Mol Biol 44:169–174. Kondrashov FA, Koonin EV. 2004. A common framework for understanding the origin of genetic dominance and evolutionary fates of gene duplications. Trends Genet 20:287–290. Koonin EV, Wolf YI. 2010. Constraints and plasticity in genome and molecular‐phenome evolution. Nat Rev Genet 11:487–498. Kouyos RD, Leventhal GE, Hinkley T, et al. 2012. Exploring the complexity of the HIV‐1 fitness landscape. PLoS Genet 8:e1002551. Kryazhimskiy S, Dushoff J, Bazykin GA, Plotkin JB. 2011. Prevalence of epistasis in the evolution of influenza A surface proteins. PLoS Genet 7:e1001301.

485 Kuliopulos A, Talalay P, Mildvan AS. 1990. Combined effects of two mutations of catalytic residues on the ketosteroid isomerase reaction. Biochemistry 29:10271–10280. Kvitek DJ, Sherlock G. 2011. Reciprocal sign epistasis between frequently experimentally evolved adaptive mutations causes a rugged fitness landscape. PLoS Genet 7:e1002056. Lehmann M, Pasamontes L, Lassen SF, Wyss M. 2000. The consensus concept for thermostability engineering of proteins. Biochim Biophys Acta 1543:408–415. Lobkovsky AE, Koonin EV. 2012. Replaying the tape of life: quantification of the predictability of evolution. Front Genet 3:246. Loewe L. 2012. How evolutionary systems biology will help understand adaptive landscapes and distributions of mutational effects. Adv Exp Med Biol 751:399–410. Lozovsky ER, Chookajorn T, Brown KM, et al. 2009. Stepwise acquisition of pyrimethamine resistance in the malaria parasite. Proc Natl Acad Sci USA 106:12025–12030. Lunzer M, Miller SP, Felsheim R, Dean AM. 2005. The biochemical architecture of an ancient adaptive landscape. Science 310:499– 501. Lunzer M, Golding GB, Dean AM. 2010. Pervasive cryptic epistasis in molecular evolution. PLoS Genet 6:e1001162. MacLean RC, Perron GG, Gardner A. 2010. Diminishing returns from beneficial mutations and pervasive epistasis shape the fitness landscape for rifampicin resistance in Pseudomonas aeruginosa. Genetics 186:1345–1354. Martin G, Elena SF, Lenormand T. 2007. Distributions of epistasis in microbes fit predictions from a fitness landscape model. Nat Genet 39:555–560. Matsumura I, Ellington AD. 2001. In vitro evolution of beta‐ glucuronidase into a beta‐galactosidase proceeds through non‐ specific intermediates. J Mol Biol 305:331–339. Matsuura T, Miyai K, Trakulnaleamsai S, et al. 1999. Evolutionary molecular engineering by random elongation mutagenesis. Nat Biotechnol 17:58–61. Mayer S, Rudiger S, Ang HC, Joerger AC, Fersht AR. 2007. Correlation of levels of folded recombinant p53 in escherichia coli with thermodynamic stability in vitro. J Mol Biol 372:268–276. McLaughlin RN Jr, Poelwijk FJ, Raman A, Gosal WS, Ranganathan R. 2012. The spatial architecture of protein function and adaptation. Nature 491:138–142. McLoughlin SY, Copley SD. 2008. A compromise required by gene sharing enables survival: implications for evolution of new enzyme activities. Proc Natl Acad Sci USA 105:13497–13502. Mehta MM, Liu S, Silberg JJ. 2012. A transposase strategy for creating libraries of circularly permuted proteins. Nucleic Acids Res 40:e71. Meier MM, Rajendran C, Malisi C, et al. 2013. Molecular engineering of organophosphate hydrolysis activity from a weak promiscuous lactonase template. J Am Chem Soc 135:11670–11677. Melancon CE III, Schultz PG. 2009. One plasmid selection system for the rapid evolution of aminoacyl‐tRNA synthetases. Bioorg Med Chem Lett 19:3845–3847. J. Exp. Zool. (Mol. Dev. Evol.)

486 Michalakis Y, Roze D. 2004. Evolution. Epistasis in RNA viruses. Science 306:1492–1493. Miralles R, Moya A, Elena SF. 2000. Diminishing returns of population size in the rate of RNA virus adaptation. J Virol 74:3566–3571. Nam H, Lewis NE, Lerman JA, et al. 2012. Network context and selection in the evolution to enzyme specificity. Science 337:1101– 1104. Nasvall J, Sun L, Roth JR, Andersson DI. 2012. Real‐time evolution of new genes by innovation, amplification, and divergence. Science 338:384–387. Newcomb RD, Campbell PM, Ollis DL, et al. 1997. A single amino acid substitution converts a carboxylesterase to an organophosphorus hydrolase and confers insecticide resistance on a blowfly. Proc Natl Acad Sci USA 94:7464–7468. Noda‐Garcia L, Barona‐Gomez F. 2013. Enzyme evolution beyond gene duplication: a model for incorporating horizontal gene transfer. Mob Genet Elements 3:e26439. Noor S, Taylor MC, Russell RJ, et al. 2012. Intramolecular epistasis and the evolution of a new enzymatic function. PLoS ONE 7: e39822. Novais A, Comas I, Baquero F, et al. 2010. Evolutionary trajectories of beta‐lactamase CTX‐M‐1 cluster enzymes: predicting antibiotic resistance. PLoS Pathog 6:e1000735. Ortlund EA, Bridgham JT, Redinbo MR, Thornton JW. 2007. Crystal structure of an ancient protein: evolution by conformational epistasis. Science 317:1544–1548. Ostermeier M, Nixon AE, Benkovic SJ. 1999. Incremental truncation as a strategy in the engineering of novel biocatalysts. Bioorg Med Chem 7:2139–2144. Papp B, Notebaart RA, Pal C. 2011. Systems‐biology approaches for predicting genomic evolution. Nat Rev Genet 12:591–602. Park HS, Nam SH, Lee JK, et al. 2006. Design and evolution of new catalytic activity with an existing protein scaffold. Science 311:535– 538. Patrick WM, Matsumura I. 2008. A study in molecular contingency: glutamine phosphoribosylpyrophosphate amidotransferase is a promiscuous and evolvable phosphoribosylanthranilate isomerase. J Mol Biol 377:323–336. Peisajovich SG, Tawfik DS. 2007. Protein engineers turned evolutionists. Nat Methods 4:991–994. Perfeito L, Ghozzi S, Berg J, Schnetz K, Lassig M. 2011. Nonlinear fitness landscape of a molecular pathway. PLoS Genet 7:e1002160. Phillips PC. 2008. Epistasis‐the essential role of gene interactions in the structure and evolution of genetic systems. Nat Rev Genet 9:855–867. Poelwijk FJ, Kiviet DJ, Weinreich DM, Tans SJ. 2007. Empirical fitness landscapes reveal accessible evolutionary paths. Nature 445:383– 386. Qasim MA, Lu W, Lu SM, et al. 2003. Testing of the additivity‐based protein sequence to reactivity algorithm. Biochemistry 42:6460– 6466.

J. Exp. Zool. (Mol. Dev. Evol.)

KALTENBACH AND TOKURIKI Ramos JL, Gonzalez‐Perez MM, Caballero A, van Dillewijn P. 2005. Bioremediation of polynitrated aromatic compounds: plants and microbes put up a fight. Curr Opin Biotechnol 16:275–281. Rockah‐Shmuel L, Tawfik DS. 2012. Evolutionary transitions to new DNA methyltransferases through target site expansion and shrinkage. Nucleic Acids Res 40:11627–11637. Romero PA, Arnold FH. 2009. Exploring protein fitness landscapes by directed evolution. Nat Rev Mol Cell Biol 10:866–876. Roodveldt C, Aharoni A, Tawfik DS. 2005. Directed evolution of proteins for heterologous expression and stability. Curr Opin Struct Biol 15:50–56. Rothlisberger D, Khersonsky O, Wollacott AM, et al. 2008. Kemp elimination catalysts by computational enzyme design. Nature 453:190–195. Salverda ML, De Visser JA, Barlow M. 2010. Natural evolution of TEM‐1 beta‐lactamase: experimental reconstruction and clinical relevance. FEMS Microbiol Rev 34:1015–1036. Salverda ML, Dellus E, Gorter FA, et al. 2011. Initial mutations direct alternative pathways of protein evolution. PLoS Genet 7:e1001321. Sanchez‐Ruiz JM. 2010. Protein kinetic stability. Biophys Chem 148: 1–15. Sanjuan R, Elena SF. 2006. Epistasis correlates to genomic complexity. Proc Natl Acad Sci USA 103:14402–14405. Savile CK, Janey JM, Mundorff EC, et al. 2010. Biocatalytic asymmetric synthesis of chiral amines from ketones applied to sitagliptin manufacture. Science 329:305–309. Schenk MF, Szendro IG, Krug J, de Visser JA. 2012. Quantifying the adaptive potential of an antibiotic resistance enzyme. PLoS Genet 8: e1002783. Schild SE, Buskirk SJ, Frick LM, Cupps RE. 1991. Radiotherapy for large symptomatic hemangiomas. Int J Radiat Oncol Biol Phys 21:729– 735. Segre D, Deluna A, Church GM, Kishony R. 2005. Modular epistasis in yeast metabolism. Nat Genet 37:77–83. Seibert CM, Raushel FM. 2005. Structural and catalytic diversity within the amidohydrolase superfamily. Biochemistry 44:6383– 6391. Sieber V, Martinez CA, Arnold FH. 2001. Libraries of hybrid proteins from distantly related sequences. Nat Biotechnol 19:456–460. Siegel JB, Zanghellini A, Lovick HM, et al. 2010. Computational design of an enzyme catalyst for a stereoselective bimolecular Diels–Alder reaction. Science 329:309–313. Smith JM. 1970. Natural selection and the concept of a protein space. Nature 225:563–564. Socha RD, Tokuriki N. 2013. Modulating protein stability—directed evolution strategies for improved protein function. FEBS J 280:5582–5595. Soo VW, Hanson‐Manful P, Patrick WM. 2011. Artificial gene amplification reveals an abundance of promiscuous resistance determinants in Escherichia coli. Proc Natl Acad Sci USA 108:1484– 1489.

DYNAMICS AND CONSTRAINTS OF ENZYME EVOLUTION Soskine M, Tawfik DS. 2010. Mutational effects and the evolution of new protein functions. Nat Rev Genet 11:572–582. Stebbins J. 1944. The law of diminishing returns. Science 99:267– 271. Sun S, Zhang W, Mannervik B, Andersson DI. 2013. Evolution of broad spectrum beta‐lactam resistance in an engineered metallo‐beta‐ lactamase. J Biol Chem 288:2314–2324. Taylor JS, Raes J. 2004. Duplication and divergence: the evolution of new genes and old ideas. Annu Rev Genet 38:615–643. Tokuriki N, Tawfik DS. 2009a. Protein dynamism and evolvability. Science 324:203–207. Tokuriki N, Tawfik DS. 2009b. Stability effects of mutations and protein evolvability. Curr Opin Struct Biol 19:596–604. Tokuriki N, Stricher F, Schymkowitz J, Serrano L, Tawfik DS. 2007. The stability effects of protein mutations appear to be universally distributed. J Mol Biol 369:1318–1332. Tokuriki N, Stricher F, Serrano L, Tawfik DS. 2008. How protein stability and new functions trade off. PLoS Comput Biol 4:e1000002. Tokuriki N, Jackson CJ, Afriat‐Jurnou L, et al. 2012. Diminishing returns and tradeoffs constrain the laboratory optimization of an enzyme. Nat Commun 3:1257. Tomatis PE, Fabiane SM, Simona F, et al. 2008. Adaptive protein evolution grants organismal fitness by improving catalysis and flexibility. Proc Natl Acad Sci USA 105:20605–20610. Toprak E, Veres A, Michel JB, et al. 2011. Evolutionary paths to antibiotic resistance under dynamically sustained drug selection. Nat Genet 44:101–105. Tracewell CA, Arnold FH. 2009. Directed enzyme evolution: climbing fitness peaks one amino acid at a time. Curr Opin Chem Biol 13:3–9. Traxlmayr MW, Obinger C. 2012. Directed evolution of proteins for increased stability and expression using yeast display. Arch Biochem Biophys 526:174–180. van Loo B, Jonas S, Babtie AC, et al. 2010. An efficient, multiply promiscuous hydrolase in the alkaline phosphatase superfamily. Proc Natl Acad Sci USA 107:2740–2745. Varadarajan N, Gam J, Olsen MJ, Georgiou G, Iverson BL. 2005. Engineering of protease variants exhibiting high catalytic activity and exquisite substrate selectivity. Proc Natl Acad Sci USA 102:6855–6860.

487 Vick JE, Gerlt JA. 2007. Evolutionary potential of (beta/alpha)8‐barrels: stepwise evolution of a “new” reaction in the enolase superfamily. Biochemistry 46:14589–14597. Voordeckers K, Brown CA, Vanneste K, et al. 2012. Reconstruction of ancestral metabolic enzymes reveals molecular mechanisms underlying evolutionary innovation through gene duplication. PLoS Biol 10:e1001446. Wackett LP. 2009. Questioning our perceptions about evolution of biodegradative enzymes. Curr Opin Microbiol 12:244–251. Wagner A. 2008. Neutralism and selectionism: a network‐based reconciliation. Nat Rev Genet 9:965–974. Walkiewicz K, Benitez Cardenas AS, Sun C, et al. 2012. Small changes in enzyme function can lead to surprisingly large fitness effects during adaptive evolution of antibiotic resistance. Proc Natl Acad Sci USA 109:21408–21413. Wang X, Minasov G, Shoichet BK. 2002. Evolution of an antibiotic resistance enzyme constrained by stability and activity trade‐offs. J Mol Biol 320:85–95. Weinreich DM, Watson RA, Chao L. 2005. Perspective: sign epistasis and genetic constraint on evolutionary trajectories. Evolution 59:1165–1174. Weinreich DM, Delaney NF, Depristo MA, Hartl DL. 2006. Darwinian evolution can follow only very few mutational paths to fitter proteins. Science 312:111–114. Wells JA. 1990. Additivity of mutational effects in proteins. Biochemistry 29:8509–8517. Weng JK. 2013. The evolutionary paths towards complexity: a metabolic perspective. New Phytol 201:1141–1149. Weng JK, Philippe RN, Noel JP. 2012. The rise of chemodiversity in plants. Science 336:1667–1670. Wright XS. 1932. The roles of mutation, inbreeding, crossbreeding and selection in evolution. Proceedings of The Sixth International Congress of Genetics 1: 356–366. Wyganowski KT, Kaltenbach M, Tokuriki N. 2013. GroEL/ES buffering and compensatory mutations promote protein evolution by stabilizing folding intermediates. J Mol Biol 425:3403–3414. Zhao H, Giver L, Shao Z, Affholter JA, Arnold FH. 1998. Molecular evolution by staggered extension process (StEP) in vitro recombination. Nat Biotechnol 16:258–261.

J. Exp. Zool. (Mol. Dev. Evol.)

Dynamics and constraints of enzyme evolution.

The wealth of distinct enzymatic functions found in nature is impressive and the on-going evolutionary divergence of enzymatic functions continues to ...
2MB Sizes 0 Downloads 3 Views