CHEMBIOCHEM MINIREVIEWS DOI: 10.1002/cbic.201402159

Biological Applications of Expanded Genetic Codes Xiang Li[b] and Chang C. Liu*[a] Substantial efforts in the past decade have resulted in the systematic expansion of genetic codes, allowing for the direct ribosomal incorporation of ~ 100 unnatural amino acids into bacteria, yeast, mammalian cells, and animals. Here, we illustrate the versatility of expanded genetic codes in biology and

bioengineering, focusing on the application of expanded genetic codes to problems in protein, cell, synthetic, and experimental evolutionary biology. As the expanded genetic code field continues to develop, its place as a foundational technology in the whole of biological sciences will solidify.

Introduction With few exceptions, nature uses a 20 amino acid genetic code for protein synthesis. While the question “why 20?” remains largely unanswered and perhaps unanswerable,[1] the related question “is it possible to go beyond 20?” has been successfully addressed with the synthetic expansion of genetic codes for the direct translational incorporation of custom unnatural amino acids (uAAs).[2] To date, ~ 100 uAAs representing physicochemical properties not found in the natural 20 amino acid repertoire, have been added to the genetic codes of bacteria,[3] yeast,[4] mammalian cells,[5] and animals.[6] The synthetic addition of uAAs to the genetic code follows a systematic method–reviewed in detail elsewhere[2]—that has become streamlined and highly accessible in the last few years. First, one hijacks a “blank” codon (codonBL)—usually the amber stop codon, UAG—to specify the desired uAA; second, one adopts an engineered orthogonal aminoacyl-tRNA synthetase (aaRS)/tRNA pair that decodes the codonBL and does not crossreact with host aaRS/tRNA pairs; and third, one evolves the orthogonal aaRS/tRNA pair, by using a well-established positive/negative selection system, toward specificity for the uAA of interest. After this process, the resulting aaRS/tRNA pair achieves codonBL-templated incorporation of the desired uAA in a manner identical to the genetic specification of natural amino acids, thus seamlessly expanding the genetic code of the host organism. This ability to incorporate custom uAAs into proteins in vivo represents an unprecedented level of chemical and synthetic control over biology’s workhorse macromolecule. Expanded genetic codes, therefore, facilitate a versatile array of applications in the biological sciences. In this review, we discuss four application areas for expanded genetic codes in biology. First

among these is protein biology, where the primary challenge is often the site-specific introduction of new chemistry, ranging from biophysical probes to biologically important posttranslational modifications, into proteins. Though chemical modification, solid-phase peptide synthesis, native chemical ligation, and expressed protein ligation have made significant strides toward the production of specialized proteins, expanded genetic codes yield site-specifically modified proteins simply through recombinant expression. They therefore offer a degree of efficiency and ease unavailable in other methods. Second among application areas is cell biology. Expanded genetic codes access new protein chemistries in vivo. Therefore, expanded genetic codes can modify and probe cellular processes under native conditions, with minimal disruption to the living cell. Third, we argue that expanded genetic codes act both as motivation and as an enabling tool for synthetic biology. This is because uAA incorporation, especially the simultaneous addition of multiple uAAs in response to multiple codonsBL, has challenged the synthetic biology community to dramatically redesign organisms through systematic ribosome engineering,[7] genome-wide codon replacement,[8] and large-scale modification of codon usage.[9] At the same time, expanded genetic codes have been repurposed for gene regulation by uAA inputs in a way that can achieve forward engineering of custom gene regulatory functions.[10] Finally, we discuss the promising area of evolution with uAAs, where new chemistries enter into the processes of mutation, amplification, and functional selection.[11] This range of applications outlines the farreaching implications and utility that expanded genetic codes have for the whole of biology.

Protein Biology [a] Prof. Dr. C. C. Liu Departments of Biomedical Engineering and Chemistry University of California at Irvine 3105 Natural Sciences II, Irvine, CA 92697 (USA) E-mail: [email protected] [b] X. Li Department of Biomedical Engineering University of California at Irvine 3120 Natural Sciences II, Irvine, CA 92697 (USA)

 2014 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

The modern study of proteins relies on two fundamental technologies: mutagenesis and recombinant expression. The former allows researchers to change, through simple manipulations of the corresponding genetic material, amino acid sequences of natural proteins; and the latter allows for their efficient production and purification. However powerful, the degree to which these technologies facilitate the study and ChemBioChem 0000, 00, 1 – 8

&1&

These are not the final page numbers! ÞÞ

CHEMBIOCHEM MINIREVIEWS modification of protein structure and function is limited by the chemistries available in the natural 20 amino acids. With expanded genetic codes, this limitation is removed. Therapeutic proteins through bioconjugation One clear area that realizes the resulting potential of expanded genetic codes is the production of therapeutic proteins through bioorthogonal conjugation. Here, the incorporation of uAAs with reactive functional groups becomes the critical first step in the custom site-specific derivatization of proteins for therapeutic purposes. For example, Cho et al. reported the recombinant expression of human growth hormone (hGH) containing a site-specifically incorporated para-acetylphenylalanine (pAcF), which served as a chemical handle for conjugation to poly(ethylene glycol) (PEG).[12] The resulting homogeneously mono-PEGylated hGH demonstrated favorable pharmacodynamics and is being developed clinically. Several other ongoing efforts in this area might share similar success for therapy. These include the conjugation of small molecules onto antibodies to recruit T cells to cancer markers,[13] the attachment of charged polymers onto antibodies to promote siRNA delivery into target cells,[14] and the site-specific conjugation of oligonucleotides, facilitating the generation of antibody dimers and multimers.[15] In all of these cases, conjugation is facilitated by a site-specifically incorporated bioorthogonal uAA, and therefore yields homogeneous therapeutic proteins, rather than mixtures that are less potent, more sensitive to production conditions, and for which it is more difficult to receive regulatory approval. Addressing post-translationally modified proteins Another area that highlights the potential of mutagenesis and recombinant expression with expanded genetic codes is the study of posttranslationally modified proteins. Posttranslational modifications (PTMs) impart an additional layer of control over protein function beyond expression, often playing central roles in binding, catalysis, signaling, and epigenetics. However, the traditional study of posttranslationally modified proteins faces major challenges, because the fundamental methods of mutagenesis and recombinant expression upon which protein biology relies do not easily address PTMs. These challenges are resolved by expanding genetic codes for the translational incorporation of uAAs corresponding to PTMs. The first complete demonstration of this idea came through a number of studies on tyrosine sulfation. Sulfation is a ubiquitous PTM found in higher eukaryotes, whereby a sulfate group is added to tyrosine, functioning to strengthen protein–protein interactions involved in processes ranging from coagulation to chemokine signaling.[16] The study of sulfation, however, is limited by organismal constraints (sulfation does not occur in the common lab microbes Escherichia coli or Saccharomyces cerevisiae), homogeneity constraints (the enzymes responsible for sulfation are expressed only in certain parts of the cell), and sequence constraints (only specific amino acid motifs can be sulfated). By expanding the genetic code of E. coli for the  2014 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

www.chembiochem.org direct incorporation of sulfotyrosine, these constraints have been removed.[17] For example, the leech protein sulfohirudin is a therapeutically important anticoagulant but is clinically used only in a lower-activity unsulfated recombinant form (desulfohirudin), because extracting sulfohirudin from leeches is highly inefficient and yields various isoforms. E. coli bacteria with their genetic codes expanded for sulfotyrosine incorporation allowed the production of sulfohirudin recombinantly, in yields viable for therapeutic development.[17] This recombinant sulfohirudin, which displays a tenfold enhancement in affinity for thrombin compared to desulfohirudin, was also used to solve the crystal structure of the sulfohirudin–thrombin complex, which shows the structural contribution of tyrosine sulfate to the activity of sulfohirudin.[18] In other instances, studies could be conducted on sulfated proteins that required mutagenesis beyond the sequence motifs normally necessary for sulfation.[11a, 19] Taken together, these experiments illuminate new paths for the study and application of sulfated proteins and, more generally, outline a set of principles for addressing PTMs by using expanded genetic codes. These principles extend to PTMs beyond tyrosine sulfation, including the important class of lysine PTMs central to eukaryotic biology and histone-based epigenetics. For example, Neumann et al. expanded the genetic code of E. coli for the direct incorporation of acetyllysine, providing a recombinant expression system for site-specifically acetylated proteins.[20] This system has been used to reveal the role of Lys120 acetylation in modulating p53s DNA-binding specificity,[21] to kinetically and structurally characterize the regulatory roles of cyclophilin A acetylation in immunosuppression and HIV isomerization,[22] to dissect the effects of His3/Lys56 acetylation in nucleosome organization,[23] and to understand the function of His4/Lys16 acetylation in gene silencing by interaction with Sir proteins.[24] Several efforts have also aimed to expand genetic codes for the site-specific incorporation of methyllysines by incorporating a precursor methyllysine that can be deprotected with light,[25] pH,[26] or transition metal chemistries.[27] In model studies, these expanded genetic codes have been used to test the effects of monomethylation or dimethylation of His3/Lys9 in interactions with a known partner, heterochromatin protein 1.[26, 28] Finally, expanded genetic codes have be developed to address ubiquitination, another important lysine modification, through the site-specific incorporation of 1,2-aminothiols that can be ligated to ubiquitin.[29] A variation of this approach that employs site-specifically protected lysines, introduced as uAAs, allowed for the synthesis of an atypical diubiquitin, elucidation of its structure, and the discovery that TRABID has preferential deubiquitinase activity on the atypical Lys29-linked ubiquitin.[30] This growing set of genetically encoded lysine modifications might contribute to the larger goal of understanding, and possibly controlling, epigenetic states. A valuable feature of genetically encoded lysine modifications is that they predominantly rely on engineered variants of aaRS/tRNA pairs specific for pyrrolysine (PylRS/tRNAPyl). As the parent PylRS/tRNAPyl pairs[31] have been shown to be compatible in a wide range of hosts, uAAs incorporated through engineered variants automatically function in E. coli and mammaliChemBioChem 0000, 00, 1 – 8

&2&

These are not the final page numbers! ÞÞ

CHEMBIOCHEM MINIREVIEWS

www.chembiochem.org

an cells.[32] Therefore, expanded genetic codes for acetyllysine and variously caged methyllysines can be directly imported into mammalian cells, providing a platform for the study of lysine modification on proteins in vivo. Indeed, the use of expanded genetic codes for recombinant expression and mutagenesis with a growing list of PTMs, including the phosphoserine modification critical to kinase signaling,[33] has become a mainstay of protein biology.

Cell Biology The development of genetically encoded green fluorescent proteins (GFPs) revolutionized biology by providing a way to image specific proteins non-invasively within living cells and tissues. Since then, a major thrust in cell biology has been the elaboration of new genetically encodable chemistries for observing, perturbing, and controlling proteins in vivo. Toward this end, expanded genetic codes that site-specifically incorporate uAAs acting as small, sensitive fluorophores, context-dependent IR, PET, or NMR probes, and photoactive switches for optogenetics have been developed.[2] This has led to a large suite of applications for uAAs in cell biology.

Figure 1. Light-activated control of biochemical pathways. By using an expanded genetic code, photocaged serines and photocaged lysines are incorporated into target proteins. Under standard conditions, these proteins remain biologically inactive. UV activation of photocaged proteins allows the controlled study of various biological pathways.

Light-activated crosslinking and control Among the most widely applied uAAs in cell biology are ones containing benzophenone, azides, and diazirines.[34] These uAAs are incorporated site-specifically into proteins in order to trap, through UV-triggered covalent crosslinking, specific protein–protein interactions and protein conformational changes. Numerous examples of this use are available and include studies on chaperone interactions with substrates,[35] protein interactions that regulate the cell cycle,[36] nascent peptide recognition inside the ribosome tunnel,[37] spontaneous conformational changes in RNA polymerase,[38] and protein interactions at various intracellular and cell membrane locations in E. coli, yeast, and mammalian cells.[34f, 39] In many of these examples, crosslinking was triggered in vivo, thus favoring the isolation of native intra- or intermolecular interactions in their biologically most relevant conditions. Photoactive uAAs can also be used to modulate protein function directly inside cells in real time (Figure 1). For example, Lemke et al. added 4,5-dimethoxy-2-nitrobenzylserine, a photocaged serine, to the genetic code of Saccharomyces cerevisiae and incorporated this uAA into various serine phosphorylation sites in the transcription factor Pho4.[40] Upon decaging through laser irradiation, serines were revealed, and the kinetics of their subsequent phosphorylation and resulting localization were studied. More recently, Gautier et al. encoded a photocaged lysine in mammalian cells and incorporated it into a nuclear localization signal to mask its endogenous activity.[41] Using temporally controlled photolysis of the masked nuclear localization tag, the authors were able to observe the kinetics of nuclear import in both a model system and for the tumor suppressor p53. The same caged lysine was applied to create photoactivatable kinases.[42] This was demonstrated in a number of experi 2014 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

ments on the MAPK pathway, in which the active site lysine of the MAPK kinase MEK1 was replaced by the photocaged lysine, disabling MEK1’s catalytic activity. The resulting photoactivatable MEK1 was used to reveal several elements of the MAPK pathway, including the kinetics of phosphorylationdependent ERK2 nuclear import. Other examples involving photoactive uAAs include the creation of a photoactivatable GFP,[43] site-specific photocleavage of protein backbones,[44] and light-activated transcription through a photoactivatable RNA polymerase.[45] In short, genetically encoded photoactive uAAs constitute an important set of optogenetic tools for the spatiotemporal control of cellular processes. Fluorescent reporters Complementary to photoreactive uAAs are ones that facilitate observation and imaging of cellular processes. Though this task has traditionally been the domain of GFP and its variants, genetically encoded fluorescent uAAs might offer several advantages over GFP. For example, fluorescent uAAs, owing to their small size, could be used to label proteins with minimal structural perturbation and might report on more intricate biochemical and biophysical events, including protein conformational and local environmental changes. In addition, a wide range of fluorescence properties might be accessed through uAAs, as there seems to be no hard limit on the types of small fluorophores that can be genetically encoded as uAAs. Several groups have begun to capitalize on the opportunities afforded by genetically encoded fluorescent uAAs. For example, several groups have added an environmentally sensitive fluorescent probe, 3-(6-acetylnaphthalen-2-ylamino)-2-aminopropanoic acid (Anap), to the genetic codes of yeast and mamChemBioChem 0000, 00, 1 – 8

&3&

These are not the final page numbers! ÞÞ

CHEMBIOCHEM MINIREVIEWS malian cells and have used it both to detect conformational changes that occur during ligand binding by QBP (glutamine binding protein) and as a tool to visualize subcellular localization of proteins in living cells.[46] Others have added a coumarin-containing fluorescent uAA (CouAA) to the genetic code of E. coli and have used it to visualize contractile ring formation by the protein FtsZ during cytokinesis.[47] In the case of FtsZ, and likely several other cytoskeletal proteins, fusion to GFP impairs cellular function, but incorporation of the small CouAA into FtsZ did not disturb its wild-type function and allowed visualization of FtsZ subcellular localization. Yet another genetically encoded fluorescent uAA for studying yeast and mammalian cell biology is 2-amino-3-(5-(dimethylamino)naphthalene-1-sulfonamido) propanoic acid (DanAla).[48] In an impressive study, Shen et al. demonstrated the genetic incorporation of DanAla into proteins expressed in neural stem cells that could subsequently differentiate into neural progenies.[49] Specifically, DanAla was incorporated into a voltage-sensitive domain (VSD) taken from a voltage-dependent membrane lipid phosphatase that undergoes conformational changes upon membrane polarization in neurons. As the fluorescence of DanAla is highly sensitive to local polarity, DanAla acted as an optical reporter for conformational changes in VSD during membrane depolarization. The continued development of genetically encoded fluorescent amino acids for observing cellular processes in real time, especially at the resolution of conformational changes in single proteins, will lead to many new advances in cell biology.

Synthetic Biology Synthetic biology applies the design principles of part modularity, orthogonality, independence, scalability, and portability to the composition of new biological functions.[50] Expanded genetic codes, which are achieved by engineering modular, orthogonal, and portable uAA-specific codonBL/aaRS/tRNA sets that act independently in the context of natural codon/aaRS/ tRNA sets, rely on similar principles. Therefore, there has been a dynamic exchange of ideas between the two fields, in which synthetic biology principles have been applied to expanded genetic codes and vice versa. A perfect example of the former is represented by efforts to simultaneously encode multiple uAAs. This requires multiple independent codonsBL and sets of mutually orthogonal aaRS/ tRNA pairs and might also benefit from engineered ribosomes that accommodate new codonBL/tRNA recognition modes (Figure 2). Each of these areas has been addressed with scalable synthetic biology strategies. For example, a number of labs have pursued the use of triplet codonsBL beyond UAG, as well as four-base codonsBL, to specify multiple uAAs, even using genome-wide engineering or synthesis efforts, to remove codon redundancies from the natural genetic code.[8, 9] Chin et al. have added to these efforts by developing orthogonal ribosomes that recognize mRNAs with artificial ribosome binding sites (RBSs),[51] subsequently engineering one of these ribosomes to better accommodate four-base codonsBL and their cognate tRNAs.[7b] This allowed the efficient expression of pro 2014 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

www.chembiochem.org

Figure 2. Synthetic expanded genetic code systems. Evolved orthogonal ribosomes with high specificity for tRNAs that recognize three or four-base codonsBL increase the efficiency of incorporating multiple uAAs.[7] A wide range of mutually orthogonal aaRS/tRNA pairs have been engineered to incorporate specific uAAs at amber (box) and quadruplet (dashed box) codons.[53, 54] Genome-wide removal of the amber stop codon and the proposed removal of 13 redundant natural codons (e.g., circled), along with their respective tRNAs, will provide multiple codonsBL for highly efficient uAA incorporation.[8, 9]

teins containing two different uAAs, incorporated by two mutually orthogonal aaRS/tRNA pairs, one of which recognizes a four-base codonBL. It might also be possible to borrow sense codons in specific contexts for uAA incorporation. For example, Brçcker et al. showed that the UGA codon, which specifies selenocysteine in a wide variety of organisms when present in a particular mRNA stem–loop structure, can be reassigned to 58 out of 64 of the sense anticodons.[52] Their work suggests the feasibility of reassigning sense codons for uAAs beyond selenocysteine. In addition to new codonsBL, the translational incorporation of multiple uAAs requires the establishment of mutually orthogonal aaRS/tRNA part sets, members of which can then be independently engineered to recognize distinct uAAs. This has been achieved either through finding disparate aaRS/tRNA pairs that naturally display mutual orthogonality[53] or through more general methods for engineering mutually orthogonal combinations of aaRS/tRNA pairs from parent pairs.[54] Taken together, these efforts represent both a significant example and a goal of applied synthetic biology as they involve the creation of orthogonal part sets and their assembly into custom genetic codes that might eventually lead to the specification of synthetic polypeptides containing multiple or lone uAAs in living systems. The reverse direction, namely the application of expanded genetic codes to synthetic biology, has also been productive. For example, a major goal in synthetic biology is the predictable engineering of genetic switches that respond to user-deChemBioChem 0000, 00, 1 – 8

&4&

These are not the final page numbers! ÞÞ

CHEMBIOCHEM MINIREVIEWS fined inputs; these should facilitate the construction of custom cellular behaviors and allow synthetic and systems biologists to probe the principles of regulatory systems.[55] Towards this end, Liu, Arkin et al. proposed the application of expanded genetic codes as a regulatory framework. This is based on the idea that codonsBL can be viewed as highly modular sensors for uAA inputs that the ribosome predictably integrates into the translation of a gene. Particular arrangements of codonsBL form different instructions for translational elongation that achieve the expression of output genes as a function of uAA inputs. To implement this framework, Liu, Arkin et al. have engineered leader peptide elements whose bacterial mechanisms couple translation of a short upstream open reading frame to transcription of regulated genes into dedicated genetic switches that use codonsBL to specify custom regulatory functions.[10] These switches have been programmed into numerous uAA-induced ON and OFF behaviors, multi-input logic gates, and most recently, into higher-order regulatory functional forms with tunable sigmoidality and other complex properties (C. C. Liu, M. S. Samoilov, A. P. Arkin, unpublished results). We believe that the success of these experiments will provide motivation to further explore the interface between expanded genetic codes and synthetic biology, both of which are rooted in a shared set of fundamental engineering principles.

Laboratory Evolution The previous sections discuss the rational addition of uAAs into proteins and biological systems. An alternative and tantalizing possibility for genetically encoded uAAs is their integration into evolution experiments, wherein the processes of mutation, amplification, and selection determine how uAAs might contribute to new functions beyond those that can be engineered rationally (Figure 3). Several steps have been taken in this direction. For example, Liu, Schultz et al. have developed a robust phage-based protein evolution system in which displayed protein libraries are generated in an expanded genetic code E. coli host that encodes 21 amino acids.[11a] This system

www.chembiochem.org was used to demonstrate that expanded genetic codes can confer a selective advantage in protein evolution. In the process, several proteins whose functions depend on the novel chemistries of uAAs were evolved, including anti-gp120 antibodies that utilize sulfotyrosine for binding,[11a, 56] antibodies against a model sugar that use a boronate moiety to interact with diols,[57] zinc fingers that rely on iron instead of zinc for structure,[58] and metal-binding peptides that use uAAs with metal-chelating sidechains.[59] Ongoing studies include incorporating “chemical warheads” in the evolution of protease inhibitors, sulfotyrosine in the evolution of protein–protein interfaces, and metal-chelating uAAs in the evolution of new catalytic centers. One may go beyond these directed evolution experiments and simply ask, in an open-ended manner, if uAAs can contribute to adaptation. Hammerling et al. studied exactly this idea, by passaging lines of T7 bacteriophage in the presence of an E. coli host with an expanded genetic code that incorporates 3-iodotyrosine (IodoY) at the amber stop codon.[11b] After numerous serial transfers of the lytic phage, the frequency of mutants specifying IodoY in certain genes reached high levels. In one case, the IodoY mutation was beneficial: the Tyr39-toIodoY mutation in T7 type II holin increased phage production over the wild-type version of the gene. The finding that phage adapt to expanded genetic code E. coli hosts by functionally using an uAA in their proteins suggests that uAAs might have many adaptive functions yet to be found. This, and the larger implication that expanded genetic codes might be evolutionarily advantageous, are exciting avenues for experimental evolutionary biology.

Limitations and Alternative Methods

Expanded genetic codes are, in theory, the ideal strategy for adding new chemistries into proteins, as they achieve precise genetic specification over the location of any number of uAAs. However, in practice, expanded genetic codes are limited by uAA incorporation efficiencies. This is especially problematic when multiple uAAs need to be incorporated into a single protein. Although these inefficiency issues are quickly being addressed,[7a, 8b, 60, 61] an alternative method called selective pressure incorporation (SPI) can achieve highly efficient uAA incorporation in cases where proteome-wide replacement of a natural amino acid is tolerable. Since the early success of complete replacement of methionine by selenomethionine in E. coli,[62] SPI has been used to incorporate a large range of uAAs into proteins. This is accomplished by generating a strain that is auxotrophic for a given natural amino acid but denying the strain that natural amino acid, instead feeding it a uAA analogue. The uAA is then incorporated by the aaRS/ tRNA pair for the natural amino acid counterpart and corresponding sense codon.[63] Although the largeFigure 3. Laboratory evolution with expanded genetic codes. Proteins selected from M13 scale perturbation of the entire proteome resulting phage display libraries exhibit unique biological activities and functions resulting from from this method often means that one can only susuAAs.[11a, 56–59] Likewise, the T7 bacteriophage has been adapted to expanded genetic tain uAA incorporation for a short period of time, codes after many generations of experimental evolution.[11b]  2014 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

ChemBioChem 0000, 00, 1 – 8

&5&

These are not the final page numbers! ÞÞ

CHEMBIOCHEM MINIREVIEWS several applications are still possible, and indeed, some capitalize on proteome-wide incorporation. For example, the incorporation of natural amino acid analogues containing heavy atoms such as telluromethionine[64–66] and b-selenolo[3,2-b]pyrrolyalanine[67] during protein overexpression in E. coli strains that are auxotrophic for methionine and tryptophan, respectively, can facilitate anomolous dispersion experiments to simplify X-ray crystallography. Another example is the purification and identification of 195 newly synthesized proteins in human embryonic kidney (HEK293) cells by bio-orthogonal labeling of uAAs incorporated proteome-wide, following the introduction of the uAA and removal of the corresponding natural amino acid.[68] Finally, SPI, like expanded genetic codes, can also be used to incorporate multiple different uAAs. For example, a polyauxotrophic strain was used to incorporate three different uAAs into a protein.[69] Ultimately, however, we expect that expanded genetic codes will be favored, especially for in vivo applications, due to the genetic precision of uAA incorporation that expanded genetic codes achieve.

Outlook Through uAA protein mutagenesis, uAA probes of cellular function, the redesign and repurposing of genetic codes, and the incorporation of new chemistries into directed evolution platforms, the relatively young field of expanded genetic codes has already had an enormous impact on the control, design, and understanding of biology. Continued work in these areas and their extension into more organisms, including multicellular animals,[6] will solidify the position of expanded genetic codes as a foundational technology for the whole of the biological sciences.

Acknowledgements We would like to thank Prof. Dr. Peter Schultz for helpful comments and the University of California at Irvine for financial support. Keywords: cell biology · directed evolution · genetic code · protein engineering · synthetic biology [1] a) A. L. Weber, S. L. Miller, J. Mol. Evol. 1981, 17, 273 – 284; b) Y. Lu, S. Freeland, Adv. Genome Biol. 2006, 7, 102; c) K. Vetsigian, C. Woese, N. Goldenfeld, Proc. Natl. Acad. Sci. USA 2006, 103, 10696 – 10701. [2] a) C. C. Liu, P. G. Schultz, Annu. Rev. Biochem. 2010, 79, 413 – 444; b) R. Furter, Protein Sci. 1998, 7, 419 – 426. [3] L. Wang, A. Brock, B. Herberich, P. G. Schultz, Science 2001, 292, 498 – 500. [4] J. W. Chin, T. A. Cropp, J. C. Anderson, M. Mukherji, Z. Zhang, P. G. Schultz, Science 2003, 301, 964 – 967. [5] a) K. Sakamoto, A. Hayashi, A. Sakamoto, D. Kiga, H. Nakayama, A. Soma, T. Kobayashi, M. Kitabatake, K. Takio, K. Saito, M. Shirouzu, I. Hirao, S. Yokoyama, Nucleic Acids Res. 2002, 30, 4692 – 4699; b) W. Liu, A. Brock, S. Chen, S. Chen, P. G. Schultz, Nat. Methods 2007, 4, 239 – 244. [6] a) S. Greiss, J. W. Chin, J. Am. Chem. Soc. 2011, 133, 14196 – 14199; b) A. R. Parrish, X. She, Z. Xiang, I. Coin, Z. Shen, S. P. Briggs, A. Dillin, L. Wang, ACS Chem. Biol. 2012, 7, 1292 – 1302; c) A. Bianco, F. M. Townsley, S. Greiss, K. Lang, J. Chin, Nat. Chem. Biol. 2012, 8, 748 – 750.

 2014 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

www.chembiochem.org [7] a) K. Wang, H. Neumann, S. Y. Peak-Chew, J. W. Chin, Nat. Biotechnol. 2007, 25, 770 – 777; b) H. Neumann, K. Wang, L. David, M. Garcia-Alai, J. W. Chin, Nature 2010, 464, 441 – 444. [8] a) F. J. Isaacs, P. A. Carr, H. H. Wang, M. J. Lajoie, B. Sterling, L. Kraal, A. C. Tolonen, T. A. Gianoulis, D. B. Goodman, N. B. Reppas, C. J. Emig, D. Bang, S. J. Hwang, M. C. Jewett, J. M. Jacobson, G. M. Church, Science 2011, 15, 348 – 353; b) M. J. Lajoie, A. J. Rovner, D. B. Goodman, H.-R. Aerni, A. D. Haimovich, G. Kuznetsov, J. A. Mercer, H. H. Wang, P. A. Carr, J. A. Mosberg, N. Rohland, P. G. Schultz, J. M. Jacobson, J. Rinehart, G. M. Church, F. J. Isaacs, Science 2013, 342, 357 – 360. [9] M. J. Lajoie, S. Kosuri, J. A. Mosberg, C. J. Gregg, D. Zhang, G. M. Church, Science 2013, 342, 361 – 363. [10] C. C. Liu, L. Qi, C. Yanofsky, A. P. Arkin, Nat. Biotechnol. 2011, 29, 164 – 168. [11] a) C. C. Liu, A. V. Mack, M. Tsao, J. H. Mills, H. Lee, H. Choe, M. Farzan, P. G. Schultz, V. V. Smider, Proc. Natl. Acad. Sci. USA 2008, 105, 17688 – 17693; b) M. J. Hammerling, J. W. Ellefson, D. R. Boutz, E. M. Marcotte, A. D. Ellington, J. E. Barrick, Nat. Chem. Biol. 2014, 10, 178 – 180. [12] H. Cho, T. Daniel, Y. J. Buechler, D. C. Litzinger, S. Maio, A.-M. H. Putnam, V. S. Kraynov, B.-C. Sim, S. Bussell, T. Javahishvili, S. Kaphle, G. Viramontes, M. Ong, S. Chu, B. GC, R. Lieu, N. Knudsen, P. Castiglioni, T. C. Norman, D. W. Axelrod, A. R. Hoffman, P. G. Schultz, R. D. DiMarchi, B. E. Kimmel, Proc. Natl. Acad. Sci. USA 2011, 108, 9060 – 9065. [13] C. H. Kim, J. Y. Axup, B. R. Lawson, H. Yun, V. Tardif, S. H. Choi, Q. Zhou, A. Dubrovska, S. L. Biroc, R. Marsden, J. Pinstaff, V. V. Smider, P. G. Schultz, Proc. Natl. Acad. Sci. USA 2013, 110, 17796 – 17801. [14] H. Lu, D. Wang, S. A. Kazane, T. Javahishvili, F. Tian, F. Song, A. Sellers, B. Barnett, P. G. Schultz, J. Am. Chem. Soc. 2013, 135, 13885 – 13891. [15] B. M. Hutchins, S. A. Kazane, K. Staflin, J. S. Forsyth, B. Felding-Habermann, P. G. Schultz, V. V. Smider, J. Mol. Biol. 2011, 406, 595 – 603. [16] Y. Kanan, M. R. Al-Ubaidi, JSM Biotechnol. Bioeng. 2013, 1, 1003. [17] C. C. Liu, P. G. Schultz, Nat. Biotechnol. 2006, 24, 1436 – 1440. [18] C. C. Liu, E. Brustad, W. Liu, P. G. Schultz, J. Am. Chem. Soc. 2007, 129, 10648 – 10649. [19] C. C. Liu, H. Choe, M. Farzan, V. V. Smider, P. G. Schultz, Biochemistry 2009, 48, 8891 – 8898. [20] H. Neumann, S. Y. Peak-Chew, J. W. Chin, Nat. Chem. Biol. 2008, 4, 232 – 234. [21] E. Arbely, E. Natan, T. Brandt, M. D. Allen, D. B. Veprintsev, C. V. Robinson, J. W. Chin, A. C. Joerger, A. R. Fersht, Proc. Natl. Acad. Sci. USA 2011, 108, 8251 – 8256. [22] M. Lammers, H. Neumann, J. W. Chin, L. C. James, Nat. Chem. Biol. 2010, 6, 331 – 337. [23] H. Neumann, S. M. Hancock, R. Buning, A. Routh, L. Chapman, J. Somers, T. Owen-Hughes, J. van Noort, D. Rhodes, J. W. Chin, Mol. Cell 2009, 36, 153 – 163. [24] M. Oppikofer, S. Kueng, F. Martino, S. Soeroes, S. M. Hancock, J. W. Chin, W. Fischle, S. M. Gasser, EMBO J. 2011, 30, 2610 – 2621. [25] a) D. Groff, P. R. Chen, F. B. Peters, P. G. Schultz, ChemBioChem 2010, 11, 1066 – 1068; b) Y.-S. Wang, B. Wu, Z. Wang, Y. Huang, W. Wan, W. K. Russell, P.-J. Pai, Y. N. Moe, D. H. Russell, W. R. Liu, Mol. Biosyst. 2010, 6, 1557 – 1560. [26] D. P. Nguyen, M. M. Garcia-Alai, P. B. Kapadnis, H. Neumann, J. W. Chin, J. Am. Chem. Soc. 2009, 131, 14194 – 14195. [27] H. Ai, J. W. Lee, P. G. Schultz, Chem. Commun. 2010, 46, 5506 – 5508. [28] D. P. Nguyen, M. M. Garcia-Alai, S. Virdee, J. W. Chin, Chem. Biol. 2010, 17, 1072 – 1076. [29] S. Virdee, P. B. Kapadnis, T. Elliott, K. Lang, J. Madrzak, D. P. Nguyen, L. Riechmann, J. W. Chin, J. Am. Chem. Soc. 2011, 133, 10708 – 10711. [30] S. Virdee, Y. Ye, D. P. Nguyen, D. Komander, J. W. Chin, Nat. Chem. Biol. 2010, 6, 750 – 757. [31] a) G. Srinivasan, C. M. James, J. A. Krzycki, Science 2002, 296, 1459 – 1462; b) K. Blight, R. C. Larue, A. Mahapatra, D. G. Longstaff, E. Chang, G. Zhao, P. T. Kang, K. B. Green-Church, M. K. Chan, J. A. Krzycki, Nature 2004, 431, 333 – 335. [32] P. R. Chen, D. Groff, J. Guo, W. Ou, S. Cellitti, B. H. Geierstanger, P. G. Schultz, Angew. Chem. Int. Ed. 2010, 48, 4052 – 4055. [33] H.-S. Park, M. J. Hohn, T. Umehara, L.-T. Guo, E. M. Osborne, J. Benner, C. J. Noren, J. Rinehart, D. Soll, Science 2011, 333, 1151 – 1154. [34] a) J. W. Chin, A. B. Martin, D. S. King, L. Wang, P. G. Schultz, Proc. Natl. Acad. Sci. USA 2002, 99, 11020 – 11024; b) N. Hino, Y. Okazaki, T. Kobaya-

ChemBioChem 0000, 00, 1 – 8

&6&

These are not the final page numbers! ÞÞ

CHEMBIOCHEM MINIREVIEWS

[35]

[36] [37] [38]

[39]

[40] [41] [42] [43] [44] [45] [46]

[47]

[48] [49] [50]

shi, A. Hayashi, K. Sakamoto, S. Yokoyama, Nat. Methods 2005, 2, 201 – 206; c) S. M. Hancock, R. Uprety, A. Deiters, J. W. Chin, J. Am. Chem. Soc. 2010, 132, 14819 – 1482; d) C. Chou, R. Uprety, L. Davis, J. W. Chin, A. Deiters, Chem. Sci. 2011, 2, 480 – 483; e) H. Ai, W. Shen, A. Sagi, P. R. Chen, P. G. Schultz, ChemBioChem 2011, 12, 1854 – 1857; f) M. Zhang, S. Lin, X. Song, J. Liu, Y. Fun, X. Ge, X. Fu, Z. Chang, P. R. Chen, Nat. Chem. Biol. 2011, 7, 671 – 677. C. Schlieker, J. Weibezahn, H. Patzelt, P. Tessarz, C. Strub, K. Zeth, A. Erbse, J. Schneider-Mergener, J. W. Chin, P. G. Schultz, B. Bukau, A. Mogk, Nat. Struct. Mol. Biol. 2004, 11, 607 – 615. Y. Kimata, M. Trickey, D. Izawa, J. Gannon, M. Yamamoto, H. Yamano, Dev. Cell 2008, 14, 446 – 454. M.-N. Yap, H. D. Bernstein, Mol. Cell 2009, 34, 201 – 211. S. Tagami, S. Sekine, T. Kumarevel, N. Hino, Y. Murayama, S. Kamegamori, M. Yamamoto, K. Sakamoto, S. Yokoyama, Nature 2010, 468, 978 – 982. a) J. W. Chin, EMBO J. 2011, 30, 2312 – 2324; b) I. Coin, V. Katritch, T. Sun, Z. Xiang, F. Y. Siu, M. Beyermann, R. C. Stevens, L. Wang, Cell 2013, 155, 1258 – 1269. E. A. Lemke, D. Summerer, B. H. Geierstanger, S. M. Brittain, P. G. Schultz, Nat. Chem. Biol. 2007, 3, 769 – 772. A. Gautier, D. P. Nguyen, H. Lusic, W. An, A. Deiters, J. W. Chin, J. Am. Chem. Soc. 2010, 132, 4086 – 4088. A. Gautier, A. Deiters, J. W. Chin, J. Am. Chem. Soc. 2011, 133, 2124 – 2127. D. Groff, F. Wang, S. Jockusch, N. J. Turro, P. G. Schultz, Angew. Chem. Int. Ed. 2010, 49, 7677 – 7679; Angew. Chem. 2010, 122, 7843 – 7845. F. B. Peters, A. Brock, J. Wang, P. G. Schultz, Chem. Biol. 2009, 16, 148 – 152. J. Hemphill, C. Chou, J. W. Chin, A. Deiters, J. Am. Chem. Soc. 2013, 135, 13433 – 13439. a) H. S. Lee, J. Guo, E. A. Lemke, R. D. Dimla, P. G. Schultz, J. Am. Chem. Soc. 2009, 131, 12921 – 12923; b) A. Chatterjee, J. Guo, H. S. Lee, P. G. Schultz, J. Am. Chem. Soc. 2013, 135, 12540 – 12543. a) J. Wang, J. Xie, P. G. Schultz, J. Am. Chem. Soc. 2006, 128, 8738 – 8739; b) G. Charbon, E. Brustad, K. A. Scott, J. Wang, A. Lobner-Olesen, P. G. Schultz, C. Jacobs-Wagner, E. Chapman, ChemBioChem 2011, 12, 1818 – 1821. D. Summerer, S. Chen, N. Wu, A. Deiters, J. W. Chin, P. G. Schultz, Proc. Natl. Acad. Sci. USA 2006, 103, 9785 – 9789. B. Shen, Z. Xiang, B. Miller, G. Louie, W. Wang, J. P. Noel, F. H. Gage, L. Wang, Stem Cells 2011, 29, 1231 – 1240. J. B. Lucks, L. Qi, W. R. Whitaker, A. P. Arkin, Curr. Opin. Microbiol. 2008, 11, 567 – 573.

 2014 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

www.chembiochem.org [51] O. Rackham, J. W. Chin, Nat. Chem. Biol. 2005, 1, 159 – 166. [52] M. J. Brçcker, J. M. L. Ho, G. M. Church, D. Sçll, P. O’Donoghue, Angew. Chem. Int. Ed. 2014, 53, 319 – 323; Angew. Chem. 2014, 126, 325 – 330. [53] a) J. C. Anderson, N. Wu, S. W. Santoro, V. Lakshman, D. S. King, P. G. Schultz, Proc. Natl. Acad. Sci. USA 2004, 101, 7566 – 7571; b) W. Wan, Y. Huang, Z. Wang, W. K. Russell, P.-J. Pai, D. H. Russell, W. Liu, Angew. Chem. Int. Ed. 2010, 49, 3211 – 3214; Angew. Chem. 2010, 122, 3279 – 3282. [54] a) H. Neumann, A. L. Slusarczyk, J. W. Chin, J. Am. Chem. Soc. 2010, 132, 2142 – 2144; b) A. Chatterjee, H. Xiao, P. G. Schultz, Proc. Natl. Acad. Sci. USA 2012, 109, 14841 – 14846. [55] T. K. Lu, A. S. Khalil, J. J. Collins, Nat. Biotechnol. 2009, 27, 1139 – 1150. [56] C. C. Liu, H. Choe, M. Farzan, V. V. Smider, P. G. Schultz, Biochemistry 2009, 48, 8891 – 8898. [57] C. C. Liu, A. V. Mack, E. Brustad, J. H. Mills, D. Groff, V. V. Smider, P. G. Schultz, J. Am. Chem. Soc. 2009, 131, 9616 – 9617. [58] M. Kang, K. Light, H.-W. Ai, W. Shen, C. H. Kim, P. R. Chen, H. S. Lee, E. I. Solomon, P. G. Schultz, ChemBioChem 2014, 15, 822 – 825. [59] J. W. Day, C. H. Kim, V. V. Smider, P. G. Schultz, Bioorg. Med. Chem. Lett. 2013, 23, 2598 – 2600. [60] D. B. F. Johnson, J. Xu, Z. Shen, J. K. Takimoto, M. D. Schultz, R. J. Schmitz, Z. Xiang, J. R. Ecker, S. P. Briggs, L. Wang, Nat. Chem. Biol. 2011, 7, 779 – 786. [61] A. Chatterjee, S. B. Sun, J. L. Furman, H. Xiao, P. G. Schultz, Biochemistry 2013, 52, 1828 – 1837. [62] D. B. Cowie, G. N. Cohen, Biochim. Biophys. Acta 1957, 26, 252 – 261. [63] A J. Link, M. L. Mock, D. A. Tirrell, Curr. Opin. Biotechnol. 2003, 14, 603 – 609. [64] N. Budisa, B. Steipe, P. Demange, C. Eckerskorn, J. Keller-mann, R. Huber, Eur. J. Biochem. 1995, 230, 788 – 796. [65] J. O. Boles, K. Lewinski, M. Kunkle, J. D. Odom, B. Dunlap, L. Lebioda, M. Hatada, Nat. Struct. Biol. 1994, 1, 283 – 284. [66] N. Budisa, W. Karnbrock, S. Steinbacher, A. Humm, L. Prade, T. Neuefeind, L. Moroder, R. Huber, J. Mol. Biol. 1997, 270, 616 – 623. [67] J. H. Bae, S. Alefelder, J. T. Kaiser, R. Friedrich, L. Moroder, R. Huber, N. Budisa, J. Mol. Biol. 2001, 309, 925 – 936. [68] D. C. Dieterich, A. J. Link, J. Graumann, D. A. Tirrell, E. M. Schuman, Proc. Natl. Acad. Sci. USA 2006, 103, 9482 – 94827. [69] S. Lepthien, L. Merkel, N. Budisa, Angew. Chem. Int. Ed. 2010, 49, 5446 – 5450; Angew. Chem. 2010, 122, 5576 – 5581.

Received: April 9, 2014 Published online on && &&, 0000

ChemBioChem 0000, 00, 1 – 8

&7&

These are not the final page numbers! ÞÞ

MINIREVIEWS X. Li, C. C. Liu* && – && Biological Applications of Expanded Genetic Codes

 2014 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Why stop at 20? Expanded genetic codes allow the direct addition of new chemistries into proteins in living organisms. We discuss the applications and possibilities of expanded genetic codes in protein, cell, synthetic, and evolutionary biology.

ChemBioChem 0000, 00, 1 – 8

&8&

These are not the final page numbers! ÞÞ

Biological applications of expanded genetic codes.

Substantial efforts in the past decade have resulted in the systematic expansion of genetic codes, allowing for the direct ribosomal incorporation of ...
784KB Sizes 3 Downloads 8 Views