Protein engineering and the use of molecular modeling and simulation: the case of heterodimeric Fc engineering.

Methods xxx (2013) xxx–xxx

Contents lists available at ScienceDirect

Methods journal homepage: www.elsevier.com/locate/ymeth

Protein engineering and the use of molecular modeling and simulation: The case of heterodimeric Fc engineering Thomas Spreter Von Kreudenstein, Paula I. Lario, Surjit B. Dixit ⇑ Zymeworks Inc., 540-1385 West 8th Avenue, Vancouver, British Columbia V6H 3V9, Canada

a r t i c l e

i n f o

Article history: Available online xxxx Keywords: Bispecific antibody Heterodimeric antibody Fc engineering Antibody engineering Computational protein engineering

a b s t r a c t Computational and structure guided methods can make significant contributions to the development of solutions for difficult protein engineering problems, including the optimization of next generation of engineered antibodies. In this paper, we describe a contemporary industrial antibody engineering program, based on hypothesis-driven in silico protein optimization method. The foundational concepts and methods of computational protein engineering are discussed, and an example of a computational modeling and structure-guided protein engineering workflow is provided for the design of best-in-class heterodimeric Fc with high purity and favorable biophysical properties. We present the engineering rationale as well as structural and functional characterization data on these engineered designs. Ó 2013 Elsevier Inc. All rights reserved.

1. Introduction Bispecific antibodies, a family of engineered antibody derivatives that concurrently recognize two different target antigens of interest, builds upon the natural functionally relevant characteristics of antibody therapeutics, particularly, its highly selective and strong antigen binding. The fact that there are over 50 molecular formats that have been engineered for the creation of bispecific molecules attests to the excitement and potential therapeutic development opportunities offered by these designs [1]. Among these formats, the asymmetric class of antibodies comprising of an heterodimeric Fc (Fig. 1) that retain the natural IgG-like structure is particularly interesting since therapeutic designs based on this format can make use of the intrinsic functionalities conferred by the Fc domain of the antibody, such as effector activity and improved serum half-life characteristics relative to other protein species. Further, the format lends itself to combine well-studied canonical monospecific antibodies (preclinical and/or clinical) to create novel bispecific molecules capable of novel mechanisms of action. In this article we present a structure- and computational modeling-guided protein engineering approach to achieve a heterodimeric Fc that is the foundation for the formation of Abbreviations: RMSD, root mean squared deviation; RMSF, root mean squared fluctuation; RDF, radial distribution function; SASA, solvent exposed surface area; CCSD, close contact surface density (Drtot); KBP, knowledge based potential; LJ, Lennard-Jones contributions; elec, electrostatic contributions; Hphobic, hydrophobic; MD, molecular dynamics; Tm, thermal melting temperature; h-bond, hydrogen bond; vdW, van der Waals interaction; KiH, knob in hole; CI, charge inversion. ⇑ Corresponding author. E-mail address: [email protected] (S.B. Dixit).

asymmetric antibodies. While not addressed here, we have employed a similar engineering approach to achieve selective pairing of two unique light chains to the two heavy chains to construct bispecific antibody molecule.

2. Background 2.1. Heterodimeric Fc engineering The intrinsic tendency of the Fc portion of the antibody molecule to dimerize has been exploited in the creation of one of the earliest bispecific antibodies using a hybrid hybridoma of two cells that can produce two different antibodies [2]. Theoretically, the amount of the bispecific antibody of interest that can be obtained using non-selective pairing of two different heavy chain will be limited to a maximum of 50%. This yield is significantly lower when the molecule comprises of two unique light chains, with the ultimate product profile depending on the nature and composition of the individual chains in the system and their propensities for expression and pairing with each other. In such a mixture of antibody products, the single correctly-associated bispecific molecule has to be recovered, usually employing a cumbersome purification effort and at potentially low yields because of the loss in the form of other byproducts. The only bispecific antibody currently approved for sale to patients with malignant ascites, Catumaxomab, is a rat/mouse chimeric antibody derived from hybrid hybridoma (quadroma) and makes use of a distinct pH dependent elution characteristics of rat and mouse Fc chains during Protein-A purification [3]. Notably, the quadroma technology also benefits

1046-2023/$ - see front matter Ó 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ymeth.2013.10.016

Please cite this article in press as: T. Spreter Von Kreudenstein et al., Methods (2013), http://dx.doi.org/10.1016/j.ymeth.2013.10.016

2

T. Spreter Von Kreudenstein et al. / Methods xxx (2013) xxx–xxx

Fig. 1. Schematic graphical representations of the heterodimeric IgG1.

from the preferential pairing of rat and mouse heavy-light chains to retain the correct light chain pairing in the bispecific product. Building upon such a purification strategy while addressing the potential immunologic risk of using such non-human antibody frameworks, a few groups have introduced unique mutations in the two different heavy chains of the human IgG framework such that the mutation permits isolation of the heterodimer from the homodimers in a chromatographic purification step [4]. An alternate approach relying on rational structure-guided engineering of the Fc chains could drive selective pairing of preferred heavy chains. Co-expression [5] or alternately a post production arm exchange technique [6,7] can reduce the formation of unwanted Fc homodimeric chain pairs. Prominent among these structure guided Fc engineering efforts is the Knobs-into-Holes (KiH) strategy employed by Presta, Carter and coworkers [88] at the CH3 interface. A ‘‘knob’’ was introduced on one chain by mutating a smaller residue with a larger one (T366W), while a ‘‘hole’’ to accommodate the ‘‘knob’’ was created in the complementary surface of the other chain with larger to smaller residue mutations (T366S, L368A and Y407V). Introducing the knob and hole on the complementary chains disfavors self-pairing that form homodimers and biases the system towards preferentially forming the heterodimeric species. However, this design resulted in significant stability loss (15 °C) of the CH3 domain relative to the wild-type Fc thereby spawning subsequent attempts to optimize the stability of these heterodimeric Fc domains by phage display [8]. The reported CH3 domain melting temperature in the KiH heterodimeric Fc design following optimization with phage display is 72 °C and improves to 78 °C when a disulphide bond is also introduced at the interface [8,9]. In contrast to the steric complementarity approach in the KiH designs, Gunasekaran and coworkers [10] employed a structure-guided electrostatic charge inversion (CI) design strategy to achieve selective heterodimerization of the Fc. Based on a sequence and structure-guided engineering approach Davis and coworkers [11] have designed a strand exchange engineered domain (SEED) CH3 which comprises of alternating segments of human IgA and IgG CH3 sequences yielding preferentially associating complementary heterodimers. Following the trend for the KiH design, heterodimeric CH3 domains derived from the CI [10] and SEEDBody [12] approaches resulted in a thermal stability of 68 °C. All these specificity-engineering approaches have favored heterodimer formation by destabilizing the natural homodimer interface. Given the complex nature of folding and dimerization characteristics intrinsic to the CH3 domains in the Fc, [13–15] there appears to be a tendency for the thermal stability of these engineered heterodimeric Fc to drop significantly relative to the wild-type Fc (82 °C). There are potential advantages to developing therapeutics with designs that have addressed such fundamental biophysical stability loss

[16,17]. The key protein engineering challenges in the design of asymmetrical Fc domains is to maintain the parental homodimerlike Fc CH3 interface affinity (a correlate of specificity) while also retaining the parental stability. The high intrinsic affinity between the Fc chains, reported to be in the pM range, [18] and high stability of the natural IgG1 CH3 domains has made concurrently achieving the high specificity (purity) and stability in heterodimeric Fc design elusive to date. 2.2. Molecular modeling and simulation To address this apparent gap, we have employed a rational protein engineering approach to design heterodimeric Fc domains with high chain-pairing specificity while retaining native antibody-like thermal stability. In contrast to solely using negative design approaches focused on driving specificity, to achieve a heterodimeric Fc as exemplified by the KiH and CI constructs, we have also incorporated positive design approaches [19]. In positive design, amino acid modifications are introduced to maximize favorable interactions within or between proteins and stabilize the desired specific protein–protein pair interactions. In the case of Fc heterodimers, when combined with negative designs, the positive designs should aim to only drive favorable heterodimeric interactions without mitigating the interactions that disfavor homodimer formation. It is understood that positive design strategies by themselves can rarely achieve greater than 90% specificity, [20,21] so the key is to concurrently introduce both types of mutations to maximize specificity and stability. Computational modeling and simulation in this context could aid in appreciating the nature of individual mutations being introduced and in interpreting their role towards forming positive or negative designs in the engineered Fc heterodimers and their corresponding homodimers. Here we shall provide a brief overview of the state of the art of the field of computational modeling and simulation of protein structure and how we employ these methods as tools for heterodimeric Fc engineering. The field of theoretical structural modeling of proteins got its start in the early works of Linus Pauling, Francis Crick, G.N. Ramachandran and many others aimed at understanding the sequence–structure relations in proteins [22–24]. Interestingly, the models developed by Crick to describe the tight packing in the structure of coiled coils was also referred to as Knobs into Holes packing [23]. The appreciation that protein function is tied to its structure, based on the developments in X-ray crystallography and enzymology, highlighted the significance of structure–function relations in proteins [25]. With the development of the induced fit and conformational selection models for explaining protein–ligand binding beyond the simpler lock and key fit par-



adigm, protein science moved towards recognizing protein dynamics as another critical attribute of proteins and their activities [26]. There are no straight-forward experimental approaches to glean information on protein dynamics at the molecular level, albeit NMR spectroscopy provides insight with it own limitations [27]. Beyond the notion of ‘‘see to believe’’, this chain of sequence–structure–dynamics–function relation that is intrinsic to the protein modality has motivated the development of theoretical frameworks for modeling and simulation of protein structure and motion as a means to understand functional aspects of the system that depend on dynamics. Further, the advances in computational infrastructure and high performance parallel-processing technologies have sustained these structurebased protein simulation and characterization efforts as a means to further understand the systems biophysical properties [28,29]. In the next few paragraphs, we provide a brief overview of the current status and challenges facing protein engineering using computational protein structure modeling and simulation and our approach towards its application.

2.3. Sampling protein conformations Realistic and accurate modeling of the conformational dynamics of proteins and their mutants, their activities and interactions with other proteins and molecular partners (including solvent), is an extremely challenging multi-disciplinary effort. A variety of deterministic, stochastic, as well as combined deterministic and stochastic algorithms to optimize protein structural models and to sample the various degrees of conformational freedom of these systems are in development [30,31]. Although de novo protein folding and structure prediction starting from the primary sequence of the protein using molecular simulation has been the holy grail, [32] there is an opportunity to apply conformational sampling approaches to model and generate practical insight into a variety of subtle to large scale protein motions of biological interest. A number of protein conformational sampling methods addressing different aspects of local geometrical features such as topology of amino acid side chains, [33–35] backbone geometry and bond orientation changes, [36–38] loop geometry descriptions, [39,40] reorganization of secondary structural elements, [41] optimization of domain-domain contacts, [21,42–44] and protein–protein and protein–solvent characterization [45,46] have been developed with potential application in protein engineering. Effective protein modeling and simulation calls for an ‘‘artful’’ application of the appropriate sampling approaches to understand the relevant conformational dynamics, which may occur over the pico-second to milli-second time scale in the real system. The local geometry sampling applications have to be coupled with a balanced appreciation of the potential impact of the approximations inherent in the approaches employed. Rightfully so, the use of molecular dynamics simulation has been labeled an art and this designation may be broadly applied to molecular simulations in general [47]. In the context of Fc engineering for the design of heterodimers starting from the natural homodimers, one of the primary goals is to retain the native tertiary structure of the Fc. This can be addressed using algorithms that effectively sample amino acid side chain packing including local backbone geometry changes in protein interiors and at protein domain interfaces, assuming the protein inter-domain configuration is retained upon mutation. There is a complex link between the protein structure, nature of mutations and its impact on stability, as a result of changes to the local flexibilities and altered folding propensity [48]. Simulations provide valuable tools to evaluate such questions and to dissect and develop insight with regard to the conformational aspects of the mutation.

3

2.4. Energy functions for protein modeling The other critical factor that defines the reliability of the protein simulation is the nature of energy functions that models the interactions between particles within the protein and its environment. Typically, modeling and simulation of protein states that does not involve a covalent bond alteration employs approximate energy functions that compute pair-wise atomic interactions or alternately use more coarse descriptions limited to the individual functional groups or amino acid residues. The most prominent energy functions and its corresponding parameter sets, such as those used in the AMBER modeling suite, [49] comprise of physics based valence terms for topological features such as bonds, bond angles and torsional angle and pair-wise non-bonded interaction terms comprising of electrostatics modeled by the Coulomb equation and the ubiquitous but weak dispersion-repulsion (van der Waals) interactions modeled by the Lennard-Jones equation [50–52]. The effective parameters in these molecular mechanics terms are derived from ab initio quantum mechanical calculations as well as by calibration to target data derived from molecular analysis such as spectroscopy, crystallography and solvation energies of small molecules that function as amino acid analogs. One of the major approximations in these fixed point charge based energy functions is that they do not explicitly address charge polarization effects although these transitory properties are critical elements of molecular interactions and binding, for instance contacts such as cationp and p–p systems. Polarizable force fields for proteins are in early stages of development [53]. The impact of polarization of water adds to the complexity and extensive effort has been put into the development of parameters to explicitly treat water molecules as a solvent medium for the protein [54]. Implicit water models based on the Poisson–Boltzmann equations that capture the effect of solvent as a dielectric continuum, or the modified Generalized Born equation which models the solvation energy of charged particles, are being developed as an alternate to explicit molecular description of water molecules [55–57]. These implicit solvent electrostatics models have to be combined with functions that capture the non-polar component of protein hydration such as the effect of cavitation and solute/solvent properties that contribute to the hydrophobic effect [58,59]. 2.5. Scoring & ranking of mutations Accurately scoring and ranking of individual designs relative to each other or to the parent species based on simulation is one of the most challenging aspects of the computer-aided approach. The stability of a folded state of a protein is typically about 5–10 kcal/mol more favorable relative to a reference unfolded state while the strength of a well formed interaction such as a hydrogen bond could be as much as 2–9 kcal/mol [25]. As a result even small inaccuracies in the description of these critical interactions could make simulation results quantitatively incorrect. Rigorous free energy and potential of mean force simulation methods that predict the relative thermodynamic free energy differences between states of the system in events such as protein ligand binding have been developed, but the inherent approximations in the energy functions and the computational cost of these simulations make regular quantitative estimation of free energy changes extremely challenging [60–62]. Furthermore, the challenge in defining a reference state corresponding to the unfolded state of a protein makes use of free energy simulation methods in order to determine the free energy of stability and folding problematic [63]. The molecular mechanics Poisson Boltzmann/generalized Born surface area (MM-PB/GB-SA) approach attempts to independently estimate the effects of all the important physico-chemical components of the system, such as internal energies, solvation energies and trac-


4


table entropic terms that could contribute to the free energy differences between two states of a system [64–66]. In spite of all the inherent approximations in such a simplified approach, this molecular state based free energy component analysis method has found favor among the simulation community, because of the opportunity it provides to inquire about different physico-chemical components as well as the qualitative insight it provides on alternate design proposals. The compensatory nature of the various physico-chemical components used in such an analysis as well as appreciating the significance of the terms relative to each other can introduce complex interpretation challenges. Effort has been put into developing scaling factors that help improve correlation of the estimated energies to empirical data in similar multicomponent energy functions [67]. Entropic contribution of the conformational degrees of freedom is another critical component of protein activity but is one of the most challenging features to model accurately and also difficult to evaluate from the experiments [68]. Covariance characteristics of the protein structure and its dynamics has often been employed as a means to derive approximate insight into the entropic aspects of the protein conformation [69,70]. Information gleaned from an analysis of the database of highresolution structures derived by X-ray crystallography or NMR spectroscopy is the most reliable empirical data for evaluating protein model quality. Analysis of residue or atomic pairing probabilities derived from the database of high-resolution 3-dimensional protein structures have been used as a means to calibrate and develop statistical potentials (also known as knowledge based potentials (KBP)) relevant to proteins [71,72]. Such potentials inherently capture the effect of all the molecular factors leading to the folded state of the protein but the development and use of these potentials is challenged by lack of information on the nature of reference unfolded states of the protein as well as the limited number of high resolution protein structures in the database. 2.6. The hypothesis driven engineering approach In spite of the challenges and limitations, available computer guided protein modeling and simulation methods provide valuable qualitative insights into the nature of protein structure and conformational dynamics along the relational chain of protein sequence-structure-dynamics-function. However, the reliability for predictive and quantitative application of the method without reasonable training with empirical data on the system of interest remains a challenge. In this context we have employed a hypothesis driven approach to protein engineering: Based on structural and other experimental information on the system of interest and the protein engineering problem at hand, rational structure guided hypothesis towards an engineering solution is developed and this is subsequently evaluated and refined with the help of computational molecular modeling and simulation. It is conceptually difficult to judge the potential impact of different types of mutations and designs based solely on the structural information of the parent protein and it is in this context that the computational chemistry based modeling and simulation become useful. Furthermore, specific combinations of computational tools and approaches can be uniquely applied to evaluate a particular hypothesis. At the same time, given the challenges faced by computational modeling approaches, the ranked output from computer simulations is often not accurate enough to effectively drive the screening effort in the laboratory. It is beneficial to incorporate the human element and expertise to query and interpret the in silico output. Under the guidance of this ‘‘two expert’’ approach, one being the computational modeling and simulation activity and the other being the more qualitative structural biology expertise, rationally derived focused libraries of mutation are selected for experimental verification. Every engineering effort is unique in itself since subtle local differences in sequence, geometry

and dynamics of the protein region could significantly impact how the mutations behave. The computational data along with the experimental observations guide further modification of the hypothesis as well as enriching the quality of the model in use for subsequent iteration of optimization. In this article, we present details of such a hypothesis driven approach to structure and computational modeling-guided optimization to effectively achieve asymmetrical heterodimeric Fc chain pairing with very high purity while retaining wild type-like stability. We present the engineering rationale as well as structural and functional characterization data of these engineered designs. 3. Methods 3.1. Structure preparation A number of IgG1 Fc structures are available in the PDB. Comparison of these structures provides information on potential regions of natural flexibility and structural variations in the Fc. In preparing the structure for simulations, the structure was protonated corresponding to pH 7.0 using propKa [73] with the N and Cterminus of the Fc chains capped with acetate and methyl amide groups to reduce simulation artifacts due to the terminal charges. The missing atoms and groups in the carbohydrate structure at the N297 position were rebuilt to the standard G0F configuration. All crystal waters in the PDB file were retained. The alternate structural configurations in the high-resolution crystal structure were considered and analyzed depending on the interest in the region, else the conformation with the highest occupancy was incorporated into the model. Mutant structures for variants of interest were generated using the protein packing method described below. All prepared crystal structures were optimized in a short energy minimization run comprising of 200 steps of steepest descent and 50 steps of conjugate gradient using an in-house platform (ZymeCAD™) prior to any other analysis. 3.2. Molecular dynamics Molecular dynamics (MD) simulations track the dynamic trajectory of a molecule resulting from motions arising out of interactions and transient forces acting between all the atomic entities in the system, in this case the atoms constituting the Fc and its surrounding water molecules. MD simulations were performed on the parent wild type structure as well as certain mutant variants to gain insight into the dynamic tendencies of various residues and their local secondary structural regions. The prepared structure was placed in a cubic waterbox extending to 15 Å on all sides and with sufficient counterions to neutralize the net charge on the system. Following a short energy minimization, MD was performed using a 1.5 fs timestep for simulations lengths in the range of 20–50 ns for each system. All bonds to hydrogen atoms in the protein and the geometry of water molecules were constrained to remain rigid in the course of the simulation [74]. The Amber parm99SB forcefield [50] was employed for the protein along with the TIP3P water model [75]. A 12 Å energy cutoff was employed along with a generalized reaction field for long-range electrostatic correction [76]. Equilibration was performed using the Leapfrog Verlet integrator in an NPT ensemble with weak coupling to an external pressure and temperature bath [77] in the ZymeCAD™ MD simulation engine. Production runs were carried out on a GPU architecture using the OpenMM simulation library [78] in the NVT ensemble using the Langevin integrator at 300 K [79]. Visualization was performed using Pymol [80] or VMD [81] and all post-facto analysis of MD trajectories employed software developed in-house.



3.3. Protein packing In systems and engineering designs where the introduction of a particular mutation is not expected to impact the folding and tertiary structure of the mutant relative to the parent protein, protein packing methods are employed to assess the local complementarity of amino acid side chains at the site of mutation with particular focus on buried residues and those at a protein–protein interface. In this context, rotamer-based amino acid side chain geometry sampling is performed to derive structures of new variants based on the parent wildtype Fc structure. Such an approach is often coupled with an analysis of the propensity for introduced mutations to impact to the local secondary structure of the parent protein. We employed the fine-grained rotamer library developed by Xiang and Honig along with the cyclic search and packing method to optimize side chain packing [82]. The optimization is based on the use of parm99SB forcefield [50] with a Generalized Born treatment of solvation effects [57,83]. The procedure consists of selecting a limited number of residues proximal to the site of mutation (comprising the optimization zone) and iteratively selecting the lowest energy rotamer for each of the positions within the zone until convergence is achieved. This procedure is repeated multiple times for each optimization run starting with a random rotamer for each residue in the optimization region in order to evaluate the nature of the minima as well as to generate an ensemble of alternate structural solutions. Following this optimization, we employed a Monte Carlo based sampling of backrub moves [36] and side chain torsional angle in the optimization zone to further refine the local geometry. 3.4. Model analysis Following structural optimization by MD or protein packing methods, a number of metrics that can be insightful in the analysis of the hypothesis are evaluated post facto for the individual or ensemble of structures obtained. This includes detailed conformational analysis looking at properties such as RMSD, RMSF, RDF and radius of gyration [84]. Other structural properties evaluated in-

clude features like solvent accessibility [85,86] and propensities of observed geometries. Individual residues are also characterized for the number of close contacts they make and the geometrical aspects of these contacts. We also compute energy properties of individual residues and groups, solvation energies, character of hydrogen bonds formed in the region of interest, van der Waals scores for clash detection and packing complementarity as well as scores based on the use of knowledge-based statistical potentials. Each of these parameters can be decomposed differently to evaluate their impact on either or both stability and affinity features of the molecule. A selection of metrics used to evaluate binding affinity and stability are summarized in Table 1. An important additional characteristic of natural protein–protein interfaces that is lost in evaluating single residue-residue interactions in a pairwise manner is the connectivity of each residue across and within the interface. This is a critical component in identifying key residue interactions and hotspot positions that stabilize the interface. To address this we use an interaction identification, visualization, and contact network analysis tool developed in-house. The tool allows calculation and visualization of the contacts and connectivity of each residue based on the 1st/2nd shell residue interaction. By shell we mean the proximity of a particular residue to the contact interface of interest. A lower shell number indicates the residues are closer to the interface. An example of the output of the visualization component of this custom tool is shown in Fig. 2. 3.5. Heterodimer Fc rational engineering strategy To address the challenges in Fc heterodimer engineering we employed a two-stage approach that specifically combines negative and positive design strategies to achieve the desired high specificity and WT-like affinity and stability (Fig. 3A). In the initial design phase the core interface positions were computationally screened using different negative design strategies, including steric, electrostatic and hydrophobic approaches and the variants predicted to have high heterodimer specificity were selected for in vitro verification. These variants were transiently expressed in

Table 1 Features typically evaluated for wild type and modeled mutant protein structures. Term

Description

DHelec

The electrostatic contribution to the difference between the enthalpy of folding of the mutant and the corresponding wild type protein. This term includes the effects of solvent polarization as modeled by the generalized Born approximation. The enthalpy of the unfolded state is modeled as the sum of the electrostatic enthalpies of the individual residues that comprise the protein. In this approximation, the contribution to the enthalpy of the unfolded state arising from a residue of type r is computed from a Boltzmann weighted average energy of a tripeptide with sequence (ALA, r, ALA), where the conformational states are defined by all Dunbrack side-chain rotamers of residue type r The electrostatic contribution to the binding affinity difference between a mutant and the corresponding wild-type protein. The effects of solvent polarization are captured through the use of the generalized Born approximation The Lennard-Jones contribution to the difference between the enthalpy of folding of the mutant and the corresponding wild type protein The Lennard-Jones contribution to the binding affinity difference between a mutant protein and the corresponding wild-type The difference in energy between the folded and unfolded states of a protein, estimated using an atomistic distance–dependent knowledge based potential (KBP) with a polymer reference state The difference in binding affinity between a mutant and the corresponding wild-type protein, as estimated by the polymer reference state KBP The difference in the buried surface area created upon complex formation, taken between the mutant and the wild type Defined like DAbind, except where the surface area is only aggregated for the carbon atoms

Dhelec DHLJ DhLJ DVKBP DvKBP DAbind

DAcarbon bind DLcarbon fold

The change in solvent exposed surface area (SASA) [85,86] upon folding of the protein. Only the solvent exposed surface area ascribed to carbon atoms is included in the computation

Dlcarbon fold Dlsys DrR

The difference in DLcarbon between a mutant and the wild type fold The difference in total SASA between a mutant and the wild type For a residue R, the ratio of the number of close atom–atom contacts with residues in the environment, divided by the residue surface area occluded by the environment, multiplied by an exponential damping to reduce the impact by the singularity in the limit of zero occluded surface area The difference in DrR between a mutant protein and the corresponding wild-type protein (CCSD) The number of residues with a conformation that does not match any of the rotamers in the library of Dunbrack and Karplus [97]. The residues considered when computing this metric are those that are structurally refined during post-mutation repacking The number of residues with a conformation that is an approximate match to one of the romaters in the library of Dunbrack and Karplus. The value of this metric equals the number of residues that have a rotameric configuration that is between two and three standard deviations away from a rotamer in the library. The residues considered when computing this metric are those that are structurally refined during post-mutation repacking

Drtot Ne N

5


6


Fig. 2. Integrated interface analysis by residue contacts and interaction network analysis. The upper panel shows a 2-dimensional graph representation of the interaction network in part of the CH3CH3 interface. Each node in the graph represents a residue, and the lines connecting the nodes represent interactions between residues. These interactions include e.g. vdW interactions, hydrogen bonds, cation-p and p–p interactions. The different calculated interactions can be directly mapped onto the structure for further inspection. The lower panels illustrate for example the residue connectivity for first and second shell residues mapped to the Fc structure. Visualizing the detailed favorable and unfavorable interactions and the pattern of inter- and intra-chain residue connectivity provides a critical aspect in both the WT analysis and the post processing of variants at the design stage.

CHO, purified by Protein-A affinity chromatography and analyzed for heterodimer purity and thermal stability as described previously [17,87]. A total of 16 variants based on four core designs were experimentally characterized in the initial design phase. From this initial set of negatively-designed Fc heterodimers, two variants were selected for further optimization in the second design phase to drive both stability and specificity using positive design approach. Each design was evaluated using computational modeling assisted structure–function analysis procedure described below to postulate a hypothesis for the observed lower-thanWT thermal stability and subsequently recommend mutations to address the structural issues. Based on this analysis, a limited new set of designs was selected for further evaluation.

3.6. Computational structure–function analysis and screening workflow The procedure for computational screening and ranking of heterodimeric Fc variants is depicted in Fig. 3B. In the initial phase, the structure and interface of the WT Fc is analyzed in detail to gain an understanding of the structure–function relationships and the potential contribution of each of the interface residues to affinity and stability. This analysis entails multiple structure and sequence alignments in conjunction with protein packing, MD simulations and comprehensive analysis of inter- and intra-chain residue contacts and interface interactions network, described above. The analysis provides an understanding of critical Fc interface charac-

teristics and hotspots, such as interaction strength, solvent accessibility of core interface residues, residue mobility at the interface and potential bifurcation of polar and electrostatic contacts. The information from the above analyses is compiled in an easily accessible summary, as exemplified in Table 2. This detailed understanding of the underlying structure–function relationships provides the basis for subsequent examination of candidate positions at the Fc interface for introducing steric, charge, or hydrophobic-based negative designs in the 1st design iteration. In the 2nd design iteration, the introduction of additional mutations were guided by similar understanding of the initial variants combined with available experimental data leading to the identification of specific positions and interactions that were contributing to the reduced stability. The hypothesis driven design is dependent on local differences in sequence, geometry and dynamics of the targeted interface regions and hence the procedure is aided by the available information on the position. The details of both the 1st and 2nd design iterations for a pure and stable heterodimeric Fc are outlined below.

3.7. 1st design iteration–in silico ranking and lead selection for heterodimer specificity The focus of the 1st design iteration is on predicting and introducing negative design mutations that drive heterodimer specificity. The computational design targeted the positions identified as candidates for introduction of mutations and as a result allowed



7

Fig. 3A. Iterative rational protein engineering strategy for heterodimer Fc design.

focused in silico screening of about 100–1000 variants per design hypothesis. In the initial filtering of the modeled designs, the difference in the scoring metrics for homodimer versus heterodimer was used as a crude measure of specificity. The evaluated metrics listed in Table 1 were compared to the corresponding property of literature mutants such as the KiH and CI designs. The DHLJ, DhLJ metrics was mainly used to estimate large steric clashes designed to disfavor homodimer formation while DAbind, DAcarbon and Dlsys, bind carbon Dlfold were used to estimate the loss of surface complementarity (aka poor packing) in the unfavored ‘hole-hole’ homodimer interface. The surface area analysis was also accompanied by evaluation of ‘‘cavities’’ at the engineered interface using numerical and visual techniques. For electrostatic complementarity designs, the differences in the DHelec, Dhelec metrics between the heterodimer and homodimer were evaluated. In addition to this initial in silico filtering for specificity, all designs were re-evaluated to consider the protein dynamics of the positions targeted for mutagenesis, especially unfavorable bulky side chains could potentially be accommodated as a result of backbone structural rearrangements and result in lower than predicted specificity. Post processing of the variants further involved analysis of computed energies, structural inspection of the computed models and making comparisons to the engineered and the natural WT interface. 3.8. 2nd design iteration–in silico ranking and lead selection for heterodimer stability Fig. 3B. Computational structure function analysis and screening strategy. The illustrated workflow was employed in the 1st and 2nd iteration of the heterodimer Fc design. A different ranking and lead selection procedure was used for the initial negative design and the optimization stages. Detailed examples of the different usage of the ranking and lead selection process are described in the main text. The in silico mutagenesis and packing workflow is described in the methods.

The focus of the 2nd design iteration was the evaluation and selection of lead heterodimeric Fc variants for improved stability and interface affinity. Here again the available information on the potential role of individual residue positions is used to guide selection of positions for computational screening. Following packing


8


Table 2 Example of WT interface analysis summary table. The interface analysis by the residue contacts and network calculation, as described in Fig. 2, is compiled for each position of the interface. In addition, multiple sequence alignment and 1X in silico screening provide an understanding of the ‘mutability’ of each position. In the 1X computational screening, the interface residues are mutated to all other amino acids as a single mutation and analyzed using the metrics described in Table 1. An example of four representative interface residues is shown below. Residue

Residue contacts and interaction network analysis

1X screening–summary of mutability and potential affinity mutations

Multiple sequence alignment

K392

Fluctuating inter-chain h-bond to main chain of residues 398, 399, 400, partially buried, contributes by vdW and h-bond, but can also impact the A–B chain distance and solvent exposure Buried, at C2 axis, intra-chain h-bond, potential importance for packing/folding

Potentially stabilizing hphobic mutations V, L

R, I , L, V, G

Most mutations destabilizing or neutral All mutations are destabilizing

W

All mutations are destabililizing

T, H, F

T394 F405 Y407

Hotspot, buried, high intra and inter-chain 1st and 2nd shell connectivity, cation-p inter-chain interaction with K409, intra-chain p–p interaction with Y407 Hotspot and hydrophobic core of interface, high intra and inter-chain 1st and 2nd shell connectivity, strong inter-chain h-bond with T366 and p–p interaction with Y407 and F405

optimization of candidate designs, the predicted models are filtered and a subset selected by applying cutoffs on the computed interaction energies. The purpose of this initial filter is to only remove variants that exhibit gross steric and electrostatic clashes or unfavorable contact. Approximately 5–10% of the screened variants are removed at this step. In the next step, a combination of ranking by in silico metrics, detailed residue contacts analysis and grouping of similar design hypothesis, and scoring of related designs is used to select the final variants. As exemplified in Supplementary Fig. 1A, the predicted models get ranked on the basis of selections of terms listed in Table 1. For each of the scoring metrics the top scoring 20–40 variants are combined and grouped into structurally distinct categories as exemplified in Figs. 7 and 8. This allows the correlation of the predicted structural details to the different in silico scores and facilitates ranking within each of the design hypothesis groups. Further ranking and lead selection within each design hypothesis group was performed by a qualitative visual inspection of favorable and unfavorable residue contacts (see also Supplementary Fig. 1B for example). 4. Results To design heterodimeric Fc variants with improved purity and stability, we employed an iterative process of hypothesis driven computational design and evaluation followed by experimental screening to select the most successful variants. The results of the iterative rational design and the computational modeling workflow employed (Fig. 3) is described in detail below. 4.1. 1st design iteration: specificity engineering using negative design hypothesis In the 1st iteration of the design workflow, different design hypotheses were independently evaluated for each of the postulated key residue and interaction pairs at the CH3–CH3 interface. Fig. 4 describes the different negative design principles evaluated in the screening process to disfavor the natural homodimer interactions: (1) Steric complementarity: Engineering the Fc interface for steric complementarity relies on introducing mutations that create undesirable steric clashes and low surface complementarity in each of the Fc homodimers, respectively. This approach was used in the generation of the KiH heterodimer where the knobknob and hole-hole homodimers were disfavored [88]. (2) Charge complementarity: The concept of optimizing charge complementarity is based on introducing point mutations that create charge repulsion in one or both of the undesirable homodimers while promoting favorable electrostatic complementarity in the heterodimer. In one case, this can be achieved by inverting the charges

A

of an existing WT salt bridge, as was done in the CI heterodimers [10]. Furthermore, a new stabilizing salt bridge in the heterodimer complex can also be introduced and can be independently used or combined with other altered salt bridges. (3) Hydrophobic design: To further expand on the concept of electrostatic designs, new stabilizing hydrophobic interactions replacing WT electrostatic interactions can be created in the heterodimer, which results in burial of charges and destabilization of the disfavored homodimers. This can be coupled with the design of new stabilizing salt bridges for better specificity. Within the framework of these optimization scenarios we developed and tested four different design hypothesis presented in Fig. 5. Numerous combinations of mutations were screened computationally at the selected interface positions, leading to selection of 16 different variants being tested in vitro. We present here in some detail the rationale behind these designs. The Steric-1 designs aimed to find new complementary steric contacts to replace the existing core contacts at the Fc interface, along the lines of the KiH design in the literature. In one of these designs, two hotspot positions (Y407 and F405) (see Table 2) in the WT Fc CH3 interface were targeted, by introducing A_Y407V_F405A mutations together with B_T394 W to achieve steric complementary (for clarity, mutations on chain-A and chain-B of the Fc are denoted as ‘A_’ and ‘B_’, respectively). The mutations and corresponding structural details are illustrated in Fig. 5A. The in silico scores predict a large potential clash observed as an unfavorable LJ score (DhLJ) (relative to the parent WT CH3 domain interface) for the B:B T394W homodimer and a unfavorable low surface complementarity for the A:A Y407V_F405A homodimer, observed as increased solvent accessibility in the homodimer model structures (Table 3). In line with these observations, in vitro results of this initial variant did not form any A:A homodimer, potentially as a result of significant destabilization, while the B:B homodimer was observed resulting in a heterodimer purity of 85% as observed in SDS–PAGE and LC/MS analysis (data not shown). Targeting the Y407 and F405 positions will likely lead to significant destabilization of the mutant homodimer interface given the significance of the wild type residues in the WT CH3 domain interface. On the other hand, T394 is located at the rim of the interface core, which is a region with higher mobility. As a result, the bulky residues engineered at this position to introduce steric clashes in the corresponding homodimer can potentially be accommodated, yielding a lower than predicted specificity. Visual analysis suggested that strong interactions across the Y407:T366 residue pair were able to form in the chain B homodimer species, potentially contributing to the homodimer formation. In a subsequent iteration, this was addressed by introducing mutation at the B_T366 position (data shown in Table 3 and Table 4) to selectively destabilize the B chain homodimer without significantly



9

Fig. 4. Negative design concepts evaluated in the first iteration of heterodimer design. Illustration of the different negative design approaches based on steric complementarity, electrostatic complementarity and hydrophobic design approaches. The WT interactions are depicted in black and changes to achieve heterodimerization are shown in red. In the initial design phase all interface position were independently evaluated for each of the above design approaches by visual inspection and targeted in silico mutagenesis and structure modeling. While a large number of designs where screened in silico in this manner, only 4 different designs and 16 variants were selected for experimental evaluation.

impacting the heterodimer as indicated by the LJ score of the packed models. This mutation in the Steric-1 variants effectively improved the heterodimer specificity of this design from 85% to 100%. The Electrostatic-1 design illustrated in Fig. 5C is based on the introduction and subsequent inversion of a new stabilizing salt bridge between residues 368 on chain-A and 357 on chain-B resulting in a variant comprising of the A_E357R_L368R __B_L368D_K370E mutations. The homodimers are expected to be disfavored because of electrostatic repulsion of the negativelycharged residues in the B:B homodimer, and the positively-charged residues in the A:A homodimer. This anticipated electrostatic repulsion is observed in the in silico modeling with a large heterodimer vs. A:A homodimer difference shown in Dhelec scores in comparison to the controls (Table 3). The in silico modeling of the heterodimer vs. B:B homodimer did not show as large of an electrostatic repulsion but since the design targeted the buried hydrophobic L368 hotspot position of the WT Fc, it is anticipated that the mutant homodimer and potentially the heterodimer as well would be disfavored. Taking the results together, this variant was selected for in vitro testing and it was confirmed that the resulting Fc domain showed a heterodimer purity of >90% although the in vitro thermal melting temperature decrease to 65 °C. The examination of the MD trajectory of this design suggested that solvent

molecules are able to penetrate and therefore destabilize the hydrophobic core following the mutation of the L368 position. Two additional design hypotheses, (Steric-2 design and Hydrophobic design) were also selected for experimental verification. Briefly, the Steric-2 designs, illustrated in Fig. 5B, are based on a combination of steric complementarity and hydrophobic designs targeting the core WT hotspots Y407 (A_Y407A) and K409 (B_K409F) positions to prevent homodimer formation, while at the same time providing optimal geometry for packing and hydrophobic interactions in the heterodimer. The hydrophobic designs, illustrated in Fig. 5D, are based on a combination of hydrophobic and steric complementarity designs targeting positions at the rim of the interface instead of the hydrophobic core of the WT homodimer. These include replacing a WT salt bridge A_K370:B_E357_S364 with stabilizing hydrophobic contacts (A_K370I:B_E357W_S364F) while penalizing the unpaired charged and hydrophobic groups in the homodimers. This is coupled with a steric complementarity design targeting the WT Y349 (A_Y349A:B_E357W) to further prevent homodimer formation. The experimental results for the hydrophobic design showed high thermal stability, but low heterodimer specificity. This result confirmed the earlier hypothesis (Steric-1 design) that targeting positions at the rim of the interface, while retaining the WT core hotspot interactions, will likely lead to good stability, but lower specificity.


10


Fig. 5. Examples of first iteration lead design selected for experimental testing. The selected first iteration designs and experimental results are summarized in the table. The representative structural models for the different designs are illustrated in A–D. The chain A of the heterodimer is shown in gray, chain B in blue and the mutated residues are highlighted in orange. (A) Illustration of Steric-1 design. The steric complementarity design Steric-1 targets the hotspots Y407 and F405 and prevents homodimers formation by significant unfavorable vdW interactions in the T394W_F405_Y407 homodimer and low surface complementarity in the F405A_Y407V homodimer. (B) Illustration of Steric-2 design. The Steric-2 design uses a combination of steric complementarity and hydrophobic design targeting the core WT hotspots Y407 and K409 to prevent homodimer formation, while at the same time providing good geometry and hydrophobic packing in the preferred heterodimer. (C) Illustration of Electrostatic-1 design. The electrostatic-1 design is based on a predicted new stabilizing salt bridge of L368R that is introduced in the first heterodimer chains, and the inversion of this newly created stabilizing salt bridge on the second heterodimer chain to create a negative design electrostatic repulsion in the mis-paired homodimers. The image on the right represent the identical heterodimer, rotated by 180° at the CH3 domain pairs 2-fold symmetry axis. (D) Illustration of the hydrophobic design. The hydrophobic design is based on a combination of hydrophobic and steric complementarity design targeting positions at the rim of the interface instead of the hydrophobic core of the WT homodimer. The hydrophobic design replaces a WT salt bridge (K370:E357_S364) with stabilizing hydrophobic contacts, creating penalizing desolvation in the mis-paired homodimer. This is coupled with a steric complementarity design targeting the WT Y349 to further prevent homodimer formation.


11


Table 3 In silico filtering of initial negative design variants. In the initial filtering of different targeted designs the preferred heterodimer and the disfavored homodimers were predicted using the in silico workflow. To rank the different designs based on heterodimer specificity, the differential of the metrics DH (LJ), Dh (LJ), DA (binding), Dl (sys), Dl (carbon fold), DH (elec) and Dh (elec) of heterodimer vs. homodimer were used as coarse measure of specificity. As illustrated, the DH (LJ) and Dh (LJ) metrics were mainly used to estimate gross clashes designed to disfavor homodimer formation in steric complementarity design ‘knob’–‘knob’ homodimers (homodimer KK), while the differential in DA (binding) and Dl (sys), Dl (carbon fold) was evaluated to filter the ‘hole’–‘hole’ homodimers (homodimer HH), disfavored because of poor surface complementarity in steric complementarity designs. For electrostatic complementarity designs, the differential of the DH (elec) and Dh (elec) metrics between the heterodimer and homodimer was evaluated (homodimer / and homodimer +/+). The literature variants Knob-into-Holes (KiH control) and charge-inversion (CI control) were used as control benchmarks to estimate the specificity. KiH control

Variant

Steric complementarity design Heterodimer A_T366S_L368A_Y407V__B_T366W Homodimer KK B_T366W__B_T366W Homodimer HH A_T366S_L368A_Y407V__A_T366S_L368A_Y407V Steric-1 Heterodimer

Homodimer Homodimer Homodimer Homodimer

KK KK KK HH

CI control

A_F405A_Y407V__B_T366I_T394W A_F405A_Y407V__B_T366L_T394W A_F405A_Y407V__B_T366T_T394W B_T366I_T394W__B_T366I_T394W B_T366L_T394W__B_T366L_T394W B_T366T_T394W__B_T366T_T394W A_F405A_Y407V__A_F405A_Y407V Variant

DVKBP

Homodimer +/+ Homodimer / Homodimer / Homodimer / Homodimer /

A_E357R_L368R__B_L351D_L368D_K370Q A_E357R_L368R__B_L351D_L368D_K370E A_E357R_L368R__B_L351L_L368D_K370Q A_E357R_L368R__B_L351L_L368D_K370E A_E357R_L368R__A_E357R_L368R B_L351D_L368D_K370Q__B_L351D_L368D_K370Q B_L351D_L368D_K370E__B_L351D_L368D_K370E B_L351L_L368D_K370Q__B_L351L_L368D_K370Q B_L351L_L368D_K370E__B_L351L_L368D_K370E

DHLJ

DhLJ

Dlcarbon fold

Dlsys

DAbind

Dhelec

DHelec

73.9 230.9 467.0

84.2 59.6 334.5

12.4 24.4 31.1

20.9 81.3 17.8

9.4 259.2 426.4

33.7 24.6 229.8

21.4 20.7 195.0

3.1 3.6 12.3

3.6 21.6 9.0

181.2 17.1 46.2 108.3 224.0 272.1 349.0

124.2 116.8 26.1 694.9 799.1 1009.7 332.0

11.8 10.1 15.4 154.1 210.5 110.9 37.2

33.8 28.2 36.7 274.1 442.6 227.4 52.4

32.0 1.1 42.3 379.7 365.9 233.3 424.3

12.7 30.1 26.0 38.0 17.1 16.7 195.8

9.4 10.0 0.2 102.3 90.6 120.2 171.0

2.4 0.6 0.2 8.6 0.5 1.6 4.9

8.8 2.0 3.2 47.8 35.7 30.9 17.8

DvKBP

DHLJ

DhLJ

carbon

Dlsys

Dhelec

DHelec

DVKBP

Electrostatic complementarity design Heterodimer A_K392D_K409D__B_D356K_D399K Homodimer +/+ B_D356K_D399K__B_D356K_D399K Homodimer / A_K392D_K409D__A_K392D_K409D Electrostatic-1 Heterodimer

DvKBP

Dlfold

DAbind

60.1 29.9 87.6

73.2 107.0 55.2

11.2 9.3 4.1

29.3 15.4 37.3

60.4 124.4 232.5

123.7 229.7 105.4

30.9 69.1 179.1

19.2 88.1 57.3

26.8 169.7 53.7

752.1 627.0 457.4 460.8 302.5 772.2 909.5 547.2 752.6

206.6 142.7 124.5 123.1 190.1 453.2 428.2 185.2 172.3

13.1 3.3 6.6 15.1 14.2 12.2 7.1 3.7 2.4

36.1 55.7 49.4 38.9 28.7 62.3 67.4 49.9 49.8

249.1 207.4 161.8 131.6 14.3 428.9 385.4 246.1 230.7

66.7 42.6 28.2 36.4 62.8 125.0 73.2 101.7 103.9

4.5 62.4 72.8 66.5 106.6 79.1 52.9 58.6 57.8

48.2 62.0 31.4 55.5 59.1 14.0 1.9 4.2 2.4

35.2 40.9 35.1 34.6 108.0 46.9 70.7 24.2 39.3

Following in vitro verification, the lead designs were further analyzed in detail for their potential of be optimized to achieve 100% specificity and WT-IgG1 like stability in the 2nd design iteration. The correspondence between the experimental observations and the modeling data was one of the factors in this analysis. Based on these considerations the Steric-1 and Steric-2 designs were deemed to be the lead candidates for further optimization. The Steric-1 heterodimer will be used to illustrate the 2nd iteration design process that led to the identification of the variant A_T350V_L351Y_F405A_Y407V__B_T350V_T366L_K392L_T394W which met the objective of very high purity and WT IgG1 Fc like thermal stability as summarized in Table 4. The Steric-2 design was also successfully pursued in a similar manner, albeit in comparison to the Steric-1 variants, have similar heterodimer stability, but slightly lower heterodimer specificity. 4.2. 2nd design iteration: stability and specificity engineering using positive design hypotheses The key step in the 2nd design iteration of the engineering workflow was the detailed analysis (similar to what was described above for the WT analysis) of the lead variants from the 1st iteration cycle. The primary in silico goal was to understand the structural rationale behind the decreased stability for these variants and to subsequently use targeted positive designs to address the stability shortcomings. It was also anticipated that heterodimers with improved stability will also result in improved specificity. 4.3. Structure–function analysis of Steric-1 variant The detailed analysis of the Steric-1 heterodimer (A_F405A_ Y407V__B_T394 W) showed that important interface interactions

pertaining to core packing and surface complementarity are lost with respect to the wild-type homodimer and these include the inter-chain cation-p and van-der-Waals interaction between A_F405 and B_K409, the strong hydrogen bond between A_Y407 and B_T366 and the p -p and van-der-Waals interactions between A_Y407 and B_F405_Y407. These changes likely led to the decreased stability observed in vitro. In addition, MD analysis showed that the Steric-1 mutations resulted in significantly larger fluctuation (calculated on the basis of RMSF) at the L368 hotspot (see Fig. 6A) critical in stabilizing the interface and furthermore showed a large conformational difference in the loop region comprising of residues D399_S400_D401 and a loss of a strong WT interchain interaction between K409 and D399 (see Fig. 6B). Consequently, optimization of this Steric-1 variant was focused on improving the core packing interactions at positions 366-368407 and stabilizing the 399-401 loop into a closed, WT-like conformation. Different design hypothesis were formulated to optimize the identified regions, as summarized in Table 5. Each of the design hypothesis was independently evaluated and screened using targeted in silico prediction, ranking and lead selection. The employed design workflow is described in detail below for the examples highlighted in bold in Table 5. 4.4. In silico modeling and characterization of the mutations A_L351Y__B_T366L Based on insight derived from structural and computational model analyses, a potential way to optimize the core interactions of the heterodimer involved improving the hydrophobic surface complementarity and packing encompassing the positions A_L368_Y407V and B_T366. To achieve this, all residues coupled


12


Table 4 Optimization of Steric-1 heterodimer variants to achieve WT-IgG1 stability

a

Chain-A

Chain-B

Purity (%)a

Stability (Tm) (°C)a

F405A_Y407V F405A_Y407V L351Y_F405A_Y407V T350V_L351Y_F405A_Y407V T350V_L351Y_F405A_Y407V

T366L_T394W T366L_K392M_T394W T366L_K392M_T394W T350V_T366L_K392M_T394W T350V_T366L_K392L_T394W

>95 >95 >95 >95 >95

70.5 73.5 78.0 79.5 81.5

Purity based on LC/MS characterization and thermal stability based on DSC analyses [17,87].

Fig. 6A. Detailed interface analysis of first iteration lead design. The 20 ns MD trajectory of the WT Fc homodimer and the first iteration lead design after experimental evaluation was subjected to detailed residue contacts and interaction network analysis to understand the structural reason for the significantly lower than WT thermal stability. As illustrated in the overlay of four representative frames of the MD trajectory, the residue L368 displays in the heterodimer variant a high degree of sidechain flexibility. The residue L368 is part of the hydrophobic core of the interface and was identified as hotspot in the WT homodimer. The analysis of the MD trajectory of the heterodimer suggests a non-optimal packing of the hydrophobic core at L368, which is a likely reason for the low thermal stability of the heterodimer. The panel below shows the RMSF for the sidechain (CG,CD1 and CD2) of L368 calculated over the 20 ns trajectory. Subsequently, similar analysis on the 2nd and 3rd iteration designs was performed to confirm the initial hypothesis.

to the L368 hotspot position, which had been identified previously to exhibit high structural fluctuations and poor packing contacts in the initial Steric-1 heterodimer (Fig. 6A), were inspected for the potential to improve packing. The immediate candidates were residues A_L368_Y407V:B_T366 themselves, but the choices for alteration of these positions were limited because of their importance for heterodimer specificity, for example, A_Y407V is critical to disfavor the A:A homodimer. A_L368 is positioned adjacent to A_Y407V and increasing the side chain to larger hydrophobic residues would improve packing but was likely to also stabilize the A:A homodimer with the added hydrophobic contacts. B_T366 is a potential position to target for mutagenesis since, as reported in the section on 1st iteration design of Steric 1, we identified that a larger residue in place of the threonine will likely improve the heterodimer specificity. In addition, B_L351 was identified as being able to potentially position the dynamic L368 side chain into a

fixed conformation (rather than multiple conformations as shown in Fig. 6A) if a large hydrophobic residue is introduced in place of the leucine. Following this rationale, targeted in silico structure prediction and ranking was pursued to concertedly screen the positions A_L351x_L368y_Y407V__B_T366z (x, y, z represent the screened mutations and the triple mutation combinations and permutations thereof). The initial filtering and ranking of these design variants was based on the in silico models and computed scores (see Supplementary Figs. 1A and 1B). The top ranked variants from this filtering step are grouped into structurally distinct categories which allows better correlation of the structural details to the different in silico scores and facilitates ranking within each of the structural groups. The post processing and rationale of selecting the final candidates, based on a combination of in silico scores, structural correlation and detailed inspection of favorable and unfavorable residue contacts, is illustrated in more detail in



13

Fig. 6B. Detailed interface analysis of first iteration lead design. The 20 ns MD trajectory of the WT Fc homodimer and the first iteration lead after experimental evaluation was subjected to detailed residue contacts and interaction network analysis to understand the structural reason for the significantly lower than WT thermal stability. The overlay of the initial heterodimer starting model and a representative frame of the trajectory shows, that over the course of the MD trajectory the important WT salt bridge interaction K409-D399 is lost in the designed heterodimer, thus weakening the interface and exposing the hydrophobic core residues to solvent. Calculation of the RDF of the K409-D399 interaction over the trajectory confirmed the observation: the diagram depicts the distribution of the observed N–H O distances of K409 D399 over the 20 ns simulation. As a consequence, targeting this region in the 2nd iteration of design was evaluated as being critical to retain the WT stability. Subsequently, further analysis on the 2nd and 3rd iteration designs was performed and is shown above as comparison to the initial hypothesis. This shows that the K392M mutation stabilizes the loop conformation over the course of the 20 ns trajectory.

Fig. 7. Category A groups all the top ranked structurally similar models with large aromatic sidechains at position A_T366, which have the potential of forming a strong p–p interaction with B_L351Y/F, whereas Category B groups top ranked variants with aliphatic sidechains at A_T366. Within each group the top one or two candidates are selected based on the in silico metric and the detailed inspection of favorable and unfavorable residue contacts. The results of the further close inspection and the rational for the selected lead candidates for experimental testing are summarized in Fig. 7. Some of the selected variants were analyzed by MD simulation to test the initial hypothesis that lead to the identification of L368 and the targeted designs (Fig. 6A). The comparison of the RMSF/Bfactors of the different variants shows a high fluctuation for the initial heterodimer (A_F405A_Y407V__B_T394W) and also for variants targeting different parts of the structure (A_ F405A_Y40 7V__B_K392V_T394W; A_ F405A_Y4 07V__B_K392M_T394W) while the two variants that contain the predicted stabilizing A_L351Y:B_T366L mutations show a lower levels of RMSF, comparable to WT-IgG1. Some of the selected variants were analyzed by MD simulation to test the initial hypothesis of the destabilized B_K409:A_D399 salt bride in the initial Steric-1 variant. As shown in Fig. 6B, the

comparison of the RDF of the salt bridge interaction for the initial heterodimer (A_F405A_Y407V__B_T394W) and WT-IgG1 to the two lead variants that contain the predicted stabilizing B_K392M mutation confirms the hypothesis. Two other variants, one targeting a different region of the interface (A_ L351Y_F405A_Y407V__B_T366L_T394W) and one that exhibit reduced stabilization of the salt bridge (A_ F405A_Y407V__B_K3 92V_T394 W), show significant experimental de-stabilization, corroborating the initial hypothesis and targeted design strategy. 4.5. In silico modeling and characterization of the mutation K392M Based on in silico analyses, a potential way to stabilize the 399-401 loop was recognized to involve targeting the K392 position of the heterodimer. In the WT CH3–CH3 interface the hotspot B_K409 stabilizes the conformation of the 399-401 loop by an electrostatic interaction with A_D399 and a cation-p and van-der-Waals interaction with A_F405. The cation-p and vander-Waals interactions with A_F405 are lost in the heterodimer due to the mutations of A_F405A. To regain the favorable packing interactions in the heterodimer, residues in close proximity are evaluated by visual inspection, and the positions B_K392 and A_F405A in close contact with the A_D399:B_K409 salt


14


Table 5 Summary table of targeted positive design approaches for improving the stability of 1st iteration Steric-1 lead variant. The design hypothesis (1) and (2) are based on the analysis described in Fig. 6A and 6B. The design hypothesis (3) targets independent positive design mutations that can potentially provide additive stabilization effects. Design rationale

Design hypothesis

Targeted designs for stability

Experimentally tested variants*

(1) Stabilization of 399400 loop conformation

Stabilization by improved core packing

a. Additional hydrophobic interactions at positions 392-405 b. Additional electrostatic interactions at exposed loop (evaluated positions: 390, 400, 411) c. Replacement of destabilized salt bridge 409-399 by hydrophobic interactions d. Replacement of destabilized salt bridge 409-399 by improved electrostatic interactions a. Optimization of core packing at positions 351-366-368 b. Optimization of exposed electrostatic rim at positions 364, 370 c. Replacement of exposed electrostatic rim by hydrophobic interactions at positions 364, 370 a. Additional 1X/2X design independent stabilization mutations

4 variants, 1 successful 5 variants, no significant improvement 4 variants, no improvement 5 variants, mostly destabilized

Stabilization by introducing solvent exposed electrostatic interactions at loop Stabilization by introducing additional hydrophobic contacts Stabilization by improved partially buried salt-bridge or electrostatic interactions

*

(2) Stabilization of core packing at residues 366-368

Stabilization by improved core packing

(3) Stabilization by independent optimization

Stabilization by additive effect of design independent 1X/2X mutations (e.g. additional polar, long range electrostatic, hydrophobic, etc. interactions) Stability improvement by stabilization of CH3 fold

Stabilization by improved electrostatic interactions protecting the hydrophobic core Stabilization by additional hydrophobic contacts

b. Second shell residue optimization

4 variants, 1 successful 5 variants, mostly destabilized 5 variants, mostly destabilized 6 variants, 2 successful 2 variants, 1 successful

The successful designs that are described in detail in the main text are highlighted in bold.

Fig. 7. In silico ranking and selection of lead variants – example L351Y_T366L. The top ranked variants predicted by the in silico workflow as described in Supplementary Fig. 1 are further ranked using a structure guided post processing. The in silico hits are grouped into structurally distinct categories which allows better correlation of the structural details to the different in silico scores and facilitates ranking within each of the structural groups: Category A groups structurally similar models with large aromatic sidechains at position A_T366, which have the potential of forming a strong p–p interaction with B_L351Y,F, whereas Category B combines top ranked models with smaller aliphatic sidechains at A_T366. Within the two groups, the models are further ranked by the different in silico metrics (see Supplementary Fig. 1A) and close inspection of the favorable and unfavorable inter- and intra-chain interactions as also illustrated in Fig. 2 and Supplementary Fig. 1B. The results of the further close inspection and the rational for the selected lead candidates for experimental testing are summarized.



15

Fig. 8. In silico ranking and selection of lead variants – example K392M. To achieve stabilization of the weakened A_D399:B_K409 salt bridge interaction in the initial Steric-1 design (see Fig. 6B), in one approach, the positions in close proximity to the salt bridge, B_K392 and A_F405, were screened by the in silico workflow for improved hydrophobic packing (positions A_F405 and B_K392 were mutated to all other hydrophobic residues). The predicted variants were filtered and ranked in the same manner as described in the previous example (Supplementary Fig. 1 and Fig. 7). The top ranked models from each in silico scoring metrics were combined and grouped into distinct categories, based on structural rational. As illustrated above, the top ranked variants can be represented in four distinct groups (A–D): Category A represents structures that contain the mutation A_F405M and different B_K392x mutations, the characteristic of this design group being the sidechain conformation of A_F405M, providing a potentially strong interchain interaction with B_K409. Category B shows structures that contain a large mutation at B_K392 e.g. F, W, Y, L, with the predicted sidechain conformations of all variants being solvent exposed, pointing into solution. Category C and D represent the mutation B_K392M, with the difference that in C the predicted methionine sidechain conformation packs favorably towards the B_K409:A_D399 salt bride, whereas in D the methionine sidechain conformation is predicted to face the solvent. Within the two groups, the models are further ranked by the different in silico metrics and close inspection of the favorable and unfavorable inter- and intra-chain interactions. The results of the further close inspection and the rationale for the selected lead candidates for experimental testing are summarized.

bridge were selected and screened as hydrophobic double mutation changes (A_F405x__B_K392y) in the context of the Steric-1 variant. Since the mutation A_F405A was identified as critical for heterodimer specificity in the 1st design iteration, F405x was only mutated to smaller sidechains and not large hydrophobic amino acids. The in silico screening and ranking was done similar to the previous example. Specifically, as illustrated in detail in Fig. 8, the top ranked variants can be represented in four distinct categories (A–D): Category A represents structures that contain the mutation A_F405M and different B_K392x mutations, the characteristic of this design group being the sidechain conformation of A_F405M, providing a potentially strong interchain interaction with B_K409 and B_T394W. Category B shows structures that contain a large mutation at B_K392 e.g. F, W, Y, L, with the predicted sidechain conformations of all variants being solvent exposed, pointing into solution. Category C and D represent the mutation B_K392M, Category C packed the predicted methionine sidechain conformation favorably towards the A_D399__B_K409 salt bride, whereas in Category D the methionine sidechain conformation is predicted to face the solvent, likely because of a larger mutation at A_F405x. Within each category the top one or two candidates are selected on the basis of in silico scores and detailed inspection of favorable and unfavorable residue contacts. The results of the further close inspection and rationale of selected lead candidates for experimental testing are summarized in the Fig. 8.

4.6. In silico modeling and characterization of the mutation T350V The employed engineering approach to improve the heterodimer stability was not limited to introducing mutations that increase complementarity across the two chains. We also evaluated mutations of amino acids that are not directly contacting the complementary chain as a means to improve the stability of the Fc heterodimer. Specifically, we computationally screened all second shell interface residues using the intra and inter-domain KBP metrics, after applying a coarse filter to remove large structural clashes (see Fig. 9). The top ranked variants were also evaluated in light of the detailed WT analyses and sequence comparisons described above. The second shell residue T350 was ranked highly in the scoring and showed in addition a high connectivity in the residue contact analysis. In addition, sequence comparison highlighted the presence of V350 and L350 in other Ig homologs. Based on this analysis the mutation T350V was selected, and the experimental results showed improved stability of the heterodimeric CH3 domain by 2 °C (Tm). Interestingly, earlier work based on Ala scanning mutagenesis and structural modeling had indicated that this T350 position was not critical for stabilizing the CH3 domain in Fc [89]. In addition to the described targeted hypothesis driven design, we also evaluated the potential for independent stabilizing mutations (Table 5). In one approach, 1 (single mutation) and 2X (double mutation combinations) computational screening using an in silico workflow was employed to identify stabilizing muta-


16


Fig. 9. In silico prediction and ranking of second shell mutations – example T350V. All second shell positions that are not directly contacting the complementary chain of the CH3 interface were computationally screened by in silico mutagenesis to all other residues. The predicted variants were ranked on the in silico metrics DVKBP and DvKBP and the top ranked variants were further inspected and evaluated with respect to the detailed WT analysis and sequence comparison (Table 2). The results are summarized in the table above. The second shell position T350 was both highly ranked in the in silico scoring and showed a high connectivity in the residue contact analysis. The positions and residues in bold were chosen for experimental evaluation. The experimental results for the T350V mutation showed improved stability of the CH3 domain by >2 °C (Tm).

tions. Out of the six selected designs, two were found to improve the thermal stability by 1–2 °C (Tm). Despite the improvement, the identified mutations were not used in the final successful design because of their solvent exposed nature. 4.7. 3rd design iteration The results of the 2nd design iteration for optimization of the Steric-1 design are summarized in Table 4, showing that 5 out of the 40 variants tested, displayed improvement in thermal stability. After identification of the successful design hypothesis of the 2nd iteration, the 3rd design iteration was focused on re-evaluation of similar variants within the same design hypothesis that were not selected in the previous iteration. For example, as a potential improvement to the experimentally tested B_K392M, the mutation B_K392L, which was predicted by the in silico workflow to be solvent exposed and was thus not selected in the 2nd design iteration (Fig. 8), was re-evaluated and selected for experimental validation in the 3rd iteration. As illustrated in Table 4 and Table 5, the mutation B_K392L gave an additional improvement in thermal stability by 2 °C. Subsequent engineering was based on the assumption that certain targeted regions and design rationale (1–3 in Table 4) were independent and could be combined. In the 3rd design iteration the successful mutations of the different independent designs were combined, yielding the lead Steric-1 heterodimer variant (A_T350V_L351Y_F405A_Y407V__B_T350V_T366L_K392L_T394W) with the observed thermal stability of >80 °C. The lead designs of this iteration demonstrated a heterodimer specificity of greater than 98%, the other 2% being half-antibodies of the two chains but no detectable amount of homodimer contaminants across a range of relative expression levels of chain A and B [17]. 5. Discussion Engineered antibodies have enormous potential for treating a wide variety of diseases. Traditionally, experimental library screening has been used as the primary technique in antibody

engineering, with a dominant focus on antibody-antigen interaction. The use of library screening techniques has been more limited in the design of protein–protein interfaces for altered specificity, because of the limited sequence space that can be explored and the requirement of a specific protocol to screen for both the affinity of the desired and the undesired complexes [43,90]. Specific optimizations to combine library screening and computational methods to provide an enriched library with a higher probability of success have been pursued. In the KiH heterodimer design, for example, experimental library screening was used in the second iteration to optimize the initial heterodimer for stability. The approach was only partially successful, since it improved the Tm of the initial variant by 4°C to 69.4 °C [8]. Computational protein engineering has been used successfully in protein engineering for altered specificity and protein design and they are poised to make significant contributions to the development of protein therapeutics [43,91]. Despite the success of the computational methods, recent protein design studies and community wide assessments in protein-interface engineering and design highlighted the significant limitations of in silico structure prediction and scoring methods and the associated overall low predictability, particularly in the case of difficult problems of practical utility requiring higher order mutations [92–94]. To overcome these limitations, in the context of the heterodimeric Fc design we have developed an iterative problem-centric approach of rational structure based design and focused computational screening. In this iterative approach, based on the detailed understanding of the system in the context of the protein engineering problem, rational structure guided hypothesis towards an engineering solution are developed, and subsequently evaluated and refined using computational methods. Specifically, in the first iteration of the heterodimer Fc design, this was achieved by a hypothesis driven design and in the second design iteration, by a focused approach to improve stability of the lead heterodimer variants. This approach identified non-obvious interface regions and candidate positions that proved to be critical in re-engineering the WT IgG1 like stability. In the subsequent in silico prediction and ranking we have used a number of different metrics, including physics and knowledge based potentials to evaluate both stability and



affinity attributes of the system. The in silico scoring is complemented with an integrated post-processing procedure involving detailed structural analysis and evaluation of favorable and unfavorable interactions and residue connectivity, using a variety of in-house developed tools. Similarly, detailed structural analysis and inspections are frequently recommended in the post processing and ranking procedures in the small molecule structure based virtual screening efforts [95]. For example, the screening and ranking of the A_F405x__ B_K392y variant that identified one of the critical stabilization mutations (described in Fig. 8), highlights both the strength and limitations of a computational approach. Specifically, with the current force fields and energy functions used for modeling, many of the top scoring variants (Category A and B in Fig. 8) will likely present the introduced mutation in a solvent exposed conformation. As a result of the large contact surface with the exposed loop of the complementary chain, these variants are likely to present favorable in silico scores although, in reality, because of the solvent exposure the mutations are not likely to significantly stabilize the heterodimer. The example illustrates the importance of considering the local structural context in the interpretation of a modeled protein mutation optimization. Rigorous coupling of in silico modeling with protein structural biology oriented analysis is necessary to recognize such issues. This example also illustrates the importance of developing an awareness of the limitations and potential errors that may be introduced in the use of modeling technologies [96]. In order to ensure the effective use of computational tools, the methods have to be carefully validated for use at each stage and region of the protein structure. Particularly, the uncertainties and limitations of a computational method are thoroughly explored prior to applying the method and coupled with detailed empirical evaluation and understanding of the system of interest and region where it is to be applied. In many instances, a protein-engineering project requires the development of new, task specific novel computational workflows calling upon unique aspects of the algorithms and theoretical techniques employed. In this context, the development of these tools is most effectively done through a close collaboration between protein engineers who are familiar with the specific needs of the engineering task, and computational and theoretical scientists who bring expertise in molecular biophysics and software development and application. In conclusion, we have engineered a novel heterodimeric Fc that retains native IgG1 like purity and biophysical properties in order to develop a best-in-class solution for the design of bispecific multifunctional therapeutic antibody molecules. The novel protein engineering method employed here relies on combining human expertise for rationalization and hypothesis generation towards a protein engineering problem with state-of-the-art computational protein modeling and simulation to drive focused optimization in systems that are hard to address using traditional approaches. Acknowledgments We acknowledge Dr. David Poon for kindly reading over the manuscript and providing valuable advice. We thank Drs. Anders Ohrn and Greg Lakatos for contributing to the preparation of the manuscript and for valuable discussions on application of molecular simulation techniques. Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.ymeth.2013. 10.016.

17

References [1] R. Kontermann, mAbs 4 (2012) 182–197. [2] C. Milstein, A.C. Cuello, Nature 305 (1983) 537–540. [3] H. Lindhofer, R. Mocikat, B. Steipe, S. Thierfelder, J. Immunol. 155 (1995) 219– 225. [4] C. Klein, C. Sustmann, M. Thomas, K. Stubenrauch, R. Croasdale, J. Schanzer, U. Brinkmann, H. Kettenberger, J.T. Regula, W. Schaefer, mAbs 4 (2012). [5] A.M. Merchant, Z. Zhu, J.Q. Yuan, A. Goddard, C.W. Adams, L.G. Presta, P. Carter, Nat. Biotechnol. 16 (1998) 677–681. [6] M.J. Gramer, E.T. van den Bremer, M.D. van Kampen, A. Kundu, P. Kopfmann, E. Etter, D. Stinehelfer, J. Long, T. Lannom, E.H. Noordergraaf, J. Gerritsen, A.F. Labrijn, J. Schuurman, P.H. van Berkel, P.W. Parren, mAbs 5 (2013). [7] P. Strop, W.H. Ho, L.M. Boustany, Y.N. Abdiche, K.C. Lindquist, S.E. Farias, M. Rickert, C.T. Appah, E. Pascua, T. Radcliffe, J. Sutton, J. Chaparro-Riggers, W. Chen, M.G. Casas, S.M. Chin, O.K. Wong, S.H. Liu, G. Vergara, D. Shelton, A. Rajpal, J. Pons, J. Mol. Biol. 420 (2012) 204– 219. [8] S. Atwell, J.B. Ridgway, J.A. Wells, P. Carter, J. Mol. Biol. 270 (1997) 26–35. [9] P. Carter, J. Immunol. Methods 248 (2001) 7–15. [10] K. Gunasekaran, M. Pentony, M. Shen, L. Garrett, C. Forte, A. Woodward, S.B. Ng, T. Born, M. Retter, K. Manchulenko, H. Sweet, I.N. Foltz, M. Wittekind, W. Yan, J. Biol. Chem. 285 (2010) 19637–19646. [11] J.H. Davis, C. Aperlo, Y. Li, E. Kurosawa, Y. Lan, K.M. Lo, J.S. Huston, Protein Eng. Des. Sel. 23 (2010) 195–202. [12] M. Muda, A.W. Gross, J.P. Dawson, C. He, E. Kurosawa, R. Schweickhardt, M. Dugas, M. Soloviev, A. Bernhardt, D. Fischer, J.S. Wesolowski, C. Kelton, B. Neuteboom, B. Hock, Protein Eng. Des. Sel. 24 (2011) 447–454. [13] M.J. Thies, J. Mayer, J.G. Augustine, C.A. Frederick, H. Lilie, J. Buchner, J. Mol. Biol. 293 (1999) 67–79. [14] M.J. Thies, F. Talamo, M. Mayer, S. Bell, M. Ruoppolo, G. Marino, J. Buchner, J. Mol. Biol. 319 (2002) 1267–1277. [15] A. McAuley, J. Jacob, C.G. Kolvenbach, K. Westland, H.J. Lee, S.R. Brych, D. Rehder, G.R. Kleemann, D.N. Brems, M. Matsumura, Protein Sci. 17 (2008) 95– 106. [16] S.J. Demarest, S.M. Glaser, Curr. Opin. Drug Discov. Devel. 11 (2008) 675–687. [17] T.S. Von Kreudenstein, E. Escobar-Carbrera, P.I. Lario, I. D’Angelo, K. Brault, J. Kelly, Y. Durocher, J. Baardsnes, R.J. Woods, M.H. Xie, P.A. Girod, M.D. Suits, M.J. Boulanger, D.K. Poon, G.Y. Ng, S.B. Dixit, mAbs 5 (2013) 646–654. [18] R.J. Rose, A.F. Labrijn, E.T. van den Bremer, S. Loverix, I. Lasters, P.H. van Berkel, J.G. van de Winkel, J. Schuurman, P.W. Parren, A.J. Heck, Structure 19 (2011) 1274–1282. [19] D.N. Bolon, R.A. Grant, T.A. Baker, R.T. Sauer, Proc. Natl. Acad. Sci. USA 102 (2005) 12724–12729. [20] J.J. Havranek, P.B. Harbury, Nat. Struct. Biol. 10 (2003) 45–52. [21] P.S. Huang, J.J. Love, S.L. Mayo, Protein Sci. 16 (2007) 2770–2774. [22] L. Pauling, R.B. Corey, Proc. Natl. Acad. Sci. USA 37 (1951) 729–740. [23] F.H. Crick, Nature 170 (1952) 882–883. [24] G.N. Ramachandran, C. Ramakrishnan, V. Sasisekharan, J. Mol. Biol. 7 (1963) 95–99. [25] A. Fersht, Structure and Mechanism in Protein Science. A Guide to Enzyme Catalysis and Protein Folding, Macmillan, 1999. [26] D.E. Koshland Jr., K.E. Neet, Annu. Rev. Biochem. 37 (1968) 359–410. [27] S.R. Tzeng, C.G. Kalodimos, Curr. Opin. Struct. Biol. 21 (2011) 62–67. [28] R.O. Dror, R.M. Dirks, J.P. Grossman, H. Xu, D.E. Shaw, Annu. Rev. Biophys. 41 (2012) 429–452. [29] T.J. Lane, D. Shukla, K.A. Beauchamp, V.S. Pande, Curr. Opin. Struct. Biol. 23 (2013) 58–65. [30] T. Schlick, F1000 Biol. Rep. 1 (2009) 48. [31] T. Schlick, F1000 Biol. Rep. 1 (2009) 51. [32] K. Lindorff-Larsen, S. Piana, R.O. Dror, D.E. Shaw, Science 334 (2011) 517–520. [33] J. Desmet, J. Spriet, I. Lasters, Proteins 48 (2002) 31–43. [34] D.B. Gordon, G.K. Hom, S.L. Mayo, N.A. Pierce, J. Comput. Chem. 24 (2003) 232– 243. [35] M.V. Shapovalov, R.L. Dunbrack Jr., Structure 19 (2011) 844–858. [36] I.W. Davis, W.B. Arendall 3rd, D.C. Richardson, J.S. Richardson, Structure 14 (2006) 265–274. [37] M. Cahill, S. Cahill, K. Cahill, Biophys. J. 82 (2002) 2665–2670. [38] D.J. Mandell, T. Kortemme, Curr. Opin. Biotechnol. 20 (2009) 420–428. [39] E.A. Coutsias, C. Seok, M.P. Jacobson, K.A. Dill, J. Comput. Chem. 25 (2004) 510– 528. [40] A.A. Canutescu, R.L. Dunbrack Jr., Protein Sci. 12 (2003) 963–972. [41] X. Li, M.P. Jacobson, R.A. Friesner, Proteins 55 (2004) 368–382. [42] S.M. Lippow, B. Tidor, Curr. Opin. Biotechnol. 18 (2007) 305–311. [43] D.J. Mandell, T. Kortemme, Nat. Chem. Biol. 5 (2009) 797–807. [44] E. Karaca, A.S. Melquiond, S.J. de Vries, P.L. Kastritis, A.M. Bonvin, Mol. Cell. Proteomics 9 (2010) 1784–1794. [45] M. Petukhov, D. Cregut, C.M. Soares, L. Serrano, Protein Sci. 8 (1999) 1982– 1989. [46] L. Jiang, B. Kuhlman, T. Kortemme, D. Baker, Proteins 58 (2005) 893–904. [47] D.C. Rapaport, The Art of Molecular Dynamics Simulation, University Press, Cambridge, 2004. [48] N. Tokuriki, F. Stricher, J. Schymkowitz, L. Serrano, D.S. Tawfik, J. Mol. Biol. 369 (2007) 1318–1332.


18


[49] D.A. Case, T.E. Cheatham 3rd, T. Darden, H. Gohlke, R. Luo, K.M. Merz Jr., A. Onufriev, C. Simmerling, B. Wang, R.J. Woods, J. Comput. Chem. 26 (2005) 1668–1688. [50] V. Hornak, R. Abel, A. Okur, B. Strockbine, A. Roitberg, C. Simmerling, Proteins 65 (2006) 712–725. [51] X. Zhu, P.E. Lopes, A.D. Mackerell Jr., Wiley Interdiscip. Rev. Comput. Mol. Sci. 2 (2012) 167–185. [52] W.D. Cornell, P. Cieplak, C.I. Bayly, I.R. Gould, K.M. Merz, D.M. Ferguson, D.C. Spellmeyer, T. Fox, J.W. Caldwell, P.A. Kollman, J. Am. Chem. Soc. 117 (1995) 5179–5197. [53] C.M. Baker, E. Darian, A.D. MacKerell Jr, Innovations in Biomolecular Modeling and Simulations, 1, The Royal Society of, Chemistry, 2012, pp. 23–50. [54] W.L. Jorgensen, J. Tirado-Rives, Proc. Natl. Acad. Sci. USA 102 (2005) 6665– 6670. [55] N.A. Baker, Methods Enzymol. 383 (2004) 94–118. [56] A. Onufriev, D. Bashford, D.A. Case, Proteins 55 (2004) 383–394. [57] N. Pokala, T.M. Handel, Protein Sci. 13 (2004) 925–936. [58] R.M. Levy, L.Y. Zhang, E. Gallicchio, A.K. Felts, J. Am. Chem. Soc. 125 (2003) 9523–9530. [59] J. Chen, C.L. Brooks 3rd, Phys. Chem. Chem. Phys. 10 (2008) 471–481. [60] J. Wereszczynski, J.A. McCammon, Q. Rev. Biophys. 45 (2012) 1–25. [61] C.D. Christ, A.E. Mark, W.F. van Gunsteren, J. Comput. Chem. 31 (2010) 1569– 1582. [62] B.Æ. Roux, Comput. Phys. Commun. 91 (1995) 275–282. [63] C. Chipot, A. Mark, V. Pande, T. Simonson, in: C. Chipot, A. Pohorille (Eds.), Free Energy Calculations, Springer, Berlin Heidelberg, 2007, pp. 463–501. [64] M.K. Gilson, H.X. Zhou, Annu. Rev. Biophys. Biomol. Struct. 36 (2007) 21–42. [65] H. Gohlke, D.A. Case, J. Comput. Chem. 25 (2004) 238–250. [66] P. Kalra, A. Das, B. Jayaram, Appl. Biochem. Biotechnol. 96 (2001) 93–108. [67] A. Leaver-Fay, M.J. O’Meara, M. Tyka, R. Jacak, Y. Song, E.H. Kellogg, J. Thompson, I.W. Davis, R.A. Pache, S. Lyskov, J.J. Gray, T. Kortemme, J.S. Richardson, J.J. Havranek, J. Snoeyink, D. Baker, B. Kuhlman, Methods Enzymol. 523 (2013) 109–143. [68] A.A. Polyansky, R. Zubac, B. Zagrovic, Methods Mol. Biol. 819 (2012) 327–353. [69] H. Schafer, L.J. Smith, A.E. Mark, W.F. van Gunsteren, Proteins 46 (2002) 215– 224. [70] H.L. Woodcock, W. Zheng, A. Ghysels, Y. Shao, J. Kong, B.R. Brooks, J. Chem. Phys. 129 (2008) 214109. [71] S. Vajda, M. Sippl, J. Novotny, Curr. Opin. Struct. Biol. 7 (1997) 222–228. [72] J. Zhang, Y. Zhang, PLoS One 5 (2010) e15386. [73] H. Li, A.D. Robertson, J.H. Jensen, Proteins 61 (2005) 704–721. [74] J.-P. Ryckaert, G. Ciccotti, H.J.C. Berendsen, J. Comput. Phys. 23 (1977) 327– 341. [75] W.L. Jorgensen, J. Chandrasekhar, J.D. Madura, R.W. Impey, M.L. Klein, J. Chem. Phys. 79 (1983) 926–935. [76] I.G. Tironi, R. Sperb, P.E. Smith, W.F. van Gunsteren, J. Chem. Phys. 102 (1995) 5451–5459. [77] H.J.C. Berendsen, J.P.M. Postma, W.F. van Gunsteren, A. DiNola, J.R. Haak, J. Chem. Phys. 81 (1984) 3684–3690. [78] P. Eastman, M.S. Friedrichs, J.D. Chodera, R.J. Radmer, C.M. Bruns, J.P. Ku, K.A. Beauchamp, T.J. Lane, L.P. Wang, D. Shukla, T. Tye, M. Houston, T. Stich, C. Klein, M.R. Shirts, V.S. Pande, J. Chem. Theory Comput. 9 (2013) 461–469.

[79] J.A. Izaguirre, C.R. Sweet, V.S. Pande, Biocomputing 2010, World Scientific, 2009, pp. 240–251. [80] Schrodinger LLC., 2010. [81] W. Humphrey, A. Dalke, K. Schulten, J. Mol. Graph. 14 (33–38) (1996) 27–38. [82] Z. Xiang, B. Honig, J. Mol. Biol. 311 (2001) 421–430. [83] G.D. Hawkins, C.J. Cramer, D.G. Truhlar, J. Phys. Chem. 100 (1996) 19824– 19839. [84] A.R. Leach, Molecular Modelling : Principles and Applications, Prentice Hall, Harlow, England, New York, 2001. [85] H. Edelsbrunner, P. Koehl, Proc. Natl. Acad. Sci. USA 100 (2003) 2203–2208. [86] R. Bryant, H. Edelsbrunner, P. Koehl, M. Levitt, Discrete Comput. Geom. 32 (2004) 293–308. [87] R.J. Woods, M.H. Xie, T.S. Von Kreudenstein, G.Y. Ng, S.B. Dixit, mAbs 5 (2013) 711–722. [88] J.B. Ridgway, L.G. Presta, P. Carter, Protein Eng. 9 (1996) 617–621. [89] W. Dall’Acqua, A.L. Simon, M.G. Mulkerrin, P. Carter, Biochemistry 37 (1998) 9266–9273. [90] T.S. Chen, A.E. Keating, Protein Sci. 21 (2012) 949–963. [91] G.L. Gilliland, J. Luo, O. Vafa, J.C. Almagro, Methods Mol. Biol. 841 (2012) 321– 349. [92] S.J. Fleishman, T.A. Whitehead, D.C. Ekiert, C. Dreyfus, J.E. Corn, E.M. Strauch, I.A. Wilson, D. Baker, Science 332 (2011) 816–821. [93] S.J. Fleishman, T.A. Whitehead, E.M. Strauch, J.E. Corn, S. Qin, H.X. Zhou, J.C. Mitchell, O.N. Demerdash, M. Takeda-Shitaka, G. Terashi, I.H. Moal, X. Li, P.A. Bates, M. Zacharias, H. Park, J.S. Ko, H. Lee, C. Seok, T. Bourquard, J. Bernauer, A. Poupon, J. Aze, S. Soner, S.K. Ovali, P. Ozbek, N.B. Tal, T. Haliloglu, H. Hwang, T. Vreven, B.G. Pierce, Z. Weng, L. Perez-Cano, C. Pons, J. Fernandez-Recio, F. Jiang, F. Yang, X. Gong, L. Cao, X. Xu, B. Liu, P. Wang, C. Li, C. Wang, C.H. Robert, M. Guharoy, S. Liu, Y. Huang, L. Li, D. Guo, Y. Chen, Y. Xiao, N. London, Z. Itzhaki, O. Schueler-Furman, Y. Inbar, V. Potapov, M. Cohen, G. Schreiber, Y. Tsuchiya, E. Kanamori, D.M. Standley, H. Nakamura, K. Kinoshita, C.M. Driggers, R.G. Hall, J.L. Morgan, V.L. Hsu, J. Zhan, Y. Yang, Y. Zhou, P.L. Kastritis, A.M. Bonvin, W. Zhang, C.J. Camacho, K.P. Kilambi, A. Sircar, J.J. Gray, M. Ohue, N. Uchikoga, Y. Matsuzaki, T. Ishida, Y. Akiyama, R. Khashan, S. Bush, D. Fouches, A. Tropsha, J. Esquivel-Rodriguez, D. Kihara, P.B. Stranges, R. Jacak, B. Kuhlman, S.Y. Huang, X. Zou, S.J. Wodak, J. Janin, D. Baker, J. Mol. Biol. 414 (2011) 289–302. [94] R. Moretti, S.J. Fleishman, R. Agius, M. Torchala, P.A. Bates, P.L. Kastritis, J.P. Rodrigues, M. Trellet, A.M. Bonvin, M. Cui, M. Rooman, D. Gillis, Y. Dehouck, I. Moal, M. Romero-Durana, L. Perez-Cano, C. Pallara, B. Jimenez, J. FernandezRecio, S. Flores, M. Pacella, K. Praneeth, J.J. Gray, P. Popov, S. Grudinin, J. Esquivel-Rodriguez, D. Kihara, N. Zhao, D. Korkin, X. Zhu, O.N. Demerdash, J.C. Mitchell, E. Kanamori, Y. Tsuchiya, H. Nakamura, H. Lee, H. Park, C. Seok, J. Sarmiento, S. Liang, S. Teraguchi, D.M. Standley, H. Shimoyama, G. Terashi, M. Takeda-Shitaka, M. Iwadate, H. Umeyama, D. Beglov, D.R. Hall, D. Kozakov, S. Vajda, B.G. Pierce, H. Hwang, T. Vreven, Z. Weng, Y. Huang, H. Li, X. Yang, X. Ji, S. Liu, Y. Xiao, M. Zacharias, S. Qin, H.X. Zhou, S.Y. Huang, X. Zou, S. Velankar, J. Janin, S.J. Wodak, D. Baker, Proteins (2013). [95] C. Bissantz, B. Kuhn, M. Stahl, J. Med. Chem. 53 (2010) 5061–5084. [96] T. Stouch, J. Comput. Aided Mol. Des. 26 (2012) 125–134. [97] R.L. Dunbrack Jr., M. Karplus, J. Mol. Biol. 230 (1993) 543–574.


Protein engineering for molecular electronics.

New molecular biology methods for protein engineering.

Editorial: Biochemical and molecular engineering.

Fc Engineering for Developing Therapeutic Bispecific Antibodies and Novel Scaffolds.

Protein engineering: Beating the odds.

Protein engineering.

Protein engineering.

Protein engineering.

Protein engineering.

Protein engineering of antibodies.

Protein engineering of Rubisco.

Protein engineering of amylases.

Protein engineering.

Protein engineering: The power of four.

Protein engineering of cellulases.

Protein engineering.

Engineering of Immunoglobulin Fc Heterodimers Using Yeast Surface-Displayed Combinatorial Fc Library Screening.

Engineering therapeutic protein disaggregases.

Protein engineering for thermostability.

Protein engineering in haemoglobin.

Synchrotron use for materials science and engineering-the next decade.

Protein disulfide engineering.

Plant food protein engineering.

Cell biology of sarcomeric protein engineering: disease modeling and therapeutic potential.