Protein Targets for Structure-Based Drug Design Malcolm D. Walkinshaw Preclinical Research, Sandoz Pharrna AG, CH-4002 Basel, Switzerland

Scope of the Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

......................................... ............. A. Overview ............................................................... 8. Histocompatibility Molecules . . . . . . . . . 1. X-Ray Structure of HLA-A2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. X-Ray Structure of HLA-Aw68 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

E. Immunolglobulins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

............... F. Cytokines . . . . . . . . . . . . . . . .

................ .................................

....................................... tors (and GF Receptors)

................ ................

...............

.................... E. Tumor Necrosis Factor (TNF) . . . . . . . . IV. Blood and Circulation A. Blood Clotting . . . ................

.......... ............ ............

................... 3. Prothrombin . . . . . . . . . . . . . . . . . . . . . . B. Serpins . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ............................... C. Plasminogen Activators . . . . . . ................

................ .............

318 320 321 321 323 323 324 324 324 325 325 326 327 327 328 329 329 330 330 331 332 332 332 333 334 334 334 335 337 339 339 339 339 340 341 342 342 343 343 344 344 346 346 346 347

Medicinal Research Reviews, Vol. 12, No. 4, 317-372 (1992) CCC 0198-6325/92/040317-56$04.00 0 1992 John Wiley & Sons, Inc.

WALKINSHAW

318

B. Picornaviruses . ................................................ ..................... 1. Viral Cell Attachment . . . . . . . . . . . . . . . 2. The Common Cold: Human Rhinoviru ............. 3. Mengo Virus . . . . . . . . . . . . . . . . . . . . . . . . . . 4. Polio Virus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. AIDS . . . . . . . . . . . . ......................................... 1. Reverse Transcrip ............................... 2. HIV Aspartate Proteases (PR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ........... VI . Inflammation and Respiratory Diseases . . . . . . . . A. Phospholopase A2(PLA2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .................................. 8 . Emphysema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . VII. A. Antibiotics . . . ................................................ ........... tidase . . . . . . . . . . 2. Class A p Lactamase of Bacillus Licheniformis . . . . . . . . . . . . . . . . . . . . . . . . . . . .............. 3. Class A p Lactamse from Staphylococcus Aureus . . . . 4. Class C p Lactamase from Citrobacter Freundii . . . . . . . . . . . . . . . . . . . . . . . . . .

......................................

rase (TIM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

................................................ ........................................................... A. Cystic Fibrosis (and Multidrug Resistance) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ........... 8 . Sickle Cell Anemia . . . . . . . . . . IX. Other Targets Present and Future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Aldose Reductase . . . . . . . ............................ B. Carbonic Anhydrase (CA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Collagenase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

VIII.

D. Hydroxymethylglutaryl Coenzyme A Reductase (HMG CoAR) E. Phosphporylation (Kinases and Phosphorylases) . . . . . . . . . . . . . . . . . . . . . . . . . . . . F. DNA Recognition . . . . . . . . . . . . . . . G. Transmembrane Receptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . X. Overview and Outlook ................................................ Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

347 348 349 350 350 350 350 350 353 353 354 356 356 357 357 357 357 358 358 359 359 359 359 360 360 361 361 361 361 361 361 362 362 362

SCOPE OF THE REVIEW The aim of this review is to discuss those proteins for which the known 3D structure may be relevant to drug design. The majority of available information to date is from protein x-ray crystallography, though a growing number of studies are based on molecular modeling and NMR spectroscopy. The review concentrates on major progress reported in the literature from 1985 through the first half of 1991. Protein structures and related drug design studies have been loosely classified here into a number of major therapeutic areas: immunology, inflammation, cardiovascular disorders, cancer, viral and bacterial infection. Table I

Malcolm D. Walkinshaw obtained his Ph.D in 1976 from Edinburgh University on the physical chemistry of sugars. This was followed by postdoctoral studies at Purdue University with S. Arnott on x-ray fiber d#raction of polysacckarides and with W. Saenger in Goettingen studying protein x-ray crystallography. After returning for five years to a permanent fellowship in the Chemistry Department at Edinburgh University he took a position in industry in 1985. He is now head of the Sandoz Drug Design Group, which consists of integrated teams specializing in biochemistry, x-ray cystallography, N M R spectroscopy, and molecular modeling.

319

STRUCTURE-BASED DRUG DESIGN TABLE I Proteins of Known 3D Structure Relevant to Structure-Based Drug Designa Protein

Section

Human leucocyte antigen (HLA) CD4 antigens

1.B

Immunoglobulins

1.E

Interleukin-I

I.F.l

Interleukin-2 Interleukin-8 Cyclophilin Macrophilin (FKBP) GM-CSF Interferon

I.F.2 I.F.3 1.G 1.G I.F.4 I.F.5

Insulin Growth hormones Ras-p21 Dihydrofolate reductase (DHFR) Thymidylate synthase (TS) Purine nucleoside phosphorylase (PNP) Tumour necrosis factor (TNF) Thrombin

1I.A 1I.B 1II.A 1II.B.1

Hirudin Prothrombin Factor XI11 Antithrombin 111 al-antitrypsin Plasminogen activators Renin Haemagglutinin Neuraminidase Rhinovirus Mengovirus Polio virus Reverse transcriptase HIV aspartate protease Phospholipase A2 Elastase plactamase Triose phosphate isomerase (TIM) Enterotoxin

1.D

1II.C

Disease autoimmune diseases AIDS, immune diseases immune diseases, cancer inflammation immune diseases inflammation immunosuppression immunosuppression cancer cancer, viral infection diabetes endocrine imbalance cancer cancer, bacterial infection cancer

Company

SmithKline Beecham Protein design labs Immunopharmaceutics Ciba-Geigy, Otsuka, Upjohn, Rouche Amgen Dainippon, Sandoz Agouron, Sandoz Agouron, Sandoz Schering, Kyowa Hakko Kogyo Genentech Novo, Hoechst Monsanto, Lilly Genentech Roche, Agouron Agouron

1II.D

cancer, immunosuppression

BioCryst

1II.E

cancer

BASF

1V.A.I

thrombosis

IV.A.2 IV.A.3 IV.A.3 IV. B IV. B 1v.c

thrombosis thrombosis thrombosis thrombosis emphysema thrombosis

Ciba-Geigy, Roche Mitsubishi, Abbott Ciba-Geigy

IV.D.1 V.A.1 V.D.2 V.B.2 V.B.3 V.B.5 V.E.1

blood pressure influenza influenza common cold poliomyelitis AIDS

Wellcome, Agouron

V.E.2

AIDS

MSD, Abbott

V1.A V1.B VILA V1I.B

inflammation emphysema infection trypanosome infection cholera

Upjohn, Sandoz, Abbott Roche MSD, ICI Roche Roche

V1I.C

Hoechst

MSD, Abbott, Upjohn

Winthrop

(continued )

WALKINSHAW

320 TABLE I (Continued ) Protein

Section

Haemoglobin Aldose reductase Carbonic anhydrase HMG-CoA reductase Collagenase Superoxide dismutase Cytochrome P450

VII1.B 1X.A 1X.B

Disease sickle cell anemia glaucoma glaucoma

Company Wellcome Biostructure MSD

1X.D

Hoechst

1X.E 1X.F

Roche

1X.G

aSpecific references are given in the relevant section of the article. Companies are mentioned only if they have been associated with a structure-relatedpublication.

lists those proteins discussed in this review for which 3D structural information is available.

INTRODUCTION There are two ways that a protein structure can be used in drug design. In the first case, the protein is the target for a given drug and the 3D structure can be used as a template for designing new inhibitors or agonists and antagonists. Examples here would include the design of enzyme inhibitors. In the second case, there are the so-called pharmaproteins, in which the protein is itself the drug. Examples here are smaller hormone proteins such as insulin and calcitonin. The hope in studying these structures is that smaller, more potent, possibly even nonpeptide analogues can be designed. The reductionist belief that it is possible to design drugs given the 3D structure of the protein target was originally labeled ”rational drug design.” A prerequisite of this approach, however, is having an understanding of the biochemical cause of a disease in such detail that specific regulatory proteins can be identified as possible drug targets. There are now a number of good examples in which the results of protein structural studies provide a clear insight into the role a protein or drug plays in a disease process. Some of these discussed in this review include the binding of antiviral drugs,l the presentation of antigen peptides,2 and structural or functional changes caused by protein mutations in a number of genetic ”molecular diseases.”3 It is still not possible to provide clear examples in which the “structure-based design” approach has led to a clinically useful drug. Two commonly quoted examples are the design of the ACE inhibitor captopril, and the design of antisickling agents based on the haemoglobin ~ t r u c t u r eMore . ~ recently, various groups have been successful in using 3D molecular enzyme structures to design potent and specific inhibitors for renin,5 dihydrofolate reductase,6 t h r ~ m b i nthymidylate ,~ synthase,8 and e l a ~ t a s ewhich ,~ could provide useful leads for new drugs. One of the most promising applications of this “rational” approach has been the design of HIV-protease inhibitors,1° though results

STRUCTURE-BASEDDRUG DESIGN

321

from clinical trials on the effectiveness of these compounds against HIV infection are not yet available. The design of biologically active molecules with a specific activity is only the first step in the production of a clinically useful drug, and problems of bioavailability, toxicity, side effects, and production costs still need to be overcome. Despite this caveat, almost every major pharmaceutical company has invested strongly in this structure-based approach. The rational or structure-based approach to drug design depends on a realistic 3D picture of the target protein. At present we have three insights available: One is a static picture of an energy minimum conformation based on protein x-ray crystal structures; another is the more flexible picture of various conformers, which can be calculated from nuclear magnetic resonance spectroscopic data; a third description is also available from molecular modeling calculations. Each of these techniques provides a different but complementary picture. The rapid growth in all three areas over the last five years has been the result of both theoretical and technological advances. Protein crystallography can provide structures of any size of molecule (so long as suitable crystals can be grown), and already transmembrane proteins and whole virus particles with molecular mass of over 6 million dalton (Da) have been solved at atomic resolution. The major advance has been the development of area detectors which can reduce the time required for the collection of x-ray data to days rather than weeks or months. With NMR the size limit of the proteins studied has increased to around 20,000 Da and the last five years have seen some fifty 3D structure determinations of such small proteins. Apart from fine detail, the structures of proteins determined by NMR and xray have been found to be very similar. The main advance in NMR techniques has been the use of 2D, 3D, and 4D spectra to permit better interpretation of the spectra. The experimental x-ray and NMR structures provide a good starting point for molecular modeling and drug design studies and there is an experimental database of some 500 protein structures available. The theoretical advance of the method of molecular dynamics has allowed a realistic picture of proteins in motion surrounded by solvent. This is a computationally intensive procedure and relies heavily on access to ever more powerful supercomputers. This technique also provides a way of overcoming the problem of getting stuck in local minimum energy conformation and is also being successfully used in refinements of x-ray and NMR protein determinations. The biggest impact on structure-based drug design, however, has been made by advances in molecular biology, which have allowed the production of the large quantities of cloned proteins and protein mutants required for both NMR and x-ray studies.

I. IMMUNOLOGY A. Overview The complex biochemistry of the immune system is gradually being understood at a molecular level and 3D structures of a number of proteins involved have been ellucidated (Fig. 1). Controlling the immune response may provide possible therapies for autoimmune diseases and organ transplant rejection. A major recent advance has been the structure determination of the human leucocyte antigen (HLA). This plays a key role in the cell-mediated immune

322

WALKINSHAW

/-

Figure 1. Some interactions and messages involved in the immune response showing those proteins for which 3D structural information is available. Human leucocyte antigen (HLA), presents the peptide antigen fragment (P). T cell receptor (TCR) recognizes HLA with bound peptide. CD4 or CD8 molecules are also involved in the recognition. Macrophilin (MI') and cyclophilin (CP) are targets for immunosuppressive drugs which prevent T cell expression of IL-2.

responses governed by T lymphocyte cells. Antigen that invades the body is digested by macrophage cells into short oligopeptides, which are then bound to and presented by HLA (or other so-called major histocompatibility complex proteins). This complex can then be recognized by T lymphocytes using a variety of specific membrane-bound adhesion molecules, including the T cell receptor and CD4 or CD8 recognition proteins. It may now be possible to use the information about the 3D structures of these molecules to design competitive peptides to bind to these cell surface molecules in the hope of mediating the recognition process. The humoral antibody response is controlled by B lymphocytes, which respond to antigen by secreting antibodies (ABs). Antisera can be used in the treatment of microbial infections and for neutralizing toxins; rodent mABs have been used to target tumors. Foreign immunoglobulins can, however, elicit an immunoresponse, and ideally human antibodies should be used. The cloning methods of molecular biology have allowed the construction of mosaic or humanized antibodies in which the hypervariable loops from a mouse AB can replace those in a human proteinll with retention of antigen recognition. Structural studies of a number of Fab fragments, both alone and complexed with peptide antigen, have been published. Modeling studies based on these structures indicate that it is possible to accurately predict the conformation of the hypervariable loop regions.l* This approach has already been used in the design of therapeutically relevant humanized antibodies.13 It is

STRUCTURE-BASED DRUG DESIGN

323

likely that this approach of modeling 3D structures of mosaic or humanized antibodies will become important in the design of many different therapeutic antibodies. Another related area that could be useful in drug design is the study of antiidiotypic antibodies. It has been suggested that the structure adopted by the hypervariable loops of the antiid-AB could provide a starting point for the design of peptidomimetic drugs.14 Structures of a number of the signaling interleukin molecules (IL-1, IL-2, and IL-8) have also been determined. It seems that quite large surfaces of these small proteins are involved in binding to their respective receptors, which makes the design of small molecule agonists or antagonists a difficult task. Other molecules important in modulating the cell-mediated immune response are the intracellular receptors for the immunosuppresive drugs cyclosporin and FK506. Structural studies of both cyclophilin, the receptor for cyclosporin, and macrophilin, the receptor for FK506, are underway in a number of l a b ~ r a t o r i e sin ~ ~the , ~ hope ~ that they will provide a template for the design of new classes of these drugs.

B. Histocompatibility Molecules Human leukocyte antigen (HLA), also known as class I histocompatibility antigens, are membrane glycoproteins found on the surface of nearly all cells (Fig. 1). Cytotoxic T cells recognize and bind to class I MHC-encoded protein, while helper T cells bind to the class I1 MHC-encoded proteins. Individuals only express a few different types of HLA molecules which have the ability to interact with a wide variety of foreign antigens. X-ray structures of two class I histocompatibility antigens have been determined. These two structures therefore provide a structural basis for allelic specificity in foreign antigen binding and the resulting non- or hyper-responsiveness to certain immunological challenges. It may be possible to design a high-affinity ligand to block recognition of selected MHC molecules by T cell clones that are responsible for tissue damage. Indeed, there is a high correlation between the presence in individuals of certain subtypes of HLA and diseases such as rheumatoid arthritis, diabetis mellitus, and multiple sclerosis. The disease ankylosing spondylitis has been connected to a covalent modification of cys-67, which lies on the edge of a binding pocket in HLA-B27 and suggests the design of a covalent blocking agent as a possible drug therapy.l7 1 . X-Ray Structure of HLA-A22

The complete MHC molecule is composed of a heavy chain (M = 44 KDa) which spans the membrane, and a light chain (M = 12 KDa). This crystal structure solved at 2.7 A is of the soluble four-domain fragment consisting of a total of 367 amino acids in which the transmembrane anchor was removed by papain digestion.18 Higher-resolution structures (to 2.5 A) are in press. The membrane-proximal end of the protein contains two domains with immunoglobulin folds, and the region distal from the membrane composed of the two domains a-1 and a-2 provides a platform of eight antiparallel p pleated sheets with two antiparallel a helices lying along the top (Fig. 2). The groove between the helices (25-A long and lo-A wide) provides the binding site for the processed foreign antigens and is actually filled with extra electron

324

WALKINSHAW

Figure 2. The a-1 and a-2 domains of the HLA-A structure. Polymorphic residues in HLA-Aw68 and HLA-A2 are shaded and lie in the putative peptide binding groove. Taken from Ref. 2.

density in this crystal structure. The site would accommodate an a-helical peptide of about 20 residues or an extended chain of about eight residues. Most of the polymorphic amino acids are clustered on top of this gr00ve.l~ Residues on top of the helices are thought to make direct contact with the T cell receptor. 2. X-Ray Structure of HLA-Aw6817

The crystal structure of HLA-Aw68 (refined to 2.6 8)was isomorphous to HLA-A2. The two polymorphs have very similar structures. Of a total of 13 amino acid substitutions, 11 occur at polymorphic residues in the antigenbinding cleft. The only important difference is in the shape of the peptidebinding groove introduced by the polymorphic amino acids. In Aw68 there is a negatively charged pocket extending under the a helix of the a-1 domain and the peptide-binding groove is again filled with uninterpreted electron density of a putative antigen. 3 . Class 11 Histocompatibility Molecules

No x-ray structures of class I1 molecules are yet available. A binding site has been modeled20based on the known class I structures and on an analysis of conserved and polymorphic residues of 26 class I and 54 class I1 amino acid sequences. Class I and I1 molecules are expected to have a similar shape and the model of the peptide-binding site provides a framework for discussing a number of experimental results on peptide binding.21,22

C. T Cell Receptor (TCR) The TCR is a heterodimer with two chains (a and p) and, like Fab fragments, it has variable (Va,Vp) and constant (Ca, Cp) domains. An analysis of the known x-ray structures of immunoglobulins, along with TCR sequence

STRUCTURE-BASED DRUG DESIGN

325

Six~ information, has been used to develop a structural model for T C R S . ~ regions corresponding to the hypervariable loops in Fab have been located and the limited sequence variability of two of these loops implies an involvement in recognition of MHC proteins. D. CD4 Molecule

CD4 is found on the surface of T cells and is involved in the binding of class I1 MHC molecules. The N-terminal domain of CD4 contains the binding region for the envelope gp120 protein of the AIDS virus. A knowledge of the region on CD4 which interacts with gp120 may provide the basis for the design of therapeutic molecules that inhibit the entry of HIV into the cell. Mature CD4 consists of 433 amino acids with the first 370 extracellular. This extracellular part is composed of four domains, each having sequence similarity with Ig domains. The x-ray structure of the two N-terminal domains ~ ~ , first ~~ (residues 1 to 183) has been determined at 2.3-A r e s ~ l u t i o n . The domain (residues 1-98) is composed of nine p strands folded into a f3 sandwich. This is very similar to the V, (variable-light) domains of Fab fragments (see next section) and 72 C, atoms from the sheet region superpose with a rms of 1.2 A. The second domain has only 75 residues and retains some structural similarity to the C, (constant-heavy) domain of Fab New. The two domains pack closely against each other, giving the molecule a rodlike shape. Those 19 or so residues important in gp120 binding have been identified using site-directed mutagenesis.26 These can be mapped onto the 3D structure (Fig. 3), most of which are found on the one loop (residues 38 to 59), indicating a region of interaction on gp120 at least 25-A long and 12-A wide. Before the x-ray structures became available, molecular modeling studies of the two extracellular domains of human CD4 antigen were p ~ b l i s h e d ~ ~ , ~ ~ based on the sequence homology of both domains with the variable light chain of selected immungolobulin x-ray structures. The published figures indicate that these models did indeed provide a useful picture for the predicted gp120 binding site.

E. Immunoglobulins Antibody molecules (ABs) are composed of a heavy and a light chain. Each domain (called an immunoglobulin fold) is composed of about 110 residues, which form two layers of antiparallel p pleated sheets (also called a p sandwich or a f3 barrel). The antibody antigen interaction is restricted to 6 "hypervariable loops" (also called complementary determining regions, or CDRs) at the tips of the V, and V, domains. Most crystallographic work has concentrated on the structure of Fab fragments, which has been reviewed.28 There are now nearly 20 Fab structures published with many more crystallized and under study. The available conserved 3D domain structures make Fabs appealing targets for x-ray determination using the technique of molecular replacement.29 Major conformational differences between different Fab structures are the result of the variable "elbow angle" between the variable and constant domains, which can vary between 130" and 180".

326

WALKINSHAW

Figure 3. The two N terminal domains of CD4 showing those residues affecting the interaction with gp120. Taken from Ref. 24.

1. Fab Complexes

A number of Fab fragments have been cocrystallized with peptides or proteins. The structures provide a clear picture of antibody-antigen recognition, which in every case is found to involve, almost exclusively, some or all of the CDR loops of the V, and V, domains. There are three examples of Fab complexes with peptides: the antiangiotensin mAB-131 with an 8-residue peptideso; mAB 17/9 (IgG2a;K) with a 9-amino-acid peptide from influenza virus hemaggl~tinin~'; and the Fab from the mAB B1312, both native, and cocrystallized with the 19 amino acid long antigen corresponding to the C helix of m y ~ h e m e r y t h r i n .A~ ~complex between the immunosuppressant cyclic undecapeptide cyclosporine and the Fab from mAB R45-45-11 (IgG1 K ) has been solved and partly refined.33 Four complexes between Fabs and protein have been studied at high resolution: three are between egg white lysozyme and three different antilysozyme antibodies which bind to different epitopes D1.3,34 H Y H E L - ~ , ~ ~ and HyHEL-10s6; the fourth complex is between the neuraminidase from influenza virus and the Fab NC41.37 These four complexes have been revieweds8 and in all cases the residues in contact with the antigen come from at least 16 amino acid residues from all six of the CDR loops, though some framework residues are also involved. Recognition and binding is accomplished by a complementary close-fitting surface between Fab and antigen

STRUCTURE-BASEDDRUG DESIGN

327

epitope which extends over about 750 A2. Water is almost completely excluded from the tight-fitting interface. Hydrogen bonds play a key role in binding, and typically at least 10 specific Fab-antigen hydrogen bonds are formed. Salt bridges have also been observed. The amount of conformational change in the lysozyme molecule on binding the Fab is small, but certain rotations of side chains may be significant. A protein-engineered F, (V, and V,) complex with the D1.3 lysozyme epitope has also been examined at high res0lution3~and was found to make very similar contacts to those in the FabD1.3 complex. A complex of Fab with the HIV coat protein p24 has also been cry~tallized.~O Although it was suggested that the structural information about this interaction may prove useful for developing a vaccine against HIV, the main objective was to use the Fab to enable crystallization of the coat protein without aggregation. There is one published structure of an idiotope-antiidiotope complex41between antilysozyme FabDl.3 and an antiidiotope Fab. This structure shows that the two Fabs interact largely through their hypervariable loops. Idiotopic mimicry of the original lysozyme antigen, however, is not observed.

2 . Modeling Antibody Structure The design of engineered antibodies depends on understanding the relationship between the sequence of the immunoglobulin and the shape of the hypervariable antigen binding site. A comparative study of the known Fab structures has shown that the limited number of conformations (canonical forms) of the six hypervariable loops are determined by the interactions of a few conserved residues.12The rules governing hypervariable loop shape can be very specific. For example, the major factor governing the shape of H2 (the second loop of the heavy chain) has been found to be the size of the residue at site 71.42These observations have been used to quite successfully predict the structures of various a n t i b o d i e ~ . Such ~ ~ , ~simple ~ rules should also be of general help in selecting those human framework sequences suitable for combining with antigen binding loops from other species in the synthesis of therapeutic antibodies. A number of molecular-mechanics studies of hypervariable loop conformation have also been p ~ b l i s h e d .One ~ , ~ clinical ~ application currently being tested is the use of a designed anti-Tac mAB to inhibit proliferation of T cells by blocking IL-2 binding. 13 A "humanized" antibody was designed which comprised the murine CDRs of anti-TAC mAB and a human framework and constant regions. The best human amino acid sequence was selected to maximize homology with the anti-Tac sequence and was examined using 3D modeling.

F. Cytokines Cytokines are a family of signaling proteins which mediate immunologic and inflammatory responses to infection or tissue damage.46 Over 30 IFN (interferon), IL (interleukin), and CSF (colony stimulating factor) proteins have been cloned and a number show promising therapeutic effects, particularly in cancer treatment. Homology studies and theoretical structure predic-

328

WALKINSHAW

tion studies suggest that a number, including GM-CSF and IL-5, may fold to 4-a-helical bundles.47 The four high-resolution cytokine structures solved to date all have very different folding patterns (IL-1 is a p barrel, IL-2 is a 4-a helix bundle, IL-8 has an a / p structure, and TNF has the “jelly roll” structure), and this indicates that the related biological roles are not reflected in their overall structures. GM-CSF (granulocyte-macrophagecolony stimulating factor) is 127 amino acids long and acts as a hematopoietic growth factor. It also stimulates antibody-dependent cytotoxic killing of tumor cells and is currently undergoing clinical trials for patients undergoing radiation or chemotherapy. Crystals of human recombinant GM-CSF which diffract to 2.4 A and are suitable for x-ray analysis have been g r o ~ n . ~ ~ , ~ ~ Recombinant bovine and human immune interferon, which exhibits antiviral, antiproliferative, and immunoregulatory effects, have also been crystallized.50 1. Interleukin-1

IL-1 activity is found in almost all nucleated cell types and, among other activities induce thymocyte and lymphocyte proliferation. IL-1 plays an important role in inflammation. It induces the proliferation of fibroblasts and the production of prostaglandins and collagenase and stimulates bone resorption. It induces fever and promotes cartilage degradation and has been found in the synovial fluid of patients with arthritis and may play a role in promoting chronic inflammation in joints. There are two types of molecule, IL-1p and IL-la, and each is expressed as a precursor ( M = 31 KDa) which is cleaved to give a C terminal fragment of some 17 KDa. Among mammals IL-la are between 60% and 70% identical in sequence but show less than 30% identity to IL-1p proteins. Despite such differences, both a and f3 bind to the same receptor protein and appear to have similar (if not identical) biological a ~ t i v i t i e s . ~ ~ IL-lp is 159 amino acids long and has been cloned, expressed, and crystallized by a number of g r o ~ p s . The ~ ~ 3-A -~~ structure55and refined 2-A structure have been p ~ b l i s h e d .The ~ ~ protein , ~ ~ has no a helices but consists of 12 antiparallel p strands which has the same topology as the Kunitz-type trypsin inhibitor. There is one cis proline (Pro91). The molecule has an approximate 3-fold symmetry. IL-la is 153 residues in length and has been solved by x-ray to 2.7 A.51 The structure is composed of 14 p strands and a 3,, helix. The core of the structure is a capped p barrel which shows 3-fold symmetry and a topology similar to that of IL-1p. The author suggests that this “IL-1 f o l d will be recognized as a general tertiary structure for protein domains and speculate that ECGF (endothelial cell growth factor) and DE-3/ tissue plasminogen activator inhibitors which show high homology with STI (soyabean trypsin inhibitor) will have the same topology. There are 20 residues conserved among the eight IL-la and p sequences from man, cow, mouse, and rabbit. Of these only two are exposed. Even though IL-lp and IL-la bind to the same receptor, the lack of invariant surface implies a different binding mode. Numerous site-directed mutants have also been used to try and locate a “binding surface,” but without any reported

STRUCTURE-BASEDDRUG DESIGN

329

c o n c l ~ s i o nAnalysis .~~ of the amino acid sequence of the IL-1 receptor (IL-1R) (as discussed in Ref. 57) shows the presence of three immunoglobulinlike domains that are thought to contain the binding site for IL-1. 2. lnterleukin-2 (IL-2) (and Receptor)

IL-2 is produced by activated T lymphocytes and stimulates proliferation of IL-2-dependent T cells. IL-2 binds to a specific high-aff inity membrane-bound receptor (IL-2R) on the surface of target cells. IL-2R consists of two distinct proteins of 75 kD and 55 kD. A single crystal complex has been made of IL-2 and a soluble form of the p55 component of the IL-2R.58This complex diffracts to 3.5 A but has not yet been solved. The promise of a high-resolution picture of such a receptor-ligand complex could provide an exciting strategy for the design of agonist and antagonist drugs. The crystallization and 3-A resolution structure of IL-2 has been publ i ~ h e d . Isomorphous ~~,~~ crystals of the mutant Cys125Ala have also been studied. IL-2 is 133 residues long with six helical segments, giving an overall helical content of 65% and no p structure. Residues 33 to 56 form a helix (B + 8’)which is bent in the middle by Pro47. Antibodies to peptides which crossreact with IL-2 have been used to map regions likely to be important in receptor binding. These results indicate that residues 8 to 27 and 33 to 54 are involved. Deletion mutants also show the importance of the N terminal region. A working hypothesis has been presented in which helices B, C, D, and F form a structural scaffold, and A, B’ (B), and E form the receptor binding sites60 (Fig. 4). 3 . lnterleukin-8 (IL-8)

Interleukin-8, also known as ”neutrophil activation factor” is released from various cell types (monocytes, fibroblasts, endothelial cells) in response to an inflammatory stimulous and acts by attracting neutrophils and T cells (see Ref. 61 and references therein). Protein sequence comparison shows IL-8 is a member of a superfamily of proteins including ”monocyte chemoatractant protein,” “platelet factor 4,” “macrophage inflammatory protein,” and ”gamma-interferon-induced protein,” which are involved in cell-specific

Figure 4. Diagram showing the overall architecture of IL-2 and its possible interaction with the IL-2 Receptor. Taken from Ref. 60.

330

WALKINSHAW

chemotaxis and inflammation. Homology within this protein family is between 20% to 35% (80% homology with conservative substitutions) and all have two conserved cystein bridges. The 3D structure derived from NMR61 shows that the 72-residue-long monomers form intimate dimers with two 24-A-long helices separated by about 14 A and lying on top of a 6-stranded antiparallel p pleated sheet platform. The general architecture is very similar to the al/a2 domains of the HLA-A2 structure discussed above. This suggests that the two helices may form the binding site for the cellular receptor with specific recognition being accomplished by the polar residues of the amphiphilic helix (particularly 594, 60R, 63E, 64K, 67K). The distinct distribution of charged residues at the surface of the two symmetry-related helices form a template for the design of potential IL-8 inhibitors which would act by binding to the groove between the two helices, thereby preventing interaction of IL-8 with its cellular receptor. The 3D structure of IL-8 provides a landmark in the development of NMR and x-ray techniques, as this is the first example where a structure from NMR was used to solve a previously unknown x-ray crystal structure using the rotation function.62 Preliminary crystallization data have also been published. 63864

G. Immunophilins The immunosuppressive drugs cyclosporin A (CsA or Sandimmune) and FK506 act as inhibitors of T cell activation (Fig. 5). Sandimmune is the currently favored drug used to prevent graft rejection in organ and bone marrow transplantation surgery. Immunophilins are a family of proteins which bind these immunosuppressive drugs. The biology and chemistry of immunophilins and their ligands has recently been reviewed.65 1. Cyclophilirz

Cyclophilin (M = 17.7 KDa, with 165 amino acids) is the predominant cyclosporin-binding protein in T cells. It has been shown that this protein catalyzes the interconversion of the cis and trans isomers of the peptidylprolyl amide bonds of peptide and protein substrate^.^^,^^ It seems that suppression of rotamase activity alone is insufficient to explain the biological activity, as drug concentrations necessary to inhibit T cell activation would not saturate the abandant r o t a m a s e ~ .The ~ ~ ,current ~~ theory of action for both cyclophilin and FKBP is that the complex of receptor with drug provides the required inhibitory signal by binding to an as yet uncharacterized third protein.65,69The x-ray structure of human recombinant cyclophilin complexed with a tetrapeptide substrate has been solved to70 2.8 A and shows a folding architecture similar to the retinol-binding and fatty-acid-binding proteins.71-73 The crystal structure and NMR structure (in chloroform) of the undecapeptide were found to be very similar.74The conformation of cyclosporin bound to cyclophilin has also been analyzed by NMR and has been found to be radically different.75 In both chloroform and in the single crystal conformations the molecule maximizes intramolecular hydrogen bonds, while in the cyclophilin-bound conformation the ligand turns inside out to present the

STRUCTURE-BASED DRUG DESIGN

331

a)

Me

Figure 5. Structures of the immunosuppressant drugs cyclosporin A and FK506. (a) Chemical formula of the cyclic undecapeptide cyclosporin A which binds specifically to its cytosolic receptor cyclophilin. (b) Chemical formula of FK506 which binds specifically to its immunophilin receptor FK-binding protein (macrophilin).

available carbonyl oxygen atoms and amide protons to maximize intermolecular hydrogen bonding. 2. FKBP (Macrophilin) FK506 is a macrolide which has inhibitory properties similar to cyclosporin but binds to a different cytosolic protein, known as FK-binding protein or m a ~ r o p h i l i n .Human ~ ~ , ~ ~ recombinant FKBP has a chain length of 107 amino acids ( M = 11.8 KDa) and has no significant homology to cyclophilin. The FK506 class of drugs do not bind to cyclophilin and CsA does not bind to FKBP. These facts fit with the recently determined x-ray and NMR structures of human recombinant FKBP,77-79 which has a very different architecture than cyclophilin and consists of a five-stranded antiparallel p sheet which wraps around a short helix. Despite these structural differences, FKBP also enigmatically shows a cis-trans rotamase activity. The mechanism for this has been partially explained by the crystal complex of FK506 with FKBP, which shows a carbonyl binding pocket involving aromatic residues which may stabilize a "twisted amide" c o n f o r m a t i ~ nAs . ~ ~with CsA, the bound conformation of FK506 is profoundly different than that determined in the crystal

332

WALKINSHAW

structure of FK506 alone. The main difference being that in the unbound FK506 the amide bond is cis, while in the complex the bond is trans (Fig. 5). The difference between the conformation of free and bound ligand in both the FKBP and cyclophilin complexes highlights a major problem in predicting active conformations of drugs. Clearly, the only safe way to proceed is to use as biologically relevant structural models as possible; this in the longer term means studying drugs bound to their natural receptors. 11. ENDOCRINOLOGY

Polypeptide hormones are manufactured and stored in specialized endocrine cells.80The hormone message is frequently conveyed to the target cell by a surface receptor for the hormone. Insulin, somatostatin, glucagon, and pancreatic polypeptide are peptide hormones found in the human pancreas, and 3D structures are available for a number of them. It seems that most small peptides of between 15 and 40 residues are predominantly helical. For example, the x-ray structure of the 36-amino-acid avian pancreatic polypeptide has been refined at 1-A resolutions0 and is composed of an a-helical stretch (residues 14-32) and a polyprolinelike helix lying antiparallel to it. Likewise, pancreatic glucagon is 29 residues long and also adopts a predominantly helical conformation in the crystal.81 This hormone activates the gluconeogenic pathway, resulting in raised blood glucose levels. The 32-residuelong hormone salmon calcitonin is therapeutically useful in the treatment of osteoporosis by inhibiting bone resorption. NMR studies in an aqueous solution of trifluoroethanol also show that the peptide is more than 45% A. Insulin The insulin monomer consists of two chains of 21 and 30 residues linked by two disulfide bridges. Various crystal forms of insulin have been refined and suggest that the insulin molecule is d e f ~ r m a b l ethough, ,~~ unlike the shorter hormone peptides, insulin is likely to retain the same overall conformation on binding to the receptor. Insulin is used as an injected therapeutic agent for the treatment of diabetes. One problem with the treatment is that in neutral solutions it is assembled into hexamers coordinating zinc ions which may limit the rate of absorption. Single site directed mutants of insulin have been designed to reduce the principal intermolecular interactions found in the crystal structure of insulin.84 These mutants were found to be monomeric at pharmaceutical concentrations and are absorbed two to three times faster after subcutaneous injection. Insulinlike growth factors (IGF) have also been modeled on the basis of their homology with insulin,80which includes identical disulphide positions. B. Growth Hormones (Somatotropins)

Purification and crystallization has been described for hGH.85 This is a 191amino-acid protein that is synthesized in the anterior pituitary and plays a key role in somatic growth. Preliminary crystal data have also been published

STRUCTURE-BASEDDRUG DESIGN

333

for bovine prolactinrM bovine growth hormone,s6 and porcine growth hormone .87

C. Growth Factors (and GF Receptors) Epidermal growth factor (EGF) and transforming growth factor (TGF-a) bind to and activate the cell surface EGF receptor and invoke receptor clustering and stimulate an intracellular tyrosine kinase.88 They play an important role in controlling cell growth, oncogenesis, and wound healing (see references in Ref. 89). Growth factor agonists and antagonists, based on the 3D structure of the functionally relevant amino acids, could be of considerable medical value. No crystal structures have yet been published; however, 3D structures of human EGF,90 murine EGFrs9 and TGF-as8 have been solved using NMR. Residues 1-48 of the full 53 amino acid long hEGF show an antiparallel p sheet structureg0which has a similar overall topology as mEGF (Fig. 6). In hEGF, residues 1-32 and 33-48 form two distinct domains. The conserved “functionally important” residues appear to be 13, 15, 16, 41,43, and 47 and suggests that the large p sheet does not contribute to the recognition site, but acts as a scaffold. EGF receptor is closely related to the rat growth factor receptor which is coded by the neu gene.91 Binding of EGF to EGFr induces receptor dimerization. A single mutation Va1664Glu converts the neu gene into an oncogene and also leads to receptor aggregation. A molecular modeling studyg1provides an explanation for the enhanced aggregation in terms of interhelical hydrogen bonds. It was suggested that this enhancement of tyrosine kinase activity could be reduced by peptides mimicking the neu transmembrane region and providing a novel therapeutic strategy for some types of cancer cells.

Figure 6. The shape of epidermal growth factor (EGF) as determined by NMR. Taken from Ref. 90.

WALKINSHAW

334 111. CANCER

A. Oncogenes Most cancers are thought to be initiated by DNA mutation in a particular cell, which results in oncogene activation or tumor suppressor gene loss. The 3D structures of some of these oncogene products are beginning to shed some light on possible mechanisms which lead to uncontrolled cell proliferation and provide new targets for drug therapy. In the human genome there are three distinct cellular Ras genes (c-H-ras, cK-ras, N-ras). Ras, also known as p21, is the product of the Harvey (Ha) Ras oncogene and is a 21-kDa membrane-bound protein that binds GTP and has intrinsic GTPase activity. There are similarities between Ras and the 39-kDa-a in subunit of other G proteins.92 Ras seems to act as a "molecular the early steps of the signal transduction pathway that is associated with cell growth and differentiation. The conformational changes from the GDP-bound (off state) to the GTP-bound (on state) may transmit the cellular growth signal. A possible target for the activated GTP-bound form of Ras is GAP (GTPase activating protein), and site-directed matagenesis studies have defined a possible effector region (residues 32-40). Oligopeptides can effectively compete against Ras binding to GAP.94 Specific mutations of Ras, which impair its GTPase activity and stabilize the protein in the active GTP-bound form, have been found to occur in a number of neoplasia, including human colorectal tumors and myeloid leukemias. The crystal structures of Ha-Ras p21 (residues 1-166) complexed with the slowly hydrolyzing GTP analogue GppNp provides a picture of Ras in its active GTP-bound c o n f ~ r m a t i o n . ~ ~ Mutations at positions 12,13, and 61 have been found in a high percentage of tumors. The 3D structure shows these residues are important in nucleotide binding. A very-high-resolution refinement of the complex to 1.35 A suggests a catalytic mechanism involving Gln61 and G l ~ 6 3 The . ~ ~conformational changes involved on the hydrolysis of GTP by H-Ras p21 have been studied by time-resolved Laue diffraction using a synchrotron source.97The results show that there are large conformational changes in the GAP binding region (residues 32 to 36) caused by a change in coordination of the active site magnesium ion. Crystal structures of complexes of H-Ras (residues 1-171) with GDP and GTP analogues show large conformational differences at two exposed loops (residues 30-38 and 60-68) which define the active (GTP-bound)and inactive (GDP-bound) states.93 A complex of GDP with the transforming oncogene Glyl2Val has also been examined.93The added hydrophobic bulk of the Valine side chain is likely to interfere with binding of GTP, leading to a decrease in GTPase activity. A further series of structures of five oncogenic mutants also shows that the molecular shape of the cellular and mutant molecules are almost identical.98 B. Cancer Therapy

The current approach to anticancer therapy is to target and kill the cancerous cells using cytotoxic agents which inhibit DNA replication or block vital biosynthetic pathways. An example here is the use of dihydrofolate

STRUCTURE-BASEDDRUG DESIGN

335

Glycine

\/

H

I

OH

Figure 7. The thymidylate synthesis cycle.

reductase (DHFR) or thymidylate synthase (TS) inhibitors to prevent the synthesis of amino acids and purines99(Fig. 7). Another possible enzyme target is purine nucleoside phosphorylase. Structural studies are also underway on a number of proteins such as tumour necrosis factor, ricin, and neocarzinostatin, which have different cytotoxic mechanisms. 1. Dihydrofolate Reductase (DHFR) Dihydrofolate reductase catalyses the NADPH-dependent reduction of dihydrofolate (H2FA) to tetrahydrofolate (H4FA).Im DHFR inhibitors have a number of important clinical indications and methotrexate (MTX), the first antineoplastic drug, was introduced as therapy for acute leukemia as long ago as 1948.99 DHFR provides one of the most intensively studied proteins for 3D drug design. Over 20 x-ray structures of DHFR from E . coZi,1m-102 L. cusei,loO chicken,lo3 mouse,1o4 and man,lo5,106with and without cofactor (NADP), substrate (DHF), or inhibitor, have been refined at high resolution (Fig. 8). Human DHFR is 185 amino acids long. A number of x-ray structures have been published: Human D H F R . f ~ l a t e , ~Human ~~,l~~ DHFR.MTX,lo5Human DHFR.TMP, lo5and Human DHFR.5-deazaf01ate.l~~ There is high sequence homology (75%-95%) among vertebrate DHFR sequences, but only about 30% homology to bacterial DHFRs. All structures show very similar architecture, with an eight-stranded p sheet with seven parallel strands; five helices pack against this core. Folate binds in a deep hydrophobic cleft and in a similar position to that of MTX in the DHFR.MTX binary complex, but, in all known structures, the pteridine ring of folate has rotated 180" compared to that of MTX (Fig. 9). The geometry of TMP binding closely resembles that for MTX binding, with

WALKINSHAW

336 (I) Mcthouexatc (MTX)

N NHZV

HyAN

T

'

0

; &HNn

Protein targets for structure-based drug design.

Protein Targets for Structure-Based Drug Design Malcolm D. Walkinshaw Preclinical Research, Sandoz Pharrna AG, CH-4002 Basel, Switzerland Scope of th...
3MB Sizes 0 Downloads 0 Views