PROTEINS Structure, Function, and Genetics 13:352-363 (1992)
Amino Acid Substitution Analysis of E. coli Thymidylate Synthase: The Study of a Highly Conserved Region at the N-Terminus Choll Wan Kim, Mark Leo Michaels, and Jeffrey H. Miller Molecular Biology Institute and the Department of Microbiology and Molecular Genetics, University of California, Los Angeles, Los Angeles, California 90024
ABSTRACT Amino acid substitution analysis within a highly conserved region of Escherichia coli thymidylate synthase (TS), using suppression of amber mutations by tRNA suppressors, has yielded a bank of 124 new mutationally altered TS proteins. These mutant proteins have been used to study the structurefunction relationship of the Escherichia coli TS protein at the N-terminus corresponding to residues 20 through 35. This region contains a block of amino acids whose sequence has been well conserved among other known TS proteins from various organisms. Positions 20 through 25 contain a surface loop structure and positions 26 through 35 encompass a P-strand. We find that residues surrounding a P-bulge structure within the fi-strand are particularly sensitive to amino acid substitution, suggesting that this structure is maintained by a highly ordered packing arrangement. Three residues in the surface loop that are present at the base of the substrate binding pocket are also sensitive to amino acid substitution. The remainder of the conserved sites, including those at the dimer interface, are tolerant to most, if not all, of the substitutions tested. o 1992 Wiey-Liss, Inc. Key words: tRNA suppressors, evolutionary conservation, protein structurefunction INTRODUCTION Thymidylate synthase (TS; 5,lO-methylenetetrahydrofo1ate:dUMP C-methyltransferase, EC 184.108.40.206) is a n enzyme required in the sole de novo pathway for thymidine synthesis. In Escherichia coli, TS is a dimer with a combined molecular weight of approximately 65 kilodaltons. It is responsible for the conversion of 2’-deoxyuridylate (dUMP) to 2’-deoxythymidylate (dTMP) using the co-factor 5,lO-methylenetetrahydrofolateas the single carbon donor as well as the reductant.’ Thymidylate synthase is interesting for a number of reasons. First, it serves an important role in thymidine synthesis, and hence cell division. Understanding the precise mechanism of catalysis will be useful for therapeutic 0 1992 WILEY-LISS, INC.
drug design. Second, thymidylate synthase isolated from sources as varied as bacteriophage, bacteria, yeast, and vertebrates shows that TS is one of the most highly conserved enzymes known to The high degree of evolutionary conservation makes this enzyme useful for amino acid sequence conservation studies. Matthews et aI.l5-l7 and Montfort et a1.l’ have recently determined the crystal structure of E . coli thymidylate synthase complexed with 5-fluoro2‘-deoxyuridylate (FdUMP), a substrate analog, and lO-propargyl-5,8-dideazafolate(CH,-H,PteGlu), a cofactor analog. The 3-D structure of this enzyme: substrate:cofactor ternary complex shows the binding pocket to be comprised of residues from both monomers, indicating that the enzyme can function only as a dimer. Thymidylate synthase contains a novel P-sheet structure, with a right-handed twist rather than the left-handed twist more commonly found in other proteins, which appears to be important for protein-protein interactions between the two monomers. In addition, this P-sheet contains a large bend (termed a P-bulge) that is crucial for proper alignment of residues in the binding pocket.l5-’’ Based on previous biochemical data and the 3-D structure of the ternary complex, approximately 30 residues in the binding pocket are involved in hydrophobic contacts and/or H-bonding with the substrate and the cofactor. As might be expected, these residues are highly conserved in all thymidylate synthases known. Amino acid substitution analysis at many of these sites, in conjunction with the crystal structure, has provided a detailed picture of the active site of the E. coli enzyme.20-22 Recent comparison of highly refined structures of E. coli TS and L. casei TS has provided insight into the effect of amino acid substitution on structural perturbations in the TS protein.23 Perry et aLZ3 show that differences in amino acid sequences of the E. coli and L. casei TS enzymes can be correlated to
Received July 1, 1991; revision accepted August 15, 1991. Address reprint requests to Jeffrey H. Miller, Molecular Biology Institute, University of California at Los Angeles, 405 Hilgard Avenue, Los Angeles, CA 90024.
STUDIES OF E. COLI THYMIDYLATE SYNTHASE
nearby changes in protein structure, where the disturbance in position of a n atom depends on the distance from the site of difference in amino acid sequence. Amino acid sequence variation during evolution would thereby be accomodated by compensatory changes in protein structure. Amino acid sequence comparisons between various TS enzymes reveal that many regions in the protein have been well conserved. Seven highly conserved blocks of 6 or more amino acids are present. These blocks of conserved residues are not strictly confined to the binding pocket, nor do they constitute boundaries of known secondary structure. To probe the nature of these sites, we have used 13 different tRNA amber suppressors to rapidly introduce 12 or 13 specific amino acid substitutions at amber nonsense codons (5’-TAG-3‘)in the E. coli TS enzyme. Amino acid substitutions at 10 new sites in a highly conserved region of the E. coli TS protein, corresponding to residues 20-35, have been analyzed by this method. These sites span both a surface loop and a p-strand. Our results, combined with previous amino acid substitution data,” identify residues that may be important for substrate binding and maintenance of the P-bulge structure. We also observe that many sites are tolerant to amino acid substitutions even though they have been highly conserved during evolution and that substitutions are tolerated roughly equally in the surface loop structure and the p-strand.
cloned into pACYC 184,35designated PAT AH,^^ was a gift of M. Belfort. The fl origin of replication was cloned into the BamHl site of pACYC184 to confer phagemid properties upon the plasmid. The final plasmid construct is designated pECTSfl. M13mplOcat was a gift from K. Moore (University of California, Davis). pUC 4-KAPA was purchased from Pharmacia.
Site-DirectedMutagenesis Synthetic oligonucleotides were purified by electrophoresis through an acrylamide gel, eluted, ethanol precipitated, and resuspended in 10 mM TrisHC1 pH7.5, 1mM EDTA. Site-directed mutagenesis of pECTSfl was carried out as previously des ~ r i b e d . Mutants ~ ~ , ~ ~ were verified by sequencing and were designated pECTSfl followed by the single letter amino acid code and its corresponding numerical position relative to the thyA sequence.2 Growth Tests The plasmids with thyA amber mutations were transformed into the suppressor strains and selected on M9 glucose containing thymidine, chloramphenicol, and proline when transforming into chromosomal suppressor strains (SuT2-89, etc.), or the same plates supplemented with ampicillin when transforming into XAC25 strains containing the pGFIB suppressors. Freshly purified colonies were streaked onto two sets of minimal M9 glucose plates. One set contained all the appropriate supplements to maintain the strain and plasmids as described above, and the other set contained all the supplements except thymidine. Plates were incubated at 30°C, 3YC, and 39.5”C for 48 hr, and colony growth was compared between plates with and without thymidine (see Figs. 2, 3).
MATERIALS AND METHODS Bacterial Strains and Media E. coli strain XAC is F- A(gpt-lac),, gyrA, rpoB, argE,,, ara. XAC25 is a derivative of XAC constructed by deleting the entire thyA gene and replacing it with a kanamycin resistance marker. This was accomplished with a gene replacement technique that uses a recombinant M13 intermediate.24 Extract Preparation The chromosomal suppressor strains Su2-89, Su3, Cultures of thymidylate synthase variants were Su5RF, Su6 have been de~cribed.’~-’~ These supgrown in 50 ml minimal selective media at 37°C pressors were mutagenized with the frameshift muwith aeration as described above. Log-phase cells tagen ICR 191 and Thy- colonies were selected on from a 50 ml culture (approximately 35 OD,,, units trimeth~prirn.~, A recA deletion was then intrototal) were harvested and the cell pellet was frozen duced to create SuT2-89, SuT3, SuT5RF, and SuT6, a t - 70°C. The extract was prepared by resuspending which are thyA recA::TnlO. Strain CJ236 is dut-1, the frozen cell pellet in 2 ml 50 mM Tris pH7.5, 20 ung-1, thi-1, relA-1, pCJ105, and strain MV1190 is A(ZacproAB), thi, supE, (srl-recA)3O6::Tn10(tetr)[F’: mM P-mercaptoethanol, 1 mM phenylmethylsulfonyl fluoride (PMSF), and kept on ice throughout the traD36, proAB, lacI9Z AM151 and were obtained remaining steps. Cells were lysed by sonication on a from Bio-Rad Laboratories. All media were suppleFisher Sonic Dismembrator Model 300 on maximum mented as necessary with proline (100 pg/ml), meoutput for two 30-sec bursts. The cell debris was rethionine (50 pglml), ampicillin (100 pg/ml), nalidmoved by a 20-min spin at 27,OOOxg (15,000 rpm) at ixic acid (30 pg/ml), chloramphenicol (30 pg/ml), 4°C. The supernatant was taken and glycerol was tetracycline (15 pg/ml), or thymidine (100 pg/ml). added to a final concentration of 5%. This constiBacteriophage and Plasmids tuted the crude extract used for in vitro activity assays. The protein content of the extracts were meaConstruction of the vector, pGFIB-1,” and the sured with the BioRad Protein Assay Kit using plasmid-based suppressors (pGFIB-Ala, etc.) have bovine serum albumin as a standard. been p ~ b l i s h e d . ~ ’ - The ~ ~ wild-type thyA gene
C.W. KIM ET AL.
TABLE I. Specific Activities of TS Variants Wild type amino acid/position* Wild-type plasmid Asp20 Gly23
Amino acid substitutedt none Cys2O Ser20 Ser23 Gln23 Tyr23 Leu23 His23 Glu23 Ser25 Gln25 Tyr25 Leu25 His25 Ser26 Gln26 Tyr26 Leu26 His26 cys27 Ser28 His27 Phe27 Pro27 Cys28 Ala28 Gly28 Gln28 Ser29 Gln29 Tyr29 Leu29 His29 Ala30 Phe3O(wt) Ser30 Tyr30 Leu30 Ala33 Phe33 Ser33 TYr33 Leu33 Gln33(wt) Ser35 Phe35 His35 Ar~35(wt)
+ + + +
+ + + + +
+ + + + + + + + + + + + +
+ + + + + + + + + -
+ + +
Specific activity§ 50 (223) 2.6 (k0.8) 2.3 ( & 0.8) 40 ( t 22) 68 ( f 38) 44 ( t 22) 44 ( k 13) 16 ( f 5) 12 ( +- 6) 19 ( k 9) 42 ( +- 6) 30 ( t 9) 24 ( 5) 11 ( ? 6) 53 ( t 30) 12 ( f 7) 0.41 ( t 0.58) 8.0 ( t 3.2) 3.1 ( t 0.7) 73 ( k 33) 22 ( 2 6) 68 ( t 26) 45 ( t 16) 14 ( f 3) 46 ( 11) 40 ( 2 21) 29 ( t 13) 13 ( ? 3) 37 ( ? 23) 39 ( t 20) 30 ( ? 15) 31 ( t 15) 18 ( t 9) 0.16 ( 2 0.20) 23 ( t 5) 0.08 ( f 0.05) 43 ( t 23) 62 ( t 16) 26 ( t 5) 36 ( ? 28) 50 ( 2 9) 49 ( 33) 22 ( 8) 49 ( 7) 22 ( t 6) 15 ( t 5) 34 ( 8) 22 ( 2 6)
100 5.1 4.5 80 134 88 87 32 23 38 83 60 48 23 106 26 0.82 16 6.1 144 44 134 90 28 91 79 57 26 73 77 60 62 36 0.32 46 0.16 85 122 52 72 99 97 49 98 43 31 68 44
*Site of amber suppression and the original wild-type amino acid at that position in E. coli thymidylate synthase. 'Amino acid inserted at the amber codon at that site. *Growth phenotype as described in Figure 2. 'Specific activity of suppressed variants in units of per mg of TS enzyme where the concentration of thymidylate synthase in the extract is determined by quantitative Western blotting as described in Materials and Methods. Values in parentheses represent the combined standard deviation obtained by tritium-release assays and densitometry analyses. One unit is defined as the release of lkmol of tritiated water in one hour under the conditions described. "Percent of wild-type activity determined from an extract prepared from E. coli containing the wild-type thyA gene on the pACYC184 vector.
STUDIES OF E. COLI THYMIDYLATE SYNTHASE
MLM092K Lysogen I,
kan Fig. 1. MLM092K is a recombinant M13 clone that contains a kanamycin resistance marker (black region) cloned in between chromosomal DNA that flanks the 5' or 3' end of the fhyA gene (stippled boxes). The entire thyA gene has been deleted from this clone. Under specific mndition~,*~ the phage can be forced to recombine with the E. coli chromosome (thin black lines) at regions of homology at the 5' or 3' flanking regions to create an
MLM092K lysogen. Subsequent excision of the phage from the chromosome can lead to gene replacement. In the above example the initial recombinationto produce the lysogen occurred in the 3' flanking region and phage excision occurred through the homology at the 5' flanking regions. This technique has been used to create a fhyA deletion in XAC25.
Measurement of Thymidylate Synthase Activity
cocktail (ICN Radiochemicals). Extract prepared from E . coli containing the wildtype thyA gene on the pACYC184 vector was used as the wildtype standard.
Activity of thymidylate synthase enzyme in the crude extract was measured by a modification of the tritium release assay described by Roberts.39 In a standard assay, 0.2-4 pg total protein from a crude lysate was added to 50 mM Tris-HC1 pH8.0,lO mM DTT, 0.1 mM EDTA, 25 mM MgCl,, 15 mM formaldehyde, 0.4 mM tetrahydrofolate (Fluka Biochemicals), and 5 pM [Eb3H]-dUMP (specific activity = 226 d p d p m o l dUMP) in a 100 p1 volume. Varying amounts of extract were tested to avoid saturating levels of thymidylate synthase activity. Reactions were carried out at 25°C for 15min and were stopped by the addition of 100 p1 15% charcoal suspension (Sigma). The charcoal was pelleted by a 5-min spin at room temperature. The supernatant was passed through a Spartan 3 nylon filter ( 5 pm pore size, Schleicher and Schuell), and 100 p1 of the flowthrough was analyzed in 5 ml Ecolume scintillation
Quantitative Western Analysis of C r u d e Extracts The concentration of thymidylate synthase enzyme in the crude extract was determined by quantitative Western blots. Proteins were separated by SDS polyacrylamide gel electrophoresis and transferred to nitrocellulose filters using the SemiPhor semi-dry blotting apparatus (Hoefer Scientific Instruments). Filters were incubated with anti-TS antisera (kindly provided by Diagnostics Products Corporation) and developed using goat antirabbit lZ5I-IgG(NEN). Filters were exposed to X-ray film to obtain autoradiographs of the blots. The autoradiographs were quantitated by measuring transmittance on a Biorad 620 Video Densitometer.
C.W. KIM ET AL. mutation
0 0 0
I I I 1 1 I 0 I
Ser28lC Ser28lF Ser28lR Ile29ff Ile29lR
0 0 0
Fig. 3. Temperature-sensitive mutants of thymidylate synthase. Purified colonies were analyzed as in Figure 2 at the following temperatures: 30°C 37"C,and 39.5%. Temperature-sensitive sites are indicated by the three-letter amino acid code representingthe wild-type amino acid at each position followed by the position number followed by the single letter code of the amino acid substituted at that site.
sources. Figure 5 shows a comparison of evolutionary conservation and amino acid substitutional tolerance. An analogous comparison done without taking into account the identity of each amino acid (unweighted scheme) yielded similar results (data not shown).
To standardize densitometry transmittance values with respect to the amount of thymidylate synthase protein, varying concentrations of purified thymidylate synthase (kindly provided by Kate Welsh, Agouron Pharmaceuticals Corp.) were included in every Western blot. For specific activity values shown in Table I, one unit is defined as the release of lpmol of tritium per mg of thymidylate synthase protein in 1 hr under the conditions described above.
RESULTS Creation of a thyA Deletion Strain Using oligonucleotide mutagenesis, the thyA gene from -2 nucleotides to 10 nucleotides beyond the stop codon was removed from an MI3 clone and an EcoRI site was introduced a t the deletion juncture. A Hind111 fragment containing 212 nucleotides of 5' and 144 nucleotides of 3' flanking chromosomal DNA was subcloned into a recombinant M I 3 vector, M13mplOcat, whose EcoRI site had been previously destroyed. me kanamycin resistancemarker from pUC4-KAPA was cloned into the EcoRI site at the deletion juncture to create MLM092K (Fig. 1).The chromosomal thyA gene in strain XAC was replaced with the kanamycin-resistant deletion mutant from MLM092K via a recombination technique developed by Blum et al.24and resulted in a kanamycin resistant thyA deletion strain, XAC25. When P1 lysates of XAC25 were used to transduce other strains, 100% of the kanamycin resistant transductants were Thy-. It is worthy of note that the complete deletion of the structural portion of thyA from the E . coli chromosome has been previously attempted
Determination of Evolutionary Conservation and Amino Acid Substitutional Tolerance Values The Structure-Genetic (S-G) Matrix weighting scheme, as modified by Feng et al.,40 was used to quantitate amino acid similarity for the purpose of analyzing amino acid sequence conservation and amino acid substitutional tolerance. Figure 4 shows an alignment of TS enzymes from 13 different
Fig. 4. Amino acid sequence alignment of thymidylate synthases from various sources. Sequences of 13 different thymidylate synthase from various source^^-'^ were aligned using the multiple sequence alignment program package from Doolittle and Feng.54The numbering system is for the E. coliprotein? Blocks of sequence with high conservation have been highlighted with a box. The region spanning positions 20 through 35 is shaded. represent p-sheets and Above the sequences, represent a-helices.
Fig. 2. Suppression patterns of E. coli thymidylate synthase. Purified colonies of the 30 thyA amber mutants transformed into 13 amber sumressor strains were streaked onto media with or without thymidhe and growth of individual colonies was compared at 37°C: Filled box = 50-1007'0 size of colonies on plates containing thymidine; darkly shaded box = 10-50%; lightly shaded box < 10%; white box = no individual colonies observed. Sites 20,22,23,24,25,26,27,28,29, and 31 are from this study. The remainder of the sites are taken from Michaels et al." The sites are indicated by the three letter amino acid code representing the wild-type amino acid at each position followed by the position number. Amino acids substituted are grou ed as non-polar, polar, basic, and acidic according to Lehninger.p3
DU I nL I E U I
L L F L
n G-v n n G-I
E H L
D E G-T D E W (
v n v c v c vuu vuu
NIL D HI n HU tl HI ~ H I
JIPIIPFPKU I EII-I~KQFPKLUIKE .s P n P F P T LS L KUOLRLSPKPF~CU I K n Q L T LT P n P F P T IJ F T E Q L T n-1 P n P F P T u F K I 0 L 0 L E P II P F P K U I
DIIL D nL D nL E P L E P L K I Q L Q ~ I P ~ P F P K L ~ I
UE D S n E DS
nn n K u n RKUE RKUE
I E G
n s v-n n s v-n v-R nc n s v-s n s v-s n s v-n
P H P T I K
L L-T L LT L LT
n n vn
Tu LO LU nI L LT n L LT K n L LT L-T cv nI
L L-T L LT
2. n s v-n
LG-u P S - h l DLYOltbJ u L u J d&LU.As
LG-u P S - h l
PS-h I PJ-h I
~ t - uP J - h
LG-u LG-u L L U L L U
D-F E l E G V D P H P G I K R P U R I 2 D-I K L L n VJ P v P n I K n P u n u z D J I I E D V - D P H P H I K G ~ U ~ U ~ D S u L w n V-u t H P P I K G K n n u 2 D S K L I H V 1 H C D K L L F E U n U Z D J E I E D VJI P H P n I Q n K II s u z D J E I UG V I P V P P I K n K n s u Z D J n I OIVJIPVET irnrns I z D J Q L D G VJ P H P P L K n E n n L z 0-1 I L E N V - N P H P I l K n H n n U 2 - N PP VH PPS1 2I K n P n n u Z 0-1 0I LI EE NG VV-tl
ni P E U K
D K I D U K E R K N E I K nns IT
F E nK F E PX U D L T F E L D UD nD E H
G M -e ..
I O L V Q M C D-U F L t lT T i CA? v-R Ll F L G - U P J - h I n s v-n Q L V Q U Q L V O U L U f LG-u P - F - h l n s v-n O U V Q l c l v R U F Ltl P S - h I n s E U R l l l L L W M n L u r-FJ4-u F D v-n
G F Q U R GFQUR OFOUR
G F Q U R
KEGEESGK; L L V Q M C o-n c ~ P K Q ~ K T ~ K PL L V Q R l c M G snnwn r O L V Q M C MG Y.. G Q U V Q M GM G D G Q L V Q UL n c D O Q L V Q U MG N G Q L V Q M G o-n G
UVUK SO-FVUSF I C O - F V U IF P I S L P D P C VUH I P
F O S V U l
L P W W
W W L T W W W L P
Phl-3-1 5 . cer*uerIoe C . alblcanr P. carlnll U a l ce I la-Zort er U l r u i H e r p e s - S l n p l o x UIrus Herpea-Ulrur a l l e l e s
_D G_ n e
G F Q U R GFQUR
G K K N ~ S L N F G ~ E V I D C K T ~ V F G A K V K T C D D D V FGREVKDCDSDV FGREVKDCOSMV F G R E V K G U G R D V F G R E V Q G L K H f l V F G I E V K D ~ D S D V F G R E V R D ~ E S D V
K D D
P nnGJL Q V L ~ Q U E H I L R C G F P P H G J L O V L G OI O H I L R C G U
n v LU n v G-s
L L N C U
N l U l D Q T H L Q L S I L E P P ? L P K L I I K R K P c H U D Q I K E Q L s n-i P R P n r T LQ nPDK WLI E Q U W L Q L E RD U R P L P Q L R F n RKU W L U E Q C K E I LIIILEPKELCELUI S G L P V X F R V L S T KE O L
H H H H H H H H
E L R D D L S n D L R H K
L R L T
L O L S
S D E E F D E E E
L R L E
I D E G I D E L I N H L
L L L F
E. c o I I L. bacillus 8 , *ubtlll* TI
Phl-3-1 V L L 5 , cer#ue.Iw Q U I DU I C . alblcma M I I P. carlnll Uolcella-Zor1.r Ulrua T U I Herp.r-SInplex Ulrua Q LI Hlrpea-Ulrur allelms 01 I KU I hUre nu i tlueal
E . co11 L. b o c l l l u s 8. r u b l l l l r
Uarlcella-Zorler U l r u r Herpea-Simplex UIrus Harper-Ulrur a l l e l e s nous. Hunan
L E L D K D QD
n QKU R K K U C RHU I K D I n s l I K D I L D L C t n i L D L C K R I L n i u P V I LKQU D O I LSOU Q H l LSQU K H I Q Q L Sn E n Q E R D R E P R
n K QV IlLEQP V K OV V v V V v V V V R
n u n v LO E W G U I ~LW EU S T ~ U ~ D L R L I Q H D S I LO G K T U U - D E ~ V E ~ Q n D U T E L ~ KIIGUHIU-~ Q U D T D n n L LS E Q C U K I W G ~ G S ~ E V E K G U K I U G ~ G S R E F S T O R K I L S E K n I H I L D R N C S R E V E T D s L K L n S T D S K E L A ~ K D I H I ~ I V G ~ S K F sTD5KEIS n n c u ~ i u n t i c s n s ~ s T D 5 K E Ln n s t u ~ ~ u n n c s n s v s T n n K E LS S K G U R I I L D R ~ G S R D F s T n n K E LS S K G U K I ~ I I R G S ~ D F
8, sublllls TI
DTNIFIVLH D T N I ~ F L L
E. c o l l L. boclllw
E. coll L. b a c l l l u r 8, r u b l l l l r TI Ph 1-3-1 S . cerwerlam C . albleons P. c o r l n l l UErICeI Io-Zrrtrr u i r u s Herper-Slnp lox U I r v s Herper-Ulrur at I r I e . nouse
nti R G n c R G K G K G
i I i I I I
L i t
L S G UQLKS
G G u LLI I G E K U LQ-U D IGQGULLLR T G Q G ILO-LK T G 0 G F L L L Q L Q Q G I L L L O K G E G U LO-L K G G E G LO-L K S G Q G UL L L Q S G Q G U e0-L 0
LU LU L U U L U U LU
E E E E E E
K L U U-I
L O G L H G L K G
C.W. KIM ET AL.
unsuccessfully36~41,though recently a partial deletion has been obtained.42
In Vivo Growth Assays To date, E. coli tRNA suppressors have been used to introduce a total of 245 amino acid substitutions in E. coli TS (Fig. 2). In this investigation, 10 new sites between residues 20 and 35 have been analyzed. This region of the TS protein represents a block of amino acids that is highly conserved between 13 known TS sequences (Fig. 4). The majority of the residues in this region are not in the vicinity of the substratetco-factor binding pocket and therefore do not seem to play a direct role in catalysis (Fig. 6 ) . Oligonucleotide-directed mutagenesis was used to introduce amber (5’-TAG-3’) mutations a t each of the 10 new sites within this region. Suppressor tRNAs, capable of inserting a specific amino acid in response to an amber termination codon, were then used to introduce up to 1 3 different amino acid substitutions at these sites. A bank of 124 new mutationally altered TS proteins was created in this manner. Amino acid substitution mutants were analyzed in vivo for their ability to grow as single colonies in the absence of exogenous thymidine at three different temperatures. The results of this study, as well as those of a previous study,22 are summarized in Figure 2. Temperature-sensitive mutations found at positions 28, 29,30, and 31 are shown in Figure 3. Substitutions a t these sites, which are centered over the first p-bulge structure, show that the P-bulge structure may be particularly sensitive to higher temperatures. No other temperature-sensitive mutants were found. In Vitro Biochemical Activities The activities of 47 substitution mutants were tested in vitro by the tritium release assay using crude l y ~ a t e sThe . ~ ~tritium release assay measures enzyme activity by the release of a tritiated hydrogen atom at the 5-position of the dUMP ring. This position is concomitantly methylated to yield the dTMP product. To determine enzyme activity, tritium released into the aqueous media was separated from dUMP-bound tritium by adsorption of 3HdUMP to activated charcoal and the non-adsorpted 3H,0 was counted by scintillation. Quantitative Western blotting was used to determine the concentration of thymidylate synthase in the crude lysate in order to calculate specific activities. Specific activity values given in Table 1are accurate to within a factor of approximately two. Mutants displaying the Thy+ phenotype in in vivo growth assays had specific activities ranging from 4.5% to 144% of wildtype with over half of the Thy+ mutants having a t least 50% of the wildtype activity. Thy- mutants
had specific activities less than 1% of wild type (Table I).
Analysis of Amino Acid Substitutional Tolerance The substitution patterns at 14 sites, between positions 20 and 35, can be divided into three general classes: high substitutional tolerance, intermediate substitutional tolerance, and low substitutional tolerance. Sites with high substitutional tolerance could be substituted with at least 9 of the 13 amino acids tested. Sites with intermediate substitutional tolerance could be substituted with 5 to 7 different amino acids. Sites with low substitutional tolerance could be substituted by, at most, 3 different amino acids. The substitution patterns for each site are shown in Figure 2 and summarized below.
High substitutional tolerance At sites Gly-23, Gly-25, Thr-26, Leu-27, and at the previously studied sites Gln-33 and Arg-35,” most or all of the substitutions that were studied allowed the cell to grow in the absence of thymidine. In vitro biochemical assays further showed that in many cases mutants had levels of specific activity comparable to wildtype (Table I). At Gly-23 and Gly-25, polar amino acids of various sizes such as Ser, Gln, and Tyr had wild-type levels of specific activity. Substitution with charged amino acids such as His and Glu retained significant levels of specific activity, though they were reduced approximately 4-fold compared to wildtype. At Thr-26, substitutions with heterologous amino acids such as Gln, Leu and His caused about 4- to 10-fold reductions in specific activities compared to wild type. Substitution with Ser did not affect the specific activity of the enzyme. Various substitutions at Leu-27, Gln-33, and Arg-35 did not significantly affect the specific activities of the mutant enzymes. The high degree of substitutional tolerance at Arg-35 was surprising since the crystal structure shows Arg-35 to be involved in hydrogen bonding with the backbone carbonyls of Phe-30 and Gly-31.15 Apparently, this interaction is not crucial for enzyme function and hence may not be important for the p-bulge structure found at Phe-30. Intermediate substitutional tolerance Thr-22, Thr-24, Ser-28, Ile-29 had intermediate levels of substitutional tolerance. At these sites some substitutions were tolerated while others were not tolerated. The patterns of amino acid substitutional tolerance at these sites shed light on the possible roles of these residues in the structure and function of the TS enzyme. At Thr-22, allowable amino acid substitutions were limited to polar residues, including substitution with the large Tyr residue. Nonpolar residues such as Gly, Ala, Phe, and Leu as well as charged
Panel A. Values of Amino Acid Conservation and Substitutional Intolerance
Fig. 5. Analysis of evolutionary sequence conservation and amino acid substitutional intolerance. A shows the values of amino acid sequence conservation (C), in shaded bars, for each of the 30 sites shown in Figure 2. Values range from 0 to 1 where a 0 represents the least amino acid sequence conservation and a 1 represents the greatest amino acid sequence conservation. Hatched bars shows the values of amino acid substitutional intolerance (S) for the same sites. Values range from 0 to 1 where a 0 represents the least substitutional intolerance and a 1 represents the greatest substitutional intolerance. The value for amino acid conservation (C) was calculated using a weighted scale based on the Structure-Genetics (S-G) Matrix, which takes into account the genetic and structural similarities of amino acids?' The following formula was used to obtain values normalized to a scale of 0 to 1:
where C = value of amino acid sequence conservation normalized to a scale from 0 to 1. C' = average raw score of amino acid substitution obtained using S-Gmatrix values.
x = average raw score of amino acid substitution by all 20 amino acids obtained using S-G matrix values. This represents the lowest possible value of amino acid sequence conservation. Therefore, the value [ I - x] represents the range of possible values. The value for amino acid substitutional intolerance (S) is calculated in an analogous manner by counting those substitutions that are Thy+ and omitting Thy- substitutions. The following formula was used:
s = (S'-x)/(l-x) where S = value of amino acid substitutional intolerance normalized to a scale from 0 to 1. S' = average raw score of amino acid substitutions that retained the Thy+ phenotype using S-Gmatrix values. x = same as above. In both cases the E. coli sequence is used as the template. B shows a difference plot (AC-S)where values for amino acid substitutional intolerance (S) were subtracted from values of amino acid sequence conservation (C).
C.W. KIM ET AL.
Fig. 6. Ribbon representation of E. coli thymidylate synthase dimer showing the dimer interface. The N-terminus and C-terminus are denoted as N and C, respectively. Residues 20 through 35 for one of the monomers are numbered according to Betfort et aI.*
residues such as His, Arg, and Glu were not tolerated. The crystal structure of the enzyme indicates that the backbone carbonyl of Thr-22 makes an indirect H-bond with Ile-264 via a water m~lecule.'~ Substitutions at Thr-24 appear to be insensitive to the size and polarity of amino acids, accepting substitutions by nonpolar as well as polar amino acids of various sizes. Thr-24 can be substituted with Ala and Phe as well as with the charged residues His, Arg, and Glu and the polar residues Cys and Ser. The differences in substitution patterns between Thr-24 and Thr-22 suggest that their roles in the protein are different. At Ser-28, substitutions with large residues such as Phe, Leu, and Tyr were not tolerated. Ser-28 was also intolerant to substitutions with the positively charged residues His, Lys, and Arg. Substitutions with polar residues such as Cys and Gln as well as the negatively charged Glu were tolerated. Ile-29 tolerated substitutions by amino acids that were large and nonpolar such as Phe and Leu, as well as with polar amino acids. Substitutions with smaller residues such as Gly and Ala as well as with charged residues such as Lys, Arg, and Glu rendered the enzyme inactive. A His substitution a t position 29 had levels of specific activity comparable to wildtype (Table l), suggesting that His may be uncharged a t this position. This may be facilitated by the pK value of the His side chain (pK = 6.0),which is near neutral.
Low substitutional tolerance At Asp-20, Cys and Ser substitutions were tolerated. Asp-20 could also tolerate substitution with
Glu, a larger but electrochemically similar residue. A Gln substitution, which is analogous to a Glu substitution in terms of size but not in terms of charge, was not tolerated. Substitutions with Ala and Gly were also not tolerated. In vitro biochemical assays of the Cys and Ser substitutions a t Asp-20 show that the mutants are actually reduced in specific activity by approximately 20-fold, which indicates that Asp20 plays a specific role in protein function. This may be attributed to its close proximity to Arg-21, which plays a crucial role in substrate binding.15," The previously studied Arg-21 could not tolerate any of the substitutions tested." As determined by the 3-D structure, the long, positively charged side chain of Arg-21 reaches into the base of the substrate binding pocket to stabilize the negatively charged phosphate moiety of FdUMP (15). The structurally similar Lys residue could not substitute for Arg a t this position. This is due to the fact that the terminal nitrogen groups of Arg-21 also make H-bonds with the carboxyl oxygens of Ile-264 a t the C-terminus of the TS protein, as well as indirect H-bonds with the folate cofactor via a water m~lecule.'~ The interaction with Ile-264 is important for stabilizing the conformational changes associated with formation of the ternary complex. This interaction is possible only by the presence of multiple nitrogens in the Arg residue. The Lys residue has only one nitrogen group and is therefore incapable of forming the multiple H-bonds required a t this position. These results correspond well with substitutions a t the corresponding Arg-44 in the mouse TS enzyme43and Arg-23 in the Lactobacillus subtilis TS enzyme.44
STUDIES OF E. COLI THYMIDYLATE SYNTHASE
Gly-31 and the previously studied Phe-30" are important in maintaining the P-bulge structure. Phe-30 can only be substituted with the structurally similar, but polar, Tyr residue as well as with the large nonpolar Leu residue. Gly-31 can only be substituted with Ser. Intolerance to substitutions by amino acids of different size, in addition to the temperature-sensitive mutants found at these positions, suggests that the P-bulge structure is highly sensitive to side chain packing.
DISCUSSION Using E. coli tRNA suppressors, a large number of amino acid substitutions have been introduced at a highly conserved region near the N-terminus of E. coli thymidylate synthase. The block of conserved residues between positions 20 to 35 encompasses regions of various structure and function. Residues 20 through 24 are in a surface loop and residues 25 through 35 are in a p-strand, which contains a pbulge structure centered over residues 30 and 31. Residues 30 through 35, within the second half of the P-strand, are at the dimer interface where two TS monomers associate. Using a strain in which the entire thyA gene was deleted in order to eliminate any background thymidylate synthase activity, mutant thymidylate synthase proteins were analyzed by in vivo growth assays for their ability to complement the Thy- phenotype of the t h y A deletion strain. A subset of these mutants were further analyzed by in vitro tritium release assays to determine their specific activities. Within the surface loop structure, at least three residues are important for enzyme function. Asp-20, Arg-21, and Thr-22, present at the base of the substrate binding pocket, had limited tolerance to amino acid substitutions. This correlates well with the 3-D structure of the ternary complex, which shows Arg-21 making direct contact with the phosphate moiety of dUMP as well as the C-terminal carboxyl oxygens of Ile-264. Substitution patterns a t positions 20 and 22 suggest that this interaction appears to be stabilized by a polar environment provided, in part, by Asp-20 and Thr-22 in the wild-type enzyme. Within the p-strand, sites near the P-bulge structure showed the least tolerance to amino acid substitution. The patterns of substitution at sites 28, 29, 30, and 31 indicate that these sites are highly specific for their wildtype amino acid. Furthermore, some variants at these sites were temperature-sensitive. These results suggest that the P-bulge structure is stabilized by a highly ordered packing arrangement of amino acids. Pro substitutions were well tolerated at various positions in E. coli TS. Pro residues are usually absent from the interiors of ahelices and p-strands in other known proteins due to their disruptive effects on these structure^.^^ The p-strand spanning sites 25 through 35 is not prefer-
entially affected by Pro substitutions indicating that disruptions in the p-strand do not significantly affect enzyme function. Amino acid substitutional tolerance has been correlated with solvent accessibility, as first shown by Perutz et al.46 A recent example is in A repressor where 17 consecutive residues at the dimer interface were analyzed by cassette m ~ t a g e n e s i s Bowie . ~ ~ et al.47 show that the majority of sites that are intolerant to amino acid substitution are also inaccessible to the solvent and that most of the highly exposed residues are tolerant to substitutions with a wide range of chemically different side chains, including hydrophilic and hydrophobic residues. The fractional solvent accessibility of each residue in E. coli thymidylate synthase was determined as described by Bowie et al.48 Side chains are considered solvent inaccessible if the fractional accessibilities are 10% or less. Residues with fractional accessibilities greater than 10% are considered solvent accessible. A comparison of amino acid substitutional tolerance with solvent accessibility for each site of E. coli TS shown in Figure 2 reveals that most residues with relatively low substitutional tolerance are also solvent inaccessible (data not shown). And most sites with high substitutional tolerance are solvent accessible. The exceptions are a t Arg-21 and Asp110, which are solvent accessible but intolerant to amino acid substitution. This most likely reflects the fact that Arg-21 and Asp-110 are involved in specific H-bonds important for enzyme function. l5 Therefore, amino acid substitutional tolerance is a good indicator of solvent accessibility, and vice versa. However, substitutional tolerance may be better than solvent accessibility in identifying solvent accessible residues that are involved in important polar interactions. In terms of amino acid sequence, amino acid substitutional tolerance is a poor indicator of evolutionary conservation. Some highly conserved sites in E . coli thymidylate synthase were tolerant to substitution with heterologous amino acids. These findings were unexpected based on the assumption that sites with high evolutionary conservation serve important functions in the enzyme, and hence, would be substitutionally intolerant. A comparison of evolutionary conservation with amino acid substitutional intolerance shows that sites 23, 25, 26, 27, 33, 35, 81, and 127 have high values of evolutionary conservation (C) compared to values of amino acid substitutional intolerance (S)(Fig. 5). Tritium-release assays further show that various amino acid substitutions at positions 23, 25, 27, 33, and 35 do not have significant effects on the specific activity of the enzyme in vitro (Table 1). Thus single amino acid substitutions are differentially tolerated among highly conserved sites within the E. coli enzyme. While some evolutionarily conserved sites may be highly specific for its wild-
C.W. KIM ET AL.
type amino acid, other sites that are equally conserved may be replaced with heterologous amino acids with no apparent effect on enzyme function. Similar observations have been made in human ai n t e r f e r ~ n , ~yeast ’ cytochrome human tumor necrosis factor,51 and chicken nerve growth factor.52 These results illustrate the complexity of evolutionary conservation and provide impetus for future studies to better understand the relationship between protein structure-function and amino acid sequence conservation.
ACKNOWLEDGMENTS We thank Jean Lee and Lucie Pham for providing expert technical assistance, and Peter Markiewicz for assistance with sequence alignment programs. We also thank David Matthews of the Agouron Pharmaceutical Corporation for sharing crystal data for the E . coli TS ternary complex and Jim Bowie with assistance with solvent accessibility determinations. Brett Lovejoy provided the 3-D representation of the TS dimer. Caine Wong and Mary Anne Schofield provided useful comments on the manuscript. This work was supported in part by U.S. Public Health Service National Research Award GM-07104 to C.W.K. and a grant from the National Institutes of Health (GM 43827-01A1) to J.H.M.
REFERENCES 1. Santi, D. V., Danenberg, P. V. In: “Folates and Pterines,” Vol 1. Blakely, R. L. and Benkovic, eds. New York: John Wiley & Sons, 1984:345-398. 2. Belfort, M., Maley, G., Pederson-Lane, J., Maley, F. Primary structure of the Escherichia coli thyA gene and its thymidylate synthase gene product. Proc. Natl. Acad. Sci. USA 80:4914-4918,1983. 3. Maley, G., Bellisario, R., Guarino, D., Maley, F. The primary structure of Lactobacillus casei thymidylate synthetase. J. Biol. Chem. 254:1301-1304,1979. 4. Iwakura, M., Kawata, M., Tsuda, K,. Tanaka, T. Nucleotide sequence of the thymidylate synthase B and dihydrofolate reductase genes contained in one Bacillus subtilis operon. Gene 649-20, 1988. 5. Kenny, E., Atkinson, T., Hartley, E. Nucleotide sequence of the thymidylate synthetase gene fthyP3) from the Bacillus subtilis phage 03T. Gene 34:335-342,1985. 6. Chu, F.,Maley, G., Maley, F., Belfort, M. Intervening sequence in the thymidylate synthase gene of bacteriophage T4.Proc. Natl. Acad. Sci. USA 81:3049-3053,1984. 7. Taylor, G., Lagosky, P., Storms, R., Haynes, R. Molecular characterization of the cell-cycle regulated thymidylate synthase gene of Saccharomyces cerevisitue. J . Biol. Chem. 262:5298-5307,1987. 8. Edman, U.,Edman, J.C., Lundgren, B., Santi, D.V. Isolation and expression of the Pneumocystis carinii thymidylate synthase gene. Proc. Natl. Acad. Sci. USA 86:65036507,1989. 9. Singer, S.C., Richards, C.A., Ferone, R., Benedict, D., Ray, P. Cloning, purification, and properties of Candida albicans thymidylate synthase. J . Bacteriology 171:13721378,1989. 10. Richter, J., Puchtler, I., Fleckenstein, B. Thymidylate synthase gene of Herpesvirus ateles. J. Virol. 62:3530-3535, 1988. 11. Honess, R.W., Bodemer, W., Cameron, K. R., Niller, H. H., Fleckenstein, B., Randall, R. E. The A + T-rich genome of Herpesvirus saimiri contains a highly conserved gene for thymidylate synthase. Proc. Natl. Acad. Sci. USA 83: 3604-3608,1986.
12. Thompson, R., Honess, R. W., Taylor, L., Morran, J., Davidson, A. J. Varicella-Zoster virus specifies a thymidylate synthetase. J. Gen. Virol. 68:1449-1455,1987. 13. Perryman, S. M., Rossana, C., Deng, T., Vanin, E. F., Johnson, L. F. Sequence of a cDNA for mouse thymidylate synthase reveals striking similarity with the prokaryotic enzyme. Mol. Biol. Evol. 3:313-321,1986. 14. Takeishi, K., Kaneda, S., Ayusawa, D., Shimizu, K., Gotoh, o.,Seno, T. Nucleotide sequence of a functional cDNA for human thymidylate synthase. Nucleic Acids Res. 13: 2035-2043, 1985. 15. Matthews, D. A., Appelt, K., Oatley, S. J., Xuong, N. H. Crystal structure of Escherichiu coli thymidylate synthase and 10-propcontaining bound 5-fluoro-2’-deoxyuridylate argyl-5,8-dideazafolate.J. Mol. Biol. 214:923-936, 1990. 16. Matthews, D. A,, Villafranca, J . E., Janson, C. A,, Smith, W. W., Welsh, K., Freer, S. Stereochemical mechanism of action for thymidylate synthase based on the X-ray structure of the covalent inhibitory ternary complex with 5flouro-2‘-deoxyuridylate and 5,lO-methylenetetrahydrofolate. J . Mol. Biol. 214:937-948,1990. 17. Matthews, D. A,, Appelt, K., Oatley, S. J . Stacked betabulges in thymidylate synthase account for a novel righthanded rotation between opposing beta-sheets. J . Mol. Biol. 205:449-454, 1989. 18. Montfort, W. R., Perry, K. M., Fauman, E. B., FinerMoore, J . S., Maley, G. F., Hardy, L., Maley, F., Stroud, R. M. Structure, multiple site binding, and segmental accomodation in thymidylate synthase on binding dUMP and an anti-folate. Biochemistry 29:6964-6977, 1990. 19. Hardy, L. W., Finer-Moore, J. S., Montfort, W. R., Jones, M. O., Santi, D. V., Stroud, R. M. Atomic structure of thymidylate synthase: Target for rational drug design. Science 235448-455,1987. 20. Dev, I. K., Yates, B. B., h a n g , J., Dallas, W. S. Functional role of cysteine-146 in Escherichia coli thymidylate synthase. Proc. Natl. Acad. Sci. USA 851472-1476, 1988. 21. Dev, I. K., Yates, B. B., Atashi, J., Dallas, W. S. Catalytic role of histidine-147 in Escherichia coli thymidylate synthase. J . Biol Chem. 264:19132-19137,1989. 22. Michaels, M. L., Kim, C. W., Matthews, D. A,, Miller, J . H. Escherichin coli thymidylate synthase: Amino acid substitutions by suppression of amber nonsense mutations. Proc. Natl. Acad. Sci. USA 87:3957-3961, 1990. 23. Perry, K. M., Fauman, E. B., Finer-Moore, J . S., Montfort, W. R., Maley, G. F., Maley, F., Stroud, R. M. Plastic adaptation toward mutations in proteins: Structural comparison of thymidylate synthase. Proteins: Structure, Function and Genetics 8:315-333,1990. 24. Blum, P., Holzschu, D., Kwan, H.-S., Riggs, D., Artz, S. Gene replacement and retrieval with recombinant M13mp bacteriophages. J. Bacteriol. 171538436,1989. 25. Miller, J. H. “Experiments in Molecular Genetics” Cold Spring Harbor, NY: Cold Spring Harbor Lab, 1972. 26. Miller, J . H., Albertini, A. M. Effects of surrounding sequence on the suppression of nonsense codons. J . Mol. Biol. 16459-71, 1983. 27. Bradley, D., Park, J . V., Soll, L. tRNAG’” Su’2 mutants that increase amber suppression. J . Bacteriology 145704712,1981. 28. Ryden, S.M., Isaksson, L. A. A temperature-sensitive mutant ofEscherichia coli that shows enhanced misreading of UAG/A and increased efficiency for some tRNA nonsense suppressors. Mol. Gen. Genet. 193:38-45, 1984. 29. Masson, J.-M., Miller, J . H. Expression of synthetic suppressor tRNA genes under the control of a synthetic promoter. Gene 47:179-183, 1986. 30. Normanly, J., Masson, J.-M., Kleina, L. G., Abelson, J., Miller, J . H. Construction of two Escherichia coli amber suppressor genes: tRNA-Phe and tRNA-Cys. Proc. Natl. Acad. Sci. USA 83:6548-6552,1986. 31. Normanly, J., Ogden, R. C., Abelson, J. Changing the identity of a transfer RNA. Nature 321:213-219, 1986. 32. McClain, W. H., Foss, K. Changing the acceptor identity of a transfer RNA by altering nucleotides in a “variable pocket.” Science 241:1804-1807,1988. 33. Kleina, L. G.,Masson, J-M., Normanly, J., Abelson, J., Miller, J . H. Construction of Escherichin coli amber suppressor tRNA genes 11. Synthesis of additional tRNA genes
STUDIES OF E. COLI THYMIDYLATE SYNTHASE
36. 37. 38. 39. 40.
and improvement of suppressor efficiency. J. Mol. Biol. 213:705-717, 1990. Normanly, J., Kleina, L. G., Masson, J-M., Abelson, J., Miller, J . H. Construction of Escherichia coli amber suppressor tRNA genes 111. Determination of tRNA specificity. J . Mol. Biol. 213:719-726, 1990. Chang, A. Y. C., Cohen, S. N. Construction and characterization of amplifiable multicopy DNA cloning vehicles derived from the P E A cryptic miniplasmid. J . Bacteriology 148:1141-1156, 1978. Belfort, M., Pederson-Lane, J . Genetic system for analyzing Escherichia coli thymidylate synthase. J. Bacteriology 160:371-378, 1984. Kunkel, T. A. Rapid and efficient site-specific mutagenesis without phenotypic selection. Proc. Natl. Acad. Sci. USA 82:488-492, 1985. Kunkel, T. A,, Roberts, J . D., Zakour, R. A. Rapid and efficient site-specific mutagenesis without phenotypic selection. Methods in Enzymol. 154:367-382, 1987. Roberts, D. An isotopic assay for thymidylate synthetase. Biochemistry 53546-3551, 1966. Feng, D. F., Johnson, M. S., Doolittle, R. F. Aligning amino acid sequences: Comparison of commonly used methods. J . Mol. Evol. 21:112-125, 1985. Chung, S.-T., Greenberg, G. R. Loss of a n essential function of Escherichia coli by deletion in the thyA region. J . Bacteriology 116:1145-1149, 1973. Bell-Pederson, D., Galloway Salvo, J . L., Belfort, M. A transcription terminator in the thymidylate synthase (thyA) structural gene of Escherichia coli and construction of a viable thyA::Km’ deletion. J . Bacteriology 173:11931200,1991. Zhang, H., Cisneros, R. J., Deng, W., Johnson, L. F., Dunlap, R. B. Site-directed mutagenesis of mouse thymidylate synthase: Alteration of Arg44 to Val44 in a conserdered loop guarding the active site has striking effects on catalysis and nucleotide binding. Bioch. Biophys. Res Comm. 167:869-875.1990.
44. Climie, S., Ruiz-Perez, L., Gonzalez-Pacanowska, D., Prapunwattana, P., Cho, S-W, Stroud, R., Santi, D. V. Saturation site-directed mutagenesis of thymidylate synthase. J . Biol Chem. 26518776-18779, 1990. 45. Richardson, J . S., Richardson, D. C. Amino acid preferences for specific locations at the ends of a helices. Science 240:1648-1652, 1988. 46. Perutz, M. F., Kendrew, J. C., Watson, H. C. Structure and function of haemoglobin. 11. Some relations between polypeptide chain configuration and amino acid sequence. J . Mol. Bio. 13:669-678, 1965. 47. Bowie, J. U., Reidhaar-Olson, J . F., Lim, W. A., Sauer, R. T. Deciphering the message in protein sequences: Tolerance to amino acid substitutions. Science 247:1306-1310, 1990. 48. Bowie, J . U., Luthy, R., Eisenberg, D. A method to identify protein sequences that fold into a known three-dimensional structure. Science 253:164-170, 1991. 49. Valenzuela, D., Weber, H., Weissmann, C. Is sequence conservation in interferons due to selection for functional proteins? Nature 313:698-700, 1985. 50. Hampsey, D. M., Das, G., Sherman, F. Yeast iso-l-cytochrome c: Genetic analysis of structural requirements. FEBS Letters 231:275-283, 1988. 51. Van Ostade, X., Tavernier, J., Fiers, W. Two conserved tryptophan residues of tumor necrosis factor and lymphotoxin are not involved in the biological activity. FEBS Letters 238:347-352, 1988. 52. Ibanez, C. F., Hallbook, F., Ebendal, T., Persson, H. Structure-function studies of nerve growth factor: Functional importance of highly conserved amino acid residues. EMBO 9:1477-1483, 1990. 53. Lehninger, A. L. “Biochemistry.” New York: Worth, 1975. 54. Doolittle, R. F., Feng, D.-F. Nearest neighbor procedure for relating progressively aligned amino acid sequences. Methods in Enzymology 183:659-669, 1990.