J. Mol. Evol. 10, 167--182 (1977)

Journal of Molecular Evolution O by Springer-Verlag 1977

Molecular Evolution of Snake Venom Toxins T.H. Hseu 1 *, E.D. Jou 2 * % C. Wang 2 * * % and C.C. Yang 1 1 Institue of Molecular Biology and 2 Department of Physics, National Tsing Hua University, Hsinchu, Taiwan, Republic of China

Summary. Phylogenetic trees were constructed for 62 venom toxins of snakes of Proteroglypbae suborder using matrix method. The resulting tree from Minimum Spanning Tree-Cluster Analysis technique had the lowest "percent deviation" (8.5 5). The taxonomic relationship of these toxins agrees very well with zoological opinions. However, the appearance of the tree did not directly provide a plausible evolutionary model for the toxins. A model was derived from nodal ancestral sequence calculations, comparisons between intra- and intergenerical rates of amino acid change, and generally held ideas about protein evolution. According to the model, short neurotoxin is the ancient form o f snake venom toxins. The courses of evolution leading to the present intraspecific homologous toxins are explained by gene duplication and allelomorphism. Key words: Molecular evolution -- Phylogenetics - Numerical t a x o n o m y Snake venom toxins - Elapidae - Hydrophiidae In the past decade intensive studies have been made of the snake venom toxins. To date there have been over 62 amino acid sequences determined from venoms of Proteroglypbae (Fig. 1). They are all small basic polypeptides with different pharmacological activities, namely neurotoxins, cardiotoxins, and so-called angusticeps-type toxins of unknown pharmacological activity. All the neurotoxins whose amino acid sequences have been determined are classified into two groups. Although the two types of neurotoxins are pharmacologically and structurally related (the short neurotoxins consist of 6 0 - 6 2 amino acid residues with four disulfide bonds; the long neurotoxins have 7 1 - 7 4 amino acid residues with five disulfide bridges), their immunochemical properties are completely different (Bores, 1972). Homologous cardiotoxins ( 6 0 - 6 1 amino acid residues with four disulfide bridges) are serologically distinct from both types of neurotoxins (Viljoen and Botes, 1973). In the absence of paleontological records, knowledge of the classification, the origin, and subsequent evolution of the venomous snakes developed years of effort by morphologists, and though broadly understood, has not yet attained unequivocal * ** ***

To whom correspondence should be addressed Present address: Department of Computer Sciences, University of Texas, Austin, Texas, USA Present address: Biological Division, California Institute of Technology, Pasadena, California, USA

168

T.H. Hseu et al.

agreement. The difficulties arise from the limitations of the comparative method and the high degree of specialization of these snakes (Johnson, 1956). However, where morphological changes accompanying speciation have been few, molecular evidence is useful in helping unravel relationships of closely related species. One of the more informative molecular approaches is the construction of phylogenetic trees using amino acid sequences. Studies on phylogenetic relationships of snake venom toxins have previously been attempted. Based on trees from 11 (Strydom, 1972a), 16 (Strydom, 1973) and 43 (Strydom, 1974) sequences, Strydom has subsequently altered his initial conclusion that a cardiotoxin-like structure was the ancient form of the snake toxins to his recent Genus

Species

Co--on

Name

Origin

ID.NO.

Toxins

~@mily: ~tapidae Naja

N. n a i l , a t r a

Taiwan

N. n i g r i c o l l i s (N. mossambica pallida)* N. m o s s a m b i c a mossambica

Black neck spitting

cobra

N. haje haje (N. haje annulifera)* N. m e l a n o l e u c a

Egyptian

N. nivea

Cape

N. naja N. naja

siamensis

Forest

Taiwan

cobra

cobra

S. Africa

Nile valley,

cobra

Egypt

S. Africa

cobra

S. Africa

Monocellate Thai Indian cobra

cobra

N. naja naja

Indian

cobra

N. naja n*:ja N. naja N. naja oxi~na

Black cobra Cambodian cobra

W. pakistan Cambodia Iran

spectacle

Thailand India

India

i 47 53 3 43 44 45 qO 52 2 56 7 28 38 55 58 59 2a 8 29 21 25 26 27 48 49 50 51 23 24 22 54 6 34 35 36 4 5 57 9 31 32

Cobrotoxin Cardiotoxin Cardiotoxin • Toxin ~ Toxin T (FI4) vii1

(62-4) [ou-4) (60-4) (61-4} (60-4) (6o-4)

vii2

(6o-4)

vii3 vii4 Toxin ~ VIZl Toxin d Toxin b Toxin 3,9,4 vii1 vii2 vii3 Toxin ~ Toxin ~ Toxin ~ Toxin 3 Toxin A Toxin B Toxin C Cytotoxin II Cytotoxin IIa CM-XI Cytotoxin • Toxin 3 Toxin 4 Toxin 3 Cardiotoxin (F8) 0xiana ~ Neurotoxin • Toxin a Toxin b Toxin II Toxin IV bLF ( F l 2 b ) Toxin ~ Toxin T Toxin 6

(bO-4) (60-4) (61-4) (60-4) (bl-4) (71-5) (71-5) (60-4) (o•-4) (61-4) (61-4) (61-4) (71-5) (71-5) (71-5) (71-5) (71-5) (60-4) (60-4) (60-4) (60-4) (71-5) (71-5) (71-5) (00-4) (61-4) (72-5) (73-5) (73-5) (01-4) (61-4) (ol-4) (60-4) (72-5) (72-5)

0phiophagus

O. hannah

K i n g cobra

Thailand

Hem~chatus

H. hemaehatus

Ringals

S. Africa

Dendroaspls

D. polylepis

Black mamba

E. Africa

Tropical

Africa

io v~l

Banded Krait

Taiwan

30 ll 33 42 40 41 37

v±~il T~xin 4.11.3 4,7,3 & 4.9.3 Toxin 4 . 9 , 6 Toxin FvlI Toxin Ta2 ~-b~arotoxin

(72-5) (6o-4) (72-5) (60-4) (61-4) (60-4) (74-5)

L. semifasciata

Erabu-umihebi

Amami Island,

L. laticaudata

Hiroo-~ihebi

Amami

Island,

Japan

Erabutoxin a Erabutoxin b Erabutoxin c LS I I I Laticotoxin a Laticotoxin b

L. colubrina E. schistosa

Aomadara-umihebi Co~on-sea snake

Amami Island, Penang Island

Japan

Hydrophis

H. cyanocinctus

Annulated

Taiwan

16 17 18 39 tga 20 19 12 13 13a 15

Pelamls

p. platurus

14

Pelsmitoxin

(62-4) (b2-4) (62-4) (06-5) (02-4) (62-4) (b2-4) (bO-4) (60-4) (60-4) (60-4) (60-4)

polylepis

D. juluesonii kaimosae

mamba

D. viridis D, angusticeps B~arus Family:

B. multieinctus

(bo-4)

Hydrophiidae

Laticauda

En/~ydrina

sea snake

Fig. 1. Toxins isolated from snake venoms of

J~pan

Taiwan

Proteroglypbaesuborder.

Laticotoxin

a

Toxin 4 Toxln 5 Hydrophitoxin Bydrophitoxin

D a n

Only those with amino

acid sequences determined are listed. References to sequence data are given in Figure 2. A (*) indicates Broadly's proposed reclassification of Nala species

Molecular Evolution of Snake Venom Toxins

169

recognition of two mutually exclusive alternative evolutionary pathways in the course of snake toxin evolution. Chang (1972) suggested that the complicated appearance of the phylogenetic tree constructed from 19 sequences is imperfection rather than anomaly. The difficulties encountered in phylogenetic studies on snake venom toxins are different from many of those encountered by investigators of cytochrome c and hemoglobins. It is known that multiple toxin varieties are present in the venom of a single individual snake, and only some of them have been sequenced. Moreover, the number of toxin varieties may change from species to species. Furthermore, sequence alignment for all these toxins (Fig. 2) requires the assumption of a large number of deletions or insertions. The mutation distances, calculated in terms of minimum number of mutation required (Fitch, 1966), may not provide a sufficiently close picture of the evolutionary history of the gene loci encoding for the toxins. In addition, the study of toxin protein evolution is an attempt made to examine the phylogenetic relationships of a set of homologous proteins all below a suborder level. All of the above complexities indicate snake venom toxins are rather unique proteins for the study of protein evolution. The present study attempted to understand the implication of the above distinct characteristics for toxin phylogeny and to provide some insights into the origin and evolution of toxin molecules and species containing them.

Phylogenetic Tree Construction The matrix method (Fitch and Margoliash, 1967; Hartigan, 1973; Moore et al., 1973) was chosen primarily on account of its general applicability and simplicity of idea. 'It has been shown (Peacock and Boulter, 1975), using a computer simulation, that when dissimilarity among sequences became greater the results of the method were slightly more accurate than those obtained by the ancestral sequence method (Dayhoff and Eck, 1966). Three approaches were used. All were based on a distance matrix calculated from Minimum Number of Multations Required (MNMR), in terms of nucleotide changes, to convert the codon for one amino acid into that for another (Fitch, 1966). The MNMR for amino acid pairs were taken from Table 1 of Fitch and Margoliash (1967), except that the values for amino acid pairs involving isoleucine with glutamine, glutamic acid and lysine are one less than in that table. Deletions which produce the gaps, represented by - ' s , were treated as if they were a 21st amino acid. Their MNMR in pairing with the rest of the 20 amino acids was assigned a value of 1 by the assumption of simultaneous appearance or disappearance of all three nucleotides encoding an amino acid. The first approach used was the method of Fitch and Margoliash (1967) (FM). A slightly different algorithm was employed (Jou, 1975). However, extensive search for a best tree was not attempted. Another approach is what we called Prelimset Ancestral Sequence (PAS) procedure. The tree building algorithm was the same as that used in FM procedure. The principal difference is in the calculation of averaged values for those elements of the distance matrix affected when two subsets are joined into a nodal

170

T.H. Hseu et al.

N. N A J A A T I ~ C O B R O T O X I N 2 N. lIAJE H A J E T O X I N 2A N. N I V E A T O X I N S 3 N. N. N I G R I C O L L I S T O X I N N. H A ~ M A C H A T U S T O X I N II H. H A ~ M A C H A T U S T O X I N I V 6 N. N A J A 0 X I A N A O X ~ A N A 7 N. M~LANOLEUCA TOXIN d 8 N. N I V ~ A T O X I N D. P O L Y L E P I S P O L Y L E P I S T O X I N i D. J A ~ S O N I I KA~IMOSA/~ VII Ii D. V I R I D I S T O X I N 4 . l i . ~ n 12 E. S C H I S T O S A T O X I N 4 l~ E. S C H I S T O S A T O X I N 5 1 3 A H. C Y A N O C I N C T U S h ~ R O P H I T O X I N b i~ P. PLAT~/RUS P E L A M I T O X ~ N a l~ H. C Y A N O C I N C T U S H Y D R O P H I T O X I N a 16 L. S E M I F A S C I A T A E R A B U T O X I N a 17 L. S E M I F A S C I A T A E R A B U T O X I N b 18 L . S ~ M ] [ F A S C I A T A E R & B U T O X I N c 19 L. C O L U B R I N A L A T I C O T O X I N a 19A L. L A T I C A U g A T A 20 L. LATICAIrDATA L A T I C O T O X I N b 21 N. N A J A SIA.MENSIS T O X I N 3 22 N. N A J A N A J A (PAKISTAN) T O X I N 3 2~ N. N & J A N A J A (INDIA) T O X I N 3 2~ N. N A J A N A J A (INDIA) T O X I N 25 N. N A J A {INDIA) T O X I N A 26 N. N A J A (INDIA) T O X I N B 27 N. N A J A (INDIA) T O X I N C 28 N. M ~ L A ~ O L E U C A T O X I N b ~9 N. NI%r~A T O X I N ~0 D. JAM~SON~fl K A I M O S ~ VIII1 31 D. P O L Y L E P I S P O L Y L E P I S T ~ X I N 32 D. P O L Y L E P I S P O L Y L E P I S T O X I N 33 D. V I R I D I S T O X I N 4 . 7 . 3 ; 4 . 9 . ~ ~4 N. N A J A 0 X I A N A N E U R O T O X I N I ~ O. H A N N A H T O X I N a 9~ O. I ~ A H TOXIN b 37 B. M U L T ~ C I N C T U S ~ - B U N G A R O T O ) L I N ~8 N. I ~ L A ~ O L E U C A T O X ~ N 9.9.4 ~9 L. S E M I F A S C I A T A LS III ~0 D. A N G U S T I C ~ P S T O X I N F V ~ I ~I D. A N G U S T I C ~ P S T O X ~ S TA2 ~2 D. V I R I D I S T O X I N ~ , 9 . 6 ~ N. N I G R I C O L L I S T O X I N F I ~ I~ ~ N. M O S S ~ I G A M0SSA24BICA V 1 ~5 N. MOSS/d~BICA M O S S A M B I C A V I I 2 ~6 N. M O S S A M B I C A M O S S A M B I C A V I ~ 3 ~7 N, N A J A A T R A C A R D I D T O X I N ~8 N. N A J A (INDIA) C Y T O T O X I N I~ ~9 N. N A J A (INDIA) C Y T O T O X I N Ila 50 N. N A J A (INDIA) CM-XI ~i N. N A J A (INDIA) C Y T O T O X I N ~, 52 N. MOSSA24BICA M O S S A M B I C A V * ~ 5~ N. N A J A A T R A CA]~DIOTOXIN I 5~ N. N A J A (CAMBODIA) C A R D I O T O X I N (F8) ~5 N. I ~ L A N O L E U C A I ~ A R D I O T O X I N 56 N. ' x ~ J E K/~J~ ~ i ~8

~Q

N. ~L~_NOL~IJCA V-- 2 N. M E L A N O L E U C A V I I 3

L E C H N Q Q S S Q T P T T T G C S G G E T N C Y K K R W R D - H - - - R G Y R T E R G C - - G C P S V X - N G I E I N C C T T - D R C N N. . . . . . . . LQCHNqQSSQPP~KTCP-GETNCY~NRD-H---RGSITERGC--GCPSVK-KGIEINCCTT-DKCNN ........ L E C ~ Q Q S S Q P P q ~ f K T C P - G ~ T N C Y I q K V M R D - H - - - N G T I I E R G C - - G C P T V K - P G I K L N C C T T _ D K C N N........ L g C H ~ Q Q S S ~ P P q ~ K S C P - S D T N C Y Z ~ K R W R D - H - - - R G T I I E R G C - - G C P T V K - P G I N L K C C T T - D R C N N. . . . . . . . L E C H N Q Q S S Q T P T T Q T C P - G E T N C Y K K q W S D - H - - - R G S R T E R G C - - G C P T V K - P G I K L K C C T T - D R C N K. . . . . . . . L E C H ~ q Q S S ~ p P T T K T C S - G ~ T N C Y K I C d W S D - H - - - R G T I I E R G C - - G C P K V K - P G V N L N C C ~ T - D R C N N........ MECH~QQSSQPPTTKTCP-GETNCYKKqWSD-H---RGTIIERGC -GCPSVK KGVKINCCTT-DRCNN ........ M I C H N Q Q S S A R P T I K T C P - G ~ T N C Y K K R W R D - H - - - R G T I I E R G C ~ - G C P S V K - K G V G I Y C C K T - D K C N ~........ R I C Y N H Q S T T R A T T K S C E - - E N S C Y K K Y W R D - H - - - R G T I I E R G C - - G C P K V ~ - P G V G I H C C Q S - D K C N Y. . . . . . . . R I C Y N H Q S T T ~ A T T K S C - - G E N S C Y K K T W G D - H - - - ~ G T I I E R G C - - G C P K V K - Q G I H L H C C Q S - D K C N N. . . . . . . . R I C Y N H Q S T T P A T T K S C - - G E N S C Y K K T W S D - H - - - [ t C T I I E R G C - - G C P K V K - R G V ~ H C C Q S - D K C N ~ N. . . . . . . . M T C C N Q Q S S Q P K T T T N C - A - E S S C Y K K T W S D - H - - - H ~ ] R I E R G C - - G C P q V K - P G I K L E C C ~ T - N E C N N. . . . . . . . M T C C N Q W A A q P K T T T N C - A - E S S C Y K K ' [ W S D - E - - I t G T R I E R G C - - G C P Q V K - S G I K L E C C H T - N E C N N. . . . . . . . M T C C N Q Q S S Q P Q T T T N C - A - E S S C Y K K T W S D - H - - - R G T R I E R G C - - G C P Q V K - S G I K L E C C H T - N E C N N. . . . . . . . M T C C N q Q S S Q P K T T T N C - A - E S S C Y K K I M S D - H - - - i ~ T R I E R G C - - G C P Q V K - K G I K L E C C H T - N ~ C N N. . . . . . . . R I C F N Q H S S Q P Q T T K T C P S G S E S C Y N K Q W ~ D - F ~ - - I I C T I I E R G C - - G C P T V K - P G I K L S C C E S - E V C N N........ R I C F N Q H S S Q P Q T T K T C P S G S E S C Y H K Q W S D - F - - - I t G T I I E R G C - - G C P T V I < - P G I K I , S C C E S - E V C N N. . . . . . . . P ~ I C F N Q H S S Q P Q T T K T C P S O S E S C Y H K q W S D - F ~ ~ - I l G T I I E R G C - - G C P T V K - P G I N L S C C E S - E V C N N. . . . . . . . P G R C F N H P S S Q p q T N K S C P P G E N S C Y N K ~ W R I ) - H - - - R C T I T E R G C ~ - G C P T V K - P G I K L T C C Q S - E D C N N. . . . . . . . R R C F N H P S S Q P Q T N K S C P P G E N S C Y N ~ W R D - H - - - [ I G T I T E R G C - - G C P Q V K - S G I K L T C C Q S - D D C N N........ IRCF---ITPDITSKDCPNG-HVCYTKTMCDAFCSIRGKRVDLGCAATCPTVK-TGVDIQCCST-DNCNPFPT-RKPP IRCF---ITPDITSKDCPNG~HVCYTKTWCDGFCSI~OKRVDLGCAATCPTVK-TGVDIQCCST-DNCNPFPT-RKPP IRCF---ITPDITSKDCPNG-HVCYTKTWCDGFCSIHOKRVDLGCAATCPTVR-TGVDIQCCST-DNCNP~PT-PdgPP IRCF---ITPDITSKDCPNG-HVCYTKTWCDGFCSS~GKRVDLGCAATCPTVR-TGVDIQCCST-DNCNP~PT-RKPP IRCF---ITPDITSKDCPNG-HVCYTKTWCDGFCSIRGKR~)LGCAATCPTVR-q?GVDIQCCST-DDCDPFPT-P~PP IRCF---ITPDITSKDCPNG-HVCYTKTWCDGFCSSRGKRVDLGCAATCPTVR-TGVDIQCCST-DDCDPFPT-RKPP [HCF---ITPDITSKDCPNG-NVCYTKTWCDAFCSI RGKRVDLGCAATCPTVK-TGVDIQCCST-DDCDPFPT-R/gPP IRCF---ITPDVTSQICADG-HVCYTKTWCDNFCASRGFJRVDLGCAATCPTVK-PGVNIKCCST-DNCNPFPT-I~N~ IRCF~--ITPDV~SQA~DG-HVCYTF~4WCDNFCGMRG~iVDLGC~ATCPKVK`~GVN~KCCSR-DNCNPFPT-~`KRS RTCY---KTYSDKSKTCPRGENICYTKTWCDGFCSGRGKRVELGCAATCPKVK-TGVEIKCCST-YNCNPFpVW-NPR RTC-N--KTFSDQSKICPPGENICYTKTWCDAWCSQ~GKRVELGCAATCPKVK-AGVEIKCCST-DDCDKFO~F-GKPR RTC-N--KTFSDQSKICPPGENICYTKTWCDAWCSQNGKIVELGCAATCPKVK~AGVEIKCCST-DNCNKF~-~KPR RTCY---KTpSVKPET~PHGENICYTE]MCDAWCSQRGKREE~GCAATC~KVK-AGV~IKCCST-DNCDP~pV-l~/~PB ITCY---KT~PI~SETCAPGENLCYTK~MCDAWCGSBGKVIELGCAATCPTVQ-SYQD~KCCST-DD~NP~PK-~J~l~P TKCY.--VTPDVKSQTCPAGENICYTETWCDAWCST~GKNVDLGCAATCPIVK-PGVEIKCCST-DNCNPF~ T K C y , - - V T P D A T S Q T C P D G E N I C Y T K T W C K G F C S S R G K R I D L G C A A T C P K X q ~ ~ P G V D I K C C S T - D N C I q P F PIgl[1~ q IVCH-TTATIPSSAVTCPPGENLCYRKMWCDAFCSSNGKWELGCAATCPSKK-PYEEVTCCST-DKCNHpPK-RQJ~ K R C Y - - - R T P D L K S Q T C P P G E D L C Y T K K W C D A W C T S R G K V I E L G C Y A T C P K X r K ~ p Y ~ Q I T CC s T - D N C N P H P K - M X P " R E C Y ~ - - L N P H D T - Q T C P S G Q E I C Y V K S W C N A W C S S R G K V L E F G C A A T C P S V N - T G T E I K C C S A - D K C N T Y P...... q~MCYSHTT'fSRAILTNC--GENSCYRKS~RHP.... pV_MVLGRGC--GCPPGD-DNLEVKCCTSPDKCN'Y........ M I C Y S I ~ T P Q P S A T I T C E E - K T - C Y K K S V R K L .... P A V V A G R G C - - G C P S K E - M L V A I H C C R S - D K C N E. . . . . . . . N I C Y S H K T P Q N S A T I T C E E - K T - C Y K K F V T K L .... P G V I K G R G C - - O C P K / ~ I F R K S I H C C R S - D K C N ~. . . . . . . . L K C _ N _ - Q L I P P F W K T C P K G K N L C y - K M T M R A ~ - - A P M V P X q Z R G C I D V C P K S S - L L I K Y M C C N T - D K C N......... L K C _ N _ _ Q L I P P F W K T C P K G K N L C y - K M T M R A - - - A P M V P V K R G C I D V C P K S S - L L I K Y M C C N T - N K C N......... L K C - N - - Q L I P P F W K T C P K G K N L C Y - K M T M R G ~ - - A S K V P V K R G C I D V C P K S S - L L I K Y M C C N T - N K C N. . . . . . . . . L K C - N - - R L I P P F W K T C P E G K N L C Y - N M T M R L - - - A P K V P V K R G C I D V C P K S S - L L I K Y M C C N T - N K C N......... L K C - N - - K L V P L F ~ C K T C P A G K N L C Y - K ~ ' M V A - - - T P K V P V K R G C I D V C P K S S - L V L K Y ~ / C C N T - D R C N......... L K C _ N - - K L V P L F Y K T C P A G K ~ L C Y - K 2 q Y M V A - - - T p K V P V K R G C I D V C P K S S - L V L K Y V C C N T - ~ R C N. . . . . . . . . L K C _ N _ _ ~ L I P L A Y K T C P A G K N L C Y _ K M ~ M V S _ _ _ N K T Y P V K R G C I D V C P I < M S - L V L K Y ~ / C C N T - D R C N. . . . . . . . . L K C - N - - K L I P L A Y K T C P A G K N L C Y - K M Y M V S - - - T P K V P V K R G C I D V C p K N S - L V L K Y E C C N T - D R C N......... LKC,N__I£LIPLAY~KTCPAGKNLCY_KMYMVS___NKTVPVKRGCIDVCpI(NS-LVLICYECCNT-DRCN ......... L K C - N - - E L I P I A Y K T C P E G K N L C Y - K M M L A S - - - I < ] < M V P V K R G C I N V C P K N S - A L V ~ C C S T - D R C N. . . . . . . . . L K C ~ N - - K L I P I A S K T C P A G K N L C Y - K M F M M S - - - D K T I P V K ~ q G C I D V C P K S N - L L V K - 6 V C C N T - D R C N......... L K C -N- -KLIPI ASKTC P A G K N L C Y - K M F M M S - - -DLTI PVIGRGCIDVC PKSN-LLV"KYa/C C N T - D R C N ......... LEC -N - -ItLVPIAHKTC P A G K N L C Y - Q M ~ S - - - K S T I P V K R G C I D V C P K S S - LLV](YVCCNT-DNCN......... L K C H _ _ _KLV p p V W K T C P E G K N L C Y -KMFMa/S - - - T S T V P V K R G C ID~C PKI~S-ALVTgY%'CCS T - D K C N ......... L K C H N - - K L V P F L S K T C p E G K N L C y - K M T M L K - - - M P K I P I K R G C T D A C PKSS -LLVKVVC CNK -DKCN ......... I K C I ~ - -TLLPFI~TKTCPEGQNLCF - K G T L K F - _ _pK/{TTYNNGC A.%TCPKSS- L L ~ Y ~ / C C N T - N K C N ......... IKCHI4 - - T L L P F I Y K T C P E G Q N L C F - K G T L K ~ - - - P K K q T I ~ G C A A T C P K S $ -LLV=KYVCCNT-NNCN.........

Fig. 2. Amino acid sequence alignment chart for snake venom toxins. The one-letter IUPAC-IUB codes for amino acid residues are used. The number in front of a toxin name is the identification (ID) number of the toxin as given in Figure 1. The sequence data shown are from: 1 (Yang et al., 1969); 2 (Botes and Strydorn, 1969); 2a (Botes et al, 1971); 3 (Eaker and Porath, 1967); 2a (Botes et al., 1971); 3 (Eaker and Porath, 1967); 4, 5 (Strydom and Botes, 1971); 6 (Arnberg et al., 1974; Grishin et al., 1973); 7 (Botes, 1972); 8 (Botes, 1971); 9 (Strydom, 1972); 10 (Strydom, 1973); 11 (Banks et al., 1974); 12 (Fryklund et al., 1972); 13 (Sato, 1974); 13a (Liu et al., 1973); 14 (Wang et al., 1976); 15 (Liu and Blackwell, 1974); 16, 17 (Sato and Tamiya, 1971); 18 (Tamiya and Abe, 1972); 19 (Sato and Tamiya, 1971); 19a, 20 (Sato, 1974); 21 (Karlsson et al., 1972); 22, 23, 24 (Karlsson, 1974); 25 (Nakai et al., 1971); 26 (Ohta and Hayashi, 1973); 27 (Hayashi, 1974); 28 (Botes, 1 9 7 2 ) ; 2 9 (Botes, 1971); 30 (Strydom, 1973); 31, 32 (Strydom, 1972); 33 (Banks et al., 1974); 34 (Grishin et al., 1974); 35, 36 (Joubert, 1973); 37 (Mebs et al., 1972); 38 (Shipolini et al., 1974); 39 Maeda and Tamiya, 1974); 40 (Viljoen and Bores, 1973); 41 (Viljoen and Botes, 1974); 42 (Shipolini and Banks, 1974); 43 (Botes, 1974); 44, 45, 46 (Louw, 1974a); 47 (Nakai and Lee, 1970); 48 (Takechi and Hayashi, 1972); 49, 50 (Takechi et al., 1973); 51 (Hayashi et al., 1971); 52 (Louw, 1974b); 53 (Hayashi et al., 1975); 54 (Bores, 1974); 55 (Carlsson and Joubert, 1974); 56 (Weise et al., 1973); 57 (Fryklund and Eaker, 1973); 58, 59 (Carlsson, 1974)

Molecular Evolution of Snake Venom Toxins

171

subset. Each time a new nodal subset was formed, a prelimset nodal ancestral sequence was reconstructed from its two immediate descendant subsets according to the rules set by Fitch (1971). The third approach is the Minimum Spanning Tree-Cluster Analysis (MSTCA) procedure. The method has been described and discussed in detail by Gower and Ross (1969) and Zahn (1971). We used the algorithm of Dijkstra (1959) in finding a minimum spanning tree from a known distance matrix. Clustering was then performed by successively breaking the linkages starting from the one with longest linking path length. Among the three procedures employed in the present study both FM and MSTCA algorithms were easily utilizable for computer calculations. The PAS procedure required considerable core storage and computer time.

Statistical Evaluation of a Tree

The statistical evaluation of a tree and calculations of internodal 'path' lengths were made by a least-squares fitting. With N source subsets (the present day amino acid sequences), the number of paths is 2N-3 (in an unrooted tree). The total number of relations for connecting two source subsets with the sum of path lengths set equal to input distance is N(N-1)/2. The least-squares calculations then correspond to finding 2N-3 parameters from N(N-1)/2 observations. The function minimized was

?2 wij (D~ - D~j), i

Molecular evolution of snake venom toxins.

J. Mol. Evol. 10, 167--182 (1977) Journal of Molecular Evolution O by Springer-Verlag 1977 Molecular Evolution of Snake Venom Toxins T.H. Hseu 1 *,...
909KB Sizes 0 Downloads 0 Views