MOLECULAR

PHYLOGENETICS

AND

EVOLUTION

Vol. 1, No. 3, September, pp. 1’79-192, 1992

Evolution of the Salmonid Mitochondrial ANDREW *Department

M. SHEDLocK,t of Pathology,

JAY

Control Region

D. PARKER,’ DAVID A. CRISPIN,* THEODORE AND GLENNA C. BURMER*‘t’*

W. Pmscti,t

School of Medicine, and tkhool of Fisheries, College of Ocean and Fishery Sciences, University of Washington, Seattle, Washington 98195 Received

February

18, 1992;

To explore the evolutionary nature of the salmonid mitochondrial DNA (mtDNA) control region (D-loop) and its utility for inferring phylogenies, the entire region was sequenced from all eight species of anadromous Pacific salmon, genus Oncorhynchus; the Atlantic salmon, S&o salar; and the Arctic grayling, ThymalZus arcticus. A comparison of aligned sequences demonstrates that the generally conserved sequence elements that have been previously reported for other vertebrates are maintained in these primitive teleost fishes. Results reveal a significantly nonrandom distribution of nucleotide substitutions, insertions, and deletions that suggests that portions of the sahnonid D-loop may be under differential selective constraints and that most of the control region of these fishes may evolve at a rate similar to that of the remainder of their mtDNA genomes. Maximum likelihood and Fitch parsimony analyses of 9 kb of aligned salmonid sequence data give evolutionary trees of identical topology. These results are consistent with previous molecular studies of a limited number of salmonid taxa and with more comprehensive, classical analyses of sahnonid evolution. Predictions from these data, based on a molecular clock assumption for the mtDNA control region, are also consistent with fossil evidence that suggests that species of Oncorhynchus could be as old as the Middle Pliocene and would have thus given rise to the extant Pacific salmon prior to about 5 or 6 million years ago. Q 1s~~ Academic Press, Inc.

INTRODUCTION

The Salmonidae, one of 15 families of the euteleost order Salmoniformes, is a monophyletic assemblage of 10 genera and approximately 68 species of freshwater and anadromous fishes (whitefishes, ciscos, grayling, 1 Present address: Department of Genetics, University of Georgia, Athens, GA 30602. ’ To whom correspondence should be addressed at University of Washington Medical Center, Department of Pathology, RC-72, Seattle, WA 98195. Fax: (206) 548-4928. Email: [email protected].

revised

July 28, 1992

trouts, and salmon) distributed throughout the northern hemisphere (Nelson, 1984). Despite considerable attention from syptematists, largely because of the tremendous commercial and recreational value of these fishes, salmonid phylogeny is poorly understood, and biological diversity in this family is probably significantly greater than currently recognized (Behnke, 1972). In recent years, considerable attention has focused on the taxonomic limits and phylogenetic interand intrarelationships of Sdmo Linnaeus and Oncorhynchus Suckley, the primary result of which has been near universal acceptance of Oncorhynchus as the proper name for the Pacific trouts, while Salmo is reserved as the genus of trouts (including the Atlantic salmon) native to Europe, western Asia, and the Atlantic Basin (for a detailed review, see Smith and Stearly, 1989). Although numerous hypotheses of phylogenetic relationship among the subtaxa of these genera have been proposed-deduced from morphological, cytogenetic, molecular, physiological, distributional, and ecological data (for numerous references to these studies, see Utter et al., 1973; Thomas et al., 1986)-a clear understanding is still unavailable. Direct use of mitochondrial DNA (mtDNA) to refine our knowledge of species relationships within Oncorhynchus has been largely limited to restriction enzyme (RFLP) analyses presented for 6 taxa (Wilson et al., 1985; Thomas et al., 1986) and to nucleotide sequence analysis of a Hind111 fragment obtained from this same RFLP work (Thomas and Beckenbach, 1989; Beckenbach et al., 1990). Conclusions from these studies draw attention to the systematic importance of the mitochondrial control region (displacement loop or D-loop) as a major source of sequence variation between these closely related fishes and as an area that holds great practical value for the potential development of strain-specific probes for population studies and stock identification. As the major noncoding segment of the vertebrate mitochondrial genome, the mitochondrial control region has a unique functional role and contains sites for initiation of both heavy-strand (H-strand) DNA replication and promoter sequences for RNA transcription (Clayton, 1982, 1984; Saccone et al., 1991). The DNA

179

1055-7903/92 $5.00 Copyright 0 1992 by Academic Press, Inc. All rights of reproduction in any form reserved.

180

SHEDLOCK

sequence of the control region contains a number of unique primary and postulated secondary structural features, including a central conserved GC-rich domain flanked by regions of higher primary sequence diversity and adenine content, a high base substitution rate, the common presence of deletions and insertions, clusters of short repeat sequences, and regions that are capable of assuming pin-loop secondary structures common to promoters and initiation or termination recognition sites (Walberg and Clayton, 1981; Brown et al., 1986; Saccone et al., 1987; Foran et al., 1988). Despite this critical functional role, however, comparative evolutionary studies of vertebrates have shown that the primary DNA sequence of the vertebrate control region exhibits substantial variation in different organisms and may evolve at a rapid rate compared to nuclear genes and to the remainder of the mitochondrial genome. The most extensive comparative studies, conducted on mammals (Brown et al., 1986; Saccone et al., 1987; Foran et al., 1988), have demonstrated that the high rate of sequence divergence, in the form of base substitutions, insertions, deletions, and repeat sequence motifs, is localized largely in the variable flanking regions abutting the central conserved sequence block. This report describes the organization, primary structure, and mutational patterns of the salmonid mitochondrial control region and demonstrates its utility for inferring phylogenies. The present study represents the first comprehensive, comparative examination of complete D-loop sequence structure in fishes and is the only phylogenetic study to simultaneously analyze all anadromous Pacific salmon using mtDNA information. In addition, it is the first mtDNA examination of Oncorhynchus masou. This study further refines the patterns of molecular evolution within salmonid fishes described by Thomas and Beckenbach (1989) by presenting a comparative analysis of noncoding control region information and by independently testing their earlier DNA sequence-based phylogenetic hypothesis. We have amplified the region flanked by cytochrome B and 12s ribosomal RNA gene of Salmo salar by “targeted gene-walking” polymerase chain reaction (Parker et al., 1991), and directly sequenced amplification products derived from this region. Primers flanking the phenylalanine transfer RNA (tRNAPhe) and proline transfer RNA (tRNA? of salmonids were then synthesized using the S. salar sequence and the control region was subsequently amplified and sequenced from eight species of Pacific salmon, genus Oncorhynchus, and the Arctic grayling, Thymullus arcticus. These sequences were compared to those previously reported for mammals (Brown et al., 1986; Saccone et al., 1987, 1991; Foran et al., 1988); an amphibian, Xenopus Zaeuis (Wong et al., 1983; Roe et al., 1985; DunonBluteau and Brun, 1987); and two fishes, the Atlantic cod, Gadus morhua (Johansen et al., 1990), and the

ET

AL.

white sturgeon, Acipenser transmontanus (Buroker et al., 1990). Binary comparisons of aligned sequences were conducted in order to evaluate the distribution of sequence variability .and the evolutionary substitution patterns exhibited. Phylogenetic relationships of S. salar and the species of Oncorhynchus were then determined using both maximum likelihood and parsimony analyses of a 1-kb fragment of the D-loop. Based on a molecular clock assumption for the mammalian mtDNA control region (for a discussion of the mtDNA molecular clock, see Moritz et al., 1987), divergence time estimates were calculated and compared to those suggested by the fossil record for Pacific salmon (Cavender and Miller, 1972). METHODS Sample Tissue Sources and DNA Extraction Procedure

The specimens were obtained from the following sources: Alaskan-T. arcticus (1 specimen), 0. nerka (2), 0. keta (2), 0. mykiss (1); St. Petersburg, Russian Republic-S. salar (1); Washington State-O. kisutch (21, 0. tschawytschu cl), 0. mykiss (l), 0. clarki (2); Oregon-O. tschawytscha (1); British Columbia-O. gorbuscha (2); Hokkaido, Japan-O. musou (2). Total cellular DNA was extracted from fresh or cryopreserved tissue samples by standard methods (Davis et al., 1986). Samples of mitochondrial DNA, isolated by ethidium bromide/CsCl gradient ultracentrifugation, were provided by G. Winans and L. Park. ‘Targeted

Gene-Walking”

PCR Procedure

A two-stage gene-walking experiment was used to amplify the control region and flanking sequences of S. salar. Two targeted primers and four arbitrary “walking” primers were initially used to amplify unknown sequences contiguous to either the tRNA”O or 12s rRNA genes. A previously published primer hybridizing to a conserved region within the tRNA”O gene (Kocher et al., 1989) was chosen for amplification of the left domain of the control region, and a second primer was designed from a conserved block within the 12s rRNA gene to amplify the right domain (Fig. 1; Parker et al., 1991). Products of these gene-walking PCRs were electrophoresed, excised from the gel, reamplified, and directly sequenced as described below. A second set of “targeted” primers [(FDl( +) and FDU-)I was then synthesized from the sequence derived from the initial walking products for the amplification and sequencing of the remainder of the control region. The S. salar sequence was then used to synthesize a series of targeted control-region-specific primers (Fig. 1, primer pairs a-f) from which the mtDNAs of the remaining species were amplified and sequenced.

SALMONID

PCR and Direct Sequencing

D-LOOP EVOLUTION

Conditions

Total cellular DNA (50 ng> or purified mtDNA (50 pg) was used as a template in 25 ~1 of buffer containing 80 mM Tris-HCl (pH 9.01, 20 mM ammonium sulfate, 10 mM MgCl,, 1 mM dithiothreitol, 50 PM each dATP, dCTP, dGTP, and d’ITP, 1 ng/pl (2.5 pm01125 pl reaction) of the targeted oligonucleotide primer, 2 ng/pl (5 pmo1/25 ~1 reaction) of the walking primer, and 0.4 U of Thermus aquaticus DNA polymerase (PerkinElmer Cetus, Norwalk, CT) The mixture was overlaid with one drop of mineral oil, submitted to 30 cycles of PCR with an initial three cycles of denaturation at 98°C for 30 s, annealing at 45°C for 20 s and extension at 72°C for 60 s, followed by 27 cycles at 98°C for 5 s, 50°C for 5 s, and 72°C for 20 s. Products of the PCR were electrophoresed on a 1% (w/v) agarose gel in the presence of ethidium bromide to visualize the amplified fragment. When successful gene-walking products were observed, the bands were excised from the gel and reamplified as described below, and an aliquot of product bands was directly sequenced by standard methods using a kinased amplification primer and a Sequenase kit per manufacturer’s recommendations as described previously (Parker and Burmer, 1991; Parker et al., 1991). The exact sequencing methods used are described in the Step-By-Step Protocols for DNA Sequencing and Sequenase Version 2.0, 5th ed. (United States Biochemical Corp., Cleveland, OH) under the section entitled “Sequencing Reaction Protocols.” Amplification and Sequencing of the Control Region of Remaining Species A two-step PCR protocol was used to amplify and sequence DNA from species other than S. salar. Template DNA was initially amplified as described above using control-region-specific primers, the products were electrophoresed in the presence of ethidium bromide, and the DNA fragment was excised from the gel to purify the mtDNA product from contaminating primers and extraneous amplification products. The band was eluted by incubating the gel slice with 50 ~1 of water in a 1.7+1 microfuge tube and shaking in a rotary shaker (Lab-Line Orbit Environ-shaker, LabLine Instruments, Inc., Melrose Park, IL) for 3 h. A 2~1 aliquot of the eluted DNA was used directly as a template in a 50-~1 secondary asymmetric PCR (Gyllensten and Erlich, 1988) using an excess of either the coding or the complementary strand primer (1 ng/pl or 5 pmol/50 pl) in the presence of 0.05 ng/Fl of limiting primer. All asymmetric reactions underwent 30 cycles of PCR at 95°C for 1 s, 50°C for 1 s, and 72°C for 15 s, and products of the reaction were directly sequenced. Oligonucleotide Primers and Sequences The oligonucleotide primers were synthesized within our laboratory on a Cruachem synthesizer using the

181

phosphoramidite method (Cruachem Corp., Sterling, VA). Primer sequences are as follows: targeted primers-t-pro, 5’-CCC AAA GCT AAG ATT CTA AA-3’; 12Sl(- ), 5’-AAA GTC AGG ACC AAG CCT T-3’; FDl( + 1, 5’CTC CAA CTA ACA CGG GCT C-3’; and FDl( - ),5’-GGT TAG CTA GAT ATA AC-3’; walking primers-Wl, 5’-ACG TGG CCA CGT AGG CCA AAA AAA AAA AAA-3’; W2,5’-TTT AAG CTT CTA GAA TTC CCC CCC CCC C-3’; W3,5’-TCA CAA CAC GAG CTG ACG-3’; W4,5’-AGG ATT NGA TAC CCN AGT AGT-3’; control region primer pairs designed from the S. salar sequence -a(+&pro and a( -1, 5’-GTG CTG ATG TAT GAG GG-3’; b( + >, 5’-AAT ATC GCA TGT GAG TAG TAC-3’; and b( - ),5’-TGA TGC AAA TAG TTG GTG GG-3’; c:FDl( +) and c( -1, 5’-A’M’ GCA TTA CAT TCG GC-3’; d( + 1, 5’CCT TAA GAA ACC ACC C-3’; d( - ),5’-GAT ACA TGG G’IT CTC TGG-3’; et + ),5’-T’M’ TTC CTT TCA GCT TGC-3’; e( - ), 5’TAG TCG GTG CCG AAT GC-3’; f( + 1, 5’-TIC CTG TCA AAC CCC TAA ACC AGG-3’; f( - ),5’-CCA TCT TAA CAG CTT CAG-3’. Sequence Comparison

and Structural

Analysis

Sequences were edited and compared using the Wisconsin Genetics Computer Group (GCG) Sequence Analysis Software Package, version 7.0 (Devereux et al., 1984). The final alignment was made using the clustering methods (Feng et al., 1987; Higgins and Sharp, 1989) employed by the GCG program PILEUP under gap weight/length penalties of 0.1. Because a rigorous optimal alignment of sequences using this clustering algorithm is intractable, PILEUP may not produce the most optimal multiple sequence alignment. In order to account for potential ambiguities in the alignment, an array of gap weight/length parameters was tested in conjunction with visual alignments of conserved sequence blocks and two large insertions. The resultant alignment presented here has a reduced number of inserted gaps even though relaxed gap penalties were employed, suggesting that a nearly optimal multiple sequence alignment of these sequence data has been achieved. Analysis of secondary structures was based on the method of Zuker and Steigler (1981) employed by the FOLD program in the GCG software package. Phylogenetic Analysis Both cladistic and maximum likelihood methods were used to infer phylogenetic relationships and to calculate divergence time estimates from the nucleotide sequence data presented. The region immediately downstream of the tRNAPhe gene (approximately 100 sites) proved too variable to be useful for the hierarchical questions addressed in the present study (for a discussion of mutational saturation, see Moritz et al.,

182

SHEDLOCK

1987; Hillis and Moritz, 1990). Consequently, only the first kilobase of control region nucleotide sequence information for each species was used for phylogenetic analyses. T. arcticus was also omitted from the analyses as a second outgroup taxon because of its high degree of sequence divergence from species of Oncorhynchus.

Cladistic analyses were performed using the software program PAUP, version 3.0 (Swofford, 1989). Fitch parsimony (Fitch, 1971) was employed in branchand-bound algorithmic searches for optimal trees (Hendy and Penny, 1982). S. s&r was defined as the outgroup taxon of the eight species of Oncorhynchus. Three separate analyses were conducted with (1) transversions (V) and transitions (S) unweighted, (2) V weighted two times S, and (3) V weighted five times S. This range of S/V ratios encompasses those observed from binary comparisons of the D-loop sequences presented here. Gaps in the alignment were treated as fifth character states in addition to the four nucleotide states. The two large insertions found in the 0. gorbuscha sequence and described under Results were encoded as single-base changes in order to avoid algorithmic treatment of these as multiple evolutionary events. A consistency index (CD (Kluge and Farris, 1969) was calculated for the most parsimonious tree of the unweighted S/V character set. In addition, 1000 heuristic bootstrap replicates (Felsenstein, 1985; Efron, 1987; Hedges, 1992) were conducted using PAUP to give a 50% majority rule consensus tree. Informative characters in the cladistic analyses were defined as unordered characters having at least two states each occurring in more than one taxon in the aligned data set. To evaluate relative and absolute divergence time estimates between taxa and to independently test cladogram topology, maximum likelihood calculations (Felsenstein, 1981) were conducted using the programs DNAML and DNAMLK contained in the phylogenetic inference package PHYLIP, version 3.3 (Felsenstein, 1990). For likelihood analyses, empirical base frequencies were used in the calculations, global rearrangements were conducted, and, as with the PAUP analysis, V to S weights were set at 1, 2, and 5. DNAMLK, which assumes an evolutionary clock, sets branch lengths equal from the root to tips such that the amount of evolutionary change in each taxon is directly proportional to elapsed time. Results from this program were multiplied by the substitution rate of 0.75% per million years to establish absolute divergence times to within the nearest 0.1 million years before present (Mybp). Because rates of mtDNA evolution have not been established for fishes, the rate used here is based on results of interspecific control region sequence comparisons studied in mammals (Brown et al., 1986; Foran et al., 1988; Hoelzel et al., 1991).

ET AL.

RESULTS Amplification

and Sequencing

Initial attempts at direct amplification of the control region from fish samples using previously published vertebrate primers were unsuccessful because of the high degree of sequence variability within this region. We therefore utilized the targeted gene-walking PCR strategy to amplify the control region of S. sulur mtDNA (Fig. 1; Parker et al., 1991). Phylogenetically conserved sequences lying outside the control region were targeted as starting sites for the gene-walking PCR procedure. Sequence-specific primers were then designed from the S. sulur sequence for the subsequent amplification and analysis of the control region from the remaining species. Sequence Profiles

The aligned sequences for the control region of the salmonids examined in this study are presented in Fig. 2. The region flanked by tRNAPro and tRNAPhe is approximately 1 kb in length, with the exception of that of 0. gorbuschu, which is 36 nucleotides longer due to the presence of two insertions characterized by near direct repeats: one 25-bp insert beginning at nucleotide position 2 immediately adjacent to tRNAR” in the left domain, with the sequence CGGCGATATTTCGCA‘ITTCTAAATG, and a second ll-bp insert adjacent to CSB-2 beginning at nucleotide position 805, with the sequence TCCCGGCTTCC. The overall organization of the salmonid control region resembles those previously published for vertebrates, with a central conserved domain lower in adenine content flanked by two regions of generally higher sequence variability (Fig. 3). An analysis of the base

w2----2

I-PRO----l-W1

-nsl(-)

w4-----&--------ml(-) Pm(+)

CYTOB

t-THR

I-PRO

a-w3

D-LOOP

---a---

---c---b--

t-Pm

12s rRN‘4

--+I--~~-

FIG. 1. Diagram of location of primers used in “targeted genewalking PCR” strategy and specific amplification of control region of salmonids. Primers t-pro and 12s rRNA were used as “targeted” primers in the initial PCR reactions with “walking” primers Wl and W2 to amplify a 600- and a 300-bp fragment, respectively, of S. s&or. A second set of “targeted” primers, FDl( - ) and FDl( + ), was then designed from this sequence with “walking” primers W3 and W4 to amplify fragments of 3000 and 500 bp, respectively. Sequencing of these four fragments provided information for synthesizing primer pairs a-f, which were used in the amplification of remaining salmonids. Primer sequences are listed under Methods.

SALMONID

D-LOOP

composition reveals that the region with the lowest adenine content is located with the central domain corresponding to nucleotides 300-600. A second block with low adenine content is also seen within the right domain, corresponding to the location of several conserved sequence blocks. An analysis of the frequency of nucleotide sequence alterations within lOO-bp stretches in each species reveals discrete sites of conserved sequence at nucleotides 300-600 and, within Oncorhynchus, at particular stretches corresponding to the locations of functional elements. In all of the species examined, the stretch with the highest sequence variability was observed immediately adjacent to tRNAPhe. Of 36 possible binary comparisons of sequences used to infer phylogenies, this area showed over three times the average variability found in the remainder of the control region. The total numbers of transitions, transversions, insertions, and deletions for all possible binary comparisons of the first kilobase of aligned sequences are summarized in Table 1. The range of transition to transversion ratios is from 1.0 (between 0. keta and 0. mykiss) to 3.11 (between 0. tschawytscha and 0. clarki), with an average S/V for all comparisons of 1.67. The S/V ratio versus the percentages of sequence difference between binary comparisons of nine salmonid taxa is plotted in Fig. 4. Combined deletions and insertions range from 11 (between 0. keta and 0. nerka) to 58 (between 0. masou and S. salar and between 0. gorbuscha and S. salar). The average of combined insertions and deletions is 28. Nearly all insertions and deletions in the aligned comparisons were restricted to single base changes. With the exception of the two large, nearly direct repeats found in 0. gorbuscha described above, insertions or deletions of more than 3 bp were not present in binary comparisons of data used for phylogenetic analyses. Structure

of the Right Domain

In the vertebrate sequences analyzed to date, the right domain contains a number of important functional elements, including the origin of heavy-strand replication (OH), sites of initiation of transcription for the heavy and light strands (HSP and LSP), and three sequence blocks located in the vicinity of OH that are conserved phylogenetically (CSB-1, 2, and 3). In the salmonids examined, immediately upstream from the central domain lies a conserved tract of 23-25 pyrimidines; similar sequences have been demonstrated in both G. morhua and X. Zaeuis and are putative binding sites for the mtSSB protein, which is involved in the regulation of DNA replication (Mignotte et al., 1985; Johansen et al., 1990). In vertebrates, further upstream from the pyrimidine tract are the three evolutionarily conserved sequence blocks that function in the initiation of D-loop DNA replication.

EVOLUTION

183

In the region of CSB-1, mammals, X. Zaeuis, and A. contain the conserved sequence block GACATA. This sequence is missing in G. morhua and also in the salmonids. The salmonid sequences that are most similar to CSB-1 are located in a stretch of approximately 60 nucleotides within which are three highly conserved pentanucleotides (GATAT) and a block of six nucleotides at positions 724-729 with the sequence CPuCATA. Further upstream are the phylogenetically conserved sequences for CSB-2 and CSB-3 (Fig. 2, nucleotides 815-828 and 855-8701, which have been demonstrated in other vertebrates to serve as recognition sites for endonucleases (RNase MRP) involved in the generation of primer RNAs required for the initiation of DNA synthesis (Walberg and Clayton, 1981; Brown et al., 1986; Cairns and Bogenhagen, 1986; Chang et al., 1987; Bennet and Clayton, 1990). The salmonid CSB-2 and CSB-3 sequences are highly similar to those previously reported in mammals and in X. Zaeuis. The 0. gorbuscha sequence contains a direct repeat between CSB-1 and CSB-2 with the sequence CCCGGCTTC that is absent in the remaining salmonids examined. Tandem repeats located at this same relative position have been described in rabbits (Oryctolagus cuniculus cuniculus), and although they are implicated in the generation of intra and interindividual heteroplasmy within this species, they are of unknown function (Mignotte et al., 1990). In vertebrates, the region adjacent to tRNAPhe contains promoter elements for both H- and L-strand transcription. Previous reports have demonstrated that although this region contains an unstable sequence profile (Chang and Clayton, 1984; Bogenhagen et al., 19861, promoter selection is dependent upon a sequence-specific transcription factor mtTF, which has been identified in humans and implicated in Xenopus (Bogenhagen and Yoza, 1986; Fisher and Clayton, 1988). The promoters and mtTFs have both been shown to possess bidirectional activity and each mtTF recognizes a species-specific “core sequence” (Chang et al., 1987; Fisher et al., 1987). The physical proximity, bidirectional nature, and sequence similarity between the promoters suggest that they arose by duplication of a single, bidirectional promoter (Chang et al., 1987; Fisher et al., 1987). The salmonid sequences contain two tandem repeats located between the tRNAPhe gene and CSB-3, with the general sequence PuTATACATTAATPuAACTTTT (nucleotides 907-928 and 998-10191, separated by approximately 68 bases. These sequences are reliably maintained only within the genus Oncorhynchus and exhibit more divergence in the more distantly related S. salar and T. arcticus (Fig. 2). A central sequence within these blocks (ACATTAATPu) is similar to the X. Zaeuis promoter sequence ACPuTTATA (Bogenhatransmontanus

SHEDLOCK

184 1 0.

tschawytrcha

0. 0. 0. 0. 0. s. T.

kfirutch ketr nerka gorbuacha PISO” ra1ar arcticus

A CGGC..TA - ----CA-_ ----c+- --______ G -------- ----m-G-*---m--G_ --___ G-- ---.-T-T - C----e..

--w-.TAA.ATGTTA ---------_-________ -----__--_ ----m--C------_____ ------------------------------C-C---.

ET AL.

CAGCT..ATG ._.-______ .-.-______ .-. .-Me---..--CCC--..A-C-CT-..--TTC.-...x-------_---...--___--

.TACM.CTG ---T-------.-TT----- .-TT-.C--.-TT-.C--.-T,L.C--,-TT----,dl.p--------M T--.--m.,,

TAKTTG.TA --------------____-----_____ -----_____ __---_____ ---------T ---c----T--*-C-. ___..____.

MCC.CA... ------T--------------------____-_____ __________ ------.p-------------------.---+-T&c

A.TGT.TAT. ---------------------------____-_____ ________-_ ---------A --------------___--T---G--A

CTATGTATM ------_--------_--------____ ________-__________ ___--_____ ------___------__-_ T------w--

G.CATGTGAG ---*-m-m -IT---,& ---c-------c------c-----(+--.---C-.-.---------,-,-.,___-

TAGTACATTA -s---_---m s-q--..____ --------------_____ ----------------c---------__--______ -----_____

TATGTATTAT ------------------------------------____-_____ ____-_____ --------------_---__---_____

cMC..ATAC ----______ ------------------* ----AT--------------AT---. ____-_____ ---------A ---------A

GGT.GATTTT ---------c ----*--c ----G---C ..--T---------G--C .T---m--T ------_--,--I&----. ,----G--C

k -MCCCCTCAT ACATUGCAC _-.--_____ -------T---------------_------------------T---G--m-__----____ ------_--_ -T------s------____ -T----A-----------------_-_------____ ___---____ _______-__ ______---_

.ACTACAT..

-T----------f&-c---------___-___,__ __________ ----T--,--------c-------_-_T-T-----GA

TATTACATTAT ------_------------d-------c ________-_ -_________ C------G-_----____ c--------C---G..---,

TmiT 0. 0. 0, 0. 0. 0. 0. 0. s. T.

mykiss clarki

0. 0.

mykisr clarki

0. 0. 0.

trcha*ytsch kfsutch

mm

101

TATGTATTTA ----mm----------m-----___-_ ----------------------,p-*c-G------------------_-----

CCC..ATA.. -------m-w -----_-_------_____ _---------------------------T-----m---AT-----------TG

--fi-TATMTA.CT -----m-w---m--T_-__-_____ -----__------------_------------C--e-------T-. ----m--G--

201

AMTCCMGG ---------__--______ -_-------T---------_-______ T--C----m-c------T-----m-----____--

TTTXATTM ------A--_---______ -------------______ --_--_____ m--w-+-------_--------A------m-T-

G.C.WUC ---Me---___-______ --,--_____ --A-----A---C--+----M-----M-----M-T-,

GTGATMTM -----mm--____-_____ -*-----------______ _----_____ --------------_-e-e ----------(+-.C-C-

CCA.A.CTM -----c---m -----__-------C-G-___-______ ____-_____ -----------------------------T-TT-.

GTT.GTTTTA .-----cc------__--.-G------------C-GC ----C-GC ------C-GC --------------__-_. .--AC---G.

MCT.GATTA -------------G--,C-----------------------__--____-_____ ------------cc-----C-G----

ATTGCTATA. --------------c-c-------c-c ----CGC------c-c------c-c--------C-v -------------T-----

TCMTMMC -------------c-c---.--c-c----------c----c--------c---c----CT--C------__-A--C-T--

-TCCMCTMC --_---____ ----G-------------G -------------G----G ----G--------G---w------__--T--e----

-ACGGGCTCCG TCTTTACCCA ----___------_____-_-----------------.,L------___-______ ----_----__--______ ----___------__---_-________ __--______ ----__-------__---_---______ ----_____---A----T---C--T---

& -CCMCT.TTC ------------------____-_____ --------------_---__---_____ ___-______ ------A--, ----m--C+,

AGCATCAGTC ------------------------G----------------_------------A ------G--T----w----T--m---+,

CGGCTTMTG ----------T--------TA-----T--------TA---------T----m-TV----TJ,T---TTA\-------

TAGTMGMC _------_------__--____-_____ _----_____ ------------------___--_____ __________ -_---_____

CGACCMCGA --------------___-__---_____ -----____------------------__---_____ ____-_____ -------A--

TT.T.lrTCGG -----------------A--------A-+----+ -ACC---.A--CC---,A_----_____ ---+,--.A-----_-_,-

TA.G.GC.AT --------------_---__---_____ -----__------------------T-, -----_____ __---_____ .-C-T-----

ACTCTTATTG ------------------_-_---____ -------A----------------____ _-----____ __----____ ---+--A--

400

ATGGTCAGGG GCAGATATCG ---_______ ___-____--

GGUTTTGGT ----------

TCCTA.ffiTC ___-______

MGGGCTATC CTTM-----_--------__---

CAGC.CCCTG ___--_____

500

_---______----*---

TATT.AGGTC GCATCTCGTG MTTATTCCT --------------_-----------_-

------*--- -----___-_ ---------- ---------- ---------- ---------- ---------- ----------

_--____-__ __--______ ----__-------__-__-_-------_ -_--_---_--A--

Evolution of the salmonid mitochondrial control region.

To explore the evolutionary nature of the salmonid mitochondrial DNA (mtDNA) control region (D-loop) and its utility for inferring phylogenies, the en...
2MB Sizes 0 Downloads 0 Views