J Mol Evol (1991) 33:83-91

Journal of Molecular Evolution (~) Springer-Verlag New York Inc. 1991

The Main Regulatory Region of Mammalian Mitochondrial DNA: Structure-Function Model and Evolutionary Pattern Cecilia Saccone, ~ Graziano Pesole, ~ and Elisabetta Sbis~ 2 Dipartimento di Biochimicae BiologiaMoleeolare,Universitfidi Bari, Italy 2Centro di Studio sui Mitocondrie MetabolismoEnergetieo,CNR, Bail, Italy

Summary.

The evolution o f the main regulatory region (D-loop) o f the mammalian mitochondrial genome was analyzed by comparing the sequences o f eight mammalian species: human, c o m m o n chimpanzee, pygmy chimpanzee, dolphin, cow, rat, mouse, and rabbit. The best alignment o f the sequences was obtained by optimization of the sequence similarities c o m m o n to all these species. The two peripheral left and right D-loop domains, which contain the main regulatory elements so far discovered, evolved rapidly in a species-specific manner generating heterogeneity in both length and base composition. They are prone to the insertion and deletion o f elements and to the generation o f short repeats by replication slippage. However, the preservation o f some sequence blocks and similar cloverleaf-like structures in these regions, indicates a basic similarity in the regulatory mechanisms o f the m i t o c h o n d r i a l g e n o m e in all mammalian species. We found, particularly in the right domain, significant similarities to the telomeric sequences of the mitochondrial (mt) and nuclear D N A of Tetrahymena thermophila. These sequences may be interpreted as relics o f telomeres present in ancestral linear forms o f m t D N A or may simply represent efficient templates o f R N A primase-like enzymes. Due to their peculiar evolution, the two peripheral domains cannot be used to estimate in a quantitative way the genetic distances between mammalian species. On the other hand the central domain, highly conserved during evolution, behaves as a good molecular clock. Reliable estimates o f the times o f divergence between closely and distantly related species were ob-

Offprint requests to: C. Saccone

tained from the central domain using a Markov model and assuming nonhomogeneous evolution o f nucleotide sites.

Key words: Mammalian mitochondrial DNA -Origin of replication -- Mitochondrial DNA evolution -- Stationary Markov model -- Phylogenetic tree -- Telomeres -- D-loop -- Regulatory region

Introduction The presence of only one major noncoding segment in the mitochondrial genome is a feature c o m m o n to all metazoa. In vertebrates this region, spanning between the Phe- and Pro-tRNA genes, is called the D-loop-containing region because of the threestranded displacement (D) loop structure created by the nascent heavy (H) strand at the level o f the Hstrand replication origin (On). It also contains promoters for the transcription o f both the heavy Strand (HSP) and the light strand (LSP). This region is the target site for numerous proteins and enzymes, such as D N A and RN A polymerases and transcription and regulatory factors and is thus subjected t o various evolutionary pressures. Because all these proteins are coded for by nuclear DNA, the study o f the D-loop-containing region is also extremely important for shedding light on the processes inherent in nucleus-mitochondrion coevolution. In order to gain deeper insight into the evolutionary dynamics o f the noncoding region o f the mammalian mitochondrial genome, we undertook a detailed investigation of its evolution at the molecular level. In previous papers we have identified several well-preserved features in the evolution of

84

COM

...........

20 30 40 50 60 70 80 90 I00 T TCT TTCATGGGGAAGCAAATTTAAGTGCCACCCAAGTATTGG ............................................................. :::::::::::::::::::::::: :: ::::: ::::: ::: TTCT TT CATC~C-GCAAGCAAATT TAG~TACCACC TAAGTACT GG .............................................................

.~

...........

~ ; ~ ; ; k k ; ~ i ; ; ; ~ i ~ ; i T ~ k ~

PYG

i0 ...........

ii0

120 .- - - C . ---C

.............................................................

.---~

COW

:: : :: : : :: :::: :::::: : AAAAAAGCTTAT T - G T A C A A T ~ A C C A C A A C C C C A C A G T G C C A C GT C A G T A T T A A A A G T A A T T T A T T T ~ A A A A A C A T T T T A C T G T A C A C A T T A C A T A C A C C A A T A C . . . . . . . . . . .T T A G :: :::: :: : :: ::: :: :: :: ::: ::::::::::: ::::: ::: : : ::: ::::: : : ::;: :: .... AACACTAT TAATATAGT T - C CATAAATACAAAGAGCCTTAT CAGTAT TAAA ..... TT TAT CAAAAAT C C CAATAACT CAACACAGAAT TT GCACC CTAACCAAATAT TAC AATG

RAT

TCAG

DOL

MUS RAB

PYG COM MAN DOL COW RAT MUS PAB

PYC COM MAt4 DOL COW

...........

130 140 T-C-ATTCACTA ............... : I ::::: :: T-C-ATTCATTA ............... : I : ::: : T-C-ACCCATCAA .............. : I : : I

150 160 170 180 190 200 210 220 230 240 TAAC-CGCTATGTATT-TCGTAcATTACTG-CCA--GCCACCATGAATA--TTACATAGTACTATAATCATTTAACCACCTATAACACATAAA :11 :::11111:1: I:ll:lll::::: ::: ::::111::::: : : :11:: :1: ::: ::: ::::::: ::::::: CAA•-CC-CTATGTATT-TCGTA•ATTA•TG-CCA--G••AC•ATGAATA--TCGTA•AGTACCATA-TCACCCAACTACCTATAGTACATAAA ::11 :::11111:1: I:11:111::::: ::: ::::111::::: : :::: I1::111:. 11 :1 1::: ::::;:::::: CAAC-•GCTATGTATT-TCGTACATTACTG-CCA•-GCCA•CATGAATA--TTGTACGGTACCATAAATACTTGACCACCT•TAGTACATAAA II 11111111:: I II III: : : III: II I::: : : : :: : :::::: T•C•T•TCTTTGTAAATATT•ATATA••TACATCCTATGTATTATTGTGCATTCATTT--ATTT---CCATACG-A-TAA ..... GT .... TAAAG-CCCGTATTAAT-TA-T-CATTAA : I : : I ::: : I1:: Illit:l :l II 111: : :: :: : - II1: : 9 :::. 9 I1...I : : :" TAC-ATAACATTA-AT-GTA-ATAAA--GA~ATAATATGTAT-ATAGTA~ATTAAATT--ATATGCCCCATGCATA-TAAGCAA~GTA~ATGACC~TCTATAG ................ ::1 ::11 I: 1 1: : :.:: 11 II11 II1:111:1 I:1111:::1:: 1: : 1:111:111 11:::1 I1: :1 "::1 9 . TAC-ATAAAATGATATGG-ACATTAA--AACATT-TATGTAT-ATCGTACATTAAATT--ATTTTCCCCAAGCATA-TAAGCAT-GTA--ATATATATCTAATGATTT ............ ::1 ::::: : : :: :-::: : II::: Illll:l :l;l{:ill:::: : :::::::ill::::: ;:::: II:. :+ : ::: :::: :: TAC-ATAAATTTACATAGTACAACAG--TACATT-TATGTAT•ATCGTACATTAAACT--ATTTTCCCCAAGCATA-TAACCTA-GTAC•ATTAA-ATC-AATGGTT• ............ :1 : : : : : : : :::: :::;; II;; IIIII I :l:ll III ::: :.. : ::111 : :: ::::::: If:: :1:: 9 9 :: : : . . . . . . AACAATAAAT-T-CATAA-CCAACATTTAACATACTATGTTTAATcGTGCAT-AAATTCCTCATCCCCCATGAATAATAAGCTA-GTAC-ATTACTGCTTGATTGGACATAATCCACT--

250 260 270 280 290 300 310 320 330 340 350 360 _CAGTACATAGCACATACAATTATATACCGTACATAGCACATTACAGTCAAATCCATCCTCGCCCCCACGGATG .............................................. ::: :11::: I:1::::: ::: :::::::::::::::::::::::::I 1: :::1:::::::::::I:: _CAGAACATAGTACATACAACCATACACCGTACATAGCACATTACAGTCAAACCCCTCCTCCCCCCCACGGATG .............................................. ::: :11::::::1:: : :::: :::::::::::::::::::::::::: :::1 :1:: ::::: ::::: CAGTACATACTACATAAAGCCATTTACCGTACATAGCACATTACACTCAAATCCCTTCTCGTCCCCATGGATG .............................................. ::11:: :::1: : : I : :1:11: I :: : I: 1: I : :: : •TTTTACATATTACATGATATGTATAATCTTACATATTATATATCCCCTAACAATTTTATTTCCATTATACCTATGGTCGCT ...... CCATTAGATCACGAG ................. ::ll:: :1:I: :: :: I :1:11::1 :: I1 I1 1: : :::: :::: : : ::::::::::::::: ~CAGTACATAATAC~TATAATTATTGACTGTACATAGTACATT-ATCTCAAAT~CATTCTTGATAGTATATCTATTATATATTCCTTACCATTAGATcACGAG .................

: RAT MOS RAB

PYG

:ll

:l:l:

:::

:

l::

:I

II

I::

:I

:

:l::l

COM

......................................................... .........................................................

DOL

..............................................................................

COW

..............................................................................

RAT

........

COM MA~

:

::

::::::

.... AAG-ATRATGCTT-ATTAGACATATCTGTGTTATTAGACATG-:: ::::: :: :::::::::::::::: ::::: .... AAA-CTAATQ-TTATAACGACATATCTGTGTTATCTGACATA-:: :::.:::::: :: ::::: :: :::: .......... AAATCTAATGATTGACTTGACATCAGACATCAATTC--CATAAT A>

::::::::::::::::::::::::

:lllll:l::ll

480

l~ll:ll::t:lllll::::::

CTCCCCCTCAGATAGGAATCCCTTGGT-CACCATCCTCCGTGAAATCAATATCCCGCACAAGA :::1:::::::::: ::::::: 111111:1::11 IIII11:::111:111:::::: ACCCCCC~CAGATAGG~GTCCCTTGA-CCACCATCCTCCGT@AAATCAATATCCCGCACAAC-A ::: :. :11111 I II IIII II I IIIII CTTAAT-CACCATGCCGCGTGAAACCAGCAACCCGCTCC-GCA :::::: IIII111::11 III1:11::1:11111: CTTAAT-TACCATGCCGCGTGAAACCAGCAACCCGCTAGC~IA . . . . . . . . . . :: :.:11111 I II I I I I II : 1 : 1 1 1 1 1

490 500 510 520 530 540 550 GTG- - TACTCT CCTC-GCTCCGGGCCCAT -AACACTTGGGGGTACCTAA-ACT GAA- CTGTATCCGACATCTGGTTC

I

: ::::

560 570 580 590 600 CTACCTCAGGGCCATGAAG- TTCAAA- GGACTCCCACACGT

::: 11111:11 I1:1:11 IIII :::: IIIIlll::lll: : Ill: :l:ll:::l:lll::lJlll:l:l::llll: :111:::: :::::: ::::::11::1:1 GTG- - -ACTCTCCT C - GCTCCGGGCCCAT -AACAT CTGGGGGTAGrCTAA--AGTGAA- CTGTATCCGACATCT GGTTCCTACCTCAGGG-CCATGAAG-TTCAAA-AGACTCCCACACGT ::: :1111:11 I1:1111 IIII :::: IIllll::lll: ::111: :l:ll:::l:lll::lllll:l:l Illll: III ::: ::: : :ll::l:l GTGCT -ACTCTCCT C -GCTCC C~P~GCCCAT-AACACTTGGGGGTAGCTAA--AGTGAA- CTGTATCCGACATCT G,@TTCCTACTTCAGGG- TCATAAAG-CCTAAA- TAGC - -CCACACGT : : IIII II II I I I I IIII : : I}llll::tll: ; III :1 I 1 : I;111: IIIII I11:11111 }11 :::: : :: :11:I:1 GG-ATCCCT

::.::::llll:ll

II

COW

GGGAT CCCTCTTCTC-

GCTC CGGGCCCAT

RAT

::

:::::

CACCATTAAGTCATAA . . . . . . . ACCTTTCTCTT--CCATATGACTATCCCT•TCCCCAA-TTGGTCTCTATT--TCTACCATCCTCCGT•AAATCAACAACCCGCCCACTC :::::: :::1:::: :: ::::::: ::::::::::::::: : ::: : :::: :::I:::::::::::::::::: Jill I1::1:11111:::: ........ CACCATACAGTCATAA ....... A•TCTTCTCTT--CCATATGACTATCCCCTT-CCCCATTTCC--TCTATTAATCTACCATCCTCCGTGAAACCAACAACCCGCCCACCA ::::: : :: :: :::: ::::: ::::: ::::::::: :.::::: : .::: :: :11111:1::11 III1:11::1:11111:::::: TAAACATAGACCATCAAATC-TACACACACCACTC`AACTCTTACC•ATACGACTATCCCTCTCCCCCA---GTCCTCTCACAACTTACCAT•CTC•GTGAAACCAACAACCCGCCCACCA

II

:.1:1:1

It::

9

:it

-It:

.1:

I:

:1:

I:

I

II

-::::

:1:11:

:

:

I

I

It

:

::

:;

:::

I

I

:

:

.......

..... TAAGGG-TCATTTATCCTCATAGAC

::::

:

.... CACAGTCTA-GACGCACCTAC-GG

9 ;;;:1

..............................

:

...............................

:::

.... CAAAGTCCT-GTGGAACCTTTTAGT

::

:::::::::::::::::::::::::::

::

I:l

ACTCAGCTATGGCCGTC-AAA0GCCCTGAC-CCGG--AG-CATC--TATTGTAGC--TGGA--CTTAACTC-CATCTT~A--GCACcAGC

:l:l

:::

..... TGATTCCT~CCTCATCCTATTATTTATCGCACCTACGTTCAATA

:::

COW

9 I:l:l

:::::::::::::::::::::::::

I:l::::::::

..... ACAGTCAAATAAATTGTAGC-GGGCCTGTGTGTATTTT---TGATTGGACTAGCA

::::li::l(

I

I

:

:::::

:::

...................

::

:

:

...... TGAAGAATCATTAGTCCGCAAAACC

:::::::

I:l;:;

.... TATTGTAGACGA-GCACCTAA

.........

:

:

:::

...................

::

::

TGAAGA-CCCTCCATCCTCATAATT

...................

850 860 870 880 890 900 910 920 930 940 TTATTACCTAGCATGATTTACTAAAGCGTG-TTAATTAATT~-TGcTTGTAGGACATAA-CAATAG-CAGCAAAATAC-CACGT~-AACTGCTTTCCACACCAAC-ATCATAACAAAAAA

COM

TTACGACCTAGCAT-ACCTA~TAAAGTGTG-TTAATTGATTAATGCTTGCAGGAcATAA-CAACAG-CAGCAAAATGCTCACAT~-AACTGCTTTCCACACCAAc-ATCATRACAAAAAA

MAN

TTACAGGCGAACAT-ACTTACTAAAGTGTG-TTAATTAATTAATC.;CTTGTAGGACATAA-TAATAA-CAATTGAATGTCT0CAC~-AGCCACTTTCCA~GAC-AT~T~

DOL

................

COW

............

::::

:

:

:::

::

:::::::::l::

l:::::

:I::111:I::

::I11111:

::

:

::

::~:

::

:

CAACCAACAG-GTG-TTATTTAATTAATGGTTACAC-GACATAT-TACTCTATTATT---CCCCCGGGT.-

:

9

:

::

I

I::

:

I

:llt:l

::::IIIIII

:

:

I::

I

:..

I

III

I

;

Illlll:::

:

AAAGCTCGAAAGAC-T-ATTTTATTCATGTTTGTAAGACATAAATAT--TTATAAATACTG

MUS

................

CAATCACCTAAGGC-TAATT--ATTCATGCTTGTTAGACATAAATGC--TACTCAATACCAAATTTT-.

RAB

..................

: :

::

COM

AGACGCCAGCCTAGCCAGACTTCAAATTT-.-CATCTTTAGGCGGTATGCACTTTTAACAGT

.... CACCCCTCAATTAACATGCCCTCC--CCCCTCA-ACT-CCCATTCTACTAGCCC

TAACACCAOCCTAACCAGATTTCAAATTT-o-TATCTTTTCGCGGTATGCACTTTTAACAGT

.... CACCCCCCAACTAACACATTATTT-TCCCCTCCCACT-CCCATACTACT

cow

i~

RAT

-AAATAAAACAAAAAGCTACT

MUS

GAAAGACATATAATATTAAC

RAB

CCGCACAGTATTTACTTAGACT-AAATT-~

.................................................. i A T c §

The main regulatory region of mammalian mitochondrial DNA: structure-function model and evolutionary pattern.

The evolution of the main regulatory region (D-loop) of the mammalian mitochondrial genome was analyzed by comparing the sequences of eight mammalian ...
841KB Sizes 0 Downloads 0 Views