Nucleotide sequence of the cytochrome oxidase subunit I gene from soybean mitochondria.

Plant Molecular Biology 7:377-384 (1986) © Martinus NijhoffPublishers, Dordrecht - Printed in the Netherlands

377

Nucleotide sequence of the cytochrome oxidase subunit I gene from soybean mitochondria Elizabeth A. Grabau Howard Hughes Medical Institute, University o f Utah, Salt Lake City, U T 84132, U.S.A.

Keywords: cytochrome c oxidase, mitochondrial DNA, soybean, dideoxy sequencing

Summary The cytochrome oxidase subunit I gene was isolated from a soybean mitochondrial library and subcloned into M13 for DNA sequencing. The sequences of the gene and flanking regions are presented and compared to the corresponding gene from maize. There is approximately 94070 sequence homology between the soybean (dicot) and maize (monocot) coding sequences at the nucleotide level. The soybean sequence exists as a single copy in the mitochondrial genome and contains an open reading frame that could encode a polypeptide of 527 amino acids. There is very little sequence homology between the soybean and maize sequences upstream from the coding regions and none is detected downstream. Even the 3' ends of the COI coding regions differ considerably between soybean and maize. There are many amino acid differences at the carboxy terminus and the predicted polypeptide contains one less amino acid than the maize sequence. Northern analysis of the soybean mitochondrial RNA suggests that this region is actively transcribed and yields two major transcripts.

Introduction Cytochrome c oxidase is one o f the enzyme complexes of the inner mitochondrial membrane and is composed of at least seven subunits (30). Subunits I, II and III are encoded by the mitochondrial genomes of many organisms and at least I and II are encoded by mitochondrial genes in plants (17, 13, 3, 18, 20, 15). The total number of plant mitochondrial genes that have been studied to date is still fairly meager and transcription in plant mitochondria has not yet been well characterized. Transcription patterns detected by hybridization with probes from cloned genes range from very simple, with a single transcript as in oenothera COII gene (15), to quite complex with many different RNA species as for a number of other genes (17, 11, 12). At least some RNA processing occurs in plant mitochondria since an intron has been found in the cytochrome oxidase subunit II gene from monocots (13, 3, 18). The 5' ends of several

plant mitochondrial transcripts have been mapped (17, 20, 16) but the sites at which transcription initiation occurs remain undefined. Here I report the cloning and sequencing of the cytochrome oxidase subunit I gene and its flanking sequence from the soybean mitochondrial genome. Alignment of this gene with the sequence reported for maize mitochondria (17) provides both an evolutionary comparison and the opportunity to search for possible common regulatory sequences.

Materials and methods Isolation o f mitochondrial D N A and R N A

Mitochondria were isolated from 6 day old dark grown soybean hypocotyls (Glycine max, cv. Williams) as previously described (21). RNA was extracted from mitochondria that had been purified by several cycles of differential centrifugation and

378 sucrose gradient purification. DNA was extracted from sucrose gradient purified mitochondria and subjected to CsC1-EtBr density purification (27).

Cloning and sequencing of the cytochrome oxidase subunit I gene A soybean mitochondrial cosmid library (14) was screened for the presence of clones containing the soybean COl gene using a 4.7 kb XhoI fragment (17) from the maize mitochondrial genome as a probe. The cosmid clone containing the maize CO1 gene was generously provided by Dr Christiane Fauron. Fragments were isolated from agarose gels and nick-translated for use as probes (22). Restriction enzyme analyses of the soybean cosmids were performed to determine appropriate fragments small enough for sequencing. Two BamHI fragments of 0.95 kb and 2.4 kb were found to contain sequences homologous to the maize COl gene and were subcloned into M13 mpl8. Sequencing was performed by the method of Sanger (23) using oligonucleotide primers synthesized by an Applied Biosystems DNA Synthesizer. The oligonucleotides were twenty nucleotides in length and were isolated from 20°7o acrylamide gels prior to use as sequencing primers.

Electrophores& and hybridization Restriction enzymes were used according to manufacturer's instructions (Bethesda Research Laboratories and Boehringer Mannheim). Digested DNA was fractionated in 0.8°7o agarose gels, transferred to nitrocellulose and probed by the method of Southern (26). RNA was fractionated by electrophoresis in l°70 agarose-formaldehyde gels and subjected to Northern analysis (29).

Results

Twenty cosmid clones from a soybean mitochondrial library (14) were identified by hybridization to a maize COI probe. The cosmid DNA from six of those clones was purified and digested with several restriction enzymes and subjected to Southern analysis to determine the location of the corresponding soybean gene. All six clones contained 2 BamHI fragments of approximately. 0.95 kb and

2.4 kb which hybridized to the maize 4.7 kb XhoI fragment as seen in Fig. 1A. The two fragments were subcloned into MI3 mpl8 for sequencing and for use as hybridization probes. Figures 1B and IC show Southern analyses of mitochondrial DNA digested with several restriction enzymes and probed with either the 0.95 kb or 2.4 kb containing M13 clones. In each case only a single band is seen indicating that the COI gene is present as a single copy in the soybean mitochondrial genome. The sequencing strategy and restriction map of the cloned DNA containing the cytochrome oxidase subunit I gene are shown in Fig. 2. The sequence was obtained by dideoxy sequencing using synthetic oligonucleotides as primers and is presented in Fig. 3. The sequence extends from the first BamHI site to the XhoI site (see Fig. 2) and is compared to the maize COI sequence (17). We have sequenced approximately 1.1 kb beyond the XhoI site but have not included the sequence here since it does not contain any homology to maize, any long open reading frames or any obvious tRNA genes. (The sequence is available upon request.) It must be noted that since the two BamHI fragments do not overlap, it remains a formal, however unlikely, possibility that these two fragments might not be adjacent in either the mitochondrial DNA or cosmid DNA. The soybean sequence contains an open reading flame of 1581 nucleotides that shows 93.9O7o nucleotide homology to the coding region of the maize sequence. Like the COI gene in maize, the soybean COI gene does not contain an intron. The 3' ends of the two genes differ considerably in sequence and the open reading frames differ in length by 3 nucleotides (maize open reading flame is 1584 nucleotides). The differences between the soybean and maize sequences are indicated by the presence of the appropriate nucleotide from the maize sequence below the soybean sequence in Fig. 3. The homology to maize breaks down in the flanking region. Immediately 5' to the AUG start there are 8 additional nucleotides in the soybean sequence as compared to maize. The insertion in soybean (or deletion in maize) is located in a region that is found to be A-rich in both maize and yeast mitochondrial COI genes (17, 4). There are 11 A's out of 13 residues immediately preceeding the initiation codon in maize and in yeast 13 of 15 residues are A's at the corresponding location. The A-rich

379

Fig. 1. Hybridization of soybean mitochondrial DNA and cloned mitochondiral DNA to CO1 gene probes. A) Six cosmids of soybean

mtDNA were digested with BamHI and probed with a maize COl probe (4.7 kb Xhol fragment). Two fragments, 0.95 kb and 2.4 kb, hybridized to the maize probe from each cosmid. B and C) Soybean mtDNA was digested individually with 4 enzymes as shown and probed with the soybean subclones of 0.95 kb (B) and 2.4 kb (C) BamHI fragments. Size markers are Hpal digested T7 DNA.

region in m a i z e is p r e c e e d e d by a sequence 5 " G G T T T T C A - 3 ' w h i c h has b e e n p r e d i c t e d to act as a r i b o s o m e b i n d i n g site in p l a n t m i t o c h o n d r i a l R N A (17, 9, 16). In s o y b e a n t h e sequence o f the 'ins e r t i o n ' j u s t p r e c e e d i n g the A U G start is Pvul \

BamHI

Pstl BarnHI \ /

BarnHI

/

Xhol

Pvull

II COl coding region P •

P

P ,1

MI

i

I

i

200bp

Fig. 2. Restriction map and sequencing strategy of cloned soy-

bean mitochondrial DNA containing the COl gene. This region was subcloned on two BamHI fragments of 0.95 kb and 2.4 kb, indicated here by the BamHI sites. Probes used for Northern analyses (see Fig. 4) were the 0.95 kb BamHI fragment, the 1.3 kb BamHI-XhoI fragment and the 1.1 kb XhoI-BamHI fragment.

5 ' - T C C A T T T T - 3 ' . D u e to this insertion i m m e d i a t e l y u p s t r e a m f r o m the i n i t i a t i o n c o d o n , the p e r c e n t a g e o f A residues is m u c h s m a l l e r t h a n in m a i z e at the s a m e p o s i t i o n (only 9 o f 21 nucleotides). There is a n A - r i c h region d i s p l a c e d f u r t h e r u p s t r e a m f r o m the i n i t i a t i o n c o d o n in s o y b e a n (13 o f 21 nucleotides are A's). There is also a l a c k o f c o n s e r v a t i o n b e t w e e n w h e a t a n d m a i z e sequences in the region p r o p o s e d to be a r i b o s o m e b i n d i n g site (2). T h e a b sence o f the p r o p o s e d r i b o s o m e b i n d i n g site f r o m t h e s o y b e a n 5' f l a n k i n g region results in a mism a t c h o f 6 b a s e pairs with m a i z e b e f o r e a n o t h e r stretch o f perfect h o m o l o g y o f 18 n u c l e o t i d e s is encountered. B e y o n d this p o i n t h o m o l o g y with m a i z e b r e a k s d o w n completely. T h e o n l y o t h e r n o t a b l e feature u p s t r e a m f r o m t h e C O I gene in s o y b e a n is the presence o f three copies o f a 7 base p a i r repeat, 5 " C C C C T C T - 3 ' , i n d i c a t e d in boxes 5' to the c o d i n g sequence in Fig. 3. S m a l l repeats have been observed in t h e vicinity o f o t h e r genes in a s s o c i a t i o n

380 -167 B GGATCCAAATTTCCCATCC TC AAGT A C AAT T -118 C C C ~ T C C A T T C T TCCT~-CCCT GGT TT G A CGGA AG C AGAAGAAG -sa CGCTCCCTAA GAAGGGC~CCC~TCATA CTG TGCC TT T

~

B

a65 ATCAATAAAG TGAGGGCTTT CCGGACCTTC C G A G G A C CG A A C AAG A AGCTGTT T

a

CT~ACTCCCC AAAC ATTTT

CTTTTTCAAGAAATAAGCCC AC CC CTT C T CTTTGG

AGGAAGG--A TTTTCA

AACGAAAGAA A

i

SB

GTT TGG GCT Val

TCTCCATTTT AA ........

SB M

CAT CAT ATG TTT ACT CTG GGC TTA GAC GTG GAT ACC CGT GCC TAC T G H i s M e t P h e T h r Val G l y Leu A s p V a l A s p T h r A r g A l a T y r

Trp Ala His

glS TTC ACC

CCA GCT ACC ATG ATC

Phe Thr Ala Ala S~

1 ATG ACAAAT

CCG GTC

MET Thr

Pro Val

M

CGA TGG

CTG TTC TCC

brg Trp

Leu

T ASh

ACT AAC

CACAAG

~u

Ile

ATA GCT GTC CCC ACT GGA ATT AAA ATC TTT ACT G ¢ Ile A l a Val P r o T h r G l y lle LyS Ile Phe S e r

GAT ATA GGG ACT

C T

C Phe Ser Thr ASh His

Thr Met

Lys A s p

Ile Gly Thr

~

55

973 TGG ATC

CCC ATG

TTA TTT

Trp

Pro Met

L e u Phe

GCT ACC ATG TGG GGG GGT TCG ATA CAA TAC AAA ACA A Ile A l a T h r M e t T r p G l y G l y S e r lle G i n T y r L y s T h r

SB C T A T A T T T C A T C T T C G G T G C C A T T G C T G G A G T G A T G G G C A C A T G C T T C T C C G T A M Leu Tyr

Phe

Ile Phe Gly Ala

A Ile Ala Gly Val Met GIy Thr

Cys

P h e S e t Val

SB M

ID27 GCT GTA

GGG TTC

Ala

GIZ

Val

Phe

ATC TTT TTG TTC ACC ATA GGA GGA CTC ACT GGA ATA GTC CTG G T L A Ile Phe Leu Phe T h r Ile C l y G I y Leu T h r C l y lie V a l eu

i09 ATT

Leu

CGT ATG

lie A r g

GAA TTA GCA CCA CCC CGC GAT CAA ATT CTT GGT CCC AAT CAT C L e u A l a A r g P r o G l y A s p G l n Ile Leu G l y G l y A S h H i s

10Sl GCA AAT TCT GGG ¢ Ala ASh $er Gly

~

Met Glu

CTA GAC ATT Leu Asp

GCT CTA CAT GAT ACT TAT TAT GTG GTT GCA CAT

Ile A l a

Leu His Asp Thr Tyr Tyr

Val Val

A l a His

163

ss

~

CTT TAT AAT

GTT TTA

Gin

Leo Tyr ASh

Val

ATA ACC GGT CAC GCT TTT TTA ATG ATC TTT TTT ATG G C L e u I l e T h r G l y H i s A l a P h e L e u M e t lle P h e Phe M e t

S8 M

1135 TTC CAT TAT GTA

CTT TCT ATG GGA GCC GTT TTT GCT TTA TTT

Phe His Tyr Val

Leu Ser Met Gly Ala Val

Phe A l a

Leu

GCA GGA TTT CAC T T Phe A l a G l y Phe H i s

Ala 217

Tyr

SB GTT A T G C C G G C G A T G A T A G G T G G A T C T G G T A A T T G G T C T C T T C C G A T T C T G A T A M

val

Met

Pro Ala Met

G T T I l e S l y G ) y S e t G l y A S h T r p S e t Val

SB P r o Ile

LeU

lle

1189 TAT TGG GTG

~

CCT GAC ATG

Gly Ala

Pro Asp

Val G I g

CCA CGA TTA AAT AAT ATT TCA A Phe Pro Arg Leu ASh ASh Ile Set

TTC TCG TTG TTG Phe Trp

LeU

Leu

SB ~25 CCG CCA AGT CTC TTG CTC CTA TTA AGC TCA GCC TTA GTA GAA GTG GGT AGC GGC M Pro Set

Leu Leu

Leu

Leu

~

B

Leu Set Ser Ala

Leu Val

C Glu Val Gly Set Gly

379 ACT GGG TGG

ACG GTC

TAT

CCG CCC TTA AGT GGT ATT

Thr

Thr

Tyr

Pro Pro Leu Set Cly

Gly Trp

Val

SB

GAT

Ala Val Asp

TCA GCA ATT TCT AGT eT T S r A l a Ile S e t S e t Leu

487 S l~ G G T T C T A T C Gly Set

~

B

e

Set His Ser Gly Gly

CTT CAT CTA TCT GGT GTT TCA TCC ATT TTA A A L e u H i s L e u S e r G l y V a l S e t S e r Ile Leu

TTT

ATA ACA ACT ATC

Phe

lle Th

Thr

TCC AAC

ATG CGT GCA

Ile S e r A S h M e t A r g

GIg

CCT GGA

~

ATG ACT

Pro GIy Met Thr

SB

M

Phe SB

541 ATG CAT AGA

TCA

Met

S r Pr

His Arg

CCC CTA TTT Le

CTG TGG TCC GTT CCA GTA

Phe Val

T r p s e t Val

Leu 595 TTA TTA

~

Leu

~

B

TCA

CTT

Leu Ser

Leu

649 AAC TTT ASh

Ile

Aia

Leu

Phe

P o Leu

CTG ACC CTC TTT CCC ATG CAT TTC T L e u P h e P r o M e t H i s Phe

Leu Thr

CTT AGC ACT TTT GGC TOT TAT ATA TCC GTA GTT GCC ATT CGT Le

cH Ser Set

Phe C I y S e t r y r

~

Leu

Ile S e t v a l

val

Ciy

lle A r g

i~05 CGT TTC

TTC GTG GTC GTA ACA ATC ACT TCA AGC AGT GGA AAT AAC ATA ACA AGG A T c G A G A G T T P h e P h e V a l Val Val T h r Ile T h r S e t S e r S e t G l y sn A s h Ile T h r A r g

Ala Lys L s Arg Cys 1459 GCG AAC ATT CCT TGG GCT GTG GAA CAA AAT TCA ACC ACA CTG GAA TGG CTG GTA c A G A l l A s n Ila P r o T r p A l a V a l GI G1 A s n S e r T h r T h r Le G l u T r p L e u Val

Ser

ACA GCA TTC CCA CTT

Pro Val Thr

Phe Phe Gly Val ASh

I~

Phe

AAT

Leu GIy Gin

CTT TCG GGT ATG CCA CGT CGC ATT CCA GAT TAT CCA GAT GCT TAC GCT C Leu Ser Gly Met Pro Arg Arg Ile Pro Asp Tyr Pro Asp Ala Tyr Ala

1351 GGA TGC AAT GCC

Arg

lle ASh

c

Thr

t

ACC AGC CAT TCT GGA GGA

Ile T h r

Ile Thr

G l y T r p A S h AI 433 GCA GTT

Pro Glu

Phe 1297 TTA GGG Leu Gly

SMB

~

lle P h e G l y T r p T h r T y

1243 CAT TTT TGG ATC ACT TTT TTC GGG GTT AAT His Phe Trp

A Pro

Lys

TTA GGT CAAATC

Phe

GCA TTT

Met Ala

TTT GGT CGG ACA TAC CCT GAAACT

M Tyr Trp

Phe 271 GGT GCA

GGTAAAATC

Pro

1513 C AA AGT CCT CCA GCT TTT CAT ACT TTT GGA GAA CTT CCA GCT ATC A~G GAG ACG C C A A A T G l n S e t P r o P r o A l a P h e H i s T ~ r Phe G l y G I u L e u P r o A l a lle L y s G l u T h r

T~r

CCG GTA

CTG GCA GGG GCA ATA ACA ATG TTA TTA ACC GAT CGA T P r o V a l L e u A l a G l y A l a Ile T h r M e t Leu L e u T h r A s p A r g

1567 G A C A TCC Lyg Set Tyr Val

AAG GT Lys

GAAGAAAAGGTCGCCGA ATCC ATCATTAGCG

CTGCTACTAAGAACCTAACA AGACAATTAT TTTTCTT3'

.

A~gA~nG~nS~rS~rC~s AAT ACA

Phe Asp

Thr

ACC Thr

TTT TCT GAT CCC GCA GGA GGG GGA GAC T A Phe Set Asp Pro Ala Gly Gly GIy Asp

CCC ATC TTA TAC A A P r o Ila L e U T y r

f

S~

GAACTTTTCAAAATGTGGGT

TCCAACGAAGAAGAGTTGAG

GACCAACTTC

GACCTAATTT

SB

AAGAGTTAGGAAAGCAAGCT

CAGTTCAGAATGGCTACTTA

CCAACAGAGG

ATAGCGAGCT

S~

ACTACAAGGC

CAGAGTTAAA

TTTGGAAATT

GGATCGAGCT

GAGCCGGAGC

TCCTCCTACT

S~

GACTATGACT

GGACAGCAGT

GGACTCTTTC

TCAAAGTCTG

ACCCTGCCAC

CTATATTTGA

S~

TTGGATGAAG

CAACCATCAC

TGAAGCAATG

GATTTAGTAG

CCAGCTCAAC

TCCATTTAAC

S8

AGGTTAGTAG

CGAAAGGTAA

GGCTTCATTC

S~

ATTAAGTGAAGGGAAGAGAATCAAGGCTAT

SB

GCTGACTCAACTCAAGGAAG

SS

CTCGAG

Phe S~

703 CAG CAT CTC Gln

TTT

CGG TTC TTC GGT

CAT CCA GAG GTG

TAT ATT

CCC ATT CTG CCT

His L e u P h e T r p Phe P h e G l y H i s P r o G l u V a l T y r lie P o Ile Leu P r o

t

Leu SB

S~ M

757 GGA TCC GGT ATC ATA AGT CAT ATC GTT TCG ACT TTT TCG GGA T T T A C A A G l y S ~ r G l y I l e I l e S e t H i s Ile Val S e r T h r P h e S e t G l y Ph •e 811 GGG TAT CTA GGC

ATG GTT

G)y Tyr

Met

Leu Gly

CCGAGTGGAC

GAGAAGAAGG

AGCAAGCCCT

TTTACTTTAC

TTTACTAGAC

TAGCTACCGA

AAA CCG GTC TTC Lys

P r o Val

Phe CCGGAAAGAAAGGGCTTCAAGACAGCTTCG

GCAACAGAAA

A rTg

Va2

TAT GCC ATG ATC AGT ^ Tyr Ala Met Ile Set

ATA GGT GTT 12e G ~ y V a l

CTT GGA TTT CTT A Leu Giy Phe Leu

Fig. 3. D N A sequence of the soybean gene for cytochrome oxidase subunit I. The soybean COl gene and flanking sequence are presented in comparison to the maize sequence. Nucleotide differences are noted below the soybean sequence and the amino acid differences are indicated by arrows. The corresponding amino acids from the maize sequence are shown below the arrows. A dash indicates the absence of a nucleotide in the corresponding sequence. Upstream repeats are boxed. The respective termination codons from both the soybean and maize sequences are boxed at the 3' end of the coding region. The putative ribosome binding site from maize is indicated by the solid line. Additional sequence of the maize 3' flanking sequence has been reported (17) but we have not included it here because it shows no h o m o l o g y to the soybean sequence.

381 Of the 99 nucleotide substitutions in the CO1 gene between soybean and maize, 56 are transitions (C.-.T, A.-.G). 41 nucleotide substitutions are involved in amino acid replacements and 27 of those substitutions are transitions (67°7o). This skewed distribution is somewhat more striking if the carboxy terminal changes are omitted. Because of the marked lack o f conservation at the 3' end of the gene, only 16 of 31 amino acid replacements are seen in the first 473 amino acids of the CO1 coding sequence. 15 o f the 16 amino acids replacements in this region are transitions. A bias towards transitions, particularly C ,-*T, is also seen when comparing the coding sequences of rat mitochondrial DNA (6). Another difference at the 3' end of the CO1 gene is the termination codon which is TAA in soybean and TAG in maize. Plant mitochondrial codon usage has been found to differ from all other mitochondrial systems. All three termination codons are utilized (13, 17, 25) and ~ G G probably codes for tryptophan rather than arginine. One of the two locations of CGG in the soybean CO1 sequence is in c o m m o n with maize and the other differs from maize but is located in a conserved region of the polypeptide that contains a tryptophan in the other sequences. This lends further support to the conclusion that CGG codes for tryptophan in plant mitochondria. To determine whether the CO1 gene is transcribed in soybean, Northern analysis of mitochondrial RNA (mtRNA) was performed using fragments from the BamHI clones as probes (see Fig. 2). Figure 5 shows the Northern blot of mtRNA using the three probes from the cloned re-

with deletions or insertions relative to an homologous sequence (18, 20, 3). In the 3' flanking region o f the soybean COIl gene a similar repeat, 5'-CCTCT-3', is found associated with an insertion in the soybean sequence relative to pea (Grabau, manuscript in preparation). Whether these repeats in the upstream flanking region o f the soybean CO1 gene are involved in a similar kind of insertion or deletion will await sequence comparison to other genes containing more extensive upstream homology to the soybean gene. The open reading frame in the soybean COl gene could code for a protein with a predicted molecular weight of 57,544. Amino acid differences between maize and soybean are indicated by arrows in Fig. 3. The difference in length between soybean and maize CO1 polypeptides is only one amino acid but the sequences at the carboxy termini are quite different. Amino acid sequence homology between soybean and maize is approximately 94.2% indicating that many o f the nucleotide changes are silent substitutions. Fifteen of 31 amino acid differences between the two sequences lie within the last 54 amino acids. This is in agreement with the lack of conservation in this region seen among CO1 genes of maize and various other organisms including Neurospora crassa, Saccharomyces cervisiae, Homo sapiens and Drosophila melanogaster (17, 7, 4, I, 10). In common with the other known CO1 genes (24), there are 12 transmembrane segments predicted by the hydropathic profile (19) of the soybean CO1 protein as shown in Fig. 4. The position of the amino acid differences are noted by arrows and are present both in the transmembrane regions and hydrophilic regions.

I

II

III

IV

V

VI

VII

VIII

IX

X

XI

XII

o *,- - 2 "0

:~-3

-I

o

5'°

,;o

,~o

I

2oo

I

25°

3;0

3;o

,;o

,;o

5;0

Amino acid number

Fig. 4. Predicted hydropathy profile for the soybean COI polypeptide. The hydropathy values were calculated according to Kyte and Doolittle (19) using a 6 a m i n o acid window. The twelve t r a n s m e m b r a n e regions (hydropathy > 0) are shown. The a m i n o acid differences from maize are indicated by the arrows. (The last amino acid difference is not shown because it is the soybean translational stop codon.)

382 the coding region. There are no additional long open reading frames within 1.6 kb downstream from the COI gene in soybean.

Discussion

Fig. 5. Hybridization of fractionated soybean mitochondrial

RNA with probes from the COI clones. Soybean mitochondrial RNA Northern blots were probed with three probes, 0.95 kb BamHI fragment (lanes 1 and 2), 1.3 BamHI-XhoI fragment (lanes 3 and 4), and 1.1 kb XhoI-BamHI fragment (lanes 5 and 6). The RNA was loaded on the gel at two different dilutions (lx and 5x respectively). RNA size markers were purchased from Bethesda Research Laboratories and probed with nicktranslated T7 DNA.

gion. There are two m a j o r transcripts of approximately 2000 and 1700 nucleotides but several less intense bands as seen when probed with either the 0.95 kb B a m H I fragment or the 1.3 kb B a m H I X h o I fragment. We have failed to detect hybridization to the m a j o r transcripts with the 1.1. kb X h o I B a m H I fragment and conclude that the 3' end of the soybean COI transcript must not extend into this region. In contrast, the 5' ends of the maize C O I transcripts have been m a p p e d to within 200 base pairs upstream of the COI gene indicating that the majority of the long noncoding region of the large transcripts are located downstream from

The COI gene in soybean mitochondria exists as a single copy as do the flanking regions as far as -167 upstream and approximately 1600 nucleotides downstream. COI is also present as a single copy in normal maize mitochondria but is repeated in the mitochondrial genome of cytoplasmic male sterile cmsS plants (17). The soybean gene exhibits approximately 94°7o homology to the maize COI sequence at the nucleotide level. This is in agreement with the slow rate of nucleotide divergence seen in plants when compared to other mitochondria (28, 8, 14). It is particularly striking when compared to animal mitochondria where the rate of nucleotide divergence for m t D N A is 5 to 10 fold greater than the rate of divergence of the animal nuclear gehome (5). Sequence comparisons of the soybean and maize 5' flanking regions showed no obvious c o m m o n features. The A-rich region immediately upstream from the initiation codon in other genes (4, 17) is not seen at the same position in soybean but is displaced upstream by the presence of an 'insertion' in soybean relative to maize. There is no homology to a putative plant mitochondrial ribosome binding site (17, 9, 16) in the soybean sequence. The homology with maize breaks down at position -43 in the soybean sequence and none of the possible promoter-like sequences found in maize preceeding the 5' end of the COI transcript (17) have been found in the region sequenced in soybean (to -167). This suggests that we have not yet recognized the signals for mitochondrial transcription or that the transcripts may be very large and subsequently processed. There are several transcripts hybridizing to the cloned soybean COI sequence although the signal is strongest to two transcripts of about 2000 and 1700 nucleotides. The 5' ends of the maize COI transcripts (2300 and 2400 nucleotides) have been m a p p e d to within 200 nucleotides upstream from the translation start and this indicates that the transcripts must contain long downstream non-coding sequences. In soybean no transcription from a 3'

383 flanking fragment can be detected. Therefore it seems that the transcription patterns are quite different from maize. The striking similarities observed in plant mitochondria for the coding sequences of the major gene products do not extend to the organization and transcription of those genes.

10.

11.

12.

Acknowledgements This work was carried out in the laboratory o f Dr Ray Gesteland. I would like to thank Ray for his continuing enthusiastic support and for critical reading o f the manuscript. I wish to thank Marie Havlik for excellent technical assistance. I would also like to thank Christiane Fauron for the cmsT cosmid clone used as a CO1 probe and for reading this manuscript.

13.

14.

15.

16.

References 1. Anderson S, Bankier AT, Barrell BG, de Bruijn MHL, Coulson AR, Drouin J, Eperon IC, Nierlich DP, Roe BA, Sanger F, Schreier PH, Smith AJH, Staden R, Young IG: Sequence and organization of the human mitochondrial genome. Nature 290:457- 465, 1981. 2. Boer PH, Mclntosh JE, Gray MW, Bonen L: The wheat mitochondrial gene for apocytochrome b: absence of a prokaryotic ribosome binding site. Nucleic Acids Res 13:2281 - 2292, 1985. 3. Bonen L, Boer PH, Gray MW: The wheat cytochrome oxidase subunit II gene has an intron insert and three radical amino acid changes relative to maize. EMBO J 3:25312536, 1984. 4. Bonitz SG, Coruzzi G, Thalenfeld BE, Tzagoloff A, Macino G: Assembly of the mitochondrial membrane system. Structure and nucleotide sequence of the gene coding for subunit 1 of yeast cytochrome oxidase. J Biol Chem 255:11927- 11941, 1980. 5. Brown WM, Prager EM, Wang A, Wilson AC: Mitochondrial DNA sequences of primates: tempo and mode of evolution. J Mol Evol 18:225-239, 1982. 6. Brown GG, Simpson MV: Novel features of animal mtDNA evolution as shown by sequences of two rat cytochrome oxidase subunit II genes. Proc Natl Acad Sci USA 79:3246- 3250, 1982 7. Burger G, Scriven C, Machliedt W, Werner S: Subunit 1 of cytochrome oxidase from Neurospora crassa: nucleotide sequence of the coding gene and partial amino acid sequence of the protein. EMBO J 1:1385- 1391, 1982. 8. Chao S, Sederoff R, Levings CS III: Nucleotide sequence and evolution of the 18S ribosomal RNA gene in maize mitochondria. Nucleic Acids Res 12:6629- 6644, 1984. 9. Dawson AJ, Jones VP, Leaver C J: The apocytochrome b

17.

18.

19.

20.

21.

22.

23.

24.

25.

26.

gene in maize mitochondria does not contain introns and is preceeded by a potential ribosome binding site. EMBO J 3:2107-2113, 1984. de Bruijn MHL: Drosophila melanogaster mitochondrial DNA, a novel organization and genetic code. Nature 304:234- 241, 1983. Dewey, RE, Levings CS III, Timothy DH: Nucleotide sequence of ATPase subunit 6 gene of maize mitochondria. Plant Physiol 79:914- 919, 1985. Dewey RE, Schuster AM, Levings CS III, Timothy DH: Nucleotide sequence of Fo-ATPase proteolipid (subunit 9) gene of maize mitochondria. Proc Natl Acad Sci USA 82:1015- 1019, 1985. Fox TD, Leaver C J: The zea mays mitochondrial gene coding cytochrome oxidase subunit II has an intervening sequence and does not contain TGA codons. Cell 26:315 - 323, 1981. Grabau EA: Nucleotide sequence of the soybean mitochondrial 18S rRNA gene" evidence for a slow rate of divergence in the plant mitochondrial genome. Plant Mol Biol 5:119- 124, 1985. Hiesel R, Brennicke A: Cytochrome oxidase subunit II gene in mitochondria of oenothera has no intron. EMBO J 2:2173-2178, 1983. Hiesel R, Brennicke A: Overlapping reading frames in oenothera mitochondria. Febs Lett 193:164- 168, 1985. Isaac PG, Jone VP, Leaver C J: The maize cytochrome c oxidase subunit I gene: sequence, expression and rearrangement in cytoplasmic male sterile plants. EMBO J 4:1617- 1623, 1985. Kao, T-h, Moon E, Wu R: Cytochrome oxidase subunit II gene of rice has an insertion sequence within the intron. Nucleic Acids Res 12:7305- 7315, 1984. Kyte J, Doolittle RF: A simple method for displaying the hydropathic character of a protein. J Mol Biol 157:105- 132, 1982. Moon E, Kao T-h, Wu R: Pea cytochrome oxidase subunit II gene has no intron and generates two mRNA transcripts with different 5'-termini. Nucleic Acids Res 13:31953212, 1985. Morgens PH, Grabau EA, Gesteland RF: A novel soybean mitochondrial transcript resulting from a DNA rearrangement involving the 5S rRNA gene, Nucleic Acids Res 12:5665 - 5684, 1984. Rigby PWJ, Dieckmann M, Rhodes C, Berg P: Labeling deoxyribonucleic acid to high specific activity in vitro by nick translation with DNA polymerase I. J Mol Biol 113:237-251, 1977. Sanger F, Nicklen S, Coulson AR: DNA sequencing with chain terminating inhibitors. Proc Natl Acad Sci USA 74:5463 - 5467, 1977. Saraste M, Wikstrom M: On the location of prosthetic groups in cytochrome aa 3 and bc I. In: Quagliariello E, Palmieri F (eds) Structure and Function of Membrane Proteins. Elsevier, Amsterdam, 1983, pp 139-144. Schuster W, Brennicke A: TGA-termination codon in the apocytochrome b gene from oenothera mitochondria. Curr Genetics 9:157- 163, 1985. Southern EM: Detection of specific sequences among DNA fragments separated by gel electrophoresis. J Mol Biol 98:503 - 517, 1975.

384 27. Sparks RB Jr, Dale RMK: Characterization of 3H-labeled supercoiled mitochondrial DNA from tobacco suspension culture cells. Molec gen Genet 180:351- 355, 1980. 28. Spencer DF, Schnare MN, Gray MW: Pronounced structural similarities between the small subunit ribosomal RNA genes of wheat mitochondria and Escherichia coli. Proc Natl Acad Sci USA 81:493-497, 1984. 29. Stern DB, Newton K J: Isolation of intact plant mitochon-

drial RNA using aurintricarboxylic acid. Plant Molecular Biol Reporter 2:8-15, 1984. 30. Tzagoloff A: Cytochrome oxidase. Model of a membrane enzyme. In: Siekevitz P (ed) Mitochondria. Plenum Press, New York, 1982, pp 111 - 130. Received 6 May 1986; in revised form 8 July 1986; accepted 16 July 1986.

The structure of the gene for subunit I of cytochrome c oxidase in Neurospora crassa mitochondria.

Complete nucleotide sequence of the gene encoding rat cytochrome c oxidase subunit IV.

Nucleotide sequence of the coxA gene encoding subunit I of cytochrome aa3 of Bradyrhizobium japonicum.

Nucleotide sequence of cDNA encoding subunit VIII of cytochrome c oxidase from rat heart.

Planarian mitochondria. II. The unique genetic code as deduced from cytochrome c oxidase subunit I gene sequences.

Planarian mitochondria. I. Heterogeneity of cytochrome c oxidase subunit I gene sequences in the freshwater planarian, Dugesia japonica.

Nucleotide sequence of the F0-ATPase subunit 9 gene from tomato mitochondria.

Nucleotide sequence of the last exon of the gene for human cytochrome c oxidase subunit VIb and its flanking regions.

Structure and transcription of the gene coding for subunit 3 of cytochrome oxidase in wheat mitochondria.

Nucleotide sequence of cDNA encoding subunit VIIa of rat liver cytochrome c oxidase.

Nucleotide sequence of the cDNA encoding subunit VIIe of cytochrome c oxidase from the slime mold Dictyostelium discoideum.

Nucleotide sequence of cDNA for nuclear encoded subunit Vb of mouse cytochrome-c oxidase.

Nucleotide sequence of the F1-ATPase alpha subunit gene of sunflower mitochondria.

Nucleotide sequence of carp mitochondrial cytochrome C oxidase III.

Identification of Sphaeroma terebrans via morphology and the mitochondrial cytochrome c oxidase subunit I (COI) gene.

Genetic differentiation of the mitochondrial cytochrome oxidase C subunit I gene in genus Paramecium (Protista, Ciliophora).

Sequence of the cysteine peptide from the copper-subunit of bovine cardiac cytochrome oxidase.

Genomic organization and sequence analysis of the cytochrome oxidase subunit II gene from normal and male-sterile mitochondria in sugar beet.

Nucleotide sequence of the hemolysin I gene from Actinobacillus pleuropneumoniae.

RNA processing in yeast mitochondria: characterization of mit(-) mutants disturbed in the synthesis of subunit I of cytochrome c oxidase.

Partially edited mRNAs for cytochrome b and subunit III of cytochrome oxidase from Leishmania tarentolae mitochondria: RNA editing intermediates.

The subunit structure of Pseudomonas cytochrome oxidase.

Five TGA "stop" codons occur within the translated sequence of the yeast mitochondrial gene for cytochrome c oxidase subunit II.

Analysis of a DNA segment from rat liver mitochondria containing the genes for the cytochrome oxidase subunits I, II and III, ATPase subunit 6, and several tRNA genes.