Plant Molecular Biology 7:377-384 (1986) © Martinus NijhoffPublishers, Dordrecht - Printed in the Netherlands
377
Nucleotide sequence of the cytochrome oxidase subunit I gene from soybean mitochondria Elizabeth A. Grabau Howard Hughes Medical Institute, University o f Utah, Salt Lake City, U T 84132, U.S.A.
Keywords: cytochrome c oxidase, mitochondrial DNA, soybean, dideoxy sequencing
Summary The cytochrome oxidase subunit I gene was isolated from a soybean mitochondrial library and subcloned into M13 for DNA sequencing. The sequences of the gene and flanking regions are presented and compared to the corresponding gene from maize. There is approximately 94070 sequence homology between the soybean (dicot) and maize (monocot) coding sequences at the nucleotide level. The soybean sequence exists as a single copy in the mitochondrial genome and contains an open reading frame that could encode a polypeptide of 527 amino acids. There is very little sequence homology between the soybean and maize sequences upstream from the coding regions and none is detected downstream. Even the 3' ends of the COI coding regions differ considerably between soybean and maize. There are many amino acid differences at the carboxy terminus and the predicted polypeptide contains one less amino acid than the maize sequence. Northern analysis of the soybean mitochondrial RNA suggests that this region is actively transcribed and yields two major transcripts.
Introduction Cytochrome c oxidase is one o f the enzyme complexes of the inner mitochondrial membrane and is composed of at least seven subunits (30). Subunits I, II and III are encoded by the mitochondrial genomes of many organisms and at least I and II are encoded by mitochondrial genes in plants (17, 13, 3, 18, 20, 15). The total number of plant mitochondrial genes that have been studied to date is still fairly meager and transcription in plant mitochondria has not yet been well characterized. Transcription patterns detected by hybridization with probes from cloned genes range from very simple, with a single transcript as in oenothera COII gene (15), to quite complex with many different RNA species as for a number of other genes (17, 11, 12). At least some RNA processing occurs in plant mitochondria since an intron has been found in the cytochrome oxidase subunit II gene from monocots (13, 3, 18). The 5' ends of several
plant mitochondrial transcripts have been mapped (17, 20, 16) but the sites at which transcription initiation occurs remain undefined. Here I report the cloning and sequencing of the cytochrome oxidase subunit I gene and its flanking sequence from the soybean mitochondrial genome. Alignment of this gene with the sequence reported for maize mitochondria (17) provides both an evolutionary comparison and the opportunity to search for possible common regulatory sequences.
Materials and methods Isolation o f mitochondrial D N A and R N A
Mitochondria were isolated from 6 day old dark grown soybean hypocotyls (Glycine max, cv. Williams) as previously described (21). RNA was extracted from mitochondria that had been purified by several cycles of differential centrifugation and
378 sucrose gradient purification. DNA was extracted from sucrose gradient purified mitochondria and subjected to CsC1-EtBr density purification (27).
Cloning and sequencing of the cytochrome oxidase subunit I gene A soybean mitochondrial cosmid library (14) was screened for the presence of clones containing the soybean COl gene using a 4.7 kb XhoI fragment (17) from the maize mitochondrial genome as a probe. The cosmid clone containing the maize CO1 gene was generously provided by Dr Christiane Fauron. Fragments were isolated from agarose gels and nick-translated for use as probes (22). Restriction enzyme analyses of the soybean cosmids were performed to determine appropriate fragments small enough for sequencing. Two BamHI fragments of 0.95 kb and 2.4 kb were found to contain sequences homologous to the maize COl gene and were subcloned into M13 mpl8. Sequencing was performed by the method of Sanger (23) using oligonucleotide primers synthesized by an Applied Biosystems DNA Synthesizer. The oligonucleotides were twenty nucleotides in length and were isolated from 20°7o acrylamide gels prior to use as sequencing primers.
Electrophores& and hybridization Restriction enzymes were used according to manufacturer's instructions (Bethesda Research Laboratories and Boehringer Mannheim). Digested DNA was fractionated in 0.8°7o agarose gels, transferred to nitrocellulose and probed by the method of Southern (26). RNA was fractionated by electrophoresis in l°70 agarose-formaldehyde gels and subjected to Northern analysis (29).
Results
Twenty cosmid clones from a soybean mitochondrial library (14) were identified by hybridization to a maize COI probe. The cosmid DNA from six of those clones was purified and digested with several restriction enzymes and subjected to Southern analysis to determine the location of the corresponding soybean gene. All six clones contained 2 BamHI fragments of approximately. 0.95 kb and
2.4 kb which hybridized to the maize 4.7 kb XhoI fragment as seen in Fig. 1A. The two fragments were subcloned into MI3 mpl8 for sequencing and for use as hybridization probes. Figures 1B and IC show Southern analyses of mitochondrial DNA digested with several restriction enzymes and probed with either the 0.95 kb or 2.4 kb containing M13 clones. In each case only a single band is seen indicating that the COI gene is present as a single copy in the soybean mitochondrial genome. The sequencing strategy and restriction map of the cloned DNA containing the cytochrome oxidase subunit I gene are shown in Fig. 2. The sequence was obtained by dideoxy sequencing using synthetic oligonucleotides as primers and is presented in Fig. 3. The sequence extends from the first BamHI site to the XhoI site (see Fig. 2) and is compared to the maize COI sequence (17). We have sequenced approximately 1.1 kb beyond the XhoI site but have not included the sequence here since it does not contain any homology to maize, any long open reading frames or any obvious tRNA genes. (The sequence is available upon request.) It must be noted that since the two BamHI fragments do not overlap, it remains a formal, however unlikely, possibility that these two fragments might not be adjacent in either the mitochondrial DNA or cosmid DNA. The soybean sequence contains an open reading flame of 1581 nucleotides that shows 93.9O7o nucleotide homology to the coding region of the maize sequence. Like the COI gene in maize, the soybean COI gene does not contain an intron. The 3' ends of the two genes differ considerably in sequence and the open reading frames differ in length by 3 nucleotides (maize open reading flame is 1584 nucleotides). The differences between the soybean and maize sequences are indicated by the presence of the appropriate nucleotide from the maize sequence below the soybean sequence in Fig. 3. The homology to maize breaks down in the flanking region. Immediately 5' to the AUG start there are 8 additional nucleotides in the soybean sequence as compared to maize. The insertion in soybean (or deletion in maize) is located in a region that is found to be A-rich in both maize and yeast mitochondrial COI genes (17, 4). There are 11 A's out of 13 residues immediately preceeding the initiation codon in maize and in yeast 13 of 15 residues are A's at the corresponding location. The A-rich
379
Fig. 1. Hybridization of soybean mitochondrial DNA and cloned mitochondiral DNA to CO1 gene probes. A) Six cosmids of soybean
mtDNA were digested with BamHI and probed with a maize COl probe (4.7 kb Xhol fragment). Two fragments, 0.95 kb and 2.4 kb, hybridized to the maize probe from each cosmid. B and C) Soybean mtDNA was digested individually with 4 enzymes as shown and probed with the soybean subclones of 0.95 kb (B) and 2.4 kb (C) BamHI fragments. Size markers are Hpal digested T7 DNA.
region in m a i z e is p r e c e e d e d by a sequence 5 " G G T T T T C A - 3 ' w h i c h has b e e n p r e d i c t e d to act as a r i b o s o m e b i n d i n g site in p l a n t m i t o c h o n d r i a l R N A (17, 9, 16). In s o y b e a n t h e sequence o f the 'ins e r t i o n ' j u s t p r e c e e d i n g the A U G start is Pvul \
BamHI
Pstl BarnHI \ /
BarnHI
/
Xhol
Pvull
II COl coding region P •
P
P ,1
MI
i
I
i
200bp
Fig. 2. Restriction map and sequencing strategy of cloned soy-
bean mitochondrial DNA containing the COl gene. This region was subcloned on two BamHI fragments of 0.95 kb and 2.4 kb, indicated here by the BamHI sites. Probes used for Northern analyses (see Fig. 4) were the 0.95 kb BamHI fragment, the 1.3 kb BamHI-XhoI fragment and the 1.1 kb XhoI-BamHI fragment.
5 ' - T C C A T T T T - 3 ' . D u e to this insertion i m m e d i a t e l y u p s t r e a m f r o m the i n i t i a t i o n c o d o n , the p e r c e n t a g e o f A residues is m u c h s m a l l e r t h a n in m a i z e at the s a m e p o s i t i o n (only 9 o f 21 nucleotides). There is a n A - r i c h region d i s p l a c e d f u r t h e r u p s t r e a m f r o m the i n i t i a t i o n c o d o n in s o y b e a n (13 o f 21 nucleotides are A's). There is also a l a c k o f c o n s e r v a t i o n b e t w e e n w h e a t a n d m a i z e sequences in the region p r o p o s e d to be a r i b o s o m e b i n d i n g site (2). T h e a b sence o f the p r o p o s e d r i b o s o m e b i n d i n g site f r o m t h e s o y b e a n 5' f l a n k i n g region results in a mism a t c h o f 6 b a s e pairs with m a i z e b e f o r e a n o t h e r stretch o f perfect h o m o l o g y o f 18 n u c l e o t i d e s is encountered. B e y o n d this p o i n t h o m o l o g y with m a i z e b r e a k s d o w n completely. T h e o n l y o t h e r n o t a b l e feature u p s t r e a m f r o m t h e C O I gene in s o y b e a n is the presence o f three copies o f a 7 base p a i r repeat, 5 " C C C C T C T - 3 ' , i n d i c a t e d in boxes 5' to the c o d i n g sequence in Fig. 3. S m a l l repeats have been observed in t h e vicinity o f o t h e r genes in a s s o c i a t i o n
380 -167 B GGATCCAAATTTCCCATCC TC AAGT A C AAT T -118 C C C ~ T C C A T T C T TCCT~-CCCT GGT TT G A CGGA AG C AGAAGAAG -sa CGCTCCCTAA GAAGGGC~CCC~TCATA CTG TGCC TT T
~
B
a65 ATCAATAAAG TGAGGGCTTT CCGGACCTTC C G A G G A C CG A A C AAG A AGCTGTT T
a
CT~ACTCCCC AAAC ATTTT
CTTTTTCAAGAAATAAGCCC AC CC CTT C T CTTTGG
AGGAAGG--A TTTTCA
AACGAAAGAA A
i
SB
GTT TGG GCT Val
TCTCCATTTT AA ........
SB M
CAT CAT ATG TTT ACT CTG GGC TTA GAC GTG GAT ACC CGT GCC TAC T G H i s M e t P h e T h r Val G l y Leu A s p V a l A s p T h r A r g A l a T y r
Trp Ala His
glS TTC ACC
CCA GCT ACC ATG ATC
Phe Thr Ala Ala S~
1 ATG ACAAAT
CCG GTC
MET Thr
Pro Val
M
CGA TGG
CTG TTC TCC
brg Trp
Leu
T ASh
ACT AAC
CACAAG
~u
Ile
ATA GCT GTC CCC ACT GGA ATT AAA ATC TTT ACT G ¢ Ile A l a Val P r o T h r G l y lle LyS Ile Phe S e r
GAT ATA GGG ACT
C T
C Phe Ser Thr ASh His
Thr Met
Lys A s p
Ile Gly Thr
~
55
973 TGG ATC
CCC ATG
TTA TTT
Trp
Pro Met
L e u Phe
GCT ACC ATG TGG GGG GGT TCG ATA CAA TAC AAA ACA A Ile A l a T h r M e t T r p G l y G l y S e r lle G i n T y r L y s T h r
SB C T A T A T T T C A T C T T C G G T G C C A T T G C T G G A G T G A T G G G C A C A T G C T T C T C C G T A M Leu Tyr
Phe
Ile Phe Gly Ala
A Ile Ala Gly Val Met GIy Thr
Cys
P h e S e t Val
SB M
ID27 GCT GTA
GGG TTC
Ala
GIZ
Val
Phe
ATC TTT TTG TTC ACC ATA GGA GGA CTC ACT GGA ATA GTC CTG G T L A Ile Phe Leu Phe T h r Ile C l y G I y Leu T h r C l y lie V a l eu
i09 ATT
Leu
CGT ATG
lie A r g
GAA TTA GCA CCA CCC CGC GAT CAA ATT CTT GGT CCC AAT CAT C L e u A l a A r g P r o G l y A s p G l n Ile Leu G l y G l y A S h H i s
10Sl GCA AAT TCT GGG ¢ Ala ASh $er Gly
~
Met Glu
CTA GAC ATT Leu Asp
GCT CTA CAT GAT ACT TAT TAT GTG GTT GCA CAT
Ile A l a
Leu His Asp Thr Tyr Tyr
Val Val
A l a His
163
ss
~
CTT TAT AAT
GTT TTA
Gin
Leo Tyr ASh
Val
ATA ACC GGT CAC GCT TTT TTA ATG ATC TTT TTT ATG G C L e u I l e T h r G l y H i s A l a P h e L e u M e t lle P h e Phe M e t
S8 M
1135 TTC CAT TAT GTA
CTT TCT ATG GGA GCC GTT TTT GCT TTA TTT
Phe His Tyr Val
Leu Ser Met Gly Ala Val
Phe A l a
Leu
GCA GGA TTT CAC T T Phe A l a G l y Phe H i s
Ala 217
Tyr
SB GTT A T G C C G G C G A T G A T A G G T G G A T C T G G T A A T T G G T C T C T T C C G A T T C T G A T A M
val
Met
Pro Ala Met
G T T I l e S l y G ) y S e t G l y A S h T r p S e t Val
SB P r o Ile
LeU
lle
1189 TAT TGG GTG
~
CCT GAC ATG
Gly Ala
Pro Asp
Val G I g
CCA CGA TTA AAT AAT ATT TCA A Phe Pro Arg Leu ASh ASh Ile Set
TTC TCG TTG TTG Phe Trp
LeU
Leu
SB ~25 CCG CCA AGT CTC TTG CTC CTA TTA AGC TCA GCC TTA GTA GAA GTG GGT AGC GGC M Pro Set
Leu Leu
Leu
Leu
~
B
Leu Set Ser Ala
Leu Val
C Glu Val Gly Set Gly
379 ACT GGG TGG
ACG GTC
TAT
CCG CCC TTA AGT GGT ATT
Thr
Thr
Tyr
Pro Pro Leu Set Cly
Gly Trp
Val
SB
GAT
Ala Val Asp
TCA GCA ATT TCT AGT eT T S r A l a Ile S e t S e t Leu
487 S l~ G G T T C T A T C Gly Set
~
B
e
Set His Ser Gly Gly
CTT CAT CTA TCT GGT GTT TCA TCC ATT TTA A A L e u H i s L e u S e r G l y V a l S e t S e r Ile Leu
TTT
ATA ACA ACT ATC
Phe
lle Th
Thr
TCC AAC
ATG CGT GCA
Ile S e r A S h M e t A r g
GIg
CCT GGA
~
ATG ACT
Pro GIy Met Thr
SB
M
Phe SB
541 ATG CAT AGA
TCA
Met
S r Pr
His Arg
CCC CTA TTT Le
CTG TGG TCC GTT CCA GTA
Phe Val
T r p s e t Val
Leu 595 TTA TTA
~
Leu
~
B
TCA
CTT
Leu Ser
Leu
649 AAC TTT ASh
Ile
Aia
Leu
Phe
P o Leu
CTG ACC CTC TTT CCC ATG CAT TTC T L e u P h e P r o M e t H i s Phe
Leu Thr
CTT AGC ACT TTT GGC TOT TAT ATA TCC GTA GTT GCC ATT CGT Le
cH Ser Set
Phe C I y S e t r y r
~
Leu
Ile S e t v a l
val
Ciy
lle A r g
i~05 CGT TTC
TTC GTG GTC GTA ACA ATC ACT TCA AGC AGT GGA AAT AAC ATA ACA AGG A T c G A G A G T T P h e P h e V a l Val Val T h r Ile T h r S e t S e r S e t G l y sn A s h Ile T h r A r g
Ala Lys L s Arg Cys 1459 GCG AAC ATT CCT TGG GCT GTG GAA CAA AAT TCA ACC ACA CTG GAA TGG CTG GTA c A G A l l A s n Ila P r o T r p A l a V a l GI G1 A s n S e r T h r T h r Le G l u T r p L e u Val
Ser
ACA GCA TTC CCA CTT
Pro Val Thr
Phe Phe Gly Val ASh
I~
Phe
AAT
Leu GIy Gin
CTT TCG GGT ATG CCA CGT CGC ATT CCA GAT TAT CCA GAT GCT TAC GCT C Leu Ser Gly Met Pro Arg Arg Ile Pro Asp Tyr Pro Asp Ala Tyr Ala
1351 GGA TGC AAT GCC
Arg
lle ASh
c
Thr
t
ACC AGC CAT TCT GGA GGA
Ile T h r
Ile Thr
G l y T r p A S h AI 433 GCA GTT
Pro Glu
Phe 1297 TTA GGG Leu Gly
SMB
~
lle P h e G l y T r p T h r T y
1243 CAT TTT TGG ATC ACT TTT TTC GGG GTT AAT His Phe Trp
A Pro
Lys
TTA GGT CAAATC
Phe
GCA TTT
Met Ala
TTT GGT CGG ACA TAC CCT GAAACT
M Tyr Trp
Phe 271 GGT GCA
GGTAAAATC
Pro
1513 C AA AGT CCT CCA GCT TTT CAT ACT TTT GGA GAA CTT CCA GCT ATC A~G GAG ACG C C A A A T G l n S e t P r o P r o A l a P h e H i s T ~ r Phe G l y G I u L e u P r o A l a lle L y s G l u T h r
T~r
CCG GTA
CTG GCA GGG GCA ATA ACA ATG TTA TTA ACC GAT CGA T P r o V a l L e u A l a G l y A l a Ile T h r M e t Leu L e u T h r A s p A r g
1567 G A C A TCC Lyg Set Tyr Val
AAG GT Lys
GAAGAAAAGGTCGCCGA ATCC ATCATTAGCG
CTGCTACTAAGAACCTAACA AGACAATTAT TTTTCTT3'
.
A~gA~nG~nS~rS~rC~s AAT ACA
Phe Asp
Thr
ACC Thr
TTT TCT GAT CCC GCA GGA GGG GGA GAC T A Phe Set Asp Pro Ala Gly Gly GIy Asp
CCC ATC TTA TAC A A P r o Ila L e U T y r
f
S~
GAACTTTTCAAAATGTGGGT
TCCAACGAAGAAGAGTTGAG
GACCAACTTC
GACCTAATTT
SB
AAGAGTTAGGAAAGCAAGCT
CAGTTCAGAATGGCTACTTA
CCAACAGAGG
ATAGCGAGCT
S~
ACTACAAGGC
CAGAGTTAAA
TTTGGAAATT
GGATCGAGCT
GAGCCGGAGC
TCCTCCTACT
S~
GACTATGACT
GGACAGCAGT
GGACTCTTTC
TCAAAGTCTG
ACCCTGCCAC
CTATATTTGA
S~
TTGGATGAAG
CAACCATCAC
TGAAGCAATG
GATTTAGTAG
CCAGCTCAAC
TCCATTTAAC
S8
AGGTTAGTAG
CGAAAGGTAA
GGCTTCATTC
S~
ATTAAGTGAAGGGAAGAGAATCAAGGCTAT
SB
GCTGACTCAACTCAAGGAAG
SS
CTCGAG
Phe S~
703 CAG CAT CTC Gln
TTT
CGG TTC TTC GGT
CAT CCA GAG GTG
TAT ATT
CCC ATT CTG CCT
His L e u P h e T r p Phe P h e G l y H i s P r o G l u V a l T y r lie P o Ile Leu P r o
t
Leu SB
S~ M
757 GGA TCC GGT ATC ATA AGT CAT ATC GTT TCG ACT TTT TCG GGA T T T A C A A G l y S ~ r G l y I l e I l e S e t H i s Ile Val S e r T h r P h e S e t G l y Ph •e 811 GGG TAT CTA GGC
ATG GTT
G)y Tyr
Met
Leu Gly
CCGAGTGGAC
GAGAAGAAGG
AGCAAGCCCT
TTTACTTTAC
TTTACTAGAC
TAGCTACCGA
AAA CCG GTC TTC Lys
P r o Val
Phe CCGGAAAGAAAGGGCTTCAAGACAGCTTCG
GCAACAGAAA
A rTg
Va2
TAT GCC ATG ATC AGT ^ Tyr Ala Met Ile Set
ATA GGT GTT 12e G ~ y V a l
CTT GGA TTT CTT A Leu Giy Phe Leu
Fig. 3. D N A sequence of the soybean gene for cytochrome oxidase subunit I. The soybean COl gene and flanking sequence are presented in comparison to the maize sequence. Nucleotide differences are noted below the soybean sequence and the amino acid differences are indicated by arrows. The corresponding amino acids from the maize sequence are shown below the arrows. A dash indicates the absence of a nucleotide in the corresponding sequence. Upstream repeats are boxed. The respective termination codons from both the soybean and maize sequences are boxed at the 3' end of the coding region. The putative ribosome binding site from maize is indicated by the solid line. Additional sequence of the maize 3' flanking sequence has been reported (17) but we have not included it here because it shows no h o m o l o g y to the soybean sequence.
381 Of the 99 nucleotide substitutions in the CO1 gene between soybean and maize, 56 are transitions (C.-.T, A.-.G). 41 nucleotide substitutions are involved in amino acid replacements and 27 of those substitutions are transitions (67°7o). This skewed distribution is somewhat more striking if the carboxy terminal changes are omitted. Because of the marked lack o f conservation at the 3' end of the gene, only 16 of 31 amino acid replacements are seen in the first 473 amino acids of the CO1 coding sequence. 15 o f the 16 amino acids replacements in this region are transitions. A bias towards transitions, particularly C ,-*T, is also seen when comparing the coding sequences of rat mitochondrial DNA (6). Another difference at the 3' end of the CO1 gene is the termination codon which is TAA in soybean and TAG in maize. Plant mitochondrial codon usage has been found to differ from all other mitochondrial systems. All three termination codons are utilized (13, 17, 25) and ~ G G probably codes for tryptophan rather than arginine. One of the two locations of CGG in the soybean CO1 sequence is in c o m m o n with maize and the other differs from maize but is located in a conserved region of the polypeptide that contains a tryptophan in the other sequences. This lends further support to the conclusion that CGG codes for tryptophan in plant mitochondria. To determine whether the CO1 gene is transcribed in soybean, Northern analysis of mitochondrial RNA (mtRNA) was performed using fragments from the BamHI clones as probes (see Fig. 2). Figure 5 shows the Northern blot of mtRNA using the three probes from the cloned re-
with deletions or insertions relative to an homologous sequence (18, 20, 3). In the 3' flanking region o f the soybean COIl gene a similar repeat, 5'-CCTCT-3', is found associated with an insertion in the soybean sequence relative to pea (Grabau, manuscript in preparation). Whether these repeats in the upstream flanking region o f the soybean CO1 gene are involved in a similar kind of insertion or deletion will await sequence comparison to other genes containing more extensive upstream homology to the soybean gene. The open reading frame in the soybean COl gene could code for a protein with a predicted molecular weight of 57,544. Amino acid differences between maize and soybean are indicated by arrows in Fig. 3. The difference in length between soybean and maize CO1 polypeptides is only one amino acid but the sequences at the carboxy termini are quite different. Amino acid sequence homology between soybean and maize is approximately 94.2% indicating that many o f the nucleotide changes are silent substitutions. Fifteen of 31 amino acid differences between the two sequences lie within the last 54 amino acids. This is in agreement with the lack of conservation in this region seen among CO1 genes of maize and various other organisms including Neurospora crassa, Saccharomyces cervisiae, Homo sapiens and Drosophila melanogaster (17, 7, 4, I, 10). In common with the other known CO1 genes (24), there are 12 transmembrane segments predicted by the hydropathic profile (19) of the soybean CO1 protein as shown in Fig. 4. The position of the amino acid differences are noted by arrows and are present both in the transmembrane regions and hydrophilic regions.
I
II
III
IV
V
VI
VII
VIII
IX
X
XI
XII
o *,- - 2 "0
:~-3
-I
o
5'°
,;o
,~o
I
2oo
I
25°
3;0
3;o
,;o
,;o
5;0
Amino acid number
Fig. 4. Predicted hydropathy profile for the soybean COI polypeptide. The hydropathy values were calculated according to Kyte and Doolittle (19) using a 6 a m i n o acid window. The twelve t r a n s m e m b r a n e regions (hydropathy > 0) are shown. The a m i n o acid differences from maize are indicated by the arrows. (The last amino acid difference is not shown because it is the soybean translational stop codon.)
382 the coding region. There are no additional long open reading frames within 1.6 kb downstream from the COI gene in soybean.
Discussion
Fig. 5. Hybridization of fractionated soybean mitochondrial
RNA with probes from the COI clones. Soybean mitochondrial RNA Northern blots were probed with three probes, 0.95 kb BamHI fragment (lanes 1 and 2), 1.3 BamHI-XhoI fragment (lanes 3 and 4), and 1.1 kb XhoI-BamHI fragment (lanes 5 and 6). The RNA was loaded on the gel at two different dilutions (lx and 5x respectively). RNA size markers were purchased from Bethesda Research Laboratories and probed with nicktranslated T7 DNA.
gion. There are two m a j o r transcripts of approximately 2000 and 1700 nucleotides but several less intense bands as seen when probed with either the 0.95 kb B a m H I fragment or the 1.3 kb B a m H I X h o I fragment. We have failed to detect hybridization to the m a j o r transcripts with the 1.1. kb X h o I B a m H I fragment and conclude that the 3' end of the soybean COI transcript must not extend into this region. In contrast, the 5' ends of the maize C O I transcripts have been m a p p e d to within 200 base pairs upstream of the COI gene indicating that the majority of the long noncoding region of the large transcripts are located downstream from
The COI gene in soybean mitochondria exists as a single copy as do the flanking regions as far as -167 upstream and approximately 1600 nucleotides downstream. COI is also present as a single copy in normal maize mitochondria but is repeated in the mitochondrial genome of cytoplasmic male sterile cmsS plants (17). The soybean gene exhibits approximately 94°7o homology to the maize COI sequence at the nucleotide level. This is in agreement with the slow rate of nucleotide divergence seen in plants when compared to other mitochondria (28, 8, 14). It is particularly striking when compared to animal mitochondria where the rate of nucleotide divergence for m t D N A is 5 to 10 fold greater than the rate of divergence of the animal nuclear gehome (5). Sequence comparisons of the soybean and maize 5' flanking regions showed no obvious c o m m o n features. The A-rich region immediately upstream from the initiation codon in other genes (4, 17) is not seen at the same position in soybean but is displaced upstream by the presence of an 'insertion' in soybean relative to maize. There is no homology to a putative plant mitochondrial ribosome binding site (17, 9, 16) in the soybean sequence. The homology with maize breaks down at position -43 in the soybean sequence and none of the possible promoter-like sequences found in maize preceeding the 5' end of the COI transcript (17) have been found in the region sequenced in soybean (to -167). This suggests that we have not yet recognized the signals for mitochondrial transcription or that the transcripts may be very large and subsequently processed. There are several transcripts hybridizing to the cloned soybean COI sequence although the signal is strongest to two transcripts of about 2000 and 1700 nucleotides. The 5' ends of the maize COI transcripts (2300 and 2400 nucleotides) have been m a p p e d to within 200 nucleotides upstream from the translation start and this indicates that the transcripts must contain long downstream non-coding sequences. In soybean no transcription from a 3'
383 flanking fragment can be detected. Therefore it seems that the transcription patterns are quite different from maize. The striking similarities observed in plant mitochondria for the coding sequences of the major gene products do not extend to the organization and transcription of those genes.
10.
11.
12.
Acknowledgements This work was carried out in the laboratory o f Dr Ray Gesteland. I would like to thank Ray for his continuing enthusiastic support and for critical reading o f the manuscript. I wish to thank Marie Havlik for excellent technical assistance. I would also like to thank Christiane Fauron for the cmsT cosmid clone used as a CO1 probe and for reading this manuscript.
13.
14.
15.
16.
References 1. Anderson S, Bankier AT, Barrell BG, de Bruijn MHL, Coulson AR, Drouin J, Eperon IC, Nierlich DP, Roe BA, Sanger F, Schreier PH, Smith AJH, Staden R, Young IG: Sequence and organization of the human mitochondrial genome. Nature 290:457- 465, 1981. 2. Boer PH, Mclntosh JE, Gray MW, Bonen L: The wheat mitochondrial gene for apocytochrome b: absence of a prokaryotic ribosome binding site. Nucleic Acids Res 13:2281 - 2292, 1985. 3. Bonen L, Boer PH, Gray MW: The wheat cytochrome oxidase subunit II gene has an intron insert and three radical amino acid changes relative to maize. EMBO J 3:25312536, 1984. 4. Bonitz SG, Coruzzi G, Thalenfeld BE, Tzagoloff A, Macino G: Assembly of the mitochondrial membrane system. Structure and nucleotide sequence of the gene coding for subunit 1 of yeast cytochrome oxidase. J Biol Chem 255:11927- 11941, 1980. 5. Brown WM, Prager EM, Wang A, Wilson AC: Mitochondrial DNA sequences of primates: tempo and mode of evolution. J Mol Evol 18:225-239, 1982. 6. Brown GG, Simpson MV: Novel features of animal mtDNA evolution as shown by sequences of two rat cytochrome oxidase subunit II genes. Proc Natl Acad Sci USA 79:3246- 3250, 1982 7. Burger G, Scriven C, Machliedt W, Werner S: Subunit 1 of cytochrome oxidase from Neurospora crassa: nucleotide sequence of the coding gene and partial amino acid sequence of the protein. EMBO J 1:1385- 1391, 1982. 8. Chao S, Sederoff R, Levings CS III: Nucleotide sequence and evolution of the 18S ribosomal RNA gene in maize mitochondria. Nucleic Acids Res 12:6629- 6644, 1984. 9. Dawson AJ, Jones VP, Leaver C J: The apocytochrome b
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
gene in maize mitochondria does not contain introns and is preceeded by a potential ribosome binding site. EMBO J 3:2107-2113, 1984. de Bruijn MHL: Drosophila melanogaster mitochondrial DNA, a novel organization and genetic code. Nature 304:234- 241, 1983. Dewey, RE, Levings CS III, Timothy DH: Nucleotide sequence of ATPase subunit 6 gene of maize mitochondria. Plant Physiol 79:914- 919, 1985. Dewey RE, Schuster AM, Levings CS III, Timothy DH: Nucleotide sequence of Fo-ATPase proteolipid (subunit 9) gene of maize mitochondria. Proc Natl Acad Sci USA 82:1015- 1019, 1985. Fox TD, Leaver C J: The zea mays mitochondrial gene coding cytochrome oxidase subunit II has an intervening sequence and does not contain TGA codons. Cell 26:315 - 323, 1981. Grabau EA: Nucleotide sequence of the soybean mitochondrial 18S rRNA gene" evidence for a slow rate of divergence in the plant mitochondrial genome. Plant Mol Biol 5:119- 124, 1985. Hiesel R, Brennicke A: Cytochrome oxidase subunit II gene in mitochondria of oenothera has no intron. EMBO J 2:2173-2178, 1983. Hiesel R, Brennicke A: Overlapping reading frames in oenothera mitochondria. Febs Lett 193:164- 168, 1985. Isaac PG, Jone VP, Leaver C J: The maize cytochrome c oxidase subunit I gene: sequence, expression and rearrangement in cytoplasmic male sterile plants. EMBO J 4:1617- 1623, 1985. Kao, T-h, Moon E, Wu R: Cytochrome oxidase subunit II gene of rice has an insertion sequence within the intron. Nucleic Acids Res 12:7305- 7315, 1984. Kyte J, Doolittle RF: A simple method for displaying the hydropathic character of a protein. J Mol Biol 157:105- 132, 1982. Moon E, Kao T-h, Wu R: Pea cytochrome oxidase subunit II gene has no intron and generates two mRNA transcripts with different 5'-termini. Nucleic Acids Res 13:31953212, 1985. Morgens PH, Grabau EA, Gesteland RF: A novel soybean mitochondrial transcript resulting from a DNA rearrangement involving the 5S rRNA gene, Nucleic Acids Res 12:5665 - 5684, 1984. Rigby PWJ, Dieckmann M, Rhodes C, Berg P: Labeling deoxyribonucleic acid to high specific activity in vitro by nick translation with DNA polymerase I. J Mol Biol 113:237-251, 1977. Sanger F, Nicklen S, Coulson AR: DNA sequencing with chain terminating inhibitors. Proc Natl Acad Sci USA 74:5463 - 5467, 1977. Saraste M, Wikstrom M: On the location of prosthetic groups in cytochrome aa 3 and bc I. In: Quagliariello E, Palmieri F (eds) Structure and Function of Membrane Proteins. Elsevier, Amsterdam, 1983, pp 139-144. Schuster W, Brennicke A: TGA-termination codon in the apocytochrome b gene from oenothera mitochondria. Curr Genetics 9:157- 163, 1985. Southern EM: Detection of specific sequences among DNA fragments separated by gel electrophoresis. J Mol Biol 98:503 - 517, 1975.
384 27. Sparks RB Jr, Dale RMK: Characterization of 3H-labeled supercoiled mitochondrial DNA from tobacco suspension culture cells. Molec gen Genet 180:351- 355, 1980. 28. Spencer DF, Schnare MN, Gray MW: Pronounced structural similarities between the small subunit ribosomal RNA genes of wheat mitochondria and Escherichia coli. Proc Natl Acad Sci USA 81:493-497, 1984. 29. Stern DB, Newton K J: Isolation of intact plant mitochon-
drial RNA using aurintricarboxylic acid. Plant Molecular Biol Reporter 2:8-15, 1984. 30. Tzagoloff A: Cytochrome oxidase. Model of a membrane enzyme. In: Siekevitz P (ed) Mitochondria. Plenum Press, New York, 1982, pp 111 - 130. Received 6 May 1986; in revised form 8 July 1986; accepted 16 July 1986.