Plant Molecular Biology 7: 3-10, 1986 © Martinus Nijhoff Publishers, Dordrecht - Printed in the Netherlands

Sequence of two genes in pea chloroplast DNA coding for 84 and 82 kD polypeptides of the photosystem I complex J. Lehmbeck* O. E Rasmussen, 1 G. B. Bookjans, 2 B. R. Jepsen, B. M. Stummann & K. W. Henningsen

Department of Genetics, The Royal Veterinary and Agricultural University, Biilowsvej 13, 1870 Copenhagen V, Denmark lpresent address: Gensplejsningsgruppen, The Technical University of Denmark, building 227, 2800 Lyngby, Denmark 2present address: Albert Chandler Medical Center, Department of Biochemistry, University of Kentucky, Lexington, KY, USA

Keywords: chlorophyll a-binding protein, chloroplast DNA, nucleotide sequence, photosystem I gene

Summary The genes encoding the two P700 chlorophyll a-apoproteins of the photosystem I complex were localized on the pea (Pisum sativum) chloroplast genome. The nucleotide sequence of the genes and the flanking regions has been determined. The genes are separated by 25 bp and are probably cotranscribed. The 5' terminal gene (psaA1) codes for a 761-residue protein (MW 84.1 kD) and the 3' terminal gene (psaA2) for a 734-residue protein (MW 82.4 kD). Both proteins are highly hydrophobic and contain eleven putative membrane-spanning domains. The homology to the corresponding polypeptides from maize are 89070 and 9507o for psaA1 and psaA2, respectively. A putative promoter has been identified for the psaA1 gene, and potential ribosome binding sites are present before both genes.

Introduction Photosystem I (PSI) particles have been isolated from the chloroplasts of several species of higher plants. The protein complement of these particles varies from 7 to 16 polypeptides (16, 17, 33, 34). The PSI complex is composed of a core complex (CPI) and a light harvesting complex. The current model for the CPI complex proposes that two similar or identical polypeptides of 60-70 kD bind one PSI reaction center and about 40 molecules of light harvesting chlorophyll a (3, 8, 19, 32). The PSI reaction center polypeptide(s) has been designated P700 chlorophyll a-apoprotein. There is conflicting evidence regarding the number and molecular weights of the polypeptide(s) which comprises the CPI complex. Several workers have concluded that the CPI complex is composed of a single 60-70 kD * To whom reprint requests should be sent

polypeptide (6, 14, 25). However, two closely spaced protein bands have repeatedly been observed upon denaturation of the chlorophyll protein complex (2, 6, 18, 28, 30). For spinach and barley these two components have similar amino acid compositions (13, 34) and produce similar peptide fragments upon protease digestion (33, 34). The P700 chlorophyll a-apoprotein(s) is synthesized within the cloroplast (25, 34). The localization of the genes for two apoproteins on the chloroplast genome has recently been reported for spinach and maize (9, 34). The sequence of the maize genes has been published (9). In maize the two genes, termed pslA1 and pslA2 in (9) and psaA1 and psaA2 in the present work, code for 45°70 homologous polypeptides of 83.2 and 82.5 kD, respectively. The two genes are separated by 25 bp and are probably cotranscribed. Immunochemical data has shown that the gene psaA1 codes for a polypeptide of the CPI complex (9). The exten-

4

sive homology between polypeptides psaA1 and psaA2 suggest that both may be components of CPI (9). We present here the sequence of a 5.0 kb EcoRI fragment of pea chloroplast DNA containing the photosystem I genes psaA1 and psaA2. The genes and the corresponding polypeptides are highly homologous to those from maize.

Materials and methods

a-32p-daTE ~35S-dATP and M13 sequencing primer were obtained from Amersham International. Dideoxynucleotides were from Boehringer (Mannheim). All enzymes were from these two companies. Chloroplast DNA was isolated from pea and spinach as previously described (4). DNA was digested with restriction enzymes according to the manufacturer's specifications and fractionated by gel electrophoresis. Isolation of pea and spinach chloroplast DNA fragments from agarose gels and DNA sequencing by the M13 dideoxynucleotide method (27) were performed as previously described (24). Appropriate fragments were cloned in pUC12 and M13mp8 and mp9 in E. coli M83 and JM103, respectively. Nick-translation of spinach chloroplast DNA probes and hybridization of the probes to nitrocellulose filters with restriction fragments of pea chloroplast DNA were carried out essentially as previously described (24).

end of the distal gene psaA2. These two BamHI fragments were labelled with 32p and hybridized to pea chloroplast DNA digested with the restriction endonucleases EcoRI, PstI and XhoI (data not shown). The fragment BamHI-15a hybridized to EcoRI 5.0 kb, PstI 5.7 kb and 12.2 and XhoI 3.9 kb fragments. The fragment BamHI-20 hybridized to EcoRI 5.0 kb, PstI 12.2 kb and to XhoI 8.2 kb fragments. According to this the genes are contained in an EcoRI 5.0 kb fragment and transcription starts in the PstI 5.7 kb fragment and proceeds into the adjacent PstI 12.2 kb fragment (22). The psaA1 gene starts 4.1 kb 5' to the psbD gene (24) and is transcribed in the opposite direction. Fig. 1 shows

ps~ p,~I)D

Pea chloroplast DNA (120,2kb)

~sbB D

p~bD

Results

Localization of the genes encoding the P700 chlorophyll a-apoproteins The genes encoding the P700 chlorophyll aapoproteins of the photosystem I complex were localized on the pea chloroplast genome by hybridization with the corresponding genes localized on the spinach chloroplast genome (1, 34). In spinach 92°70 of the region covered by the genes is contained in the two fragments BamHI-15a and -20. The fragment BamHI-15a contains the promoter proximal gene psaA1 and the 5' end of the distal gene psaA2. The BamHI-20 fragment contains the 3'

-11,

psaA1

I

I

II

P

P

EE

~

DsaA2

I

!

I

I

I

X

E

P

X

E

Fig. 1. (A) Map of pea chloroplast DNA showing the location of rRNA and protein genes. The Pstl sites (22) are indicated by asterisks. Arrows indicate direction of transcription. The genes for 16S and 23S rRNA (22), 4 subunits of ATP synthase (atpA, B, E and H (10)), cytochrome f (petA (35)), a 15 kD polypeptide of the cytochrome b-f complex (petD (23)), the large subunit of ribulose-bisphosphate carboxylase (rbcL (20)), the 32 kD protein (psbA (21)), the D2 protein (psbD (24)), the 44 kD protein (psbC, Bookjans et al., unpublished), the 51 kD protein (psbB, Lehmbeck et aL, unpublished) and the two PSI polypeptides (psaA) are shown. (B) The genes psaAl and psaA2 and their orientation relative to the psbD gene (24). Arrows indicate direction of transcription. P, E and X indicate sites for PstI, EcoRI and XhoI, respectively.

the location and the transcription direction of the P700 chlorophyll a-apoprotein genes psaA1 and psaA2 on the pea chloroplast genome.

Sequence analysis with the sequence of the genes for the P700 chlorophyll a-apoproteins The method of Sanger (27) was used to determine the DNA sequence of the 5.0 kb EcoRI fragment of the pea chloroplast genome. The fragment was sequenced in both directions and all restriction sites were overlapped. The nucleotide sequence is shown in Fig. 2. Analysis of the sequence revealed, as expected, two large open reading frames on the same DNA strand. The gene located upstream, psaA1, contains information for a 761 amino acid polypeptide with a predicted molecular weight of 84.1 kD. The second gene, psaA2, codes for a 734 amino acid polypeptide with a molecular weight of 82.4 kD. The deduced amino acid sequences corresponding to the genes are shown over the nucleotide sequence in Fig. 2. The percentage of hydrophobic amino acids in the psaA1 and psaA2 polypeptides is 50.5% and 54.8%, respectively. The amino acid sequences of the two polypeptides are 41.7% homologous. Comparison of the pea sequence of the genes for psaA1 and psaA2 in the maize chloroplast genome (9) clearly establishes the identity of the two open reading frames as the genes for the psaA1 and psaA2 polypeptides. The pea psaA1 and psaA2 sequences are 86% and 89% homologous to the corresponding genes of maize. The homology between the pea and maize sequences upstream from the start codon of psaA1 are 98% for the first 46 bp and 60% for the next 168 bp. Before this region there is no significant homology between the two sequences. The 25 bp region between the two genes is completely conserved between maize and pea. Downstream from the psaA2 gene the homology is 68% for the first 22 bp and thereafter there is no significant homology. In Fig. 3A and 3B the amino acid sequences of the psaA1 and psaA2 polypeptides of pea are compared with those of maize. The homology is 89% and 95% for the psaA1 and psaA2 polypeptides, respectively. The sequence GGAGG at position -6 to -10 from the initiation codon in psaA1 and at nucleotide -15 to -19 from the initiation codon in psaA2 may function as a ribosome binding site (31) by pairing with

the sequence CCUCC at the 3' end of plastid 16S rRNA (29). A putative promoter set can be identified 5' to the gene psaA1. The sequence CATAAT, at position 260 in Fig. 2, is similar to the -10 consensus sequence TATAAT of chloroplast DNA (7). Sixteen base pairs 5' to this the sequence TTGAGC, which is similar to the -35 consensus sequence TTGANA (7), is located. A putative promoter set is also located about 170 bp upstream from the start codon for psaA2 (around position 2590 in Fig. 2) in the coding region of the preceeding gene psaA1. It has the -10 like sequence CATAAT and the -35 like sequence TTGAAT. From position 125 to 161 in Fig. 2 and inverted repeat is found, that can form a highly stable hairpin structure (zx G ° = -78.7 kJ/mol according to (5)).

Discussion

The nucleotide sequence of the 5.0 kb EcoRI fragment of pea chloroplast DNA revealed two large noninterrupted open reading frames (ORFs) (Fig. 2). The two ORFs were identified as the genes for apoproteins of the P700 chlorophyll a-protein complex (CPI) by hybridization with the corresponding genes of spinach (1, 34) and by comparison to the sequence of the maize genes (9). The upstream and downstream genes, termed psaA1 and psaA2, code for a 761 and for a 734 amino acid residue protein with molecular weights of 84.1 and of 82.4 kD, respectively. These sizes deviate significantly from the apparent molecular weight of 6 0 - 7 0 kD for the CPI apoproteins estimated by electrophoretic mobility in SDS-polyacrylamide gels (9, 16, 34). There is no evidence that synthesis of the proteins involves a precursor protein. It is unlikely that translation starts at an internal ATG codon in any of the ORFs because only the first Met codon in both genes is preceeded by the sequence GGAGG which is considered as a strong ribosome binding site (31). A similar inconsistency between the size of a protein estimated from its electrophoretic mobility and calculated from its predicted amino acid sequence has been noted for other hydrophobic membrane proteins (1, 9, 11, 15, 24). Hydrophobic polypeptides are known to bind more SDS than hydrophilic ones and this may cause underestimation of the molecular weight for the former.

6AATT~TA6A~4TATT~TT~TTTCCTT~ATMATTAT~TTATTA6AAAC~AATTCATC~TATTCACATTTTAATT6ATccT6A~TTTTCTTTTTTTTTTTTTTATTcTAATTCTAATATT 120 ATT6TCT~TATTTTATTCTCTMTATATAATAA~ATATATA6A6MTCCATTA TATATATTTAATATTTMTTTTT6ATATTTTATT6TCACCATCATTCJUMTAAAA666TCCTTT ~ a i °,~

240

CsCCTA~6CTAT6T~A6ATCC~TT~CCCuATTCACTTCT6MTCCTAATT6TTCTA6~TC~TA~MMT~T~A~TATTCA 360 psaAl: H !

l R S P [

P K V g !

ATATAA~ATTTTCTATM6TTCACTA~TTA~6TTTTAT6~Ts6Cu6TCTCTT~6TAT6T6TT6TCC66~6~TCMT~TTATT~TT~~T~T 49e L A 0 P E V K [ L V | R DP / I T S F [ O H l g P 6 HF S I T I A I 6 P D T T T TTT66CA6~TCCATAATTCAAMTTTT66TAEAT..ar.~__ TCCCAT~6TCTTTCTAiEMT~,L'__..~AACCTBTTCATTTCTCAATAACAATAOCTM666ACCT6ATACT~TAC 60# i ] I N L H A D A H D F D S H T S D L [ E ! S fl K V F S A H F 6 O L S Z I F L I I TT6~ATCT66AAC~TACA~6CT6ATG~TCAT~ATTTC6ATA6CCATACTA6T6A~TTA6M6A6ATT~CCC6MM6TATTTA6T~CTCATTT~66CCAAC~C~TATCT~T~6720 L S 6 n Y F H 6 A R F S H Y [ A HL N O P T H i R P S A O Y VH P [ V 6 OE ] L 6~T6A6T66CAT6TATTTCCAT6~T6~TC6TTTTT~CMTTAT6M6CAT6~T~AT6ATCCTACTCACATTC6AC~TA6T6CCCM6~66TT~6~CAATA6TA66CCM6~ATT 64e N 6D V6 6 6FR6 I 0 | T B6F FO | iRA $ 6 Z T S EL O L YC T A ] 6 A L 6AAT66TTAT6TA6666~66TTTCCTC661LqTACAMTMCCTCT66TTTTTTTCAMTT TOTCTATCATCT66AATAACTATTTAATTACAACTCTATT6TAC~TCNT~

960

V F A A L HL F A 6 H F H Y H K A A P ~ L V H F g D Y [ S H L N HH L A 6 L L 6 66TCTTTTCATCCTTMTTCTTTTTTCT66TTG6TTTCATTATCACAAA6CTTCTCCAAAATT66TTT66TTTCAATATTTATAATCCATTTTTAATCACCATTTTECA66TCTACf~JOM L 6 S L S HA 6 H O Y H V S L P | N O F L H A 6 VDL L A | L Y P G F AE 6 A T ACTT666TCTCTTTCTT666C6666CATCAA6TACAT6~ATCTTTACCM~TpJ~CAATTTC~AMT6CT66A6TA6ATCTTTT|~TCMCTT~ATCCA~TTTT~CT~ 1204 P F ~ T L N H S [ Y A 0 FL T F R 6 6 L D P L T 6 6L H L T D i A H HH L A I A CCCATTTTTTACCTT6M~T~6TCMM~AC6C66AC~TTC~TAC~TTTC6T66~6~TAi~CCACTAACT~T6~?CTAT66CTMCT~ATATT~CACAT~ATT~ |320 | L F L | A 6Hfl Y R T NH 6 ~ 6 H6 i ~ 0 Z L [ AH K 6 P F T 6 | 6 HK 6 P F AATTCTTT~TCTCATT6C666~CATAT~TATA6~MCTA4CT66~6TATT66TCAT66T4TA~iMMTAT~TTA~M6CCCATMA~TCCATTTiiCA6~C~6~TCATAM66TCCATT 1440 T 606HK 6PF T 6OTHK 6 L Y| i L T T S HHAOL S I NL AflL 6 S L T TACA66CCA66~TCATAM6~TCCAT~T~iCA~iicCA~TCA~AAA~6TcTATA~6~Tc~AACMC6TC~T66CAT6C~CAA~TAT~TA~TAT~TC~ t560 [ ~ A A H H fl ] A n P P | P Y L A T D Y 6 T O L S L F T H H H H i 6 6 F L i V 6 cATTATT6CA6CTCACCATAT6ATTET~T6CCTCCTATTCCATACCTA8CTACT6ACTAT66TACACMCT6TCATT6TTCACAC~TCACAT~T~ATT5~6~TTTCT~TMTTMI(dle A A A H i A | F H V fl 0 Y D P T T R Y H D L L D R V L ~ H R D i [ [ S H L H i V T6CTTCT6CACAT6CA6CCATTTTTAT66TAA6A6ACTAT~T~CAACTACTc6ATACAAC8AT~TATTAMTc~T~T~TTA~iCAT~T6AT6CAATCAT~T~ACATcTCAACT6i6T 1800 C I F L 6 F HS F 6 L Y [ H H O T H $ A L 6 R P | D H F S n T A Z | L O P V F A 6T6TATATTTTTI66Cf TTCACA6TT~T666TT6~ATA~TC~AA~6~TACCAT~A6~6CTTTA~C6CCCTCAA6~AT~T~TTCA6ATACC6CTATACAATTACAAC~T|TCTT~6C 1926 g i | O i T HAL i P 6 T T A P 6 A T T S T S L | H6 6 6 D L V S V 66 K V AL TCAAT66ATACAAAAT~C~T~CTT~A~W~ACC~66CA~MC~6C~CT66T~CAA~AACAA6CACCA6T~T~ATTT~T~T~ATTTA6T~TCA6T6~6(~CAM6TA6C~TT2040 L P [ P L 6 T A P F L V H H I H A F T [ H Y T V L [ L L K 6 V L F A R S S R L l 6TTACCTAT~CCATTA66MCT6CA6~TTT~TT66TACA~ATAT~CAT6CA~TACAATT~AT6T~Ac66TATT6ATACTCCTAM~6~T6TTCTATTT6CTC6TA~CTCAC6ATT6AT 2tTe r D K AN L 6 F g F P C D 6 P 6 R 6 6 T C O V S A I V H V F L 6 L F I n Y H A ! ACC66AT~A~6CA~`~TC~T6~TTTC6~TTCCCTT6T6AT66C~CT6~AA~M~666~ACAT6CCM6TATCC~CTT666TACAT6TCTTCCTA66ATTATTTTNAT~TACMT6CMT 2296 S V V I F H F S i K fl O S D V i 6 S I N O | 6 V V T H [ T 6 6 R F A O S S | T ! TTCT•TA6TAATATTCCATTTCA6TT66AMA•6CA6TCA6AT6TTT6••6AA6•ATAAA••ATCAA•6A•TA•TAACTCATATCACA•6A••AA66TTT6C•CA6A6TTC6ATTACCAT 2400 N 6 i L R O F L I A 0 A S g V I 0 S Y 6 S S L S A Y 6 L F F L 6 A HF V I A F S TAAT666T66CTTC6C6ATTTCTTAT666C6CA66CATCCCA66 fAATTCATTCTTATO6TTCCTCATTATCTTCATAT66TCTTTTTTTCTTA66CTCCCATTTT6TAT666CTTTTA62526 L n F L F S 6 It 6 Y II O E L ~ E S t V H A H N k L K V A P A T 0 P R A L S Z V O TTTAAT6T1'TCTATTCATC666CTT66TT~TT~8CAA~A~CTTATT~CCATC~TTT~6CT~j~4-T.AAATTAA~A~TT6CTCCT~CTACTCA~CCTA~A6cCTT~A8cATT6TAcA264e 6 R A V 6 V T H Y L L 6 6 I A T T V A F F L A R ~ Z A ~ 6ill •6•AC6?6CT6TA66A6TAACACA••ACCTTCT66•T66AATT6CCACMC6T666CATTCTTCTTA6CAA6AATTATT•CA6TA66ATA6 pslP.2. H TTOCTr A ~ s TTT6MAiITCATTAT66 2765 A L R | P A F S| 6 I A | DP T Tflil ! HF 6 Z A T AH DF [ S H O O | T [ 6 R CATTAAGTATTCCAAGTTTTA~CMO~,A?ABCTCAMIICL'CCACTACTC6 fCSTIITTTimTTTUTIITTBCT__ _ a O ~ ' _ _ ~ , A T M C T ~ T C A ~ f M T A ~ T ~ 6 T C

2890

L Y 0 N I F A S H F 6 0 L A [ I F L H T S 6 N L F H V A I I 0 6 H F [ A II Y | O P TTTATCA6MTATTTTT6CTTC?CAT~TL`6~ACMT~A6CM~M?TTT?C~|Ti~N~fCC~iiMTCTCTTCCAT6TA~TT6~CM~MATTTT6M6CA~6uTACA66ATCCTT309e F H VR P | A HA I HDPHF 6 QPA V [ A F T RI 6A L 6 P V N I A Y S 6 V Y TC~A~6TAA~AC~ATT6C~CA~M~TT~TCCCCAT?TT|~A~[k~T~6M~CrTTfACTC~A66Aa~KT~T11i~A6~W~M~TA~T6TTTA?~

3t2e

O U I Y T i 8L R T N[ OL Y T 6 A I F L L F L | F ! SL L A |H L H L O P K i A6TMTTTATACMTT88ATIT~'STACTMT6MMTCTTTAT~TNMCTATTTTTCTITTiTTTCTTTC~TAT~TTA~~T~T~

324#

K P S V S II F K I l i [ S II L II N H L | I L F 6 V S S L A II II | H L ¥ H V A I P 6 A4CC~A~C~TTTCCT~TTT~11jCCi~T'--`--~-~--~-TCMTCATCATTT~T~A~TATTT~TCMTTCC~T~~T~T~TATT~T

3360

S R O ( Y V R il li I F L | V L P ¥ P | | L | P L L T | 0 II N L Y A 0 I P S S S l C~A6A66A~MTAC~TTC~MT~6MTMT~TTTTMMi~T~iTTAC~T~iTCCCCN~TM~CCACT~CTMC~6~T~TTTA~T~TMTTC~TMTC

34De

H L F6 T TO6A6 Tll] L T L L | O F H P O T | SLUL T |VAHHHLA t A ATTTATTTNT~iCTACTCM~i~CN~MCT~CCATTCTMCCATTCTT~ii~TTCCATCCTCMACijCMA~TTT~iT~CTT~CC~AT~T~j~CCCA~AT~T~TA~T360# F L F k I 6 6 L fl Y R T II F I I I N S ! K Y I L [ A N I P P 6 6 R L O R O H IC | TTCTTTTT~:T~AT~T~TCTT~iT1~TAT~i~MCTMTTTC~T~TT~r~lT~iTMMT~T~i~~TA~~~T~T~

3720

L Y | T I H II S I H F 0 L 6 L A L A $ L | V I t I L V I I 0 H II Y S I. P A ¥ A F I TTTAT~ACACMTCMTMTTCMTTC~TTTCMTTA~CCTT~CTCT~iECTCi~TTMB~TCMT~iCTTCTTT~i~T~I~TCMCMA11T~i~T~TK~A~CATM 3040 A i D F T T 0 AII L ¥ T N H | Y III | F I R T | A F A H | P I F F | IID Y I P E CACM~TTTACT~T~M~T~C6TT&TitT~T~TCit~CMT~ot~itTcA1~i4Ci~N~£TTT1~CT~Ti~E~TitTCTTTTTCA~~

3960

O H A I N Y L II li II L E N ICE A | I | N L | II A | L F L | F H T L | L Y V H N D A~L~'~C66ATMT~T~iT~i~rJ~j~MT~TTA~MCACMMM~CTMCM~iTCCCATTTM~TT~CCA~C~iTTTC1WNTTCC~T~iCT~T~iACTTTAT~TCCATM~T64000 VHL AFGTPEKO I L [[P I FA|H I |SlIHiK TT Y6F | ! PL B| T TCAT~CTT~CTTT~TCTTi~4TC~M~CCT~T~tTTT~iCCCM11~4T~r~MTCT~C11Ci~T~rM~CiHDTAT~A~TAT~TAT~T~

4204

N 6 P A L N A 6 R H I I I L P I H L II li ! I I E II | g | L F L T I 6 P 8 | F L V H H AT66TC~CTTTiWiT~C~C~iiCA~iT~TT~CCC~TT~TT~mT~CT~iTTM1W~MT~4TMTTCTCTA~TCTTMCMT~T~~T~T~T~

4,120

A I A L 6 k N T T T k I L ¥ IC I It L | A l i I S I( L H P i K K i F 6 Y S F P C | 6 CTflTTOCTCTT~TTT~k~A~r~MCT~i~TT~MTC~Ti~6~M~iii1~CTTT~i~A~3u11i~M~DCC~M~MM~iTT~T~i~TTAT~C ~ C

4440

P 6 R 6 6 T C D [ S A If | | F Y L dl V F II H k H T [ | H V T F Y H H II K H I T L CA666C6~66~66TACT11mi~TATTTCTi~CTT~di~T~TT~TT~CA6~TT~miT|TTMATACCA~11~M~i~TE~T~TTAT~~~TAT

4560

H R 6 N V S | F H E | G T Y L H 6 H L II | Y L II L H I | 0 L i H 6 I T P L V C H 66C6660CAAC6TT T C I k ? , M T T T M T 6 M T C T T C C A C C T A T T T I M T O I I T I I T T ~ T T A T C T I t T O g ~ ~ A ~ A T ~ T T m T A ~ T M T A

4680

S L S V i III H F L F 6 H L V il A T i F H F L ! | II R I Y II | E L I E T L A II A 6TTTATCA~TCT~6~C6~i~T~TTCTT~iTT1N~CATCTTi~?TT~ETACT~TTTiiTi~TCTTA~iTCT~T~TA~~T~A~TC

4804

H E R T P L A N L ! It H R D K P V A L S I V 0 II It L V 6 L ¥ H F S V 6 Y 1 F T Y RT6AACOCACACCTTTii6CgMTTTUTTCIHITli6M4~TAM~ TS"TO6CTCTTTCiMTT6TCCM6CM6ATTO6~T~T~~ATATATTT~TTAT6

4920

A A F L [ I I S T S | I( F 6 I l l CAGCTTT(:TTGATTGCCTCTACATCAOGTMATTT66TTMTTTGIITTCGTTATT|TATCTEAMMTTTMTCTC

4999

Fig. 2. Nucleotide sequence of the 5.0 kb EcoRl fragment from the pea chloroplast genome. The amino acid sequences of the psaA1

and psaA2 polypeptides are given in one letter code. Sets of possible -35 and -10 promoter sequences and putative ribosome binding sites (rbs) are indicated for both genes.

The amino acid composition for both polypeptide psaA1 and psaA2 show a preponderance of hydrophobic residues (52% and 54%, respectively) as expected for an intrinsic membrane protein. A hydropathy plot for both proteins shows that the hydrophobic residues are clustered into eleven major regions large enough to traverse the membrane in the form of u-helices containing 19 to 25 amino acid each (cf. Fig. 3A and 3B). The overall patterns for the two proteins are very similar. The same patterns are found for the corresponding maize polypeptides (9). The amino acid sequences of the psaA1 and psaA2 polypeptides from pea are 41.7% homologous (not shown). In maize are the two polypeptides 45% homologous (9). As suggested in (9) this pair of genes may have arisen by gene duplication.

Comparison of the pea and maize psaA1 proteins shows that they differ in 84 out of the 761 amino acid (89% homology) (Fig. 3A). H a l f of these differences are located in five deletions/insertions between the genes. All five deletions/insertions in t h e polypeptides and 32 of the remaining 42 amino acid substitutions are located outside the putative transmembrane segments. Comparison of the psaA2 polypeptides of pea and maize reveals differences in 38 out of the 734 amino acids (95% homolgy) (Fig. 3B). One of the changes is a deletion of one residue in the pea polypeptide. These changes in the psaA1 and psaA2 polypeptides could indicate regions of the proteins which are structurally and functionally unimportant or reflect coevolution of these partner proteins. It is notable that nearly all the His residues are con-

At

pea AhH~R~PEP~V~LA~P~VK~LVDR~P~KTSF~AKP6HFSRTIAK6~P~TTT~IHNLHADAH~F~HTSDL~EISRKVFS 8e eazze AI:¢++++S¢+E+K+-+ . . . . . . . +++++¢++¢++E+¢R++++++÷+,++N¢÷¢++*¢+++¢+++++++++*6++++++¢¢+++ 73

AHF6~LS~FLHLS6HYFH6ARF~YEA~LN~PTH~RPSA~V~HP[~6~E~L~6DV666FR6~ITS6FF~RA~ITSELQLY~TAI~

17e

••*•*••***•••**÷•**¢÷*••••*••*s••+••6•÷•*•*•••••+•*•÷•••••*•••••••••+*÷••••+••*•••••÷••*••

163

ALVFAALHLFA6HFHYHKAAPKLVHFOOVESHLNHHLA6LL6L6SLSHA6HOVHVSLPINOFLNA6VO ...............

LLAOLYP 245

•••••••+•••••••••••••••A•••••••••••••••••+••••••••••••••••••+•••••••PKEIPLPHEF•LNR•••++••• 253 ~F~ATPFFTL~H~YA~)FLTFR66L~PLT66LHLT~AHHHLA~A~LFL~A6HHYRTNH6~6H6~K~LEAHK6PFT6~6HK6PFT6~335 ••6•••¢••••E•S•+••I••••••¢••••••+••••••L•++•+••••••• . . . . . ~38 6HK6PFT6•6HK6LYE•LTT•HHA•L••NLAHL6•LT•[AAHHHIAHPP•PYLAT•Y6T•LSLFTHHH••66FL••6AAAHAA•F•WR•Y . . . . . . . . . . . . . ***¢****¢÷**÷,L,¢***¢*T÷*VV÷÷**YS¢÷*Y*÷*÷¢÷*+*+*÷*+*¢*÷*+*÷*******+**¢+,÷÷+**

425 415

DPTTRY•DLLDR•LRHRDAII•HL••VC•FL6FHSF6LY•HNDTHSAL6RP••HF••TA••L•P•FA••••NTHALAP6TTAP6ATT•TS •••••••••••••••••••••••••••••••••+•••••••••+•••••••••••+•A•+••++I•••¢••••••••••V•+••••••••

515 505

L•H666•L•••66••ALLP•PL6TADFL•HHIHAFTIH•T•L[LLK6VLFAR•SRL•P•KAHL6FRFPCD6P6R66TC•VSA••HVFL6L •T•+•E•M••••••••••••••••••••••+••••

605 595

FHHY•A••V••FHFSHKH••D•H6•IND•6•VTH[T66RFA•S••T•N6•LRDFL•A•A•••[••Y6SSLSAY6LFFL6AHFV•AFSLHF •••••s••••••••••••••••••T••••••+•••••¢N••••••••••••••••••••••••••••••••••••••••••+••••••••

695 585

LFS6R6YIiOELIESIVHAHNKLKVAPATOPRALSIVO6RAV6VTHYLL66IATTHAFFLARIIAV67~I ¢+¢..._¢¢¢¢*+¢÷*+¢¢¢¢¢¢¢¢÷¢+*+**¢¢¢¢¢¢*I¢÷¢++¢¢+*¢¢+¢¢¢+¢*÷÷¢¢÷*÷+¢÷+* 751

Bt

pea A2•HALR•PRF••6•A••PTT•R•TF6•ATAH•FE•HD••TE6RL••N•FAsHF6•LA••FLHTS6NLFH•AH•6NFEAH••• |aize A2•••••F••••••L••••••••••••••••••••••••••••+••••••••••••••+•••••••••••••••••••S••••

80

PFHVRP•AHA•H•PHF6•PAVEAFTR66AL6P•N•AYS6VY•HHYT•6LRTNE•LYT6A[FLLFLSF•SLLA6HLHL•P•H•PSVSuF•N •*••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••*•••••

170 170

AE•RLNHHLS6LF6V••LA•A•HL•HVA•P6S-R6EY•RHNNFL•VLPYP•6H6PLLTL••NLYA•NPSSSNHLF•TT•6A6TA•LTLL6

259

+•••••+•••••••••+••••••••••••+••••+•+•••••••••++•••••••++•+••++•+++••••+•••••••••+•••••L••

260

6FHP•T•$LHLTDVAHHHLAIAFLFLI•6LHYRTNF6•6H••KYIL•AH[PP66RL6R6HK6LYDT•NNS•HF•L6LALA•L•V[TSLVA

•••••••••••••••••••••••••••A•H•••••+•••••••DL••••T•••••••••••••••••••••••••••••••••+••••••

349 35e

•HHY•LPAYAF•A••FTT•AALYTHH•Y•A6F•HT6AFAH6P•FF•R•YNPE••ADNVLARHLEHLEA••SHLSHASLFL6•HTL6LY•H + ~ + ~ ~ ¢ ~ + ~ ~ + ~ ¢ ~ E ~ + D ~ ~ ÷ + ~ ~ P ~

439 440

N•VHLAF•TPEK••L•EP•FA•H••SAH•KTTY•FD•PL•STN•PALNA•RN••LP••LNA•NENSN•LFLT••P•DFLVHHA•AL•LHT ~ ~ ~ ¢ ~ + ~ ~ T ~ + ~ + ~ ~ + ~ + ~ ~

529 53e

TTL~LVK6ALDAR6~KL~PDKK~F6YSFPCD6P~R66TCD~SA~DDFYLAVF~LNT~6~VTFY~Hi~KH~TL~R6NVSQF~E~STYLM6~ 619 •••++•+•••••+•••••+•••+••••+••+++••++••+•••+•A++•••++•••••••••••••••++••••+•••+•+•+•••+++• 620 LR~YLHLN~L~TPLvC~SL~V~A~HFL~HLVHAT~FHFL~HR~Y~EL~ETL~AHERTPLA~L~RHR~PVAL~V~ARL~6 7e9 +••+•++••••••+•YN•F•••••••••++••+••••••++•••++•x•+•••+•••++•+++•++•••+++••+++•+++••+••+••+ 710 LVHFSV6YIFTYAAFLIASTS6KF6 734 ÷A++÷¢++÷¢¢+÷÷÷¢*¢++¢+÷¢+ 735 Fig. 3. Comparison of the amino acid sequences of the psaA1 (.4) and psaA2 (B) polypeptides from pea to the corresponding polypeptides from maize. The putative membrane spanning regions (12) are underlined.

served between the two species. There are 43 and 38 His residues in the pea psaA1 and psaA2 polypeptides, respectively. In the maize psaA1 and psaA2 polypeptides there are 42 and 39 His residues (9). His residues have been suggested to participate in the noncovalent binding of reaction center and antenna chlorophyll a (11). As pointed out by Fish et al. (9) the sequence Asp-Pro-Thr-Thr-Arg is present both in the P680 chlorophyll a-apoprotein of PSII, for which the gene has been sequenced in spinach (15), and in the maize psaA1 and psaA2 polypeptides. It was suggested that this sequence might be associated with the P680 and P700 reaction center chlorophylls (9). This sequence is conserved in the pea psaA1 and psaA2 polypeptides at positions 425-429 and 15-19, respectively (cf. Fig. 3A and 3B). In psaA1 this sequence is located in a fully conserved 21 residue region between two putative membrane spanning regions (see Fig. 3A). It is noteworthy that ten out of the 21 residues are charged (five Asp and five Arg). The CPI complex appears to contain two polypeptides with molecular weights of 60-70 kD (3, 33, 34) which are largely homologous in their primary structure (3, 33). It is therefore tempting to assume that the CPI complex may contain one copy of each of the polypeptides psaA1 and psaA2, as suggested by Fish et al. (9). If the His residues serve to mediate the noncovalent binding of chlorophylls then this complex would be able to bind ca. 40 chlorophyll molecules which is in agreement with results obtained by Mullet et al. (16). They found that the PSI core complex (CPI complex) contains 4 0 - 45 chlorophyll molecules. Prokaryotic-like promoter sequences are located in the 5' flanking regions of both the psaA1 and psaA2 gene (Fig. 2), but the short distance between the genes suggests that they are cotranscribed. In maize and spinach a 4.9 and a 6 kb transcript, respectively, which could carry information for both proteins, has been mapped to this region (26, 34). However, in maize there are several smaller transcripts which also map to this region. They could arise either through the presence of different transcription initiation sites or by processing of a primary transcript. The high homology of the psaA genes and their flanking regions from pea and maize indicates that regulation of the genes probably occurs by similar mechanisms in these two species.

Acknowledgements This work was supported by the Danish Agricultural and Veterinary Research Council, the Carlsberg Foundation and the Foundation of S. and I. Hansen.

References 1. Alt J, Morris J, Westhoff P, Herrmann G: Nucleotide sequence of the clustered genes for the 44 kd chlorophyll a apoprotein and the '32 kd'-like protein of the photosystem lI reaction center in the spinach plastid chromosome. Curr Genet 8:597-606, 1984. 2. Anderson JM, Levine RP: The relationship between chlorophyll-protein complexes and membrane polypeptides. Biochim Biophys Acta 357:118-126, 1974. 3. Bengis C, Nelson N: Subunit structure of the chloroplast PSI reaction center. J Mol Biol 252:4564-4569, 1977. 4. Bookjans B, Stummann BM, Henningsen KW: Preparation of chloroplast DNA from pea plastids isolated in a medium of high ionic strength. Analyt Biochem 141:244-247, 1984. 5. Cantor CR, Schimmel PR: Biophysical Chemistry, part III, W. H. Freeman and Company, San Fransisco, USA, 1980. 6. Chua N-H, Marlin K, Bennoun P: A chlorophyll-protein complex lacking in photosystem I mutants of Chlamydomonas reinhardtii. J Cell Biol 67:361-377, 1975. 7. Crouse E J, Bohnert H J, Schmitt JM: Chloroplast DNA synthesis, ln: Ellis RJ (ed) Chloroplast biogenesis, Seminar series of the Society for Experimental Biology, vol. 21, Univ. Press, Cambridge, 1984, pp 8 3 - 136. 8. Dietrich WE, Thornber JP: The P700 chlorophyll a-protein of a bleached green alga. Biochim Biophys Acta 245:482-493, 1971. 9. Fish LE, Kiick U, Bogorad L: Two partially homologous adjacent light-inducible maize chloroplast genes encoding polypeptides of the P700 chorophyll a-protein complex of photosystem I. J Biol Chem 3:1413 - 1421, 1985. 10. Huttly AK, Gray JC: Localisation of genes for four ATP synthase suhunits in pea chloroplast DNA. Mol Gen Genet 194:402- 409, 1984. 1.1. Holschuh K, Bottomley W, Whitfeld PR: Structure of the spinach chloroplast genes for the D2 and 44 kd reactioncenter protein of photosystem II and for tRNA-Ser(UGA). Nucleic Acids Res 12:8819- 8834, 1984. 12. Hopp TP, Woods KR: Prediction of protein antigenic determinants from amino acid sequence. Proc Natl Acad Sci USA 78:3824- 3828. 13. Lagoutte B, Serif P, Duranton J: (1981) Structure and molecular organization of the photosynthetic apparatus. In: Photosynthesis III, Akoyunglou G (ed) Balaban International Science Services, Philadelphia, Pennsylvania, pp 237 - 243. 14. Machold O: On the molecular nature of chloroplast thylakold membranes. Biochim Biophys Acta 281:103-122, 1975. 15. Morris J, Herrmann G: Nucleotide sequence of the P680 chlorophyll a apoprotein of the photosystem II reaction

10

16. 17.

18.

19.

20.

21.

22.

23.

24.

25.

26.

center from spinach. Nucleic Acids Res 12:2837-2850, 1984. Mullet JE, Burke J J, Arntzen C J: Chlorophyll proteins of photosystem I. Plant Physiol 65:814-822, 1980. Mullet JE, Grossman AR, Chua N-H: Synthesis and assembly of the polypeptide subunits of photosystem I. Cold Spring Harbor Syrup Quant Biol 46:979-984, 1982. Nechuhustai R, Nelson N: Purification properties and biogenesis of Chlamydornonas reinhardii photosystem I reaction center. J Biol Chem 256:11624-11628, 1981. Ogawa T, Obata F, Shibata K: Two pigment proteins in spinach chloroplasts. Biochim Biophys Acta 112:223 - 234, 1966. Oishi KK, Tewari KK: Characterization of the gene and mRNA of the large subunit of ribulose-1, 5-bisphosphate carboxylase in pea plants. Mol Cell Biol 3:587 - 595, 1983. Oishi KK, Shapiro DR, Tewari KK: Sequence organization of a pea chloroplast DNA gene for a 34500-dalton protein. Mol Cell Biol 4:2556-1984. Palmer JD, Thompson WF: Rearrangements in the chloroplast genomes of mung bean and pea. Proc Natl Acad Sci USA 78:5533 - 5537, 1981. Phillips AL, Gray JC: Location and nucleotide sequence of the gene for the 15.2 kDa polypeptide of the cytochrome bf complex from pea chloroplasts. Mol Gen Genet 194:477- 484, 1984. Rasmussen OF, Bookjans GB, Stummann BM, Henningsen KW: Localization and nucleotide sequence of the gene for the membrane polypeptide D2 from pea chloroplast DNA. Plant Mol Biol 3:191 - 199, 1984. Remy R, Hoarau J, Ledere JC: Electrophoretic and spectrophotometric studies of chlorophyll protein complexes. Photochem Photobiol 25:151- 158, 1977. Rodermel SR, Bogorad L: Maize plastid photogenes: Mapping and photoregulation of transcript levels during light-

induced development. J Cell Biol 100:463- 476, 1985. 27. Sanger F, Nicklin S, Coulson AR: DNA sequencing using chain-terminating inhibitors. Proc Natl Acad Sci USA 74:5463 - 5467, 1977. 28. Satoh K: Properties of light-harvesting chlorophyll a/bprotein, and P700 chlorophyll a-protein of spinach chloroplast by isoelectric focusing. Plant Cell Physiol 20:499- 512, 1979. 29. Schwarz Z, K6ssel H: The primary structure of 16S rRNA from Zea mays chloroplast is homologous to E. coli 16S rRNA. Nature 283:739- 742, 1980. 30. Serif P, Acker S, Lagoutte B, Duranton T: Contribution to the structural characterization of eukaryotic PSI reaction centre. Photosyn Res 1:17-27, 1980. 31. Shine J, Dalgarno L: The 3' terminal sequence of Escherichia coli 16S ribosomal RNA: Complementarity to nonsense triplets and ribosome binding sites. Proc Natl Acad Sci USA 71:1342- 1346, 1974. 32. Shiozawa JA, Alberte RS, Thornber JP: The P700 chlorophyll a-protein. Arch Biochem Biophys 165:388-397, 1974. 33. Vierling E, Alberte RS: P700 chlorophyll a-protein. Plant Physiol 72:625 - 633, 1983. 34. Westhoff P, Alt J, Nelson N, Bottomley W, Btinemann H, Herrmann RG: Genes and transcripts for the P700 chlorophyll a apoprotein and subunit 2 of the photosystem I reaction center complex from spinach membranes. Plant Mol Biol 2:95 - 107, 1983. 35. Willey DL, Auffret AD, Gray JC: Structure and topology of cytochrome f in pea chloroplast membranes. Cell 36:555-562, 1984.

Received 11 November 1985; in revised form 26 February 1986; accepted 18 March 1986.

Sequence of two genes in pea chloroplast DNA coding for 84 and 82 kD polypeptides of the photosystem I complex.

The genes encoding the two P700 chlorophyll a-apoproteins of the photosystem I complex were localized on the pea (Pisum sativum) chloroplast genome. T...
625KB Sizes 0 Downloads 0 Views