Vol. 174, No. 12
JOURNAL OF BACTERIOLOGY, June 1992, p. 3993-3999
0021-9193/92/123993-07$02.00/0 Copyright © 1992, American Society for Microbiology
Cloning, Sequencing, Mapping, and Transcriptional Analysis of the groESL Operon from Bacillus subtilis ANGELIKA SCHMIDT,' MICHAEL SCHIESSWOHL,1 UWE VOLKER,2 MICHAEL HECKER,2 AND WOLFGANG SCHUMANN"* Lehrstuhl fiir Genetik, Universitat Bayreuth, W-8580 Bayreuth, ' and Institut ifir Mikrobiologie, Universitat Greifswald, 0-2200 Greifswald,2 Germany Received 21 January 1992/Accepted 15 April 1992
Using a gene probe of the Escherichia coli groEL gene, a 1.8-kb HindIII fragment of chromosomal DNA of BaciUus subtilis was cloned. Upstream sequences were isolated as a 3-kb PstI fragment. Sequencing of 2,525 bp revealed two open reading frames in the ordergroES groEL. Alignment of the GroES and GroEL proteins with those of eight other eubacteria revealed 50 to 65% and 72 to 84% sequence similarity, respectively. Primer extension studies revealed one potential transcription start site preceding the groESL operon (S) which was activated upon temperature upshift. Northern (RNA) analysis led to the detection of two mRNA species of 2.2 and 1.5 kb. RNA dot blot experiments revealed an at least 10-fold increase in the amount of specific mRNA from 0 to 5 min postinduction, remaining at this high level for 10 min and then decreasing. A 9-bp inverted repeat within the 5' leader region of the mRNA might be involved in regulation of the beat shock response. By using PBS1 transduction, the groESL operon was mapped at about 3420. A group of molecular chaperonins include evolutionarily conserved proteins encoded by heat-induced genes (9, 13, 16). The prototype bacterial chaperonin is the product of Eschenchia coli groEL. Mutant studies have implicated groEL together with the cotranscribed groES in diverse processes, such as phage morphogenesis, protein folding, mRNA turnover, stress protection, oligomer assembly, and general protease activity (reviewed in references 13 and 41). The GroEL protein is a tetradecameric protein consisting of two stacked rings with seven subunits (Mr, about 60,000) in each ring. Related proteins in prokaryotes and eukaryotes with high N-terminal amino acid identity are termed chaperonin 60 proteins (Cpn60). The GroES protein is a heptameric protein (Mr, about 10,000), which forms a ring in vivo and binds to Cpn60 in the presence of Mg2+-ATP. The related group of proteins found in prokaryotes and mitochondria are designated CpnlO proteins. This unusual morphology was also described for a GroEL particle from Bacillus subtilis (5) that shared at least one antigenic determinant with the GroEL polypeptide from E. coli (1, 32). Electron microscopy analyses showed the GroEL-like chaperonin from B. subtilis as a cylindrical body with seven well-defined globules arranged almost parallel to the longitudinal axis of the particle (4). Recently, GroEL was identified on two-dimensional gels (21, 40). To study the function of the GroEL chaperonin in processes such as in vivo assembly, folding of proteins, and secretion of exoenzymes in B. subtilis and to study regulation of this heat shock gene under diverse stress conditions, we have initiated a program to clone and to characterize the genes encoding the GroEL and GroES homologs. In this paper we report the nucleotide sequence of 2.525-kb DNA which contains two heat shock genes equivalent to the E. coli groESL operon. Furthermore, we analyzed the transcription of this operon and mapped the groESL genes on the B. subtilis chromosome.
MATERUILS AND METHODS Bacterial strains and plasmids. The bacterial strains and plasmids used are listed in Table 1. B. subtilis and E. coli strains were routinely grown at 37°C in Luria-Bertani medium. For the induction of the heat shock regulon, bacteria were shifted from 37 to 48°C. Concentrations of antibiotics, when used, were as follows: ampicillin, 100 ,ug/ml; chloramphenicol, 5 ±g/ml; and kanamycin, 25 ,ug/ml. Enzymes and chemicals. Restriction enzymes, T4 DNA ligase, Klenow fragment, and calf intestinal alkaline phosphatase were obtained from Bethesda Research Laboratories. Avian myeloblastosis virus reverse transcriptase was from Stratagene. DNA probes were labeled either with [ot-32P]dATP (specific activity, 3,000 Ci/mmol; Amersham Corp.) or with digoxigenin--dUTP (Boehringer GmbH, Mannheim, Germany) by the random priming technique of Feinberg and Vogelstein (11, 42). Synthetic oligodeoxynucleotides were purchased from Appligene (Heidelberg, Germany).
DNA isolation and cloning of the groE locus. B. subtilis MB11 DNA was digested with HindIll. DNA fragments of about 2 kb were eluted from a 0.8% agarose gel, ligated into HindIII-digested pBR322, and transformed into E. coli RR1. Plasmid screening indicated that 80% of the transformants contained plasmids with an insert of about 2 kb in length. A total of 143 transformants were analyzed by probing Southern blots of HindIll-digested recombinant plasmids with the digoxigenin--dUTP-labeled 1.2-kb BstEII fragment of E. coli groEL gene recovered from pOF39 (10). This screening led to the detection of two positive plasmids (pASG58 and pASG129). Sequencing of pASG58 carrying an about 1.8-kb insert revealed that the HindIll fragment did not contain the complete groEL gene. The remaining part was isolated by cloning chromosomal 3-kb PstI fragments into pACYC177. Positive clones were identified by using the digoxigenin-dUTP-labeled 1-kb HindIII-PstI fragment of pASG58 containing an internal part of groEL. Two of 346 clones analyzed (pASG145 and pASG241) carried the desired PstI insertion. Determination of DNA sequence. Most restriction fragments to be sequenced were ligated into the appropriate sites 3993
SCHMIDT ET AL.
J. BACTE RIOL.
TABLE 1. Bacterial strains and plasmids Strain or plasmid
E. coli RR1
ara-14 galK2 lacY] mlt-i proA2 rpsL20 supE44 xyl-5
or Source reference
rB mB B. subtilis MB11 AGS1
Plasmids pBR322 pJH101 pOF39
hisH2 met B10 lys-3 trpC2 MB11 carrying chromosomal insertion of pAGS02 MB11 carrying chromosomal insertion of pMS01
Apr Tcr Apr Tcr Cmr pBR325 carrying groESL of E. coli pWD3 Apr Cmr promoterless amyL pACYC177 Apr KMr pASG58, pAGS129 pBR322 containing 1.8-kb HindIII of 'groEL of B. subtilis pASG145, pASG241 pACYC177 containing 3-kb PstI of groESL' of B. subtilis pASGO2 pJH101 containing 1.0-kb HindIII-EcoRI fragment of pASG58 pMSO1 pWD3 containing internal 1.4kb fragment of groEL of B. subtilis
28 This work This work
3 12 10 38 6 This work
This work This work
of M13 vectors. DNA sequencing was performed by the dideoxy-chain termination method (30) with [a-355]dATP (specific activity, 650 Ci/mmol; Amersham Corp.) and the Sequenase kit (version 2.0) of U.S. Biochemical Corp. Universal primer was used for M13 sequencing, and oligonucleotides ON1 (5'-GTTGTACCGTCACCGGC-3') and ON2 (5'-GCAGAATCCGGCAACAC-3') were used for double-stranded DNA sequencing (15) with plasmid DNA as the template. Both strands of the groESL operon were sequenced. Analyses of transcription. Isolation of total RNA, primer extension, Northern (RNA) analyses, and analyses of mRNA synthesis by dot blotting were performed as described previously (37). Primer extension was carried out with ON1, ON2, and ON3 (5'-AGTGCATCGACACCGCG3'). Mapping of the groE locus. Mapping of the groE locus was performed by PBS1 transduction (18). First, a chloramphenicol acetyltransferase (cat) gene was integrated at the site of groEL. This was accomplished by ligating the 858-bp HindIII-EcoRI fragment of pAGS58 (see Fig. 1) into pJH101 (pAGS02). This recombinant integration vector was transformed into B. subtilis MB11, selecting for transformants on plates containing chloramphenicol and screening chromosomal DNA from integrants by Southern blotting for the correct arrangement at the groE locus (data not shown). One of these integrants (ASG1) was used for genetic mapping. A lysate of PBS1 grown on ASG1 was used to infect the set of reference strains of Dedonder et al. (8) and selecting for Cmr transductants. DNA and protein sequence analyses. Sequence analyses were performed by using the PC/GENE package (IntelliGe-
netics) for IBM personal computers and HUSAR (Heidelberg Unix Sequence Analysis Resources).
Nucleotide sequence accession number. Sequence data in this work have been assigned GenBank data base accession no. M84965.
RESULTS Molecular cloning of the groEL locus. Southern blot experiments revealed that an internal part of the E. coli groEL gene hybridized to a 2-kb HindlIl fragment and a 3-kb PstI fragment of B. subtilis chromosomal DNA (data not shown). We cloned the groEL homolog from chromosomal DNA of B. subtilis as described in Materials and Methods. Screening of individual recombinant plasmids revealed two that gave clear hybridization signals (pASG58 and pASG129). Both plasmids carried a 1.8-kb HindlIl fragment in the same orientation. Sequencing of pASG58 revealed the presence of an open reading frame (ORF), the translation product of which was highly homologous to the GroEL protein of E. coli. Furthermore, it became apparent that this ORF was incomplete, as it was missing the 5' end of groEL. Isolation of this part was accomplished by cloning a 3-kb PstI fragment which gave a signal in the chromosomal DNA blot (data not shown). Two recombinant plasmids (pASG145 and pASG241), both carrying the 3-kb PstI fragment in the same orientation, were recovered (Fig. 1). Sequencing and features of the groE locus. By using M13 subclones and synthetic oligonucleotides, the sequence of the groE locus was determined. A total of 2,525 bp was sequenced, revealing the presence of two contiguous ORFs (Fig. 1 and 2). ORF1 is 282 nucleotides long, beginning with a UUG start codon at position 168 and ending with a UAA stop codon at position 450. Alternative start codons GUG, UUG, and AUU are frequently used in prokaryote genes; all are single-base changes from AUG (26). ORF1 encodes a peptide of 94 amino acids with a predicted molecular mass of 10,158 Da and a calculated pl of 4.56. This polypeptide exhibits significant sequence identity with GroES of E. coli and other bacterial species (see below). Therefore, we designated ORF1 groES. ORF2 is 1,632 nucleotides long, beginning with an AUG start codon at position 499 and ending with two UAA stop codons at position 2131. The start codon of ORF2 is 46 nucleotides downstream from the stop codon of ORF1. ORF2 encodes a polypeptide of 544 amino acids with a predicted molecular mass of 57,424 Da, a calculated pl of 4.49, and three tandem repeats of the Gly-Gly-Met motif at its C terminus. This motif has been found in GroEL proteins of various species, but its biological function remains unclear. As already mentioned, this ORF exhibits a significant sequence identity with groEL of E. coli and other bacterial species (see also below) and was designated as groEL. The GroEL protein was blotted from a two-dimensional gel onto a polyvinylidene difluoride membrane, and its N-terminal amino acid sequence was determined with an Applied Biosystems protein sequencer (A373). The sequence AKEIKFSEEARRAMLRGVDALADAVKVTLG agrees perfectly with the predicted amino acid sequence
Sequence similarities between GroES and GroEL and proteins of other bacterial species. Alignment of the deduced amino acid sequences with the GroES and GroEL sequences of other bacterial species clearly shows a high degree of
B. SUBTILIS groESL OPERON
VOL. 174, 1992
pASG58. pASG129 pASG 145. pASG241 pASGO2
FIG. 1. Restriction map and plasmids of the groE locus. Inserts in the various plasmids described in the text identified genes are indicated with arrows. B, BsmI; E, EcoRI; H, HindIll; P, PstI; Pv, PvuII.
similarity, indicating that the 10.1- and 57.4-kDa protein subunits are indeed equivalents of GroES and GroEL, respectively (Table 2). The overall similarity between GroES of B. subtilis and the others ranges from 50.2% (Chlamydia
trachomatis) to 65.1% (Streptomyces albus). In contrast, comparative amino acid analyses of the 57.4kDa GroEL protein of B. subtilis revealed a higher sequence similarity to the GroEL proteins of other bacterial species (Table 2). Overall, the amino acid similarity between B. subtilis GroEL and the others ranges from 71.9% (Mycobacterium tuberculosis) to 83.9% (Clostndium acetobutylicum). There is a remarkable similarity between GroES and GroEL, respectively, of Legionella micdadei and Coxiella bumetii (Table 2). One striking feature of the GroEL sequence is a carboxyterminal Gly-Gly-Met repeat. This repeat structure, although of variable length, is found in most bacterial GroEL sequences. The biological significance of the repeat is not yet known, but its widespread occurrence suggests that this region is functionally important. Features of the noncoding regions. Probable ribosomebinding sites, marked with asterisks in Fig. 2, are 4 and 6 nucleotides upstream from the start codons of ORFi and ORF2, respectively. These sites consist of 10 (ORF1) and 8 (ORF2) nucleotides complementary to the 3' end of B. subtilis 16S rRNA (22). Primer extension studies were used to identify potential transcription start sites for the two ORFs. Total RNA was prepared from bacterial cells grown at 37°C or at different times after a heat shock to 48°C. Synthetic oligonucleotides ON1, ON2, and ON3 derived from the coding region of groESL were used as primers. The analyses revealed one potential transcription start point located 67 nucleotides upstream of the translation initiation codon of groES (designated S; Fig. 3A and 2). This position corresponds to a T residue. Upon longer exosure of the sequencing gels, we observed an additional weak band immediately below the strong band (not shown). This weak band corresponds to a G residue (Fig. 2). The amount of transcripts initiated at S increased with the temperature shift from 37 to 48°C. There is a dramatic increase between 0 and 5 min, followed by a decrease 5 min after heat induction and remaining constant for at least 15 min (Fig. 3A). The transcriptional start point S is preceded by a sequence which resembles the consensus sequence of promoters recognized by the vegetative sigma factor of B. subtilis, sig-
shown. Locations of
ma-43 (Fig. 2). Therefore, heat-sensitive transcription from S might also be initiated by sigma-43. Furthermore, we found a palindromic structure downstream of S consisting of a 9-bp repeat separated by a 9-bp spacer (Fig. 2, nucleotides 106 to 132). This palindromic structure could be involved in the temperature-sensitive regulation of transcription (see Discussion). No stem-loop structure resembling a prokaryotic rho-independent transcription terminator could be found downstream of the groESL operon. Synthesis of mRNA. The in vivo transcripts of the groE locus were detected by dot blot and Northern analyses with total RNA prepared either from exponentially growing cells of B. subtilis or at different times after the bacteria had been transferred to 48°C. The dot blot experiments revealed a rapid and at least 10-fold increase in the mRNA level within 5 min after temperature upshift (Fig. 3B). The amount of mRNA declined between 15 and 30 min postinduction. At 30 min after temperature upshift, there was still an about threefold-higher level of mRNA as compared with that of noninduced cells. The Northern analysis was performed with the radioactively labeled 1.4-kb internal PvuII fragment of groESL as a probe (Fig. 1). This fragment hybridized to two mRNA species with molecular sizes of 2.2 and 1.5 kb (Fig. 3C). The 2.2-kb RNA could comprise the whole operon, whereas the 1.5-kb RNA could also be initiated at S and extend into groEL. To prove that groES and groEL form one operon, hybridization was repeated with ON1, ON2, or ON3. In all three cases, these oligonucleotides hybridized to the 2.2- and 1.5-kb mRNAs (31). The synthesis of both mRNA species was strongly induced following temperature upshift, thereby confirming the results of the dot blot analysis. Location of groEL on the B. subtilis chromosome. The position of groEL on the chromosome was determined by phage PBS1-mediated transduction with the chloramphenicol resistance gene in the groEL::cat mutant (strain ASG1) as a selectable marker. The groEL::cat fusion was 75% cotransduced with the auxotrophic mutation purA and 17% cotransduced with the sacS marker. These results indicate that the order of genes is sacS groESL purA, placing groEL at about 3420 on the B. subtilis genetic map (27). DISCUSSION We used the high degree of homology among the groEL genes to isolate the groEL homolog from B. subtilis. The