Proc. Nati. Acad. Sci. USA Vol. 89, pp. 2125-2129, March 1992 Microbiology

A developmentally regulated chlamydial gene with apparent homology to eukaryotic histone H1 (Chlamydia trachomatis)

EVE PERARA*, DON GANEM, AND JOANNE N. ENGELt Departments of Microbiology and Immunology and Medicine, University of California Medical Center, San Francisco, CA 94143-0502

Communicated by Bernard D. Davis, October 28, 1991

H1 proteins.§ A preliminary account of this work was presented at the 1990 International Chlamydia Symposium (Harrison Hot Springs, Vancouver, Canada) and is summarized in its proceedings (3).

ABSTRACT We have developed a method for the isolation of genes whose expression is developmentally regulated from the murine strain of Chlamydia trachomatis. Here we describe the identification of two developmental stage-specific genes, one of which is predicted to encode a 26-kDa lysine- and alanine-rich protein that appears to be homologous to several eukaryotic histone H1 proteins. A substantial proportion ofthis homology relates to its distinctive amino acid composition. No sequence homology was observed between this protein and other bacterial "histone-like" chromosomal proteins, but homology does exist with two other recently described prokaryotic proteins. The protein is expressed late in chiamydial development, during the transition from reticulate bodies to elementary bodies. The basic nature of the protein predicts that it could bind DNA, and Southwestern blotting experiments confirm this finding. These properties are consistent with a role either in the regulation of late gene expression or in the compaction of the chlamydial genome.

MATERIALS AND METHODS Nucleic Acid Preparation and Analysis. Chlamydial DNA from the mouse pneumonitis strain of C. trachomatis (MoPn) or the lymphogranuloma venereum strain (serovar L2) of C. trachomatis was prepared as described (4). Total RNA was prepared from infected cells (at the indicated times following MoPn infection) as described (5). Preparation and Screening of Chlamydial DNA Libraries. For preparation of a plasmid-based library, chlamydial DNA was digested with EcoRI and cloned into a pGEM7Zf vector (Promega) previously cleaved with EcoRI and dephosphorylated at the 5' termini. Individual insert-bearing clones were selected randomly and DNA was prepared from them by the minilysate method (6). Probes were made by nick-translation of individual miniprep DNAs and then used to probe Northern blots containing chlamydial RNA extracted from cells at various times postinfection or following ampicillin treatment or exposure to a brief heat shock (5). To minimize detection of contaminating host cell DNAs (4) in the screening process, DNA was prepared from chlamydia grown in the human cell line HeLa, while RNA was isolated from infected mouse L cells. The A-based library was prepared in AgtWES as described (4). DNA Sequencing. The dideoxy chain-termination method of DNA sequencing (7) was carried out on double- or singlestranded DNA prepared from fragment-containing plasmid or phagemid vectors, pGEM7Zf, or pBluescript KS(+) or SK(+) (Stratagene). Sequencing reactions were primed with oligonucleotides (Promega) complementary to the SP6 and T7 promoter sequences of pGEMZf that flank the inserted DNA. Reactions were carried out using the Sequenase kit (IBI) following the manufacturer's instructions for sequencing using the dideoxy G and the dideoxy I reagents provided. Immunoblotting. SDS/polyacrylamide gels and Western blots were carried out as described (5). The anti-Hcl antiserum, a rabbit polyclonal antiserum raised to the purified 18-kDa protein from the L2 serovar of C. trachomatis (8), and the anti-KARP antibody [where KARP indicates lysine (K)and alanine (A)-rich protein], a mouse monoclonal antibody raised to purified 32-kDa protein from the L2 serovar of C. trachomatis (generously provided by Ted Hackstadt, Rocky

Chlamydia trachomatis is an obligate bacterial pathogen that displays an interesting developmental cycle involving two morphologically distinct forms. The spore-like, metabolically inactive elementary body (EB) is the extracellular infectious form. EBs attach to and enter eukaryotic cells; once within a host-derived cytoplasmic vacuole, they differentiate into metabolically active reticulate bodies (RBs). After replicating within this vacuole, the RBs redifferentiate into EBs, thus completing the developmental cycle (1). This life cycle has been well-described morphologically, and distinctive ultrastructural changes have been observed during the transition between the two developmental forms (1). In addition to significant differences in size and membrane permeability between EBs and RBs, chromatin organization varies markedly between the two forms. Notably, the DNA genome of RBs resembles that of other bacteria, with diffuse fine fibrils extending throughout the cell, whereas mature EBs have a discrete, condensed electrondense nucleoid that appears to be unique among prokaryotes

(2).

These changes and others that characterize the chlamydial developmental cycle are temporally regulated. We are interested in identifying and characterizing genes that are involved in these developmental transitions. As a first step toward this goal, we have developed a straightforward though laborious method for the identification of chlamydial genes whose expression is regulated either temporally or by environmental stimuli. We report here the cloning of two genes that are expressed specifically late in chlamydial development, during the transition from RBs to EBs. Surprisingly, characterization of one of these genes reveals it to encode a small, basic protein that shows apparent homology to eukaryotic histone

Abbreviations: NBRF, National Biomedical Research Foundation; EB, elementary body; RB, reticulate body; hpi, hours postinfection; ORF, open reading frame; MoPn, mouse pneumonitis strain of C. trachomatis; KARP, lysine (K)- and alanine (A)-rich protein. *Present address: Department of Biology, San Francisco State University, San Francisco, CA 94132. tTo whom reprint requests should be addressed. §The DNA sequences of the proteins discussed in this paper have been deposited in the GenBank data base (accession no. M86605).

The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. 2125

2126

Microbiology: Perara et al.

Proc. Natl. Acad. Sci. USA 89 (1992)

Mountain Laboratories, Hamilton, MT), were used at a 1:200 dilution. The immunoblots were probed with a second antibody [goat anti-mouse or goat anti-rabbit IgG conjugated to alkaline phosphatase, respectively (Promega)]. The Southwestern blots were carried out according to Wagar and Stephens (9) except that 105 cpm of probe made by nicktranslating either total MoPn DNA or Escherichia coli DNA was incubated with the Western blots at room temperature for 1-2 hr.

A

iC\

>\

4I

B

RESULTS Identification of Developmentally Regulated Genes. To identify genes whose expression is stage-specific, we initially screened bacteriophage A libraries containing genomic chlamydial DNA for their ability to anneal to [32P]cDNA prepared from RNA isolated from chlamydia-infected cells at early and late times following infection. In this differential screening approach, we looked first for clones that annealed preferentially to late but not to early cDNA probes. Of >10,000 clones screened, we obtained no clones of differentially regulated chlamydial genes, only clones arising from chlamydial rRNA and contaminating host RNA. Accordingly, we turned to a brute-force approach, simplistic and inelegant but made possible by the low complexity [=1000 kilobases (kb)] of the chlamydial genome. In this screening process, we constructed a plasmid library of EcoRI-cleaved genomic DNA from the MoPn strain of C. trachomatis. DNA prepared from minipreps of individual recombinant plasmids was radiolabeled and used to probe Northern blots containing RNA purified from MoPn-infected cells at 16 or 22 hours postinfection (hpi) or at 22 hpi from cells treated with ampicillin (100 ,ug/ml). Ampicillin treatment is known to prevent the conversion of RBs to EBs late in the developmental cycle without affecting DNA replication (10); thus such treatment should yield RNA enriched in early or middle stage-specific transcripts and lacking mRNAs expressed only late in the developmental cycle. DNA and RNA were prepared from infected cells of different species to minimize cross-annealing to contaminating host clones. Eighty individual clones were screened, and nine hybridized specifically to chlamydial RNA on Northern blots; none hybridized to host RNA or DNA. Of these nine, we identified two clones that hybridized to RNAs that were expressed only late in the developmental cycle and whose expression was specifically inhibited by treatment with ampicillin. These genes are of special interest as they are likely to encode products that are expressed during the transition from RBs to EBs or products that are EB-specific. Below we describe their-further characterization. Cloned Late Stage-Specific Chlamydial Genes. One of these genes encoded a small (=250 bases), abundant RNA. The DNA sequence of this clone revealed only one open reading frame (ORF) of 43 amino acids. Comparison of the nucleic acid sequence and the derived amino acid sequence with the corresponding National Biomedical Research Foundation (NBRF) data bases revealed no homology to known genes or proteins. At present we know nothing of the function of this RNA or its putative protein product in the chlamydial life cycle and have not considered it further in this work. The second clone (pG83) isolated in this way hybridized to an -650-base-pair (bp) RNA; the temporal pattern of expression of this RNA is shown in Fig. 1A. At early (8 hpi, lane 2) and middle cycle (12 hpi, lane 3) little or no expression is detected, but significant levels of RNA are seen late in development (at 18 and 21 hpi; lanes 4 and 5, respectively). Ampicillin treatment inhibits expression (lane 7). Brief exposure to a heat shock of 45°C does not alter its expression (lane 6). These studies indicate that it is specifically expressed late in the developmental cycle during the transition

z I

I-

..

.C'

iRV.

.:.

4-

r..RF' IPL >. ". (.1 l~ E7Ci

+. _.- :-

!

:i;

jI

llk§JE illlf~~~t

FIG. 1. (A) Northern blot analysis of temporal expression pattern of pG83 transcript. Approximately equal amounts of total chlamydial RNA (as determined by ethidium bromide staining) prepared from uninfected cells (lane 1) and from chlamydia-infected cells at 8 hpi (lane 2), 12 hpi (lane 3), 18 hpi (lane 4), 21 hpi (lane 5), 21 hpi after exposure to 45°C for 10 min (HS; lane 6), and 21 hpi with ampicillin treatment (AMP; 10 ,ug/ml) (lane 7) were electrophoresed through a 1% agarose/formaldehyde gel (6), transferred to Hybond paper, and hybridized to 107 cpm of a nick-translated probe made from the plasmid pG83. (B) Restriction map of pG83. A more detailed map of the 1.7-kb Sac I/Nco I fragment is illustrated below the map. Potential ORFs are indicated by the shaded boxes. The region of hybridization to the 650-nucleotide RNA is illustrated. This extent was determined by Northern blot analysis with subclones derived by restriction endonuclease digestion of the pG83 clone (data not shown). The sequencing strategy is indicated by arrows.

from RB to EB. Further RNA blot analysis with probes derived from subcloned restriction fragments demonstrated that the transcribed region could be localized to one strand of a 1.7-kb Sac I-Nco I fragment (Fig. 1B). Probes A and B hybridized well to the 650-bp mRNA, whereas probe C hybridized weakly (Fig. 1B and data not shown). pG83 Encodes a Basic Protein with Apparent Homoogy to Eukaryotic Histone H1. DNA sequencing from the Nco I site to near the Sac I site reveals two large ORFs (Fig. 1B and data not shown). Beginning directly at the Nco I site is ORF1, containing 124 codons. Comparison to the protein data base using the FASTA program (11) demonstrates that ORF1 displays strong homology to the C-terminal 120 amino acids of bacterial aminopeptidase A [43% identity in 88-amino acid overlap (12)] and to bovine leucine aminopeptidase [46.1% identity in 102-amino acid overlap (13); data not shown]. ORF1 appears, then, to represent the C terminus of the coding region for the chlamydial homologue of aminopeptidase A. One hundred forty-six nucleotides beyond the termination codon for ORF1 is a second coding region (ORF2) that encodes a highly repetitive, alanine- and lysine-rich 207amino acid residue polypeptide (Fig. 2). A pentapeptide motif consisting of three small aliphatic residues, usually valine or

Proc. Natl. Acad. Sci. USA 89 (1992)

Microbiology: Perara et al. 10 C. trach 83

ad

:..

Sea urchin Hi

C. trach 83 Sea urchin Hi

20

120

:.: Ad

....

140

50 50

S

,0

.s.

130

40A40

30

8acaNpu

MVQKKRS

60 60

150

70E70

V-KAI AEAi0-AAKK PAAKK-AAKPAAKK 160

170

80

180

90

100

110

C. trach 83

Sea urchin Hi

-PAKKA-AEKPAAEKAAEPAK 19VUP-KAAK

200

190

120

130

C. trach 83

C. trach 83

210

140

150

160

MU¶LVST^AX TAVAMKAGVXMKK

170 180 200 190 KAACGRVAA-%-VKVCASA5P5V^HFR4;1.>~~ IUrvAK

FIG. 2. Compatrison of the predicted amino acid sequence of ORF2 of the MoPn strain of C. trachomatis with histone Hi from sea urchin. Amino acid d identity is indicated by a colon, and conserved amino acids are in( by periods; amino acids are denoted with the one-letter aminko acid code. The pentad repeats are shown either underlined or in booldface type. The 26-amino acid repeat is indicated by italics. Analysis was performed by using the FASTA program (11)

.icated

Pentapeptide repebats in ORF2 are indicated by the alternating

boldface and undeirlined

type.

alanine, followedI by two basic residues, lysine or arginine, is repeated 26 time s in the first 150 amino acids. This motif is contained withinl a 23-amino acid sequence (VRKVAAKKTVARKTVAKECAVAARK) repeated three times in the same region. The amino acid composition of this polypeptide is quite distinct tive, consisting of 34% basic amino acid residues and 24 o alanine residues. It contains no acidic residues and has a predicted isoelectric point of 13.13. Surprisingly, c*omparison ofthis sequence with the protein data base reveal Id that it shares amino acid homology with the C-terminal rcegions of histone Hi subtypes from several different eukary()tic species. Homologies ranged from 25% identity across 1I 84 amino acids (with fruit fly histone Hi) to 45% identity in a 108-amino acid overlap (with sea urchin acidalignment depicted in in fig. Fig. histone Hi). Optiimal ma 0amino overlgnmenisiste amino acid 2. Significant hoivrologies were also seen to histone Hifrom chicken, rainbov v trout, mouse, fruit fly, and other species, but no homology was evident with known bacterial histonelike proteins (e.g>., HiU). Two bacterial proteins were found that share significcant sequence homology with ORF2; both of these are also related to eukaryotic histone H1 proteins. Pseudomonas ad eruginosa AlgR3 (14), also known as AlgP (15), is a 304-ammino acid regulator of alginate synthesis that has 35.3% amin(o acid identity in a 136-amino acid overlap with ORF2. A 2310-residue portion of the 421-amino acid TolA protein from E. c7oli also shares 32.5% sequence identity in a 154-amino acid c)verlap with ORF2. The significance of this will be considercDd in Discussion. The observed 650-bp mRNA identified by Northern analysis is the prestumed mRNA for the ORF2 product. This assignment is suj pported by the size of ORF2, by the fact that a 400-bp ORF2-;specific probe hybridizes to this RNA (Fig. 1B, fragment B),,and by RNase protection experiments (data not shown). We were unable to identify a specific transcript for ORFi on Noirthern analysis; most likely, this protein is encoded by a noriabundant mRNA. Putative rho-independent

2127

transcription termination signals are identifiable following each ORF. ORF2 Encodes a 26-kDa Protein. The gene product of ORF2 was overexpressed in E. coli under the control of a T7 polymerase promoter (16). [3H]Lysine labeling of such cells revealed a polypeptide of -26 kDa that was synthesized in cells containing the pGEMZf plasmid into which ORF2 had been inserted under the control of the 17 promoter but not in cells containing the same plasmid bearing ORF2 in the opposite orientation or cells bearing the pGEMZf plasmid without any DNA insert (data not shown). The observed molecular mass is about 3 kDa larger than that predicted from the DNA sequence; the slow migration of this polypeptide on SDS/polyacrylamide gels is most likely due to the abundance of highly basic amino acids in this protein (data not shown). ORF2 Protein Expression Parallels Its mRNA Expression. By metabolically labeling chlamydia-infected cells with either [35S]methionine or [3H]lysine, we were unable to detect the existence of a 26-kDa protein whose synthesis was inhibited by ampicillin. Independent of this work, Hackstadt raised a monoclonal antibody to a 32-kDa protein purified from the L2 serovar of C. trachomatis that bound to HeLa cell membrane proteins and to 125I-labeled heparin (8, 17, 18). N-terminal sequencing of the purified protein revealed that 14 of the 15 N-terminal amino acids are identical with the protein sequence encoded by the MoPn ORF2 gene (18); we thus presume that ORF2 encodes the MoPn homolog of the 32-kDa HeLa cell binding protein. To substantiate this hypothesis, we probed Western blots of lysates made late in the o ternts Motn ycled nal atibody with the anti-32-kDa monoclonal MoPn life life cycle antibody (kindlyprovidedbyT. Hackstadt). Thisexperimentrevealed the presence of a 26-kDa protein (Fig. 3B, lane 5) whose synthesis was absent in uninfected cells (lane 4) and in cells infected with chlamydia in the presence of ampicillin (lane 6). Furthermore, as shown in Fig. 3C, the protein is only made late in the intracellular life cycle, when RBs redifferentiate back into EBs. Thus, the expression of the 26-kDa protein parallels the expression of its mRNA, suggesting that expression of ORF2 is regulated at the level of transcription. ORF2 Encodes a DNA Binding Protein. The extremely basic character of the protein and the known DNA binding activities of its prokaryotic and eukaryotic homologs strongly suggest that the protein encoded by ORF2 can bind DNA. Accordingly, we directly assayed whether the 26-kDa protein could bind to DNA. Lysates from infected cells grown in the presence or absence of ampicillin were electrophoresed on SDS/polyacrylamide gels, electrophoretically transferred to nitrocellulose probedprotein with nick-translated chlamydial genomicpaper, DNA.and A 26-kDa is specifically bound by the double-stranded DNA probe (Fig. 3A, lane 2); this DNA binding activity is absent from extracts of cells infected with chlamydia in the presence of ampicillin (lane 3) and from uninfected HeLa cells (lane 1). The filter was then probed with the antibody to the L2 32-kDa protein; the 26-kDa DNA binding protein exactly comigrates with the polypeptide recognized by the monoclonal antibody (Fig. 3B, lane 5). Similar results were obtained when a Western blot was probed with nick-translated E. coli DNA (data not shown).

DISCUSSION In this paper we have described a method to identify chlamydial genes that are expressed preferentially late in the chlamydial life cycle. This protocol has allowed us to identify an interesting gene whose product is likely to be involved in the conversion of RBs to EBs. Its mRNA and protein product are present at very low levels during the early stages of the life cycle. They increase markedly between 12 and 16 hpi, the time at which RBs begin to redifferentiate back into EBs. The developmentally regulated transcription of this gene and the

2128

Proc. Natl. Acad. Sci. USA 89 (1992)

Microbiology: Perara et al. A

B

xpz

i Cal

x

Is

kD 68

-O.-

43

--

\I5N@ tt\% 9

,

29-

-KARL 18.4

--.-

14.3 -_

-* 1

2

3

4 5 6

hpi

+ AMP

1 2345

FIG. 3. KARP is a DNA binding protein. Extracts from uninfected HeLa cells (lanes 1 and 4), HeLa cells infected with MoPn for 18 hpi (lanes 2 and 5), or HeLa cells infected with MoPn for 18 hpi in the presence of ampicillin (100 ,gg/ml, lanes 3 and 6) were electrophoresed on SDS/polyacrylamide gels (19), electrophoretically transferred to nitrocellulose paper, and probed with 105 cpm of nicktranslated probe made to total chlamydial DNA (9) (A) or to the monoclonal antibody to the 26-kDa protein from the L2 strain of C. trachomatis (B). Sizes are indicated in kDa. (C) Developmental Western blot of KARP protein synthesis. Lysates from chlamydiainfected cells in the absence (upper panel) or presence (lower panel) of anipicillin (100 ,g/ml) at various times after infection were immunobldtted and probed with the monoclonal antibody to the 26-kDa protein. Lane 1, uninfected HeLa cells; lane 2, 8 hpi; lane 3, 12 hpi; lane 4, 18 hpi; lane 5, 24 hpi.

subsequent induction of this protein can be blocked by ampicillin, a drug that is thought to block (either directly or indirectly) the conversion of RBs to EBs (10). This protein displays several striking features. The predicted amino acid sequence of the encoded protein is very basic (34%: 71 residues of 207 are lysine or arginine) and alanine-rich (24%: 50 residues of 207); we therefore refer to this protein as C. trachomatis KARP. The predicted polypeptide contains no acidic residues (glutamate and aspartate) nor any aromatic side chains (tyrosine, tryptophan, or phenylalanine). The N-terminal two-thirds of this sequence is highly repetitive, consisting of 26 repeats of a pentapeptide motif composed of three residues with small aliphatic side chains (alanine, proline, glycine, serine, threonine, or valine) followed by two basic residues (arginine or lysine). Programs predicting protein secondary structure indicate that the long string of pentad repeats has a high potential for forming a kinked helical structure that would fit in the major groove of DNA, with the lysine residues providing a positive charge that could interact with the phosphate backbone of the DNA (F. Cohen, personal communication). By Southwestern blot experiments combined with immunoblot analysis, we have shown that the denatured KARP binds to double-stranded DNA. This DNA binding is most likely to be sequence nonspecific in nature.

Surprisingly, a search ofthe current protein data base using the FASTA algorithm (11) revealed detectable primary amino acid sequence homology of KARP to several eukaryotic histone H1 proteins. When KARP was compared to the data base using the BLAST algorithm (20), the observed regions of homology to these eukaryotic histone Hls were demonstrated to be significant; it was estimated that the observed homology would occur by random chance with a probability of

A developmentally regulated chlamydial gene with apparent homology to eukaryotic histone H1.

We have developed a method for the isolation of genes whose expression is developmentally regulated from the murine strain of Chlamydia trachomatis. H...
1MB Sizes 0 Downloads 0 Views