DNA AND CELL BIOLOGY Volume 9, Number 8, 1990 Mary Ann Liebert, Inc., Publishers Pp. 545-552

Nucleotide Sequences and Novel Structural Features of Human and Chinese Hamster hsp60 (Chaperonin) Gene Families THOMAS J. VENNER, BHAG SINGH, and RADHEY S. GUPTA

ABSTRACT A number of clones that specifically hybridize to the human hsp60 cDNA (chaperonin protein; GroEL homolog) were isolated from human and Chinese hamster ovary cell genomic libraries. DNA sequence analysis shows that one of these clones, pGem-10, is completely homologous to the human hsp60 cDNA (in both coding and noncoding regions) with no intervening sequences. The other human clones analyzed were all nonfunctional pseudogenes containing numerous small additions, deletions, and base substitutions, but no introns. On the basis of sequence data, six different hsp60 pseudogenes were identified in human cells. In addition, we also cloned and completely sequenced a genomic clone from CHO cells. This clone, which was also a pseudogene, contained a small 87-nucleotide intron near the 3' end. Southern blot analysis of human, mouse, and Chinese hamster DNA, digested with unique restriction enzymes (no sites in cDNA), indicates the presence of about 8-12 genes for hsp60 in the vertebrate genomes. The sequence data, however, suggest that most of these genes, except one (per haploid genome), are likely to be nonfunctional pseudogenes.

INTRODUCTION that all of the information for necessary proper folding of proteins and their assembly into oligomeric complexes is contained within the primary sequence of the polypeptide(s) and that no catalyst or other accessory proteins are involved in this process (see Anfinsen, 1973). However, this basic tenet of biochemistry is now being seriously challenged by the discovery of a class of proteins referred to as "chaperonins," which have been shown to be involved in the proper folding and assembly of a number of different proteins in various systems (Hemmingsen et al., 1988; Cheng et al., 1989; Goloubinoff et al., 1989; Rothman, 1989, Ellis, 1990). Members of this family of proteins include the GroEL and GroES proteins of Escherichia colt, a protein referred to as the rubisco subunit binding protein of plant chloroplast, and a protein present in eukaryotic cell mitochondria (referred to as hsp60 in yeast or PI in mammalian cells) which is required for import into mitochondria and proper folding of subunits of a number of mitochondrial enzyme complexes (Hemmingsen et al., 1988; McMullin and Hallberg, 1988; Roy, 1989; Georgopoulos and Ang, 1990). In

It

has long been assumed

both prokaryotic and eukaryotic systems, the synthesis of the above proteins is induced in response to stresses such as heat shock, which provide evidence that these are members of the heat shock family of proteins, hence, commonly referred to as hsp60 (Lindquist and Craig, 1988; McMullin and Hallberg, 1988; Shinnick et al., 1988). Furthermore, extensive sequence homology between these chaperonin proteins and the 65-kD major antigenic protein of mycobacteria and other pathogenic bacteria has sparked the interest of immunologists, since there is considerable evidence indicating that an autoimmune response to the 65-kD antigen plays an important role in the onset of rheumatoid arthritis as well as in insulin-dependent diabetes (see Young and Elliott, 1989 and Young, 1990 for re-

views). We have

recently reported the cloning and sequencing of chaperonin (or hsp60) cDNA from human and Chinese hamster ovary (CHO) cells (Jindal et ai, 1989; Picketts et al., 1989). The mammalian proteins were found to exhibit extensive sequence similarity (-40-55% identical residues plus an additional 25-30% conservative replacements) throughout their lengths to the related proteins from yeast, prokaryotes, and plant chloroplasts, indicating that these

Department of Biochemistry, McMaster University, Hamilton, Ontario, Canada L8N 3Z5. 545

proteins comprise one of the most highly conserved group ing the entire length of human and Chinese hamster hsp60 of proteins known. To gain information regarding the cDNA in both orientations. The primers were synthesized number of gene copies as well as structure of the gene(s) at the central facility of the Institute of Molecular Biology for the chaperonin protein in mammalian cells, we have and Biotechnology of McMaster University (Hamilton, screened human and CHO genomic libraries with specific Canada). cDNA probes. Results of our studies provide evidence regarding the presence of multiple copies (about 8-12) of RESULTS hsp60 gene in mammalian genome. Complete nucleotide sequence (for the coding region) of five human genes and one Chinese hamster (CH) gene have been determined. To obtain an estimate of the number of gene copies of These data show that all except one of these genes are non- hsp60 genes in mammalian cells, the high-molecular-weight functional pseudogenes containing numerous changes (vz'z. DNA from human, mouse, and CHO cells was digested to base substitutions, additions and deletions). Interestingly, completion with Hind III, for which no sites are observed the functional gene for hsp60 as well as the various in either human or CH hsp60 cDNA (Jindal et ai, 1989; Picketts et ai, 1989). Southern analysis of the blot using pseudogenes for it in human cells contain no introns. 1.4-kb human hsp60 3' end probe under stringent condiMATERIALS AND METHODS Libraries and cDNA probes Human (Hu) libraries of partially Eco Rl-digested genomic DNA cloned in X Charon 4A (ATCC 37385; Lawn et al., 1978) and partial Sau 3A-digested genomic DNA (12-20 kb size) in the replacement vector EMBL3 were purchased from American Type Culture Collection (MD) and Clonetech Laboratories (CA), respectively. For preparation of CHO cell genomic library, high-molecular-weight CHO cell DNA was partially digested with Sau 3A and 12-to 20-kb size fragments were subcloned into EMBL3 arms and packaged using a commercial packaging kit (Promega Corp.). High-molecular-weight genomic DNA from human (diploid fibroblast strain HSC172), mouse (3T3), and CHO cells was prepared by standard procedures (see Maniatis et al., 1982; Ausubel et al., 1989).

Screening of libraries Between 2-5 x 105 pfu from each library were plated on Escherichia coli strain LE 392 (~ 1 x 10" per 10-cm dish). Eco RI digestion of full-length human hsp60 (i.e., PI) cDNA clone XC5 resulted in two fragments of 757 bp (5' end) and 1,485 bp (3' end) which were used as probes to screen phage libraries (Jindal et al., 1989). Duplicate filters from each dish were hybridized to the 5' and 3' end probes.

Phage plaques hybridizing to 5' and 3' end probes were purified through successive rounds of screening and phage DNA was isolated by standard procedures (see Ausubel et al., 1989). Nucleotide sequence determination The DNA from positive phage clones were digested with either Eco RI or Bam HI to release the inserts which were subcloned into the plasmid pGem-7Zf(+) (Promega). Subclones were examined by Southern blot analysis using both the 5' and 3' end probes to verify the hsp60-related nature of the inserts. DNA from the recombinant plasmids was isolated and sequenced by dideoxy chain-termination method using a Sequenase kit (United States Biologicals,

Inc.), employing synthetic oligonucleotides primers

cover-

tions revealed the presence of about 10-12 bands in human and mouse DNA, and slightly fewer bands in CH DNA (Fig. 1). A number of bands that hybridized to the 1.4-kb cDNA probe were also observed in Kpn I-digested DNA. However, most of the fragments in this latter case appear to be of high molecular weight and these were not properly resolved from each other. Because there are no sites for these restriction enzymes in human or CH hsp60 cDNA, the number of bands hybridizing to the cDNA probe provide a rough indication of the number of copies for this particular gene that are present in these cell types (Cleveland et al., 1980). To characterize some of these genes, human and CH genomic DNA libraries were screened using human hsp60 cDNA probes. Screening of the human genomic library ATCC 37385 led to the isolation of one clone that contains an insert of approximately 17 kb. Sequencing of subclones containing fragments from this clone revealed that it (pGem-10) contained the entire coding region of the Hu hsp60 protein. This particular human PI gene contained no introns, and the nucleotide sequence of it exactly matched with the human hsp60 cDNA sequence both in the translated and untranslated region (see Fig. 2). In view of its complete identity with the Hu hsp60 cDNA, this gene apparently corresponds to a functional hsp60 gene. However, this clone contained only 24 bp upstream from the hsp60 translation start site (ATG) and, therefore, provided no information regarding the regulatory sequences responsible for transcriptional control of this gene. In contrast to this library, screening of the human EMBL3 library with Hu hsp60 cDNA probes resulted in the isolation of 12 positive clones. The inserts from these were subcloned in the plasmid vector pGem-7, and the clones containing hsp60-related sequences were sequenced using synthetic primers based on Hu hsp60 sequence. The sequence data on these clones indicated that all 12 constituted pseudogenes that contained various in-frame deletions, additions, and base substitutions resulting in no large open reading frames within them. Based on the sequence data, the clones have been classified into six groups of distinct pseudogenes. Four of these pseudogenes {viz. PS2, PS4, PS5, and PS11) have been completely sequenced across the coding region of Hu hsp60, whereas only partial sequence information is available on the other

CHARACTERIZATION OF HSP60 GENE FAMILY

3

2

1

1

2

3

r 10.0-

w

30-

1"6*"

translated

genomic clones, screening of the CHO genomic library identified two positive clones, one of which, GC-1, hybridized to both the 5' and 3' end probes. The complete nucleotide sequence of this gene was determined and compared with the CH hsp60 cDNA (Fig. 3). From a comparison of the two sequences it is evident that clone GC-1 also corresponds to a pseudogene as it contains several in-frame additions, deletions, and base substitutions leading to premature chain termination in all reading frames. The sequence of this pseudogene was also highly homologous in the coding region and a part of the 3' noncoding sequence, but differed in the 5' upstream region as well as in the distal part of the 3' noncoding sequence. In the 3' region of this pseudogene, a string of 19 As interspersed with a few other bases are observed. However, it is unclear whether it represents a poly(A) tail. Interestingly, in contrast to the human pseudogenes, this pseudogene is found to contain a 87-nucleotide intron within it, which is flanked at the 5' and 3' end by the con-

.

1

homologies in the of and 86.1, 87.4, 89.7, 90.2%, respecregion tively. Interestingly, similar to the functional gene (pGem-10), no introns were found in any of these pseudo-

to the cDNA sequence with sequence

genes. In addition to the human

Ill 5

50-

547

mm

sensus

splice

sequences

(Padgett

et

al., 1986).

DISCUSSION

t B FIG. 1. Southern blot analysis of human, mouse, and Chinese hamster genomic DNA. About 10 /tg of genomic DNA from each of these species was digested to completion with either Hind III or Kpn I and then blotted onto Gene Plus nylon membrane. The blot was hybridized to a J2P-labeled, 1.4-kb 3' end fragment of human hsp60 cDNA. The hybridization and washes were performed at 65°C under stringent conditions as described in Ref. 2. A. Hind III digested. B. Kpn I digested. Lanes 1, Human DNA; lanes 2, mouse DNA; lanes 3, CHO cell DNA.

(viz, PS3 and PS6). A comparison of the seof these pseudogenes, as well as the putative funcquences tional gene (pGem-10) with the Hu hsp60 cDNA, is shown in Fig. 2. As can be seen, the different pseudogenes contained various additions, deletions, and base substitutions, some of which were unique for a given pseudogene, whereas others were shared among some of the other members. The pseudogene PS3 showed an insertion of 23 T after about 20 nucleotides downstream from the initiation codon. The DNA sequences of the four pseudogenes that have been completely sequenced were highly homologous two clones

Data presented in this paper provide information regarding the structure as well as the number of gene copies of hsp60 (chaperonin) protein in mammalian cells. Southern blot analysis of genomic DNA digested with unique restriction enzymes (with no sites in corresponding cDNAs)

indicated the presence of about 10-12 genes in human and mouse cells and a slightly lower number (about 6-7) in CH cells. The cloning and sequencing of a large number of the human hsp60 gene provides evidence that all except one or two of these genes are pseudogenes whose transcription and translation would not lead to any functional product. In human cells, at least six different pseudogenes which differ from each other in sequence have been identified. In CH cells, one gene that has been cloned and sequenced also turned out to be a pseudogene. The inference that mammalian cells may contain only two functional copies of this gene (one per haploid chromosome) is also supported by earlier genetic studies with CHO cell mutants where a single mutation converts 50% of the protein into an electrophoretically variant form (Gupta et al., 1982). The most plausible explantion for this observation is that the diploid CHO cells contain two functional copies of this gene, and of these, one is altered in the mutant cells. One of the human genes that has been cloned and sequenced matches exactly with the human hsp60 cDNA sequence, both in the coding and noncoding regions, providing evidence that it constitutes the functional gene. One very interestring feature of this gene is that, unlike most other genes from vertebrate cells, it does not contain intervening sequences. Some of the other genes that lack in-

548

VENNER ET AL.

-45 Hu cDNA GACGACCTGTCTCGCCGAGCG

CACGC

pGemlO

TTGCCGCCGCCCCGCAGAAATGCTTCGG

--I-C-AI- A--T-TCC-- -AG-A-A-

-CATTCCA-

FS 4

CCT-C-TCAC---A--AT--A --T-TCC-- -A--A-A--I-C-AT--A --I-T C-- -AG-A-A--T-C-AT--A --T-TCC-- -AG-A-A---ATGATCTTTTT* --C-ATC* A-T-A-T-T---CAACGC- CC--T--A-I-

-AC

PS 5

PS 11 PS 6 PS 3

AGATGAGACC

TTACCCACAGTCTTTCGCC

PS 2

-CA -CA

-GAGITTAGA-A(23T) —-

39 Hu cDNA

pGemlO

GGTGTCCAGGGTACTGGC

TCCTCAT

CTCACTCGGGCTTATGCCAAAGATGTAAAATTTGGTGCAGATGCCCGAGCCTTAATGCTTCAAGGTG

PS 2

PS PS PS PS PS

-T-AAG-G— -GT-

4

-T--

-cc- -T-AAG-G—GT--T-AAG-G—GT-TACTGGC-C- -T-AAG-G-"GT.*-C-

-T--

-

5 11

6 3

-C-

--C

-A— -A— -A—-A—

-T-T-C-

-CC---T--T--

-G-

-T-AAG-G-

---AC

-A-AA-

--

131 Hu cDNA TAGACCTTTTAGCCGATGCTG

pGemlO

PS 2 PS 4 PS 5 PS 11

228 Hu cDNA

pGemlO PS PS PS PS

2 4

5 11

-

-T-T-

-A---A-A-A--TA-C-C-

PS 2 PS 4

PS 5 PS 11

-

-*-T--A---A-***-C-A-A--TA-C-A"*-

-T-T-T-

-A---A-A-A--TA-C-

--T-G-*-T-****-I-A-A-AT-AA-A--TA-C-G TGGTGTGACTGTTGCAAAGTCAATTGACTTAAAAGATAAATACAAGAACATTGGAGCTAAACTTGTTCAAGATGTTGCC

AATAACACAAATGAAGA A

-

-G -G-G -C-A-G-T--A-G-C-C-TAA--C-CA-G--- G -A-A-G-T--A-G-C- -G-G -A-G-G-T--A-G-T-C-G-AG

-A-T--A-G-C-

324 Hu cDNA AGCTGGG

pGemlO

TGGCCGTTACAATGGGGCCAAAGGGAAGAACAGT GATTATTGAGCAGGGTTGGGGAAGTCCCAAAGTAACAAAAGA

-

GATGGCACTACCACTGCTACTGTACTGGCACGCTCTATAGCCAAGGAAGGCTTCGAGAAGATTAGCAAAGGTGCTAATCCAGTGGAAATCAG -

GA-T-GT-T-T-TT-AT-T-C-G-

GA-T-G-T-T-TT-AT-C-C-G-T GA-G-T-G-T-T-TT---AT-C-G-A GA-T-G-T-T-TT-AT-C-G-

423 Hu cDNA GAGA

GGTGTGATGTTAGCTGTTGATGCTGTAATTGCTGAACTTAAAAAGCAGTCTAAACCTGTGACCACCCCTGAAGAAATTGCACAGGTTGCTACGAT -

pGemlO-

2-*-***-AA-A--GA-

PS

PS 4

-AA-

PS 5 PS 11 522 Hu cDNA

pGemlO PS 2

-AA-AA-

TTCTGCAAACGGAGACAAAGAAATTGGCAATATCATCTCTGATGCAATGAAAAAAGTTGGAAGAAAGGGTGTCATCACAGTAAAGGATGGAAAAACACTG -

PS 4 PS 5 PS 11 622 Hu cDNA

pGemlO PS 2 PS 4

PS 5 PS 11 720 Hu cDNA

pGemlO

-CA-GT- -CA-GT- -CA-*** -CT- -CA-

--T-

--T--T-

AATGATGAATTAGAAATTATTGAAGGCATGAA GTTTGATCGAGGCTATATTTCTCCATACTTTATTAATACATCAAAAGGTCAGAAATGTGAATTCCA -

-C--TA-

-C-AA GGATGCCTATGTTCTGTTGAGTGAAAAG -

PS 2

-AC-

-T---T-T--T--

-CA-CA-

315 Hu cDNA TCATA -

PS 2

PS 4 PS 11

-T-GT--

ATCGCTGAAGATGTTGATGGAGAAGCTCTAAGTACACTCGTCTTGAATAGGCTAAAGGTTGGTCTTCAGGTTGTGGCAGTCAAGGCT -

-T-T-T--T-

PS 5 908 Hu cDNA

AAAATTTCTA GT ATCCA GTCCATTGTACCTGCTCTTG AAATTGCCAATGCTCACCGTAAGCCTTTGG

-CA-

PS 4 PS 5 PS 11

pGemlO

A

-C-AAA -C-AA

—A-

-A-GT-

-CA-CA-

-A--C-A-A--C-

-A-T-

-A--C-

-A---AAAGCT-

T

-A—

CAGGGTTTGGTGACAATAGAAAGAACCAGCTTAAAGATATGGCTATTGCTACTGGTGGTGCAGTGTTTGGAGAAGAGGGATTGACCCTGAATCTTGAAGA

pGemlO-

PS PS PS PS

2 4

1008 Hu cDNA

pGemlO

--C--TC-

--C--T-

CGTTCAGCCTCATGACTTAGGAAAAGTTGGAGAGGTCATTGTGACCAAAGACGATGCCATGCTCTTAAAAGGAAAAGGTGACAAGGCTCAAATTGAAAAA

PS 2

PS 4 PS 5 PS 11

—C-*C—A—A—S—A-*— —C-*-A—A— —C-*-A—A-

-CG--GCC-G****-

5 11

I-

-G-G-G-*--

»__.T-

-C-S-*T-

---A-T-

--TA*G---T---TAT—TT-

-A-A---T-

-G-G-G-G-G-G-

-C— -c—

---TT-

—A-T---A-T-

CHARACTERIZATION OF HSP60 GENE FAMILY

1108 Hu cDNA

pGemlO

549

CGTATTCAAGAAATCATTGAGCAGTTAGATGTCACAACTAGTGAATATGAAAAGGAAAAACTGAATGAACGGCTTGCAAAACTTTCAGATGGAGTGGCTG -

PS 2 PS 4

T--G-C-C-*--T-AT-G___G--G-G********-A-

PS 5

T--G-C-C-C-G-G--G-G********-AT--G-G-G-C-C-TGT-C-G-G-G********-A-

PS 11

T-A--T-A-T-G-A-TATG-G-A—C-

1208 Hu cDNA TGCTGAAGGTTGGTGGG

pGemlO

ACAAGTGATGTTGAAGTGAATGAAAAGAAAGACAGAGTTACAGATGCCCTTAATGCTACAAGAGCTGCTGTTGAAGAAG

-

-

PS 2

-**T-

PS 4

G-T-G-T--G-A-

--T-T-T--TGAG-C-G-A--C-T-C-T-A-

PS 5 PS 11 1304 Hu cDNA

pGemlO PS 2 PS 4 PS 5 PS 11 1404 Hu cDNA

pGemlO PS 4 PS 5 PS 11 1504 Hu cDNA

pGemlO PS 2 PS 4 PS 5

PS 11 1604 Hu cDNA

pGemlO PS 2

PS 4 PS 5 PS 11 17 04 Hu cDNA

pGemlO pS 2 PS i, ps 5

1804 Hu cDNA

pGemlO 1904 Hu cDNA

pGemlO T004 Hu cDNA

pGemlO 2104 Hu cDNA

pGemlO

-

-G-T--G---A-

-

-G-T--G---A-

GCATTGTTTTGGGAGGGGGTTGTGCCCTCCTTCGATGCATTCCAGCCTTGGACTCATTGACTCCAGCTAATGAAGATCAAAAAATTGGTATAGAAATTAT -

-A-G-C-A---T-G-*-it*******-£-A-T-A-C-A-G-C-A—T— -A-G-C-A---T-

TAAAAGAACACTCAAAATTCCAGCAATGACCATTGCTAAGAATGCAGGTGTTGAAGGATCTTTGATAGTTGAGAAAATTATGCAAAGTTCCTCAGAAGTT -

-T-1****1-A-C-T--CAT-T-G-A-T-C--T-T-

GGTTATGATGCTATGGCTGGAGATTTTGTGAATATGGTGGAAAAAGGAATCATTGACCCAACAAAGGTTGTGAGAACTGCTTTATTGGATGCTGCTGGTG

.

-

-A-TTA-G-C-AC"T-C-CA -A-TTA-G-C-AC"T-T-CA -A-TTA-G-C-AC"T-C-CA -A-TTA-G-C-AC--T-C-CA

TGGCCTCTCTGTTAACTACAGCAGAAGTTGTAGTCACAGAAATTCCTAAAGAAGAGAAGGACCCTGGAATGGGTGCAATGGGTGGAATGGGAGGTGGTAT -

--C-A-A-CT-G---CAG-*--T-**-CC---*-A-CT-A-G---CAG-T-CC-

-A-CT-CAG-A-T-CC-

-A-CT-G---CAG-G-C-AT-T-A—

GGGAGGTGGCATGTTCTAACTCCTAGACTAGTGCTTTACCTTTATTAATGAACTGTGACAGGAAGCCCAAGGCAGTGTTCCTCACCAATAACTTCAGAGA -

-************-T-A-T-T-*- -************-x-A-T-T-G-************-A--T-A-T-T-*-

AGTCAGTTGGAGAAAATGAAGAAAAAGGCTGGCTGAAAATCACTATAACCATCAGTTACTGGTTTCAGTTGACAAAATATATAATGGTTTACTGCTGTCA -

TTGTCCATGCCTACAGATAATTTATTTTGTATTTTTGAATAAAAAACATTTGTACATTCCTGATACTGGGTACAAGAGCCATGTACCAGTGTACTGCTTT -

CAACTTAAATCACTGAGGCATTTTTACTACTATTCTGTTAAAATCAGGATTTTAGTGCTTGCCACCACCAGATGAGAAGTTAAGCAGCCTTTCTGTGGAG AGTGAGAATAATTGTGTACAAAGTAGAGAAGTATCCAATTATGTGACAACCTTTGTGTAATAAAAATTTGTTTAAAGTTAAAAAAAAAAAAAAA -

FIG. 2. Comparison of the nucleotide sequence of human hsp60 cDNA with that of the seven human genomic clones. The nucleotides have been numbered assuming the first nucleotide of the initiation codon to be 1. Identical residues are denoted by a dash (-). The gaps in the cDNA sequence and other sequences corresond to additions in one or more of the clones. The deletions in clones are indicated by asterisks (*). The sequence of clone pGem-10 matches exactly with the cDNA sequence. For clones PS3 and PS6, only partial sequence information is available.

trons include those for the

major heat-inducible form of hsp70 from human and chicken, histones, and interferon-a genes (Hentschel and Birnsteil, 1980; Nagata et al., 1980; Hunt and Morimoto, 1985). The significance of the lack of introns in a given gene is at present unclear. It is of interest in this regard that although hsp60 is a nuclear gene, the protein product is localized in mitochondria, which are presumed to have evolved from prokaryotes via endosymbiosis (Schwartz

and Dayhoff, 1978; Woese, 1987; Gupta et al., 1989). Since the genes from prokaryotic organisms lack intervening sequences, the absence of introns in the human hsp60 sequence might be related to its prokaryotic origin. Similar to hsp70, hsp60 protein is induced in response to heat shock (Lindquist and Craig, 1988; McMullin and Hallberg, 1988). Since experiments in Drosophila indicate inhibition of RNA splicing at higher temperatures (Yost and Lindquist, 1988), the absence of introns in the genes for heat

550 -200 (GC 1)

VENNER ET AL.

CAIGGTGaiTTTTCTTTAGATATTCTGGGCCTCCK^AACTGAC^^ GGGÄAAAGTTCTTCACCAACCXXATATCTAGCC^^

CX3GCCTGCTTOGCCTCXrrGCX37nXCXM»XXXXXX»C^GAA

ATGCTTCCACTACCCACAGTCCTTCSCCAGAIGAGACCAGTGT^^

TCA--CTCCGCCA-GOC-G-T-G-T-G-GGAGATGCTO^GCCTTAATGCTTX^AGrjTCT

-A—C--T--T-A-T-CAATAICGGAGCTAAACTTGTTCAAGA

TTGGGGAAGTOXAAAGTAACAAAAGATGGGrjTCACTGTTGr^^

-EES--T-T--A-C-fcd-A-T

nsTgccài!aMac»àÊCi6umiamaammaaBisusÉoaaBaaaei ictogcécsi iciAirsccAABOAAOGcn igagaagatcagcaaa -T-*-A-cGGGGCTAATCCAGTAGAAATCCÖGAGAGGTGTGATGTTGGCTGTTGATrjC^^

—T-T-A-A-C-A-AAAATTGCTCaGGTTrjCTACAATTTCTGCGAATrjGAGfl

-T-a-C-T-A-C-T-G-AAGTGAAGGATGGAAAAACCCTGAATGAIGAGTTAGAAATTATTGAAGKATG^

-C-A-T-A-G-A—A-G-t-A-************-TGrjTCAAAAATGTGAATTCC^AGATrjCCTATGTKrrGTTGAGTGAAAAGA

A-A-T-G—C-*********************—cATCXrTAAGaXnrrjGTCATTATTGCKMAGATGTTGATGGAGAAGC^^ —A-G--A-AAA-G—AI-C-A-CC^AAGCTCCAGtjIlll GrjfJGACAATAGGAAGAATCaGCTTAAAGATATGGCTATTtX^^ T-T-a-C-A-A-CAA-C-ATCrTTGAAGATGTTCAAGCTCATGACTTGrJGAAAAGTTGGAG^

-A-T-G-T-T--A-T-C-T-CT-A-cf^-A

ATTX^GAAAaSAATTCAAGAAATCACTGAGCAGCTAGAAATCACAAC^ GAGTAGtaGTGTTGAAGGTTGGAGGAACAAGTGATGTTGAAGTCAATGAGAA^ -T-C-T-T-G-G-AAGAAGGC^TTTiTTTTAGGArjGrjGGCTXaXXrrenX^ -A-G-TTA-C—TG-AATTATTAAAAGAGCACTCaAAATTCCTCCAATrjACAATTœ

-g-g-^2-G-G-E3" Intron

AAATTGGTTATGATGCTATGCTCXSGAGATTTTGTGAAC^TGGTGGAAAAG

G-G-(T-G)-C—C-A-A-A-A-GTAAATGAGTCAGCAATTCTTAAACATTG T

Intron 87 base pair GTTGTAAGAACTrXTTTACTGGATfJCTGCTGGGGTrjGCC^ GATGGL1 lbl ICTATGGTATGTGl 111'LATTAATCATTTTATAAACAl Hill 1CTAG-G-A—************-

TTGCTAACTACAGCAGAAGCTGTAGTGACCGAAATTCCT^ -T-a-A-T-C--A--CAT-CXrnT-ATCCTA-CAC-TGC-AG-CAArjrxaGG-AG--AT--GTG-GTTC-AGGCX^-C-GGACrrXX»»GTGAG^ AAGAAAATGAAGAGAAGGCTGa^TGATCACTATAACCATCAGTTACTGGT^^ C-A-C--AC--A-A-GACA-TTr>-ACATT-C^G--AA-AAaXA-A-AA-C-AAT-AA--T-GrJra^-C-GCr-AATAAAGATA--C^^ GATAATTTATTTTGTATTTTTGAATAAAGACATTIGTACArTCXr^ ATCTTAAG-GAAAAATC-AC-A-TA--T---T-GGAGCAGG-TCACAGACAATATn--Tr>-G-OTAGGTA-C-TCATG-G«rrTAGT-G-TCCA-T GGr^GTTCTACTATTCTGTTACAGTCAGr^TTCTAGTrjCrT^^ T--rrr^AA-TGCCXrrCArXllllGTCAT-GG—CA--TGGAG-ATCTACTrXATTGGAT-GT-ATArTAAAAGTA-C-C-Tr>-CT AAGTAGAGAAGTATtX^ATTATGTGACAACrrrTGTGTAATAAAAl 11 lbl 1 lAAAAGTTAAAAAAAAAAAAAAAAAAAAAAAAA -TTA-TCCœ-AGT--AA-CTœ-TCT-AGA--CTCTC--GTrTC--G-Gœ^ TGATCnnCnnt^CAATCTCTTATArnnT^AGTACTTAAC^ CTACTCAGGTATAAATATTmACriTCTWTrjCXXX^^ AGCTGATTXMAACCTCTTAGATACArarATATGCTW^G^

Nucleotide sequence of CH hsp60 cDNA and of the genomic clone GC1. The sequences have been numbered assuming the A of the initiation codon to be 1. The gaps in the cDNA sequences correspond to additions in the genomic clone. The deletions in the genomic clone are indicated by asterisks (*). Some of the in-frame termination codons are boxed. A stretch of As near the 3' end in the genomic clone is underlined. The position of the 87-bp intron in the genomic clone is marked by arrowheads. FIG. 3.

CHARACTERIZATION OF HSP60 GENE FAMILY shock proteins may

account

for selective expression of

such genes under these conditions.

Lastly, large pseudogene families have previously been reported for a number of different genes including rat tubulin, human actin, mouse and human immunoglobulin variable regions, and many others (see Firtel, 1981; Cowan and Dudley, 1983; Weiner et al., 1986; Wilde, 1986). Data presented here show that a large number of pseudogenes are also present for the hsp60 protein in mammalian cells. The DNA sequence data on the four human pseudogenes that have been completely sequenced show that they contain many common changes, indicating their evolution from a common ancestor. Pseudogenes in the genome most probably result from either the accumulation of deleterious mutations following gene duplication, or from the insertion of retrotranscribed copies of a cytoplasmic mRNA molecule (processed pseudogene) which subsequently accumulate deleterious genetic changes (Vanin, 1985; Wilde, 1986). The pseudogenes described here are likely to be of the latter class given their abundance and features and could have been derived from a common ancestral processed pseudogene by gene duplication and mutation common

ACKNOWLEDGMENTS The work was supported by a research grant from the Medical Research Council of Canada. We thank Mr. B. Eng for providing technical assistance and Mrs. B. Sweet for secretarial help.

REFERENCES ANFINSEN, C.B. (1973). Principles that govern the folding of protein chains. Science 181, 223-230. AUSUBEL, F.M., BREWNT, R., KINGSTON, R.E., MOORE, D., SEIDMAN, J.G., SMITH, J.A., and STRUHL, K. (1989). Current Protocols in Molecular Biology. Sarah Green/Wiley Interscience, New York. CHENG, M.Y., HARTL, F.-U., MARTIN, J., KALOUSEK, F., NEUPERT, W., HALLBERG, E.M., HALLBERG, R.L., and HORWICH, A.L. (1989). Mitochondrial heat shock protein hsp60 is essential for assembly of proteins imported into yeast mitochondria. Nature 337, 620-625. CLEVELAND, D.W., LOPATA, M.A., MACDONALD, R.J., COWAN, N.J., RUTTER, W.J., and KIRSCHNER, M.W. (1980). Number and evolutionary conservation of a- and ßtubulin and cytoplasmic ß- and 7-actin genes using specific cloned cDNA probes. Cell 20, 95-105. COWAN, N.J., and DUDLEY, L. (1983). Tubulin isotypes and the multigene tubulin families. Int. Rev. Cytology 85, 147173.

R.J. (1990). The molecular chaperone concept. Seminars Cell Biol. 1, 1-9. FIRTEL, R.A. (1981). Multigene families encoding actin and tubulin. Cell 24, 6-7. GEORGOPOULOS, C, and ANG, D. (1990). The Escherichia coli groE chaperonins. Seminars Cell Biol. 1, 19-25.

ELLIS,

551

GOLOUBINOFF, P., CHRISTELLER, J.T., GATENBY, A.A., and LORIMER, G.H. (1989). Reconstitution of active dimeric ribulose bisphosphate carboxylase from an unfolded state depends on two chaperonin proteins and Mg-ATP. Nature 342, 884-888.

GUPTA, R.S., HO, T.K.W., MOFFAT, M.R.K., and GUPTA, R. (1982). Podophyllotoxin-resistant mutants of Chinese hamster ovary cells. J. Biol. Chem. 257, 1071-1078. GUPTA, R.S., PICKETTS, D.J., and AHMAD, S. (1989). A novel ubiquitous protein "chaperonin" supports the endosymbiotic origin of mitochondrion and plant chloroplast. Biochem. Biophys. Res. Commun. 163, 780-787. HEMMINGSEN, S.M., WOOLFORD, C, VAN DER VIES, S.M., TILLY, K., DENNIS, D.T., GEORGOPOULOS, C.P., HENDRIX, R.W., and ELLIS, R.J. (1988). Homologous plant and bacterial proteins chaperone oligomeric protein assembly. Nature 333, 330-334. HENTSCHEL, C.C., and BIRNSTEIL, M.L. (1980). The organization and expression of histone gene families. Cell 25, 301313.

HUNT, C, and MORIMOTO, R.I. (1985). Conserved features of eukaryotic hsp70 genes revealed by comparison with the nucleotide sequence of human hsp70. Proc. Nati. Acad. Sei. USA

82, 6455-6459. JINDAL, S.K., DUDANI, A.K., SINGH, B., HARLEY, B.C., and GUPTA, R.S. (1989). Primary structure of a human mitochondrial protein homologous to the bacterial and plant chaperonins and to the 65 kilodalton mycobacterial antigen. Mol. Cell. Biol. 9, 2279-2283. LAWN, R.M., FRITSCH, E.F., PARKER, R.C., BLAKE, G., and MANIATIS, T. (1978). The isolation and characterization of linked 5- and /3-globin genes from a cloned library of human DNA. Cell 15, 1157-1174. LINDQUIST, S., and CRAIG, E.A. (1988). The heat shock proteins. Annu. Rev. Genet. 22, 631-677. MANIATIS, T., FRITSCH, E.F., and SAMBROOK, J. (1982). Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York. McMULLIN, T.W., and HALLBERG, R.L. (1988). A highly evolutionarily conserved mitochondrial protein is structurally related to the protein encoded by the Escherichia coli groEL gene. Mol. Cell. Biol. 8, 371-380. NAGATA, S., MANTEI, N., and WEISSMAN, C. (1980). The structure of one of the eight or more distinct chromosomal genes for human ¡nterferon-a. Nature 287, 401-408.

PADGETT, R.A., GRABOWSKI, P.J., KONARSKA, M.M., SEILER, S., and SHARP, P.A. (1986). Splicing of messenger RNA precursors. Annu. Rev. Biochem. 55, 1119-1150. PICKETTS, D.J., MAYANIL, C.S.K., and GUPTA, R.S. (1989). Molecular cloning of a Chinese hamster mitochondrial protein related to the "chaperonin" family of bacterial and plant proteins. J. Biol. Chem. 264, 12001-12008. ROTHMAN, J.E. (1989). Polypeptide chain binding proteins: Catalysts of protein folding and related processes in cells. Cell 59, 591-601. ROY, H. (1989). Rubisco assembly: A model system for studying the mechanism of chaperonin action. The Plant Cell 1, 10351042.

SCHWARTZ, R.M., and DAYHOFF, M.O. (1978). Origins of prokaryotes, eukaryotes, mitochondria and chloroplasts. A perspective is derived from protein and nucleic acid sequence data. Science 199, 395-403.

SHINN1CK, T.M., VODKIN, M.M., and WILLIAMS, J.C. (1988). The mycobacterium tuberculosis 65 kilodalton antigen

552 is a heat shock protein which corresponds to common antigen and to the Escherichia coli GroEL protein. Infect. Immun. 56, 446-451. VANIN, E.F. (1985). Processed pseudogenes: Characteristics and evolution. Annu. Rev. Genet. 19, 253-272. VAN EDEN, W., HOGERVORST, E.J.M., VANDER ZEE, R., VAN EMBDEN, J.D.A., HENSEN, E.J., and COHEN, I.R. (1989). The mycobacterial 65 kD heat-shock protein and autoimmune arthritis. Rheum. Int. 9, 187-191. WILDE, CD. (1986). Pseudogenes. CRC Crit. Rev. Biochem. 19, 323-352. WEINER, A.M., DEININGER, P.L., and EFSTRATIADIS, A. (1986). Nonviral retroposons: Genes, pseudogenes and transposable elements generated by the reverse flow of genetic information. Annu. Rev. Biochem. 55, 631-666. WOESE, C.R. (1987). Bacterial evolution. Microbiol. Rev. 51, 221-271.

VENNER ET AL.

YOST, H.J., and LINDQUIST, S. (1988). Translation of unspliced transcripts after heat shock. Science 242, 1544-1548. YOUNG, D.B. (1990). Chaperonins and the immune response. Seminars Cell Biol. 1, 27-35. YOUNG, R.A., and ELLIOTT, T.J. (1989). Stress proteins, infection and immune surveillance. Cell 59, 5-8.

Address reprint requests to: Dr. Radhey S. Gupta

Department of Biochemistry McMaster University Hamilton, Ontario, Canada L8N 3Z5 Received for

publication June 6, 1990.

Nucleotide sequences and novel structural features of human and Chinese hamster hsp60 (chaperonin) gene families.

A number of clones that specifically hybridize to the human hsp60 cDNA (chaperonin protein; GroEL homolog) were isolated from human and Chinese hamste...
2MB Sizes 0 Downloads 0 Views