Oene. 95 (1990) 203-213 Elsevier

203

GENE 03722

T h e m o u s e k e r a t i n 19-encoding gene: sequence, s t r u c t u r e a n d c h r o m o s o m a l a s s i g n m e n t (Intermediate fdaments; gene family; ~ phage vector; recombinant DNA; nucleotide sequence; orthologous genes; sequence comparisons; regulatory signals; restriction-fragment length polymorphism; interspecies baekcross) Marc Lussier'*, Marie Filion', John G. Comptonb, Joseph H. Nadeau', Line Lapointe" and Andr~ Royal" ° Institut du Cancer de Monrrdal, Quebec H2L 4MI (Canada); and b The Jackson Laboratory, Bar Harbor, ME 04609 (U.S.A.) Tel. (207)288-3371 Received by D.T. Denhardt: 16 March 1990 Revised: 31 May 1990 Accepted: 30 June 1990

SUMMARY

Keratin 19 (KI9) is synthesized mainly in embryonic and adult simple epithelia, but has also been found in stratified epithelia as well. KI9 is the smallest known keratin and is remarkable in that, contrary to all other keratins, it does not have a designated partner for the formation of filaments, implying that regulation of its expression is different from other keratin-encoding genes. As a first step in elucidating the mechanisms by which the KI9 gene is regulated in relatively undifferentiated embryonic and in terminally differentiated adult tissues, a series of overlapping clones containing the complete mouse KI9 gene was isolated from a mouse genomic library and characterized. The nucleotide (nt) sequence extends over 5119 nt and includes six exons. A region of 303 nt upstream from the transcription start point (tsp) was also sequenced. Comparison with the human and bovine KI9 genes revealed the existence of homologies in both the coding and noncoding regions. The putative promoter region of the mouse KI9 gene is highly homologous to the corresponding sequences of the human and bovine KI9 genes. It contains an ATA box, a CAAT box and two potential Spl-binding sites. Significant homologies were also found between the sequences of the introns of the mouse, human and bovine genes: this was particularly evident in introns 2, 3, 4 and 5. Intron 1, which showed the greatest degree of divergence, was found to contain many repetitive elements. Finally, it is shown that the mouse KI9 gene cosegregates with the type-I keratin-encoding gene locus (Krt-i) on chromosome 11.

INTRODUCTION

Keratin 19 (KI9) is the smallest known intermediate filament (IF) protein. Like all IF proteins, K19 has a highly conserved central ~-helical domain (Bader et al., 1986; Eckert, 1988; Lussier et al., 1989; Stasiak et al., 1989) essential for filament formation (Steinert and Roop, 1988) Correspondenceto: Dr. A. Royal, Institut du Cancer de Montr6al, 1560 Eat, rue Sherbrooke, Montreal, Qu6bec H2L 4M I (Canada) Tel. (514) 876-7078; Fax (514) 876-5476. * Present address: Dept. of Biology, McGiU University, 1205 Dr. Penfield, Montr6al, Quebec H3A IBI (Canada). Abbreviations: aa, amino acid(s); Alu-type sequence, genomic repetitive elements of the Alu-type; bp, base pair(s); eDNA, DNA complementary 0378- I 119/90/$03.50 © 1990 Elsevier Science Publishers B.V. (BiomedicalDivision)

but is distinguished from other keratins in having a very short 13-aa-long C-terminal domain. Although it is a type-I keratin, K19 is unusual in that it does not have an assigned type-II partner (Sun et al., 1985). Keratins are obligate heteropolymers and require a member of each class for filament formation. In vitro, any type-I subunit can associate with any type-If subunit but there are indications that to RNA; cM, centiMorgan; DTI', dithiothreitol; IF, intermediate filament; Krt-l, gene locus encoding mouse type I keratins; Krt-2, gene locus encoding mouse type II keratins; K7-KI9, keratins 7-19; KT.KI9, genes encoding K7-KI9; kb, kilobase(s) or I000 bp; nt, nucleotide(s); oligo, oligodcoxyribonuclcotide;PCR, polymarase chain reaction; S DS, sodium dodecyl sulfate; SSC, 0.15M NaCI/0.015 M Na3"citrate pH 7.6; tap, transcription start point; u, unit(s).

204 1983). In fetal tissues, KI9 mRNA is detectable by day 9.5 (Lussier et al., 1989) and is probably induced earlier since a 40-kDa cytoskeletal protein likely to be K 19 was detected in the embryonic ectoderm of day 6-8 embryos (Jackson etal., 1981). In extraembryonic tissues, KI9 mRNA is already very abundant by day 8.5 (Lussier et al., 1989) but it is not known when its induction occurs. In the trophectoderm of blastocysts stage embryos, K8 and KI8 were found to be expressed while KI9 was not (Jackson et al., 1980). Therefore, in derivatives of the trophectoderm at least, the K19 gene appears to be induced after K8 and K18. In adult tissues, the pattern of K i 9 expression is not consistent with its association with a specific type-II partner (Sun et al., 1985). In simple and stratified epithelia which express K19, it is never the sole type-I keratin. The type-II

not all associations are equally probable in vivo (Hatzfeld and Franke, 1985; Eichner et al., 1986). In this respect, it has been postulated that the principal function of KI9 was to counterbalance an overabundance of type-ll keratins (Eckert, 1988; Stasiak et al., 1989). The KI9 gene has a unique pattern of expression. During development, its induction is not coordinated with the induction of a specific type-ll subunit. Keratins are the first IF proteins to be expressed in the developing mouse embryo (Jackson et al., 1980; 1981) where their induction coincides with differentiation of cells giving rise to the trophectoderm, the extraembryonic endoderm and the embryonic ectoderm (Hogan et al., 1986). In these tissues, IF are initially composed of the type-I keratin 18 (KIS) and of the type-If keratin 8 (K8; Jackson et al., 1980; 1981; Oshima et al., ..~ -4, A ..~

~.8A

4r-~.6.2 ~r L5.2 ~.

~. 10A ~.IA 5' CI S

K

EV B

H2

I

I

I I

I

i

E

PS¢

I il

"'""'.,.

'

3'

C12

',

L

~',a AUO

XP

II

II

H2HP

III

"'"" '"" ""' "'""';';;';'"" ,/ i .--:./,,." ,...'.,:::.;..'.;.' •

,",".'" ." ,'" ' " ,'.,'" ,,"o ..• ,' ,'. ' .' C C,,.. , ' ,'," ' o

,,' ,

.' ,

'

BH

,,

',,

,,

,',"

.,,' ,.:," ',

,' ~,,"

,'

,

,',"

,,:,'"

,~"

,,

,.

,','

,

/,'

.,,,, ,~

,,

,'

/ tlOk

e~o.

' I I Imron , v.~'..~ Repellive ~lron seq.

o

'

I

I

Promolerregloa

,

,, q i p

h i

~

,

N

C1A Sl

G1B

$2

C2

C

region Fig. 1. Structural organization of the mouse K I 9 Sane.The mouse K I 9 8ene was isolated from a senomi¢ library constructed in phage vector ,IEMBL4 (Frischaufet al,, 1983)using DNA extracted from a secondary culture orC3H/HeNCrlBR (Charles River) embryonic fibroblasts. 8 × l0 s plaques were screened with two K/9-specillc probes called 5' and 3' (see below map B). Screening was done essentially as described by Lnssier et al. (|989). Hybridization was for 18 h at 42°C in 50% formamide/4 × SSC/0.1 M Ha,phosphate pH 7/0,$~ SDS/0.|% non-fat dried milk/200#g per ml or h©terologous DNA, Washes were done in 0.1 x SSC at 50°C. Positive clones were purified througt, three rounds of screen,rig mid restriction mapped followed by Southern hybridization with probes derived from the eDNA, and by partial restriction endonucleasedigestion followed by hybridization with olisos complementary to A cohesive ends (Rackwitz ct ai., 1984).(A) Five overlapping A genomic clones comprising different parts of the K i 9 gene. Only the ganomic inserts are represented and wavy lines represent genomic DNA outside from the K l 9 gane region. (B) Probes: All probes were derived from a K I 9 eDNA (Lussier et al,, 1989). The 3' probe is a Sty! restriction fragment comprising sequences corresponding to the last 3 aa residues of the tail domain plus the entire 3'-noncoding region minus one nt. The Ci2 probe is a Psr| restriction fragment of 446 nt which encompasses most of the central domain. The CI probe is a 12$-nt fragment that also corresponds to sequencesof the central domain, The 5' probe is a 377-nt fragment containing the 5' end ot'the mRNA and about 25 nt orS' flanking sequence.(C) Schematic representation and map of the mouse K I 9 gone, The six axons arc represented by blackened areas and the five introns are represented by open areas, The $'-flanking region is denoted by a shaded box and the region of intron I containing repetitive sequences is represented by an hatched area. 8, Sinai; K, Kpni; EV, EcoRV; B, BgllI; H2, HincIl; E, EcoRi; P, Pul; Sc, $cal; H, H/ndllI; X, XbaI. (D) Structure ofthe KI9 mRNA. The length of the RNA is approximately 1.4 kb without the poly(A) tall (Lussier et al., 1989) and the positions or'the six exons are carried over from map C. The start and stop codons are indicated. (E) Structure of the Kl9 protein. Boxes represent non-~,-helical regions. Thick lines represent ~,-helical regions. Domains are: N, N-terminal; CIA, coil IA; SI, spacer l; CIB, coil lB; 82, spacer 2; C2, coil 2; C, C-terminal. Portions of the protein and their corresponding exons are carried over from maps C and D.

205 subunit which is the most often observed in the same tissues as K19 is K8 (Quinlan et al., 1985). However, in some instances, for example in the esophageal epithelium, the K19 protein is more abundant than the K8 protein indicating that KI9 forms fdaments with other type-ll subunits (Bosh et al., 1988). In other types of cells, for example in cultured epidermal keratinocytes where the K8 and KI9 ~0

,~"

$

20

"'=-'~ =

. 30

T

="

40

'= " ' ' ~

~

genes are not normally expressed, KI9 is inducible by retinoids but not K8 (Kim et al., 1984; Gilfix and Eckert, 1985; Kopan et al., 1987). Thus, the expression of K l 9 is not necessarily linked to the expression of KS. The pattern of expression of K l 9 does not appear to be related to epithelial structural organization, contrary to the observation made with most other keratins (O'Guin et al.,

.~0

-T ~ ' ~ =

60

"-'AT=" =

"

":0

T=C-.~.C

.

SO

GG3 "

SO

" "=

",,CO

." . - . ' = 3 = ' :

T"'~

".Z0

..''"

"-20

T T T , ~ .'2

"

Cr~T B©x ATA n=x 38~,

• ~S'e#.JD

TCT ~.C ~.C~" .,T ,T,.T.~AC ,C~.,A A.,,. A T C , ~ A ~ A C C ~" A,~

M T S

A G

~

~G,.,G~'~A ~G~CGG ,.,,~ TCA:- A " ~ A

~ , . , ~ A GGG~: ,~

,..=",==A~.CCAC-CA

S ¥ R ~ T S A M S S F G G T r. G G S V R Z G S G G V F R A P S C,aCG~, G" CCG ~=T~.C~Ce'ACC ~ ~ GACC'%.GTCCz,,CGGGA~CTATG,.,~.G~>CC~ e G e / ~ . ~ , A , ,=$.ACCC G" G.t~="~C.'~

6CZ

¥

C ACGGG ,~., A,~r 6~ Z 1.1 6 3 S G G R G V S V S S T R F V T S S S G S Y G G V R S G S F S 3 T L A V AT33~C'~ GC." 3:'C ." 3~ :7"~T3 A ,~AAGATc&r'CAT G C k ~ F ' C C T C ~ T c A : : ~T=TCGCCTCCTAC TT 3G~C-t~GG'GCGCC'2C C' : A 3 A 3 ~ " ; ' = ~ ; : ~ "TG~CGA~CT~A3GT ~

D G L L $

G I¢ E 1( ." T M ~

R ." W ¥

K

0

0

G P

G ~

N 1, ~

S R D

I'

D R L A S t" L ,;~" K V R A L ¢~ ~

N

H Y F

=

~

Z £

D

L

A N G £

S :C

L E V K Z

R D ]~ t--),.. Z ~ w o n 1

3$$~7$3~$A$2.'~$CCGGA~AT-"A~:'C~G~A$~cTCA~;;..~$T~.Ac~2`~$22~$T$$C CCTC~2"C.'G$~AC.'C.':'~G CA .~.~C~.."~'C~,.~CA~.'.~G.~::'GSG-'Z'$-'-~CC2~2C:G$2:~C~:~ACA~,:A $C.'~CC~'C.'ZA C TC:'CC :'CCTCTCCAGCCCA G ~ 3 ~ C ~ CC:A $,,$=T

=GEZ :,~ C,1

:'TG$$C:'$CC~A $.=.SA TSAA~A ."A CC'3~.GCC$~A GCC~,=A$ C~'G6$3$Z~$GGT$=~G$ $ ~

Z44Z 2 ~.81

C~G~CA2A~'C.'A~$=CCA.~C:'ACAG~ .-.r~A~`-~`AA`-~`~$2~AG$G~1~A~$2C~C1~G$~$G~1~GCCC~~G~CA~1~ CA~7~,~ASA~ASA."AACAAC2C.%t, ACAC~RAACAA.r,A3AAAAA:'AAA~ACAA.'=A~2AG~G~G AG~$ASG~G$~G~C2=~CCC=C~$C~2C?G?G~CC~CTC~$~$~C

~401

ACCCCCAAAG2C. ~AA`~AAA~AC~A~A.~A~AAC~2AA~c7"ACC~.r~C~c~C~c~C~~2~~C~G~C~G

42¢:1

1 L S A T l D H B K . v L G l D N A ~ L A A D D F ~ ? ~,~:n~roa GCCA$~GCAAg:$A~AG$A$A~CAGC~CA$~y~.`-ATA$~c~`.-~T~$~A~$GCCT$~yCyGC~CTTCTCCACTy~GCy~GCTC~C~T~CCC~CCTCC~c~C~CC~ AAAOCC~G$~U~A~T~'G~'~.ASA ~'AcA$A~`AT~AG~.A~T]~$CCT~CT~CTA~TTC~y~G$~$~TT~AGACAG~A~G~cTT~T~T~AG~GT$G~G~c~G~AT~G~ ~J F E T E ;4 A L R L S V E & D Z R ~TG~3~T~T~GAT$A~TGA~T~TG~CCA~ACTGAC~T~GAGAT~`~AGATT`~A3A~¢~T~AA~,~.A~AG~TGGccTAC~T0~ATGA~T~$$TC~CT~$~ L ~ R V L D £ 2. T L A ~ T O 2. £ M 0 Z I~ S 2. X E E 2. A ¥ L ~ K 14 ;4 g E ~ ) ~ Z n ~ W O ~ $ C~:'TCT~2CCA.'.'*"T~:'CC ~3~`TTG~`3ACC~`-AT*-ATC~T$GGC~TA~T~C~A~CCT~TCT~TT$$CCC~g~CTTT~GCC~Cy$GCA~TCTCA~CAg$$~T~

4801

R g D A £ A T Y I, A R L-),.Zn~ron 4 C~-`-A~-~-TT~CC`-:AACTT$TCT~-G---ACA~A-.TGA~GA~T$AAcA~AGGT~G~GT~ACT~AG:A3AT~AGA~G:~GA~G~TCACG~AC~TTC~ACG~C~TC~A

48=Z

G~C~TTGA~A-~A$~T~A3T~C~A~T~AG~ATG$TA~g~GyC2CCA~CC~A~G3CcTGCA~ACTTGTGC~C~$s~`~A CCTCGGTGCCTGTGCCCGTG.'TCAG~AACC'.GCC~CTG~

~8,;~, ~%~,Z 408Z

.~Jz

E

E

L

N

T

O

V

A

V

;4 S

E

0

l

g

Z

S

K

." E

V

T

D

2. R

R

T

L

; 2, E ! £ ;, {~ S ¢~ L IS M ~ Z n ' c z o n S .~A~TCAC~¢=CCC,'.'~CCC.'CCCC='CA~GCT;~CCT~3AN~=ACG:T~CA ~AGA=3G~GCCC3TTAT~AGTCCAG:TG: ~ACA~ATCCA~AGC;TGAT:AG:G~T:TT~Ak .~C,41 51(Z

~ M A & b E 6 T L A E T E & R ¥ G V ~ L S O Z 0 S V Z S ~ F £ CCCA~CTGAGC~ACGTGCGT~CCGACATAGAGCGCCAGAACCAGGAG-'ATAA~AGCTCATGGACATCAAG: CCA~GCT GGAGCAGGAGATCGCCACCTACCGCAGCCTGCTG~$GGC¢ A ~ ; S D Y A A D Z Z R 0 N 0 E ¥ K ~ L N D Z ~ S R L £ ~ £ I A T ¥ R S L L E G

8281

A~AANCCCACTAC~J~CJ~TCTGCCCACCCC

5401

0 E A H ¥ H N L C k ~ T A ~ A C .~ATCCTC CAAGG G

P

T

eM~G$cc&TcTGAGc ' : A C ~ A ~ c ~ A G A c T ¢ ~ c c T G G 3 " ~ G 3 ~ c T G A c T G ~ G ~ T ~ T ~ G T T : A c T c ~ c ¢ c ~ T c c c T ~ c T T ~ T

P

K &

I

*

(;4 F i g . 2. Nucleotide sequence o f the mouse K / 9 gene and its 5'-flanking region. Selected restriction fragments from genomic clones ~6.2 and A8A (see F i g . ! ) were subcloned in the pBiuescript phagemid vector and sequences were determined by the dideoxy chain.termination procedure (Sanger et al., 1977) using, as primers, oligos corre.~ponding to the T3 and T7 R N A polymerase promoter sequences as well as El9 sequences as they became available. Sequencing reactions were carried out using Sequenase (USB Corporation) with [~SS]dATP (Amersham Canada). Gel reading and sequence assembly were done using the Gel Mate 1000 Sonic Digitizing System and the Microgenie Sequence Analysis Program (Beckman Canada). The sequence was determined from both strands. The CAAT box, the ATA box, the tsp (denoted by a dot), the start codon, the stop codon and the polyadenylation signal are all indicated. The deduced aa sequence is indicated by the one-letter code below the coding sections of the exons. The GenBank accession number poi¥

is M36120.

206 1987). Rather, KI9 expression is frequently associated with the less differentiated cells in epithelial tissues (Stasiak et al., 1989), a cell population potentially more susceptible to neoplastic transformation. The available evidence, thus, suggest that K19 gene regulation in embryonic and adult tissues will be found to be a complex process related to the differentiated state of the cells. An understanding of the mechanisms controlling the specific pattern ofexpression Of KI~, requires the availability of genomic clones. The airu of the present study was the isolation and characterization of the mouse gene encoding KI9, and comparison with other mammalian /('19 sequences. Highly conserved sequences were found in the noncoding regions of the genes suggesting they are potentially important for the regulation of their expression. In addition, the chromosomal localization of the mouse KI9 gene was examined genetically, and linkage to the type-I keratin locus, Krt-l, on chromosome 11 has been demonstrated.

RESULTS AND DISCUSSION

(a) Isolation and characterization of genomie clones comprising the murine KI9 gene As previously shown (Lussier et al., 1989), the mouse genome contains only one copy of the KI9 gene. It was cloned using two Kl9.specific probes, called 5' and 3' (see Fig. 1). Two ditTerent recombinants were obtained with the 3' probe (clones ~5.2 and ).6.2; see Fig. 1) and three with the 5' probe (clones 31A, ~gA and 310A; see Fig. 1). To map the recombinants with respect to the KI9 gene, ~;outhern blots of restriction endonuclease digests of each phage were analyzed with probes (5', Cl, C12 and 3'; see Fig. 1) corresponding to different parts of the KI9 mRNA (the blots are not shown). Fig. 1 summarizes the results of that analysis. The inserts of clones 35.2 and ,16.2 overlap with the 3' end of the gene ending in exon 5 and in exon 4 respectively, whereas the inserts in clones ~ IA and ~10A overlap with the 5' end of the gene and finish near the junction ofexon 1 and intron 1, and in exon 3 respectively. As for phage ~8A, it contains the entire KI9 gene. (b) Sequence and structural features of the mouse KI9 gene The mouse KI9 gene spans 5119nt and includes six exons. The precise exon-intron junctions were identified by comparing the genomic sequence with the cDNA sequence we have recently determined (Lussier et al,, 1989). The complete sequence of the K19 gene, along with 303 nt of 5'-fianking sequence is presented in Fig. 2. The exon sequences along with a short stretch of 5'-flanking sequence obtained from the subclones were identical to our cDNA sequence except for five differences, in the coding region,

a discrepancy was noted at nt 3803 in exon 2 where an AGA codon in the genomic clone was determined as CGA in the cDNA. Both codons correspond to Arg. The cDNA was cloned using RNA obtained from outbred mice (CD-I) while the gene was cloned from C3H DNA (see Fig. 1 legend). Also note that an AGA codon was found at that position in another mouse KI9 cDNA sequence determined by Ichinose et al. (1989). The four other discrepancies lie outside from the transcribed region of the KI9 gene, in a 25 nt stretch just upstream from the tsp. The cDNA sequence we have recently published (Lussier et al., 1989) was in part determined from an amplified genomic DNA fragment that extended from nt 278 in the 5'-flanking sequence to nt 654 in exon 1. The inconsistencies noted are in the 5' oligo sequence not actually derived from the mouse gene, The overall structure of the mouse KI9 gene (Fig. 1) is identical to K19 genes of other species and similar to other type-I keratin genes: exon 1 includes the sequences corresponding to the 5'-untranslated region of the mRNA, the N-terminal domain, coil 1A and 11 aa of coil IB of the u-helical region of the protein; exon 2 and 3 comprise sequences corresponding to part of coil IB; exon 4 contains the sequences corresponding to the end of coil 1B, spacer 2 and the beginning of coil 2; exon 5 encodes part of coil 2 and exon 6 includes sequences corresponding to the remaining of coil 2, the 13 aa of the C-terminus and the Y-untranslated region of the mRNA. The K19 genes have five introns, the lowest number found in the type ! keratin gene family. The position of the five introns (Fig. 1) is conserved in the mouse, bovine and human KI9 genes and reflects keratin type I organization (Steinert and Roop, 1988). The first, second and third introns interrupt the sequence at positions corresponding to aa 11, 39 and 91 in TABLE ! Length (in bp) of the exons and the introns of the mouse, bovine and

human/~19genes KI9 gene Exon I Exon 2 Exon 3 Exon 4 Exon 5 Exon 6 lntron I lntron 2 Intron 3 Intron 4 Intron 5

Mouse a

Bovine b

500 83 156 161 125 352

476 83 156 161 125 382

2923 217 366 116 III

1675 223 305 111 131

Human b (490) ~ 83 156 161 125 379 2571 193 302 I l0 144

Values are from our present results.

b Valuesare fromBaderetal. (198g). c The size of the human exon I was estimated from the published sequence.

207 coil IB, while the fourth and filth introns interrupt the sequence at positions corresponding to aa 29 and 71 in coil 2 (Fig. 1). The size of the exerts and introns ofthe three KIP genes are indicated in Table I. All the exon-intron junctions of the mouse KIP gone display the minimal consensus sequences G T and A G at the 5' and 3' boundaries ofevery intron (Fig. 2). However, the majority of these junctions do not conform to the weak consensus sequences A G : GU(A/G)AGU...intron...(U/C)n nN(C/T)AG : G (Breathnach and Chambon, 1981; Padget et al., 1986).

(e) Transcriptional start point (tsp) T o determine the exact location of tsp for the mouse KIP gene, primer extension experiments were carried out. The

G A

311

5' 3'

~, _

GC

~

? 3' S'

test R N A was obtained from mouse lungs which contain high levels o f K I 9 m R N A (unpublished observations). It was hybridized with an excess of radiolabeled EXT-I primer situated 16 nt downstream from the A T G start codon (Fig. 2). The annealed product was then reverse transcribed to extend the primer to produce a c D N A complementary to the RNA template. The product obtained is 111 nt long. It was run alongside a set of sequencing reactions using the same primer and single-stranded D N A from a subcione covering the cap site region. The results (Fig. 3) indicate that the mouse KIP gone tsp is at nt 304, thymidine residue (underlined) in the sequence 5 ' - C T T G _ T C A C T C C T C - 3 ° (see Fig. 2). No such extension product was found when poly(A)+RNA was not included in the reaction. This result indicates that the 5'-untranslated region of the mouse KIP messenger R N A is 71 nt long.

T C

~'~ m

'.'°"l~

Fill, 3. Determination of the ~sp of the mouse keratin 19 gone. Primer

extension experiments were performed using the 21-mar olign Ext-I ($'-AAGACATAGCTGAGGTCrGGC-Y) which is complementary to nt 393-414 in the DNA sequence and is situated 16 nt downstream from the ATG start codon (see Fig. 2). The Ext-I olign was end.labeled with T4 polynucleotide kinase and [~,-~iP]ATP (3000Ci/mmol, AmershamCanada), and purified by a SephadexG-10 spun column. The equivalent of 3 x 10~©pinwas heated at 85°C in the presence of 10/~g ofyeast tRHA or 2 ~tgofmouse lung poly(A)+ RHA.The preparation was rapidly precipitated from ethanol and the pellet resuspended in 20 #l of 80% formamide/0.4M NaCI/50mM PlPES (pH 6.5)/ImM EDTA/RNasin (promega Biotec, Madison, W! at I #1) and allowed to anneal for 16 h at 32°C. Followingethanol precipitation,the pellet was washed twice with 70% ethanol and the extension reaction was carried out in a final volume of 50 #1 containing 50 mM Tris' HCI (pH 8.3)/50 mM KCI/8mM MgCll/I mM DTT/I mM of each of the four deoxyribonucleotidetriphosphates/50 units of RNasin/50 units of AMV reverse transcriptase (Life Sciences) for 75 rain at 42°C. The reaction mixtures were phenol extracted and labeled products were precipitated from ethanol, denatured and run on a sequencing gel (6% polyacrylamide/7 M urea) alongside a set of sequencing reactions (lanes G, A, T and C) done on a singlestranded DNA from a subclonecoveringthe CAP site region, using the same primer. The product of the primer extension reaction (PE; obtained with mouse lung RNA) is denoted by a black arrowhead. The arrow on the left indicates the tsp and the direction of transcription. The numbering of the coding strand corresponds to the sequence of Fig. 2.

(d) The mouse KiP gone: interspeeies homologies and possible regulatory elements in 5'-flanking sequences The mouse K I P sequence was examined for known regulatory motifs characterized in many other genes as well as for sequences conserved at identical positions in the 5'-flanking sequences of the bovine and human KIP genes. Comparison of the sequences of the KiP genes immediately upstream from tsp revealed a high degree of homology among the three species (Fig. 4). The mouse and bovine genes were found to bc more than 78% homologous in a region extending over about 200 nt upstream from the mouse tsp. This region is 76% homologous with the human sequence, excluding a stretch of 20 nt present only in the mouse and bovine genes. The best homology found in that HOU 17 IKDV HUN

J~,G'~G&G'JL"?GTO?k,~GCC'~TG?~,GTkkG OGOC?G J~GGGCCAG~CKIG~CO eloec:i.ig, egi,t~.~,, , l g g g i g e c : , , ¢ , ~ t ~ , a l e i i c:g**tg.~l~Ots i e t.* COS,it:lie g~.C~,OigeCgS~;*gilit t e~alh, illl~ i sgg~.¢~ i ~ll gtgllg t g* e t.i

]~OU ?S ISDV ~K;H

~ C.~?G'~rGG CAG?,t.GC'~'~GAO'rGGGC'~c~:~anGoc&GC?C'J~GG~o~& i e i g t g v i c : , a e °gigs t i g g * I C ~ C l O S l i l l i C ~ l & l t O S e l i g s C i l S i l i g O l t g giggag*iigi*~igiel ***c~-*©Ic**s*¢**ll**~*"*c* *°~"gitegti*g S~!

HOU 13S

G.~CC'~C"~SGC'~CC:'GGG;tG~Gk~ G~TAGG&~CkG~'~G¢CAGG~t'~G~c~TqsGO CAT I ~

;~OV

~t~1eiCv't i~tl;lt* * * e et l k l i A* t t *t(:* e C i e t liC¢llgggC~©~ i l t l t l g ~ l *

HUH

egililt

ttlt~

ii llt~iil

I~ee

~ t e i t I C t 4 C * t I t tC¢~Ct~S~C4 i t t t e l ~ l l ~

e

ill HOU 144 SOY

C~CA~.e,~'~ C?CC~TCCCCC? ICCCGCCCCGGGCk?~L~kGCC~kGG.'~-~GGa ~ g l l I C i g i C l O * t C ~ t l l t i~¢CCCC~lie l l t t e l l i l t i l ' l ! e l l g l i g C t l i e e t **li-i

~

tillit

CliltillllCll

i i iCICtCl&l ifl ill/tilt

Ill HOU l i d

C':2"J~;

HI~

i

II * "till

ilCtl

le I ltlll

All i

~ Tmnsi'illllon

¢*..

Fig. 4. Comparison of the nt sequence of the 5'-flankingregions of the mouse (MOU), bovine(BOV)and human (HUM)KIP genes. Sequences were aligned with gaps to maximizetheir homologies.Only the mouse sequence is entirely represented. Asterisks indicate solely a match between the bovine and/or human sequence and the mouse sequence. Only the mouse sequence is numbered accordingto nt positions defined in Fig. 2. The position ofthe mouse tsp is indicated by a blackenedcircle (see also Fig. 3). Known regulatory motifs are also indicated.

208 region was between the human and bovine genes which are 80~0 homologous. Alignment of the three sequences shows two short stretches of 9 and 5 nt respectively which are not found in the mouse gene. Examination of the 5'-flanking sequence of the mouse gene revealed the presence of many putative transcriptional control dements. However, only four of these were conserved in the three, species at approximately the same position with respect to the tsp. The three KI9 genes have an ATA box located, in the mouse gene, 20 nt upstream from the tsp and, in the case of the bovine gene, 24 nt upstream from the tsp. This suggests that the sequence is utilized as a functional TATA box. It is identical to the ATA box of the rabbit and human /~-globin genes (Hardison et al., 1979; Lawn et al., 1980), which are required for

accurate and efficient in vivo and in vitro transcription (Charnay et al., 1985; Dierks et al., 1983). Another element present in the KI9 genes ofthe three species is a CAAT box, located, in the mouse, 89 nt upstream from the ATA box. In the human and bovine genes it is located 105 nt upstream from the ATA box. A CAAT box has also been shown to be required for efficient transcription of the /~-globinencoding genes (Charnay et al., 1985; Dierks et al., 1983). In the three species, the ATA and CAAT boxes were found at consensus distances from the tsp. The mouse K19 gene also contains in the 303 nt upstream from the tsp, three Spl-binding sites (Kadonaga et al., 1986). Only two of these sites are conserved in the human or bovine genes (Fig. 4). Finally, examination of the aligned sequences also reveals the presence of other short but perfectly conserved

INTRON 2 HOO

1 GTAAGTGTCCCTGTATTGGGGGC~GCGGAAGCCAGGGCAAGTGAGAGGAGAGCAGCGGC AGGTCZ"fATAG~GC~TGATGAAGTAGAGGCCTGGACTCTGCCCTTCTCCACtT

HOU 113 GAGCTGGCTCAA3~CAATGCGCCCCCTCCCCGACTCC TA/LACCCTGGGAAAGT~GG BOV 10g I~S 8t

CAGAGACACAGATCCAGCACTAGCCTGACTGCAACTATTTGTTC

INTRON 3 HO~t $e BOV 1 )NI4 1

ZGGG CTTACTGACCATGACCTCATCTCT2GGCC 0~GGOTTT GATCA~GCCTCTGGCAOTCTCAGCAGGGCCATGACATCATCAC~&TGATGACCTTAC~ACCATGACATCAT

HOU I71 CACTATGGC~CTGGCTT TGATCATG~CTCTCGCAGTCTCA~AG~GCA~G~TAACCATGACATCTACTT~C~AGTGCTGGCTCC~AGATTTGTACCTCr~AAACCA~CCGTCTTCAGC ~OV 103 HUH IOS

HOU 289 TCAGGGCTTGGG BOV 220 218

GGAGTGACCTGATCCAAGT~CAGCATG TGTAGGTCTTGACATGCCCt~GACTGTGGTGTCTTCACA~

INTRON 4 HOU

1 GTATGCAGAGGATGC~GGTATCCCGTGAGGGTGGCGA~,GA~AGACCCTGAACCCTCACCCATCCACCCGCAGA~,AGA~C~ TGTTGCTTATTTTCTCCTTAACTTG

INTRON $ HOU

1 GTAC~TGTCT¢CAC¢CTACGGCCZGCACACTTGTGCCCT

lqlOV

1

Etllq

22

tttlbltllr tt tt ~ Qt t t t l ~ s t t / I r t E t t t S l t

GGCAACCT C~ZGCCT4;TG(:

~.t t / I ~t/e/~lklt/t t lk Illi1~t g t agC:(:t.t: ILIt It t t l t / i t

(:ItS~tl:t~'~tttS~'tq=a'ittt;twt~ttll~&~t~t8

8m~t t tit t 8~tGt t t 4t ~lt

(:CG~(;TTCAGr.dUtCCTG(:CTCTGCTGAGTC llk~tt ~ t C t ~ t t Q t t t b l t ~ S t t t t t

CSc(:l:ll~lt/tft/~t~/t884=/tt~ttt~ttt~t~c&~St)~tGtQt~ttti&tlt~Itt~tttt

t * / , * t O / t tl~.C:~.l:aC:C:t C:I:@ t(:lki(::(~C(:C:~(:

Fig. 5. Comparison of the , t sequences of intron 2, 3, 4 and 5 ofthe mouse, bovine and human K I 9 genes. Sequences are shown with gaps to maximize their homology. For other details see legend to Fig. 4.

209 sequences in the putative K I 9 promoter (see Fig. 4). The potential role of these sequences in the regulation of KI9 transcription is presently being evaluated. The KS, KI8 and K I 9 genes are often coexpressed and might therefore share some c/s-acting regulatory elements. However, comparison o f their 5'-flanking sequences revealed no significant homologies. The K8 (Vasseur et al., 1985) and KI8 (Ichinose et al., 1988; Kulesh and Oshima" 1988) genes do not have C A A T boxes and possess T A T A boxes instead of the ATA boxes found in the K I 9 genes. Spl-binding sites were found in the human K I 8 gene putative promoter (Kulesh and Oshima" 1988) and in the 5'-flanking sequences of the human £ 7 gene (Glass and Fuchs, 1988), another keratin expressed in some simple epithelia, but not in the mouse K8 (Vasseur et al., 1985) and K I 8 (lchinose et al., 1988) genes. The results of these comparisons are consistent with numerous observations

indicating that co-expressed type I and type 11 keratin genes are independently regulated (Schermer et al., 1986; Roop et al., 1987; Ouellet et al., 1990, a and b).

(e) The mouse K19 geue: interspeeies hemulegies in introu sequences In the few keratin genes where it has been examined, there are remarkable interspecies homologies in intron sequences (Bader et al., 1988; Rieger and Franke, 1988). It has been postulated that those regions might be o f importance in the regulation of the genes, especially since conservation of intron sequences in ortholognus genes is not a common phenomenon (Yang et a l , 1984; Rieger and Franke, 1988). Comparison of the mouse K19 introns 2, 3, 4 and 5 with the corresponding sequences ofthe bovine and human genes (Fig. 5) shows that, in spite of evidence of divergenc e, they have conserved greater than 75% homol-

TABLE I1 Distribution of the repetitive sequences in the first intron of the mouse 1~!9gene Position~ (nt)

Type h

,

1400-1428

BI

Very short and in the senseorientation. 90% homologous with the BI consensus sequence.

1533-1572

(CAGA)Io

The CAGA sequence is repeated ten times.

2128-2182

BI

Incomplete and in the sense orientation. 64% homologouswith the B1 consensus sequence. Not flanked by direct repeats.

2257-2382

B!

Complete and in the sense orientation, 82% homologouswith BI consensus sequence. Flanked by direct repeats CAAACCAGAG at nt 2244 and 2416. Has box A and box B of PolIII promoter.

2430-2479

(CCAC)Io

The CCAC sequence is repeated ten times.

2782-2982

B2

Complete and in the antisense orientation. 62% homologouswith the B2 consensus sequence. Flanked by direct repeats GGGATAA at nt 2768 and 2984.

3004-3051

(CCTT),2

The CCTT sequence is repeated twelve times.

3057-31 !0

T-rich

T-rich region (80%).

3111-3213

BI

Complete and in the antisense orientation. 88% homologous to the BI consensus sequence. No direct repeats. Has only box B of the Pollil promoter.

3243-3414

B2

Complete and in the antisense orientation. 93% homologous to the B2 consensus sequence. Flanked by direct repeats GGATAAATGATI'CIT at nt 3216 and 3416. Has box A and box B of the Pollli promoter.

Description C

° lntron I extends from nt 804 to 3727 (see Fig. 2). b BI and B2 are A/u-typefamilies of short interspersed repetitive sequences found in the mouse genome(Krayev et al., 1982). (CAGA)Io,(CCAC)Io and (CCTT)m,represent simple repeats. c The reported Bi and B2 consensus sequencesarc fromKrayevet ai. (! 982)and Kalb et al. ( 1983). The B1 and B2 repetitiveelementsare nsuallyflanked by short direct repeats and include a sequence highly homologousto the RNA polymerase!!i promoter.This promoter is composedof intemsl control regions which consist essentially of two domains: box A and box B (Geidusehek and Tocchini-Valentini, 1988).

210 ogy over a substantial portion of their length. However, intron I has a different size in the three species and is the most divergent region of the gene. Intron I is almost 3 kb long, making the mouse KI9 gene one ofthe largest type-I keratin gene. The difference in size between species is due to the presence of repetitive sequences in varying numbers. In the three genes, intron 1 is composed of three regions: a central section containing repetitive sequences bordered on each side by unique sequences. The structure of the repetitive segment is the most complex in the mouse intron 1 with two complete B I sequences, two complete B2 sequences, two incomplete B 1 sequences and simple repeats not conserved in the other species (see Table II). The human intron I has at least two A/u-type repetitive elements spanning almost 1 kb and the bovine intron 1 has one 287 nt long bovine A/u-type sequence element (Bader et al., 1988). The two stretches of unique sequences present in the three genes on either side of the repetitive sequences show evidence of evolutionary conservation (Fig. 6). Alignment of the first 597 nt of the intron (nt 803-1400 in the mouse gene; see Fig. 2) has revealed the presence of five short regions where the homology between the mouse and the other two sequences ranges between 62 to 81 ~ . At the end of the intron, the last 283 nt (nt 3444-3727 in the mouse gene; see Fig. 2) have 68 and 70~ homology respectively with the human and bovine corresponding sequences. Finally, when considered together, the five human and bovine KI9 gene introns are more homologous between them than when compared with the mouse introns. This was also noticed by Rieger et al. (1988) in their analysis of the KIO gene introns. The conservation of intron sequences suggest they have an important function, possibly in the regulation of gene expression. Intron sequences were examined for motifs known to be involved in transcriptional regulation. Many were found but none that were conserved at the same position in the three genes. The three introns I had at least one Spl binding site in their unique sequences (located at nt 853 in the mouse KI9 gene; see Fig. 2). In addition, as mentioned above, the three introns 1 contain Aiu.type sequences with consensus internal control regions (box A and

((((

((

HH

H H H

BOV 70 HUM 73

64 71

73 61

box B; Geiduschek and Tocchini-Valentini, 1988), These sequences could participate in KI9 gene regulation since they can act as positive enhancers when positioned near RNA polymerase II promoters as shown for the haptoglobin gene (Oiiviero and Monaci, 1988). (f) Presence of Ain-type sequences in intron 1 of kemtiu genes: implications for the evolution of the type-! gene family Alu-type sequences have been found associated with many mammalian genes including the keratin type-I gene family: they have been identified in the human KlO, K14, KIS, KI8 and KI9 genes (Marchuk et al., 1985; Bader et al., 1988; Rieger and Franke, 1988; Kulesh and Oshima, 1989); in the mouse Kl8 (Ichinose et al., 1988) and Kl9 genes; and finally in the bovine KI9 gene (Bader et al,, 1988). The repetitive sequences have different locations in these genes except in the KI5 and KI9 genes; these two genes contain at least one A/u-type sequence in their intron 1. The conservation of these sequences in the three KI9 genes indicates that the acquisition of the A/u-type sequence occurred in an ancestral keratin gene present in a precursor of the mouse, bovine and human species. Furthermore, the presence of the repetitive sequences in KI9 and Ki5 (Bader et al., 1988), which are adjacent in the human (Bader et al., 1988) and mouse (Filion et al., in preparation) genomes, suggests the possibility that KI9 and g l ~ were derived from the same ancestral gene. This is at variance with the data of Blumenberg (1988) that suggested that KI$ and K19 were in two different type-I subfamilies. However, those conclusions were drawn from comparisons of the 2B sub. domains ofkeratins and it is likely that aa-codin8 sequences do not evolve like intron sequences. Finally, many type-I keratin genes do not have an Alu.type sequence in their first intron, suggesting that there were at least two type I keratin genes when the 1(19 ancestor acquired its A/u-type insert. To substantiate this hypothesis, it will be interesting to determine if 1(15 genes in other species, as well as other keratin genes close to the KI5 and the KI9 genes, also contain A/u-type sequences in their first intron.

(( (( 62 69

(

73 01

( 7O

66

¢ I

% HOMOLOGY

Fig.6. SchematicrepresentationofthemouseKI9 geneintronI. Regions(hatched),theirpositions(innt) andpercentageofhomologywithcorresponding Kl9 introns I ofotherspeciesare depicted,The regioncontaining~pedtive sequencesis denotedby dashedlinesand is of variab|elengthin the three species.

211 (g) The meuse K19 gem maw to the Kvt.l locus on chmmeseme 11 The genomic organization of the keratin gene family is of considerable interest from an evolutionary perspective and also in relation to coexpression oftype-I and typo-ll keratin genes. In the mouse, meiotic mapping has identified a locus oftype-I keratin genes, Kn-l, to chromosome 11 (Compton et al., 1988; Nadeau et al., 1989). The Krt-I locus was found tightly-linked to the mutation 'rex" (Re) and to the borneo box gene complex Hox-2 (Joyner et al., 1985) by mapping restriction fragment variants in mouse interspecies backcrosses using a mouse KIO gene specific probe and a human 1(.14 gene nonspecific probe (Nadeau et al., 1989). To test whether the mouse 1(,19 gene is also a part of Kn-l, we determined the segregation of 1(,19 gene variants in a subset of progeny DNA from a (C57BL/6JTrSRe × M. spretus[Spain]F, × C57BL/6J interspecies backcross which had also been typed for segregation of the dominant phenotypic mutations Re, Tr' ('trembler'), for the 1(,10 gene (marking the Krt-I locus) and for the Hox-2.1 gene. Probe 5' (see Fig. 1) detected 1.0 kb and 1.5 kb variant bands in TaqI-digested C57BL/tJ Re Tr~ and M. spvetus DNAs, respectively (experimental results not shown). In the 29 progeny tested, the 1(,19gene cosegregated with Krt-I and the Re gene, including the single recombinant between these genes and Hox-2. These data definitively map the KI9 gene to chromosome 11 within 10 cM of Re and Hox-2 (95y0 confidence), and strongly support assignment ofthe 1(,198ene to the type I keratin gene locus, Krt-l. It has become apparent that the mammalian keratin gene family is organized into a few loci rather than being dispersed in the genome. The Krt.2 locus on the distal third of mouse chromosome 15 near the mutation 'caracui' (Ca) includes the type-ll keratin genes KI, 1(,4, KS, 1(,6, 1(,8 (J.G.C. and J.H.N., in preparation) and two type-ll hair keratin genes (J.G.C. and A. Bertolino, in preparation). The Krt-I locus oftype-I genes on mouse chromosome 11 at Re has now been shown by linkage studies to contain the 1(,19 gene (this report), the 1(,10 gene (Nadean et ai., 1989), and the 1(,14 and 1(,16 genes (J.G.C., unpublished). Physical linkage between 1(,19 an another type-I gene on a cosmid clone (see section f) place a fifth type-I gene, identified as 1(,15 (M.F., M.L. and A.R., in preparation), into Kn-l. The Krt-I locus is within a large chromosomal segment on mouse chromosome 11 and human chromosome 17 that has been conserve since divergence of human and mouse lineages (Nadeau, 1989). The human locus analogous to Krt.l containing functional keratin genes has been located to 17ql 1-13, where the type-I keratin genes KI3, KI4,1(,15 and 1(,16 (Romano et al., 1988a,b; Rosenberg et al., 1988) have been mapped, and assignment of Kl9 is supported by

physical linkage betwee~ K19 and KI 5 in genon~ clones (Bader et al., 1988). The fact that many type-I and type-li keratin genes are segregated and linked in only two gene clusters which have been maintained in different lineages, saggests that evolution of the keratin gene family must have proceeded within constraints favoring physical linkage of the genes. One possibility is a mechanism of keratin gene duplication and evolution which intrinsically requires close linkage between genes. Determining whether non-functional pseudogenes also tend to be located in Kn-I and Kn-2 would address this question. We speculate that the complex control of keratin gene regulation may include a role for keratin 'domain' elements in developmental and tissue-specific gene activation and repression, as has been demonstrated for the human O-globin locus (Grosveld et al., 1987). Transgenic expression of individual keratin gene sequences should reveal whether proximal and gene DNA sequence account for all features of keratin gene regulation. Incomplete regulation of such transgenes would be consistent with the existence of distal control elements and is predicted if the keratin multigene 'domains' do indeed have a regulatory function.

(h) Conclusions The results presented here provide a more comprehensive characterization of the overall structure and organization of the keratin gene family and suggest that flanking sequences, intragenic sequences and keratin 'domain' elements may all be important for the regulation of£19 gene expression. Functional studies of these potential regulatory sequences are in progress and will certainly lead to a better understanding of the underlying principles governing the specificity of expression of the K19 and that of other keratin genes during development and in adult tissues. ACKNOWLEDGEMENTS

We would like to thank Carmen Lampron for helpful suggestions. The photographic work of Louis Lussier and Roger Duclos is acknowledged. This research was supported by a grant from the NCI of Canada to A.R.J.G,C. wishes to acknowledge The Jackson Laboratory for its initial support of these studies, through NIH CA 34196, RR 05545 and ACS grant IN 155. This work was also supported by NIH grant HG00189 to J H N. M.L. is the recipient of studentships from La Societal de Recherches sur le Cancer and from the Universit~ de Montrtal. The NIH is not responsible for the contents of this publication nor do the contents necessarily represent the official views of that agency.

212 REFERENCES Bade,r, B.L., Magin, T.M., Hatzfeld, M. and Franke, W.W.: Amino acid sequence and 8ene organization of cytokeratin no. 19, an exceptional tail-less intermediate filament protein. EMBO J. 5 (1986) 1865-1875. Bader, B.L, Jalm, L. and Franke, W.W.: Low level of cytokeratins 8, 18 and 19 in vascular smooth muscle cells ofhuman umbilicalcord and in cultured cells derived therefrom, with an analysis of the chromosomal locus containing the cytokeratin 19 cane. Ear. J. Ceil. Biol. 47 (1988) 300-319. Blumenbex8,M.: Concerted dupficationsin the two keratin sane famifies. J. Mol. EvoL (1988) 203-211. Bosh, F.X., Leube, ICE., Achtstatter, T., Moll, R. and Franke, W.W.: Expression of simple epithelial type cytokeratins in stratified epithelia as detected by immunolocalizationand hybridization in situ. J. Cell. Biol. 106 (1988) 1635-1648. Breatimach, R. and Chambon, P.: Organization and expression of eucaryotic split genes coding for proteins. Annu. Rev. Biochem. 50 (1981) 343-383. Charnay, P., Mellon, P. and Maniatis, T.: Linker scanning mutagenesis ofthe 5'-tl,nkins region of the mouse p.major-globincane: Sequence requirements for transcription in erythroid and nonerythroid cells. Mol. Cell. Biol. 5 (1985) 1498-1511. Compton, J.G., Phillips, S.H. and Nadean, J.H.: Keratin genes proximity to mutant loci on Chr. I1 and Chr. 15 that affect the epidermis, and the finkageto ~omeobox homolngsgene. Mouse News Lett. 80 (1988) 165. Dierks, P., Van Ooyen, A., Cochran, M.D., Dobkin, C., Raiser, J. and Weissmann, C.: Three regions upstream from the cap site are required for efficient and accurate transcription of the rabbit p-giobin gene in mouse 31'6 cells. Cell 32 (1983) 695-706. Bckert, R.L.: Sequence ofthe human 40-kDa keratin reveals an unusual structure with very high sequence identity to the corresponding bovine keratin. Proc. Natl. Acad. Sci. USA 85 (1988) 1114-1 ! 18. Eichner, R., Sun, T..T. and Aebi, U.:The role of keratin subfamilies and keratin pa/, in the formation ofhuman epidermal intermediate ilia. ments. J. Cell Biol. 102 (1986) 1767-1777. Fdschanf, A.-M., Lehrach, H., Poustka, A. and Murray, N.: Areplace. ment vectors carrying polylinker sequences. J, Mol, Biol. 170 (1983)

827-842. Oeidusehek, EP. and Tocchini-Valentini, O.P.: Transcription by RNA polymerase i!1. Annu. Rev. Biochem. 57 (1988) 873-914. Gilfix, B.M. and Bckert, R.L.: Coordinate control by vitamin A of keratin gene expression in human keratinocytes. J. Biol. Chem. 260 (1985) 14026-14029. Glass, C. and Fuchs, E.: Isolation, sequence, and differential expression of a human K7 gene in simple epithelial cells. J. Cell Biol. 107 (1988) 1337-1350. Grosveld, F., Blom van Assendelft, G., Greaves, D.R. and Kollias, G.: Position-independent, high-level expression of the human #.giobin gene in transgenic mice. Cell 51 (1987) 975-985. Hardison, R.C, Butler, III E.T., Lacy, E., Maniatis, T., Rosenthai, N. and Efstratiedis, A.: The structure and transcription of four linked rabbit p-like giobin genes. Cell 18 (1979) 1285-1297. Harzfeld, M. and Franke, W.W.: Pair formation and promiscuity of cytokera.ins: formation in vitro of heteretypic complexes and intermediate-sized filaments by homologous and heterologousrecombina. ,ions of perilled polypeptides. J. Cel! Biol. 101 (1985) 1826-1841. Hogan, B., Constantini, F. and Lacy, E.: Manipulating the Mouse Embryo. A Laboratory Manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 1986, pp. 17-78. Ichinose, Y., Morita, 1"., Zhan8, F., Srimahasongcram, S., Tondella, M.L.C, Matsumoto, M., Nozaki, M. and Matsushiro A.: Nu¢lcotide

sequence and structure of the mouse cytokeratin endoB gene. Gene 70 (1988) 85-95. Ichinose, Y., Hashido, IL, Miyamnto, H., Nagata, T., Nozaki, M., Morita, T. and Matsushiro, A.: Molecular cloning and characterization of cDNA encoding mouse cytokeratin No. 19. Gene 80 (1989) 315-323. Jackson, B.W., Grund, C, Schmid, E., Burki, K., Franke, W.W. and lilmensee, K.: Formation of cytoskeletal elements during mouse embryogenesis. Intermediate filaments of the cytokeratin type desmusomes in preimplantation embryos. Differentiation 17 (1980) 161-179. Jackson, B.W., Grund, C, Winter, S., Franke, W.W. and Ilimensee, K.: Formation of cytoskeletal elements during mouse embryogenesis,II. Epithelial differentiation and intermediate-sized filaments in early postimplantation embryos. Differentiation 20 (1981) 203-216. Joyner, A.L., Lebo, R.V., Khan, Y.W., Cox, D.IC and Martin, G.R.: Comparative chromosome mappingof a conserved homeo box region in mouse and human. Nature 314 (1985) 173-175. Kadonaga, J.T., Jones, K.A. and Tijan, R.: Promoter-specific activation ofRNA polymerase II transcription by Spl. Trends Biochem. Sci. ! 1 (1986) 20-23. Kaib, V.F., Glasser, S.0 King, D. and Lingrel, J.B.: A cluster of repetitive elements within a 700 base pair region in the mouse genome. Nucleic Acids Res. 11 (1983) 2177-2184. Krayev, A.S., Markusheva, T.V., Kramerov, D.A., Ryskov, A.P., Skryabin, K.G., Bayer, A.A; and Georgiev, G.P.: Ubiquitous transposon-like repeats BI and B2 of the mouse genome: B2 sequencing. Nucleic Acids Res. 10 (1982) 7461-7475. Kim, K.H., Schwartz, F. and Fuchs, E.: Differences in keratin synthesis between normal epithelial cells and squamous cell carcinomas are mediated by vitamin A. Prec. NatL Aced. Sci. USA 81 (1984) 4280-4284. Kopan, R., Traska, G. and Fuchs, E: Retinoids us important regulators of terminal diffeRntiation: examiningkeratin expression in individual epidermal cells at various stages of keratinization, J. Cell Biol. 105 (1987) 427-440. Kulesh, D.A. and Oshima, R.G,: Complete structure of the gene for human keratin 18. Ganomics 4 (1989) 339-347. Lawn, R.M., Efstratiedis, A,, O'Connell, C. and Maniatis, T.: The nuoleotide sequence of the human B-giobin cane. Cell 21 (1980) 647-651. Lussier, M., Ouellet, T., Lampron, C, Lapointe, I, and Royal, A.: Mouse keratin 19: complete amino acid sequence and gene expression during development. Gene 85 (1989)435-444. Marehuk, D., MeCrohon, S. and Fuchs, E.: Complete sequence ofa gene encoding a human type I keratin: sequences homologous to enhancer elements in the regulatory region of the gene. Pro¢. Natl. Acad. Sci. USA 82 (1985) 1609-1613. Nadcau, J.H., Berger, F.G., Cox, D,IC, Crosby, J.L., D~visson, M.T., Ferrara, D., Fuchs, E., Hart, C., Honihan, L., Lailey, P.A., Langley, S.H., Martin, G.R., Nichols, L., Phillips, S.J., Rnderick, T.H., Roop, D.IC, Ruddle, F.H., Skow, L.C. and Compton, J.G.: A family of type ! keratin genes and the borneo box.2 gene complex are closelylinked to the rex locus on mouse chromosome 11. Genomics 5 (1989) 454-46Z Nadeau, J.l-I.: Maps of linkage and synteny homologies between mouse and man. Trends Genet. 5 (1989) 82-86. O'Guin, W.M., Gaivin, S., Schermer, A. and Sun, T.-T.: Patterns of keratin expression define distinct pathways of epithelial development and dili'erentiation. Carr. Topics Develop. Biol. 22 (1987) 97-125. Oliviero, S. and Monaci, P.: RNA polymerase Ill promoter elements enhance transcription of RNA polymerase II genes. Nucleic Acids Res~ 16 (1988) 1285-1293. Oshima, R.G., Howe, W.E., Klier, F.G., Adamson, E.D. and Shevinsky,

213 L.H.: Intermediate filament protein synthesis in preimplantation routine embryos. Dev. Biol. Biol. 99 (1983) 447--455. Onellet, T., Lampron, C., Lussier, M., Lapointe, L. and Royal, A.: Differential regulation of keratin 8 and 18 messenger RNAs in differentiating F9 cells. Biochim. Biophys. Acta 1048 (1990a) 194-201. Ouellet,T., Lussier,M., Babai, F.,Lapolnte,L and Royal, A.: Differential expression of the epidermal K1 and KI0 keratin genes during mouse embryo development. Biochem. Cell Biol. 68 (1990b) 448--453. Pedget, R.A., Grabowski, PJ., Konarska, M.M., Seller, S. and Sharp, P.A.: Splicing of messenger RNA precursors. Aunu. Rev. Biochem. 55 (1986) 1119-1150. Quinlan, R.A., Schiller, D.L., Hatzfeld, M., Achtstatter, T., Moll, R., Joreano, J.L., Magin, T.M, and Franke, W.W.: Patterns of expression and organization of cytokeratin intermediate filaments. In Wang, E., Fischman, D., Hem, R.K.H. and Sun, T.-T. (Eds.), Intermediate Filaments. Ann. N.Y. Acad. Sci. 455 (1985) 282-306. Rackwitz, H.-R., Zehetner, (3., Frischauf, A.-M. and Lehrach, H.: Rapid restriction mapping of DNA cloned in lambda phage vectors. Gene 30 (1984) 195-200. Rieger, M. and Franke, W.W.: Identification of an orthologous mammalian cytokeratin geue. High degree ofintron sequence conservation during evolution of human cytokeratin 10. l. Mol. Biol. 204 (1988) 841-856. Romano, V., Bosco, P., Costa, (3., Leube, R,E., Franke, W.W., Rocchi, M. and Romeo, (3.: Chromosomal assignmem of cytokeratin genes. Cytogenet. Cefi Genet. 46 (1988a) 683. Romano, V., Bosco, P., Rocchi, M., Costa, G., Leube, R.E., Franke, W.W. and Romeo, G.: Chromosomal assignments of human type i and type 11 cytokeratin genes to different chromosomes. Cytogenet. Cell Genet. 48 (1988b) 148-151. Roop, D.R., Huitfeldt, H,, Kilkenny, A. and Yuspa, S.H.: Regulated

expression of differentiation-associated keratins in cultured dermal cells detected by monospecific antibodies to unique peptides of mouse epidermal keratins. Differentiation 35 (1987) 143-150. Rosenberg, M., RayChaudhury, A., Shows, T.B., LeBean, M.M. and Fuchs, E.: A group of type i keratin genes on human chromosome 17: characterization and expression. MOl. Cell Biol. 8 (1988) 722-736~ Sanger, F.,Nicklcn, S. and Coalson, A.R.: D N A sequenc~ with chah~ terminating inhibitors. Prec. Natl. Acad. Sci. USA 74 (197"/) 5463-5467. Schermar, A., (]alvin, S. and Sun, T.-T.: Differontiation-ralated expression of a major 64K corneal keratin in vivo and in culture sulK~sts timbal location of corneal epithefial stem cells. J. Cell Biol. 103 (1986) 49-62. Stesiak, P.C., Purkis, P.E., Leigh, I.M. and Lane, E.B.: Keratin 19: predicted amino acid sequence and broad tissue distribution suggest it evolved Fromkeratinocyte karatins. J. Invest. Dcrmatol. 92 (1989) 707-716. Steinert, P.M. and Roop, D.R.: Molecular and cellular biology of intermediate filaments. Annu. Rev. Biochem. 57 (1988) 593-625. Sun, I".-'1".,Tseng, S.C.(]., Huang, AJ.-W., Cooper, D., Schermer, A., Lynch, M.H., Weiss, R. and Eichner, R.: Monoclonal antibody studies of mammalian epithelial keratins: a review. In Wang, E., Fishman, D,, Liem, I~ILH. and Sun, T.-T. (Eds.), Intermediate Filaments. Ann. N.Y. Acad. Sci. 455 (1985) 30"/-329. Vasseur, M., Duprey, P., Brfilet, P. and Jacob, F.: One gene and one pseudngene for the cytokeratin endoA. Prec. Nat]. Aced. Sci. USA 82 (1985) 1155-1159. Yang, K.Y., Masters, $.N. and Attardi, G.: Human dihydrofolate reductase gene organization: extensive conservation of the G + Crich 5' non-coding sequence and strong intron size divergence from homologous mammalian genes. J, Mol, Biol. 176 (1984) 169-187.

The mouse keratin 19-encoding gene: sequence, structure and chromosomal assignment.

Keratin 19 (K19) is synthesized mainly in embryonic and adult simple epithelia, but has also been found in stratified epithelia as well. K19 is the sm...
1MB Sizes 0 Downloads 0 Views