GENOMICS 1 4 , 875-882 (1992)

Conserved Regulatory Elements in the Promoter Region of the N-CAM Gene GREGGORYCOLWELL, Bo LJ, DOUGLASFORREST,AND ROBERTBRACKENBURY Department of Anatomy and Cell Biology, University of Cincinnati Medical Center, Cincinnati, Ohio 45267-0521 Received June 26, 1992

G e n o m i c c l o n e s c o n t a i n i n g 5'-flanking sequences, t h e first e x o n , a n d t h e e n t i r e first i n t r o n f r o m t h e c h i c k e n N - C A M g e n e w e r e c h a r a c t e r i z e d b y r e s t r i c t i o n mapping a n d D N A sequencing. A > 600-bp s e g m e n t t h a t includes the first exon is v e r y G + C - r i c h a n d c o n t a i n s a l a r g e p r o p o r t i o n o f CpG d i n u c l e o t i d e s , s u g g e s t i n g t h a t it r e p r e s e n t s a CpG i s l a n d . S P - 1 a n d AP-1 consensus elements ar e present, but no TATA- or C C A A T - l i k e ele-

ments w e r e found within 300 bp upstream of the first exon. Comparison of t h e c h i c k e n p r o m o t e r r e g i o n sequence with s i m i l a r r e g i o n s o f t h e h u m a n , rat, a n d mouse N-CAM genes r e v e a l e d t h a t s o m e p o t e n t i a l r e g u l a t o r y e l e m e n t s i n c l u d i n g a " p u r i n e b o x " seen in mouse and rat N-CAM genes, one of two homeodomain bindi n g r e g i o n s seen in mammalian N-CAM genes, and several p o t e n t i a l SP-1 s i t e s a r e n o t c o n s e r v e d within this r e g i o n . I n c o n t r a s t , h i g h CpG c o n t e n t , a h o m e o d o m a i n b i n d i n g sequence, an SP-1 element, an octomer element, and an AP-1 e l e m e n t a r e c o n s e r v e d i n a l l four genes. The first i nt r on of t h e c h i c k e n gene is 38 kb, substantially smaller than the corresponding intron

from mammalian N-CAM genes. T oget her with previous studies, this w o r k c o m p l e t e s t h e cloning of t h e c h i c k e n N - C A M g e n e , w h i c h c o n t a i n s at l e a s t 2 6 exons distributed over 85 kb.

© 1992 AcademicPress, Inc.

INTRODUCTION Th e neural cell adhesion molecule (N-CAM) consists of several heavily glycosylated polypeptides that are derived from alternatively spliced transcripts of a single gene (Owens, et al., 1987; Murray et al., 1986; Goridis and Wille, 1988). N-CAM mediates cell-cell aggregation by homophilic (i.e., N-CAM to N-CAM) binding (Rutishauser et al., 1982) and is believed to play key morphogenetic roles during embryogenesis (Edelman, 1986). In adult animals, N-CAM is expressed on neurons, striated muscle, and testis, but is absent from most epithelia and mesenchymal tissue (Thiery et al., 1982; Crossin et al., 1985). During development, the molecule is dynamically expressed by a variety of cell types: A large increase in

N-CAM expression is an early consequence of neural induction and N-CAM is transiently expressed in metanephric tubules, in somites, and in placodes (Thiery et al., 1982; Crossin et al., 1985). In addition, N-CAM expression is altered in a variety of tumor cells (Roth et al., 1988; Garin-Chesa et al., 1991; Van-Camp et al., 1990) and after transformation by N-myc (Akeson and Bernards, 1990) or Rous sarcoma virus (Greenberg et al., 1984; Brackenbury et al., 1984) in vitro. Despite the presumptive significance of changes in NCAM expression during embryogenesis or tumor development, little is known about the mechanisms th a t control transcription of the N-CAM gene. Chicken (Murray et al., 1984; Hemperly et al., 1986; Cunningham et al., 1987), human (Barton et al., 1988; Hemperly et al., 1990), mouse (Barthels et al., 1987), rat (Small et al., 1987), and frog (Kintner and Melton, 1987) homologs of N-CAM cDNAs have been cloned and 5'-flanking sequences of the human (Barton et al., 1990), rat (Chen et al., 1990) and mouse (Hirsch et al., 1990) genes have been analyzed. In mammalian species, all neural transcripts appear to be derived from a single promoter which lacks T A T A and CCAAT boxes, but contains AP1 and SP-1 consensus sequences, and other elements t hat vary among species. T he cell type-specific regulation of N-CAM may not be derived solely from 5'-flanking sequences, but analysis of potential control elements within the gene has been hampered by the large, indeed unknown, size of the first intron. T o gain insight into the regulatory elements controlling N-CAM expression, we isolated and characterizated genomic clones encompassing the 5'-flanking region, the first exon, and all of the first intron of the chicken NCAM gene and compared the structure and sequence of the 5' portion of this gene to those of previously characterized mammalian N-CAM genes (Barton et al., 1990; Chen et al., 1990; Hirsch et al., 1990). This comparison revealed several potential regulatory elements th a t are conserved among these evolutionarily distant organisms, implying functional significance. MATERIALS AND METHODS General molecular biological methods. Isolation of D N A , agarose gel electrophoresis, a n d restriction e n z y m e a n a l y s e s of D N A were all

S e q u e n c e d a t a f r o m t h i s article h a v e b e e n deposited w i t h t h e E M B L / G e n B a n k D a t a L i b r a r i e s u n d e r A c c e s s i o n No. Z12128. 875

0888-7543/92 $5.00 Copyright © 1992 by Academic Press, Inc. All rights of reproduction in any form reserved.

876

COLWELL ET AL.

carried out using standard methods (Sambrook et al., 1989). Oligonucleotides were synthesized on an ABI Model 391 PCR-Mate synthesizer. Libraries and cosmid clone isolation. Two cosmid libraries were used to assemble a map of the 5' portion of the N-CAM gene. Each cosmid library was constructed using partially digested, size-selected chicken genomic DNA ligated into the BamHI site ofpWE15 (Stratagene). The first library (obtained from Warren Gallin, University of Alberta) was constructed from Sau3A partially digested DNA from a male White Leghorn chicken, was propagated in Escherichia coli NM544, and was used in the isolation of cosE3 and cosG2C. The second library (purchased from Stratagene) was constructed from MboI partially digested DNA from a Cornish White Rock Cockerel, was propagated in E. coli NM544, and was used in the isolation of cosIDA, cosDAA, and cosIAA. For screening, cosmid libraries were plated at a density of 50,000 CFU/plate on 150-mm Petri dishes containing LB-amp (Luria broth with 100 ttg/ml ampicillin) agar. The plates were incubated for 8 h at 37°C. The colonies were lifted onto a nitrocellulose membrane which was then placed colony side up onto a new LB-amp plate. The colonies were allowed to grow an additional 15-17 h at room temperature. A duplicate of the original colony lift was generated by placing a second membrane, prewet in 2X SSPE (1× SSPE = 0.18 M NaC1, 10 mM NaPOt, pH 7.4, 1 mM EDTA), onto the colony side of the first membrane. Colonies were transferred by placing the membrane sandwich between two pieces of Whatman 3 MM filter paper and pressing between two glass plates. The filter sandwich was sprinkled with 2× SSPE, wrapped in foil, autoclaved for 5 min, and vacuum baked for I h at 80°C. Cell debris was removed by shaking in 2X SSPE, 1% SDS for 2h. The membranes were prehybridized at 42 °C in prehybridization solution (5X SSPE, 5X Denhardt's reagent, 0.1% SDS, 50% formamide, 100 ttg/ml sheared, denatured salmon testis DNA). Cosmid restriction fragments or cDNA inserts were used as probes after purification from agarose gels using the glass bead method (Vogelstein and Gillespie, 1979). The random-primer labeling method (Feinberg and Vogelstein, 1983) was used to generate probes with specific activities of approximately 2 × 109 cpm/ttg. Labeled probes were added to the prehybridization solution at a final concentration of 2 X 106 cpm/ml. Membranes were incubated at 42°C for 17 h, then washed (final stringency, 0.1X SSPE, 0.1% SDS, at 65°C) and exposed to Kodak XAR-5 film with an intensifying screen at -70°C for 17 h. Duplicate positives were purifled through two additional rounds of screening. The cosmid clone containing exon 0, cosG2C, was isolated as positive to a pEC265 cDNA insert probe and negative to an insert probe from pEC208, which contains sequences from the 3' half of the NCAM gene. Cosmid clone cosDAA, which linked cosG2C and cosE3, was isolated as positive to both cosG2C EcoRI fragment 2.5E and cosE3 EcoRI fragment 1.9E (see Fig. 1). The cosmid clone cosIDA, which contains exon 0 and 5'-fianking regions, was positive to cosG2C EcoRI fragment 2.5E but negative to cosE3 EcoRI fragment 1.9E. Cosmid mapping. Cosmid clones were mapped using a modification of the partial restriction digestion mapping technique (Rackwitz et al., 1985). Cosmid DNA, purified over a cesium chloride gradient, was digested by serially diluting EcoRI and BamHI restriction enzymes (Gibco/BRL) over a range of 2.2-0.1 units/ttg cosmid. When mapping cosmids for BamHI sites, a complete NotI digest was used to dissociate the insert from the vector. The use of 1:2 serial dilutions resulted in five digests, each digest set up to restrict 1.5 #g of cosmid. The tubes were incubated at 37°C for 15 min; then the reaction was stopped by the addition of 2.5 td loading dye containing 10 mM EDTA. Half of each digest was then electrophoresed on a 0.8% vertical agarose gel (Hoeffer: 25 cm long, comb: 3 mm thick X 8 mm wide) at 2 V/cm for 17 h. The second half of the digest was then loaded onto the same gel and electrophoresed for an additional 8 h at 3 V/cm. Gels were then blotted onto nitrocellulose membranes (Schleicher & Schuell) and probed consecutively with T3 (24-mer) and T7 (25-mer) oligonucleotide probes. Probes were labeled and detected via chemiluminescence as specified in the ECL oligonucleotide labeling and detection system purchased from Amersham. The sizes of each of the partial digestion products were determined relative to standards run

in parallel and the successive size differences were used to construct the map. Restriction maps were confirmed by analyzing cross-hybridization and double-digestion results. DNA sequencing. Regions to be sequenced were subcloned into M13 for single-stranded sequencing or Bluescript SK+ for doublestranded sequencing. The dideoxy chain termination method (Sanger et al., 1977) in conjunction with Sequenase (U.S. Biochemical) and [35S]dATP (DuPont) was used for both single-stranded and denatured, double-stranded templates. Oligonucleotide primers were synthesized to extend sequences obtained from a single template. Templates that resulted in compressed sequence were additionally sequenced with the dITP nucleotide mixes supplied with the Sequenase kit. Sequence data were assembled using the DNASIS software (Hitachi) and analyzed using DNASIS and MacVector 3.5 (IBI-Kodak). The N-CAM sequence was compared to sequences in the GenBank database (Bilofsky and Burks, 1988).

RESULTS

Isolation of G e n o m i c Clones C o n t a i n i n g the 5' P o r t i o n of the N - C A M Gene Genomic cosmid clones containing sequences from t h e 5' p o r t i o n o f t h e c h i c k e n N - C A M g e n e w e r e o b t a i n e d by screening with a c D N A clone, pEC265. This probe c o n t a i n s e x o n s 0, 1, 2, a n d 3 a n d p a r t o f e x o n 4 ( O w e n s et al., 1987; C u n n i n g h a m et al., 1987). O n e c o s m i d c l o n e , t e r m e d E3 , t h a t s p e c i f i c a l l y h y b r i d i z e d t o p E C 2 6 5 w a s f o u n d t o c o n t a i n s e q u e n c e s c o r r e s p o n d i n g t o e x o n s 1, 2, a n d 3. A s e c o n d c l o n e , G 2 C , p r o v e d t o c o n t a i n s e q u e n c e s r e p r e s e n t i n g e x o n 0. T h e i n s e r t s o f b o t h c o s m i d s w e r e m a p p e d b y p a r t i a l d i g e s t i o n w i t h B a m H I or E c o R I followed by hybridization with end-specific probes representing the T3 and T7 promoter sequences. This analysis r e v e a l e d t h a t E 3 a n d G 2 C e a c h c o n t a i n e d t w o u n r e lated inserts and two vector sequences. The inserts c o r r e s p o n d i n g t o t h e N - C A M g e n e a r e s h o w n in Fig. 1. T o obtain clones c o n t a i n i n g the r e m a i n d e r of the first intron, the d i s t a l m o s t EcoRI restriction f r a g m e n t s of c o s m i d s E 3 a n d G 2 C w e r e u s e d in a n o t h e r r o u n d o f library screening. Fifteen independent positive clones were obtained, and four of these reacted with both p r o b e s , s u g g e s t i n g t h a t t h e y b r i d g e d t h e gap b e t w e e n G 2 C a n d E3. A m a p o f t h e B a m H I a n d E c o R I s i t e s obt a i n e d by partial digestion analysis c o n f i r m e d t h a t this was, i n d e e d , t h e c a s e (see Fig. 1). T h e m a p w a s v e r i f i e d by subcloning each of the E c o R I and B a m H I f r a g m e n t s a n d c o n f i r m i n g t h e l o c a t i o n o f r e s t r i c t i o n s i t e s f or t h e other e n z y m e within the subclone. T h e map order was also c o n f i r m e d by verifying the c r o s s - h y b r i d i z a t i o n of BamHI fragments onto the expected EcoRI fragments a n d vice versa. Only one B a m H I site showed a p o l y m o r phism. Two of the cosmid clones (IAA and DAA) lacked a B a m H I s i t e t h a t w a s p r e s e n t i n I D A a n d o t h e r cosmids, and thus c o n t a i n e d a single B a m H I f r a g m e n t of 7.1-kb i n s t e a d o f t w o f r a g m e n t s o f 3.1 a n d 4.0 kb. H y b r i d ization analysis showed t h a t the 7.1-kb B a m H I f r a g m e n t b o u n d t o b o t h t h e 3.1- a n d 4 . 0 - k b B a m H I f r a g m e n t s . Size m e a s u r e m e n t s f r o m the partial digestion analysis a n d a c c u r a t e size d e t e r m i n a t i o n s o f t h e s u b c l o n e d f r a gm e n t s indicated t h a t the distance from exon 0 to exon 1 is 38 kb (Fig. 1).

CONSERVED REGULATORY ELEMENTS IN THE N-CAM GENE G2C

T3

T7

T7

T3

2.5E

1.9E

IDA

14.0E 9.0B

T3

0.9B T3 2.8E 0.9B

14.0

T7

4.1E 1.3B

32 Kb

1.2E 1.9B

11.5

I ,

4.1

13"2 112"21 lf41 t 3.9 I I 5.0 ll4"6 I , ,l , , I , , , II I 1.0 3.9 4.7 2.3 0.5

3.5

I

8.9

T7 4.3E 0.0B

DAA

13.8

I

IAA 36.5 Kb

T3

I

BarnH1

T3

1.0E

37 Kb

EcoR1

E3

4.9

I I

3.1 4.3

7.1" 4.0

I

I

4.9

li

0.5

1.6

I

6.5

II

13.5

p

p s p

s

p

p

p

0.2

IIII

0.1

Exon 0

p

877

0.3 0.5

Exons 1-7

Scale:

I

==

i

J

I 10 Kb

p

s

i

b

pP

B

p

I Xhol

N

I I Exo. o

I

I

~

~ .

Scale:

I

]

0.1 Kb

FIG. 1. Cosmidclones representing the 5' half of the chicken N-CAM gene. The figure shows cosmids G2C and E3, which were isolated as reactive to the cDNA clone pEC265. EcoRI fragments at the end of these cosmids (thickened line) were used to probe libraries, yielding cosmids IDA, IAA, DAA, and others. Partial restriction analysis of these cosmids, using the T3 and T7 promoter sequences as probes, produced the EcoRI and BamHI restriction maps shown. The sizes of the EcoRI (E) and BamHI (B) end fragments are listed below the line representing each cosmid. The 7.1-kb BamHI fragment indicated by the asterisk was found in some cosmid clones; all other clones contained an additional BamHI site, resulting in two fragments of 3.1 and 4.0 kb. The locations of exon 0 and exons 1-7 (filled boxes) are shown at the same scale. The region surrounding exon 0 that was subcloned and sequenced is shown at the bottom of the figure at an enlarged scale.

Isolation a n d Sequence Analysis of the Chicken N - C A M P r o m o t e r Region T o locate sequences c o r r e s p o n d i n g to the p r o m o t e r region a n d first exon, the c D N A clone p E C 2 6 5 was used to probe f r a g m e n t s of the appropriate cosmid, G2C. pEC265 reacted with a 4.0-kb N o t I f r a g m e n t which was t h e n subcloned into pBluescript. A single pEC265-reactive 1.2-kb X h o I s u b f r a g m e n t was subcloned into M13 in b o t h orientations. As s h o w n in Fig. 2, the sequences of exon 0, 1.0 kb of 5'-flanking sequence, a n d 100 bp of i n t r o n 1 were d e t e r m i n e d b y analysis of p o r t i o n s of these clones. Sequencing was carried out using universal a n d reverse primers, s y n t h e t i c primers c o m p l e m e n t a r y to internal sequence, directed subcloning of large fragments, a n d s h o t g u n subcloning a n d sequencing of RsaI, HaeIII, a n d A l u I fragments. P a r t of this region h a d very high G + C content, necessitating the use of 7-deazad G T P a n d d I T P labeling mixtures a n d t e r m i n a l deoxynucleotidyl transferase extension of reactions. T h e consensus sequence s h o w n was d e t e r m i n e d from multiple readings of sequence runs from b o t h strands. This analysis revealed the location of the 5'-most sequence previously f o u n d in c D N A clones p E C 2 5 4 a n d pEC265 ( C u n n i n g h a m et al., 1987). This sequence was f o u n d in the genomic clone as a single c o n t i n u o u s region,

indicating t h a t it comprises a single exon, as in m a m m a lian N - C A M genes ( B a r t o n et al., 1990; Chen et al., 1990; H i r s c h et al., 1990). T h i s exon encodes the t r a n s l a t i o n start site and signal peptide of all forms of chicken NC A M ( C u n n i n g h a m et al., 1987). T h e genomic sequence of this exon, t e r m e d exon 0, generally agreed with the sequence previously d e t e r m i n e d from c D N A clones ( C u n n i n g h a m et al., 1987). W i t h i n the coding p o r t i o n of the exon, the sequence we d e t e r m i n e d agrees completely with t h a t previously published ( C u n n i n g h a m et al., 1987). In the 5 ' - u n t r a n s l a t e d p o r t i o n of the exon, several differences (noted in the legend) were seen t h a t m o s t likely represent genuine p o l y m o r p h i s m s . T h e 5' end of exon 0 as shown is derived from previously d e t e r m i n e d c D N A sequence ( C u n n i n g h a m et al., 1987). A spliced o n o r sequence, C A G / G T A C C G , t h a t varies from the consensus M A G / G T R A G T (Mount, 1982) was located at the 3' end of exon 0. Gross features of this region included a middle repetitive element a n d unusual base composition. A s e g m e n t of a b o u t 300 nucleotides t h a t is highly homologous to the chicken repetitive element CR1 (Scott et al., 1987; Silva a n d Burch, 1989) begins a p p r o x i m a t e l y 1160 bp ups t r e a m of the t r a n s l a t i o n initiation codon. F r o m nucleotide - 1 1 6 6 to nucleotide - 8 5 9 (see Fig. 2), the N - C A M sequence showed 78% identity to the CR1 consensus se-

878

COLWELL ET AL. -1229

TTGTGCCCCT

CTGCTCTGCC

CCTGGAAGGC

CCCACCTGCA

GTGCTGTGCC

CAGCCTGGGG

CCCCCAGGAC

-1159

AGGAGGGATG

CGGAGCTGCT

GGAGCGGGTC

CAGAGGAGGG

CACGAGGATG

CTCAGAGGGC

TGCAGCACCT

-1089

CTGCTGTGAA

GACAGGCTGA

GGGAGCTGGC

TTCTCGCCTA

CAAGAGGGGA

GGCGTTGGTT

AGATGTGAGG

-1019

GCGGTGAGGC

GCTGGCACTG

CTGCCCAGAG

AGCTGTGAGT

GCCCCATCCC

TGCAGGTGCT

CAAGGCCAGG

-949

TGGGATGGGG

CCCTGGGCAG

CCTGAGCTGC

TGGTTGACAC

CTTGATGGTC

TTTAAGGTCC

TCTCCAACAT

-879

GAGCCATTCT

ATGATTCTGT

TTATCGCTCT

CTACAACTCC

CTGAAAGGAG

GTTGTGCTGA

GGTGGGGGTT

-809

GCATCCCCTG

TGCCTCTTCT

GCCAGGTAAC

AGTGAGAGGA

CAATGGGGAA

XhoI TGGCCTCGAG

CTGCACCAGG

-739

TGAGGTTCAG

ATTGAATCTT

AGGAAAAACT

TCTTCCCCAG

AAGAGCGGTC

ATTGGCACAG

GGTGCCCAGG

-669

GGGTGGAGGA

GTCACCGTCC

CTGGAAGTGC

TCAGGTGATG

TGGAGGTGTG

GCACTGAGGT

ACACCGTCAG

-599

TGGGCAATGG

TGGTGGTGGG

TGGACGGCTG

CACTGGGTTA

TCTTAGAGGT

TTTTTTCCAA

H o m e o II TATA Box CCTGAATAAT

-529

T CCACGATTC

TATTTGTTGC

CCGAACAGTT

TTTCCTCATC

AGTGTTCTGT

TTAAGTGTCA

CGAGGAGTCC

GAGCAACTGA

ACCGTGTTAC

CCCTACGAAG

CACACCCGCA

GCGCTCCGCT

-459

TTACAGCAGA

ACTGTCTGCT

CACACAGCAT

Octamer TAATGATGGA AATGGTCATT

-389

TCTGCACGTT

TTGACAGCGG

TCAGCTCCGC

GCATTTCCCC

-319

CCGCGCGCGT

TTTGCGGACG

GCCGCCAGCG

CCACCGCGCG

GCCTATTTGA

Spl GGCGGGCGCG GGCGGAAGGA

-249

CACGGCGGTG

CCGCGCTCAG

AGAGCGGCGC

GCAGGGATCC

GCGCGGTGCT

API GCGCGCTGTC AGTCACGGCG

-179

GCGGCGGCGC

GGAGGCGGCC

GGGGCGCAGC

Spl GGGCGGAGCG

GGGACACGCG

CGGACGGAGC

GACGGCGCGA

ACGGAACGGC

-109

AGGGGACGCG

GGGCAAAAAC

GGGAAGATCC

ATCCGGAGGG

CCGCGTGGTG

Spl CTCCGCCCGC aAlaAlaLeu CGCCGCGCTG

ProTrpThrL CCCTGGACGT Spl GGC~CSGGGG

-39

TATTTCGGAG

CGGCCCCTCT

CGCCCCCCCG

M etLeuProAl CCGGCTGCGA TGCTGCCCGC

32

euPhePheLe TGTTTTTTCT

uGlyAlaAla CGGAGCCGCA

GGTACCGGCG

GGGCGATGCC

Spl GGCTGCGGGC

GGGGGGCTGC

102

GAGCCGGGCC

GGGGGTCGCG

GGAGCGCAGA

Spl CAATGGGCGG

NotI CGGGGTGCGG

CCGC

FIG. 2. Sequence of the 5' region of the chicken N-CAM gene. The figure shows approximately 1.O kb of 5'-flanking sequence, exon 0 (underlined), and approximately 100 bp of the first intron of the chicken N-CAM gene. Numbering is relative to the translation initiation codon for the N-CAM signal sequence (shown in three-letter amino acid code above the nucleotide sequence). Some potential regulatory sequences are indicated. The XhoI and NotI sites (underlined) correspond to those shown in Fig. 1. The sequence from -1166 to -859 (dotted underline) is related to the chicken CR1 repetitive element. Within the exon, some differences were noted between this sequence and a previously determined cDNA sequence (Cunningham et al., 1987): 1031, 1032: cg vs cgg; 1044-1047: acgg vs ag; 1058, 1059: gc vs gcc; ll00-110h cg vs cggagcg; 1103, 1104: ac vs agc; 1109, 1110: cg vs cgg; 1111, 1112: ac vs agc. quence ( S c o t t et al., 1987). N u c l e o t i d e s - 8 7 4 to - 8 5 9 ( C A T T C T A T G A T T C T G T ) m a t c h t h e i m p e r f e c t octam e r i c d i r e c t r e p e a t c o n s e n s u s sequence ( S i l v a a n d B u r c h , 1989) b e l i e v e d to be i n v o l v e d in CR1 t r a n s p o s o n i n s e r t i o n . In a d d i t i o n , a r e g i o n t h a t e x t e n d e d from - 4 5 0 bp, t h r o u g h e x o n 0 a n d into t h e first i n t r o n , was v e r y G + C - r i c h a n d c o n t a i n e d a large p r o p o r t i o n of C p G d i n u cleotides. F i g u r e 3 shows a p l o t of t h e r a t i o b e t w e e n t h e o b s e r v e d f r e q u e n c y a n d e x p e c t e d f r e q u e n c y of C p G a n d G p C d i n u c l e o t i d e s over 100-bp i n t e r v a l s of t h e sequence s h o w n in Fig. 2. As e x p e c t e d , t h i s r a t i o f l u c t u a t e s a r o u n d 1.0 for G p C d i n u c l e o t i d e s , b u t shows a d i f f e r e n t p a t t e r n for C p G d i n u c l e o t i d e s : t h e r a t i o is low ( A A T A A T T C ) was l o c a t e d a t - 5 3 5 , a p p r o x i m a t e l y 320 bp u p s t r e a m of t h e b e g i n n i n g of e x o n 0. S e v e r a l p o t e n t i a l S P - 1 sites were found:

a t ~ 5 5 bp u p s t r e a m of t h e t r a n s c r i p t i o n s t a r t site, w i t h i n t h e first exon, a n d w i t h i n t h e first i n t r o n , j u s t 3' to t h e first exon. A p o t e n t i a l A P - 1 r e c o g n i t i o n sequence was l o c a t e d w i t h i n e x o n 0, v e r y close to t h e a p p a r e n t t r a n s c r i p t i o n s t a r t site. A n o c t a m e r - r e l a t e d sequence ( < A T G G A A A T ) ( W i r t h et al., 1987) was f o u n d at - 4 2 4 and a potential homeobox recognition element (

Conserved regulatory elements in the promoter region of the N-CAM gene.

Genomic clones containing 5'-flanking sequences, the first exon, and the entire first intron from the chicken N-CAM gene were characterized by restric...
969KB Sizes 0 Downloads 0 Views