DNA AND CELL BIOLOGY Volume 10, Number 10, 1991 Mary Ann Liebert, Inc., Publishers Pp. 723-734

Cloning and Primary Structure of the Chicken Transforming Growth Factor-/52 Gene

Molecular

DAVID W. BURT and IAN R. PATON

ABSTRACT

The chicken

transforming growth factor-d2 (TGF-02) gene and its flanking regions were cloned and characterized. The gene contains 7 exons and 6 introns spanning about 70 kb. Primer extension analysis identified one major and two minor starts of transcription. A comparison of the 5-flanking regions for human and chicken TGF-/32 genes revealed limited sequence homology around the start of transcription, including conserved TATA-box, CRE, and AP-2 sequence motifs. A species comparison of the 5' untranslated region did not reveal any sequence homology beyond the coding region. In contrast, the 3' untranslated region was highly conserved, suggesting that this region may play an important role in the expression of the TGF-/32 gene. INTRODUCTION

THEpeptides factors that

FACTOR-/3 (TGF-/3) polylarge superfamily of growth regulate the proliferation and differentiation of a great variety of cell types (Roberts and Sporn, 1990). The different family members can be classified into four groups (Lee, 1990; Roberts and Sporn, 1990) based on functional and structural criteria: TGF-0 (TGF-,31 to TGF05), Inhibin (Inhibin a, j8A, /3r), MIS (MIS), and BMP (BMP2-7, Vgl, DPP, GDF1). Chromosomal locations have been assigned to several members of the TGF-ß superfamily (Barton et al, 1988; Dickinson et al, 1990; Tabas et al, 1991) and indicate that the TGF-/3 superfamily has become widely dispersed during its evolution. Five different genes for TGF-0 (TGF-/31 to TGF-/35) have been described in vertebrate species (Roberts and Sporn, 1990). The structure of the human TGF-/31 and TGF-/33 genes have been described (Derynck er al, 1987, 1988) and although the overall size of these genes is unknown, a preliminary report suggests that the TGF-|81 gene is at least 100 kb in length (Roberts and Sporn, 1990). The TGF-/31 and TGF-/33 genes both contain 7 exons and 6 introns at homologous positions. Clones for chicken, human, ape, mouse, and Xenopus TGF-02 cDNAs have been isolated (Hanks er al, 1988; Madisen et al, 1988; Miller er al, 1989; Jakowlew et al, TRANSFORMING GROWTH are members of a

1990; Rebbert er al, 1990). DNA sequence analyses of these clones indicate that TGF-02 is synthesized as a large precursor polypeptide, with the carboxyl terminus being cleaved to yield the mature 112-amino-acid residue monomer. Two forms of primate TGF-/32 have been described (Webb et al, 1988) that differ in length by 29 amino acids. The function, if any, of these additional 29 amino acids is unknown and it remains to be seen whether these are also present in other vertebrate species. The data suggest that these different forms arise from alternative splicing as three major TGF-,82 mRNA species (4,100, 5,100, and 6,500 nucleotides) were detected in monkey cells (Webb et al, 1988), with the 5,100-nucleotide mRNA coding for the larger TGF-/32 isoform due to the insertion of an additional exon between exons I and II. The 5,100- and 6,500nucleotide mRNAs also differ in their 3' untranslated regions and use alternative polyadenylation sites. Multiple TGF-/32 transcripts have also been described in the mouse and chicken (Miller et al, 1989; Jakowlew et al, 1990), which show tissue and developmental stage-specific expression. In this report, we describe for the first time the genomic organization and nucleotide sequence of the chicken TGF/32 gene. A number of highly conserved sequence motifs are evident in the 5'- and 3'-flanking regions and are candidates for c/5-acting cointrol elements.

Department of Cellular and Molecular Biology, AFRC Institute of Animal Physiology and Genetics Research, Edinburgh Research Station, Roslin, Midlothian EH25 9PS, UK. 723

BURT AND PATÓN

724

MATERIALS AND METHODS

Cloning the chicken TGF-ß2

DNA sequence

analysis

Sequencing templates were prepared by cloning DNA generated either by sonication or from specific fragments A White Leghorn chicken genomic library of Sau 3A restriction digests of plasmid subclones, into M13 cloning fragments cloned into phage EMBL3 was screened with vectors (Bankier et al, 1987). Single-stranded DNA temsuitable DNA fragments. A simian TGF-02 cDNA clone were prepared from 1-ml cultures of NM522 cells inplates pcD-GIG2 kindly provided by Hanks et al (1988) was used fected with single Ml3 recombinants (Sambrook et al, as a source of several DNA fragments. Screening was perWe used Sequenase and Taquence systems to deterformed using DNA fragments labeled with 32P to a specific 1989). mine DNA sequences, as directed by the manufacturer (US activity of 5 x 108 cpm//ig by the random primer method. Biochemicals). The Taquence system allowed us to seRecombinant plaques were transferred and hybridized by GC-rich DNA sequences. standard procedures (Sambrook et al, 1989). Filters were quence through very washed in 0.5 x SSC, 0.1% NaDodSO, at 37°C. Phage DNAs were purified according to standard methods. De- Computer analyses tailed restriction maps of genomic clones were obtained The DB (Staden, 1982) was used for sequence asusing the partial-mapping protocol of Rackwitz et al sembly. Thesystem of Wisconsin Genetic Computing University (1984) used in combination with analyses of the sizes of Group et al, 1984) were used for gen(Devereux programs single and double restriction digest products. Clones were eral sequence sequence alignments were analysis. Multiple further characterized by Southern blotting and DNA semade using the CLUSTAL program (Higgins and Sharp, quencing. 1989). DNA

gene

Blotting and hybridization

Primer extension

analysis

electrophoresis, gels were treated as described Primer C was 5' end-labeled with using T4 (Rigaud et al, 1987), transferred to Zetaprobe (Biorad) polynucleotide kinase, annealed to [7-32P]ATP 20 pg of CEF total membranes under vacuum (LKB VacuGene) and hybridRNA and then extended using AMV reverse transcriptase, ized as recommended by the manufacturer (Biorad). as described (Sambrook et al, 1989). We included 7deaza-2'-deoxyguanosine instead of dGTP in the extension After

Synthetic oligonucleotides

Synthetic oligonucleotides were synthesized by Oswel DNA services (Edinburgh University) and purified by HPLC. Primers for PCR were: Primer A, TCCATCTACAACAGCACCAGGG ACTTGCTCCAGGAGAAG .positions 404-442 in a simian TGF-02 cDNA (Hanks et al, 1988); and Primer B, CGAGACCCGCGCCTTTGAGTT, complementary to positions 851-871 in a chicken TGF-/32 cDNA (Jakowlew et al, 1990). Primer C, GCCGCCGCCGCCGCCGCCCCGACGGCCTCC, was used for primer extension. Isolation of a chicken TGF-ß2 cDNA for the 5' end

specific

Total RNA was isolated from cultured chicken embryo fibroblasts (CEF) using the standard guanidine thiocyanate extraction procedure (Sambrook et al, 1989). First-strand CEF cDNA was synthesized using Moloney murine leukemia virus (Mo-MLV) reverse transcriptase (BRL) as described (Rappolee et al, 1989). The polymerase chain reaction (PCR) was carried out according to the GeneAmp DNA amplification reagent kit instructions (Perkin-Elmer Cetus, Norwalk, CT) using Primer A and Primer B (see above). The reaction was heated to 94°C (1 min), 50°C (2 min), and 72°C (3 min) for 30 cycles in a thermal cycler. The expected 304-bp product was digested with Pst I to produce two DNA fragments of 270 bp and 34 bp. The 270-bp product was isolated from a 5% PAGE gel and sequenced. Comparison with the simian TGF-/32 cDNA sequence (Hanks et al, 1988) verified that it contained exon I and II coding sequences.

reaction, to weaken secondary structures in the GC-rich TGF-02 mRNA, since this proved a block to primer exten-

sion in our early experiments (unpublished observations). A set of dideoxy-nucleotide DNA sequencing reactions using the same labeled primer and a complementary Ml3 template, containing a chicken TGF-/32 gene fragment, were used for size markers. The products of the extension reaction were analyzed on a 6% polyacrylamide/8 M urea sequencing gel. A 35S-labeled Msp I digest of Bluescribe DNA was used as additional size markers.

RESULTS Isolation

of the

chicken TGF-ß2 gene

We isolated phage containing the chicken TGF-/32 gene from a White Leghorn chicken genomic library. Using a 281-bp Sau 3A DNA probe isolated from a simian TGF-/32 cDNA clone (positions 1184-1464; Hanks eí al, 1988), chicken genomic clones coding for TGF-/32 were identified (Fig. 1). These clones were subjected to detailed restriction mapping and Southern blotting with the entire simian TGF-/32 cDNA. Homologous DNA fragments were sequenced and shown to code for exons III-VII. The remainder of the simian cDNA could not be used to screen the genomic library directly to identify genomic clones coding for exons I and II due to a high level of nonspecific binding. To overcome this problem, a DNA fragment was isolated from the 5' end of the most 5' genomic clone (Clone XTGF-l.F) and used to isolate more 5' genomic clones. This was repeated until exon II was identified by Southern blotting and DNA sequencing. To identify genomic clones

CHARACTERIZATION OF TGF-/32 GENE

ßdGK

G '

,

i"ií"_

SS

Ti

S

G

GSX

725

S

GKGSG

K K

K

S

GO

X S KG

I I H,,'G

GXS

HHH HHHHH H

XS K

-r¡-r->-MrJ-, J+TH H H H H H III IV V VI

-CB-

VII

-H—H—H-

FIG. 1. Structural map of the chicken TGF-|32 gene derived from overlapping genomic clones. A. Restriction map of chicken TGF-/32 gene and restriction sites shown are: H, Hind III; G, Bgl II; K, Kpn I; S, Sst I; X, Xba I. B. Diagram of the structure of the TGF-/32 gene. (Solid bars) coding regions; (open bars) untranslated regions; (solid lines) introns. A region conserved in the 3'-flanking sequences of mouse and chicken TGF-/32 genes is shaded. The direction of transcription is indicated as shown (5' to 3'). C. Alignment of genomic X phage clones.

coding for

exon I, a chicken TGF-/32 cDNA spanning I and II was isolated by RT-PCR (see Materials and Methods) and was used to probe the chicken genomic library. To establish physical linkage between the genomic clones on the 5' side of the physical map with those on the 3' side, we needed to identify single DNA fragments in chicken genomic DNA which would hybridize with probes derived from each side. Southern blots of restriction digests of chicken genomic DNA were hybridized with probes derived from genomic clones, XTGF-6.8 and XTGF-4.2, representing each side (Fig. 1). The endonucleases Sst I, Xba I, Bgl II, and Kpn I gave single, detectable DNA fragments (Fig. 2) which formed stable hybrids with exons

both probes. The smallest restriction fragment detected 10-kb Sst I DNA, and this raised the possibility that these clones were within 10 kb of each other. A further genomic clone, XTGF-7.2, was isolated that hybridized to both these probes (Fig. 3). Since these probes detected single DNA fragments in the chicken genomic digests (Fig. 2) the clone XTGF-7.2 must link the genomic clones XTGF-6.8 and XTGF-4.2. Physical evidence for linkage from restriction mapping was limited in the region of overlap (Fig. 1). The sizes of specific DNA fragments observed in genomic blots (Fig. 2), however, were consistent with those deduced from the restriction map derived from cloned DNAs (Fig. 1).

H

S

G

K

H

S

13.0 10.1

7.2 64

was a

Sequence analysis of the chicken TGF-ß2

gene

Figure 4 gives the nucleotide sequence of 16,256 bp of the 80-kb cloned region; the deduced protein sequence is shown above the exon sequences. The size and position of

5' END

3' END

FIG. 2. Linkage analysis of exon I and II regions by Southern hybridization. 5' and 3' probes were isolated from phages XTGF-6.8 and XTGF-4.2, respectively. Single digests of chicken genomic DNA with: H, Hind III; S, Sst I; X, Xba I; G, Bgl II; K, Kpn I. Filters were washed at high stringency; 0.1 x SSC, 65 °C.

BURT AND PATÓN

726

alignment (Fig. 6). There is no sequence homology beor 3'-flanking regions of the TGF-/31, -/32, or -03 genes (unpublished observations).

tween the 5'-

6.7



5.0



44



35



2.0



18



1.6



1.4



0.9 0.8

Potential chicken TGF-ß2 gene regulatory regions





3' END

FIG. 3. Southern blot analysis of phage XTGF-7.2 DNA. The same 5' and 3' probes used in the genomic blot (Fig. 2) were used. Sal I (L) double restriction enzyme digests with: H, Hind HI; S, Sst I; X, Xba I; G, Bgl II; K, Kpn I.

introns was deduced from the DNA sequence and the position of restriction sites on the restriction map (Fig. 1). The chicken TGF-/32 gene is organized into 7 exons and 6 introns spanning a 70-kb region. The boundary sequences of the intron-exon junctions conform to the GT.. .AG rule (Breathnach and Chambón, 1981).

Identification of TGF-ß2 transcription

start sites

A primer extension assay was used to determine the transcription initiation site for the chicken TGF-02 gene. A synthetic primer complementary to the sequence +162/ + 191 (Fig. 4) was used to prime the synthesis of cDNA from 20 pg of chicken embryo fibroblast (CEF) total RNA (Fig. 5). A major transcription start was mapped to position -15/+1 (site I), relative to the homologous start in the human TGF-|32 promoter. We also detected weaker starts at positions -47/-40 (site II) and -214/-209 (site III).

Similarity of genome organization of TGF-ßl, -ß2, and -ß3 genes

The 5' untranslated region is long, approximately 1 kb, and very GC-rich, as is common for most other growth factor genes (Kozak, 1987). A comparison of the chicken TGF-|82 exon I sequence with the homologous region from the cDNAs of other species, shows a high degree of sequence conservation only in the coding region and none in the 5' untranslated region (unpublished observations). Comparison of chicken and human TGF-/32 promoter sequences (Noma et al, 1991) revealed homology limited to the region surrounding the transcription start site (Fig. 7). The conserved sequence contains putative TATA-box, CRE, and AP-2 sequence motifs. Homologies to other exacting signals are found in the chicken 5'-flanking region, but these are not conserved in the human TGF-02 gene

(Fig. 4).

Unlike the 5' untranslated region, the 3' untranslated region of the TGF-02 gene is highly conserved (Fig. 8). A comparison of mouse and chicken TGF-/32 genes reveals 1,300 bp of sequence homology throughout the 3' untranslated region. It remains to be determined if any nontranscribed, 3'-flanking regions are also conserved; at present data is only available for the chicken gene. The overall homology between chicken and mammalian TGF-/32 mRNA in the 3' untranslated region is 66%. There are five identical segments, 10-14 bp in length, shared by chicken, human, and mouse sequences. The published mouse cDNA is longer than that described for human and reveals an additional 21-bp sequence identity in the 3' end of the mouse and chicken homology. These short blocks of sequence identity may represent signal motifs that play an important role in the expression of the TGF-/32 gene. The presence of a potential binding site for the positive transcriptional activator, NF-xB (Lenardo ef at., 1989), in the third identity block supports this hypothesis. On the basis of previously published data (Jakowlew et al, 1990), the TGF-/32 gene encodes three mRNAs, 3,900, 4,300, and 8,000 nucleotides in length. The sum of the coding and 5'-untranslated regions is approximately 2,300 nucleotides and assuming an approximate 200-bp poly(A) tail, this predicts 3' untranslated regions of 1,400, 1,800, and 5,500 nucleotides for these three mRNAs. Polyadenylation signals are therefore predicted to map at positions 10,700, 11,100, and 14,800 in the chicken TGF-/32 gene. An ATTAAA at 10,743 is near the predicted 10,700 site. Several sites are consistent with the 8,000-nucleotide

Table 1 compares the sizes of exons for the human TGF- mRNA, at 14,613, 14,723, and 14,729; the site at 14,613 is followed by TG-rich sequences, often associated with active polyadenylation signals (Wickens, 1990). The conserved 3' untranslated region also codes for other potential polyadenylation signals; two are conserved in either human or mouse TGF-/32 genes (Fig. 8). The potential sites in the chicken are followed by T- and TG-rich sequences, often found downstream of functional polyadenylation signals

ß\ and TGF-/33 genes with those found in the chicken TGF-/32 gene. There is conservation of exon size and phase. The intron sizes are only known for the chicken TGF-02 gene; however, a preliminary report does suggest that the human TGF-/31 gene is similar in size (Roberts and Sporn, 1990). The splice sites used in the human and chicken TGF-/3 genes are conserved, but there is some uncertainty in the position of the first, due to poor sequence

(Wickens, 1990).

727

CHARACTERIZATION OF TGF-/32 GENE

-915 -795 -675

-555 -435 -315 -195 -75

45 165

265 405 525 645 765 865

gacx:tctctag*ggtaacagtttcctactggagx;aaagagtggagtgcttttacagagttattaaggttggaaaagatgactgtgatcatttagtagtccaacccttgagggctgtgaag ccccccaggcaagtgatgggcagtgcagcacgggcagcaaccacgcggcgtggggccataggttcagtgcaaggcatttctccatggtttcagctgcagctttaggaggaaatcactcaa tagcctttttcttcggaagagtttgctttacggtacaaggctgagggggtgtgcaaggtcatttctgtagggcagagcggggcccgaggtcagagctcggcctttgcctgcacacggctc cgggctcaggctgcgatgtgcgctgcggccgaaagaggcagtggaaaaactgaattgcagacgcagcctgagaatgcggggcttgagagcttctgtcaagtgcagtgaggagagcgggag gggagagcccggagctgagcctgctttcccctggaattgccccaaacgtgcgtggaagtgaaaagtaagagtggggcagtgcaaagcccgatctgctgctaccggaggtcctcgtcctgc QCf JIJ CCCCTGCGTACTTGCACGGAACCATATCCCGœGCTTAATTGTAGCGAGGTTGCCGTTTGACCCAATTACCCTAATAAGAGGGœGAGGGGCGCTGCAOCCCarCCTTTTTGAAAGAGGAA GC CBC AAAGAGATGCGCCGGGGCTGCCCCTCTCTCCCCCAGGCTGCGGCCGATCCCTGCCGGGCTCGGAGGGGAATAAATAGGCGGGCGGCGGCGGAGCCCTCCGCTCTCTGCCCGCAGCCCCAC H

GC

TATA

ÇAf^CACTTCGCGGCGAGGCGAGCAC«J3ÎSSG£GSGGAGCCCTTATAAA^^

AP2

I

-

-

GGCCGGGGGAGGCGGGGGGCGCCGGGCGGAGCGCTAGGGCCAGCCCGCGGGGGAGGCCGGCCGGCAGAGCGGAGCTGCGGCCAGCCCCGGGCGGCGTCCGCGCCTGCGATCCGCGGGAGG OC CCGTCGGGGCGGCGGCGGCGGCGGCCTCGCTGAGTCGCGGCGGAAGGAGGAGGAAGAGGAGGGAGCTGGAAAAAGCCCCGCAGCAGCGCGCAGCGGCCCGGCCGGGGGTGGGTCGGTGCA -

TGCGGCGCGGGGCGCACGCAGCAGCCTGCGGGACGCGGCGGGCGGACGAGCAGCAGCAGCCGCCGCCGCAGCTCCGCCGCTCAGCGCCGGGGCCGGGCGCCGCCGCGGGGCCGGGGCCGC GCGCCGCCGGGCGCACACGTGCGGCCTCTCCCGCCGCCGCCGCCGCGGGGCTGAGCGCCGGCACCGCGGGGCTCGGCTCGGCTCGGCGCGGCTCGGCACGGCCTTCTGCCATCACTTCTT CCCTCCTTTTTTTTCCTTTTTTTTTTTTTTTTTTCTTTTGGGTTTTCCACCCTGCGCCGCCAGCTCGCTGCGAACAGACTCTTTTCTATTTACTTGATTTGCGAGTTTATCTATTTGCTT CGCTGGGTTTTTTTCCTCCÇCCCCTCACCCCCATCTTTTTTCTTTTTTCTTTTTTTTTTCTTTTTTTTTCCTTCCTTTTTTTTTTTCTTTTTTCTTTTTTTTAACTGCCGCTCTCTCCAC CCCCGACTCTCACTAGGCTGCTTTTTTATTTTTTTGTATCACTATTGGTATTTTTTCCACTGGCGACTCCCGTGCGTTTGTGGATTGCAAGCCGTCCTTGCAGTCCGGTTGTTCACTGCA TTTTGTCTGCAGATCCTCCCTTTGTTTGCACACATCCCCCTGTTATTTTTTAATATTAACCCCCCACCCCCCCACCCCCACCGTTGTCGTTCGTGGTTAAAGTCACTCTTTTTTCGGGGG

1005

MetHlsCysTyrLeuLeuSerValPheLeuThrLeuAspLeuAlaAlaValAlaLeuSerLeuSerThrCysSerThrLeuAspMetAspGlnPheMe GCGCTGTATTTAAGCACTTAAqATGCACTGCTATCTCCTGAGCGTGTTCCTCACCCTGGATCTGGCCGCCGTGGCTCTCAGCCTGTCTACCTGCAGCACCCTCGACATGGATCAGTTCAT tArgLysArglleGluAlalleArgGlyGlnlleLeuSerLysLeuLyaLeuThrSerProProAspGluTyrProGluProGluGluValProProGluVallleSerlleTyrAsnSe

1125

GCGCAAGAGGATCGAGGCGATCCGGGGGCAGATCCTGAGCAAGCTGAAGCTCACCAGCCCCCCGGACGAGTACCCCGAGCCCGAGGAGGTGCCCCCGGAGGTCATCTCCATCTACAACAG

1245

CACCAGGGACCTGCTGCAGGAGAAAGCCAACCACAGAGCTGCCACTTGCGAGAGGGAGCGGAGCGACGAGGAGTACTACGCCAAAGAAGTTTACAAAATAGACATGCAGCCTTTTTACCC

1365

CGAAApTAAGTGCTCCGCGGACGTGATTTCGCTTGGCGGTGCCCACCAGCCCCGTACATCACAACCCCTCCCTCTTTGCTCCTTTTGCTCTCAGCATCCCATCTGGTTGCTTATAGGGCT

1465

TCCTTTAGTCGGGGGGAATGGATATGTTGAGTGGGGGGGACTCGGTTGCCCCCCAGCTTCACGGCGTTCCGGATCCCTGCTGAGGTCACCCTATGTGGTCCCCAGGGGTGCCCGCGCACG

1605

CGCCCCGCTGCCCTGGGTGCAGCCTCTGAGAGGAGCCAACATCTTTCCGCTGCCGTTCTTTCCCTTGGAGTGTCTGGGTCGGGCAAGTTTCCATCTCTCCAGAGCTGAAAAGAAAAGGAG

1725

GTCTTAAAAAGAAAAAAAAAAAAACCAAACAAACCCACAACAAAACCCACAGCATTCCTCCTGGGTTCCTTCGCCCCTCCAGCACTCTGTCCCATGGGTTGGTGCTCAGATTGCTCGGCT

1845

TCCCCCTGTGTCCCTCCCGCCGCCCGCTCGGTGCATGAGCTCTTTCAGGTAGAGGAAAAGTTTTCCTTCTGGGTAGCCTGGTTGCTTGGAACCTGTCTCAGCCCGTCTGCCTTCTCGCTG

1965

CTGTTAACCACTTTCTGGCGATCGGTGCCACATTTTGTGCAGTGGAAGGGTACAAACGCACTGTGCCGACCGGATCCACTGAGCGCGACTAGGAATGGCAACAACGCGCCTTTTTGTTTG

2085

GGTTGTGAGATTTAAGTGTATAGGGAGAAAAAAAGGGAAAAAAAAAAAAAAAAAAAAAGAGAAAGGGAAAAACACCGCACTCCCTTAAAAGAATCTCAATCATCATCGTGCCGAGGAAAG

rThrArgAapLeuLeuGlnGluLysAlaAsnHisArgAlaAlaThrCysGluArgGluArgSerAspGluGluTyrTyrAlaLyaGluValTyrLyalleAapMetGlnProPheTyrPr oGluAl

2205

TGAAATACCAGAAGCAGAAAGTAACTCCTCCGCTCGCGTACCCAGAGCCGCATCCCCAGTATTCCGCCTCGCTCCCTTCCCTCTGTCTCCCCTCCCACCACTGGACGCCCCATCTCCTCA

2325

CCGGCTGGGTCGGCAAAGCACACTCCTCTGGCTCACACAGATAATGCAGCCATCTTGGTAACTTTATCCTAGAGTTTCTAATTGCGTTTTCTCTTATTTATTTTTAGTTGAGCGAGTTGT

2445

AGCTTTGAGGCTTGCTCTGTCTTGCAGTGTATCGTCCCAGGATGGCAGCCAAGGCATCCGAGCCCCGCGCTGGGCGAGCTGCCTCCCTTTGTTGTTAAATGACAGCATCAGGTACGGGAC

2565

ACCCTGGCAGTCAGAGCCGTTTTAGTATTGTAATTATGGCACCATCCCTAATAGCCACCATAAATACACTCCACCCTTGTTACCCAGAGGAAGAAATACTTTTTTTTCTTTTTCTTTCTT

2665

TTTTTGTAAGGAAAAAAATATATATATATATACATTTGGAGCCAAGAGGCTATGAGCAAAGAGGTTCAGTGTAATTTGCAGCGATAGCACCTAGGTCGGTGTTCAGGTCAGTGGGGTTTT

2805

GCTGGGTTTTGTTGAAAGCCAGACGCTTCTGTTTTGGGACCTCCTGGGTCGTGTCCTGAGTGGAGCAGTGGGCACCAATGGGCAGGGCTAGAATGGCTGAGAGCCAAATGAGGGACGGTC

2925

TCTGGAGGTGAAGTGCCTCAGTCCTGACCATGCTCCATGTGTCCCCTGCTGTGAGGATACTCCCAAAGACCAACGGTCCAGAAAGCGCAGTTACCTGCACTGCGCCTCTTGGGTGGTGTG

3045

GCAAAAGAGGATCTAGTTATAAAGCCTTGTCGGAGGCCAGGTTTGCACTGAGGGATCAATCCAGGCTTTGGACGAAGAGAGGAGATTGCAGGTACCTACCAACAGAGTGCATCACCAAGT

NF1

3165

TTGGGATCAGAGGGCCCCGCGGCAGCAGATCCCGTCCTGCTGTGCCGGTCGCCGCGAACATCCCAGTAAGTGGTTAACTGCTTTTCCACTGCTCTCGAG.37.0 kb.T

3285

3645

CTGCTGAAAATCTTCCCCCCCCACCCCCCAGCCTCCACAATACAATTTTCATCTTCTGTTTCTTTGTACATTTCAAAATGTAAGAGGAAAGGAAACCTTTCCCTCAGCTTGAATGGAGAT TTTATCAGGGCCTTTGAGTCATTTTGACTCATCCCTGAATTTGGCTCACCGTGCCCACTCTGCAATTTGAAGAAAATCTTGCCAAACAACAGGGTGTGGAGAATGTCACACATACAGAGG ATGTTCAGAGATTTCTGCACATGGTTTTCTGTTTTTGCTTTTGGCCAGAGTTAGGTGGAACAGCCTCAGATGATCACTGAACCTTCAGGGTAATCCGAATTGTGCAAAAGTACTTTGCTG CTTAACCTCAGTCAGGGTTTACAGCTGCAAAGGAGGAAAGGATTGTTTTGAAACCTAAACTGATACTTCTGATATCATTACTGAGGACTTGATTCTGTTTCTCTGTTGTTTTTTTTTCCT

3765

GATTTCCCTGCAACGTATCATGGGCAGTTGGTTACATTTATGACTTCTACTGTTTGCAAATGGTCAATATTCATGTATTTCTATTCGTCTACTATCTGCTTGCTTTTTTCTTCCTTTTCT

3885 4125

TCTTTCTCTTCTATTCCTGCCCCTGTTATAATGAGAATGAATATTTTATTCCAGTGAAATCCTTAATAACTAAATAGAAAACAAAACCTGAAGCTGGCTAATTTATATGCCATGCCCTGG GAGAAAATGGATGTTAGAGAACATACAGATCAAAGACTTCCTGTTACTTGTCTTTTTAGTATATCTCTTGTGAATACCTTACATGTCTGAAGGCATTTGCCAGCTACAAACTGTTAGGAT TTTTTTCCAATTTGTAAGCATCCAGGACTTTCCACCACTTCGTCAAAAGAGTGGAGTTAAATGCATGCCTTGAAGTAGAGATGGAAATTATTGTTAATCTCTATTAATCTGCTATTCCTG

4245

TAAICAAACACATGTGGAGTTGGTTTTTTAGGAGTTAAATTGTTAATGATACTAAATTGGCCATTGGAAACTGAATCATGCTACAGATATTTCCCTTAGTGCCTTCTGGCTGTCTTGTAT

3405

3525

4005

JsnAlal eProProSerTyrTyrSerLeuTyrPheArglleValArgPheAap ATGCCATCCCACCAAGCTATTACAGCCTTTACTTCAGAATTGTTAGATTTGAC

ValSerAlaMetGluLysAsnAlaSerAsnLeuvalLysAlaGluPheArgValPheArgLeuGlnAanSerLysAlaArgValSerGluGlnArglleGluLeuTyrGlr

4965

GTCTCGGCGATGGAGAAAAATGCGTCCAACTTGGTGAAGGCTGAGTTCAGGGTCTTCCGCCTGCAGAACTCAAAGGCGCGGGTCTCGGAGCAGCGGATAGAGCTGTACCACGTAGGGCTC CAGCCGTTACTGCTGTCTCTCAAAAGTAGCTAATTTCCTGTGTCTCTCCTGGAAGCAGCCTCATTCAAAGACTTATGAAGCTGAAATACGCACACTTTTCTAGAAAGCATGCCTTAGGAC AP3 TCCAGCATTCAGCTTGACTTCTGTATGGTGGCCCAGGGCCTGGAGCAAGGCAGTGGAAAATTCAGGGTTTGTGGAAAGGGGAAGAAAATGATCTAAAGCCTTTTTTCTTTCTTTTTTTTT TTCTTTTTCTTTTTTTTTTTTTTTTTACTCTGGAGTGTCATTAAGACTTATCTCTCTTTTGTTTTCCAATTGTACTTTCCTGTTCAGCTAGTGACATCAGATGATTTATGAACTCGATCA GATGATGTTGTCACTGAAGAAGCAAATTCAGTCTTGCATAGATACCATCGCAAACCTAAAAAACTTCCTCTGTGTGTAAACAAACTCAAGCAGAGGCTCCTCTAGCTAGATAGTGAGTCA

5085

GCTTGCACACATGCTAAATCCATTAAACTCAACTGCAAGGATTCTGTGCGTTTGTTTGTGGAAATACTGGCTAGAGCATATTTCGTGTAAAGCATAGTTCATCTTGCTGAGAAGCTCAAC

5205

AGCAAAAAGGGATTTGACTCCCTGCAGTTATTTACTTGTTCAGCTGAAAACCTGCTAGGTAGAAAGCTATCTTGTAAAAAAAACTATACCAAACCCCTCTTTAGCTCTTTTTTTGCTACC

5325

GTTATTTGCTATATCTGAGCCCTGCTGTAAGTATGCGGCACCACAGAGCCTGGGTCAGATTCTGTGGGTGCAAATTTGAGTTAGTCATCACCCTTACTTTCAAAGTTCTTATTTTTCTGC

5445

CTGATTTGCTTATGCACAAACCAAGCTGGGAGTAGGTAAAGCAGAATTTGTGCTCGTGTTTCTTGAAGTGACGGTATAAGCCAGACTGAGAGACTGAGGGCAGGAAATGATGCACGGGGG

5565

TTTAAACATGATTTAA.14.5 kb.TTGAGCGTTATATTCTCGCTCTTTATGCTGGTGGTCCACTTTACTGCAAATCTGCTAAATATTAAACTTGTAGACAATCATATT

4485 4605

4725 4645

ualLeuLysSerLysGluLeuSerSerProGlyGlnArgTyrlleAspSerLysValValLysThrArgAlaGluGlyGluTrpLeuSerPheAspValThrGlu OCTAi

TGTTATTTCTTGCAGpTTCTGAAATCCAAAGAATTATCATCACCAGGACAGCGTTACATTGACAGCAAAGTGGTTAAAACAAGAGCTGAAGGAGAATGGTTGTCTTTTGATGTCACTGAG

AlaValHiaGluTrpLeuHiaHisArgA] GCTGTACATGAATGGCTCCATCACAGAGGTGACAAGCAGTGTTTTGTCTTCACTCCTGCCCCAAGTCGTCATCTATTAACTCAGCAGTATTTCAGGCTCATTCCACTCGTAAAAACCTAT

KpArgAsnLeuGlyPheLyalleSerLeuHiaCysProCysCysThrPheValProSerAsnAsnTyrllelleProAsnLyaSerGluGluProG

tccacttactttcatgttttgcagIacaggaaccttggatttaagataagcttacattgtccatgctgcacctttgtgccttccaataattatatcatcccaaataaaagtgaggagcctg luAlaArgPheAlad

aagcaagatttgcagctaacaactgaaaaatatatatataattttttactgctgtgtaattgactgggggggggggggggg.1.9 kb.tgtggctcagctttcccag

72« 6165

BURT AND PATÓN TAAATGTGCTCAGTTGACGCCATCGTTGCCCAGGTTCTGACAGCATTTTTTTAAAGAGATGGTTTATTTAATTTGTGAAGAAAATAGCAACATAATGTTTATCACCAAAATGTATTTTTT

LLylleAspAapTyrThrTyrSerSerGlyAspValLysAlaLeuLysSerAanArgLysLyaTyrSerGlyLyaThrProHiaLeuLeuLeuMetLeuLeuProSerTyrArgLeu GAAGGTATTGATGACTACACATATTCCAGTGGTGATGTGAAAGCTTTAAAGTCCAATAGGAAAAAATACAGTGGGAAGACCCCACATCTTCTGCTAATGTTGTTACCCTCCTACAGACTT

GluSerGlnGlnProSerArgArgLyaLysArgAlaLeuAspAlaAlaTyrCysPheAd 6405

GAGTCGCAACAG^CCAGTCGGCGGAAGAAGCGTGCTCTAGATGCTGCCTATTGTTTTAqGTAACTATGCATCTATATTTAGTATACTTTACATAACTGTGCCAGCTTCTGGATTGTAGTG

6525

CAGTTGTTGTCATGTAAGCAGACATTTCTTTATGGTAGGCGTTTTACGGAGTGTGGTCCTTGTTCACAGACCCACAGTGTCCATACCCAAAGGTATCTGCCCAGTGCGCTCCACTCCCAT

6645

GGAGACAGTGCCTGAAGAGGAGGAAGGTTAGGACCTTGGAATGCAGGAACTGAGTTGCACCTGGGGTGATTCGCAAGCCTATCAGCTTCATAGTTATAGCCAAATTATCACCTCCACTGT

6765

GTTGAGAGGTGACTCAGTAAGACCTGGAGAAAAATATTTCAGTGATCTTCTATTAGCTTTCTATAAGCAAGTTTTGCCGTGTCTAAGCATCTCATTTCATTTACAGTATAAGCCACAACA

6885

GATGTGATTAGGTTGTGAATTTTTGGGACTCTAACTTGTTTTTAAACCATCATTCAGGGAGCAAACAACTGGATTATTTTTATTCTCTCTGTTTCTTTGTTACGGCCAATGCATTTGTCT

7 005

GTATTTTATTTTCCAd ¡AATGTGCAGGATAATTGCTGCCTGCGTCCACTTTATATTGACTTCAAGAGGGATCTTGGCTGGAAATGGATTCATGAACCTAAAGGATATCATGCCAATTTCT

7125

GTGCAGGAGCCTGCCCATATTTATGGAGCTCAGATACTCAGCACAGCAGflGTAAGTAAAGATTTTGGTGCAGAGAGTAGCCTTAAGTTGCGCCAGAGGAGATTCAGGGTGGATATGAGGA

7245

AAAGTTTCTTCTTGGAAAGAGAGGTGAAGTATTGGCACAGGCTGCCCAGGAAGTTGATGGAGTCACTGTCACTGGAGGTGTTTAAGAAAAGGGTAGATGTGGCACTGAGGGACATGGGAT

7365

AGCTTCTATTTTGTAACTGCAGGGTGAAATATGTTTCTAGGACTGTCCTTGGTATGAGAGTCTAAAGCTTCCATTATGTCTCAGTAATAAAAAATATGACCTAAGAGGACTTGCATGTGG

7485

CCTCAGACTAGAGTTGAAGCTGAAGTCATGTATTGATCCAGAAATATATTTTTTTAATATGGTCATCAATACAATGAATGAATGTATCTAATACTCTTTGTGACCATGAGCATGATACAG

7 605

CTTTGGAAGGGACCAAAGTTGAAAAGGGACCAGTCAGTAAAGAACATGAGTACAAAATGGCAATGAGTTCCCTCAGCCAAGGCACCTAGGGAAACACAAGCAACATGATTACGGGGCTGA

tAsnValGlnAspAanCysCysLeuArgProLeuTyrlleAspPheLysArgAapLeuGlyTrpLysTrpIleHisGluProLysGlyTyrHisAlaAsnPheC

ysAlaGlyAlaCysProTyrLeuTrpSerSerAspThrGlnHiaSerArd

TSTl

7 725

GTCTTACCATCCAATTGTCACTTACTTTATGAATTGCTCTTTTCATATTGTTAGTGACCTGCTACCATGCTGAATTTCATGATGATATAGGGATAGATGTACTTCTTCAAGAGAGGCATT

7845

CATTCTTGTACATTGAAAATTTATTTGAGCAAGATTGACCTTGTGTCTGGGATGCTTATGAACAGAGAGAGGGTTGGGATACAATCTTGTAGAGATTGGATGGGTTATTCTAGCCAAGTG

7965

CCTTAGAGTGAGTGATATTCTGCTGTGAAGTGCTTATTTTTCTTAACTGTCAATAAGGATGTCTAAACTGTAAATGCATGAGGCAGGCATTTGCACTGCAGTTGTCATAAAGAAAGAATT

8085

TCACCCTCAGGTCCATACTAAGGATGATGTCTCCTTTTTTACAAACCTTCTTTAAAATATCAGATCAATGTCAGCCTCTTGTTCAAGAGAGAGATTGCTTTCCTTTTGGTCTGTGAAGGC

8205

ACAAGGTACAGACCTAAGTGTAGGTATCTTTTGTAACTGACAGTGGCTGTGGAATTGTAGTATCAGCAGTGGCTGTCAGCTGACAGTGGTAAATATTTTTTAATAGAACCTATGTTTCCA

8325

TGCTTGTGTGGAGGAGTAGTAGTGTACACTGTTTGGAGCAGCACTTGTTTTTAAAGGAAAGAGGAAGAATAACTGTCATGGTCACAAATATGTGGTTTGGATAGGTACCCAAATAGGCAG

8445

AGGTTGCTTAATAAAGTAGTTAGTGTTGCTGTCCTGAACAGGTCTTCTGTCTCTAAAAGCCGTAAATGTTGTGTTGAAGCTCTAGGAAAAGGTCATTTAGTGTTTAGCTAACACATTAAT

8565

GGCATCCTCGGAAATACCTGTGAATCATCAGTGGGTAACCCTACATACTTCCCTTGCAGAGAACCATGCATATGAATCCAATGTGACATAAAGTACCAGTGCTGCTGGACCATCATCAGC

8 685

CACAGCTGATCCTCAGTTGGACTACTGAGGAAGTAGTCATGGTTCTTGAAGGGTGGCATTAGTGGATTTACAGAAAAGTATCCAGGGATGAGAAGCAAAGCAGTACTCATGTACTGAAGG

8805

ACTCTCCTCTCCCTTCGGTGCTTTTGTAAAACCAAAATCCTATAACAGTAAATGACAGTCTGAAATTGCAGATAAAAGTCATCCAGTTTCTTCCTTCCCTATCAGTCAGCATGTGTGCTG

8 925

CTTTGAAGACTGGCGCTTTCAAGTCAGGCTGCAGGCACATCCATGTTTTAAGATTTAAGGCTGTGGGGATAAAATATTAGGAAGCTCAGACGGTTTGTGACACCAGGTTGAACTGGAGAA

9045

GCATGCAAAAATATTCAAATCCAGGCATCTTCTGATTTCACTGGGCTTTTTCCCAGGCACTGGAGGAGGAAGTGATGCAGAAAGGAATTTGCTTTCCAGAGTCAAGCAGGCAGAGGAAAT

AP4

p/alLeuSerLeuTyrAsnThrlleAsnProGluAlaSerAlaSerProCysCysValSerGlnAspLeuGluProLeuThrlleLe CCATATTAATCTCTTATTTCCTTTCTCCTTCTAGpTACTCAGCTTGTATAACACCATAAATCCAGAAGCTTCTGCCTCTCCGTGCTGCGTGTCCCAGGATTTAGAGCCCCTCACCATCCT

uTyrTyrlleGlyLysThrProLysIleGluGlnLeuSerAsnMetlleValLyaSerCyaLysCysSed***

CTACTACATTG^CAAAACACCCAAAATCGAACAGCTGTCCAACATGATTGTAAAGTCTTGCAAATGCAGqTAAAGGATACCCCAGAAACAGCGTGACGTGTGTGTATCAAAAAAAAJJAGT 9405

AAAJUU^TAAAAATAATAACCAAGGAAAAAAAGAAAGAGAGAGCGAGGAGGAAAGAAAGCAJU^CGAAAACCCAGG

9525

97 65

AAGCGGTACTAGTCTGAACTGTTTGAAAGTTTGTTTTTGTCTTTTTGTTTTTAAACTGGCATCTGAAGCAAAACATTGAAGGCCTTTATCTTACATTTCACCTACACATAGTGTGAGATA NFKB GACAAGAAGCAAAGTTAAAGCAAAAAAAAAAGAAACCTTTTAAAAAATAAACACTGGAAGAAATTTGTTAGTGTTACTATGTGAAAGGAAAAAAACAGGAAAATCCCATGAAGTGGAGTT GASNE AT GCTGTATCTGTCCCGTGCCTTACTTGATCTCTCTGTATTGTTATGCAATAGGCATCCTACCCATTCCTCTTAGTTGTAGAGTTAACAGTGCATTATTTATTTGTGTGTAAAAACTATCAA

9645

9885

ATGAACATTTCCATATCGCCATTGGAAAAGGAAAACCAGCAAATGTGGACAAAATGGAGACCAAATAAGCTGCCAGAAACACATAGAAAGCCTAAAGGAACGCAAGCTCAAAAGTGTCAG

10005

AAAGTAATGGTT AGTTTAGTTGGATTTAAATAGAAATCAGTT ATTACACT ATTGAGAAACTCTGCCTTTAAAATACCTTATTTTTC ATGCCAGCTGCCTGGAGAGGCTTCTTGT AAGGTC

10125

tcaaacccttgttttaaatactgaataacttactiaataaaggacactctgtttcaqtctcaaagacaagtctataagatttttttttttttttttactgtaaatgatttaaatgtcagtt

TSTl

i

_._,

TSTl

HUMAN

10245

thgtaa7Mu

GTCCTGAGCTTATATAATACCATAAATCCAGAAGCATCTGCTTCTCCTTGCTGCGTGTCC

GTCCTCAGCCTGTACAACACCATAAATCCCGAAGCTTCCGCTTCCCCTTGCTGTGTGTCC

>Hu >Mu

GAGACCAAATJITTTGCCAGAAACTCATGGATGGCTTAA-tGGAACTTGAACTCAAACGAG

>ck

GTACTCAGCTTGTATAACACCATAAATCCAGAAGCTTCTGCCTCTCCGTGCTGCGTGTCC

>Ck

eaCACCAAATjiAGCTGCCAGAAACACATAGAAAGCCTAAAGGAACGCAAGCTCAAAAGTG

».

*.

••*

*

».

••

»••«*«»«**.

*****

**

**

**

**

*****

.-GGAGTTTGAACTCAAATAAG G*SACCAA»T>:tttgccacaaactcatggatggcttaa-i

******

>Hu

CAAGATTTAGAACCTCTAACCATTCTCTACTACATTGGCAAAACACCCAAGATTGAACAG >Mu CAGGATCTGGAACCACTGACCATTCTCTATTACATTGGAAATACGCCCAAGATCGAACAG >Ck CAGGATTTAGAGCCCCTCACCATCCTCTACTACATTGGCAAAACACCCAAAATCGAACAG

>Hu CCAGAAAAAAAGAGGTCATATTAATGGGATGAAAACCCAAGTGAGTTATTATATGACCGA >Mu

CCAGGGGGAAGGAGGTCATAGT-GGATGA—CCCCCTGTGAGTTGTTATAGGACTAA

>Ck TCAGAAAGTAAT-GGTTAGTTTAGTTGGATTTAAATAGAAATCAGTTATTACACTATTGA ***

>Hu >Mu

>Ck

CTTTCTAATATGATTGTAAAGTCTTGCAAATGCAGCrAAAAT- —TCTTGGAAAAGTGGC CTTTCCAATATGATTGTCAAGTCTTGTAAATGCAGCrAAAGT- -CCTTGGGAAAGCCAG CTGTCCAACATGATTGTAAAGTCTTGCAAATGCAGCTAAAGGATACCCCAGAAACAGCGT

>Hu >Mu >Ck

AAGA-CCAAAATGACAATGATGATGATA-ATGATGATGACGACGACAACGAT GACA-CGAAAATCACGGTGACAATGACATATAATGACAACGATGACGACCAT GACGTGTGTGTATCAAAAAAAAAAGTAAAAAATAAAAATAATAACCAAGGAAAAAAAGAA

>Hu >Mu

>Hu >Mu

GATGCTTGTAAC-AAGAAAACATAA-GAGAGCCTTGG

>Hu G

GATGTTTGTGAC-AGGAGGG-A-GGGAGTTTTG

PTTAAAA—

>Ck

AGAGAGAGCGAGGAGGAAAGAAAGCAAAACGAAAACCCAGG

TAAAAGGG

>Hu >Mu

-

>Mu -AAAAAAAAATTG>Ck

TTGAAAAGG-CGGTACTAGT rCAGACACTTTGGAAGTTTGT GAGAAAAAAAA1 CGGTACTACk -TTTTGTCTTTTT-GTTTTTAAACTGGCATCTGAAGCAA-AACATTGAAGG

*

****

*

*

•***

***

*

*

*

GCAAGTCTTC-TGTGG—AAAA-ATC—AAAGCCC-CAGCA—AACACGTGTCTGCC **

***

***

**

****

GAAGCTTCTTGTAAGGTCOkAAA-ACTAAAAACUUrTGTTAATAAAAGAAACTTTCAGTCA GAAGCTTOlTG-GACGCCATATG-CCCAGtAAGGCCTGTTAACAAAGAAAACTTGGAATCA

>Ck AGGTCTCAAACCCTTGTTTTAAATACTGAATAACTTACTAATAAAGGACACTCTGTTTCA

>Mu >Ck

GT-GGCAA-TCTGGkAGATTTTTTTTTfcCTTTTA-ATtGTAAATbGTTCTT-TGC

GTCTCAAAGACAAGTCTATAAGATTTTTTTTTffTTTTTTTAct'GTAAATbATTTAAATGT

>Mu CAGTTTAAGCAAGCCGGTGAAATGTT-GACCTGTTTTGATATGTATTGTCAGACTTTTGA >Ck CAGTTTA-GTAAACCAGTGAAATATTTAACATGTACTGGTCTA-ATCTTCAGACCTT—A *******

*

**

**

*******

**

**

***

**

*

*

**

******

.*

*

>Ck

CCGTGAAGTGGCTGTTGATCTACAATACAGGTTTTTCCTTTGTC TTGGTATATGTAA TTA ATATG-TTGCTGTATAGCTATGCTATGGGATTTTTTGTTCTTtrTGGTATATGTAACCA

>Mu

CATGGATACTATTAAAATAGACGGGTC-TAdAAGCCAGCKTGATTGAAAACACA-CTGCA

>Mu >Hu >Mu

***

>Ck GAAACTCTGCCTTTAAAATACCTTATTTTTCATGCCAGCTGCCTGGAGAGGCTTCTTGTA *

>Hu-AAAAAATTT-

*

GAAAGTCTGC-ATTAAGATAAAG-ACCCTGAAAACACATGTTATGTATCAGCTGCCTAAG

>Ck

TACCTAGAGTATTGCAATAGGTGGGTAGTAjflAGCCAGClrTAATTGGAAACATATCTGTA

>Hu CCTT- ATTCTACA TTTCACC rACTTTGTAAGTGAGAGAGACAAGAAGCAAATTTTTTTT>Mu CGTT- AGTCTGCA TCTCACC rACTTCCTAAGAGACACAAAAAGAAAACATCTTTTTTTTT

>Mu GATCTGTTTTTCC *AACTATTAAA TCGAAACAGTAACTACTTTACA rGTAATGTGTÄGAT

>Ck CCTTTATCTTACA fTTCACC rACACATAGTGTGAGATAGACAAGAAGCAAAGTTAAAGC-

>Ck GATCTCTATTTGT

>Hu —AAAGAAAAAAATAAACACTGGAAGAATTTATTAGTGTTAATTATGTGAACAACGACAA

>Mu

>Mu

TTAAGGAAAAAAATAAACACTGGAAGAATTTGTTAGTGTTAATTATGTGAAAAA—AAAA

>Ck

TTTCACCATATTTTTG-TAclrCTGTAAulcTGTCAGCTCTCATG-AGTTTGAATTTGAAT TTGGACTTTTTTTTTTAATGATCATTCAGATTGTATATTTGTTTCCTTpAGCTGGCCAGT TTGGGGCTCTTTTT-GATCACTCAGAAT-CACGTGT-TTCCCTCrAGCTGGCCAGT

>Mu

&CTTTGAATAAAACCCCTAGfcTTTTCi»CCT6CACI»C»A»7?TaUiTTTTTTTT^^

>Ck

>Ck-AAAAAAAAAAGAAACCTTTTAAAAAATAAACACTGGAAGAAATTTGT-TAGTGTTA>Mu >Hu CAACAACAACAACAAGj >Mu AAACATCAA-AACA>Ck CTATGTGAAAGGA

-TGTAC-GTA-

(TACGTATTGTTTCCA -TGTATC—

>Ck >Hu —CCGT-TCCTATCCCGCGCCTCACTTGATTTT rCTGTATTC CTAT—GCAATAGGCACC

AJ-TGAGTAGAGTCCCTATktTrTGACTTGCACTACAAAllACA-TGTTTTATAA--

GCCCGCATTTCACCCCACGCCTCTCCTGGTTCC TCTGTATTC CTCTCTGCAGTGGGTGCC

>Mu

>Ck-TGTCCCGTGCCTTACTTGATCTgrCTGTATTGrfTA—TGCAATAGGCATC

>Ck

>Hu CTTCCCATTCTTACT- CT-TAGAGT :AACAGTG HG TTATTTAT1 GTGTGTTA-CTA >Mu CTCCCCGTCCCTTC— CT-CCAAGC :AACAGTG ÎG fTATTTATl -GTGTGTTA-CTA

>Mu ATT >Ck ATT

>Mu

>Ck

TATCTTCCCTGCCTGTATTTTATGTATTGTCCATT-TAATGACATdAGCTACCTGpGTCC TAATTGCTATACATGTGCTTTGTATATTGTTCATCATTATGACATAt\GCTACCTGKCTCC

CTACCCATTCCTCTTAGTTGTAGAGTfrAACAGTGCAÍTTATTTATTiTGTGTGTAAAAACTA

>Hu TATAATGAACGTTTC>Mu TATAATGAACCTTTC-

>Ck

\AACTATTAAA-AAAT—TAAGTACTTTATgÍtGTAATGTGTaUaT

CTT-ACCACATTTTTAATATffCTGTAATÄATG-GTTATGATTTAGATTGAACTTAAAT

ATTGCCCffTGGAAAAffAAAA

-CAGGTGT—ATAAAGTG

-CAGGTGT—ATAAA-TC ATTACCCfTGGAAAAfcAAAATCAAATGAACATTTCCATATCGCCAfrTGGAAAAEGAAAACCAGCAAATGTGGACAAAATG

FIG. 8. Species comparison of TGF-02 3' untranslated regions. The comparisons were made using CLUSTAL (Higgins and Sharp, 1989). Dashes were introduced for maximal alignment. The coding portion of exon 7 is outlined. Blocks of sequence identity greater than 7 bp in length are boxed when present in all species. Highly conserved identity blocks are highlighted when greater than 14 bp in at least two species or greater than 10 bp in all three species. Potential poly(A) signals are underlined (Wickens, 1990). Ck, chicken; Mu, mouse (Miller et al, 1989); Hu, human (Madisen et al, 1988).

ACKNOWLEDGMENTS We wish to thank Rosemary Armour for the African green monkey cDNA, Helen Sang for the chicken genomic library, E. Armstrong, R.K. Field, and N. Russell for graphic work, and J.M. Boswell and A. Law for their

helpful

comments.

The nucleotide sequence data reported will appear in the EMBL, GenBank, and DDBJ Nucleotide Sequence Databases under the accession numbers X58071, X59080, X59081, and X59082.

REFERENCES ATWATER, J.A., WISDOM, R., and VERMA, I.M. (1990).~ Regulated mRNA stability. Annu. Rev. Genet. 24, 519-541. BANKIER, A.T., WESTON, K.M., and BARRELL, B.G.

(1987). Random cloning and sequencing by the M13/dideoxynucleotide chain termination method. Methods Enzymol. 155, 51-93. BARTON, D.E., FOELLMER, B.E., DU, J., TAMM, J., DERYNCK, R., and FRANCKE, U. (1988). Chromosomal mapping of genes for transforming growth factors ßl and 03 in man and mouse: Dispersion of TGF-0 gene family. Oncogene Res. 3, 323-331. BASCOM, C.C., WOLFSHOHL, J.R., COFFEY, JR., R.J., MADISEN, L., WEBB, N.R., PURCHIO, A.R., DERYNCK, R., and MOSES, H.L. (1989). Complex regulation of transforming growth factor 01, 02 and 03 mRNA expression in mouse fibroblasts and keratinocytes by transforming growth factors 01 and 02. Mol. Cell. Biol. 9, 5508-5515. BISHOP, J.F., RINAUDO, M.S., RITTER, J.K., CHANG, A.C.-Y., CONANT, K., and GEHLERT, D.R. (1990). A putative AP-2 binding site in the 5' flanking region of the mouse POMC gene. FEBS Letters 264, 125-129. BLACKMAN, R.K., SANICOLA, M., RAFTERY, L.A., GIL-

733

CHARACTERIZATION OF TGF-02 GENE LEVET, T., and GELBART, W.M. (1991). An extensive 3' cis-

regulatory region directs the imaginai disk expression of decapentaplegic, a member of the TGF-0 family in Drosophila. Development 111, 657-665. BREATHNACH, R., and CHAMBÓN, P. (1981). Organisation and expression of eukaryotic split genes coding for proteins. Annu. Rev. Biochem. 50, 349-383. CRAIK, C.S., SPRANG, S., FLETTERICK, R., and RUTTER, W.J. (1982). Intron-exon splice junctions map at protein surfaces. Nature 299, 180-182. DERYNCK, R., RHEE, L., CHEN, E.Y., and TILBURG, A.V.

(1987). Intron-exon structure of the human transforming growth factor-0 precursor gene. Nucleic Acids Res. 15, 31883189.

DERYNCK, R., LINDQUIST, P.B., LEE, A., WEN, D., TAMM, J., GRAYCAR, J.L., RHEE, L., MASON, A.J., MILLER, D.A., COFFEY, R.J., MOSES, H.L., and CHEN, E.Y. (1988). A new type of transforming growth factor-0, TGF-03. EMBO J. 7, 3737-3743. DEVEREUX, J., HAEBERLI, P., and SMITHIES, O. (1984). A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Res. 12, 387-395. DICKINSON, M.E., KOBRIN, M.S., SILAN, CM., KINGSLEY, D.M., JUSTICE, M.J., MILLER, D.A., CECI, J.D., LOCK, L.F., LEE, A., BUCHBERG, A.M., SIRACUSA, L.D., LYONS, K.M., DERYNCK, R., HOGAN, B.L.M., COPELAND, N.G., and JENKINS, N.A. (1990). Chromosomal localization of seven members of the murine TGF-0 superfamily suggests close linkage to several morphogenetic mutant loci. Genomics 6, 505-520. DONOGHUE, M., ERNST, H„ WENTWORTH, B., NADALGINARD, B., and ROSENTHAL, N. (1988). A muscle-specific enhancer is located at the 3' end of the myosin light-chain 1/3 gene locus. Genes & Dev. 2, 1779-1790. DYNAN, W.S., and TJIAN, R. (1985). Control of eukaryotic messenger RNA synthesis by sequence-specific DNA-binding proteins. Nature 316, 774-778. EMERSON, B.M., NICKOL, J.M., JACKSON, P.D., and FELSENFELD, G. (1987). Analysis of the tissue-specific enhancer at the 3' end of the chicken adult 0-globin gene. Proc. Nati. Acad. Sei. USA 84, 4786-4790. HANKS, S.K., ARMOUR, R., BALDWIN, J.H., MALSONADO, F., SPIESS, J., and HOLLEY, R.W. (1988). Amino acid sequence of the BSC-1 cell growth inhibitor (polyergin) deduced from the nucleotide sequence of the cDNA. Proc. Nati. Acad. Sei. USA 85, 79-82.

HE, X., GERRERO, R., SIMMONS, D.M., PARK, R.E., LIN, C.R., SWANSON, L.W., and ROSENFELD, M.G. (1991). Tst-1, a member of the POU domain gene family, binds the promoter of the gene encoding the cell surface adhesion molecule Po. Mol. Cell. Biol. 11, 1739-1744. HIGGINS, D.G., and SHARP, P.M. (1989). Fast and sensitive multiple sequence alignments on a microcomputer. Comput. Applic. Biosci. 5, 151-153. IMAGAWA, M., CHIU, R., and KARIN, M. (1987). Transcription factor AP-2 mediates induction by two different signaltransduction pathways: Protein kinase C and cAMP. Cell 51, 251-260.

JAKOWLEW, S.B., DILLARD, P.J., SPORN, M.B., and ROBERTS, A.B. (1990). Complementary deoxyribonucleic acid cloning of an mRNA encoding transforming growth factor-02 from chicken embryo chondrocytes. Growth Factors 2, 123-133.

KAGEYAMA, R., and PASTAN, I. (1989). Molecular cloning and characterization of a human DNA binding factor that represses transcription. Cell 59, 815-825.

KOZAK, M. (1987). An analysis of 5'-noncoding sequences from 699 vertebrate messenger RNAs. Nucleic Acids Res. 15, 81258148.

LEE, S.-J. (1990). Identification of a novel member (GDF-1) of the transforming growth factor-0 superfamily. Mol. Endocrinol. 4, 1034-1040. LENARDO, M.J., and BALTIMORE, D. (1989). NF-xB: A

pleiotropic mediator of inducible and tissue-specific gene trol. Cell 58, 227-229.

con-

MADISEN, L., WEBB, N.R., ROSE, T.M., MARQUARDT, H., IKEDA, T., TWARDZIK, D., SEYEDIN, S., and PURCHIO, A.F. (1988). Transforming growth factor-02: cDNA cloning and sequence analysis. DNA 7, 1-8. MERMOD, N., WILLIAMS, T.J., and TJIAN, R. (1988). Enhancer binding factors AP-4 and AP-1 act in concert to activate SV40 late transcription in vitro. Nature 332, 557-561. MILLER, D.A., LEE, A., PELTON, R.W., CHEN, E.Y., MOSES, H.L., and DERYNCK, R. (1989). Murine transforming growth factor-02 cDNA sequence and expression in adult tissues and embryos. Mol. Endocrinol. 3, 1108-1114. MIYATA, T., YASUNAGA, T., and NISHIDA, T. (1980). Nucleotide sequence divergence and functional constraint in mRNA evolution. Proc. Nati. Acad. Sei. USA 77, 7328-7332. MONTMINY, M.R., SEVARINO, K.A., WAGNER, J.A., MANDEL, G., and GOODMAN, R.H. (1986). Identification of a cyclic AMP-responsive element within the rat somatostatin gene. Proc. Nati. Acad. Sei. USA 83, 6682-6686. NOMA, T., GLICK, A.B., GEISER, A.G., O'REILLY, M.A., MILLER, J., ROBERTS, A.B., and SPORN, M.B. (1991). Molecular cloning and structure of the human transforming growth factor-02 gene promoter. Growth Factors (in press). RACKWITZ, H.-R., ZEHETNER, G., FRISCHAUF, A.-M., and LEHRACH, H. (1984). Rapid restriction mapping of DNA cloned in lambda phage vectors. Gene 30, 195-200. RAPPOLEE, D.A., WANG, A., MARK, D., and WERB, Z. (1989). Novel method for studying mRNA phenotypes in single or small numbers of cells. J. Cell. Biochem. 39, 1-11. REBBERT, M.L., BHATIA-DEY, N., and DAWID, LB. (1990). The sequence of TGF-02 from Xenopus laevis. Nucleic Acids Res. 18, 2185. RIGAUD, G., GRANGE, T., and PICTET, R. (1987). The use of NaOH as transfer solution of DNA onto nylon membrane decreases the hybridisation efficiency. Nucleic Acids Res. 15, 857.

ROBERTS, A.B., and SPORN, M.B. (1990). The transforming growth factor-0s. In Peptide Growth Factors and Their Receptors, Handbook of Experimental Pharmacology. M.B. Sporn and A.B. Roberts, eds. (Springer Verlag, Heidelberg) vol. 95, pp. 419-472.

SAMBROOK, J., FRITSCH, E.F., and MANIATIS, T. (1989). Molecular Cloning: A Laboratory Manual (Cold Spring Harbor Laboratory, Cold Spring Harbor, NY). SHARP, P. (1981). Speculations on RNA splicing. Cell 23, 643646. SHAW, G., and KAMEN, R. (1986). A conserved AU sequence from the 3' untranslated region of GM-CSF mRNA mediates selective RNA degradation. Cell 46, 659-667. STADEN, R. (1982). Automation of the computer handling of the gel reading data produced by the shotgun method of DNA sequencing. Nucleic Acids Res. 10, 4731-4751.

TABAS, J.A., ZASLOFF, M., WASMUTH, J.J., EMANUEL, B.S., ALTHERR, M.R., McPHERSON, J.D., WOZNEY, J.M., and KAPLAN, F.S. (1991). Bone morphogenetic protein: Chromosomal localization of human genes for BMP1, BMP2A and BMP3. Genomics 9, 283-289. WANG, T.C., and BRAND, S.J. (1990). Islet cell-specific regula-

.

734 tory domain in the gastrin promoter contains adjacent positive and negative DNA elements. J. Biol. Chem. 265, 8908-8914. WEBB, N.R., MADISEN, L., ROSE, T.M., and PURCHIO, A.F. (1988). Structural and sequence analysis of TGF-02 cDNA clones predicts two different precursor proteins produced by alternative mRNA splicing. DNA 7, 493-497. WICKENS, M. (1990). How the messenger got its tail: Addition of poly(A) in the nucleus. Trends Biol. Sei. 15, 277-281. WINGENDER, E. (1988). Compilation of transcription regulating proteins. Nucleic Acids Res. 16, 1879-1902. YAFFE, D., NUDEL, U., MAYER, Y., and NEUMAN, S. (1985). Highly conserved sequences in the 3' untranslated region of mRNAs coding for homologous proteins in distantly re-

BURT AND PATÓN lated

species. Nucleic Acids

Res.

13, 3723-3737.

Address

reprint requests

to:

Dr. David W. Burt Department of Cellular and Molecular Biology AFRC Institute of Animal Physiology and Genetics Research Edinburgh Research Station Roslin, Midlothian EH25 9PS, UK Received for

publication August 26,

1991.

Molecular cloning and primary structure of the chicken transforming growth factor-beta 2 gene.

The chicken transforming growth factor-beta 2 (TGF-beta 2) gene and its flanking regions were cloned and characterized. The gene contains 7 exons and ...
5MB Sizes 0 Downloads 0 Views