CENOMICS

(1991)

11,981-990

Genomic Cloning, Complete Nucleotide Sequence, and Structure of the Human Gene Encoding the Major Intrinsic Protein (MIP) of the Lens M. MICHELE PISANO’ AND ANA B. CHEPELINSKY’ Laboratory

of Molecular and Developmental Biology, National Eye Institute, National Institutes of Health, Bethesda, Maryland 20892 Received

May 21, 1991;

Academic

July 20, 1991

cells, proper membrane biosynthesis and physiology are of upmost importance in maintaining the transparent state of the lens (Alcala and Maisel, 1985). The crystallins, or major water-soluble proteins of the lens, account for 80-90% of the dry weight of the tissue (Wistow and Piatigorsky, 1988) while the remaining lo-20% of lens dry weight is composed of various cytoskeletal and membrane proteins, collectively referred to as the water-insoluble fraction (Alcala and Maisel, 1985). Major intrinsic protein (MIP, also called MP26)3 the principal constituent of this fraction, constitutes up to 80% of total lens membrane protein (Broekhuyse et aZ., 1976). While significant inroads have been made in delineating the molecular basis of crystallin diversity and differential expression (see Wistow and Piatigorsky, 1988; Piatigorsky and Zelenka, 1991, for reviews), far fewer insights regarding the molecular biological basis of lens membrane protein structure and function have been gained. Thus, we have initiated a molecular characterization of the MIP gene, which encodes the predominant intrinsic protein of the lens fiber cell membrane. MIP has a molecular mass of 28,200 Da based on its deduced amino acid sequence (Gorin et al., 1984) and appears to be conserved throughout evolution as judged by immunological cross-reactivity and peptide mapping (Bouman and Broekhuyse, 1981; Takemoto et aZ., 1981). This integral membrane protein not only appears to be lensspecific, but is restricted to lens fiber cells (Paul and Goodenough, 1983; Watanabe et al, 1989; Yancey et CL, 1988). Both MIP mRNA and protein are first detectable in the primary lens fibers at the early lens vesicle stage and are subsequently found in secondary lens fibers as they differentiate from epithelial cells (Yancey et al., 1988; Watanabe et al., 1989). Hence,

Major intrinsic protein (MIP, also called MP26) is the predominant fiber cell membrane protein of the ocular lens. MIP hasbeensuggestedto play a role in cell-cell communication in the lens. Its expression is tissue-specificand developmentally regulated. We have isolated and characterized the human gene encoding MIP and report here its genomic structure and entire nucleotide sequence. The gene is 3.6 kb, contains four exons separated by introns ranging in size from 0.4 to 1.6 kb, and is present in single copy per haploid human genome.Primer extension of human lensRNA indicates that transcription of the geneinitiates from a single site 26 nt downstream from the TATA box. Three complete Ah repetitive elementsare found in tandem in the 6’-flanking region of the gene, and a single complete Alu sequenceis present in the third intron. The interspeciescomparisonsof the MIP gene coding sequence and homologiesto other membersof a putative transmembrane channel protein superfamily are also discussed. Q 1991

revised

Press, Inc.

INTRODUCTION The vertebrate ocular lens constitutes an excellent paradigm of developmentally regulated and tissuespecific gene expression (see McAvoy, 1980, Wistow and Piatigorsky, 1988, for reviews). The lens, which is derived from the embryonic surface ectoderm, is composed of an anterior layer of epithelial cells which continually differentiate into fiber cells at the equatorial epithelium (McAvoy, 1980). Since the differentiation of epithelial cells into fiber cells entails extensive cell elongation with a resultant lOOO-fold increase in the elaboration of new plasma membrane by elongating

’ Present address: Department of Anatomy, Jefferson Medical College, 1020 Locust Street, Philadelphia, PA 19107. * To whom correspondence should be addressed at National Institutes of Health, Building 6, Room 211, Bethesda, MD 20892.

3 Abbreviations tide(s).

used:

MIP,

major

intrinsic

protein;

981

nt, nucleo-

OSSS-7543/91$3.00 All

Copyright 0 1991 rights of reproduction

by

Academic in any

form

Press, Inc. reserved.

982

PISANO

AND

CHEPELINSKY

A. Clone: HMIPAlG-1

5.8 Sal I

8gl II Ndel

Bgl II

Nde I

kb

8gl II

Sal I

B. Probe: -

+

C (bp 603-820)

-

+ -

D (bp l-35)

-

A (bp -31+201) 8 (bp 282-602)

C. Plasmid Su bclones

+ +

+

-

-

-

-

+

-

-

-

-

-

-

{ pHMIP16

l1-j

pHMlP1 l-l

pHMIP2

e

pHMIP4 I+

pHMIP5

FIG. 1. Restriction endonuclease cleavage map of genomic clone HMIPXlG-1 and relative localization of human MIP gene sequences. (A) Schematic representation of genomic clone HMIPXlG-1 showing restriction enzyme sites used in mapping and subcloning the gene. Wavy lines represent X EMBL-3 sequences and the solid bar above the map indicates the relative location of the human MIP gene. (B) Summarized results from Southern blot hybridization of clone HMIPXlG-1 to probes A-C (derived by restriction digest from the bovine cDNA) and to probe D (an oligodeoxyribonucleotide corresponding to the initial 35 bases of the bovine MIP coding sequence). The relative nucleotide location of these fragments in the bovine cDMA are noted in parentheses, with +l being the translation initiation site. Hybridization to specific restriction fragments of the human genomic clone is noted by I‘+” beneath the respective fragment. (C) Plasmid subclones of HMIPXlG-1 in pBluescript II.

MIP serves as an excellent marker of lens fiber cell differentiation. The lens, which lacks vasculature, innervation, and cellular connective tissue, maintains an extensive network of intercellular transmembrane pathways. Numerous electrophysiological and dye passage studies indicate that lens fiber cells are thoroughly coupled (Mathias and Rae, 1989). MIP, reconstituted into planar lipid bilayers and membrane vesicles, exhibits channel-forming activity (Gooden et al., 1985; Girsch and Peracchia, 1985; Ehring et al., 1990). Additional evidence demonstrates that antibodies to MIP block channel activity in reconstituted membrane systems (Gooden et al., 1985) and in cultured lens cells (Johnson et al., 1988). Such results support the notion that MIP may play an integral role in lens cell-cell communication. Indeed, recent evidence indicates that MIP belongs to a growing superfamily of putative transmembrane channel proteins (Rao et al., 1990; Yamamot0 et al., 1990; Pa0 et al., 1991; Wistow et al., 1991). To gain further insight regarding the function of this protein, which constitutes the preponderant component of the lens fiber cell membrane, and to identify the eIements underlying the tissue specific and developmentally regulated expression of MIP, we have cloned the gene encoding this protein. Here we report the isolation and characterization of the human MIP

gene, representing the first description structure of a lens membrane protein MATERIALS

Isolation and Identification Human MIP Gene

AND

of the genomic gene.

METHODS

of Genomic Clones for the

A human leukocyte genomic library of Mb01 random fragments inserted into the BamHI site in X EMBL-3 phage was screened (Clontech, Inc., Palo Alto, CA). Approximately 5 X lo6 recombinant phage plaques (corresponding to approximately two haploid genomes) were screened with a bovine MIP cDNA probe containing the entire protein-coding sequence. The hybridization probe was a 780-bp NdeI-HindIII fragment isolated from the bovine MIP cDNA clone, MP4 (Gorin et al., 1984). Screening was performed using the probe radiolabeled with [a-32P]dCTP (3000 Ci/mmol; Amersham Corp., Arlington Heights, IL) by random oligonucleotide priming (BoehringerMannheim Biochemicals, Indianapolis, IN) to a specific activity of approximately 3 x lo8 cpm/pg. Hybridization positive phage clones were purified by successive plaque hybridization. Recombinant plaques were transferred to nitrocellulose and the filters hybridized and washed according to standard methods (Maniatis et al., 1982). Phage DNAs were purified and

STRUCTURE

OF

THE

LENS

MAJOR

INTRINSIC

PROTEIN

GENE

983

synthesized on an Applied Biosystems 380A DNA synthesizer and purified by passage over a Sephadex G-25 column (5 Prime-3 Prime, Paoli, PA) according to the manufacturer’s directions.

A

kb 23.1-

DNA Sequencing A 5.0-kb WI-NdeI fragment (HMIP5.0) and a 1.5kb NdeI fragment (HMIP1.5) of the human MIP gene were separately multimerized, sheared by sonication, and cloned into the SmaI site of MlSmplO. The resulting clones were sequenced by the dideoxynucleotide chain-termination method (Sanger et al., 1977) using T7 DNA polymerase (Tabor and Richardson, 1987). The nucleotide sequence was established by sequencing the fragments in Ml3 with universal primers. Specific oligodeoxribonucleotides based on the human MIP gene sequence were utilized as primers to sequence the junction between HMIP5.0 and HMIP1.5 and to complete several singlestranded regions.

9.46.64.4-

Southern Analysis

B

1.9kb

991 II

7 +1 I UII Pstl

II

J

1 EWRI

I1 { Pstl

+950 I f&l

Pstl

II i xt,

+3410 I I Hind Ill

BernHI

FIG. 2. Southern hybridization of human genomic DNA. (A) Human genomic DNA was digested with the indicated restriction enzymes. The blotted filter was hybridized with a 1.9-kb BgEII fragment of the human MIP gene containing coding and 5’-flanking sequences (see Fig. 2B for relative location of probe) and washed as described under Materials and Methods. HindIII-digested X DNA was electrophoresed in parallel with genomic DNA samples. The locations of these size standards are indicated at the left. (B) Restriction endonuclease cleavage map of the human MIP gene illustrating sites for enzymes used in Southern analysis of the gene. The bar above the map indicates the relative location of the 1.9-kb BglII fragment used as a hybridization probe. The transcription initiation site is denoted as +l.

the positive clones characterized by restriction enzyme mapping. Different DNA fragments of the human MIP gene were subcloned into Bluescript II plasmid (Stratagene, La Jolla, CA) and plasmid DNA was prepared by alkaline lysis followed by two series of cesium chloride/ethidium bromide equilibrium centrifugation (Maniatis et al., 1982).

Oligodeoxyribonucleotide

Synthesis

Oligodeoxyribonucleotides Southern hybridizations,

used for sequencing, and primer extensions were

Recombinant plasmid DNA (2.0 pg) or genomic DNA prepared from human placental tissue (8.0 pg) (Clontech, Inc.) was digested with individual or multiple restriction enzymes as indicated, fractionated through a 0.7% agarose gel in 0.04 M Tris-acetate/ 0.001 M ethylenediaminetetraacetic acid, and transferred to nitrocellulose according to standard methods (Southern, 1975) in 10X SSC (1X SSC: 0.15 M sodium chloride, 0.015 M sodium citrate, pH 7.0). Filters were prehybridized for 4 h at 42°C in 5X SSC, 0.5% sodium dodecyl sulfate (SDS), 5X Denhardt’s solution (1 X Denhardt’s: 0.02% each of bovine serum albumin, polyvinylpyrrolidone, and Ficoll), 100 1.18 salmon sperm DNA per milliliter, and 50% formamide. Hybridizations were performed for 18 h at 42°C in the same buffer with the addition of 1.0-2.0 x 10’ cpm of [a-32P]dCTP-labeled probe prepared by random oligodeoxyribonucleotide priming (Boehringer-MannheimBiochemicals).Followinghybridization, the filters were washed at room temperature twice in 2~ SSC, 0.1% SDS for 15 min each; twice in 1X SSC, 0.1% SDS for 15 min each; and twice at 68°C in 0.1X SSC, 0.1% SDS for 15 min each, unless otherwise stated. For autoradiography, washed filters were exposed to Kodak XAR film with an intensifying screen at -70°C.

RNA Isolation Total cellular RNA was isolated from human lenses by the acid guanidinium thiocyanate-phenol-chloroform extraction method (Chomczynski and Sacchi,

984

PISANO

AND

CHEPELINSKY

I P s s 5

I

I



A

\\

_

400 E\

c,

D ..\

-

8

\ I

0

2000

0

\

-8(Jobp

1

I

4000

6000 b

HUMAN MIP GENOMIC SEQUENCE FIG. 3. Dot matrix comparison of the bovine MIP cDNA and human MIP gene sequences. The human MIP gene and ita flanking sequences are shown on the horizontal axis. The bovine MIP cDNA is shown on the vertical axis. Four regions of homology (A through D) are noted between 2800 and 6400 bp, delineating the four exons of the human MIP gene.

1987). Normal human lenses were generously provided by Dr. J. Horwitz (Jules Stein Eye Institute, Los Angeles, CA) and stored at -70°C until processed. Primer

Extension

age (MBUG) software for the PC, the Integrated Database and Extended Analysis System for Nucleic Acids and Protein (IDEAS), and Sequence Analysis Software Package of the Genetics Computer Group (GCG) for the VAX computer.

Analysis

Oligo 3729 (5’ GACATAGAAGAGGGTGGC 3’) and oligo 3730 (5’ CAGGAGCCCAGCGCAGTGAGGAC 3’) were 5’-end-labeled with [T-~~P]ATP (7000 Ci/mmol; ICN Pharmaceuticals, Inc., Irvine, CA) and T4 polynucleotide kinase (Pharmacia, Piscataway, NJ) to a specific activity of approximately 5 X 10’ cpm/pg. Oligo 3729 is complementary to the coding sequences corresponding to nucleotides 99 to 116 (encoding amino acids 19 to 24) and oligo 3730 complementary to the coding sequences corresponding to nucleotides 131 to 153 (encoding amino acids 30 to 36), both in exon 1 of the human MIP gene. Primer extensions were performed essentially as described (Chepelinsky et al., 1987) using approximately 9.0 pg of total RNA, 1.0 X lo6 cpm of primer, and 85 units of avian myeloblastosis virus reverse transcriptase (Seikagaku America, Inc., Rockville, MD). Primer-extended products were analyzed on a 10% polyacrylamide-8 A4 urea sequencing gel with 32P-end-labeled MspI-digested pBR322 DNA fragments and a sequencing ladder as size markers. Materials Unless otherwise noted, restriction and modifying enzymes were purchased from New England Biolabs, Inc. (Beverly, MA); Boehringer-Mannheim Biochemicals (Indianapolis, IN); or Stratagene (La Jolla, CA). All other chemicals and reagents were of the highest grade available. Software Alignment and analysis of the nucleotide and protein sequences were performed using the NIH Molecular Biology User Group PC-Tools Distribution Pack-

RESULTS

AND

Isolation and Characterization Human MIP Gene

DISCUSSION

of a Clone for the

To clone the human MIP gene, approximately 500,000 recombinant phage plaques from a human genomic library in h EMBL-3 were screened using a bovine cDNA fragment containing nearly the entire protein-coding sequence of the MIP gene (from nucleotide 38 encoding amino acid 13 to nucleotide 820, 26 nucleotides 3’ to the translation termination codon) (Gorin et al, 1984). Three positive recombinant clones were identified and carried through to tertiary screening. All three clones were of the same overall length. Two of the recombinant clones (HMIPXlG-1 and HMIPXlG-3) contained 16-kb SalI inserts that appeared to be identical by restriction enzyme mapping. The restriction fragment map of genomic clone HMIPXlG-1 is diagramed in Fig. 1A. The insert from the third recombinant clone could not be isolated from the phage vector, possibly due to loss or disruption of one of the SalI sites. The 16-kb SalI insert from positive clone HMIPhlG-1 and various restriction fragments of the insert were subcloned into pBluescript II (Stratagene) (Fig. 1C). Localization ckm?

of MIP

Sequences within the Genomic

As a means of localizing MIP sequences within the positive genomic clone, pHMIP16 was digested with the restriction enzymes noted in Fig. 1A and the DNA fragments were subjected to Southern analysis using three probes derived from the bovine MIP cDNA and an oligodeoxyribonucleotide. Probe A was a 232-bp

STRUCTURE

OF

THE

LENS

MAJOR

INTRINSIC

PROTEIN

GENE

985

203 263 363 443 TGTCTCTCTATTGCCCTGACTCCCTGACTGGTGCAGGTGAATCAGCTCCCCTGGGGTCTGGGACAATATGTGCATGTGTG

523

AGCATGTGTGTGTTGAAGTGGTGAGGATTGACAGGTGGTTTAGAG~TTTAAGGAAGG~ATAGGGCCCTGGA~TGAGAATT

603

AAGAAACCTGAGTTTGAGTTTCAGCTTTCCTTCCAACTCCTTGCAGGTTCTTAAGGAAACTTATTTTAC~TTT~AAGGCC

683

TCAGTTTCCTCATACATGTAAGTGGCTGCAATCATTTCCATCATGAAGGAGCACTGTTAGGAGATGGTAAGATGCAAATA

763

CAATGATTATGAAGGGGGTTGTATTATCCTCTCCTCTCCACGGACCTGATGCCGTAACGAGATCCTTGCCGGGGAAGTCTT

043 923 1003 TAAGCAGGAAGGAGA

1063

GCAACTTCTCATTCAAGGATTCCGGGGG~C~CTAAAC~TTTC~~AACTTTTAAGGGTT~TT~~TG~~TACTT~A~TGT~AA

1163

ACATGAACATTGTCCATTGCACTCTCTCTACTGGCACAAAGGAATCAACTCTGC~CTATCCCTCTCTTGTGACTGCTATATC

1243

TAGTCCTTCTTGACCCCAAGGTAGAAATGAACGTATCATGGCTAGACATGGGCTTTGCCTATGGATCCATAGTCTGTTCC

1323

CAGACAGGGCATCAGTTGCCCTCACCCCATTGCAGGAACCCCTGGAGAGTCATGCAGGCTGTCCCTCTCCACTTGCTGCT

1403

GGCTGGAGAAAAGATGGCAGCT

1463 1563

BTTACAACTGTCTCTTTTGCAG TGAGTGAAGGGAGAGGGGCAAGATCCTGAAGCCCTTCTGGATGGTTGTGGACTGCA

1643

GGTTCCAGGCTGGATTCCTGCTTTCCTTCTGCGTGTGTTGGCTGCTGTACCAGCACTCAGCTAAGGGGGCTGTCAGAGAG

1723

TTTGGCATGCCTGTGTGAGCAGGATTCATGATTTTGCTAGAAGGAGAGGCTCTTGCTCATATTTTTCCTTCTCTGGTTAG

1603

TCAGGGAGTATGCACTGAGTATTCACTGGTTGCTGAACTAGCGGGTATGAAGAGAGACAACTAAATATGAGCAGATAAAT

1663

TCTGCTCTCAGAGATCAAGATGTGTTGTGAGCACAGATGGGCCTCAGCCCCATTCCAGTTCTAATACGTTATCGGCTTTG

1963

TGTGTGACCATGGGCAAATAATCACTCTGTGCTTCAGTTTGTTTTACTATAAAATGGGACATTACGAGAAATGTGTGAAA

2043

GTTATATGTGAGAAACATTTAGCACAGAATCTGATATAAAGTAAGCACTCAATAAATATTGGTTATGTTGATGTTGTGCA

2123

AGCCAAACATATGGAAATAATTTAAAAATTTTTCAAAGCTGTACACCCATTTTCATAACAGCATTATTCACAATAGCCAA

2203

GAGGTAAAAGCAACCCAAGTGTTCCTCAGTGGATGAATGGATAAACAAAATGTGGTATATACATAAATGGAATATTACCT

2263

GTAAAAAGGAAGGAAATTGTGACACATACCACAACATGGATGAATCTTGAGGACATTATGCTAAGTGAATAAGCCAGTCA

2363

CAAAAAGACAAATACTGTATGATTACATTTATATGGGGTATTTAGAGTACTCAAATTCA~AGACGCAAAGTAGAAGGATA

2443

GTTACCAGGGGCTGATGGGGGTGGGGTGGAATAGGCAGTTGTTTAATGGGTATAGAGTTACAGCTTTGCAAGATGAAAAA

2523

GTTCTAGACATAGGTTGCACAACAATGTGAATGTGAATATACTTAACACTACTAAACTGTACACTTAAAAACATATATATATTTTT

2603

TTGAGATGGAGTCTCCCTCTGTCACCCAGGCTGGAGTGCAGrGGAG~TTGATCTCAGGATCTCAGCTCACTGCAACCTCC

2683

~CCTCCTGGGTTCAAGCAATTCTCCTGCCTCAGCCTCCTGAGTAGCTGGGATTTCAGGCGCCTGCCACCACGCCCAGCTA

2763

CTTTTTGTATTTTTAGTAGAGATGGGGTTTCTCCCTGTTGGTCAGGCTGTTCTCAAATGGCTGACCTTGTGATCCGCCCG

2643

TCTTGTGATCCGCCCGTCTCAGCCTCCCAAAGTGCTGGGATTACAGGCATAAGCCACTGCGCCTGT~CAAAAATATTTAA

2923

TATGGTAAATTTTATATGTGTTTTACCACAATGTATAAAATTTTTTTGAAAGCAACATACAAACTAGTGCAAATTGATAA

3003

CAAATAGAATATACACATGCTAAGGTGTGGGATAAAGGAGTAATTTGATGAATCAGAAAATAAATGGTGAGTTTATTGTG

3063

GATATCAAGTGTACAATATCTTGTTTTCCACTAAGGTGGCTGG~AAAAGAGCAGCGTTGCTGCTCTGTCCCCTCCCCAAG . .

TATTCCTTTTCTCTTTCTACAG~ *

m=

* . . m--TGA

.

.

* . .

.

.

.

.

,,

GCTGGTTGGTGCAAACTTCCCTTCCTCCCCATCCCACCACCCTTCGCCGTGTGTGCTGATTGTGCATATG

3243 . I

.

.

3163

1 . . .

3323

,.A...

3463 . * .

. .

3563 3633

FIG. 4. The complete nucleotide sequence of the human MIP gene. The nucleotide sequence of the human MIP gene was determined as indicated under Materials and Methods. The transcription initiation site was determined by primer extension of human lens RNA (see Fig. 5) and is noted as position tl. Numbers correspond to positions relative to the transcription initiation site. The four exons of the gene are indicated by reverse images (white letters on a black background). The translation initiation codon (ATG) is denoted by asterisks (***) beginning at position 45, as is the translation termination codon (TAG) beginning at position 3375. Putative lariat branch points (CEAC) in each of the introns are boxed and shaded gray. An Ah repetitive element, located in the third intron, is underlined. A TATA box, present 26 bp upstream from the transcription initiation site, is delineated by a box.

986

PISANO

AND

CHEPELINSKY

MIP Sequences

=z -217 -201 -190 -180 -180 152+

-147

-123 115*

-110

FIG. 5. Localization of the human MIP gene transcription initiation site. The 5’ end of the gene was mapped by primer extension of human lens RNA. Oligos 3729 and 3730 were 5’-end-labeled, hybridized to 9.0 pg of RNA from I-year-old normal human lenses, and extended as described under Materials and Methods. Extended products were separated on a 10% polyacrylamide-8 M urea sequencing gel. Primer extended products of 115 bases (lane 3729) and 152 bases (lane 3730), indicated by arrows, were obtained with the respective oligonucleotide. Dideoxy sequencing reactions (lanes T and G) and MspI-digested pBR322 DNA fragments (lane M) were simultaneously run as markers. The sizes of the pBR322 fragments are indicated to the right.

PpuMI/HincII fragment of the bovine cDNA which encompassed 30 bp of the 5’-untranslated region and the coding sequence for the first 66 amino acids of the protein. Probe B was a 321-bp PflMI DNA fragment encoding amino acids 91-197 of the bovine MIP. Probe C was a 218bp PfEMI/HindIII fragment of the bovine cDNA containing the coding sequence for amino acids 197 to 263, and 29 bp of the 3’-untranslated region. Probe D, an oligodeoxyribonucleotide, corresponded to the initial 35 bases of the bovine MIP coding sequence. Results from the Southern analysis of the genomic clone insert, using these probes, are summarized in Fig. 1B. Collectively, these results indicated that the entire coding region of the human MIP gene was contained within the 1.9-kb EgZII, 1.2kb BglII/NdeI, and the 1.5kb NdeI fragments of the genomic clone. Subsequent sequence analysis demonstrated that the human MIP gene spanned 3.6 kb, as illustrated by the solid bar over clone HMIPhlG-1 in Fig. 1A.

within

the Human

Genome

Southern analysis of total human genomic DNA was performed to determine the copy number of the MIP gene in the human genome. Autoradiographic results from this analysis as well as a restriction map of the human MIP gene are shown in Figs. 2A and 2B, respectively. Human genomic DNA was digested with in an agarose gel, various enzymes, fractionated transferred to nitrocellulose, and hybridized with the 1.9-kb BgZII fragment of pHMIP16 which encompassed the 5’ end of the MIP gene as shown in Fig. 1B. Based on prior restriction mapping of the genomic clone and subsequent computer analysis for restriction enzyme sites in the human MIP nucleotide sequence, it was anticipated that a single HindIII, BamHI, and BglII fragment and two EcoRI fragments would hybridize to the 1.9-kb BgJII probe. These results are evident in Fig. 2A. PstI-digested DNA was expected to produce four hybridization positive bands of 1389, 730, and 82 bp and one larger than 2200 bp. The PstI fragments that hybridized to the probe were approximately 4400, 1400, and 700 bp. The expected 82-bp PstI fragment probably was not retained in the 0.7% agarose gel during the separation due to its small size. No additional hybridization positive bands were detected on the Southern blot, even under low stringency hybridization conditions (data not shown). The five resultant genomic DNA hybridization patterns evident in Fig. 2A indicate the presence of a single MIP gene copy per haploid human genome. This finding is consistent with the mapping of MIP to the long arm of human chromosome 12 (Sparkes et al., 1986).

Nucleotide Sequence and Structural the Human MIP Gene

Organization

of

The complete nucleotide sequence of the human MIP gene and approximately 3.0 kb of the 5’-flanking region was determined by dideoxynucleotide sequencing of single-stranded DNA from recombinant Ml3 clones, as indicated under Materials and Methods. A total of 133 overlapping Ml3 templates assembled HMIP5.0 and 31 overlapping templates assembled HMIP1.5. Using this strategy, the human MIP gene was sequenced in its entirety on both strands. To determine the overall structural organization and intron/exon boundaries of the gene, the entire sequence obtained for the human MIP gene (coding and flanking sequences) was compared to the fulllength bovine cDNA sequence using a computer-generated dot matrix analysis (Maize1 and Lenk, 1981). Results of this dot matrix analysis (Fig. 3) demonstrated four significant regions of homology (A, B, C,

STRUCTURE

OF

THE

LENS

MAJOR

INTRINSIC

PROTEIN

987

GENE

HumanMWELRSASFWRAlFAEFFATLFYVFFGLGSSLRWAPGPLHVLQVAMAFGL Bovine - _ _ - - _ _ _ - _ _ _ -C.----b--.---.--~---------------i----

50

Human

ALATLVQSVGHISGAHVNPAVTFAFLVGSQMSLLRAFCYMAAQLLGAVAG - - - - - - -~-.---...--.---..------------i---~---------

100

Bovine Human Bovine

AAVLYSVTPPAVRGNLALNTLHPAVSVGQATTVEIFLTLQFVLCIFATYD _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ --~-------i------------------

150 R;---------------

Rat Human Bovine Rat

ERRNGQLGSVALAVGFSLALGHLFGMYYTGAGMNPARSFAPAlLTGmN - - - - -~------..----;--------------------------R----

200

_ _ _ _ _ ~~-.---------;----.---------------------R--~-

Chick

~-~~-~p.-~--p.----.------ip~--------------~i------

Human Bovine

HWV YWV G P I I G G G L G S L L Y D F L L F:~~T-~~?-~~~~-~~~~~~~~P ---w-e -“-‘-‘“1-----c _ _ _ _ _ _ _ _ ;--~-.----------.----V--.,---SR-~E------

Rat

.---------.------.-----------~-----i----~-~N------

Chick

- - -

Human Bovine

VTGEPVELNTQAL _ _ _ _ _ _ _ _ k _ _ _ _

++

+

+ 1

D V S N G Q P E

250

;.~--ii.~~-~~---;i~-c--~~-~~---~--~-~--p~~App--

Rat

G-----;---

Chick

ppi.-i--i---t

300

FIG. 6. Alignment of the predicted amino acid sequences of human, bovine, rat, and chicken MIP. The deduced amino acid sequence of human MIP is aligned with the deduced sequences of MIP from bovine (Ref. (11)) and partial sequences from rat (Ref. (32)) and chicken (Ref. (14)). Amino acid identities are noted by a dashed line. Bovine amino acid 14 obtained by peptide sequencing is a phenylalanine (Ref. (39)). Conservative sequence changes due to polarity and steric considerations are indicated by an asterisk above the amino acid. A potential glycosylation site beginning at amino acid 197 is indicated by a solid box, while a putative calmodulin binding site beginning at amino acid 225 (Ref. (25)) is demarcated by a dashed box. Potential phosphorylation sites (Refs. (8,15)) at positions 229,231,235, and 245 are denoted by arrows.

and D) between nucleotides 2800 and 6400, indicating the presence of four exons in the gene. This finding was confirmed by computer-assisted alignment of the nucleotide sequences of the human MIP gene and bovine MIP cDNA using the NUCALN program. The exon-intron junctions were also verified by scanning the nucleotide sequences for a consensus 5’-donor splice signal (zAG/GTzAGT) and 3’-acceptor splice signal (&NCAG/G) (Mount, 1982). The sequences flanking the exon-intron junctions in the human MIP gene conform well, but are not identical, to the donor splice and acceptor consensus sequences. However, the consensus intron border dinucleotides, GT and AG (Breathnach et al., 1978), were found at the 5’ and 3’ borders, respectively, of all three introns in the human MIP gene. Lariat signals having the consensus sequence YNYURAY are typically located 20-50 nucleotides upstream from the acceptor signal in the intron (see Smith et al., 1989). Signals having either the sequence ctgag or ctcag were found in each of the three introns, 21 to 28 nucleotides upstream from the exon-intron junction (Fig. 4). The four exons of the

gene are 404, 165, 81, and 369 bp, respectively. three introns are 498,438, and 1605 bp.

Transcription

Start Site of the Human

MIP

The

Gene

Primer extension analysis of RNA isolated from lyear-old normal human lenses was performed to delineate the 5’ end of the gene and to establish the precise transcription initiation site(s). Two oligodeoxyribonucleotides, a 23-mer (oligo 3730) complementary to the mRNA-encoding amino acids 30 to 36 and an 18-mer (oligo 3729) complementary to the mRNAencoding amino acids 19 to 24, were extended with reverse transcriptase and the reverse-transcribed products analyzed as detailed under Materials and Methods. Oligo 3730 produced a single primer-extended product of 152 bases and oligo 3729 produced a single product of 115 bases (see Fig. 5), indicating a single site of transcriptional initiation in the human MIP gene. Based on these results, the transcription initiation site of the human MIP gene (designated +l

988

PISANO

in Fig. 4) was mapped stream from the TATA 3’-Untranslated

AND

to a single site 26 bp downbox.

Sequence of the Human

MIP

Gene

The 3’ end of the human MIP gene was determined by comparing the nucleotide sequence of the 3’-untranslated region of the gene with the bovine cDNA sequence. The translation stop codon or the human MIP gene, TAG, is located at nucleotide 3375 (Fig. 4). The 3’-untranslated regions of the human gene and bovine cDNA are approximately 90% identical from the translation termination codon to the polyadenylation site that had been delineated for the bovine MIP cDNA (Gorin et al., 1984). Based on the excellent sequence homology between the 3’-untranslated regions of the human and bovine MIP genes, exon 4 of the human MIP gene was determined to be 369 bp, the last 186 bp of which are untranslated. Analysis of the 3’ end of the human gene sequence indicated that the eukaryotic polyadenylation consensus signal AATAAA, usually located lo-30 bp 5’ to the polyadenylation site (Proudfoot and Brownlee, 1976), is absent from the human MIP gene. Although the AATAAA sequence has been found to be highly conserved, natural variations of this sequence that are still active in specifying the poly(A) addition site have been found in many genes (see Leff et al., 1986, for review). Additional sequences have been suggested to play a role in mRNA cleavage and polyadenylation (McLaughlan et al., 1985; Leff et al, 1986; Renan, 1987). The hexanucleotide AAGAAA, located at nucleotide 3449 in the human MIP gene, is found in an identical position in the bovine MIP cDNA (Gorin et al., 1984). Analysis of the partial rat MIP cDNA sequence (Shiels et al., 1988) suggests that this polyadenylation signal may be utilized. The apparent divergence in the 3’ end sequence of the MIP cDNA from rat, bovine, and chicken (Kodama et al, 1990) suggests that alternative polyadenylation may be involved in processing of the MIP gene. Human

MIP Gene Coding Sequence

Based on delineation of the 5’ and 3’ ends of the human MIP gene as detailed above, the gene was determined to be 3560 bp. Alignment of the MIP gene sequence with the bovine cDNA sequence using the NUCALN program demonstrated 81 and 90% sequence identity for the 5’- and 3’-untranslated regions of the genes, respectively. Exons 1 through 4 of the human gene were found to be 90, 95, 93, and 89% identical to the coding sequence of the bovine cDNA. Minimal overall sequence divergence between the coding sequences of the two genes indicates a high degree of evolutionary conservation.

CHEPELINSKY

A comparison of the deduced amino acid sequences from the human MIP gene and bovine (Gorin et al., 1984), partial rat (Shiels et al., 1988), and partial chicken (Kodama et al., 1990) cDNAs is presented in Fig. 6. The human MIP gene encodes a 263-aminoacid protein that bears 92% overall sequence identity to the bovine protein. The human and bovine proteins possess even greater homology (98%) if one considers conservative amino acid changes based on polarity of the residue. Comparison of the derived rat and chicken amino acid sequences with the human MIP primary amino acid sequence, accounting for sequence identities and conservative changes based on polarity, demonstrated 96 and 89% sequence homology, respectively, in the regions available. Such comparisons of the primary amino acid sequences of the human, bovine, rat, and chicken MIP suggest that the protein has been highly conserved throughout evolution, a finding that has previously been suggested based on the immunological cross-reactivity and peptide mapping of MIP in various species (Bouman and Broekhuyse, 1981; Takemoto et al, 1981). Based on computer alignment of the amino acid coding sequences of several recently cloned membrane genes and cDNAs, an expanding superfamily of putative transmembrane channel proteins has been identified. Included in this superfamily are MIP, the Drosophila big brain protein (bib), soybean nodulin 26 protein (nod26), the Escherichia coli glycerol facilitator protein (glpF), the root-specific proteins TobRB7 (from tobacco) and AtRB7 (from Arubidopsis), and a soybean tonoplast protein (see Rao et at., 1990; Yamamoto et al., 1990; Pao et al., 1991). It has been noted recently that the 28-kDa erythrocyte transmembrane protein may also belong to this superfamily (Smith and Agre, 1991). The amino acid sequences of MIP, bib, nod26, glpF, TobRB7, and AtRB7 bear no marked homology to known transport proteins, yet each appears to play some intrinsic role in intercellular transport or communication within their respective environmental locales. Analyses of the amino acid sequences of MIP, bib, nod26, and glpF have delineated the presence of a twofold repeat in the primary structure of these proteins (Pao et al., 1991; Wistow et al., 1991). The first repeat corresponds to the sequences encoded by a single exon (exon 1) of the human MIP gene; the second repeat encompasses exons 2 through 4 (Wistow et al., 1991). This finding suggests that members of this superfamily may have evolved by gene duplication of a single structural motif, perhaps representing an ancestral monomer capable of forming higher-order multimerit structures. Human MIP Gene Noncoding Sequences Comparison of the human MIP gene sequence to those nucleotide sequences in GenBank indicated the

STRUCTURE

OF

THE

LENS

MAJOR

presence of multiple Ah repetitive sequences in and around the human MIP gene. Ah repeats, the most abundant family of interspersed repetitive DNA in the human genome, typically found in intergenic regions and introns, share a 300-bp conserved consensus sequence consisting of two imperfect, directly repeated monomeric units that are separated by an adenine-rich spacer (see Schmid and Jelinek, 1982, for review). Their expression may regulate various aspects of cell proliferation, differentiation, and transformation (Howard and Sakamoto, 1990). Three complete Alu sequences are found in tandem in the 5’-flanking region of the human MIP gene, and a single complete Ah sequence in the third intron. The Ah repeats in the human MIP gene and its 5’-flanking sequence are from 77 to 87% identical to the Ah consensus sequence. The three Ah repeats upstream of the MIP gene are classic Ah sequences in that they are all flanked by inverted direct repeats and terminate with a stretch of poly(A), whereas the repeat present in the third intron of the gene, while 84% identical to the Alu repeat consensus sequence, is not flanked by direct repeats and terminates with an imperfect region of poly(A)/poly(T). In summary, the present report delineates the genomic structure and complete nucleotide sequence of the human gene encoding the major intrinsic protein of the ocular lens, the first such report on the gene structure of a lens membrane protein. The isolation of the coding and noncoding sequences of the MIP gene has allowed us to initiate analyses of the c&elements and trans-acting factors governing the tissue specific and developmental profile of the protein. ACKNOWLEDGMENTS

REFERENCES

BOWN,

A. A., AND BROEKHWSE,

4.

5.

6.

I.

8.

9.

10.

11.

12.

14.

15.

16.

17.

1. ALCALA, J., AND MAISEL, H. (1985).

2.

3.

13.

We gratefully acknowledge Dr. Michael B. Gorin for providing the bovine MIP cDNA clone, Dr. Joseph Horwitz for the provision of human lenses, and Dr. Abdul Ally of Biotechnica International, Inc., for assistance in dideoxynucleotide sequencing of the gene. We also acknowledge Marvin Shapiro of the NIH Division of Computer Research and Technology, Dr. David Landsman of the National Library of Medicine’s Center for Biotechnology Information for assistance with computer analyses, as well as the National Cancer Institute for allocation of computer time and staff support, and in particular Mark Gunnel1 at the Advanced Scientific Computing Laboratory of the Frederick Cancer Research Facility. Special thanks is expressed to Drs. John Klement, Douglas Lee, Graeme Wistow, and Joram Piatigorsky for their insightful comments and thoughtful evaluation of the manuscript and to MS. Gabriela Tobal for assistance in verifying the gene sequence.

plasma membranes and cytoskeleton. Structure, Function, and Pathology” 169-222, Dekker, New York.

INTRINSIC

Biochemistry of lens In “The Ocular Lens: (H. Maisel, Ed.), pp.

R. M. (1981).

Lens Mem-

18.

19.

PROTEIN

GENE

989

branes XIV. Comparative study of immunological characteristics of the fiber membrane polypeptides from calf, pig, sheep and chicken lenses. Exp. Eye Res. 33: 299-308. BRJZATHNACH, R., BENOIST, C., O’HARE, K., GANNON, F., AND CHAMRON, P. (1978). Ovalbumin gene: Evidence for a leader sequence in mRNA and DNA sequences at the exonintron boundaries. Proc. Nutl. Acad. Sei. USA 75: 4853-4857. BROEKHUYSE, R. M., KUHLMAN, E. D., AND STOLS, A. L. (1976). Lens membranes II. Isolation and characterization of the main intrinsic polypeptide (MIP) of bovine lens fiber membranes. Exp. Eye Res. 23: 365-371. CHEPELINSKY, A. B., SOMMER, B., AND PIATIGORSKY, J. (1987). Interaction between two different regulatory elements activates the murine oA-crystallin gene promoter in explanted lens epithelia. Mol. Cell. Biol. 7: 1807-1814. CHOMCZYNSKI, P., AND SACCHI, N. (1987). Single step method of RNA isolation by acid guanidinium thiocyanatephenol-chloroform extraction. Anal. Biochem. 162: 156-159. EHRING, G. R., ZAMPIGHI, G. A., HORWIT~, J., BOK, D., AND HALL, J. E. (1990). Properties of channels reconstituted from the major intrinsic protein of lens fiber membrane. J. Gen. Phystil. 96: 631-664. GARLAND, D., AND RUSSELL, P. (1985). Phosphorylation of lens fiber cell membrane proteins. Proc. Natl. Acad. Sci. USA 82: 653-657. GIRSCH, S. J., AND PERACCHIA, C. (1985). Lens cell-to-cell channel protein. I. Self-assembly into liposomes and permeability regulation by calmodulin. J. Membr. Biol. 83: 217-225. GOODEN, M., RINTOUL, D., TAKEHANA, M., AND TAKEMOTO, L. (1985). Major intrinsic polypeptide (MIP26K) from lens membrane: Reconstitution into vesicles and inhibition of channel forming activity by peptide antiserum. Biochem. Biophys. Res. Commun. 128: 993-999. GORIN, M. B., YANCEY, S. B., CLINE, J., REVEL, J. P., AND HORWITZ, J. (1984). The major intrinsic protein (MIP) of the bovine lens fiber membrane: Characterization and structure based on cDNA cloning. Cell 39: 49-59. HOWARD, B. H., AND SAKAMOTO, K. (1999). Alu interspersed repeats: Selfish DNA or a functional gene family? New Bioiagist 2: 759-770. JOHNSON, R. G., KLUKAS, K. A., TZE-HONG, L., AND SPRAY, D. C. (1988). Antibodies to MP28 are localized to lens junctions, alter intercellular premeability and demonstrate increased expression during development. In “Gap Junctions” (E. L. Hertzberg and R. G. Johnson, Eds.), pp. 81-98, A. R. Liss, New York. KODAMA, R., AGATA, N., MOCHII, M., AND EGUCHI, G. (1990). Partial amino acid sequence of the major intrinsic protein (MIP) of the chicken lens deduced from the nucleotide sequence of a cDNA clone. Exp. Eye Res. 60: 737-741. LAMPE, P. D., AND JOHNSON, R. G. (1990). Amino acid sequence of in vivo phosphorylation sites in the main intrinsic protein (MIP) of lens membranes. Eur. J. Biochem. 194: 541547. LEFF, S. E., ROSENFELD, M. G., AND EVANS, R. M. (1986). Complex transcriptional units: Diversity in gene expression by alternative RNA processing. Anna Rev. Bioehem. 65: 1091-1117. MAIZEL, J. V., AND LENK, R. P. (1981). Enhanced graphic matrix analysis of nucleic acid and protein sequences. Proc. Natl. Acad. Sei. USA 78: 7665-7669. MANIATIS, T., FRITSCH, E. F., AND SAMBROOK, J. (1982) “Molecular Cloning: a Laboratory Manual,” Cold Harbor Laboratory, Cold Spring Harbor, NY. MATHIAS, R. T., AND RAE, J. L. (1989). Cell to cell communi-

990

20. 21.

22. 23.

24.

25.

26.

27.

28.

29. 30.

31. 32.

PISANO

AND

cation in lens. In “Cell Interactions and Gap Junctions” (N. Sperelakis and W. C. Cole, Eds.), Vol. I, pp. 29-50, CRC Press, Boca Raton, FL. MCAVOY, J. W. (1980). Induction of the eye lens. Difierentiation 17: 137-14s. MCLAUGHLAN, J., GAFFNEY, D., WHITTON, J. L., AND CLEMENTS, J. B. (1985). The consensus sequence YGTGTTYY located downstream from the AATAAA signal is required for efficient formation of mRNA 3’ termini. Nucleic Acids Res. 13: 1347-1368. MOUNT, S. M. (1982). A catalogue of splice junction sequences. Nucleic Acio!s Res. 10: 4X-472. PAO, G. M., WV, L-F., JOHNSON, K. D., HOFTE, H., CHRISPEELS, M. J., SWEET, G., SANDAL, N. N., AND SAIER, M. H. (1991). Evolution of the MIP family of integral membrane transport proteins. Mol. Microbial. 5: 33-37. PAUL, D. L., AND GOODENOUGH, D. A. (1983). Preparation, characterization, and localization of antisera against bovine MP26, an integral protein from lens fiber plasma membrane. J. Cell Bid. 96: 625-632. PERACCHIA, C. (1989). Control of gap junction permeability and ealmodulin-like proteins. In “Cell Interactions and Gap Junctions” (N. Sperelakis and W. C. Cole, Eds.), Vol. I, pp. 125-142, CRC Press, Boca Raton, FL. PIATIGORSKY, J., AND ZELENKA, P. (1991). Transcriptional regulation of crystallin genes: Cis elements, trans-factors and signal transduction system in the lens. In “Advances in Developmental Biochemistry” (P. Wassarman, Ed.), Vol. I, pp. 211-256. PROUDFOOT, N. J., AND BROWNLEE, G. G. (1976). 3’Non-coding region sequences in eukaryotic messenger RNA. Nature 263: 211-214. RAo, Y., JAN, L. Y., AND JAN, Y. N. (1990). Similarity of the product of the Drosophila neurogenic gene big brain to transmembrane channel proteins. Nature 345: 163-167. RENAN, M. J. (1987). Conserved 12-bp element downstream from mRNA polyadenylation sites. Gene 60: 245-254. SANGER, F., NICKLEN, S., AND COULSON, A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad.Sci. USA 71:5463-5467. SCHMID, C. W., AND JELINEK, W. R. (1982). The Alu family of dispersed repetitive sequences. Science 216: 1065-1070. SHIELS, A., KENT, N. A., MCHALE, M., AND BANGHAM, J. A. (1988). Homology of MIP26 to Nod26. Nucleic Acids Res. 16:

9348.

CHEPELINSKY 33.

SMITH, B. L., AND AGRE, P. (1991). Erythrocyte M, 28,000 transmembrane proteinexists as a multisubunit oligomer similar to channel proteins. J. Biol. Chem. 266: 6407-6415.

34.

SMITH, C. W. J., PATTON, J. G., AND NADAL-GINARD, B. (1989). Alternative splicing in the control of gene expression. Anna Rev Genet. 23: 527-577.

35.

SOUTHERN, E. M. (1975). Detection of specific sequences among DNA fragments separated by gel electrophoresis. J. Mol. Biol. 98: 503-517.

36.

SPARKES, R. S., MOHANDAS, T., HEINZMANN, C., GORIN, M. B., HORWITZ, J., LAW, M. L., JONES, C. A., AND BATEMAN, J. B. (1986). The gene for the major intrinsic protein (MIP) of the ocular lens is assigned to human chromosome 12cen-q14. Invest. Ophthalmol. Visual Sci. 27: 1351-1354.

37.

TABOR, S., AND RICHARDSON, C. C. (1987). DNA sequence analysis with modified bacteriophage T7 DNA polymerase. Proc. Natl. Acad. Sci. USA 84: 4767-4771.

38.

TAKJZMOTO, L. J., HANSEN, J. S., AND HORWITZ, J. (1981). Interspecies conservation of the main intrinsic polypeptide (MIP) of the lens membrane. Comp. Biochem. Physiol. B 68: 101-106.

39.

TAKEMOTO, L. J., HANSEN, J. S., NICHOLSON, B. J., HUNKAPILLER, M., REVEL, J-P., AND HORWITZ, J. (1983). Major intrinsic polypeptide of lens membrane. Biochemical and immunological characterization of the major cyanogen bromide fragment. Biochim. Biophys. Acta 731: 267-274.

40.

WATANABE, M., KOBAYASHI, H., RUTISHAUSER, U., KATAR, M., ALCALA, J., AND MAISEL, H. (1989). NCAM in the differentiation of embryonic lens tissue. Dev. Biol. 135: 414-423.

41.

WISTOW, G., AND PIATIGORSKY, J. (1988). The lens crystallins: Evolution and expression of proteins for a highly specialized tissue. Annu. Rev. B&hem. 57: 479-504.

42.

WISTOW, G., PISANO, M. M., AND CHEPELINSKY, Tandem sequence repeats in transmembrane teins. Trends Biochem. Sci. 16: 170-171.

43.

YAMAMOTO, Y. T., CHENG, C-L., AND CONKLING, M. A. (1990). Root-specific genes from tobacco and Arabidopsis homologous to an evolutionarily conserved gene family of membrane channel proteins. Nucleic Acids Res. 18: 7449.

44.

YANCEY, S. B., KOH, K., CHUNG, J., AND REVEL, J. P. (1988). Expression of the gene for main intrinsic polypeptide (MIP): Separate spatial distributions of MIP and &crystalIin gene transcripts in rat lens development. J. Cell Biol. 106: 705714.

A. B. (1991). channel pro-

Genomic cloning, complete nucleotide sequence, and structure of the human gene encoding the major intrinsic protein (MIP) of the lens.

Major intrinsic protein (MIP, also called MP26) is the predominant fiber cell membrane protein of the ocular lens. MIP has been suggested to play a ro...
4MB Sizes 0 Downloads 0 Views