5894 Nucleic Acids Research, Vol. 18, No. 19
Nucleotide sequence of the arrestin-like 49 Kd protein gene of Drosophila miranda Rajesh Krishnan and Ranjan Ganguly* Department of Zoology and Cell, Molecular and Developmental Biology Program, University of Tennessee, Knoxville, TN 37996, USA Submitted August 14, 1990
EMBL accession no. X54084
We report the genomic nucleotide sequence of the D. miranda homologue of the gene 507, that encodes a 49 Kd photoreceptor cell specific protein in D. melanogaster (1, 2). This protein undergoes light activated phosphorylation and shares a 42% amino acid identity with vertebrate arrestin (1), which is involved in the vertebrate visual phototransduction pathway (3). The gene 507 is different from the one mapped at locus 36D of D. melanogaster, whose product shows a 40% amino acid identity with vertebrate arrestin (4, 5). D. melanogaster and D. miranda have diverged 45 million years ago and in the course of this divergence, gene 507, which is autosomal (66D) in D. melanogaster (6), has become X-linked (12A) in D. miranda (2). Consequently this gene has been subjected to additional selection pressures brought on by its new X-linked and dosage compensated status in D. miranda (2). Viewed against this background the coding regions of gene 507 in D. miranda and D. melanogaster are remarkably conserved, showing an 87 % homology at nucleic acid level and a 98 % similarity at amino acid level. None of the amino acid changes occur at domains thought to be similar in function to the vertebrate homologue (1). Six amino acid substitutions are in exon I and one of them is in exon II. Shown below is the genomic sequence of gene 507 in D. miranda. The translational start and stop codons, identified by similarity to the D. mekanogaster sequence, are located at nucleotide no. 139 and 10
20
30
40
50
1469 respectively, and are underlined. Also underlined are a TATA like element and a putative polyadenylation signal at positions 17 and 1657, respectively. We have not determined the exact transcriptional start site. Two arrowheads show the positions of the boundaries of the two introns whose sequences are underlined.
ACKNOWLEDGEMENTS We thank Dr D.Brian, Dr P.Sethna and Dr S.Abraham for use of the Beckman Microgenie program. This work was supported by the University of Tennessee Start-up Research Fund and a Faculty Research Award to R.G. REFERENCES Yamada,T. et al. (1990) Science 248, 483 -486. Krishnan,R., Swanson,K. and Ganguly,R. (1990) Chromosoma, in press. Kuhn,H. and Wilden,U. (1987) J. Recept. Res. 7, 283-298. Hyde,D. et al. (1990) Proc. Natl. Acad. Sci. USA 87, 1008-1012. Smith,D., Shieh,B. and Zuker,C. (1990) Proc. Natl. Acad. Sci. USA 87, 1003-1007. 6. Levy,L.S., Ganguly,R., Ganguly,N. and Manning,J.E. (1982) Dev. Biol. 94, 451-464.
1. 2. 3. 4. 5.
60
70
80
90
110
100
120
GGACAGGGCAAAGGTAUI^^GCAGGTCCTACGGC.ACTIOGfCCXll CCTCCGGGC AAACGGAAAACTCAGGAGCTCCCAGTCCTGGTCTTAGGAAAAGGAMACAGCAAAAGA 140 180 130 1S0 160 170 190 200 240 210 220 230 AAAGAAAAACCGCTAGTTJMGTTGTCTCCCI CAAC=CAI AGCCACOACCC AACGGC AAGGTCACTTTCTATTTGGCGCCCCGCACTTCATCGATCACTTGGACTACTGCGAT 260 250 270 280 290 300 310 320 330 340 360 350 CCCGTCGTTCTATTCTCGTGGA GCCr.ATCTAGAACGCAACTIG 11CCAGCTCCCCACGACCTATCGCTATGGCCGCGAGGAGGACGAGGTCATGGGCGTCMACTTC
380
370
390
400
420
410
440
430
450
460
470
480
TCGAAGGAGCTGATCCTCTCCCOGGCGA"CGMATGCCGATGACAACCGACATGGAGCCCCGATGCAGGAGCAGCTGCGMCAGCTGCGCAGCAACGCCCATGCCOTTCACC 500
490
510
520
530
540
TTCCACTTCCCACCCAATTCGCCCAGCTCGGTCACGCTGCCCACCC^GACACACGCCM
550 560 570 580 600 590 AGCCCCTCGCCCTGCACTACACCATTCCCCCGTrTCTCGCC4CTCCGAGGACCAT
620 630 640 660 670 650 680 690 700 710 720 CGCCACCACAAACGAC4ACCTICAGCITCGM CAGAACCTCCACTATGCCCCCCTGAACItCCCOCCAGCGCTCAGCTCCCGCTCGTACCMGGGATTCACCTTCTCAACCCC 610
740 730 750 760 770 800 780 790 810 AAGATCACT CTGGA:TCACCCTCGACAGCGAGATCTACTATCACGGAAACCCC^GCCCG=CGTCAGCTGCICrCMI
830
860
970
980
CTC
870
880
990
1000
900 910 CArTAGCAGCTCCGCCAGCCTCAGCCAAGA
ATTCCCCTGGCAGCCMMMCCT 1090
TCTA
1210
CA
1020 _
1030
1040 1050 __
1060 lC
840 960
1070 1080 Cl:OTA
1220
1110
1230
1240
1350 1360 _ OTCATCCACCSM 1340
1370
1S80
1590
1700
1710
CAGGGCTCCTCT 1690
ACATA1AITCTC0T0CrTTCG0CTATCTAGA
To whom correspondence should be addressed
1250
1260
1270
1280
1290
1300
1310
1320
CGCC_ _ >AC
1450 1460 1470 1480 GC _GA"CGAAOCC ,GAT MAAGAC
*
1010 _
630
820
920 930 940 950 CG=CATUCACSC C
1120 1130 1140 1150 1160 1170 1180 1190 1200 G l:ATTCACOCCTCCTCCACCATI _lGla > C-CTCCCCAC; 1100
CCCTCGGGOOGAGATGCSAA 1330 _AAA
890
1600
1370
1380
TCTCS
1400
1390 _
1410
1420
1430
1440
1530
1560
_
W
1490
1500
1510
1520
1530
1540
1610
1620
1630
1640
1650
1660
1670 1680 CWGTOMCCGCGCTCOCC