GENOMICS

10,390-399

(1991)

Serial Ah Sequence Transposition Interrupting a Human B Creatine Kinase Pseudogene TONY S. MA,*,’ JONAH IFEGWU,* LAURA WATTS,* ROBERT ROBERTS,* AND M. BENJAMIN

MICHAEL J. SIcuwo,t PERRYMAN*

*Department of Medicine, Molecular Cardiology Unit, Baylor College of Medicine, One Baylor Plaza, Room 506C, Houston, Texas 77030; and fDepartment of Molecular Genetics, M.D. Anderson Cancer Center, Houston, Texas 77030 Received

October

2, 1990;

Press, Inc.

INTRODUCTION

Creatine kinase (CK; EC 2.7.3.2) catalyzes the reversible storage of ATP as creatine phosphate and its regeneration to maintain a high ATP/ADP ratio. In mammalian cells the cytoplasmic isoenzymes are dimers of M and B subunits, and all three isoforms, MM, MB, and BB creatine kinase, can be detected in a tissue-specific distribution (Watts, 1973). In addition to the cytoplasmic isoenzymes, at least two mitochondrial isoforms are present in mammalian tissues, the ubiquitous and striated muscle isoforms (Grace et al., 1983; Haas et al., 1989; Haas and Strauss, 1990; correspondence

and reprint

0888-7543/91 $3.00 Copyright 0 1991 by Academic Press, Inc. All rights of reproduction in any form reserved.

requests

should

December

5, 1990

Hossle et al., 1988). We have isolated and sequenced M (CKM) and B (CKB) creatine kinase cDNAs from human tissue (Perryman et al, 1986; Villarreal-Levy et aZ., 1987), and two other laboratories have also published cDNA sequences for the human B creatine kinase isoform (Kaye et al., 1987; Mariman et al., 1987). Using isoform-specific probes, we have demonstrated that both of the CKM and CKB genes exist in single copy in the human genome (Villarreal-Levy et al., 1987). DNA probes derived from these cDNA clones, however, detect an additional creatine kinase gene in the human genome. Similar observations have been reported by Kaye et al. (1987). We have cloned and characterized this gene and determined it to be a CKB pseudogene. A unique feature of the organization of the pseudogene is an insertion of three tandem Ah repetitive elements into the coding region. Ah repetitive elements have been demonstrated with high-resolution in situ hybridization techniques to be nonrandomly distributed in the human genome (Korenberg and Rykowski, 1988). Nucleotide sequence analysis of the Ah repetitive elements in the CKB pseudogene suggestsone mode of integration wher’eby Ah repetative elements are nonrandomly placed into the human genome.

We have isolated, sequenced, and characterized a singlecopy B creatine kinase pseudogene. The chromosomal assignment of this gene is 16~13 and a unique sequence probe from this locus detects EcoRI restriction fragment length polymorphisms of 7.8 and 6.4 kb. In 26 unrelated individuals, the frequencies for the 7.8- and 5.4-kb B creatine kinase pseudogene alleles were calculated to be 17.3 and 62.7%, respectively. The B creatine kinase pseudogene is interrupted by a 904-bp DNA insertion composed of three Alu repeat sequences in tandem flanked by an 18-bp direct repeat, derived from the pseudogene sequence. Nucleotide sequence analysis of the Alu elements suggests that the Alu sequences were incorporated into this locus in three separate integration events. Several complex clustered Ah repeat sequences without defined integration borders have been previously identified at different genomic loci. This is the first evidence that complex tandem Alu elements can integrate in an apparently serial manner in the human genome and supports the contention that Alu repeats integrate nonrandomly into the human genome. o 1991 Academic

1 TO whom dressed.

revised

METHODS

Creatine Kinuse-Specific Probes A 182-bp HaeIII-HaeIII restriction fragment of the human CKB (position 1177-1358) and a 135bp HaeIII-HaeIII restriction fragment of the human CKM 3’-untranslated region (position 12801414) were subcloned into pUC9 and designated pHCKB3UT and pHCKM3UT, respectively. Probes made from these subcloned fragments were highly specific and hybridized only with the appropriate CKM or CKB cDNA. A 134-bp HaeIII-Hue111 restriction

be ad-

390

THREE

Alus IN TANDEM

WITHIN

fragment of the human CKB cDNA (position 657790) was similarly subcloned and used as a coding region nonspecific probe (designated pHCKB-134) that cross-hybridized with CKM and CKB sequences under nonstringent conditions. A 619-bp SmaI-PstI restriction fragment of the 5.4-kb B creatine kinase pseudogene (position 804-1422 of the sequence as reported below) was used as a specific probe for this gene and designated pHCK4-618. Genomic DNA Purification Hybridization Analysis

and Southern Filter

High-molecular-weight genomic DNA was isolated by the Proteinase K-Sarkosyl method (Maniatis et al., 1982). Complete restriction endonuclease digestion was achieved using 5 units of enzyme/pg DNA and overnight incubation. The digests were electrophoresed on 0.8% agarose gels and transferred to nylon membranes (Zetaprobe, Bio-Rad) using the alkaline transfer method of Reed and Mann (1985). Hybridization conditions were 1.5~ SSPE, 1% SDS, 0.5% milk powder, 500 pg/ml salmon sperm DNA, 10% (w/v) dextran sulfate, at 68°C (or 55°C for cross-species hybridization), overnight with 6-8 X lo6 cpm/ml of 32P-labeled probe. Stringent washing conditions were as follows: 2~ SSC, 1% SDS for 15 min twice at room temperature, followed by 0.1X SSC, 0.1% SDS for 60-90 min at 65°C. Reduced stringency wash conditions included a final wash at 1X SSC, 0.1% SDS for 60 min at 55”C, or as specified under Results. The nylon was exposed to X-ray film with two intensifying screens (DuPont) at -70°C. Probe Labeling DNA fragments were labeled by the random-primer labeling technique (Feinberg and Vogelstein, 1984) to a specific activity of 10’ cpm/wg. The probe was purified by NenSorb (DuPont) column chromatography before use in hybridization experiments. Oligonucleotides were end-labeled using T4 polynucleotide kinase (Maniatis et al., 1982) to a specific activity of 10’ cpm/pg. The probe was purified by NenSorb chromotography as above before hybridization. Enriched Genomic Libraries of Individual Kinase Genes

Creatine

Four hundred micrograms of human placenta DNA was digested to completion with EcoRI restriction endonuclease. The digests were electrophoresed on 0.8% agarose gels at 2 V/cm for 36 h. Regions of the gel corresponding to the size of DNA fragments hybridizing to creatine kinase-specific probes were recovered from the gel by electroelution and separately inserted into appropriate cloning vectors [EMBL4 or

A HUMAN

PSEUDOGENE

391

Lambda-Zap (Stratagene)] to make a sequence-specific enriched genomic library using the GigapackGold (Stratagene) in vitro packaging system.

Screening, Subcloning, and Sequencing of the B-l&e Creatine Kinase Gene The creatine kinase-enriched genomic libraries were screened with a CKB cDNA probe containing 538 bp of the coding region and all 204 bp of the 3’-untranslated region. Lambda clones were amplified and harvested using Lambda Sorb (Promega). The insert was released by EcoRI digestion and subcloned into plasmid vectors using standard protocols (Maniatis et al., 1982). Plasmid DNA sequencing was performed by the double-strand dideoxy chain terminating method using the T7 Sequenase Kit (Pharmacia).

Human Chromosomal Localization Chromosomal localization was accomplished essentially as described by Stallings et al. (1988). The somatic cell hybrid clone panels consisted of 17 clones and a HeLa cell control. The blot was hybridized with a 618-bp SmaI-PstI fragment of the 5.4-kb B-like gene clone under stringent conditions for chromosomal localization.

Polymerase Chain Reaction (PCR) Amplification the Tandem Alu Repetitive Element

of

Two 30-mer oligonucleotide primers with 5’ added restriction sites were constructed from unique sequences from the CKB pseudogene flanking the Alu insertional element: 5’-ccgaattcatcaggctgccccacctgggca-3’ (sense); 5’-ccaagcttatctgcaccagcaccacctctg-3’ (antisense). The PCR mixture in a total volume of 20 ~1 contained 0.2 pg human genomic DNA; 1 &f sense and antisense primers; 50 mM Tris, pH 8.8; 10 mM (NH,),SO,; 5 nnJ4 MgCl,; 14 mM mercaptoethanol; 5 @4EDTA; 100 mg/ml BSA, 1 unit Tuq DNA polymerase; 1 mM dNTP. The PCR protocol was as follows: 96”C, 1 min; 68°C 7 min. The amplification was performed for 30 cycles. The PCR products were separated on a 1% agarose gel and transferred to a Zetaprobe membrane. The blot was hybridized with 2 X lo6 cpm/ml 32P-end-labeled Alu-specific DNA probe (Nelson et al., 1989), in 5~ SSC, 5% SDS, 5~ Denhardt’s solution with 100 pg/ml of sheared denatured salmon sperm DNA at 42°C. The membranes were washed at room temperature in 1X SSC for 1 h, followed by an additional wash at 42°C for 30 min.

MA

ET

AL.

share considerable sequence similarity with the CKB cDNA and hybridize with the CKB-specific probe under nonstringent conditions. The HindIII, XbaI, and BgtII restriction digests show only a single band for CKM and for CKB, as well as for the B-like creatine kinase gene. These results indicate that in addition to the single-copy CKM and CKB gene, there exits another single-copy B-like creatine kinase DNA sequence in the human genome.

EcoRI Restriction Fragment Length Polymorphisms with B and B-like Creatine Kinase Gene

2.32.0-

1

Probe: Coding CKB Stringency: Low

Probe: CKB Stringency:

*

3’UT Low

FIG. 1. Human creatine kinase gene family. (A) Southern analysis of human genomic DNA digested with EcoRI, HindIII, XbaI, and BgflI using a 135-bp B creatine kinase coding region probe shows cross-hybridization with M, B, and B-like creatine kinase sequences in the human genome. For the EcoRI digest (first lane), the CKM gene resides on the 23-kb fragment, and the CKB gene resides on fragments with an EcoRI restriction polymorphism (RFLP) of 16.5 and 12 kb. The boxed signals represent the B-like creatine kinase gene with RFLP of 7.8 and 5.4 kb. (B) Same blot hybridized with 3’UT CKB-specific probe under reduced stringency shows only hybridization signals from the B and the B-like locus.

RESULTS

Human

Creatine Kinuse Gene Family

Using isoform-specific probes, derived from unique 3’-untranslated sequences of the corresponding cDNAs, we have demonstrated that both CKM and CKB are single-copy genes in the human genome (Villarreal-Levy et al., 1987). When a nonspecific creatine kinase coding region probe was used to probe Southern blots of human genomic DNA, however, three groups of DNA sequences were identified in the human genome (Fig. 1A). EcoRI restriction endonuclease digests human genomic DNA to produce distinctive DNA fragments representing each member of the cytosolic creatine kinase gene family; the CKM gene resides on a 23-kb fragment, the CKB gene is associated with an EcoRI restriction polymorphism of 16.5 and 12 kb, and a third distinct sequence, again showing EcoRI restriction polymorphism, is identified at fragment sizes of 7.8 and 5.4 kb (Fig. 1). By Southern blot analysis these B-like genomic sequence

The CKB gene has an EcoRI restriction fragment length polymorphism (RFLP) of 16.5 and 12 kb and the B-like creatine kinase gene, similarly, has an EcoRI RFLP of 7.8 and 5.4 kb. With DNA isolated from 26 unrelated individuals, the frequencies of the 16.5 and 12-kb B creatine kinase alleles were calculated to be 38.5 and 61.5%, respectively. The frequencies for the 7.8- and 5.4-kb B-like creatine kinase alleles were calculated to be 17.3 and 82.7%, respectively.

B-like Creatine Kinase Gene Isolation Characterization

and

B and B-like creatine kinase gene sequences could be separated by agarose gel electrophoresis, allowing the construction of a specific sequence-enriched genomic library. We screened 600,000 recombinant clones constructed in the EMBL4 phage vector containing inserts of 12 to 18 kb and obtained 4 positive clones, representing the human CKB genomic sequences. Studies of these clones are not reported in the present communication. Similarly, there were 3 positive clones from 600,000 recombinant clones constructed in the Lambda-Zap vector with inserts in the range 4-6 kb and 33 positive clones from 600,000 recombinant clones in the range 6-9 kb. These represent the alleles of the B-like creatine kinase genomic sequences. There were also 12 weakly hybridizing clones from the fragments in the range 6-9 kb, which were not further characterized. One of the B-like creatine kinase clones, containing a 5.4-kb insert and representing the smaller allele of the B-like gene, was subloned into pGEM3Z (Promega) for detailed analysis and designated pHCK4. A 618-bp SmuI-PstI restriction fragment of the B-like gene was used to verify the isolation of the correct genomic fragment and for use as a B-like creatine kinase gene-specific probe for Northern blot analysis. This fragment hybridizes to the same B-like creatine kinase gene sequences identified in the original EcoRI restriction digests of human DNA, and under stringent conditions does not cross-hybridize with the CKB gene sequences (results not shown).

Lb

0

I

2

3

THREE

Ah

4

5

IN

TANDEM

WITHIN

Poly-A

AAGAGTCTGTGGCT AAGAGACTGTGACT

FIG. 2. B creatine kinase pseudogene structure. A 5.4-kb DNA fragment corresponding to the 5.4-kb signal on the EcoRI restriction digest was cloned into Lambda-Zap vector. Selected restriction sites on the insert are shown. The structure of the B-like creatine kinase gene was inferred by comparison to the CKB cDNA sequence. The corresponding sequences are represented by open rectangles. Filled arrows represent Ah repetitive elements. The locations of ATG, AATAAA, and poly(A) sites are designated, as are the flanking direct repeats of the inserted Ah complex.

Chromosomal Localization Kinase Gene

of the B-like

Creatine

Using a human-hamster hybrid cell line chromosomal panel, we determined that the B-like creatine kinase gene is localized to chromosome 16~13. Among the members of the human-hamster hybrid clone panel, the lowest level of discordancy of the human band that hybridized with the probe to any of the isozyme or molecular chromosomal markers studied was 6%. The marker with that low level of discordance was PGP, which has been mapped to the short arm of human chromosome 16,16p13.3 (Hyland et al, 1989). The level of discordance with the q-arm marker used in the study, DfA4 at 16q12-q22 (Lavinha et al., 19&Q, was equal to the next lowest level of discordancy of any of the other markers, which ranged from 24 to 59%. These data confirm the provisional assignment of the gene to chromosome 16 and suggest its p-arm regional localization.

B-like Creatine Kinuse Gene Structure Figure 2 shows the locations of selected restriction enzyme sites in the B-like creatine kinase gene clone and the location of Alu repeat sequences. Of the 5400bp total in the insert, the DNA sequence between positions 2420 and 5400 was completely determined (Fig. 3). This region of the clone contains the entire pseudogene sequence. For clarity in this communication, the DNA sequence at position 2421 of the 5400bp insert, corresponding to the SmaI site, is designated position 1. The organization of the gene is determined by comparison to the B creatine kinase cDNA sequence (Villarreal-Levy et al., 1987).

A HUMAN

393

PSEUDOGENE

The B-like creatine kinase gene has an AUG initiation codon at positions equivalent to those of the B creatine kinase cDNA. It is characterized by what initially seemed to be a two-exon structure, in addition to multiple small deletions and insertions. The overall length of this gene product, if functional and counted from the ATG codon, is estimated to be 1296 bp, which is 78 bp shorter than the B creatine kinase transcript. The termination codon location of this gene, when matched to that of the CKB cDNA, corresponds to the position of a small deletion. Identical localization of the polyadenylation signal and poly(A) tail are present. In the 2980 bp sequenced from the 5.4-kb genomic fragment, there are four Alu repetitive elements, one of which is located 5’ to the pseudogene. The B-like creatine kinase gene itself is interrupted by 904 bp of DNA representing three Alu repetitive elements in tandem in the same orientation. The insertion of the 904-bp Alu repetitive elements is bordered by an 18bp direct repeat. The overall sequence homology between this gene and the CKB cDNA, starting at the initiation codon ATG and ending before the poly(A) tail, excluding one of the duplicated 18-bp direct repeats and the Alu insertion complex, is 77.6%. The homology is adversely affected by the presence of multiple small deletions and less so by insertions of l-3 bases.

Creatine Kinase Gene Families

in Other Species

To determine whether the CKB gene has undergone a similar duplication event in other species, we compared the restriction patterns of genomic DNA prepared from several different species, including human, dog, mouse, and rat. Figure 4 shows that B-like creatine kinase genes are present in all species examined. Dog DNA shows unexpected complexities, suggesting multiple duplication events, the number of which cannot be determined by Southern blot analysis.

Alu Tandem Repeats in B Creatine Kinuse Pseudogene Are Not Polymorphic and Are Homozygous in the Human Population To verify that the AZu tandem repeats in the CKB pseudogene are not cloning artifacts and to examine possible polymorphisms of this structure in the population, the Alu tandem repeats were amplified by PCR using unique sequence primers flanking the insertion. Figure 5 shows that the tandem repeats are present in five unrelated individuals and in an additional six unrelated individuals that were tested (results not shown). No polymorphism was apparent by gel electrophoresis analysis. This indicates that the Ah repet-

394

MA ET AL.

itive elements contained in the B creatine kinase pseudogene are homozygous in the population. DISCUSSION

We have identified a single-copy B creatine kinase pseudogene in the human genome. This is demonstrated by the presence of multiple small deletions, insertions, and point mutations in its nucleotide sequence. The initiation codon ATG is present but the location of the corresponding termination codon is displaced by a short deletion. The canonical polyadenylation signal AATAAA is represented and followed at the appropriate distance by the poly(A) tail indicative of a processed pseudogene. A 904-bp insertion present in the B-like creatine kinase pseudogene sequence has been determined to be three tandem Ah elements aligned in the same orientation, flanked by an 18-bp direct repeat corresponding to the bordering CKB pseudogene sequence. The Ah elements abut the direct repeat sequences at each end, suggesting that the direct repeats are target-generated and are created by the transposition of the complex Ah element. Ah repetitive elements which belong to the SINES family (short mobile elements; Singer, 1982) are thought to be retropseudogenes (Weiner et al., 1986) derived from RNA polymerase III transcripts of 7SL RNA or its descendants (Ullu and Tschudi, 1984). Previously, Ah sequences have been found to be inserted into other identifiable DNA sequences in primate genomes, such as the a-satellite repetitive sequences in African Green Monkey (Grimaldi and Singer, 1982), the 3’-untranslated region of a functional low-density lipoprotein receptor gene (Yamamoto et al., 1984), as well as pseudogenes (Liu and Chan, 1990; Zaborsvsky et al, 1984). In rodents, Kominami et al. (1983) showed that a mouse type 2 Ah sequence is present only in some mouse strains and not in others, suggesting the mobile nature of this element. The fact that an Ah is present at the ,&globin gene locus of gorilla DNA but is absent from the corresponding homologous position of the human and other primate DNA (Trabuchet et al, 1987) and the demonstration of a polymorphic Ah sequence at the Cl gene (Stoppa-Lyonnet et al., 1990) suggest that Ah repetitive elements may have remained mobile throughout primate evolution. Recently, direct demonstration of the mobility of an Ah element in a tissue culture system was reported (Lin et aZ., 1988) but later found to be due to contamination by a plasmid clone, Blur-8 (Lin et al, 1989). The present results, with no ambiguity of the target site and border, support the interpretation of the occurrence of a transposition event in the distant past. The presence of the complex Ah element suggests

additional ways in which complexity of the genome is created. It appears that the B-like creatine kinase gene was the product of a reverse transcription of a processed CKB mRNA that was integrated into a chromosome (chromosome 16) different from the authentic CKB gene, which is located on chromosome 14. Subsequent to this event, an Alu transposition took place and interrupted the pseudogene sequences. The mechanism for Ah transposition is believed to be similar to that of the integration of a pseudogene, that is, the incorporation of a cDNA product of a transcribed gene sequence, in this case the 7SL RNA or its descendants. The mode of integration appears to be distinct from that of the well-characterized transposons and the retroviruses (Shih et al., 1988), and AT-rich sites are proposed to be selected preferentially for integration (Daniels and Deininger, 1985). There is a suggestion that secondary structures of DNA, such as those associated with DNase I hypersensitivity sites, may invite the integration of retroposons (Vijaya et al., 1986). From the same line of reasoning it has been suggested that homopurine-homopyrimidine sequences and triple helix structures may generate a preferential integration site (Liu and Chan, 1990). Recently, using high-resolution in situ DNA hybridization techniques, it has been demonstrated that Ah elements in the human genome are not randomly distributed, but are associated with the reverse bands, or R bands, of the metaphase chromosome (Korenberg and Rykowski, 1988). At the DNA sequence level, although clustering of the Ah repetitive elements has been noted at some genomic locations, tandem repeats have been rare. In this context, it is important to differentiate tandem Ah elements from clustered Ah elements. The latter are frequently found in intragenic regions but often lack the flanking direct repeats indicative of transposition versus integration via a possible recombinational event. In reviewing the literature, we have found six reports of tandem Ah repeats: thymidine kinase gene Ah G and H and Ah K and L (Slagel et al., 1987), tubulin gene Ah E and F (Slagel et al., 1987), cr-globin tandem Ah 3A and 3B (Hess et al., 1983), nucleoplasmin pseudogene Ah 3 and 4 (Liu and Chan, 1990), and prothrombin Ah 1 and 2 (Degen et al., 1983). The mechanism for the generation of tandem repeats has been addressed only to the extent that Ah repeats may have preferentially integrated into the poly(A) tail of another Ah sequence (Rogers, 1985). Recently, Ah repeats have been demonstrated to be divided into at least three subsets, including sets with a “conserved consensus” and a “divergent consensus” (Britten et al., 1988; Jurka and Smith, 1988; Willard et aZ., 1987). Ah repeats may be classified into subsets, which in turn may reflect evolutionary relationships between them.

THREE

20

10

ccc~~~xcx

W~TTGCA 70

Alua

IN

TANDEM

WITHIN

30 40 50 60 GTGAG~~GG ATTGCGCCAC TGUCTCUG TCTGGGTGAC

A HUMAN 1630 AAhhATACM

110 GAMTAMTT

120 CAAMCCAM

140 AhAhCMAG&

160

ATMTATTTT

170 ACATA‘%GCh

180 GCTTCCTTTG

190 GMhhCAGGC

200 210 AGTTCTTTAC AMTTTAAAC

AhA ATAGMCCTG

230 240 GGTGGCTCAC GCCTGTMTC

250 CCAGUCTTT

260 GTTAGGCCGA

270

280 TUCTTGAGG

290 300 TCAGTGACTA GCCTGhCAhh

AGGCGGhTCh

310 CATGGTAAM

320 CTTCGTCTCT GC-TM

330

340 CACAAACATT

350 360 AGCCGGGTGT TGTGGCGGGC

TCTCTACTM

370 TCCTGTMTC

380 CCAGCTACTC

390 GAGAGGCTGA

430 AGTTTGCAGT

440 GAGCCGAGAT

AGGGCCACTG

490 AACATAGTGT

TACCATATGA

130

150 -‘ZAGhA

TTGCTATAGA

GGCGGGTGGA

500

2060 TTCCATCCTG

2070 GGCGACUGA

2080 GcGTAACTCC

510

520 TCMTACTM

530 GMTATGCTC

540 ATM-G

u AAAACGGTCA

2120 2130 GGCGTGGTCG TCACGCCCTG

TMTCTCMA

CCACGCTCAG

GCAGATCACG

2190 AGGTTMGAG

ATCMGACGA

TCCTGGCCAA

2230 ACTMhhhCA

2240 ChhhMTTAG

2250 2260 2270 CTGGGCGTGG TGGCACGCGC CTGTACTCCC

AGCTACACCA

UGAhGAhTA

2310 GCTTGAACCC

AGGAGGAGGA

2330 GGTTGCAGTG

AGCCGAGATA

2360 ACTCCAGCCT

2370 GGCEAUGAG

2380 AGAGhctcc~

TCTCMW

660 GGCGCTGMG

670 CTCGCCTCCC

680 GGCGUGGAC

690 GAGTTTCCCG

ACCTGAGCGG

710 CCACCAAGAC

720 CCAGTGGCCG

730

740 CCAGCTGMC

GCGGAGCTGC

750

800 810 CGCGTGGACA GCCCGGGCCA

700

650

760 770 780 GCGCCAGGGG TGGCTTCGCG CTGGAGGCGC 820 CCCGTACTCA

930 990

940 ACCCAGCTCA

830 840 GGGCCGTGGG CGCGTGGCGG 900

1060

2350 GTGCCACT~~

960

1070

1020

ACTGGTCGGC

GACTGGCGGG

1090 1100 GCGCGCAGGA GChGGACCGh

1110 CGCGGMCAG

1120 CMCAGCAGC

1130 1140 TCCTCGACAG CCACTTCCTC

1150 1160 TTCCACGAK CTGTACCGCC

1170 1160 CCTGCTCCTG GCCTCGGUT

1210 TGGChCAAffi

1230 1240 1250 CCTTCCTGGT GTGGGGCGGG GACGAGGACC

MGTGAGGGA

2470 GGGTCTTTGA

1190 GGCCChhCGC

1060 CCTGGhCGhC

2660 2670 CCAGTGCCTG CCATGCACCC

1200 CCGCGGGATC

2400 *GGAAAAAAA

2450 2460 GGACGTGGCC GCGGTGGGTG

zsoa 2510 GGGGCTTCTC AGAGGTGGTG

2520 CTGGTGCAGA

2560

2570 GTGGCATCAC

2580 CAGGCCAGCC

2610 CCAGCCCGCA

2620 CCCACCACCA

2630 2640 GCCCTTGCTG CTTCCTAACT

1260 ACCTGCGGGC

Oucaide T -GhA

2830

2840 CTGATGAATG

2890

1390 TCUTCCTCA

1400 CCTGCCCTCC

1410 CAACCTGCGC

1420 ACAGGCCTGC

AGOSGCMGT

1450 GGCTGCCCCA

1460 CCTGGGCMG

CACGAGTTCT

1380 TCACCTGGGC

TATGCTAAAT

1430

1440 GTGCACATCA

Direct

2900

2960 GAAAGAAACC

1550 GAWCMGG

1560 CGGGCGGATC

1600 GGCCAACATG

1610 GTGMACCAC

1620 CGTCTGTACT

2620 ACRI\TCCAM

2860 TGTGGTATAT

2910 2920 GTACTGATAG ATGCTATMC 2970 AAACACAAAA

2870

2660

CCATACMTG

GAATATTATT

2930 ATAGThhACC

2940 TTGAAMCAT

2980 GGAATATACT

AAAAAGT AAAAAGT

(582-588) (2800-2806)

Ah insertion:

AAGAGGCTGTGGCTTCAG MGAGTCIGTGGCT AAGAGACXTGACTTCAG

(1482-1499) (1798-1811) (2403-2420)

CornspondingBaclrbv kinase~uetnx:

AA-GACITCAG

MO-997)

repeat

ilmlclng

the

pseudogees:

Lxrccttepatflulkiagthc

AGCACTTTffi

2850 GATMTAAAA

MGGMTUA

2950

1370

1480 CGGAGGTGCT TUrJlGGCTG

2700 AGCCCTTAGC

MTTTTGGCC

TGTACATCAA

2730

2680 2690 CTGATGTCGG CCACCTGGCA

2770 GTGACGCTGA -CTAG

TGTGGMTCC

1590

2340

2140 2750 2760 G C A T T T T T T T T T T T T M T G G TAAGATATTC

1360 GACGAGTTCA

AGACCAGCCT

2280

CACCTTGCTA

1350 GTCTAGGMC

1580

2390

2220 CTCCATCTCT

2710 2720 CTCGCTGTAG AGACTTCCGT

1330 1340 CCCGGTTTGA MCTCTTCAA

MGGAGTTCG

2440

AGATGGAGCA

2650 TATCWCCGG

CAGCUTW

1570

2320

2550 CTGCTCATTG

2600 TMTGCTTGC

2590

1300 1310 1320 GAGGCATTTA CTGCTTCTGC CTCGGCCTCT

ACTTGAGGTC

2490 GCTGACCTCC

2210 CATGGTGAM

CAGACGGTGT

ACCTATGGCC

1290 GCAAACGMG

1540

2480 CGTCTCCAAC

2530 2540 TGAGGGTGGG CGTCATAMG

1270 1280 CATCTCCGTG CMCAGGGGG

u 1510 1520 1530 CTGGGCG-CGG TGGCTCATGC CTGTMTCCC

2200

2430 P

2150 2160 ACTTTGGGAG GCCGAGGCGG

ACTTCTGCCT

CGCCCTAGAG

1470

2300

2190

2100 cmmmcm

CCTGCTGGGA

1030 1040 CCCGTGGGTC CCGCGATGGA

1220 ACAGTATAGA

2290 GAGGCTGAGG

2140

2090 GTCT-

AGGACCGGCA

950 ACCCCGAW

1000 1010 TCGCGGTGCG CACGGCCTCC

1050

2170

600 GG

870 880 090 TGTTCAAGGA TCTCTTCGGC CCCATCCTTG

TGTGTTGAGC

1980 TCCAGCTACT

2050 G~C~ATGTA

ACAGCCGCGG

960 TGGCCCCAAC

1960 1970 ATGGTGGCAC ATGCCTGTM

480 CAAUAChAh

640 CCCTTCTCCG

970 GGCGGCCACC

1950 MTTACTGGC

470 TGGMAhhM

8-CI CCCGCA‘XT.G

TGAGCAUAG

1940 AGATACCMA

1920 GAAMCCCTA

460 CACTCCAGTC

CCACCMGCG

920 AGCCCAGCGA

1910 CCMCGTGGA

1860 AGGCTGAGGC

450

610 AGCTGCGGAC

910

1890 1900 GGAGTTCGAG ACCMCCTGA

1650 GCACTTTGGG

2020 2030 2040 TGGAGGTGGA GGTTGTGGTG ACTGGAGATT

570

CCWCTACA

1930

11140 TGTMTCCU

B

2010 CGGCTTGACC

TAHUGT

850 860 CGACGAGGAG TCCTACGACG

lea0 CCTGAGGTCA

1730 1740 TGCAGTGAGC UAGATCACG

1770 CGATATGTGC GAGATTCAGT CTUMMM

1820 1830 TGGGCGCGGG GGCTCACGCC 1870

1720 AGGCAGAGGT

1680 TACTTGGGAG

1990 2000 CGGAGGCTGA GGCAGAAGAA

420

TTATMCAGT

790

1710 TGMCCTGGG

1670 TMTCCCAGC

CCGGGAGGGT

410

GCAGhhTTAT

CATCCAGACC

1760 CCAGCCTGGG

1660 TGGATGXTG

TCGCTTGMC

400

550 560 TGTACATATG MTGTTGATA

AGGCGCTGCC

1750 CCACTACACT

1650 GGCGTGGTGG

GGCAGGAGhh

TCCCACMTT

620

1640 MATTAGCCG

1690 1700 GGTGAGACAC GAGMTCGCT

80 90 100 MMCCCMA TCCGTCTChh hMGMAhAA

AGAGCCAGAC

395

PSEUDOGENE

FIG. 3. DNA sequences of the B creatine kinase pseudogene. Of the 5400-bp insert, the DNA sequence between positions 2420 and 5400 was completely determined and contains the entire pseudogene sequence. For clarity in this communication, the DNA sequence at position 2420 of the 5400-bp insert, corresponding to the SmaI site, is designated position 1 (Fig. 2). The location of pseudogene outside direct repeats, the Ah insertion flanking repeats, the initiation codon ATG, and the polyadenylation signal AATAAA are underlined. Also shown is the corresponding B creatine kinase cDNA sequence for the 18-bp direct repeat.

The data in this study of the B creatine kinase pseudogene locus show that the repetitive elements represent three Ah repetitive sequences in tandem, in

the same orientation. This could be a result of (1) cointegration of multiple Ah sequence, (2) independent insertion of three Ah repeats in the same region,

396

MA Human

12345678 ---------mm-

rat

mouse

dog 9

10

I1

12

13 14 -m--

15

16

-

23 Lb

-

9.4

-

4.3

-

2.3 2.0

6.5

FIG. 4. B-like creatine kinase DNA sequences in different species. Genomic restriction digests of EcoRI, HindIII, X&I, and BglII from human placenta DNA, dog, mouse, and rat liver DNA were separated on an 0.8% agarose gel and transferred to a nylon membrane. The blot was probed with a 742-bp CKB cDNA probe, which contains a portion of the coding region and the entire 3 ‘UT region (see Methods). The signals represent B and B-like creatine kinase sequences. The 23-kb M gene in the human EcoRI digests hybridizes faintly under such conditions. B-like creatine kinase sequences produce weaker signals than the major CKB gene hybridization signal in each restriction digest. Separate hybridization experiments with a CKM probe demonstrated that these signals, with the possible exception of the rat sequences, do not represent cross-hybridization with the CKM sequences.

ET

AL.

poly(A) tail of Alu-1, and its 3’-end is composed of 4 bp of the beginning of Ah-2. This suggests that a short stretch of sequence homology may have influenced the integration site of Ah-2. The same argument can be applied to Alu-1. The possibility that Alu-1 and Ah-2 were created by an unequal crossover at meiosis between the alleles of an original complex consisting of a single Ah element needs to be considered, particularly in view of the presence of the incomplete 13-bp direct repeat 5’ to Ah-2. This model, however, would demand a close identity between Alu-1 and Ah-2 sequences, since they would have been generated by a duplication event mediated through misalignment of the Alu-1 alleles. We therefore performed pairwise sequence comparisons for the three individual Ah elements. Alu-1 and Ah-2 differ from each other by 20%, Ah-2 and Ah-3 differ from each other by 22%, whereas Ah-3 and Alu-1 differ from each other by 20%. These differences are comparable to that between any two random Ah sequence in the human genome (Kariya et al., 1987) and argue against a recombination event contributing to the generation of this complex. We analyzed the six other reported Ah tandem in-

123456

bp or (3) duplication or unequal crossover of an inserted Ah repeat. The Ah insertions were analyzed with respect to conserved consensus sequences. Alu-1 and Ah-2, based on these analyses, belong to class II Ah repeats (Britten et al., 1988; Jurka and Smith, 1988; Willard et al., 1987), whereas Ah-3 represents a class III Ah, or a more recently evolved class (Table 1). Ah-4, which is 5’ to the pseudogene, also belongs to Ah class II. It has, however, a small 5’ and a 3’ truncation. The possibility exists that this Ah is generated through a recombinational event rather than transposition. Ah-3, being of the later evolved class III Ah, indicates that this Ah repetitive element was integrated later and argues against a cointegration event as a mechanism of the generation of the complex. The retention of the full-size Alu element in each of the three individual Alus within the complex also argues against unequal crossover as the mode of generation of the complex. The preservation of the IS-bp direct repeat at the border of the Ah complex, and the presence of another incomplete 13-bp repeat preceding the Ah-2 element, suggests that the sequences of integrations are serial; Alu-1 preceded Ah-2, and Ah-2 preceded Ah-3. The incomplete 13-bp repeat of the second Ah repeat is particularly interesting, as its 5’-end is composed of two As from the preceding

9416 6557 436 1 2322 2027

1353 1078 872

603 of tandem Alu repetitive elements in human FIG. 5. Detection B creatine kinase pseudogene using PCR. Ah tandem repeats were amplified by PCR using unique sequence primers flanking the insertion. The tandem repeats were identified using an A&specific probe (see Methods) and are present in all individuals tested (total equals 11). Lanes 1 through 5: unrelated individuals; Lane 6: clone pCK4. No polymorphism was apparent by gel electrophoresis analysis. This indicates that the Ah repetitive elements contained in the B creatine kinase pseudogene are homosygous in the population.

THREE

Alus

IN

TANDEM

WITHIN

A HUMAN

TABLE of Ah Repetitive

Classification Diagnostic Position

Class

93 98 218 196 199 132 152

C T G C

positions

1

Class

2

1

at the B Creatine

repeat Class

C T G C

C T G C

C

Cl C

T T T G G

C/T T T G A

L

64 + 1 64 + 2 76 86 162

in Ah

Elements

Classified 3

Class

(5 C

T C C G G A G

c--j t--j A T G

t-1 (-) A T G

4

Alu-1

(4.3%) C T G C

Deletion

occurred

compared

to other

classes;

TABLE

Ah Repetitive in Tandem Alu

subfamilies

Thymidine kinase Alu G and H Thymidine kinase Alu K and L Tublin Alu E and F cY-Globin Alu 3A and 3B Nucleoplasmin pseudogene Alu 3 and 4 Prothrombin Ah 1 and 2 B creatine kinase pseudogene Ah-l, -2, -3

2

Element

Subfamilies

Ah Repeats Class

IV

Class

III

Class

II

G and H L

K

EandF 3B

3A 3and4 1 and 2 3

land2

pseudogene Ah-2

(8.6%)

Pseudogene Alu

repeat Ah-3

Locus (% divergence) (5.3%)

(5 G

(T, C

T T T G A

C T T A A

(-) (-) A A C

% divergence-compared

sertions in this fashion. The results are summarized in Table 2. Note that only in the case of the a-globin cluster Ah 3A and 3B and the thymidine kinase locus Ah K and L are the tandem AZus made from two different classes. In the present finding, the difference in the class of the Ah repetitive element inserted into the B creatine kinase pseudogene and the presence of direct repeats as described suggest three separate serial integration events. This supports the theory that the poly(A) end of the Ah element is involved in the preferential integration of subsequent AZu elements (Rogers, 1985), although the rarity of finding AZu repeats in tandem

Kinase

C T G G T A C

Class Note. (-) positions.

397

PSEUDOGENE

2

Class

Ah-4

C T G A

2

to Alu consensus

Class (2) not counting

(7.9%) A T T C T A G T (3 G A

3

Class CpG

2

and diagnostic

indicates that there is very low specificity of this factor as the basis of preferential integration. In addition, the sequence of the direct repeat in this pseudogene locus suggests that localized sequence homology to the beginning of the AZu element may have contributed to the determination of the AZu integration site. Note that in all cases of AZu elements in tandem (i.e., with outside flanking repeat), the AZu elements are all in the same orientation. The age of the CKB pseudogene can be calculated by its nucleotide divergence from CKB cDNA sequences. The non-CpG nucleotide sequence drift rate of single-copy DNA has been estimated to be about 0.15% per million years (Britten et al., 1988). The nucleotide divergence of the pseudogene, compared to that of the B creatine kinase cDNA, and AZu insertional elements, compared to the consensus AZu sequence at the non-CpG and nondiagnostic positions (Britten et aZ., 1988), are as follows: 5’ to the AZu insertion, beginning at initiation codon ATG, total 872 nucleotides, 13.2%; AZu-1, 4.3%; AZu-2, 8.6%; AZu-3, 5.3%; 3’ to the insertion, ending at poly(A) tail, total 390 nucleotides, 14.3%. This suggests that the CKB pseudogene evolved perhaps 90 million years ago, well before the primate radiation. In our calculation for the nucleotide divergence for the CKB pseudogene from the CKB gene sequence, we have included nucleotide(s) insertions or deletions as individual mutational events. Thus, the estimation of 90 million years is perhaps an overestimation. The neutral drift rate of 0.15% per million year in higher primates is based on primary sequence comparisons for silent substitutions in coding sequences and from interspecies hy-

MA

bridization data (Britten, 1986) and there is evidence that Alu sequence drift occurs at a similar rate (Britten et al., 1988). The nucleotide sequence drift rate of the pseudogene, after insertion, likely behaves similarly to that of an integrated Ah element. After finding the presence of a CKB pseudogene in the human genome, we proceeded to examine the presence of CKB gene duplication in other species, and DNA from three species were examined. Using probes derived from the coding regions of human CKM and CKB cDNA, we were able to establish hybridization and washing conditions under which cross-hybridization between species was maintained, but cross-hybridization within the species and between the CKM and CKB sequences was minimized. The results support the view that the B creatine kinase-like gene occurs in the mouse and possibly also in the rat. In the dog, there are multiple B creatine kinase-like genes in the genome. The nature and significance of the B gene duplication in these species are not known. Certainly, there has been no evidence of multiple forms of CKB in these species. Since the CKB gene is active in embryonic cells, there is a propensity, as suggested for some other pseudogenes, to have a reverse-transcription product of a CKB message incorporated into the genome of a pleuripotent cell which predates the development of germ lines, leading to the maintenance of this transposed gene product in the species. Whether any of these sequences in these species represent the same pseudogene cannot be answered. Certainly, the rodents and the dog have different ALU-like repetitive elements and the first Ah sequence insertional event would be expected to have occurred in the primate lineage. In summary, the present findings support the contention that Ah repeats integrate nonrandomly into the human genome. Since it is known that the higher primates have very different copy numbers ofAlu family repeats, as determined by the titration method (Hwu et al., 1986) and since deletion events at homologous locations have yet to be observed, it is conceivable that this locus in human genome, and other such structures that can be recovered and analyzed, in which multiple independent transposition events occurred, may function as useful molecular clocks through which the evolution of primate radiation and Ah repetitive sequences can be studied.

ACKNOWLEDGMENTS This work was supported by a grant from the American Heart Association-Texas Affiliate, Inc., the American Heart Association Bugher Foundation Center for Molecular Biology in the Cardiovascular System, and the Muscular Dystrophy Association. We also thank S. L. Terry and S. A. Montemayor for manuscript preparation, and P. A. Brink for providing RFLP data and analysis.

ET

AL.

REFERENCES 1. 2.

3.

4.

5.

6.

7.

8.

9.

10.

11.

12.

13.

14.

15.

16.

17.

BRITTEN, R. J. (1986). Rates of DNA sequence evolution differ between taxonomic groups. Science 231: 1393-1398. BRITTEN, R. J., BARON, W. F., STOUT, D. B., AND DAVIDSON, E. H. (1988). Sources and evolution of human Alu repeated sequences. Proc. Natl. Acad. Sci. USA 85: 4770-4774. DANIELS, G. R., AND DEININGER, P. L. (1985). Integration site preferences of the Alu family and similar repetitive DNA sequences. Nucleic Acids Res. 13: 8939-8954. DEGEN, S. J. F., MACGILLIVRAY, R. T. A., AND DAVIE, E. W. (1983). Characterization of the complementary deoxyribonucleic acid and gene coding for human prothrombin. Btichemistry 22: 2087-2097. FEINBERG, A. P., AND VOGELSTEIN, B. (1984). A technique for radiolabeling DNA restriction endonuclease fragments to high specific activity: Addendum. Anal. Biochem. 137: 266 267. GRACE, A. M., PERRYMAN, M. B., AND ROBERTS R. (1983). Purification and characterization of human mitochondrial creatine kinase. J. Biol. Chem. 268: 15346-15354. GRIMALDI, G., AND SINGER, M. F. (1982). A monkey Alu sequence is flanked by 13-base-pair direct repeats of an interrupted alpha-satellite DNA sequence. Proc. Natl. Acad. Sci. USA 79: 1497-1502. HAAS, R. C., KORENFELD, C., ZHANG, Z., PERRYMAN, M. B., ROMAN, D., AND STRAUSS, A. W. (1989). Isolation andcharacterization of the gene and cDNA encoding human mitochondrial creatine kinase. J. Biol. Chem. 264: 2890-2897. HAAS, R. C., AND STRAUSS, A. W. (1990). Separate nuclear genes encode sarcomere-specific and ubiquitous human mitochondrial creatine kinase isozymes. J. Biol. Chem. 266: 69216927. HESS, J. F., Fox, M., SCHMID, C., AND SHEN, C. K. J. (1983). Molecular evolution of the human adult a-globin-like gene region: Insertion and deletion of Alu family repeats and nonAlu DNA sequences. Proc. Natl. Acad. Sci. USA 80: 59705974. HOSSLE, J., SCHLEGEL, J., WEGMANN, G., et al. (1988). Distinct tissue specific mitochondrial creatine kinases from chicken brain and striated muscle with a conserved CK framework. Biochem. Biophys. Res. Commun. 151: 408-416. Hwu, H. R., ROBERTS, J. W., DAVIDSON, E. H., AND BRIAN, R. J. (1986). Insertions and/or deletions of many repeated DNA sequences in human and higher ape evolution. Proc. Natl. Acad. Sci. USA 83: 3875-3879. HYLAND, V. J., SUTHERS, G. K., FRIEND, K. L., et al. (1989). Probe, VKBB, is located in the same intervals as the autosoma1 dominant adult polycystic kidney disease locus, PKDl. Hum. Genet. 84: 286-288. JURKA, J., AND SMITH, T. (1988). A fundamental division in the Alu family of repeated sequences. Proc. Natl. Acad. Sci. USA 86: 4775-4778. KARIYA, Y., KATO, K., HAYASHIZAKI, Y., HIMENO, S., TARUI, S., AND MATSUBARA, K. (1987). Revision of consensus sequence of human Alu repeats-a review. Gene 53: l-10. KAYE, F. J., MCBRIDE, 0. W., BA?TEY, J. F., GAZDAR, A. F., AND SAUSVILLE, E. A. (1987). Human creatine kinase-B complementary DNA-Nucleotide sequence, gene expression in lung cancer, and chromosomal assignment to two distinct loci. J. Clin. Inuest. 79: 1412-1420. KOMINAMI, R., MURAMATSU, A mouse type 2 Alu sequence Nature 301: 87-89.

M., AND MORIWAKI, K. (1983). (M2) is mobile in the genome.

THREE

Alus

IN

TANDEM

18.

KORENFJERG, J. R., AND RYKOWSKI, M. C. (1988). nome organization: Alu, Lines, and the molecular metaphase chromosome Bands. Cell 53: 391-400.

19.

LAVINHA, J., MORRISON, N., GLASGOW, L., AND FERGUSONSMITH, M. A. (1984). Further evidence for regional localization of human APRT and DIA4 on chromosome 16. Cytogenet. Cell Genet. 37: 517.

20.

Human structure

WITHIN geof

LIN, C. S., GOLL~THWAITE, D. A., AND SAMOLS, D. (1988). Identification of Alu transposition in human lung carcinoma cells. Cell 54: 153-159.

21.

LIN, C. S., GOLDTHWAITE, D. A., AND SAMOLS, D. (1989). Identification of Alu transposition in human lung carcinoma cells: A correction. Cell 59: 153-159.

22.

LIU, Q-R., AND CHAN, P. K. (1990). stretch of homopurine-homopyrimidine of retroposons in the human genome. 459.

Identification of a long sequence in a cluster J. Mol. Biol. 212: 453-

A HUMAN

PSEUDOGENE

399

31.

SLAGEL, V., FLEMINGTON, E., TRAINA-DROGE, V., BOADSHAW, H., AND DEININGER, P. (1987). Clustering and subfamily relationships of the Alu family in the human genome. Mol. Biol. Evol. 4: 19-29.

32.

STALLINGS, R. L., OLSON, E. A. W. S., THOMPSON, L. H., BACHINSKI, L., AND SICILIANO, M. J. (1988). Human creatine kinase genes on chromosomes 15 and 19 and proximity of the gene for the muscle form to the genes for apolipoprotein C2 and excision repair. Am. J. Hum. Genet. 43: 144-151. STOPPA-LYONNET, D., CARTER, P. E., MEO, T., AND TOSI, M. (1990). Clusters of intragenic Alu repeats predispose the human Cl inhibitor locus to deleterious rearrangements. Proc. Natl. Acad. Sci. USA 87: 1551-1555.

33.

34.

TRAEXUCHET, G., CHEBLOUNE, Y., SAVATIEXZ, P., et al. (1987). Recent insertion of an Alu sequence in the beta-globin gene cluster of the Gorilla. J. Mol. Evol. 25: 288291.

35.

ULLU, E., AND TSCHUDI, C. (1984). Alu sequences are processed 7SL RNA genes. Nature 312: 171-172. VLJAYA, S., STEFFEN, D. L., AND ROBINSON, H. L. (1986). Acceptor sites for retroviral integrations map near DNase l-hypersensitive sites in chromatin. J. Virol. 60: 683-692.

23.

MANIATIS, T., FRITSCH, E. F., AND SAMBROOK, J. (1982). “Molecular Cloning: A Laboratory Manual,” Cold Spring Harbor Laboratory, Cold Spring Harbor, New York.

36.

24.

MARIMAN, E. C. M., BROERS, C. A. M., CLAESEN, C. A. A., TESSER, G. I., AND WIERINGA, B. (1987). Structure and expression of the human creatine kinase B gene. Genomics 1: 126-137.

37.

25.

NELSON, D. L., LEDBETTER, S. A., CORBO, L., et al. (1989). Alu polymerase chain reaction: A method for rapid isolation of human-specific sequences from complex DNA sources. Proc. Natl. Acad. Sci. USA 86: 6686-6690.

VILLARREAL-LEW, G., MA, T. S., KERNER, S. A., ROBERTS, R., AND PERRYMAN, M. B. (1987). Human creatine kinase: Isolation and sequence analysis of cDNA clones for the B subunit, development of subunit specific probes and determination of gene copy number. Biochem. Biophys. Res. Commun. 144: 1116-1127.

38.

WATTS, D. C. (1973). In “Creatine Kinase (Adenosine 5-Triphosphate-creatine Phosphotransferase)” (P. D. Boyer, Ed.), pp. 383-455, Academic Press, New York. WEINER, A. M., DEININGER, P. L., AND EFSTRATIADIS, A. (1986). Nonviral retroposons: Genes, pseudogenes, and transposable elements generated by the reverse flow of genetic information. Annu. Rev. Biochem. 55: 631-661. WILLARD, C., NGUYEN, H. T., AND SCHMID, C. W. (1987). Existence of at least three distinct Alu subfamilies. J. Mol. Evol. 26: 180-186. YAMAMOTO, T., DAVIS, C. G., BROWN, M. S., et al. (1984). The human LDL receptor: A cysteine-rich protein with multiple Alu sequences in its mRNA. Cell 39: 27-38. ZABAROSVSKY, E. R., CHUMAKOV, I. M., PIUSSOLOV, V. S., AND KISSELEV, L. L. (1984). The coding region of the human c-mos pseudogene contains Alu repeat insertions. Gene 30: 107-111.

26.

27.

PERRYMAN, M. B., KERNER, S. A., BOHLMEYER, T. J., AND ROBERTS, R. (1986). Isolation and sequence analysis of a fulllength cDNA for human M creatine kinase. Biochem. Biophys. Res. Commun. 140: 981-989. REED, K. C., AND MANN, D. A. (1985). from agarose gels to nylon membranes. 7207-7221.

Rapid transfer Nucleic Acids

and evolution

of DNA Res. 13:

28.

ROGERS, J. H. (1985). The origin sons. Znt. Rev. Cytol. 93: 187-279.

of retropo-

29.

SHIH, C-C., STOYE, J. P., AND COFFIN, J. M. (1988). Highly preferred targets for retrovirus integration. Cell 63: 531-537.

30.

SINGER, M. F. (1982). SINES and LINES: Highly repeated short and long interspersed sequences in mammalian genomes. Cell 28: 433-434.

39.

40.

41.

42.

Serial Alu sequence transposition interrupting a human B creatine kinase pseudogene.

We have isolated, sequenced, and characterized a single-copy B creatine kinase pseudogene. The chromosomal assignment of this gene is 16p13 and a uniq...
3MB Sizes 0 Downloads 0 Views