Update Reverse Genetics and Cystic Fibrosis Michael C. Iannuzzi and Francis S. Collins Departments of Internal Medicine and Human Genetics, Howard Hughes Medical Institute, University of Michigan, Ann Arbor, Michigan

The protein responsible for cystic fibrosis has been identified using an approach called "reverse" genetics. This approach relies on the chromosomal map position to direct the search for a disease gene, several novel cloning strategies to isolate the gene, and the gene's sequence to define the abnormal protein. Reverse genetics, because it does not require prior knowledge of the protein's biochemical function, has wide utility and is being used to define the defects in many single-gene disorders. This update presents the reverse genetics approach and uses cystic fibrosis to illustrate the principles involved.

The standard biochemical approach to the molecular basis of inherited disorders has often been hampered by a lack of functional assays, a low abundance of abnormal gene products, the complexity of cellular and protein interactions, and even the inability to identify the affected cell type. An alternative approach without regard to cellular pathophysiology or biochemical data has recently been developed. This approach is based on locating (mapping) the disease gene in the human genome and using the map position to clone, sequence, and determine the abnormal gene product. This process has been called "reverse" genetics, because it is the reverse of the usual approach of isolating the protein and then using the protein to clone the gene (1). In collaboration with investigators at the Hospital for Sick Children in Toronto, we recently used reverse genetics to identify the abnormal protein responsible for cystic fibrosis (CF), the cystic fibrosis transmembrane regulator (CFTR) (2-4). Although biochemical and electrophysiologic investigations previously detected a physiologic abnormality in CF-an abnormal ion transport in the apical membrane of epithelial cells-the CF protein was not forthcoming. By mapping the gene to the long arm of chromosome 7 and then honing in on and cloning the CF gene, the responsible protein was identified. This article will review the principles and techniques used (Received in original form January 23, 1990 and in revised form January 26, 1990) Address correspondence to: Michael C. Iannuzzi, M.D., Department ofInternal Medicine, Division of Pulmonary and Critical Care Medicine, University of Michigan, Ann Arbor, MI 48109. Abbreviations: cystic fibrosis, CF; cystic fibrosis transmembrane regulator, CFTR; Duchenne muscular dystrophy, DMD; pulsed field gel electrophoresis, PFGE; restriction fragment length polymorphism, RFLP; yeast artificial chromosomes, YAC. Am. J. Respir. Cell Mol. BioI. Vol. 2. pp. 309-316, 1990

in reverse genetics and will focus on CF, although reverse genetics has also yielded the genes for chronic granulomatous disease, Duchenne muscular dystrophy, retinoblastoma, and Wilms' tumor and is being used for several other single-gene disorders (5-14). Genetic Linkage Analysis The first step is to establish the chromosomal position of the disease gene. This is not difficult for those diseases that are sex linked or have specific cytogenetic abnormalities visualized by karyotypic analysis. For example, in Duchenne muscular dystrophy (DMD), a number of patients with cytogenetic abnormalities were identified: several female patients with a chromosome translocation between one X chromosome and an autosome, and a boy with a deletion within the short arm of the X chromosome (15). These cytogenetically abnormal chromosomes helped to pinpoint the DMD gene to band Xp21. For those diseases without sex linkage or cytogenetic abnormalities, such as CF, the disease gene is mapped using genetic linkage analysis. To understand gene mapping by linkage analysis, it is helpful to review a few terms and to recognize the magnitude of the task. The human genome contains the instructions or genes for making tens of thousands of different proteins. The human genome consists of two sets of 23 chromosomes, and each set is 3 billion base pairs in length. The human karyotype is the display of the 46 chromosomes at mitosis when each chromosome has its own unique pattern of bands. An allele is one of several alternate forms of a gene (e.g., alleles for eye color). A gene locus is the chromosomal address at which the gene for a particular trait resides and is occupied by anyone of the alleles for that gene. The chromosomal address is reported as the chromosome number, whether it is on the long arm (q) or short arm (p), and the chromosomal band(s) to which it has been located, for example, the interleukin-2 gene locus is at 4q26-q28 (16).

AMERICAN JOURNAL OF RESPIRATORY CELL AND MOLECULAR BIOLOGY VOL. 21990

310

DNA with

3 "'file! sites

1

2

3

+ •

+ + • •

1

-

3

DNA with 2 "'filet ...- - - - -....- - sites

+

+

~3kb----J

DNA "",rk.,

DNA """ler

Digest with restriction endonuclease Separate fragments by gel electrophoresis Southern blot DNA Hybridize with labeled DNA marker

_

Figure 1. Restriction fragment length polymorphism (RFLP). A change in the DNA sequence that affects a restriction site is detected by a difference in size of restriction fragments.

3kb

2kb -

lkb -

Autoradiogram

The size of a typical gene is about 10,000 base pairs, although genes may be as small as several hundred base pairs in length or as large as a few hundred thousand base pairs. Locating a gene in the human genome (3 billion base pairs in length) is about the same as finding a specific house on earth while standing on the moon a quarter of a million miles away. A task of this magnitude can be accomplished because of three factors: first, the availability of cloned "marker" DNA with assigned chromosomal location; second, the discovery of restriction endonucleases which cleave DNA only after recognizing short specific DNA sequences; and third, the phenomenon of recombination. More than 2,000 marker DNA segments spanning the human genome have been cloned and mapped (16). The chromosomal position of DNA markers is usually determined by in situ chromosomal hybridization allowing visualization of the marker on a specific chromosome, or by Southern blot hybridization to a panel of DNAs prepared either from flowsorted chromsomes or from somatic cell hybrids. A review of these mapping techniques is referenced (17). A cloned DNA segment with assigned chromosomal position is useful as a marker for linkage analysis when it detects sequence variation (polymorphism) in the population. When the difference in sequence, which may be as small as a single base change, creates or eliminates a recognition site detected by a specific restriction endonuclease, the polymorphisms are called restriction fragment length polymorphisms (RFLPs). The detection ofRFLPs begins with obtaining blood samples. DNA is extracted from the nuclei of white blood cells and the DNA digested with a restriction endonuclease. The resulting DNA fragments are separated according to size by

agarose gel electrophoresis and the DNA transferred from the gel onto a nylon membrane by Southern blotting. DNA markers labeled with 32p are used to detect complementary sequences by hybridization to the DNA fragments on the nylon membrane. A change in the DNA sequence that affects a restriction site is detected on autoradiogram as a difference in the size of the restriction fragments (Figure 1). In addition to RFLPs detected by cloned DNA markers, linkage analysis depends on the fact that genes, as well as noncoding fragments of DNA generated by restriction endonucleases, tend to be inherited together (i.e., linked) when they are close together on the chromosome and inherited independently when they are far apart. This is due to recombination, which occurs during meiosis. When parental chromosomes are transmitted, they are not transmitted intact, but after homologous regions of the same chromosome pairs recombine, that is, cross over and exchange segments of equal length (Figure 2). Recombination does not alter the order of genes on the chromosome, although new combinations of alleles, not present in the parents, may be found in their progeny. Linkage between two DNA markers or between a DNA marker and a disease locus and the estimate of their recombination frequency are determined by statistical analysis using computer programs (e.g., LIPED) (18). The statistical measure of how strongly two genetic loci are linked is given as a lod score (log of the odds in base 10). A lod score of greater than 3 means there is a greater than 1,000:1 chance that the two loci are on the same chromosome and are linked. A lod score below -2 indicates the loci are unlinked and well apart on a chromosome or are on different chromosomes.

311

Update

LOCI CLOSE TOGETHER

a a

b

b

a

b

AI 'lA!lab a B

'I~



b

a b

MEIOSIS I

~

RECOMBINANTS



tt

Figure 2. Recombination occurs between homologous chromosomes in meiosis. On the left is an example where two loci are relatively far apart so that recombination is likely to occur between them. The order of the genes on the new chromosome is not altered but new combinations of alleles are present. On the right is an example where the loci are close together, in which case recombination between them will be a rarer event.

The recombination frequency of unlinked loci is 50%, as there is an even chance of being passed on together or apart. For linked loci, the percentage of recombination can be related to their physical distance apart on the chromosome. By definition, the genetic distance between two DNA segments that cross over in 1% of progeny is 1 centiMorgan (cM). Based on recombination studies, the genetic length of the entire human genome is 3,300 cM. Loci with a 1% recombination frequency or a genetic distance of 1 cM therefore lie 1/3,300 of the human genome apart; since the physical length of the genome is about 3 million base pairs, 1 cM is about 1 million base pairs (1/3,300 x 3 billion). The physical distance derived from the genetic distance is a rough esti-

mate, since recombination is not evenly spaced over the genome; there are hot spots for recombination, as well as regions in the genome where recombination is suppressed. Linkage to CF was first reported with the polymorphic serum enzyme paroxonase (PON) (19). Unfortunately, the PON gene had not been cloned and its chromosomal location was not known. Shortly thereafter, linkage was reported to the DNA marker DOCRI-917 located on the long arm of chromosome 7, and then to two other DNA markers from 7q21-31, the met gene, and 13.11(D7S8)1 (20-22). DOCRI-917 was the most distant to CF at a recombination distance of 15 cM. The met gene, which detects a polymorphism with the restriction enzyme MspI (recognition sequence CCGG), and 13.11, which detects a polymorphism with TaqI (TCGA), were each within 1 to 2 cM of the CF gene. A large collaborative study indicated that CF was somewhere between met and 13.11 (23). In addition to bracketing the CF region, these two tightly linked polymorphic markers were immediately useful for genetic diagnosis in affected families (24) . Genetic linkage maps, as the one for cystic fibrosis, metCF-D7S8, describe the order of genes and DNA markers based on their pattern of inheritance. The best resolution obtained with mapping by genetic linkage analysis is rarely better than 1 to 2 cM, or about 1 or 2 million base pairs; a finer resolution map is generally not obtained because the number of affected families is not infinite and recombination is unlikely to occur within the small chromosomal interval between a linked marker and the disease gene. For example, despite a 4-yr search by many investigators in several countries, only one certain recombinant between D7S8 and CF was found (25). While the distance between markers estimated by genetic analysis is useful, it is important to confirm the physical size of the target region before searching for the disease gene. We will now discuss how, once the chromosomal location of a disease gene is known, a more precise physical measure is made. Physical Mapping The ultimate aim of DNA mapping is the nucleotide sequence; on the way to attaining the sequence is the physical map. One type of physicalrnap, a restriction map, is obtained by breaking the DNA molecule at defined points and separating the fragments according to their size. The specific breakage at defined points is made possible by restriction endonucleases which cleave DNA only at specific sequences, and the size of the restriction fragments generated is determined by agarose gel electrophoresis. The resolution of standard restriction maps, representing the linear sequence of sites at which particular restriction enzymes find their targets, is from a few hundred to several thousand base pairs. Another type of physical map is the cytogenetic map of banded chromosomes. The best resolution obtained in cyOnce DNA segments with unknown function are found useful as markers, they are given names that follow genetic nomenclature (16). For example, J3.ll was named mS8: "0" for DNA; "7" for the chromosomal assignment; "S" to indicate the complexity of the DNA segment detected by the probe. "S" is for a unique DNA segment; "Z" is for repetitive DNA segments found at a single chromosome site; and "8" is a number to give uniqueness to the concatenated symbols.

AMERICAN JOURNAL OF RESPIRATORY CELL AND MOLECULAR BIOLOGY VOL. 21990

312

clamped homogeneous electric field gel (CHEF); and with fields at 180 -field inversion gel electrophoresis (FIGE). All of these PFGE instruments separate DNA from 50 kb to more than 9 megabase pairs (l megabase = 1 million base pairs = 1 Mb). The critical parameters affecting size separation are voltage, buffer temperature, agarose concentration, and pulse time. Intact yeast chromosomes of known length are used as size markers. Several groups have presented physical maps of the CF region using PFGE (34, 35). The physical distance from the met gene to the DNA marker D7S8 was found to be 1,300 to 1,800 kb which correlated well with the 1 to 2 cM distance estimated by linkage analysis. In addition to sizing the CF region, these maps greatly aided cloning efforts by establishing the location of newly isolated clones. Figure 3 presents the physical map of the region spanning the met gene to ·D7S8. To summarize, linkage analysis is used to chromosomally locate the disease gene; one seeks to identify closely linked markers that flank or bracket the target region. Pulsed field mapping is then used to determine the actual physical size of the region and to facilitate cloning. The approaches used to reach the disease gene from this point vary. We will discuss four molecular cloning strategies.

togenetic mapping, which depends on light microscopy, is 5 to 10 million nucleotides. Until recently the middle range in physical mapping, between standard restriction site mapping (10 to 20 kilobases [kb]) and cytogenetic mapping (5 to 10 million nucleotides), contained a serious gap. There were two obstacles to bridging this gap: first, a lack of enzymes that cleave human DNA infrequently enough to produce large DNA fragments, and second, the inability to separate DNA fragments much larger than 50 kb using conventional gel electrophoresis. These two obstacles were overcome by: (1) the discovery of "rare cutter" restriction enzymes that cleave DNA into fragments with average sizes ranging from 100,000 to 1 million nucleotides, and (2) the development of pulsed field gel electrophoresis (PFGE). The recognition sequences for rare cutter restriction enzymes contain the dinucleotide CpG (cytosine-guanine). This dinucleotide is both underrepresented in the mammalian genome and frequently methylated, making it more resistant to endonuclease digestion. For example, NotI (which contains two CpG dinucleotides in its recognition sequence GCGGCCGC) cuts human DNA into fragments averaging about 500,000 bp in size. PFGE, used to separate large DNA fragments, subjects DNA to pulses of nonuniform opposing fields (26-29). This is in contrast to conventional gel electrophoresis, which uses a constant unidirectional electric field. Separation of longer DNA fragments with PFGE probably results from conformational changes, an uncoiling and recoiling of DNA, that go on by switching field directions. The shorter DNA fragments, by taking less time to orient to the changing electric fields, move down the gel more quickly than the larger DNA fragments. A variety of instruments for PFGE are available and differ primarily in the geometry of the electric fields used to separate the DNA (30-33). There are instruments with fields at a perpendicular angle - orthogonal field gel electrophoresis (OFAGE); with fields in a hexagonal array at a 120 angle-

0

Molecular Cloning One tactic is to screen phage and cosmid libraries with probes for the linked markers and then sequentially isolate overlapping clones. This process of proceeding along a stretch on the chromosome by isolating overlapping clones is called chromosome walking. Cosmids, the cloning vectors with the largest capacity, carry 40 to 45 kb of insert DNA, and yield about 20 kb of additional DNA with each new clone. Chromosome walking with cosmid or phage is tedious and unfortunately seldom allows one to proceed more than 200,000 to 300,000 bases in a given direction because of repetitive or unclonable DNA present throughout the genome.

0

CF MET 5 ,me t melD metHCF63

, '" X

pH131 W3D1A

TM58trA t _)( X

X

X'X

X



,y

0758

3H-1

CE1.5 4 n,E1.0

X

XXX

J30

XX

X

J 2,f11 XX X

X A

A

8 F

IF

F L

(8)

(8) [8][8](8)

L LL

L

F F F

L L

F

LL

L

M N

N

R

(N)

R

o R

I

500 (kb) I

I

R

Figure 3. Pulsed field gel electrophoresis (PFGE) mapping of the CF region. The distance from the met oncogene to the DNA marker mS8 is about 1.8 million base pairs. The open box indicates the segment cloned by walking and jumping (about 280 kb) and the horizontal arrow indicates the region covered by the CF transcript (about 250 kb). The vertical arrows indicate the location of isolated clones used for mapping. The symbols for each rare cutter restriction enzyme are A, Nae I; B, BssHII; F, Sfi I; L, Sal I; M, Mlu I; N, Not I; R, Nru I; and X, Xho I.

Update

Another tactic is saturation cloning, performed by isolating and mapping a few hundred random clones from a chromosome specific library to identify clones in a target region. Specific libraries prepared from either somatic cell hybrids containing human chromosomes or from chromosomes physically separated by flow cytometery are available for all the chromosomes. Rommens and coworkers (36) reported that from 258 random clones derived from a chromosome 7 library, 53 mapped to the general region of CF (chromosome band 7q31-7q32); two of these were closer to CF by linkage analysis than the met gene. Interestingly, these two random clones, mS122 and mS340, were only 10 kb apart, but were useful because they narrowed the region to be searched; by pulsed field mapping they were 450 kb closer to CF than met. Yet the molecular distance to be traversed was still likely to be a few hundred kilobases and too great a stretch for convenient chromosome walking. A third approach is to screen phage and cosmid libraries constructed from a subchromosomal piece of DNA containing the disease gene. This allows enrichment for DNA sequences from the target region. Three methods used to generate subchromosomal size pieces of DNA are reduced somatic cell hybrids, preparative pulsed field gel electrophoresis, and cloning in yeast artificial chromosomes. Scambler and coworkers (37) reported using chromosome-mediated gene transfer of human DNA into mouse cells to select a small piece of human chromosomal DNA containing both the met oncogene and the CF gene. The transforming properties of the met oncogene provided the means to select for these reduced somatic cell hybrids containing the DNA of interest. Out of several hybrid cell lines constructed, one line contained 1 mb of human DNA from the target region on human chromosome 7 and was used to construct a cosmid library. While chromosome-mediated gene transfer and other methods relying on somatic cell hybrids can be used to isolate human DNA fragments several megabases in length, disadvantages of this strategy include requirement for a selection scheme, lack of control over the size of fragments transferred, and a high frequency of sequence rearrangements. Selective cloning of DNA fragments from a PFGE gel can also be used. Michiels and colleagues (38) used PFGE to separate large fragments of DNA and reported identifying a 450 kb NotI fragment hybridizing to the met gene. A phage library constructed from DNA cut out of the gel provided a clone from one end of the NotI fragment and 300 kb 5' to the met gene. Unfortunately, CF was located to the 3' side of met (39) and so this clone was not useful. While "preparative" pulsed field gels aid cloning from a specific subchromosomal region, a disadvantage is the high background of clones from contaminating DNA comigrating with the target DNA during electrophoresis. A final example of how subchromosomal fragments of DNA can be obtained is cloning in yeast artificial chromosomes (YAC). Plasmid vectors containing known yeast centromere and telomere sequences are used to maintain artificial chromosomes in yeast hosts (40, 41). YAC libraries made with these vectors yield clones with inserts that are hundreds of kilobases in size which can be subcloned into phage or cosmid libraries. Technical difficulties in both making and screening YAC libraries have previously hindered

313

their wide availability and general application, but these problems are being solved and YAC technology is emerging as the method of choice for cloning a large region (42). Another gene cloning strategy that addresses the general problem of cloning over large distances is chromosome jumping. Chromosome jumping allows the isolation of DNA segments separated in the genome by distances up to several hundred thousand base pairs without isolating all the intervening DNA as required in conventional chromosome walking. Chromosome jumping, as described by Collins and Weissman (43) and independently by Poustka and Lehrach (44) depends on circularizing very large DNA fragments, followed by cloning of the junction fragments of these circles which bring together DNA sequences that were originally located a considerable distance apart (Figure 4). An addi-

Very high MW DNA

I-

Partial Mba I digest in by aoaro..

Selec~on of desired size range by PFGE

=

-

I

Low concentration ligation presence of molar excess sup F

In

~

I

Eoo RI d',.,,,," ~

---"IN--

+

+

I I

Ligate to >.am arms Package, plate on sup- host

---"IN--JoMI'--

Library of junction fragments

~

5"000 .,," -

Juncllon fragment clone

Figure 4. Principles of cloning by chromosome jumping. The heavy bar represents the starting probe which in the final junction fragment clone is present along with another segment of DNA (open box) that was initially far away in the genome. The starting probe and the jump piece in the junction fragment are connected by a short (200 bp) marker gene.

314

AMERICAN JOURNAL OF RESPIRATORY CELL AND MOLECULAR BIOLOGY VOL. 2 1990

tional advantage of chromosome jumping is that it avoids the problem of repetitive or unclonable DNA which prevents sequential walks. Cloning the CF Gene The strategy used to search for CF candidate gene sequences combined several molecular cloning techniques including saturation cloning, chromosome jumping, and chromosome walking. Saturation cloning provided two markers, O7S122 and 07340, which narrowed the interval to be searched, and chromosome jumping accelerated cloning DNA from the target region. Each jump clone traversed about 75 to 100 kb of DNA, and at the end point of each jump bidirectional walks were initiated to obtain DNA between the jumps. Recombinational analysis with RFLPs detected by the jump and walk clones excluded more than 300 kb of DNA in the interval between O7S8 and CF (45, 46), but between O7S122 and CF it was necessary to search through more than 280 kb of continuous DNA before the CF gene was identified because individuals recombinant in this interval were unavailable. How do you sort through such a large amount of DNA? In other words, how do you determine if an isolated clone is part of a gene? Since many genes show evolutionary conservation, one of the most useful signs is detecting crosshybridizing sequences between species. Cross-hybridizing sequences can be detected by labeling intact phage or cosmid

clones and hybridizing to Southern blots containing DNA from a variety of species such as mouse, bovine, and chicken (a so-called zoo blot). A second guidepost is the presence of CpG islands (47, 48). These islands are stretches of DNA, 500 to 2000 bp in length, which are rich in the nonmethylated dinucleotide CpG, and contain clusters of sites for the rare cutter restriction enzymes. These islands often mark the 5' end of vertebrate genes. Additional tests include screening mRNA or cDNA libraries for evidence of transcription in affected tissues. In sorting DNA for genes, several criteria must be used because not all human genes are conserved, many genes are not marked by CpG islands, and cDNA clones and mRNA transcripts may be difficult to detect for genes with low levels of gene expression or small exons. After several false starts, a clone which met these criteria was identified. It was conserved, contained a CpG-rich island, detected cDNA clones in sweat gland and tracheal cDNA libraries, and detected a 6.5 kb mRNA transcript in several tissues affected in CF (2-4). Overlapping cDNA clones were isolated and the entire gene sequenced. No gross rearrangements in this gene were seen in individuals with CF, nor was there any apparent difference in the size or abundance of the mRNA transcript between normal and CF tissues. This gene, which spans a 250 kb region in the genome, was identified as the CF gene because a specific mutation, a 3 bp deletion removing phenylalanine 508 from the coding

II

II

I

III

Figure 5. Gene, transcript, and predicted CF protein identified by reverse genetics. The CF gene spans about 250 kb in the genome (top line). The mRNA transcript is 6,129 bp long and contain~v5' and 3' untranslated (UT) regions. The/complete sequence of the 1,480 amino acid protein is predicted from the DNA sequence. The folding of the protein and its location in the cell membrane are at present hypothetical. Triangles indicate the location of the 3 bp deletion in exon 10 representing the common CF mutation. This mutation leads to a loss of a single amino acid in the protein product. (From Principles of Medical Genetics by Thomas Gelehrter and Francis Collins. Permission for figure granted by Williams & Wilkins Co., Baltimore. Copyright 1990.)

Update

region, was found in affected individuals but not in over 600 normal chromosomes (4, 49). The 3 bp mutation accounts for about 75 % of individuals with CF, and it is likely that the remaining mutations will be defined within a short period. Cystic Fibrosis Transmembrane Regulator Based on the gene sequence, the CF protein or cystic fibrosis transmembrane regulator (CFTR) is 1,480 amino acids long with a molecular mass of 168,138 D (3). The predicted CFTR protein (Figure 5) has two amino acid sequence domains resembling consensus nucleotide (ATP) binding folds. It also has two repeated motifs, each of which consists of a domain capable of spanning the membrane several times. The protein unexpectedly resembles the mammalian multidrug resistance P-glycoprotein. How CFTR relates to the observed chloride conductance abnormality and to the pathophysiology in CF is still unknown. CFTR may itself serve as an ion channel, but the CFTR protein does not resemble other known anion channel structures. It is entirely possible that CFTR has some other unique epithelial function, and the effect on chloride conductance is secondary. Many investigators, with the aggressive and enthusiastic support of the Cystic Fibrosis Foundation, are now working to define CFTR function and to develop new treatment strategies based on protein structure. Reverse Genetics and the Human Genome Mapping Project Through the National Institutes of Health and the Department of Energy, the U.S. government has embarked upon the Human Genome Project which will have enormous influence on understanding and treating human disease. The objective of the Human Genome Project is to generate a complete map and sequence of the human genome. Assuming one character per nucleotide, the result will fill 13 sets of the Encyclopredia Britannica (50). From a complete map, a set of markers spaced at equal intervals along the chromosomes may be used to search for genes. After discovering linkage, the location of the disease gene may be further pinpointed by selecting additional markers from the defined region. A computer search of the DNA sequence of this interval may then reveal the gene of interest. If not, all cloned DNA from the smaller target region would be immediately available for transcript searching. The genome mapping project will eventually allow not only the single gene defects to be found more efficiently, but will also hasten the search for the genetic basis of multigene disorders and predisposition to common diseases, such as atherosclerosis and lung cancer. Summary The reverse genetics approach relies on linkage analysis, physical mappping, and molecular cloning. DNA markers inherited together with a disease suggest the markers and the disease reside on the same chromosome. The frequency of recombination between markers and a disease locus reflects their genetic distance and the actual distance is determined by physical mapping. A variety of powerful techniques now exist for searching an area of several hundred thousand base pairs for the desired gene, though the process is stilliaborious. By isolating the gene and determining its sequence, ac-

315

curate carrier detection and prenatal diagnosis is made possible, and the composition of the protein and the biochemical defect may be identified. Acknowledgments: Drs. Iannuzzi and Collins acknowledge support from the Cystic Fibrosis Foundation and Grant DK39690-0l from the National Institutes of Health. Dr. Collins is an Associate Investigator of the Howard Hughes Medical Institute.

References I. Orkin, S. H. 1986. Reverse genetics and human disease. Cell 47: 845-850. 2. Rommens,1. M., M. C. Iannuzzi, B.-S. Kerem et al. 1989. Identification of the cystic fibrosis gene: chromosome walking and jumping. Science 245: 1059-1065. 3. Riordan, 1. R., 1. M. Rommens, B.-S. Kerem et al. 1989. Identification of the cystic fibrosis gene: cloning and characterization of complementary DNA. Science 245:1066-1073. 4. Kerem, B.-S., 1. M. Rommens,1. A. Buchanan et al. 1989. Identification of the cystic fibrosis gene: genetic analysis. Science 245:1073-1080. 5. Orkin, S. H. 1989. Molecular genetics of chronic granulomatous disease. Annu. Rev. Immunol. 7:277-307. 6. Monaco, A. P., and L. M. Kunkel. 1988. Cloning of the Duchenne/Becker muscular dystrophy locus. Adv. Hum. Genet. 17:61-98. 7. Mandel, 1. L. 1989. Dystrophin. The gene and its product. Nature 22: 584-586. 8. Weinberg, R. A. 1988. Finding the anti-oncogene. Sci. Am. 259:44-51. 9. Gessler, M., K. 0. Simola, and G. A. Bruns. 1989. Cloning of breakpoints of a chromosome translocation identifies the AN2 locus. Science 244: 1575-1578. 10. Call, K. M., T. M. Claser, C. Y. Ito etal. 1990. Description and characterization of a zinc finger polypeptide gene at the human chromosome II Wilms' tumor locus. Cell. In press. II. Rose, E. A., T. Glaser, C. A. Jones et al. 1990. Complete physical map of the WAGR region of IIpl3 localized a candidate Wilms' tumor gene. Cell. In press. 12. Fearon, E. R., K. R. Cho, 1. M. Nigro et al. 1990. Identification ofa chromosome 18q gene that is altered in colorectal cancers. Science 247:49-56. 13. Gusella, 1. F. 1989. Location cloning strategy for characterizing genetic defects in Huntington's disease and Alzheimer's disease. FASEB J. 3: 2036-2041. 14. Collins, F. S., P. O'Connell, B. A. 1. Ponder, and B. R. Seizinger. 1989. Progress toward identifying the neurofibromatosis (NFl) gene. Trends Genet. 5:217-221. 15. Francke, U., H. D. Ochs, B. de Martinville et al. 1985. Minor Xp21 chromosome deletion in a male associated with expression of Duchenne muscular dystrophy, chronic granulomatous disease, retinitis pigmentosa, and McLeod syndrome. Am. J. Hum. Genet. 37:250-267. 16. Kidd, K. K., A. M. Bowcock, 1. Schmidtke et al. 1989. Report of the DNA committee and catalogs of cloned and mapped genes and DNA polymorphisms. Cytogenet. Cell Genet. 622-947. 17. Gottesman, M. M., editor. 1987. Molecular Genetics of Mammalian Cells. Academic Press, Inc., San Diego. 18. Ott, 1. 1985. Analysis of Human Linkage. Johns Hopkins University Press, Baltimore. 19. Eiberg, H., 1. Mohr, K. Schmeigelow, L. S. Nielson, and R. Williamson. 1985. Linkage relationships of paraoxonase (PON) with other markers: indication of paN-cystic fibrosis synteny. Clin. Genet. 28:840-845. 20. Tsui, L.-c., M. Buchwald, D. Barker et al. 1985. Cystic fibrosis locus defined by a genetically linked polymorphic DNA marker. Science 230:1054-1057. 21. White, R., S. Woodward, M. Leppert et al. 1985. A closely linked marker for cystic fibrosis. Nature 318:382-384. 22. Wainwright, B. 1., P. J. Scambler,1. Schmidtke et al. 1985. Localization of cystic fibrosis locus to human chromosome 7cen-q22. Nature 318: 384-386. 23. Beaudet A., A. Bowcock, M. Buchwald etal. 1986. Linkage of cystic fibrosis to two tightly linked DNA markers: joint report from a collaborative study. Am. J. Hum. Genet. 39:681-693. 24. Dean, M. 1988. Review: molecular and genetic analysis of cystic fibrosis. Genomics 3:93-99. 25. White, R., M. Leppert, P. O'Connell et al. 1986. Further linkage data on cystic fibrosis: the Utah study. Am. J. Hum. Genet. 39:694-698. 26. Schwartz, D. c., and C. R. Cantor. 1984. Separation of yeast chromosomesized DNAs by pulsed field gradient gel electrophoresis. Cell 37:67-75. 27. Barlow, D. P., and H. Lehrach. 1987. Genetics by gel electrophoresis: the impact of pulsed field gel electrophoresis on mammalian genetics. Trends Genet. 3:167-171. 28. Smith, C. L., and C. R. Cantor. 1986. Approaches to physical mapping of the human genome. Cold Spring Harbor Symp. Quant. BioI. 51(Pt I): 115-122.

316

AMERICAN JOURNAL OF RESPIRATORY CELL AND MOLECULAR BIOLOGY VOL. 2 1990

29. Smith, C. L., S. K. Lawrance, G. A. Gillespie, C. R. Cantor, S. M. Weissman, and E S. Collins. 1987. Strategies for mapping and cloning macroregions of mammalian genomes. Methods Enzymol. 151:461-489. 30. Carle, G. E, and M. V. Olson. 1984. Separation of chromosomal DNA molecules from yeast by orthogonal field alternation gel electrophoresis. Nucleic Acids Res. 12:5647-5664. 31. Chu, G., D. Vollrath, and R. W. Davis. 1986. Separation of large DNA molecules by contour-clamped homogeneous electric fields. Science 234: 1582-1585. 32. Carle, G. E, M. Frank, and M. V. Olson. 1986. Electrophoretic separations oflarge DNA molecules by periodic inversion of the electric fiield. Science 232:65-68. 33. Vollrath, D., and R. W. Davis. 1987. Resolution of DNA molecules greater than 5 megabases by contour clamped homogeneous electric fields. Nucleic Acids Res. 15:7865-7875. 34. Poustka, A.-M., H. Lehrach, R. Williamson, and G. Bates. 1988. A long range restriction map encompassing the cystic fibrosis locus and its closely linked genetic markers. Genomics 2:337-345. 35. Drumm, M. L., C. L. Smith, M. Dean, 1. L. Cole, M. C. Iannuzzi, and E S. Collins. 1988. Physical mapping of the cystic fibrosis region of pulsed-field gel electrophoresis. Genomics 2:346-354. 36. Rommens, 1. M., S. Zengerling, 1. Burns et af. 1988. Identification and regional localization of DNA markers on chromosome 7 for the cloning of the cystic fibrosis gene. Am. J. Hum. Genet. 43:4-13. 37. Scambler, P. 1., H. Y. Law, R. Williamson, and C. S. Cooper. 1986. Chromosome mediated gene transfer of six DNA markers linked to the cystic fibrosis locus on human chromosome seven. Nucleic Acids Res. 14: 7159-7174. 38. Michiels, E, M. Burmeister, and H. Lehrach. 1987. Derivation of clones close to met by preparative field inversion gel electrophoresis. Science 236:1305-1308. 39. Collins, E S., M. L. Drumm, 1. L. Cole, W. K. Lockwood, G. E Van-

40. 41. 42.

43. 44.

45.

46. 47. 48. 49.

50.

deWoude, and M. C. Iannuzzi. 1987. Construction of a general human chromosome jumping library with application to cystic fibrosis. Science 235: 1046-1049. Burke, D. T., G. E Carle, and M. V. Olson. 1987. Cloning of large segments of exogenous DNA into yeast by means of artificial chromosome vectors. Science 236:806-812. Cooke, H. 1987. Cloning in yeast: an appropriate scale for mammalian genomes. Trends Genet. 173-174. Brownstein, B. H., G. A. Silverman, R. D. Little et al. 1989. Isolation of single-copy human genes from a library of yeast artificial chromosome clones. Science 244: 1348-1351. Collins, E S., and S. M. Weissman. 1984. Directional cloning of DNA fragments at a large distance from an initial probe: a circularization method. Proc. Natl. Acad. Sci. USA 81:6812-6816. Poustka, A., and H. Lehrach. 1986. Jumping libraries and linking libraries: the next generation of molecular tools in mammalian genetics. Trends Genet. 2:174-179. Iannuzzi, M. c., M. Dean, M. L. Drumm et al. 1989. Isolation of additional polymorphic clones from the cystic fibrosis region, using chromosome jumping from roS8. Am. J. Hum. Genet. 44:695-703. Dean, M., M. L. Drumm, C. Stewart etal. 1990. Localization of the cystic fibrosis locus. Nucleic Acids Res. In press. Lindsay, S., and A. P. Bird. 1987. Use of restriction enzymes to detect potential gene sequences in mammalian DNA. Nature 327:336-338. Bird, A. P. 1986. CpG-rich islands and the function of DNA methylation. Nature 321:209-213. Lemna, W. K., G. L. Feldman, B.-S. Kerem et al. 1990. Mutation analysis for heterozygote detect and prenatal diagnosis of cystic fibrosis. N. Engl. 1. Med. 322:291-296. McKusick, V. A. 1989. Mapping and sequencing the human genome. N. Engl. J. Med. 320:910-915.

Reverse genetics and cystic fibrosis.

The protein responsible for cystic fibrosis has been identified using an approach called "reverse" genetics. This approach relies on the chromosomal m...
832KB Sizes 0 Downloads 0 Views