Downloaded from http://cshprotocols.cshlp.org/ at University of Illinois at Chicago Library on November 20, 2014 - Published by Cold Spring Harbor Laboratory Press

Transposon Insertional Mutagenesis Models of Cancer Karen M. Mann, Nancy A. Jenkins, Neal G. Copeland and Michael B. Mann Cold Spring Harb Protoc; doi: 10.1101/pdb.top069849 Email Alerting Service Subject Categories

Receive free email alerts when new articles cite this article - click here. Browse articles on similar topics from Cold Spring Harbor Protocols. Genetics, general (326 articles) Mouse (311 articles) Transgenic Mice (72 articles) Transgenic Mice, general (66 articles)

To subscribe to Cold Spring Harbor Protocols go to:

http://cshprotocols.cshlp.org/subscriptions

© 2014 Cold Spring Harbor Laboratory Press

Downloaded from http://cshprotocols.cshlp.org/ at University of Illinois at Chicago Library on November 20, 2014 - Published by Cold Spring Harbor Laboratory Press

Topic Introduction

Transposon Insertional Mutagenesis Models of Cancer Karen M. Mann, Nancy A. Jenkins, Neal G. Copeland, and Michael B. Mann1 Cancer Research Program, The Methodist Hospital Research Institute, Houston, Texas 77030

Transposon-based insertional mutagenesis in the mouse provides a powerful approach for identifying new cancer genes. Transposon insertions in cancer genes are selected during tumor development because of their positive effect on tumor growth, and the transposon insertion sites in tumors thus serve as tags for identifying new cancer genes. Direct comparisons of transposon-mutated genes in mouse tumors with mutated genes in human tumors can lend insight into the genes and signaling pathways that drive tumorigenesis. This is critical for prioritizing genes for further study, either for their efficacy as biomarkers or drug targets. In this article, we will introduce DNA transposon-based systems used for gene discovery in mice and discuss their application to identify candidate cancer genes in light of recently published tumor studies.

TRANSPOSON-BASED INSERTIONAL MUTAGENESIS

Transposons are mobile genetic elements that exist within the genomes of plants, invertebrates, and vertebrates. These “jumping genes,” a term coined by Barbara McClintock, can be mobilized either by a “copy-and-paste” mechanism through an RNA intermediate (retrotransposons) or by a “cut-andpaste” mechanism whereby an enzyme acts directly on DNA to excise the transposon element from the genome. Often transposons cause mutations when they reinsert into the genome. Transposonbased insertional mutagenesis is a technology that utilizes a DNA-based transposable element to perturb genes that contribute to a phenotype of interest, in this case cancer. Typically, the transposon is introduced into the mouse germline by standard transgenesis (see Introduction: Transgenic Mouse Models—A Seminal Breakthrough in Oncogene Research [Smith and Muller 2013]). Like all microinjected DNA, the transposon often concatenates before insertion at a single random site in the mouse genome. By screening multiple transposon transgenic founders it has been possible to generate transgenic lines that carry anywhere from only a few to many hundreds of transposon copies all linked together at a single site in the genome. A transposase gene expressed in trans recognizes the inverted repeats flanking either end of the transposon and excises the transposon from the concatemer. The transposon is then able to reintegrate anywhere in the genome. Mobilization of the transposon is a continual process and is visualized when insertions are clonally selected during tumor development. Transposons used in cancer screens typically contain a promoter followed by a splice donor sequence that is used to deregulate the expression of oncogenes, in addition to a bidirectional poly(A) site, along with two in-frame splice acceptors in either orientation to inactivate the expression of tumor-suppressor genes (Copeland and Jenkins 2010). The transposon thus activates proto-oncogenes when inserted upstream or in an intron of the gene, in the same transcriptional orientation. Transposons inserted upstream of genes and in the opposite orientation can also disrupt transcription

1

Correspondence: [email protected]

© 2014 Cold Spring Harbor Laboratory Press Cite this introduction as Cold Spring Harb Protoc; doi:10.1101/pdb.top069849

235

Downloaded from http://cshprotocols.cshlp.org/ at University of Illinois at Chicago Library on November 20, 2014 - Published by Cold Spring Harbor Laboratory Press

K.M. Mann et al.

factor or enhancer binding sites and lead to deregulated gene expression. Likewise, truncation mutations in tumor-suppressor genes can occur when the transposon inserts within the coding region (primarily in introns) in either orientation. The ability of the transposon to induce both activating and inactivating mutations is what enables insertional mutagenesis to drive cancer in multiple and diverse organ systems.

Sleeping Beauty INSERTIONAL MUTAGENESIS

Sleeping Beauty (SB) is the most commonly used transposon for inducing cancer in mice. SB is derived from the salmonoid type of Tc1/mariner elements (Ivics et al. 1997), which have lacked intrinsic transposition activity in vertebrate cells for more than 10 million years (Plasterk et al. 1999). SB has been optimized for high functionality in mammalian cells in a number of ways. The first active SB transposase for use in mammalian cells, SB10, was reconstructed from phylogenic analysis of sequences from eight species of fish (Ivics et al. 1997). Geurts and colleagues further improved the transposase activity by optimizing the codon sequence for the mammalian system, thereby creating SB11. They also limited the size of the transposon to 2 kb because this allows for maximal transposition efficiency (Geurts et al. 2003). Multiple transposon lines with copy numbers ranging from 30 to 300 have been generated and shown to induce different phenotypes (Collier et al. 2005; Dupuy et al. 2005, 2009). Various iterations of the SB transposase also exist, each differing in activity or means of induction or both (see Table 1). CAGGS-SB10 transposase exists as a transgene but has limited transposase activity (Dupuy et al. 2001; Collier et al. 2005, 2009). The CAGGS promoter itself drives ubiquitous transposase expression but has limited activity in the hematopoietic compartment (Rad et al. 2010; A Dupuy, unpubl.). Rosa26-SB11 is a knock-in constitutive allele of the SB11 transposase into the Rosa26 locus (Dupuy et al. 2005). The SB11 allele has more activity than SB10 (Collier et al. 2009) and, because it is a knock-in, is protected from transgene-induced epigenetic silencing. LSL-Rosa26-SB11 is an inducible allele that carries a floxed-stop cassette, which can be activated by Cre expression. This allele is a powerful tool that makes it possible to control transposition in space and time. Recently, two additional inducible alleles of the transposase knocked-in to the Rosa26 locus were reported. Rosa26-Lox66-SB-Lox71 contains a hyperactive transposase (HSB5) (Yant et al. 2007) targeted in the antisense orientation and flanked by mutant loxP sites. Upon Cre expression, the transposase allele is inverted and now comes under the control of the CAGGS promoter. LSL-Rosa26-SB13 contains a modified SB10 transposase carrying two mutations that make it hyperactive (Carlson et al. 2005) and is expressed under the endogenous ubiquitous Rosa26 promoter (Perez-Mancera et al. 2012). Sleeping Beauty transposons are primarily nonautonomous, meaning that transposase must be supplied in trans to induce SB transposition. This allows for temporal and spatial control over SB transposition. There is also little insertion site preference for SB transposons in unselected cells. In tumors, however, SB insertions are nonrandom because of genetic selection for mutations in cancer genes. The SB transposon requires only a TA dinucleotide for insertion, although ANNTANNT is the preferred target site over other TAs (Vigdal et al. 2002; Liu et al. 2005; Yant et al. 2005). Excision of the transposon occurs in a cut-and-paste manner and a 5-bp footprint (CAGTA or CTGTA) is usually left behind in genomic DNA following mobilization (Copeland and Jenkins 2010). This 5-bp footprint can be mutagenic, particularly when located within an exon. Mobilization of the transposon concatemer can also induce small rearrangements or deletions surrounding the donor site, and in some cases these deletions can be detrimental to the host (Geurts et al. 2006). SB also prefers to reintegrate near the transposon donor site (Collier and Largaespada 2007). This phenomenon, known as “local hopping,” reduces the number of transposons available for reintegration elsewhere in the genome. Local hopping also complicates the computational algorithms used for identifying common insertion sites on the donor chromosome. In addition, once the transposase is activated, transposition is a continuous process and never stops. There are currently no SB transposase alleles that can be inac236

Cite this introduction as Cold Spring Harb Protoc; doi:10.1101/pdb.top069849

Function

Allele

Chr

Activity

Promoter

Potential to cause cancer

Reference

CAGGS-SB10

SB transposase

Transgenic

ND

Constitutive

CAGGS

Collier et al. 2005

Rosa26-SB11 Rosa26-LSL-SB11 Rosa26Lox66SBLox71 Rosa26-LSL-SB13 RosaPB T2/Onc (76) T2/Onc2 (6113) T2/Onc2 (6070) T2/Onc3 (12740) T2/Onc3 (12775) GrOnc ATP1 ATP2 ATP3

SB transposase SB transposase SB transposase SB transposase PB transposase SB transposon SB transposon SB transposon SB transposon SB transposon SB transposon PB transposon PB transposon PB transposon

Targeted Targeted Targeted Targeted Targeted Transgenic Transgenic Transgenic Transgenic Transgenic Transgenic Transgenic Transgenic Transgenic

6 6 6 6 6 1 1 4 9 12 ND ND ND ND

Constitutive Inducible Inducible Inducible Constitutive Low-copy High-copy High-copy Low-copy Low-copy Low-copy Various Various Various

Rosa26 Rosa26 CAGGS Rosa26 Rosa26 MSCV MSCV MSCV CAGGS CAGGS Gr1.4 LTR CAGGS MSCV PGK

Low frequency of cancer induction enhanced in presence of sensitizing mutation Mobilizes SB transposons to induce cancer Mobilizes SB transposons to induce cancer Mobilizes SB transposons to induce cancer Mobilizes SB transposons to induce cancer Mobilizes PiggyBac transposons to induce cancer Cancer-causing Cancer-causing Cancer-causing Cancer-causing Cancer-causing Cancer-causing Cancer-causing Cancer-causing Cancer-causing

Dupuy et al. 2005 Starr et al. 2005 March et al. 2011 Perez-Mancera et al. 2012 Rad et al. 2010 Collier et al. 2005 Dupuy et al. 2005 Dupuy et al. 2005 Dupuy et al. 2009 Dupuy et al. 2009 Vassiliou et al. 2011 Rad et al. 2010 Rad et al. 2010 Rad et al. 2010

Several published alleles of Sleeping Beauty transposons and transposases have been introduced into mice through either transgenesis or gene-targeting approaches and have been shown to drive tumorigenesis.

237

Transposon Models of Cancer

Downloaded from http://cshprotocols.cshlp.org/ at University of Illinois at Chicago Library on November 20, 2014 - Published by Cold Spring Harbor Laboratory Press

Cite this introduction as Cold Spring Harb Protoc; doi:10.1101/pdb.top069849

TABLE 1. Sleeping beauty transposon and transposase Alleles Element

Downloaded from http://cshprotocols.cshlp.org/ at University of Illinois at Chicago Library on November 20, 2014 - Published by Cold Spring Harbor Laboratory Press

K.M. Mann et al.

tivated following their induction. This is a limitation for studies aiming at following tumor induction based on a finite window of transposition. APPLICATIONS OF SB MUTAGENESIS

Sleeping Beauty induces mutations in mouse embryonic stem cells (ESCs), germ cells, and somatic cells with different frequencies. SB transposition in ESCs also shows a high frequency of local hopping. Transposon excision is estimated to occur at a frequency of 3.5 × 10−6 events per cell per generation (Luo et al. 1998) and may differ depending on the transposon donor chromosome. In contrast, mutagenesis of the male germline yields an average of two mutagenic events per gamete (Dupuy et al. 2001). Importantly, SB mutagenesis occurs in somatic cells at a frequency high enough to induce cancer. In 2005, Dupuy and colleagues published the first SB insertional mutagenesis screen designed to model cancer in mice in the absence of a sensitizing mutation. Using the ubiquitously expressed, constitutively active Rosa26-SB11 transposase allele, Dupuy et al. showed that the high-copy SB transposon T2/Onc2, containing the MSCV promoter (see Table 1), drives the development of Band T-cell lymphomas (Dupuy et al. 2005). A significant limitation of the high-copy T2/Onc2 transposon concatemer used in these experiments was the high degree of lethality (50% of the expected progeny) in early embryos with active whole-body transposition. This lethality was not observed in a screen using a low-copy transposon concatemer (Collier et al. 2005), suggesting that copy number may play a role. One hypothesis for the lethality is that mobilization of the high-copy transposon concatemer caused double-strand breaks to accumulate throughout the genome at frequencies so high that they could not be repaired efficiently. Dupuy and colleagues later published an SB screen using a low-copy T2/Onc3 transposon that contained the CAGGS promoter. In combination with Rosa26SB11, this transposon drives solid tumor formation in the absence of embryonic lethality, although with much longer tumor latency (Dupuy et al. 2009). The difference in tumor profiles between T2/Onc2 and T2/Onc3 may be due to differences in transposon copy number, donor site, or even the promoter contained within the transposon. PCR AMPLIFICATION AND SEQUENCING OF TRANSPOSON INSERTION SITES

One of the major advantages of transposon-based mutagenesis is the ability to use the transposon as a tag for identifying new cancer genes. This is typically done using ligation-mediated splinkerette polymerase chain reaction (PCR) to amplify the transposon/genomic DNA junctions (March et al. 2011), which are then sequenced and mapped to the mouse genome to identify unique (i.e., nonrepetitive) loci mutated by transposon insertions in tumor DNA (see Fig. 1). Tumor DNA may be isolated from either flash-frozen or formalin-fixed paraffin-embedded (FFPE) tissue. FFPE tissue may be isolated from either core punches from paraffin blocks or laser-capture microdissection of sectioned lesions from slides. One consideration with FFPE samples is that the embedding process and/or the fixative may shear genomic DNA. Whole-genome amplification of DNA isolated from fixed tissue is thus required to obtain enough DNA for downstream processing. Two published methods exist for processing genomic DNA for high-throughput sequencing of transposon insertion sites. The first, originally described by Uren and colleagues (Uren et al. 2009), takes advantage of unique restriction sites present within the inverted repeats of the transposon that are also present in the flanking mouse genomic DNA. Restriction enzyme digestion generates chimeric products, varying in size, that contain both genomic DNA and transposon repeat sequences. These fragments are then PCR-amplified using multiplex PCR (see below) before massive parallel sequencing. The second method relies on sonication to shear the genomic DNA. This “shear-splink” method, described by Koudijs and colleagues (Koudijs et al. 2011), presents a few advantages to restriction digestion. First, it does not rely on the availability of restriction sites in the mouse DNA flanking the 238

Cite this introduction as Cold Spring Harb Protoc; doi:10.1101/pdb.top069849

Downloaded from http://cshprotocols.cshlp.org/ at University of Illinois at Chicago Library on November 20, 2014 - Published by Cold Spring Harbor Laboratory Press

Transposon Models of Cancer

Isolate tumor gDNA

Fragment tumor gDNA

Use Qiagen Gentra Puregene kits for DNA isolation from:

Generate fragments of DNA containing both transposon sequences and genomic DNA by:

Snap frozen tumor 0.6–1 mm cores from FFPE tissue Laser-capture microdissection Use Qiagen Repli-G Midi kit for wholegenome amplification of DNA from FFPE tissue

Restriction digestion (Uren et al. 2009)

Ligate adapter sequences Provides template for amplification of unknown genomic DNA sequences (See Largaespada and Collier 2008 for adapter and primer sequences)

Amplify DNA junctions Splinkerette PCR Enrich for products of transposon insertion sites

Pool and purify barcoded PCR products

Sequence and map

Pre- and postprocessing of sequence reads

Roche 454 Titanium (March et al. 2011)

CIS statistical analysis

Illumina (Berguam-Vrieze et al. 2011)

Use unique barcodes for each tumor sample (March et al. 2011)

or Shearing (sonication) (Koudijs et al. 2011)

FIGURE 1. Flow diagram for identifying transposon insertions in tumors. Multiple steps are required to identify transposon insertion sites in tumors driven by insertional mutagenesis. See text for detailed description. FFPE, formalin-fixed paraffin-embedded.

transposon insertion site. This is an important consideration for regions of the genome where enzymatic digestion results in fragments that are thousands of bases in length. Such large fragments are not amenable to PCR amplification and so sequencing does not capture transposon insertions that lie within these regions. Second, the size range of the sheared DNA products can be controlled by sonication for template sizes that are efficiently amplified by the multiplex PCR. This minimizes the amplification bias for small fragments and allows for greater sampling of transposon insertion sites in the tumor, including rare events, which can be used for semiquantitative analysis of tumor clonality by sequencing. A ligation-based multiplex PCR reaction, or splinkerette PCR, is performed to amplify either restriction-digested or sheared DNA products for sequencing. The first round of PCR employs one primer sequence located near the end of the transposon and a second primer homologous to a ligated adapter, allowing for amplification of transposon/genomic DNA junctions. The transposon primer is specific to the end of the transposon containing the restriction enzyme used to digest the genomic DNA, thereby giving directionality to the DNA junctions. A second, nested PCR is then performed using the primary PCR product as template. The primer homologous to the transposon contains a unique barcode sequence used to label all insertions in a single tumor, whereas the linker primer contains a sequence tag adopted for the sequencing platform (Copeland and Jenkins 2010). With 100s of unique barcodes available, it is possible to combine the amplification products from multiple tumors before sequencing. High-throughput sequencing technology, such as Roche’s 454 platform or Illumina, can then generate thousands of sequencing reads from a single tumor or a collection of tumors. These reads are then annotated for the presence of the transposon repeat, the barcode, the TA insertion dinucleotide, and mouse genomic DNA. Sequence tags containing all four elements are then mapped back to the mouse genome to identify the loci containing transposon insertions, and the insertions are then assigned to individual tumors based on the barcode. Several statistical frameworks have been developed to identify sites in tumor genomes enriched for transposon insertions. These so-called common insertion sites (CISs) are regions in cancer genomes most likely to harbor cancer genes. The most frequently used methods to identify CISs are the Monte Carlo (MC) and Gaussian kernel convolution (GKC) methods. MC simulations define CISs by counting the number and distribution of mapped transposon insertions relative to the distribution of TA dinucleotides across the genome within finite window sizes (Starr et al. 2009; see also Copeland and Jenkins 2010 for review). The GKC algorithm, developed by de Ridder and colleagues, considers the number and clustering of insertions defined under a Gaussian kernel (de Ridder et al. 2006). A kernel is assigned to every uniquely mapped insertion in the data set, and the sum of the kernels at a given nucleotide position in the data set generates a “peak height.” Peak height increases with the Cite this introduction as Cold Spring Harb Protoc; doi:10.1101/pdb.top069849

239

Downloaded from http://cshprotocols.cshlp.org/ at University of Illinois at Chicago Library on November 20, 2014 - Published by Cold Spring Harbor Laboratory Press

K.M. Mann et al.

number of data points within a kernel and becomes significant when the height exceeds the amplitude threshold defined by permutation analysis of the null distribution of peak heights (March et al. 2011). Multiple kernel widths can be used in the GKC analysis, and the smallest kernel that defines the CIS is usually reported (Mann et al. 2012; Perez-Mancera et al. 2012). Importantly, both MC and GKC are agnostic to the presence or absence of genes within the window or kernel. However, the majority of CISs are found within or near genes, which is not surprising given that many of these CISs contain mutated cancer genes that were selected during tumor formation because of their positive effect on tumor growth. A third statistical method takes a gene-centric view to identifying CISs based on the probability of mutating a uniquely mappable TA dinucleotide in each RefSeq gene (Brett et al. 2011). Gene-centric CISs (g-CISs) are defined by chi-square analysis of the number of tumors containing insertions in each RefSeq gene. This analysis is particularly useful to identify CISs in both large genes (>1 Mb) and small genes (

Transposon insertional mutagenesis models of cancer.

Transposon-based insertional mutagenesis in the mouse provides a powerful approach for identifying new cancer genes. Transposon insertions in cancer g...
621KB Sizes 5 Downloads 3 Views