Developmental Cell

Article The C. elegans SNAPc Component SNPC-4 Coats piRNA Domains and Is Globally Required for piRNA Abundance Dionna M. Kasper,1 Guilin Wang,1 Kathryn E. Gardner,1 Timothy G. Johnstone,1 and Valerie Reinke1,* 1Department of Genetics, Yale University School of Medicine, New Haven, CT 06520, USA *Correspondence: [email protected] http://dx.doi.org/10.1016/j.devcel.2014.09.015

SUMMARY

The Piwi/Piwi-interacting RNA (piRNA) pathway protects the germline from the activity of foreign sequences such as transposons. Remarkably, tens of thousands of piRNAs arise from a minimal number of discrete genomic regions. The extent to which clustering of these small RNA genes contributes to their coordinated expression remains unclear. We show that C. elegans SNPC-4, the Myb-like DNA-binding subunit of the small nuclear RNA activating protein complex, binds piRNA clusters in a germline-specific manner and is required for global piRNA expression. SNPC-4 localization is mutually dependent with localization of piRNA biogenesis factor PRDE-1. SNPC-4 exhibits an atypical widely distributed binding pattern that ‘‘coats’’ piRNA domains. Discrete peaks within the domains occur frequently at RNA-polymeraseIII-occupied transfer RNA (tRNA) genes, which have been implicated in chromatin organization. We suggest that SNPC-4 binding establishes a positive expression environment across piRNA domains, providing an explanation for the conserved clustering of individually transcribed piRNA genes.

INTRODUCTION Piwi-interacting RNAs (piRNAs) are a class of small RNAs (21–30 base pairs [bp]) that associate with Piwi, a member of the highly conserved Argonaute/Piwi protein family involved in diverse small-RNA-mediated regulatory pathways (Saxe and Lin, 2011). The Piwi/piRNA pathway is necessary for gametogenesis and to protect genome integrity in the germline by recognizing and silencing nonself elements, such as transposons (Luteijn and Ketting, 2013). piRNAs are key effectors in this pathway because they have the ability to target many types of transcripts due to their abundance and sequence diversity. Although tens of thousands of piRNAs exist, they originate from a discrete number of clustered genomic loci, and their expression is generally limited to the germline and associated cell types (Ishizu et al., 2012). The organization of other complex loci, such as the protocadherin gene cluster, the b-globin locus control region (LCR), and Hox gene clusters into specialized domains proves to be essential

for their appropriate expression patterns (Andrey et al., 2013; Chen and Maniatis, 2013; Deng et al., 2012). However, the extent to which arrangement into clusters fully contributes to piRNA gene regulation remains unclear. Based on association with the Piwi ortholog PRG-1, C. elegans has greater than 15,000 annotated piRNAs, known as 21URNAs. 21U-RNAs distribute roughly equally between two classes (Gu et al., 2012). Type I 21U-RNA genes are clustered into two discrete domains of 2.5 Mb (4.5–7 Mb) and 3.7 Mb (13.5– 17.2 Mb) on chromosome IV and primarily reside in introns and intergenic regions (Batista et al., 2008; Ruby et al., 2006). Type II 21U-RNAs arise from annotated coding gene promoters throughout the genome and are unclustered (Gu et al., 2012). Multiple lines of evidence indicate that each 21U-RNA is independently transcribed as a short RNA (approximately 26–70 bp in length) by RNA polymerase II (Pol II) and is processed into its mature, PRG-1-associated form through poorly characterized events (Batista et al., 2008; Billi et al., 2013; Cecere et al., 2012; Gu et al., 2012; Ruby et al., 2006). This process contrasts with mammals and Drosophila, in which a long precursor is processed into many piRNAs (Aravin et al., 2006; Brennecke et al., 2007; Lau et al., 2006). Factors implicated in the regulation of type I 21U-RNAs include the factor PRDE-1 (Weick et al., 2014) and members of the forkhead (FKH)-related transcription factor family, which can bind to 21U-RNA consensus motifs upstream of individual type I 21U-RNA genes in order to promote their autonomous expression (Cecere et al., 2012). Type II 21URNAs are presumably under the regulation of the transcriptional machinery at the promoters from which they arise (Gu et al., 2012). Although individual transcription should obviate the requirement for genomic clustering of type I 21U-RNA loci, their clustering is highly conserved between C. elegans and its close nematode relative C. briggsae (Batista et al., 2008; Ruby et al., 2006; Shi et al., 2013). Intriguingly, the clustered type I loci account for approximately 95% of the mature 21U-RNA population in C. elegans, whereas type II 21U-RNAs have minimal contribution (Gu et al., 2012), suggesting that organization of 21U-RNAs into discrete domains is advantageous for effectively promoting robust and appropriate spatial expression. However, the mechanisms involved in creating an environment that instructs germline-specific expression of abundantly expressed type I 21U-RNAs within these domains are unknown. Here, we use genome-wide analysis to demonstrate that the transcription factor SNPC-4, the ortholog of small nuclear RNA (snRNA) activating protein complex (SNAPc) subunit 4, binds

Developmental Cell 31, 145–158, October 27, 2014 ª2014 Elsevier Inc. 145

Developmental Cell SNPC-4 Coats and Regulates piRNA Domains

the two major 21U-RNA domains and regulates the global expression of mature 21U-RNAs in C. elegans. SNPC-4 binding at 21U-RNA regions is developmentally regulated and germline specific and correlates with SNPC-4 subnuclear foci visible only in undifferentiated germ cells. Moreover, the formation of SNPC-4 foci, as well as the enriched binding at 21U-RNA regions, depends on the activity of the 21U-RNA biogenesis factor PRDE-1. Conversely, SNPC-4 is required for PRDE-1 localization in germ nuclei. Given that SNPC-4 orthologs are sequence-specific transcription factors, the SNPC-4 binding profile within the 21U-RNA-rich genomic domains is unusual and striking: broad binding coats each domain, interspersed with stronger discrete peaks that do not necessarily correspond to individual 21U-RNA loci. At discrete sites, SNPC-4, in conjunction with RNA polymerase III (Pol III), binds transfer RNA (tRNA) genes with elevated frequency and strength within 21URNA domains. tRNA genes are known regulators of chromatin organization in yeast and humans (Donze, 2012), suggesting that these sites might contribute to defining 21U-RNA domains. Additionally, SNPC-4 contains Myb/SANT protein motifs, which are capable of influencing nucleosome positioning and histone modifications (Boyer et al., 2004). Together, the data presented here define a highly specialized function for SNPC-4 in establishing a unified domain of regulation to coordinate the robust expression of tens of thousands of individually transcribed 21U-RNA genes.

(mCherry:H2B) but not with a marker for perinuclear germ granules (PGL-1:RFP), which is consistent with the expectation that SNPC-4 associates with DNA (Figures 1B and 1C). SNPC-4:GFP foci were most apparent in the distal population of proliferating and pachytene germ cells and decreased substantially in intensity as proximal germ cells entered oogenesis. Typically, two foci were present in mitotic germ nuclei, whereas only a single, larger focus was detected in nuclei that had entered the pachytene stage of meiosis I (Figure 1A, zoom; Figure 2A). We hypothesized that the double foci in proliferating germ nuclei corresponded to a discrete genomic locus detectable on a pair of homologous chromosomes and that these foci coalesce as chromosomes align and undergo synapsis in meiotic nuclei. Therefore, we utilized mutants that selectively disrupt homologous pairing and synapsis of specific chromosomes during meiosis: him-8(e1489)-chromosome X; zim-1(tm1813)-chromosomes II and III; and zim-3(tm2303)-chromosomes I and IV (Phillips and Dernburg, 2006). After crossing these mutants to the SNPC4:GFP transgenic line, we quantified the number of SNPC4:GFP foci present in the mitotic, transition, and meiotic zones. A single focus was still detected in the pachytene region of him-8 and zim-1 mutants, as in wild-type, whereas in the absence of zim-3 activity, most meiotic nuclei contained two foci (p = 0.001; Figures 2A, 2B, and S2). zim-3 is required for pairing of chromosomes I and IV, suggesting that SNPC-4:GFP foci might be associated with a region on one of these two chromosomes.

RESULTS Localization and Expression of SNPC-4 Our interest in 21U-RNA regulation led us to further explore a striking germline localization pattern for a factor we investigated through the modENCODE project (Gerstein et al., 2010). This factor, initially called GEI-11, was renamed SNPC-4 to better reflect its orthology and function. SNPC-4 is the only C. elegans ortholog of mammalian SNAPC4, the DNA-binding subunit of SNAPc (Jawdekar and Henry, 2008; Wong et al., 1998) (Figure S1A and Table S1 available online). SNAPc binds to a consensus element called the proximal sequence element (PSE) and interacts with both Pol II and Pol III (Figure S1B) (Yoon et al., 1995). With Pol II, it promotes transcription of the U1, U2, U4, and U5 snRNAs involved in splicing. It also initiates transcription of type 3 Pol-III-regulated genes, such as U6 snRNAs, Y RNAs, and the scan RNA, RNAse P RNA, RNAse MRP RNA, vault RNA, and 7SK RNA (Canella et al., 2010). We investigated SNPC-4 localization and protein function using a transgenic line containing a fosmid that encompasses the endogenous snpc-4 genomic locus with an inserted green fluorescent protein (GFP)-3xFLAG tag at the carboxy terminus (Niu et al., 2011; Sarov et al., 2012; Zhong et al., 2010). This transgene rescued the lethal phenotype of snpc-4(tm4568) mutants, resulting in viable, fertile animals (Figure S1C). Consistent with its known function, SNPC-4:GFP was broadly expressed throughout all tissues, where it exhibited diffuse nuclear localization (Figures S1D–S1F). Notably, SNPC-4:GFP was also concentrated in prominent foci only within germline nuclei. These germline-specific SNPC-4 foci were present throughout development of both hermaphrodites and males (Figure 1A; Figure S1G). SNPC-4:GFP foci colocalized with a marker for chromatin

SNPC-4 Binds to the 21U-RNA Domains in Addition to Canonical SNAPc Targets To determine more precisely the location of SNPC-4 binding in the genome, we performed chromatin immunoprecipitation followed by deep sequencing (ChIP-seq) of SNPC-4:GFP at six stages of hermaphrodite development (embryos, larval stage [L]1, L2, L3, L4, and young adult) (Table S2). As expected on the basis of studies in mammals and Drosophila (James Faresse et al., 2012; Su et al., 1997), SNPC-4 was found at the promoters of many Pol-II- and Pol-III-transcribed snRNA genes (Figures 3A and 3B), as well as type 3 Pol-III-transcribed noncoding RNA (ncRNA) genes (Figures 3B and 3C). SNPC-4:GFP bound many small nucleolar RNAs (snoRNAs) and some tRNA genes as well, although tRNAs are not expected targets. Binding at the canonical ncRNA targets was generally consistent across developmental stages (Figure 3D). Additionally, SNPC-4:GFP was bound at specialized ncRNA genes whose products are involved in trans-splicing of multicistronic transcripts arising from operons (Blumenthal, 2005), including the smy- and SL2 splice leader genes, although not the SL1 genes. Most remarkably, SNPC-4:GFP exhibited developmentally regulated binding at two specific locations on chromosome IV that corresponded precisely to the genomic regions containing clustered 21U-RNAs (Figure 3E). The highest level of binding occurred in the L4 and young adult stages and was much lower in other developmental stages (Figures 3D–3F). In the young adult stage, 50% (532/1,073) of SNPC-4:GFP binding sites were on chromosome IV; of these, 79% (419/532) were within the two 21U-RNA domains. ncRNA genes were not enriched within the 21U-RNA domains relative to the rest of chromosome IV (Figures S3A and S3B); therefore, they cannot fully account for

146 Developmental Cell 31, 145–158, October 27, 2014 ª2014 Elsevier Inc.

Developmental Cell SNPC-4 Coats and Regulates piRNA Domains

A mitotic zone

transition zone

meiotic zone

B SNPC-4:GFP

mCherry:H2B

Merge

PGL-1:RFP

Merge

C SNPC-4:GFP

Figure 1. SNPC-4 Localizes to Germline-Specific Chromatin-Associated Foci (A) SNPC-4:GFP localization in one young adult gonad arm (outlined by a dotted line in the differential interference contrast image). Zoomed region shows SNPC4:GFP localization to subnuclear foci in addition to diffuse nuclear expression in undifferentiated germ nuclei. (B) Colocalization of SNPC-4:GFP (green) with the chromatin marker mCherry:H2B (red) in young adult germ nuclei proceeding from the transition zone to the meiotic zone (left to right). (C) Overlay of SNPC-4:GFP (green) with the germ granule marker PGL-1::RFP (red) in young adult meiotic germ nuclei. Scale bars, 10 mm.

the preferential binding of SNPC-4 in these regions. Moreover, concentrated binding was not seen on any other chromosomes (e.g., Figure S3C). These data demonstrate that SNPC-4:GFP strongly associates with 21U-RNA domains. Germline-Specific Binding of SNPC-4 at 21U-RNA Regions 21U-RNAs are expressed primarily, if not exclusively, in the germline, and enriched SNPC-4:GFP binding across the 21URNA-containing domains correlated with germline development. To determine directly whether SNPC-4 binding at the 21U-RNA regions occurs only in the germline, we performed tissue-specific ChIP-seq (Table S2). To examine somatic tissues, we crossed the SNPC-4:GFP transgenic strain to the glp-1(q224) mutant, which lacks virtually all germ cells at the restrictive temperature (Figure S4A). To query binding in the germline, we generated a separate transgenic line using the pie-1 promoter to enrich SNPC-4:GFP expression in the germline. Notably, pie-1-driven SNPC-4:GFP also concentrated in foci within

germ nuclei (Figure S4B). SNPC-4:GFP binding sites in the whole animal and germline data sets were preferentially located on chromosome IV, specifically in 21U-RNA regions (p % 1.8 3 1017), but not in the soma data set (p = 0.7) (Figures 4A and 4B; Figure S4C). Notably, in the soma, SNPC-4:GFP still occupied snRNA and other ncRNA target genes (e.g., Figure S4D). Consistent with the preferential expression and activity of 21URNAs in the germline, SNPC-4 peaks in the germline data set that were absent in the soma data set (germline-specific SNPC-4 binding events) also were primarily located within the 21U-RNA regions (p = 3.8 3 1027; Figure 4B). Based on these observations and the disrupted focal pattern in the zim-3 synapsis mutant (Figure 2A), we suggest that the SNPC-4:GFP foci seen specifically in germ nuclei likely correspond to 21U-RNA clusters. Several other germline-expressed transcription factors do not exhibit this binding pattern or concentrate in subnuclear foci (Kudron et al., 2013), indicating that this pattern is not simply a consequence of accessible chromatin in this tissue. We conclude that, in addition to its role as a broadly expressed

Developmental Cell 31, 145–158, October 27, 2014 ª2014 Elsevier Inc. 147

Developmental Cell SNPC-4 Coats and Regulates piRNA Domains

A

mitotic zone transition zone meiotic zone

average foci number / germ cell nucleus

2.5

2.0

***

****

1.5

1.0

0.5

SNPC-4:GFP

B

SNPC-4:GFP; him-8 (-)

SNPC-4:GFP; zim-1(-)

α−GFP transition zone

SNPC-4:GFP; zim-3(-)

DAPI

meiotic zone

SNPC-4:GFP; zim-3(-)

SNPC-4:GFP; him-8(-)

factor that regulates multiple classes of essential ncRNAs, SNPC-4 exhibits germline-specific binding at 21U-RNA regions. SNPC-4 exhibits two distinct forms of binding in the 21U-RNA domains that we have termed ‘‘discrete’’ and ‘‘distributed.’’ Discrete binding sites are statistically significant, clearly defined peaks typical of a sequence-specific transcription factor, whereas distributed binding is below significance but clearly above background levels. Discrete peaks occurred at an interval of about 28 kb, whereas 21U-RNA genes are typically spaced 200–300 bp apart (Batista et al., 2008), so that the vast majority of 21U-RNAs are distant from discrete SNPC-4 peaks. Moreover, these peaks did not preferentially associate with the subset of loci that produce the most abundant 21U-RNAs (e.g., Figure 4C), suggesting that 21U-RNA expression is not directly linked to discrete SNPC-4 binding events. Strikingly, however, the broad, distributed SNPC-4:GFP binding across both 21URNA clusters was much stronger in the whole animal and germline data sets than in the soma data set (Figure 4C). To better characterize the SNPC-4 binding pattern across individual 21U-RNA genes, we determined the mean signal intensity (tag density) of SNPC-4 occupancy spanning 1 kb upstream and downstream from the transcription start site of 21U-RNAs, as well as other classes of structural ncRNAs for comparison, within a given data set. SNPC-4 was evenly bound across 21U-RNA loci, with no preferential accumulation at any specific feature, whereas SNPC-4 binding at the other classes of structural RNA genes was strongly centered at their transcription start sites (Figures 4D and S4E). Moreover, detectable binding across

Figure 2. SNPC-4:GFP Focal Pattern Is Altered on Disruption of Pairing and Synapsis of the Chromosome Harboring 21U-RNA Clusters (A) Bars represent average foci number per germ nucleus (n = 50 germ nuclei per zone) from at least five dissected SNPC-4:GFP gonads per genotype in wild-type and chromosome-specific pairing mutants him-8 (chromosome X), zim-1 (chromosomes II and III), and zim-3 (chromosomes I and IV). Statistical analysis was performed by comparing wild-type versus mutant within each zone (***p % 0.001 and ****p % 0.0001, Fisher’s exact test). Error bars represent SEM. (B) Representative confocal z stack projection of dissected young adult gonads stained with a-GFP and DAPI from zim-3 and him-8 mutants, in which SNPC-4:GFP foci were quantified (see A and Figure S2). White line separates the transition (left) and meiotic (right) zones. White arrows indicate representative zim-3 mutant nuclei in the meiotic zone in which two SNPC-4:GFP foci persist. Scale bars, 10 mm.

21U-RNA genes was apparent in the whole animal and germline data sets but at baseline levels in the soma. To confirm that the distributed SNPC-4 binding pattern at 21U-RNAs genes was specific to the 21U-RNA clusters, we measured the signal intensity for whole animal, germline, and soma SNPC-4 data sets across chromosome IV, excluding intervals that correspond to discrete, significant peaks. The mean signal level of distributed binding within the 21U-RNA domains was >12-fold higher than baseline in the whole animal and germline data sets (p < 0.0001), whereas the soma data set is equivalent to baseline (p R 0.05) (Figure 4E). This distributed binding pattern was not detected on chromosome IV outside the 21U-RNA regions in any sample (p R 0.05). In summary, SNPC-4 exhibits specialized chromatin-binding properties that are specific to the 21U-RNA domains and to the germline. SNPC-4 Is Required for Normal Expression of Mature 21U-RNAs The germline-specific SNPC-4:GFP-binding properties at 21URNA loci suggested that SNPC-4 might regulate 21U-RNA expression in the germline. Investigation of this possibility was complicated by the fact that SNPC-4 is required for germ cell viability (Figure S5A), precluding investigation of 21U-RNA levels in its absence. To bypass this limitation, we treated L1 animals with snpc-4(RNAi) to achieve a partial reduction of snpc-4 expression in adults (typically 60%–70% reduction of messenger RNA (mRNA) levels; Figures S5B and S5C), which did not result in obvious morphological germ cell defects or a significant decrease in progeny production (Figure S5D). Levels of an individual 21U-RNA (21ur-1) were reduced in snpc-4(RNAi) animals compared to control animals by northern analysis (Figure S5E). Using these conditions, we performed high throughput sequencing of small RNA populations from control and

148 Developmental Cell 31, 145–158, October 27, 2014 ª2014 Elsevier Inc.

Developmental Cell SNPC-4 Coats and Regulates piRNA Domains

A

B

D

C 2 kb

4 kb

number of SNPC-4 target genes

2 kb

Emb L1 L2 L3 L4 YA input 21U-RNA tRNA snRNA ncRNA Chr III

21U-RNA miRNA snRNA snoRNA tRNA assorted ncRNA

220 180 140 100 60 20 Emb

(U6)

(U5)

(U5)

(U5)

(smy-1/2) (snoRNA)

(snoRNA) 5,080,000

9,480,000

Chr IV

L2

L3

L4

YA

(mrpr-1) 7,200,000

Chr II

E04F6.6

cutl-17

cah-1

L1

E

1 Mb

F

2 kb

Emb L1 L2 L3 L4 YA input 21U-RNA tRNA snRNA ncRNA genes Chr IV

Chr IV 2 Mb

4 Mb

6 Mb

8 Mb

10 Mb

12 Mb

14 Mb

16 Mb

15,430,000

15,440,000 Y73F8A.22

Figure 3. SNPC-4:GFP Preferentially Binds 21U-RNA Clusters (A–C) SNPC-4:GFP ChIP-seq binding profiles for combined replicates at known ncRNA targets for all the major developmental stages. Read counts for each ChIP-seq data set are normalized by the total number of reads. A representative input from the young adult data set is shown. Annotated gene tracks include 21URNAs (pink) and various other ncRNA classes (gray). The snRNA track combines snRNA, sls-, and smy- genes. One gene per image is annotated for reference. (D) Quantification of SNPC-4:GFP significant peak binding for various ncRNA gene classes across development. The number of SNPC-4-bound targets was normalized to the total number of significant binding sites in each ChIP-seq data set. Assorted ncRNA genes includes a variety of type 3 Pol III targets, as well as various single-copy or small family ncRNAs. (E) SNPC-4:GFP ChIP-seq binding profiles of combined replicates across chromosome IV for all the major developmental stages; tracks are as in (A). (F) An example of SNPC-4:GFP binding within the 3.7 Mb 21U-RNA cluster.

snpc-4(RNAi) young adults (Table S3). Reads mapping to type I 21U-RNA loci decreased 52% on snpc-4(RNAi) (p % 2.2 3 1016), whereas reads corresponding to microRNAs (miRNAs) (p = 1) or other types of small ncRNAs (p = 0.9) were unaffected, indicating that reduced 21U-RNA abundance is not simply a consequence of defects in germline development that broadly affect small RNA production (Figure 5A). The vast majority of type I 21U-RNAs with detectable expression in the control sample decreased in the snpc-4(RNAi) sample, demonstrating that this reduction affects 21U-RNA levels globally. Intriguingly, this effect was limited to clustered type I 21U-RNAs; unclustered type II RNAs were minimally affected, decreasing only 6% (p = 7.8 3 107; Figure 5A). We subsequently validated the small RNA deep-sequencing results by quantitative RT-PCR (qRT-

PCR) and showed that expression was significantly reduced for four out of five type I 21U-RNAs tested in animals treated with snpc-4(RNAi) compared to control (Figure S5F). Concentration of SNPC-4 at 21U-RNA Loci Depends on PRDE-1 To determine whether SNPC-4 might act early, late, or indirectly in the generation of 21U-RNAs, we tested whether it interacted with other known biogenesis pathway components, specifically PRG-1, UNC-130, and PRDE-1. PRG-1, the C. elegans ortholog of Piwi, is required for the presence of mature 21U-RNAs and presumably acts late in their biogenesis and/or function in germ granules (Batista et al., 2008; Das et al., 2008; Wang and Reinke, 2008). In prg-1(tm0872) mutants, SNPC-4:GFP foci

Developmental Cell 31, 145–158, October 27, 2014 ª2014 Elsevier Inc. 149

Developmental Cell SNPC-4 Coats and Regulates piRNA Domains

A

B

2.5

observed / expected SNPC-4:GFP binding sites

1 Mb

whole animal germ line soma input

2.0

****

****

whole animal germ line germline-specific soma

****

1.5 1.0 0.5

21U-RNA abundant 21U-RNA tRNA

inside outside 21U-RNA 21U-RNA clusters clusters Chr IV

snRNA ncRNA genes Chr IV 2 Mb

4 Mb

6 Mb

8 Mb

C

10 Mb

12 Mb

2 kb

14 Mb

16 Mb

E average SNPC-4:GFP tag density excluding significant peaks

25

whole animal germ line soma input 21U-RNA abundant 21U-RNA

whole animal germ line soma

**** 15

****

5 1

0.5

0

-0.5

outside 21U-RNA clusters

inside 21U-RNA clusters Chr IV

tRNA snRNA ncRNA

15,700,000

Chr IV genes

15,720,000

Y105C5A.26

D whole animal

germ line

soma

average SNPC-4:GFP tag density

400

600

21U-RNA miRNA snoRNA snRNA tRNA

200

300

400 200 100 200 100

0

0 -1kb

TSS

+1kb

0 -1kb

TS S

+1kb

-1kb

TSS

+1kb

Figure 4. Germline-Specific Binding of SNPC-4:GFP at 21U-RNA Clusters Consists of Discrete and Distributed Binding (A) Young adult ChIP-seq binding profiles for combined replicates across chromosome IV when SNPC-4:GFP is expressed broadly (whole animal, purple), specifically in the germline (navy), and specifically in the soma (green). Annotated gene tracks as in Figure 3A. Abundantly expressed 21U-RNA genes (pink) have >50 reads in the RNA sequencing data set in Figure 5A. (B) Distribution of significant discrete SNPC-4:GFP binding sites on chromosome IV relative to genomic length for all tissue-specific ChIP-seq data sets. Germline-specific SNPC-4 binding sites (teal) consist of germline SNPC-4 sites not present in the soma ChIP-seq data. Statistically significant deviations greater than the expected value of 1 (marked with a black line) are indicated by asterisks (****p % 0.0001, Pearson’s chi-square test). (C) Example of a SNPC-4:GFP ChIP-seq binding profile within the 3.7 Mb 21U-RNA cluster as described in (A). Arrows indicate significant, discrete peaks identified by the SPP algorithm present in the whole animal and germline data sets. Brackets denote regions where distributed binding occurs. (D) Distribution of mean tag density values (100 bp bins, input subtracted) for SNPC-4:GFP across 21U-RNA genes and other ncRNA gene classes for the tissuespecific ChIP-seq data sets. Tag density bins were filtered for positive values and were not normalized between data sets. (E) Average tag density excluding significant peaks was calculated across chromosome IV in order to quantify the extent of distributed binding inside and outside 21U-RNA clusters by SNPC-4:GFP. Tag density values represent 100 bp bins with input subtracted and normalized to average tag density for the entire data set. Statistically significant differences between inside and outside 21U-RNA clusters within each data set are marked with asterisks (****p % 0.0001, Welch’s t test).

150 Developmental Cell 31, 145–158, October 27, 2014 ª2014 Elsevier Inc.

Developmental Cell SNPC-4 Coats and Regulates piRNA Domains

cumulative fraction of small RNAs

A

Type I 21U-RNAs 1.0

1.0

0.8

transition zone

*p = 7.8E-7 0.94x

meiotic zone

0.8

0.6 0.6 SNPC-4:GFP; prg-1(-)

0.4 0.4 control RNAi snpc-4 (RNAi)

0.2

-5

0

5

control RNAi snpc-4 (RNAi)

0.2

10

-5

0

miRNAs cumulative fraction of small RNAs

B

Type II 21U-RNAs

*p ≤ 2.2E-16 0.48x

1.0

5

10

C

other 1.0

p=1 1.15x

0.8

0.8

0.6

0.6

0.4

0.4

0.2

transition zone

p = 0.07 1.08x

meiotic zone

SNPC-4:GFP; unc-130(-)

0.2 control RNAi snpc-4 (RNAi) -5

0

5

10

control RNAi snpc-4 (RNAi) -5

0

log2 reads per million

5

10

log2 reads per million

D transition zone

meiotic zone

SNPC-4:GFP

mCherry:PRDE-1

E

SNPC-4:GFP mitotic zone

meiotic zone

prde-1 (-) transition zone

wildtype

merge

F

mCherry:PRDE-1 mitotic zone

meiotic zone

snpc-4 (RNAi) meiotic zone

mitotic zone

transition zone

control RNAi

Figure 5. SNPC-4 Functions Early in Biogenesis to Promote Normal Levels of Mature 21U-RNAs (A) Global levels of type I and type II 21U-RNA, miRNA, and other small RNA reads relative to the total number of reads (reads per million) in young adult animals treated with snpc-4(RNAi) (black) compared to vector control RNAi (gray). The p values listed in the top left corner of each panel were determined by Wilcoxon signed-rank test. Fold expression differences between snpc-4(RNAi) relative to control RNAi are listed below the p values. (B and C) SNPC-4:GFP focal pattern in prg-1(tm0872) (B) and unc-130(ev505) (C) young adult germ nuclei. (D) Colocalization of SNPC-4:GFP (green) with mCherry:PRDE-1 (red) in young adult germ nuclei. (E) SNPC-4:GFP expression in prde-1(mj207) and wild-type prde-1(+) young adult gonads. (F) Image depicts a young adult gonad expressing mCherry:PRDE-1 that was strongly affected by snpc-4(RNAi) (top) compared to a representative gonad treated with control RNAi (bottom). We increased RNAi strength to achieve more complete loss of snpc-4 activity (e.g., Figure S5I), disrupting germ cell proliferation slightly. RNAi of several other genes that decrease germ cell proliferation did not disrupt mCherry:PRDE-1 foci (Figure S5J). Scale bars, 10 mm.

Developmental Cell 31, 145–158, October 27, 2014 ª2014 Elsevier Inc. 151

Developmental Cell SNPC-4 Coats and Regulates piRNA Domains

1 Mb

A wildtype prde-1(-) input 21U-RNA abundant 21U-RNA tRNA snRNA ncRNA genes Chr IV 2 Mb

4 Mb

6 Mb

8 Mb

10 Mb

B

12 Mb

14 Mb

C

4 kb

2 kb

wildtype

wildtype

prde-1(-)

prde-1(-)

input

input

21U-RNA

21U-RNA

abundant 21U-RNA

abundant 21U-RNA

tRNA

tRNA

snRNA

snRNA

ncRNA

ncRNA 15,680,000

15,640,000

Chr IV genes

16 Mb

Chr III genes

(U6)

4,410,000

4,420,000 pdcd-2

Y105C5A.8

****

2.0 1.5 **

1.0 0.5

Chr I

Chr II

Chr III

2.5 observed / expected SNPC-4:GFP binding sites

observed / expected SNPC-4:GFP binding sites

F

E wildtype prde-1(-)

Chr IV

Chr V

wildtype prde-1(-)

**** ****

2.0 1.5

****

1.0 0.5

Chr X

outside inside 21U-RNA 21U-RNA clusters clusters Chr IV

wildtype prde-1(-)

12.5 average SNPC-4:GFP tag density excluding significant peaks

D 2.5

(U6)

7.5

****

2.5 1.0

0.5

0

-0.5

inside 21U-RNA clusters

outside 21U-RNA clusters

Chr IV

(legend on next page)

152 Developmental Cell 31, 145–158, October 27, 2014 ª2014 Elsevier Inc.

Developmental Cell SNPC-4 Coats and Regulates piRNA Domains

were still present (Figure 5B), and SNPC-4:GFP binding still occurred at target sites residing in both 21U-RNA clusters, as well as at an snRNA gene (Figure S5G). These data suggest that SNPC-4 might act before PRG-1 in 21U-RNA production. Next, we examined whether SNPC-4 genetically interacts with UNC-130, one of several members of the C. elegans FKH family of transcription factors implicated in the regulation of clustered 21U-RNA loci. UNC-130 binds the 21U-RNA upstream regulatory sequence in vivo, and unc-130(ev505) mutants have a partial reduction in the levels of select 21U-RNAs (Cecere et al., 2012). SNPC-4:GFP foci were still present in unc-130 mutant germlines, indicating that UNC-130 is not essential for their formation (Figure 5C). Moreover, qRT-PCR analysis showed that disrupting snpc-4 activity still led to decreased 21U-RNA levels in unc-130 mutants (Figure S5H). These data suggest that SNPC-4 contributes to normal levels of 21U-RNA expression separately from UNC-130. In contrast to PRG-1 and UNC-130, we did uncover a clear interaction with the early biogenesis factor, PRDE-1. PRDE-1 is a germline-specific factor required for the production of both precursor and mature type I 21U-RNAs but not type II 21U-RNAs (Weick et al., 2014). Like SNPC-4, PRDE-1 is concentrated in subnuclear foci on chromosome IV that likely correspond to 21U-RNA domains (Weick et al., 2014). We found that SNPC4:GFP and mCherry:PRDE-1 foci colocalize and exhibit essentially identical patterns during germ cell differentiation (Figure 5D). Strikingly, SNPC-4:GFP foci were dependent on PRDE-1. In prde-1(mj207) mutants, SNPC-4:GFP expression became diffuse in the nucleus, no longer coalescing into prominent foci in any part of the germline (Figure 5E). Conversely, mCherry: PRDE-1 foci were similarly disrupted in affected snpc-4(RNAi) animals (Figures 5F and S5I). Taken together, these data indicate that SNPC-4 and PRDE-1 colocalize to germline-specific foci in a mutually dependent fashion. Thus, SNPC-4 specifically and profoundly affects a factor that functions during the initial steps of 21U-RNA biogenesis, and vice versa, indicating that SNPC-4 is also an early-acting factor. Both Discrete and Distributed SNPC-4 Binding in the 21U-RNA Region Requires PRDE-1 The interaction between SNPC-4 and PRDE-1 foci led us to examine whether PRDE-1 was required for SNPC-4:GFP binding activity. ChIP-seq analysis of SNPC-4:GFP in the prde-1 mutant background (Table S2) demonstrated that both discrete and distributed SNPC-4:GFP binding was greatly diminished in the 21U-RNA domains, while 93% (142/152) of SNAPc target loci such as snRNAs and other structural small RNAs were retained in prde-1 mutants genome-wide (Figures 6A–6C; Figure S6A).

Quantification of these data demonstrates that discrete SNPC4 binding events were no longer preferentially enriched on chromosome IV in the absence of prde-1 (Figure 6D), and within 21U-RNA domains, there were 1.6-fold fewer peaks than expected in prde-1 mutants compared to wild-type (p = 6.5 3 1052; Figure 6E). Additionally, distributed SNPC-4:GFP binding in the 21U-RNA domains was reduced to levels observed outside the domains when prde-1 activity was lost (Figure 6F). Although depletion of SNPC-4 binding within 21U-RNA domains was profound in prde-1 mutants, it was not absolute; 19% of discrete SNPC-4:GFP peaks remained in the absence of prde-1 (e.g., Figure 6B). Discrete SNPC-4:GFP peaks in prde-1 mutants were still significantly enriched inside compared to outside 21U-RNA domains, but binding strength was often diminished (Figures 6E and S6B). Moreover, retained sites are enriched for the canonical PSE binding sequence (E value = 5.9 3 10348), despite the fact that most (53/59) of these PSE-enriched sites do not correspond to structural small RNA genes. These observations suggest that at least a subset of SNPC-4 binding events might occur independently of PRDE-1 in these regions (Figure S6C). In summary, both classes of SNPC-4 binding in the 21U-RNA domains, discrete and distributed, are largely dependent on prde-1 activity. These results further link enriched SNPC-4 binding at 21U-RNA regions with the regulation of 21URNA biogenesis. Co-occupancy of SNPC-4 and Pol III within 21U-RNA Clusters Often Occurs at tRNA Genes The codependency between SNPC-4 and PRDE-1, an earlyacting factor in the 21U-RNA biogenesis pathway, strongly suggests that SNPC-4 also acts at a very early step in this process. SNAPc can interact with either Pol II or Pol III to activate its established ncRNA gene targets (Yoon et al., 1995). Therefore, we investigated the co-occupancy of Pol II and Pol III with SNPC-4 at 21U-RNA clusters to understand better how SNPC4 may regulate 21U-RNA expression. For this analysis, we utilized published Pol II (AMA-1) ChIP-seq data (Gerstein et al., 2010) and generated ChIP-seq data for the large subunit of Pol III (RPC-1) in young adult whole animals (Figure 7A; Table S2). When we quantified the frequency of discrete binding sites shared between the SNPC-4:GFP and polymerase data sets, we discovered that SNPC-4 occupied Pol-III-bound sites 9-fold more frequently than Pol-II-bound sites within 21U-RNA clusters (Figure 7B). By contrast, co-occupancy of SNPC-4 was roughly equivalent between polymerases in regions outside 21U-RNA domains on chromosome IV and on other autosomes (Figures 7B and S7A). Because convincing evidence indicates that Pol II promotes transcription of individual 21U-RNAs (Gu et al.,

Figure 6. SNPC-4 Binding at 21U-RNA Domains Is Diminished in prde-1 Mutants (A–C) Young adult, whole animal SNPC-4:GFP ChIP-seq binding profiles for combined replicates across chromosome IV (A), within the 3.7 Mb 21U-RNA cluster (B), and at U6 snRNA genes (C) in wild-type prde-1(+) (purple) and prde-1 mutant (gray) backgrounds. Annotated gene tracks are as in Figure 4A. Arrows indicate highly significant, discrete peaks (SPP enrichment signal value R 1,000) in wild-type, whereas brackets denote distributed binding. (D and E) Distribution of discrete SNPC-4:GFP binding sites across chromosomes (D) and on chromosome IV (E) relative to genomic length for wild-type and prde-1 mutant ChIP-seq data sets. Statistically significant deviations greater than the expected value of 1 are indicated by asterisks (**p % 0.01 and ****p % 0.0001, Pearson’s chi-square test). (F) SNPC-4:GFP distributed binding as calculated in Figure 4E for wild-type and prde-1 mutants. To reduce possible SNPC-4 overexpression from the SNPC4:GFP transgene (Figure S6D), ChIP-seq data sets associated with this figure (and Figure 7G) were generated using strains that lack the endogenous snpc-4 locus (YL487 and YL551).

Developmental Cell 31, 145–158, October 27, 2014 ª2014 Elsevier Inc. 153

Developmental Cell SNPC-4 Coats and Regulates piRNA Domains

1 Mb

B SNPC-4:GFP sites shared with Pol III relative to sites shared with Pol II

SNPC-4

A whole animal germ line soma Pol III Pol II input 21U-RNA abundant 21U-RNA tRNA snRNA ncRNA genes

10 9 8 7 6 5 4 3 2 1 inside outside 21U-RNA 21U-RNA clusters clusters

Chr IV

Chr IV 2 Mb

4 Mb

6 Mb

C

8 Mb

2kb

10 Mb

14 Mb

16 Mb

D 1.0 fraction of Pol III bound tRNA genes occupied by SNPC-4

whole animal SNPC-4

12 Mb

germ line soma Pol III Pol II input

21U-RNA abundant 21U-RNA tRNA snRNA ncRNA

0.9 0.8 0.7

***

0.6 0.5 0.4 0.3 0.2 0.1 Chr I

16,380,000

Chr II

Chr III

Chr IV

Chr V

Chr X

1600

** ****

16,400,000

Chr IV genes Y7A9D.1

E

F 0.9 0.8

*

0.7 0.6 0.5 0.4 0.3 0.2 0.1 inside outside 21U-RNA 21U-RNA clusters clusters Chr IV

G 300 average enrichment signal for SNPC-4 bound tRNA genes

average enrichment signal for GL specfifc SNPC-4 bound tRNA genes

fraction of Pol III bound tRNA genes occupied by SNPC-4

1.0

250

**** **** ****

200

***

*

****

150 100 50

Chr I

Chr II

Chr III

inside outside Chr V 21U-RNA 21U-RNA clusters clusters

Chr X

Chr IV

1400

wildtype prde-1(-)

1200 1000 800 600 400 200 inside 21U-RNA clusters

outside 21U-RNA clusters

Chr IV

Figure 7. SNPC-4 Colocalizes with Pol III within 21U-RNA Clusters and Often Binds at tRNA Genes (A) Young adult SNPC-4:GFP ChIP-seq binding profiles for combined replicates across chromosome IV and annotated ncRNA gene tracks as described in Figure 4A, as well as young adult, whole animal ChIP-seq binding profiles for combined replicates of Pol III (brown) and Pol II (yellow). (B) Ratio of significant SNPC-4:GFP sites shared with Pol III compared to sites shared with Pol II inside and outside 21U-RNA clusters on chromosome IV. (C) SNPC-4:GFP and RNA polymerase ChIP-seq binding profiles as described in (A) at tRNA loci located within the 3.7 Mb 21U-RNA cluster. One gene is annotated for reference. (D and E) Fraction of Pol-III-bound tRNAs occupied by SNPC-4 across chromosomes (D) and inside and outside 21U-RNA clusters on chromosome IV (E) in the whole animal. Statistical significance was determined by comparing observed versus expected number of occurrences (*p % 0.05 and ***p % 0.001, Pearson’s chi-square test). (legend continued on next page)

154 Developmental Cell 31, 145–158, October 27, 2014 ª2014 Elsevier Inc.

Developmental Cell SNPC-4 Coats and Regulates piRNA Domains

2012; Weick et al., 2014), we examined whether the association between SNPC-4 and Pol III might influence aspects of 21U-RNA production other than transcription initiation. We noted that many (but not all) discrete SNPC-4 sites cobound by Pol III are unexpectedly associated with tRNA genes (Figure 7C), which are not canonical targets of SNAPc (James Faresse et al., 2012). SNPC-4/Pol III co-occupancy at tRNA genes was greatest on chromosome IV compared to other chromosomes (p = 0.0002; Figure 7D) and twice as frequent inside 21U-RNA clusters compared to outside those clusters (p = 0.03; Figure 7E), even though overall Pol III binding at tRNA genes was randomly distributed genome-wide (Figures S7B and S7C). The average strength of germline-specific SNPC-4 binding at occupied tRNA genes was significantly greater inside 21U-RNA domains compared to other regions on chromosome IV and on other chromosomes (Figures 7F and S7D). Intriguingly, in prde-1 mutants, 28% (8/29) of tRNA genes were no longer bound by SNPC-4, and SNPC-4 binding strength at remaining tRNA genes was vastly reduced (Figures 6B and 7G), suggesting that PRDE-1 might also contribute to this activity. Finally, sequences enriched in the strongest SNPC-4 peaks included known tRNA gene regulatory elements (the A and B boxes), which were primarily concentrated on chromosome IV in the whole animal and germline data sets (E value % 7.1 3 10108), but not the somatic data set (Figures S7E–S7G). By contrast, the consensus SNAPc element, the PSE, was overrepresented in all data sets (E value % 7.0 3 10327). Altogether, these results demonstrate a germline-specific association of SNPC-4 with Pol III at tRNA genes, a noncanonical target of SNAPc. tRNA genes, often in conjunction with partially or fully assembled Pol III complexes, have been previously implicated in chromatin organization in yeast and in humans (Donze, 2012), suggesting that the presence of SNPC-4 at these loci might contribute to the spatial organization of 21U-RNA domains in the germline (see Discussion). DISCUSSION The piRNA clusters of C. elegans are remarkable because they invoke the coordinated, tissue-specific regulation of thousands of individual noncoding 21U-RNA genes over a large region that contains many intervening coding genes with diverse expression patterns. Here, we used genome-scale analyses to provide evidence that SNPC-4 promotes the abundant expression of 21U-RNAs. Specifically in the germline, SNPC-4 localizes to subnuclear foci and exhibits concentrated binding across the two 21U-RNA-rich domains on chromosome IV. To our knowledge, extended occupancy of such large, contiguous chromosomal regions has not been described for other sequence-specific transcription factors, except for components involved in X chromosome dosage compensation (Ercan et al., 2007; Pferdehirt et al., 2011). Given the tissue-specific SNPC-4 binding profile, the mutual dependency with PRDE-1, and the reduction of mature 21U-RNA levels on decreased SNPC-4 activity, the

simplest interpretation of our data is that SNPC-4 acts at an early step in type I 21U-RNA biogenesis, perhaps facilitating transcription and/or early processing events. Thus, a factor known to act at promoters to regulate transcription of diverse essential small ncRNA genes can adopt germline-specific properties to globally and uniformly regulate thousands of individually transcribed 21U-RNA genes across large domains. Strikingly, our data also provide a second instance in which a Myb-like factor functions in piRNA regulation, as the A-MYB protein binds and regulates mouse pachytene piRNA clusters (Li et al., 2013). Unlike SNPC-4, A-MYB does not display distributed binding and is likely acting primarily as a sequence-specific transcription factor. However, the repeated involvement of a Mybrelated protein in piRNA regulation, despite the many differences in piRNA cluster organization and mechanism of transcription between worm and mouse, suggests an ancient regulatory relationship. trans-Acting Factors Contributing to Germline-Specific SNPC-4 Function Constitutive binding of SNPC-4 at most target genes raises the question of how its developmentally regulated germline-specific function at the two 21U-RNA clusters is achieved. Our discovery that SNPC-4 depends on PRDE-1 for this binding pattern provides one likely mechanism for this specificity, as PRDE-1 is expressed only in the germline and has no obvious function beyond 21U-RNA regulation. Given the mutual dependency of each protein on the other for recruitment to the same germline-specific foci, it is possible that both proteins must be present for the spreading of SNPC-4 (and possibly PRDE-1) across the 21URNA domains. However, there are at least some SNPC-4 binding sites in the 21U-RNA domains that are PRDE-1 independent and that contain the known SNAPc binding element, the PSE. Therefore, it is possible that SNPC-4 uses a subset of these sites to initiate binding and the germline-specific presence of PRDE-1 permits spreading throughout the domain. Experiments investigating the PRDE-1 binding pattern in comparison to that of SNPC-4, and whether it changes on loss of SNPC-4 activity, will further enlighten our understanding of this important regulatory relationship. Other mechanisms that could contribute to the germline-specific properties of SNPC-4 may involve other transcription factors. C. elegans encodes eight proteins (SNPC-1.1–1.4 and SNPC-3.1–3.4; see Table S1) orthologous to the two mammalian SNAPc subunits that provide ancillary DNA contacts to support SNAPC4 binding (Li et al., 2004). One or more of these eight proteins might be expressed specifically in the germline and modulate SNAPc binding in 21U-RNA-rich regions. However, whether SNPC-4 is acting as part of a canonical SNAPc complex within the 21U-RNA domains, or has been co-opted to function via a completely different mechanism, is unknown. Additionally, currently unidentified factor(s) that bind to the 21U-RNA upstream regulatory sequence only in the germline might also facilitate SNPC-4 binding across the 21U-RNA domains in that tissue.

(F and G) Graphs depict average SNPC-4:GFP binding strength of significant peaks (enrichment signal as determined by the SPP algorithm) at SNPC-4:GFPoccupied tRNA genes for germline-specific (F) and for whole animal (G) targets in wild-type and prde-1 mutant backgrounds. Statistically significant differences between inside 21U-RNA clusters and other genomic regions are marked with asterisks (*p % 0.05, **p % 0.01, ***p % 0.001, and ****p % 0.0001, Student’s t test).

Developmental Cell 31, 145–158, October 27, 2014 ª2014 Elsevier Inc. 155

Developmental Cell SNPC-4 Coats and Regulates piRNA Domains

Differential Binding Patterns of SNPC-4 and Potential Chromatin Interactions SNPC-4 exhibits both discrete and distributed binding events within the 21U-RNA regions in the germline. However, neither form of binding is enriched at any feature of the 21U-RNA locus, including the upstream 21U-RNA motif, transcription start site, or 50 nucleotide. Moreover, 99% of 21U-RNA loci are more than 500 bp from discrete binding sites, and proximity of these sites does not correlate with 21U-RNA abundance. These observations, along with pervasive, low-level SNPC-4 binding that coats the 21U-RNA domains, suggest that SNPC-4 might facilitate the coordinated, germline-specific expression of 21U-RNAs by affecting the genomic organization of the entire domain rather than separately regulating individual 21U-RNA loci. One possible mechanism for promoting such an organization is suggested by the presence of multiple SANT domains within SNPC-4. SANT domains are commonly found in components of chromatin-modifying complexes, such as Ada2, which promotes histone acetylation, and SmRT, a corepressor that promotes deacetylation (Boyer et al., 2004). SANT domains can interact with histone tails and stabilize their conformation, potentially facilitating presentation of the tail to an associated enzymatic activity. Biochemical studies of SNAPc in other organisms indicate that it interacts with nucleosomes at snRNA promoters (Zhao et al., 2001). Thus, the SANT domains of SNPC-4 might influence the chromatin state in the 21U-RNA regions through modulating nucleosome organization and/or histone modifications. Intriguingly, data collected from young adult animals suggest that the 21U-RNA regions have decreased nucleosome density (Cecere et al., 2012). The ability to analyze multiple properties of chromatin, including histone modifications, in the germline separately from the soma will be necessary to directly explore the relationship between the chromatin state, SNPC-4, and 21U-RNA expression. Another mechanism by which SNPC-4 could contribute to the genomic organization of 21U-RNA domains is through the interaction with Pol III and tRNA genes in the 21U-RNA domains. This association was not expected a priori, because tRNA genes are regulated by Pol III through an internal A/B box motif and not via SNAPc (Canella et al., 2010). Intriguingly, Pol III and tRNA loci have been implicated in various large-scale gene regulatory processes, including enhancer blocking, insulator activity, and establishment of chromatin boundaries (Donze, 2012). A current hypothesis is that expression from tRNA loci disrupts the spread of heterochromatin through ‘‘transcriptional interference,’’ or by serving as entry sites for chromatin-remodeling activities. Additionally, proteins that influence chromatin loop formation, such as cohesin, are often found in association with components of Pol III at tRNA genes and additional non-tRNA sites (Moqtaderi et al., 2010). Therefore, the tissue-specific association of SNPC-4 with RNA Pol III at tRNA genes and other genomic sites might promote the arrangement of the 21U-RNA-rich chromosomal domains into a conformation or location that is permissive for 21U-RNA expression, specifically in the germline. For example, SNPC-4 may utilize Pol-III-occupied tRNA genes as specific chromatin entry sites to facilitate its dispersion across the 21U-RNA domain or as contact points to help establish particular chromatin loop structures to facilitate robust 21URNA expression. Alternatively, interaction with tRNA genes

might help to position 21U-RNA clusters at the nuclear periphery, as has been recently shown for the association of the nuclear pore protein NPP-13 with tRNA and snoRNA genes (Ikegami and Lieb, 2013). Thus, SNPC-4 foci might be found near nuclear pores via interaction with Pol III and tRNA genes, aiding in either permissive chromatin conformation, posttranscriptional processing events of 21U-RNAs, and/or transport of 21U-RNAs to germ granules. Conclusions The mechanisms that translate information from chromatinassociated regulatory factors and sequence elements to the epigenetic states and 3D genomic domains to govern gene expression are still mysterious. The unusual pattern and remarkable specificity of SNPC-4 binding at 21U-RNA domains suggest that the genomic clustering of these individually transcribed but coordinately regulated genes contributes to their robust and germline-specific expression. This system provides a unique and exciting opportunity to dissect how regulation of defined genomic domains is achieved. EXPERIMENTAL PROCEDURES Detailed methods are available in the Supplemental Experimental Procedures. C. elegans Strains Strains were maintained at 20 C using standard methods unless otherwise noted (Brenner, 1974). N2 Bristol was used as a wild-type strain. See the Supplemental Experimental Procedures for transgenic strain construction and all strains used in this study. SNPC-4:GFP Foci Analysis Young adult SNPC-4:GFP gonads were antibody stained with a 1:100 dilution of rabbit a-GFP antibody (BD BioSciences), 1:100 dilution of Alexa Fluor 488 goat a-rabbit immunoglobulin G (IgG), and 0.5 mg/ml DAPI as described in the Supplemental Experimental Procedures. Immunostained gonad images were subdivided based on nuclear morphology into mitotic, transition, and meiotic zones. The number of SNPC-4 foci was determined for ten nuclei in each zone for at least five animals per genotype. See Figure S2 for a representative example of foci quantification. ChIP-Seq Worm samples were crosslinked in 2% formaldehyde for 30 min. Sonicated extract containing 2 mg of protein was immunoprecipitated as described elsewhere (Zhong et al., 2010) with 7.5 mg of a-GFP antibody (gift from Anthony Hyman) (Zhong et al., 2010) or a-Pol III antibody (gift from Jason Lieb) (Ikegami and Lieb, 2013) for the majority of samples. For ChIP-seq data comparing wild-type and prde-1 mutant backgrounds, 4 mg of sonicated lysate was immunoprecipitated and prepared for HiSeq sequencing using Ovation Ultralow DR Multiplex System (NuGEN Technologies). Reads were aligned to C. elegans WS220 genome build, and significant peaks were identified with ChIP-seq processing pipeline (SPP) and irreproducible discovery rate, or IDR, algorithms (Landt et al., 2012). Target calling was performed separately for 21U-RNAs and other gene models (i.e., coding and ncRNAs). The gene in closest proximity to the peak maximum of a binding site was defined as a transcription factor candidate target. See the Supplemental Experimental Procedures for detailed protocols for ChIP-seq, as well as downstream analyses such as metagene and motif analysis. ChIP-qPCR Chromatin was immunoprecipitated as described earlier, except control samples were immunoprecipitated with 7.5 mg of goat a-IgG antibody (R&D Systems). See the Supplemental Experimental Procedures for quantitative PCR (qPCR) conditions.

156 Developmental Cell 31, 145–158, October 27, 2014 ª2014 Elsevier Inc.

Developmental Cell SNPC-4 Coats and Regulates piRNA Domains

RNA Isolation and Analysis L1 animals were fed bacteria expressing RNA interference (RNAi) clones from the Ahringer Library (Fraser et al., 2000) until collection at the young adult stage. RNAi-treated young adults were harvested in TRIzol reagent (Invitrogen). Total RNA was then treated with DNA-free DNaseI (Ambion). For small RNA sequencing, a total of 5 mg of DNase-treated RNA harvested from snpc-4(RNAi) and wild-type young adults was subjected to the Illumina Small RNA v1.5 Sample Preparation Kit, and 22–30 bp small RNAs were sequenced (Illumina). Trimmed reads were counted within WS235 transcript sets mapped to WS220 coordinates and normalized to reads per million. Read abundance for sequences present with at least one read in both conditions was analyzed in R, using a paired-sample, one-sided Wilcoxon signedrank test. For qRT-PCR, the SuperScript III First-Strand Synthesis System for RT-PCR (Invitrogen) was performed to generate complementary DNA. qPCR reactions were performed in triplicate for three biological replicates. Small RNA primers were designed as described elsewhere (Chen et al., 2005; Das et al., 2008). Relative expression levels were quantified using the DDCt method. See the Supplemental Experimental Procedures for detailed protocols.

ACCESSION NUMBERS ChIP-seq and RNA sequencing experiments are submitted to Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo/) under accession number GSE53412. SUPPLEMENTAL INFORMATION Supplemental Information includes Supplemental Experimental Procedures, seven figures, and four tables and can be found with this article online at http://dx.doi.org/10.1016/j.devcel.2014.09.015. ACKNOWLEDGMENTS We thank the modENCODE consortium, especially Wei Niu, for help with ChIPseq; members from the University of Chicago Institute for Genomics and Systems Biology for ChIP-seq data processing; LaDeana Hillier for target calling analysis; Tyson Edwards for technical assistance; Jason Lieb and Anthony Hyman for antibodies; Lynn Cooley and Marc Hammarlund for equipment; and Joseph Culotti for the unc-130(ev505) strain. We especially thank Eric Miska and his lab for providing several prde-1 reagents. Some strains were provided by the Caenorhabditis Genetics Center (NIH P40 OD010440). This work was supported by grants from the March of Dimes (FY10–433) and National Human Genome Research Institute (U54HG007002 and U01HG004267). K.E.G. is supported by a grant from the National Institute of General Medical Sciences (F32GM101778). Received: January 6, 2014 Revised: August 25, 2014 Accepted: September 25, 2014 Published: October 27, 2014

Billi, A.C., Freeberg, M.A., Day, A.M., Chun, S.Y., Khivansara, V., and Kim, J.K. (2013). A conserved upstream motif orchestrates autonomous, germline-enriched expression of Caenorhabditis elegans piRNAs. PLoS Genet. 9, e1003392. Blumenthal, T. (2005). Trans-splicing and operons. WormBook, 1–9. Boyer, L.A., Latek, R.R., and Peterson, C.L. (2004). The SANT domain: a unique histone-tail-binding module? Nat. Rev. Mol. Cell Biol. 5, 158–163. Brennecke, J., Aravin, A.A., Stark, A., Dus, M., Kellis, M., Sachidanandam, R., and Hannon, G.J. (2007). Discrete small RNA-generating loci as master regulators of transposon activity in Drosophila. Cell 128, 1089–1103. Brenner, S. (1974). The genetics of Caenorhabditis elegans. Genetics 77, 71–94. Canella, D., Praz, V., Reina, J.H., Cousin, P., and Hernandez, N. (2010). Defining the RNA polymerase III transcriptome: Genome-wide localization of the RNA polymerase III transcription machinery in human cells. Genome Res. 20, 710–721. Cecere, G., Zheng, G.X., Mansisidor, A.R., Klymko, K.E., and Grishok, A. (2012). Promoters recognized by forkhead proteins exist for individual 21URNAs. Mol. Cell 47, 734–745. Chen, W.V., and Maniatis, T. (2013). Clustered protocadherins. Development 140, 3297–3302. Chen, C., Ridzon, D.A., Broomer, A.J., Zhou, Z., Lee, D.H., Nguyen, J.T., Barbisin, M., Xu, N.L., Mahuvakar, V.R., Andersen, M.R., et al. (2005). Realtime quantification of microRNAs by stem-loop RT-PCR. Nucleic Acids Res. 33, e179. Das, P.P., Bagijn, M.P., Goldstein, L.D., Woolford, J.R., Lehrbach, N.J., Sapetschnig, A., Buhecha, H.R., Gilchrist, M.J., Howe, K.L., Stark, R., et al. (2008). Piwi and piRNAs act upstream of an endogenous siRNA pathway to suppress Tc3 transposon mobility in the Caenorhabditis elegans germline. Mol. Cell 31, 79–90. Deng, W., Lee, J., Wang, H., Miller, J., Reik, A., Gregory, P.D., Dean, A., and Blobel, G.A. (2012). Controlling long-range genomic interactions at a native locus by targeted tethering of a looping factor. Cell 149, 1233–1244. Donze, D. (2012). Extra-transcriptional functions of RNA Polymerase III complexes: TFIIIC as a potential global chromatin bookmark. Gene 493, 169–175. Ercan, S., Giresi, P.G., Whittle, C.M., Zhang, X., Green, R.D., and Lieb, J.D. (2007). X chromosome repression by localization of the C. elegans dosage compensation machinery to sites of transcription initiation. Nat. Genet. 39, 403–408. Fraser, A.G., Kamath, R.S., Zipperlen, P., Martinez-Campos, M., Sohrmann, M., and Ahringer, J. (2000). Functional genomic analysis of C. elegans chromosome I by systematic RNA interference. Nature 408, 325–330. Gerstein, M.B., Lu, Z.J., Van Nostrand, E.L., Cheng, C., Arshinoff, B.I., Liu, T., Yip, K.Y., Robilotto, R., Rechtsteiner, A., Ikegami, K., et al.; modENCODE Consortium (2010). Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project. Science 330, 1775–1787. Gu, W., Lee, H.C., Chaves, D., Youngman, E.M., Pazour, G.J., Conte, D., Jr., and Mello, C.C. (2012). CapSeq and CIP-TAP identify Pol II start sites and reveal capped small RNAs as C. elegans piRNA precursors. Cell 151, 1488– 1500.

REFERENCES Andrey, G., Montavon, T., Mascrez, B., Gonzalez, F., Noordermeer, D., Leleu, M., Trono, D., Spitz, F., and Duboule, D. (2013). A switch between topological domains underlies HoxD genes collinearity in mouse limbs. Science 340, 1234167. Aravin, A., Gaidatzis, D., Pfeffer, S., Lagos-Quintana, M., Landgraf, P., Iovino, N., Morris, P., Brownstein, M.J., Kuramochi-Miyagawa, S., Nakano, T., et al. (2006). A novel class of small RNAs bind to MILI protein in mouse testes. Nature 442, 203–207. Batista, P.J., Ruby, J.G., Claycomb, J.M., Chiang, R., Fahlgren, N., Kasschau, K.D., Chaves, D.A., Gu, W., Vasale, J.J., Duan, S., et al. (2008). PRG-1 and 21U-RNAs interact to form the piRNA complex required for fertility in C. elegans. Mol. Cell 31, 67–78.

Ikegami, K., and Lieb, J.D. (2013). Integral nuclear pore proteins bind to Pol IIItranscribed genes and are required for Pol III transcript processing in C. elegans. Mol. Cell 51, 840–849. Ishizu, H., Siomi, H., and Siomi, M.C. (2012). Biology of PIWI-interacting RNAs: new insights into biogenesis and function inside and outside of germlines. Genes Dev. 26, 2361–2373. James Faresse, N., Canella, D., Praz, V., Michaud, J., Romascano, D., and Hernandez, N. (2012). Genomic study of RNA polymerase II and III SNAPcbound promoters reveals a gene transcribed by both enzymes and a broad use of common activators. PLoS Genet. 8, e1003028. Jawdekar, G.W., and Henry, R.W. (2008). Transcriptional regulation of human small nuclear RNA genes. Biochim. Biophys. Acta 1779, 295–305.

Developmental Cell 31, 145–158, October 27, 2014 ª2014 Elsevier Inc. 157

Developmental Cell SNPC-4 Coats and Regulates piRNA Domains

Kudron, M., Niu, W., Lu, Z., Wang, G., Gerstein, M., Snyder, M., and Reinke, V. (2013). Tissue-specific direct targets of Caenorhabditis elegans Rb/E2F dictate distinct somatic and germline programs. Genome Biol. 14, R5.

Ruby, J.G., Jan, C., Player, C., Axtell, M.J., Lee, W., Nusbaum, C., Ge, H., and Bartel, D.P. (2006). Large-scale sequencing reveals 21U-RNAs and additional microRNAs and endogenous siRNAs in C. elegans. Cell 127, 1193–1207.

Landt, S.G., Marinov, G.K., Kundaje, A., Kheradpour, P., Pauli, F., Batzoglou, S., Bernstein, B.E., Bickel, P., Brown, J.B., Cayting, P., et al. (2012). ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 22, 1813–1831.

Sarov, M., Murray, J.I., Schanze, K., Pozniakovski, A., Niu, W., Angermann, K., Hasse, S., Rupprecht, M., Vinis, E., Tinney, M., et al. (2012). A genome-scale resource for in vivo tag-based protein function exploration in C. elegans. Cell 150, 855–866.

Lau, N.C., Seto, A.G., Kim, J., Kuramochi-Miyagawa, S., Nakano, T., Bartel, D.P., and Kingston, R.E. (2006). Characterization of the piRNA complex from rat testes. Science 313, 363–367. Li, C., Harding, G.A., Parise, J., McNamara-Schroeder, K.J., and Stumph, W.E. (2004). Architectural arrangement of cloned proximal sequence element-binding protein subunits on Drosophila U1 and U6 snRNA gene promoters. Mol. Cell. Biol. 24, 1897–1906. Li, X.Z., Roy, C.K., Dong, X., Bolcun-Filas, E., Wang, J., Han, B.W., Xu, J., Moore, M.J., Schimenti, J.C., Weng, Z., and Zamore, P.D. (2013). An ancient transcription factor initiates the burst of piRNA production during early meiosis in mouse testes. Mol. Cell 50, 67–81. Luteijn, M.J., and Ketting, R.F. (2013). PIWI-interacting RNAs: from generation to transgenerational epigenetics. Nat. Rev. Genet. 14, 523–534. Moqtaderi, Z., Wang, J., Raha, D., White, R.J., Snyder, M., Weng, Z., and Struhl, K. (2010). Genomic binding profiles of functionally distinct RNA polymerase III transcription complexes in human cells. Nat. Struct. Mol. Biol. 17, 635–640. Niu, W., Lu, Z.J., Zhong, M., Sarov, M., Murray, J.I., Brdlik, C.M., Janette, J., Chen, C., Alves, P., Preston, E., et al. (2011). Diverse transcription factor binding features revealed by genome-wide ChIP-seq in C. elegans. Genome Res. 21, 245–254. Pferdehirt, R.R., Kruesi, W.S., and Meyer, B.J. (2011). An MLL/COMPASS subunit functions in the C. elegans dosage compensation complex to target X chromosomes for transcriptional regulation of gene expression. Genes Dev. 25, 499–515. Phillips, C.M., and Dernburg, A.F. (2006). A family of zinc-finger proteins is required for chromosome-specific pairing and synapsis during meiosis in C. elegans. Dev. Cell 11, 817–829.

Saxe, J.P., and Lin, H. (2011). Small noncoding RNAs in the germline. Cold Spring Harb. Perspect. Biol. 3, a002717. Shi, Z., Montgomery, T.A., Qi, Y., and Ruvkun, G. (2013). High-throughput sequencing reveals extraordinary fluidity of miRNA, piRNA, and siRNA pathways in nematodes. Genome Res. 23, 497–508. Su, Y., Song, Y., Wang, Y., Jessop, L., Zhan, L., and Stumph, W.E. (1997). Characterization of a Drosophila proximal-sequence-element-binding protein involved in transcription of small nuclear RNA genes. Eur. J. Biochem. 248, 231–237. Wang, G., and Reinke, V. (2008). A C. elegans Piwi, PRG-1, regulates 21URNAs during spermatogenesis. Curr. Biol. 18, 861–867. Weick, E.M., Sarkies, P., Silva, N., Chen, R.A., Moss, S.M., Cording, A.C., Ahringer, J., Martinez-Perez, E., and Miska, E.A. (2014). PRDE-1 is a nuclear factor essential for the biogenesis of Ruby motif-dependent piRNAs in C. elegans. Genes Dev. 28, 783–796. Wong, M.W., Henry, R.W., Ma, B., Kobayashi, R., Klages, N., Matthias, P., Strubin, M., and Hernandez, N. (1998). The large subunit of basal transcription factor SNAPc is a Myb domain protein that interacts with Oct-1. Mol. Cell. Biol. 18, 368–377. Yoon, J.B., Murphy, S., Bai, L., Wang, Z., and Roeder, R.G. (1995). Proximal sequence element-binding transcription factor (PTF) is a multisubunit complex required for transcription of both RNA polymerase II- and RNA polymerase IIIdependent small nuclear RNA genes. Mol. Cell. Biol. 15, 2019–2027. Zhao, X., Pendergrast, P.S., and Hernandez, N. (2001). A positioned nucleosome on the human U6 promoter allows recruitment of SNAPc by the Oct-1 POU domain. Mol. Cell 7, 539–549. Zhong, M., Niu, W., Lu, Z.J., Sarov, M., Murray, J.I., Janette, J., Raha, D., Sheaffer, K.L., Lam, H.Y., Preston, E., et al. (2010). Genome-wide identification of binding sites defines distinct functions for Caenorhabditis elegans PHA-4/ FOXA in development and environmental response. PLoS Genet. 6, e1000848.

158 Developmental Cell 31, 145–158, October 27, 2014 ª2014 Elsevier Inc.

The C. elegans SNAPc component SNPC-4 coats piRNA domains and is globally required for piRNA abundance.

The Piwi/Piwi-interacting RNA (piRNA) pathway protects the germline from the activity of foreign sequences such as transposons. Remarkably, tens of th...
4MB Sizes 1 Downloads 6 Views