Article

Bidirectional Transcription Arises from Two Distinct Hubs of Transcription Factor Binding and Active Chromatin Graphical Abstract

Authors Benjamin S. Scruggs, Daniel A. Gilchrist, ..., David C. Fargo, Karen Adelman

Correspondence [email protected]

In Brief Scruggs et al. precisely define divergent sense and upstream anti-sense transcription start sites in mouse cells and find that increased distance between start sites corresponds to larger regions of nucleosome depletion, more transcription factor occupancy, and higher gene activity. Notably, enhancerlike features mark upstream nucleosomes at divergent genes with highest expression.

Highlights d

Start-seq reveals variable spacing between coupled sense and anti-sense TSSs

d

Bidirectional genes with distant anti-sense TSSs have enlarged NDRs

d

Larger NDRs encompass more TF motifs, enabling greater TF binding and gene activity

d

Anti-sense TSSs of inducible genes are enriched in PU.1, H3K4me1, H3K27ac, and p300

Scruggs et al., 2015, Molecular Cell 58, 1–12 June 18, 2015 ª2015 Elsevier Inc. http://dx.doi.org/10.1016/j.molcel.2015.04.006

Accession Numbers GSE62151

Please cite this article in press as: Scruggs et al., Bidirectional Transcription Arises from Two Distinct Hubs of Transcription Factor Binding and Active Chromatin, Molecular Cell (2015), http://dx.doi.org/10.1016/j.molcel.2015.04.006

Molecular Cell

Article Bidirectional Transcription Arises from Two Distinct Hubs of Transcription Factor Binding and Active Chromatin Benjamin S. Scruggs,1,3 Daniel A. Gilchrist,1,3,4 Sergei Nechaev,1,5 Ginger W. Muse,1 Adam Burkholder,2 David C. Fargo,2 and Karen Adelman1,* 1Epigenetics

and Stem Cell Biology Laboratory for Integrative Bioinformatics National Institute of Environmental Health Sciences, Research Triangle Park, NC 27709, USA 3Co-first author 4Present address: National Human Genome Research Institute, Bethesda, MD 20892, USA 5Present address: University of North Dakota School of Medicine, Grand Forks, ND 58202, USA *Correspondence: [email protected] http://dx.doi.org/10.1016/j.molcel.2015.04.006 2Center

SUMMARY

Anti-sense transcription originating upstream of mammalian protein-coding genes is a well-documented phenomenon, but remarkably little is known about the regulation or function of anti-sense promoters and the non-coding RNAs they generate. Here we define at nucleotide resolution the divergent transcription start sites (TSSs) near mouse mRNA genes. We find that coupled sense and anti-sense TSSs precisely define the boundaries of a nucleosome-depleted region (NDR) that is highly enriched in transcription factor (TF) motifs. Notably, as the distance between sense and anti-sense TSSs increases, so does the size of the NDR, the level of signal-dependent TF binding, and gene activation. We further discover a group of anti-sense TSSs in macrophages with an enhancer-like chromatin signature. Interestingly, this signature identifies divergent promoters that are activated during immune challenge. We propose that anti-sense promoters serve as platforms for TF binding and establishment of active chromatin to further regulate or enhance sense-strand mRNA expression.

INTRODUCTION How cells achieve the appropriate responses to intrinsic and extrinsic signals remains a central question in biology. Changes in gene expression constitute a major facet of cellular responses to all types of stimuli, and one of the great accomplishments of molecular biology has been the elucidation of core promoter sequences around which the transcription machinery is organized (Butler and Kadonaga, 2002). Structural and biochemical studies have helped establish basic rules regarding spatial arrangements of DNA sequence elements, complexes such as the gen-

eral transcription factors (TFs) and mediator that facilitate RNA Polymerase II (Pol II) binding, and the role of chromatin in defining the promoter environment (Conaway and Conaway, 2013; Roeder, 2005). However, advances in genomic technology have revealed that far more of the mammalian genome is transcribed than previously anticipated, including intergenic regions upstream of protein-coding genes and distal enhancers (Carninci et al., 2005; Cheng et al., 2005). Thus, there is much still to learn about the logic underlying the transcriptional repertoire of mammalian cells. Notably, recent studies have revealed that Pol II transcribes in both sense and anti-sense directions near many mRNA genes (Core et al., 2008; Preker et al., 2008; Seila et al., 2008). At such bidirectional promoters, Pol II has been shown to initiate transcription and undergo promoter-proximal pausing at both the protein-coding, sense TSS and from within the upstream region in the anti-sense direction (Core et al., 2008; Duttke et al., 2015; Flynn et al., 2011). Such divergent transcription is particularly widespread at mammalian promoters, which typically lack key core promoter elements such as the TATA motif and are thought to be information-poor due to their richness in CpG dinucleotides. This enigmatic, dual promoter structure raises fundamental questions about the specificity, purpose, and regulation of such non-coding upstream anti-sense transcription. Whereas transcription on the sense strand leads to processive elongation of mRNAs that can extend tens of kilobases (kb) in length, transcription in the anti-sense direction is much less processive and typically results in unstable non-coding RNA (ncRNA) species of limited length (< 2 kb, Flynn et al., 2011; Ntini et al., 2013; Preker et al., 2008; Sigova et al., 2013). An elegant explanation for the differential processivity arose from studies of DNA sequences surrounding bidirectional promoters. Sequences recognized by the U1 small nuclear ribonucleoprotein (U1 snRNP), which promotes efficient RNA splicing and processive Pol II elongation (Berg et al., 2012), are enriched downstream of sense TSSs. In contrast, poly-A sequences (PAS) that trigger termination of transcription and RNA poly-adenylation (Proudfoot, 2011) are much more frequent downstream of anti-sense TSSs (Almada et al., 2013; Andersen et al., 2012; Ntini et al., 2013). Molecular Cell 58, 1–12, June 18, 2015 ª2015 Elsevier Inc. 1

Please cite this article in press as: Scruggs et al., Bidirectional Transcription Arises from Two Distinct Hubs of Transcription Factor Binding and Active Chromatin, Molecular Cell (2015), http://dx.doi.org/10.1016/j.molcel.2015.04.006

Thus, a ‘‘U1-PAS axis’’ is proposed to drive the differences in Pol II processivity in the sense and anti-sense directions. Intriguingly, many distal enhancer regions also display bidirectional transcription initiation and generate short, unstable transcripts (De Santa et al., 2010; Kim et al., 2010), raising parallels between enhancers and anti-sense promoters near protein-coding genes (Core et al., 2014). Enhancers are sites of active chromatin modifications and high TF occupancy that are suggested to boost mRNA output by looping toward their target gene and increasing the local concentration of TFs and other transcription stimulatory complexes (Calo and Wysocka, 2013). To address what function(s), if any, are served by the non-coding anti-sense transcription arising upstream of protein-coding genes, we precisely annotated both sense and anti-sense TSSs in murine macrophages. We found that bidirectional transcription is widespread and arises from two discrete, focused TSSs that are adjacent to distinct peaks of TF binding sites and TF occupancy. We present evidence that the distance between divergent TSSs impacts promoter structure: mRNA promoters with distant anti-sense TSSs have larger nucleosomedepleted regions (NDRs), more accessible TF motifs, elevated TF binding, and greater transcription activation. Further, we discovered a subset of anti-sense TSSs with characteristics of enhancers such as enrichment in histone H3 mono-methylated at K4 (H3K4me1), H3 acetylated at K27 (H3K27ac), and p300 occupancy. These TSSs are also highly bound by PU.1, a master regulator of the macrophage lineage implicated in the establishment of tissue-specific enhancers. Notably, divergent promoters bearing this signature around the anti-sense TSS are highly active and strongly inducible by immune challenge, suggesting that factors associated with anti-sense TSSs might modulate mRNA expression. RESULTS Start-Seq Precisely Identifies Transcription Start Sites To elucidate the regulation and function of bidirectional transcription near protein-coding genes, we first obtained a highresolution view of Pol II activity across the mouse genome using a recently described strategy for isolation of TSS-associated RNAs (Henriques et al., 2013; Nechaev et al., 2010). Highthroughput sequencing of these nascent 25–65 nucleotide (nt) capped RNA species from the 50 -end (which we refer to as Start-seq) allows for precise definition of TSSs at nucleotide resolution. This method has previously enabled characterization of TSS-proximal sequence elements and promoter structure in Drosophila (Gilchrist et al., 2010; Nechaev et al., 2010), but such studies have not yet been carried out in mammals. We sequenced start site-associated RNAs (or Start-RNAs) from murine bone marrow-derived macrophages that were unstimulated or challenged with bacterial lipopolysaccharide (LPS) to capture TSSs that were basally active as well as those induced by immune stimulation. Roughly half of RefSeq genes examined showed significant Start-RNA reads within a 2 kilobase (kb) window around their promoter, indicative of Pol II initiation and gene activity (Figure S1A; 12,228 of 23,747 RefSeq genes). In many cases, the peak of Start-RNA reads on the sense strand was in perfect agreement with RefSeq annotation (Fig2 Molecular Cell 58, 1–12, June 18, 2015 ª2015 Elsevier Inc.

ure 1A), but Start-RNA peaks could also be offset from the annotated TSS (Figure 1B). On a global scale, Start-RNA reads were diffusely distributed around RefSeq TSSs (Figure 1C, left panel; Figure 1D and Figure S1A), suggesting several possibilities: (1) the use of dispersed TSSs by Pol II, or (2) modest discrepancies between start site usage in macrophages and the current mouse genomic annotation. To distinguish between these possibilities, the peaks of StartRNA data were used to define ‘‘observed’’ TSSs in macrophages. When Start-RNA reads were aligned and graphed with respect to these observed sense-strand TSSs, it revealed a tightening of read distribution (Figure 1C, right panel, and Figure 1D) and a clear increase in the information content present near observed sense TSSs as compared to RefSeq annotation (Figure 1E). Further, de novo motif discovery identified sequences resembling the Inr and TATA motif (Butler and Kadonaga, 2002) that were enriched around observed TSSs (Figure S1B). Indeed, motif analysis confirmed a greater occurrence of both Inr and TATA elements at observed versus RefSeq TSSs (Figure S1C). These data demonstrate the power of Startseq for defining TSSs with high confidence and bringing promoter regulatory features into focus. Interestingly, Start-RNAs on the anti-sense strand displayed a broad distribution when aligned against the observed sense TSSs (Figure 1F, left panel), indicating that anti-sense transcription does not initiate at a set distance from sense TSSs. However, anti-sense Start-RNAs frequently exhibited focused peaks at individual loci (Figures 1A and 1B). We thus used our StartRNA datasets to define the position of anti-sense TSSs occurring within 1 kb upstream of the sense TSS. We were able to clearly define anti-sense TSSs at > 75% of active promoters (9,219 of 12,228). Importantly, we note that TSSs with levels of anti-sense transcription that fall below the stringent threshold employed for defining an anti-sense TSS location are not necessarily devoid of anti-sense transcription activity. Alignment of anti-sense Start-RNAs around these anti-sense TSSs (Figure 1G, left panel) revealed generally focused transcription initiation. To probe the factors underlying this specificity, we examined DNA sequence in the region immediately surrounding anti-sense TSSs. This analysis revealed strong information content at the anti-sense TSSs (Figure 1H) and a frequency of Inr motif occurrence (27%) that was nearly identical to that observed at sense TSSs (28%; Figure S1C). Together, these data indicate that Pol II initiates transcription specifically and divergently near the majority of active genes in macrophages. However, across different genes, the location of non-coding anti-sense initiation is set at a variable distance upstream of the coding sense TSS location. The median distance between sense and anti-sense TSSs is 176 base pairs (bp), but a considerable fraction of anti-sense TSSs are more distal, with many genes displaying sense/anti-sense distances of greater than 200 bp (Figure S1D). Sense and Anti-sense TSSs Represent a Coupled Promoter Unit Although paired sense and anti-sense TSSs flank the same regulatory information, it is not clear whether promoter sequences regulate bidirectional transcription in unison or if Pol II activity

Please cite this article in press as: Scruggs et al., Bidirectional Transcription Arises from Two Distinct Hubs of Transcription Factor Binding and Active Chromatin, Molecular Cell (2015), http://dx.doi.org/10.1016/j.molcel.2015.04.006

A

B

Figure 1. High-Resolution Mapping of TSSs Reveals Focused Transcription Initiation in Sense and Anti-sense Directions

(A and B) Pol II ChIP-seq data (red) show peaks of Pol II at the sense (green arrow) and anti-sense (purple arrow) TSSs. Start-seq data are shown for the sense strand (green) and anti-sense strand (purple). RefSeq gene models are shown in black with TSS depicted by arrows. In (A), the RefSeq TSS for Socs6 (black arrow) is aligned exactly with the observed, sense-strand TSS identified by Start-seq data. In (B), the observed sense-strand TSS is slightly offset (38 nt) from RefSeq annotaC D F G tion for the Btg2 gene. (C) Distribution of Start-seq data on the sense strand, aligned around RefSeq-annotated TSSs (left, black arrow) or the peak of sense-strand Start-RNA data in this region (right, green arrow), which we term the observed sense TSSs. Color bar at bottom indicates read depth. n = 12,228 genes defined as active in mouse macrophages. (D) Average distribution of Start-seq reads at single-nucleotide resolution in a 21 nt window E centered on either RefSeq (black) or sense TSSs (green). Data are for genes shown in (C). (E) Information content in a ± 5 nt window around RefSeq and sense TSSs determined by WebLogo. Position 0 indicates the RefSeq (top) or sense TSS (bottom). (F) Sense- and anti-sense-strand (green and purH ple, respectively) Start-RNA reads centered around sense TSSs. Note focused initiation around sense TSSs (right) with dispersed antisense signal (left). (G) Sense- and anti-sense-strand (green and purple, respectively) Start-RNA reads centered around anti-sense TSSs (purple arrows) reveal generally focused anti-sense initiation from the 9,219 promoters possessing sufficient Start-seq data for identification of sense and anti-sense TSSs. (H) Information content around anti-sense TSSs as in (E). See also Figure S1.

in each direction could show distinct, uncoupled regimes. To investigate this, we examined the transcriptional profile of bidirectional promoters in resting and activated macrophages. RNA Pol II ChIP-seq and Start-RNA-seq were performed on macrophages challenged with LPS for 0, 30, or 120 min (referred to hereafter as: untreated, 30’ LPS, and 120’ LPS). Protein-coding genes that were transcriptionally activated by LPS were identified by increases in Pol II ChIP-seq signal within the gene body at each time point as compared to untreated macrophages (Figure S2A). In agreement with earlier work (Amit et al., 2009; Bhatt et al., 2012; Escoubet-Lozach et al., 2011; Ramirez-Carrozzi et al., 2006), genes could be readily classified as ‘‘Early,’’ where Pol II levels increased across the gene by 30’ (Figure 2A), ‘‘Late,’’ where increased Pol II elongation was detected only at 120’ after LPS treatment (Figure 2B), or ‘‘Control,’’ where Pol II signal did not increase within the gene at either time point (Figure S2B). These gene classes were validated by analysis of published RNA-seq data, wherein chromatin-associated RNA species that had elongated to > 200 nt in length were isolated over a time course of bacterial challenge in macrophages (Bhatt et al., 2012). Indeed, elongated transcripts were evident at Early genes at the 30’ time point, whereas Late genes underwent transcription following 120’ of immune challenge (Figure S2C).

In general, the number of Start-RNAs from the sense TSS was considerably higher than at the coupled anti-sense TSS (e.g., Figures 2A and 2B). However, Early and Late genes often showed concomitant increases in sense and anti-sense Start-RNAs upon gene activation, consistent with earlier indications of coordinated transcription in the two directions (Sigova et al., 2013). For example, both sense and anti-sense Start-RNAs peaked after 30’ of LPS treatment at the Early gene Cxcl1 and diminished at 120’ as Pol II levels subsided (Figure 2A). Likewise, the Late gene Socs1 showed maximal levels of Start-RNAs in both directions after 120’ of LPS challenge. To globally evaluate the degree of coordination between sense and anti-sense transcription, we first analyzed the levels of Start-RNAs from all active genes in resting macrophages. We calculated Pearson correlation coefficients between StartRNA reads from coupled sense and anti-sense promoters, as compared to reads from randomly paired sense and anti-sense TSSs (see Supplemental Experimental Procedures). This analysis revealed stronger correspondence between Start-RNAs at coupled bidirectional TSSs than between TSSs paired at random (Figure 2C; bidirectional pairs are ‘‘right-shifted’’ toward higher correlation coefficients). Molecular Cell 58, 1–12, June 18, 2015 ª2015 Elsevier Inc. 3

Please cite this article in press as: Scruggs et al., Bidirectional Transcription Arises from Two Distinct Hubs of Transcription Factor Binding and Active Chromatin, Molecular Cell (2015), http://dx.doi.org/10.1016/j.molcel.2015.04.006

A

B

C

D

E

F

We next determined how broadly anti-sense transcription initiation increased along with sense transcription upon LPS challenge by calculating the fold change in Start-RNAs at Early, Late, and Control genes (Figure 2D). As expected, sense-strand Start-RNA levels increased at Early genes following 30’ LPS challenge (Figure 2D, left panel). Notably, anti-sense Start-RNA reads were similarly elevated. Start-RNA levels also increased equivalently in both directions at Late genes after 120’ LPS treatment (Figure 2D, right panel). Accordingly, the fold change in Start-RNAs following 30’ LPS challenge revealed a very strong correlation when evaluating coupled sense/anti-sense TSS pairs (Figure 2E). By comparison, there was little correspondence between fold changes in Start-RNA levels at randomly paired sense/anti-sense TSSs. Thus, activation of protein-coding genes stimulates coordinated transcription initiation and formation of short (25–65 nt) Start-RNAs at coupled sense and antisense TSSs. To determine whether the burst in anti-sense transcription initiation upon LPS challenge led to generation of long ncRNAs, we evaluated data measuring longer nascent RNA species (Bhatt et al., 2012). As anticipated, there was good correspondence be4 Molecular Cell 58, 1–12, June 18, 2015 ª2015 Elsevier Inc.

Figure 2. Gene Activation Leads to Coupled Bidirectional Transcription Initiation, but Pol II Only Elongates Effectively in the Sense Direction (A and B) Pol II ChIP-seq (red), sense Start-RNA (green), and anti-sense Start-RNA (purple) data are shown at representative Early (A) and Late (B) response genes in untreated macrophages or following 30’ or 120’ LPS challenge. Gene models are shown in black with observed sense TSSs depicted by arrows. (C) Distribution of Pearson correlation coefficients for Start-RNAs levels at sense TSSs paired with adjacent anti-sense TSSs (orange) is significantly higher than when paired randomly with other antisense TSSs. (D) Fold change in Start-RNA levels in the sense (S) or anti-sense (A-S) TSS region (TSS ± 100 bp) was calculated for Early, Late, and Control genes. Fold changes were calculated between untreated and 30’ LPS-treated macrophages (left), and 30’ and 120’ LPS-treated samples (right). (E) Distribution of Pearson correlation coefficients for fold change in Start-RNA levels between untreated and 30’ LPS-treated macrophages. Bidirectional sense TSSs were paired as in (C). (F) Fold change in elongated nascent RNA levels in the sense (S) or anti-sense (A-S) TSS region (TSS to +250 bp downstream) was calculated for Early, Late, and Control genes and shown as in (D). All box plots show 25th–75th percentiles and error bars depict 10th–90th percentiles. See also Figure S2.

tween levels of sense-strand Start-RNAs and elongated RNAs observed in resting macrophages (Figure S2D), and a clear increase in elongated RNA reads was seen on the sense strand for both Early and Late genes at the expected time points (Figure 2F). However, there was a strong bias toward elongated RNA signal in the sense direction, with significantly less long anti-sense RNA produced at any time point (Figure 2F). Thus, even under conditions where transcription initiation and short RNA synthesis is activated similarly in both directions, the processivity of Pol II in the anti-sense direction is markedly lower, with many transcripts failing to reach 200 nt in length. Coupled TSSs Define a Nucleosome-Depleted Region of Variable Size To obtain a clearer view of the regulatory landscape surrounding sense/anti-sense promoter pairs, we ranked promoters by increasing distance between sense and anti-sense TSSs (Figure 3A). Heat maps of data from resting macrophages showed that Pol II and TBP ChIP-seq signal tracked with sense/antisense TSS locations (Figure 3B), indicating that two distinct transcription complexes can be located at the divergent TSSs, as shown previously for yeast (Rhee and Pugh, 2012). MNase-seq experiments showed highly positioned nucleosomes located just downstream of Pol II in the sense direction, with regularly

Please cite this article in press as: Scruggs et al., Bidirectional Transcription Arises from Two Distinct Hubs of Transcription Factor Binding and Active Chromatin, Molecular Cell (2015), http://dx.doi.org/10.1016/j.molcel.2015.04.006

A

C

Figure 3. Transcription Machinery and Nucleosomes Are Highly Organized around Both Sense and Anti-sense TSSs

B

D

E

F

spaced nucleosomes extending further into the gene (Figures 3B and 3C, right panel, black lines). Remarkably, precise definition of anti-sense TSSs revealed equally prominent positioned nucleosomes in the anti-sense direction (Figures 3B and 3C, left panel, black lines), in a pattern that mirrored the distribution of nucleosomes in the coding region. Thus, we find a symmetric distribution of nucleosomes on either side of the coupled TSS pair. The area between the bidirectional TSSs was largely devoid of MNase-seq reads, with the divergent sense and anti-sense TSS pairs bracketing an accessible nucleosome-depleted region. To confirm this, we examined FAIRE-seq data, which provide an independent measure of relative DNA accessibility (Ostuni et al., 2013). FAIRE-seq signal from macrophages was in concordance with MNase-seq data (Figure S3A) and indicated that the greater the distance between sense and anti-sense TSSs, the larger the intervening region of accessible DNA.

(A) Start-RNA reads obtained from macrophages are shown in the anti-sense (purple) and sense (green) direction, both aligned with respect to sense TSSs. Genes are rank ordered by the distance between the sense and anti-sense TSSs. Data shown throughout this figure are for 8,730 Pol II-bound promoters with sufficient Start-seq reads for identification of sense and anti-sense TSSs in resting macrophages. (B) Pol II and TBP ChIP-seq, MNase-seq, and CpG dinucleotide count are shown, centered on sense TSSs, with genes rank ordered as in (A). (C) Average distribution of nucleosomes (from MNase-seq, black) and CpG dinucleotide count (red) centered on either the anti-sense (left) or sense (right) TSSs. (D) CpG island distribution and MNase-seq are shown for bidirectional genes categorized as CpG+ (n = 6,464) or CpG (n = 2,266), ranked by increasing distance between sense and anti-sense TSSs. (E) Conservation score across placental mammals (phyloP) is shown in 2 kb windows centered on anti-sense or sense TSSs as shown in (C). Dashed line indicates average conservation score across the mouse genome. (F) Number of poly-A sequences (PAS; black) or motifs recognized by U1 snRNP (U1; red) in 50-mer bins centered upon either the anti-sense TSS (left) or sense TSS (right). See also Figure S3.

CpG islands can influence the formation of nucleosome-depleted promoter regions (Fenouil et al., 2012) and thus might be involved in establishing variably sized NDRs. Interestingly, in the sense direction, CpG content showed little relationship to TSSs location or nucleosome positioning (Figures 3B and 3C, red lines, right panel). This result held true whether considering CpG density (Figures 3B and 3C) or CpG island designations (Figure S3A): high CpG content typically encompassed both the NDR and highly positioned downstream nucleosomes. Thus, the relationship between CpG content and nucleosome occupancy in the sense direction remains unclear. Notably, the upstream edge of CpG richness broadly coincided with anti-sense TSSs (Figure 3C, left panel), and the first positioned nucleosome in the anti-sense direction is found in a region of rapidly declining CpG frequency (Figure 3C, left panel). To investigate a potential role for CpG content in creating NDRs of different sizes, we separated bidirectional promoters that intersected a CpG island (CpG+, 74%) from those that didn’t (CpG). Comparison of the distances between sense/anti-sense TSSs at CpG+ and CpG genes indicated a modest difference in the median width (179 bp CpG+ versus 161 bp CpG; Figure S3B), suggesting that CpG Molecular Cell 58, 1–12, June 18, 2015 ª2015 Elsevier Inc. 5

Please cite this article in press as: Scruggs et al., Bidirectional Transcription Arises from Two Distinct Hubs of Transcription Factor Binding and Active Chromatin, Molecular Cell (2015), http://dx.doi.org/10.1016/j.molcel.2015.04.006

islands, or the factors that bind to them, may influence the upstream reach of the NDR. However, CpG islands are not necessary to create a wide, defined region of nucleosome deprivation upstream of the sense TSS. Heat maps of MNase-seq data, rank ordered by increasing distance between sense/anti-sense TSSs, showed similar nucleosome profiles for CpG+ and CpG bidirectional genes (Figure 3D). At both gene groups, NDRs were considerably larger at genes with more distal anti-sense TSSs, and a well-positioned nucleosome was evident immediately adjacent to the anti-sense TSS. Moreover, CpG islands were not sufficient to generate an extended NDR: examination of the nucleosome profile at CpG+ promoters revealed weak agreement between the upstream edge of CpG islands and the NDR (Figures S3C and S3D). Thus, CpG content may facilitate nucleosome depletion around promoters, but the upstream boundary of the NDR at bidirectional genes corresponds to the position of anti-sense transcription initiation and not the CpG island. Sequence Conservation within Bidirectional Promoters To elucidate the role of DNA sequence in defining anti-sense TSS location, we calculated sequence conservation in the promoter region among placental mammals. As anticipated, sequences immediately adjacent to sense TSSs showed high conservation that extended downstream into the gene, presumably reflecting evolutionary pressure to maintain mRNA promoters and coding regions (Figure 3E, right panel). Examining conservation around anti-sense TSSs revealed high conservation scores between sense and anti-sense TSSs that dropped precipitously just past the anti-sense TSS (Figure 3E, left panel). Interestingly, this indicates that conservation is low across the region where an anti-sense transcript is encoded. Thus, strong sequence conservation extends upstream from the protein-coding gene to the anti-sense TSS, but does not reach into to the ncRNA. A prediction of DNA sequence-driven TSS selection is general conservation of sense/anti-sense TSS locations across cell types. To test this, we generated Start-seq data from mouse embryonic fibroblasts (MEFs) of the same genotype. Alignment of Start-seq data obtained from MEFs against the TSSs identified in macrophages showed strong agreement in positioning of both sense and anti-sense TSSs (Figure S3E), indicating that bidirectional start sites defined in one cell type were often used in cells of different origin. To further define the specificity of anti-sense transcription initiation, we investigated the distribution of Start-RNA reads around sense and anti-sense TSSs. Surprisingly, we found similarly precise transcription initiation around sense TSSs of different classes (unidirectional versus bidirectional, CpG+ versus CpG), with 40% of sense TSSs in each class showing strongly focused initiation (Figure S3F). Moreover, the level of focus at anti-sense TSSs was comparable to sense TSSs (33% of TSSs classified as focused), consistent with DNA sequence directing anti-sense transcription to initiate at a particular location (Duttke et al., 2015). Specific sequence motifs showed striking asymmetry downstream of the sense and anti-sense TSSs. As described above, binding of the U1 snRNP promotes Pol II processivity and splicing of mRNA transcripts (Berg et al., 2012), and motifs for U1 association are known to be enriched downstream of 6 Molecular Cell 58, 1–12, June 18, 2015 ª2015 Elsevier Inc.

sense-strand TSSs. Our motif analysis concurs with these findings and reveals a strong, sharp peak of U1 motifs centered 55 nt downstream of sense TSSs (Figure 3F, red lines), coinciding precisely with the region where Pol II makes the transition to productive elongation. This is in clear contrast to the distribution of PAS, which are important for transcription termination and are depleted downstream of sense TSSs (Almada et al., 2013; Ntini et al., 2013). In the anti-sense direction, the opposite pattern is observed: there is a marked increase in PAS just downstream of anti-sense TSSs (Figure 3F, left panel) and a dearth of U1 motifs. We note that the enrichment of PAS motifs downstream of anti-sense TSSs is consistent with the very limited processivity of anti-sense transcription elongation described above when comparing Start-seq to elongated RNA-seq data (Figure 2D versus Figure 2F). TF Binding Motifs Are Highly Enriched Near Both Sense and Anti-sense TSSs To examine whether sequence conservation between bidirectional promoters could in part reflect TF binding motifs, we determined the frequency of core vertebrate motifs around this region. Heat maps of motif occurrence revealed enrichment in consensus motif frequency that encompassed the region between sense and anti-sense TSSs for both CpG+ and CpG bidirectional promoters (Figures 4A and S4A, respectively). Notably, the region of focused TF motif enrichment was substantially larger at genes with distant anti-sense TSSs, and there was a striking overlap in the locations of TF motif enrichment and the region of nucleosome deprivation. This suggests that the binding of TFs could contribute to establishment and/or maintenance of an enlarged NDR. Likewise, vertebrate motifs were enriched within the smaller region of nucleosome depletion observed upstream of unidirectional TSSs (Figures 4A and S4A). As expected, there was a peak of vertebrate TF motifs upstream of sense TSSs that was also evident when focusing on individual TFs, such as PU.1 (Figures 4B and 4C, right panel). PU.1 is a master regulator required for establishing macrophage identity (Scott et al., 1994) that is known to occupy thousands of enhancers and promoters in resting macrophages (Kaikkonen et al., 2013; Ostuni et al., 2013). Accordingly, PU.1 ChIP-seq signal showed enrichment at promoters (Figures 4A and 4D, right panel), including at anti-sense TSSs (Figure 4D, left panel). To determine whether focused PU.1 occupancy near antisense promoters was driven by consensus motifs, we aligned data for motif occurrence to anti-sense TSSs. We found that PU.1 motifs displayed a sharp peak in enrichment directly upstream of anti-sense TSSs (Figure 4B, left panel). Moreover, vertebrate motifs in general showed strong enrichment immediately adjacent to anti-sense promoters (Figure 4B, left panel). This suggests that TFs bound to these clustered motifs could help specify the location of anti-sense transcription initiation by directing recruitment of the transcription machinery. Enhanced TF Binding at Bidirectional Promoters We noted that whereas PU.1 motifs are distributed around unidirectional promoters, PU.1 ChIP signal is restricted to the region immediately adjacent to the TSSs (Figures 4C and 4D, right

Please cite this article in press as: Scruggs et al., Bidirectional Transcription Arises from Two Distinct Hubs of Transcription Factor Binding and Active Chromatin, Molecular Cell (2015), http://dx.doi.org/10.1016/j.molcel.2015.04.006

A

Figure 4. TF Motifs and Binding Are Focused within NDR between Bidirectional TSSs

C

D

E

B

F

panels). By contrast, there was closer agreement between PU.1 motif localization and PU.1 ChIP-seq signal at bidirectional genes. This suggested that focused enrichment of TF motifs at sense and anti-sense TSSs might facilitate PU.1 binding by enabling collaboration between TFs. Indeed, PU.1 is known to access cognate motifs at enhancers through cooperation with partner TFs that work together to increase chromatin accessibility (Heinz et al., 2010). To probe whether the clustered TF motifs observed near bidirectional TSSs might provide a binding advantage, we first determined the number of consensus PU.1 motifs present upstream of unidirectional and bidirectional promoters, separating the latter into those with the smallest and largest sense/anti-sense distances. Motif frequencies were counted in promoter-proximal regions (from the TSS to 176 bp upstream, with 176 bp being the most common sense/anti-sense distance detected, Figure S1D) and promoter-distal regions (from 1 kb to 177 bp upstream of sense TSSs). We found similar PU.1 motif occurrence for each group promoter-proximally, and unidirectional promoters displayed slightly more PU.1 motifs promoter-distally (Figure 4E). The number of consensus motifs in each window was then used to normalize PU.1 ChIP-seq signal from resting macrophages over each region to give ChIP signal intensity per motif. This analysis revealed higher PU.1 ChIP-seq signal per motif at bidirectional than unidirectional genes (Figures 4F, left and S4). This was particularly striking for TSS-distal PU.1 motifs located

(A) Heat maps depict MNase-seq, locations of a consolidated panel of vertebrate TF binding motifs, and PU.1 ChIP-seq in a 2 kb window centered on sense TSSs (green arrow). Active CpG+ genes are categorized as bidirectional (n = 6,464) or unidirectional (n = 1,585). (B) Composite distribution of a consolidated panel of 131 non-redundant vertebrate TF binding motifs (black) and PU.1 motifs (red) in 50-mer bins for bidirectional genes, centered on either the antisense (left) or sense (right) TSS. (C) Composite metagene distribution of PU.1 motifs centered on either the anti-sense (left) or sense (right) TSSs at all unidirectional (black) or bidirectional (orange) genes. (D) Average distribution of PU.1 ChIP-seq reads in 50-mer bins is shown in 2 kb windows centered on anti-sense or sense TSSs as shown in (B). (E) Number of PU.1 motifs at CpG+ genes that are unidirectional, or bidirectional with the smallest (blue) or largest (red) sense/anti-sense distances. Motifs were summed in promoter-proximal and promoter-distal windows as described in the text. n = 1,585 genes in each group. (F) PU.1 ChIP-seq reads within each window were summed and normalized to the number of PU.1 motifs present. See also Figure S4.

upstream of genes with the largest sense/anti-sense distances, where the anti-sense TSS and NDR extend into this region (Figure 4F, right). This implies that aspects of bidirectional promoters can facilitate TF binding to cognate motifs located at a distance from sense promoters and suggests that anti-sense TSSs serve as an additional ‘‘hub’’ for TF recruitment. We note that collaboration between TFs could enhance TF occupancy both directly through cooperative interactions and indirectly through the generation of an accessible chromatin domain (Heinz et al., 2010). Variably Sized NDRs in Resting Macrophages Impact Signal-Dependent TF Recruitment and Gene Activation The observation of variably sized NDRs upstream of bidirectional TSSs in resting macrophages suggested the intriguing possibility that larger NDRs could contain more accessible binding sites for stimulus-dependent TFs like NF-kB. Notably, NF-kB is unable to bind cognate motifs when they are occluded by nucleosomes, and thus a majority of rapid, signal-dependent NF-kB binding events occur within regions that were nucleosome deprived prior to immune challenge (Saccani et al., 2001). To probe this possibility, we first calculated the distance between sense and antisense TSSs for Early response, Late response, and Control genes in macrophages, using this distance as a proxy for the size of the NDR. Surprisingly, despite Early response genes having fewer CpG+ promoters than the other gene classes (Figure S5A), Early Molecular Cell 58, 1–12, June 18, 2015 ª2015 Elsevier Inc. 7

Please cite this article in press as: Scruggs et al., Bidirectional Transcription Arises from Two Distinct Hubs of Transcription Factor Binding and Active Chromatin, Molecular Cell (2015), http://dx.doi.org/10.1016/j.molcel.2015.04.006

A

B

C

D

genes displayed significantly wider sense/anti-sense distances than Late or Control genes (Figure 5A; Early gene median width = 220 bp, Late median width = 184 bp). This was true for both CpG+ and CpG Early genes, which had very similar median sense/ anti-sense distances (Figures S5B and S5C). Heat maps of MNase-seq profiles around Early promoters verified nucleosome deprivation between bidirectional TSSs and enlarged NDRs at many Early genes (Figure 5B). To investigate whether an extension of the NDR upstream at bidirectional promoters could expose underlying NF-kB motifs, we calculated the number of NF-kB motifs at Early genes using two different windows: the genomic median sense/anti-sense distance (TSS to 176 bp) or the experimentally observed sense/anti-sense distance at each Early gene. Strikingly, using the position of the anti-sense TSS as a measure of accessible promoter size revealed a 1.6-fold increase in NF-kB motifs detected upstream of Early genes (Figure S5D) versus the median promoter-proximal window. To evaluate signal-dependent NF-kB binding as a function of NDR size, we stratified Early genes with the smallest sense/ anti-sense distances (Figure 5B, denoted at right in blue) and those with the largest distances (denoted in red). Motif analysis indicated that Early genes with the most distant TSSs had considerably more NF-kB consensus elements detected within their NDRs than genes with the smallest sense/anti-sense distances and more genes with multiple NF-kB motifs (Figure S5E). We then examined ChIP-seq signals for NF-kB at each group of genes shortly after stimulation of macrophages (Kaikkonen et al., 2013) and compared the signal at Early genes transcribed unidirectionally. Both groups of bidirectional genes exhibited higher NF-kB ChIP-seq signals than unidirectional genes, and NF-kB signal spread farther upstream at genes with the largest sense/ anti-sense distances (Figure 5C, red line). This is consistent 8 Molecular Cell 58, 1–12, June 18, 2015 ª2015 Elsevier Inc.

Figure 5. Increased NF-kB binding and Gene Output at Promoters with Distant Anti-sense TSSs (A) Sense/anti-sense TSS distances for Early response, Late response, and Control gene groups. Median width of Early promoters (220 bp) is significantly greater than that of Late and Control promoters. Box plots show 25th–75th percentiles and error bars depict 10th–90th percentiles. (B) MNase-seq data shown at Early genes categorized as bidirectional or unidirectional (n = 113). Bidirectional genes are ranked by increasing distance between sense/anti-sense TSSs. Gene quartiles with smallest and largest sense and antisense distances are indicated at right (n = 236 for each). (C) Average NF-kB (left) ChIP-seq signal in stimulated macrophages. Data are shown for Early unidirectional genes as well as bidirectional genes with the smallest versus largest inter-TSS distances as defined in (B). (D) Elongated nascent RNA reads (Bhatt et al., 2012) are shown for Early response genes in untreated macrophages and macrophages challenged for 15 or 30 min with LPS. Data are shown for gene groups defined in (B). See also Figure S5.

with a larger region of accessible DNA in resting cells exposing more NF-kB motifs for rapid occupancy following LPS exposure (Figure S5E). To test whether the observed differences in NF-kB ChIP-seq signal among Early genes resulted in differences in gene output, we used elongated RNA-seq data from macrophages that were untreated or challenged with LPS (Bhatt et al., 2012). A strong, rapid increase in elongated RNAs was detected at Early genes with the largest sense/anti-sense distances, while those with closer anti-sense TSSs showed more modest increases in RNA production (Figure 5D). This result was significant at both CpG+ and CpG promoters (Figure S5F). In agreement with NF-kB ChIP-seq data, genes with unidirectional TSSs underwent the lowest levels of activation upon immune challenge (Figure 5D). We conclude that the larger NDRs found in resting macrophages at genes with distant anti-sense TSSs can facilitate more recruitment of signal-dependent TFs like NF-kB, thereby enabling stronger gene activation. A Subset of Anti-sense TSSs Exhibit Enhancer-like Chromatin Features We next investigated whether genes with anti-sense transcription displayed altered epigenetic chromatin modifications upstream of the sense TSS. For example, chromatin features such as tri-methylation of histone H3 K4 (H3K4me3) have been linked to transcriptional activation at mRNA promoters, and mono-methylation of histone H3 K4 (H3K4me1) coincides with enhancer regions (Calo and Wysocka, 2013; Heintzman et al., 2009; Visel et al., 2009). We analyzed the distribution of these histone modifications around bidirectional as compared to unidirectional promoters using ChIP-seq from resting macrophages. As expected, H3K4me3 signal peaked in the sense direction near the first downstream nucleosomes and was present at

Please cite this article in press as: Scruggs et al., Bidirectional Transcription Arises from Two Distinct Hubs of Transcription Factor Binding and Active Chromatin, Molecular Cell (2015), http://dx.doi.org/10.1016/j.molcel.2015.04.006

A

D

Figure 6. Enhancer-like Chromatin Signature around Anti-sense TSSs

E

B F

C G

similar, low levels upstream of bidirectional and unidirectional genes (Figure 6A). In contrast, H3K4me1 was enriched upstream of bidirectional TSSs relative to unidirectional TSSs (Figure 6B). Given the association of H3K4me1 with enhancer regions, we evaluated whether other characteristics of enhancers were present upstream of bidirectional TSSs. Indeed, ChIP-seq signal for two key markers of active enhancers, the histone acetyl-transferase p300 and H3 K27 acetylation (H3K27ac; Calo and Wysocka, 2013), were significantly higher upstream of genes with anti-sense transcription (Figure 6C). Interestingly, not all bidirectional promoters shared these features equally: ranking genes by the level of H3K4me1 near antisense TSSs revealed a subset with particularly strong H3K4me1 signal (Figure 6D, top quartile of signal denoted in red). Genes in this group showed a full range of sense/anti-sense distances (Figure S6A) and H3K4me1 signal was independent of NDR size (Figure S6B), indicating that this histone mark was not reflecting levels of open chromatin. Notably, genes in the top quartile of H3K4me1 signal were enriched in other features of enhancers near their anti-sense TSSs, such as H3K27ac and p300 ChIP-seq signal, and displayed lower upstream H3K4me3 signal (Figure 6E). Moreover, bidirectional promoters in this top quartile of H3K4me1 signal produced elevated levels of sense-strand Start-RNAs and elongated RNAs (Figures 6F and S6A). Thus, we suggest that the presence of enhancer-like characteristics around the anti-sense TSS might contribute to transcription activity from the sense TSS partner.

(A and B) Average distribution of H3K4me3 (A) and H3K4me1 (B) ChIP-seq signal for Pol II-bound unidirectional promoters (black; n = 2,240) and bidirectional promoters (orange; n = 8,730) in 50 bp bins centered on sense TSSs. (C) Shown are H3K27Ac and p300 ChIP-seq reads in the region from the sense TSS to 1 kb at unidirectional (black) and bidirectional (orange) genes. Asterisks signify p < 0.0001. (D) H3K4me1 ChIP-seq signal at bidirectional promoters in a 2 kb window centered on antisense TSSs. Bidirectional promoters are ranked by decreasing H3K4me1 signal in the region from 1 kb to the observed anti-sense TSS and divided into the top quartile of H3K4me1 (red) and all other quartiles of H3K4me1 signal (green), as shown at right. (E) H3K27ac, p300, and H3K4me3 ChIP-seq reads in the region from the sense TSS to 1 kb at genes from the top quartile of upstream H3K4me1 signal (red) and all other quartiles (green), as shown in (D). Asterisks signify p < 0.0001. (F) Elongated nascent RNA (sense TSS to +250 bp) and Start-RNA reads (±100 bp around sense TSS) at H3K4me1 quartiles as shown in (E). Asterisks signify p < 0.0001. (G) Enriched GO biological processes among bidirectional promoters with highest levels of H3K4me1 (top quartile). All box plots show 25th–75th percentiles and error bars depict 10th–90th percentiles. See also Figure S6.

To understand what kinds of genes displayed these features in macrophages, we performed Gene Ontology analysis on the most H3K4me1-enriched genes (top quartile of signal). This revealed a strong enrichment in genes involved in immune and defense responses (Figure 6G), including key regulators such as Toll-like receptors, interleukins, and key chemokines. In conclusion, anti-sense TSSs endowed with enhancer-like features in macrophages appear to be preferentially coupled to active genes with immune and inflammatory functions. PU.1 Enrichment at Bidirectional TSSs of Immuneinducible Genes To probe whether genes within the top quartile of H3K4me1 enrichment were indeed involved in immune responsiveness, we calculated the percentage of Early, Late, and Control genes contained within this group. This revealed more Early than Late genes with high levels of H3K4me1, but with both groups present at levels higher than Control genes (Figure 7A). We then analyzed a time course of nascent elongated RNA production following LPS treatment and found that genes in the top quartile of H3K4me1 signal exhibit much greater levels of activated transcription upon bacterial challenge as compared to genes in the lower three quartiles (Figures 7B and S7A), revealing a correlation between histone modifications around the anti-sense TSS and gene inducibility. To determine how the H3K4me1 modification was specified at a subset of bidirectional promoters, we considered that Molecular Cell 58, 1–12, June 18, 2015 ª2015 Elsevier Inc. 9

Please cite this article in press as: Scruggs et al., Bidirectional Transcription Arises from Two Distinct Hubs of Transcription Factor Binding and Active Chromatin, Molecular Cell (2015), http://dx.doi.org/10.1016/j.molcel.2015.04.006

A

B

E

Figure 7. PU.1 Enrichment at Bidirectional TSSs Marks Active and Inducible Genes

(A) Percentages of Early, Late, and Control genes that fall in the top quartile of H3K4me1 signal around anti-sense TSSs. (B) Average elongated RNA reads (sense TSS to +250 bp) for genes in the top quartile of H3K4me1 signal, versus all other bidirectional genes, across a 2 hr time course of LPS challenge. C (C and D) Average distribution of PU.1 motifs (C) and PU.1 occupancy (D) in 50-mer bins for genes in the top quartile of H3K4me1 versus all other bidirectional genes, centered on either the antisense (left) or sense (right) TSS. (E) Model for transcriptional activity at bidirectional (top) versus unidirectional (bottom) promoters. At bidirectional promoters, a lineage-specifying facD tor such as PU.1 (blue) collaborates with other TFs (yellow) to bind the focused region of TF motifs at coupled sense (green) and anti-sense (purple) TSSs. Through the activity of TFs and the transcription machinery, an NDR is established at both TSSs and spreads between the coupled promoters. This process renders additional transcription factor binding motifs accessible (orange boxes). Following stimulation, signal-dependent transcription factors like NF-kB (orange) preferentially bind motifs in the pre-established NDR enabling high transcriptional output. At unidirectional promoters, a limited NDR size and absence of well-positioned upstream nucleosome is observed, such that promoterdistal TF motifs are often occluded by chromatin and unavailable for binding. See also Figure S7.

deposition of H3K4me1 and other enhancer marks in macrophages often follows binding of the lineage factor PU.1 (Ghisletti et al., 2010; Heinz et al., 2010; Kaikkonen et al., 2013). Quantifying the distribution of PU.1 motifs around sense and anti-sense TSSs revealed that genes enriched in H3K4me1 had considerably more PU.1 motifs near the sense and anti-sense TSSs than genes with lower H3K4me1 signal (Figure 7C). Moreover, genes in the top quartile of H3K4me1 signal in macrophages displayed significantly elevated PU.1 ChIP-seq signal around both TSSs (Figures 7D and S7B). Thus, PU.1 association with bidirectional promoters is enriched at genes with enhancer-like chromatin features and increased activity. DISCUSSION We propose that the predominant mammalian promoter unit should be defined to include sense and upstream anti-sense TSSs as well as the intervening DNA sequence. Bidirectional promoter architecture is widespread: we observe clear antisense TSSs at > 75% of active genes in murine macrophages. Anti-sense transcription initiation is precise and guided by DNA sequence content including distinct core promoter-like elements, as observed in recent work (Duttke et al., 2015). Further, we find peaks of TF motif occurrence focused directly upstream of both sense and anti-sense TSSs (Figure 7E). High TF occupancy at bidirectional TSSs is envisioned to allow collaboration between TFs, to recruit the transcription machinery and define sites of transcription initiation. Once initiated, Pol II could elongate divergently, creating convergent negative supercoils between the promoters that further discourages nucleosome assembly (Core et al., 2008; Preker et al., 2008; Seila et al., 10 Molecular Cell 58, 1–12, June 18, 2015 ª2015 Elsevier Inc.

2008). Importantly, increased distance between sense and anti-sense TSSs is associated with a larger NDR in resting macrophages. This creates a broader region of accessible chromatin and exposes more signal-dependent TFs motifs. Thus, identification of anti-sense TSS locations provides valuable information on promoter chromatin architecture and which promoter-distal TF motifs would be readily available for binding. These results provide insights into the profile of transcriptional responses elicited by bacterial challenge of macrophages. It has long been appreciated that a single immune stimulus activates hundreds of genes to differing magnitudes and with variable kinetics; however, the mechanisms underlying this transcriptional diversity have remained unclear (Bhatt et al., 2012; EscoubetLozach et al., 2011; Rogatsky and Adelman, 2014). Our findings indicate that the magnitude of gene activation is connected to the distance between sense and anti-sense TSSs, since this impacts the number of accessible TF motifs. We find significantly greater sense/anti-sense TSS distances and larger NDRs among Early response genes than at Late genes (Figure 5A), supporting the idea that NF-kB motifs located in open chromatin enable fast binding and transcriptional responses. Further, we suggest a role for PU.1 binding near anti-sense TSSs in promoting a unique chromatin environment at a subset of active macrophage genes (Figure 7A). PU.1 is critical for defining macrophage identity and is strictly required in mature cells for inducible gene expression (Eichbaum et al., 1994; Grove and Plumb, 1993). These roles of PU.1 have generally been attributed to its association with enhancers (Kaikkonen et al., 2013; Ostuni et al., 2013), and the majority of PU.1-bound regions are indeed located distal from mRNA TSSs. However, nearly 9,000 binding sites for PU.1 in macrophages are located

Please cite this article in press as: Scruggs et al., Bidirectional Transcription Arises from Two Distinct Hubs of Transcription Factor Binding and Active Chromatin, Molecular Cell (2015), http://dx.doi.org/10.1016/j.molcel.2015.04.006

near gene promoters (Ghisletti et al., 2010), indicating that PU.1 also plays a TSS-proximal role. In support of this idea, we find that a subset of anti-sense TSSs displayed enrichment in PU.1 binding sites and PU.1 ChIP-seq signal, along with accumulation of H3K4me1, H3K27ac, and p300. The combination of enrichment in PU.1 enhancer-like chromatin marks and non-processive transcription elongation at anti-sense TSSs is reminiscent of recent studies on macrophage enhancers (Kaikkonen et al., 2013), suggesting the intriguing possibility that anti-sense TSSs might serve as proximal enhancers of mRNA expression. In conclusion, we find that two previously unappreciated features of promoter architecture impact basal and stimulusresponsive transcription, namely, the distance between sense and anti-sense TSSs and the presence of enhancer-like chromatin features near the anti-sense TSS. We show that anti-sense TSSs organize local chromatin, extending the NDR and enabling the asymmetric acquisition of activating histone modifications. We suggest that bidirectional promoters generate an optimized environment for recruitment of TFs and the transcription machinery, both in resting macrophages and during immune challenge, shedding new light on the regulatory landscape surrounding mammalian protein-coding genes. EXPERIMENTAL PROCEDURES Cell Culture Bone marrow-derived macrophages were prepared from 8- to 12-week old C57BL/6 mice and maintained as described previously (Adelman et al., 2009). All animal experiments were approved by the NIEHS Institutional Animal Care and Use Committee and were performed according to NIH guidelines for the care and use of laboratory animals. Genomic Datasets Pol II ChIP-seq was performed as described in Muse et al., 2007. Start-RNAs were prepared from nuclear RNA as described in Nechaev et al., 2010. MNasedigested chromatin from macrophages and libraries were prepared as described in Gilchrist et al., 2010. PU.1, H3K4me1, and H3K4me3 ChIP-seq and FAIRE-seq data from resting bone marrow derived macrophages (Ostuni et al., 2013) were downloaded from the Gene Expression Omnibus (GEO: GSE38379). NF-kB (p65) ChIP-seq data from stimulated macrophages (Kaikkonen et al., 2013) were downloaded from GEO: GSE48759 and TBP ChIP-seq data (Escoubet-Lozach et al., 2011) were downloaded from GEO: GSE23622. Elongated nascent RNA-seq data from the chromatin fraction of bone marrowderived macrophages (Bhatt et al., 2012) were downloaded from GEO: GSE32916. Sense and Anti-sense TSS Identification Observed sense TSSs were defined within 2,000 nt search windows centered on RefSeq-annotated TSSs, using the location to which the largest number of Start-RNA reads aligned, with further considerations noted in Supplemental Experimental Procedures. A preference toward minimal distance for re-annotation relative to RefSeq was employed. Anti-sense TSS search windows were defined within a 1,000 nt region upstream of each sense TSS, on the opposite strand, using the same procedure described for sense TSS loci with proximity to the sense TSS utilized to break ties between locations with equivalent Startseq reads. LPS Responsive Gene Classification Genes were categorized as Early or Late response genes if they displayed a R 2-fold increase in Pol II ChIP-seq gene body signal after a 30’ or 120’ LPS challenge, respectively. An equivalent number of control genes were selected at random from the population that showed no increase in Pol II levels at either time point.

Statistics Throughout the manuscript, box plots show 25th–75th percentiles and error bars depict 10th –90th percentiles. Mann-Whitney tests were used to determine p values, except where noted otherwise. ACCESSION NUMBERS The accession number for the sequencing data reported in this paper is GEO: GSE62151. SUPPLEMENTAL INFORMATION Supplemental Information includes Supplemental Experimental Procedures and seven figures and can be found with this article online at http://dx.doi. org/10.1016/j.molcel.2015.04.006. AUTHOR CONTRIBUTIONS D.A.G., B.S.S., and K.A. conceived of this project and wrote the manuscript. S.N. prepared Start-RNAs, G.W.M. performed ChIP-seq, and D.A.G. carried out remaining experiments. B.S.S., D.A.G., A.B., and D.C.F. performed data analysis. ACKNOWLEDGMENTS We thank Paul Wade, Torben Jensen, and Chris Burge for helpful comments on this project and manuscript. This research was supported by the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences to K.A. (Z01 ES101987). Received: October 3, 2014 Revised: January 29, 2015 Accepted: April 1, 2015 Published: May 28, 2015 REFERENCES Adelman, K., Kennedy, M.A., Nechaev, S., Gilchrist, D.A., Muse, G.W., Chinenov, Y., and Rogatsky, I. (2009). Immediate mediators of the inflammatory response are poised for gene activation through RNA polymerase II stalling. Proc. Natl. Acad. Sci. USA 106, 18207–18212. Almada, A.E., Wu, X., Kriz, A.J., Burge, C.B., and Sharp, P.A. (2013). Promoter directionality is controlled by U1 snRNP and polyadenylation signals. Nature 499, 360–363. Amit, I., Garber, M., Chevrier, N., Leite, A.P., Donner, Y., Eisenhaure, T., Guttman, M., Grenier, J.K., Li, W., Zuk, O., et al. (2009). Unbiased reconstruction of a mammalian transcriptional network mediating pathogen responses. Science 326, 257–263. Andersen, P.K., Lykke-Andersen, S., and Jensen, T.H. (2012). Promoter-proximal polyadenylation sites reduce transcription activity. Genes Dev. 26, 2169– 2179. Berg, M.G., Singh, L.N., Younis, I., Liu, Q., Pinto, A.M., Kaida, D., Zhang, Z., Cho, S., Sherrill-Mix, S., Wan, L., and Dreyfuss, G. (2012). U1 snRNP determines mRNA length and regulates isoform expression. Cell 150, 53–64. Bhatt, D.M., Pandya-Jones, A., Tong, A.J., Barozzi, I., Lissner, M.M., Natoli, G., Black, D.L., and Smale, S.T. (2012). Transcript dynamics of proinflammatory genes revealed by sequence analysis of subcellular RNA fractions. Cell 150, 279–290. Butler, J.E., and Kadonaga, J.T. (2002). The RNA polymerase II core promoter: a key component in the regulation of gene expression. Genes Dev. 16, 2583–2592. Calo, E., and Wysocka, J. (2013). Modification of enhancer chromatin: what, how, and why? Mol. Cell 49, 825–837. Carninci, P., Kasukawa, T., Katayama, S., Gough, J., Frith, M.C., Maeda, N., Oyama, R., Ravasi, T., Lenhard, B., Wells, C., et al.; FANTOM Consortium; RIKEN Genome Exploration Research Group and Genome Science Group

Molecular Cell 58, 1–12, June 18, 2015 ª2015 Elsevier Inc. 11

Please cite this article in press as: Scruggs et al., Bidirectional Transcription Arises from Two Distinct Hubs of Transcription Factor Binding and Active Chromatin, Molecular Cell (2015), http://dx.doi.org/10.1016/j.molcel.2015.04.006

(Genome Network Project Core Group) (2005). The transcriptional landscape of the mammalian genome. Science 309, 1559–1563.

provides an opportunity to target and integrate regulatory signals. Mol. Cell 52, 517–528.

Cheng, J., Kapranov, P., Drenkow, J., Dike, S., Brubaker, S., Patel, S., Long, J., Stern, D., Tammana, H., Helt, G., et al. (2005). Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science 308, 1149–1154.

Kaikkonen, M.U., Spann, N.J., Heinz, S., Romanoski, C.E., Allison, K.A., Stender, J.D., Chun, H.B., Tough, D.F., Prinjha, R.K., Benner, C., and Glass, C.K. (2013). Remodeling of the enhancer landscape during macrophage activation is coupled to enhancer transcription. Mol. Cell 51, 310–325.

Conaway, R.C., and Conaway, J.W. (2013). The Mediator complex and transcription elongation. Biochim. Biophys. Acta 1829, 69–75. Core, L.J., Waterfall, J.J., and Lis, J.T. (2008). Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science 322, 1845–1848. Core, L.J., Martins, A.L., Danko, C.G., Waters, C.T., Siepel, A., and Lis, J.T. (2014). Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nat. Genet. 46, 1311–1320. De Santa, F., Barozzi, I., Mietton, F., Ghisletti, S., Polletti, S., Tusi, B.K., Muller, H., Ragoussis, J., Wei, C.L., and Natoli, G. (2010). A large fraction of extragenic RNA pol II transcription sites overlap enhancers. PLoS Biol. 8, e1000384. Duttke, S.H., Lacadie, S.A., Ibrahim, M.M., Glass, C.K., Corcoran, D.L., Benner, C., Heinz, S., Kadonaga, J.T., and Ohler, U. (2015). Human promoters are intrinsically directional. Mol. Cell 57, 674–684. Eichbaum, Q.G., Iyer, R., Raveh, D.P., Mathieu, C., and Ezekowitz, R.A. (1994). Restriction of interferon gamma responsiveness and basal expression of the myeloid human Fc gamma R1b gene is mediated by a functional PU.1 site and a transcription initiator consensus. J. Exp. Med. 179, 1985–1996.

Kim, T.K., Hemberg, M., Gray, J.M., Costa, A.M., Bear, D.M., Wu, J., Harmin, D.A., Laptewicz, M., Barbara-Haley, K., Kuersten, S., et al. (2010). Widespread transcription at neuronal activity-regulated enhancers. Nature 465, 182–187. Muse, G.W., Gilchrist, D.A., Nechaev, S., Shah, R., Parker, J.S., Grissom, S.F., Zeitlinger, J., and Adelman, K. (2007). RNA polymerase is poised for activation across the genome. Nat. Genet. 39, 1507–1511. Nechaev, S., Fargo, D.C., dos Santos, G., Liu, L., Gao, Y., and Adelman, K. (2010). Global analysis of short RNAs reveals widespread promoter-proximal stalling and arrest of Pol II in Drosophila. Science 327, 335–338. Ntini, E., Ja¨rvelin, A.I., Bornholdt, J., Chen, Y., Boyd, M., Jørgensen, M., Andersson, R., Hoof, I., Schein, A., Andersen, P.R., et al. (2013). Polyadenylation site-induced decay of upstream transcripts enforces promoter directionality. Nat. Struct. Mol. Biol. 20, 923–928. Ostuni, R., Piccolo, V., Barozzi, I., Polletti, S., Termanini, A., Bonifacio, S., Curina, A., Prosperini, E., Ghisletti, S., and Natoli, G. (2013). Latent enhancers activated by stimulation in differentiated cells. Cell 152, 157–171.

Escoubet-Lozach, L., Benner, C., Kaikkonen, M.U., Lozach, J., Heinz, S., Spann, N.J., Crotti, A., Stender, J., Ghisletti, S., Reichart, D., et al. (2011). Mechanisms establishing TLR4-responsive activation states of inflammatory response genes. PLoS Genet. 7, e1002401.

Preker, P., Nielsen, J., Kammler, S., Lykke-Andersen, S., Christensen, M.S., Mapendano, C.K., Schierup, M.H., and Jensen, T.H. (2008). RNA exosome depletion reveals transcription upstream of active human promoters. Science 322, 1851–1854.

Fenouil, R., Cauchy, P., Koch, F., Descostes, N., Cabeza, J.Z., Innocenti, C., Ferrier, P., Spicuglia, S., Gut, M., Gut, I., and Andrau, J.C. (2012). CpG islands and GC content dictate nucleosome depletion in a transcription-independent manner at mammalian promoters. Genome Res. 22, 2399–2408.

Proudfoot, N.J. (2011). Ending the message: poly(A) signals then and now. Genes Dev. 25, 1770–1782.

Flynn, R.A., Almada, A.E., Zamudio, J.R., and Sharp, P.A. (2011). Antisense RNA polymerase II divergent transcripts are P-TEFb dependent and substrates for the RNA exosome. Proc. Natl. Acad. Sci. USA 108, 10460–10465. Ghisletti, S., Barozzi, I., Mietton, F., Polletti, S., De Santa, F., Venturini, E., Gregory, L., Lonie, L., Chew, A., Wei, C.L., et al. (2010). Identification and characterization of enhancers controlling the inflammatory gene expression program in macrophages. Immunity 32, 317–328. Gilchrist, D.A., Dos Santos, G., Fargo, D.C., Xie, B., Gao, Y., Li, L., and Adelman, K. (2010). Pausing of RNA polymerase II disrupts DNA-specified nucleosome organization to enable precise gene regulation. Cell 143, 540–551. Grove, M., and Plumb, M. (1993). C/EBP, NF-kappa B, and c-Ets family members and transcriptional regulation of the cell-specific and inducible macrophage inflammatory protein 1 alpha immediate-early gene. Mol. Cell. Biol. 13, 5276–5289. Heintzman, N.D., Hon, G.C., Hawkins, R.D., Kheradpour, P., Stark, A., Harp, L.F., Ye, Z., Lee, L.K., Stuart, R.K., Ching, C.W., et al. (2009). Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 459, 108–112. Heinz, S., Benner, C., Spann, N., Bertolino, E., Lin, Y.C., Laslo, P., Cheng, J.X., Murre, C., Singh, H., and Glass, C.K. (2010). Simple combinations of lineagedetermining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589. Henriques, T., Gilchrist, D.A., Nechaev, S., Bern, M., Muse, G.W., Burkholder, A., Fargo, D.C., and Adelman, K. (2013). Stable pausing by RNA polymerase II

12 Molecular Cell 58, 1–12, June 18, 2015 ª2015 Elsevier Inc.

Ramirez-Carrozzi, V.R., Nazarian, A.A., Li, C.C., Gore, S.L., Sridharan, R., Imbalzano, A.N., and Smale, S.T. (2006). Selective and antagonistic functions of SWI/SNF and Mi-2beta nucleosome remodeling complexes during an inflammatory response. Genes Dev. 20, 282–296. Rhee, H.S., and Pugh, B.F. (2012). Genome-wide structure and organization of eukaryotic pre-initiation complexes. Nature 483, 295–301. Roeder, R.G. (2005). Transcriptional regulation and the role of diverse coactivators in animal cells. FEBS Lett. 579, 909–915. Rogatsky, I., and Adelman, K. (2014). Preparing the first responders: building the inflammatory transcriptome from the ground up. Mol. Cell 54, 245–254. Saccani, S., Pantano, S., and Natoli, G. (2001). Two waves of nuclear factor kappaB recruitment to target promoters. J. Exp. Med. 193, 1351–1359. Scott, E.W., Simon, M.C., Anastasi, J., and Singh, H. (1994). Requirement of transcription factor PU.1 in the development of multiple hematopoietic lineages. Science 265, 1573–1577. Seila, A.C., Calabrese, J.M., Levine, S.S., Yeo, G.W., Rahl, P.B., Flynn, R.A., Young, R.A., and Sharp, P.A. (2008). Divergent transcription from active promoters. Science 322, 1849–1851. Sigova, A.A., Mullen, A.C., Molinie, B., Gupta, S., Orlando, D.A., Guenther, M.G., Almada, A.E., Lin, C., Sharp, P.A., Giallourakis, C.C., and Young, R.A. (2013). Divergent transcription of long noncoding RNA/mRNA gene pairs in embryonic stem cells. Proc. Natl. Acad. Sci. USA 110, 2876–2881. Visel, A., Blow, M.J., Li, Z., Zhang, T., Akiyama, J.A., Holt, A., Plajzer-Frick, I., Shoukry, M., Wright, C., Chen, F., et al. (2009). ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature 457, 854–858.

Bidirectional Transcription Arises from Two Distinct Hubs of Transcription Factor Binding and Active Chromatin.

Anti-sense transcription originating upstream of mammalian protein-coding genes is a well-documented phenomenon, but remarkably little is known about ...
4MB Sizes 2 Downloads 8 Views