Accepted Manuscript Title: Transcriptional and Post-Transcriptional Regulation of Nucleotide Excision Repair Genes in Human Cells Author: Hailey B. Lefkofsky Artur Veloso Mats Ljungman PII: DOI: Reference:
S0027-5107(14)00204-8 http://dx.doi.org/doi:10.1016/j.mrfmmm.2014.11.008 MUT 11403
To appear in:
Mutation Research
Received date: Revised date: Accepted date:
24-8-2014 2-11-2014 24-11-2014
Please cite this article as: H.B. Lefkofsky, A. Veloso, M. Ljungman, Transcriptional and Post-Transcriptional Regulation of Nucleotide Excision Repair Genes in Human Cells, Mutation Research/Fundamental and Molecular Mechanisms of Mutagenesis (2014), http://dx.doi.org/10.1016/j.mrfmmm.2014.11.008 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
*Conflict of Interest Statement
Conflict of Interest Statement:
Ac ce p
te
d
M
an
us
cr
ip t
None of the authors have any conflicts of interests regarding this manuscript to declare.
Page 1 of 28
Transcriptional and Post-Transcriptional
ip t
Regulation of Nucleotide Excision Repair Genes
cr
in Human Cells
Hailey B. Lefkofsky1, Artur Veloso1,2,3 and Mats Ljungman1,2,4,* 1
us
Translational Oncology Program, University of Michigan Medical School, Ann
Arbor, Michigan.
an
2
Department of Radiation Oncology, University of Michigan Medical School, Ann
M
Arbor, Michigan. 3
Bioinformatics Program and Department of Computational Medicine and
4
d
Bioinformatics, University of Michigan, Ann Arbor, Michigan.
te
Department of Environmental Health Sciences, School of Public Health,
Ac ce p
University of Michigan, Ann Arbor, Michigan.
*To whom reprint requests should be addressed at
[email protected].
Page 2 of 28
ABSTRACT Nucleotide excision repair (NER) removes DNA helix-distorting lesions induced by UV light and various chemotherapeutic agents such as cisplatin.
ip t
These lesions efficiently block the elongation of transcription and need to be
rapidly removed by transcription-coupled NER (TC-NER) to avoid the induction of
cr
apoptosis. Twenty-nine genes have been classified to code for proteins
us
participating in nucleotide excision repair (NER) in human cells. Here we explored the transcriptional and post-transcriptional regulation of these NER
an
genes across 13 human cell lines using Bru-seq and BruChase-seq, respectively. Many NER genes are relatively large in size and therefore will be easily
M
inactivated by UV-induced transcription-blocking lesions. Furthermore, many of
d
these genes produce transcripts that are rather unstable. Thus, these genes are
te
expected to rapidly lose expression leading to a diminished function of NER. One such gene is ERCC6 that codes for the CSB protein critical for TC-NER.
Ac ce p
Due to its large gene size and high RNA turnover rate, the ERCC6 gene may act as dosimeter of DNA damage so that at high levels of damage, ERCC6 RNA levels would be diminished leading to the loss of CSB expression, inhibition of TC-NER and the promotion of cell death.
Page 3 of 28
1. Introduction
Ultraviolet (UV) light is a strong mutagen and has promoted natural selection
ip t
among organisms throughout evolutionary time. Nucleotide excision repair
(NER) evolved to safeguard the DNA from the deleterious effects of UV light and
cr
can be found in all species of life from bacteria and plants to mammals [1]. NER
us
also protects cells from cyclopurines formed by endogenous reactive oxygen species (ROS) and cancer cells use NER to repair damage induced by certain
an
chemotherapeutic agents such as cisplatin. A better understanding of the regulation of expression of NER genes could aid in predicting the sensitivity of
M
cells to UV light and chemotherapeutic agents and may promote the
d
development of new therapeutic regiments.
te
Global genomic NER (GG-NER) deals with lesions in the whole genome. The lesion recognition complex in GG-NER, consisting of XPC, RAD23A,
Ac ce p
RAD23B, CETN2, DDB1 and DDB2, acts by recognizing the DNA lesions in chromatin and recruits the core NER complex to sites of damage (Fig. 1) [1-3]. A specialized damage recognition step of NER has evolved to aid in the removal of bulky lesions that block transcription elongation [4]. This sub-pathway of NERis called transcription-coupled NER (TC-NER) where Cockayne’s syndrome factors A (CSA) and B (CSB), XAB2 and UVSSA factors orchestrate the recruitment of the NER core complexes to sites of transcription-stalling lesions. Following damage recognition, which is the rate-limiting step, the pre-incision complex consisting of XPA and RPA verifies the presence of the lesion followed by the
Page 4 of 28
DNA unwinding by the TFIIH complex. The damaged strand is then incised by the incision enzymes XPG, ERCC1 and XPF, DNA polymerases re-synthesize the DNA in the excised gap and DNA ligase 1 seals the newly synthesized strand
ip t
with existing strand (Fig. 1).
Mutations in core components of NER leads to the human disorders
cr
xeroderma pigmentosum and trichothiodystrophy [5] while defects in the factors
us
responsible for TC-NER give rise to the Cockayne’s and UV-sensitive syndromes [6]. Polymorphisms in NER genes have been linked to reduced repair capacity
an
and cancer predisposition [7]. Furthermore, Inactivating somatic mutations of the NER genes ERCC2, ERCC3, ERCC4, ERRC5, XPA, XPC and DDB2 promote
M
cancer and therefore these genes are known as cancer predisposition genes [8].
d
Many studies have also found a correlation between the expression level of DNA
te
repair genes in cancer cells and their sensitivity to cisplatin [7]. Interestingly, a low level of expression or a defect in the GG-NER factors XPC and DDB2 does
Ac ce p
not sensitize cells to cisplatin or UV light while reduced expression or defects in the TC-NER factors CSA and CSB results in a marked sensitivity [9, 10]. The expression of NER genes have been previously analyzed in cell lines
using total cellular RNA, which reports on the steady-state level of RNA but does not distinguish between the contribution of synthesis and turnover of RNA to RNA homeostasis. In this study, we used Bru-seq and BruChase-seq [11, 12] to specifically examine the rate of RNA synthesis and turnover of NER transcripts across 13 human cell lines. These techniques are based on the pulse-labeling of nascent RNA with bromouridine (Bru) followed by either immediate harvest (Bru-
Page 5 of 28
seq) or harvest after a 6-hour chase in uridine (BruChase-seq). The Bru-labeled RNA is then isolated using anti-BrdU antibodies conjugated to magnetic beads, converted into a cDNA library and deep sequenced. Surprisingly, our results
ip t
show that many critical NER genes produce fairly unstable transcripts that would be expected to quickly vanish following induction of transcription-blocking lesions
cr
by UV light and bulky chemical adducts – the very adducts NER was designed to
us
remove.
an
2. Materials and Methods
2.1. Cell lines. The cell lines used were HF1 (gift from Mary Davis, University of
M
Michigan), HPNE, Panc1, MiaPaCa2, BxPC3, UM16, UM28, UM59 and HEK293
d
(gifts from Diane Simeone University of Michigan), HeLa, GM12878 and K562
te
(ATCC) and LN428 (gift from Rob Sobol, University of Pittsburgh). The media used were MEM with 10% FBS and antibiotics (HF1), DMEM with 10% FBS and
Ac ce p
Pen/Strep (Panc1, MiaPaCa2, UM16, UM28, UM59, HEK293), RPMI1640 with 10% FBS and antibiotics (BxPC3), F12K with 10% FBS and Pen/Strep (HeLa), RPMI1640 with 15% FBS (GM12878), IMDM with 10% FBS (K562), and MEMalpha with 10% FBS, antibiotics, gentamycin and puromycin (LN428). The HPNE cells were grown in 75% DMEM without glucose (Sigma D-5030) containing 2 mM L-glutamine (Sigma G7513), 1.5g/L sodium bicarbonate (Sigma S 4019), 25% Medium M3 Base (Incell Corp, M300F-500) with the additives 10 ng/mL human recombinant EGF (BD Sciences 354052), 5.5 mM D-glucose (1
Page 6 of 28
g/L) (Sigma G8644) and 750 ng/mL puromycin dihydrochloride (Invitrogen A11138-02).
ip t
2.2. Bru-seq and BruChase-seq. For detailed descriptions of these
techniques, please see [11, 12]. In short: the labeling of nascent RNA was
cr
performed for 30 min at 37°C with 2 mM bromouridine in conditioned medium.
us
For the BruChase-seq experiments, the bromouridine-containing medium was removed after 30-min labeling, the plates were rinsed twice in PBS and then
an
conditioned medium containing 20 mM uridine was added. The cells were then incubated for 6 hours at 37°C. At the completion of the labeling +/- chase, the
M
cells were lysed in TRIzol and the Bru-containing RNA was isolated using anti-
d
BrdU antibodies conjugated to magnetic beads. The isolated RNA was
te
converted into cDNA libraries using the Illumina True-seq library kit followed by
Ac ce p
deep sequencing to around 50 million single-end 50 nucleotide reads.
2.3. Analysis
Bru-seq and BruChase-seq gene expression was measured in transcripts per million reads (TPM) in order to compare gene expression across cell lines. The TPM of each gene can be thought of as the percent of expression associated with a gene in relation to the total expression of the genome [13]. The TPM formula was slightly modified and implemented as: (transcript read count x 109)/(transcript length x sum of all transcript densities) Where transcript densities are defined as:
Page 7 of 28
(transcript read count)/(transcript length) The natural logarithm of TPM measurements corresponding to the 29 NER genes were used in the analysis. When plotting Bru-seq and BruChase-seq
ip t
data, genes with expression values equal to zero were represented in black. The stability measurement of each transcript is calculated by dividing the exonic
cr
reads at 6 hours by the reads across the entirety of the gene at 0 hours. Since it
us
is impossible to calculate a stability value for a gene with no synthesis
an
expression, we left such cases displayed as white boxes in Figure 4B.
3. Results
M
The total, steady-state level of RNA in cells is a reflection of the equilibrium
d
between synthesis and degradation of RNA. We previously showed that the
te
expression of different human genes appear to have their unique ratios of RNA synthesis and degradation suggesting that both of these processes are
Ac ce p
coordinately regulated by cells to obtain defined, homeostatic, levels of RNA [12]. The particular settings of the ratios of synthesis and degradation of RNA from NER genes are not known. Since many of the adducts that NER has been designed to remove from DNA have the capacity to block the elongation of transcription, the ratio of synthesis and stability of a particular RNA may have a profound effect on the steady-state level of its expression following DNA insult. For example, a setting of high synthesis coupled with low stability would rapidly lead to the depletion of this RNA while a gene with the strategy of low synthesis and high RNA stability would fare much better at times of exposure to
Page 8 of 28
transcription-blocking DNA damage. Here we present the signatures of RNA synthesis and stability of 29 NER genes using Bru-seq and BruChase-seq across
3.1. Differences in NER gene regulation across cell lines
ip t
13 human cell lines.
cr
To obtain estimates of the relative stability of the 29 NER transcripts using
us
BruChase-seq, we compared the amount of sequencing reads from all of the exons of a particular gene after a 6-hour chase with the amount of sequencing
an
reads from the whole gene immediately after Bru-labeling. If a particular transcript is stable this ratio will be high while for an unstable transcript this ratio
M
will be low. We observed that the stability of some NER transcripts was
d
differentially regulated in the different cell lines. For example, the XPA transcript
te
was synthesized to similar levels in the pancreatic cancer cell lines BxPC3 and UM59 but this transcript was unstable in BxPC3 cells but stable in UM59 cells
Ac ce p
(compare the heights of the exonic peaks in red) (Fig. 2A). The XPC transcript was much more stable in human fibroblasts compared to UM28 cells (Fig. 2B) and the ERCC6 transcript was synthesized at a lower level in GM12878 B-cells than in MiaPaCa2 pancreatic cancer cells while showing relatively low stability in both cell lines (Fig. 2C).
The RAD23A and RAD23B genes encode functionally redundant proteins acting as ubiquitin receptors and they interact with the XPC protein to promote damage recognition for GG-NER [3]. Using Bru-seq we found that both RAD23A and RAD23B genes were highly transcribed across the cell lines. In the three
Page 9 of 28
primary pancreatic cell lines UM16, UM28 and UM59, both genes were similarly transcribed while the relative stabilities of these transcript showed large variations (Fig. 3). Interestingly, in the UM16 cells the RAD23A transcript was
ip t
very stable while the RAD23B transcript was not. Conversely, in UM28 and
UM59 cells, the RAD23A transcript was unstable while the RAD23B transcript
cr
was stable. Thus, each cell line arrived at a desired level of either RAD23A or
us
RAD23B by post-transcriptional regulation involving RNA stability.
an
3.2. Relative rates of RNA synthesis of NER genes
To enable direct comparisons of RNA synthesis between samples, transcript
M
intensities were measured in transcripts per million reads (TPM). It was found
d
that the 29 NER genes were synthesized at a spectrum of different rates (Fig.
te
4A). These rates differed between genes and between different cell lines for the same gene. To explore whether the origin of this variability was due to different
Ac ce p
biology of the cell lines or from technical variation between experiments, we compared the relative RNA synthesis between two biological samples prepared at different occasions from the same cell line (K562) and found them to be highly correlated (r2=0.952) (Supplemental Fig. 1A). Thus, technical variability did not appear to be the main cause of the variability seen in the rate of RNA synthesis of the NER genes across the cell lines. We next compared the synthesis rates of the 29 NER genes to the synthesis rates of all expressed genes in the same cell line (HF1) and found that NER genes are synthesized at a median rate that is comparable to the median rate of all expressed genes (Fig. 4B).
Page 10 of 28
3.3. Relative stability of NER transcripts The relative stabilities of transcripts were estimated by assessing the ratio of
ip t
transcription reads obtained from the exons of the different genes in the 6-hour sample (BruChase-seq) with the reads throughout the corresponding genes in
cr
the 0-hour sample (Bru-seq). The relative stabilities of the 29 NER transcripts
us
are shown in Figure 4C with the correlation between the two K562 cell samples shown in Supplemental Figure 1B. In Figure 4E, the exonic reads from the
an
different genes in the 6-hour samples are shown with the correlation between the K562 cell samples shown in Supplemental Figure 1C. The densities of reads
M
across the exons of genes after a 6-hour chase in uridine are approximations of
d
the steady-state levels of RNA. Taken together, it can be seen that there was
Ac ce p
the 13 cell lines.
te
large differences in transcript stability between different genes and also across
3.4. Gene and cell type-specific regulation of NER gene expression We observed differential regulation of gene expression at the synthesis, stability and mature RNA content both at the gene and at the cell type level. For a given variable we used the gene’s median (across all cell types) in order to calculate a gene regulation score (Fig. 5A-C). In order to understand regulation of NER genes in individual cell lines, we calculated overall scores for the cell lines. For a given variable we calculated the percent difference for a given variable of a given gene from the median of that gene calculated across all cell lines. This allowed
Page 11 of 28
us to calculate the overall distance of individual cell lines from the median signal observed across all cell lines (Fig. 5D-F). Our analyses of the synthesis and stability of the RNA data showed that
ip t
the NER genes showed different patterns of gene regulation. For example, it can be seen that the RAD23B was the highest transcribed gene across the cell lines
cr
with RPA3 being the lowest (Fig. 5A). Among the cell lines, the primary
us
pancreatic cancer line UM28 and the normal pancreatic epithelial cell line HPNE showed the highest rates of transcription across the 13 cell lines while the
an
established pancreatic cancer cell lines Panc1, MiaPaCa2 and BxPC3, together with HEK293 cells, had the lowest rates of transcription (Fig. 5D). When we
M
tabulated the median relative stability of the different NER transcript we got a
d
very different picture than that of the synthesis data. The most striking difference
te
was seen for the RPA3 transcript, which had the overall lowest median rate of transcription (Fig. 5A) but had by far the highest median relative stability of all
Ac ce p
tested NER transcripts (Fig. 5B). Similarly, the BxPC3 cell line had the lowest median relative synthesis rate of the NER genes among the 13 cell lines (Fig. 5D) while the transcripts produced scored near the top for median relative stability (Fig. 5E). The transcripts generated by HeLa cells showed the overall lowest median relative stability. The exonic values of the 6-hour samples are the products of the RNA synthesis and RNA stability scores and represent the mature population of RNA resembling steady-state RNA. The NER transcript with the highest median relative abundance at 6 hours after labeling was RAD23B with DDB1 as close second (Fig. 5C). The cell lines with the highest
Page 12 of 28
median relative levels of NER transcripts at 6 hours were the pancreatic cancer cell line panc1 and the glioblastoma cell line LN428 while HeLa cells showed the
ip t
lowest level (Fig. 5F).
cr
4. Discussion
us
Our study provides a comprehensive view of the complex regulation of synthesis and stability of transcripts of the 29 genes of the NER pathway across 13 cell
an
lines. NER is thought to have evolved primarily to deal with DNA lesions induced by UV light but this repair pathway is also needed to remove cyclopyrine adducts
M
induced by endogenously produced reactive oxygen species (ROS) [3].
d
Therefore all cells need NER in some capacity to deal with a set of lesions that
te
could pose threats to transcription and replication. Indeed, in our survey of synthesis and stability of the NER genes across both normal and cancerous
Ac ce p
tissues, no cell line was found to lack expression of the NER genes. However, there were large differences in the expression signatures of the NER genes among the cell lines. Moreover, the balance between synthesis and turnover of the different transcripts showed unique patterns. For example, the RPA1 gene generated very little nascent RNA, but this RNA showed a very high stability allowing this transcript to have fairly high relative abundance at 6 hours after its synthesis. On the other hand, the ERCC6 gene coding for the CSB protein involved in TC-NER, was synthesized at fairly high rates in many cell lines but this transcript was rather unstable. An interesting pattern of post-transcriptional
Page 13 of 28
regulation was also observed for the RAD23A and RAD23B transcripts in the cell lines UM16, UM28 and UM59 (Fig. 3). The gene products of these two genes are thought to be functionally equivalent and we found that each cell line
ip t
synthesized them at similar rates but that they retained one or the other by posttranscriptional regulation.
cr
Our approach of exploring transcript synthesis and stability across cancer
us
cells lines could potentially be used to predict susceptibility to particular
chemotherapeutic agents. In this test case of examining the NER pathway we
an
calculated NER gene regulation scores for each cell line and ordered the cell lines according to their median expression (Fig. 5D-F). Functional repair and
M
survival assays would be needed to discern whether the median expression
d
levels of the NER genes in the different cell lines are predictive of the sensitivity
te
to UV light or cisplatin for example. Alternatively, the “outlier” genes with very low expression may be used as predictors of treatment outcomes. A future
Ac ce p
refinement of this scoring system would be to assign each gene in a particular pathway different “weight” according to whether they are expressed at ratelimiting levels and how important they are for the function of the pathway. Transcription acts as a sensor of DNA damage and is linked to p53
induction via RPA and ATR signaling [14]. Pro-apoptotic genes induced by p53 are in general more compact in size than anti-apoptotic genes, a situation that ensures a dose-dependent induction of apoptosis as larger survival genes become preferentially blocked by DNA lesions [15]. Furthermore, apoptosis may be induced by transcription-blocking lesions as a result diminished levels of
Page 14 of 28
critical gene products, such as MCL-1 [9, 16, 17]. Finally, proliferating cells entering S-phase before resolving stalled RNA polymerase II complexes, will have lethal encounters between replication and transcription machineries [18].
ip t
Thus, it is critical that NER, and especially the TC-NER sub-pathway, remains operational following damage exposure to promote survival. To remain
cr
operational even after induction of transcription-blocking lesions, it would be
us
plausible to hypothesize that NER genes may be under a selective pressure to reduce their sizes. However, when examining the gene sizes of NER genes we
an
found that they are not generally smaller than the median size of human genes (Fig. 6A).
M
One of the key factors in TC-NER is the CSB protein, which promotes the
d
recruitment of NER to sites of stalled RNA polymerase II complexes [3, 4]. The
te
CSB protein is synthesized from the ERCC6 gene that spans about 87 kb, a size about three times larger than the median size of all human protein-coding genes,
Ac ce p
placing it second in size out of the 29 NER genes (Fig. 6A&B). Due to its large size, it represents a rather large target for inactivation by UV light and other agents that are capable of inducing transcription-blocking lesions. Moreover, the ERCC6 transcript was found to be quite unstable across the cell lines so it is expected that the levels of ERCC6 RNA will rapidly decline in the cells following a challenge with high doses of a transcription-blocking agent, such as UV light, eventually leading to diminished levels of CSB protein (Fig. 6C). Since CSB is critical for the recovery of RNA synthesis and survival following exposure to UV light or other agents inducing transcription-blocking lesions, a time-dependent
Page 15 of 28
decline in the level of CSB proteins after high doses paints the cells into a corner by running out of the one factor designed to save them from these types of insults. The findings of the unstable nature of the ERCC6 transcript, and its
ip t
rather large gene size, suggests to us that ERCC6/CSB may act as a dosimeter of DNA damage where above a certain threshold of DNA damage the ERCC6
cr
transcript is depleted to such a degree that it can no longer sustain the
us
translation of the CSB protein resulting in deficient TC-NER and promotion of cell
an
death.
5. Acknowledgements
M
We would like to thank Michelle Paulsen for excellent technical assistance and
d
the other members of the Ljungman lab for valuable input to this project. We are
te
also grateful for the assistance by Manhong Dai and Fan Meng for administration and maintenance of the University of Michigan Molecular and Behavioral
Ac ce p
Neuroscience Institute (MBNI) computing cluster and by the personnel at the University of Michigan Sequencing Core. This work has been supported by funds from The University of Michigan Pancreatic Cancer Center, The Translational Oncology Program, University of Michigan Bioinformatics Program, National Institute of Environmental Sciences (1R21ES020946) and National Human Genome Research Institute (1R01HG006786).
6. References
Page 16 of 28
1.
Friedberg E, Walker G, Siede W, Wood R, Schultz R and Ellenberger T (2006) DNA Repair and Mutagenesis. ASM Press, Washington, D .C.
2.
Wood RD, Mitchell M, Sgouros J and Lindahl T (2001) Human DNA repair
3.
ip t
genes. Science 291:1284-1289.
Marteijn JA, Lans H, Vermeulen W and Hoeijmakers JH (2014)
cr
Understanding nucleotide excision repair and its roles in cancer and
4.
us
ageing. Nat Rev Mol Cell Biol 15:465-81.
Hanawalt PC and Spivak G (2008) Transcription-coupled DNA repair: two
5.
an
decades of progress and surprises. Nat Rev Mol Cell Biol 9:958-70. Offman J, Jina N, Theron T, Pallas J, Hubank M and Lehmann A (2008)
M
Transcriptional changes in trichothiodystrophy cells. DNA Repair (Amst)
Spivak G, Itoh T, Matsunaga T, Nikaido O, Hanawalt P and Yamaizumi M
te
6.
d
7:1364-71.
(2002) Ultraviolet-sensitive syndrome cells are defective in transcription-
7.
Ac ce p
coupled repair of cyclobutane pyrimidine dimers. DNA Repair 1:629-643. Bowden NA (2014) Nucleotide excision repair: why is it not used to predict response to platinum-based chemotherapy? Cancer Lett 346:163-71.
8.
Rahman N (2014) Realizing the promise of cancer predisposition genes. Nature 505:302-8.
9.
Ljungman M and Zhang F (1996) Blockage of RNA polymerase as a possible trigger for u.v. light-induced apoptosis. Oncogene 13:823-31.
Page 17 of 28
10.
McKay BC, Becerril C and Ljungman M (2001) P53 plays a protective role against UV- and cisplatin-induced apoptosis in transcription-coupled repair proficient fibroblasts. Oncogene 20:6805-6808. Paulsen MT, Veloso A, Prasad J, Bedi K, Ljungman EA, Magnuson B,
ip t
11.
Wilson TE and Ljungman M (2014) Use of Bru-Seq and BruChase-Seq for
cr
genome-wide assessment of the synthesis and stability of RNA. Methods
12.
us
67:45-54.
Paulsen MT, Veloso A, Prasad J, Bedi K, Ljungman EA, Tsan YC, Chang
an
CW, Tarrier B, Washburn JG, Lyons R, Robinson DR, Kumar-Sinha C, Wilson TE and Ljungman M (2013) Coordinated regulation of synthesis
M
and stability of RNA during the acute TNF-induced proinflammatory
Wagner GP, Kin K and Lynch VJ (2012) Measurement of mRNA
te
13.
d
response. Proc Natl Acad Sci U S A 110:2240-5.
abundance using RNA-seq data: RPKM measure is inconsistent among
14.
Ac ce p
samples. Theory Biosci 131:281-5. Derheimer FA, O'Hagan H M, Krueger HM, Hanasoge S, Paulsen MT and Ljungman M (2007) RPA and ATR link transcriptional stress to p53. Proc. Natl. Acad. Sci. U.S.A. 104:12778-83.
15.
McKay BC, Stubbert LJ, Fowler CC, Smith JM, Cardamore RA and Spronck JC (2004) Regulation of ultraviolet light-induced gene expression by gene size. Proc. Natl. Acad. Sci. U S A 101:6582-6586.
16.
Ljungman M and Lane DP (2004) Transcription - guarding the genome by sensing DNA damage. Nat Rev Cancer 4:727-37.
Page 18 of 28
17.
Nijhawan D, Fang M, Traer E, Zhong Q, Gao W, Du F and Wang X (2003) Elimination of Mcl-1 is required for the initiation of apoptosis following ultraviolet irradiation. Genes Dev 17:1475-86.
ip t
McKay B, Becerril C, Spronck J and Ljungman M (2002) Ultraviolet lightinduced apoptosis is associated with S-phase in primary human
te
d
M
an
us
cr
fibroblasts. DNA Repair 1:811-820.
Ac ce p
18.
Page 19 of 28
FIGURE LEGENDS
Figure 1. The 29 gene products of NER that were examined in this study and
ip t
their roles in NER.
cr
Figure 2. Differential regulation of relative synthesis and stability of key NER
us
transcripts. (A) The relative synthesis of XPA RNA in the pancreatic cancer cell lines BxPC3 and UM59 (blue trace, left, expressed as reads per thousand base
an
pairs per million reads or RPKM) is similar but the stabilities of the XPA transcript differ dramatically between these two cell lines (red trace, right). (B)
M
Transcription appears to diminish at the 3’-end of the XPC gene in UM28 cells
d
while the XPC gene is transcribed evenly in the skin fibroblasts HF1 (left). The
te
stability of the XPC transcript is much higher in the HF1 cells than in the UM28 cells (right). (C) Differential rates of RNA synthesis of the ERCC6 gene in
Ac ce p
MiaPaCa2 cells and GM12878 cells (left). Low stability of the ERCC6 transcript (ratio of 6-h exonic reads to full gene reads at 0h) in both cell types (right).
Figure 3. Post-transcriptional regulation of RAD23A and RAD23B genes in pancreatic cancer cell lines. (A) The relative synthesis of RAD25A RNA is very similar across three pancreatic cancer cell lines while the relative stability of the RAD25A transcript is high in UM16 cells but low in UM28 and UM59 cells. (B) Similar relative synthesis of RAD25B across the three UM cell lines while higher stability of the RAD25B transcript in UM28 and UM59 cells.
Page 20 of 28
Figure 4. Expression of the 29 NER genes in 13 human cell lines. (A) Relative rates of RNA synthesis of NER genes as determined by Bru-seq. (B) Boxplots
ip t
showing the comparisons of RNA synthesis between NER genes and all other
expressed genes in cell line HF1. (C) Relative RNA stability of NER transcripts
cr
as determined by BruChase-seq. (D) Boxplots showing the comparisons of
us
relative RNA stability between NER genes and all other expressed genes in cell line HF1. (E) Exonic reads of 6-hour old NER RNAs. (F) Boxplots showing the
an
comparisons of exonic reads of 6-hour old RNA between NER genes and all other expressed genes in cell line HF1. The boxplots are used to demonstrate
M
the distribution of the data where the horizontal line inside the box indicates the
d
median (50th percentile). The edges of the boxes indicate the 25th and 75th
te
percentile and the spaces between them are called interquartile ranges. Whiskers extend from the box edges up to the most extreme point that is not
Ac ce p
more than 1.5 times the interquartile range. Individual data points more extreme than the extension of the whiskers are indicated with circles.
Figure 5. Comparative analysis of transcription regulation across genes and cell lines. Relative transcript synthesis (A), stability (B), and exonic signal in 6-hour old RNA (C). Data is grouped by gene to indicate genes with highest and lowest values. Transcript percent difference from median values (calculated across cell lines) in synthesis (D), stability (E), and exonic signal in 6-hour old RNA (F). Data is grouped by cell lines to indicate cell lines with the greatest discrepancies from
Page 21 of 28
the median expression of each gene. Boxplots are designed as described in Fig. 4.
ip t
Figure 6. Genomic lengths of human NER genes. (A) The lengths of the 29 human NER genes are expressed in kb. As can be seen, the MNAT1 is the
cr
longest gene followed by ERCC6, RPA3 and ERCC8. The median length of all
us
human genes is 18 kb. (B) Model of how ERCC6/CSB may act as a dosimeter of DNA damage to ensure an appropriate level of cell death as a function of UV
an
dose. At lower doses of UV light, cells are able to transcribe sufficient amounts of RNA from the ERCC6 gene to sustain CSB protein levels for proficient TC-
M
NER. At higher levels of UV exposure, transcription-blocking lesions have a high
d
likelihood of forming in the large ERCC6 gene resulting in the inhibition of new
te
synthesis of ERCC6 RNA. Since the ERCC6 transcript is very unstable, it is expected that the levels of ERCC6 RNA will rapidly diminish as transcription of
Ac ce p
the gene is prohibited, resulting in a time-dependent loss of the CSB protein, loss of TC-NER and induction of cell death.
Page 22 of 28
Figure 1
CSB
XAB2
RNP II UVSSA
CETN2 HR23B
TC-NER
ce
pt
ed
GG-NER
CSA
M
DDB1 XPC
HR23A
an
DDB2
us
cr
ip t
UV light
MMS19
Ac
CCNH XPB XPD GTF2H3 GTF2H1 GTF2H4 MNAT1 TFIIH GTF2H2 CDK7 TTDA
RPA1
XPA RPA2
ERCC1 XPG
XPF
RPA3
pre-incision complex
incision LIG1
ligation Page 23 of 28
hg19
10 kb 100,450,000
100,460,000
XPA XPA hg19
10 kb 100,440,000
100,450,000
100,460,000
5
5
hg19
14,190,000
10 kb 14,200,000
14,210,000
XPC XPC XPC 14,190,000
10 kb 14,200,000
14,210,000
0 10
GM
5 Scale 0 chr10:
100,450,000
100,460,000
Scale chr3: 0 TMEM43
14,220,000 LSM3
Scale chr3: TMEM43
50,700,000 50 kb ERCC6
hg19
14,190,000
10 kb 14,200,000
14,210,000
XPC XPC XPC 14,190,000
10 kb 14,200,000
14,210,000
14,220,000 LSM3
hg19 14,220,000 LSM3
XPC XPC XPC
40 RPKM
Bru-seq
50 kb
BruChase-seq
20
BruChase-seq
20 0 40
hg19 50,750,000
20 Scale 0 chr10:
ERCC6 Scale chr10:
hg19
10 kb
100,440,000
40
RPKM
5
100,460,000
XPA XPA
0
14,220,000 LSM3
hg19
Ac ce p
RPKM
10
100,450,000
20
ERCC6
mia
hg19
10 kb
100,440,000
40
XPC XPC XPC
RPKM
C
te
RPKM
d
10
Scale chr3: TMEM43
Scale chr9: NCBP1
RPKM
Bru-seq
Scale chr3: 0 TMEM43
chr9: NCBP1 0
M
RPKM
XPA XPC
0
HF1
20 Scale
XPA XPA
XPA
10
cr
40
us
10
an
RPKM
RPKM
0
Scale chr9: NCBP1
UM28
20
0
100,440,000
BruChase-seq
40
5
Scale 5 chr9: NCBP1 0
B
Bru-seq
RPKM
UM59
XPA
RPKM
BxPC3
10
RPKM
A
ip t
Figure 2
hg19
50 kb 50,700,000
50,750,000
ERCC6
PGBD3hg19 ERCC6-PGBD3 50,700,000 50,750,000 ERCC6-PGBD3 PGBD3 ERCC6-PGBD3 ERCC6-PGBD3
Scale chr10:
50 kb ERCC6
PGBD3hg19 ERCC6-PGBD3 50,700,000 50,750,000 ERCC6-PGBD3
PGBD3 ERCC6-PGBD3 ERCC6-PGBD3
Page 24 of 28
i cr
RPKM
us
RPKM
RPKM
ed hg19
ce pt
RPKM
13,065,000
RAD25B
Bru-seq
13,065,000
GADD45GIP1
GADD45GIP1
Scale 0 chr19: RAD23A RAD23A Scale RAD23A chr19: RAD23A RAD23A
RAD23A RAD23A 400 RAD23A
0 40
2 kb 13,060,000
hg19 13,065,000
hg19 13,065,000 GADD45GIP1
BruChase-seq
GADD45GIP1
200
200 0 400
20
Scale chr9: 0 RAD23B RAD23B Scale RAD23B chr9:
2 kb 13,060,000
0 400 RPKM
20
2 kb 13,060,000
hg19
UM59
100
RPKM
UM59
RPKM
UM28
0 40
2 kb 13,060,000
100 0 200
RPKM
5
20
100
M an
5
RAD23A RAD23A 40 RAD23A
BruChase-seq
0 200
0 10
RPKM
UM16
200
5
Scale 0 chr19: RAD23A RAD23A Scale RAD23A chr19: RAD23A RAD23A
B
Bru-seq
Ac
UM59
RPKM
UM28
RAD25A
0 10
RPKM
A
UM16
10
RPKM
Figure 3
20 kb
hg19 110,100,000
20 kb
hg19 110,100,000
200
Scale chr9: 0 RAD23B RAD23B Scale RAD23B chr9:
20 kb
hg19 110,100,000
Page 25 of 28 20 kb
hg19 110,100,000
K5620h1
K5620h2
K5626h1.K5620h1
K5626h2.K5620h2
5 10
15 10
LN6h1.LN0h1
mia6h1.mia0h1
color key and histogram
panc16h1.panc10h1
HPNE6h1.HPNE0h1
nf6h3b.nf0h3a_3b
Ac
E
hela6h1.hela0h1
6
UM596h1.UM590h1
4
UM286h1.UM280h1
2
Value
ce
0
UM166h1.UM160h1
0
stability Stability
5
HeLa
UM59
UM28
UM16
H F1 3b.nf0h3a_3b H N PE h1.HPNE0h1 m pan ia c Pa 1 h1.panc10h1 C a Bx 2 ia6h1.mia0h1 PC 1.BxPC30h1 U 3 M 16 h1.UM160h1 U M 28 h1.UM280h1 U M 5 h1.UM590h1 H 9 eL LN a a6h1.hela0h1 G 42 M 8 N6h1.LN0h1 12 H 878 GM128780h1 EK 2 K5 93 36h1.2930h1 62 K5 (1 6h1.K5620h1 62 ) (2 6h2.K5620h2 )
BxPC3
12 12
miaPaCa2
10 10
panc1
8 8
HNPE
6 6
exonic expression Exonic signal after 6h
HF1
0
0
GM12878
2930h1
2936h1.2930h1
2
2
pt
0
0
BxPC36h1.BxPC30h1
Count
-2
−2
5 10
-4
D
F
Log Gene Expression at 0h (TPM) 0 5 10 15 NER genes Other genes Sample nf0h3a_3b
DDB1 DDB2 XPC RAD23A RAD23B CETN2 ERCC8 ERCC6 XAB2 KIAA1530 ERCC3 ERCC2 CCNH MMS19 MNAT1 CDK7 NER genes Other genes GTF2H1 Sample nf0h3a_3b DDB1 GTF2H2 DDB2 GTF2H3 XPC GTF2H4 RAD23A GTF2H5 RAD23B RPA1 CETN2 RPA2 ERCC8 RPA3 ERCC6 XPA XAB2 ERCC1 KIAA1530 ERCC4 ERCC3 ERCC5 ERCC2 LIG1 CCNH MMS19 MNAT1 CDK7 GTF2H1 NER genes Other genes GTF2H2 DDB1 GTF2H3 DDB2 GTF2H4 Page 26 of 28 XPC GTF2H5 RAD23A RPA1 RAD23B
K562 (2)
us
GM128780h1
an GM128786h1.GM128780h1
LN0h1
hela0h1
UM590h1
UM280h1
UM160h1
DDB1 DDB2 XPC RAD23A RAD23B CETN2 ERCC8 ERCC6 XAB2 KIAA1530 ERCC3 ERCC2 CCNH MMS19 MNAT1 CDK7 GTF2H1 GTF2H2 GTF2H3 GTF2H4 GTF2H5 RPA1 RPA2 RPA3 XPA ERCC1 ERCC4 ERCC5 LIG1
ed
15
Color Key and Histogram
−4
Count count
DDB1 DDB2 XPC RAD23A RAD23B CETN2 ERCC8 ERCC6 XAB2 KIAA1530 ERCC3 ERCC2 CCNH MMS19 MNAT1 CDK7 GTF2H1 GTF2H2 GTF2H3 GTF2H4 GTF2H5 RPA1 RPA2 RPA3 XPA ERCC1 ERCC4 ERCC5 LIG1
M
20 15 10 5 0
0 5
Count count
color key and histogram
BxPC30h1
C
mia0h1
2
panc10h1
0
Value
HPNE0h1
−2
nf0h3a_3b
−4
Gene Stability (log 6h TPM/log 0h TPM) −8 −6 −4 −2 0 2 4
synthesis Synthesis
15
9 10 11 11 12
9
HEK293
15
8 8
0 5
7 7
Count
0
0
cr
Color Key and Histogram
LN428
5
ip t
10
5
Count count
15
10
10 15
color key and histogram
B
5
A
0
DDB1 DDB2 XPC RAD23A RAD23B CETN2 ERCC8 ERCC6 XAB2 KIAA1530 ERCC3 ERCC2 CCNH MMS19 MNAT1 CDK7 GTF2H1 GTF2H2 GTF2H3 GTF2H4 GTF2H5 RPA1 RPA2 RPA3 XPA ERCC1 ERCC4 ERCC5 LIG1
Log Gene K562 (1) Expression at 6h (TPM)
Figure 4
1.1
1.0
0.9
E
1.0
0.0
0.8
0.7
0.6
0.5
0.6
−0.2 -20
cr
−0.1 -10
0.7
F
ip t
Synthesis (% in difference) % difference synthesis
0.8
UM28 HNPE GM12878 HF1 K562 (2) HeLa LN428 UM16 K562 (1) UM59 panc1 miaPaCa2 HEK293 BxPC3
us
0.9
−10
−20
−30
Stability (% difference) % difference in stability
1.0
HF1 BxPC3 UM16 GM12878 HEK293 miaPaCa2 panc1 UM28 K562 (2) UM59 K562 (1) HNPE LN428 HeLa
1.5
an
D
panc1 LN428 K562 (1) HF1 K562 (2) miaPaCa2 HEK293 HNPE GM12878 UM28 UM59 BxPC3 UM16 HeLa
0.5
M
RAD23B CCNH DDB1 RAD23A GTF2H1 MMS19 ERCC3 CDK7 XPA DDB2 XPC GTF2H3 ERCC5 MNAT1 ERCC6 RPA1 CETN2 RPA2 ERCC1 ERCC8 LIG1 GTF2H5 ERCC4 GTF2H4 XAB2 GTF2H2 ERCC2 KIAA1530 RPA3
Relative synthesis
Relative synthesis
A
−0.2 -10
exonic in expression (% difference) % difference expression of 6h old RNA
−1.5
ed
−1.0
pt
−0.5
Ac ce
RPA3 RPA1 RPA2 DDB1 GTF2H3 LIG1 KIAA1530 CETN2 RAD23B XAB2 CDK7 ERCC3 MNAT1 ERCC2 RAD23A ERCC1 ERCC8 GTF2H1 MMS19 DDB2 ERCC5 CCNH XPC ERCC6 ERCC4 XPA GTF2H5 GTF2H4 GTF2H2
Relative stability
Relative stability
B
RAD23B DDB1 GTF2H3 RPA1 CCNH RAD23A ERCC3 GTF2H1 CETN2 CDK7 RPA2 MNAT1 LIG1 ERCC5 MMS19 DDB2 XPA RPA3 ERCC1 ERCC8 XPC XAB2 ERCC6 ERCC2 KIAA1530 ERCC4 GTF2H5 GTF2H2 GTF2H4
Relative exonic expression
C
Relative expression of 6h old RNA
Figure 5
10 0.1
0.00
10 0
0.6 30
20 0.4
0.2 10
0.00
Page 27 of 28
Figure 6
250 200
all genes
150
t
100
ip
Gene length (kb)
A
us
cr
50
low UV
ce pt ed
ERCC6 RNA
CSB
TC-NER
survival
X
loss of TC-NER
cell death
ERCC6
high UV
X
Ac
B
M
an
0
unstable RNA
X
ERCC6 large gene size
CSB
Page 28 of 28