The 6th base of DNA and beyond: Intermediates of a DNA demethylation pathway
Epigenetic modifications include changes to DNA that alter gene expression but do not change the DNA sequence. One of the most important and widely studied epigenetic modifications is DNA methylation. It involves the addition of a methyl group to position-5 of the cytosine base, generating 5-methylcytosine (5-mC). DNA methylation renders the chromatin more tightly closed and inaccessible for the transcriptional machinery leading to silencing of gene expression. Its distribution has profound implications for development and aging, as well as for cancer and other diseases. Conversely, DNA demethylation involves the removal of the methyl group and is linked to transcriptional activation and gene expression.
DNA demethylation is important for mammalian development and differentiation. There is global DNA demethylation in the developing zygote and primordial germ cells of the embryo, whereas in some somatic cells there is locus-specific DNA demethylation (1, 2). DNA demethylation is vital during cellular differentiation. For example, in immune cells the differentiation of monocytes into macrophages and dendritic cells requires DNA demethylation (3). DNA demethylation can be carried out passively, by not methylating the newly-synthesized strand after DNA replication, or actively, by mechanisms not dependent on replication (1).
While DNA methylation has been under study for a few decades, active DNA demethylation had just started to be unraveled in the past few years. This is due to the discovery of another methylated variant of cytosine, 5-hydroxymethylcytosine (5-hmC), and the enzymes that catalyze its generation from 5-mC.
Conversion of 5-mC to 5-hmC is the first step in the DNA demethylation pathway. The discovery of 5-hmC has led to the finding of 5-formylcytosine (5-fC) and 5-carboxylcytosine (5-caC), other modified cytosines generated from 5-mC, and also intermediates in the DNA demethylation pathway. Prior to the discovery of 5-mC and the advent of epigenetics, no other gene regulating modifications of the four DNA bases (adenine, guanine, thymine, and cytosine) were known. Hence, 5-mC methylated cytosine became known as the 5th base of DNA (4), and 5-hmC, 5-fC, and 5-caC are occasionally referred to as the 6th, 7th, and 8th base, respectively (5, 6). They are produced from 5-mC by subsequent oxidation steps, leading to the complete demethylation of 5-mC. The DNA demethylation pathway sequentially proceeds from 5-mC → 5-hmC→ 5-fC→ 5-caC (Figure 1). 5-fC and 5-caC can then both be converted back to the unmethylated cytosine.
Much is now known about 5-methylcytosine and its impact on gene expression. However less is known about these products of DNA demethylation and whether they are just intermediates in a DNA demethylation pathway or whether they have other functional roles and are epigenetic marks on their own. Here we will briefly discuss the possible roles of 5-hmC, 5-fC, and 5-caC, the enzymes that lead to their generation, and experimental methods to distinguish these from 5-mC.
5-hydroxymethylcytosine (5-hmC)
5-hydroxymethylcytosine (5-hmC) is the first intermediate in the DNA demethylation pathway. 5-mC is converted to 5-hmC by an oxidation reaction catalyzed by the TET protein family of enzymes. It was initially discovered in bacteriophages in the 1950s and reported in mammalian tissues in 1972 (7, 8). However, it wasn’t until 2009 that its existence and functional significance became widely accepted.
Levels and distribution of 5-hmC vary in different tissues, and its levels are highest in the brain, where ~1% of cytosines are 5-hmC (9-12). While 5-mC is associated with closed, repressed chromatin, 5-hmC is associated with DNA demethylation and open active state chromatin (10). In the brain, 5-hmC is enriched in intragenic or gene body regions. Its levels correlate with gene expression levels (9), whereas in embryonic stem cells, 5-hmC is associated with promoters, enhancers, and other gene regulatory regions and regulates theexpression of pluripotency-related genes (13, 14).
5-hmC levels are reduced in several tumor tissues, including lung, brain, breast, colon, melanoma, prostate and others as compared to normal tissues (15, 16). Therefore, its absence has been suggested as a possible marker for cancer diagnosis. (15), and its loss is associated with malignant transformation in melanoma and other solid tumors (17, 18). It has also been designated as an epigenetic marker of DNA damage (19). Researchers have shown that it colocalizes with 53BP1 and gamma-H2X after microirradiation-induced DNA damage, suggesting a possible role in DNA damage repair.
5-formylcytosine (5-fC) and 5-carboxylcytosine (5-caC)
Much less is known about the roles of 5-formylcytosine (5-fC) and 5-carboxylcytosine (5-caC). They are generated by sequential oxidations from 5-hmC, to 5-fC and then to 5-caC (5-mC→ 5-hmC→ 5-fC→ 5-caC) or directly from 5-mC (5-mC→ 5-fC; 5-mC→5-caC) (Figure 1) (20). 5-fC is found in the promoters of highly transcribed genes in embryonic stem cells and is increased in poised enhancers and other gene regulatory regions (21, 22). Studies have also reported that 5-fC, unlike the other modified cytosines, changes the conformation of the DNA double helix, and specific DNA-binding proteins can be recruited by 5-fC (23). Collectively, most studies suggest that 5-fC is important for the regulation of gene expression in embryonic stem cells.
5-caC is the last modified cytosine in the demethylation pathway, prior to excision and repair back to unmodified cytosine, and thus has been deemed a marker of active DNA demethylation. Similar to 5-fC, it is also present in the promoters of highly transcribed genes in embryonic stem cells (22). It is elevated in breast cancers and gliomas, suggesting the possibility of preferential oxidation of 5-mC to 5-fC and 5-caC, instead of to 5-hmC in some cancers (24). However, this and the clinical significance of its elevated levels remains to be elucidated.
The Ten-Eleven Translocation (TET) proteins catalyze the sequential oxidations of 5-mC to 5-hmC and to 5-fC and 5-caC, and also the direct conversion to 5-fC and 5-caC from 5-mC (Figure 1) (20). The TET proteins comprise a family of three dioxygenases, TETs 1-3, dependent on iron (II) and 2-oxoglutarate, as cofactors. They all share enzymatic functions in the demethylation pathway and have differential levels of expression among tissues, with TET2 preferentially expressed in hematopoietic cells. TETs 1-3 are mutated in some solid tumors, and TET2 in hematological cancers (25), suggesting tumor-suppressing roles for the TET enzymes. 5-fC and 5-caC can be converted back to cytosine by thymine DNA glycosylase (TDG) in base excision repair, rendering the cytosine completely unmethylated (5). TDG is involved in the correction of G/T mismatches during DNA repair. TDG null mice are embryonic lethal and have DNA demethylation defects (26).
Bisulfite-sequencing is the gold standard method for studying DNA methylation. It involves a process in which unmethylated cytosines are converted to uracils, whereas methylated cytosines are unsusceptible to treatment and remain as cytosines. This technique allows for discrimination of the methylation status of CpG sites during sequencing.
Similar to 5-mC, 5-hmC also remains unchanged as cytosine during the processing of samples for bisulfite-sequencing, and, consequently, the conventional bisulfite-sequencing method cannot distinguish between 5-mC and 5-hmC. However, 5-mC and 5-hmC have opposing roles in transcriptional regulation and gene expression, despite both being methylated cytosines. Thus it is important to be able to discriminate between these two methylated cytosines, and new methods have been developed for distinguishing between these two DNA bases.
Two of the principal methods are oxidative bisulfite-sequencing (OxBS-seq) and TET-assisted bisulfite-sequencing (TAB-seq). In OxBS-seq, an additional oxidation step is added, 5-hmC is converted to 5-fC (27). Unlike 5-hmC, 5-fC can be converted to uracil during bisulfite conversion. A regular bisulfite sequencing can then be compared with the OxBS-sequencing run to account for the 5-hmC. In TAB-seq, 5-hmC is first protected by glycosylation by β- glucosyltransferase and becomes β -glucosyl-5-hydroxymethylcytosine (β-glu-5hmC). This renders it unsusceptible to conversion into uracil during bisulfite-treatment. Then TET enzyme is used to catalyze the conversion of 5-mC to 5-fC and 5-caC, which will change to uracils during bisulfite conversion (13). OxBS-seq and TAB-seq both provide information at single-base pair resolution.
Similar methods have been developed to measure 5-fC and 5-caC at single-base pair resolution, including chemically assisted bisulfite-sequencing (fCAB-seq)(21) and methylation-assisted bisulfite-sequencing (MAB-seq)(22). In fCAB-seq, 5-fC is first protected by O-ethylhydroxylamine treatment, prior to bisulfite-treatment and sequencing. Then comparison of the ethylhydroxylamine-treated and untreated bisulfite-sequencing runs can account for 5-fC, as the unprotected 5-fC changes to uracil during the untreated run, while the ethylhydroxylamine-protected 5-fC remains as cytosine. MAB-seq maps both 5-fC and 5-caC, and it consists of first using the CpG methyltransferase M.SssI to methylate the unmodified cytosines to 5-mC, rendering them now, similar to 5-mC and 5-hmC, unsusceptible to bisulfite-treatment. Then during bisulfite-sequencing these will remain as cytosines, whereas 5-fC and 5-caC are changed to uracils, and this allows for discrimination of their methylation status during sequencing.
It is possible to distinguish 5-mC from 5-hmC also by using affinity-based methods that enrich for 5-hmC. For example, hmeDIP uses antibodies specific for 5-hmC (28) or by pull down assay using J binding protein 1, which binds glycosylated 5-hmC (β-glu-5hmC)(29). Although not at true base pair-resolution level, these methods can then be coupled with sequencing or microarrays, to provide information on the positioning in the genome.
Antibodies for immunohistochemistry, IP, and other applications have also been developed, as well as commercially available ELISA-based methods that can quantify the global levels of these modified cytosines in cells and tissues (16, 30-33).
The interplay between DNA methylation and demethylation is important in development and differentiation and has an impact on disease. In cancer, for example, increased methylation in the promoter of tumor suppressor genes leads to silencing of expression and is associated with cancer progression. Preventing methylation by DNA methyltransferase inhibitors is thus being investigated as apotential therapeutic strategy. On the other hand, TET proteins, the main enzymes in the DNA demethylation pathway, have tumor-inhibitory roles, and loss of TET proteins and 5-hmC is associated with malignant transformation and progression (17, 34). This highlights the importance of DNA demethylation and its impact on disease. Current DNA demethylation studies could lead to important discoveries with implications for development and cancer therapy.
Chen, Z.X. and A.D. Riggs, DNA methylation and demethylation in mammals. J Biol Chem, 2011. 286(21): p. 18347-53.
Hill, P.W., R. Amouroux, and P. Hajkova, DNA demethylation, Tet proteins and 5-hydroxymethylcytosine in epigenetic reprogramming: an emerging complex story. Genomics, 2014. 104(5): p. 324-33.
Vento-Tormo, R., et al., IL-4 orchestrates STAT6-mediated DNA demethylation leading to dendritic cell differentiation. Genome Biol, 2016. 17: p. 4.
Lister, R. and J.R. Ecker, Finding the fifth base: genome-wide sequencing of cytosine methylation. Genome Res, 2009. 19(6): p. 959-66.
Li, C.J., DNA demethylation pathways: recent insights. Genet Epigenet, 2013. 5: p. 43-9.
Song, C.X. and C. He, The hunt for 5-hydroxymethylcytosine: the sixth base. Epigenomics, 2011. 3(5): p. 521-3.
Hershey, A.D., J. Dixon, and M. Chase, Nucleic acid economy in bacteria infected with bacteriophage T2. I. Purine and pyrimidine composition. J Gen Physiol, 1953. 36(6): p. 777-89.
Penn, N.W., et al., The presence of 5-hydroxymethylcytosine in animal deoxyribonucleic acid. Biochem J, 1972. 126(4): p. 781-90.
Jin, S.G., et al., Genomic mapping of 5-hydroxymethylcytosine in the human brain. Nucleic Acids Res, 2011. 39(12): p. 5015-24.
Mellen, M., et al., MeCP2 binds to 5hmC enriched within active genes and accessible chromatin in the nervous system. Cell, 2012. 151(7): p. 1417-30.
Pfeifer, G.P., S. Kadam, and S.G. Jin, 5-hydroxymethylcytosine and its potential roles in development and cancer. Epigenetics Chromatin, 2013. 6(1): p. 10.
Hahn, M.A., P.E. Szabo, and G.P. Pfeifer, 5-Hydroxymethylcytosine: a stable or transient DNA modification? Genomics, 2014. 104(5): p. 314-23.
Yu, M., et al., Base-resolution analysis of 5-hydroxymethylcytosine in the mammalian genome. Cell, 2012. 149(6): p. 1368-80.
Ficz, G., et al., Dynamic regulation of 5-hydroxymethylcytosine in mouse ES cells and during differentiation. Nature, 2011. 473(7347): p. 398-402.
Jin, S.G., et al., 5-Hydroxymethylcytosine is strongly depleted in human cancers but its levels do not correlate with IDH1 mutations. Cancer Res, 2011. 71(24): p. 7360-5.
Haffner, M.C., et al., Global 5-hydroxymethylcytosine content is significantly reduced in tissue stem/progenitor cell compartments and in human cancers. Oncotarget, 2011. 2(8): p. 627-37.
Kudo, Y., et al., Loss of 5-hydroxymethylcytosine is accompanied with malignant cellular transformation. Cancer Sci, 2012. 103(4): p. 670-6.
Lian, C.G., et al., Loss of 5-hydroxymethylcytosine is an epigenetic hallmark of melanoma. Cell, 2012. 150(6): p. 1135-46.
Kafer, G.R., et al., 5-Hydroxymethylcytosine Marks Sites of DNA Damage and Promotes Genome Stability. Cell Rep, 2016. 14(6): p. 1283-92.
Ito, S., et al., Tet proteins can convert 5-methylcytosine to 5-formylcytosine and 5-carboxylcytosine. Science, 2011. 333(6047): p. 1300-3.
Song, C.X., et al., Genome-wide profiling of 5-formylcytosine reveals its roles in epigenetic priming. Cell, 2013. 153(3): p. 678-91.
Neri, F., et al., Single-Base Resolution Analysis of 5-Formyl and 5-Carboxyl Cytosine Reveals Promoter DNA Methylation Dynamics. Cell Rep, 2015.
Raiber, E.A., et al., 5-Formylcytosine alters the structure of the DNA double helix. Nat Struct Mol Biol, 2015. 22(1): p. 44-9.
Eleftheriou, M., et al., 5-Carboxylcytosine levels are elevated in human breast cancers and gliomas. Clin Epigenetics, 2015. 7: p. 88.
Huang, Y. and A. Rao, Connections between TET proteins and aberrant DNA modification in cancer. Trends Genet, 2014. 30(10): p. 464-74.
Cortellino, S., et al., Thymine DNA glycosylase is essential for active DNA demethylation by linked deamination-base excision repair. Cell, 2011. 146(1): p. 67-79.
Booth, M.J., et al., Quantitative sequencing of 5-methylcytosine and 5-hydroxymethylcytosine at single-base resolution. Science, 2012. 336(6083): p. 934-7.
Stroud, H., et al., 5-Hydroxymethylcytosine is associated with enhancers and gene bodies in human embryonic stem cells. Genome Biol, 2011. 12(6): p. R54.
Robertson, A.B., et al., Pull-down of 5-hydroxymethylcytosine DNA using JBP1-coated magnetic beads. Nat Protoc, 2012. 7(2): p. 340-50.
Yu, Z., Q. Kong, and B.C. Kone, Aldosterone reprograms promoter methylation to regulate alphaENaC transcription in the collecting duct. Am J Physiol Renal Physiol, 2013. 305(7): p. F1006-13.
Kang, K.A., et al., Epigenetic modification of Nrf2 in 5-fluorouracil-resistant colon cancer cells: involvement of TET-dependent DNA demethylation. Cell Death Dis, 2014. 5: p. e1183.
Dhliwayo, N., et al., Parp inhibition prevents ten-eleven translocase enzyme activation and hyperglycemia-induced DNA demethylation. Diabetes, 2014. 63(9): p. 3069-76.
Park, J.L., et al., Decrease of 5hmC in gastric cancers is associated with TET1 silencing due to with DNA methylation and bivalent histone marks at TET1 CpG island 3'-shore. Oncotarget, 2015. 6(35): p. 37647-62.
An, J., et al., Acute loss of TET function results in aggressive myeloid cancer in mice. Nat Commun, 2015. 6: p. 10071.