- Open Access
Identification of differentially methylated genes as diagnostic and prognostic biomarkers of breast cancer
World Journal of Surgical Oncology volume 19, Article number: 29 (2021)
Aberrant DNA methylation is significantly associated with breast cancer.
In this study, we aimed to determine novel methylation biomarkers using a bioinformatics analysis approach that could have clinical value for breast cancer diagnosis and prognosis. Firstly, differentially methylated DNA patterns were detected in breast cancer samples by comparing publicly available datasets (GSE72245 and GSE88883). Methylation levels in 7 selected methylation biomarkers were also estimated using the online tool UALCAN. Next, we evaluated the diagnostic value of these selected biomarkers in two independent cohorts, as well as in two mixed cohorts, through ROC curve analysis. Finally, prognostic value of the selected methylation biomarkers was evaluated breast cancer by the Kaplan-Meier plot analysis.
In this study, a total of 23 significant differentially methylated sites, corresponding to 9 different genes, were identified in breast cancer datasets. Among the 9 identified genes, ADCY4, CPXM1, DNM3, GNG4, MAST1, mir129-2, PRDM14, and ZNF177 were hypermethylated. Importantly, individual value of each selected methylation gene was greater than 0.9, whereas predictive value for all genes combined was 0.9998. We also found the AUC for the combined signature of 7 genes (ADCY4, CPXM1, DNM3, GNG4, MAST1, PRDM14, ZNF177) was 0.9998 [95% CI 0.9994–1], and the AUC for the combined signature of 3 genes (MAST1, PRDM14, and ZNF177) was 0.9991 [95% CI 0.9976–1]. Results from additional validation analyses showed that MAST1, PRDM14, and ZNF177 had high sensitivity, specificity, and accuracy for breast cancer diagnosis. Lastly, patient survival analysis revealed that high expression of ADCY4, CPXM1, DNM3, PRDM14, PRKCB, and ZNF177 were significantly associated with better overall survival.
Methylation pattern of MAST1, PRDM14, and ZNF177 may represent new diagnostic biomarkers for breast cancer, while methylation of ADCY4, CPXM1, DNM3, PRDM14, PRKCB, and ZNF177 may hold prognostic potential for breast cancer.
Breast cancer is the most commonly diagnosed cancer and the leading cause of cancer-associated death among women worldwide . Early diagnosis and accurate prognostic assessment of breast cancer are crucial for timely targeted treatment . Accumulating evidence suggests that DNA methylation may hold an important role for the development and progression of breast cancer [3,4,5].
DNA methylation consists in the addition of a methyl group to carbon 5-position of cytosine within a cytosine guanine (CpG) dinucleotide . This molecular process is critical for several important cellular mechanisms, including embryonic development, regulation of gene expression, X-chromosome inactivation, and genomic imprinting and stability . Aberrant hypo- and hypermethylation patterns of the DNA have been identified as critical players in tumorigenesis, promoting the expression or silencing of oncogenes and tumor suppressor genes, respectively [8,9,10]. Therefore, abnormal DNA methylation, acting as a cancer-related biomarker, could be helpful for cancer early detection and prognosis, as well as for predicting response to treatment of cancer.
DNA methylation markers are not currently in use in clinical settings for breast cancer assessment. This is mostly due to lack of evidence on their clinical value in large cohorts of breast cancer patients [4, 11,12,13]. Indeed, available data on the clinical potential of cancer-specific methylated markers rely on platforms with low genomic coverage, small sample datasets, or missing appropriate healthy counterparts for comparison .
In the present study, we aimed to evaluate methylation changes specific to breast cancer that could be used as tools in the clinical setting for diagnostic and prognostic assessment of patients. To achieve this goal, we used different bioinformatics approaches to analyze several publicly available methylation datasets of samples collected from cancer patients and healthy counterparts.
Description of breast cancer and control samples
Breast cancer and control samples publicly available at the Gene Expression Omnibus database (GEO, https://www.ncbi.nlm.nih.gov/geo/) were used for the different bioinformatics analyses. Cancer samples were obtained from GSE72308, which includes three sets (GSE72245, N = 118; GSE72251, N = 119; GSE72254, N = 58), as well as from GSE141338 (N = 42), GSE100850 (N = 34), and GSE117439 (N = 52). DNA methylation data from normal tissue samples was used as control and was obtained from GSE88883 (N = 100), GSE74214 (N = 18), GSE141338 (N = 6), GSE100850 (N = 5), and GSE101961 (N = 121) datasets. Data from GSE41169 (N = 95) were used as a blood control dataset. Information of all samples is compiled and available in supplementary information (Fig. 1; Table S1).
Differentially methylated analyses
Data from 118 breast cancer samples (GSE72245) and 50 normal samples (GSE88883) was analyzed by R package ChAMP, according to a previously described protocol . Probe signal was removed when detected p value was above 0.05, or when more than 1% of the dataset contained no information. Briefly, differential methylation analysis was performed at probe (lmFit from limma; adjusted p ≤ 1 × 10−35; minimum delta beta value of 0.35) or region level (bumphunter from minfi; regions represented by at least two probes with L ≥ 2). In order to minimize the risk of false positive detection in blood tests, methylation in leukocytes was excluded (GSE41169; maximum beta value allowed = 0.1). Differentially methylated probes were limited to those overlapping differentially methylated regions, which was distant of a maximum 150 bp, not located in centromeres or telomeres. Lastly, methylation level of differentially methylated genes in control and breast cancer samples was plotted with the use of GraphPad Prism software.
UALCAN database analysis
UALCAN online tool (http://ualcan.path.uab.edu) is designed to provide easy access to publicly available cancer transcriptome data (TCGA and MET500 transcriptome sequencing), including 793 breast cancer samples and 97 normal samples. Therefore, it was used to perform a comprehensive analysis of promoter DNA methylation patterns in control and breast cancer samples . In this study, the beta value indicated level of DNA methylation ranging from 0 (unmethylated) to 1 (fully methylated), and different beta value cutoff was considered as hypomethylation [beta value 0.3–0.25] or hypermethylation [beta value 0.7–0.5]. Additionally, mRNA expression of the identified genes in breast cancer was also analyzed using UALCAN.
Marker discovery analysis
Receiver operating characteristic (ROC) analyses were performed in GSE72251 and GSE88883 with the pROC package in R Bioconductor to establish thresholds, considering normal and adjacent mucosa as positive outcome and cancer as negative; only loci showing a threshold below 0.35 were kept. ROC curve was generated, and area under the curve (AUC) with the binomial exact confidence interval was calculated. For AUC values above 0.9, the differentially methylated gene was deemed able to distinguish between control and breast cancer with excellent specificity and sensitivity. AUC for the combined epigenetic signature was assessed using a logistic regression model . Each threshold was used to stratify the two mixed cohorts, defining a positive predictive value and negative predictive value for discriminating normal adjacent from tumor tissue. The two mixed cohorts were as follows: mixed cohort 1 included breast cancer (GSE141338, GSE100850, and GSE117439) and control (GSE101961) samples, whereas mixed cohort 2 included breast cancer (GSE72254) and control (GSE74214, GSE141338, and GSE100850) samples.
Prognostic value of the selected DNA methylation-driven genes was evaluated through the Kaplan-Meier plot assessment (http://kmplot.com/analysis/) with data from the mRNA breast cancer database . Median value of all gene expression levels was used as threshold to identify and separate cases with high or low gene expression. p < 0.05 was considered significant.
Identification of differentially methylated genes
To evaluate the DNA methylation pattern in breast cancer, we started by comparing 118 breast cancer and 50 control samples. We identified 105,143 differentially methylated positions and 8764 regions in breast cancer cases compared to controls (Fig. 2). Next, we filtered these differentially methylated sites as described in the “Methods” section, allowing us to refine our findings to a total of 23 differentially methylated sites. Importantly, these sites were directly linked to the transcription of 9 genes, namely adenylyl cyclase 4 (ADCY4), carboxypeptidase X (CPXM1), dynamin 3 (DNM3), guanine nucleotide binding-protein gamma subunit 4 (GNG4), microtubule associated serine-threonine kinase 1 (MAST1), microRNA 129-2 (mir129-2), PR domain zinc finger protein 14 (PRDM14), protein kinase C beta (PRKCB), and zinc finger protein 177 (ZNF177) (Table S2; Fig. 3; Table 1). All genes, with exception of PRKCB, had significantly higher levels of DNA methylation in breast cancer samples compared to controls.
In order to validate the correlation between DNA methylation levels and mRNA expression of the identified genes in breast cancer, we used the online tool UALCAN. As expected, methylation levels of ADCY4, CPXM1, DNM3, GNG4, MAST1, PRDM14, and ZNF177 were found to be increased in breast cancer, and all with p values lower than 0.001 (Fig. 4). Note that information related to mir129-2 was not available in UALCAN, so we could not conduct this analysis. Then, we found the mRNA expression of ADCY4, CPXM1, GNG4, and ZNF177 were significantly decreased in breast cancer, the mRNA expression of MAST1 was significantly upregulated, but there was no difference of DNM3 mRNA expression between breast cancer samples and controls (Fig. 5).
Diagnostic potential of differentially methylated genes
In this study, ROC analyses were performed in GSE72251 and GSE88883 with the pROC package in R Bioconductor to establish thresholds, and only loci showing a threshold below 0.35 were kept. Then, AUC values of ADCY4 (0.9471), CPXM1 (0.9856), DNM3 (0.9506), GNG4 (0.9589), MAST1 (0.9950), PRDM14 (0.9883), and ZNF177 (0.9786) were all above 0.9 (Fig. 6). Then, we validated the diagnostic value of the combined logistic regression model in these two cohorts, and found the AUC for the combined signature of 7 genes (ADCY4, CPXM1, DNM3, GNG4, MAST1, PRDM14, ZNF177) was 0.9998 [95% CI 0.9994–1] (Fig. 7a) and the AUC for the combined signature of 3 genes (MAST1, PRDM14, and ZNF177) was 0.9991 [95% CI 0.9976–1] (Fig. 7b).
Next, each threshold was used to stratify the two mixed cohorts. Our results showed that the breast cancer specificity of each gene ranged from 50.41 to 98.35%, while the sensitivity ranged from 84.25 to 97.64%, and accuracy from 67.82 to 91.13% in mixed cohort 1 (Table 2). Particularly, the specificity, sensitivity, and accuracy of MAST1 were 81.82%, 97.64%, and 89.92%; those of PRDM14 were 97.52%, 84.25%, and 90.73%; and those of ZNF177 were 80.17%, 89.76%, and 85.08%, respectively (Table 2). Results obtained in mixed cohort 2 also followed the same trend, with specificity, sensitivity, and accuracy of MAST1 being 75.86%, 100%, and 91.95%; of PRDM14 being 89.66%, 86.21%, and 87.36%; and of ZNF177 being 89.66%, 93.10%, and 91.95%, respectively (Table 3).
Prognosis analyzed by K-M plotter
To further explore the clinical value of these biomarkers, we evaluated whether 6 of our differentially methylated genes—ADCY4, CPXM1, DNM3, PRDM14, PRKCB, and ZNF177—had any relation with overall survival of breast cancer patients. Hazard ratios of these 6 genes showed significant differences between the high-expression and the low-expression groups, with high expression of all genes being significantly associated with longer overall survival (Fig. 8).
Breast cancer has extremely high mortality worldwide, mostly due to late diagnosis. Cancer-specific DNA methylation patterns are correlated with gene silencing or activation in several types of cancers [19, 20]. Recent studies highlight that aberrant DNA methylation is significantly associated with breast cancer and demonstrated that DNA methylation analysis may help predict the outcome of patients with breast cancer [12, 21]. In this study, we identified differentially methylated genes and confirmed the diagnostic and prognostic value of 6 of these methylation-based biomarkers in breast cancer using a bioinformatics approach.
We first identified 23 differentially methylated CpG sites in breast cancer samples as compared to control counterparts. And the 23 differentially methylated CpG sites correspond to 9 genes, and then, 8 of these 9 genes were coding genes—ADCY4, CPXM1, DNM3, GNG4, MAST1, mir129-2, PRDM14, and ZNF177—that had significantly higher levels of DNA methylation in breast cancer. Similarly, methylation levels described in UALCAN analysis for all these genes were found to be significantly higher in patients with breast cancer, with exception for mir129-2 that was not possible to assess. Further analysis revealed the potential of these 8 differentially methylated genes to accurately predict the outcome of patients in training and validation datasets, suggesting that they could be used as biomarkers for breast cancer diagnosis. Additionally, combination of 7 of these methylation markers significantly improved our ability to predict the outcome of breast cancer patients. Overall, we found that MAST1, PRDM14, and ZNF177 had high sensitivity, specificity, and accuracy for the diagnosis of breast cancer.
Growing evidence shows a strong relationship between epigenetic and genetic aberrations of MAST1, PRDM14 , and ZNF177  in tumorigenesis. Previous studies reported that abnormal MAST1 expression is significantly associated with worse cancer prognosis [23, 24]. Oishi et al. found that aberrant promoter demethylation of MAST1 could be responsible for overexpression of this gene in malignant pheochromocytoma and paraganglioma . Other studies have shown that silencing of PRDM14 can suppress tumorigenicity and metastasis potential of breast cancer cells , while methylation-mediated gene silencing of PRDM14 leads to apoptosis evasion in human papillomavirus-positive cancer cells . Several reports have also shown that methylation of ZNF177 is associated with different types of cancer including gastric and endometrial cancers , as well as non-small cell lung carcinoma. ZNF177 is methylation-silenced in gastric cancer cell lines, whereas methylation of its promotor is a frequent epigenetic event in endometrial cancer. Indeed, ROC analysis of ZNF177 has demonstrated that it can identify endometrial carcinomas cases with a sensitivity, specificity, and accuracy of 92.3%, 94.4%, and 95.1%, respectively. Furthermore, hypermethylated CpG islands within ZNF177 were selected as candidate biomarker for further validation in NSCLC. Nakakido et al. demonstrated that ZNF177 is overexpressed in breast cancer and plays a critical role in cancer cell proliferation . However, the role of MAST1, PRDM14, and ZNF177 in diagnosis and prognosis of breast cancer remains unclear.
Our findings add a new layer of evidence to the epigenetic landscape of breast cancer, providing convincing clues that MAST1, PRDM14, and ZNF177 are differentially methylated in breast cancer, as well as that they may serve as potential drivers and biomarkers for breast cancer. Furthermore, our study demonstrates that high expression of ADCY4, CPXM1, DNM3, PRDM14, PRKCB, and ZNF177 are significantly associated with longer patient survival. This finding supports the hypothesis that methylation-driven genes are likely to be associated with clinical outcomes in cancer and can be used as potential biomarkers for predicting the outcome of breast cancer patients.
In summary, we have identified and independently validated abnormal DNA methylation patterns in MAST1, PRDM14, and ZNF177 as potential biomarkers for breast cancer diagnosis. Moreover, we showed that DNA methylation landscape of ADCY4, CPXM1, DNM3, PRDM14, PRKCB, and ZNF177 could be selected as accurate biomarkers for the prognosis of breast cancer. Overall, these findings provide a novel epigenetic predictive model that may help improve the diagnosis and prognosis of breast cancer.
Availability of data and materials
The datasets used during the present study are available from the corresponding author upon reasonable request.
Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68:394–424.
Goldhirsch A, Winer EP, Coates AS, Gelber RD, Piccart-Gebhart M, Thürlimann B, Senn HJ. Personalizing the treatment of women with early breast cancer: highlights of the St Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 2013. Ann Oncol. 2013;24:2206–23.
Tang Q, Cheng J, Cao X, Surowy H, Burwinkel B. Blood-based DNA methylation as biomarker for breast cancer: a systematic review. Clin Epigenetics. 2016;8:115.
Chen X, Zhang J, Dai X. DNA methylation profiles capturing breast cancer heterogeneity. BMC Genomics. 2019;20:823.
Downs BM, Mercado-Rodriguez C, Cimino-Mathews A, Chen C, Yuan JP, Van Den Berg E, Cope LM, Schmitt F, Tse GM, Ali SZ, et al. DNA methylation markers for breast cancer detection in the developing world. Clin Cancer Res. 2019;25:6357–67.
Koch A, Joosten SC, Feng Z, de Ruijter TC, Draht MX, Melotte V, Smits KM, Veeck J, Herman JG, Van Neste L, et al. Analysis of DNA methylation in cancer: location revisited. Nat Rev Clin Oncol. 2018;15:459–66.
Mahmood N, Rabbani SA. Targeting DNA hypomethylation in malignancy by epigenetic therapies. Adv Exp Med Biol. 2019;1164:179–96.
Klutstein M, Nejman D, Greenfield R, Cedar H. DNA methylation in cancer and aging. Cancer Res. 2016;76:3446–50.
Picardo F, Romanelli A, Muinelo-Romay L, Mazza T, Fusilli C, Parrella P, Barbazán J, Lopez-López R, Barbano R, De Robertis M, et al. Diagnostic and prognostic value of B4GALT1 hypermethylation and its clinical significance as a novel circulating cell-free DNA biomarker in colorectal cancer. Cancers (Basel). 2019;11:1598.
Dong S, Li W, Wang L, Hu J, Song Y, Zhang B, Ren X, Ji S, Li J, Xu P, et al. Histone-related genes are hypermethylated in lung cancer and hypermethylated HIST1H4F could serve as a pan-cancer biomarker. Cancer Res. 2019;79:6101–12.
Zhang S, Wang Y, Gu Y, Zhu J, Ci C, Guo Z, Chen C, Wei Y, Lv W, Liu H, et al. Specific breast cancer prognosis-subtype distinctions based on DNA methylation patterns. Mol Oncol. 2018;12:1047–60.
de Ruijter TC, van der Heide F, Smits KM, Aarts MJ, van Engeland M, Heijnen VCG. Prognostic DNA methylation markers for hormone receptor breast cancer: a systematic review. Breast Cancer Res. 2020;22:13.
Moss J, Zick A, Grinshpun A, Carmon E, Maoz M, Ochana BL, Abraham O, Arieli O, Germansky L, Meir K, et al. Circulating breast-derived DNA allows universal detection and monitoring of localized breast cancer. Ann Oncol. 2020;31:395–403.
Sangtani A, Wang C, Weaver A, Hoppman NL, Kerr SE, Abyzov A, Shridhar V, Staub J, Kocher JA, Voss JS, et al. Combining copy number, methylation markers, and mutations as a panel for endometrial cancer detection via intravaginal tampon collection. Gynecol Oncol. 2020;156:387–92.
Barault L, Amatu A, Siravegna G, Ponzetti A, Moran S, Cassingena A, Mussolin B, Falcomata C, Binder AM, Cristiano C, et al. Discovery of methylated circulating DNA biomarkers for comprehensive non-invasive monitoring of treatment response in metastatic colorectal cancer. Gut. 2018;67:1995–2005.
Chandrashekar DS, Bashel B, Balasubramanya SAH, Creighton CJ, Ponce-Rodriguez I, Chakravarthi B, Varambally S. UALCAN: a portal for facilitating tumor subgroup gene expression and survival analyses. Neoplasia. 2017;19:649–58.
Diaz-Lagares A, Mendez-Gonzalez J, Hervas D, Saigi M, Pajares MJ, Garcia D, Crujerias AB, Pio R, Montuenga LM, Zulueta J, et al. A novel epigenetic signature for early diagnosis in lung cancer. Clin Cancer Res. 2016;22:3361–71.
Lv L, Cao L, Hu G, Shen Q, Wu J. Methylation-driven genes identified as novel prognostic indicators for thyroid carcinoma. Front Genet. 2020;11:294.
de Almeida BP, Apolónio JD, Binnie A, Castelo-Branco P. Roadmap of DNA methylation in breast cancer identifies novel prognostic biomarkers. BMC Cancer. 2019;19(1):219.
Sun X, Tian Y, Wang J, Sun Z, Zhu Y. Genome-wide analysis reveals the association between alternative splicing and DNA methylation across human solid tumors. BMC Med Genet. 2020;13:4.
de Ruijter TC, Smits KM, Aarts MJ, van Hellemond IEG, Van Neste L, de Vries B, Peer PGM, Veeck J, van Engeland M, Tjan-Heijnen VCG. The trans-DATA study: aims and design of a translational breast cancer prognostic marker identification study. Diagn Progn Res. 2019;3:20.
Sybirna A, Tang WWC, Pierson Smela M, Dietmann S, Gruhn WH, Brosh R, Surani MA. A critical role of PRDM14 in human primordial germ cell fate revealed by inducible degrons. Nat Commun. 2020;11:1282.
Jin L, Chun J, Pan C, Li D, Lin R, Alesi GN, Wang X, Kang HB, Song L, Wang D, et al. MAST1 drives cisplatin resistance in human cancers by rewiring cRaf-independent MEK activation. Cancer Cell. 2018;34:315–30 e317.
Yu X, Sheng P, Sun J, Zhao X, Zhang J, Li Y, Zhang Y, Zhang W, Wang J, Liu K, et al. The circular RNA circMAST1 promotes hepatocellular carcinoma cell proliferation and migration by sponging miR-1299 and regulating CTNND1 expression. Cell Death Dis. 2020;11:340.
Oishi T, Iino K, Okawa Y, Kakizawa K, Matsunari S, Yamashita M, Taniguchi T, Maekawa M, Suda T, Oki Y. DNA methylation analysis in malignant pheochromocytoma and paraganglioma. J Clin Transl Endocrinol. 2017;7:12–20.
Taniguchi H, Imai K. Silencing PRDM14 via oligonucleotide therapeutics suppresses tumorigenicity and metastasis of breast cancer. Methods Mol Biol. 1974;2019:233–43.
Snellenberg S, Cillessen SA, Van Criekinge W, Bosch L, Meijer CJ, Snijders PJ, Steenbergen RD. Methylation-mediated repression of PRDM14 contributes to apoptosis evasion in HPV-positive cancers. Carcinogenesis. 2014;35:2611–8.
Chen YC, Tsao CM, Kuo CC, Yu MH, Lin YW, Yang CY, Li HJ, Yan MD, Wang TJ, Chou YC, Su HY. Quantitative DNA methylation analysis of selected genes in endometrial carcinogenesis. Taiwan J Obstet Gynecol. 2015;54:572–9.
Nakakido M, Tamura K, Chung S, Ueda K, Fujii R, Kiyotani K, Nakamura Y. Phosphatidylinositol glycan anchor biosynthesis, class X containing complex promotes cancer cell proliferation through suppression of EHD2 and ZIC1, putative tumor suppressors. Int J Oncol. 2016;49:868–76.
This study was supported in part by the Project of Application on Public Welfare Technology in Zhejiang Province (No. LGF18H160022), the Major Project for Medical and Health Science and Technology of Zhejiang Province (No. WKJ-ZJ-1716), and the General Project Funds from the Health Department of Zhejiang Province (No. 2016KYB026).
Ethics approval and consent to participate
Consent for publication
The authors declare no conflict of interest.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Mao, Xh., Ye, Q., Zhang, Gb. et al. Identification of differentially methylated genes as diagnostic and prognostic biomarkers of breast cancer. World J Surg Onc 19, 29 (2021). https://doi.org/10.1186/s12957-021-02124-6
- Breast cancer