Gene coexpression network approach to develop an immune prognostic model for pancreatic adenocarcinoma
World Journal of Surgical Oncology volume 19, Article number: 112 (2021)
Pancreatic adenocarcinoma (PAAD) is a nonimmunogenic tumor, and very little is known about the relationship between the host immune response and patient survival. We aimed to develop an immune prognostic model (IPM) and analyze its relevance to the tumor immune profiles of patients with PAAD.
We investigated differentially expressed genes between tumor and normal tissues in the TCGA PAAD cohort. Immune-related genes were screened from highly variably expressed genes with weighted gene correlation network analysis (WGCNA) to construct an IPM. Then, the influence of IPM on the PAAD immune profile was comprehensively analyzed.
A total of 4902 genes highly variably expressed among primary tumors were used to construct a weighted gene coexpression network. One hundred seventy-five hub genes in the immune-related module were used for machine learning. Then, we established an IPM with four core genes (FCGR2B, IL10RA, and HLA-DRA) to evaluate the prognosis. The risk score predicted by IPM was an independent prognostic factor and had a high predictive value for the prognosis of patients with PAAD. Moreover, we found that the patients in the low-risk group had higher cytolytic activity and lower innate anti-PD-1 resistance (IPRES) signatures than patients in the high-risk group.
Unlike the traditional methods that use immune-related genes listed in public databases to screen prognostic genes, we constructed an IPM through WGCNA to predict the prognosis of PAAD patients. In addition, an IPM prediction of low risk indicated enhanced immune activity and a decreased anti-PD-1 therapeutic response.
Pancreatic adenocarcinoma (PAAD) is historically one of the deadliest malignancies; the 5-year survival rate is less than 9% for patients with all stages combined . Furthermore, the incidence of PAAD is rapidly increasing worldwide and is predicted to become the second leading cause of cancer-related death in the future, behind only lung cancer . Treatment options for advanced PAAD are limited to gemcitabine-based chemotherapy and are hampered by ineffective clinical outcomes.
Immune surveillance is the first filter to identify and eliminate aberrant or malignant cells , and PAAD is regarded as a nonimmunogenic tumor . Hence, most tumor cells avoid recognition by host immune cells, promoting PAAD progression. Although several immune-related parameters, such as CD8+ T cell infiltration and PD-L1 expression, have been reported to predict the prognosis of patients with PAAD , immune prognostic models (IPMs) have not yet been developed for PAAD.
Weighted gene correlation network analysis (WGCNA) is a powerful approach in network modeling that is useful for identifying cluster modules of highly related genes, summarizing these clusters with module eigengenes (MEs) or intramodular hub genes, and relating modules to one another and to external sample traits . Currently, WGCNA has been widely used in various cancer-related studies to identify candidate biomarkers [7, 8].
In the present study, we first built the WGCNA network using primary tumors from 177 patients in The Cancer Genome Atlas (TCGA) database. The WGCNA network was constructed by highly variably expressed among tumors, and immune scores calculated using the “ESTIMATE” package  were used to identify immune-related modules. By selecting hub genes in these immune-related modules, this study developed an IPM that may identify low-risk patients. Importantly, our investigation indicated that low-risk patients predicted by the IPM present higher immune activity and might experience a greater benefit from immunotherapies than high-risk patients.
Materials and methods
The RNA high-throughput sequencing (RNA-HTSeq) read count data and clinical information from patients with PAAD (n=177) were obtained from the GDC Data Portal (https://portal.gdc.cancer.gov/). Normal tissue (n=171) expression was obtained from the GTEx portal (https://www.gtexportal.org/).
The following pre-processing method was used: (1) removing genes with counts ≤ 1 in all samples and (2) normalizing expression data by variance stabilizing transformation (VST). Highly variably expressed genes (HVGS) were analyzed in R (version: 3.6.3), and a cutoff of top quartile (>25%) are considered hypervariable.
The gene coexpression networks were constructed using the WGCNA package . A coexpression network was constructed for all differentially expressed genes, and the Pearson correlation coefficients were calculated between all genes. A β (soft-thresholding power) parameter was determined based on the scale-free topology of the network to reconstruct the network with strongly correlated genes and eliminate weakly correlated genes.
Evaluation of the levels of immune cell infiltration and tumor purity
The levels of immune cell infiltration (immune score) and tumor purity for each PAAD sample were calculated using the “estimateScore” function in the “ESTIMATE” package . This method is based on the single sample gene set enrichment analysis (ssGSEA) algorithm .
Construction of an immune-related prognostic model
The expression profiles of 2991 genes from patients in TCGA PAAD cohort with survival information were analyzed using a univariate Cox regression analysis. Ten variables with both a log-rank P≤0.2 and likelihood P≤0.2 in the univariate analysis were entered into the primary IPM. Next, the “step” function in the survival “package” with the option direction “both” compares improvements in the Akaike information criterion (AIC) achieved by systematically deleting each candidate variable and adding each candidate variable between the upper and lower bound regressor sets supplied from the current model and by deleting or adding the one variable with best improvement in the AIC (smallest AIC).
Analysis of the immune-related signature
Gene sets for cytolytic activity (granzyme-A, perforin-1)  and activated CD8 T cells  were used in previous studies. TCGA transcriptomic profiling data were obtained from the GDC Data Portal (https://portal.gdc.cancer.gov/). The expression of each target gene was normalized by TPM, and the immune signatures were measured as the geometric mean of gene expression reported as the log2 value of TPM+1.
The innate anti-PD-1 resistance (IPRES) signatures were calculated based on the average Z score across all gene sets associated with tumor metastasis, as described in a published study .
The association between the risk score and survival outcome
Kaplan-Meier survival curves were used to show differences in survival, and the log-rank test was used to evaluate the significance of differences in survival times. We performed survival analyses using the R programming function “survfit” in the “survival” package.
All statistical analyses were performed using R version 3.6.3 software (Institute for Statistics and Mathematics, Vienna, Austria; www.r-project.org). The Wilcoxon test was used to compare two continuous variables. Survival was analyzed by constructing a Kaplan-Meier survival plot, and the log-rank test calculated the P value. P < 0.05 was considered statistically significant.
Construction of the WGCNA
The coexpression network was constructed using the expression of 4902 HVGs in 177 PAAD samples. For building signed networks with WGCNA, we choose β=12 to compute gene co-expression similarity, resulting in a scale-free topology index (R2) of 0.93 (Fig. 1a). Eighteen coexpression modules were constructed and are shown in different colors (Fig. 1b). MEs were examined and the dynamic modules were merged into 11 modules with a threshold of 0.25 to increase the reliability of the module divisions (Fig. 1b).
Identification of module-trait relationships and hub genes
We further wanted to determine which module is more closely related to immune profiles after identifying the MEs. As described in a previous study, immune estimation approaches should be tested to determine whether they display a negative correlation with tumor purity . Hence, we calculated the immune score and tumor purity for each PAAD sample to screen immune-related modules (see the “Materials and methods” section). Figure 1c shows the module-trait heat map. The pink module exhibited the highest correlation with the immune score (R = 0.98, P < 0.001) and a significant negative correlation with tumor purity (R = −0.95, P < 0.001). Therefore, the pink module containing 640 genes was identified as the module with immune significance.
Hub genes are highly interconnected with nodes in a module and have been shown to be functionally significant. In the present study, hub genes were defined by module connectivity, as measured by the absolute value of Pearson’s correlation coefficient (module membership (MM) > 0.8), and by trait relationship, which was measured by the absolute value of Pearson’s correlation coefficient (gene significance (GS) > 0.8). Eventually, 175 highly connective genes in the pink module were selected as hub genes (Table S1). Moreover, GS and MM were highly correlated in the pink module (Fig. 1d), implying that hub genes of the pink module also tend to be highly correlated with immune profiles.
Functional enrichment analysis of hub genes in the pink module
Before developing an IPM for pancreatic cancer, we must verify whether the hub genes are related to immunity. Therefore, we performed an enrichment analysis of Gene Ontology (GO) terms associated with the hub genes in the pink module using the R programming function “enrichGO” in the “clusterProfiler” package . The top 20 clustering groups obtained in each root category (biological process (BP), cellular component (CC), and molecular function (MF)) in the GO enrichment analysis are shown separately in Fig. 2. Hub genes in the pink module were highly enriched in pathways closely related to regulation of lymphocyte activation (Fig. 2a), MHC protein complexes (Fig. 2b), and cytokine receptor activity (Fig. 2c). Taken together, we identified an immune-related gene set in PAAD based on WGCNA.
Construction of an IPM and evaluation of its predictive ability
The univariate Cox regression analysis revealed that 10 of these hub genes in the pink module were probably related to the overall survival (OS) of patients with PAAD (Table S2). A primary IPM was constructed by performing a multivariate Cox regression analysis of the expression levels of 10 immune-related genes. The optimized IPM was chosen by determining the maximum AIC in a stepwise regression analysis in both directions using the R package (see the “Materials and methods” section). Three core genes (FCGR2B, IL10RA, and HLA-DRA) were included in the optimized IPM, and the covariates IL10RA (hazard ratio (HR) 0.36; 95% CI 0.24−0.52, P < 0.001), FCGR2B (HR 1.46; 95% CI 1.03−2.06, P = 0.033), and HLA-DRA (HR 1.61; 95% CI 1.12−2.32, P = 0.01) were significant (Figure S1A). The risk score for each sample was calculated utilizing the regression coefficients derived from the multivariate Cox regression analysis to multiply the normalized expression level of the 3 core genes. As shown in Figure S1B-D, the expression levels of FCGR2B, IL10RA, and HLA-DRA in tumor samples (n=177) were significantly higher (q value < 0.001; log2FoldChange > 1) than in normal pancreatic tissue (n=171; from GTEx database).
As risk scores increased, the patient mortality risk increased and the OS decreased (Fig. 3a, b). The cut-off point (0.96) was set as the median risk score to assign patients into high- and low-risk groups. As shown in Fig. 3c, low-risk patients with PAAD (n=89) had a significantly longer OS than high-risk patients with PAAD (n=88) (log-rank test, P < 0.001). Furthermore, the risk score was independently significant and the most valuable prognostic factor in TCGA PAAD cohort (HR 1.6; 95% CI 1.33–2.0, P < 0.001; Fig. 3d).
A low risk score indicated increased immune activity and reduced features of the response to anti-PD-1 therapy
Since the prognostic model was constructed based on immune-related genes, we wanted to further explore whether patients with different risk scores have different immune profiles and subsequently differences in sensitivity to immunotherapies. The status of tumor-infiltrating lymphocytes (TILs), particularly CD8+ T cells, has been verified as the core determinant of immune checkpoint inhibitor (ICI) treatment efficacy [16, 17]. Low-risk patients with PAAD presented significantly greater numbers of activated CD8+ T cells than high-risk patients with PAAD (Wilcoxon test, P < 0.001; Fig. 4a). Additionally, cytolytic T cell activity (CYT value), which predicts the immunotherapy response [11, 18], was also higher in low-risk patients with PAAD than in high-risk patients with PAAD (Wilcoxon test, P < 0.001; Fig. 4b).
On the other hand, an attenuation of the biological processes that underlie IPRES signatures may improve the response of various cancer types to anti-PD-1 therapy, including PAAD . Interestingly, as shown in Fig. 4 c, d , multiple IPRES signatures, including the mesenchymal transition, angiogenesis, and hypoxia, were significantly downregulated in the low-risk group. Collectively, a low-risk status is associated with increased immune activity and the IPM prediction of a low-risk status may serve as an indicator of the response of PAAD to ICIs.
Previous studies have noted the strong relationship between the host immune response and survival of patients with cancer [19,20,21]. Researchers frequently focus on using well-established immune-related gene sets to screen prognostic genes. However, a number of fundamental questions and problems remain to be solved; one of these problems is that this method ignores differences in the cancer types. Compared with other cancers, PAAD has unique immunological conditions with a dense stromal environment and a highly immunosuppressive tumor microenvironment , implying the potential for specific transcriptomic features of pancreatic cancer. Therefore, researchers have not been able to easily determine whether the included immune-related gene set accurately represents the characteristics of PAAD. The WGCNA efficiently identifies modular immune-related genes in specific cancer types to address this problem. This technology focuses on the relationship between gene modules and disease traits instead of linking thousands of genes to the disease. Consequently, hidden key genes associated with a particular biological process in cancer can be discovered using WGCNA.
In the present study, we constructed a prognostic model for PAAD using four immune-related genes (FCGR2B, IL10RA, and HLA-DRA) identified by WGCNA and machine learning. To the best of our knowledge, the genes that constitute our IPM have not yet been reported to be associated with the prognosis of patients with PAAD. FCGR2B, an inhibitory Fcgamma receptor , has been proven to be a prognostic factor for patients with glioblastoma . The prognostic values of IL10RA and HLA-DRA have been considered even lower in the literature. However, in the multivariate Cox analysis, the covariates IL10RA and HLA-DRA remained significant in PAAD. Additionally, patients classified into the high-risk group had significantly worse outcomes, and the ROC analysis verified that the risk score had good prognostic accuracy. Based on these results, the IPM was successfully established to assess the prognosis of patients with PAAD.
In the era of immunotherapy, pancreatic cancer remains an incurable disease because of an immunosuppressive and anti-inflammatory environment that results in the escape of cancer cells from such therapies . Nevertheless, some advances have been achieved using therapy targeting PD-1 and its ligand PD-L1 [26, 27], suggesting that some patients with PAAD indeed benefit from ICIs. To date, the microsatellite instability (MSI) or mismatch repair deficiency (dMMR) status remains the only clear marker of a benefit from PD-1 blockade in patients with PAAD. Unfortunately, probably less than 1% of patients with PAAD have a high MSI (MSI-H) or dMMR , implying a pressing need to identify potential predictive therapeutic biomarkers that will help patients derive greater benefits from ICIs. Perhaps, the most unexpected finding in our study is that low-risk PAAD is associated with increased cytolytic cell infiltration, indicating an inflammatory phenotype. This type of PAAD suggests the presence of a pre-existing anticancer immune response , and the majority of patients with inflamed tumors will show an evident clinical response to anti-PD-1/PD-L1 monotherapy . However, the immune infiltrate of pancreatic cancer is a part of the tumor microenvironment. IPRES, a transcriptomic signature associated with increased neoplasm invasion and angiogenesis, is associated with resistance to PD-1 blockade . PAAD is a cancer type that contains IPRES-enriched transcriptomic subsets involved in the majority of tumors . Interestingly, based on our data, the IPRES signatures in low-risk patients with PAAD were significantly reduced compared to high-risk patients. Thus, low-risk patients may benefit from ICIs.
This study has several limitations. First, an external validation cohort is needed to verify the validity of the IPM prediction. Second, without data from an ICI-treated cohort, we were unable to evaluate whether patients with low-risk PAAD exhibit a better response to ICIs than high-risk patients. Therefore, the study findings should be interpreted with caution, and further studies are warranted.
In summary, we identified and validated an IPM based on 3 immune-related genes with independent prognostic significance for patients with PAAD. In addition, the present study proposed that the risk score may be an immunotherapeutic marker for PAAD.
Availability of data and materials
All data used in the manuscript were extracted from the public database indicated in the manuscript.
Immune prognostic model
Innate anti-PD-1 resistance
Weighted gene correlation network analysis
The Cancer Genome Atlas
Akaike information criterion
Receiver operating characteristic
Immune checkpoint inhibitor
Mismatch repair deficiency
Seufferlein T, Mayerle J. Pancreatic cancer in 2015: Precision medicine in pancreatic cancer--fact or fiction? Nat Rev Gastroenterol Hepatol. 2016;13(2):74–5. https://doi.org/10.1038/nrgastro.2015.215.
Ansari D, Tingstedt B, Andersson B, Holmquist F, Sturesson C, Williamsson C, et al. Pancreatic cancer: yesterday, today and tomorrow. Future Oncol. 2016;12(16):1929–46. https://doi.org/10.2217/fon-2016-0010.
Burnet FM. The concept of immunological surveillance. Prog Exp Tumor Res. 1970;13:1–27. https://doi.org/10.1159/000386035.
Lutz ER, Wu AA, Bigelow E, Sharma R, Mo G, Soares K, et al. Immunotherapy converts nonimmunogenic pancreatic tumors into immunogenic foci of immune regulation. Cancer Immunol Res. 2014;2(7):616–31. https://doi.org/10.1158/2326-6066.CIR-14-0027.
Hou YC, Chao YJ, Hsieh MH, et al. Low CD8(+) T cell infiltration and high PD-L1 expression are associated with level of CD44(+)/CD133(+) cancer stem cells and predict an unfavorable prognosis in pancreatic cancer. Cancers (Basel). 2019;11(4).
Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9(1):559. https://doi.org/10.1186/1471-2105-9-559.
Wang H, Liu J, Li J, Zang D, Wang X, Chen Y, et al. Identification of gene modules and hub genes in colon adenocarcinoma associated with pathological stage based on WGCNA analysis. Cancer Gene Ther. 2020;242:1–7. https://doi.org/10.1016/j.cancergen.2020.01.052.
Adhami M, MotieGhader H, Haghdoost AA, Afshar RM, Sadeghi B. Gene co-expression network approach for predicting prognostic microRNA biomarkers in different subtypes of breast cancer. Genomics. 2020;112(1):135–43. https://doi.org/10.1016/j.ygeno.2019.01.010.
Yoshihara K, Shahmoradgoli M, Martinez E, et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nat Commun. 2013;4(1):2612. https://doi.org/10.1038/ncomms3612.
Barbie DA, Tamayo P, Boehm JS, Kim SY, Moody SE, Dunn IF, et al. Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature. 2009;462(7269):108–12. https://doi.org/10.1038/nature08460.
Rooney MS, Shukla SA, Wu CJ, Getz G, Hacohen N. Molecular and genetic properties of tumors associated with local immune cytolytic activity. Cell. 2015;160(1-2):48–61. https://doi.org/10.1016/j.cell.2014.12.033.
Charoentong P, Finotello F, Angelova M, Mayer C, Efremova M, Rieder D, et al. Pan-cancer immunogenomic analyses reveal genotype-immunophenotype relationships and predictors of response to checkpoint blockade. Cell Rep. 2017;18(1):248–62. https://doi.org/10.1016/j.celrep.2016.12.019.
Hugo W, Zaretsky JM, Sun L, Song C, Moreno BH, Hu-Lieskovan S, et al. Genomic and transcriptomic features of response to anti-PD-1 therapy in metastatic melanoma. Cell. 2016;165(1):35–44. https://doi.org/10.1016/j.cell.2016.02.065.
Rosenthal R, Cadieux EL, Salgado R, et al. Neoantigen-directed immune escape in lung cancer evolution. Nature. 2019;567(7749):479–85. https://doi.org/10.1038/s41586-019-1032-7.
Yu G, Wang LG, Han Y, et al. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16(5):284–7. https://doi.org/10.1089/omi.2011.0118.
Zhou J, Wang G, Chen Y, Wang H, Hua Y, Cai Z. Immunogenic cell death in cancer therapy: Present and emerging inducers. J Cell Mol Med. 2019;23(8):4854–65. https://doi.org/10.1111/jcmm.14356.
Yamamoto Y, Iwahori K, Funaki S, Matsumoto M, Hirata M, Yoshida T, et al. Immunotherapeutic potential of CD4 and CD8 single-positive T cells in thymic epithelial tumors. Sci Rep. 2020;10(1):4064. https://doi.org/10.1038/s41598-020-61053-8.
Balli D, Rech AJ, Stanger BZ, Vonderheide RH. Immune cytolytic activity stratifies molecular subsets of human pancreatic cancer. Clin Cancer Res. 2017;23(12):3129–38. https://doi.org/10.1158/1078-0432.CCR-16-2128.
Hartmann WM, Rakerd B. Localization of sound in rooms. IV: The Franssen effect. J Acoust Soc Am. 1989;86(4):1366–73. https://doi.org/10.1121/1.398696.
Cai SW, Yang SZ, Gao J, Pan K, Chen JY, Wang YL, et al. Prognostic significance of mast cell count following curative resection for pancreatic ductal adenocarcinoma. Surgery. 2011;149(4):576–84. https://doi.org/10.1016/j.surg.2010.10.009.
Tahkola K, Leppanen J, Ahtiainen M, et al. Immune cell score in pancreatic cancer-comparison of hotspot and whole-section techniques. Virchows Arch. 2019;474(6):691–9. https://doi.org/10.1007/s00428-019-02549-1.
Jiang J, Zhou H, Ni C, Hu X, Mou Y, Huang D, et al. Immunotherapy in pancreatic cancer: New hope or mission impossible? Cancer Lett. 2019;445:57–64. https://doi.org/10.1016/j.canlet.2018.10.045.
Sharp PE, Martin-Ramirez J, Boross P, et al. Increased incidence of anti-GBM disease in Fcgamma receptor 2b deficient mice, but not mice with conditional deletion of Fcgr2b on either B cells or myeloid cells alone. Mol Immunol. 2012;50(1-2):49–56. https://doi.org/10.1016/j.molimm.2011.12.007.
Cheng W, Ren X, Zhang C, Cai J, Liu Y, Han S, et al. Bioinformatic profiling identifies an immune-related risk signature for glioblastoma. Neurology. 2016;86(24):2226–34. https://doi.org/10.1212/WNL.0000000000002770.
Sideras K, Braat H, Kwekkeboom J, van Eijck CH, Peppelenbosch MP, Sleijfer S, et al. Role of the immune system in pancreatic cancer progression and immune modulating treatment strategies. Cancer Treat Rev. 2014;40(4):513–22. https://doi.org/10.1016/j.ctrv.2013.11.005.
Brahmer JR, Tykodi SS, Chow LQ, et al. Safety and activity of anti-PD-L1 antibody in patients with advanced cancer. N Engl J Med. 2012;366(26):2455–65. https://doi.org/10.1056/NEJMoa1200694.
Herbst RS, Soria JC, Kowanetz M, Fine GD, Hamid O, Gordon MS, et al. Predictive correlates of response to the anti-PD-L1 antibody MPDL3280A in cancer patients. Nature. 2014;515(7528):563–7. https://doi.org/10.1038/nature14011.
Sahin IH, Askan G, Hu ZI, O’Reilly EM. Immunotherapy in pancreatic ductal adenocarcinoma: an emerging entity? Ann Oncol. 2017;28(12):2950–61. https://doi.org/10.1093/annonc/mdx503.
Chen DS, Mellman I. Elements of cancer immunity and the cancer-immune set point. Nature. 2017;541(7637):321–30. https://doi.org/10.1038/nature21349.
Rosenberg JE, Hoffman-Censits J, Powles T, van der Heijden MS, Balar AV, Necchi A, et al. Atezolizumab in patients with locally advanced and metastatic urothelial carcinoma who have progressed following treatment with platinum-based chemotherapy: a single-arm, multicentre, phase 2 trial. Lancet. 2016;387(10031):1909–20. https://doi.org/10.1016/S0140-6736(16)00561-4.
This study was supported by Shanghai special disease clinical “five innovation” transformation project (16CR3039A). The funders played no roles in the study design, decision to publish, or preparation of the manuscript.
Ethics approval and consent to participate
Consent for publication
No conflicts of interest exist in the submission of this manuscript, and the manuscript has been approved by all authors for publication.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
. Evaluation of predicted outcomes. Forest plot of the multivariate Cox regression proportional hazards regression analysis of OS in TCGA PAAD cohort (A). FCGR2B (B), IL10RA (C), and HLA-DRA (B) expression between tumor and normal tissue.
. Hub genes in the pink module.
. Ten genes with both a log-rank P ≤ 0.2 and likelihood P ≤ 0.2 in the univariate analysis.
About this article
Cite this article
Gu, X., Zhang, Q., Wu, X. et al. Gene coexpression network approach to develop an immune prognostic model for pancreatic adenocarcinoma. World J Surg Onc 19, 112 (2021). https://doi.org/10.1186/s12957-021-02201-w