- Open Access
Identification of key lncRNAs in colorectal cancer progression based on associated protein–protein interaction analysis
World Journal of Surgical Oncology volume 15, Article number: 153 (2017)
Colorectal cancer (CRC) was one of the most commonly diagnosed malignancies. The molecular mechanisms involved in the progression of CRC remain unclear. Accumulating evidences showed that long noncoding RNAs (lncRNAs) played key roles in tumorigenesis, cancer progression, and metastasis. Therefore, we aimed to explore the roles of lncRNAs in the progression of CRC.
In this study, we aimed to identify differentially expressed lncRNAs and messenger RNAs (mRNAs) in CRC by analyzing a cohort of previously published datasets: GSE64857. GO and KEGG pathway analyses were applied to give us insight in the functions of those lncRNAs and mRNAs in CRC.
Totally, 46 lncRNAs were identified as differentially expressed between stage II and stage III CRC for the first time screening by microarray. GO and KEGG pathway analyses showed that differentially expressed lncRNAs were involved in regulating signal transduction, cell adhesion, cell differentiation, focal adhesion, and cell adhesion molecules.
We found three lncRNAs (LOC100129973, PGM5-AS1, and TTTY10) widely co-expressed with differentially expressed mRNAs. We also constructed lncRNA-associated PPI in CRC and found that these lncRNAs may be associated with CRC progression. Moreover, we found that high PGM5-AS1 expression levels were associated with worse overall survival in CRC cancer. We believe that this study would provide novel potential therapeutic and prognostic targets for CRC.
As the third most commonly diagnosed malignancy in most parts of the world, colorectal cancer (CRC) caused more than 600,000 deaths (approximately 8%) of all cancer deaths . Due to lack of oncogenesis-associated molecular biomarkers, the overall survival time of CRC patients is still not improved remarkably after a mass of progress in clinical treatment for CRC . Thus, to develop novel treatments of CRC, a more clear understanding of molecular mechanisms underlying the development and progression of CRC is urgently needed and new diagnostic and prognostic biomarkers are essential to be identified.
Endogenous cellular RNAs with lengths longer than 200 nucleotides and lack of obvious open reading frame (ORF) are the definition of long noncoding RNAs (also known as lncRNAs), which are lately discovered to be RNAs and make up 80% of noncoding RNAs [3, 4]. More than 8000 lncRNA genes are identified within 4 years, and the number of human lncRNAs are estimated ranging from 10,000 to 20,000 . Although accumulating evidences showed that lncRNAs have been correlated to cancer progression including CRC, the functions of most lncRNAs are still unknown . Concretely, lncRNA DANCR is a prognostic factor for both overall survival (OS) and disease-free survival (DFS) in CRC [6, 7]; upregulation of lncRNA FTX promoted growth, invasion, and migration in CRC cells [8, 9]; and the expression of lncRNA HOTAIR is associated with tumor invasion and radio-sensitivity suggested its potential role in CRC diagnostics and therapeutics [10, 11].
In this study, we aimed to identify differentially expressed lncRNAs and messenger RNAs (mRNAs) in CRC by analyzing a cohort of previously published datasets: GSE64857. To provide novel information about molecular mechanisms and functional roles of lncRNAs, we conducted protein–protein interaction analysis in CRC and found that several lncRNAs may be associated with the tumorigenesis of different CRC subtypes.
Microarray data and data preprocessing
Microarray data was downloaded from Gene Expression Omnibus (GEO) database (www.ncbi.nlm.nih.gov/geo/) under the accession number GSE64857. This dataset was acquired from the study by Wang . Totally, 81 samples were included in this dataset, which consisted of 44 stage II and 37 stage III CRC samples. We used arrayQuality package to quality control and limma package to apply raw data in R software. The normalization criteria were quantile normalization. Genes having fold changes ≥2 and P values <0.05 were selected as of significantly differential expression.
lncRNA classification pipeline
We applied a pipeline to evaluate the lncRNA expression in microarray data as previously described . The following criteria were used to identify the uniquely probe sets for lncRNAs from the Affymetrix array. We retained Refseq IDs labeled as “NR_” (NR indicates non-coding RNA in the Refseq database). For the probe sets with Ensembl gene IDs, we retained those annotated with “lncRNA”, “processed transcripts”, “non-coding”, or “misc_RNA” in Ensembl annotations. Then, we filtered the probe sets obtained from the last step by filtering out pseudogenes, rRNAs, microRNAs, tRNAs, snRNAs, and snoRNAs. Finally, we got 2448 annotated lncRNA transcripts with corresponding Affymetrix probe IDs. lncRNAs having fold changes ≥2 and P values <0.05 were selected as of significantly differential expression.
Co-expression network construction and analysis
In this study, the Pearson correlation coefficient of differentially expressed gene (DEG)-lncRNA pairs was calculated according to their expression value. We used the “cor” function in R software, which was a common software. All parameters are default values. The co-expressed DEG-lncRNA pairs with the absolute value of Pearson correlation coefficient ≥0.5 were selected, and the co-expression network was established by using cytoscape software.
GO and KEGG pathway analyses
MAS system provided by CapitalBio company (Molecule Annotation System, http://bioinfo.capitalbio.com/mas3/) was used to determine the biological roles of differentially expressed mRNAs. Gene functions were classified into three subgroups namely BP (biological process), CC (cellular component), and MF (molecular function). The enriched GO terms were presented by enrichment scores. KEGG pathway analysis was carried out to determine the involvement of differentially expressed mRNAs in different biological pathways. The recommend p value (hypergeometric P value) cutoff is 0.05.
Identification of lncRNA-associated PPI modules
STRING online software was used to analyze the interaction. The interaction relationships of the proteins encoded by DEGs were searched by STRING online software, and the combined score >0.4 was used as the cutoff criterion. The PPI network was visualized using Cytoscape software.
The numerical data were presented as mean ± standard deviation (SD) of at least three determinations. Statistical comparisons between groups of normalized data were performed using t test or Mann–Whitney U test according to the test condition. The p < 0.05 was considered statistically significant with a 95% confidence level.
Systematic analysis of the significantly differentially expressed mRNAs and lncRNA between stage II and stage III CRC
To identify the significantly differentially expressed mRNAs and lncRNA between stage II and stage III CRC, we utilized a publicly available gene expression data, GSE64857. We identified a total of 1472 DEGs (806 up- and 666 downregulated) and 46 differentially expressed lncRNAs (24 up- and 22 downregulated) in stage III CRC compared to stage II CRC samples (see Additional file 1). The top ten up- and downregulated lncRNAs were listed in Table 1.
Co-expression network analysis
To predict the potential functions of 24 up- and 22 downregulated lncRNAs, we first calculated the Pearson correlation coefficient of DEG-lncRNA pairs according to their expression value. The co-expressed DEG-lncRNA pairs with the absolute value of Pearson correlation coefficient ≥0.5 were selected. As shown in Fig. 1, the network included 46 lncRNAs and 881 differentially expressed genes (Fig. 1).
GO and KEGG analyses of differentially expressed lncRNAs
Based on co-expression networks, we performed GO and KEGG analyses for differentially expressed lncRNAs by using the set of co-expressed mRNAs (Fig. 2a, b).
According to the GO analysis, differentially expressed lncRNAs were enriched in signal transduction, cell adhesion, development, transcription, cell differentiation, and cell proliferation. KEGG pathway analysis revealed that differentially expressed lncRNAs mainly participated in regulating focal adhesion, cell adhesion molecules, calcium signaling pathway, and TGF-beta signaling pathway.
lncRNA co-expressed mRNAs was connected by PPI
In this study, we found three upregulated lncRNAs (LOC100129973, PGM5-AS1, and TTTY10) could widely co-expressed with DEGs. Among them, PGM5-AS1 co-expressed with more than 275 DEGs, LOC100129973 co-expressed with about 200 DEGs, and TTTY10 co-expressed with about 350 DEGs in the GSE64857 data. Next, we analyzed the co-expressed mRNAs of these lncRNAs and examined whether the mRNAs were connected by PPIs.
Based on the information in the STRING database, we constructed a protein–protein interaction network of each lncRNA in the CRC. The PGM5-AS1-related PPI network contained 72 nodes and 163 edges, and the hub nodes with the highest connectivity degree were ACTG2 (degree = 12), DMD (degree = 11), MYLK (degree = 11), and MYH11 (degree = 10) (Fig. 3a). The TTTY10-related PPI network contained 26 nodes and 37 edges, and the hub nodes with the highest connectivity degree were PIK3CD (degree = 7) (Fig. 3b). The LOC100129973-related PPI network contained 41 nodes and 78 edges, and the hub nodes with the highest connectivity degree were CD79A (degree = 11) (Fig. 3c).
Exploring the molecular functions of PGM5-AS1, TTTY10, and LOC100129973
The molecular functions of LOC100129973, PGM5-AS1, and TTTY10 in the CRC progression were still unknown. To further explore the molecular function of LOC100129973, PGM5-AS1, and TTTY10, we perform GO analysis of them using their co-expressed genes. We found that PGM5-AS1 was associated with the regulation of transcription, signal transduction, cell adhesion, nervous system development, and muscle development (Fig. 4a, d). TTTY10 was associated with cell adhesion, regulation of transcription, signal transduction, development, and cell differentiation (Fig. 4b, e). LOC100129973 was associated with immune response, signal transduction, cell adhesion, regulation of transcription, and anti-apoptosis (Fig. 4c, f).
Alterations of PGM5-AS1 expression and prognosis in CRC
To evaluate possible prognostic value of PGM5-AS1, TTTY10, and LOC100129973, we download the RNA-seq data from cbioportal (http://www.cbioportal.org/). However, only PGM5-AS1 (LOC100129973 was not included in TCGA, and TTTY10 expression was too low) expression levels with survival data were available to analyze. As shown in Fig. 5a, we also observed that PGM5-AS1 were upregulated in stage III and IV CRC samples. Kaplan–Meier analysis showed patients with high PGM5-AS1 expression levels had decreased overall survival compared to those with low PGM5-AS1 levels (p = 0.0097).
The molecular mechanism involved in the CRC progression remained unclear. Therefore, it was critically important to investigate the biological mechanisms of CRC. In the present study, we identified the significantly differentially expressed mRNAs and lncRNAs between stage II and stage III CRC by using GSE64857. GO and KEGG pathway analyses showed that differentially expressed lncRNAs were involved in regulating CRC progression. Our analysis also revealed that the function of co-expressed mRNAs related to PGM5-AS1, TTTY10, and LOC100129973 could be connected by PPI.
CRC was one of the deadliest malignancies due to its lack of biomarkers for early diagnosis and efficient therapeutic strategies . Recently, studies had shown that lncRNAs played key roles in tumorigenesis, cancer progression, and metastasis. Increasingly, reports also demonstrated that lncRNAs’ expression could be deregulated in human cancers, including CRC [15,16,17]. In the present study, we identified the significantly differentially expressed mRNAs and lncRNAs between stage II and stage III CRC using a publicly available gene expression data, GSE64857. From the microarray expression profiles, we identified 1472 DEGs (806 up- and 666 downregulated) and 46 differentially expressed lncRNAs (24 up- and 22 downregulated) in stage III CRC compared to stage II CRC samples altogether.
One challenge in predicting the functions of lncRNAs is that lncRNA could not be catalogued by GO and KEGG pathways analyses directly. According to the report of Guttman et al., one approach to classify the putative function of ncRNAs uses “guilt-by-association” . In the previous reports, combination co-expression with GO analysis were widely used to predict lncRNAs’ functions in triple-negative breast cancer  and prostate cancer . To predict the functions of the differentially expressed lncRNAs, we first constructed co-expression networks and performed GO and KEGG analyses for differentially expressed lncRNAs according to Guttman’s report. According to the GO analysis, differentially expressed lncRNAs were enriched in signal transduction, cell adhesion, development, transcription, cell differentiation, and cell proliferation. KEGG pathway analysis revealed that differentially expressed lncRNAs mainly participated in regulating focal adhesion, cell adhesion molecules, calcium signaling pathway, and TGF-beta signaling pathway.
Recently, several reports had shown that altered expression of lncRNAs may have important mechanisms of CRC progression. A few lncRNAs including DANCR , FTX [9, 15], and HOTAIR were significantly associated with the progression of CRC [21, 22]. However, the molecular mechanisms and functional roles underlying the lncRNAs in transformation of CRC remain largely unknown. In this study, we identified three upregulated lncRNAs (LOC100129973, PGM5-AS1, and TTTY10) could widely co-express with DEGs. lncRNA LOC100129973 was reported to suppress apoptosis in vascular endothelial cells by targeting miR-4707-5p and miR-4767 . However, the molecular function of LOC100129973, PGM5-AS1, and TTTY10C remains unclear in CRC. Here, to explore their molecular mechanisms, we analyzed the co-expressed mRNAs of these lncRNAs and examined whether the mRNAs were connected by PPIs. We found that PGM5-AS1 was associated with the regulation of transcription, signal transduction, and cell adhesion. Interestingly, we found PGM5-AS1 may regulate cell adhesion by effecting PGM5. PGM5 was a kind of phosphotransferase involved in the interconversion of glucose-1-phosphate and glucose-6-phosphate. In CRC, PGM5 was also reported as a potential protein marker of colorectal adenoma . TTTY10 was identified to be involved in regulating cell adhesion, transcription, signal transduction, development, and cell differentiation by regulating FOXO1 , SLIT1, and SLIT3 . We observed that LOC100129973 was associated with immune response, signal transduction, cell adhesion, regulation of transcription, and anti-apoptosis and suggested that LOC100129973 was involved in regulating CRC proliferation. To evaluate possible prognostic value of PGM5-AS1, TTTY10, and LOC100129973, we analyzed the TCGA data and found high PGM5-AS1 expression levels were associated with worse overall survival in CRC cancer.
In conclusion, we identified differentially expressed lncRNAs between stage II and stage III CRC for the first time screened by microarray. We found that 46 lncRNAs were dysregulated in CRC totally. GO and KEGG pathway analyses showed that differentially expressed lncRNAs were involved in regulating signal transduction, cell adhesion, cell differentiation, focal adhesion, and cell adhesion molecules. Three lncRNAs (LOC100129973, PGM5-AS1, and TTTY10) were identified to widely co-express with DEGs. We also constructed lncRNA-associated PPI in CRC. Of note, we observed that high PGM5-AS1 expression levels were associated with worse overall survival in CRC cancer. We believed that this study would provide novel potential therapeutic and prognostic targets for CRC.
Differentially expressed gene
Gene expression omnibus
Open reading frame
Peng W, Wang Z, Fan H. LncRNA NEAT1 impacts cell proliferation and apoptosis of colorectal cancer via regulation of Akt signaling. Pathol Oncol Res. 2016;23(3):651-56.
Lian Y, Ding J, Zhang Z, Shi Y, Zhu Y, Li J, Peng P, Wang J, Fan Y, De W, Wang K. The long noncoding RNA HOXA transcript at the distal tip promotes colorectal cancer growth partially via silencing of p21 expression. Tumour Biol. 2016;37:7431.
Zhao W, Song M, Zhang J, Kuerban M, Wang H. Combined identification of long non-coding RNA CCAT1 and HOTAIR in serum as an effective screening for colorectal carcinoma. Int J Clin Exp Pathol. 2015;8:14131.
Zhou J, Li X, Wu M, Lin C, Guo Y, Tian B. Knockdown of long noncoding RNA GHET1 inhibits cell proliferation and invasion of colorectal cancer. Oncol Res. 2016;23:303.
Han S, Liang Y, Li Y, Du W. Long noncoding RNA identification: comparing machine learning based tools for long noncoding transcripts discrimination. Biomed Res Int. 2016;2016:8496165.
Jia J, Li F, Tang XS, Xu S, Gao Y, Shi Q, Guo W, Wang X, He D, Guo P. Long noncoding RNA DANCR promotes invasion of prostate cancer through epigenetically silencing expression of TIMP2/3. Oncotarget. 2016;7:37868.
Guo XB, Hua Z, Li C, Peng LP, Wang JS, Wang B, Zhi QM. Biological significance of long non-coding RNA FTX expression in human colorectal cancer. Int J Clin Exp Med. 2015;8:15591.
Liu Y, Zhang M, Liang L, Li J, Chen YX. Over-expression of lncRNA DANCR is associated with advanced tumor progression and poor prognosis in patients with colorectal cancer. Int J Clin Exp Pathol. 2015;8:11480.
He X, Sun G, Guo F, Wang K, Gao Y, Feng Y, Song B, Li W, Li Y. Knockdown of long non-coding RNA FTX inhibits proliferation, migration, and invasion in renal cell carcinoma cells. Oncol Res. 2016;25(2):157–66. doi:10.3727/096504016X14719078133203.
Yang XD, Xu HT, Xu XH, Ru G, Liu W, Zhu JJ, Wu YY, Zhao K, Wu Y, Xing CG, Zhang SY, Cao JP, Li M. Knockdown of long non-coding RNA HOTAIR inhibits proliferation and invasiveness and improves radiosensitivity in colorectal cancer. Oncol Rep. 2016;35:479.
Yan Y, Han J, Li Z, Yang H, Sui Y, Wang M. Elevated RNA expression of long noncoding HOTAIR promotes cell proliferation and predicts a poor prognosis in patients with diffuse large B cell lymphoma. Mol Med Rep. 2016;13:5125.
Wang L, Shen X, Wang Z, Xiao X, Wei P, Wang Q, Ren F, Wang Y, Liu Z, Sheng W, Huang W, Zhou X, Du X. A molecular signature for the prediction of recurrence in colorectal cancer. Mol Cancer. 2015;14:22.
Zhang X, Sun S, Pu JK, Tsang AC, Lee D, Man VO, Lui WM, Wong ST, Leung GK. Long non-coding RNA expression profiles predict clinical phenotypes in glioma. Neurobiol Dis. 2012;48:1.
Shipp MA, Ross KN, Tamayo P, Weng AP, Kutok JL, Aguiar RC, Gaasenbeek M, Angelo M, Reich M, Pinkus GS, Ray TS, Koval MA, Last KW, Norton A, Lister TA, Mesirov J, Neuberg DS, Lander ES, Aster JC, Golub TR. Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med. 2002;8:68.
Peng W, Wu J, Feng J. LincRNA-p21 predicts favorable clinical outcome and impairs tumorigenesis in diffuse large B cell lymphoma patients treated with R-CHOP chemotherapy. Clin Exp Med. 2017;17:1.
Li CH, Chen Y. Targeting long non-coding RNAs in cancers: progress and prospects. Int J Biochem Cell Biol. 2013;45:1895.
Tang JY, Lee JC, Chang YT, Hou MF, Huang HW, Liaw CC, Chang HW. Long noncoding RNAs-related diseases, cancers, and drugs. ScientificWorldJournal. 2013;2013:943539.
Guttman M, Rinn JL. Modular regulatory principles of large non-coding RNAs. Nature. 2012;482:339.
Shen X, Xie B, Ma Z, Yu W, Wang W, Xu D, Yan X, Chen B, Yu L, Li J, Chen X, Ding K, Cao F. Identification of novel long non-coding RNAs in triple-negative breast cancer. Oncotarget. 2015;6:21730.
Zhang Y, Zhang P, Wan X, Su X, Kong Z, Zhai Q, Xiang X, Li L, Li Y. Downregulation of long non-coding RNA HCG11 predicts a poor prognosis in prostate cancer. Biomed Pharmacother. 2016;83:936.
Peng W, Fan H, Wu G, Wu J, Feng J. Upregulation of long noncoding RNA PEG10 associates with poor prognosis in diffuse large B cell lymphoma with facilitating tumorigenicity. Clin Exp Med. 2016;16:177.
Xi W, Song W. Prognostic value of lncRNA HOTAIR expression in patients with cancer: a meta-analysis. Zhong Nan Da Xue Xue Bao Yi Xue Ban. 2016;41:1352.
Lu W, Huang SY, Su L, Zhao BX, Miao JY. Long noncoding RNA LOC100129973 suppresses apoptosis by targeting miR-4707-5p and miR-4767 in vascular endothelial cells. Sci Rep. 2016;6:21620.
Uzozie AC, Selevsek N, Wahlander A, Nanni P, Grossmann J, Weber A, Buffoli F, Marra G. Targeted proteomics for multiplexed verification of markers of colorectal tumorigenesis. Mol Cell Proteomics. 2017;16:407.
Wu L, Li H, Jia CY, Cheng W, Yu M, Peng M, Zhu Y, Zhao Q, Dong YW, Shao K, Wu A, Wu XZ. MicroRNA-223 regulates FOXO1 expression and cell proliferation. FEBS Lett. 2012;586:1038.
Dickinson RE, Dallol A, Bieche I, Krex D, Morton D, Maher ER, Latif F. Epigenetic inactivation of SLIT3 and SLIT1 genes in human cancers. Br J Cancer. 2004;91:2071.
We acknowledge excellent technical and graphic design assistance from Dr. Lai Jiang.
Availability of data and materials
All the data of the case report are included in this manuscript.
Ethics approval and consent to participate
This article does not contain any studies with human participants performed by any of the authors.
The authors declare that they have no competing interests.
Consent for publication
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Zhu, H., Yu, J., Zhu, H. et al. Identification of key lncRNAs in colorectal cancer progression based on associated protein–protein interaction analysis. World J Surg Onc 15, 153 (2017). https://doi.org/10.1186/s12957-017-1211-7
- Long non-coding RNA
- Colorectal cancer
- Protein–protein interaction analysis
- Expression profiling