Identification of key gene modules for human osteosarcoma by co-expression analysis
© The Author(s). 2018
Received: 9 December 2017
Accepted: 3 April 2018
Published: 2 May 2018
Osteosarcoma is a type of bone cancer casting huge threat to the human health worldwide. Previously, gene expression analyses were performed to identify biomarkers for cancer; however, systemic co-expression analysis for osteosarcoma is still in need. The aim of this study was to construct a gene co-expression network that predicts clusters of candidate genes associated with the pathogenesis of osteosarcoma.
Here, we extracted the large scale of datasets from the GEO database. With systematical approaches, we identified the co-expression modules by using weighted gene co-expression network analysis (WGCNA) and investigated the functional enrichments of important modules at GO and KEGG terms.
First, seven co-expression modules, which contain different genes, were conducted for 2228 genes in the 22 human osteosarcoma samples. Then, correlation study showed that the hub genes between pairwise modules displayed great differences. Lastly, functional enrichments of the co-expression modules showed that the module 5 enriched in immune response, antigen processing, and presentation, which is in consistence with GO result. Therefore, we speculated that the module 5 may play a key role in the pathogenesis of osteosarcoma.
Here, we speculated that genes of the module 5 were the essential genes that were associated to human osteosarcoma. Together, our findings not only provided outline of co-expression gene modules for human osteosarcoma, but also promoted the understanding of these modules at functional aspects.
Osteosarcoma (OS), the most common primary bone malignancy, has an overall incidence of 0.2–3/100000 per year. In the age group of 15–19 years, osteosarcoma is even more common with an incidence of 0.8–11/100,000 per year globally [1, 2]. Despite its rarity, it was also reported as the third most common cancer in adolescence, occurring only less frequently than brain tumor and lymphomas in this age group. Usually, the incidence increases to a peak along with the pubertal growth spurt with gender bias (occurs earlier in females than in males). Besides, tall stature and high birth weight are also reported to be important risk factors . Although the introduction of effective chemotherapy has improved 3-year survival from 20% to 60–70%, no further improvements have been achieved in the last few decades . Therefore, better understanding of genetic etiology and pathology of OS may provide new possible treatment strategies for this tumor.
Several studies have reported that common genetic variations were preliminarily associated with the occurrence of osteosarcoma in some biological pathways, such as TGFBR1*6A, which is a common mutation of TGF-β receptor 1 and was reported to be associated with the distant metastasis of osteosarcoma . Recently, Savage et al. suggested that two loci in the GRM4 gene at 6p21.3 and in the gene desert at 2p25.2 may be involved in the mechanisms underlying susceptibility to osteosarcoma . However, only a handful of candidate genes are considered to be crucial in the pathogenesis of OS, and there is still a large part needed to be explored.
In some computational research, disease risk modules have been developed to provide significant measurement for cancer diagnosis and to develop novel treatment strategies [5, 7–10]. The weighted gene co-expression network analysis (WGCNA) is a powerful approach based on “guilt-by-association.” It is used to identify gene modules which are popularly applied as candidate biomarkers or therapeutic targets [11, 12]. As a systematical biology method, it was widely used in many complex diseases, such as breast cancer , schizophrenia [14, 15], and intracranial aneurysm . By using WGCNA, we are able to construct co-expression networks to detect the differentially correlated gene clusters and perform gene-specific analysis [17, 18].
In this study, WGCNA was constructed based on a dataset comprising 2228 genes from 22 human osteosarcoma samples. The correlation between each module and the biologic functions of genes detected in these modules are analyzed. These informative genes found in our study may be beneficial to clinical treatment of osteosarcoma.
Datasets for WGCNA related to osteosarcoma were obtained from the NCBI Gene Expression Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo) with accessing number GSE12512. The combined dataset consists of 22 samples. We firstly mapped the array probes to their respective gene IDs by using the array annotations. Probes matching multiple genes were removed from the dataset, and then, we calculated the average expression values of genes measured by multiple probes. A proper threshold was settled based on the amount of genes filtered out.
Co-expression networks and modules
The influence of power value on the scale independence and mean connectivity were analyzed by using the function softConnectivity in WGCNA package. The “randomly selected genes” parameter was set as 5000; other parameters’ set was default. The power parameter was pre-calculated with the function pickSoftThreshold in WGCNA. In this function, an appropriate soft-thresholding power for network construction was provided by calculating the scale-free topology fit index of several powers. That is, if the scale-free topology fit index for the reference dataset exceeded 0.8 for low powers (< 30), then the topology of the network is scale-free without batch effects . Next, we summarized the expression values by using the function collapseRows in the R package. Cluster analysis was subsequently performed by flashClust . The interactions (correlations) of each module was analyzed and visualized by heat map.
Hub genes and the functional annotations
We performed a gene ontology (GO) enrichment analysis for top 5 modules with most genes by the Database for Annotation, Visualization, and Integrated Discovery (DAVID https://david.ncifcrf.gov/summary.jsp) . Functional enrichment analysis of the hub genes were carried out at GO terms and KEGG pathways (p < 0.05) [20, 21]. Before assigning enrichment score for each cluster to make interpretation of the results more straightforward, functional annotation clustering combines single category with a significant overlap in gene content.
Pre-processing of the osteosarcoma datasets
Identification of gene co-expression networks and modules
Correlation between each modules
Functional enrichment and clustering analysis
GO enrichment analysis in co-expression modules
Negative regulation of cellular protein metabolic process
Negative regulation of protein metabolic process
Negative regulation of protein ubiquitination
Anaphase-promoting complex-dependent proteasomal ubiquitin-dependent protein catabolic process
Generation of precursor metabolites and energy
Ribosomal large subunit biogenesis
ATP synthesis coupled proton transport
Energy-coupled proton transport, down electrochemical gradient
Antigen processing and presentation
Antigen processing and presentation of peptide or polysaccharide antigen via MHC class II
Antigen processing and presentation of peptide antigen
Antigen processing and presentation of exogenous peptide antigen
KEGG pathways in co-expression modules
Epithelial cell signaling in Helicobacter pylori infection
Vibrio cholerae infection
Type I diabetes mellitus
Antigen processing and presentation
The main objective for this study was to utilize a global approach to construct a gene co-expression network that predicts clusters of candidate genes involved in the pathogenesis of osteosarcoma. We hypothesized that tightly co-expressed gene modules with common functional annotation would be able to predict candidate gene sets that underlies a given biological process.
WGCNA is a relatively novel statistical approach based on gene correlations and has been used not only to construct gene networks and detect modules/sub-networks, but also to identify hub genes and select candidate genes as biomarkers . Usually, module detection in WGCNA needs a knowledge-independent process. However, selection of a threshold for culling the network to limit noise would probably rely on empirical judgment and functional annotation . Furthermore, WGCNA can only provide a set of hub genes instead of specific genes related to the background, such as osteosarcoma in this study. Therefore, further studies should be carried out to narrow down the gene targets. Such as RMT method, this lies in its ability to automatically localize the noise-to-signal threshold instead of using empirical judgment or annotations . Moreover, construction of mutant will also help to understand the role of one or more specific genes in the pathogenesis of osteosarcoma.
Endo Munoz et al. have reported that OS are characterized by an early deregulation of genes involved in antigen presentation and suggest that patient prognosis is determined early in tumor development and that enhancing antigen presentation may be clinically valuable in treating OS . Furthermore, several immune molecules, such as cytotoxic T cell lymphocyte antigen 4 (CTLA4) and CD40 (TNF receptor superfamily 5), have been targeted clinically in osteosarcoma. It was discovered that they can break the immune tolerance in tumor . Therefore, we suggested the genes in module 5 might play a key role in the pathogenesis of osteosarcoma and thereby provide potential targets for treating OS.
In summary, this research creatively applied transcriptional network analysis to identify co-expression module. In module 5, the highly enriched genes were involved in the antigen and immune process. According to their collective expression, they were speculated to be correlated with pathogenesis of osteosarcoma as well.
The discoveries in this study might be used to predict clusters of candidate genes associated with the pathogenesis of osteosarcoma. This might contribute to improve or optimize clinical diagnosis by using molecular techniques. However, the clinical specific efficiency of the identified module needs more experiments to clarify.
Availability of data and materials
All data and material were available in the GEO database.
JZ, QL, and JSL conceived this study; JZ, QL, and JSL performed the analysis; JZ, QL, and JSL prepared the manuscript. Both authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Bielack S, Carrle D, Casali PG, Group EGW. Osteosarcoma: ESMO clinical recommendations for diagnosis, treatment and follow-up. Ann Oncol. 2009;20(Suppl 4):137–9.PubMedGoogle Scholar
- Mirabello L, Troisi RJ, Savage SA. Osteosarcoma incidence and survival rates from 1973 to 2004: data from the surveillance, epidemiology, and end results program. Cancer. 2009;115:1531–43.View ArticlePubMedPubMed CentralGoogle Scholar
- Mirabello L, Yu K, Berndt SI, Burdett L, Wang Z, Chowdhury S, Teshome K, Uzoka A, Hutchinson A, Grotmol T, et al. A comprehensive candidate gene approach identifies genetic variation associated with osteosarcoma. BMC Cancer. 2011;11:209.View ArticlePubMedPubMed CentralGoogle Scholar
- van Oosterwijk JG, Anninga JK, Gelderblom H, Cleton-Jansen AM, Bovee JV. Update on targets and novel treatment options for high-grade osteosarcoma and chondrosarcoma. Hematol Oncol Clin North Am. 2013;27:1021–48.View ArticlePubMedGoogle Scholar
- Hu YS, Pan Y, Li WH, Zhang Y, Li J, Ma BA. Association between TGFBR1*6A and osteosarcoma: a Chinese case-control study. BMC Cancer. 2010;10:169.View ArticlePubMedPubMed CentralGoogle Scholar
- Savage SA, Mirabello L, Wang Z, Gastier-Foster JM, Gorlick R, Khanna C, Flanagan AM, Tirabosco R, Andrulis IL, Wunder JS, et al. Genome-wide association study identifies two susceptibility loci for osteosarcoma. Nat Genet. 2013;45:799–803.View ArticlePubMedPubMed CentralGoogle Scholar
- Chen Y, Yang Y, Liu S, Zhu S, Jiang H, Ding J. Association between interleukin 8 −251 A/T and +781 C/T polymorphisms and osteosarcoma risk in Chinese population: a case-control study. Tumour Biol. 2016;37:6191–6.View ArticlePubMedGoogle Scholar
- Wang J, Liu H, Liu X, Qi X. Effect of variation of FGF2 genotypes on the risk of osteosarcoma susceptibly: a case control study. Int J Clin Exp Med. 2015;8:6114–8.PubMedPubMed CentralGoogle Scholar
- Song WS, Jeon DG, Cho WH, Kong CB, Cho SH, Lee SY, Lee SY. Spontaneous necrosis and additional tumor necrosis induced by preoperative chemotherapy for osteosarcoma: a case-control study. J Orthop Sci. 2015;20:174–9.View ArticlePubMedGoogle Scholar
- Zhao Q, Wang C, Zhu J, Wang L, Dong S, Zhang G, Tian J. RNAi-mediated knockdown of cyclooxygenase2 inhibits the growth, invasion and migration of SaOS2 human osteosarcoma cells: a case control study. J Exp Clin Cancer Res. 2011;30:26.View ArticlePubMedPubMed CentralGoogle Scholar
- Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinf. 2008;9:559.View ArticleGoogle Scholar
- DiLeo MV, Strahan GD, den Bakker M, Hoekenga OA. Weighted correlation network analysis (WGCNA) applied to the tomato fruit metabolome. PLoS One. 2011;6:e26683.View ArticlePubMedPubMed CentralGoogle Scholar
- Clarke C, Madden SF, Doolan P, Aherne ST, Joyce H, O'Driscoll L, Gallagher WM, Hennessy BT, Moriarty M, Crown J, et al. Correlating transcriptional networks to breast cancer survival: a large-scale coexpression analysis. Carcinogenesis. 2013;34:2300–8.View ArticlePubMedGoogle Scholar
- Ren Y, Cui Y, Li X, Wang B, Na L, Shi J, Wang L, Qiu L, Zhang K, Liu G, Xu Y. A co-expression network analysis reveals lncRNA abnormalities in peripheral blood in early-onset schizophrenia. Prog Neuro-Psychopharmacol Biol Psychiatry. 2015;63:1–5.View ArticleGoogle Scholar
- de Jong S, Boks MP, Fuller TF, Strengman E, Janson E, de Kovel CG, Ori AP, Vi N, Mulder F, Blom JD, et al. A gene co-expression network in whole blood of schizophrenia patients is independent of antipsychotic-use and enriched for brain-expressed genes. PLoS One. 2012;7:e39498.View ArticlePubMedPubMed CentralGoogle Scholar
- Zheng X, Xue C, Luo G, Hu Y, Luo W, Sun X. Identification of crucial genes in intracranial aneurysm based on weighted gene coexpression network analysis. Cancer Gene Ther. 2015;22:238–45.View ArticlePubMedGoogle Scholar
- Wang YB, Jia N, Xu CM, Zhao L, Zhao Y, Wang X, Jia TH. Selecting key genes associated with osteosarcoma based on a differential expression network. Genet Mol Res. 2015;14:17708–17.View ArticlePubMedGoogle Scholar
- Bakhshi S, Gupta A, Sharma MC, Khan SA, Rastogi S. Her-2/neu, p-53, and their coexpression in osteosarcoma. J Pediatr Hematol Oncol. 2009;31:245–51.View ArticlePubMedGoogle Scholar
- Dennis G Jr, Sherman BT, Hosack DA, Yang J, Gao W, Lane HC, Lempicki RA. DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol. 2003;4:P3.View ArticlePubMedGoogle Scholar
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000;25:25–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Kanehisa M, Goto S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 2000;28:27–30.View ArticlePubMedPubMed CentralGoogle Scholar
- Ficklin SP, Luo F, Feltus FA. The association of multiple interacting genes with specific phenotypes in rice using gene coexpression networks. Plant Physiol. 2010;154:13–24.View ArticlePubMedPubMed CentralGoogle Scholar
- Endo-Munoz L, Cumming A, Rickwood D, Wilson D, Cueva C, Ng C, Strutton G, Cassady AI, Evdokiou A, Sommerville S, et al. Loss of osteoclasts contributes to development of osteosarcoma pulmonary metastases. Cancer Res. 2010;70:7063–72.View ArticlePubMedGoogle Scholar
- Paladini L, Fabris L, Bottai G, Raschioni C, Calin GA, Santarpia L. Targeting microRNAs as key modulators of tumor immune response. J Exp Clin Cancer Res. 2016;35:103.View ArticlePubMedPubMed CentralGoogle Scholar