Microarray data analysis to identify crucial genes regulated by CEBPB in human SNB19 glioma cells

Background Glioma is one of the most common primary malignancies in the brain or spine. The transcription factor (TF) CCAAT/enhancer binding protein beta (CEBPB) is important for maintaining the tumor initiating capacity and invasion ability. To investigate the regulation mechanism of CEBPB in glioma, microarray data GSE47352 was analyzed. Methods GSE47352 was downloaded from Gene Expression Omnibus, including three samples of SNB19 human glioma cells transduced with non-target control small hairpin RNA (shRNA) lentiviral vectors for 72 h (normal glioma cells) and three samples of SNB19 human glioma cells transduced with CEBPB shRNA lentiviral vectors for 72 h (CEBPB-silenced glioma cells). The differentially expressed genes (DEGs) were screened using limma package and then annotated. Afterwards, the Database for Annotation, Visualization, and Integrated Discovery (DAVID) software was applied to perform enrichment analysis for the DEGs. Furthermore, the protein-protein interaction (PPI) network and transcriptional regulatory network were constructed using Cytoscape software. Results Total 529 DEGs were identified in the normal glioma cells compared with the CEBPB-silenced glioma cells, including 336 up-regulated and 193 down-regulated genes. The significantly enriched pathways included chemokine signaling pathway (which involved CCL2), focal adhesion (which involved THBS1 and THBS2), TGF-beta signaling pathway (which involved THBS1, THBS2, SMAD5, and SMAD6) and chronic myeloid leukemia (which involved TGFBR2 and CCND1). In the PPI network, CCND1 (degree = 29) and CCL2 (degree = 12) were hub nodes. Additionally, CEBPB and TCF12 might function in glioma through targeting others (CEBPB → TCF12, CEBPB → TGFBR2, and TCF12 → TGFBR2). Conclusions CEBPB might act in glioma by regulating CCL2, CCND1, THBS1, THBS2, SMAD5, SMAD6, TGFBR2, and TCF12.


Background
Glioma, which is known as one of the most common primary malignancies in the brain or spine, accounts for nearly 30 % of all brain and central nervous system tumors and 80 % of all malignant brain tumors [1,2]. Previous researches have shown that the most important hallmarks of malignant glioma are its invasion and angiogenesis [3]. So far, researchers have indicated that glioma can be induced by neurofibromatoses and tuberous sclerosis complex [4], electromagnetic radiation [5], DNA repair genes (such as excision repair cross-complementing 1, ERCC1, and X-ray repair cross-complementing group 1, XRCC1) [6]. However, the exact molecular mechanisms of glioma were still unclear.
In the central nervous system, the neoplastic transformation can convert the neural cells into cells of mesenchymal phenotype which possess the ability of invasion and promoting angiogenesis [7,8]. What is more, it has been identified that mesenchymal stem cells (MSC)-like properties may play a role in the tumorigenesis, invasion, and recurrence of primary glioblastoma tumors [8]. The transcription factor (TF) CCAAT/enhancer binding protein beta (CEBPB) is associated with the mesenchymal state of primary glioblastoma, and its expression in glioma is important for maintaining the tumor initiating capacity and invasion ability [9,10]. Moreover, the transforming growth factor beta 1/ SMAD family member 3 (TGFB1/SMAD3) plays a key role in the extracellular matrix (ECM) production which can lead to glioblastoma aggression [11,12]. It has been revealed that CEBPB can regulate the synthesis of ECM [13]. However, the regulation mechanism of CEBPB on TGFB1/ SMAD3 in glioma was seldom studied.
In our study, in order to gain a better understanding of the regulation mechanisms of CEBPB and investigate whether CEBPB could regulate the production of ECM via the TGFB1/SMAD3 signaling pathway in glioma, the microarray data deposited by Carro et al. were further analyzed with bioinformatics methods. Firstly, the differentially expressed genes (DEGs) between SNB19 human glioma cells transduced with non-target control small hairpin RNA (shRNA) lentiviral vectors for 72 h and SNB19 human glioma cells transduced with CEBPB shRNA lentiviral vectors for 72 h were identified and annotated. Subsequently, their potential functions were predicted by enrichment analysis. Finally, protein-protein interaction (PPI) network and transcriptional regulatory network were constructed to screen key genes.

Microarray dataset
The microarray dataset of GSE19114 [14] was downloaded from Gene Expression Omnibus (GEO, http:// www.ncbi.nlm.nih.gov/geo/) database, which was based on the platform of GPL6947 IlluminaHumanHT-12 V3.0 expression beadchip. A total of 74 samples were included in the dataset, among which 3 samples of SNB19 human glioma cells transduced with non-target control shRNA lentiviral vectors for 72 h (normal glioma cells) and 3 samples of SNB19 human glioma cells transduced with CEBPB shRNA lentiviral vectors for 72 h (CEBPBsilenced glioma cells) were used to study the effect of CEBPB on glioma.

Data preprocessing and DEGs screening
The preprocessed microarray data were obtained from GEO2R of National Center of Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov/geo/geo2r/), including 48803 probes. The linear models for microarray data (limma) package [15] were used to identify the DEGs between the normal glioma cells and the CEBPB-silenced glioma cells. Benjamini-Hochberg (BH) method [16] was applied to adjust the raw p value into false discovery rate (FDR). The FDR <0.05 and |log 2 fold change (FC) >1 were used as cut-off criteria.

Functional and pathway enrichment analysis
Gene Ontology (GO, http://www.geneontology.org/) annotations are of great importance for mining biological and functional significance from large dataset [17]. The Kyoto Encyclopedia of Genes and Genomes (KEGG, http://www.genome.ad.jp/kegg) database represents higher order of functions in terms of the network of the interacting molecules [18]. The Database for Annotation, Visualization, and Integrated Discovery (DAVID) online tool [19] was employed to perform GO functional and KEGG pathway enrichment analyses for the DEGs. The p value <0.05 was used as the cut-off criterion.

DEGs annotation
TSGene database (http://bioinfo.mc.vanderbilt.edu/ TSGene/), which contains detailed annotations for each tumor suppressor gene (TSG), such as cancer mutations, gene expressions, methylation sites, transcriptional regulations, and PPIs, was applied to identify the TSGs from the DEGs [20]. Additionally, tumor-associated gene (TAG) database (http://www.binfo.ncku.edu.tw/TAG/), which provides information about commonly shared functional domains in well-characterized oncogenes and TSGs, was used for screening the TAGs from the DEGs [21]. Besides, as a collection of data about the  [22].

PPI network construction
The PPI pairs were searched using the Search Tool for the Retrieval of Interacting Genes (STRING, http:// string-db.org/) online tool [23]. The required confidence (combined score) >0.4 was used as the cut-off criterion. Then, the Cytoscape software [24] was used to visualize the PPI network. Furthermore, connectivity degree analysis was performed to search the hub nodes of PPI networks. The degree of a node was corresponded to the number of interactions involved it [25]. In addition, hub nodes were nodes with higher degrees.

Transcriptional regulatory network construction
ENCODE project is a collection of data about the transcriptional regulatory network, which helps illuminate TF-binding sites, histone marks, chromatin accessibility, DNA methylation, RNA expression, RNA binding, and other cell-state indicators [22]. Based on the transcriptional regulation interactions derived from EN-CODE project, the regulatory network containing CEBPB and TGFB1/SMAD3 was constructed by Cytoscape software [24].

Identification of DEGs
According to the analysis of the microarray dataset, a total of 529 DEGs (including 336 up-regulated genes and 193 down-regulated genes) were identified in the normal glioma cells compared with the CEBPB-silenced glioma cells. Among them, the top ten significantly up-regulated genes (such as thrombospondin 1 (THBS1) and chemokine (C-C motif) ligand 2 (CCL2)) and down-regulated genes (such as cyclin D1 (CCND1)) are displayed in Table 1.

The annotation of DEGs
A total of 54 DEGs were screened as TAGs, including 33 up-regulated and 21 down-regulated genes. Among the 33 up-regulated genes, there were 22 TSGs (such as THBS1), 6 oncogenes, and 5 other genes (such as CCL2). Meanwhile, there were 13 TSGs, 4 oncogenes (such as CCND1), and 4 other genes in the 21 downregulated genes. Additionally, 9 DEGs were screened as the TFs, including 8 up-regulated and 1 down-regulated genes ( Table 4).

Transcriptional regulatory network analysis
For further study, the regulation of TGFB1/SMAD3 by CEBPB, the transcriptional regulation interactions related to TGFB1/SMAD3, and the members of TGFB family were screened out from the ENCODE database and the transcriptional regulatory network was visualized by Cytoscape software (Fig. 2). The transcriptional regulation network showed that the CEBPB could regulate SMAD3, transcription factor 12 (TCF12), transforming growth factor beta 2 (TGFB2), TGFBR2, and TGFBR3 directly. Additionally, TCF12 targeted TGFB1, TGFBR1, TGFBR2, TGFBR3, and SMAD3.

Discussion
In this study, a total of 529 DEGs were obtained, including 336 up-regulated genes and 193 down-regulated genes. Enrichment analysis indicated that the up-regulated CCL2 was significantly enriched in the chemokine signaling pathway. Reports have found that chemokine expressed by stromal cells or endogenously produced in glioma cells may play key roles in tumor cell migration, invasion, proliferation, angiogenesis and immune cell infiltration in the tumor mass [26]. The chemokine CCL2 can promote glioma tumor aggressiveness by promoting attraction of T regulatory cells (which suppress the lymphocyte anti-tumor effector function) and microglial cells (which can reduce the anti-tumor functions and secrete pro-invasive metalloproteinases) [27,28]. Meanwhile, metalloproteinases can promote the glioma invasion through the detachment of ECM [29]. Besides, results of DEGs annotation showed that CCL2 was screened out as a TAG. Therefore, we speculated that the increased expression of CCL2 could promote glioma aggressiveness through the pathway of chemokine signaling.
In addition, some up-regulated genes (such as THBS1, THBS2, SMAD5, and SMAD6) were significantly enriched in the TGF-beta signaling pathway in our study. Recently, it has been reported that the TGFB is a key factor in controlling migration, invasion and angiogenesis in glioblastoma and induces profound immunosuppression [30]. Besides, the THBS1 (belonging to thrombospondin family), which is referred as a TGFB activating protein, induces the glioma invasion [31]. THBS1 is a powerful antiangiogenesis protein in glioblastoma [32]. These suggested that THBS1 might play a key role in regulating the Table 4 The identified transcription factors (TFs) and tumor associated genes (TAGs) among the differentially expressed genes (DEGs). Tumor suppressed genes, TSGs   DEGs  TF  numbers   TFs  TAG  numbers   TAGs   TSGs Oncogenes Others angiogenesis in glioma. As another member of thrombospondin family, THBS2 may be a potential inhibitor of tumor growth and angiogenesis [33]. Moreover, it has been shown that THBS2 can function as an endogenous inhibitor of angiogenesis through directly affecting endothelial cell migration, proliferation, survival, and apoptosis [34]. In our study, we also found that THBS1 and THBS2 were significantly involved in the pathway of focal adhesion. Previous study reported that focal adhesion can suppress the migration and metastasis of tumor cells [35].
Therefore, we speculated that THBS1 and THBS2 could regulate angiogenesis and invasion in glioma via TGF-beta signaling pathway and focal adhesion pathway. Former researches have shown that SMAD6 is an inhibitor of TGFB signaling and blocked the phosphorylation of receptorregulated SMADs (such as SMAD5) in the cytoplasm [36]. As a result, we assumed that SMAD5 and SMAD6 might affect glioma by regulating the TGFB signaling. In the PPI network, THBS1 could interact with CCL2, to some extent, indicating that THBS1 might play key roles in glioma Fig. 1 The protein-protein interaction (PPI) network for the differentially expressed genes (DEGs). The red circles represent the up-regulated genes. The green circles indicate the down-regulated genes through regulating CCL2. Consequently, THBS1, THBS2, SAMD5 and SMAD6 could be key factors involved in the CEBPB-silenced glioma. Moreover, CCND1, as a member of the cyclin family, possessed the highest degree in the PPI network. Cyclins can modulate tumor cell cycle through alterations in cyclin-dependent kinase activity [37]. What's more, researchers have discovered that overexpression of CCND1 can elevate the proliferation and invasion potential of human glioblastoma cells [38]. In the PPI network, we also found that CCND1 had interaction with THBS1, suggesting that CCND1 could be involved in regulating proliferation and invasion of glioma via interacting with THBS1.
TGFBR2 plays a key role in TGFB signal propagation via activating TGFBR1 and the phosphorylation of SMAD proteins [39]. Moreover, silencing of TGFBR2 can abolish TGFB-induced invasion and migratory responses of glioblastoma in vitro [40]. In our study, we also discovered that the up-regulated TCF12 could regulate TGFB1 and SMAD3, indicating that CEBPB might regulate TGFB1 and SMAD3 through TCF12. Previous studies have shown that TGFB1/ SMAD3 can promote tumor cell migration, invasion and metastasis through inducing epithelial-mesenchymal transition [41,42]. What is more, TCF12 has been found to suppress the expression of E-cadherin, which can lead to the metastasis of tumor cells [43]. Therefore, we assumed that CEBPB might regulate TGFBR2 and SMAD3 through TGF-β1/SMAD3 signaling pathway in glioma, and CEBPB could also affect metastasis of glioma by regulating TCF12. However, in our study, TGFB1 and SMAD3 were not significantly expressed, which might due to the relatively short time for CEBPB silencing. In our further research, the regulation of CEBPB on TGFB1/SMAD3 will be studied with CEBPB-silenced for a relatively long time.

Conclusions
We conducted a comprehensive bioinformatics analysis to identify genes which may be correlated with CEBPB-silenced glioma. A total of 529 DEGs were identified in the normal glioma cells compared with the CEBPB-silenced glioma cells. Besides, The identified DEGs, such as TCF12, TGFBR2, CCL2, THBS1, THBS2, SMAD5, SMAD6, and CCND1, might play important roles in the progression of  Fig. 2 The transcriptional regulatory network involving CEBPB and TGFB1/SMAD3. The red and green nodes represent the up-regulated and down-regulated genes, respectively. The blue nodes stand for non-differentially expressed genes (DEGs). The arrows represent regulatory relationships glioma via the regulation of CEBPB. However, further researches are still needed to unravel their action mechanisms in glioma.