Identification and interaction analysis of key genes and microRNAs in hepatocellular carcinoma by bioinformatics analysis

Background Hepatocellular carcinoma (HCC) is the most common liver malignancy worldwide. However, present studies of its multiple gene interaction and cellular pathways still could not explain the initiation and development of HCC perfectly. To find the key genes and miRNAs as well as their potential molecular mechanisms in HCC, microarray data GSE22058, GSE25097, and GSE57958 were analyzed. Methods The microarray datasets GSE22058, GSE25097, and GSE57958, including mRNA and miRNA profiles, were downloaded from the GEO database and were analyzed using GEO2R. Functional and pathway enrichment analyses were performed using the DAVID database, and the protein–protein interaction (PPI) network was constructed using the Cytoscape software. Finally, miRDB was applied to predict the targets of the differentially expressed miRNAs (DEMs). Results A total of 115 differentially expressed genes (DEGs) were found in HCC, including 52 up-regulated genes and 63 down-regulated genes. The gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses from DAVID showed that up-regulated genes were significantly enriched in chromosome segregation and cell division, while the down-regulated genes were mainly involved in complement activation, protein activation cascades, carboxylic acid metabolic processes, oxoacid metabolic processes, and the immune response. From the PPI network, the 18 nodes with the highest degree were screened as hub genes. Among them, ESR1 was found to have close interactions with FOXO1, CXCL12, and GNAO1. In addition, a total of 64 DEMs were identified, which included 58 up-regulated miRNAs and 6 down-regulated miRNAs. ESR1 was potentially targeted by five miRNAs, including hsa-mir-18a and hsa-mir-221. Conclusions The roles of DEMs like hsa-mir-221 in HCC through interactions with DEGs such as ESR1 and CXCL12 may provide new clues for the diagnosis and treatment of HCC patients. Electronic supplementary material The online version of this article (doi:10.1186/s12957-017-1127-2) contains supplementary material, which is available to authorized users.

Background HCC is the most common primary liver malignancy and is one of the leading causes of cancer-related deaths around the world. In Asia, there are more than 580,000 new cases expected every year [1]. As well as other carcinomas, gene aberrations, cellular context, and environmental influences are believed to be the reason of the occurrence, progression, and metastasis of HCC.
Although the study of the multiple genes and cellular pathways that take part in the initiation and development of HCC has been discussed for many years, the therapy for HCC accurately still remains scarcely. Accordingly, it is crucial to investigate the molecular mechanisms involved in the proliferation, apoptosis, and invasion of HCC for the improvement of diagnostic and therapeutic strategies.
In recent years, the microarray, a high-throughput platform for analysis of gene expression, has been extensively conducted as an efficient tool for the identification of general genetic alteration during tumorigenesis [2].
While studies of DEGs and DEMs of HCC have been performed in the past decades and some of their functions in different pathways, biological processes, or molecular functions have been reported, there remain questions about how the DEGs and microRNAs interact through molecular pathways because of limitations on the comparative analysis of the DEGs in independent studies. Currently, due to bioinformatics methods, we can finally deal with the data generated by microarray technology and find the interactions among DEGs and microRNAs, especially the pathways in the interaction network, to conclude their potential mechanisms in HCC.
In this study, we chose three gene expression profiles (GSE22058, GSE25097, and GSE57958), which were down loaded from the GEO database (https://www.ncbi.nlm.nih. gov/geo/), to obtain DEGs and DEMs between liver cancer tissues and normal tissue samples. Then, functional enrichment and network analyses were applied to identify the DEGs, which were combined with mRNA-microRNA interaction analysis, to describe the key genes and miRNAs as well as their potential molecular mechanisms in HCC.

Collection and inclusion criteria of studies
We searched the GEO database (https://www.ncbi.nlm. nih.gov/geo/) for publicly available studies from January 1, 2010 to October 30, 2016 using the following keywords: "hepatocellular carcinoma" (study keyword), "Homo sapiens" (organism), "Expression profiling by array" (study type), "70 to 3000" (sample count) and "tissue" (attribute name). After a systematic review, 13 GSE studies were retrieved. The inclusion criteria for studies were as follows: (1) samples diagnosed with HCC tissue samples and normal tissue samples, (2) gene expression profiling of mRNA, (3) sample count of each group are more than 35, and (4) sufficient information to perform the analysis. Then, three gene expression profiles (GSE22058, GSE25097, and GSE57958) were collected for analysis.

Microarray data
Three gene expression profiles (GSE22058, GSE25097, and GSE57958) were downloaded from the GEO database. The array data for GSE25097 included 268 HCC tissue samples and 243 normal tissue samples [3]. The array data for GSE57958 consisted of 39 HCC tissue samples and 39 normal tissue samples [4]. The array data for GSE22058 consisted of one mRNA expression profile (including 100 HCC tissue samples and 97 normal tissue samples) and a miRNA expression profile (including 96 HCC tissue samples and 96 normal tissue samples) [5].
Data processing GEO2R (http://www.ncbi.nlm.nih.gov/geo/geo2r/) is an interactive web tool for comparing two groups of data that can analyze any GEO series [6]. GEO2R was applied to screen differentially expressed mRNAs and miRNAs between HCC and normal tissue samples. The adjusted P values (adj. P) using Benjamini and Hochberg (BH) false discovery rate (FDR) method by default were applied to correct for the occurrence of false positive results. An adj. P < 0.05 and a |logFC| ≥ 1 were set as the cut-off criteria. Heat map of DEGs was generated using the online tool Morpheus (https://software.broadinstitute.org/ morpheus/).

Functional and pathway enrichment analysis
Gene ontology (GO) is a common method for annotating genes, gene products and sequences to underlying biological phenomena [7,8]; the Kyoto Encyclopedia of Genes and Genomes (KEGG) is an integrated database resource for biological interpretation of genome sequences and other high-throughput data [9]. Both analyses were available in the DAVID database (https://david.ncifcrf.gov/), which is a bioinformatics data resource composed of an integrated biology knowledge base and analysis tools to extract meaningful biological information from large quantities of genes and protein collections [10]. GO and KEGG analyses were performed using the DAVID database to identify DEGs. A P value <0.05 was set as the cut-off criterion.

PPI network construction and analysis of modules
The STRING database (http://string-db.org/) is an online software that aims to provide a critical assessment and integration of protein-protein interactions, including direct (physical) and indirect (functional) associations [11]. Cytoscape is a popular open-source software tool for the visual exploration of biomolecule interaction networks composed of protein, gene, and other types of interactions [12]. The DEGs were mapped to STRING to evaluate the PPI information and then visualized with Cytoscape. A combined score >0.15 was set as the cut-off criterion. To screen the hub genes, node degree ≥10 was set as the cut-off criterion. Then, the Molecular Complex Detection (MCODE) plug-in was used to screen modules of hub genes from the PPI network with degree cut-off = 10, haircut on, node score cut-off = 0.2, k-core = 2, and max. depth = 100. Moreover, the functional and pathway enrichment analyses of DEGs in each module were performed by DAVID. A P value <0.05 was set as the cut-off criterion.

Prediction of miRNA targets
The target genes of the DEMs from GSE22058 were predicted with miRDB (http://mirdb.org/miRDB/), which is an online database for predicting microRNA targets [13]. The target genes were aligned with the DEGs to obtain an intersection for further analysis.

Identification of DEGs
A total of 2021, 1097, and 409 DEGs were identified after the analyses of the GSE22058, GSE25097, and GSE57958 datasets, respectively (Additional files 1, 2, and 3). Among them, 116 genes were found in all three datasets (Fig. 1). Of these, 115 gene expressions were matched, including 52 up-regulated genes and 63 down-regulated genes in HCC tissue samples compared with normal liver tissue samples.

Functional and pathway enrichment analyses
To further understand the function and mechanism of the identified DEGs, functional and pathway enrichment analyses, including GO and KEGG, were performed using DAVID. The GO term enrichment analysis showed that in the biological processes-associated category, the upregulated genes were significantly enriched in chromosome segregation and cell division, while the down-regulated genes were mainly involved in complement activation, protein activation cascades, carboxylic acid metabolic processes, oxoacid metabolic processes, and the immune response (Table 1). In addition, cell component analysis showed that the up-regulated genes were enriched in the chromosome, cytosol, proteinaceous extracellular matrix, and basement membrane collagen trimer, and that the down-regulated genes were mainly found in the collagen trimer, blood microparticles, membrane attack complexes, membrane-bounded vesicles, and the extracellular region part (Table 1). Moreover, for molecular function, the up-regulated genes were enriched in histone deacetylase binding, and the down-regulated genes were enriched in steroid binding, aromatase activity, oxidoreductase activity, monooxygenase activity, and serine-type endopeptidase activity (Table 1). Furthermore, the KEGG pathway analysis showed that the up-regulated genes were significantly enriched in ECM-receptor interaction, while four pathways were overrepresented in the down-regulated genes: chemical carcinogenesis, steroid hormone biosynthesis, retinol metabolism, and drug metabolism-cytochrome P450 (Table 1).

PPI network construction and analysis of modules
Ninety-seven nodes and 319 edges were mapped in the PPI network of identified DEGs, including 43 up-regulated genes and 54 down-regulated genes (Fig. 2a). The 18 nodes with the higher degrees were screened as hub genes, including TOP2A, FOS, TK1, CDC20, ESR1, CCNB2, CXCL12, FOXO1, HMMR, VWF, ACSM3, COL4A1, ZIC2, RFC4, TXNRD1, GNAO1, CYP3A4, and RAP2A ( Table 2). Hub genes expression heat map in GSE25097 is shown in Fig. 2c. Among these, ESR1 was found to have close interactions with FOS, FOXO1, CXCL12, and GNAO1; CDC20 had interactions with CCNB2, CDCA5, CENPM, and GMNN (combined score > 0.90). A significant module including 16 nodes and 58 edges was obtained using MCODE (Fig. 2b). GO term enrichment analysis showed that in biological processes, the genes in this module were mainly associated with the cell cycle and responses to oxidation reactions and steroid hormones ( Table 3). The genes were significantly enriched in the nucleoplasm, cytosol, and chromosome by cell component analysis ( Table 3). The molecular function analysis showed that the genes were mainly involved in the binding of macromolecular complexes, chromatin, enzymes, RNA polymerases, and carbohydrate derivatives ( Table 3). The KEGG analysis showed that the genes were mainly enriched in ECM-receptor interactions (Table 3).

miRNA-DEG pairs
A total number of 64 DEMs between HCC tissue samples and normal tissue samples were identified after the analyses of the GSE22058 datasets (Additional file 4), including 58 up-regulated miRNAs and 6 downregulated miRNAs (Fig. 3). The miRDB database was used to predict target genes of the identified DEMs (Table 4). By comparing the target gene to the DEGs, we screened the genes with a consistent expression trend for further analysis. miRNA-221, one of the most significantly up-regulated miRNAs, was found to target ESR1, FOS, and CXCL12. In addition, miRNA-142-5p, one of the main down-regulated miRNAs, potentially targeted ANKRD29, IGF2BP3, and IGSF3. Moreover, we found that ESR1 was potentially targeted by five miRNAs, including hsa-mir-148b, hsa-mir-181b, hsamir-18a, hsa-mir-19a, and hsa-mir-221 (Table 2). In addition, FOXO1 was the potential target of hsa-mir-135b, hsa-mir-324, and hsa-mir-369.

Discussion
Although people have continuously studied HCC, the early diagnosis and treatment of HCC is still a large problem due to the lack of understanding of the molecular mechanisms that drive the occurrence and development of HCC. Therefore, in-depth research into the factors and mechanisms of HCC progression are necessary for HCC diagnosis and treatment. Due to well-developed microarray technology, it is easier to determine the general genetic alterations in the progression of diseases, which can allow for the identification of gene targets for diagnosis, therapy, and prognosis of tumors.
In our study, a total of 115 DEGs were screened, including 52 up-regulated genes and 63 down-regulated genes. The up-regulated genes were enriched in chromosome segregation and cell division, while the down-regulated Count: the number of enriched genes in each term If there were more than five terms enriched in this category, the top five terms were selected per the P value genes were mainly involved in complement activation, protein activation cascades, carboxylic acid metabolic processes, oxoacid metabolic processes, and the immune response. Moreover, by constructing the PPI, we identified high degree genes including ESR1, which was found to have close interactions with FOXO1, FOS, CXCL12, and GNAO1. ESR1 could encode the transcription factor that enhances the response to various stimuli such as estrogen and growth factors in different kinds of tissue types [14]. Some researchers noticed that it may play a potential tumor suppressor features in HCC with its novel associations among the hepatocyte-specific pathways [15]. Moreover, there are also some reports claiming that ESR1 could suppress the inflammatory process mediated by interleukin-6 and reduce hepatic injury and the compensatory proliferation of hepatocytes [16]. Recent research has shown that cancers such as non-small cell lung cancer (NSCLC) also have very lowexpression level of ESR1 [17]. FOXO, which has four members including FOXO1, is one of the 19 kinds of forkhead box transcription factors and is mainly express in mammals [18,19]. Reports showed that the overexpression of FOXO could inhibit the growth and size of tumors [18,20]. Some research on breast cancer also claimed that when FOXO proteins accumulate in the nucleus, they could suspend cell cycle progression and promote apoptosis [20,21]. Recent research on FOXO proteins provides a new viewpoint that they may possess antitumor properties in HCC, inducing Fig. 2 Protein-protein interaction (PPI) network and hub genes. a PPI network of differentially expressed genes (DEGs). b A significant module selected from the PPI network. Red nodes denote up-regulated genes, while green nodes denote down-regulated genes. Black border shows that the gene is a potential target for differentially expressed miRNAs (DEMs). The lines represent an interaction relationship between the nodes. c Hub genes expression heat map (11 up-regulated genes and 7 down-regulated genes) in GSE25097. Red: up-regulation; purple: down-regulation the expression of pro-apoptotic genes and interfering with signaling cascades, such as the Wnt/β-catenin, PI3K/AKT/ mTOR, or MAPK pathways that are commonly changed in HCC [21,22]. Meanwhile, FOS is reported as an oncogene in several kinds of cancers such as bladder cancer and HCC [23,24]. There is also a recent research that shows that FOS has a high expression in HCC cell lines [25]. Yet in this work, we found it as a low-expression hub gene through the Microarray data analysis. Therefore it may need a further test in the future experiments and is excluded from the work this time. However, the molecular mechanism of ESR1 for HCC and the relationship of ESR1 and FOXO1 were rarely explored. In our study, we found that ESR1 and FOXO1 both had low-expression levels and had an interaction, indicating a joint function in HCC.
Stromal-derived factor 1 alpha, also known as CXCL12, is a specific ligand for CXCR4 and CXCR7. These three proteins together drive the migration of progenitor cells in embryonic development. Some invasion-related or metastasis-related pathways, such as cytokine-cytokine receptor interaction and axon guidance, contain CXCL12 as an important role in the regulation of themselves [26]. Several studies have shown that CXCL12 has a lower expression level in HCC tissue than in normal liver tissue [27]. There is also a study showing that CXCL12 and CXR4 may play significant roles in the metastasis of HCC by promoting the migration of tumor cells [28]. GNAO1 is a member of the subunit family of Gα proteins. As a molecular switch that controls signal transduction, the deregulation of GNAO1 can facilitate oncogenesis [29]. Kan et al. found that the role of mutant GNAO1 in oncogenesis might be to act as a tumor suppressor gene [30]. Jia et al. combined an integrated CNA (chromosomal copy    If there were more than ten genes predicted by miRDB, only ten genes were listed in the table number alteration) analysis with gene expression data to demonstrate that GNAO1 may play a key role in the pathogenesis of HCC [31]. Meanwhile, some research has shown that GNAO1 also plays a significant role in breast cancer and hepatocellular carcinoma [32]. Above all, these results suggest that ESR1, FOXO1, CXCL12, and GNAO1 are involved in the pathogenesis of carcinoma by affecting cell division, complement activation, and protein activation cascades, which support our findings.
Increasing evidence has shown that the dysregulation of miRNAs is an important part of the pathogenesis of multiple cancer types, including HCC. In our study, we identified 64 DEMs, including 58 up-regulated and 6 down-regulated miRNAs in HCC. miR-221 is one of the most significantly up-regulated miRNAs and was found to target ESR1, FOS, and CXCL12. Additionally, miRNA-142-5p is the main down-regulated miRNA and potentially targets ANKRD29, IGF2BP3, and IGSF3. A recent report shows that miRNA-142-5p could regulate the expression of IGF2BP3, which is strongly associated with an advanced tumor stage and is a predictor of poor prognosis among patients with HCC, as with IGF2BP1 [33]. miR-221, which is one of the most frequently and consistently up-regulated microRNAs (miRNAs) in human cancer, has been suggested to act as a tumor promoter. A recent study presents that miR-221 overexpression could accelerate hepatocyte proliferation and contribute to liver tumorigenesis [34]. Additionally, miR-221 also had a higher expression level in NSCLC [35]. Further researches give a conclusion that through the stimulation from staphylococcal nuclease domain-containing 1 (SND1), miR-221 could result in the overexpression of angiogenic factors like angiogenin and CXCL16 [36]. By using parallel measurement, a change in CDKN1B, CDKN1C, paralemmin-2, and CXCL12 levels was suggested as consistent with increased miR-221 activity in the same group. Meanwhile, Yau et al. found that miR-221 and miR-18a levels were also significantly higher in colorectal carcinoma tissues compared with their respective adjacent normal tissues [37]. miR-18a and miR-19a, regulated by E2F-MYC signaling pathways, are reported playing a crucial role in inducing cell proliferation [38]. As we found that ESR1 was potentially targeted by hsa-mir-148b, hsa-mir-181b, hsa-mir-18a, hsa-mir-19a, and hsa-mir-221, it indicates that these miRNAs may play a key role in HCC by mediating ESR1.

Conclusions
In conclusion, our study tried to identify DEGs using comprehensive bioinformatics analyses and found potential biomarkers to predict the progression of diseases. After analysis, a total of 115 DEGs and 64 DEMs were screened including ESR1, FOXO1, CXCL12, GNAO1, and several miRNAs such as miR-221 and miR-142-5p. Significantly, the roles of hsa-mir-148b, hsa-mir-181b, hsa-mir-18a, hsa-mir-19a, and hsa-mir-221 in HCC through interactions with ESR1 and CXCL12 may provide new clues for the diagnosis and treatment of HCC patients. However, lacking of experimental verification is a limitation of this study. In the future, these predicted results obtained from bioinformatics analysis can be verified by further experimental researches, such as qRT-PCR and Western Blot.