Clinical value of miR-182-5p in lung squamous cell carcinoma: a study combining data from TCGA, GEO, and RT-qPCR validation

Background MiR-182-5p, as a member of miRNA family, can be detected in lung cancer and plays an important role in lung cancer. To explore the clinical value of miR-182-5p in lung squamous cell carcinoma (LUSC) and to unveil the molecular mechanism of LUSC. Methods The clinical value of miR-182-5p in LUSC was investigated by collecting and calculating data from The Cancer Genome Atlas (TCGA) database, the Gene Expression Omnibus (GEO) database, and real-time quantitative polymerase chain reaction (RT-qPCR). Twelve prediction platforms were used to predict the target genes of miR-182-5p. Protein-protein interaction (PPI) networks and gene ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were used to explore the molecular mechanism of LUSC. Results The expression of miR-182-5p was significantly over-expressed in LUSC than in non-cancerous tissues, as evidenced by various approaches, including the TCGA database, GEO microarrays, RT-qPCR, and a comprehensive meta-analysis of 501 LUSC cases and 148 non-cancerous cases. Furthermore, a total of 81 potential target genes were chosen from the union of predicted genes and the TCGA database. GO and KEGG analyses demonstrated that the target genes are involved in pathways related to biological processes. PPIs revealed the relationships between these genes, with EPAS1, PRKCE, NR3C1, and RHOB being located in the center of the PPI network. Conclusions MiR-182-5p upregulation greatly contributes to LUSC and may serve as a biomarker in LUSC.


Background
Lung cancer is a major cause of death associated with cancer, and its rate continues to increase [1,2]. Non-small cell lung cancer (NSCLC) accounts for 80% of all lung cancer cases. Subgroups of NSCLC include lung adenocarcinoma, lung squamous cell carcinoma (LUSC), and lung large cell carcinoma. According to recent studies, most patients are diagnosed at an advanced stage, which leads to low cure rates. Only a minority of patients can be treated by surgery [1,3]. Some patients are treated with chemotherapy, radiation therapy, and targeted therapy, but the 5-year survival rate remains low and many patients are at risk of recurrence [2][3][4][5][6]. The cure rate of lung cancer can be improved by early confirmation. At present, the diagnosis of lung cancer mainly depends on the pathological biopsy. Therefore, it is significant to explore the molecular mechanism of lung cancer and to improve its diagnosis and treatment [7].
MiR-182-5p, as a member of miRNAs family, can be detected in many cancers, for example, lung cancer, and the expression of miR-182-5p is upregulated [6,21]. Several studies indicated that miR-182-5p acts as an onco-miR to enhance tumor cell proliferation [21][22][23]. However, previous studies have focused on particular aspects of miR-182-5p in LUSC and thus lacked a comprehensive description. The expression value of miR-182-5p was not shown in previous articles, which have often displayed p values of a statistical test. Therefore, data cannot be obtained. In this study, we analyzed 388 LUSC samples from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) database to verify the clinical value of miR-182-5p in LUSC. Next, 23 clinical LUSC samples were used to further prove the clinical value of miR-182-5p. The PubMed, Wiley Online Library, EBSCO, Cochrane Central Register of Controlled Trials, Web of Science, Google Scholar, Ovid, EMBASE, and LILACS were also searched to obtain document sources. Furthermore, we used miRBase (http://www.mirbase.org/) to discern the target genes of miR-182-5p and investigated the enrichment pathways and target genes by KEGG pathway and GO enrichment analyses and protein-protein interaction (PPI) networks. On the basis of the previous literature, we combined more samples and using various methods to reduce the difference between the existing literatures. We hope this study provides comprehensive information on miR-182-5p for the occurrence and progression of LUSC.

Methods
MiR-182 expression in LUSC samples from TCGA database TCGA database provides comprehensive cancer genomic datasets for researchers where data are available to search, download, and analyze. In this study, we searched TCGA database (https://cancergenome.nih.gov/) to examine miR-182 expression in LUSC tissues. We obtained the miRNA profiles of 338 LUSC tissues and 45 noncancerous tissues together with the clinical info. Afterward, miR-182 expression was examined from the miRNA profiles. The extracted data were normalized and processed by log2 transformation. Subsequently, statistical analyses were performed to evaluate the miR-182 expression in LUSC tissues and the correlation between miR-182 expression and relevant clinical data. Additionally, to further analyze the overall survival of LUSC, a Kaplan-Meier curve was constructed using the median miR-182 expression value.

MiR-182-5p expression in LUSC tissues from the GEO database
We mined the GEO database (http://www.ncbi.nlm.nih. gov/geo/) to obtain microarray profiles from LUSC samples using the following search terms: (cancer OR carcinoma OR adenocarcinoma OR tumour OR tumor OR malignanc* OR neoplas*) AND (lung OR pulmonary OR respiratory OR respiration OR aspiration OR bronchi OR bronchioles OR alveoli OR pneumocytes OR "air way"). The search results were then specified using the following filters: Series[Entry type], Homo sapiens [Organism]. The microarrays were selected according to the inclusion criteria as follows: miR-182 expression was examined in LUSC tissues and non-cancerous tissues. Microarrays were considered ineligible according to the following exclusion criteria: (1) microarrays did not meet the inclusion criteria; (2) the microarray profile did not include miR-182 expression; (3) the microarray only provided LUSC tissues without a control group; (4) an insufficient number of LUSC samples for analysis; and (5) microarrays used cell line samples. A total of seven datasets were obtained, namely, GSE16025, GSE25508, GSE29248, GSE47525, GSE19945, GSE51853, and GSE74190.

Clinical samples
In our study, 23 formalin-fixed, paraffin-embedded LUSC tissues and their adjacent normal tissues were collected from the Pathology Department of the First Affiliated Hospital of Guangxi Medical University between January 2012 and February 2014. All samples were pathologically confirmed as LUSC by two independent pathologists (Z.-y.L. and G.C.). The study was approved by the Ethics Committee of the First Affiliated Hospital of Guangxi Medical University, and the clinical parameters of 23 patients were shown in Table 1.

RT-qPCR
To detect the expression of miR-182 in 23 pairs of samples, RT-qPCR was carried out on an Applied Biosystems PCR 7900 system. Total RNA was extracted and normalized as previously reported [24][25][26][27][28]. The expression levels of miR-182 were evaluated with a mirVana RT-qPCR miRNA Detection Kit (Ambion Inc., Austin, TX, USA). The combination of miR-103 and miR-191 was considered an endogenous control and served as a reference in our previous study [29]. TaqMan Micro-RNA Assays from Applied Biosystems were used in the PCR system, and the sequences were as follows

Literature
The keywords were used to search the literature of miR-182-5p in LUSC from PubMed, Wiley Online Library, EBSCO, Cochrane Central Register of Controlled Trials, Web of Science, Google Scholar, Ovid, EMBASE, and LILACS, until 5 October 2017, and the keywords were as follows: (cancer OR carcinoma OR adenocarcinoma OR tumour OR tumor OR malignanc* OR neoplas*) AND (Lung OR pulmonary OR respiratory OR respiration OR aspiration OR bronchi OR bronchioles OR alveoli OR pneumocytes OR "air way") AND (miR-182 OR miRNA-182 OR microRNA-182 OR miR182 OR miRNA182 OR microRNA182 OR "miR 182" OR "miRNA 182" OR "microRNA 182"OR miR-182-5p OR miRNA-182-5p OR microRNA-182-5p). The studies which were included need to meet the following criteria: (1) the expression of miR-182-5p in LUSC must be detected by Homo sapiens, and (2) the data of the expression of miR-182-5p can be extracted in the studies.

Meta-analysis
A comprehensive meta-analysis was performed using Stata 14.0 software by combining the four sources (RT-qPCR data, TCGA data, GEO datasets, and the literature) reporting miR-182 expression in LUSC. The respective meta-analysis for RT-qPCR data, TCGA data, and GEO datasets was also performed. Pooled data in the meta-analysis were assessed by the standard mean difference (SMD) with a 95% confidential interval (CI). Heterogeneity among the eligible microarrays was evaluated by the chi-squared and I-squared tests. The effect model was then determined according to the heterogeneity. Specifically, a fixed effects model was conducted for the meta-analysis when the heterogeneity was low (I 2 ≤ 50% and p > 0.05) and a random effects model was selected if apparent heterogeneity existed (I 2 > 50% or p ≤ 0.05) [30]. A summary receiver operating characteristic (sROC) curve was constructed to describe the diagnostic ability of miR-182-5p in LUSC.

MiR-182-5p predicted target genes
MiR-182 target genes were projected in silico with 12 databases (miRWalk, Microt4, miRanda, mirbridge, miRDB, miRMap, miRNAMap, Pictar2, PITA, RNAhybrid, Targetscan, and RNA22). Genes present in at least five databases were further regarded as predicted target genes of miR-182. Two databases (Tarbase and miRTarbase) were employed to gather miR-182 target genes with "strong evidence." All miR-182 target genes verified by western blot, qPCR, or luciferase reporter assays were selected as validated genes. Moreover, we identified weakly expressed genes in LUSC from TCGA database. Finally, target genes of miR-182 were achieved from the three analyses (predicted genes, validated genes, and genes from TCGA database), which were utilized for further gene pathway analysis, GO analysis, statistical analysis, and generating ROC curves. A correlation analysis between hub genes and miR-182 was also conducted. For all analyses described above, a p-value < 0.05 was regarded to present a significant difference.

Functional enrichment analysis via bioinformatics
Predicted target genes were subjected to GO analysis in the DAVID database [31]. The BINGO plugin of Cytoscape was applied to visualize the GO network. The PPI networks were constructed using STRING 10.0 [32]. We also mapped genes to the KEGG database to identify significant signaling pathways. A p value < 0.001 was regarded to show statistical significance.

Statistical analysis
All statistical analyses were conducted using GraphPad 5.0 software. Student's t test was used to detect a significant difference in the miR-182 expression between two groups, and one-way analysis of variance was used to study the miR-182 level among three or more groups. Furthermore, ROC curves were constructed, and the area under the curve (AUC) was calculated to assess the diagnostic role of miR-182 in LUSC. The diagnostic efficacy for LUSC was evaluated as low, moderate, or high depending on the AUC-0.5-0.7 (low), 0.7-0.9 (moderate), and 0.9-1.0 (high). A statistical alteration was considered to occur when p < 0.05.

Clinical value of miR-182-5p
Expression of miR-182-5p in LUSC from TCGA database A total of 338 LUSC cases and 45 adjacent non-cancer cases were collected from TCGA database ( Table 2). The expression value of miR-182-5p in the LUSC group was 14.4295 ± 1.16110 and that in the non-cancer group was 12.2828 ± 0.64852. MiR-182-5p expression was clearly over-expressed in the LUSC group in comparison with the non-cancerous group (Fig. 1a). As shown in Fig. 1b, the ROC curve assessed the diagnostic ability of miR-182.
To verify this result, we matched the data of 45 patients from TCGA database (Fig. 1c, d). MiR-182 expression was higher in LUSC tissue than in adjacent normal tissues (14.0102 ± 1.17344 and 12.2828 ± 0.64852, respectively, p < 0.001). Kaplan-Meier curves (Fig. 1e) were constructed to analyze the prognosis of miR-182-5p in LUSC patients. The curves display the median survival of LUSC patients with high miR-182-5p expression (63.73 months) and those with low miR-182-5p expression (47.43 months).
LUSC microarrays from the GEO database GEO microarrays can be regarded as an auxiliary means to validate the expression of miR-182-5p in LUSC. A total of seven microarrays were selected from the GEO One-way analysis of variance (ANOVA) was used for the analysis of five groups database, namely, GSE16025, GSE25508, GSE29248, GSE47525, GSE19945, GSE51853, and GSE74190 (Fig. 2). Four microarrays (GSE16025, GSE19945, GSE51853, and GSE74190) showed statistical significance in which the miR-182-5p expression level was remarkably increased in LUSC tissues. The expression of miR-182-5p in the GEO microarrays is shown in Table 3. The meta-analysis results are shown in Fig. 3. The forest plot (Fig. 3a) included the miR-182-5p expression data from the seven microarrays. The pooled SMD of miR-182-5p was 1.54 (95% CI 0.74 to 2.34) by the random effects model. The I-squared value was 77.4%, and the p value was less than 0.001. Furthermore, the sensitivity analysis (Fig. 3b) indicated no significant difference among the microarrays. We also assessed the publication bias using a funnel plot (Fig. 3c). The p value from Begg's test was 1.000 and that from Egger's test was 0.939. The sROC curve of the GEO microarrays is shown in Fig. 3d. The AUC was 0.97 (95% CI 0.95-0.98). Based on these results, we conclude that these microarrays had no significant publication bias.

RT-qPCR analysis
We detected the clinical expression level of miR-182-5p by RT-qPCR in 23 LUSC and 23 non-cancerous lung tissues. The miR-182-5p expression whose tumor size was greater than 3 cm was 8.55 ± 3.99, and the expression of   Fig. 4a). In Fig. 4b, the ROC curves show the diagnostic value of miR-182-5p in tumor size.

Meta-analysis of TCGA, GEO, PCR, and literature analyses
We performed a comprehensive meta-analysis using data from TCGA database, GEO microarrays, and PCR. Regarding the literature, the data could not be extracted. A total of 501 LUSC cases and 148 non-cancerous cases were extracted. The random-effect was used in the meta-analysis because the I-squared value was 81.8%. The I-squared value may be caused by the differences in patients, samples processing methods, and statistical methods. The forest plot (Fig. 5a) included the miR-182-5p expression data from PCR, TCGA database, and GEO microarrays. The pooled SMD of miR-182-5p was 1.44 (95% CI 0.83 to 2.05) using the random effects a MiR-182-5p expression in patients whose tumor size was greater than 3 cm and in patients whose tumor size was less than or equal to 3 cm. b The ROC curve was generated to assess the diagnostic ability of miR-182-5p in tumor size. The AUC was 0.933 (95% CI 0.8206 to 1.045, p = 0.002). The sensitivity was 85.71%, and the specificity was 87.50% model. The I-squared value was 81.8%, and the p value was less than 0.001. The sensitivity analysis (Fig. 5b) indicated no significant difference among studies. The funnel plot (Fig. 5c) showed a publication bias among these studies. The p value obtained from Begg's test was 0.754 and that from Egger's test was 0.678. The sROC curve is shown in Fig. 5d. The AUC was 0.95 (95% CI 0.93-0.97).
In summary, these studies showed a mild publication bias.

Molecular mechanism of miR-182-5p
Prediction of miR-182-5p target genes The prediction of miR-182-5p target genes was performed using 12 gene prediction platforms. We chose the predicted genes displayed in at least five platforms, which was 7757. The number of verified target genes was 2105. We next downloaded 4648 genes with low miR-182-5p expression in LUSC from TCGA database. Finally, we calculated the union of the three groups, and a total of 81 target genes were chosen. The screening process is displayed in Fig. 6.  genes were largely involved in protein binding and SH3 domain binding. With respect to the KEGG pathway analysis, the results included nine items. Among these pathways, the Rap1 signaling pathway and platelet activation were important. We also show the GO network for the predicted target genes in Figs. 7, 8, and 9. One node represents one term. Yellow nodes indicate that the terms are more significant.

PPI network of target genes
We identified 31 proteins in the PPI network (Fig. 10), some of which were not associated with other proteins. The more connections between proteins indicate that the protein is more important in LUSC. According to the PPI network, EPAS1, PRKCE, NR3C1, and RHOB are hub genes in LUSC.

Clinical expression of hub genes
Among the 81 target genes, EPAS1, PRKCE, NR3C1, and RHOB were located in the center of the PPI network. There were more connections between these four genes, which may indicate that these genes contribute to LUSC. We chose four hub genes (EPAS1, PRKCE, NR3C1, and RHOB) to analyze their clinical expression in 502 LUSC and 49 non-cancerous cases from TCGA database. The expression of EPAS1, PRKCE, NR3C1, and RHOB was decreased in LUSC (Table 5). Figure 11a, c, e, g shows the expression of the four hub genes in LUSC and non-cancerous tissues. Figure 11b, d, f, h shows the ROC curves of the diagnostic ability of the four genes. The AUCs were 0.929 (95% CI 0.9023 to 0. 9558, p < 0.001), 0.996 (95% CI 0.9929 to 0.9995, p < 0. 001), 0.958 (95% CI 0.9404 to 0.9749, p < 0.001), and 0. 929 (95% CI 0.9238 to 0.9774, p < 0.001), respectively. Correlations between the four hub genes and miR-182-5p expression are shown in Fig. 12. The expression of the four hub genes was significantly negatively related to miR-182-5p expression in LUSC.

Discussion
At present, LUSC is one of the most common cancers and is the chief cause of cancer deaths [1,40]. Misdiagnosis or metastasis can increase the mortality rate.
Therefore, miRs including miR-182-5p are regarded as a new tool used to diagnose LUSC [41]. In our study, we gathered a large amount of data on miR-182-5p expression in LUSC from TCGA and GEO databases and analyzed data from 23 paired clinical LUSC tissues. Herein, a meta-analysis was performed to explore the clinical value of miR-182-5p in LUSC.   Fig. 7 GO biological process (GO-BP) network for the predicted target genes. Nodes represent GO items. Yellow nodes imply that the items are statistically significant (p < 0.01). White nodes imply that the items only take part in connecting items but are not statistically significant There were 338 LUSC cases and 45 adjacent noncancer cases in TCGA database. The data from TCGA database showed that the miR-182-5p expression in LUSC tissues was higher than in adjacent normal tissues, which indicated that miR-182-5p expression was associated with LUSC. We also included seven microarrays (GSE16025, GSE25508, GSE29248, GSE47525, GSE19945, GSE51853, and GSE74190) in the GEO database. In addition to GSE47525, other microarrays showed an increasing trend in miR-182-5p expression in LUSC compared to non-cancerous tissues. Among them, four microarrays (GSE16025, GSE19945, GSE51853, and GSE74190) showed statistical significance. However, in GSE47525, the result was opposite. MiR-182-5p expression was lower in LUSC tissue than in non-cancerous tissue. The result of GSE47525 may be caused by the small number of patient samples. According to RT-qPCR, miR-182-5p expression was correlated with tumor size. The expression of miR-182-5p tended to be higher when the tumor size was greater than 3 cm. As the tumor is growing, the expression of miR-182-5p was also increasing. The result revealed that the miR-182-5p was important in the progress of LUSC, and miR-182-5p could indicate the deterioration of LUSC. On the basis of the result, miR-182-5p can provide a biomarker to detect the occurrence and development of LUSC. The meta-analysis, which included data from TCGA database, the GEO database, RT-qPCR, and the literature, was the highlight of our study. The meta-analysis rendered the most comprehensive data on miR-182-5p. The pooled SMD of miR-182-5p was 1.44 (95% CI 0.83 to 2.05) by the random effects model, which showed that the high miR-182-5p expression in LUSC was consistent with the literature [8,13,14,35,39]. Therefore, we conclude that miR-182-5p is markedly over-expressed in LUSC, consistent with the existing research. And the results showed an obvious relationship between the miR-182-5p expression and LUSC.
We also predicted miR-182-5p target genes using 12 prediction platforms and performed a bioinformatics analysis by GO enrichment, KEGG pathway, and PPI network analyses. The GO enrichment and KEGG pathway analyses included 97 items. In GO-BP, the pathway of apoptotic process included the target genes PRKCE, NR3C1, and RHOB. However, the pathway of apoptotic process in LUSC is still unclear. In GO-CC, the cytosol and cytoplasm were enriched in four hub genes. But there was no study of the relationship between the pathway and LUSC. As for GO-MF, EPSA1, PRKCE, NE3C1, and RHOB were all involved in SH3 domain binding. Shim et al. found that SH3 domain-binding protein 1 could suppress the growth of LUSC [42]. Through the thinking, we can slow down the progress of LUSC by SH3 domain binding pathway. Additionally, the KEGG Fig. 9 GO molecular function (GO-MF) network for the predicted target genes. Nodes represent GO items. Yellow nodes imply that the items are statistically significant (p < 0.05). White nodes imply that the items only take part in connecting items but are not statistically significant pathway analysis revealed that PRKCE is involved in the pathway of MicroRNAs in cancer, the cGMP-PKG signaling pathway, and pathway of vascular smooth muscle contraction. The function of these pathways in LUSC remains to be studied.
According to our bioinformatics analysis, four genes (EPAS1, PRKCE, NR3C1, and RHOB) were regarded as hub genes in LUSC. EPAS1, which is also known as hypoxia-inducible factor-2α (HIF-2α), belongs to the family of hypoxia-inducible factors (HIFs) [43]. In our study, the expression of EPAS1 was negatively correlated with the expression of miR-182-5p in LUSC. In LUSC, EPAS1 plays the role of a HIF [44]. According to recent studies, the high level of EPAS1 expression could lead to a poor prognosis by increasing the tumor size and angiogenesis [43,45,46]. These findings are consistent with the conclusions of our current study.
PRKCE, which consists of 32 exons, is a member of the protein kinase C (PKC) family and regulates the   formation of protein kinase C epsilon type (PKCε) [47]. According to our statistical analysis, the high miR-182-5p expression in LUSC is accompanied by the low expression of PRKCE. As an enzyme, PKCε influences many cellular functions, such as growth, division, and transcription factor regulation [48][49][50]. Wang et al. [51] discovered that PKCε is oncogenic and associated with the occurrence of lung cancer. They also found that PRKCE increases PKCε expression in LUSC. NR3C1 is also known as GR or GCR and encodes a glucocorticoid receptor to participate in inflammation, cell proliferation, and differentiation [52]. NR3C1 plays an anti-inflammatory role in the development and metastasis of LUSC [53,54]. Therefore, NR3C1 is important for inhibiting tumor progression.
RHOB belongs to the Ras homolog gene family. RHOB plays a role in cell proliferation and survival [55]. RHOB also inhibits tumor growth. If RHOB is lacking, the tumor frequency increases [56]. A recent study found that the lack of RHOB often occurs in LUSC [57].
According to our study, the expression of RHOB is downregulated in LUSC, consistent with the report by Mazières et al. [56].
According to the present study, miR-182-5p is upregulated in LUSC and plays a pivotal role in the process of LUSC. Through our research, miR-182-5p is found that it is involved in several biological processes to inhibit LUSC progression and improve the cure rate, and it can offer a new idea of LUSC diagnosis and therapy in molecular mechanism to us.

Conclusion
Our study collected a lot of data from TCGA, GEO, and RT-qPCR and verified the clinical value and diagnostic significance of the high miR-182-5p expression in LUSC. According to the result of target genes, 81 genes were related to the molecular mechanism of miR-182-5p in LUSC. The result of GO and KEGG pathway can provide the idea to cure LUSC in the molecular mechanism.
(See figure on previous page.) Fig. 11 The expression of four hub genes was decreased in TCGA LUSC samples and ROC curve analysis. a The expression of EPAS1 in 502 LUSC and 49 non-cancerous lung tissues. b ROC curve was generated to assess the diagnostic ability of EPAS1 in 502 LUSC and 49 non-cancerous lung tissues. The AUC was 0.929 (95% CI 0.9023 to 0.9558, p < 0.001). c The expression of PRKCE in 502 LUSC and 49 non-cancerous lung tissues. d The ROC curve was generated to assess the diagnostic ability of PRKCE in 502 LUSC and 49 non-cancerous lung tissues. The AUC was 0.996 (95% CI 0.9929 to 0.9995, p < 0.001). e The expression of NR3C1 in 502 LUSC and 49 non-cancerous lung tissues. f The ROC curve was generated to assess the diagnostic ability of NR3C1 in 502 LUSC and 49 non-cancerous lung tissues. The AUC was 0.958 (95% CI 0.9404 to 0.9749, p < 0.001). g The expression of RHOB in 502 LUSC and 49 non-cancerous lung tissues. h The ROC curve was generated to assess the diagnostic ability of RHOB in 502 LUSC and 49 non-cancerous lung tissues. The AUC was 0.929 (95% CI 0.9238 to 0.9774, p < 0.001) Fig. 12 Correlation analysis of the four hub genes decreased in 38 paired LUSC samples from TCGA database. a Correlation between miR-182 and EPAS1. b Correlation between miR-182 and PRKCE. c Correlation between miR-182 and NR3C1. d Correlation between miR-182 and RHOB