CTHRC1 is a prognosis-related biomarker correlated with immune infiltrates in colon adenocarcinoma

Background Colon adenocarcinoma (COAD) is one of the common cancers worldwide. Collagen triple helix repeat containing 1 (CTHRC1) has been reported to be involved in cell invasion, angiogenesis, and the promotion of epithelial-mesenchymal transformation by mediating multiple signaling pathways. However, the role of CTHRC1 in COAD has not yet been determined. Methods Differentially expressed genes were evaluated using gene expression data from the Oncomine and TIMER databases. Correlations between CTHRC1 gene expression and clinicopathological factors were analyzed using gene expression data from UALCAN databases. Then, we searched the GEPIA database to evaluate the association of CTHRC1 gene expression with clinical outcomes. The cBioPortal database was used to analyze CTHRC1 genetic alterations. Subsequently, the TIMER website was chosen to assess the correlation of CTHRC1 with the tumor immune cell infiltration level. The TCGA dataset was used for a gene set enrichment analysis (GSEA). Result CTHRC1 was highly expressed in COAD patients, and significantly related to poor prognosis. In addition, elevated expression of CTHRC1 was related to the clinical stage and pathological type of COAD. The GSEA analysis showed that CTHRC1 was enriched in Gα signaling, GCPR ligand binding, neutrophil degranulation, interleukin signaling, and tumor-associated pathways. In addition, CTHRC1 was significantly associated with the expression of multiple immune markers related to specific immune cells. Conclusion This study suggest that CTHRC1 expression is related to the prognosis and immune infiltration of COAD patients. Therefore, CTHRC1 may be a new candidate prognostic biomarker for determining immune infiltration levels and providing COAD prognoses.


Introduction
Colon adenocarcinoma (COAD) is a common malignant cancer and the second most common cause of cancerrelated deaths worldwide [1]. The internal epithelial cells of colorectal tissues are derived from endodermal cells, and a multi-step process occurs from normal epithelium to adenomatous polyps to invasive colorectal cancer, which is achieved by inactivating tumor suppressor genes and activating oncogenes [2]. The treatment response and survival rate of patients with advanced COAD are still very low compared to the early stage of COAD, with 5-year survival rates dropping from 50 to 10% in more advanced cases. Surgical tumor resection is the mainstay treatment of radical treatment for locally advanced COAD, while a treatment strategy is not available for metastatic tumors that cannot be surgically resected or have poor efficacy of chemotherapy and radiotherapy  [3]. Quantitative analyses of COAD indicate that stem cells are initially transformed into malignant tumor cells over a 10-year period, and then these tumor cells acquire the ability to metastasize over the next 5 years [4]. Early and accurate detection of COAD is important for disease control and prevention because approximately 30-40% of COAD patients relapse after treatment [5]. Given the clinical challenges of the effectiveness of therapeutic strategies to improve the outcome of COAD patients, a significant proportion of patients receiving conventional treatment still experience relapse. Therefore, it is imperative to find new methods, especially molecular markers, to perform timely diagnosis and assessments of colorectal cancer, identify new drug targets, and to better understand the basic molecular mechanisms of COAD for the successful development of effective treatment.
Many studies have reported that the abnormal expression of many genes in tumor tissues may increase the risk of cancer [6]. Therefore, the identification of changes in gene expression levels will provide new insights for studying the pathogenesis of COAD. The collagen triple helix repeat containing 1 (CTHRC1) gene is located on human chromosome 8q22. 3 and was initially identified in balloon injury and normal rat arteries [7]. Its overexpression in fibroblasts is related to an increase in cell migration, motility and invasion [8]. In adults, CTHRC1 is expressed only in the bone matrix and periosteum. CTHRC1 is also present in atherosclerotic plaques and mineralized bone matrix [8]. In other tissues, CTHRC1 is expressed at sites that significantly overlap with interstitial collagen and members of the transforming growth factor-β (TGF-β) family, particularly bone morphogenetic proteins (BMPs) [9]. In addition, the expression of CTHRC1 was positively correlated with lymph node metastasis, tumor stage, and disease prognosis. CTHRC1 is widely upregulated in several solid tumors, including melanoma, gastrointestinal cancer, breast cancer, thyroid cancer, liver cancer, and pancreatic cancer [10]. Recent studies have shown that CTHRC1 can trigger tumor metastasis by promoting epithelial-mesenchymal transformation, promoting cell invasion, and inducing angiogenesis through a variety of signaling pathways [11]. Although the high expression of CTHRC1 and the prognostic value of COAD patients have been highlighted by previous studies, the predictive accuracy of CTHRC1 as a biomarker can vary widely in heterogeneous diseases such as COAD due to the lack of adequate sample sizes and stringent inclusion criteria.
Dysregulation of CTHRC1 expression in COAD and its relationship with clinical pathological features and prognosis have been partially reported. However, bioinformatics analysis has not been applied to explore the role of CTHRC1 in COAD. With the development of RNA-seq technology, genome sequencing has experienced huge changes. In this study, the Cancer Genome Atlas (TCGA) and Oncomine database were employed to investigate the expression and function of CTHRC1 in COAD. The relationship between CTHRC1 levels and clinical pathological parameters was analyzed, and CTHRC1 was identified as a potential biomarker of tumor progression for COAD. Furthermore, we analyzed mutations of CTHRC1 in COAD to determine its expression pattern, potential function, and prognostic value.

Oncomine database analysis
Oncomine (http:// www. oncom ine. org) is a gene chipbased database for gene expression analysis, coexpression analysis, enrichment analysis, and interaction network analysis in various cancers [13]. It contains 715 datasets and 86,733 samples. This database uses Student's t test to compare the transcription levels of CTHRC1 in normal controls and clinical cancer specimens. The p value was set as 0.05 and the fold change was set as 1.5.

UALCAN database analysis
The UALCAN database (http:// ualcan. path. uab. edu/) is available for online analysis of differential gene expression in cancer and normal tissue from TCGA and MET500 data [14]. This study used the UALCAN database to determine the correlation between CTHRC1 gene expression and various sub-groups of clinical characteristics. p < 0.05 was considered statistically significant.

GEPIA database analysis
GEPIA (http:// gepia. cancer-pku. cn) is a newly developed interactive web server for gene expression analysis based on TCGA and GTEx data [15]. It includes 9736 tumor and 8587 normal samples. In the current study, survival analysis of CTHRC1 was evaluated using TCGA-COAD datasets with a median cut-off. Kaplan-Meier (KM) plots are presented with hazard ratios (HRs), 95% confidence intervals (CIs), and log-rank p values.

PrognoScan database analysis
The correlation between CTHRC1 expression and COAD survival was also analyzed by the PrognoScan database (http:// www. abren. net/ Progn oScan/) [16]. To select the datasets to be included in this study, the screening parameters were set as follows: "Cancer Type" as COAD, "Gene" as CTHRC1. The HR with 95% CIs was calculated. The threshold was adjusted to a Cox p value < 0.05.

TISIDB database analysis
TISIDB (http:// cis. hku. hk/ TISIDB/ index. php) is a web portal for tumor and immune system interactions that integrates multiple heterogeneous data types [17]. It contains genomics and transcriptomics of 30 cancer types from TCGA, RNA sequencing data set of patient cohorts treated with immunotherapy. It was used to investigate correlations between CTHRC1 and different immune genes.

Tumor-infiltrating immune cell analysis
Tumor-infiltrating immune cell analysis (TIMER) (http:// www. timer. cistr ome. org/) is a website used to analyze the expression of various types of cancer-associated genes and tumor-infiltrating immune cells [18]. It integrates 10,897 samples across 32 types of cancer in the TCGA. The site provides estimates of immune invasion abundance through multiple immunodeconvolution methods and allows users to dynamically generate high-quality numbers to comprehensively explore tumor immunological, clinical, and genomic characteristics. In this study, the correlations of CTHRC1 expression with immune infiltration in COAD were evaluated by puritycorrelated partial Spearman's correlation values and the statistical significance. cBioPortal database analysis cBioPortal (http:// www. cbiop ortal. org) is an online access database used to explore cancer genomic data from multiple perspectives [19]. The gene mutation and survival data derive from 640 COAD samples in the TCGA database in cBioPortal. The genome atlas includes mutations and putative copy number changes in the genome identification of important cancer targets. OncoPrint was constructed to directly reflect all types of changes in CTHRC1 gene amplification, deletion, mRNA upregulation, and mRNA downregulation in COAD patients. In addition, the overall survival (OS) and disease-free survival (DFS) of CTHRC1 gene alteration were analyzed through the "Comparison/Survival" module in cBioPortal.

Gene-gene interaction and protein-protein interaction networks (GeneMANIA)
GeneMANIA (http:// www. genem ania. org) can be used to generate hypotheses about gene functions, analyze gene lists, and preferentially select genes for functional analysis [20]. A protein-protein interaction network between CTHRC1 and its 50 adjacent genes was constructed by GeneMANIA. Then, these 50 genes were analyzed by gene ontology (GO). GO enrichment analysis predicts the functional role of target host genes based on three aspects, including molecular function, biological processes, and cellular components.

Gene set enrichment analysis
Gene set enrichment analysis (GSEA) is a calculation method based on the entire gene expression matrix and an analysis method that determines whether a previously defined set of genes shows statistically significant and consistent differences between two phenotypes. In this study, ClusterProfiler 3.11 was used to analyze the TCGA-COAD dataset to elucidate significant functional and signaling pathway differences between the high and low CTHRC1 groups. The CTHRC1 gene expression level was used as a phenotypic marker, and each analysis carried out 1000 permutations of gene sets. In this study, H.all.v7.0.symbols.gmt from MSigDB Collections was selected as the reference gene set. The statistically significant GSEA threshold was set as p < 0.05 and FDR < 0.25. The enrichment of phenotypic was sequenced using calibrated p values and normalized enrichment scores.

Statistical analysis
CTHRC1 expression was analyzed via the Oncomine, TIMER, and UALCAN databases. The results of Pro-gnoScan and GEPIA are displayed with HR and P or Cox p values from a log-rank test. Spearman's correlation analysis was performed to evaluate the correlation of gene expression in TIMER. The GO and GSEA analysis were performed by "ClusterProfiler" R package. All R packages were operated using R software version v3.6.3. p values < 0.05 were considered statistically significant.

Expression of CTHRC1 in different cancers
Oncomine and TIMER were used to analyze the expression of CTHRC1 mRNA in COAD and normal tissues. The results showed that CTHRC1 mRNA levels in various cancer tissues were significantly higher than those in normal tissues (Fig. 1A). In addition, CTHRC1 mRNA was significantly upregulated in all intestinal tumors (Fig. 1B). Then, we further assessed these data through the TIMER database and found that CTHRC1 mRNA expression was significantly elevated in most tumors compared with normal tissues, especially in COAD (Fig. 1C). According to TCGA data, CTHRC1 expression was 6.222-fold higher in colorectal adenocarcinoma, 10.224-fold higher in mucinous colon adenocarcinoma, 4.803-fold higher in rectal adenocarcinoma, and 4.643-fold higher in cecum adenocarcinoma compared with normal tissues (Table 1). Then, we evaluated the correlation between the expression of CTHRC1 and the clinicopathological features (age, gender, race, clinical stage, histological, and TP53 mutation status) of COAD through the online cancer OMICS database of UALCAN. According to age, gender, race, clinical stage, histological, and TP53 mutation status, CTHRC1 expression was significantly upregulated in COAD patients compared to the corresponding normal controls (Fig. 2). In COAD patients, CTHRC1 expression levels were positively associated with clinical stages and histological status. The expression of CTHRC1 in stage 3 was higher than in stage 1 (p < 0.05). The expression of CTHRC1 in mucinous adenocarcinoma was higher than in adenocarcinoma (p < 0.05)

High expression of CTHRC1 was related to poor prognosis in COAD patients
We further explored the critical efficiency of CTHRC1 in the survival of patients with COAD by using Kaplan-Meier Plotter. The Kaplan-Meier curve and log-rank test analyses revealed that increased CTHRC1 mRNA levels were significantly correlated with overall survival (OS) and disease-free survival (DFS) (Fig. 3A,  B). COAD patients with high CTHRC1 expression had poor prognosis (OS, HR = 1.8, p = 0.018; DFS, HR = 1.8, p = 0.015). Moreover, the PrognoScan Database showed that overexpression of CTHRC1 was significantly related to low OS, DFS, and disease-specific survival (DSS) ( Table 2).

Genetic alterations of CTHRC1 in COAD
Gene alterations in CTHRC1 were found to occur in 6% of the 220 colorectal adenocarcinoma cases in the data obtained from the OncoPrint schematic of cBioPortal. The mutation rate of COAD was 4.56%. Meanwhile, missense mutations and amplification were the main alteration types in mucinous adenocarcinoma of the colon and rectum, rectal adenocarcinoma and colon adenocarcinoma (Fig. 4A, B). Figure 4C summarizes the details of all mutations: CTHRC1 has one truncation mutation and four missense mutations. In addition, we assessed the association of CTHRC1 gene alterations with survival in patients with colorectal adenocarcinoma. However, the OS and DFS were not associated with changes in the CTHRC1 gene in colorectal adenocarcinoma patients. We then constructed the protein-protein interactions network for CTHRC1 and the 50 altered neighboring genes by using GeneMANIA . The results showed  that ELF4, PGF, IKBIP, SERPINH1, ESAM, COL15A1,  THBS2, COL5A2, COL10A1, ADAM12, OLFML2B,  GJA4, CD248, INHBA, PLPP4, HECW2, P4HA3,  COL3A1, COL1A2, FLT1, COL6A3, VCNA, TIE1, LUM,  COL4A1, FAP, MYH9, ADAMTS2, COL1A1, MXRA5, and SULF1 genes were closely related to CTHRC1 alterations (Fig. 5A). The functions of CTHRC1 and the genes significantly associated with CTHRC1 alterations were predicted by analyzing GO. GO enrichment analysis predicts the functional role of target host genes from three aspects: biological process, cell composition, and molecular function. The results showed that morphogenesis, collagen trimer, endoplasmic reticulum lumen, and other processes were controlled by changes in CTHRC1 (Fig. 5B) ( Table 3). The GO enrichment analysis comprised three aspects: a biological process (BP), a molecular function (MF), and a cellular component (CC). The results showed that morphogenesis, collagen trimer, endoplasmic reticulum lumen, and other processes were significantly regulated by the CTHRC1 alterations ( Fig. 5B) (Table 3).

GSEA identifies signaling pathways associated with CTHRC1
We acquired the CTHRC1 expression dataset from the TCGA-COAD data (480 patients with COAD and 41 normal tissues) for the GSEA analysis to identify different activated signaling pathways in COAD. Gene sets associated with Gα signaling, GPCR ligand binding, neutrophil degranulation, interleukin signaling, and tumor-associated pathways showed varying degrees of enrichment in the highly expressed phenotype of the CTHRC1 gene ( Fig. 6) ( Table 4).

Relationship between CTHRC1 expression and immune checkpoint genes
With increased awareness of immune checkpoint function in humans, immune checkpoint inhibitors have made great progress in cancer treatment. We assessed whether the expression of CTHRC1 was associated with immune checkpoint genes. We used the TISIDB database to study the relationship between CTHRC1 expression and immunosuppressive effects. The results showed that CTHRC1 was associated with ADORA2A, TIGT, TGFBR1, TGFB1, PDCD1LG2, CD96, PDCD1, LAG3, KDR, IL10, IDO1, BTLA, HAVCR2, CTLA4, CSF1R, CD274, and CD244 (Fig. 7).

Relationship between CTHRC1 expression and immune infiltration
In patients with cancer, the level of immune cells is closely related to the proliferation and development of cancer  cells. Then, we used the TIMER database to explore the relationship between CTHRC1 expression and the degree of immune cell infiltration. CTHRC1 expression was positively correlated with CD4+T cell infiltration, CD8+T cell infiltration, macrophage infiltration, neutrophil infiltration, and dendritic cell infiltration, expect at the B cell level (Fig. 8). To further explore the relationship between CTHRC1 expression in COAD and immune cell invasion level, we identified the correlation between CTHRC1 expression and various immune  invasion-associated biomarkers. Our results showed that CTHRC1 expression was significantly correlated with most B cells, T cells, CD8+T cells, macrophages, neutrophils, dendritic cells, and NK cell markers (Table 5).

Discussion
The incidence of COAD has increased dramatically in recent years, and the heterogeneity of colorectal cancer makes it difficult to determine which patients require further treatment after surgical resection and which have a poor prognosis. Diagnostic and prognostic tools for the early detection and prediction of patient survival are limited. Many studies have focused on this question, and there have been many advances in uncovering the underlying mechanisms by which cancer occurs. Therefore, finding reliable diagnostic markers for COAD is still an important research focus. The results showed that CTHRC1 expression in COAD tissues was higher than that in normal tissues, and the difference was statistically significant. This finding is consistent with previous studies, which also found that CTHRC1 protein is highly expressed in various types of cancer, such as lung, stomach, cervical, and breast cancers. Our analysis further confirmed this finding because   we found that CTHRC1 was significantly overexpressed in most of the tumors in the TCGA data. These results suggest that CTHRC1 may be a diagnostic marker of multiple cancers. In addition, we found that CTHRC1 was associated with the clinical staging and histological subtypes of COAD. Previous studies have reported that CTHRC1 promotes tumor cell progression by affecting specific pathways in different cancer types. For example, CTHRC1 is elevated in cervical cancer and promotes metastasis via the Wnt/PCP pathway [21]. In contrast, CTHRC1 regulates the aggressiveness of NSCLC through the GSK-3β/β-catenin pathway [22]. Current studies have found that CTHRC1 is an intrinsic marker of COAD metastasis, and further revealed that CTHRC1 promotes COAD liver metastasis through TGF-β signaling remodeling infiltrated macrophages [23]. This suggested that CTHRC1 expression may be a convenient diagnostic biomarker for a variety of tumors, including COAD. In addition, CTHRC1 is highly expressed in COAD and associated with poor prognosis. This result is consistent with previous findings that high CTHRC1 expression is associated with poor COAD survival [24]. Gene mutations have important implications, including altering genetic content, disrupting genes, and causing phenotypic differences. We found that the main type of CTHRC1 change was mRNA upregulation, and copy number variation was the most common type which may be the reason for the high expression of CTHRC1 in COAD. Changes in the chromosomal structure of the CTHRC1 gene can lead to abnormal expression and dysfunction. In the current study, gene mutations in CTHRC1 accounted for 6% of colorectal adenocarcinomas. The gene mutation of CTHRC1 in COAD was 4.65%, and the change frequency of CTHRC1 in mucinous adenocarcinoma of the colon and rectum was the highest, up to 13.04%. These results suggest that high CTHRC1 mutation may increase the carcinogenesis of COAD. According to the GO analysis, we found that the functional network of CTHRC1 in COAD is involved in morphogenesis, endoplasmic reticulum lumen, extracellular matrix structural components, and collagen trimers, suggesting that CTHRC1 plays a biological role in morphogenesis and cytoskeletal tumorigenesis. These findings suggest that CTHRC1 may regulate transcription to influence cell biological function.
In addition, it is important to understand how changes in proteins that regulate normal transcription are involved in cancers. It has been reported that genomic instability may lead to the transformation of normal cells into a carcinogenic state, and protein kinases and their related signaling pathways will help stabilize and repair genomic DNA [25]. To further investigate the role of CTHRC1 in COAD, we used TCGA data for GSEA. The results showed that the high expression of CTHRC1 was enriched in various pathways and key biological functions, and was related to the occurrence of tumors, such as Gα signaling, GPCR ligand binding, neutrophil degranulation, interleukin signaling, and tumorassociated pathways. The mechanism of Gα classical signaling is to inhibit the cAMP-dependent pathway by inhibiting adenylate cyclase. The reduced production of cAMP in ATP results in decreased cAMP-dependent protein kinase activity, and mutations in Gα subunits lead to specific cancers [26]. Additionally, many cancer cells abnormally express GPCRs, including those from lung, prostate, colon, pancreas, and mesenchymal cancer cells, which stimulate cell proliferation, migration, invasion, and angiogenesis [27]. Among members of the GPCR family, gonadotropin-releasing hormone receptor (GnRH) has been reported to be overexpressed in various tumor cells such as melanoma, prostate and endometrial cancer, leiomyoma, breast cancer, choriocarcinoma, epithelial ovarian tumor, and stromal ovarian tumor [28]. Therefore, the GnRH receptor is a reliable target for the clinical treatment of cancer [29]. The molecular basis of neutrophil degranulation is not fully understood, although the SNARE and Rab proteins seem to play a central role [30]. The presence of neutrophil components in cytoplasmic granules is involved in tumor metastasis and angiogenesis and may be considered a biomarker for tumor prognosis [31]. The tumor microenvironment is an increasingly popular topic that may influence tumor progression and recurrence. Studies have shown that immune cells in the TME have protumor or antitumor activity [32]. They are considered to be important determinants of clinical outcome and immunotherapy response. We observed that CTHRC1 expression was closely related to COAD immune infiltration. For example, CTHRC1 expression was significantly positively correlated with multiple immune cells and immune cell markers. It is suggested that CTHRC1 may reflect not only the prognosis of the disease but also the immune status of the body. In conclusion, these results underlie the ability of CTHRC1 to potentially regulate immune cell recruitment and activation in COAD. The present study improves our understanding of the relationship between CTHRC1 and COAD, although some limitations still exist. First, all the data analyzed in this study were retrieved from online databases, and further studies consisting of in vitro and in vivo experiments are required to validate our findings. Second, most of the analyses in the present study were performed based on mRNA levels of CTHRC1. A deeper analysis, based on protein levels, would make the data more convincing.

Conclusion
This study systematically analyzed the relationship of CTHRC1 expression with COAD. The results showed that the expression of CTHRC1 is upregulated in COAD, and high CTHRC1 expression was correlated with clinical progression. In addition, Gα signaling, GPCR, neutrophil degranulation, and interleukin signal transduction may be the key pathways regulated by CTHRC1. The expression of CTHRC1 is closely related to the infiltration of various immune cells in COAD. Therefore, the biological function of CTHRC1 may play a vital role in the diagnosis and treatment of COAD.