Incidence trends and survival prediction of urothelial cancer of the bladder: a population-based study

Background The aim of this study is to determine the incidence trends of urothelial cancer of the bladder (UCB) and to develop a nomogram for predicting the cancer-specific survival (CSS) of postsurgery UCB at a population-based level based on the SEER database. Methods The age-adjusted incidence of UCB diagnosed from 1975 to 2016 was extracted, and its annual percentage change was calculated and joinpoint regression analysis was performed. A nomogram was constructed for predicting the CSS in individual cases based on independent predictors. The predictive performance of the nomogram was evaluated using the consistency index (C-index), net reclassification index (NRI), integrated discrimination improvement (IDI), a calibration plot and the receiver operating characteristics (ROC) curve. Results The incidence of UCB showed a trend of first increasing and then decreasing from 1975 to 2016. However, the overall incidence increased over that time period. The age at diagnosis, ethnic group, insurance status, marital status, differentiated grade, AJCC stage, regional lymph nodes removed status, chemotherapy status, and tumor size were independent prognostic factors for postsurgery UCB. The nomogram constructed based on these independent factors performed well, with a C-index of 0.823 and a close fit to the calibration curve. Its prediction ability for CSS of postsurgery UCB is better than that of the existing AJCC system, with NRI and IDI values greater than 0 and ROC curves exhibiting good performance for 3, 5, and 8 years of follow-up. Conclusions The nomogram constructed in this study might be suitable for clinical use in improving the clinical predictive accuracy of the long-term survival for postsurgery UCB.


Introduction
Urothelial cancer of the bladder (UCB) is the most common pathological type of bladder cancer, and its incidence is especially high in Western countries [1,2]. The incidence of this cancer is closely related to tobacco consumption and exposure to occupational carcinogens [3,4]. However, the incidence of UCB may have changed over the past few decades due to industrial developments, the implementation of policies for controlling tobacco, and progress in disease diagnosis and treatment [5,6]. There have been few analyses of the incidence of UCB despite many studies researching the incidence trends of bladder cancer [7].
UCB is most frequently diagnosed in males and people older than 55 years [7]. Surgical resection is the mainstay treatment for UCB, but many people-especially those presenting with muscle invasion-have poor outcomes despite receiving surgery and systemic treatment [8].
Given the aforementioned situation, this study analyzed trends in the incidence of UCB and established a nomogram based on a Cox proportional-hazards regression analysis of the prognostic factors for predicting the survival of UCB after surgery based on data obtained from the Surveillance, Epidemiology, and End Results (SEER) database [17].

Data collection and definition
The data were extracted retrospectively from the SEER database and downloaded using SEER*Stat software (version 8.3.6, National Cancer Institute). To identify UCB patients, we searched the database using the tumor-site ICD-9 codes (C67.0-C67.9) and ICD-O-3 code (8130/3). To analyze the trends in the incidence of UCB, the ageadjusted incidence rate of UCB diagnosed from 1975 to 2016 was calculated.
To establish a nomogram for analyzing survival, the following variables for UCB were extracted from the SEER database: age at diagnosis, sex, ethnic group, primary site, grade, metastasis stage, derived AJCC stage, regional lymph nodes removed, radiation status, chemotherapy status, insurance status, marital status, tumor size, survival time, and cancer-specific death status. We only included patients who received surgery, which were identified with "Surgery performed" record on the item "Reason no cancer-directed surgery." Other exclusion criteria were (1) only autopsy findings being available, (2) diagnosis based on direct visualization without microscopic confirmation, (3) not the first malignant primary indicator, and (4) incomplete information for the above-listed variables.

Statistical analyses
The data for the age-adjusted incidence rate of UCB from 1975 to 2016 was used to calculate the annual percentage change (APC) in the incidence using the weighted least-squares method. Joinpoint regression analysis (version 4.7.0, Joinpoint, IMS, Calverton, MD, USA) was performed to delineate trends in the incidence of UCB from 1975 to 2016. Considering the large difference in the incidence between males and females, the APC analysis and the joinpoint regression analysis were performed with stratification by sex.
All of the patients included in the cancer-specific survival (CSS) analysis were randomly divided into a training cohort and a validation cohort at the ratio of 7:3. We first used the data in the training set to find independent prognostic factors and construct a nomogram, and then applied the data to the validation cohort to evaluate the distinguishability, calibration, and clinical effectiveness of the prediction model.
Differences in the distribution of categorical variables between the training cohort and validation cohort were estimated using the chi-square test. Differences in age between the two cohorts were assessed using Student's t test, and differences in survival time were assessed using the log-rank test. Statistical analyses to identify risk factors were performed by applying the backward stepwise selection method of multivariable Cox regression to the training cohort. A nomogram was then established based on the identified risk factors.
The distinguishability of the nomogram was evaluated using the consistency index (C-index) calculated by Harrell's C statistic, the net reclassification index (NRI), and the integrated discrimination improvement (IDI). The C-index was used to describe the difference between the real values and those predicted by the model. This index ranges from 0.5 (no discrimination) to 1 (excellent discrimination), with a value of ≥ 0.70 indicating that the distinguishability of the prediction model is acceptable. Values of NRI and IDI of > 0 (compared with the traditional AJCC staging system) indicate that the prediction ability of the nomogram is better than that of the AJCC staging system, while negative values would indicate that it is inferior.
The calibration of the nomogram was evaluated using a calibration plot, on which the abscissa shows the predicted values for different groups and the ordinate shows the actual probabilities. The value points for different groups are connected by line segments to form a calibration line. A calibration curve that is closer to the standard line of y = x indicates a smaller error between the model's prediction and the actual situation, and hence a better calibration capability of the model. The clinical effectiveness of the nomogram was evaluated using the receiver operating characteristics (ROC) curve.
Statistical analyses were performed using the R software (version 3.5.1; https://www.r-project.org/). Statistical significance was defined as a two-sided probability value of < 0.05.
Among female UCB patients, the age-adjusted incidence rate increased slightly from 3.8 per 100,000 persons in 1975 to 5.3 per 100,000 persons in 2016. Only one join point was identified (Fig. 1), with the incidence rate showing a slowly increasing trend from 1975 to 1996 (APC = 2.0%, 95% CI = 1.7-2.4%, P < 0.0001), followed by a slowing decreasing trend from 1997 to 2016 (APC = −0.8%, 95% CI = −1.2% to −0.5%, P < 0.0001). The age was showed as mean ± standard deviation In terms of treatment modalities, the regional lymph nodes were removed in 2215 (19.24%) patients, 3814 (33.13%) had received chemotherapy, and 718 (6.24%) had received radiation. There were no significant differences between the training and validation cohorts in sex, ethnic group, tumor size, marital status, insurance status, differentiate grade, metastasis stage, AJCC stage, tumor location, regional lymph nodes removal status, chemotherapy status, or radiation status (P > 0.05). The patients in the validation cohort were slightly older than those in the training cohort (P = 0.04). The log-rank test showed that the survival time did not differ significantly between the training and validation cohorts (P = 0.3).

Independent prognostic factors and construction of the nomogram
Multivariable Cox regression with the backward stepwise selection method revealed that the statistically significant factors affecting postsurgery UCB survival in the training cohort were the age at diagnosis (hazard ratio [ Table 2). These independent prognostic factors were used to construct a prognostic nomogram for predicting the 3-, 5-, and 8-year CSS of postsurgical patients with UCB (Fig. 2). The nomogram shows that the age at diagnosis and the AJCC stage were the strongest factors influencing the prognosis.  < 0.001). These performance indicators demonstrate that the nomogram showed better discrimination than the AJCC staging system.
The calibration plots showed excellent consistency between the observed and nomogram-predicted probabilities in the training and validation cohorts (Fig. 3). The ROC curve of the predictive model showed good clinical effectiveness in both the training cohort (Fig. 4A), with areas under the ROC curve (AUCs) for 3, 5, and 8 years of follow-up of 0.831, 0.808, and 0.789, respectively, and the validation cohort (Fig. 4B), with corresponding AUCs of 0.811, 0.798, and 0.789.

Discussion
This study analyzed incidence trends in order to establish a survival predictive model for postsurgery UCB based on data in the SEER database. From 1975 to 2016, the overall incidence rate showed an upward trend, despite a slight decrease from the beginning of the twenty- The overall upward trend in the incidence of UCB over the past 40 years is consistent with the results of many studies, although the types of pathologies investigated have varied [18][19][20][21][22]. This increase is mostly attributable to progress in the development of diagnostic tools, especially in ultrasonography, computed tomography, and magnetic resonance imaging [23]. Another possible reason is the global trend of population aging, since this cancer is more common in the elderly, while the joinpoint regression also found that the incidence of UCB was not always rising, but had experienced a process of rapid rise, slow rise and then decline in men and the general population. We speculate the downward trend may be related to the control of tobacco consumption. Tobacco smoking is the main factor underlying the incidence of bladder cancer [24]. A report from the Centers for Disease Control and Prevention showed that the smoking rate has decreased markedly in American adults over the past few decades, from 42.4% in 1965 to 16.8% in 2014 [18]. It should be noted that there was long latency between tobacco exposition and bladder cancer diagnosis [25]. So the downward trend only began to appear around 2000. Another issue is that the incidence of female UCB had declined in earlier years. We suspect that the possible explanation is that women had a lower bladder cancer incidence because of potential biologic factors, and the decrease in tobacco consumption exerted a more significant impact on them.
Our study found that the prognosis is worse for postsurgery UCB patients who are single, separated, divorced, or widowed than it is for married patients. We speculate that this could be due to the mental status of UCB patients affecting their survival. It has been shown that single patients with bladder cancer are more likely to have a posttreatment psychiatric diagnosis than are married patients, and that the prognosis of bladder cancer is worse in patients with a psychiatric diagnosis [26]. Other analyses of the prognosis of bladder cancer using data from the SEER database have also found that the marital status can affect the prognosis of the disease [27,28].
We further found that the prognosis is worse in patients without insurance than in those receiving medical insurance/medical assistance. This is somewhat consistent with the findings of Sung et al. [29] based on California Cancer Registry data that the survival time for bladder cancer is worse for not-insured patients and those with an unknown insurance status than it is for those with managed care, although there was no significant difference in the CSS. That study also found that among all insurance categories, the prognosis was worst for Medicaid insurance in the USA. We speculate that the main reason is that Medicaid is aimed at low-income people, who are less likely to receive treatment within 12 weeks of a diagnosis [30]. Sung et al. [29] also found that Medicaid patients had more advanced-stage, highergrade tumors compared with patients covered by Medicare or managed care, and so their prognosis may be worse. This has been confirmed in other previous research [31]. In our study, we did not subdivide the patients into different types of insurance, instead only dividing them into insured and uninsured/unknown, which may be the main reason for the difference in the research results. Regardless, the type of and accessibility to medical insurance may affect the survival rate of bladder cancer, possibly due to differences in basic living conditions (e.g., income and living environment), disease prevention, and the treatment of people covered by different types of medical insurance.
Other independent prognostic factors for postsurgery UCB identified in this study were the age at diagnosis, black ethnic group, lower differentiation grade, lower AJCC stage, no regional lymph nodes removed, not receiving chemotherapy, and larger tumor, which is traditional prognostic factors for bladder cancer that have been reported previously [32][33][34]. Based on these factors and the aforementioned marital status and insurance status, we established a nomogram for the individualized prognosis of postsurgery UCB, and found that the AJCC stage and the age had the greatest impact on individualized prognoses. This was not surprising. The AJCC stage itself reflected the severity of the tumor to a large extent. On the other hand, the elderly patients usually suffered from reduced physiological function, coupling with other underlying diseases, resulting in that perioperative mortality and postoperative complications had increased significantly. Additionally, the risk of recurrence increased with age, and the prognosis of older patients was poor [35]. However, the contribution of other variables to the model cannot be ignored. We calculated the NRI and IDI of established model using "Age + AJCC stage" as the control model and found the NRI values for 3, 5, and 8 years of follow-up were 0.23, 0.2, and 0.17, respectively, in the training cohort, and 0.19, 0.12, and 0.12 in the validation cohort; the corresponding IDI values were 0.03, 0.03, and 0.03 in the training cohort, and 0.02, 0.02, and 0.03 in the validation cohort (all P < 0.001). These indicated that variables other than AJCC and age also exerted a positive contribution to the prediction of prognosis.
The nomogram developed in this study is the first one reported for postsurgery UCB. Zhang et al. [36] established a nomogram for the individualized prognosis of bladder cancer based on data in the SEER database. The variables in that model include the age at diagnosis, ethnic group, sex, and TNM stage. That model also indicated that age and the T stage have the greatest impact on the prognosis, which is essentially consistent with our model; the main differences are that we used AJCC staging, which is also based on the TNM stage, and we targeted postsurgery UCB. Our nomogram might be superior since we take into account the clinical treatment received by the patients and a broader range of demographic information. In addition, the nomogram that we have established exhibits good discrimination, calibration, and clinical effectiveness, and a better prognostic ability for postsurgery UCB than the currently used AJCC staging system. This easy-to-use nomogram can help doctors to estimate the likelihood that a patient will survive at a certain point in time.
Several limitations of this study should be considered. Firstly, the data used in the validation cohort also came from the SEER database, and so the nomogram still needs to be validated using data from another database or using clinical prospective data. Secondly, some important clinical factors were not collected, such as the smoking status after diagnosis, parameters of social status (e.g., socioeconomic status or level of education), condition of the underlying disease, comorbidities, and biochemical indicators such as the C-reactive protein level. The data available are also subject to the limitations of the SEER database. Finally, for patients with bladder cancer to have a good prognosis, preventing relapse is also an important indicator for the clinical treatment of the disease [37,38], but we did not analyze the risk of recurrence in patients.

Conclusions
In conclusion, this study has revealed the incidence trends of UCB and constructed a nomogram for predicting the long-term survival of individual postsurgery UCB patients based on a population cohort. The nomogram showed good predictive performance, and may serve as an effective and convenient evaluation tool for helping surgeons to perform personalized survival predictions and mortality risk identification in postsurgery UCB patients.  2019SF-140). The funders had no role in the study design, collection, analysis, interpretation, or writing of the manuscript.

Availability of data and materials
The data that support the findings of this study are available on request from the corresponding author.