Evaluation of the POSSUM, P-POSSUM and E-PASS scores in the surgical treatment of hilar cholangiocarcinoma

Background The Physiological and Operative Severity Score for the enUmeration of Mortality and morbidity (POSSUM) model, its Portsmouth (P-POSSUM) modification and the Estimation of physiologic ability and surgical stress (E-PASS) are three surgical risk scoring systems used extensively to predict postoperative morbidity and mortality in general surgery. The aim was to undertake the first study of the predictive value of these models in patients undergoing surgical treatment of hilar cholangiocarcinoma. Methods A retrospective analysis was performed on data collected prospectively over a 10-year interval from January 2003 to December 2012. The morbidity and mortality risks were calculated using the POSSUM, P-POSSUM and E-PASS equations. Results One hundred patients underwent surgical treatment of hilar cholangiocarcinoma. Complications were seen in 52 of 100 patients (52.0%). There were 10 postoperative in-hospital deaths (10.0%). Of 31 preoperative and intraoperative variables studied, operative type (P = 0.000), preoperative serum albumin (P = 0.003) and aspartate aminotransferase (P = 0.029) were found to be factors multivariate associated with postoperative complications. Intraoperative blood loss (P = 0.015), Bismuth-Corlette classification (P = 0.033) and preoperative hemoglobin (P = 0.041) were independent factors multivariate associated with in-hospital death. The POSSUM system predicted morbidity risk effectively with no significant lack of fit (P = 0.488) and an area under the ROC curve (AUC) of 0.843. POSSUM, P-POSSUM and E-PASS scores showed no significant lack of fit in calculating the mortality risk (P >0.05) and all yielded an AUC value exceeding 0.8. POSSUM had significantly more accuracy in predicting morbidity after major and major plus operations (O:E (observed/expected) ratio 0.98 and AUC 0.901) than after minor and moderate operations (O:E ratio 1.13 and AUC 0.759). Conclusions POSSUM, P-POSSUM and E-PASS scores effectively predict morbidity and mortality in surgical treatment of hilar cholangiocarcinoma. However, improvements are still needed in the future because none of these scoring systems yielded an AUC value exceeding 0.9 for operations with all different levels of severity. Only POSSUM had more accuracy in predicting postoperative morbidity after operations with higher severity. Trial registration This study was undertaken after obtaining approval from the ethics committee of School of Medicine, Shanghai Jiao Tong University with a trial registration number of http://09411960800.


Background
In the last 20 years surgical treatment of hilar cholangiocarcinoma has mainly evolved because of the enhanced appreciation of tumor characteristics and improvements in preoperative imaging [1]. However, due to complex biliary and hepatic resections, this surgical procedure is still considered to be one of the most challenging procedures faced by hepatobiliary surgeons. Postoperative morbidity and mortality rates remain in a high range (14 to 76% and 0 to 19%, respectively) even in reports from high-volume centers [1][2][3][4][5]. The accurate prediction of outcomes after a high-risk procedure such as surgical treatment of hilar cholangiocarcinoma can early detect postoperative complications, allow improved treatment planning and increase the precision of individual prognosis.
Many surgical risk scoring systems have been devised, but the Physiologic and Operative Severity Score for the enUmeration of Mortality and Morbidity (POSSUM) model by Copeland et al. [6] was recommended as the most appropriate for general surgery [7,8]. This model, utilizing scores relating to twelve physiological and six operative variables, was developed to predict inhospital mortality and morbidity postoperatively. However, POSSUM was then reported as over predicting postoperative mortality, particularly in patients at low risk. This led to a revision: the Portsmouth modification (P-POSSUM) by Whiteley et al. [9] Another surgical risk scoring system that has been validated in hepatobiliary surgery worldwide is the Estimation of Physiologic Ability and Surgical Stress (E-PASS) [10][11][12]. This system comprises a preoperative risk score (PRS), a surgical stress score (SSS) and a comprehensive risk score (CRS) that is calculated from both the PRS and SSS.
Surgical risk scoring systems were initially designed for large populations and populations with different pathologies. However, later they were also applied to patients with a single diagnosis or one type of operation. It is unclear whether these two surgical risk scoring systems are useful for predicting morbidity and mortality in high-risk surgery such as surgical treatment of hilar cholangiocarcinoma. Hellmann et al. [13] was the first and only study to evaluate POSSUM for the surgical treatment of cholangiocarcinoma, and they found that the model overestimated postoperative morbidity and mortality. However, owing to its heterogeneity in selection design, this study considered only a limited and unclear number of operations for hilar cholangiocarcinoma over a 13-year period. The aim of the present study was to evaluate the POSSUM and E-PASS score in surgery for hilar cholangiocarcinoma, a most high-risk context in which to our knowledge, they have never been applied.

Patients
Data between January 2003 and December 2012 was analyzed retrospectively from a prospectively maintained database. Consecutive patients treated surgically in this center following a diagnosis of hilar cholangiocarcinoma were studied and only patients with histologically confirmed cholangiocarcinoma were included. Patients who underwent liver transplantation were not included in this study because a significant difference was shown in factors associated with morbidity or mortality compared with other types of operation.
To evaluate the different treatment strategies during the years, these 10 years were divided into three periods in time. Operative technique and postoperative care have been developed during the 10 years and the last period (2009 to 2012) was characterized by more extensive resections. The operative approach in the last four years was performed as described below [14,15]. Generally, in patients with Bismuth-Corlette classification type I, we performed bile duct resection only. In patients with types II and III, an (extended) right or left hepatectomy was performed. Patients with type IV underwent a right or left trisectionectomy. We routinely dissected the lymph nodes surrounding the hepatoduodenal ligament, behind the pancreatic head and around the common hepatic artery. Biliary continuity was achieved by Roux-en-y hepaticojejunostomy with an isoperistaltic 70 cm limb of jejunum. Patients underwent exploratory laparotomy with palliative biliodigestive anastomosis or without curative intent after the surgeon found unresectable hilar cholangiocarcinoma.
All preoperative, intraoperative and postoperative patient data (31 preoperative and intraoperative variables) were collected and entered into a computer database prospectively. The POSSUM, E-PASS scoring systems and multivariate analysis were done retrospectively from the collected data and medical records according to defined criteria. The morbidity risk was calculated using the POSSUM equation. The mortality risk was calculated using the POSSUM, P-POSSUM and E-PASS equations respectively. Complication was evaluated based on the original POSSUM [6] and E-PASS [16] definitions and graded according to the Clavien complication scheme [17]. The in-hospital mortality was recorded for each patient.

Statistical analysis
Clinical parameters were tested using the χ2 goodnessof-fit for comparison within the three time periods. Multivariate analysis was performed using the logistic regression method. Frequency tables were constructed with 10 risk bands and compared with the χ2 test using the methods of Hosmer and Lemeshow to test the goodness of fit [18]. A good model is indicated by a high P value. In order to predicted postoperative morbidity or mortality rate from the lowest to the highest risk in each model the 10 risk bands were divided. Each risk band contained the same number of subjects. Expected and observed complications or deaths were quantified in each band.
The discriminatory power of each model was assessed by calculating the area under receiver-operating characteristic curve (AUC). Values ranging from 0.7 to 0.9 represent reasonable discrimination. Values exceeding 0.9 represent good discrimination. The differences in morbidity and mortality rate between the risk bands were analyzed using the χ2 goodness-of-fit test. Categorical variables were compared between groups using the χ2 test with Yates's correction for continuity [11]. Statistical calculations were carried out with SPSS computer software (SPSS, Chicago, Illinois, United States). A value of P <0.05 was considered statistically significant.

Parameters and outcome
The 100 consecutive patients who underwent surgical treatment of hilar cholangiocarcinoma during the study period were included in the present study ( Table 1). Four of the patients had hepatolithiasis and one had liver cirrhosis. Thirteen of the patients had diabetes, five had hypertension and six had coronary disease. The preoperative serum concentration of total bilirubin was greater than 18 umol/L in 96 of the patients. The preoperative hemoglobin was less than 10 g/dl in 13 of the patients. Ten of the patients had Child's grade A, 85 had Child's grade B and 5 had Child's grade C liver status. The various types of operation performed are shown in Table 2. Since the second period (2006 to 2008), our center adopted new aggressive approaches for patients with hilar cholangiocarcinoma, which resulted in a R0 resection rate increase from 21% in the first period to 45% in the last period (Table 3). Postoperative complications were seen in 52 of 100 patients (52.0%), with some patients having more than one complication. There were 10 in-hospital deaths (10%). The postoperative morbidity and in-hospital mortality were not different among the three periods (P >0.05).

POSSUM, P-POSSUM and E-PASS scores
The calibration power of POSSUM, P-POSSUM and E-PASS was analyzed using the Hosmer-Lemeshow test after 10 risk bands were divided (Table 5 and  Table 6) [18]. Statistically significant differences were detected in the postoperative morbidity or in-hospital mortality rate between the risk bands of all three models using the χ2 goodness-of-fit test. When comparing predicted morbidity with observed morbidity by POSSUM score, an overall O:E (observed/expected) ratio of 1.00 was found (Table 7). This model showed no significant lack of fit (P = 0.488) and yielded an AUC of 0.843 ( Figure 1). The POSSUM, P-POSSUM and E-PASS scores showed no significant lack of fit in calculating the mortality risk (P >0.05). P-POSSUM and E-PASS performed well and gave an O:E ratio of 1.00, while POSSUM gave an O:E ratio of 1.11 (Table 7). All scoring systems yielded an AUC value exceeding 0.8 and none of them showed a higher AUC value in predicting in-hospital mortality than the others (P 0.05, Table 7 and Figure 2).
To evaluate the effect of different surgical procedures on the predictive value of each model, all cases were divided into four groups according to the operative severity. Definitions of the operative severity were shown in Table 2. POSSUM had significantly more accuracy in predicting morbidity after major and major plus operations (O:E ratio 0.98 and AUC 0.901) than after minor and moderate operations (O:E ratio 1.13 and AUC 0.759, P <0.05). However, no additional value was found for POSSUM, P-POSSUM and E-PASS scores in predicting in-hospital mortality after major and major plus operations (O:E ratio 1.33, 1.14, 1.33 and AUC 0.803, 0.796, 0.852) compared to those after minor and moderate operations (O:E ratio 1.00, 1.00, 0.00 and AUC 0.986, 0.973, 0.833, P >0.05).

Discussion
Postoperative complications and death may result depending on three major factors: the quality of the surgical team, the patient's physiological status and the degree of surgical stress [16]. Where the quality of a surgical team in one hospital has remained stable for a certain period, surgical risk scoring systems could be applied to assess the risk of complications and death by quantification of the patient's physiological status and the surgical stress applied. Several surgical groups have used POSSUM [17], P-POSSUM [18] and E-PASS [12,19] successfully to perform comparative audit in hepato-biliary-pancreatic surgery. Therefore, these two scoring systems were chosen in the present study.  This is the first time surgical risk scoring systems were applied to a specific hilar cholangiocarcinoma surgical patient population. With a postoperative morbidity rate of 52.0% and an in-hospital mortality rate of 10.0%, our institution lies within the accepted range of complications after surgical treatment of hilar cholangiocarcinoma [1]. When the χ2 test was used to compare actual morbidity and mortality rates with estimated ones, there was no significant lack of fit (P >0.05), indicating that the POSSUM and E-PASS scoring systems accurately estimate the outcomes. They also yielded an AUC value exceeding 0.7, suggesting their utility in predicting morbidity and mortality after surgery for hilar cholangiocarcinoma. However, improvements are still needed in the future because none of these scoring systems yielded an AUC value exceeding 0.9 for operations with all different levels of severity. Previous meta-analysis [20] and some reports [10] revealed that the POSSUM and E-PASS scoring systems failed to offer a significant predictive value for morbidity and mortality after hepatobiliary surgery. The main reason for the different findings may be that surgery of hilar cholangiocarcinoma is a more complex and severe operation than other hepatobiliary procedures. It has a higher operative severity score in POSSUM and a higher surgical stress score in E-PASS and therefore results in a higher risk prediction. Because the potential for morbidity and mortality is greater after this operation, surgical risk scoring systems would demonstrate a more accurate predictive value. We evaluated the corresponding results if only the patients underwent major and major plus operations were included. POSSUM indeed had more accuracy in predicting postoperative morbidity after major and major plus + operations. Similar findings have been observed in other studies, where POSSUM has a significantly more accurate predictive value for higher acuity procedures, such as pancreaticoduodenectomy, than for other Table 5 Calibration power of POSSUM score for predicting postoperative morbidity     pancreatic surgeries [17,21]. However, no additional value was found for POSSUM, P-POSSUM and E-PASS scores in predicting in-hospital mortality after major and major plus operations. Firstly, operative type was not a factor multivariate associated with postoperative complications in our study, therefore, it remains unclear whether the type of operation influences the validity of the scores. Secondly, some independent factors for morbidity and mortality, such as operation type, intraoperative blood loss and preoperative hemoglobin are scored in POSSUM, P-POSSUM and E-PASS. Multivariate predictors for hepatobiliary surgery may differ from those in POSSUM and E-PASS scoring systems [22]. Based on the findings of our multivariate analysis, preoperative serum albumin, aspartate aminotransferase, and the Bismuth classification are independent factors associated with postoperative morbidity or in-hospital mortality but are included neither in E-PASS nor in POSSUM systems. Therefore, if researchers would like to improve the AUC value of these surgical scoring systems for hilar cholangiocarcinoma in the future, these factors might be added as new parameters in revised models. Among the three surgical risk scoring systems employed in the present study, none of them showed a higher AUC value in predicting in-hospital mortality than the others. The advantage of the E-PASS scoring system is the relative ease with which data are acquired. This is favorable to the POSSUM or P-POSSUM score, which requires 18 different variables compared with the nine variables needed for the E-PASS score [23]. Furthermore, the POSSUM was generated only for surgical auditing and not for surgical decision making. However, the application of E-PASS has a potential role not only in surgical auditing but also in surgical decision making both between and within individual practice [24]. In our institution, we have developed preoperative management scenarios in our pancreatobiliary surgical practice. For example, patients with a high comprehensive risk score (CRS) are provided additional preoperative interventions such as enteric tube feedings, hyperalimentation, antibiotics, and biliary stenting, to improve preoperative parameters. This is often indicated, particularly when patients present with malignant obstructive jaundice, comorbid cardiac or respiratory illness, diabetes, or malnutrition.
There are some limitations to the present study. Firstly, since hilar cholangiocarcinoma is an uncommon neoplasm, the mortality rate corresponds to only ten patients, resulting in a relatively small group (100 patients) available for analysis over a long period of time (2003 to 2012). Operative technique and postoperative care have been developed during the past 10 years and treatment strategies are evolving. Secondly, in constructing the E-PASS model postoperative complications were only included when medical or interventional treatment had been carried out and mild complications were not regarded to be the same as severe ones [16]. However, POSSUM and P-POSSUM use a different definition and analysis of complication [6] which may affect the comparison of predictive value of scoring systems in our study.