Advertisement
Research Article| Volume 48, ISSUE 1, P38-44, July 2014

Download started.

Ok

Comparison of Three Contemporary Risk Scores for Mortality Following Elective Abdominal Aortic Aneurysm Repair

  • S.W. Grant
    Correspondence
    Corresponding author. S.W. Grant, University Hospital of South Manchester, Academic Surgery Unit, Education and Research Centre, Southmoor Road, Manchester M23 9LT, UK.
    Affiliations
    The University of Manchester, Manchester Academic Health Science Centre, UHSM, Academic Surgery Unit, Education and Research Centre, Manchester, UK

    University College London, National Institute for Cardiovascular Outcomes Research, Institute of Cardiovascular Science, London, UK
    Search for articles by this author
  • G.L. Hickey
    Affiliations
    University College London, National Institute for Cardiovascular Outcomes Research, Institute of Cardiovascular Science, London, UK

    The University of Manchester, Manchester Academic Health Science Centre, Centre for Health Informatics, Manchester, UK
    Search for articles by this author
  • E.D. Carlson
    Affiliations
    The University of Manchester, Manchester Academic Health Science Centre, UHSM, Academic Surgery Unit, Education and Research Centre, Manchester, UK
    Search for articles by this author
  • C.N. McCollum
    Affiliations
    The University of Manchester, Manchester Academic Health Science Centre, UHSM, Academic Surgery Unit, Education and Research Centre, Manchester, UK
    Search for articles by this author
Open AccessPublished:April 01, 2014DOI:https://doi.org/10.1016/j.ejvs.2014.03.040

      Objective/background

      A number of contemporary risk prediction models for mortality following elective abdominal aortic aneurysm (AAA) repair have been developed. Before a model is used either in clinical practice or to risk-adjust surgical outcome data it is important that its performance is assessed in external validation studies.

      Methods

      The British Aneurysm Repair (BAR) score, Medicare, and Vascular Governance North West (VGNW) models were validated using an independent prospectively collected sample of multicentre clinical audit data. Consecutive, data on 1,124 patients undergoing elective AAA repair at 17 hospitals in the north-west of England and Wales between April 2011 and March 2013 were analysed. The outcome measure was in-hospital mortality. Model calibration (observed to expected ratio with chi-square test, calibration plots, calibration intercept and slope) and discrimination (area under receiver operating characteristic curve [AUC]) were assessed in the overall cohort and procedural subgroups.

      Results

      The mean age of the population was 74.4 years (SD 7.7); 193 (17.2%) patients were women and the majority of patients (759, 67.5%) underwent endovascular aneurysm repair. All three models demonstrated good calibration in the overall cohort and procedural subgroups. Overall discrimination was excellent for the BAR score (AUC 0.83, 95% confidence interval [CI] 0.76–0.89), and acceptable for the Medicare and VGNW models, with AUCs of 0.78 (95% CI 0.70–0.86) and 0.75 (95% CI 0.65–0.84) respectively. Only the BAR score demonstrated good discrimination in procedural subgroups.

      Conclusion

      All three models demonstrated good calibration and discrimination for the prediction of in-hospital mortality following elective AAA repair and are potentially useful. The BAR score has a number of advantages, which include being developed on the most contemporaneous data, excellent overall discrimination, and good performance in procedural subgroups. Regular model validations and recalibration will be essential.

      Keywords

      There are currently no widely used risk prediction models in elective abdominal aortic aneurysm (AAA) repair. This study validates three risk prediction models for in-hospital mortality. Overall, all models demonstrated good performance and are potentially useful. The British Aneurysm Repair (BAR) score, which was developed using the UK National Vascular Database, has a number of advantages over the other models, including being developed on the most contemporaneous data, excellent discriminatory ability overall, and good performance in procedural subgroups. Validated risk prediction models such as the BAR score can facilitate clinical decision making, improve informed consent, and risk-adjust elective AAA repair outcome data.

      Introduction

      Risk prediction models are an important aspect of contemporary surgical practice. They can be used to provide information on the risks of surgery, to guide clinical decision-making, and to risk-adjust clinical outcome data. Accurate risk prediction models are particularly important for elective abdominal aortic aneurysm (AAA) repair. First, because most patients with AAA are asymptomatic, with an increasing number of patients identified through screening programmes, accurate estimates of procedural risk are important for optimising clinical decision-making. Second, named individual surgeon outcomes following elective AAA repair have recently been published in the UK.
      • Waton S.
      • Johal A.
      • Groene O.
      • Cromwell D.
      • Mitchell D.
      • Loftus I.
      National Vascular Registry 2013 report on surgical outcomes consultant-level statistics.
      Risk-adjusting published surgical outcome data is essential if fair comparisons between surgeons and hospitals are to be made, and inappropriate risk-averse clinical decisions avoided.
      A number of risk prediction models have previously been developed for AAA repair.
      • Patterson B.O.
      • Holt P.J.E.
      • Hinchliffe R.
      • Loftus I.M.
      • Thompson M.M.
      Predicting risk in elective abdominal aortic aneurysm repair: a systematic review of current evidence.
      However, unlike in cardiac surgery, where risk prediction models have been widely accepted and utilised in clinical practice,
      • Roques F.
      • Michel P.
      • Goldstone A.R.
      • Nashef S.A.M.
      The logistic EuroSCORE.
      risk models have never been widely used in AAA repair. Reasons for this lack of adoption include doubts about the accuracy and applicability of models to contemporary practice, lack of vigorous model validation, the inability to easily perform model calculations and uncertainty over how such models might be used in clinical practice.
      To address the lack of suitable risk models, a number of models have recently been developed, including the British Aneurysm Repair (BAR) score, and the Medicare and Vascular Governance Northwest (VGNW) models.
      • Grant S.W.
      • Hickey G.L.
      • Grayson A.D.
      • Mitchell D.C.
      • McCollum C.N.
      National risk prediction model for elective abdominal aortic aneurysm repair.
      • Giles K.A.
      • Schermerhorn M.L.
      • O'Malley A.J.
      • Cotterill P.
      • Jhaveri A.
      • Pomposelli F.B.
      • et al.
      Risk prediction for perioperative mortality of endovascular vs open repair of abdominal aortic aneurysms using the Medicare population.
      • Grant S.W.
      • Grayson A.D.
      • Purkayastha D.
      • Wilson S.D.
      • McCollum C.
      on behalf of the participants in the Vascular Governance North West P
      Logistic risk model for mortality following elective abdominal aortic aneurysm repair.
      All three models have been developed specifically for elective AAA repair and have demonstrated potential accuracy in validation studies performed to date.
      • Grant S.W.
      • Grayson A.D.
      • Mitchell D.C.
      • McCollum C.N.
      Evaluation of five risk prediction models for elective abdominal aortic aneurysm repair using the UK National Vascular Database.
      • Choke E.
      • Lee K.
      • McCarthy M.
      • Nasim A.
      • Naylor A.R.
      • Bown M.
      • et al.
      Risk models for mortality following elective open and endovascular abdominal aortic aneurysm repair: a single institution experience.
      • van Beek S.C.
      • Blankensteijn J.D.
      • Balm R.
      Validation of three models predicting in-hospital death in patients with an abdominal aortic aneurysm eligible for both endovascular and open repair.
      Surgical risk models can lose accuracy over time as practices change and outcomes improve.
      • Hickey G.L.
      • Grant S.W.
      • Murphy G.J.
      • Bhabra M.
      • Pagano D.
      • McAllister K.
      • et al.
      Dynamic trends in cardiac surgery: why the logistic EuroSCORE is no longer suitable for contemporary cardiac surgery and implications for future risk models.
      Therefore, if a model is to be recommended for use in clinical practice, it is important that it has been validated and found to be accurate for contemporary practice. Therefore, the objective of this study was to perform a contemporary prospective validation of the Medicare, VGNW, and BAR risk models.

      Materials and Methods

      This was a prospective, multicentre study conducted on behalf of VGNW, a peer-led clinical governance programme that audits the results of vascular surgeons across the north-west of England and Wales. Data were collected on consecutive elective AAA repairs performed between April 2011 and March 2013. As only pseudonymous, nonidentifiable data were used for this study, ethical approval was not required. Elective procedures were defined by the timing of surgery rather than the mode of admission. Data collected included patient demographics, comorbidities, preoperative medications, preoperative investigations, procedural details and clinical outcomes.
      Patient characteristics were defined as follows: ischaemic heart disease included a history of angina, previous myocardial infarction or previous coronary revascularisation; respiratory disease was defined as the presence of chronic respiratory disease with dyspnoea on exertion or at rest; diabetes included any patient receiving treatment for diabetes (diet controlled, noninsulin- and insulin-dependent). Medications were recorded if they were being taken by the patient on admission for surgery. Abnormal electrocardiogram was recorded if there was evidence of any of the following; atrial fibrillation or any other abnormal rhythm; > 5 ectopic beats/minute; Q or ST/T wave changes. Abnormal laboratory investigations were defined as serum sodium <135 or >145 mmol/L; serum potassium <3.5 or >5.5 mmol/L; urea >7.5 mmol/L; white cell count <3.0 × 109/L or >11.0 × 109/L; and haemoglobin <11 g/dL for women and <13 g/dL for men. Activity and outcome data for the VGNW programme are validated against Hospital Episode Statistic data by the VGNW team, with risk factor data also validated where possible.
      Data were cleaned by removing duplicate records, correcting transcriptional discrepancies, and resolving any clinical or temporal conflicts. Missing data were imputed with the sample median for continuous or ordinal variables, and the mode for dichotomous variables. Thoraco-abdominal or isolated iliac aneurysm repairs were excluded from the analysis. The primary outcome measure for the study was in-hospital mortality, defined as death due to any cause during admission for elective AAA repair. Following risk factor imputation, the Medicare, VGNW, and BAR scores were calculated for each record. Risk factors included in each model are shown in Table 1.
      Table 1Risk factors included in the British Aneurysm Repair (BAR) score, Medicare model, and Vascular Governance Northwest (VGNW) model.
      Bar scoreMedicare modelVGNW model
      Open repairOpen repairOpen repair
      Age (continuous)Age (grouped)Age (continuous)
      Female sexFemaleFemale
      Creatinine >120 μmol/LChronic renal diseaseCreatinine (continuous)
      Cardiac diseaseEnd-stage renal diseaseDiabetes
      Abnormal ECGCardiac failureAntiplatelet medication
      Previous aortic surgery/stentVascular diseaseRespiratory disease
      Abnormal white cell count
      Abnormal sodium
      AAA diameter (cm)
      ASA grade (I–IV)
      Note. ECG = electrocardiogram; AAA = abdominal aortic aneurysm; ASA = American Society of Anesthesiologists.
      Model performance was assessed using measures of calibration and discrimination in the overall cohort and separately in both procedural and gender subgroups. Discrimination was evaluated by determining the receiver operating characteristic (ROC) curve, which is summarized by the area under the curve (AUC) alongside an approximate 95% confidence interval (CI). An AUC of ≥0.9 is defined as “outstanding”, an AUC of 0.8–0.9 is considered “excellent”, an AUC of 0.7–0.8 represents “acceptable” discrimination, and an AUC of ≤0.5 represents no discrimination.
      • Hosmer D.W.
      • Lemeshow S.
      Applied logistic regression.
      In the overall cohort and procedural subgroups, model calibration has been summarised by calculating the observed to expected (O:E) ratio and an exact 95% CI. If, on average, the model is well calibrated, then the O:E ratio should be close to 1. If the O:E is above or below 1 this indicates under- and over-prediction respectively. Model calibration has been further assessed in the overall cohort by dividing the cohort into low (bottom 50%), medium (middle 25%), and high-risk (top 25%) groups based on the model's predicted mortality. Model performance statistics were calculated for each risk group. Calibration plots for each model based on these groups were produced showing the mean predicted probability of outcome against the observed proportion of outcomes. Approximate 95% CIs for the observed mortality proportions in each group are shown as error bars in the figures.
      The calibration intercepts and slope parameters were also calculated for each model in the overall cohort. The calibration intercept and slope are calculated by fitting a logistic regression model with the dependent variable set as the observed outcome and the independent variable set as the log-odds (“logit”) transformed model prediction. If the model was perfectly calibrated the intercept and slope would equal 0 and 1 respectively. The intercept is a measure of overall model calibration, which is whether predictions agree on average with observed probabilities. A chi-square test (on 2 degrees of freedom) for “unreliability” (null hypothesis: intercept = 0 and slope = 1) was also performed.
      • Harrell F.E.
      Modeling strategies with applications to linear models, logistic regression and survival analysis.
      All p-values <.05 were considered significant. All statistical analyses were performed using R version 3.0.2.
      • R Core Team
      R: a language and environment for statistical computing.

      Results

      Patient characteristics

      Data were available for 1,142 procedures performed by 60 surgeons at 17 different hospitals. Sixteen procedures were excluded as they related to either thoracic or isolated iliac aneurysm repairs, with two further records excluded as there was evidence that the procedure was nonelective. This resulted in a final cohort for analysis of 1,124 elective AAA repairs. The mean age of the population was 74.4 years (SD 7.7); 193 (17.2%) patients were women. The majority of patients (759, 67.5%) underwent endovascular aneurysm repair (EVAR), and most patients (1,037, 92.3%) were asymptomatic. Additional patient characteristics are given in Table 2.
      Table 2Characteristics of the study population.
      Risk factorFrequency (%)Missing data (%)
      Age (years)
      Continuous data displayed as mean (SD).
      74.4 (7.7)0.2
      Female sex193 (17.2)0.0
      AAA diameter (cm)
      Continuous data displayed as mean (SD).
      6.3 (1.2)6.0
      Previous aortic surgery/stent67 (6.0)1.3
      AAA symptoms87 (7.7)6.1
      Ischaemic heart disease357 (31.8)8.5
      Previous myocardial infarction163 (14.5)8.5
      Cardiac failure27 (2.4)9.2
      Respiratory disease199 (17.7)14.5
      Diabetes162 (14.4)3.2
      Antiplatelet medication716 (63.7)0.5
      Antihypertensive medication371 (33.0)0.5
      Statin therapy778 (69.2)0.4
      Smoking status12.2
      Ex-smoker298 (26.5)
      Current smoker246 (21.9)
      Abnormal ECG359 (31.9)5.7
      Abnormal sodium105 (9.3)11.0
      Abnormal potassium36 (3.2)10.4
      Abnormal urea308 (27.4)10.4
      Creatinine >120 μmol/L186 (16.5)10.1
      Creatinine >200 μmol/L21 (1.9)10.1
      Abnormal WCC82 (7.3)9.0
      Abnormal haemoglobin296 (26.3)9.0
      ASA grade11.4
       I56 (5.0)
       II434 (38.6)
       III604 (53.7)
       IV30 (2.7)
      Open repair365 (32.5)0.0
      Note. AAA = abdominal aortic aneurysm; ECG = electrocardiogram; WCC = white cell count; ASA = American Society of Anesthesiologists.
      a Continuous data displayed as mean (SD).

      Overall model calibration

      There were 32 in-hospital deaths in the study cohort, giving an overall mortality of 2.8%. The Medicare predicted mortality was 2.8%, which was not significantly different from the observed mortality, giving an O:E of 1.02 (95% CI 0.70–1.44, p = .858). The VGNW predicted mortality was 3.2%, giving an O:E ratio of 0.89 (95% CI 0.61–1.25, p = .560) and the BAR score predicted mortality was 2.4%, giving an O:E ratio of 1.19 (95% CI 0.81–1.68, p = .334). Calibration plots for the three models are shown in Fig. 1. The calibration intercepts and slopes also demonstrated good calibration for all three models, with each unreliability statistic being insignificant (Medicare p = .332, VGNW p = .756, BAR p = .581). Model performance in the low-, medium-, and high-risk groups, defined based on the probabilities calculated by each model, is shown in Table 3, Table 4, Table 5.
      Figure thumbnail gr1
      Figure 1Calibration plots for low-, medium-, and high-risk groups for the British Aneurysm Repair (BAR), Medicare, and Vascular Governance Northwest (VGNW) risk models. The black dashed line is the line of equality that represents perfect calibration. Vertical lines represent approximate 95% binomial confidence intervals of the observed mortality proportion.
      Table 3Calibration of the Medicare model for low-, medium-, and high-risk groups; groups were defined according to the Medicare model predicted mortality.
      Low riskMedium riskHigh risk
      Predicted mortality ranges0 < x ≤ 2.0842.084 < x ≤ 3.2303.230 < x ≤ 32.519
      Number of patients, n611247266
      EVAR, n534120105
      Open AAA repair, n77127161
      Observed mortality rate, %1.12.47.1
      Expected mortality rate, %1.32.86.1
      O:E ratio, 95% CI0.85 (0.34–1.76)0.88 (0.32–1.91)1.16 (0.70–1.82)
      p0.8611.0000.457
      Note. EVAR = endovascular aneurysm repair; AAA = abdominal aortic aneurysm; O:E = observed to expected; CI = confidence interval.
      Table 4Calibration of the Vascular Governance Northwest (VGNW) score for low-, medium-, and high-risk groups; groups were defined according to the VGNW score predicted mortality.
      Low riskMedium riskHigh risk
      Predicted mortality ranges0 < x ≤ 2.0892.089 < x ≤ 3.8173.817 < x ≤ 62.330
      Number of patients, n562281281
      EVAR, n48718488
      Open repair, n7597193
      Observed mortality rate, %1.11.87.5
      Expected mortality rate, %1.22.87.6
      O:E ratio (95% CI)0.88 (0.32–1.92)0.63 (0.20–1.46)0.99 (0.61–1.51)
      p1.0000.3751.000
      Note. EVAR = endovascular aneurysm repair; O:E = observed to expected; CI = confidence interval.
      Table 5Calibration of the British Aneurysm Repair (BAR) score for low-, medium-, and high-risk groups; groups were defined according to the BAR score predicted mortality.
      Low riskMedium riskHigh risk
      Predicted mortality ranges0 < x ≤ 1.2451.245 < x ≤ 2.5402.540 < x ≤ 44.335
      Number of patients, n562281281
      EVAR, n49619667
      Open repair, n6685214
      Observed mortality rate, %0.41.88.9
      Expected mortality rate, %0.71.86.4
      O:E ratio (95% CI)0.52 (0.06–1.88)1.00 (0.32–2.33)1.38 (0.89–2.04)
      p0.6031.0000.124
      Note. EVAR = endovascular aneurysm repair; O:E = observed to expected; CI = confidence interval.

      Overall model discrimination

      The BAR score demonstrated excellent discrimination in the overall cohort, with an AUC of 0.83 (95% CI 0.76–0.89). The discriminative ability of the Medicare and VGNW models was acceptable, with AUCs of 0.78 (95% CI 0.70–0.86) and 0.75 (95% CI 0.65–0.84) respectively. The ROC curves for the models in the overall cohort are shown in Fig. 2.
      Figure thumbnail gr2
      Figure 2Receiver operating characteristic curves for the British Aneurysm Repair (BAR), Medicare, and Vascular Governance Northwest (VGNW) risk models in the overall cohort. The grey line represents the line of equality.

      Model performance in separate open AAA repair and EVAR groups

      The in-hospital mortality rates for open AAA repair and EVAR were 6.8% and 0.9% respectively. The predicted mortality in the open AAA repair group was 4.4% (O:E 1.56 [95% CI 1.01–2.30], p = .033), 5.4% (O:E 1.26 [95% CI 0.81–1.86], p = .260), and 4.9% (O:E 1.41 [95% CI 0.91–2.08], p = .095) for the Medicare, VGNW, and BAR scores respectively. In the EVAR group, the predicted mortality was 2.0% (O:E 0.46 [95% CI 0.18–0.94], p = .029), 2.1% (O:E 0.43 [95% CI 0.17–0.89], p = .017), and 1.2% (O:E 0.76 [95% CI 0.31–1.57], p = .619) for the Medicare, VGNW, and BAR scores respectively.
      In the open AAA repair subgroup again only the BAR score demonstrated acceptable discrimination, with an AUC of 0.70 (95% CI 0.61–0.78). Both the Medicare and VGNW models demonstrated unacceptable discrimination, with AUCs of 0.68 (95% CI 0.58–0.78) and 0.64 (95%CI 0.53–0.75). In the EVAR subgroup only the BAR score demonstrated acceptable discrimination, with an AUC of 0.75 (95% CI 0.55–0.95). Both the Medicare and VGNW models demonstrated unacceptable discrimination with AUCs of 0.66 (95% CI 0.47–0.85) and 0.56 (95% CI 0.31–0.81) respectively.

      Model performance by sex

      The in-hospital mortality for men was 2.8% and for women was 3.1%. The predicted mortality for men was 2.4% (O:E 1.14 [95% CI 0.75–1.67], p = .462), 2.6% (O:E 1.08 [95% CI 0.71–1.59], p = 0.682), and 1.9% (O:E 1.47 [95% CI 0.96–2.16], p = .055) for the Medicare, VGNW and BAR scores respectively. For women, the predicted mortality was 4.5% (O:E 0.70 [95% CI 0.26–1.52], p = 0.494), 6.2% (O:E 0.50 [95% CI 0.18–1.09], p = .109), and 4.8% (O:E 0.64 [95% CI 0.24–1.40], p = .408) for the Medicare, VGNW and BAR scores respectively. For men, the BAR score demonstrated excellent discriminatory ability, with an AUC of 0.85 (95% CI 0.78–0.92). Both the Medicare and VGNW models demonstrated acceptable discrimination with AUCs of 0.78 (95% CI 0.69–0.86) and 0.76 (95% CI 0.65–0.87) respectively. For women, the Medicare model demonstrated excellent discriminatory ability with an AUC of 0.88 (95% CI 0.76–0.99). The BAR score and VGNW model both demonstrated acceptable discrimination with AUCs of 0.79 (95% CI 0.67–0.92) and 0.76 (95% CI 0.53–0.99) respectively.

      Discussion

      We evaluated three contemporary risk prediction models for in-hospital mortality following elective AAA repair, and demonstrated that all three models are potentially useful. The BAR score, developed using data from the UK National Vascular Database demonstrated excellent discriminatory ability overall and retained acceptable discriminatory ability in separate open AAA repair and EVAR cohorts. Although the Medicare and VGNW models demonstrated acceptable discrimination overall for elective AAA repair, discrimination was inadequate for open AAA repair and EVAR separately. All three models demonstrated good calibration in the overall cohort. However, only the BAR score demonstrated acceptable calibration in both procedural subgroups.
      As with any registry study, data quality and completeness are inevitable limitations; however, no variable used was missing in >15% of cases. All required variables were available as defined for the calculation of the BAR score and VGNW model; however, for the Medicare model, some risk factors had to be inferred from related data as the risk factors as defined were not collected by the VGNW programme for the study period. A history of vascular disease was presumed for patients identified as taking preoperative antiplatelet medication. This assumption is likely to have led to an overestimation in risk for the Medicare model, as only ∼31% of patients had vascular disease in the Medicare cohort compared with the assumed proportion in this study of 61.7%.
      • Giles K.A.
      • Schermerhorn M.L.
      • O'Malley A.J.
      • Cotterill P.
      • Jhaveri A.
      • Pomposelli F.B.
      • et al.
      Risk prediction for perioperative mortality of endovascular vs open repair of abdominal aortic aneurysms using the Medicare population.
      Chronic renal disease was assumed for any patient with a creatinine >120 μmol/L and end-stage renal disease assumed for any patient with a creatinine >200 μmol/L. This assumption led to similar proportions of patients being classified as having renal disease in this study as in the cohort used for development of the Medicare model.
      This study was based on prospectively collected, validated clinical audit data from multiple centres throughout the north-west of England and Wales. A comprehensive assessment of each model's performance with regard to discrimination, calibration, and clinical validity has been performed. Good model discrimination is important, as this tends to remain stable over time, and poor discriminatory performance is difficult to correct.
      • Hickey G.L.
      • Grant S.W.
      • Murphy G.J.
      • Bhabra M.
      • Pagano D.
      • McAllister K.
      • et al.
      Dynamic trends in cardiac surgery: why the logistic EuroSCORE is no longer suitable for contemporary cardiac surgery and implications for future risk models.
      • Nashef S.A.M.
      • Roques F.
      • Sharples L.D.
      • Nilsson J.
      • Smith C.
      • Goldstone A.R.
      • et al.
      EuroSCORE II.
      Adequate calibration of a model is, perhaps, more essential for clinical decision-making and risk adjustment of surgical outcome data.
      • Grant S.W.
      • Grayson A.D.
      • Jackson M.
      • Au J.
      • Fabri B.M.
      • Grotte G.
      • et al.
      Does the choice of risk-adjustment model influence the outcome of surgeon-specific mortality analysis? A retrospective analysis of 14 637 patients under 31 surgeons.
      Unlike poor discrimination, if a model is poorly calibrated this can potentially be addressed using a variety of methods.
      • Moons K.G.M.
      • Kengne A.P.
      • Grobbee D.E.
      • Royston P.
      • Vergouwe Y.
      • Altman D.G.
      • et al.
      Risk prediction models: II. External validation, model updating, and impact assessment.
      Although good calibration was demonstrated overall, both the Medicare and VGNW models demonstrated over-prediction of risk in patients undergoing EVAR, and the Medicare model also under-predicted risk for open AAA repair.
      A trend towards under-prediction of risk in high-risk patients was demonstrated by the Medicare model and BAR score, while all three models tended to under-predict risk in men and over-predict risk in women. However, a limitation of this study is that the relatively small numbers of outcomes, particularly for the subgroup analyses, may have led to substantial differences in model performance not being detected.
      • Vergouwe Y.
      • Steyerberg E.W.
      • Eijkemans M.J.C.
      • Habbema J.D.F.
      Substantial effective sample sizes were required for external validation studies of predictive logistic regression models.
      It is necessary that these models be tested on larger datasets in order to confirm their performance in specific subgroups. The number of outcomes available also meant it was not possible to perform an overall Hosmer–Lemeshow test.
      • Hosmer D.W.
      • Lemeshow S.
      Applied logistic regression.
      Clinical validity of the models is suggested by the risk factors that are included in all three models, including open repair, age, female sex, and renal disease. Cardiac disease is included in both the BAR score and Medicare model, and vascular disease (identified in the VGNW model by use of antiplatelet medication) is present in the Medicare and VGNW models. Diabetes and respiratory disease are exclusive to the VGNW model, while previous aortic surgery or stent, abnormal white cell count, abnormal sodium, AAA diameter (cm), and American Society of Anesthesiologists grade (I–IV) are exclusive to the BAR score. Potential reasons for differences in risk factors between the models include the size of the dataset available for model development, the availability of each risk factor, differences in the time interval of data collection, difference in the populations studied between development datasets and differences in risk factor definitions.
      In addition to differences in risk factors included in the models there are also differences in the outcomes the models were designed to predict. The BAR score was developed to predict in-hospital mortality, whereas the VGNW model was designed to predict 30-day mortality. The Medicare model was developed to predict either in-hospital mortality or 30-day mortality. These differences in outcome definitions are unlikely to affect model performance significantly as in this study no deaths occurred after discharge, but within 30 days of the procedure. Clearly, perioperative mortality is an important outcome for patients, surgeons, and healthcare providers, and is also the outcome currently used for publication of elective AAA repair outcome data. However, other important outcomes, such as re-intervention, long-term survival, and aneurysm-related mortality are key indicators of quality and may be used in the publication of surgical outcome data going forward.
      Both the VGNW and Medicare models have previously been validated using multicentre and single-centre data and found to perform acceptably with regard to discrimination and calibration.
      • Grant S.W.
      • Grayson A.D.
      • Mitchell D.C.
      • McCollum C.N.
      Evaluation of five risk prediction models for elective abdominal aortic aneurysm repair using the UK National Vascular Database.
      • Choke E.
      • Lee K.
      • McCarthy M.
      • Nasim A.
      • Naylor A.R.
      • Bown M.
      • et al.
      Risk models for mortality following elective open and endovascular abdominal aortic aneurysm repair: a single institution experience.
      • van Beek S.C.
      • Blankensteijn J.D.
      • Balm R.
      Validation of three models predicting in-hospital death in patients with an abdominal aortic aneurysm eligible for both endovascular and open repair.
      This study represents the second successful external validation of the BAR score, with the previous validation performed using data from a randomised clinical trial.
      • van Beek S.C.
      • Blankensteijn J.D.
      • Balm R.
      Validation of three models predicting in-hospital death in patients with an abdominal aortic aneurysm eligible for both endovascular and open repair.
      All the models studied have been developed within the last 5 years; however, the development datasets cover significantly different time periods, which may account for differences in model performance. The datasets used to develop the VGNW model and Medicare model are relatively out of date with respect to modern vascular surgical practice represented in this cohort. With regard to the type of data used for model development, the BAR score and VGNW model were built using clinical datasets, and the Medicare model was developed using administrative data. It has been suggested that using clinical datasets for risk model development is more appropriate than using administrative datasets.
      • Shahian D.M.
      • Edwards F.H.
      • Jacobs J.P.
      • Prager R.L.
      • Normand S.L.T.
      • Shewan C.M.
      • et al.
      Public reporting of cardiac surgery performance: part 2 – implementation.
      • Siregar S.
      • Pouw M.E.
      • Moons K.G.M.
      • Versteegh M.I.M.
      • Bots M.L.
      • van der Graaf Y.
      • et al.
      The Dutch Hospital Standardised Mortality Ratio (HSMR) method and cardiac surgery: benchmarking in a national cohort using hospital administration data versus a clinical database.
      Unlike some older models for AAA repair,
      • Samy A.K.
      • Murray G.
      • MacBain G.
      Glasgow aneurysm score.
      • Tang T.
      • Walsh S.R.
      • Prytherch D.R.
      • Lees T.
      • Varty K.
      • Boyle J.R.
      VBHOM, a data economic model for predicting the outcome after open abdominal aortic aneurysm surgery.
      • Prytherch D.R.
      • Ridler B.M.F.
      • Beard J.D.
      • Earnshaw J.J.
      A model for national outcome audit in vascular surgery.
      all three models were based on datasets that included exclusively AAA repairs (both open repairs and EVAR). Although the BAR score demonstrates acceptable performance for separate EVAR and open cohorts, it is possible that models designed specifically for either open repair or EVAR may have improved performance for these procedures separately. However, a combined model approach is potentially more suitable for clinical practice as it allows the easy calculation of the risk of both procedures and is more appropriate for risk-adjustment purposes.
      Although all three models validated here are potentially useful, in our opinion the BAR score is the most appropriate for informing clinical decision-making and risk-adjusting surgical outcome data in elective AAA repair. Although, overall, a statistically significant superior performance of the BAR score has not been demonstrated, it was the only model to retain acceptable discrimination and calibration in both EVAR and open repair subgroups. The BAR score is also based on the most contemporary data, and can be easily accessed and calculated in <30 seconds using either a website (www.britishaneurysmrepairscore.com) or an App (via the Apple App Store or Google Play).
      When using a model in clinical decision-making it is important that the clinician should (i) understand that the risk prediction model does not predict the outcome for an individual patient, but provides an estimate of the risk for a population of patients with similar characteristics undergoing the same procedure; (ii) understand any potential limitations of the model; (iii) know how their own performance may influence the prediction; and (iv) adjust the prediction based on important risk factors not captured in the model. As risk prediction models tend to lose calibration over time,
      • Hickey G.L.
      • Grant S.W.
      • Murphy G.J.
      • Bhabra M.
      • Pagano D.
      • McAllister K.
      • et al.
      Dynamic trends in cardiac surgery: why the logistic EuroSCORE is no longer suitable for contemporary cardiac surgery and implications for future risk models.
      it is vital that models are validated every 2–3 years and recalibrated. This approach allows new risk factors to be incorporated into updated models to potentially further improve model performance. Dynamic modelling is an alternative approach that could be adopted to address calibration, but would require improved data collection and informatics infrastructure.
      • Hickey G.L.
      • Grant S.W.
      • Caiado C.
      • Kendall S.
      • Dunning J.
      • Poullis M.
      • et al.
      Dynamic prediction modeling approaches for cardiac surgery.

      Funding

      This project was partly funded by the National Institute for Health Research Health Technology Assessment (NIHR HTA) programme (project no. 09/91/39) and will be published in full in Health Technology Assessment. The views and opinions expressed herein are those of the authors and do not necessarily reflect those of the HTA programme, NIHR, National Health Service, or Department of Health. The Vascular Governance Northwest programme is funded by the Supra District Audit Fund, via Bury Primary Care Trust.

      Conflict of Interest

      The authors declare no financial conflicts of interest. S.W.G. and C.N.M. were part of the team that developed the Vascular Governance Northwest and British Aneurysm Repair (BAR) scores. G.L.H. was part of the team that developed the BAR score.

      Acknowledgements

      We would like to acknowledge the VGNW team, Megan Jones and Anne Worthington for their assistance with data collection. The surgeons who contributed data to this study are A. Blair, R. Chandrasekar, C. Chan, and L. Williams (Arrowe Park Hospital); S. Dimitri, M. Hamish, and P. Edwards (Countess of Chester); O. Klimach (Glan Clwyd Hospital); J. Smyth, F. Serracino-Inglott, and D. Murray (Manchester Royal Infirmary); J. Mosley and M. Jameel (Royal Albert Edward Infirmary); S. Hardy, R. Salaman, H. Al-Khaffaf, and A. Rahi (Royal Blackburn Infirmary); G. Ferguson and M. Onwudike (Royal Bolton Hospital); J. Abraham, P. Wilson, M. Bukhari, J. Calvey, and M. Tomlinson (Royal Lancaster Infirmary); J. Brennan, V. Rao, R. Fisher, J. Joseph, J. Naik, F. Torella, and J. Smout (Royal Liverpool University Hospital); T. Oshodi, M. Hadfield, R. Ibrahim, G. Williams, M. Madan, N. Allaf, R. Shabazi, and V. Perricone (Royal Oldham Hospital); G. Thomson, A. Egun, and S. Drinkwater (Royal Preston Hospital); D. Jones and F. Mason (Southport District General Hospital); C. Pratap and L. Wolowczyk (Tameside District General Hospital); C. McCollum, M. Baguneid, M. Welch, J. Ghosh, and S. Richardson (University Hospital of South Manchester); P. Moody, T. Nicholas, P. Wake, D. Olojugba, and N. Teo (Warrington District General Hospital); U. Kirkpatrick A da Silva (Wrexham Maelor Hospital); V. Perricone and H Osman (Blackpool Victoria Hospital).

      References

        • Waton S.
        • Johal A.
        • Groene O.
        • Cromwell D.
        • Mitchell D.
        • Loftus I.
        National Vascular Registry 2013 report on surgical outcomes consultant-level statistics.
        National Vascular Registry, London2013
        • Patterson B.O.
        • Holt P.J.E.
        • Hinchliffe R.
        • Loftus I.M.
        • Thompson M.M.
        Predicting risk in elective abdominal aortic aneurysm repair: a systematic review of current evidence.
        Eur J Vasc Endovasc Surg. 2008; 36: 637-645
        • Roques F.
        • Michel P.
        • Goldstone A.R.
        • Nashef S.A.M.
        The logistic EuroSCORE.
        Eur Heart J. 2003; 24: 881-882
        • Grant S.W.
        • Hickey G.L.
        • Grayson A.D.
        • Mitchell D.C.
        • McCollum C.N.
        National risk prediction model for elective abdominal aortic aneurysm repair.
        Br J Surg. 2013; 100: 645-653
        • Giles K.A.
        • Schermerhorn M.L.
        • O'Malley A.J.
        • Cotterill P.
        • Jhaveri A.
        • Pomposelli F.B.
        • et al.
        Risk prediction for perioperative mortality of endovascular vs open repair of abdominal aortic aneurysms using the Medicare population.
        J Vasc Surg. 2009; 50: 256-262
        • Grant S.W.
        • Grayson A.D.
        • Purkayastha D.
        • Wilson S.D.
        • McCollum C.
        • on behalf of the participants in the Vascular Governance North West P
        Logistic risk model for mortality following elective abdominal aortic aneurysm repair.
        Br J Surg. 2011; 98: 652-658
        • Grant S.W.
        • Grayson A.D.
        • Mitchell D.C.
        • McCollum C.N.
        Evaluation of five risk prediction models for elective abdominal aortic aneurysm repair using the UK National Vascular Database.
        Br J Surg. 2012; 99: 673-679
        • Choke E.
        • Lee K.
        • McCarthy M.
        • Nasim A.
        • Naylor A.R.
        • Bown M.
        • et al.
        Risk models for mortality following elective open and endovascular abdominal aortic aneurysm repair: a single institution experience.
        Eur J Vasc Endovasc Surg. 2012; 44: 549-554
        • van Beek S.C.
        • Blankensteijn J.D.
        • Balm R.
        Validation of three models predicting in-hospital death in patients with an abdominal aortic aneurysm eligible for both endovascular and open repair.
        J Vasc Surg. 2013; 58: 1452-1457.e1
        • Hickey G.L.
        • Grant S.W.
        • Murphy G.J.
        • Bhabra M.
        • Pagano D.
        • McAllister K.
        • et al.
        Dynamic trends in cardiac surgery: why the logistic EuroSCORE is no longer suitable for contemporary cardiac surgery and implications for future risk models.
        Eur J Cardiothorac Surg. 2013; 43: 1146-1152
        • Hosmer D.W.
        • Lemeshow S.
        Applied logistic regression.
        Wiley, New York2000
        • Harrell F.E.
        Modeling strategies with applications to linear models, logistic regression and survival analysis.
        Springer-Verlag, New York2001
        • R Core Team
        R: a language and environment for statistical computing.
        R Foundation for Statistical Computing, Vienna2013
        • Nashef S.A.M.
        • Roques F.
        • Sharples L.D.
        • Nilsson J.
        • Smith C.
        • Goldstone A.R.
        • et al.
        EuroSCORE II.
        Eur J Cardiothorac Surg. 2012; 41: 734-745
        • Grant S.W.
        • Grayson A.D.
        • Jackson M.
        • Au J.
        • Fabri B.M.
        • Grotte G.
        • et al.
        Does the choice of risk-adjustment model influence the outcome of surgeon-specific mortality analysis? A retrospective analysis of 14 637 patients under 31 surgeons.
        Heart. 2008; 94: 1044-1049
        • Moons K.G.M.
        • Kengne A.P.
        • Grobbee D.E.
        • Royston P.
        • Vergouwe Y.
        • Altman D.G.
        • et al.
        Risk prediction models: II. External validation, model updating, and impact assessment.
        Heart. 2012; 98: 691-698
        • Vergouwe Y.
        • Steyerberg E.W.
        • Eijkemans M.J.C.
        • Habbema J.D.F.
        Substantial effective sample sizes were required for external validation studies of predictive logistic regression models.
        J Clin Epidemiol. 2005; 58: 475-483
        • Shahian D.M.
        • Edwards F.H.
        • Jacobs J.P.
        • Prager R.L.
        • Normand S.L.T.
        • Shewan C.M.
        • et al.
        Public reporting of cardiac surgery performance: part 2 – implementation.
        Ann Thorac Surg. 2011; 92: S12-S23
        • Siregar S.
        • Pouw M.E.
        • Moons K.G.M.
        • Versteegh M.I.M.
        • Bots M.L.
        • van der Graaf Y.
        • et al.
        The Dutch Hospital Standardised Mortality Ratio (HSMR) method and cardiac surgery: benchmarking in a national cohort using hospital administration data versus a clinical database.
        Heart. 2013; 100: 702-710
        • Samy A.K.
        • Murray G.
        • MacBain G.
        Glasgow aneurysm score.
        Cardiovasc Surg. 1994; 2: 41-44
        • Tang T.
        • Walsh S.R.
        • Prytherch D.R.
        • Lees T.
        • Varty K.
        • Boyle J.R.
        VBHOM, a data economic model for predicting the outcome after open abdominal aortic aneurysm surgery.
        Br J Surg. 2007; 94: 717-721
        • Prytherch D.R.
        • Ridler B.M.F.
        • Beard J.D.
        • Earnshaw J.J.
        A model for national outcome audit in vascular surgery.
        Eur J Vasc Endovasc Surg. 2001; 21: 477-483
        • Hickey G.L.
        • Grant S.W.
        • Caiado C.
        • Kendall S.
        • Dunning J.
        • Poullis M.
        • et al.
        Dynamic prediction modeling approaches for cardiac surgery.
        Circ Cardiovasc Qual Outcomes. 2013; 6: 649-658

      Linked Article

      Comments

      Commenting Guidelines

      To submit a comment for a journal article, please use the space above and note the following:

      • We will review submitted comments as soon as possible, striving for within two business days.
      • This forum is intended for constructive dialogue. Comments that are commercial or promotional in nature, pertain to specific medical cases, are not relevant to the article for which they have been submitted, or are otherwise inappropriate will not be posted.
      • We require that commenters identify themselves with names and affiliations.
      • Comments must be in compliance with our Terms & Conditions.
      • Comments are not peer-reviewed.