Personalised Outcomes Forecasts of Supervised Exercise Therapy in Intermittent Claudication: An Application of Neighbours Based Prediction Methods with Routinely Collected Clinical Data

of claudication. the to upon published models in this population. personalised provide into individual prognosis which the potential to improve and personalise and to and evaluate personalised outcomes forecasts for functional claudication distance over six months of supervised exercise therapy for with intermittent claudication. Data of 5 940 patients were eligible for analysis. Neighbours based predictions were generated via an adaptation of predictive mean matching. Data from the nearest 223 matches (a.k.a. neighbours) for an index patient were modelled via Generalised Additive Model for Location Scale and Shape (GAMLSS). The realised outcome measures were then evaluated against the GAMLSS model, and the average bias, coverage, and precision were calculated. Model calibration was analysed via within sample and of sample analyses. Results: Neighbours based predictions demonstrated small average bias ( e 0.04 standard deviations; ideal ¼ 0) and accurate average coverage (48.7% of realised data within 50% prediction interval; ideal ¼ 50%). Moreover, neighbours based predictions improved prediction precision by 24%, compared with estimates derived from the whole sample. Both within sample and of sample testing showed predictions to be well calibrated. Conclusion: Neighbours based prediction is a method for generating accurate personalised outcomes forecasts for patients with intermittent claudication undertaking supervised exercise therapy. Future work should examine the in ﬂ uence of personalised outcomes forecasts on clinical decisions and patient outcomes.


INTRODUCTION
Intermittent claudication (IC) is the most common symptom of peripheral arterial disease, caused by atherosclerotic narrowing in the lower extremity arteries. 1,2 Patients with IC typically experience discomfort and pain in the legs and buttocks during exercise, which rapidly disappears after a brief rest. The recommended first choice therapy for patients with IC is supervised exercise therapy. 1,2 Although supervised exercise therapy is known to be effective in relieving symptoms, results vary greatly between patients. 3,4 Several patient related factors have been associated with the outcome of supervised exercise therapy, including patient reported function and baseline walking distance. 5e7 Gaining greater insights into individual prognosis may improve patient centred care and optimise treatment results by enabling patients and clinicians to better anticipate the course of exercise therapy. Visualising the prognosis may improve exercise adherence via behavioural science principles such as social norming. Additionally, an individual patient's prognosis can be used to benchmark progress in therapy, thus supporting personalisation of an exercise programme or other treatment decisions such as discharge from therapy. 8e11 However, prognostic work in this patient population has demonstrated limitations to date. Previous regression analyses have exhibited poor external validity, poor prediction accuracy, and limited potential for application in daily practice. 5,7 An alternative approach to prognostic modelling is to use a semi-parametric, "neighbours based" prediction methodology. 8,12,13 The central idea is to create individual prognostic profiles using historical outcomes data of patients similar to an index patient (aka the index patient's neighbours). The realised outcomes data of these similar patients, selected from a large database, are then used to generate the prediction. 8,12 This approach has potential advantages over commonly used parametric prediction approaches (e.g., mixed effects models); in particular, it enables flexible and realistic estimates, and the display of historical data may improve salience in practice. 14 This article aims to describe the development and evaluation of personalised outcomes forecasts for functional claudication distance over six months of supervised exercise therapy for patients with IC, using a neighbours based prediction method. It was hypothesised that the outcomes forecasts would demonstrate small average bias (< 0.1 standard deviations, on average), with improved precision over prognostic estimates derived from the full sample. Additionally, it was hypothesised that forecasts would be well calibrated via both within sample and of sample analyses.

Study design
This retrospective cohort study used data from the Chronic CareNet Quality system. 15 Chronic CareNet is a clinical network responsible for the delivery of standardised supervised exercise therapy for all patients with IC in The Netherlands. The Quality system database receives data from the National Register for Physical Therapy, an initiative by the Royal Dutch Society for Physical Therapy. 16 The pseudo-anonymised and non-identifiable data used falls out of the remit of the Medical Ethics Committee according to Dutch law. Patients and therapists provided informed consent to use their data for research purposes at initial collection. This study was reported according the standard reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) guidelines. 17

Data source
Data were gathered in routine clinical practice, extracted from electronic health records of physical therapist practices aligned with Chronic CareNet in The Netherlands. Standardised exercise training and testing is assured through training of all therapists affiliated with Chronic CareNet. Variables included in the database were patient characteristics (e.g., age, sex, body mass index [BMI]), treatment processes (e.g., treatment duration, number of treatment sessions, achievement of treatment goal), patient reported outcome measures (such as quality of life and activity scores), and walking distances. 15 Patients' measurements were performed and documented by physical therapists every three months, according to the guideline recommendations. 18 For development and evaluation of the prediction model, data were extracted based on a therapy start date between 2015 and 2019. To correct for any errors, patients were removed from the database when containing biologically implausible measurements or when lacking either baseline or at least one follow up measurement for functional claudication distance. The database was split temporally (based on date of evaluation) into a training (75%) and test (25%) dataset. The training dataset was used to tune the procedures for neighbours based predictions and examine model performance and the test set was used to examine of sample calibration.

Outcome variable
Personalised outcomes forecasts were developed for functional claudication distance, defined as the distance walked when a patient would elect to stop walking because of IC induced pain. 5 Functional claudication distance was used as the outcome measurement since it is a reliable and valid measurement for determining functional capacity 19 and because it is the primary outcome measure recommended in the Dutch treatment guideline. 18 Functional claudication distance was measured by physical therapists as part of daily clinical practice using standardised treadmill test (i.e., Gardner Skinner protocol), 20 with a speed of 3.2 km/h and increasing incline every two minutes by 2%, starting with 0%.

Matching characteristics
The neighbours based prediction approach uses patient characteristics to select neighbours (aka matches) from the existing database. Variables available for use as potential matching characteristics included (1) age, (2) BMI, (3) functional claudication distance at baseline, (4) maximum walking distance at baseline, (5) motivation score measured as phase of behaviour change, (6) pack years of smoking, (7) quality of life measures using the Vascular Quality of Life Questionnaire-6, 21 and (8) walking impairment measured using the Walking Impairment Questionnaire. 22 A more detailed description of these variables is available in the Supplementary material. Of these potential matching characteristics, a subset was selected for use in neighbours based predictions via procedures described in the following sections. The final set of matching characteristics were selected using backwards selection, which optimised the Akaike Information Criterion (i.e., step AIC function, lm package, R version 3.5.3).

Statistical analysis
All analyses were conducted using R version 3.5.3 (R Foundation). The steps to generate a neighbours based prediction by predictive mean matching are described in the following sections and are summarised in Table 1.
Model development: selection of matches by predictive mean matching. Because the source dataset contained functional claudication distance measurements at irregularly spaced time intervals, a functional claudication distance measurement was estimated for each patient at 180 days following the initial assessment, using a linear mixed effects model via the Brokenstick package (R statistical computing). 12,23 This timepoint was chosen since clinical follow up commonly occurs six months after the initiation of therapy, and prognostic estimates over this timeframe are therefore likely to have value for clinical decision making. The 180 day functional claudication distance estimate was used as the distal anchor for selecting matches by an adaptation of predictive mean matching. Multiple linear regression models were estimated with the 180 day functional claudication distance measurement (Brokenstick estimate) as the outcome variable and potential matching characteristics as explanatory variables. Of the available potential matching characteristics, only variables that contributed significantly (p < .050) to the prediction of 180 day functional claudication distance were retained for subsequent steps.
The predicted values from the linear model were the metric upon which the matches (aka neighbours) were selected. Briefly, an index patient's matching characteristics would be entered into the multiple linear regression model, and a predicted value would be obtained. The patient records in the database with similar predicted values would be extracted as the neighbours for use in subsequent steps. In preliminary analyses, the number of matches did not substantially influence the performance of the neighbours based prediction approach when less than 30% (w1 400 patients) of the dataset was used for matching (Supplementary material). However, when greater numbers of patients were used as matches, the average precision became substantially worse (i.e., greater uncertainty in prediction). Therefore, any given patient was matched to the nearest 5% of patients (matches ¼ 223).
Flexible modelling of outcome data. For each patient in the training data, the realised functional claudication distance measurements from the patient's matches were used to fit a Generalised Additive Model for Location Scale and Shape (GAMLSS). 24 The GAMLSS approach was chosen for its flexibility in modelling the median (location), variance (scale) and skewness (shape) as smooth functions of time (i.e., time since initial evaluation). In particular, since functional claudication distance measurements were positively skewed, a modelling framework was chosen that accommodated changes in skewness over time. Cubic splines were fitted to each of the parameters; three degrees of freedom (df) were used for the location parameter and one df was used for each of the scale and shape parameters. Since the degrees of freedom could not be independently optimised for each patient in the training set, this approach was taken to limit the potential for overfitting. 25 This same modelling approach was also used on the full training set to create a prognostic estimate that included the full sample.
Model evaluation. The training dataset was used to improve the performance of the prediction methodology based on three metrics: (1) bias, (2) coverage, and (3) precision. These metrics were chosen to gain insight into multiple relevant aspects of prediction performance. Bias was operationalised as the average difference (on a z scale) between patients' predicted functional claudication distance measurements and the observed functional claudication distance measurements in the first six months following patients' evaluation appointments. By this approach, an average bias of zero would be ideal and deviations from zero would indicate systematic bias in the prediction approach. Coverage was operationalised as the percentage of observations within the 50% prediction interval (ideal ¼ 50%). Deviations from the expected coverage would indicate limitations in modelling uncertainty. Precision was operationalised as the average width of the 50% prediction interval (narrower is better). These metrics were calculated by a leave one out cross validation approach, 26 wherein GAMLSS models were fit to existing data from the 223 closest matches to each of the patients in the training dataset. The realised data from each index patient was compared with the GAMLSS estimate to calculate bias and coverage, and the precision of the GAMLSS model was averaged over the first 180 days of supervised exercise therapy.

Descriptive statistics
The final dataset for analysis contained 17 926 functional claudication distance measurements of5 940 patients (Fig. 1). In total, 20 073 patient cases were excluded from the analysis, most commonly because of missing data in BMI, pack years, and functional claudication distance. Patient characteristics from training and test sets are shown in Table 2. Baseline functional claudication distance was significantly different between the training and test sets, but there were no significant differences in other variables.

Model development: selection of matches and number of matches
The following characteristics demonstrated a statistically significant relationship with the Brokenstick estimate of 180 day functional claudication distance: .001), motivation (b ¼ 15.5, p ¼ .002) and baseline functional claudication distance (b ¼ 0.93; p < .001) (Fig. 2). Baseline functional claudication distance was the most important matching characteristic, carrying the most weight in predictive mean matching with a standardised beta coefficient of 0.54 units of standard deviations (Fig. 2). Due to high correlation between functional and maximum walking distance at baseline, maximum walking distance was left out the final model. The predicted values from this multivariable linear regression were used as the matching metric and ranged from 220 metres to 2 522 metres (Fig. 2).

Model evaluation and calibration
With this approach, the average bias was found to be e 0.04 standard deviations, the average coverage (proportion of realised observations within the 50% prediction interval) was found to be 48.7%, and the average precision (the average width of the 50% prediction interval) was found to be 313 metres. For comparison, the average precision of the GAMLSS model that included all patients in the training set (i.e., the full sample prognostic estimate) was 412 metres. Thus, the neighbours based prediction approach amounted to a 24% improvement in precision relative to a prognostic estimate derived from the full sample (Fig. 3). The predictions appeared well calibrated; the observed values fell within the standard error of the median of predicted values  across all deciles, according to both within sample and of sample analyses (Fig. 4).

DISCUSSION
Neighbours based predictions were developed to forecast functional claudication distance for patients with IC over the course of six months of supervised exercise therapy. This prediction approach used historical data of selected matches (aka neighbours) to estimate the functional claudication distance for a new patient, over the course of supervised exercise therapy. Results of the prediction performance were in accordance with the hypotheses; within sample testing indicated small average bias, accurate average coverage and improved average precision of the individual patient predictions relative to prognostic estimates derived from the full sample. This is the first use of a neighbours based prediction method in this patient population. Several features of this prediction approach may ultimately promote its usefulness in clinical practice. First, the small average bias (e 0.04 standard deviations) suggests the predictions are accurate on average, with no evidence of a systematic over or under estimation. Second, coverage was calculated to be 48.7%, meaning 48.7% of realised observations fell within the 50% prediction interval. This suggests the approach accurately models uncertainty in functional claudication distance, which is potentially important for clinical interpretation. If a patient is performing better or worse than expected, it is important to be able to interpret the magnitude of the deviation (i.e., the probability of an observed deviation from the predicted value) as this is an indicator of the degree to which a measurement should be interpreted as good (if it is better than predicted) or bad (if it is worse than predicted). Finally, neighbours based predictions were 24% more precise, on average, compared with the prediction model derived from the full sample. This suggests the potential for the precision of the neighbours based predictions to confer clinical utility. For example, with this level of precision, the predictions are distinct between individuals with good vs. poor prognosis (Fig. 3).
There are at least two major areas where neighbours based predictions might be useful in clinical practice: (1) setting patient expectations and promoting adherence to exercise therapy, and (2) monitoring progress in therapy to detect treatment success and failure. Neighbours based predictions may be particularly useful for helping patients and clinicians understand prognosis; the use of historical clinical data enables ensemble visualisation (i.e., displaying a group or cluster of data points), which intuitively conveys the prognosis and uncertainty in prognosis. This also creates an opportunity to leverage behavioural science principles such as social norming; by comparing a patient to his or her peers, the patient may be motivated to adhere to the exercise programme to match or exceed others' performance. Additionally, the neighbours based prediction could be used as a template against which to benchmark progress in exercise therapy. If a patient is underperforming expectations, this could stimulate therapists to modify the exercise programme or refer the patient for consultation with another provider/discipline (e.g., vascular surgery).
Previous studies have used regression analyses to examine changes in walking distance over the course of supervised exercise therapy. 5e7,27 These studies have found that factors such as baseline walking distance, BMI, age, sex, and comorbidity status are significantly associated with walking distance outcome following supervised exercise therapy. The results largely align with these previous findings. Of all available matching characteristics, baseline walking distance was the most influential in determining matches (aka neighbours), carrying roughly five times the weight of the next most influential factor: patient age.
Smoking history (measured in pack years), BMI, motivation level, and sex, although statistically significantly associated, were less influential. Although many important clinical factors (e.g., comorbidity status) were not measured in this study, 5 a person's baseline walking function may also indirectly capture many important health or functional prognostic factors. 6 Previous regression analyses in this patient population have reported high levels of uncertainty in predictions. Farah et al. reported that less than one third of the predicted walking distance values were within 25% of the realised outcome measurements. 7 Kruidenier et al. reported that between 25% and 34% of patients' realised walking distance outcomes were within a predefined target range of 325 e 400 metres. 5 Direct comparison of these previous findings to the results is difficult due to the different methodologies used; however, there is evidence that the neighbours based approach may yield improved precision. Briefly, the 50% prediction interval of the neighbours based approach was 313 metres (on average), and 49% of the realised measurements fell within this interval. This appears to be an improvement on the results of Kruidenier et al., wherein a lesser proportion of the realised data fell within a larger target range. 5 One of the attributes of the neighbours based approach is its flexibility; both the prognostic trajectory and prediction interval are allowed to vary substantially across individual patients. This may enable improved precision over previously tested approaches.

Limitations
The main limitation of this analysis was the use of clinically collected data. On the one hand, no eligibility criteria were applied to study participants; thus, clinically collected data may be more generalisable to routine practice. On the other hand, because therapists collected data in the context of routine practice, this contributed to missing data. Additionally, challenges arise when creating and implementing a national data registry like the Chronic CareNet Quality system, including the wide variety of electronic health records from which to extract data. 15 Therefore, many patients were excluded from the database due to incomplete follow up measurement or no follow up measurement at all. A valid reason for lacking follow up measurements might be early termination of supervised exercise therapy or lack of compliance with therapy. This could have caused bias in the prediction approach. For example, if patients who are lost to follow up tend to have worse clinical outcomes, the predictions would systematically overestimate functional claudication distance. Therefore, prospective testing should be performed to investigate for the presence and extent of any bias in predictions. Nevertheless, the analysis relied upon a relatively large dataset (n ¼ 4 455), and the temporal validation suggested the predictions performed well in of sample testing. Finally, the dataset lacked several variables that might be expected to influence patients' prognosis, such as location of stenosis, comorbidity status, and details of the supervised exercise therapy (e.g., adherence, intensity). As mentioned, it is likely that many health factors that affect physical function are captured by the initial walking measurement. Differences in training programmes have potential influence on the outcome of supervised exercise therapy but tend to be very difficult to capture as structured data. 28 Moreover, uniformity in exercise programmes might be expected in the source data, as all participating physical therapists are aligned with Chronic CareNet and are educated in the general recommendations stated in the Royal Dutch Society for Physiotherapy guidelines for treatment of peripheral arterial disease. 18

Future directions
Two major areas of future work are foreseen: (1) refining prediction performance and comparing the neighbours based approach to other prediction approaches, and (2) examining the influence of predictions on clinical decisions and treatment outcomes for patients with IC. Specifically, future research might attempt a direct comparison of the neighbours based methodology with other prediction approaches, to further probe the strengths and limitations.
Additionally, the neighbours based approach could be extended in future work through the inclusion of additional matching characteristics or with adaptations to the approach (e.g., varying the numbers of matches across individuals). Ultimately, research should focus on translating this or other prediction methodologies to the point of care, to explore the effect of real time prognostic estimates on clinical decision making and patient outcomes.

Conclusion
In this study a neighbours based prediction approach was developed and tested to estimate functional claudication distance for patients with intermittent claudication undertaking a supervised exercise therapy programme. The neighbours based prediction approach enabled improved precision over previously described approaches in this patient population. Ultimately, this prediction methodology may inform the clinical use of personalised outcomes forecasts, which have the potential to support patient engagement and clinical decision making to ultimately improve patient centred care.