If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
This Edutorial illustrates the relevance of statistical method reporting and testing of model assumptions using the example of the Cox proportional hazards model, a frequently used statistical method to compare time to event data among treatment groups.
In medical research, time to event analyses are of paramount interest. Their main advantage is that they not only compare a binary outcome, but also reflect the length of time until the event occurs, or the subject leaves the study. The latter is given when no event during the follow up occurs and subjects are censored at the last known time point during follow up. Time to event analyses assess the risk of an event occurring at any time throughout the study (i.e., the hazard rate) for each treatment arm. The hazard rate of two groups can then be compared by calculating the hazard ratio (HR). The HR represents the likelihood of an event in one group relative to the other at any time during follow up. Cox proportional hazard models are very attractive since they allow adjustments for potential confounders. However, like any other statistical method, model assumptions must be checked, and the obtained results are only valid if those assumptions are met. One distinctive concept of Cox models is the proportional hazard assumption (PHA, Fig. 1). It requires that the hazard ratio of the compared study groups be proportional over time. In other words: the risk of an event occurring during the study period in one group is a constant multiplier of the compared group. Consequently, the HR is a valid estimator of the overall treatment effect. Figure 2A shows the replicated results of a recently published study.
The HR of 0.66 (95% confidence interval 0.43 – 1.00) implies that the risk of an event was one third smaller for the patients in the treatment group at any time throughout the entire follow up compared with the control group.
The PHA can be assessed informally by visual inspection of the Kaplan–Meier estimator. Disproportional courses of the survival curves can indicate changes in the HR over time (Fig. 1). Yet, if the study groups are small, disproportional curves might not result in a violation of the PHA and if study groups are very large, disproportional hazards might not be detectable by visual inspection only. In Figure 2A, the survival curves were similar at the beginning of the study. After approximately 100 days, the hazard was bigger in the control group, whereas after 180 days, the curves seem to be parallel again, indicating similar hazard rates in both groups. However, testing of the PHA was not reported in this study.
The statistical concept for testing of the PHA is similar to testing of residuals in simple linear regression: the difference of the observed time to event for each study participant and the corresponding estimated time to event from the Cox model are plotted against time and assessed for slope (Fig. 2B). Since the effect of each variable is assumed to be constant over time, the size of the residuals is expected to be constant over time as well. Figure 2B shows the analysis of the residuals for the treatment group variable of the example study in Figure 2A. The residuals were in fact not constant over time: The blue smoothed line is not constant in slope, and the average slope is negative (p = .016 for non-zero slope).
If non-proportional hazards are present, reporting of the overall HR may be misleading. Additionally, the statistical tests to assess an overall difference between the compared groups lose power.
Throughout the medical literature, reporting of the statistical methods is often only rudimentary.
Due to non-proportional hazards during the 12.7 year follow up period, the investigators decided to split the follow up into four time groups of which each met the PHA. The underlying reason for non-proportionality was the varying hazard for aneurysm related mortality. There was a short term benefit in the EVAR group with a lower aneurysm related mortality during the first six months (HR 0.47, 95% CI 0.23 – 0.93), similar hazard rates from then on to eight years, but a higher aneurysm related mortality in the EVAR group thereafter (HR 5.82, 95% CI 1.64 – 20.65). In this situation, the overall HR does not sufficiently summarise the data and splitting the follow up time is one possible solution to address the issue of non-proportional hazards. This trial also demonstrates the importance of adequate follow up times. If the follow up was stopped after 12 years, the full pattern of the treatment effect would not have been detected.
There are multiple other options at one’s disposal if the PHA is violated: the Cox models can be extended by time varying covariates. This can be helpful if for example a substantial proportion of patients stops smoking during follow up. The analysis can also be stratified by the violating variable if such a one can be identified. An example would be stratification by a cardiovascular risk factor. However, there are non-parametric alternatives available that make no assumption on the underlying distribution of the data and allow treatment effects to be non-proportional. The restricted mean survival time (RMST) or Kaplan–Meier estimates with inverse probability weighting are available.
The RMST was similar in both groups and the ratio between the two RMSTs was 0.91 (95% CI 0.81 – 1.02), thus indicating a non-significant difference, p = .11. Interpretation of the RMST Ratio is straightforward. The average event free follow up time in the control group was 91% of the event free follow up time in the treatment group. The RMST is especially useful if almost all patients experience the event and hence, one is interested in the analysis of the event free follow up time rather than the comparison of rare events.
Apart from these statistical obstacles that can impair time to event analysis, researchers must also address the concept of non-informative censoring. Censoring of participants must not be in a causal relationship to the treatment provided in their study group. An example would be dropouts in one study group due to side effects and subsequent discontinuation of the assigned treatment or follow up visits. Therefore, researchers should aim for a maximum of follow up information of all study participants and report completeness of follow up using standardised measures, i.e. the Follow up Index.
In conclusion, model assumptions should be tested and reported with sufficient details to ensure valid study results and allow critical appraisal of the analysis. As shown in this Edutorial, non-proportional hazards in Cox models can have a dramatic impact and even lead to a change in the outcome direction. The following steps are proposed as part of a standardised statistical reporting:
Declare all conducted statistical tests and identify the underlying model assumptions.
Verify all model assumptions and report the results concisely.
Adapt the analysis if assumptions are not met.
As a further action to enhance reporting quality, statistical consulting should be a part of the peer reviewing process. A concept that already has been implemented successfully in the European Journal of Vascular and Endovascular Surgery.
A randomized trial of vonapanitase (PATENCY-1) to promote radiocephalic fistula patency and use for hemodialysis.
To submit a comment for a journal article, please use the space above and note the following:
We will review submitted comments as soon as possible, striving for within two business days.
This forum is intended for constructive dialogue. Comments that are commercial or promotional in nature, pertain to specific medical cases, are not relevant to the article for which they have been submitted, or are otherwise inappropriate will not be posted.
We require that commenters identify themselves with names and affiliations.