If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Several studies have examined inter-observer variability in measurements for standard EVAR, but little is known about measurements for complex aortic aneurysm.
Two independent observers reviewed all preoperative CT scans of 268 patients in a French trial of fenestrated and/or branched aortic stent-grafts (f/b-EVAR). Those data were compared with those obtained (1) by investigators (extent of aneurysm, target vessel stenosis, and aortic diameters), and (2) from manufacturers (proximal landing zone, device diameter, and target vessel position). We assessed the reproducibility using kappa statistics for qualitative data and both Bland–Altman plot and Passing–Bablok regression analysis for quantitative data.
Reproducibility was moderate to almost perfect for all factors. However, a few critical discrepancies were found, such as target vessel clock position (≥45 minutes) and location (≥5 mm), level of proximal landing zone, and diameters of the endograft.
This is the first large-scale analysis focused on inter-observer variability in sizing for f/b-EVAR. The measurement data showed good agreement, but there were some critical discrepancies between observers that may affect clinical results.
The current study is the first large-scale analysis, focused on inter-observer variability in sizing fenestrated and/or branched aortic stent-grafts. The agreements between core laboratory and each rater were all moderate to perfect; however, there were some significant discrepancies, which may affect clinical results. These discrepancies should be taken into account in sizing fenestrated and/or branched stent-grafts.
Complex endovascular repair, such as fenestrated and/or branched endovascular aortic repair (f/b-EVAR), is a recent development.
Whereas standardized devices are designed to be suitable for a certain average anatomy, custom-made devices require accurate preoperative sizing of stent-grafts for technical success. The design of a custom-made device is based on an individual CT scan provided by a surgeon. Device planning requires experience in imaging and 3D reconstruction using a workstation to make all the necessary measurements. In the majority of cases, this sizing is performed by specialists in a centralized planning facility of the manufacturer.
Inter-observer variability in various exams is well known. Some authors have reported inter-observer variability in measurements of abdominal aortic aneurysm (AAA).
However, at present, little is known about any discrepancy of sizing of complex aortic stent-grafts between different specialists and between clinicians and manufacturers. This study investigates the variability between experienced endovascular surgeons and investigator or manufacture measurements in measuring and sizing endovascular aneurysm repair using fenestrated and/or branched stent-grafts.
WINDOWS study CT scans
WINDOWS Study is a multicenter, prospective single-arm trial of f/b-EVAR for complex aortic aneurysms – abdominal (juxta-, para-, and suprarenal AAA) or thoracoabdominal (TAAA) – in centers selected according to their expertise in this technique and their compliance with the recommendations of the French Health Authority (HAS: Haute Autorité de Santé). All patients had preoperative CT scans and patient inclusion was validated by both the inclusion criteria committee of WINDOWS study and the planning center of manufacturer. Between September 2009 and October 2012, 268 patients were included in the trial (the study is registered # NCT01168037 at clinicaltrials.gov (http://www.clinicaltrials.gov/ct2/show/NCT01168037)).
In this study, all the preoperative CT scans were collected and reviewed by the core laboratory. Planning center data of the manufacturer were collected, as well as data provided by investigators. The quality of the retrieved scans varied widely in terms of slice thickness (1 mm to 5 mm) and scanning interval after contrast injection, and thus not all scans were optimal for sizing fenestrated and/or branched endograft.
Two independent observers performed image analysis as the core laboratory. A three-dimensional imaging workstation (TeraRecon Inc., Santa Rosa, CA, USA.) was used to generate multiple three-dimensional reconstructions of volumetric data sets from the preoperative CT scans. Both observers were well-trained and experienced vascular surgeons. The third observer, an experienced interventional radiologist, provided the final decision as a core laboratory in case of discrepancy in categorization between the two observers. As for the quantitative data, mean values of the two observers were determined as core laboratory data.
Extent of aneurysm was classified according to reporting standards for thoracic endovascular aortic repair (TEVAR) and ACC/AHA guidelines.
As for paravisceral aneurysm, juxtarenal aneurysms arise distal to the renal arteries but in very close proximity to them; pararenal aneurysms involve the origin of one or both renal arteries; suprarenal aneurysms encompass the visceral aortic segment containing the superior mesenteric and celiac arteries.
Eventual stenosis (>70%) of visceral branches (celiac axis [CA], superior mesenteric artery [SMA], right renal artery [RRA], left renal artery [LRA]) was identified. Stenosis determination was made by measuring the ratio between the diameter of the narrowest segment of the imaged artery (a) and the diameter of a normal segment of the artery proximal to the stenosis or distal to poststenotic dilation (b) (Percentage of stenosis = (b − a)/b × 100).
A semi-automated centerline was generated using the above-mentioned workstation. The centerline was assessed with multiplanar reconstruction views perpendicular to the centerline of flow, and then manually edited if necessary. Aortic diameters at each level of visceral branches (CA, SMA, RRA, and LRA), thoracic and infrarenal aortic diameter were measured in perpendicular planes to the centerline.
Visceral artery orientation was measured relative to a line extending anteriorly from the centerline of the aorta. Clockwise deviation was assigned a positive value, and counterclockwise deviation a negative value. The average of angles estimated by two observers was defined as the angle of core laboratory. And then, all degrees were converted to clock positions for analysis considering 0° as 12 o'clock because some data about target vessel orientation obtained from manufacturer were described only as clock positions.
For measuring longitudinal vessel separation, a stretch view was used. The distance between the center of each target vessel ostium and the low margin of CA ostium was measured. In the case when information of CA could not be obtained, the low margin of SMA ostium was substituted as a reference point.
The proximal aorta was considered to be suitable as a landing zone when the length of healthy aorta was ≥15 mm. Aneurysms were sub-divided into zones according to where it was thought an adequate proximal seal could be achieved in relation to the visceral arteries. Zone 0 was a seal below the lowest preserved renal artery, Zone 1 is between renal arteries at different levels, Zone 2 was above the renal arteries but below the SMA, Zone 3 was above the SMA but below the CA and Zone 4 was above the CA (Dr. K. Ivancev, personal communication, June 2013).
The proximal device diameter was determined according to the aortic diameter in the proximal seal zone and in agreement with the instructions for use of the manufacturer.
Data about extent of aneurysm, stenosis of visceral branches, and aortic diameter, were obtained from each center. They were estimated or measured by their own way in daily practice. Orientation of visceral arteries, distance from low margin of CA (or SMA), and proximal device diameter were obtained from the manufacturer. The proximal seal zone that the manufacturer proposed was obtained from the planning sheet of the manufacturer. (A circumferential seal was expected at the level of fenestration but scallop, which means that the proximal landing zone was considered distal to the scallop if the device incorporated a scallop.)
All statistical analyses were performed using SAS version 9.3 (SAS Institute, Cary, NC, USA) or R statistical software, version 3.0.0 (A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria). Quantitative and qualitative variables were analyzed separately by several methods. For quantitative variables, agreement between the core laboratory and raters was assessed by plotting the difference between each reading and the reference with the limits of agreement (±two standard deviations around the mean difference) as described by Bland and Altman.
A new biometrical procedure for testing the equality of measurements from two different analytical methods. Application of linear regression procedures for method comparison studies in clinical chemistry, part I.
For qualitative variables, reproducibility was assessed using the weighted kappa statistics (quadratic weighting was employed). Applying generally accepted definitions, kappa values ≤0 indicate no agreement, 0 to 0.2 slight agreement, 0.2 to 0.4 fair agreement, 0.4 to 0.6 moderate agreement, 0.6 to 0.8 substantial agreement, and 0.8 to 1.0 almost perfect agreement.
Core laboratory classified all 268 patients according to the extent of aneurysm: 136 juxtarenal, 48 pararenal, 16 suprarenal, 26 type IV thoracoabdominal (TAAA), 24 type III TAAA, 16 type II TAAA, and 2 type I TAAA. Inter-observer reproducibility showed all almost perfect results (kappa value = 0.91 with investigator, 0.99 with Observer 1, and 0.82 with Observer 2). There were some discrepancies, however, that might lead to a difference in proximal landing zone (Fig. 1).
Visceral artery stenosis
Core laboratory indicated stenosis >70% in 11.9% (31/261) of CA, 0% (0/261) of SMA, 3.4% (9/261) of RRA, and 3.9% (10/256) of LRA. Inter-observer reproducibility showed moderate to almost perfect results (CA: 0.56 with Investigator, 0.88 with Observer 1, and 0.69 with Observer 2. SMA: kappa value could not be calculated because of the absence of stenosis. RRA: 0.41, 0.79, and 0.65, respectively; LRA: 0.54, 1.00, and 1.00, respectively).
Aortic diameter at various levels (thoracic, CA, SMA, the lowest renal artery, and infrarenal) showed excellent agreements between core laboratory and each rater by Passing–Bablok regression analysis. Slopes and intercepts were 1.01, 1.09 (Investigator), 1.00, 0.00 (Observer 1), 1.00, 0.00 (Observer 2) at thoracic level, 0.92, 3.37 (Investigator), 1.00, 0.00 (Observer 1), 1.00, 0.00 (Observer 2) at CA level, 0.96, 1.92 (Investigator), 1.00, 0.00 (Observer 1), 1.00, 0.00 (Observer 2) at SMA level, 0.92, 3.27 (Investigator), 1.00, 0.00 (Observer 1), 1.00, 0.00 (Observer 2) at the lowest renal artery level, and 0.98, 1.23 (Investigator), 1.00, 0.88 (Observer 1), 1.00, -0.50 (Observer 2) at infrarenal level, respectively.
Visceral artery orientation
As for CA clock position, agreements with core laboratory were all almost perfect (kappa value: 0.80 (Manufacturer), 0.92 (Observer 1), and 0.94 (Observer 2)). Cases that had discrepancy ≥45 minutes compared with core laboratory were 2.3% (3/128) (Manufacturer), 0 (Observer 1), and 0 (Observer 2). As for SMA clock position, agreements with core laboratory were all almost perfect (kappa value: 0.81 (Manufacturer), 0.94 (Observer 1), and 0.93 (Observer 2)). The cases that had discrepancy ≥45 minutes compared with core laboratory were 1.5% (3/199) (Manufacturer), 0 (Observer 1), and 0 (Observer 2). As for RRA clock position, agreements with core laboratory were substantial to almost perfect (kappa value: 0.64 (Manufacturer), 0.93 (Observer 1), and 0.92 (Observer 2)). The cases that had discrepancy ≥45 minutes compared with core laboratory were 3.6% (7/195) (Manufacturer), 1.0% (2/195) (Observer 1), and 1.5% (3/195) (Observer 2). As for LRA clock position, agreements with core laboratory were all almost perfect (kappa value: 0.84 (Manufacturer), 0.95 (Observer 1), and 0.95 (Observer 2). The cases that had discrepancy ≥45 minutes compared with core laboratory were 6.2% (12/193) (Manufacturer), 0.5% (1/193) (Observer 1), and 0.5% (1/193) (Observer 2) (Fig. S1).
Visceral artery distance
Both Bland–Altman plot and Passing–Bablok regression showed good reproducibility for distances between all visceral artery and low margin of CA ostium. As for SMA, slopes and intercepts were 0.86, 0.47 (Manufacturer), 1.01, –2.36 (Observer 1), and 1.01, 1.96 (Observer 2). A difference ≥5 mm was noticed in 16.4% (21/128) (Manufacturer). As for RRA, slopes and intercepts were 0.95, –0.48 (Manufacturer), 0.97, –0.31 (Observer 1), and 1.05, –0.02 (Observer 2). The cases that had difference ≥5 mm compared with core laboratory were 13.9% (27/194) (Manufacturer). As for LRA, slopes and intercepts were 0.94, –0.19 (Manufacturer), 0.95, 0.44 (Observer 1), and 1.06, –0.75 (Observer 2). The cases that had difference ≥5 mm compared with core laboratory were 13.0% (25/192) (Manufacturer),(Fig. 2, Fig. S2).
Proximal landing zone
Proposed proximal landing zone compared with core laboratory are shown in Fig. 3. Agreements were all almost perfect (kappa value = 0.82 with Manufacturer, 0.95 with Observer 1, and 0.80 with Observer 2). There were some discrepancies, however, as well as extent of aneurysm. And there were more cases that were proposed more proximally for PLZ by the core laboratory compared with the manufacturer in the first half of this trial, whereas more cases proposed more distally in the last half. These results did not show any statistical difference.
Proximal device diameter
Proximal device diameters proposed by core laboratory and raters or manufacturer are shown in Fig. 4. Each agreement was good to almost perfect (kappa value = 0.83 with Manufacturer, 0.94 with Observer 1, and 0.93 with Observer 2). The cases with discrepancy of ≥2 size in device diameter were 23.5% (46/196) (Manufacturer), 4.1% (8/196) (Observer 1), and 3.6% (7/196) (Observer 2) when compared with device diameter proposed by the core laboratory.
This study is the first large-scale report focused on inter-observer variability of preoperative measurement and sizing for f/b-EVAR by reviewing CT scans of the French multicenter trial.
Several methods are available for preoperative sizing, but centerline analysis is generally used for particularly complex endovascular aortic surgery. Some authors emphasize the usefulness and accuracy of centerline analysis, stating that (1) it is associated with a decreased number of iliac limb extensions, (2) distance calculations provide accurate length selection of the stent-graft in the majority of cases.
But, the calculation of an aortic centerline of flow is done in a consistently semi-automatic manner. Although the workstation calculates the center of the aortic lumen in the targeted area, operators have to assess whether the line runs proper path, and modify it if necessary. The operator must draw the centerline manually in case of insufficient contrast enhancement for detecting arterial flow automatically. In the case of an angulated aorta, the operator must adjust the centerline according to the predicted path in which the main body will run. These processes require judgment and care of the operator, and can cause inter-observer variability (Fig. 5). However, little is known about inter-observer variability on measurements for complex aortic aneurysm which requires visceral branch preservation.
In this study, agreements between the core laboratory and each rater showed good results in all terms. However, there were some cases with critical discrepancies, if we defined 45 minutes as the threshold value of vessel orientation, and 5 mm as vessel distance. Although we do not think that those discrepancies affect the success rate of target vessel revascularizations or patency rate of target vessels directly, at least they might have led to technical difficulty, with a subsequent increase of the duration of intervention. In the WINDOWS study, multivariate analysis showed that duration of intervention was a factor, which affected 30-day and in-hospital mortality (and morbidity) after f/b-EVAR (under publication). As for the proximal landing zone (PLZ), some cases proposed a more proximal part of the aorta for PLZ, but others proposed more distal by the core laboratory compared with the manufacturer. This is a significant issue. The rate of type I EL can be lowered by implanting an endograft with longer landing zone. Conversely, occurrence of spinal cord ischemia can be reduced if the endograft is implanted with shorter length. In the WINDOWS trial, the incidence of spinal cord ischemia was relatively high (4.1%) (under publication), and this is one of the most important issues we face to improve the clinical result. Interestingly as the manufacturer experienced more cases of f/b-EVAR, they might have planned the device implantation more proximally to ensure the exclusion of aneurysm. We might be able to treat some cases with shorter devices and reduce morbidities. Intraoperative data and long-term results are awaited to investigate the correlation between the discrepancy in the image analysis and clinical results.
The quality of the retrieved scans varied widely in terms of slice thickness (1 mm to 5 mm) and scanning interval after contrast injection, and, thus not all scans were optimal for sizing fenestrated and/or branched endografts. This variability of quality may influence accuracy of measurement. We have studied scans with slice thicknesses of 1–5 mm and it is likely that poor scan quality may account for some disagreement in terms of the setting of the landmarks, as the three-dimensional software interpolates in between the slices.
We could not obtain information about the exact way to create the centerline, and of measurement or sizing by manufacturer. Although using the same workstation, there may be some differences between the core laboratory and manufacturer, which could lead to discrepancies.
In this study, a limited amount of data was analyzed and other measurements, such as treatment length, the level of distal landing zone, target vessel diameter, and landing length, were not compared between each rater. Those data are also important factors and should be estimated in a subsequent study.
We obtained manufacturer sizing from their planning sheet, but there were some cases in which we could not obtain these, and, in addition, some of the acquired sheets did not include all the contents. Thus, in those cases, we substituted data from the device request form as the data of sizing. As the data of branched device request forms may be different from original sizing, those data were excluded from analysis in this study.
We used measurements and assessment of the core laboratory as a reference. Although the third observer provided the final decision as a core laboratory in case of discrepancy in categorization between two observers, as for quantitative values, the mean of two observers was defined as the data of core laboratory. Multi-observer analysis is recommended for proper preoperative planning.
This is the first large-scale analysis focused on inter-observer variability of sizing for f/b-EVAR. The measurement data showed good agreement, but there were some critical discrepancies between observers that may affect clinical results. These discrepancies should be taken into account in sizing fenestrated and/or branched stent-grafts. Intraoperative data and long-term results of those patients should be investigated to assess the role of proper planning and the results of f/b-EVAR.
Conflict of Interest
A grant obtained from the French Ministry of Health (STIC: French acronym for Support to evaluation of Costly Innovative Techniques) covered the cost of WINDOWS trial. The sponsor had no role in study design.
Appendix A. Supplementary material
The following are the Supplementary material related to this article:
Comparison of visceral artery orientation between the core laboratory and each rater. (A) Celiac axis (CA). (B) Superior mesenteric artery (SMA). (C) Right renal artery (RRA). (D) Left renal artery (LRA). Gray box indicates the same orientation between the two, and black box indicates ≥45 minutes discrepancy.
A new biometrical procedure for testing the equality of measurements from two different analytical methods. Application of linear regression procedures for method comparison studies in clinical chemistry, part I.
To submit a comment for a journal article, please use the space above and note the following:
We will review submitted comments as soon as possible, striving for within two business days.
This forum is intended for constructive dialogue. Comments that are commercial or promotional in nature, pertain to specific medical cases, are not relevant to the article for which they have been submitted, or are otherwise inappropriate will not be posted.
We require that commenters identify themselves with names and affiliations.