ORIGINAL RESEARCH

Interobserver Variability of Selective Region-of-Interest Measurement Protocols for Quantitative Diffusion Weighted Imaging in Soft Tissue Masses: Comparison with Whole Tumor Volume Measurements Shivani Ahlawat, MD,1* Paras Khandheria, MD,1 Filippo Del Grande, MD,1,2 John Morelli, MD,3 Ty K. Subhawong, MD,4 Shadpour Demehri, MD,1 and Laura M. Fayad, MD1 Background: To assess the interobserver reliability of three selective region-of-interest (ROI) measurement protocols for apparent diffusion coefficient (ADC) quantifications in soft tissue masses (STMs) compared with whole tumor volume (WTV) ADC measurements. Methods: Institutional review board approval was obtained and informed consent was waived. Three observers independently measured minimum and mean ADCs of 73 benign and malignant musculoskeletal STMs using three selective methods (single-slice [SS], predefined three slices [PD], observer-based [OB]) and WTV measurements at 3.0 Tesla. Minimum and mean ADC values derived from each method were compared with WTV measurements, and inter-reader variation was assessed using the intraclass correlation coefficient (ICC). The time required for each method of ADC measurement was recorded. Results: For the SS, PD, OB, and WTV methods, minimum ADC values ((31023 mm2/s)) were 0.97, 0.78, 0.73, and 0.67, respectively, and mean ADC values ((31023 mm2/s)) were 1.49, 1.49, 1.51, and 1.49, respectively. Interobserver agreement was good to excellent for the minimum and mean ADC values for the three readers using the SS, PD, OB, and WTV (ICC range 0.78–0.90). The SS, PD and OB methods required the least amount of measurement time (14 6 5, 40 6 17, and 38 6 15 s, respectively) while the reference WTV method required the longest measurement time (111 6 54 s) (P < 0.01). Conclusion: While all selective and WTV measurements offer good to excellent interobserver agreement, the selective OB method of ADC measurement results in the closest values to WTV measurements and requires significantly less measurement time than that required for the WTV method. J. MAGN. RESON. IMAGING 2016;43:446–454.

T

he technique of quantitative diffusion weighted imaging (DWI) with apparent diffusion coefficient (ADC) mapping has emerged as a potential method for the assessment of soft tissue masses (STMs). Recent work exploring the role of DWI and ADC mapping in the assessment of STMs

includes the differentiation of benign and malignant peripheral STMs,1–7 the detection of recurrent tumors in the surgical bed,8 and the assessment of treatment response in STMs.9–12 The degree of cellularity of STMs affects the degree of restricted diffusion, as indicated by the ADC

View this article online at wileyonlinelibrary.com. DOI: 10.1002/jmri.24994 Received May 1, 2015, Accepted for publication Jun 23, 2015. *Address reprint requests to: S.A., The Russell H. Morgan Department of Radiology & Radiological Science, The Johns Hopkins Medical Institutions, 600 North Wolfe Street, Baltimore, MD 21287. E-mail: [email protected] From the 1The Johns Hopkins Medical Institutions, The Russell H. Morgan Department of Radiology & Radiological Science, Baltimore, Maryland, USA; Department of Radiology, Regional Hospital, Lugano, Switzerland; 3Tulsa Radiology Associates, Tulsa, Oklahoma, USA; and 4Department of Radiology (R-109), University of Miami, Miami, Florida, USA

2

446

C 2015 Wiley Periodicals, Inc. V

Ahlawat et al.: Interobserver Variability of ROI for DWI

values within the lesions. For example, using minimum ADC values, DWI accurately differentiates between benign and malignant peripheral nerve sheath tumors.5 Additionally, both minimum and mean ADC values of STMs reportedly provide noncontrast MR imaging metrics to accurately differentiate between cysts and solid masses.6 However, due to the potentially heterogeneous composition of some STMs, particularly soft tissue sarcomas with hemorrhage, necrosis and fibrosis, the determination of ADC values may be inaccurate depending on the region of interest (ROI) selected for measurement. ADC analysis can also be more challenging in high grade soft tissue sarcomas compared with low grade tumors, again due to their more heterogeneous appearance by MR imaging.13 There are a few reports regarding the influence of manual region of interest (ROI) size and positioning on interobserver variability of visceral tumor ADC measurements.14–17 Meanwhile, robust manual ADC measurement of the entire tumor volume using manual placement of ROIs over all MR images containing a tumor is timeconsuming. Hence, several software tools have been developed for semi-automated segmentation and volumetric analysis of ADC maps for oncologic imaging applications.18–22 Although these novel tools are potentially more accurate for providing whole tumor volume (WTV) measurements, they are typically not widely available in routine clinical practices, and are still time-consuming. The evaluation and determination of optimal manual approaches for quantitative analysis of ADC values in terms of accuracy and reproducibility is important for proper use of ADC map data. Prior investigations on quantitative ADC measurements in STMs for characterization, detection of tumor recurrence and assessment of treatment response have relied on a manual approach.1–12 Thus, the hypothesis for our study was that single-slice or three-slice manual ROI measurements of ADC values could offer similar information to that obtained by WTV analysis of ADC maps and require less measurement time. The purpose of this study was to determine the interobserver agreement and accuracy of three selective ROI measurement protocols for ADC quantifications in STMs compared with manual WTV analysis of ADC measurements, and define the fastest most accurate method of ADC measurement (with WTV analysis taken as the reference standard).

Materials and Methods Overview This retrospective study was HIPAA compliant and approved by the institutional review board. Informed consent was waived. The MR imaging of 73 subjects with STMs was reviewed by three independent observers. Minimum and mean ADC value measurements were recorded using three selective ROI measurement protocols by the February 2016

three observers and compared with WTV measurements of lesional ADC values, considered the reference standard. Agreement between the different measurement protocols and the reference standard was assessed, along with interobserver variability and time required for manually making measurements by each method.

Subject Population Between March 2011 and March 2014 (36 months), consecutive patients with soft tissue abnormalities were retrieved from a database of patients seen within Orthopedic Oncology clinic who had undergone MR imaging at our institution at 3 Tesla (T). The inclusion criteria were patients who had preoperative STMs and undergone 3T MRI with diagnostic quality DWI with ADC mapping acquired with three b values (50, 400, 800 s/mm2). The exclusion criteria were patients who had undergone tumor resection, patients whose STMs were too small (less than 1cm rendering a lesion too small for accurate ADC measurement) and patients who had absent DWI, nondiagnostic quality DWI, or DWI performed with different technical parameters.

MRI Acquisition Protocol MRI was performed at 3T (Magnetom Trio or Verio, Siemens Medical Solutions, Malvern, PA) using a flexible phased-array body-matrix coil. The anatomic sequences were T1-weighted (repetition time (TR) /echo time (TE), 790/15; section thickness, 5 mm; axial and sagittal planes), fat-suppressed (FS) T2-weighted (TR/TE 3600/70; section thickness, 5 mm; axial plane), and unenhanced and contrast agent-enhanced three-dimensional T1-weighted FS sequences (volumetric interpolated breath-hold examination (VIBE), isotropic resolution, repetition time/echo time (TR/TE) 4.6–6.35/1.4–1.52; flip angle, 9.58; section thickness, 1 mm; coronal plane with axial and sagittal reconstructions, 0.1 mmol/kg gadolinium-based contrast agent). In addition to the anatomical sequences, axial diffusion weighted-imaging (DWI) was performed using a spin-echo, singleshot echo-planar imaging (TR/TE 7700/80; field of view 180–250 mm2, matrix size 256 3 256 pixels, in-plane resolution of 0.7–1 mm2, section thickness, 5 mm; b values of 50, 400, and 800 s/mm2) and ADC maps were generated including the entire STM before contrast administration. The ADC value was calculated using three b values of 50, 400, and 800 s/mm2 using the ADC map automatically generated from the scanner.

Readers and Procedures Three independent observers (two musculoskeletal radiology fellows and one musculoskeletal radiologist with 11 years’ experience) measured and reported the mean and minimum ADC value of each tumor using the ADC measurement protocols described below. Before data acquisition, the observers were trained by evaluating three lesions, which were not included in this study. Observers had access to all anatomic images for reference when evaluating the ADC maps, and for defining the margin of the tumor. One observer measured the maximal tumor diameter in transverse, anteroposterior, and cranicaudal plane on the post contrast VIBE sequence and an average diameter of each STM was determined.

ADC Measurement Protocols For each measurement protocol below, the minimum and mean ADC measurements were recorded by each observer independently. 447

Journal of Magnetic Resonance Imaging

FIGURE 1: Measurement strategies for the various methods of ADC measurement and ROI placement in a spherical STM with internal dark gray oval representing regions of lowest ADC value and white oval representing regions with highest ADC values. a: The WTV ROI method is shown with sequential contiguous ROIs drawn through the entire neoplasm. b: The SS ROI method encompasses the largest tumor area. c: The three slice PD method has three slices through the cranial most slice, largest tumor area and caudal most slice. d: The three slice OB ROI method includes selection of slices with lowest ADC value (encompassing internal dark gray oval), highest ADC value (encompassing internal white oval) and the largest tumor area.

For each measurement, the ROI was placed such that it was entirely within the margin of the tumor (encompassing approximately 90–95% of the tumor on the measured axial slice). Figure 1 shows an overall schematic of how the ADC values were measured for the single slice, three slice and WTV methods. In addition, for each protocol, the time required to make the ADC measurements was recorded in the first 30 cases. The measurement protocols were as follows: 1. Single slice (SS) ROI: A ROI was placed entirely within the tumor at the slice containing the largest area of the tumor. This slice was determined by each reader independently (Fig. 2).

FIGURE 2: Single slice ROI method. A 61-year-old woman with subcutaneous grade 3 spindle epitheloid sarcoma in the right leg (arrows) visualized on anatomical post contrast T1 fat suppressed image (left, TR/TE 6.35 ms/1.52 ms) and the functional ADC map (right, TR/TE 7700 ms/80 ms). Single slice ROI method required visual inspection of the neoplasm on the ADC map with selection of the slice containing the largest area of the tumor (line through image [left] localizes the ADC map in the mass) by the observer and measurement of the minimum and mean ADC value.

448

FIGURE 3: The three slice predefined-ROI method. A 14-yearold girl with intermuscular grade 3 angiosarcoma in the right proximal arm (arrow) visualized on the anatomical imaging comprised of coronal post contrast T1 fat suppressed image (left, TR/TE 6.35 ms/1.52 ms) showing localizer lines through the regions of interest where the ADC measurements were made on the functional imaging (b–d). The three slice Predefined - ROI method required placement of three ROIs on the ADC map (TR/TE 7700 ms/80 ms). Upper right: The first ROI was placed at the most cranial slice through the mass on the ADC map. Center, right: The second ROI was placed through the slice containing the largest area of the tumor (similar to single slice ROI). Lower right: The third ROI was placed at the caudal most slice (if different from the two prior slices) at the slice.

2. Sampling of the tumor using three ROIs: 2a: Predefined sampling (PD): Three ROIs were placed, two at the most cranial and caudal slices and the third ROI (if different from the two prior slices) at the slice containing the largest area of the tumor (similar to the SS ROI). The minimum measurement within all ROIs was reported as the minimum ADC, and the average of all measurements (not weighted for ROI area) as the mean ADC for this method (Fig. 3). 2b: Observer based sampling (OB): Three ROIs were placed, the first ROI at the slice where the observers visually detected the lowest ADC value on the ADC map and the second ROI at the slice where the observer visually detected the highest ADC value. The third ROI (if different from the two prior slices) was placed at the slice containing the largest area of the tumor (Fig. 4). The minimum measurement within all ROIs was reported as the minimum ADC, and the average of all measurements (not weighted for ROI area) as the mean ADC for this method. 3. WTV ADC measurements (reference standard): ROIs were placed at all slices that contained the tumor and the minimum and mean ADC measurements from each slice were recorded and averaged to get WTV minimum and average ADC values. Averages were not weighted for ROI area. The segmentation was performed in the following order: first WTV, then OB, followed by PD and lastly SS method. The observers were instructed to perform segmentations at separate Volume 43, No. 2

Ahlawat et al.: Interobserver Variability of ROI for DWI

FIGURE 4: The three slice Observer defined ROI method. A 75-year-old man with intramuscular right gluteal high grade undifferentiated pleomorphic sarcoma (arrow) visualized on the anatomical imaging comprised of coronal post contrast T1 fat suppressed image (left, TR/TE 6.35 ms/1.52 ms) showing localizer lines through the regions of interest where the ADC measurements were made on the functional imaging (b–d). The Observer defined three slice ROI method required placement of three ROIs on the ADC map (7700 ms/80 ms). Upper right: The first ROI at the slice where the observers visually detected the lowest ADC value. Center right: The second ROI at the slice where the observer visually detected the highest ADC value. Lower right: The third ROI (if different from the two prior slices) were placed at the slice containing the largest area of the tumor, analogous to the single slice method.

sessions to reduce bias and progressive familiarity with the ADC map of each STM.

Statistical Analysis Descriptive statistics were reported. The interobserver variability for ADC measurements from the three readers for each measurement protocol was analyzed by calculating the intraclass correlation coefficient (ICC: 0.00–0.20, poor correlation; 0.21–0.40, fair; 0.41–0.60, moderate; 0.61–0.80, good; and 0.81–1.00, excellent). Individual intraclass correlation for absolute agreement was performed using a two-way random-effects model, treating raters as random. For further analyses, minimum and mean ADCs were averaged between the three observers and compared with the WTV measurements (again using ICC) to determine how well each measurement protocol agreed with WTV measurements. The time required for each method of ADC measurement was compared. STATA SE version 13.1 (StataCorp, College Station, TX) was used for statistical analysis.

were 59 benign lesions, including peripheral nerve sheath tumors (n 5 21), lipomas (n 5 6), ganglion cysts (n 5 5), hematomas (n 5 2), ganglioneuromas (n 5 2), well-defined venous malformations (n 5 2), hibernomas (n 5 2), fibrosis (n 5 2), fibromatosis (n 5 1), calcific bursitis (n 5 1), collagenous fibroma (n 5 1), desmoid (n 5 1), elastofibroma (n 5 1), foreign body reaction (n 5 1), giant cell tumor of soft parts (n 5 1), giant verruca vulgaris (n 5 1), glomangioma (n 5 1), hemophiliac pseudotumor (n 5 1), lymphatic malformation (n 5 1), fat necrosis (n 5 1), myxoma (n 5 1), nodular synovitis (n 5 1), particle disease (n 5 1), seroma (n 5 1), and synovial chondromatosis (n 5 1). There were 14 malignant tumors (in various stages of treatment), comprised of alveolar rhabdomyosarcoma (n 5 1), Ewing’s sarcoma (n 5 1), synovial sarcoma (n 5 1), metastatic melanoma (n 5 1), epitheloid sarcoma (n 5 1), leiomyosarcoma (n 5 1), malignant peripheral nerve sheath tumor (n 5 2), myxofibromasarcoma (n 5 1), high grade undifferentiated pleomorphic sarcoma (n 5 1), lymphoma (n 5 1), grade 1 myxoid liposarcoma (n 5 1), grade 3 spindle cell epitheloid sarcoma (n 5 1), and angiosarcoma (n 5 1). The 73 STMs investigated ranged in average size from 1cm to 16.5 cm, with mean size 4.95 cm. All measurement methods resulted in good to excellent interobserver agreement (Table 1). The lowest agreement was noted for the PD method for measuring the minimum ADC (ICC 5 0.78 [0.70–0.85]), while the highest agreement was noted for the SS method for measuring the mean ADC value (ICC 5 0.90 [0.85–0.93]). Figures 5 and 6 show modified Bland Altman Plots of Interobserver agreement of minimum and mean ADC measurements using SS, PD, OD, and WTV methods. Tables 2 and 3 summarize the minimum and mean ADC values averaged across three readers for each ROI method (SS, PD, and OB) compared with WTV measurements. As expected, the minimum ADC value obtained by the reference standard WTV method was lower than each of the other methods (0.68 6 0.59 3 1023 mm2/s), although the OB method provided the closest minimum ADC values TABLE 1. Intraclass Correlation for Individual, Minimum and Mean ADC Measurement for Each Method With 95% Confidence Intervals

Method

Minimum ADC

Mean ADC

Single slice

0.79 [0.71-0.86]

0.90 [0.85-0.93]

Three slice predefined

0.78 [0.70-0.85]

0.88 [0.83-0.92]

Three slice – observer-defined

0.80 [0.72-0.86]

0.88 [0.83-0.92]

WTV

0.82 [0.75-0.88]

0.87 [0.82-0.91]

Results Of 98 consecutive cases imaged at our institution, lesions were excluded due to lack of diagnostic quality DWI (n 5 1), lack of the DWI acquisition (n 5 23) and small size (n 5 1), leaving 73 cases included in the final analysis. For the study population, the mean age of the subjects was 47.1 years (range, 4–86 years). Of these, 53% were female (n 5 39) and 47% were male (n 5 34). Of 73 STMs, there February 2016

449

Journal of Magnetic Resonance Imaging

FIGURE 5: Modified Bland Altman Plots of Interobserver agreement of minimum ADC measurements using SS, PD, OD, and WTV methods. The average of individual minimum ADC measurements for the three observers are plotted on the horizontal axis. The differences of individual measurements from that average are plotted on the vertical axis. A 0 represents the mean absolute difference in ADC between the observers. Red lines represent the limits of agreement (1.96 times the standard deviation) below and above the mean absolute difference.

(0.73 6 0.613 1023 mm2/s). The difference in measurement between each selective method and the WTV method was statistically significant for all methods of obtaining the minimum ADC value, although such statistical differences are probably not clinically significant. Regarding the mean ADC values, ADC measurements from all selective methods as well as WTV method resulted in ADC value (1023 mm2/s) of 1.49 for SS, PD and WTV methods and 1.51 for the OB method. Of the three selective methods for mean ADC measurement, the OB method (1.51 6 0.62 3 1023 mm2/s) approached statistical significance compared with the WTV method (1.49 6 0.63 3 1023 mm2/s, P 5 0.05). Table 4 shows the average time in seconds required by each observer to make the various ADC measurements using the separate protocols. As expected, the SS method required the least amount of measurement time (14 6 5 s), while the reference WTV method required the longest measurement time (111 6 54 s) (P < 0.01). Of interest, the third observer with the most clinical experience tended to have the shortest time required for each method, suggesting that quantitative DWI experience may affect efficiency by reducing segmentation time. 450

Discussion For the analysis of musculoskeletal STMs, methods of automated or semi-automated ADC quantification have not been in routine clinical use and are not widely available. Hence, for the clinical evaluation of these neoplasms, ADC quantification is typically performed manually. We assessed the interobserver reliability of three different manual measurement protocols for determining the mean and minimum ADC values compared with WTV ADC measurements, and showed that a fast selective observer-based method provides similar results to WTV analysis with high interobserver agreement, and a significant decrease in measurement time. There are prior investigations that have assessed the interobserver reliability of ADC quantification with different methods in visceral neoplasms (14–17). Nogueira et al assessed the reliability of measuring the mean ADC values in 39 breast lesions using small and large manually placed ROIs, and concluded that ROI selection influenced the mean ADC values, although for both ROIs, there was excellent interobserver measurement agreement.14 The authors proposed that the small ROI tended to include the most solid or “cellular or viable” portion of the lesion (and therefore lower ADC values), while the larger ROI included Volume 43, No. 2

Ahlawat et al.: Interobserver Variability of ROI for DWI

FIGURE 6: Modified Bland Altman Plots of Interobserver agreement of mean ADC measurements using SS, PD, OD, and WTV methods. The average of individual mean ADC measurements for the three observers are plotted on the horizontal axis. The differences of individual measurements from that average are plotted on the vertical axis. A 0 represents the mean absolute difference in ADC between the observers. Red lines represent the limits of agreement (1.96 times the standard deviation) below and above the mean absolute difference.

more of the lesion and led to an overall higher mean ADC due to the inclusion of adjacent cystic or necrotic regions.14 Conclusions from this study provide an understanding of how mean ADC values are influenced by selection of the ROI (as minimum ADC values were not investigated). Although not previously reported, the use of the median TABLE 2. Comparison of Minimum ADC (3 1023 mm2/s) Values Averaged Across Three Readers for Each Method (Single Slice, Three Slice-Predefined, and Three-Slice Observer Defineda

Method of ROI measurement

Minimum ADC

P-Value

Single slice

0.97 6 0.67

< 0.001

Three slice predefined

0.78 6 0.61

< 0.001

Three slice – observer defined

0.73 6 0.61

< 0.001

WTV

0.68 6 0.59

a

A paired t-test was then performed to compare these averages among the different methods, with WTV ROI used as an arbitrary reference standard. February 2016

rather than the mean ADC could be advantageous, as the mean reflects the inclusion of cystic/necrotic regions in addition to cellular areas, whereas the median value will reflect the most common value (either cystic/necrotic or cellular). In our study, ROIs for all selective methods included the whole lesion on axial views (a large ROI on each slice), suggesting a reason for the high agreement between mean ADC values of the selective methods and the WTV method. In an another study on 69 patients with endometrial carcinoma, Inoue et al investigated the influence of four manual ROI methods on both the minimum and mean ADC values, and again noted excellent interobserver agreement, similar to our findings.15 This study noted statistically significant differences in the minimum and mean ADC values obtained by the different ROI methods, and hypothesized that the shape of the ROI influences the values (fixed shape compared with a freehand ROI),15 although WTV ADC values were not available for comparison in this study. Other studies have compared WTV measurements with manual methods. Kwee et al compared a semiautomated volumetric WTV method of mean ADC measurement to a 2-slice manual selective method in 11 patients with esophageal patients and showed good agreement (ICC of 0.73) between the manual and semiautomated methods, 451

Journal of Magnetic Resonance Imaging

TABLE 3. Comparison of Mean ADC Value (3 1023 mm2/s) Averaged Across Three Readers for Each Method (Single Slice, Three Slice-Predefined, and Three-Slice Observer Defined)

Method of ROI measurement

Mean ADC

P-Value

Single slice

1.49 6 0.65

5 0.88

Three slice - predefined

1.49 6 0.62

5 0.45

Three slice – observer defined

1.51 6 0.62

5 0.05

WTV

1.49 6 0.63

A paired t-test was then performed to compare these averages among the different methods, with WTV ROI used as an arbitrary reference standard.

but an overall higher interobserver agreement (ICC of 0.96) for the mean ADC measurement by the semi-automated WTV method, indicating that inclusion of the entire tumor volume led to a decrease in interobserver variation.17 Similarly, Lambregts et al compared manual WTV measurements to two selective ROI methods in 46 patients with locally advanced rectal cancer for measurement of the mean ADC values and showed that the WTV method (ICC of 0.91) was the most reproducible of the three methods (compared with ICCs of 0.53 and 0.6 for selective ROI methods in that study).16 Unlike our study, tumor boundaries for manual WTV measurements were determined by using high b-value (b1000) diffusion images rather than anatomic T1, T2, and contrast-enhanced images as in our study (perhaps accounting for erroneous tumor borders) and the selective ROI methods included only a single-slice and small samples, suggesting that the positioning and size of the small ROIs influenced the measurement of the mean tumor ADC values. Again, in our study, ROIs for the selective methods were large and included the entire lesion on the axial views of each slice, likely accounting for the higher inter-reader reliability observed in our study between our selective ROI methods and the WTV measurement method.

Unlike most prior investigations, the observers in our study recorded both the mean and minimum ADC values of each mass using the various methods. Of note, there were statistically significant differences in the minimum ADC value measurements between all selective ROI methods in our study. The minimum ADC was highest in the SS method, for which readers recorded ADC values from one slice (the largest ROI within the lesion); as more slices were sampled in other methods, data from increasing volumes of ROIs were included, and approached the lowest minimum ADC value (as determined by the WTV method). In addition, unlike the OB method, both the SS and PD methods did not require the observers to select the region of interest with the lowest signal on the ADC map, and hence, these two methods do not provide an accurate representation of the lowest ADC value, and therefore the most “cellular” region within a neoplasm. The OB method approaches the WTV method, suggesting the highest accuracy for measuring minimum ADC values. Similarly, when measuring the mean ADC values, there was a detectable statistically significant difference between the OB and WTV methods even though the mean values averaged across three readers for the two methods were the same. This statistical result may be attributable to the large number of paired observations comprising the dataset; the magnitude of the difference between these measurement methodologies renders them essentially clinically equivocal. Specifically, with respect to STM characterization by ADC values, the differences noted within the literature are more substantial than the incremental differences observed between the selective and WTV methods. For example, with respect to peripheral nerve sheath tumors, a more substantial difference in minimum ADC values is noted between benign and malignant tumors (0.5 6 0.3 3 103 mm2/s versus 1.1 6 0.3 3 103 mm2/s; P < 0.0001).5 Similarly, for solid STMs, minimum and mean ADC values were significantly lower for malignant versus benign tumors (minimum ADC: 0.5 6 0.4 3 103 mm2/s versus 0.8 6 0.3 3 103 mm2/s respectively, P 5 0.04; mean ADC: 1.2 6 0.4 3 103 mm2/s versus 1.6 6 0.1 3 103 mm2/s, respectively, P 5 0.004).6 Similarly, larger differences between mean ADC values were noted between chronic hematomas and malignant STMs (1.6 6 0.1 3 103 mm2/s versus 0.9 6 0.1 3 103 mm2/s, respectively).3

TABLE 4. Average Time (in Seconds) Required for Each Method of ADC Measurement for Each Observer

Single slice ROI

Predefined sampling using three ROIs

Observer based sampling using three ROIs

WTV

Observer 1

19 (6 9)

47 (6 14)

49 (6 14)

115 (670)

Observer 2

16 (65)

52 (6 30)

44 (629)

163 (6 111)

Observer 3

9 (64)

21 (66)

21 (67)

54 (631)

Mean time

14 (6 5)

40 (617)

38 (6 15)

111 (6 54)

452

Volume 43, No. 2

Ahlawat et al.: Interobserver Variability of ROI for DWI

Hence, the statistical significance detected between ADC measurement methods is likely of negligible clinical significance. Overall, while the literature in visceral neoplasms suggests moderate to excellent agreement between readers and between various manual ROI methods similar to our conclusions, no studies have recorded the time required to produce results with each measurement method. The segmentation was performed in the following order: WTV, followed by OB, followed by PD and lastly, followed by the SS method. The observation that the PD method required more time than the OB method is potentially explained by the fact that the OB method may be more intuitive. The OB method requires the reader to select the slices with the visually apparent lowest ADC value, highest ADC value and largest tumor area. In the PD method, the reader must correlate the ADC map with a coronal or sagittal sequence to determine the most cranial and caudal slices, excluding regions of perilesional edema and adjacent soft tissues, potentially increasing the time required for segmentation. While mean ADC values are used to detect, stage, and follow-up patients with visceral tumors, recent studies have shown that minimum ADC values can detect differences in cellularity within heterogeneous tumors that are not apparent when the mean ADC is used.23 In musculoskeletal STM evaluation particularly, the minimum ADC values may be more important than the mean ADC values for some applications, such as in differentiating between benign and malignant peripheral nerve sheath tumors.5 Our results showed that, despite the dependence of measurements on the subjective slice selection by observers who were instructed to qualitatively identify the slice with the lowest ADC, there was high interobserver agreement for ADC measurements for these selective methods. Hence, during the interpretation of DWI obtained for the evaluation of STMs, workflow efficiency can be improved using observerbased slice selection instead of WTV ADC analysis and therefore, the integration of ADC quantification into routine STM evaluation by MR imaging becomes easy and feasible clinically. This study had limitations. First, we performed manual rather than automated or semiautomated WTV measurements as the reference standard, although automated or semi-automated WTV measurements of STMs have not been validated. Second, for selective ROI protocols, we determined the accuracy of measurements based on the number of slices (single-slice versus three-slice methods) and their selections (predefined versus observer-based), not ROI size. However, a change in ROI size is unlikely to impact measurement time and interpretation workflow, although the size of the ROI can potentially influence absolute ADC values. Moreover, we did not weight measurements based on ROI size: in calculating the mean ADC for WTV, the mean ADC value derived from a large ROI in the middle February 2016

of the tumor was given the same weight as the mean ADC value derived from a smaller ROI at the tumor margin. While theoretically less accurate, our approach was computationally more straightforward and probably resulted in little impact on the final analysis. To avoid the ROI size and shape variability, the observers were instructed to use the largest round or oval ROI located entirely within the tumor. Third, WTV measurements were used as the gold standard. No comparison was made with the pathological diagnosis or histological assessment of cellularity and hence, there is no pathologic correlation for the ADC values measured. However, the purpose of this study was to identify a measurement strategy rather than assess the added value of ADC mapping to diagnosis or the assessment of treatment response. Finally, many soft tissue sarcomas are highly heterogeneous, and it is unclear how well the single slice measurement methods would perform in a large population of highly heterogeneous masses; although we included soft tissue sarcomas in our population, they were less numerous than benign lesions. In conclusion, a simplified observer-based manual method of ADC quantification in the evaluation of STMs is comparable to WTV measurements of the minimum and mean ADCs, and requires significantly less analysis time. It is important to identify an optimal manual quantification strategy at this time, as automated methods of ADC measurement are not widely available for clinical use. Nevertheless, future directions of investigation should also include studies of manual or automated WTV ADC measurements, as they may provide additional value for assessing tumor heterogeneity (potentially especially important in large heterogeneous soft tissue sarcomas), not available with single slice methods.

References 1.

Nagata S, Nishimura H, Uchida M, et al. Diffusion-weighted imaging of soft tissue tumors: usefulness of the apparent diffusion coefficient for differential diagnosis. Radiat Med 2008;26:287–295.

2.

Namimoto T, Yamashita Y, Awai K, et al. Combined use of T2-weighted and diffusion-weighted 3-T MR imaging for differentiating uterine sarcomas from benign leiomyomas. Eur Radiol 2009;19:2756–2764.

3.

Oka K, Yakushiji T, Sato H, et al. Ability of diffusion-weighted imaging for the differential diagnosis between chronic expanding hematomas and malignant soft tissue tumors. J Magn Reson Imaging 2008;28: 1195–1200.

4.

Van Rijswijk CS, Kunz P, Hogendoorn PC, Taminiau AH, Doornbos J, Bloem JL. Diffusion-weighted MRI in the characterization of soft-tissue tumors. J Magn Reson Imaging 2002;15:302–307.

5.

Demehri S, Belzberg A, Blakeley J, Fayad LM. Conventional and functional MR imaging of peripheral nerve sheath tumors: initial experience. AJNR Am J Neuroradiol 2014;35:1615–1620.

6.

Subhawong TK, Durand DJ, Thawait GK, Jacobs MA, Fayad LM. Characterization of soft tissue masses: can quantitative diffusion weighted imaging reliably distinguish cysts from solid masses? Skeletal Radiol 2013;42:1583–1592.

7.

Tan SL, Rahmat K, Rozalli FI, et al. Differentiation between benign and malignant breast lesions using quantitative diffusion-weighted sequence on 3 T MRI. Clin Radiol 2014;69:63–71.

453

Journal of Magnetic Resonance Imaging 8.

Del Grande F, Subhawong T, Weber K, Aro M, Mugera C, Fayad LM. Detection of soft-tissue sarcoma recurrence: added value of functional MR imaging techniques at 3.0 T. Radiology 2014;271:499–511.

16.

Lambregts DM, Beets GL, Maas M, et al. Tumour ADC measurements in rectal cancer: effect of ROI methods on ADC values and interobserver variability. Eur Radiol 2011;21:2567–2574.

9.

Barabasch A, Kraemer NA, Ciritsis A, et al. Diagnostic accuracy of diffusion-weighted magnetic resonance imaging versus positron emission tomography/computed tomography for early response assessment of liver metastases to Y90-radioembolization. Invest Radiol 2015;50:409–415.

17.

Kwee RM, Dik AK, Sosef MN, et al. Interobserver reproducibility of diffusion-weighted MRI in monitoring tumor response to neoadjuvant therapy in esophageal cancer. PLoS One 2014;9:e92211.

18.

Vouche M, Salem R, Lewandowski RJ, Miller FH. Can volumetric ADC measurement help predict response to Y90 radioembolization in HCC? Abdom Imaging 2014. [Epub ahead of print].

19.

Fathi Kazerooni A, Mohseni M, Rezaei S, Bakhshandehpour G, Saligheh Rad H. Multi-parametric (ADC/PWI/T2-w) image fusion approach for accurate semi-automatic segmentation of tumorous regions in glioblastoma multiforme. MAGMA 2015;28:13–22.

20.

Li Z, Bonekamp S, Halappa VG, et al. Islet cell liver metastases: assessment of volumetric early response with functional MR imaging after transarterial chemoembolization. Radiology 2012;264: 97–109.

21.

Halappa VG, Bonekamp S, Corona-Villalobos CP, et al. Intrahepatic cholangiocarcinoma treated with local-regional therapy: quantitative volumetric apparent diffusion coefficient maps for assessment of tumor response. Radiology 2012;264:285–294.

22.

Gowdra Halappa V, Corona-Villalobos CP, Bonekamp S, et al. Neuroendocrine liver metastasis treated by using intraarterial therapy: volumetric functional imaging biomarkers of early tumor response and survival. Radiology 2013;266:502–513.

23.

Malayeri AA, El Khouli RH, Zaheer A, et al. Principles and applications of diffusion-weighted imaging in cancer detection, staging, and treatment follow-up. Radiographics 2011;31:1773–1791.

10.

11.

Mayerhoefer ME, Karanikas G, Kletter K, et al. Evaluation of diffusionweighted magnetic resonance imaging for follow-up and treatment response assessment of lymphoma: results of an 18F-FDG-PET/CT-controlled prospective study in 64 patients. Clin Cancer Res 2015;21:2606–2513. Kokabi N, Camacho JC, Xing M, Edalat F, Mittal PK, Kim HS. Immediate post-doxorubicin drug-eluting beads chemoembolization Mr Apparent diffusion coefficient quantification predicts response in unresectable hepatocellular carcinoma: a pilot study. J Magn Reson Imaging 2015. doi: 10.1002/jmri.24845. [Epub ahead of print].

12.

Lee MS, Kim MD, Jung DC, et al. Apparent diffusion coefficient of uterine leiomyoma as a predictor of the potential response to uterine artery embolization. J Vasc Interv Radiol 2013;24:1361–1365.

13.

Zhao F, Ahlawat S, Farahani SJ, et al. Can MR imaging be used to predict tumor grade in soft-tissue sarcoma? Radiology 2014;272:192–201.

14.

Nogueira L, Brand~ ao S, Matos E, et al. Region of interest demarcation for quantification of the apparent diffusion coefficient in breast lesions and its interobserver variability. Diagn Interv Radiol 2015;21:123–127.

15.

Inoue C, Fujii S, Kaneda S, et al. Apparent diffusion coefficient (ADC) measurement in endometrial carcinoma: effect of region of interest methods on ADC values. J Magn Reson Imaging 2014;40:157–161.

454

Volume 43, No. 2

Interobserver variability of selective region-of-interest measurement protocols for quantitative diffusion weighted imaging in soft tissue masses: Comparison with whole tumor volume measurements.

To assess the interobserver reliability of three selective region-of-interest (ROI) measurement protocols for apparent diffusion coefficient (ADC) qua...
466KB Sizes 0 Downloads 8 Views