Software for enhanced video capsule endoscopy: challenges for essential progress.

REVIEWS Software for enhanced video capsule endoscopy: challenges for essential progress Dimitris K. Iakovidis and Anastasios Koulaouzidis Abstract | Video capsule endoscopy (VCE) has revolutionized the diagnostic work-up in the field of small bowel diseases. Furthermore, VCE has the potential to become the leading screening technique for the entire gastrointestinal tract. Computational methods that can be implemented in software can enhance the diagnostic yield of VCE both in terms of efficiency and diagnostic accuracy. Since the appearance of the first capsule endoscope in clinical practice in 2001, information technology (IT) research groups have proposed a variety of such methods, including algorithms for detecting haemorrhage and lesions, reducing the reviewing time, localizing the capsule or lesion, assessing intestinal motility, enhancing the video quality and managing the data. Even though research is prolific (as measured by publication activity), the progress made during the past 5 years can only be considered as marginal with respect to clinically significant outcomes. One thing is clear—parallel pathways of medical and IT scientists exist, each publishing in their own area, but where do these research pathways meet? Could the proposed IT plans have any clinical effect and do clinicians really understand the limitations of VCE software? In this Review, we present an in-depth critical analysis that aims to inspire and align the agendas of the two scientific groups. Iakovidis, D. K. & Koulaouzidis, A. Nat. Rev. Gastroenterol. Hepatol. advance online publication 17 February 2015; doi:10.1038/nrgastro.2015.13

Introduction

Department of Computer Engineering, Technological Educational Institute of Central Greece, 3rd Km Old National Road Lamia-Athens, Lamia PC 35 100, Greece (D.K.I.). The Royal Infirmary of Edinburgh, Endoscopy Unit, 51 Little France Crescent, Old Dalkeith Road, Edinburgh EH16 4SA, UK (A.K.). Correspondence to: A.K. tassos.koulaouzidis@ nhslothian.scot.nhs.uk

Video capsule endoscopy (VCE) remains the first-line diagnostic tool for screening of small bowel diseases.1 Despite sustained technical advancements, 2,3 VCE sequences are still manually reviewed, which is a timeconsuming task that typically lasts 45–90 min and requires the undivided concentration of the reviewer.4 This process is prone to diagnostic errors as a result of the natural limitation of human capabilities.5 Studies from the past few years indicate that despite its high diagnostic yield, the true sensitivity of VCE is difficult to determine due to the lack of an adequate gold standard.1,2,6 Furthermore, the accuracy of VCE is heavily dependent on accurate interpretation that is not influenced by the reviewer’s experience.7,8 Historically, the VCE miss rates for vascular lesions, ulcers and neoplasms were 5.9%, 0.5% and 18.9%, respectively.9 Computational methods, which are integral to the reviewing software of the VCE devices, could contribute to the reduction of both the time required by the reviewers and errors in human interpretation. These computational methods first appeared in the early 2000s in the form of automatic polyp detection using conventional video endoscopy.10,11 Since then, several other approaches for the detection and/or characterization of abnormalities have been proposed by Competing interests A.K. has received research support from Given Imaging and SynMed UK, lecture honoraria from Dr Falk Pharma UK, and travel support from Abbott, Dr Falk Pharma UK, Almirall, and MSD. D.K.I. declares no competing interests.

computer scientists and engineers (henceforth referred to as information technology [IT] scientists) to support medical decision-making.12,13 Here, ‘abnormality’ refers not only to polyps but also to other pathologies, such as ulcers and haemorrhage. These methods are based on processing and/or analysis of videos from VCE procedures.14 Processing involves transformation of the video signals to enhance relevant information or suppress irrelevant information, for example, to enhance the outlines of the lesions or to suppress noise. Analysis involves automatic detection of relationships between sets of pixels or video frames, such as pixels belonging to a lesion or recognition of image contents (that is, lesion recognition). Relevant methods include calculation of numerical features15 (feature extraction) that encode information patterns contained within images or video segments, such as the colour of an image region, and pattern classification. Algorithms for classifying patterns (classifiers) can be supervised, in the sense that they can adapt their parameters (and thus learn) to classify patterns.16 These algorithms are based on information provided by training examples that are acquired from gold standard images, that is, images manually annotated by VCE experts.17 This process enables pattern recognition by machines, which can be used to identify lesion patterns. Popular machinelearning algorithms for supervised pattern recognition include neural networks and support vector machines.16 This Review focuses on computational methods that can be implemented in software to enhance VCE procedures in terms of time efficiency and diagnostic

NATURE REVIEWS | GASTROENTEROLOGY & HEPATOLOGY © 2015 Macmillan Publishers Limited. All rights reserved

ADVANCE ONLINE PUBLICATION | 1

REVIEWS Key points ■■ Computational software can enhance the diagnostic yield of video capsule endoscopy (VCE), both in terms of efficiency and accuracy ■■ Despite increasing activity in information technology (IT) research worldwide, the translation of this information to clinical practice has been limited ■■ The development of intelligent software systems requires close collaboration between medical and IT scientists at a laboratory level ■■ Public sharing of anonymized and annotated VCE image and video data is essential

performance. Its aim is not technical; however, technical issues are qualitatively discussed to promote further, essential, communication between medical and IT experts. Moreover, this Review comprehensively addresses and expands on topics and directions that have already been identified13,18 and analyses the publication activity, state-of-the-art methods and results from the latest studies to identify research challenges and directions for further essential progress in VCE. The state-of-the-art methods reviewed herein are organized into seven categories. These categories include automated video analysis for the detection of haemorrhage and lesions; computer-aided reduction of the time required by VCE reviewers to visually examine the endoscopic videos; video-based capsule localization, which can replace or enhance the conventional approaches based on external sensors; computer-aided intestinal motility assessment; video processing for enhancement of the visual quality of the VCE videos; efficient management of the VCE data produced by health-care providers; and availability of these data for both educational and software development purposes.

State-of-the art methods Haemorrhage detection The importance of detecting haemorrhage stems from the fact that it can be associated with various pathologies. VCE has had a major effect on the way patients with occult and, in particular, obscure haemorrhage are managed.19 A commercially available software tool, suspected bleeding indicator (SBI), has been produced by Given Imaging Ltd for the automatic detection of blood in VCE video frames. However, several studies have criticized this tool for its suboptimal performance, even for patients with active intestinal bleeding,20 as well as for having low sensitivity (true positive) and specificity (false negative) rates in the ranges of 21.5–41.8% and ~37–59%, respectively.21,22 As blood has a distinct red hue, methods for automatic detection of haemorrhage have mainly been based on the colour features of the VCE images. Although colour is a property of a single pixel, it can also be considered as a property of a wider region within an image. The colour content of a VCE image or image region is usually described by a normalized colour histogram. A pioneering study in this field23 was based on histograms estimated from whole VCE images. Current methods to detect haemorrhage mainly include variations of this approach with respect to colour representation and the size and shape of the image regions, for example, so-called

pyramid histograms extracted from subimages iteratively sampled at various resolutions.24 The latest approaches consider regions of arbitrary shape that are automatically selected by image segmentation algorithms, such as region-growing(the algorithm begins from an initial set of scattered pixels, iteratively selects neighbouring pixels and eventually contiguous regions within the image)25 or super-pixel (the algorithm groups pixels into perceptually meaningful image regions, which can be used to replace the rigid structure of the pixel grid)26 for segmentation of the blood regions in the VCE images. In addition, promising results have been obtained by methods that evaluate colour at the pixel level rather than at the region level. One of the earliest approaches was found to outperform the SBI, with 92% sensitivity and 98% specifi city,27 whereas one of the latest approaches achieved very high results (94.8% sensitivity and 96.1% specificity) by incorporating the concept of patient-adaptivity (that is, the capability of the algorithm to adapt its parameters to the different colour-balance conditions of VCE images obtained from different patients).28 Texture is also considered as a discriminative feature of haemorrhage, mainly in conjunction with the detection of a lesion. Unlike colour, texture can only be considered as a property of an image region and it describes the spatial arrangement of the pixel values within that region. Texture features are typically estimated from the intensity of the image.29 Colour–texture histograms have been exploited for the detection of haemorrhage and nonbleeding lesions that are detectable by their more reddish and pinkish appearance compared with the surrounding tissue.30 An important concept in this approach is that it initially detects pixels that could belong to bleeding regions, and then proceeds to the histogram-based evaluation of the respective image regions to refine the assessment. Another important issue is that the evaluation of the method based on the colour–texture histogram was performed with a fairly large dataset of 84 full-length videos, which demonstrated a notable improvement over SBI in terms of the number of bleeding, lesion and false positive events or frames that were detected.30 In another study, a set of optimal colour–texture features for detection of haemorrhage and ulcers was selected by applying an automatic feature selection scheme.31 Of a total of 5,859 features considered, only three features were sufficient to achieve high levels of haemorrhage detection. However, the reference dataset of 613 images only included 38 images of bleeding, which makes it difficult to achieve specificity. Methods based on the estimation of colour and/ or texture features from whole images or subimages will probably have reduced sensitivity for the detection of small quantities of blood. For enhanced sensitivity, haemorrhage-detection methods that are based on colour at the pixel level have been proposed.32,33 A more recent method exploits information on colour to detect salient pixels (Figure 1a).17 The colour components of the salient pixels, for example, their hue and saturation, are used, along with the ranges of the respective values of their neighbouring pixels, to detect several abnormalities, including haemorrhage. The average reported accuracy

2 | ADVANCE ONLINE PUBLICATION

www.nature.com/nrgastro © 2015 Macmillan Publishers Limited. All rights reserved

REVIEWS a

c

investigated can be more discriminative for detection of multiple bleeding spots.35 A summary of state-of-the-art methods for detecting haemorrhage is provided in Table 1.

b

d

e

Figure 1 | Results of state-of-the-art software-based methods for video capsule endoscopy. Blue crosses indicate automatically detected salient pixels red Nature Reviews | Gastroenterology & and Hepatology circles indicate which of these pixels have been assessed as abnormal by the system (true positive detections). a | Haemorrhage detection.17 Permission obtained from Elsevier © Iakovidis, D. K. & Koulaouzidis, A. Gastrointest. Endosc. 80, 877–883 (2014). b | Small-ulcer detection.17 Permission obtained from Elsevier © Iakovidis, D. K. & Koulaouzidis, A. Gastrointest. Endosc. 80, 877–883 (2014). c | Panoramic visualization of 10 frames simultaneously to reduce the review time.82 Permission obtained from IEEE © Iakovidis, D. K. et al. 2013 IEEE 13th International Conference on Bioinformatics and Bioengineering doi:10.1109/ BIBE.2013.6701598. d | 3D visualization for image enhancement.123 Permission obtained from Elsevier © Rondonotti, E. et al. Gastrointest. Endosc. 80, 642–651 (2014). e | Visualization of the motility of a whole small intestine with adaptive cut longitudinal view.113 Motility events can be identified from the patterns formed by the dark areas of this visualization, for example, the spiky pattern at the beginning indicates contractions, whereas smoother patterns indicate less motility. Each white stripe marks 10 min of the video duration. Permission obtained from Elsevier © Drozdzal, M. Comput. Med. Imaging Graph. 37, 72–80 (2013).

for the detection of haemorrhage using this method was 83.5% in terms of the area under receiver operating characteristic (AUROC).34 A comparative study between colour and texture features indicate that the colour features

Lesion detection The automatic detection of abnormalities can help reduce the number of false negative diagnoses and, indirectly, could contribute to the reduction in the time it takes to review VCE videos. The consequent reduction in morbidity and health-care costs could result in a noteworthy socioeconomic effect. The main challenge in the development of methods to automatically detect lesions is to identify and mathematically model the image features that differentiate lesions from normal mucosa (and from intestinal content). Although essentially similar to the problem of detecting haemorrhage, discussed in the previous section, the diversity of the lesions makes the problem of automatic detection of lesions a more challenging task. Several automatic methods to detect lesions have been proposed (Table 2). Most of them deal with the detection of polyps, tumours, ulcers and Crohn’s disease lesions. A few of the methods deal with the detection of coeliac disease, lymphangiectasias and hookworms. Only a handful of the methods deal with the detection of more than a single type of abnormality.17,31,36 In most studies, colour and texture were considered as discriminative features for lesion detection. The rationale behind this approach was based on previous medical studies37,38 and/or encouraging results presented in technical studies on lesion detection, in the context of conventional video endoscopy.10,39,40 Colour has been used as the sole discriminative property of lesions. Methods to estimate the colour features of VCE videos are in principle similar to those proposed for detecting haemorrhage. For the detection of certain types of lesions, colour intensity without any chromatic information has been considered. For instance, a method for the detection of lymphangiectasias based on their characteristic bright appearance in VCE images has been proposed.41 The variation of colour intensity has been used for the detection of coeliac disease.42 The approach based on colour saliency, discussed in the previous section on detecting haemorrhage,17 detects several other types of abnormalities (Table 2, Figure 1b). The best performance was obtained for angioectasias of intermediate bleeding potential (P143) (AUROC 97.5%) and nodular lymphangiectasias (AUROC 96.3%). Overall, the average accuracy is 94.0% (AUROC 89.2%), without excluding intestinal content from the evaluation.17 Texture has been used either as the sole feature or jointly with colour (colour–texture). In the context of lesion detection, the texture features that are estimated from the VCE images include second or higher order statistical measures,31,44,45 measures estimated from fractal-based image modelling 46 or image transformations, including Fourier,47 wavelet,48,49,50 curvelet 45 and contourlet,51 which reveal discriminative lesion characteristics that might not otherwise be evident. In the detection of polyps and ulcers, an approach based on concatenation of colour



REVIEWS Table 1 | State-of-the-art methods for detecting haemorrhage in VCE Study

Number of images in largest dataset (patients)

Features

Jung et al. (2009)28

4,800 (12)

Lv et al. (2011)24

560

Pan et al. (2011)

Alotaibi et al. (2013)35

Best average results (%) Accuracy

Sensitivity/specificity

Colour

NA

94.8/96.1

Colour

97.9

NA

14,630 (150)

Colour

NA

93.1/85.8

691

Colour

81.2 (AUROC)

NA

Figueiredo et al. (2013)

4,000 (10)

Colour

92.7

92.9/>90.0

Sainju et al. (2014)25

1,500 (3)

Colour

98.0

99.4/99.3

Fu et al. (2014)26

5,000

Colour

95.0

99.0/94.0

Szczypinski et al. (2014) *

613 (50)

Colour and texture

NA

100.0/99.0

Iakovidis & Koulaouzidis (2014)17‡

1,370 (252)

Colour

83.5 (AUROC)

NA

32

33

31

Results from the majority of the studies were quantified in terms of accuracy, and/or sensitivity and specificity. The AUROC was adopted in only two studies 17,35 instead of the conventional accuracy, which was not available. *Deals also with ulcer detection. ‡Deals also with the detection of several lesions. Missing number of images or patients implies unavailability of the respective data. Abbreviations: AUROC, area under receiver operating characteristic; NA, not available; VCE, video capsule endoscopy.

and texture features has been proposed to obtain a fused colour–texture representation of VCE images, which provides more robust lesion detection than the use of just colour or texture features alone.36 Higher sensitivity and specificity rates have been reported for ulcer detection by using a more sophisticated feature fusion technique based on the combination of classifiers.52 In another study, polyp detection was based on salient pixels that were identified only by their colour intensity.53 An approach based on both colour saliency and image texture has been proposed for the detection of ulcers.54 In this case, colour is used to detect salient pixels, and then statistical texture features are separately extracted from their neighbourhoods. The detection of Crohn’s disease lesions has also been based on the separate estimation of colour and texture features,55 without exploiting saliency.56 Few studies have considered shape as an additional cue for lesion discrimination. Shape features considered for polyp detection have been estimated from both 2D36,57,58 and 3D image structures.44 The results of these studies indicate that information on shape can enhance polyp detection based on colour or texture, with the latest methods achieving up to 91.0% sensitivity and 95.2% specificity (Table 2).44

Reduction in review time In the previous sections, detection of abnormalities has been identified as an indirect approach to reduce the review time. Other approaches can be classified into two categories: data mining and visualization approaches (Table 3). Data mining Data mining aims to discover implicit information from large collections of data. A pioneering data-mining method for reducing the review time aims to detect the most representative frames in a VCE video.59,60 Instead of detecting specific abnormalities, it detects frames with content that deviates from that of most of the frames in a segment of video. Evaluation of this method revealed

a capacity for up to 85% frame reduction without loss of abnormality detection.59,60 Subsequently, less computationally complex approaches were proposed that resulted in a frame reduction of >90%.61,62 However, evaluation of similar approaches on larger datasets indicate that the accuracy for detection of the most representative frames is rather low (66%).63 The time needed to review the video can also be reduced by removing frames depicting only intestinal content (such as debris and fluids) that are not informative.64–67 These frames can be detected by pattern recognition methods (similar to those used for abnormality detection). Although their accuracy can be high (99.3% on 61,427 video frames),67 results in terms of overall frame reduction have not been reported. Another approach to reducing the number of VCE frames that need to be assessed is removal of redundant video frames. When the capsule endoscope moves slowly, many redundant images are recorded. Hence, consecutive frames with similar colour and texture can be removed.68 Removal of redundant frames has also been based on estimations of the capsule endoscope motion.69,70 Motion is estimated by comparing consecutive VCE frames with respect to the position of salient pixels. These pixels are automatically detected on the basis of their colour intensity and are tracked from frame to frame using their surrounding textural content.71 In instances of nonsignificant motion, the frames are characterized as redundant and are discarded. A representative motion-based method was evaluated on 100-frame VCE clips, which resulted in a statistically significant compression ratio of 68%.69 Another method was evaluated using 30,000 video frames obtained from three patients and resulted in a 52.3% reduction of near-duplicate frames.70 The removal of redundant video frames, combined with the removal of noninformative frames, achieved an overall r eduction of 65% in a dataset of six VCE video sequences.72 Other studies have addressed the problem of reducing the review time indirectly, using automatic detection of the transition points between the different parts of



REVIEWS Table 2 | State-of-the-art lesion detection methods for VCE Study

Lesions


Features

Lymphangiectasias

7,218 (18)

Ciaccio et al. (2010)

Coeliac disease

Kumar et al. (2013)56

Crohn’s disease

Hwang (2011)53

Best average results (%) Accuracy

Sensitivity/ specificity

Colour intensity

90.0

NA

21,000 (21)

Colour intensity, time

NA

80.0/96.0

533 (47)

Colour, texture

92.0

NA

Polyps

120

Colour intensity, texture

80.0

90.0/70.0

Karargyris & Bourbakis (2011) *

Polyps

100

Colour, texture, shape

NA

96.2/70.2

Li & Meng (2011)48

Polyps

1,200 (10)

Colour–texture

91.6

NA

Romain et al. (2013)

Polyps

1,500

Texture, shape

NA

91.0/95.2

David et al. (2013)57

Polyps

30,540

Colour, shape

NA

80.0/65.0

Mamonov et al. (2014)58

Polyps

18,900 (5)

Colour, shape

NA

81.0/90.0

Li et al. (2011)

Tumours

1,200 (10)

Colour–texture

90.5

92.3/88.7

Li & Meng (2012)49

Tumours

1,200

Colour–texture

92.4

NA

Li & Meng (2012)

Tumours

1,200

Colour–texture

83.5

84.7/82.3

Chen et al. (2013)47

Tumours

75

Texture

100

NA

Charisis et al. (2012)46

Ulcers

174 (6)

Colour–texture

95.4

90.9/89.4

Yu et al. (2012)

Ulcers

344 (60)

Colour–texture

89.5

99.2/80.0

Chen & Lee (2012)54

Ulcers

272

Colour, texture

92.6

83.3/98.8

Karargyris & Bourbakis (2012)36*

Ulcers

100


NA

75.0/73.3

Szczypinski et al. (2014)

Ulcers

613 (50)

Colour–texture

NA

83.0/94.0

Chen et al. (2013)51

Hookworms

1,700 (10)

Colour, texture

88.7

84.5/93.0

1,370 (252)

Colour

94.0; 89.2 (AUROC)

95.4/82.9

Cui et al. (2010)41 42

36

44

50

45

52

31‡

Iakovidis & Koulaouzidis (2014)

17‡

Several lesions

§

*Deals with both polyp and ulcer detection. ‡Deals also with haemorrhage detection. §Polyps, chylous cysts, nodular lymphangioectasias, villous oedemas, stenoses, ulcers, apthae, and three types of angioectasias characterized by different degrees of bleeding potential (P0–P2). 43 Missing number of images or patients implies unavailability of the respective data. Abbreviations: AUROC, area under receiver operating characteristic; NA, not available; VCE, video capsule endoscopy.

the gastrointestinal tract, for example, the oesophagogastric junction, the pylorus and the ileocaecal valve.73,74 On average, the time reduction was estimated at 15 min, which is approximately equal to the average time required for manual detection of the transition points. Visualization Automatically constructed dictionaries representing associations between textural elements (called textons) of colour VCE images and respective classes of video contents have been proposed for automatic detection of uninformative video frames and the construction of a visualization to detect clinically relevant segments of the video.75 Although this method is still at an early stage, the preliminary results indicate a capacity for a considerable overall reduction of uninformative frames. Commercial software76 provides a simple but effective visualization of multiple consecutive frames by simultaneously enabling evaluation of multiple frames at the same time, with the potential for a proportional reduction of the reading times. However, this potential is limited by human perception capabilities.77 Furthermore, the validity of some quick visualization approaches has been questioned,78,79 as informative VCE frames could be skipped. Frame skipping could be overcome with adaptive control of the video display.80 In this approach, a VCE

video can be played back at a high frame rate in stable smooth frame sequences to save time. Another method is based on epitomes that are constructed from a set of consecutive frames.81 Panoramic visualization has also proved efficient.82 Panoramic images are automatically generated from multiple consecutive video frames by comparing or stitching together consecutive VCE frames at their matching points.83 By repeating this process for consecutive clusters of frames, a new video composed of fewer, panoramic, frames (Figure 1c) is produced. The resulting video offers a broader field of view and it includes a considerably smaller number of frames than the original video. The overall frame reduction achieved is 85.6% (Table 3);82 therefore, a reduction of the reviewing time of this order is feasible. The effect of such a reduction on the reviewer’s performance (such as in abnormality detection) is still open for investigation.

Capsule localization Accurate knowledge of the position of the capsule endoscope (and thus the position of an abnormality) in the gastrointestinal tract is fundamental for several reasons, including localization of lesions for further follow-up examinations and/or surgical intervention;84 determination of the insertion route for device-assisted enteroscopes to avoid repeated attempts on invasive



REVIEWS Table 3 | State-of-the-art methods for reducing the VCE video review time Study

Method


Features

Best average results Metric

Value

Iakovidis et al. (2010)60

Most representative frames detection

14,000 (8)

Colour intensity

Overall frame reduction (%)

85.0

Zhao & Meng (2011)61


2,000 (10)


Compression ratio (%)

96.4

Yuan & Meng (2013)62


1,200 (12)



91.3

Ismail et al. (2013)63


96,963

Colour, texture

Accuracy (%)

66.0

Fan et al. (2011)64

Uninformative frame removal

500

Colour

Sensitivity/specificity (%)

76.4/87.5

HajiMaghsoudi et al. (2012)65


400 (52)

Colour

Accuracy (%) Sensitivity/specificity (%)

93.7 95.1/92.7

Segui et al. (2012)66


100,000 (50)

Colour, texture


91.6 80.1/93.1

Sun et al. (2012)67


61,427

Colour, texture


99.3 99.6/99.6

Fu et al. (2012)68

Redundancy reduction

200 (3)

Colour, texture

Precision/recall (%)

84.0/100.0

Chen et al. (2012)


164,000 (6)

Colour, texture


65.0

Liu et al. (2013)69


>100

Colour intensity, texture, motion


68.0

Lee et al. (2013)70


30,000 (3)


Reduction of nearduplicate frames (%)

52.3

Gallo & Granata (2010)75

Uninformative frame removal, visualization

(10)

Colour, texture


70

Chu et al. (2010)81

Epitomized visualization

1,500 (7)

Colour


>90.0

Iakovidis et al. (2013)82

Panoramic visualization

(30)



85.6

72

Missing number of images or patients implies unavailability of the respective data.

endoscopy;85 potential targeted drug delivery;86 and automated capsule endoscope navigation of future capsule endoscopies.87 Results obtained from relevant studies are summarized in Table 4. Commercially available methods for the localization of a capsule endoscope are typically based on external radio frequency sensor arrays that receive the signals transmitted from the capsule. Following radiologic validation in 27 healthy volunteers, the average 3D spatial localization error of one of the latest capsule endoscopes was found to be 13.26 cm3 (2.00 ± 1.64 cm in x axis, 2.64 ± 2.39 cm in y axis and 2.51 ± 1.83 cm in z axis).88 3D localization is a newly added feature in commercially available VCE software. Previous studies have reported average position errors within the range of 3.7–11.4 cm.3,84 Other capsule localization methods have been proposed;84 however, most of them are based on external equipment such as magnetic sensors and MRI devices. Software methods for localization of capsule endoscopes that are independent of any external equipment are based on video analysis techniques. Currently, such techniques include topographic video segmentation and motion estimation approaches (Figure 2). Some of them have the potential to provide very accurate localization89 that is comparable to that achieved by the methods

exploiting external equipment; however, new, improved methods are still required to improve accuracy. Topographic video segmentation Video segmentation is the process of dividing a sequence of video frames into consecutive segments with coherent content. In this context, topographic video segmentation methods aim to divide the video into a number of consecutive segments corresponding to different parts of the gastrointestinal tract. Pioneering methods for localizing the capsule endoscope were based on supervised machine-learning algorithms that can recognize the different parts of the gastrointestinal tract and the transitions between them (such as the oesophagogastric junction, pylorus and ileocaecal valve)73,74,90 using features of colour, texture and motion. The results reported using the most comprehensive methods are considered to be references for comparisons (Table 4).73,74 A novel colour model of the gastrointestinal tract has been proposed to discriminate between the different parts of the gastrointestinal tract based solely on their colour.91 To avoid the time-overhead introduced by the decompression of VCE video sequences, a colour-based method for topographic segmentation of compressed VCE video sequences has been proposed.92 Unsupervised topographic



REVIEWS Table 4 | State-of-the-art methods for software-based capsule localization Study

Method


Features

Best results Metric

Value(s)

Mackiewicz et al. (2008)73

Video segmentation

(76)

Colour, texture, motion

Median error for oesophagogastric junction; pylorus; ileocaecal valve (frames)

8; 91; 285

Cunha et al. (2008)74

Video segmentation

(60)

Colour, texture

Median error for oesophagogastric junction; pylorus; ileocaecal valve (frames)

2; 287; 1,057

Vu et al. (2010)91

Video segmentation

(50)

Colour

Median error for pylorus; ileocaecal valve (frames)

105; 319

Marques et al. (2011)92

Video segmentation

500

Colour

Overall accuracy (%)

85.2

Shen et al. (2012)93

Video segmentation

(10)

Colour, texture

Accuracy for oesophagogastric junction; pylorus; ileocaecal valve (%)

99.9; 98.3; 94.7

Zhou et al. (2013)94

Video segmentation

(3)

Colour, texture

Precision/recall for pylorus; ileocaecal valve (%)

91.4/88.5; 90.4/97.3

Bao et al. (2012)100

Motion estimation

Simulation


Average rotation error (o)

1.2

Bao & Pahlavai (2013)103

Motion estimation

25,600 (1)

Colour, texture

Average accuracy (%)

92.7%

Bao et al. (2013)102

Motion estimation

Emulation


Average displacement error (cm); rotation error (o); tilt error (o)

0.04; 1.8; 3.0

Spyrou & Iakovidis (2014)101

Motion estimation

Simulation


Average relative displacement error; rotation error (o)

0.002; 1.0

Missing number of images or patients implies unavailability of the respective data.

VCE segmentation can be achieved using colour and texture features extracted from salient pixels.93 A supervised approach that is less complex than the unsupervised approach has been reported to achieve higher accuracy, but the dataset on which it was evaluated was small so these results must be interpreted with caution.94 Overall, topographic approaches for VCE segmentation tend to be more accurate in the localization of the oesophagogastric junction and less accurate for the pylorus (Table 4). Motion estimation Algorithms for estimating camera motion can also be used for the localization of the capsule endoscope within the gastrointestinal tract using visual features (without external sensor arrays).69,70 This approach is used in visual odometry.89,95 Its advantage over wheel odometry 96,97 is that it is not affected by wheel slip or other adverse conditions and it can provide more accurate trajectory estimates.89 This technique can also be used for estimating the velocity of the capsule endoscope.98 To enhance the outcomes of the currently employed sensor-based capsule localization, software localization of the capsule endoscope based on visual features was first addressed by a method capable of estimating only the rotation of the capsule endoscope.99,100 The feasibility of software-based localization of the capsule endoscope has been investigated with respect to both rotation and displacement by two studies that used two different approaches for estimating motion.101,102 These methods were evaluated by emulation and simulation experiments

because of the lack of gold standard rotation and displacement measurements. An emulation testbed was created by bending, twisting and painting a plastic tube to resemble the small intestine.102 The simulation testbed was based on periodically acquired images from publicly available video clips that were artificially rotated and scaled to simulate capsule endoscope rotation, forward and backward motion.101 The advantage of the emulation is that it enables measurements to be taken directly in centimetres, whereas the simulation enables only relative measurements. However, the fact that the emulation testbed was custom-made prohibits the reproducibility of the experiments. Another motion estimation method was proposed to roughly infer the orientation of the capsule endoscope by automatically recognizing whether it is facing the tunnel or the surface of the gastrointestinal tract.103 Before these types of methods can be used in clinical practice,101 several issues, such as the presence of intestinal content and the motility of the gastrointestinal tract, need to be addressed.

Intestinal motility assessment Disturbances in motility and transit are common in functional gastrointestinal disorders.104 As a result of the heterogeneous and overlapping nature of symptoms, it is often difficult to identify which gastrointestinal region is mainly affected, so a comprehensive evaluation of gastro intestinal transit can be useful for planning appropriate management strategies. The assessment of presumed disorders of small bowel motility is a major clinical



REVIEWS a

Esogastric junction

Pylorus

Ileocaecal valve

Video

Oesophagus

Stomach

n n+1

Small intestine

Colon

b

A(n + 1) A(n)

Figure 2 | Capsule localization methods in software. a | & Topographic Natureimplemented Reviews | Gastroenterology Hepatology segmentation methods can automatically identify the different parts of the gastrointestinal tract. b | Methods to estimate motion can approximate the displacement and the rotation of the capsule within a part of the gastrointestinal tract by automatically identifying corresponding pixels between consecutive video frames. For example, pixel A(n) corresponds to A(n + 1), which can be tracked across a whole frame sequence. These two kinds of methods can complement each other, in the sense that topographic segmentation can provide a rough localization of the capsule, which can be refined by motion estimation.

challenge. The current gold standard is small bowel manometry, but this examination is a complex, invasive procedure and requires expertise that is only available in a few referral centres worldwide.105 Furthermore, whole gut and regional transit time can be measured with different techniques, including scintigraphy, breath tests and radio-opaque markers. These techniques are limited by their availability, exposure to radiation and lack of standardization between centres.106 The relationship between capsule transit and intestinal motility has been investigated in several studies.106,107 For example, using conventional VCE, the transit time from the duodenum to the cecum in 34 healthy volunteers was estimated to be 206 ± 11 min (reaching the cecum in all participants), whereas the capsule reached the cecum of only five of 19 patients with manometric criteria of dysmotility, with a transit time of 248 ± 38 min.107 A study published in 2014 reported the results of a wireless motility capsule106 that has pressure, pH and temperature sensors with dedicated software. This capsule was validated for the assessment of gastrointestinal transit and it was demonstrated that the normal range for transit time includes 2–5 h for gastric emptying, 2–6 h for small bowel transit and 10–59 h for colonic transit, giving a whole-gut transit time of 14–70 h. Observable motility events include phasic and tonic contractions. 108 The former are characterized by a sudden closing of the intestinal lumen, followed by a posterior opening, whereas the latter are a sequence in which the intestinal lumen remains completely closed for an undetermined period of time. Both the frequency

and the type of contractions are altered in several small intestinal diseases.109 Automatic inspection, detection and interpretation of motility events can be based on a two-step video analysis process that involves detection of VCE video frames containing possible contractions on the basis of differences in the mean colour intensity between consecutive frames, followed by isolation of a subset of actual contractions using supervised texture-based pattern recognition.110 The method was later improved by a cascade of three classifiers for separate detection, and then rejection of frames containing turbid material, intestinal wall and lumen views.111 Another approach was based on the automatic recognition of the wrinkling of the intestinal folds during a contraction.112 The wrinkle patterns are represented by a set of novel features that are characteristic of the centrality of the intestinal folds and their colour. Motility inspection can be facilitated by a VCE video visualization method (Figure 1e), called adaptive cut long itudinal view.113 The longitudinal view can be obtained by ‘cutting’ a line of pixels across a VCE video frame and repeating this action for a sequence of frames. This method maximizes the probability of cutting the frames through the lumen to preserve motility information and provides a compact display and fast inspection of m otility for VCE.

Enhancement of video quality The hardware limitations of commercially available VCE platforms result in deterioration of the quality of the outcome video.3 Several image and video processing algorithms (not tailored for a specific VCE platform) have been proposed to ameliorate these limitations.114–121 Insufficient mucosal illumination results in images with dark regions of content that cannot be adequately distinguished. Contrast enhancement should not be uniform and should avoid a consequent amplification of noise (for example, originating from the image sensor and the wireless transmission system of the capsule endoscope). To this end, anisotropic 114 and homo morphic115 image filtering techniques that are dependent on manually adjusted parameters have been proposed. Considering the importance of colour for the assessment of abnormalities,37 contrast enhancement should also preserve the colour tones of VCE images.116,117 Suppression of noise has been addressed by applying general filtering methods.118,119 Blurring can be caused either by abrupt motion of the capsule endoscope or by intestinal content interfering with the camera. An image-restoration algorithm was proposed to deal with deblurring the image.119 The limited frame-rate of the capsule endoscope video sequences could be artificially increased by the application of an algorithm that introduces more frames, with content that is predicted by interpolation.120 The resolution of each video frame could also be artificially increased by the application of super-resolution algorithms.121 These algorithms usually combine the redundant information contained in consecutive video frames to produce new frames with increased resolution. The performance of the algorithms is expected to be improved if the video frame-rate is increased, as the



REVIEWS Table 5 | Publicly available databases with gastrointestinal endoscopy data Database

Data quality

Annotations

CapsuleEndoscopy.org (Given Imaging)

Full resolution, lossy compression

Free text

KID151

Full resolution, near-lossless compression

Graphic, free text, semantic

Capview database152

Low resolution, lossy compression

Free text, semantic

Gastrolab

Low and full resolution, lossy compression

Free text

Atlas of World Endoscopy Organization (WEO)154


Free text

El Salvador Atlas of Gastrointestinal Endoscopy


Free text

Low resolution, lossy compression

Free text, semantic

150

153

Atlas of Gastrointestinal Endoscopy156

155

The first three databases are dedicated to VCE, whereas the rest are more general.

consecutive frames will be more similar to each other; therefore, they will carry more redundant information than the frames of the current, low frame‑rate capsule endoscopy video. Panoramic and 3D visualization techniques have been used to enhance the reviewers’ perception of the gastro intestinal tract during a review session.122–125 The process for constructing a panoramic image involves motion estimation using techniques that are similar to the ones used for reducing the review time.122 To improve the viewing of the digestive tract, a 3D visualization technique has been proposed that leads to a more qualitative and efficient examination than the use of previous methods (Figure 1d).123,124 This technique is based on the principle that the shape of a visible object in a scene can be retrieved from its shading. A feasibility study has yielded promising results with respect to the applicability of this method in clinical practice.125

Data management and availability Every VCE procedure results in large quantities of data. Consequently, managing such data can be incredibly time-consuming. As enhancements of capsule endoscope systems are continually being released, the volume of data will continue to increase. Thus, novel techniques are necessary for efficient management of VCE data. The efficient management of the accumulating amount of data can be facilitated by cloud computing platforms126 and big data technologies that are capable of coping with very large datasets.127 Cloud platforms enable ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources such as networks, servers, storage, applications and services.128 Algorithms to preserve the quality of compressed videos exploit particular properties of the VCE videos, such as their specific colour range,129 which could further enhance the efficiency of storage and communication. In addition, methods to summarize VCE videos could be considered, such as a video summarization-based tele-endoscopy service that has been proposed for efficient management of VCE data.130 Sharing data from VCE is currently difficult as most VCE reading software only supports specialized storage formats that are specific to the vendor of the software. Thus, individuals who use different software are often unable to share their data easily. Most vendors provide options to export files in standard image and video

formats such as JPEG and MPEG. However, the exported data usually have a reduced resolution and these formats include compression algorithms that discard information (that is, lossy algorithms), which can cause considerable deterioration in the quality of the data. This deterioration can introduce ambiguities in the diagnostic process and obstacles for automated image and/or video analysis, for example, dealing with compression artefacts. Vendors should take these considerations seriously and collaborate for the adoption of a common, standard VCE media format that is capable of accommodating not only videos but also structured annotations, such as standards proposed for other medical imaging modalities. 131 The requirements of both the clinical and IT communities need to be taken into account. The annotation tools provided with VCE software of different vendors are manual and have similar functionalities. The manual annotation of large volumes of VCE images and videos is time-consuming. Nonetheless, such large-scale data are necessary for the training and sufficient evaluation of intelligent software systems, such as systems to detect abnormalities. Labelling of video frames, accurate delineation of the boundaries of each visible structure of interest (for example, abnormalities) and labelling of these structures is necessary and can be achieved with versatile annotation tools.17,132 Applicationspecific annotation tools have also been proposed for interactive annotation of VCE video sequences.133 Public sharing of anonymized and annotated VCE image and video data can contribute not only to VCE training but also to the development of reference datasets for the evaluation of computational methods for VCE video analysis, such as methods to detect abnormalities. Currently, such public data resources are limited (Table 5). The retrieval of VCE video sequences or images from VCE databases can be based on their semantic tags, their textual annotations, or even directly on their content. For instance, an example image can be used as a query to retrieve images or videos with pathologies that visually resemble the content of the query image (Figure 3). This system is known as content-based image or video retrieval134,135 and only a limited number of studies deal with its application in the domain of gastrointestinal endoscopy.136–138 The system is based on video analysis; therefore, efficient VCE content-based image or video retrieval could be facilitated by compressed-domain-analysis methods.92



REVIEWS a

Although large datasets are desirable, their annotation remains a barrier, as it requires a considerable amount of effort from expert VCE reviewers (usually more than one reviewer per dataset to enable assessment of their interobserver agreement). However, the restricted availability of annotated datasets is a limiting factor for essential progress in developing methods to detect abnormalities as well as in the context of the rest of the research areas identified in this Review.

b

c

d

Figure 3 | Example of the content-based or video retrieval concept for video Natureimage Reviews | Gastroenterology & Hepatology capsule endoscopy. a | The user queries the system with an image of a lesion. b–d | The system retrieves patient cases with similar images from the hospital’s database. By consulting the diagnoses reported by other physicians for the retrieved cases, a more reliable evaluation of the image could be obtained.

Challenges for essential progress Progress The development of VCE139 triggered a wave of scientific publications both in the medical and the broader IT domain (Figure 4a).The research areas that have received the most interest have also changed over time (Figure 4b). Automatic haemorrhage detection in VCE was the first issue that caught the interest of IT scientists, whereas lesion detection, video review-time reduction and quality enhancement have been the focus of the past ~10 years. An increasing trend is observed for methods related to VCE data management, which can be justified by the wider adoption of VCE, triggering a consequent data increase, as well as by the need for efficient image and video annotation methods to support the collection of information for intelligent VCE software systems. Publication activity on methods to detect abnormalities has undergone a steep increase in the past 5 years (>50%); however, relevant progress in the clinic has been limited. The dataset sizes used for evaluation before 2010 were between 50 and 87,258 images and were obtained from small cohorts of 5–172 patients.12 Between 2010 and 2014, the respective numbers were 75–30,540 and 3–252 (Tables 1–2). These tables also show that most current methods still deal with the detection of only a few types of abnormalities.17 Figure 5 summarizes the results reported in Tables 1–2 in comparison with those reported between 2000 and 2010.

Technical challenges Clinical experience needs to be used more than it is currently to inspire novel computational methods. Most of the current methods are new uses of pre-existing approaches from other fields. For example, generic methods for extracting image features are usually applied for detecting lesions, without even being modified to consider the particularities of the clinical problem under investigation, such as the diversity of the lesion manifestations and the environment in which they can be found. Realistic experimentation for clinically important results should involve multiple full-length videos. As these videos are so large, the computational resources of an average researcher are usually insufficient for such a large-scale experiment. This limitation is one of the reasons that, currently, most studies are performed using only subsets of VCE data. Therefore, computational efficiency should be considered as a requirement in dealing with large-scale data analysis and experimentation. The demand for clinically relevant results also requires that methods for analysing images and videos should become robust in the presence of intestinal content, which is usually discarded from the datasets of current studies. Technical challenges to the development of new software for VCE include the development of methods with increased sensitivity to abnormalities of a very small size (only a few pixels in size) as well as methods that can detect a broad spectrum of pathologies. Furthermore, methods to reduce the review time should preserve all the clinically relevant information to eliminate the chances of missing abnormalities. Many studies on this topic suggest that a large number of video frames are discarded as uninformative, redundant or not representative.60–67 In such cases, the experimentation phase should indicate a zero loss of abnormalities60 and the VCE software should not totally discard the possibly irrelevant frames; instead, it should provide a visualization that enables these frames to be inspected more quickly than relevant frames. The requirements for computational efficiency in such software are increased as a result of the real-time constraints posed by the clinical workflows and the reviewers’ reaction times.4,5,8,17 The process of developing clinically viable VCE localization software is challenging. Algorithms for estimating motion that are used for tracking the capsule endoscope should become more robust so that they can withstand the motility and deformability of the gastrointestinal tract. Although current methods have reported low error rates in the estimation of small displacements and rotations of the capsule,101 novel approaches should be devised for the reduction or correction of the error that is accumulated



REVIEWS a 500

Medicine IT Software Polynomial trendlines (medicine) Polynomial trendlines (IT) Polynomial trendlines (software)

450

Number of publications

400 350 300 250 200 150 100 50

14

13

20

12

20

11

20

10

20

09

20

08

20

07

20

06

20

05

20

04

20

03

20

02

20

01

20

20

20

00

0

Time (years)

b 100

5

90

12

80

10

Publications (%)

70

1 16 3 12

8 17

60 50

100 51

40

40

30 20 10

14

0 2000–2004

11

2005–2009 Time intervals (years)

Data management Video quality enhancement Intestinal motility assessment Capsule localization

2010–2014

Review-time reduction Lesion detection Haemorrhage detection

Figure 4 | Scientific publication activity on video capsule endoscopy based on Nature Reviews | Gastroenterology & Hepatology analytics from 4,870 articles in the Scopus database. a | The yearly number of publications on video capsule endoscopy in the subject areas of medicine, computer science and engineering (labelled as ‘IT’). The subset of the respective IT articles that deal with computational methods implemented in software is labelled as ‘Software’. The dashed lines represent the respective polynomial trend lines. b | Percentage of publications over time per research area on video capsule endoscopy software. Abbreviation: IT, information technology.

as the capsule travels down increased distances to the gastrointestinal tract. Techniques for topographic video segmentation can provide the anatomic landmarks from which the motion estimation algorithms can be initialized, and thus reduce the accumulation of errors. Furthermore, the techniques could be extended or enhanced by automatic recognition of other anatomical landmarks within the gastrointestinal tract to improve localization.140,141 Software for localizing the capsule endoscope could be enhanced by the development of 3D models of the

gastrointestinal tract, for example, following a method that has been proposed for 3D reconstruction of the bladder.142 Challenges also arise with respect to the development of cloud platforms that enable big data analytics. Cloud platforms provide unlimited storage that is accessible from anywhere in the world through shared computational resources; therefore, they are particularly suitable for the development of big repositories for VCE data with annotations from physicians all over the world. Challenges for the analysis of big data on VCE include the development of video-analysis algorithms that are highly scalable, enable parallel execution on multiple computers and provide time-efficient image or video indexing and retrieval. Effective use of big data analytics also requires large-scale image or video annotation. This requirement poses several challenges with respect to annotation efficiency, such as semiautomatic graphic and semantic annotation of VCE images. This type of tool would automatically pre-annotate VCE images and videos, so that physicians will have to quickly provide only corrections on the annotations suggested by the software, instead of annotating all the images from scratch. The development of this tool could also be facilitated by the development of a collaborative, but quality-controlled, crowdsourcing annotation platform. Considering that annotations are not only data, but also knowledge provided by the experts, large-scale management of knowledge is also an important perspective. For example, a challenge would be the standardization of how knowledge is represented so that various VCE software tools share the knowledge stored in a common cloud repository. Preserving quality and efficient image and video compression is another direction of research from which both manual and automated VCE video analysis will benefit. Currently, lossy compression algorithms, such as MPEG, are preferred due to the increased compression rates that can be achieved for storage; however, the use of improved lossless compression algorithms in the future could result in reduced error rates both for manual and computer-aided reviewing processes. Other challenges for research in VCE software will continue to arise as new kinds of capsule endoscopes are developed. These challenges include the development of intuitive user-friendly interfaces for accurate steering of remote-control capsules, such as the magnetically-guided capsule endoscopes,143 and intelligent analysis of multiple signals from multi-sensor capsules that enable enhanced detection and characterization of disorders and abnormalities. Such capsules include motility capsules106 that could be enhanced with cameras in the future, and future proto types that will combine multispectral and 3D imaging sensors for detection and localization of abnormalities.144

Clinical challenges Perhaps one of the most important challenges for the future is the creation of robust gold standard datasets for training and evaluating intelligent VCE software: for example, image and video datasets with annotations from experts that indicate where the lesions are located (graphic annotations) and what kind of lesions they are using a



REVIEWS 100

90

Results (%)

80

70

60

50

2000–2010 Accuracy Sensitivity Specificity

2010–2014 Accuracy Sensitivity Specificity

0 Haemorrhage

Crohn’s disease

Polyps

Tumours

Ulcers

Detected abnormalities

Figure 5 | Comparison of results from studies detecting abnormalities that are Nature Reviews | Gastroenterology & Hepatology discussed in this Review and the results reported between 2000 and 2010.12 The vertical bars indicate overall ranges of accuracy, sensitivity and specificity obtained from various abnormality detection methods. Single points indicate no variance. Missing bars indicate unavailable data due to inconsistent reporting of results across studies.

standard terminology (semantic annotations). The limited availability of such data on the internet (Table 5), which is mainly attributable to ethical and legal considerations,145 is a major obstacle to the progress of research on computational methods for the automation of VCE processes. Considering that the process of annotating images and videos is time-consuming, the discovery of cost-efficient methods to motivate clinicians in this task is also a challenge. For instance, if time-efficient software tools become available, the annotation process could be introduced in everyday clinical practice. The clinical relevance of the studies that address software systems for supporting medical decision-making is largely determined by the contribution of clinicians in the selection of representative datasets and the evaluation of the obtained results. Studies that compare the performance of clinicians with and without the use of these systems56 can provide essential information for their clinical applicability. As it is usually difficult to engage a sufficient number of reviewers for such an evaluation task, the inclusion of results on publicly shared databases could encourage clinicians to become involved. Other clinical challenges include the construction of realistic models of the intestine and systematic in vivo experimentation in animals and humans. The development of intelligent systems that are capable of accurately recognising lesions, localising the abnormality and, potentially, transmitting interactive data, is highly desirable. These intelligent systems should enable actuation and therapeutic capabilities, eventually leading to the next generation of capsule robots.146

To our knowledge, the majority of medical practitioners are not familiar with the full potential of commercially available VCE reviewing software. Another clinical misconception relates to the perceived accuracy of localization and virtual chromoendoscopy software. With regards to the question of how accurate localization is, a 2D or 3D visualization of the capsule endoscope trajectory is obtained by means of triangulation of the radiofrequency signal strength at the same antennas that receive image data. However, its accuracy and clinical use has been questioned on several occasions;3,84 therefore, an improved software localization technique is necessary to enable progression. Of note, hardly any studies have been carried out on relevant tools with VCE reading software. With regards to chromoendoscopy, VCE data are still limited and discordant.4,147,148 Virtual chromoendoscopy in VCE might have a role in improving the definition of lesions that had previously been diagnosed using white light, but its effect on detection and outcome is still unclear. In other words, virtual chromoendoscopy applied to VCE enables the endoscopist to see and feel ‘better’, but not necessarily to see ‘more’.149

Conclusions

Since its first appearance, VCE has fascinated researchers from both the medical and the IT domain, leading to the investigation of novel clinical and technical pathways. Despite an ever-increasing number of publications, the actual progress with respect to VCE software can only be considered marginal. Essential progress can be achieved by sharing data and knowledge, together with close collaboration between medical and IT scientists. Such collaboration should involve cycles of constructive discussions and systematic knowledge transfer from medical to IT scientists and vice versa, as well as engagement for common scientific effort and publications that address both the IT and clinical aspects involved. State-of-the-art studies promise clinically viable intelligent software systems that are capable of reducing diagnostic errors and effort, localizing abnormalities and assessing intestinal motility, at the same time as providing enhanced video quality for improved content perception. In light of the growing volume of VCE data, cloud and big data technologies will soon become necessary for efficient data management. Review criteria Scopus was the main database searched for articles published between 2000 and August 2014, including ‘capsule endoscopy’ as a search term. Technical articles were refined by limiting searches in the subject areas of ‘computer science’ and ‘engineering’. Publication activity analysis is based on the online tools provided by Scopus. Articles not indexed in Scopus were searched in PubMed and Google Scholar. References cited in review articles were also searched and relevant websites were accessed. Considering the substantial volume of novel contributions over the past 5 years, relevant (full) articles from international conference proceedings, with algorithm validation details, were included.



REVIEWS 1. 2.

3.

4.

5. 6.

7.

8.

9.

10.

11.

12.

13.

14. 15.

16. 17.

18.

19.

20.

21.

Wang, A. et al. Wireless capsule endoscopy. Gastrointest. Endosc. 78, 805–815 (2013). Fisher, L. R. & Hasler, W. L. New vision in video capsule endoscopy: current status and future directions. Nat. Rev. Gastroenterol. Hepatol. 9, 392–405 (2012). Ciuti, G., Menciassi, A. & Dario, P. Capsule endoscopy: from current achievements to open challenges. IEEE Rev. Biomed. Eng. 4, 59–72 (2011). Koulaouzidis, A., Rondonotti, E. & Karargyris, A. Small-bowel capsule endoscopy: a ten-point contemporary review. World J. Gastroenterol. 19, 3726–3746 (2013). Lo, S. K. How should we do capsule reading? Tech. Gastrointest. Endosc. 8, 146–148 (2006). Eliakim, R. & Magro, F. Imaging techniques in IBD and their role in follow-up and surveillance. Nat. Rev. Gastroenterol. Hepatol. 11, 722–736 (2014). Zheng, Y., Hawkins, L., Wolff, J., Goloubeva, O. & Goldberg E. Detection of lesions during capsule endoscopy: physician performance is disappointing. Am. J. Gastroenterol. 107, 554–560 (2012). Rondonotti, E. et al. Can we improve the detection rate and interobserver agreement in capsule endoscopy? Dig. Liver Dis. 44, 1006–1011 (2012). Lewis, B., Eisen, G. & Friedman, S. A pooled analysis to evaluate results of capsule endoscopy trials. Endoscopy 39, 303–308 (2005). Karkanis, S. A., Iakovidis, D. K., Maroulis, D. E., Magoulas, G. D. & Theofanous, N. Tumor recognition in endoscopic video images using artificial neural network architectures. In Proc. 26th Euromicro Conference Vol. 2, 423–429 (2000). Karkanis, S. A., Iakovidis, D. K., Maroulis, D. E., Karras, D. A. & Tzivras, M. Computer-aided tumor detection in endoscopic video using color wavelet features. IEEE Trans. Inf. Technol. Biomed. 7, 141–152 (2003). Liedlgruber, M. & Uhl, A. Computer-aided decision support systems for endoscopy in the gastrointestinal tract: a review. IEEE Rev. Biomed. Eng. 4, 73–88 (2011). Fisher, M. & Mackiewicz, M. in Color Medical Image Analysis Vol. 6 (eds Celebi, M. E. & Schaefer, G.) 129–144 (Springer, 2013). Bovik, A. C. Handbook of Image and Video Processing (Academic Press, 2010). Nixon, M., Nixon, M. S. & Aguado, A. S. Feature Extraction and Image Processing for Computer Vision (Academic Press, 2012). Theodoridis, S. & Koutroumbas, K. Pattern Recognition (Academic Press, 2008). Iakovidis, D. K. & Koulaouzidis, A. Automatic lesion detection in capsule endoscopy based on color saliency: closer to an essential adjunct for reviewing software. Gastrointest. Endosc. 80, 877–883 (2014). Iakovidis, D. K. Software engineering applications in gastroenterology. Global J. Gastroenterol. Hepatol. 2, 11–18 (2014). Rockey, D. C. Occult and obscure gastrointestinal bleeding: causes and clinical management. Nat. Rev. Gastroenterol. Hepatol. 7, 265–279 (2010). Buscaglia, J. M. et al. Performance characteristics of the suspected blood indicator feature in capsule endoscopy according to indication for study. Clin. Gastroenterol. Hepatol. 6, 298–301 (2008). Park, S. C. et al. Sensitivity of the suspected blood indicator: an experimental study. World J. Gastroenterol. 18, 4169–4174 (2012).

22. D’Halluin, P. N. et al. Does the “Suspected Blood Indicator” improve the detection of bleeding lesions by capsule endoscopy? Gastrointest. Endosc. 61, 243–249 (2005). 23. Boulougoura, M., Wadge, E, Kodogiannis, V. & Chowdrey, H. S. Intelligent systems for computer-assisted clinical endoscopic image analysis. In Proc. 2nd IASTED International Conference on Biomedical Engineering 405–408 (2004). 24. Lv, G., Yan, G. & Wang, Z. Bleeding detection in wireless capsule endoscopy images based on color invariants and spatial pyramids using support vector machines. In Engineering in Medicine and Biology Society, EMBC, Annual International Conference of the IEEE 6643–6646 (2011). 25. Sainju, S., Bui, F. M. & Wahid, K. A. Automated bleeding detection in capsule endoscopy videos using statistical features and region growing. J. Med. Syst. 38, 25 (2014). 26. Fu, Y., Zhang, W., Mandal, M. & Meng, M. Q. Computer-aided bleeding detection in WCE video. IEEE J. Biomed. Health Inform. 18, 636–642 (2014). 27. Hwang S., Oh, J., Cox, J., Tang, S. J. & Tibbals, H. F. Blood detection in wireless capsule endoscopy using expectation maximization clustering. In Proc. SPIE: Medical Imaging 61441P–61441P (2006). 28. Jung, Y. S. et al. Automatic patient-adaptive bleeding detection in a capsule endoscopy. In Proc. SPIE: Medical Imaging 72603T–72603T (2009). 29. Mäenpää, T. & Pietikäinen, M. Classification with color and texture: jointly or separately? Pattern Recognit. 37, 1629–1640 (2004). 30. Mackiewicz, M. W., Fisher, M. & Jamieson, C. Bleeding detection in wireless capsule endoscopy using adaptive colour histogram model and support vector classification. In Proc. SPIE Medical Imaging 69140R–69140R (2008). 31. Szczypinski, P., Klepaczko, A., Pazurek, M. & Daniel, P. Texture and color based image segmentation and pathology detection in capsule endoscopy videos. Comput. Methods Programs Biomed. 113, 396–411 (2014). 32. Pan G., Yan G., Qiu X. & Cui J. Bleeding detection in wireless capsule endoscopy based on probabilistic neural network. J. Med. Syst. 35, 1477–1484 (2011). 33. Figueiredo, I. N., Kumar, S., Leal, C. & Figueiredo, P. N. Computer-assisted bleeding detection in wireless capsule endoscopy images. Comput. Methods Biomechan. Biomed. Eng. Imaging Vis. 1, 198–210 (2013). 34. Fawcett T. An introduction to ROC analysis. Pattern Recognit. Letters 27, 861–874 (2006). 35. Alotaibi, S., Qasim, S., Bchir, O. & Ismail, M. M. Empirical comparison of visual descriptors for multiple bleeding spots recognition in wireless capsule endoscopy video. Computer Analysis Images Patterns 8048, 402–407 (2013). 36. Karargyris, A. & Bourbakis, N. Detection of small bowel polyps and ulcers in wireless capsule endoscopy videos. IEEE Trans. Biomed. Eng. 58, 2777–2786 (2011). 37. Tanaka, M. et al. A new instrument for measurement of gastrointestinal mucosal color. Dig. Endosc. 8, 139–146 (1996). 38. Kudo, S. et al. Colonoscopic diagnosis and management of nonpolypoid early colorectal cancer. World J. Surg. 24, 1081–1090 (2000). 39. Maroulis, D. E., Iakovidis, D. K., Karkanis, S. A. & Karras, D. A. CoLD: a versatile detection system for colorectal lesions in endoscopy videoframes. Comput. Methods Programs Biomed. 70, 151–166 (2003).


40. Häfner, M. et al. Computer-assisted pit-pattern classification in different wavelet domains for supporting dignity assessment of colonic polyps. Pattern Recognit. 42, 1180–1191 (2009). 41. Cui, L. et al. Detection of lymphangiectasia disease from wireless capsule endoscopy images with adaptive threshold. In Proc. 8th World Congress on Intelligent Control and Automation 3088–3093 (2010). 42. Ciaccio, E. J., Tennyson, C. A., Bhagat, G., Lewis, S. K. & Green, P. H. Classification of videocapsule endoscopy image patterns: Comparative analysis between patients with celiac disease and normal individuals. Biomed. Eng. Online 9, 44 (2010). 43. Saurin, J. C. et al. Diagnostic value of endoscopic capsule in patients with obscure digestive bleeding: blinded comparison with video push-enteroscopy. Endoscopy 35, 576–584 (2003). 44. Romain, O. et al. Towards a multimodal wireless video capsule for detection of colonic polyps as prevention of colorectal cancer. In Proc. 13th International Conference on Bioinformatics and Bioengineering 1–6 (2013). 45. Li, B. P. & Meng, M. Q. Comparison of several texture features for tumor detection in CE images. J. Med. Syst. 36, 2463–2469 (2012). 46. Charisis, V. S., Hadjileontiadis, L. J., Liatsos, C. N., Mavrogiannis, C. C. & Sergiadis, G. D. Capsule endoscopy image analysis using texture information from various colour models. Comput. Methods Programs Biomed. 107, 61–74 (2012). 47. Chen G., Bui, T. D., Krzyzak, A. & Krishnan, S. Small bowel image classification based on Fourier-Zernike moment features and canonical discriminant analysis. Pattern Recognit. Image Analysis 23, 211–216 (2013). 48. Li, B. & Meng, M. Q. Automatic polyp detection for wireless capsule endoscopy images. Expert Syst. Appl. 39, 10952–10958 (2012). 49. Li, B. & Meng, M, Q. Tumor recognition in wireless capsule endoscopy images using textural features and SVM-based feature selection. IEEE Trans. Inf. Technol. Biomed. 16, 323–329 (2012). 50. Li, B., Meng, M. Q. & Lau, J. Y. Computer-aided small bowel tumor detection for capsule endoscopy. Artif. Intell. Med. 52, 11–16 (2011). 51. Chen, H., Chen, J., Peng Q., Sun G. & Gan T. Automatic hookworm image detection for wireless capsule endoscopy using hybrid color gradient and contourlet transform. In Proc. 6th International Conference on Biomedical Engineering and Informatics 116–120 (2013). 52. Yu, L., Yuen, P. C. & Lai, J. Ulcer detection in wireless capsule endoscopy images. In Proc. 21st International Conference on Pattern Recognition 45–48 (2012). 53. Hwang, S. Bag‑of‑visual-words approach to abnormal image detection in wireless capsule endoscopy videos. Advances Visual Computing 6939, 320–327 (2011). 54. Chen, Y. & Lee, J. Ulcer detection in wireless capsule endoscopy video. In Proc. 20th ACM International Conference on Multimedia 1181–1184 (2012). 55. Sikora T. The MPEG‑7 visual standard for content description-an overview. IEEE Trans. Circuits Syst. Video Technol. 11, 696–702 (2001). 56. Kumar, R. et al. Assessment of Crohn’s disease lesions in wireless capsule endoscopy images. IEEE Trans. Biomed. Eng. 59, 355–362 (2012). 57. David, E., Boia, R., Malaescu, A. & Carnu, M. Automatic colon polyp detection in endoscopic capsule images. In Proc. International Symposium on Signals, Circuits and Systems 1–4 (2013).


REVIEWS 58. Mamonov, A. V., Figueiredo, I. N., Figueiredo, P. N. & Tsai, Y. H. Automated polyp detection in colon capsule endoscopy. IEEE Trans. Biomed. Eng. 33, 1488–1502 (2014). 59. Iakovidis, D., Tsevas, S., Maroulis D. & Polydorou, A. Unsupervised summarisation of capsule endoscopy video. In Proc. 4th International IEEE Conference Vol. 1, 3–15 (2008). 60. Iakovidis, D. K., Tsevas, S. & Polydorou, A. Reduction of capsule endoscopy reading times by unsupervised image mining. Comput. Med. Imaging Graph. 34, 471–478 (2010). 61. Zhao, Q. & Meng, M. H. A strategy to abstract WCE video clips based on LDA. In Proc. IEEE International Conference on Robotics and Automation 4145–4150 (2011). 62. Yuan, Y. & Meng, M. Q. Hierarchical key frames extraction for WCE video. In Proc. IEEE International Conference on Mechatronics and Automation 225–229 (2013). 63. Ismail, M., Bchir, O. & Emam, A. Z. Endoscopy video summarization based on unsupervised learning and feature discrimination. IEEE Xplore [online], http://dx.doi.org/10.1109/ VCIP.2013.6706410 (2013). 64. Fan, Y., Meng, M. H. & Li, B. A novel method for informative frame selection in wireless capsule endoscopy video. In Proc. Annual International Conference of the IEEE: Engineering in Medicine and Biology Society 4864–4867 (2011). 65. HajiMaghsoudi, O., Talebpour, A., Soltanian‑Zadeh, H. & Soleimani, H. A. Automatic informative tissue’s discriminators in WCE. In Proc. IEEE International Conference on Imaging Systems and Techniques 18–23 (2012). 66. Segui, S. et al. Categorization and segmentation of intestinal content frames for wireless capsule endoscopy. IEEE Trans. Inf. Technol. Biomed. 16, 1341–1352 (2012). 67. Sun, Z., Li, B., Zhou, R., Zheng, H. & Meng, M. Q. Removal of non-informative frames for wireless capsule endoscopy video segmentation. In Proc. IEEE International Conference on Automation and Logistics 294–299 (2012). 68. Fu, Y. et al. Key-frame selection in WCE video based on shot detection. In Proc. 10th World Congress on Intelligent Control and Automation 5030–5034 (2012). 69. Liu, H. et al. Wireless capsule endoscopy video reduction based on camera motion estimation. J. Digi. Imaging 26, 287–301 (2013). 70. Lee, H. G., Choi, M. K., Shin, B. S. & Lee, S. C. Reducing redundancy in wireless capsule endoscopy videos. Comput. Biol. Med. 43, 670–682 (2013). 71. Bay, H., Ess, A., Tuytelaars, T. & Van Gool, L. Speeded-up robust features (SURF). Comput. Vis. Image Underst. 110, 346–359 (2008). 72. Chen, Y., Lan, Y. & Ren, H. Trimming the wireless capsule endoscopic video by removing redundant frames. In Proc. 8th International Conference on Wireless Communications, Networking and Mobile Computing 1–4 (2012). 73. Mackiewicz, M., Berens, J. & Fisher, M. Wireless capsule endoscopy color video segmentation. IEEE Trans. Med. Imaging 27, 1769–1781 (2008). 74. Cunha, J. S., Coimbra, M., Campos, P. & Soares, J. M. Automated topographic segmentation and transit time estimation in endoscopic capsule exams. IEEE Trans. Med. Imaging 27, 19–27 (2008). 75. Gallo, G. & Granata, E. WCE video segmentation using textons. Proc. SPIE http://dx.doi.org/ 10.1117/12.840690. 76. Given Imaging Wireless capsule endoscopy software [online], http://www.givenimaging.com/

en-int/Innovative-Solutions/Capsule-Endoscopy/ Software/Pages/default.aspx (2014). 77. Koulaouzidis, A., Iakovidis, D. K., Karargyris, A. & Plevris, J. N. Optimizing lesion detection in small-bowel capsule endoscopy: from present problems to future solutions. Expert Rev. Gastroenterol. Hepatol. 9, 217–235 (2015). 78. Günther, U., Daum, S., Zeitz, M. & Bojarski, C. Capsule endoscopy: comparison of two different reading modes. Int. J. Colorectal Dis. 27, 521–525 (2012). 79. Koulaouzidis, A., Smirnidis, A., Douglas, S. & Plevris, J. N. QuickView in small-bowel capsule endoscopy is useful in certain clinical settings, but QuickView with Blue Mode is of no additional benefit. Eur. J. Gastroenterol. Hepatol. 24, 1099–1104 (2012). 80. Vu, H. et al. Controlling the display of capsule endoscopy video for diagnostic assistance. IEICE Trans. Inf. Syst. 92, 512–528 (2009). 81. Chu, X. et al. Epitomized summarization of wireless capsule endoscopic videos for efficient visualization. Med. Image Comput. Comput. Assist. Interv. 13, 522–529 (2010). 82. Iakovidis, D. K, Spyrou, E. & Diamantis, D. Efficient homography-based video visualization for wireless capsule endoscopy. In Proc. 13th International Conference on Bioinformatics and Bioengineering 1–4 (2013). 83. Szeliski, R. Image alignment and stitching: a tutorial. Foundations Trends Computer Graphics Vision. 2, 1–104 (2006). 84. Than, T. D., Alici, G., Zhou, H. & Li, W. A review of localization systems for robotic endoscopic capsules. IEEE Trans. Biomed. Eng. 59, 2387–2399 (2012). 85. Li, X., Chen, H., Dai, J., Gao, Y. & Ge, Z. Predictive role of capsule endoscopy on the insertion route of double-balloon enteroscopy. Endoscopy 41, 762–766 (2009). 86. Pedersen, P. B., Bar-Shalom, D., Baldursdottir, S., Vilmann, P. & Müllertz, A. Feasibility of capsule endoscopy for direct imaging of drug delivery systems in the fasted upper-gastrointestinal tract. Pharm. Res. 31, 1–10 (2014). 87. van der Stap N., van der Heijden, F. & Broeders, I. A. Towards automated visual flexible endoscope navigation. Surg. Endosc. 27, 3539–3547 (2013). 88. Marya, N., Karellas, A., Foley, A., Roychowdhury, A. & Cave, D. Computerized 3‑dimensional localization of a video capsule in the abdominal cavity: validation by digital radiography. Gastrointest. Endosc. 79, 669–674 (2014). 89. Scaramuzza, D. & Fraundorfer, F. Visual odometry [tutorial]. IEEE Robotics & Automation Magazine 18, 80–92 (2011). 90. Berens J., Mackiewicz, M. & Bell, D. Stomach, intestine, and colon tissue discriminators for wireless capsule endoscopy images. Proc. SPIE 5747, Medical Imaging Image Processing http:// dx.doi.org/10.1117/12.594799. 91. Vu, H. et al. Color analysis for segmenting digestive organs in VCE. In Proc. 20th International Conference on Pattern Recognition (ICPR) 2468–2471 (2010). 92. Marques, N., Dias, E., Cunha, J. & Coimbra, M. Compressed domain topographic classification for capsule endoscopy. In Proc. Engineering in Medicine and Biology Society, EMBC, Annual International Conference of the IEEE 6631–6634 (2011). 93. Shen, Y., Guturu, P. & Buckles, B. P. Wireless capsule endoscopy video segmentation using an unsupervised learning approach based on probabilistic latent semantic analysis with scale invariant features. IEEE Trans. Inf. Technol. Biomed. 16, 98–105 (2012).


94. Zhou, R., Li, B., Zhu, H. & Meng, M. Q. A novel method for capsule endoscopy video automatic segmentation. In Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 3096–3101 (2013). 95. Nistér, D., Naroditsky, O. & Bergen J. Visual odometry. In Proc. IEEE Computer Society Conference 1–652 (2004). 96. Karargyris, A. & Koulaouzidis, A. Capsuleodometer: a concept to improve accurate lesion localisation. World J. Gastroenterol. 19, 5943 (2013). 97. Karargyris, A. & Koulaouzidis. A. OdoCapsule: next generation wireless capsule endoscopy with accurate localization and video stabilization. IEEE Trans. Biomed. Eng. http://dx.doi.org/ 10.1109/TBME.2014.2352493. 98. Szczypinski, P. M., Sriram, R. D., Sriram, P. V. & Reddy, D. N. A model of deformable rings for interpretation of wireless capsule endoscopic videos. Med. Image Anal. 13, 312–324 (2009). 99. Liu, L., Hu, C., Cai, W. & Meng, M. H. Capsule endoscope localization based on computer vision technique. In Proc. Engineering in Medicine and Biology Society, EMBC 2009. Annual International Conference of the IEEE 3711–3714 (2009). 100. Bao, G., Ye, Y., Khan, U., Zheng, X. & Pahlavan, K. Modeling of the movement of the endoscopy capsule inside GI tract based on the captured endoscopic images. In Proc. IEEE International Conference on Modeling, Simulation and Visualization Methods, MSV Vol. 12 (2012). 101. Spyrou, E. & Iakovidis, D. K. Video-based measurements for wireless capsule endoscope tracking. Meas. Sci. Technol. 25, 015002 (2014). 102. Bao, G., Mi, L. & Pahlavan, K. Emulation on motion tracking of endoscopic capsule inside small intestine. In Proc. 14th International Conference on Bioinformatics and Computational Biology, Las Vegas (2013). 103. Bao, G. & Pahlavan, K. Motion estimation of the endoscopy capsule using region-based Kernel SVM classifier. In Proc. IEEE International Conference on Electro/Information Technology (EIT) 1–5 (2013). 104. Talley, N. J. Decade in review—FGIDs: ‘Functional’ gastrointestinal disorders—a paradigm shift. Nat. Rev. Gastroenterol. Hepatol. 11, 649–650 (2014). 105. Rodriguez, L. & Nurko, S. in Clinical Management of Intestinal Failure (eds Duggan, C. P., Gura, K. M. & Jaksic, T.) 31 (2011). 106. Lee, Y. Y., Erdogan, A. & Rao, S. S. How to assess regional and whole gut transit time with wireless motility capsule. J. Neurogastroenterol. Motil. 20, 265–270 (2014). 107. Malagelada, C. et al. New insight into intestinal motor function via noninvasive endoluminal image analysis. Gastroenterology 135, 1155–1162 (2008). 108. Kellow, J. E. et al. Principles of applied neurogastroenterology: physiology/motility– sensation. Gut 45, (Suppl. 2), II17–II24 (1999). 109. Hansen, M. Small intestinal manometry. Physiol. Res. 51, 541–556 (2002). 110. Spyridonos, P., Vilariño, F., Vitria, J. & Radeva, P. in Advanced Concepts for Intelligent Vision Systems (eds Blanc-Talon, J., Philips, W., Popescu, D. & Scheunders, P.) 531–537 (Springer, 2005). 111. Vilarino, F. et al. Intestinal motility assessment with video capsule endoscopy: automatic annotation of phasic intestinal contractions. IEEE Trans. Med. Imaging 29, 246–259 (2010). 112. Segui, S. et al. Detection of wrinkle frames in endoluminal videos using betweenness centrality measures for images. IEEE J. Biomed. Health Inform. 18, 1831–1838 (2014).


REVIEWS 113. Drozdzal, M. et al. Adaptable image cuts for motility inspection using WCE. Comput. Med. Imaging Graph. 37, 72–80 (2013). 114. Li, B. & Meng, M. Q. Wireless capsule endoscopy images enhancement via adaptive contrast diffusion. J. Vis. Commun. Image Represent. 23, 222–228 (2012). 115. Ramaraj, M., Raghavan, S. & Khan, W. A. Homomorphic filtering techniques for WCE image enhancement. In Proc. IEEE International Conference on Computational Intelligence and Computing Research (ICCIC) 1–5 (2013) (2013). 116. Vu, H. et al. in Abdominal Imaging Computational and Clinical Applications (eds Yoshida, H., Sakas, G. & Linguraru, M. G.) 35–43 (Springer, 2012). 117. Okuhata, H., Nakamura, H., Hara, S., Tsutsui, H. & Onoye T. Application of the real-time Retinex image enhancement for endoscopic images. In Proc. Engineering in Medicine and Biology Society (EMBC), 35th Annual International Conference of the IEEE 3407–3410 (2013). 118. Gopi, V. P. & Palanisamy, P. Capsule endoscopic image denoising based on double density dual tree complex wavelet transform. Int. J. Imag. Robot. 9, 48–60 (2013). 119. Liu, H., Lu, W. S. & Meng, M. H. De-blurring wireless capsule endoscopy images by total variation minimization. In Proc. IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PacRim) 102–106 (2011). 120. Karargyris, A. & Bourbakis, N. An elastic video interpolation methodology for wireless capsule endoscopy videos. In Proc. IEEE International Conference on BioInformatics and BioEngineering (BIBE) 38–43 (2010). 121. Häfner, M., Liedlgruber, M. & Uhl, A. POCSbased super-resolution for HD endoscopy video frames. In Proc. Computer Based Medical Systems 185–190 (2013). 122. Spyrou, E., Diamantis, D. & Iakovidis, D. K. Panoramic visual summaries for efficient reading of capsule endoscopy videos. In Proc. 8th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP) 41–46 (2013). 123. Rondonotti, E. et al. Utility of 3‑dimensional image reconstruction in the diagnosis of smallbowel masses in capsule endoscopy (with video). Gastroint. Endosc. 80, 642–651 (2014). 124. Karargyris, A. & Bourbakis, N. Three-dimensional reconstruction of the digestive wall in capsule endoscopy videos using elastic video interpolation. IEEE Trans. Med. Imaging 30, 957–971 (2011). 125. Koulaouzidis, A. et al. Three-dimensional representation software as image enhancement tool in small-bowel capsule endoscopy: a feasibility study. Dig. Liver Dis. 45, 909–914 (2013).

126. d’Orazio, L. et al. Multimodal and multimedia image analysis and collaborative networking for digestive endoscopy. IRBM 35, 88–93 (2014). 127. Genta, R. M. & Sonnenberg, A. Big data in gastroenterology research. Nat. Rev. Gastroenterol. Hepatol. 11, 386–390 (2014). 128. Mell, P. & Grance, T. The NIST definition of cloud computing. The ACM Digital Library [online], http://dl.acm.org/citation.cfm?id=2206223 (2010). 129. Khan, T. & Wahid, K. Low-complexity colourspace for capsule endoscopy image compression. Electronics Letters 47, 1217–1218 (2011). 130. Mehmood, I., Sajjad, M. & Baik, S. W. Video summarization based tele-endoscopy: a service to efficiently manage visual data generated during wireless capsule endoscopy procedure. J. Med. Syst. 38, 1–9 (2014). 131. Torres, J. S., Damian Segrelles Quilis, J., Espert, I. B. & García, V. H. Improving knowledge management through the support of image examination and data annotation using DICOM structured reporting. J. Biomed. Inform. 45, 1066–1074 (2012). 132. Iakovidis, D., Goudas, T., Smailis, C. & Maglogiannis, I. Ratsnake: a versatile image annotation tool with application to computeraided diagnosis. ScientificWorldJournal http:// dx.doi.org/10.1155/2014/286856. 133. Drozdzal, M. et al. in Pattern Recognition and Image Analysis (eds Vitrià, J., Sanches, J. M. & Mario Hernández, M.) 143–150 (Springer, 2011). 134. Müller, H. & Deserno, T. M. in Biomedical Image Processing (ed. Deserno, T. M.) 471–494 (Springer, 2011). 135. Hu, W., Xie, N., Li, L., Zeng, X. & Maybank, S. A survey on visual content-based video indexing and retrieval. IEEE Trans. Syst. Man Cybern. 41, 797–819 (2011). 136. Garaiman, D. D. & Saftoiu, A. A comparative study for methods of content search in multimedia databases with endoscopic images. Current Health Sci. J. 37, 86–88 (2011). 137. André, B., Vercauteren, T. & Ayache, N. in Medical Content-Based Retrieval for Clinical Decision Support (eds Müller, H., Hayit Greenspan, H. & Syeda-Mahmood, T.) 12–23 (Springer, 2012). 138. Wu, X. W., Yang, Y. B. & Yu, W. Y. Content-based medical image retrieval system for color endoscopic images. Advanced Mat. Res. 798, 1022–1025 (2013). 139. Iddan, G., Meron, G., Glukhovsky, A. & Swain, P. Wireless capsule endoscopy. Nature 405, 417 (2000). 140. Compton, C. C. et al. AJCC Cancer Staging Atlas 287–295 (Springer, 2012). 141. Carrion, A. F., Hindi, M., Molina, E. & Barkin, J. S. Ileal lines: a marker of the ileocecal valve on


wireless capsule endoscopy. Gastrointest. Endosc. 79, 871–872 (2014). 142. Soper, T. D., Porter, M. P. & Seibel, E. J. Surface mosaics of the bladder reconstructed from endoscopic video for automated surveillance. IEEE Trans. Biomed. Eng. 59, 1670–1680 (2012). 143. Rey, J. F. et al. Blinded nonrandomized comparative study of gastric examination with a magnetically guided capsule endoscope and standard videoendoscope. Gastrointest. Endosc. 75, 373–381 (2012). 144. Iakovidis, D. K. et al. Towards intelligent capsules for robust wireless endoscopic imaging of the gut. In Proc. IEEE-IST Conference 95–100 (2014). 145. Hripcsak, G. et al. Health data use, stewardship, and governance: ongoing gaps and challenges: a report from AMIA’s Health Policy 2012 Meeting. J. Am. Med. Inform Assoc. 21, 204–211 (2014). 146. Sliker, L. J. & Ciuti, G. Flexible and capsule endoscopy for screening, diagnosis and treatment. Expert Rev. Med. Devices 11, 649–666 (2014). 147. Aihara, H., Ikeda, K. & Tajiri, H. Image-enhanced capsule endoscopy based on the diagnosis of vascularity when using a new type of capsule. Gastrointest. Endosc. 73, 1274–1279 (2011). 148. Ryu, C. B., Song, J. Y., Lee, M. S. & Shim, C. S. Does capsule endoscopy with Alice improves visibility of small bowel lesions? Gastrointest. Endosc. 77 (Suppl), AB466 (2013). 149. Spada, C., Hassan, C. & Costamagna, G. Virtual chromoendoscopy: will it play a role in capsule endoscopy? Dig. Liver Dis. 43, 927–928 (2011). 150. Given Imaging. Capsule endoscopy [online], http://www.capsuleendoscopy.org (2014). 151. Koulaouzidis, A. & Iakovidis, D. K. KID, a capsule endoscopy database for medical decision support [online], http://is-innovation.eu/kid (2014). 152. University of Aveiro. Capview [online], http:// www.capview.org (2010). 153. Gastrolab [online], http://www.gastrolab.net (2014). 154. World Endoscopy Organization. WEO Clinical Endoscopy Atlas [online], http://www.endoatlas. org (2014). 155. El Salvador atlas of gastrointestinal endoscopy [online], http://www.gastrointestinalatlas.com (2014). 156. Atlas of gastroenterological endoscopy [online], http://www.endoskopischer-atlas.de (2014). Author contributions D.K.I. researched data for the article, contributed to discussion of the content, wrote the article and reviewed/edited the manuscript before submission. A.K. contributed to discussion of the content, wrote the article and reviewed/edited the manuscript before submission.


Optimal Bowel Preparation for Video Capsule Endoscopy.

Guidelines for video capsule endoscopy: emphasis on Crohn's disease.

Perfecting video capsule endoscopy: is there need for training?

Training in Video Capsule Endoscopy: A Call for Curriculum Development.

Video capsule endoscopy in inflammatory bowel disease.

Automatic lesion detection in capsule endoscopy based on color saliency: closer to an essential adjunct for reviewing software.

Challenges and Future of Wireless Capsule Endoscopy.

Coffee enema for preparation for small bowel video capsule endoscopy: a pilot study.

Video capsule endoscopy: perspectives of a revolutionary technique.

Acute appendicitis: a potential complication of video capsule endoscopy.

QuickView video preview software of colon capsule endoscopy: reliability in presenting colorectal polyps as compared to normal mode reading.

Video capsule endoscopy for the investigation of the small bowel: primary care diagnostic technology update.

Video Capsule Endoscopy: A Tool for the Assessment of Small Bowel Transit Time.

What Is the Optimal Timing of Bowel Preparation for Video Capsule Endoscopy?

An improved YEF-DCT based compression algorithm for video capsule endoscopy.

KID Project: an internet-based digital video atlas of capsule endoscopy for research purposes.

Innovative video capsule endoscopy for detection of ubiquitously elongated small intestinal villi in Cronkhite-Canada syndrome.

The history of time for capsule endoscopy.

Software Tools in Endoscopy - Nice to Have or Essential?

Clinical Experience with the PillCam Patency Capsule prior to Video Capsule Endoscopy: A Real-World Experience.

Capsule Endoscopy for Portal Hypertensive Enteropathy.

Emergency video capsule endoscopy in patients with acute severe GI bleeding and negative upper endoscopy results.

Appendicitis following capsule endoscopy.

Aspiration of Capsule Endoscopy.