Elibol et al.

Vol. 31, No. 4 / April 2014 / J. Opt. Soc. Am. A

773

Graph theory approach for match reduction in image mosaicing Armagan Elibol,1 Nuno Gracias,2 Rafael Garcia,2 and Jinwhan Kim1,* 1

Division of Ocean Systems Engineering, Korea Advanced Institute of Science and Technology, Yuseong, Daejeon 305-701, South Korea 2 Computer Vision and Robotics Group, University of Girona, Campus Montilivi, Edifici P4, 17071 Girona, Spain *Corresponding author: [email protected] Received August 5, 2013; revised January 12, 2014; accepted February 14, 2014; posted February 14, 2014 (Doc. ID 194174); published March 19, 2014 One of the crucial steps in image mosaicing is global alignment, which requires finding the best image registration parameters by employing nonlinear minimization methods over correspondences between overlapping image pairs for a dataset. Based on graph theory, we propose a simple but efficient method to reduce the number of overlapping image pairs without any noticeable effect on the final mosaic quality. This reduction significantly lowers the computational cost of the image mosaicing process. The proposed method can be applied in a topology estimation process to reduce the number of image matching attempts. The method has been validated through experiments on challenging underwater image sequences obtained during sea trials with different unmanned underwater vehicles. © 2014 Optical Society of America OCIS codes: (110.4153) Motion estimation and optical flow; (110.4155) Multiframe image processing; (150.0150) Machine vision. http://dx.doi.org/10.1364/JOSAA.31.000773

1. INTRODUCTION Over the past two decades, important advances in robotics and optical sensors have made it possible to obtain optical information from places that humans cannot easily reach such as the moon, Mars, and the deep ocean. This optical information is mainly used for creating maps, which are accurate static representations of spaces and are mainly used for navigation, exploration, or localization. Optical imaging provides higher resolution than acoustic imaging [1]. However, optical imaging in scattering media (e.g., underwater) presents several additional challenges because of the severe light absorption, illumination effects, noise, and lack of image contrast [2–4]. For the vast majority of applications, these limitations prevent the area of interest from being captured in a single image. This increases the need for methods that combine several overlapping images into a single image that gives an overall perspective of the area of interest. These methods are referred to as image mosaicing, and the final images are referred to as mosaics, which are among the most highly valuable sources of information in various applications such as geological surveys [5,6], archaeological surveys [7], ecological studies [8,9], environmental damage assessments [10], and the detection of temporal changes in the environment [11,12]. Therefore, creating optical maps establishes permanent visual records of areas of interest, which are essential inputs for different scientific communities. The short-range nature of image acquisition causes inaccuracies in image registration [13]. These result in misalignments when images are mapped onto a common frame (usually referred to as the mosaic or the global frame). To deal with this problem, it is necessary to know the spatial relationship between images. These relationships are referred to as the 1084-7529/14/040773-10$15.00/0

image topology and exist in the form of information about the robot trajectory and which image pairs overlap (whether consecutive in time or not). If the overlapping images are identified, then global alignment methods can be used to obtain a globally coherent mosaic image. Global alignment is related to the problem of estimating the image registration parameters that best comply with the constraints introduced by all overlapping image pairs. Global alignment is commonly dealt with through nonlinear minimization of an error term defined from the image correspondences detected on overlapping image pairs. This implies a high computational cost [14] for largearea surveys, which may contain hundreds to many thousands of images [6]. Thus, it would be beneficial to reduce the total number of overlapping image pairs used in the minimization of the cost function without disturbing the final mosaic quality. This would allow us to obtain mosaics at reduced computational cost. In this paper, we propose a simple method that is based on graph theory and can be used in two different contexts. The first application is for reducing the total number of overlapping image pairs before employing the global alignment method. In this case, we assume that all overlapping image pairs have been identified a priori through pairwise image matching, either using some navigational sensor information of the vehicle (e.g., Doppler velocity log, ultrashort baseline) or using an all-against-all strategy. The second application is related to the topology estimation process itself. Topology estimation is the process by which overlapping image pairs are identified to obtain an accurate estimate of the robot trajectory. One of the main problems is to determine whether it is necessary to continue the search for new overlapping image pairs or, equivalently, to determine whether the trajectory © 2014 Optical Society of America

774

J. Opt. Soc. Am. A / Vol. 31, No. 4 / April 2014

estimate will improve (i.e., become more accurate) if more overlapping image pairs are used. Our proposal in this paper is aimed to provide an answer to this question, and we present experimental results for challenging real underwater image sequences. To the best of our knowledge, there is no prior work in the literature to reduce the number of overlapping image pairs for estimating the topology in image mosaicing. The proposed method can be applied as an additional step that can be easily integrated into the existing topology estimation frameworks. By reducing the time devoted to image matching attempts, the method can improve the efficiency of any of the existing state-of-the-art methods for image mosaicing, reducing their computational cost and, thus, the time required for global alignment. This work focuses on showing the effectiveness and advantages of this step in two contexts, namely as a postprocessing step and as an integrated step within the wellknown topology estimation framework based on the bundle adjustment (BA). Experimental results are presented for both contexts. Topology estimation methods are reviewed in Section 2. Next, Section 3 is devoted to details of the proposed method, and Section 4 illustrates and discusses some experimental results. Finally, Section 5 presents our conclusions.

2. RELATED WORK ON FEATURE-BASED IMAGE MOSAICING Feature-based image mosaicing (FIM) relies on finding consistent corresponding points among image pairs and combining them into the mosaic reference frame. This can be divided into two main steps: pairwise alignment and global alignment. Whereas pairwise alignment is used to find the registration parameters between two overlapping images, global alignment is used to search for registration parameters that align images on a common frame, also known as the global frame, in order to have an overall view of the surveyed area. The quality constraints on image mosaics are very strict, especially for mapping purposes, as the mosaic may be used for global navigation [15], localization of areas of interest [6], or detection of temporal changes [11,12]. Hence, highly accurate image registration methods are needed. Even if highly accurate image registration methods are used, time-sequential registration over long image sequences is not enough to ensure reliable maps. When the trajectory of the camera revisits an area that has been imaged before, closed-loop trajectories appear. Commonly, mosaicing methods require automatic inference of the path topology in order to detect nonsequential overlapping images, which provide crucial input to global alignment methods for obtaining globally coherent mosaics. Global alignment methods compensate for the cumulative errors made in pairwise registration. Several global alignment approaches have been presented over the past two decades [16–18]. Sawhney et al. [16] proposed a complete solution for image mosaicing in which the topology is iteratively estimated. The spatial consistency is improved by identifying and registering images not consecutive in time. The researchers used a graph representation where nodes represent images and edges represent overlapping areas between nodes. They constructed a graph by taking into account image positions with respect to the chosen global frame. This requires initial estimates of the image positions, which are obtained from the assumption that a time-consecutive pair of images has an

Elibol et al.

overlap. The initial estimation is performed by accumulating pairwise homographies. Potentially overlapping image pairs are generated by computing the normalized distances between image positions. This method performs well for an image sequence that is composed of relatively few images, or whose camera motion is very slight, as the initial estimate is likely to suffer less error accumulation, but such an image sequence is not possible when mapping a large environment. In the latter case, time-consecutive images may not have an overlap and, even if overlap exists, the initial estimation may suffer greatly from drift, resulting in errors while generating and filtering the possible overlapping image pairs. In terms of edge quality in the graph, Ila et al. [19] proposed a method that retains the most informative edges between robot poses by using mutual information within a context of simultaneous localization and mapping to maintain the sparsity of the system. Informative edges are defined as edges with higher mutual information, which provide a prediction of the uncertainty reduction of the system. This reduction is greater when closing a (bigger) loop. In batch mosaicing, however, all possible image pairs are considered as potential matches, which is a problem different from that of comparing the most recent image to all previous images as the robot moves. Some recent studies have used image-to-mosaic registration [20–22] to discard some images and select key frames [23–25] with the aim of real-time mosaicing (known as online mosaicing). These approaches are well suited to create panoramic mosaics from a video sequence in cases where the images have large overlaps, especially cases where camera translation is not allowed. If an error occurs in the online mosaicing while mapping one image onto the mosaic, future image-to-mosaic registrations are more likely to fail. However, large overlaps between images cannot be ensured in our application domain. Indeed, large-area surveys (usually taking from tens of hours to several days) tend to have minimal image overlap because the robot often navigates at relatively high speed to cover the maximum area within the allocated time. The amount of overlap is directly related to the vehicle dynamics (i.e., velocity, altitude, and so forth).

3. MATCH REDUCTION BASED ON GRAPH THEORY The graph theory has been widely used in different scientific disciplines including biology (e.g., the spread of infectious diseases or the roles of proteins and genes) [26], electrical engineering (e.g., electrical circuits) [27], operational research (e.g., network-flow problems, transportation theory, scheduling, or game theory) [27], and several others [28–30]. A graph G  V ; E is composed of vertices (nodes) and the corresponding edges (links) between the vertices. Specifically, V is the set of vertices and E ⊂ V × V is the set of edges. The total number of vertices n  jV j defines the order of the graph, and the total number of edges m  jEj defines the size of the graph [31]. A graph can be classified as directed or undirected, depending on whether the edges are ordered pairs or not. In many applications, a (generally positive) numerical value, called the weight or cost, can be assigned to each edge; graphs with weights are called weighted graphs. The shortest path between two vertices (or nodes) in a weighted graph is that along which the sum of the weights (costs) of the

Elibol et al.

Vol. 31, No. 4 / April 2014 / J. Opt. Soc. Am. A

775

Fig. 1. Basic steps of topology estimation process [15].

constituent edges is minimal. The density of the graph is de2⋅m fined as D  n⋅n−1 , which is the ratio between the number of existing edges and the total number of possible edges. A graph is generally called sparse if D is close to zero. We have modeled the topology of an image registration and mosaicing problem as a directed graph where images are vertices, and each successfully matched image pair represents the edges between nodes in the graph. As there will be only one edge between a pair of nodes, the graph should be called a simple graph. Furthermore, for each edge of the graph, (i.e., for each successfully registered image pair), a weight can be computed as the variance of the error between correspondences once those are mapped to the same coordinate frame. Appendix A summarizes the formulations. Our proposal is motivated by the desire to find and retain the edges that close a bigger loop while bounding a greater error. For each edge in the graph, we compute the shortest alternative path [32] between its nodes. Thus, we obtain detailed information (e.g., the total cost and the number of edges) for the shortest alternative paths. We denote by f a; b the cost of the shortest alternative path between nodes a and b, where (a; b ∈ E∧b > a), and denote by ga; b the total number of edges visited on the alternative path. As the ranges of f and g can vary widely, the values are rescaled to lie within the interval 0; 1 and combined as ha; b  w1 ⋅f a; b w2 ⋅ga; b, where w1  w2  1∕2 can be set by default. Then, we compute basic statistical measures over all h values: min  min ha; b; a;b∈E

max  max ha; b; a;b∈E

1 X ha; b; m a;b∈E 1∕2  1 X 2 ha; b − μ : σ m a;b∈E

between the number of edges whose h cost is higher than μ and the total number of edges in the graph excluding the edges that have no alternative path. Since the topology is modeled with a directed graph, there will be some edges that are the only path from one to another (e.g., time-consecutive images). These edges play an important role in keeping the whole trajectory together (if it is fully connected). Therefore, such nodes are retained and taken into account during the global alignment process. The ratio is simply defined as follows: r

jAj ; m − n − 1

where jAj denotes the cardinality of the set A  fa; bjha; b > μ∧a; b ∈ Eg, while jEj  m is the total number of edges in the graph and n − 1 is the number of edges that have no alternative path under the assumption that the graph is fully connected. The motivation for using this ratio is that if the ratio is close to unity, most of the edges have a high alternative cost and high probability of closing a (big) loop, so there is little need to reduce the threshold by increasing α. Conversely, if the ratio is close to zero, there is a greater need to reduce the threshold by increasing α in order not to disturb the final quality of the mosaic. As our application domain is mainly for providing high-resolution maps to various scientific users, the final quality of the mosaic is very important. Hence, we choose α in such a way that approximately 40%–50% of the edges are retained by taking into account the ratio defined in Eq. (3). Obtaining the topology graph and the trajectory is referred to a topology estimation in the literature [15,16,33]. Generally,

μ

(1)

If the cost of the shortest alternative path from node a to node b becomes very high and the path becomes very long, this is a strong indication that the edge between a and b is closing a (bigger) loop. Therefore, the algorithm keeps edges whose h values are greater than a chosen threshold tα  μ − α⋅σ;

(2)

where α is an adjustable parameter and lies within the interval ≤ α ≤ μ−min σ . The parameter α has a direct effect on the total number of edges used in the global alignment process. To select the appropriate value of α, we observe the ratio

(3)

μ−max σ

Fig. 2. Example of discarded image.

776

J. Opt. Soc. Am. A / Vol. 31, No. 4 / April 2014

Elibol et al.

Fig. 3. Sample images from the datasets used in the experiments.

Table 1. Summary of the Obtained Results with and without Proposed Reduction Algorithm Dataset

Method

Total Number of Edges

Dataset 1

Reduced Original Reduced Original Reduced Original Reduced Original

2739 5412 2419 3895 1663 3225 1752 3906

Dataset 2 Dataset 3 Dataset 4

Total Number of Correspondences

Average Error (in pixels)

Standard Deviation (in pixels)

Total a (in seconds)

443,791 930,898 391,603 628,859 207,284 360,262 817,602 1,699,784

6.15 5.80 5.85 5.58 6.48 6.08 3.21 2.67

2.65 2.54 2.48 2.41 2.96 2.69 1.25 1.00

568.7 906.6 523.3 785.5 273.2 462.4 4933.2 15,847.8

a The current implementation is not optimized. The execution times are included to provide an estimate of the time saving between the reduced datasets and the original datasets.

topology estimation frameworks proceeds iteratively by refining the trajectory estimation, generating a list of potentially overlapping image pairs, and attempting to match a certain number (or all) of the image pairs from the list generated (see Fig. 1). At this point, we have integrated our proposed method into the topology estimation framework of [33]. Once the list of potentially overlapping image pairs has been generated, for each pair in the list we determine whether an alternative path exists with the edges currently in the graph. If there is an alternative path with a higher cost or there is no alternative path, we keep these pairs in the list. Thus, the algorithm significantly reduces the total number of potentially overlapping pairs in the list. The topology estimation continues with the newly reduced list and stops when there are no new unattempted pairs in the list.

4. EXPERIMENTAL RESULTS The proposed approach has been tested on four different real datasets with two different scenarios. The first dataset was gathered by the ICTINEUAUV [34] underwater robot during sea experiments near Colera on the Mediterranean coast of Spain. The acquired dataset is composed of 430 low-resolution images (384 × 288 pixels) and covers approximately 400 m2 . Features were detected and matched between images by using the scale-invariant feature transform (SIFT) [35]. Then, the random sample consensus (RANSAC) [36] was applied for outlier rejection and to estimate the registration parameters between images. The total number of successfully registered overlapping image pairs is 5412. The second dataset was obtained using a Flea digital camera carried by a Phantom XTL unmanned underwater vehicle during a survey of a patch of reef located in the Florida Reef Tract [37]. This is composed of 1136 images with a resolution of 512 × 384 pixels and covers an area of approximately 220 m2 . The total number of overlapping image pairs is 3895. The third dataset was obtained during a survey over a coral reef patch [38] and is composed of 486

images with a resolution of 512 × 384 pixels covering approximately 80 m2 . The total number of overlapping pairs is 3225. There are five time-consecutive images for which the image registration method failed to find consistent correspondences because of the high navigation speed during the survey. These images violate the common assumption of overlap between time-consecutive images, which most of the existing methods invoke to obtain the topology. The fourth dataset was acquired by the Girona500AUV [39] during tests in the pool of the Underwater Robotics Center at the University of Girona. During the experiment, the floor of the pool was covered with a large poster, simulating an underwater environment. The dataset contains 286 images of 384 × 288 pixels covering an area of approximately 30 m2 . The total number of overlapping image pairs is 3906. In order to make quantitative comparison among the mosaics obtained by the methods tested, we registered individual images with respect to the image of the poster and used the resulting mosaic as the ground truth. Although 277 images were registered successfully, 9 images taken close to the borders of the poster had insufficient overlap for successful registration and were discarded (see Fig. 2 for an example image). Sample images from the four datasets are shown in Fig. 3. We assume that the optical axis of the Table 2. Comparison of Visual Quality Using UQI with Respect to Mosaics Obtained with AllAgainst-All Strategy UQI Dataset Dataset Dataset Dataset Dataset

1 2 3 4a

First-On-Top

Last-On-Top

0.9415 0.9620 0.9420 0.7748

0.9387 0.9654 0.9429 0.7724

a Except for the last dataset. The visual comparison for the last dataset was done with the resulting mosaic of image-to-poster registration.

Elibol et al.

down-looking camera is perpendicular to the scene, which can be accepted as almost planar taking into account the constant altitude of the camera and the limited three-dimensional (3D) relief of the scene comparing to the altitude. The images were corrected for radial distortion. As a global alignment method, we use BA [40] to minimize the symmetric transfer error [41]. The cost function is defined as follows:

Vol. 31, No. 4 / April 2014 / J. Opt. Soc. Am. A

ε

777

n 1XXX 1 t ‖k xj − 1 H−1 k ⋅ Ht ⋅ xj ‖2 s k t j1 1 k  ‖t xj − 1 H−1 t ⋅ Hk ⋅ xj ‖2 ;

(4)

where k and t are the indices of a pair of images that were successfully matched, n is the number of correspondences between those overlapping images, s is the total number of

Fig. 4. Mosaics of dataset 4 rendered with last-on-top strategy. (a) Resulting mosaic with all-against-all strategy. (b) Resulting mosaic with reduction. (c) Pixel-wise difference between mosaics.

778

J. Opt. Soc. Am. A / Vol. 31, No. 4 / April 2014

Elibol et al.

correspondences for all successfully matched image pairs, and 1 Hk ; 1 Ht  are the homographies to transform the points of images k and t, respectively, to the global frame. MATLAB was used to implement the large-scale nonlinear least squares minimization and the parameter estimation. A basic requirement for the minimization algorithm is computation of the Jacobian matrix containing the derivatives of all residuals with respect to all parameters. In our implementation, the Jacobian matrix is computed using analytical expressions that we derived symbolically. The initial value for minimization is obtained using the linear approach proposed in [42] over a reduced set of correspondences (e.g., five for each overlapping image pair). This initial value generally provides a good approximation that entails a low number of iterations by the minimization algorithm. Our comparison criterion is based on the average symmetric transfer error over all correspondences found by all-against-all image matching. The main reason to choose this error term is that it does not depend on the global frame selected. If the image transformation parameters are represented in any other coordinate frame, the reprojection error will remain the same. Therefore, the first image frame is selected as the global coordinate frame. Table 1 summarizes the results that were obtained with and without a reduction in the number of edges. The third column lists the total number of image pairs, and the fourth column lists the total number of correspondences that were used to minimize the error metric defined in Eq. (4). The fifth column lists the average reprojection error computed over all correspondences, while the sixth column lists the standard deviation of the error. The last column lists the computational time required by the minimization process as measured using the cputime function of MATLAB (in seconds). All tests were performed on a desktop computer with a 3.4 GHz Intel Core i7 TM processor, 8 GB of RAM, running a 64 bit operating system, and MATLAB R2011b. It can be seen in the table that the global alignment using the reduced number of overlapping pairs provides almost the same trajectory accuracy as that using all overlapping pairs, but with a significant time saving. The numbers given in the third column include the number of edges that have no alternative path. If these edges are excluded, it can be concluded that the same trajectory accuracy is achieved using approximately 50% fewer successfully matched overlapping pairs, which results in a great time saving. The percentage can be changed by using a different value

for the adjustment parameter α. In order to make a visual comparison of the mosaics obtained, we used the universal image quality index (UQI) [43], which is a well-established measure of the similarity between two images. The UQI compares the tested image to the reference image (free of distortion and noise) by taking into account three different factors: correlation loss, luminance, and contrast distortion. The method gives an overall quality index of the tested image, with a value in the range of −1; 1. We used the final mosaic obtained with an all-against-all strategy as the ground truth, and we measured the UQI with respect to that comparison baseline. Before measuring, we rendered two mosaics for each method by using first-on-top and last-on-top strategies. No image blending methods were applied, since that might have visually reduced the artifacts of some alignment errors, which could have affected the result of the qualitative visual comparison. The small differences in the size of the mosaics have been adjusted so as to have the same resolution as that obtained using the all-against-all strategy. Results are summarized in Table 2. The mosaics obtained with the reduced number of overlapping pairs are very close in UQI (over 0.75) to those obtained with all overlapping image pairs. Although the last dataset has the lowest similarity, the mosaics still are visually similar, as can be seen in from Figs. 4(a), 4(b), and 4(c). The main computational cost of the proposed method comes from the need to compute the alternative shortest path for each edge, which can be fulfilled very efficiently [44,45]. especially for sparse graphs such as those in our application domain. However, as mentioned above, minimizing the error given by Eq. (4) requires nonlinear optimization methods, and frequently either the Gauss–Newton or the Levenberg– Marquardt method is used for solving the nonlinear least squares problem. All these methods involve computing the Jacobian matrix and solving the (augmented) normal equations repeatedly, which imposes a very high computational cost; for instance, solving the normal equations has cubic complexity over the total number of unknowns [46]. Although there are some improvements available for sparsely constructed problems [41], the computation can still be very expensive for large problems [14]. Additionally, the memory requirement is directly proportional to the total number of edges and the total number of correspondences used. Reducing the total number of edges lowers both the memory requirements and the computational cost. Therefore, the

Table 3. Summary of Results Obtained Using Proposed Method during the Topology Estimation Process Dataset Dataset 1

Dataset 2

Dataset 3

Dataset 4

Strategy

Successful Pairs

Unsuccessful Pairs

Avg. Error in Pixels

Std. Deviation in Pixels

With Reduction Without Reduction All-against-all With Reduction Without Reduction All-against-all With Reduction Without Reduction All-against-all With Reduction Without Reduction All-against-all

1522 5386 5412 1860 3886 3895 1541 3216 3225 881 3834 3906

74 1961 86,823 6236 39,014 640,785 2365 7852 114,630 47 1140 36,849

6.14 6.07 6.07 5.94 5.58 5.58 6.44 6.08 6.08 2.67 2.36 2.93

3.50 2.59 2.53 3.08 2.42 2.41 3.08 2.70 2.69 1.09 0.92 1.06

Elibol et al.

Vol. 31, No. 4 / April 2014 / J. Opt. Soc. Am. A

779

Fig. 5. Overlapping image pairs and final trajectory of the first dataset.

reduction plays a very important role in the ability to obtain mosaics successfully from large datasets. Table 3 presents the results that were obtained while the topology estimation framework of [33] was employed with and without reduction to obtain the topology. The second column corresponds to the method tested, and the third column lists the total number of successfully matched image pairs. SIFT [35] is used for detection and matching. RANSAC [36] is used for outlier rejection. An image pair is considered successfully matched if it has a minimum of 20 inliers. The fourth column lists the total number of image pairs that were not successfully matched. The last two columns correspond to the average symmetric transfer error and the standard deviation, which are calculated from the resulting set of homographies for each tested strategy but over the correspondences obtained with the all-against-all strategy. Table 3 indicates that our proposed method can be used successfully while obtaining a topology graph by state-ofthe-art topology estimation methods. It was able to obtain results very similar to those of its counterparts, but it has two main advantages. First, it drastically reduced the total number of matching attempts (successful and unsuccessful pairs). Second, it provided a tremendous time saving not only in the global alignment process but also in the image matching

attempts. Figure 5 shows the final trajectory and the overlapping image pairs for the first dataset. Table 4 presents the results of visual comparisons between mosaics that were obtained by the topology estimation with and without reduction. The mosaics obtained with reduction are compared to the reference mosaics obtained with the standard topology estimation framework (without reduction). From the results, we observe that the mosaics are very similar, with a UQI value over 0.85. Figures 6(a) and 6(b) show the mosaics of the second dataset that were obtained in topology estimation with

Table 4. Visual Quality Comparison Using UQI with Respect to Mosaics Obtained with the AllAgainst-All Strategy UQI Dataset Dataset Dataset Dataset Dataset

1 2 3 4a

First-On-Top

Last-On-Top

0.9415 0.8795 0.9411 0.8581

0.9387 0.8858 0.9385 0.8513

a The visual comparison for the last dataset was done with the mosaic resulting from image-to-poster registration.

780

J. Opt. Soc. Am. A / Vol. 31, No. 4 / April 2014

Elibol et al.

Fig. 6. Mosaics of the second dataset rendered with first-on-top strategy. (a) Resulting mosaic of the topology estimation framework. (b) Resulting mosaic of the topology estimation framework with reduction. (c) Difference between two mosaics.

and without the proposed algorithm. Figure 6(c) shows the pixel-wise difference between these mosaics.

5. CONCLUSIONS Large-area image mosaicing methods have lately been of great benefit to various scientific disciplines. To obtain a highquality (well-aligned) mosaic, global alignment is essential. Global alignment is usually accomplished using nonlinear minimization, which imposes a high computational cost,

especially for large datasets. The performance of the global alignment is directly dependent on the identification of the overlapping pairs and thus on the topology of the surveyed area. In this paper, we have presented a simple but efficient method for reducing the total number of overlapping image pairs. This method is for use in the global alignment process and is designed to obtain photomosaics with a lessening of the computational requirements but without compromising the quality of the final mosaic. Moreover, the method can easily be integrated into the topology estimation process. Graph

Elibol et al.

Vol. 31, No. 4 / April 2014 / J. Opt. Soc. Am. A

theoretical algorithms are used to determine which image pairs should be retained. Experiments on challenging underwater datasets have been reported and have proved the efficiency of the proposed approach.

APPENDIX A: EDGE COST COMPUTATION Let I k and I t denote two images that have an overlap. In special circumstances such as when the 3D relief of the scene is much smaller than the distance from the camera to the scene, it can be assumed that the scene is planar. Under this assumption the image mapping that relates I t and I k can be described by the planar transformation pt  Hpk , where H is a homography matrix, pt denotes the image coordinates of the projection of a 3D point P onto the image t, and pk is the projection of the same 3D point onto the image k. Then pk  xk ; yk ; 1T and pt  xt ; yt ; 1T are known as correspondences and can be expressed in homogeneous coordinates. The motion between images is usually computed by minimizing a selected error metric defined over the correspondences [41]. In the present work, the estimation of the motion parameters is formulated as follows: 3 r 1  ‖pt1 − H⋅pk1 ‖2 n X 7 6 .. 7 residues vectorϵ  ri; R6 5 4 . i r n  ‖ptn − H⋅pkn ‖2 2

(A1) where 2

s⋅ cosθ −s⋅ sinθ tx

6 H6 4 s⋅ sinθ 0

s⋅ cosθ 0

3

7 ty 7 5 1

is a four degrees of freedom (1D scaling, 1D rotation, and 2D translations) planar transformation. We compute the variance of the residues that are computed from Eq. (A1) and use this as the edge cost for the topology graph.

ACKNOWLEDGMENTS This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (2012R1A1A1015307). It was also partially funded through European Union projects FP7-ICT-2011-288704 (MORPH) and FP7-312762 (EUROFLEETS2) as well as under grant CTM2010-15216 (MuMap) from the Spanish Ministry of Science and Innovation (MCINN).

REFERENCES 1. 2.

3.

A. Fusiello and V. Murino, “Augmented scene modeling and visualization by optical and acoustic sensor integration,” IEEE Trans. Vis. Comput. Graph. 10, 625–636 (2004). W. S. Pegau, D. Gray, and J. R. V. Zaneveld, “Absorption and attenuation of visible and near-infrared light in water: dependence on temperature and salinity,” Appl. Opt. 36, 6035–6046 (1997). H. Loisel and D. Stramski, “Estimation of the inherent optical properties of natural waters from irradiance attenuation coefficient and reflectance in the presence of Raman scattering,” Appl. Opt. 39, 3001–3011 (2000).

781

4. J. Jaffe, K. Moore, J. McLean, and M. Strand, “Underwater optical imaging: status and prospects,” Oceanography 14, 64–75 (2001). 5. Z. Zhu, E. Riseman, A. Hanson, and H. Schultz, “An efficient method for geo-referenced video mosaicing for environmental monitoring,” Machine Vis. Appl. 16, 203–216 (2005). 6. J. Escartin, R. Garcia, O. Delaunoy, J. Ferrer, N. Gracias, A. Elibol, X. Cufi, L. Neumann, D. J. Fornari, S. E. Humpris, and J. Renard, “Globally aligned photomosaic of the Lucky Strike hydrothermal vent field (Mid-Atlantic Ridge, 3718.5’N): release of georeferenced data, mosaic construction, and viewing software,” Geochem. Geophys. Geosyst. 9, Q12009 (2008). 7. B. Bingham, B. Foley, H. Singh, R. Camilli, K. Delaporta, R. Eustice, A. Mallios, D. Mindell, C. Roman, and D. Sakellariou, “Robotic tools for deep water archaeology: surveying an ancient shipwreck with an autonomous underwater vehicle,” J. Field Robot. 27, 702–717 (2010). 8. K. Jerosch, A. Lüdtke, M. Schlter, and G. Ioannidis, “Automatic content-based analysis of georeferenced image data: detection of beggiatoa mats in seafloor video mosaics from the Hakon Mosby mud volcano,” Comput. Geosci. 33, 202–218 (2007). 9. O. Pizarro, S. B. Williams, M. V. Jakuba, M. Johnson-Roberson, I. Mahon, M. Bryson, D. Steinberg, A. Friedman, D. Dansereau, N. Nourani-Vatani, D. Bongiorno, M. Bewley, A. Bender, N. Ashan, and B. Douillard, “Benthic monitoring with robotic platforms— the experience of Australia,” in IEEE International Underwater Technology Symposium (UT)(2013), pp. 1–10. 10. A. Gleason, D. Lirman, D. Williams, N. Gracias, B. Gintert, H. Madjidi, R. Reid, G. Boynton, S. Negahdaripour, M. Miller, and P. Kramer, “Documenting hurricane impacts on coral reefs using two-dimensional video-mosaic technology,” Marine Ecol. 28, 254–258 (2007). 11. O. Delaunoy, N. Gracias, and R. Garcia, “Towards detecting changes in underwater image sequences,” in OCEANS 2008MTS/IEEE Techno-Ocean, Kobe, Japan (2008), pp. 1–8. 12. M. Bryson, M. Johnson-Roberson, O. Pizarro, and S. Williams, “Automated registration for multi-year robotic surveys of marine benthic habitats,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Tokyo, Japan (2013). 13. R. Prados, R. Garcia, J. Escartin, and L. Neumann, “Challenges of close-range underwater optical mapping,” in MTS/IEEE OCEANS Conference (2011), pp. 1–10. 14. J. Ferrer, A. Elibol, O. Delaunoy, N. Gracias, and R. Garcia, “Large-area photo-mosaics using global alignment and navigation data,” in MTS/IEEE OCEANS Conference, Vancouver, Canada (2007), pp. 1–9. 15. N. Gracias, S. Zwaan, A. Bernardino, and J. Santos-Victor, “Mosaic based navigation for autonomous underwater vehicles,” IEEE J. Ocean. Eng. 28, 609–624 (2003). 16. H. Sawhney, S. Hsu, and R. Kumar, “Robust video mosaicing through topology inference and local to global alignment,” in European Conference on Computer Vision, Freiburg, Germany (1998), Vol. II, pp. 103–119. 17. D. Gledhill, G. Tian, D. Taylor, and D. Clarke, “Panoramic imaging: a review,” Comput. Graph. 27, 435–445 (2003). 18. R. Szeliski, “Image alignment and stitching: a tutorial,” Found. Trends Comput. Graph. Vis. 2, 1–104 (2006). 19. V. Ila, J. M. Porta, and J. Andrade-Cetto, “Information-based compact pose SLAM,” IEEE Trans. Robot. 26, 78–93 (2010). 20. A. Cervantes and E. Y. Kang, “Progressive multi-image registration based on feature tracking,” in International Conference on Image Processing, Computer Vision, & Pattern Recognition, Las Vegas (2006), Vol. 2, pp. 633–639. 21. H. Bulow and A. Birk, “Fast and robust photomapping with an unmanned aerial vehicle UAV,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (2009), pp. 3368–3373. 22. F. Ferreira, G. Veruggio, M. Caccia, and G. Bruzzone, “Real-time optical SLAM-based mosaicking for unmanned underwater vehicles,” Intel. Serv. Robotics 5, 55–71 (2012). 23. D. Steedly, C. Pal, and R. Szeliski, “Efficiently registering video into panoramic mosaics,” in Tenth IEEE International Conference on Computer Vision (ICCV) (2005), Vol. 2, pp. 1300–1307.

782

J. Opt. Soc. Am. A / Vol. 31, No. 4 / April 2014

24. M. Fadaeieslam, M. Fathy, and M. Soryani, “Key frames selection into panoramic mosaics,” in 7th International Conference on Information, Communications and Signal Processing (ICICS) (2009), pp. 1–5. 25. Y. Li, C. Wu, S. Yu, and C. Tsuhan, “Motion-focusing key frame extraction and video summarization for lane surveillance system,” in 16th IEEE International Conference on Image Processing (ICIP) (2009), pp. 4329–4332. 26. O. Mason and M. Verwoerd, “Graph theory and networks in biology,” IET Syst. Biol. 1, 89–119 (2007). 27. N. Deo, Graph Theory with Applications to Engineering and Computer Science (PHI Learning, 2004). 28. A. Balaban, Chemical Applications of Graph Theory (Academic, 1976). 29. F. Roberts, Graph Theory and Its Applications to Problems of Society (SIAM, 1978), Vol. 29. 30. J. Gross and J. Yellen, Graph Theory and Its Applications (Chapman & Hall/CRC Press, 2005). 31. S. Skiena, Implementing Discrete Mathematics: Combinatorics and Graph Theory with Mathematica (Addison-Wesley, 1990). 32. B. Cherkassky, A. Goldberg, and T. Radzik, “Shortest paths algorithms: theory and experimental evaluation,” Math. Program. 73, 129–174 (1996). 33. A. Elibol, N. Gracias, R. Garcia, A. Gleason, B. Gintert, D. Lirman, and P. R. Reid, “Efficient autonomous image mosaicing with applications to coral reef monitoring,” in IROS 2011 Workshop on Robotics for Environmental Monitoring, San Francisco, CA (2011). 34. D. Ribas, N. Palomeras, P. Ridao, M. Carreras, and E. Hernandez, “Ictineu AUV wins the first SAUC-E competition,” in IEEE International Conference on Robotics and Automation, Rome, Italy (2007). 35. D. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vis. 60, 91–110 (2004).

Elibol et al. 36. M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM 24, 381–395 (1981). 37. D. Lirman, N. Gracias, B. Gintert, A. Gleason, R. P. Reid, S. Negahdaripour, and P. Kramer, “Development and application of a video-mosaic survey technology to document the status of coral reef communities,” Environ. Monit. Assess. 125, 59–73 (2007). 38. N. Gracias and S. Negahdaripour, “Underwater mosaic creation using video sequences from different altitudes,” in MTS/IEEE OCEANS Conference, Washington, DC (2005), pp. 1234–1239. 39. D. Ribas, N. Palomeras, P. Ridao, M. Carreras, and A. Mallios, “Girona 500 AUV: from survey to intervention,” IEEE/ASME Trans. Mechatron. 17, 46–53 (2012). 40. B. Triggs, P. McLauchlan, R. Hartley, and A. Fitzgibbon, “Bundle adjustment: a modern synthesis,” in Vision Algorithms: Theory and Practice, B. Triggs, A. Zisserman, and R. Szeliski, eds. (Springer-Verlag, 2000), pp. 298–372. 41. R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, 2nd ed. (Cambridge University, 2004). 42. N. Gracias, J. P. Costeira, and J. S. Victor, “Linear global mosaics for underwater surveying,” in 5th IFAC Symposium on Intelligent Autonomous Vehicles, Lisbon, Portugal (2004), Vol. I. 43. Z. Wang and A. Bovik, “A universal image quality index,” IEEE Signal Process. Lett. 9, 81–84 (2002). 44. A. Goldberg, “Point-to-point shortest path algorithms with preprocessing,” in SOFSEM 2007: Theory and Practice of Computer Science (2007), pp. 88–102. 45. M. Fredman and R. Tarjan, “Fibonacci heaps and their uses in improved network optimization algorithms,” J. ACM 34, 596–615 (1987). 46. J. Nocedal and S. Wright, Numerical Optimization (Springer, 1999).

Graph theory approach for match reduction in image mosaicing.

One of the crucial steps in image mosaicing is global alignment, which requires finding the best image registration parameters by employing nonlinear ...
1MB Sizes 2 Downloads 3 Views