Geometry-Based Distributed Spatial Skyline Queries in Wireless Sensor Networks.

sensors Article

Geometry-Based Distributed Spatial Skyline Queries in Wireless Sensor Networks Yan Wang 1,2 , Baoyan Song 1, *, Junlu Wang 1 , Li Zhang 1 and Ling Wang 1 1 2

*

School of Information, Liaoning University, Shenyang 110036, China; [email protected] (Y.W.); [email protected] (J.W.); [email protected] (L.Z.); [email protected] (L.W.) School of Information Science and Engineering, Northeastern University, Shenyang 110819, China Correspondence: [email protected]; Tel.: +86-24-62202346

Academic Editors: Yunchuan Sun, Antonio Jara and Shengling Wang Received: 22 January 2016; Accepted: 21 March 2016; Published: 29 March 2016

Abstract: Algorithms for skyline querying based on wireless sensor networks (WSNs) have been widely used in the field of environmental monitoring. Because of the multi-dimensional nature of the problem of monitoring spatial position, traditional skyline query strategies cause enormous computational costs and energy consumption. To ensure the efficient use of sensor energy, a geometry-based distributed spatial query strategy (GDSSky) is proposed in this paper. Firstly, the paper presents a geometry-based region partition strategy. It uses the skyline area reduction method based on the convex hull vertices, to quickly query the spatial skyline data related to a specific query area, and proposes a regional partition strategy based on the triangulation method, to implement distributed queries in each sub-region and reduce the comparison times between nodes. Secondly, a sub-region clustering strategy is designed to group the data inside into clusters for parallel queries that can save time. Finally, the paper presents a distributed query strategy based on the data node tree to traverse all adjacent sensors’ monitoring locations. It conducts spatial skyline queries for spatial skyline data that have been obtained and not found respectively, so as to realize the parallel queries. A large number of simulation results shows that GDSSky can quickly return the places which are nearer to query locations and have larger pollution capacity, and significantly reduce the WSN energy consumption. Keywords: wireless sensor network; environmental monitoring; distributed spatial skyline query; convex hull; cutting node

1. Introduction Nowadays applications using the sensor network monitoring strategy are being more and more widely used, such as in forest fire monitoring systems, CitySee, real time CO2 monitoring systems, real time shortest path for drivers [1], a complex embedded system-CPS [2], graph similarity issue [3], digital library services [4], data security [5], the development of the Internet of Things [6,7], the development of the Web of Things [8], Semantic Link Network (SLN) [9,10], different bird species’ behavior research, air quality inspection [11] and so on. Big data analytics for these applications has become more and more essential [12,13]. In terms of the air quality inspection situation, in European and American countries, monitoring stations are mostly set near big cities, and reflect the average air quality level of a whole region. In China, the current information people get is from placing in cities several monitoring sub-stations, which locations must be set in places which are widely apart and not disturbed by people, rather than in places near to the pollution sources. This deployment can also reflect the average air quality level of a city. The above traditional methods can’t monitor the pollution around places with dense populations, and they are always set up to monitor a wide range and will be limited in flexibility when an arbitrarily designated area is queried. Therefore, it is very Sensors 2016, 16, 454; doi:10.3390/s16040454

www.mdpi.com/journal/sensors

Sensors 2016, 16, 454

2 of 22

necessary to apply the spatial skyline query method in wireless sensor networks used for air quality monitoring [14–16]. As traditional monitoring strategies can only monitor the average situation of a large area and have limited flexibility for any small range, this paper proposes a skyline query method for sensor networks. Considering that environmental monitoring involves more spatial attributes, and often incurs a lot of computational cost for general skyline queries, we propose the geometry-based distributed spatial skyline query method in wireless sensor networks (GDSSky). The strategy can quickly find the locations which are near to the query places and have higher pollution potential, and it also can reduce the energy cost. The paper is an extension of our previous work Geometry-Based Spatial Skyline Query in Wireless Sensor Networks [17]. It adds a more detailed description, new experiments and a new distributed parallel strategy and query strategy. The major contributions of this paper are as follows: ‚

‚

‚

‚

We design the cut of skyline region based on convex hull vertices method, which can cut out a lot of the non-skyline region by the rectangular B strategy and reduce the comparison times between nodes and improve the efficiency. We propose a distributed query method based on the data node tree concept to traverse all the neighbor sensor monitoring regions, and enter them into the queue according to the distance by the monotone function, so it can implement the execution in parallel. We design a clustering strategy for the parallel execution of general skyline queries on the non-spatial attributes of the spatial skyline, and conduct the spatial skyline query on the remaining spatial skyline points which are still not found by the tree method at the same time, so we can implement a distributed execution between different sub-regions. We propose the concepts of cuts among sub-regions and cuts in sub-regions for the collection. This can reduce the data transmissions in the network and reduce the sensor energy cost.

The rest of the paper is organized as follows: Section 2 discusses the related work. Section 3 describes geometry-based spatial skyline queries. Section 4 describes the geometry-based distributed spatial skyline query algorithm. Section 5 discusses the experiments and the performance analysis. Finally, Section 6 concludes this paper with the future work. 2. Related Work 2.1. The Non-Spatial Skyline Query Method For the skyline query in the sensor network, the most common method is the cut method based on a filter [18–23]. Y Kwon et al. [18] set down the initial filter from the root node and updates each node constantly; the strongest cut filter is produced in the leaf node and returned with the query results. In [19], a multidimensional filter is acquired by calculation to improve the cut efficiency of the overall wireless communication between sensor nodes. In another approach [20] a local or global filter is set up at the sensor nodes to restrain unnecessary data transmissions, however, the global filter has some limitations in a large-scale sensor network, as these methods will result in large amounts of transmission consumption in the data recovery process, and cost a lot of storage and transmission costs when calculating the filter. In [21], the authors use a global and local optimization strategy to implement skyline queries for more dimensions, and it associates the sub-space skyline with the parent spatial skyline, but it must merge the sub-space skyline queries into the existing expanded parent spatial skyline query. Undoubtedly this has a computational cost. G Wang et al. [22] continuous fragmented skylines over distributed streams which constitute a distributed skyline query method based on streams are presented. A complex skyline monitoring function on the distributed fragmented objects, and a model of the fragments of the object are proposed. B Chen et al. [23] face the problem of curse of dimensionality and proposes to acquire truly controlled and characteristic results to select the most interesting objects from among all the skyline objects.

Sensors 2016, 16, 454

3 of 22

Sensors 2016, 16, 454 3 of 21 Therefore, in these kinds of strategies, the dominant compare counts and computational cost will increase. The tuple is too long, so the transmission and energy consumption between data Therefore, in these kinds of strategies, the dominant compare counts and computational cost will increase. data The nodes may be obtained from severely polluted areas but far from locations will The increase. tuple is too long, so the transmission and energy consumption between datacrowded will with people. However, these nodes are not the geographic locations people really care about. increase. The data nodes may be obtained from severely polluted areas but far from locations

crowded with people. However, these nodes are not the geographic locations people really care

2.2. Spatial about.Skyline Query Method

Unlike no-spatial skyline queries, the spatial skyline query strategy includes two concepts: the 2.2. Spatial Skyline Query Method spatial dominance relation and the spatial skyline. Unlike no-spatial skyline queries, the spatial skyline query strategy includes two concepts: the

‚

spatial dominance relation and the spatial skyline. The spatial dominance relation 

The spatial dominance relation

A data node set P = {p1, p2,..., pn} representing the sensor nodes includes some points in A data nodeAmong set P = {p1, p2,..., representing the sensor includes some points in 2- data 2-dimensional space. them, the pn} function of the sensor nodenodes is sensing data, transforming dimensional space. function the sensor is sensing data,astransforming dataareas and participating in the Among skylinethem, query.the The queryof node set Q =node {q1,...,qn} shown the residential and participating in the skyline query. The query node set Q = {q1,...,qn} shown as the residential in Figure 1 includes some points in 2-dimensional space. D(pi, qj) is the distance in 2-dimensional areas in Figure 1 includes some points in 2-dimensional space. D(pi, qj) is the distance in 2space. The node dominance relation is shown in Figure 1. Among them, rectangle B is the minimum dimensional space. The node dominance relation is shown in Figure 1. Among them, rectangle B is rectangle boundary MBR. In the query process, B updates constantly, B_ {new} = BXMBR (SR (p, Q)). the minimum rectangle boundary MBR. In the query process, B updates constantly, B_ {new} = The interior B contains all the candidate nodes. Nodes B nodes. are theNodes dominated ones, so the there is B∩MBRof(SR (p, Q)). The interior of B contains all the outside candidate outside B are no judgment of dominance relation to those. dominated ones, so there is no judgment of dominance relation to those. Dominator region of p p' p''' B

Bisector line of pp'

p q1

p'' q0

C(q1,p)

Dominance region of p

Figure Thedominance dominance relation relation between Figure 1. 1. The betweennodes. nodes.

As shown in the figure, given a 2-dimensional query node set Q = {q1, ..., qn} and two sensor

As shown figure, given arelation 2-dimensional query node set Q = {q1, ..., qn} and two sensor nodes p, p’,in thethe space dominance is defined as follows: nodes p, p’, the space dominance relation is defined as follows: Definition 1: A point p spatially dominates a point p’ with respect to Q if and only if D (p, qi) ≤ D

 Q.respect to Q if and only if D (p, qi) ď D Definition A point p spatially point p’ qj with (p’, qi) for 1: every qi  Q, and D (p, dominates qj) < D(p’, qj)a for some (p’, qi) forThe every qi Pskyline Q, and D (p, qj) < D(p’, qj) for some qj P Q. spatial ‚

By theskyline comparison between the above dominance relations, we can find the data nodes that are The spatial part of the spatial skyline. Given the data node set P = {p1, p2, ..., pn} and the query node set Q =

By theqn}, comparison between the above dominance relations, we can find the data nodes that {q1,..., the definition is as follows: are part ofDefinition the spatial skyline. Given the data node set P = {p1, p2, ..., pn} and the query node set 2: A point p  P is a spatial skyline point with respect to Q if and only if p is not Q = {q1,..., qn}, the definition is as follows: spatially dominated by any other point of P. The node p is a spatial skyline with respect to Q,



denoted as2:pA point S(Q). This get the formalpoint description of Equation shown as iffollows: Definition p P Pway is awe spatial skyline with respect to Q (1) if and only p is not spatially P. The   S(Q)of     dominated by any other ppoint node p is a spatial skyline with respect to Q, denoted as p’ P, p’ p and p’ S(Q), qi Q, (1) p P S(Q). This way we get the formal description of Equation (1) shown as follows:  so that D(p, qi)

D(p’, qi)

In [24–28], the authors describe theP existing query These methods are p P S(Q) ô @p’ P, p’ ‰ pspatial and p’skyline R S(Q), Dqi methods. P Q, (1) always applied in skyline queries with respect toqi) theďquery so that D(p, D(p’,nodes. qi) In [24,25], the authors propose an algorithm for transmitting all the other data node information to one node in which the dominance relations with existing skyline the nodes is judged. Thisskyline methodquery will affect the correctness of the are In [24–28], thethe authors describe existing spatial methods. These methods

always applied in skyline queries with respect to the query nodes. In [24,25], the authors propose an algorithm for transmitting all the other data node information to one node in which the dominance

Sensors 2016, 16, 454

4 of 22

relations with the existing skyline nodes is judged. This method will affect the correctness of the overall Sensors 2016, 454 networks experience problems, and bring about high transmission costs4and of 21 low result once the16,local efficiency. S Yoon et al. [26] propose dividing the query space into several triangle regions by using a overall result once the local networks experience problems, and bring about high transmission costs triangulation method and conducting the skyline query in each sub-region. However, the arbitrary and low efficiency. S Yoon et al. [26] propose dividing the query space into several triangle regions triangle area must to wait for the local skyline result from the clockwise neighbor triangle area and by using a triangulation method and conducting the skyline query in each sub-region. However, the cannot query in parallel, so it has low efficiency. In [27], there is a range-based skyline query which arbitrary triangle area must to wait for the local skyline result from the clockwise neighbor triangle usesarea the and I-SKY and query N-SKY methods tolow consider the spatial attributes of the cannot inindex parallel, so it has efficiency. In [27], and therenon-spatial is a range-based skyline objects at the same time. Son et al. N-SKY [28] solve the methods problem to of computing top-kand Manhattan spatial query which uses theW I-SKY and index consider thethe spatial non-spatial skyline queryofby theatmonotonic function. It al. computes skyline points inthe near linear attributes theusing objects the same time. W Son et [28] solvethe thetop-k problem of computing top-k time.Manhattan The abovespatial spatialskyline skyline query can only findfunction. the regions which have query bymethods using the monotonic It computes the more top-kinfluence skyline to points in near linear Theposition, above spatial skyline query can onlypollution find the regions which the query region on thetime. spatial but can't find themethods environmental overview of the have more influence to the query region on the spatial position, but can't find the environmental query region. pollution overview of the query region.

3. Geometry-Based Spatial Skyline Queries 3. Geometry-Based Spatial Skyline Queries

In order to reduce the time for comparing between data, a region partition strategy based on order to reduce timenodes for comparing data,locations a region is partition strategy based on The geometryInaccording to the the sensor deployedbetween at different proposed in this paper. geometry according to the sensor nodes deployed at different locations is proposed in this paper. attributes detected by the sensors include the spatial attributes (i.e., the distance to each query node) The attributes detected by the sensors include the spatial attributes (i.e., the distance to each query and non-spatial attributes (such as PM2.5 or SO2 concentration in the air), thus, the paper presents a node) and non-spatial attributes (such as PM2.5 or SO2 concentration in the air), thus, the paper geometry-based spatial skyline query method to quickly query a part of the geographical positions presents a geometry-based spatial skyline query method to quickly query a part of the geographical withpositions greater with domination ability in spatial calculate the maximum query boundary greater domination ability distance, in spatial then distance, then calculate the maximum query formed by query nodes. It proposes a method of cutting the skyline area based on the convex boundary formed by query nodes. It proposes a method of cutting the skyline area based on thehull vertices, which quickly findcan these data find nodethese sets near crowded places in the spatial convex hull can vertices, which quickly data to node sets near to crowded placesattribute. in the spatial attribute.

3.1. Regional Division Based on Sensor Deployment 3.1. Regional Division Based on Sensor Deployment

As shown in Figure 2, users do not prefer to select one attribute and judge synthetically according As shown in Figure 2, users do not to nodes select which one attribute and to judge synthetically to the multiple attribute values. To query theprefer sensor are closer the residential areas according to the multiple attribute values. To query the sensor nodes which are closer to the and where the air pollution level is higher are the skylines we are looking for, so this paper firstly residential areas for andcutting where of theskyline air pollution is on higher are the are looking for, so set proposes a method regionlevel based convex hullskylines verticeswe to find the data node thistopaper firstlynodes. proposes method for of return skylinethe region based on convex hullabove vertices to closer the query Thea method cancutting quickly spatial locations of the problem find the data node set closer to the query nodes. The method can quickly return the spatial locations which are near to all the crowded regions in the special query range, that is, the nodes having more of the above problem which are near to all the crowded regions in the special query range, that is, the crowd influence. Then, we further propose a distributed query method based on the data node tree nodes having more crowd influence. Then, we further propose a distributed query method based on to directly query the non-spatial attributes, such as the values of CO2 , SO2 , PM2.5 to get the places the data node tree to directly query the non-spatial attributes, such as the values of CO2, SO2, PM2.5 to which polluted. Abovepolluted. all, we can obtain source pollution overview which is near to getare theseriously places which are seriously Above all, awe can obtain a source pollution overview the crowed the method of geometry-based spatial skyline queries wireless which isplaces near toby the crowed places by the method ofdistributed geometry-based distributed spatial in skyline sensor networks we propose. queries in wireless sensor networks we propose.

Figure 2. The overview diagram of the city residential area. Figure 2. The overview diagram of the city residential area.

Sensors 2016, 16, 454

5 of 22

Sensors 2016, 16, 454

5 of 21 5 of 21

2016, 16, 454 3.1.1.Sensors Voronoi Diagram

3.1.1. Voronoi Diagram This Voronoi paper uses the Voronoi diagram method to divide the spatial region. This method is 3.1.1. Diagram This paper uses the Voronoi diagram method to divide the spatial region. This method is an an efficient geometric partitioning method. Amongtoeach sub-region we can judge the dominance This geometric paper usespartitioning the Voronoi method. diagram Among method each divide the spatial Thisthe method is an efficient sub-region we region. can judge dominance relationships between them. As shown in Figure 3, the Voronoi diagram of set P inthe two-dimensional efficient geometric partitioning method. Among each sub-region we can judge dominance relationships between them. As shown in Figure 3, the Voronoi diagram of set P in two-dimensional space divides the space into several regions. For3,allthe the points x (where x isPain2-dimensional point) relationships them. shown in Figure Voronoi two-dimensional space dividesbetween the space intoAs several regions. For all the points diagram x (whereofx set is a 2-dimensional point) contained in the corresponding region to p (p P P) V (p), we can get the formal description given space divides the corresponding space into several regions. For allVthe x (where is a 2-dimensional point)  P) contained in the region to p (p (p),points we can get the xformal description given in in  Equation (2) as contained infollows: thefollows: corresponding region to p (p P) V (p), we can get the formal description given in Equation (2) as Equation (2) as follows: @ p1 P P and p1 ‰ P ñ Dpx, pq ď Dpx, p1 q (2) (2)  p’  P and p’  P  D(x, p)  D(x, p’) (2)  p’  P and p’  P  D(x, p)  D(x, p’)

Figure 3. Voronoi diagram. Figure 3. Voronoi diagram.

Figure 3. Voronoi diagram.

3.1.2. Delaunay Graph

3.1.2.3.1.2. Delaunay Graph Delaunay Graph

The Delaunay graph is the Voronoi diagram of a dual graph, used to traverse the adjacent TheThe Delaunay graph is the diagram of acorresponding dual graph, used to traverse adjacent Voronoi Delaunay graph is Voronoi the diagram of a dual graph, used to traverse the Voronoi cell. Figure 4 shows the Voronoi Delaunay graph to Figure 3. In a the graph G adjacent (V, E), V Figure shows Delaunay graph We corresponding to any Figure 3.points InGa (V, graph V the cell. Voronoi Figure shows the4Delaunay to V Figure 3. In a two graph V (V, refers to refers to4cell. the vertices, E refersthe tograph the setcorresponding of edges. set = P, for p,E), p’ G of V, ifE),and refers vertices, E refers to the setin ofthe edges. set Vtwo = P,of for anyp,two points p, p’ of V,edge if vertices, toVoronoi the set of edges. set V Voronoi = P, We for diagram any points p’ of and only if and p is only Eiftorefers p the is the neighbor ofWe p’ P, there will beV,a if connection of the only if p is the Voronoi neighbor of p’ in the Voronoi diagram of P, there will be a connection edge of the two points in G. We say graph G is P’s Delaunay graph [29]. Any Delaunay graph of a point set is in Voronoi neighbor of p’ in the Voronoi diagram of P, there will be a connection edge of the two points the two points in G. We say graph G is P’s Delaunay graph [29]. Any Delaunay graph of a point set is a planar graph, it is connected. G. We say graph G and is P’s Delaunay graph [29]. Any Delaunay graph of a point set is a planar graph, a planar graph, and it is connected.

and it is connected.

Figure 4. Delaunay graph. Figure4. 4. Delaunay Delaunay graph. Figure graph.

3.1.3. The Region Division Method The Region Division Method 3.1.3.3.1.3. TheFigure Region Method 5 Division is the local map, which contains eight data nodes in the local query region. For Figureas5aisnode the local map, which contains eight nodes inconnect the local For example, and its Voronoi celleight V (p1), wedata respectively thequery pointregion. p1 with its Figure 5 is the localp1 map, which contains data nodes in the local query region. For example, example, as a node p1graph, and its Voronoi cellp3 V and (p1),p4, wep5respectively connect thefive point p1 we with its adjacent nodes in the which are p2, and p6. For the above lines, draw as a node p1 and its Voronoi cell V (p1), we respectively connect the point p1 with its adjacent nodes in adjacent nodes in the graph, whichbisectors, are p2, p3 p4, p5 aand p6. For the above five lines, we draw the corresponding perpendicular to and constitute polygon made up of these perpendicular the graph, which are p2, p3 and p4,bisectors, p5 and p6. For the above five lines, we draw the corresponding the corresponding perpendicular to constitute a polygon of these perpendicular bisectors (that is, the bold line polygon in the figure). This polygonmade is theup Voronoi cell V (p1) of the perpendicular bisectors, to constitute a polygon madeThis uppolygon of these perpendicular bisectors bisectors is, the bold line polygon in by thethe figure). therepresent Voronoi cell V geographic (p1) of(that the is, node p1, (that and any location is represented p1, that is to say P1iscan all of the bold line polygon in the figure). This polygon is the Voronoi cell V (p1) of the node p1, and node p1, and any location is represented by the p1, that is to say P1 can represent all of geographic any

location is represented by the p1, that is to say P1 can represent all of geographic information in the V

Sensors 2016, 16, 454 Sensors2016, 2016,16, 16,454 454 Sensors

6 of 21 66ofof2221

information informationin in the the VV (p1). (p1). In In this this way way data data nodes nodes that thatusers users want want to to query queryand andare are close close to to all allquery query nodes, by dominance relationship of data like p1 isis also (p1). Incan this way data nodes users want to query and are close tonodes all query can beThis judged by nodes, can be be judged judged by the thethat dominance relationship of these these data nodes likenodes, p1 to to gain. gain. This also an behind why we use the skyline query based on geometry in the dominancefactor relationship these like p1 to gain. Thismethod is also an important behind an important important factor behindof why wedata usenodes the spatial spatial skyline query method based on the thefactor geometry in this this can quickly and cut data nodes and query this paper. paper. With this method method we canmethod quicklybased and accurately accurately cut dominated dominated data With nodes and query why we useWith the spatial skylinewe query on the geometry in this paper. this method those nodes are the points close to nodes. As in figure, those required dataaccurately nodesthat thatcut are thequery querydata points close toall allquery query nodes. Asshown shown inthe thethat figure, we canrequired quickly data and dominated nodes and query those required data nodes are V of point p3 the same formation process (p1) similarly, all data V (p3) (p3) of the the point p3 has hasall the same formation process as VVfigure, (p1) of ofVp3, p3, similarly, all other other data nodes the query points close to query nodes. As shown in as the (p3) of the point p3 has thenodes same in can own Voronoi way, so will form all nodes’ Voronoi inthis thisfigure figure canconstruct construct their own Voronoi cells this way, sothis this willfigure formcan alldata data nodes’ Voronoi formation process as V (p1)their of p3, similarly, allcells otherthis data nodes in this construct their own diagram in whole diagramcells inthe the whole query region. Voronoi this way,query so thisregion. will form all data nodes’ Voronoi diagram in the whole query region.

Figure Figure Regional division based on Figure5.5.Regional Regionaldivision divisionbased basedon onthe theVoronoi Voronoidiagram. diagram.

Figure Figure 666 is is the the abstract abstract graph graph of of Figure Figure555 after after regional regional division division based based on onthe the Voronoi Voronoi diagram. diagram. Figure is the abstract graph of Figure after regional division based on the Voronoi diagram. The solid dots represent the sensors in the figure, we call the data nodes, and hollow points The solid dots represent the sensors in the figure, we call the data nodes, and hollow points The solid dots represent the sensors in the figure, we call the data nodes, and hollow points represent represent areas, called the nodes. Through regional division strategy on represent residential residential areas, callednodes. the query query nodes. Through the regional division strategy based on residential areas, called the query Through the regionalthe division strategy based on thebased Voronoi the Voronoi diagram, we can quickly judge the positional relationship between a Voronoi cell and the Voronoi diagram, we can quickly judge the positional relationship between a Voronoi cell and diagram, we can quickly judge the positional relationship between a Voronoi cell and the query point, the point, and to the spatial thequery query point, and thus toobtain obtain the spatialskyline skylinefast. fast. and thus to obtain thethus spatial skyline fast.

Figure Figure6.6.The Theabstract abstractfigure figureafter afterquery querydivision. division.

3.2. 3.2.Cutting CuttingofofSkyline SkylineRegions RegionsBased Basedon onConvex ConvexHull HullVertices Vertices In In order order to toimprove improvethe thespatial spatialskyline skyline query query process process and and accelerate accelerate the the cutting cutting process process of of the the non-spatial skyline, we propose a reduction method which based on the convex hull vertices of the non-spatial skyline, skyline, we propose a reduction method which based on the convex hull vertices of the contour contour region region in in this this paper. paper. ItIt uses uses the the geometry geometry knowledge, knowledge, by by judging judging whether whether the the data data nodes nodes and and the the convex convex hull hullvertices vertices form formthe thelargest largestlocation locationrelationship relationshipbetween between the theconvex convexpolygons, polygons, to to complete complete partial partial data data queries queries of of the the spatial spatial skyline, skyline, and and thus thus query query the the partial partial spatial spatial skyline skyline data data which whichmeets meetsthe thedemand demandof ofusers. users. 3.2.1. 3.2.1.Convex ConvexHull Hull onon the left side of Figure 6 is 6denoted as Q as and extract all the all hollow points, The query node set the left side of and we Thequery querynode nodeset set on the left side of Figure Figure 6 isis denoted denoted as Q Qwe and we extract extract all the the hollow hollow as shown in Figure 7. The convex hull (denoted as CH) of set Q in 2-dimensional space is the points, the points,as asshown shownin inFigure Figure7. 7.The Theconvex convexhull hull(denoted (denotedas asCH) CH)of ofset setQ Qin in2-dimensional 2-dimensionalspace spaceisissole the smallest convex polyhedron (it is a convex polygon when in the 2-dimensional space), the convex sole the sole smallest smallest convex convex polyhedron polyhedron (it (it isis aa convex convex polygon polygon when when in in the the 2-dimensional 2-dimensional space), space),hull the contains all points in all Q. Each point the non-convex points does not affect thenot shape of the CH(Q). The convex contains in point points does affect of convexhull hull contains allpoints points inQ. Q.ofEach Each pointof ofthe thenon-convex non-convex points does not affect theshape shape of

Sensors 2016, 16, 454 Sensors 2016, 16, 454

7 of 21 7 of 22

CH(Q). The symbols used are as follows: the query node set is Q, the query node is denoted as q, the data node set is as P,the thequery data node set is denoted p. If node a dataispoint is the nodenode fromset a symbols used aredenoted as follows: is Q, theas query denoted as closest q, the data convex hullas query then is it is denoted [22], and if closest a data point inside the convex is denoted P, thevertex, data node denoted asas p.pIf=aNN(q) data point is the node is from a convex hull hull, then it isthen denoted as p  CH(Q), the convex hull ifquery Q is denoted as CHv(Q), the query vertex, it is denoted as p = NN(q) [22], and a datavertex point in is inside the convex hull, then it perpendicular line the linehull segment is donated as l  (pp’). So if pthe  Pperpendicular and p  S(Q), is denoted as pbisector P CH(Q), theofconvex querypp’ vertex in Q is denoted as CHv(Q), then S(Q) does notline depend on q Q and q  CHv(Q), weSo can greatly improve the bisector line of the segment pp’ is donated as l K (pp’). if pfind P P that and pit Pcan S(Q), then S(Q) does not efficiency of the algorithm when it is not necessary to calculate the distance to the non-vertex query depend on q P Q and q R CHv(Q), we can find that it can greatly improve the efficiency of the algorithm pointsitinisthe query. when notskyline necessary to calculate the distance to the non-vertex query points in the skyline query.

Figure 7. Convex hull.

3.2.2. Cut Method for Skyline Region In order to query the partial skyline data quickly, we propose a cutting strategy based on the contour region of convex hull vertices in this paper. It can cut out data nodes which are outside the rectangular frame.For Fordetermining determiningwhether whether data node is within the (Q), CH this (Q), paper this paper a rectangular frame. thethe data node is within the CH uses auses point point location query algorithm, which can easily locate a data pointiswhether insideconvex the query location query algorithm, which can easily locate a data point whether inside theisquery hull convex hull or not. reduce thetime comparison data, proposesstrategy an ordering of or not. To reduce theTocomparison betweentime data,between it proposes anitordering of thestrategy monotone the monotone this paper, to reduce the energyofconsumption of sensor nodes. function in thisfunction paper, toinreduce the energy consumption sensor nodes.  ‚

The point point location location query query The

In this phase of the skyline query, in order to determine whether data node and its Voronoi cell In this phase of the skyline query, in order to determine whether data node and its Voronoi cell are located in the interior of the convex hull, we need a point location query. The point location are located in the interior of the convex hull, we need a point location query. The point location query is known as a region divided into multiple sub-regions and a query point q appointed by the query is known as a region divided into multiple sub-regions and a query point q appointed by the coordinates, where we need to find the subdomain in which the point q is located. The required coordinates, where we need to find the subdomain in which the point q is located. The required output output of a point location query usually stays in the sub-region partition containing a given query of a point location query usually stays in the sub-region partition containing a given query and the the and the the serial number of the unit. Such as in d dimensional space, what is called the point serial number of the unit. Such as in d dimensional space, what is called the point positioning for a positioning for a convex polyhedron, there are only two possible search results: one is the query convex polyhedron, there are only two possible search results: one is the query point located inside a point located inside a polyhedron, the other is located outside the polyhedron. Therefore, the polyhedron, the other is located outside the polyhedron. Therefore, the algorithm can be used to easily algorithm can be used to easily locate a data point whether it is within the query convex hull or not. locate a data point whether it is within the query convex hull or not.  Cutting method for the skyline region ‚ Cutting method for the skyline region Each data node’s Voronoi neighbors are known, in other words, the Delaunay graph adjacency list of thedata point of set P is stored in the table. Using the point location queries Each node’s Voronoi neighbors areneighbor known, innodes other words, the Delaunay graph adjacency list mechanism it set willP be located has crossed corresponding datalocation at the queries convex mechanism hull of the of the point of is stored in in theorneighbor nodesthe table. Using the point Voronoi node is the outline the node. it will be unit located in that or has crossed theof corresponding data at the convex hull of the Voronoi unit node Figure 8 shows thatnode. the algorithm begins with V(p) including some query node to traverse the that is the outline of the dual Figure diagram of the Voronoi diagram which called theincluding Delaunaysome graphquery [30]. It can to determine 8 shows that the algorithm beginsiswith V(p) node traverse the spatial skyline quickly located in CH(Q) or intersecting with CH(Q) according to Theorem 1 and dual diagram of the Voronoi diagram which is called the Delaunay graph [30]. It can determine the Theorem 3, andquickly issue the current skylineordata to those. with The algorithm runs recursively until the spatial skyline located in CH(Q) intersecting CH(Q) according to Theorem 1 and initial query node. The conducted inthose. a centralized way inruns the recursively monitor center. the Theorem 3, and issue theprocess current is skyline data to The algorithm until Then, the initial monitor center calculates rectanglein B abycentralized the existing SCH, andmonitor then cuts the data nodes outside query node. The process isthe conducted way in the center. Then, the monitor B, lastly, all the accessed data B are in the SCH, data access table Visited, and the center transmits center calculates the rectangle bystored the existing and then cuts the data nodes outside B, lastly,the all necessary data to the sensor nodes in the tableand SCHthe forcenter further querying. the accessed data arecorresponding stored in the data access table Visited, transmits the necessary data to the corresponding sensor nodes in the table SCH for further querying.

Sensors 2016, 16, 454 Sensors 2016, 16, 454

8 of 22 8 of 21

Sensors 2016, 16, 454

8 of 21

Figure 8. Cut based on on the the convex convex hull hull vertices. vertices. Figure 8. Cut of of skyline skyline region region based Figure 8. Cut of skyline region based on the convex hull vertices.

Algorithm 1: Cutting of the skyline region based on the convex hull vertices Algorithm 1:Cutting Cutting theskyline skyline region basedon onthe theconvex convexhull hullvertices vertices Input: data 1: node set P,ofof query node set Q based Algorithm the region Output: the skyline set S Input: Input:data datanode nodeset setP,P,query querynode nodeset setQQ 1: initialize the skyline data table SCH and data access table Visited Output: the skyline set S Output: the skyline set S 2: calculates the CH(Q) by the monitor center 1:1:initialize initializethe theskyline skylinedata datatable tableSCH SCHand anddata dataaccess accesstable tableVisited Visited 3: begin with V(p) including some query node q(q2CHv(Q)) and put p into table SCH 2:2:for calculates the by center calculates theCH(Q) CH(Q) bythe themonitor monitor center 4: traverse Delaunay graph along each edge of CH(Q) 3:3:do for begin V(p) some query node q(q2CHv(Q)) for beginwith with V(p)including including some query node q(q2CHv(Q))and andput putppinto intotable tableSCH SCH 5: if V(p’)(p’2P) is adjacent to V(p) and intersecting with CH(Q) or in CH(Q) 4:4:do dotraverse traverseDelaunay Delaunaygraph graphalong alongeach eachedge edgeofofCH(Q) CH(Q) 6: by Theorem 1 and Theorem 3, insert node p with into SCH and table Visited. Issues the 5:5:then ififV(p’)(p’2P) isisadjacent totoV(p) intersecting CH(Q) ororin V(p’)(p’2P) adjacent V(p)and and intersecting withtable CH(Q) inCH(Q) CH(Q) current skyline data to those 6:6:then thenby byTheorem Theorem1 1and andTheorem Theorem3,3,insert insertnode nodeppinto intotable tableSCH SCHand andtable tableVisited. Visited.Issues Issuesthe the 7: else if V(p’) is adjacent to V(p) and not intersecting with CH(Q) or not in CH(Q) current skyline data to those current skyline data to those 8: node p’ intoto table Visited 7:7:then else isisadjacent and elseifinsert ifV(p’) V(p’) adjacent toV(p) V(p) andnot notintersecting intersectingwith withCH(Q) CH(Q)orornot notininCH(Q) CH(Q) 9: back to the initial data node p 8:8:then theninsert insertnode nodep’p’into intotable tableVisited Visited 10: end to for initial data node p 9:9:back back tothe the initial data node p 10: 10:end endfor for Figure 9 shows the query results for this phase. The big solid points represent the phase which can query part ofthe skyline in for the this figure, however, points on the of the Figurethe shows query data results phase. The big bigsmall solidsolid points represent theoutside phase which which Figure 99 shows the query results for this phase. The solid points represent the phase convex hull still have some data not found in the skyline data, so the skyline data query method is can query query the the part part of of skyline skyline data data in in the the figure, figure, however, however, small small solid solid points points on on the the outside outside of of the the can put forward in the next section of this article. convex hull hull still stillhave havesome somedata datanot notfound found skyline data, skyline query method is convex inin thethe skyline data, so so thethe skyline datadata query method is put put forward in the next section of this article. forward in the next section of this article.

Figure 9. Cutting result for the skyline region.



Figure 9. Cutting result for the skyline region.

Monotonic function sorting

Monotonic function sorting In order tofunction avoid large comparison times between data and reduce the computing cost, this ‚ Monotonic sorting paperInproposes a monotonic function [31] strategy by data sorting data the in table SCH. Thus we order to avoid large comparison times between andthe reduce computing cost, this In order to avoid large comparison times between data and reduce the computing cost, this paper maintain a list L in which all the skyline [31] data strategy distancedbyattributes are kept. The datawe in paper proposes a monotonic function sorting of thetable dataSCH in table SCH. Thus proposes asorted monotonic function [31] strategy by sorting athe data innode tableand SCH. Thus we maintain a listin L list L are by the sum of the distance between skyline each convex hull vertex maintain a list L in which all the skyline data distanced attributes of table SCH are kept. The data in in which allorder. the skyline data distanced attributes ofinto tableL,SCH kept. Thetodata in list L are sorted by ascending a new node is inserted we are only conduct dominated list L are sorted byWhen the sum of the distance between a skyline nodeneed and each convexthe hull vertex in the sum of the distance between a skyline node and each convex hull vertex in ascending order. When judgement assessment with the node whose distance attribute is less than that of the new one ascending order. When a new node is inserted into L, we only need to conduct the dominated abecause new node is inserted into L, we only need to conduct the dominated judgement assessment with the if the distance attribute of the newly inserted node is less than that of some node in L, it judgement assessment with the node whose distance attribute is less than that of the new one node whose distance attribute is less than that of the new one because if the distance attribute of the means is distance at least a attribute lesser distance that of the nodes following new becausethere if the of the than newly inserted node is less thanthe that of one. someThus, nodethe in new L, it node is not dominated by the following nodes and it is not necessary to judge the dominated relation means there is at least a lesser distance than that of the nodes following the new one. Thus, the new node is not dominated by the following nodes and it is not necessary to judge the dominated relation

Sensors 2016, 16, 454

9 of 22

newly inserted node is less than that of some node in L, it means there is at least a lesser distance than that of the nodes following the new one. Thus, the new node is not dominated by the following nodes Sensors 2016, 16, 454 9 of 21 and it is not necessary to judge the dominated relation between the other nodes. Via the monotonic function strategy, wenodes. can reduce nodes’ function comparison times lower computing cost and the between the other Via thethe monotonic strategy, wewith can reduce the nodes’ comparison energy consumption of sensor nodes, meanwhile, we can improve the accuracy and efficiency times with lower computing cost and the energy consumption of sensor nodes, meanwhile, we can of theimprove query. the accuracy and efficiency of the query. As As shown in Figure 10, we in ascending order into into list Llist all L theallnodes whosewhose sum ofsum distance shown in Figure 10, insert we insert in ascending order the nodes of to each convex hullconvex verticeshull is calculated. We can judge if node p1ifisnode located node p2 via this distance to each vertices is calculated. We can judge p1 isbefore located before node p2list. via we thiscan list.only Thus, we candominated only conduct dominatedjudgements relationshipbetween judgements between node p6 with Thus, conduct relationship node p6 with all the skyline all the skyline it in L. Ifdominated node p6 is not dominated by the it, nodes it, then we L, nodes before it innodes list L.before If node p6list is not by the nodes before thenbefore we insert it into insert it into L, then it is the new skyline data. Accordingly, by the monotonic function, we can avoid then it is the new skyline data. Accordingly, by the monotonic function, we can avoid the dominated the dominated relationship judgement p6following and the nodes following p6. This further reduces relationship judgement between p6 and between the nodes p6. This further reduces the comparison thefor comparison times for data dominance. times data dominance. d(pi，qi) pi p6

q1 p1

p5

p3 q3 p4

p2 q2

q1

q2

q3

p1

1.1

3.7

p2

2.7

2.8 1.0

p3 p4

3.0

4.5

0.9

4.1

2.7

1.8

p5

0.8

3.5

4.8

p6

1.1

2.8

3.9

4.2

Figure10. 10.The Thegeography geography information information in Figure insensor sensornode. node.

4. Geometry-BasedDistributed Distributed Skyline Skyline Query 4. Geometry-Based Query In order to find the sensor nodes are closer to neighborhoods andlarger have air larger air In order to find the sensor nodes whichwhich are closer to neighborhoods and have pollution pollution indexes, which are the skyline data we are looking for, research about the remaining indexes, which are the skyline data we are looking for, research about the remaining spatial skyline spatial skyline queries, parallel queries [32] and distributed queries of general skyline queries of queries, parallel queries [32] and distributed queries of general skyline queries of non-spatial attributes non-spatial attributes in spatial skyline data has been conducted on the basis of the above study. in spatial skyline data has been conducted on the basis of the above study. The regional division The regional division strategy based on triangulation [33] and sub-region clustering strategy [34] strategy based on triangulation [33] and sub-region clustering strategy [34] are proposed to provide are proposed to provide a basis for spatial skyline queries based on data node trees [35] and parallel a basis for of spatial skyline queries based on data node trees [35] and parallel queries of non-spatial queries non-spatial skyline data.

skyline data.

4.1. Regional Division

4.1. Regional Division

In order to realize the accuracy and completeness of the query, this paper proposes a regional In order to realize theon accuracy and completeness of the query, this paper proposes regional division strategy based the triangulation method. In Section 3, after querying partialaspatial division strategy based on the triangulation method. In Section 3, after querying partial spatial skyline data via the method of cutting the skyline area based on the convex hull vertices, thereskyline are data viasome the method of cutting theoutside skyline area based onvertices, the convex hull vertices, there region are stillinto some still spatial skyline nodes the convex hull so we divide the spatial spatial skyline nodes outside the convex hull queries vertices,onsothe weremaining divide the spatial region into several sub-regions, then conduct distributed spatial skyline data, as several well as generalthen queries of these data, so as to realize distributed queries sub-regions, sub-regions, conduct distributed queries on the remaining spatialbetween skylinedifferent data, as well as general and parallel in as sub-regions. The method queries can greatly improve the accuracy and efficiency of queries of thesequeries data, so to realize distributed between different sub-regions, and parallel the queries. queries in sub-regions. The method can greatly improve the accuracy and efficiency of the queries. 4.1.1. Regional Division Basedon onTriangulation Triangulation Method Method 4.1.1. Regional Division Based In order find the nodeswhich whichstill stillexist exist outside outside the thethe method of of In order to to find the nodes the convex convexhull, hull,we wepropose propose method distributed queries basedon onthe thedata datanode node tree tree [22] [22] to skyline. Aiming thethe distributed queries based to find findthe theremaining remainingspatial spatial skyline. Aiming at each closest node, the algorithm builds a data node tree by region and a local queue so that it at each closest node, the algorithm builds a data node tree by region and a local queue so thatcan it can traverse all the neighbor nodes which unvisited and intersecting with the sub-regions[30]. traverse all the neighbor nodes which areare unvisited and in in or or intersecting with BB ofof the sub-regions [30]. Finally, the algorithm judges the dominance relationship between nodes by the distributed Finally, the algorithm judges the dominance relationship between nodes by the distributed query to query to get all the spatial skyline with the distributed query. get all the spatial skyline with the distributed query. The monitor center computes the central location R of the CH(Q), and then divides the region into several sub-regions shown in Figure 11 by the triangulation [36] method. We define the data

Sensors 2016, 16, 454

10 of 22

The monitor Sensors 2016, 16, 454 454 center computes the central location R of the CH(Q), and then divides the region 10 into of 21 21 Sensors 2016, 16, 10 of several sub-regions shown in Figure 11 by the triangulation [36] method. We define the data nodes nodes to closest each of thehull convex hullasvertices as the closest nodes. Accordingly, find node each closest each to of the convex vertices the closest nodes. Accordingly, we find eachwe closest belonging to belonging each triangle area.triangle Here, we define thewe closest is the special node which closest node to each area. Here, definenode the closest node issensor the special sensor node which possesses highstronger energy and strongerability. computing ability. possesses high energy and computing

Figure 11. 11. Triangulation Triangulation diagram. diagram. Figure

Clustering in in the the Sub-Region 4.1.2. Clustering Sub-Region 4.1.2. query in in parallel, parallel, this this paper paper has has designed designed a clustering strategy, In order order to to make make the the skyline skyline query In regarding the closest point as the cluster head node, and the partial spatial skyline obtained obtained from from the the regarding the closest point as the cluster head node, and the partial spatial skyline cutting strategy of the skyline region in each triangle region is classified as one cluster called major cutting strategy of the skyline region in each triangle region is classified as one cluster called major cluster,that thatcan canreturn return spatial skyline STof out the convex by the distributed query cluster, allall thethe spatial skyline ST out theofconvex hull byhull the distributed query method method based on the data node tree, and then divide the ST data into an assisted cluster which is based on the data node tree, and then divide the ST data into an assisted cluster which is shown in shown12. in Figure 12. Figure

Figure 12. 12. Clustering Clustering in in the the sub-region. sub-region. Figure

To improve the efficiency of the query, this paper has designed a clustering strategy which To improve the efficiency of the query, this paper has designed a clustering strategy which divides divides the triangle region nodes obtained by the triangulation method into a major cluster and an the triangle region nodes obtained by the triangulation method into a major cluster and an assisted assisted cluster so as to realize the parallel queries. According to the choice of the camera position in cluster so as to realize the parallel queries. According to the choice of the camera position in the the triangulation method, this paper regarded the closest point as the cluster head node, and the triangulation method, this paper regarded the closest point as the cluster head node, and the partial partial spatial skyline which was obtained by the cutting strategy of the skyline region in each spatial skyline which was obtained by the cutting strategy of the skyline region in each triangle region triangle region is classified as one cluster called major cluster, which can return all the spatial skyline is classified as one cluster called major cluster, which can return all the spatial skyline ST out of the ST out of the convex hull by the distributed query method based on the data node tree, and then convex hull by the distributed query method based on the data node tree, and then divide the ST data divide the ST data into assisted clusters, which is shown in Figure 12. The major cluster and assisted into assisted clusters, which is shown in Figure 12. The major cluster and assisted clusters share a clusters share a common cluster head node in each sub-region. As shown in Figure 12, each little common cluster head node in each sub-region. As shown in Figure 12, each little region drawn with a region drawn with a thin line indicates a major cluster and vice versa. Data from different sub-regions thin line indicates a major cluster and vice versa. Data from different sub-regions is transmitted via the is transmitted via the respective cluster head nodes. respective cluster head nodes. 4.2. Distributed Regional Queries 4.2. Distributed Regional Queries This paper proposes the strategy of distributed regional queries based on the division of the This paper proposes the strategy of distributed regional queries based on the division of the query region. region. We We can can query query in in parallel parallel in in each each sub-region. sub-region. Using Using the the spatial spatial skyline skyline query query method method query based on data node trees we can conduct queries in the remaining spatial skyline, meanwhile, it also conducts non-spatial skyline queries on the non-spatial attributes of the spatial skyline, and distributed queries among each sub-region. This paper also proposes a method which can combine the results of each sub-regional skyline query. Firstly, we delete all the dominated results in the

Sensors 2016, 16, 454

11 of 22

based on data node trees we can conduct queries in the remaining spatial skyline, meanwhile, it also conducts non-spatial skyline queries on the non-spatial attributes of the spatial skyline, and distributed queries among each sub-region. This paper also proposes a method which can combine the results of each2016, sub-regional skyline query. Firstly, we delete all the dominated results in the sub-region Sensors 16, 454 11 of 21 via the cutting method between sub-regions. Then we continue to delete some dominated query results within sub-region the cut method.sub-regions. Via the combination the queryto results, can sub-region viathethe cutting via method between Then weofcontinue deletewesome further cut the dominated nodes, the via accuracy the queries andcombination reduce the of cost data dominated query results within theimprove sub-region the cutofmethod. Via the theofquery communication. This paper presents the strategy of distributed regional which quickly results, we can further cut the dominated nodes, improve the accuracy of queries the queries andcan reduce the return multi-dimensional attribute skyline results interestoftodistributed the user. regional queries which cost ofthe data communication. This paper presents the of strategy can quickly return the multi-dimensional attribute skyline results of interest to the user. 4.2.1. Queries in Parallel within Sub-Regions

4.2.1.InQueries within Sectionin 3, Parallel after cutting theSub-Regions skyline area based on the convex hull vertices, we assign the partial spatial skyline data to the major cluster so asarea to conduct non-spatial attributes. In Section 3, after cutting the skyline based skyline on the queries convex on hullthevertices, we assign the However there are some spatial skyline data outsides the convex hull. In order to find these nodes, we partial spatial skyline data to the major cluster so as to conduct skyline queries on the non-spatial propose a spatial skyline query basedskyline on the data tree tothe guarantee the completeness and attributes. However there are method some spatial datanode outsides convex hull. In order to find accuracy of the query results, and thus attribute these results to the assisted cluster. The sub-region these nodes, we propose a spatial skyline query method based on the data node tree to guarantee the parallel query method this paper proposes is to conduct skyline queriesthese on non-spatial in completeness and accuracy of the query results, and thus attribute results to attributes the assisted the major cluster and to conduct skyline queries on the remaining spatial skyline data. The use of cluster. The sub-region parallel query method this paper proposes is to conduct skyline queries on the clustering parallel queries can improve efficiency the algorithm. Meanwhile this non-spatial attributes in the major cluster the andexecution to conduct skyline of queries on the remaining spatial method conducts skyline queriesparallel on the queries non-attribute data on the the execution basis of theefficiency spatial skyline skyline data. The general use of the clustering can improve of the data. It furtherly reduce the time needed for comparing data and reducing the energy consumption of algorithm. Meanwhile this method conducts general skyline queries on the non-attribute data on the sensor nodes. basis of the spatial skyline data. It furtherly reduce the time needed for comparing data and reducing the energy consumption of sensor nodes. (1) Spatial skyline queries based on a data node tree (1) Spatial skyline queries based on a data node tree In order to find the nodes which still exist outside the convex hull, we propose the method of In order to find the nodes which still exist outside the convex hull, we propose the method of distributed queries based on the data node tree [37] to find the remaining spatial skyline. Aiming at distributed queries based on the data node tree [37] to find the remaining spatial skyline. Aiming at each closest node, the algorithm builds a data node tree by region and a local queue so that it can each closest node, the algorithm builds a data node tree by region and a local queue so that it can traverse all the neighbor nodes which are unvisited and in or intersecting with B of the sub-regions [38]. traverse all the neighbor nodes which are unvisited and in or intersecting with B of the sub-regions Finally, the algorithm judges the dominance relationship between nodes by distributed queries to get [38]. Finally, the algorithm judges the dominance relationship between nodes by distributed queries all the spatial skyline data with these distributed queries. to get all the spatial skyline data with these distributed queries. ‚

The The generation generation of of the thedata datanode nodetree tree

Each data table SCH from the monitoring center and the Each closest closestpoint pointreceives receivesa skyline a skyline data table SCH from the monitoring center anddata thenode data visited table Visited and regards the twothe tables the local tables, and maintains a local data node data tree node visited table Visited and regards twoastables as the local tables, and maintains a local Tnode whose is the closest and other nodes the tree correspond to all the data nodes the treeroot T whose root ispoint the closest point andofother nodes of the tree correspond to allout theofdata Visited table in or intersecting with B, as shown with in Figure nodes out of and the Visited table and in or intersecting B, as13. shown in Figure 13.

node tree tree T. T. Figure 13. The structure diagram of the data node

This paper maintains a local queue L located at each node who possesses neighbors, which is used for the storage of tuple, mindist() is a monotonic function, mindist(p, CHv(Q)) represents the sum of the distances from a data point p to CHv(Q). The specific algorithm is shown as generation data node tree [34] algorithm. The time complexity of the algorithm is O(n). 

skyline queries based on data node trees

Sensors 2016, 16, 454

12 of 22

This paper maintains a local queue L located at each node who possesses neighbors, which is used for the storage of tuple, mindist() is a monotonic function, mindist(p, CHv(Q)) represents the sum of the distances from a data point p to CHv(Q). The specific algorithm is shown as generation data node tree [34] algorithm. The time complexity of the algorithm is O(n). skyline queries based on data node trees

‚

At this stage, the algorithm regards the root node of tree T as the parent node, and starts from it to traverse this tree. If the parent node has a local queue L, then the neighbor nodes will be traversed by it, and the dominance relations will be judged with the local skyline data SCH. In this process T will be traversed in depth until the leaf nodes, and then it recursively returns to the next data node of the queue pointer pointing in the upper layer. The process is repeated. The specific algorithm is shown as Algorithm 2. The time complexity of the algorithm is O(n). Algorithm 2: Skyline query based on the data node tree algorithm Input: table SCH, table Visited, data node tree T, local queue L, query node set Q Output: the spatial skyline data ST based on T 1: regard the root node of data node tree T as the father node 2: for each of the father node do 3: if the father node has the local queue L 4: then the pointer point to the first record of the father node local queue L, determine dominance relation of the data point p corresponding to the record with the data node in the local table SCH 5: if P is dominated by the node in table SCH 6: then regard p as the new father node, then conduct step 2 7: if p is not dominated by the node in table SCH and do not dominate any node in SCH 8: then insert p to ST, and regard p as new father node, then conduct step 2 9: if p is not dominated by the node in table SCH and dominate the node in table SCH 9: if p is not dominated by the node in table SCH and dominate the node in table SCH 10: then insert p to ST, delete the dominated node in SCH, and regard p as new father node, then conduct step 2 11: if the father node do not has the local queue L 12: then return to the local queue of the father node of p, move the pointer to the next one and regard it as the new father node, then conduct step 2 13: end for (2)

Non-spatial skyline queries

The algorithm queries the spatial skyline in an assisted cluster, meanwhile, it also conducts the general skyline queries on the non-spatial attributes of the data in the major cluster to get the skyline data Smaj. After the query, the algorithm transmits all the skyline data in the major cluster to the cluster head node. After receiving the remaining spatial skyline found in the assisted cluster, the cluster head node makes the dominance relation judgment between the non-spatial attributes of the spatial skyline in the assisted cluster and the existing skyline data Smaj in the major cluster. Finally, we can find all the skyline data S of the sub-region in the cluster head node of the sub-region. At the same time, each triangle area conducts the above distributed process as shown in Figure 14.

to the cluster head node. After receiving the remaining spatial skyline found in the assisted cluster, the cluster head node makes the dominance relation judgment between the non-spatial attributes of the spatial skyline in the assisted cluster and the existing skyline data Smaj in the major cluster. Finally, we can find all the skyline data S of the sub-region in the cluster head node of the sub-region. At the same time, each triangle area conducts the above distributed process as shown in Sensors 2016, 16, 454 13 of 22 Figure 14.

Figure 14. 14. Skyline data data collection collection inside inside the the sub-region. sub-region. Figure

The specific algorithm of the above process is shown as Algorithm 3. The time cost of the The specific algorithm of the above process is shown as Algorithm 3. The time cost of the algorithm algorithm is related with the number of nodes in the assisted cluster and the number of general is related We Sensors 2016,with 16, 454the number of nodes in the assisted cluster and the number of general skylines. of 21 skylines. We suppose P as the number of nodes, S as the number of general skylines, the13 time suppose P as the number of nodes, S as the number of general skylines, the time complexity of the complexity of the algorithm is O(|P|(|S|log|CH(Q)| + log|P|)). After the query algorithm, as algorithm O(|P|(|S|log|CH(Q)| + log|P|)). Afterblack the query as shown Figure query 15 for shown in is Figure 15 for the query results, the solid pointsalgorithm, in the figure is theinskyline the query results, the solid black points in the figure is the skyline query results in each area. results in each area.

Figure each area. area. Figure 15. 15. The The skyline skyline query query results results in in each

Algorithm 3: Syline parallel query algorithm Algorithm 3: Syline parallel query algorithm Input: table SCH, table Visited, the data node tree T, the local queue L, the query node set Q, the Input: table SCH, table Visited, major cluster, the assisted clusterthe data node tree T, the local queue L, the query node set Q, the major cluster, the assisted Output: the skyline data in cluster sub-region Slocal Output: the skyline data in sub-region Slocal 1: divide the skyline in SCH into the major cluster, 1: divide the skyline in SCH into the major cluster, 2: conduct the distributed query algorithm based on the data node tree algorithm conduct distributed algorithm based the datathem nodetotree 3:2:divide thethe data in ST into query the assisted cluster andon transmit the algorithm cluster head node 3: divide the data in ST into the assisted cluster and transmit them to the cluster head 4: at the same time, conduct the general skyline query on the non-spatial attribute of thenode data in 4: major at the same time, conduct the general the cluster to get the skyline Smaj skyline query on the non-spatial attribute of the data in the cluster to get the skyline 5:major transmit all the skyline data inSmaj the major cluster to the cluster head node 5: transmit all the skyline datathe in the major cluster to the cluster between head node 6: the cluster head node makes dominance relation judgment the non-spatial 6: the cluster head node makesin the dominance relation judgment between the non-spatial attributes attributes of the spatial skyline the assisted cluster with the existing skyline data Smaj in the of the spatial skyline in the assisted cluster with the existing skyline data Smaj in the major cluster major cluster thenthe thecluster clusterhead headnode noderegards regardsthe thedata datathat thatare arenot notdominated dominatedby by others others as as the the local local skyline 7:7:then data ofdata the sub-region Slocal Slocal skyline of the sub-region 4.2.2. Query Result Merging in the Inter-Region This paper proposes a skyline query recovery mechanism which can effectively improve the cut efficiency of the skyline queriesa while reducing the data computation in the skyline queries and reducing traffic in the network network of the the skyline skyline query query results results and and saving saving the the network network energy. energy. This paper proposes the method of cutting among sub-regions and cutting inside sub-regions to conduct domination on on the skyline data in eachinregion quickly cutquickly the non-skyline domination relation relationjudgments judgments the skyline data each which regioncan which can cut the data, and reduce data transmission across the network. non-skyline data,the andvolume reduceof the volume of data transmission across the network.

(1) Cuts among sub-regions Each cluster head node establishes a local cluster head node table Sdata when its triangle area finishes calculating all the skyline data, the table includes the tuples of the whole skyline data which is shown in Table 1. Table 1. Format of the data tuples.

Sensors 2016, 16, 454

(1)

14 of 22

Cuts among sub-regions

Each cluster head node establishes a local cluster head node table Sdata when its triangle area finishes calculating all the skyline data, the table includes the tuples of the whole skyline data which is shown in Table 1. Table 1. Format of the data tuples. Node ID

Attribute 1 Value

Attribute 2 Value

Maximum Value

Minimum Value

Multiplication of Each Attribute

ID

Attribute 1

Attribute 2

max

min

D-space

The cluster head node computes maximum and minimum values of each tuple, respectively, and computes the minimum value of the max MIN-max and the minimum value of min MIN-min of the whole tuples in the cluster head node table Sdata, and is supposed to get the interested attribute values Sensors 2016, 16, 454 14 of 21 which are the smaller the better. If the MIN-max of the cluster head node is smaller than the MIN-min of thethe other cluster of head thehead sub-regions of the are cut totally, as shown in totally, Figure 16, than MIN-min thenode, other then cluster node, then thelatter sub-regions of the latter are cut as where ˆ represents the sub-regions which are cut. shown in Figure 16, where × represents the sub-regions which are cut.

Figure 16. 16. Cuts Cuts among among sub-regions. sub-regions. Figure

(2) Cuts inside sub-regions (2) Cuts inside sub-regions In order to improve the skyline query efficiency and reduce the number of data comparisons In to improve thetransmitted, skyline query efficiency and reduce of dataattribute comparisons and and theorder quantity of data this paper sets a valuethe ofnumber the D-space which is the quantity of data transmitted, this paper sets a value of the D-space attribute which is calculated calculated by the multiplication of each attribute in the tuple to represent the dominance ability ofbya the multiplication of each attribute in thetotuple toarepresent the dominance ability of a tuple ability [39]. The tuple [39]. The calculated value is used select tuple possessing the biggest dominance as calculated value is used to select a tuple possessing the biggest dominance ability as the cut tuple. As the cut tuple. As the data inside each sub-region does not dominate each other, we only need to the data each sub-region doesbetween not dominate each we only need to judge the dominance judge theinside dominance relationships the data in other, different sub-regions. As shown in Table 2 relationships between the data in different sub-regions. shown in 2 and 3 for and Table 3, for two cluster head node Sdata tables, weAs calculate theTables D-space value oftwo eachcluster tuple head node Sdata tables, we calculate the D-space value of each tuple value in the table, and if the value in the table, and if the D-space value of any tuple in one cluster head node table is greater D-space value of any tuple in one cluster head node table is greater than the D-space value of any tuple than the D-space value of any tuple in the other cluster head node data table, then we need to in the other cluster head noderelationship data table, then we need to further judge the dominance relationship further judge the dominance between tuples. For example, If d2’ < d1’ < d2 < d1, then between tuples. For example, If d2’ < d1’ < d2 < d1, then we need to judge the dominance relationship we need to judge the dominance relationship between S2’ with S2 and S1, and still need to judge the between S2’relationship with S2 and between S1, and still thetodominance S1’ with S1 dominance S1’ need with to S1judge and S2 avoid any relationship non-skyline between transmission in the and S2 to avoid any non-skyline transmission in the network, and then cut the dominated tuples, and network, and then cut the dominated tuples, and the nodes which are not dominated will be the nodes which not dominated will be transmitted to the center. Asofshown in Figure 17, transmitted to theare monitor center. As shown in Figure 17, thismonitor is the final result the skyline query. this thesolid final point result represents of the skyline query. The big solid point represents sensorareas nodesand which arehigher close The isbig sensor nodes which are close to residential have to residential areas and have higher pollution ability. By this query, we can quickly return the results pollution ability. By this query, we can quickly return the results which meet the demand of users. which meet the demand of users.

Figure 17. Final results of the skyline query.

we need to judge the dominance relationship between S2’ with S2 and S1, and still need to judge the dominance relationship between S1’ with S1 and S2 to avoid any non-skyline transmission in the network, and then cut the dominated tuples, and the nodes which are not dominated will be transmitted to the monitor center. As shown in Figure 17, this is the final result of the skyline query. The big solid point represents sensor nodes which are close to residential areas and have 15 higher Sensors 2016, 16, 454 of 22 pollution ability. By this query, we can quickly return the results which meet the demand of users.

Figure 17. 17. Final Final results results of of the the skyline skyline query. query. Figure Table2.2. Data Data table table S1 S1of ofcluster clusternodes. nodes. Table

ID ID Attribute 1 1 Attribute Attribute Attribute22 S1 S1 S2 S2

v1 v1 v2 v2

v11 v11 v22 v22

Max Max m1 m1 m2 m2

Min D-Space D-Space Min m11 m11 m22 m22

d1 d1 d2 d2

Table Table3.3. Data Data table table S1’ S1’ of ofcluster clusternodes. nodes.

ID ID Attribute Attribute11 S1’ S1’ v1’ v1’ v2’ S2’ S2’ v2’

Attribute Max MinMin D-Space Attribute 22 Max D-Space v11’ m1’ m11’m11’ d1’ d1’ v11’ m1’ v22’ m2’ v22’ m2’ m22’m22’ d2’ d2’

4.3. The Geometry-Based Distributed Skyline Query Algorithm The algorithm conducts the general skyline queries on the non-spatial attributes of the data in the major cluster to get the skyline Smaj, transmits all the skyline data in the major cluster to the cluster head nodes, the cluster head nodes perform the dominance relation judgment between the non-spatial attributes of the spatial skyline in assisted clusters with existing skyline data Smaj in the major cluster, and the cluster head nodes regard the data that are not dominated by others as the local skyline data of the sub-region. The collection process can cut all the local skyline of the dominated sub-region with the cuts among sub-regions, then cut all dominated local skyline by the cuts inside sub-regions to get the final skyline data and transmit them to the monitor center. The specific algorithm is shown as Algorithm 4. Algorithm 4: Geometry-based distributed spatial skyline query algorithm Input: data node set P, query node set Q Output: the skyline set S 1: divide the spatial region by Voronoi diagram 2: conduct cut of skyline region based on convex hull vertices, acquire part of the spatial skyline data SCH, and transmit them to each nodes, and divide them into the major cluster 3: divide the space region into several triangle sub-region by the triangulation 4: for each closest node, conduct generation data node tree algorithm, conduct the general skyline query on the data in major cluster, and return the result to the cluster head node 5: conduct distributed query algorithm. 6: conduct skyline parallel query algorithm 7: conduct the process of cut among sub-region, use the threshold to cut the dominated sub-region 8: conduct the process of cut inside sub-region, transmit the nodes which are not dominated to the monitor center The time cost of the division of space region using the Voronoi diagram is O(nlogn + 2n ´ 5), then the cost of the triangulation on the division area is O(nlogn), and the cost of the execution of the

Sensors 2016, 16, 454

16 of 22

skyline query is associated with the number of the data node P in the triangle area, and the time cost is O(|P|(|S|log|CH(Q)| + log|P|)), so the whole time complexity is O(nlogn + n). 5. Experimental Performance Analysis This paper proposes the method of geometry-based distributed spatial skyline query, it solves the issue whereby existing methods cannot adapt to provide efficient specific queries on the spatial distance of any region. The algorithm can reduce communication overhead in the network and improve the network life of monitoring systems by using geometry-based distributed spatial skyline queries. This paper discusses six aspects concerning the number of dominance comparisons between nodes, the number of query nodes with dominance comparisons, the sensor energy consumption, the response time of the query nodes, the execution efficiency percentage and the collection efficiency of the skyline. This paper compares them with the methods used in wireless sensor networks. At present, spatial and non-spatial skyline methods can be applied to query specific environmental monitoring regions. Non-spatial skyline queries also shows high response time and node recovery time efficiency. Therefore, we designed some corresponding comparative experiments for them. Voronoi-based Spatial Skyline (VS2) [19] and Enhanced Spatial Skyline (ES) [20] are the spatial skyline query methods, the Top-down filtering method of FIST (TF) [40] and Energy-Efficient Evaluation of Multiple Skyline Queries over a Wireless Sensor Network (EMSE) [41] are the non-spatial skyline query methods. We simulate a data set consisting of 1000 uniform distributed random locations in three-dimensional space. The experiment uses synthetic data as a standard test data set [32–34,42–46]. The value in spatial attributes of the nodes is in the range of [0, 1]. The hardware parameters used in the experiment are shown in Table 4. Table 4. Hardware parameters. Hardware

Parameter

CPU Memory Disk

Pentium 4 (3.2 GHz) 4 GB 1 TB

The parameters used in the experiment are shown in Table 5. Table 5. Experimental Parameters. Parameter

Setting

Dimensionality Dataset cardinality Distribution of data points The number of points in a query

3 50, 100, 200, 500, 1000 Independent 5, 10, 15, 20, 40

5.1. Number of Data Nodes with Dominance Comparisons As shown in Figure 18, all the VS2 skylines will undoubtedly increase the dominance comparisons. The monotone function used in GDSSky sorts the space distance, which only needs to compare the dominance relation with the partial skyline, while the second stage of ES starts from a point to recursively traverse the neighbor nodes and judges dominance relations with all existing skyline nodes. The TF strategy sets local or global filters in the sensor node, and the filter computation brings a lot of overhead, though the data detected by the sensor node is not beyond the range of super-cube, however, it cannot judge whether the node is the skyline or not because of the change of the detected data of the other nodes, however, it needs to acquire the detected data of this node, which will result in the more domination relationship comparisons. The EMSE algorithm uses the global and local two processes, allocates a signature for each tuple in the process of local optimization to indicate the query where

dominance comparisons (M)

skyline nodes. The TF strategy sets local or global filters in the sensor node, and the filter computation brings a lot of overhead, though the data detected by the sensor node is not beyond the range of super-cube, however, it cannot judge whether the node is the skyline or not because of the change of the detected data of the other nodes, however, it needs to acquire the detected data of this node, which will result in the more domination relationship comparisons. The EMSE algorithm Sensors 2016, 16, 454 17 of 22 uses the global and local two processes, allocates a signature for each tuple in the process of local optimization to indicate the query where the tuple belongs to, all tuples can only participate in a query, when the data the network dominance relation thedata tupleinisthe likely to bechanges, falsely the tuple belongs to, allintuples can onlychanges, participate in a query, when the network judged. However, the tuple paper uses to thebepoint queries mechanism without dominance dominance relation the is likely falselylocation judged. However, the paper uses the point location comparisons between nodesdominance to get most local skyline data. nodes The rectangular canskyline reducedata. the queries mechanism without comparisons between to get mostBlocal dominance comparisons between nodes at the later stage.between The more the number of sensor nodes is, The rectangular B can reduce the dominance comparisons nodes at the later stage. The more the obvious the effect. the more number of sensor nodes is, the more obvious the effect. VS2 ES TF EMSE GDSSky

200 150 100 50 0

50

100

200

500

1000

number of data nodes

Figure Figure 18. 18. Number Number of of data data nodes nodes and and dominance dominance comparisons. comparisons.

5.2. 5.2. Number Number of of Query Query Nodes Nodes with with Dominance Dominance Comparisons Comparisons Figure Figure 19 19 shows shows the the relationship relationship between between the the number number of of query query nodes nodes with with the the dominance dominance comparisons. Figures 18 and 19 are similar, with the increase of the number of query nodes, the comparisons. Figures 18 and 19 are similar, with the increase of the number of query nodes, the number number nodes dominating comparisons increasessteadily, relatively steadily, because thespatial-based above three of nodesof dominating comparisons increases relatively because all the aboveall three spatial-based algorithms—VS2, ES, GDSSky—use CH(Q) instead of Q, and the size of CH(Q) grows algorithms—VS2, the Sensors 2016, 16, 454 ES, GDSSky—use CH(Q) instead of Q, and the size of CH(Q) grows slower than 17 of 21 slower than size algorithms of Q. In thebased two algorithms based on theEMSE—all non-spatial—TF, EMSE—all sensor size of Q. In the two on the non-spatial—TF, sensor nodes are required nodes are required to participate in must the query, that each sensors including multi-dimensional attributes be involved in the dimension judgmentmulti-dimensional of all domination relations, to participate in the spatial query, that is, each dimension of allis,sensors including spatial which willmust no doubt greatly increase the dominance relation comparisons. Thewill VS2no algorithm starts attributes be involved in the judgment of domination relations, which doubt greatly from a data node to judge the relationship otheralgorithm nodes. This also increases the comparisons increase the dominance relation comparisons.with The VS2 starts from a data node to judge the between thewith nodes. Thenodes. GDSSky we have proposed inbetween this paper, firstly, The findsGDSSky all the relationship other Thisalgorithm also increases the comparisons the nodes. interestingwe spatial byin the spatial method, and cutsspatial off amounts algorithm have skyline proposed this paper,skyline firstly,query finds all the interesting skylineofbynon-skyline the spatial data, then it uses the and non-spatial attributes the spatialdata, skyline conduct the general skyline skyline query method, cuts off amounts of of non-skyline then to it uses the non-spatial attributes queries among the sub-regions and inside the sub-regions to cut respectively, and also reduces of the spatial skyline to conduct the general skyline queries among the sub-regions and inside the number of dominance comparisons. sub-regions to cut respectively, and also reduces the number of dominance comparisons.

dominance comparisons (M)

120

VS2 ES

100

TF

80

EMSE

60

GDSSky

40 20 0 5

10

15

20

40

number of query nodes

Figure 19. 19. Number nodes with with dominance dominance comparisons. comparisons. Figure Number of of query query nodes

5.3. Sensor Energy Consumption Consumption 5.3. Sensor Energy As shown shown in in Figure Figure 20, 20, the the energy energy consumption consumption in in GDSSky GDSSky is is less less than than that that of of the the other other four four As methods, because becausethethe algorithm transmits information to one and the calculates the methods, VS2VS2 algorithm transmits information to one node andnode calculates dominance dominance relation in the overall recursion query phase and the ES algorithm in the second phase relation in the overall recursion query phase and the ES algorithm in the second phase transmits to transmits to theinformation same nodetoinformation to determine dominance relation, and energy. both consume the same node determine the dominancethe relation, and both consume All the energy. All the sensor nodes and all the attributes of the algorithms of TF and EMSE are required to sensor nodes and all the attributes of the algorithms of TF and EMSE are required to participate in the participate in the filter and the dominance relation judgment, and the multi-dimensional spatial attributes produces an energy consumption which increases as the number of data nodes increases. Our strategy proposes a method to acquire the skyline on the spatial attributes so that it can cut a large part of the non-skyline, which can reduce the energy cost in the network. This paper cuts part of the non-spatial skyline first, then based on it we conduct a general skyline query to reduce the

As shown in Figure 20, the energy consumption in GDSSky is less than that of the other four methods, because the VS2 algorithm transmits information to one node and calculates the dominance relation in the overall recursion query phase and the ES algorithm in the second phase transmits to the same node information to determine the dominance relation, and both consume Sensors 2016, 16, 454 18 of 22 energy. All the sensor nodes and all the attributes of the algorithms of TF and EMSE are required to participate in the filter and the dominance relation judgment, and the multi-dimensional spatial filter and the dominance relation judgment, and the increases multi-dimensional spatialofattributes produces an attributes produces an energy consumption which as the number data nodes increases. energy consumption which increases as the number of data nodes increases. Our strategy proposes Our strategy proposes a method to acquire the skyline on the spatial attributes so that it can cut aa method to of acquire the skyline on the spatial attributes so thatcost it can large part of the non-skyline, large part the non-skyline, which can reduce the energy in cut the anetwork. This paper cuts part which reduce the energy costthen in the network. paper cuts part of skyline the non-spatial of the can non-spatial skyline first, based on it This we conduct a general query toskyline reducefirst, the then based on it we conduct a general skyline query to reduce the dominance comparisons between dominance comparisons between nodes and the data transmission, which reduces the energy nodes and the data transmission, which reduces the energy consumption. consumption. energy consumption (J)

4000

VS2

3500

ES

3000

TF

2500

EMSE

2000

GDSSky

1500 1000 500 0 50

100

150

200

400

600


Figure 20. 20. Sensor Sensor energy energy consumption. consumption. Figure

5.4. Effect of the Number of Query Nodes on the Response Time 5.4. Effect of the Number of Query Nodes on the Response Time Figure 21 shows the effect of the number of query nodes on the response time. The GDSSky Figure 21 shows the effect of the number of query nodes on the response time. The GDSSky outperform the VS2 algorithm, TF algorithm and EMSE algorithm. Because they all reduce the outperform the Sensors 2016, 16,the 454 VS2 algorithm, TF algorithm and EMSE algorithm. Because they all reduce 18 of 21 dominance relation judgment between nodes which are in or intersecting with CH(Q), they can dominance relation judgment between nodes which are in or intersecting with CH(Q), they can reduce reduce lots manner, of the dominance comparisons between nodes. our algorithm is executed in a distributed compared with the ES algorithm hasalgorithm a As shorter response time. method lots of the dominance comparisons between nodes. Asisour is executed in aOur distributed uses thecompared skyline parallel query strategy in the asub-region to query the Our skyline in the major cluster. manner, with the ES algorithm is has shorter response time. method uses the skyline Meanwhile, queriesin the skyline data, which can improve the responseit time. In parallel queryitstrategy theremaining sub-regionspatial to query the skyline in the major cluster. Meanwhile, queries theremaining TF algorithm, theskyline base station can determine the the skyline resulttime. afterInobtaining relevant detected the spatial data, which can improve response the TF algorithm, the base data ofcan the determine sensor nodes, so its query time is relevant long. In detected the EMSEdata strategy, thenodes, increase station the skyline resultresponse after obtaining of the with sensor so of query queryresponse nodes, astime theisdistances of each node to the each query of nodes be as calculated, this its long. In the EMSEsensor strategy, with increase querymust nodes, the distances greatly increases thetoresponse time. of each sensor node each query nodes must be calculated, this greatly increases the response time.

response time （s）

30

VS2 ES

25

TF

20

EMSE

15

GDSSky

10 5 0 5

10

15

20

40

number of query nodes

Figure 21. 21. Relation Relation between between number number of of query query nodes nodesand andresponse responsetime. time. Figure

5.5. Execution Efficiency Percentage 5.5. Execution Efficiency Percentage Figure 22 shows that our method proposes the GDSSky method that acquires the remaining Figure 22 shows that our method proposes the GDSSky method that acquires the remaining skyline at sub-stations after receiving the partial skyline, so it can obtain all spatial skylines earlier skyline at sub-stations after receiving the partial skyline, so it can obtain all spatial skylines earlier than than the ES algorithm. The VS2 algorithm executes from one node recursively, so it gets a linear the ES algorithm. The VS2 algorithm executes from one node recursively, so it gets a linear internal internal skyline, and the arrival time of the external skyline is relatively slow. Meanwhile the TF skyline, and the arrival time of the external skyline is relatively slow. Meanwhile the TF and EMSE and EMSE algorithms calculate the skyline of all interested attributes together, due to the fact the algorithms calculate the skyline of all interested attributes together, due to the fact the dominance dominance relationship comparisons are too many, so the data needs to be continuously transmitted within the network, so the skyline data is obtained relatively slowly. 1.2

e

1

VS2 ES TF EMSE

5.5. Execution Efficiency Percentage Figure 22 shows that our method proposes the GDSSky method that acquires the remaining skyline at sub-stations after receiving the partial skyline, so it can obtain all spatial skylines earlier than the ES algorithm. The VS2 algorithm executes from one node recursively, so it gets a linear Sensors 2016, 16, 454 and the arrival time of the external skyline is relatively slow. Meanwhile the 19 of TF 22 internal skyline, and EMSE algorithms calculate the skyline of all interested attributes together, due to the fact the dominance comparisons relationship are comparisons arethetoo the data needstransmitted to be continuously relationship too many, so datamany, needs so to be continuously within the transmitted within the network, so the skyline data is obtained relatively slowly. network, so the skyline data is obtained relatively slowly. 1.2

VS2 ES TF EMSE GDSSky

skyline arrival percentage

1 0.8 0.6 0.4 0.2 0 2

2.5

3

3.5

4

4.5

5

5.5

6

skyline arrival time（S）

Figure 22. Execution percentage. Figure 22. Execution efficiency efficiency percentage.

5.6. Skyline Efficiency 5.6. Skyline Collection Collection Efficiency Figure 23 23 shows shows the the effect effect of of the the number number of of sensor sensor nodes nodes on on the the skyline skyline collection collection time, time, which which Figure grows with with the the increase increase of of the the number number of of sensor sensor nodes. nodes. However, However, the the skyline skyline collection collection time time of of our our grows method is the least, because our method will cut more non-spatial skyline with the increasing data method is the least, because our method will cut more non-spatial skyline with the increasing data nodes by by the the spatial spatial skyline skyline query query method. method. Based it, the the skyline skyline query query time time will will be be short. short. Our Our nodes Based on on it, method conducts conductsthe the queries collection the time, samequerying time, querying theonskyline on the method queries andand collection at theatsame the skyline the non-spatial non-spatial attributes of the spatial skyline at the same time. Our method uses a clustering strategy attributes Sensors 2016,of 16,the 454 spatial skyline at the same time. Our method uses a clustering strategy to return 19 ofall 21 to return all to thethedata back to node the cluster node so it improves the efficiency. the the data back cluster head so thathead it improves thethat efficiency. At the collection stage, At it uses collection stage,the itcuts uses the cuts sub-regions tocut cutthe the amount non-skyline quickly, and thenamong utilizes inside the sub-region to accurately dominated nodes by judging the the cuts sub-regions toamong cut the the amount of non-skyline quickly, and of then utilizes cuts inside dominance relation through cut the the value of the D-space last transmits them to the the sub-region to accurately dominated nodes threshold, by judgingand the at dominance relation through monitor while thethreshold, VS2 algorithm the dominance relationships one by one,the then it the valuecenter, of the D-space and at judges last transmits them to the monitor center, while VS2 uploads data to the center so the collection efficiency low. Both VS2 strategies algorithm judges the monitor dominance relationships one by one, then itisuploads dataES toand the monitor center completely query and collect theBoth spatial attributes, and then conduct the general skyline so the collection efficiency is low. ES and VS2 strategies completely query and collect thequeries. spatial These strategies have low efficiency. While all the attributes the TF and EMSE algorithmsWhile take attributes, and then conduct the general skyline queries. Theseofstrategies have low efficiency. parttheinattributes the dominance judgment, the take growing number of comparisons increases the all of the TFrelation and EMSE algorithms part in the dominance relation judgment, recovery number time. of comparisons increases the recovery time. growing VS2 ES TF EMSE GDSSky

collection time (s)

6 5 4 3 2 1 0 50

100

200

500

1000


Figure of the the skyline. skyline. Figure 23. 23. Collection Collection efficiency efficiency of

6. Conclusions 6. Conclusions To address address environmental environmental problems, problems, this this paper paper proposes proposes aa geometry-based geometry-based distributed distributed skyline skyline To query strategy strategy based based on on the the characteristics characteristics of of spatial spatial data data and and aiming aiming to to solve solve the the deficiencies deficiencies of of the the query existing skyline query strategies. We design a clustering strategy for the parallel execution of general skyline queries on the non-spatial attributes of the spatial skyline, and conduct the spatial skyline queries on the remaining spatial skyline data which are still not found by the method of the tree at the same time, so it will be conducted in parallel. This paper proposes a distributed execution between different sub-regions. Finally we propose the use of cuts among sub-regions and

Sensors 2016, 16, 454

20 of 22

existing skyline query strategies. We design a clustering strategy for the parallel execution of general skyline queries on the non-spatial attributes of the spatial skyline, and conduct the spatial skyline queries on the remaining spatial skyline data which are still not found by the method of the tree at the same time, so it will be conducted in parallel. This paper proposes a distributed execution between different sub-regions. Finally we propose the use of cuts among sub-regions and cuts in sub-regions for the collection. To solve the problem that the existing spatial and non-spatial skyline query strategies have large energy consumption and poor real-time performance in querying the sensor network, this paper used the concept of the geometry-based spatial skyline, which is based on the query problem of environmental pollution monitoring systems, to reduce the network communication consumption, and increase the network lifetime of the monitoring system. Acknowledgments: This work is supported by the National Natural Science Foundation of China (No. 61472169, 61472072), the National Key Basic Pre-Research Program of China (No. 2014CB360509) and the Foundation of Science Public Welfare of Liaoning Province in China (No. 2015003003). Author Contributions: Yan Wang, Baoyan Song and Li Zhang conceived the thesis research ideas, designed the experiments, and analyzed the data. Yan Wang and Li Zhang performed the experiments. Junlu Wang and Ling Wang provided suggestions and critical insights during the development of this paper. Valuable comments on a first draft were received from Journal. Baoyan Song and Junlu Wang provided valuable comments on a later draft and on the index. Yan Wang, Li Zhang and Ling Wang wrote the manuscript. Conflicts of Interest: The authors declare no conflict of interest.

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.

12. 13. 14.

Sun, Y.; Yu, X.; Bie, R.; Song, H. Discovering Time-dependent Shortest Path on Traffic Graph for Drivers towards Green Driving. J. Netw. Comput. Appl. 2016. [CrossRef] Sun, Y.; Lu, C.; Bie, R.; Zhang, J. Semantic Relation Computing Theory and Its Application. J. Netw. Comput. Appl. 2016. [CrossRef] Sun, Y.; Zhang, J.; Bie, R. Measuring Semantic-based Structural Similarity in Multi-relational Networks. IJDWM 2016. [CrossRef] Zhang, J.; Sun, Y.; Zhu, J.; Qiao, X. A Synergetic Mechanism for Digital Library Service in Mobile and Cloud Computing Environment. Pers. Ubiquitous Comput. 2014. [CrossRef] Sun, Y.; Zhang, J.; Xiong, Y.; Zhu, G. Data Security and Privacy in Cloud Computing. Int. J. Distrib. Sens. Netw. 2014. [CrossRef] Sun, Y.; Jara, A.J. An extensible and active semantic model of information organizing for the Internet of Things. Pers. Ubiquitous. Comput. 2014, 18, 1821–1833. [CrossRef] Guo, J.; Zhang, H.; Sun, Y.; Bie, R. Square-root unscented Kalman filtering-based localization and tracking in the Internet of Things. Pers. Ubiquitous Comput. 2014, 18, 1824–1829. [CrossRef] Sun, Y.; Yan, H.; Lu, C.; Bie, R.; Zhou, Z. Constructing the web of events from raw data in the Web of Things. Mob. Inf. Syst. 2014. [CrossRef] Zhuge, H.; Sun, Y. Schema Theory for Semantic Link Network. Future Gener. Comput. Syst. 2010. [CrossRef] Sun, Y.; Bie, R.; Yu, X.; Wang, S. Semantic Link Networks: Theory, Applications, and Future Trends. J. Internet Technol. 2013, 14, 365–378. D’Arienzo, M.; Iacono, M.; Marrone, S.; Nardone, R. Estimation of the energy consumption of mobile sensors in WSN environmental monitoring applications. In Proceedings of the 2013 27th International Conference on Advanced Information Networking and Applications Workshops (WAINA), Barcelona, Spain, 25–28 March 2013; pp. 1588–1593. Sun, Y.; Song, H.; Jara, A.J.; Bie, R. Internet of Things and Big Data Analytics for Smart and Connected Communities. IEEE Access. 2016. [CrossRef] Sun, Y.; Yan, H.; Zhang, J.; Xia, Y.; Wang, S.; Bie, R.; Tian, Y. Organizing and Querying the Big Sensing Data with Event-Linked Network in the Internet of Things. Int. J. Distrib. Sens. Netw. 2014, 4, 610–617. Chen, B.; Liang, W. Progressive skyline query processing in wireless sensor networks. In Proceedings of the 5th International Conference on Mobile Ad-hoc and Sensor Networks, Fujian, China, 14–16 December 2009; pp. 17–24.

Sensors 2016, 16, 454

15. 16.

17.

18. 19.

20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37.

38.

21 of 22

Yin, B.; Lin, Y.; Yu, J.; Liu, P. Skyline monitoring in wireless sensor networks. IEICE Trans. Commun. 2013. [CrossRef] Navarro, M.; Davis, T.W.; Liang, Y.; Liang, X. ASWP: A long-term WSN deployment for environmental monitoring. In Proceedings of the 12th International Conference on Information Processing in Sensor Networks, Philadelphia, PA, USA, 8–11 April 2013. Li, Z.; Wang, Y.; Song, B.; Li, X.; Hao, X. Geometry-based spatial skyline query in wireless sensor network. In Proceedings of the 2014 11th Web Information System and Application Conference, Tianjin, China, 12–14 September 2014. Kwon, Y.; Choi, J.H.; Chung, Y.D.; Lee, S. In-network processing for skyline queries in sensor networks. Ieice Trans. Commun. 2007. [CrossRef] Roh, Y.J.; Song, I.; Jeon, J.H.; Woo, K.G.; Kim, M.H. Energy-efficient two-dimensional skyline query processing in wireless sensor networks. In Proceedings of the 2013 IEEE Consumer Communications and Networking Conference (CCNC), Las Vegas, NV, USA, 11–14 January 2013; pp. 294–301. Xin, J.C.; Wang, G.R. Continuous Skyline nodes query processing over wireless sensor networks. Chin. J. Comput. 2012, 35, 2415–2430. [CrossRef] Xin, J.; Wang, G.; Chen, L.; Oria, V. Energy-Efficient Evaluation of Multiple Skyline Queries over a Wireless Sensor Network; Springer: Berlin, Heidelberg, Germany, 2009; pp. 247–262. Wang, G.; Xin, J.; Chen, L.; Liu, Y. Energy-efficient reverse skyline query processing over wireless sensor networks. IEEE Trans. Knowl. Data Eng. 2012, 24, 1259–1275. [CrossRef] Chen, B.; Liang, W.; Yu, J.X. Energy-efficient skyline query optimization in wireless sensor networks. Wirel. Netw. 2012, 18, 985–1004. [CrossRef] Sharifzadeh, M.; Shahabi, C. The spatial skyline queries. In Proceedings of the 32nd international conference on Very large data bases, Seoul, Korea, 12–15 September 2006. Son, W.; Lee, M.W.; Ahn, H.K.; Hwang, S. Spatial Skyline Queries: An Efficient Geometric Algorithm; Advances in Spatial and Temporal Databases; Springer: Berlin, Heidelberg, Germany, 2009; pp. 247–264. Yoon, S.; Shahabi, C. Distributed Spatial Skyline Query Processing in Wireless Sensor Networks. In Proceedings of the IPSN, San Francisco, CA, USA, 13–16 April 2009. Lin, X.; Xu, J.; Hu, H. Range-based skyline queries in mobile environments. Knowl. Data Eng. 2013, 25, 835–849. [CrossRef] Son, W.; Stehn, F.; Knauer, C.; Ahn, H. Top- k manhattan spatial skyline queries. In Algorithms and Computation; Springer International Publishing: Cham, Switzerland, 2014; pp. 22–33. De Berg, M.; van Kreveld, M.; Overmars, M.; Overmars, M. Computational Geometry: Algorithms and Applications; Springer-Verlag: Berlin, Heidelberg, Germany, 2000. Huang, Y.K.; Chang, C.H.; Lee, C. Continuous distance-based skyline queries in road networks. Inf. Syst. 2012, 37, 611–633. [CrossRef] Bartolini, I.; Ciaccia, P.; Patella, M. The skyline of a probabilistic relation. Knowl. Data Eng. 2013, 25, 1656–1669. [CrossRef] Wei, W.; Yang, X.L.; Zhou, B.; Feng, J.; Shen, P.Y. Combined energy minimization for image reconstruction from few views. Math. Probl. Eng. 2012. [CrossRef] Wei, W.; Qiang, Y.; Zhang, J. A bijection between lattice-valued filters and lattice-valued congruences in residuated lattices. Math. Probl. Eng. 2013, 32, 1437–1450. Wang, Y.; Wei, W.; Deng, Q.; Liu, W.; Song, H. An energy-efficient skyline query for massively multidimensional sensing data. Sensors 2016. [CrossRef] [PubMed] Ahmed, K.; Nafi, N.S.; Gregory, M.A. Enhanced distributed dynamic skyline query for wireless sensor networks. J. Sens. Actuator Netw. 2016. [CrossRef] An, P.T.; Hai, N.N.; Hoai, T.V. Advances in Computer Science and Its Applications; Springer: Berlin, Heidelberg, Germany, 2014. Beckmann, N.; Kriegel, H.P.; Schneider, R.; Seeger, B. The R*-Tree: An efficient and robust access method for points and rectangles. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, Atlantic City, NJ, USA, 23–25 May 1990. Hu, L.; Ku, W.; Bakiras, S.; Shahabi, C. Spatial query integrity with voronoi neighbors. Knowl. Data Eng. 2013, 25, 863–876. [CrossRef]

Sensors 2016, 16, 454

39. 40. 41. 42. 43. 44.

45. 46.

22 of 22

Guo, L.; Li, J.; Li, G. Spatio-temporal query processing method in wireless sensor networks. J. Softw. 2006, 17, 794–805. [CrossRef] Mao, X.; Miao, X.; He, Y.; Li, X.Y.; Liu, Y. CitySee: Urban CO2 monitoring with sensors. In Proceedings of the 2012 Proceedings IEEE INFOCOM, Orlando, FL, USA, 25–30 March 2012; pp. 1611–1619. Wang, G. Analysis and study of the deicing and anti-icing for catenary. J. Railw. Eng. Soc. 2009, 8, 93–95. Li, J.; Wang, B.; Wang, G.; Huang, S. A survey of query processing techniques over uncertain mobile objects. J. Front. Comput. Sci. Technol. 2013, 7, 1057–1072. Jin, X.; Zhang, Q.; Zeng, P.; Kong, F.; Xiao, Y. Collision-free multichannel superframe scheduling for IEEE 802.15.4 cluster-tree networks. Int. J. Sens. Netw. 2014, 15, 246–258. [CrossRef] Peng, Y.; Yu, Y.; Guo, L.; Jiang, D.; Gai, Q. An efficient joint channel assignment and QoS routing protocol for IEEE 802.11 multi-radio multi-channel wireless mesh networks. J. Netw. Comput. Appl. 2013, 36, 843–857. [CrossRef] Wei, W.; Yong, Q. Information potential fields navigation in wireless Ad-Hoc sensor networks. Sensors 2011, 11, 4794–4807. [CrossRef] [PubMed] Wei, W.; Xu, Q.; Wang, L.; Hei, X.H.; Shen, P.Y. GI/Geom/1 queue based on communication model for mesh networks. Int. J. Commun. Syst. 2014, 27, 3013–3029. [CrossRef] © 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons by Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

Distributed optimal power and rate control in wireless sensor networks.

Distributed efficient similarity search mechanism in wireless sensor networks.

Distributed Signal Processing for Wireless EEG Sensor Networks.

Random and directed walk-based top-(k) queries in wireless sensor networks.

Efficient Data Collection in Widely Distributed Wireless Sensor Networks with Time Window and Precedence Constraints.

A Secure Scheme for Distributed Consensus Estimation against Data Falsification in Heterogeneous Wireless Sensor Networks.

Surveying multidisciplinary aspects in real-time distributed coding for Wireless Sensor Networks.

Self Calibrated Wireless Distributed Environmental Sensory Networks.

A Novel Energy-Aware Distributed Clustering Algorithm for Heterogeneous Wireless Sensor Networks in the Mobile Environment.

An efficient distributed algorithm for constructing spanning trees in wireless sensor networks.

Distributed RSS-based localization in wireless sensor networks based on second-order cone programming.

Distributed Information Compression for Target Tracking in Cluster-Based Wireless Sensor Networks.

Reliability of wireless sensor networks.

Distributed clone detection in static wireless sensor networks: random walk with network division.

Distributed least-squares estimation of a remote chemical source via convex combination in wireless sensor networks.

A Comparison of Alternative Distributed Dynamic Cluster Formation Techniques for Industrial Wireless Sensor Networks.

DCPVP: distributed clustering protocol using voting and priority for wireless sensor networks.

Fuzzy-Logic Based Distributed Energy-Efficient Clustering Algorithm for Wireless Sensor Networks.

An uncertainty-based distributed fault detection mechanism for wireless sensor networks.

A Distributed Learning Method for ℓ 1 -Regularized Kernel Machine over Wireless Sensor Networks.

Availability issues in wireless visual sensor networks.

On the feasibility of measuring urban air pollution by wireless distributed sensor networks.

Secure Skyline Queries on Cloud Platform.

Optimizing Retransmission Threshold in Wireless Sensor Networks.