Temporal Data-Driven Sleep Scheduling and Spatial Data-Driven Anomaly Detection for Clustered Wireless Sensor Networks.

sensors Article

Temporal Data-Driven Sleep Scheduling and Spatial Data-Driven Anomaly Detection for Clustered Wireless Sensor Networks Gang Li 1 , Bin He 1, *, Hongwei Huang 2 and Limin Tang 1 1 2

*

School of Electronics and Information Engineering, Tongji University, Shanghai 201804, China; [email protected] (G.L.); [email protected] (L.T.) Department of Geotechnical Engineering, Tongji University, Shanghai 200092, China; [email protected] Correspondence: [email protected]; Tel.: +86-186-2111-6388

Academic Editors: Massimo Poncino and Ka Lok Man Received: 18 July 2016; Accepted: 22 September 2016; Published: 28 September 2016

Abstract: The spatial–temporal correlation is an important feature of sensor data in wireless sensor networks (WSNs). Most of the existing works based on the spatial–temporal correlation can be divided into two parts: redundancy reduction and anomaly detection. These two parts are pursued separately in existing works. In this work, the combination of temporal data-driven sleep scheduling (TDSS) and spatial data-driven anomaly detection is proposed, where TDSS can reduce data redundancy. The TDSS model is inspired by transmission control protocol (TCP) congestion control. Based on long and linear cluster structure in the tunnel monitoring system, cooperative TDSS and spatial data-driven anomaly detection are then proposed. To realize synchronous acquisition in the same ring for analyzing the situation of every ring, TDSS is implemented in a cooperative way in the cluster. To keep the precision of sensor data, spatial data-driven anomaly detection based on the spatial correlation and Kriging method is realized to generate an anomaly indicator. The experiment results show that cooperative TDSS can realize non-uniform sensing effectively to reduce the energy consumption. In addition, spatial data-driven anomaly detection is quite significant for maintaining and improving the precision of sensor data. Keywords: spatial–temporal correlation; data-driven sleep scheduling; data-driven anomaly detection; WSN

1. Introduction Monitoring systems based on wireless sensor networks (WSNs) play an important role in sensor network applications, such as civil structure monitoring [1], environment monitoring, habitat monitoring and military warfare. A large number of sensor nodes are deployed to monitor the object, and their sensor data are transferred to the sink node through single-hop or multi-hop communication in WSNs [2]. Since these sensor nodes are in similar environments, sensor data from nodes nearby have a high spatial correlation. In addition, a single sensor node has a high temporal correlation when physical phenomena of the object change slowly. Accordingly, sensor data of such monitoring systems have a high spatial–temporal correlation. Tunnel structural health monitoring is vital to the operation safety of subway systems. The tunnel monitoring system based on WSNd is gradually replacing manual monitoring because of the harsh environment in the tunnel. Compared with monitoring systems based on the cable, the tunnel monitoring system based on WSNs can reduce costs and the complexity of sensor deployment. A fundamental challenge of monitoring systems based on WSNs lies in the energy constraint of battery-powered sensor nodes, which limits the network lifetime. Moreover, this work is based on Sensors 2016, 16, 1601; doi:10.3390/s16101601

www.mdpi.com/journal/sensors

Sensors 2016, 16, 1601

2 of 18

the monitoring system of the metro tunnel, in which it is difficult to enter. The labor cost of replacing the battery is quite high. Low energy consumption is therefore extremely important in the tunnel monitoring system. A large number of scholars have confirmed that exploiting the spatial–temporal correlation of sensor data is helpful for reducing energy consumption. The energy consumption of sensor nodes is mainly caused by wireless transmission, data sensing and node operation. To reduce the energy consumption caused by wireless transmission, data aggregation based on the spatial–temporal correlation has been proposed [3]. This kind of data aggregation uses the spatial–temporal correlation to reduce the amount of transmitted data. Aimed at reducing the energy consumption caused by data sensing, sparse sensing based on the spatial–temporal correlation can reduce the amount of collected samples [4]. To reduce the energy consumption caused by node operation, the spatial–temporal correlation can be used to realize the sleep scheduling of nodes [5]. In addition, the sleep scheduling can also reduce the amount of collected samples and transmitted data, so that it is a significant method for reducing the energy consumption of sensor nodes. Accordingly, this work focuses on the data-driven sleep scheduling to save the energy. In this work, the tunnel monitoring system is used in the shield tunnel, which generally takes the form of a ring to distinguish segments. Sensor nodes are therefore arranged in the forms of rings. Besides the aforementioned fundamental challenge about the energy constraint, the tunnel monitoring system has two core requirements for data: (1) synchronous acquisition in the same ring should be realized for analyzing the situation of the every ring; and (2) high precision of all sensor data should be ensured for analyzing and monitoring the situation of the tunnel structure. In consideration of the second core requirement for data, the spatial correlation is used in a proposed anomaly detection, which is termed spatial data-driven anomaly detection. This anomaly detection method can determine whether the node is working properly, which is a necessary precondition for keeping the high precision of all sensor data. On the other hand, temporal correlation is used in the proposed sleep scheduling, which is termed temporal data-driven sleep scheduling. This sleep scheduling works in different nodes in a cooperative way according to the first core requirement for data. For the tunnel monitoring system, a network topology with a long and linear cluster structure is also built [6]. Clustering has been widely used in WSNs, which can reduce the energy consumption [7]. In clustered WSNs, some sensor nodes become cluster heads to collect data from their respective cluster members, and then forward these data to the sink node. Based on the above, this work presents temporal data-driven sleep scheduling and spatial data-driven anomaly detection for clustered WSNs. The main contributions and novelties of this work are as follows: (1) the combination of temporal data-driven sleep scheduling and spatial data-driven anomaly detection is proposed. The proposed approach is data-driven, and thus it can be used in any clustered WSN; (2) the proposed cooperative temporal data-driven sleep scheduling (TDSS) not only can realize non-uniform sensing to reduce the energy consumption, but also can realize synchronous acquisition in the same ring for analyzing the situation of the every ring; and (3) the proposed spatial data-driven anomaly detection, which generates an anomaly indicator ξ to decide the priority level of node maintenance, can maintain and improve the precision of sensor data effectively. The rest of this work is organized as follows. Section 2 reviews related work. In Section 3, the temporal data-driven sleep scheduling model is proposed. Section 4 describes a cluster-based cooperation mechanism. Performance of the proposed methods is evaluated in Section 5. Section 6 concludes this work. 2. Related Work In most of the existing works, the spatial–temporal correlation is used to reduce data redundancy. Data redundancy can be reduced by several ways including data aggregation, sparse sensing, sleep scheduling and so on. Data aggregation techniques using soft computing methods for WSN mainly have four types including neural-based data aggregation, fuzzy-based data aggregation, genetic-based data aggregation and swarm-based data aggregation [8]. Most of these data aggregation

Sensors 2016, 16, 1601

3 of 18

techniques are based on cluster or tree. Alsheikh et al. [9] presented a data compression algorithm with error bound guarantee, where a three-layer neural network was used to extract the feature of data. Their algorithm exploits the spatial–temporal correlation in the raw data to generate a lower-dimensional representation so that the amount of data transmitted by the cluster head is reduced. Since their algorithm is performed in the cluster head, the amount of data transmitted from cluster members to the cluster head is not changed. To further reduce the amount of such data, Arunraja et al. [10] deployed a dual prediction based reporting between cluster members and their cluster head, which is based on the temporal correlation of sensor data. After a few rounds of data transmission, this dual prediction in the cluster member and the cluster head starts to predict the next data from previous data synchronously. If the prediction is in line with the actual data, the cluster member will not transmit the actual data to the cluster head. In addition, the cluster head reduces the spatial data redundancy after receiving data from its cluster members. Although the energy spent for the transmission is further optimized by reducing the amount of data transmitted from cluster members to the cluster head, the energy consumption of sensing is still ignored. Chen et al. [4] realized a distributed adaptive sparse sensing based on compressive sensing, which reduced the amount of collected samples and recovered the missing data by using the spatial–temporal correlation. By exploiting spatial and temporal (spatiotemporal) correlations among sensor readings simultaneously, Gong et al. [11] proposed a spatiotemporal compressive network coding. This scheme compressed and recovered sensor data in a more energy-efficient manner, which was validated in their simulation. The aforementioned methods based on the spatial–temporal correlation can reduce the energy consumption by optimizing data transmission and data sensing, which need to keep sensor nodes active. Sleep scheduling is therefore considered to further reduce the energy consumption. Ma et al. [12] proposed an optimal sleep scheduling scheme based on balanced energy consumption, where sensor nodes had three states including active, sleep and pre-sleep. Simulation results showed that their sleep scheduling improved the sleep ratio and extended the network lifetime. They focused on balanced energy consumption, which was realized by considering the residual energy and the distance among nodes. Aimed at keeping high network coverage through a minimum number of sensor devices in active state, Al-Kahtani et al. [5] presented an efficient cluster-based sleep scheduling. In their work, the sensing range of the node is larger than the distance between nodes. Thus, some nodes can remain in sleep mode if monitoring areas nearby are covered. These works are based on the existing clustering routing. Yang et al. [13] proposed sleeping multipath routing that could trade off network lifetime and reliability by dynamically activating an optimal number of paths and putting the rest of the sensors to sleep. An adaptive sensor sleeping solution was also proposed, and it further prolonged the network lifetime by putting sensors to sleep at both the routing and the Media Access Control (MAC) layers. Liu et al. [14] presented a joint routing-and-sleep-scheduling scheme by incorporating the design of routing and sleep scheduling into one optimization framework. An asynchronous sleep scheduling mechanism in different sensor nodes was described. The problem about optimal joint routing and sleep scheduling was solved through an iterative geometric programming algorithm. According to existing research results, sleep scheduling is an efficient way to reduce the energy consumption in WSNs. Some scholars have exploited the spatial–temporal correlation of sensor data to detect the anomaly. Titouna et al. [15] proposed a two-level outlier detection approach using Bayes classifiers, where the first level was implemented locally inside the sensor nodes and the second level was carried out in the cluster head or gateway. Their approach used these two levels based on the spatial–temporal correlation to ensure a high accuracy and reduce the false alarm. To detect faulty readings with greater accuracy, Nisha et al. [16] realized anomaly detection by using a proactive correlated fuzzy system. Their correlation acts included the fuzzy-temporal act, the fuzzy-attribute act, and the fuzzy-spatial act. Their experimental results proved that their method outperformed the existing work in anomaly detection accuracy and false alarm. Xiang et al. [17] focused on the spatial correlation to realize event

Sensors 2016, 16, 1601

4 of 18

Sensors 2016, 16, 1601

4 of 18

detection in the three-dimensional space. The Neyman–Pearson principle and collaborative detectors event. According to existing researchofresults, anomaly detection based on According the spatial–temporal were utilized to improve the accuracy detecting the occurrence of the event. to existing correlation is feasible and valuable. research results, anomaly detection based on the spatial–temporal correlation is feasible and valuable.

3. 3. Temporal TemporalData-Driven Data-DrivenSleep SleepScheduling SchedulingModel Model The transmission control protocol (TCP) congestion control. For The TDSS TDSS model modelisisinspired inspiredbyby transmission control protocol (TCP) congestion control. ensuring the stability and efficiency of data transmission in the network, TCP congestion control For ensuring the stability and efficiency of data transmission in the network, TCP congestion control prevents a situation situation in in which which network network performance performance is affected or or even even paralyzed paralyzed prevents the the occurrence occurrence of of a is affected because some data packets are not processed promptly. The core idea of TCP congestion control because some data packets are not processed promptly. The core idea of TCP congestion control is is to to maximize network throughput without congestion by adjusting the congestion window. The change maximize network throughput without congestion by adjusting the congestion window. The change in window size size of of slow slowstarts startsand andthe thecongestion congestionavoidance avoidancealgorithm algorithmisisshown shownininFigure Figure in congestion congestion window 1. 1. In the case of sleep scheduling of sensor nodes, sleep scheduling is just like a kind of “congestion In the case of sleep scheduling of sensor nodes, sleep scheduling is just like a kind of “congestion control” control” of of time. time. 25

increase step by step Congestion window size

20

15

10

increase exponentially 5

0

0

2

4

6

8

10

12

14

16

18

20

Rounds of transmission Figure Thechange change in congestion window ofstarts slowand starts and the congestion Figure 1.1. The in congestion window size ofsize slow the congestion avoidance avoidance algorithm. algorithm.

To describe the TDSS model, the following parameters are defined, namely, the maximum sleep To describe the TDSS model, the following parameters are defined, namely, the maximum sleep time TSmax , the minimum sleep time TSmin , the sleep time step T, the counter variable N (set to 0 by time TSmax, the minimum sleep time TSmin, the sleep time step T, the counter variable N (set to 0 by default), and the threshold τ. TSmax and TSmin should be integer multiples of T. In this section, the TDSS default), and the threshold τ. TS max and TSmin should be integer multiples of T. In this section, the model is based on a single sensor node. The main operation processes of the TDSS model are as follows: TDSS model is based on a single sensor node. The main operation processes of the TDSS model are (1) the first sleep time TS1 is set to TSmin ; (2) the absolute value of difference between two sensor data as follows: (1) the first sleep time TS1 is set to TSmin; (2) the absolute value of difference between two is recorded as ∆V. After the first sleep time TS , if ∆V is less than τ, the second sleep time TS sensor data is recorded as ∆V. After the first sleep1time TS1, if ∆V is less than τ, the second sleep time2 will be equal to TS × 2; (3) After the second sleep time TS , if ∆V is less than τ, the third sleep TS2 will be equal to1TS1 × 2; (3) After the second sleep time TS2 2, if ∆V is less than τ, the third sleep time TS will be equal to TS × 2, and so on; (4) After the ith sleep time TSi , if ∆V < τ and time TS33will be equal to TS2 × 2,2 and so on; (4) After the ith sleep time TSi, if ∆V < τ and 2 × TSi > TSmax, 2 × TSi > TSmax , the (i + 1)th sleep time TSi+1 will be equal to TSi + T; (5) After the (j − 1)th sleep time the (i + 1)th sleep time TSi+1 will be equal to TSi + T; (5) After the (j − 1)th sleep time TSj-1, if ∆V is TSj-1 , if ∆V is greater than or equal to τ, the jth sleep time TSj will be equal to TSj-1 − T, and N = N + 1; greater than or equal to τ, the jth sleep time TSj will be equal to TSj-1 − T, and N = N + 1; (6) After TSj, if (6) After TSj , if ∆V is still greater than or equal to τ, the (j + 1)th sleep time TSj+1 will be equal to TSj /2, ∆V is still greater than or equal to τ, the (j + 1)th sleep time TSj+1 will be equal to TSj/2, and N = N + 1. and N = N + 1. Otherwise, if ∆V is less than τ, TSj+1 will be equal to TSj + T, and N = 0; (7) If N is Otherwise, if ∆V is less than τ, TSj+1 will be equal to TSj + T, and N = 0; (7) If N is greater than 3, the greater than 3, the sleep time TS will be set to TSmin . sleep time TS will be set to TSmin.

Sensors 2016, 16, 1601

5 of 18

TDSS Model: Temporal Data-Driven Sleep Scheduling Model Input: Sensor data of a single sensor node, TSmax , TSmin , T, N and τ. Output: The sleep time TS 1. Initialize TSmax , TSmin , T, N = 0 and τ. 2. Collect the first sensor data, enter the sleeping state with TS = TSmin , and then collect the second sensor data. 3. while (true) 4. while (∆V < τ) 5. TS = TS × 2, N = 0 6. if TS > TSmax , then 7. TS = TS/2 + T 8. if TS > TSmax , then 9. TS = TSmax 10. Enter the sleeping state, and then collect sensor data after TS 11. while (∆V ≥ τ) 12. N = N + 1 13. if N ≤ 1, then 14. TS = TS − T 15. else if N ≤ 3, then 16. TS = TS/2 17. else if N >3, then 18. TS = TSmin 19. if TS < TSmin , then 20. TS = TSmin 21. Enter the sleeping state, and then collect the sensor data after TS 22. end For slowly changed data like inclination data, temperature data, splaying amount of joints, and leakage data in the tunnel, the TDSS model can keep the sleep time at a relatively stable value after the sleep time is approaching appropriate sleep time. The stage of achieving the stable sleep time (TSstb ) is termed the stable process in this work. The criterion for entering the stable process is that four consecutive TS vary between X and X + T where X is a constant. In addition, X and X + T are recorded as TSstb . This kind of data-driven sleep scheduling model is not confined by the types of sensors. To describe the change in the sleep time of TDSS, four cases are selected as the inputs of TDSS, as shown in Figure 2. When the number of data acquisition is less than 11 in Figure 2a, the stage of approaching the stable sleep time is termed the stability approximation process in this work. TDSS is inspired by TCP congestion control. The stability approximation process is therefore similar to the trend of the change in congestion window size shown in Figure 1. In Figure 2a, a sinusoidal signal τsin(ωt) is selected as the input of TDSS. At the beginning of TDSS, ∆V is always less than τ, and 2 × TS ≤ TSmax . Thus, TS increases exponentially. If ∆V < τ and 2 × TS > TSmax , TDSS will increase step by step. When ∆V is close to τ, TDSS enters the stable process. In Figure 2b, where the input is a constant value, ∆V is always less than τ, and thus TS will remain at TSmax after these two stages including “increase exponentially” and “increase step by step”. In Figure 2c, the inclination of metro tunnel without metro operation changes very little, which is caused by the surrounding disturbance and sensor errors. Accordingly, although TDSS enters the stable process temporarily, TSstb can bee changed according to the variation of sensor data. In Figure 2d, TS decreases rapidly after the 11th data acquisition because metro train vibration can cause instantaneous change in the inclination angle. Small TS can preserve more data with the key features about instantaneous change in the inclination angle. After the metro train leaves, TS increases to reduce the redundancy of sensor

Sensors 2016, 16, 1601

6 of 18

data. As stated above, TDSS can enter the stable process when the input of TDSS is constant or changing regularly for a while. The aim of TDSS is reducing the temporal redundancy of sensor data Sensors 2016, 16, 1601 6 of 18 through appropriate sleep time rather than achieving the stable process. To reduce the redundancy of actual sensor data, TDSS achieves appropriate TS for actual sensor data instead of the true value, sensor data, TDSS achieves appropriate TS for actual sensor data instead of the true value, which which cannot be obtained. Since actual sensor data have certain randomness caused by random events cannot be obtained. Since actual sensor data have certain randomness caused by random events and and errors, TDSS a practical application does necessarily to stabilize, which is exactly errors, TDSS in in a practical application does notnot necessarily tendtend to stabilize, which is exactly a a characteristic of the data-driven mechanism. TDSS can achieve appropriate TS according characteristic of the data-driven mechanism. TDSS can achieve appropriate TS according to to thethe variation of of actual sensor will be be verified verifiedininSection Section5.5. variation actual sensordata datatotoreduce reducedata dataredundancy, redundancy, which which will

TSmax TSstb

TSmax

increase step by step

increase step by step stable process

increase exponentially


TSmin

TSmin

0

5

10

stable process

Sleep time

Sleep time

TSstb

15

20

0

The number of data acquisition

5

(a)

increase step by step

increase step by step increase exponentially

5

10


TSmin

15


(c)

Sleep time

Sleep time

20

TSmax

stable process 2 stable process 1

0

15

(b)

TSmax TSstb 2 TSstb 1

TSmin

10


20

0

5

10

15

20


(d)

Figure 2. The change in the sleep time of temporal data-driven sleep scheduling (TDSS) with Figure 2. The change in the sleep time of temporal data-driven sleep scheduling (TDSS) with different different inputs. (a) sinusoidal signal τsin(ωt) as the input of TDSS; (b) constant value as the input of inputs. (a) sinusoidal signal τsin(ωt) as the input of TDSS; (b) constant value as the input of TDSS; TDSS; (c) inclination of metro tunnel without metro operation; and (d) inclination of metro tunnel (c) inclination of metro tunnel without metro operation; and (d) inclination of metro tunnel with with metro operation. metro operation.

4. Cluster-Based Cooperation Mechanism 4. Cluster-Based Cooperation Mechanism 4.1. Long and Linear Cluster Structure 4.1. Long and Linear Cluster Structure For the tunnel monitoring system, a network topology with long and linear cluster structure is For the tunnel monitoring system, a network topology with longand andlinear linear cluster structure built. Sensor nodes in the same ring form a cluster, and then a long cluster structure is is built. Sensor nodes in the same ring form a cluster, and then a long and linear cluster structure formed in the long and linear tunnel. As shown in Figure 3, a cluster head in every ring collects data is formed longmembers. and linearIftunnel. As shown Figure 3, a cluster range head in ring collects data from in its the cluster the gateway is in in the communication of every the cluster head, the from its cluster members. If data the gateway is in the communication of thehead cluster the data cluster cluster head will transmit to the gateway directly; otherwise,range the cluster willhead, transmit head willgateway transmit data tothe theinter-cluster gateway directly; to the through relay. otherwise, the cluster head will transmit data to the gateway through the inter-cluster relay.

Sensors 2016, 16, 1601 Sensors 2016, 16, 1601

7 of 18 7 of 18

Figure 3. The tunnel monitoring system based on the long and linear cluster structure. Figure 3. The tunnel monitoring system based on the long and linear cluster structure.

The tunnel monitoring system has two core requirements for data, which is described in Section The tunnel system hasistwo core requirements data, is described in Section 1. 1. The first core monitoring requirement on data considered in Sectionfor 4.2, andwhich the second core requirement Thedata firstiscore requirement on data is The considered inlinear Section 4.2, and the second for for considered in Section 4.3. long and cluster structure is thecore basisrequirement of Sections 4.2 data is considered in Section 4.3. The long and linear cluster structure is the basis of Sections 4.2 and 4.3. and 4.3. Its brief description is therefore given. The main steps in building long and linear cluster Its brief description is therefore given. Themechanism, main steps in building linear cluster structure structure are as follows: (1) the round which has long two and stages including the stageare of as follows: (1) the round two including the stage of establishing establishing clusters andmechanism, the stage of which stable has work, is stages used in the clustering routing algorithm;clusters (2) the and the of stable work, is used in theround clustering algorithm; (2) the first step of establishing first stepstage of establishing clusters in every is to routing select the cluster head. Sensor nodes in the same clusters in every round is to select the cluster head. Sensor nodes in the same ring become the (3) cluster ring become the cluster head, in turn, so that energy consumption of nodes is balanced; the head, in turn, so that energy consumption of nodes is balanced; (3) the second step of establishing second step of establishing clusters is to form a cluster. Except for the cluster head, other sensor clusters to same form aring cluster. Except for the clusterofhead, other sensor theasame ring become nodes inisthe become cluster members this cluster head. nodes Then, in then cluster is formed; cluster members of establishing this cluster head. Then, a cluster is formed; the third of establishing (4) the third step of clusters is tothen realize multi-hop among(4) clusters by step a greedy algorithm clusters is to realize multi-hop among clusters by a greedy algorithm [6] and (5) sensor nodes enter the [6] and (5) sensor nodes enter the stage of stable work with TDSS. stage of stable work with TDSS. 4.2. Cooperative TDSS in the Cluster 4.2. Cooperative TDSS in the Cluster The proposed TDSS model in Section 3 is based on a single sensor node. According to the first The proposed TDSS model in Section 3 is based on a single sensor node. According to the first core requirement for data, cooperative TDSS in the cluster is therefore presented to keep the same core requirement for data, cooperative TDSS in the cluster is therefore presented to keep the same sleep time in the same ring. sleep time in the same ring. Since time synchronization error among clusters can be tolerated in the tunnel monitoring Since time synchronization error among clusters can be tolerated in the tunnel monitoring system, system, sensor nodes are configured to be same time before installation, which is a simple way to sensor nodes are configured to be same time before installation, which is a simple way to realize realize adjustment time among clusters roughly. Sensor data in the same cluster require a higher adjustment time among clusters roughly. Sensor data in the same cluster require a higher time time synchronization precision. Time synchronization in the cluster is easier to be realized than time synchronization precision. Time synchronization in the cluster is easier to be realized than time synchronization among clusters. Accordingly, reference broadcast synchronization [18], which is synchronization among clusters. Accordingly, reference broadcast synchronization [18], which is commonly used, is used to realize time synchronization in the cluster. commonly used, is used to realize time synchronization in the cluster. According to the TDSS model, every sensor node has calculated the sleep time TS, which is not According to the TDSS model, every sensor node has calculated the sleep time TS, which is not the final sleep time. Cooperative TDSS in the cluster further processes these sleep time to generate a the final sleep time. Cooperative TDSS in the cluster further processes these sleep time to generate unified sleep time in the cluster. The ith cluster, which has seven sensor nodes i1, i2, i3, i4, i5, i6 and i7, is a unified sleep time in the cluster. The ith cluster, which has seven sensor nodes i1 , i2 , i3 , i4 , i5 , i6 and i7 , analyzed as an example. Firstly, TS(ix) denotes the sleep time of these sensor nodes, which is is analyzed as an example. Firstly, TS(ix ) denotes the sleep time of these sensor nodes, which is obtained from the TDSS model. The minimum TS(ix), which is recorded as min(TS(ix)), is selected by obtained from the TDSS model. The minimum TS(ix ), which is recorded as min(TS(ix )), is selected the sorting method. Secondly, the unified sleep time in the ith cluster, which is recorded as TS(i), can be calculated as shown in Equation (1):

Sensors 2016, 16, 1601

8 of 18

by the sorting method. Secondly, the unified sleep time in the ith cluster, which is recorded as TS(i), can be calculated as shown in Equation (1): $ TS (i ) = min ( TS (i x )) +

% 1 m TS i j − min ( TS (i x )) × T, m j∑ T =1

(1)

where m denotes the number of sensor nodes in the cluster. In the case of ith cluster, m is equal to 7. Every sensor node in the ith cluster uses the same sleep time TS(i). Through the same sleep time and time synchronization in the cluster, synchronous acquisition in the same ring can be realized. 4.3. Spatial Data-Driven Anomaly Detection in the Cluster Besides temporal correlation of sensor data, which is the basis of cooperative TDSS in the cluster, the spatial correlation of sensor data should be considered. Different from most existing works where the spatial correlation is used to further compress data, the spatial correlation is used in cluster-based anomaly detection in this work. Spatial compression can cause some loss of precision, which is against the second core requirement for data, and spatial data-driven anomaly detection can determine whether the node is working properly, which is a necessary precondition for keeping the high precision of all sensor data. Accordingly, the combination of temporal data-driven sleep scheduling and spatial data-driven anomaly detection can meet the core requirements on data of the tunnel monitoring system. Spatial data-driven anomaly detection is implemented in the cluster head. The anomalies of sensor nodes mainly include “sensor error”, “abnormal data” and “disconnected node”. Although high precision sensors are useful for high precision data acquisition, sensors start to develop drift slowly as the network ages, which is a kind of systematic error. “Sensor error” including the systematic error and the random error is inevitable. “Abnormal data” can be caused by the unstable node or network. In addition “disconnected node” means a node is not a head or member of the cluster corresponding to the ring where this node is. These three anomalies have different features on data, like small deviation data in “sensor error”, big deviation data in “abnormal data” and no data in “disconnected node”. To calculate the deviation data, which is the difference between the sensor data and the true data, Kriging is used to estimate the true data. Kriging, which is an excellent spatial interpolation method, can interpolate the sensor value at an unobserved location from observations of nearby locations. Compared with other spatial interpolation methods, Kriging has two main contributions for spatial data-driven anomaly detection in the tunnel monitoring system. (1) Kriging takes full account of the spatial correlation of sensor data to achieve high precision of spatial interpolation; (2) Kriging is applicable to the region where sampling data has characteristics of randomness and structural property. In addition, spatial data-driven anomaly detection in the tunnel monitoring system is exactly used in this kind of region. According to the inclination value IV = [iv1 , iv1 , . . . , ivm ] of sensor nodes X = [x1 , x1 , . . . xm ], Kriging can estimate the inclination angle iv* at other locations through a linear combination of iv1 , iv1 , . . . , ivm , as shown in Equation (2): iv∗ = α∗ +

m

∑ λk (ivk − αk ),

(2)

k =1

where αk denotes the initial installation angle of sensor node xk , α* is the initial installation angle at other locations, λk is the weight factor. Any inclination sensor node should be initialized after the installation, and the first sensor data of sensor node is used as its initial installation angle. The calculation of iv* is therefore based on the selection of λk . These weight factors λk are chosen by the unbiased condition given by Equation (3) and the minimum Kriging variance given by Equation (4). E [iv∗ − iv] =

m

∑ λk (ivk − αk ) − (iv − α∗ ) = 0,

k =1

(3)

Sensors 2016, 16, 1601

h i min (δ∗ )2 = Var (iv∗ − iv) =

9 of 18

m

m

∑ ∑ λk λ j c

k =1 j =1

m ivk , iv j + Var (iv) − 2 ∑ λk c (ivk , iv).

(4)

k =1

The ith cluster, which has seven sensor nodes i1 , i2 , i3 , i4 , i5 , i6 and i7 , is still selected as an example. The proposed anomaly detection is based on the cluster. All sensor nodes in a cluster are detected in turn. When the sensor node i1 is being detected, other nodes in this cluster are taken as the input of Kriging method, namely, X = [x1 , x1 , . . . , xm ] = [i2 , i3 , i4 , i5 , i6 , i7 ]. The true data of node i1 is estimated by Equations (2)–(4), and it is recorded as iv1 * . Then, the deviation data of node i1 ∆iv1 is equal to |iv1 − iv1 *|, where iv1 denotes the sensor data about inclination value of node i1 . Based on the deviation data, a simple anomaly classification and an anomaly indicator are presented. If the cluster head does not receive sensor data from a node, this node is in the state of “disconnected node”. If a node has the deviation data of which value is greater than the threshold value β, this node is in the state of “abnormal data”. Otherwise, this node is in the state of “sensor error”. The threshold value β can be set according to application scenarios of different sensor nodes. For inclination sensor node in the tunnel monitoring system, its threshold value β can be set to 1. Besides sensor node itself, some external factors can also cause “abnormal data”, like the artificial disturbance or major disasters. Whatever is causing “abnormal data”, they can increase the anomaly indicator ξ to arouse attention. Every sensor node has an anomaly indicator ξ, which is updated once ∆iv1 is updated, ξ = (ξ + ∆iv1 )/2. The anomaly indicator ξ is divided into three levels including green, yellow and red. These levels of anomaly indicator are used to decide the priority level of node maintenance. Data-driven anomaly detection based on the spatial correlation and Kriging method, as an important supplement to cooperative TDSS, makes the tunnel monitoring system own more practical value. Such anomaly detection also has two effects on the tunnel monitoring system: (1) the spatial correlation of sensor data is used to generate the anomaly indicator ξ, which can maintain and improve the precision of sensor data necessarily through node maintenance; (2) and data-driven anomaly detection can keep cooperative TDSS valid because keeping high precision of sensor data is a necessary precondition for achieving cooperative TDSS effectively. To better illustrate the relationship between cooperative TDSS and spatial data-driven anomaly detection, the data flow diagram of proposed methods is given. Since proposed methods are used in clustered WSNs, Figure 4 shows the data flow diagram of the ith cluster as an example. Colored squares are used to represent data. The rows of matrixes S1 , S2 , S3 , S4 , S5 and S6 represent space dimension (different sensor nodes), and the columns represent time dimension (different collecting time). The relationship between cooperative TDSS and spatial data-driven anomaly detection is divided into three parts: (1) cooperative TDSS ensures data synchronization for spatial data-driven anomaly detection. Cooperative TDSS realizes synchronous acquisition in the same ring, which is a necessary precondition for spatial data-driven anomaly detection; (2) cooperative TDSS reduces data redundancy to make spatial data-driven anomaly detection efficient. Compared with S3 that is uncompressed, S4 processed by cooperative TDSS preserves data with the key features to reduce data redundancy. In addition, anomalies are reflected in these data with the key features. Reducing unnecessary input data can therefore improve the efficiency of spatial data-driven anomaly detection; and (3) spatial data-driven anomaly detection can maintain and improve the precision of sensor data to keep cooperative TDSS valid. Accordingly, cooperative TDSS and spatial data-driven anomaly detection are interdependent with each other to improve common performance.

Sensors 2016, 16, 1601 Sensors 2016, 16, 1601

10 of 18 10 of 18

spatial data-driven anomaly detection

Uncompressed

Sensors 2016, 16, Sensor 1601 nodes i1

S1 S3 Uncompressed

Sensor nodes i1

Sensor nodes i6

S1


S5


Cooperative TDSS S S34

S2

Cooperative TDSS

Sensor The ith cluster nodes i6

S2 The ith cluster

10 of 18

Cluster head i7

S5 spatial data-driven anomaly detection

S6

Anomaly indicator ξ

S4

S6

Cluster head i7


Node maintenance

Figure 4. The diagram of of the the ith ith cluster. cluster. Figure 4. The data data flow flow diagram Node maintenance

5. Performance 5. PerformanceEvaluation Evaluation Figure 4. The data flow diagram of the ith cluster. This section section presents presents the the performance evaluation of temporal data-driven sleep scheduling and This 5. Performance Evaluation performance evaluation of temporal data-driven sleep scheduling and spatial data-driven data-driven anomaly anomaly detection detection through through experiments experiments and and analysis. analysis. spatial This section presents the performance evaluation of temporal data-driven sleep scheduling and spatial data-driven 5.1. Experiment Platformanomaly detection through experiments and analysis.

5.1. Experiment Platform

5.1. Platform TheExperiment proposed methods including temporal data-driven sleep scheduling spatial The proposed methods including temporal data-driven sleep scheduling and spatialand data-driven data-driven anomaly detection are verified in an independently developed tunnel monitoring anomaly The detection are methods verified including in an independently developed tunnel monitoring system in proposed temporal data-driven sleep scheduling and spatial system in Shanghai, as Figure The sink node connects thenodes sensorthrough nodes through Shanghai, China,anomaly as China, shown in shown Figure 5. The sink node connects the sensor ZigBee data-driven detection are in verified in5.an independently developed tunnel monitoring ZigBee (ZigBee Alliance, Davis, CA, USA), and it connects the server through 3G. This tunnel system in Shanghai, China, as shown in Figure 5. The sink node connects the sensor nodes through (ZigBee Alliance, Davis, CA, USA), and it connects the server through 3G. This tunnel monitoring ZigBee (ZigBee Alliance, Davis, CA, USA), and it connects the server through 3G. This tunnel monitoring system, which is deployed from Tiantong Road Station to International Cruise Terminal system, which is deployed from Tiantong Road Station to International Cruise Terminal Station on the monitoring system, which deployed from Tiantong Roadincluding Station to International Cruise Terminal Station onMetro the Shanghai Metro #12 kinds Line, hassensor threenodes kinds of sensor nodes including the inclination Shanghai #12 Line, hasisthree of the inclination sensor, the water Station on the Shanghai Metro #12 Line, has three kinds of sensor nodes including the inclination sensor, the water leakage sensor and the joint splaying sensor. Since the proposed methods use a leakage sensor and the joint splaying sensor. Since the proposed methods use a kind of data-driven sensor, the water leakage sensor and the joint splaying sensor. Since the proposed methods use a kind of data-driven be applied to all kindsFor of sensor nodes. For the sake of clarity, scheme, they can bescheme, appliedthey to allcan kinds of sensor nodes. the sake of clarity, experiments and kind of data-driven scheme, they can be applied to all kinds of sensor nodes. For the sake of clarity, experiments and analysis are based on the inclination sensor in this work. analysis are based on the inclination sensor in this work. experiments and analysis are based on the inclination sensor in this work.

Figure 5. Tunnel monitoring system in Shanghai, China.

Figure 5. Tunnel monitoring system in Shanghai, China.

Figureof5. the Tunnel monitoring Shanghai, China.The influence of the The inclination angle segment in thesystem tunnelinchanges slowly. surrounding disturbance on the inclination sensor can be studied by analyzing data in of a the The inclination angle of the segment in the tunnel changes slowly. inclination The influence The inclination angle of the segment in the tunnel changes slowly. The influence of the short-term span. Inclination data changes very little in over a long-term span if no incident occurs. surrounding disturbance on the inclination sensor can be studied by analyzing inclination data

surrounding disturbance on the inclination sensor can be studied by analyzing inclination data in a short-term span. Inclination data changes very little in over a long-term span if no incident occurs.

Sensors 2016, 16, 1601

11 of 18

2016, 16, 1601 11 of 18 in aSensors short-term span. Inclination data changes very little in over a long-term span if no incident occurs. According to the experimental data, the variation scope of the axial inclination angle in According to the experimental data, the variation scope of the axial inclination angle in a short-term a short-term spanhours) (several hours) with metro operation is less 0.2◦ metro because metro train vibration span (several with metro operation is less than 0.2° than because train vibration causes causes instantaneous change in the inclination angle. In addition, the variation scope of the axial instantaneous change in the inclination angle. In addition, the variation scope of the axial inclination inclination angle over a long-term spanmonths) (several without months)metro without metro operation less than 0.02◦ , angle over a long-term span (several operation is less thanis0.02°, because because change the inclination of the segment is extremely slow. Reducing the redundancy ofthe data change in the in inclination of the segment is extremely slow. Reducing the redundancy of data by by sleep the sleep scheduling is therefore significant. To verify the proposed methods in different time spans, scheduling is therefore significant. To verify the proposed methods in different time spans, experimental analysis in ainshort-term span andand experimental analysis overover a long-term spanspan are given. experimental analysis a short-term span experimental analysis a long-term are given. 5.2. Experimental Analysis in a Short-Term Span

The inclination angle( ° )

5.2. Analysis a Short-Term InExperimental metro tunnels, metrointrain vibrationSpan has a pronounced influence on inclination data, especially in a short-term Two experiment set in this section,on including Scenario In metrospan. tunnels, metro train scenarios vibration are hastherefore a pronounced influence inclination data,A: especially a short-term span. Two experiment scenarios are set inwithout this section, including metro tunnel in with metro operation (05:30–24:00) and Scenario B:therefore metro tunnel metro operation Scenario A: tunnel with metroscenarios, operationnon-uniform (05:30–24:00) sensing and Scenario metro tunnel without (0:00–5:30). In metro these two experiment with B: cooperative TDSS under ◦ ◦ metro operation (0:00–5:30). In these two experiment scenarios, non-uniform sensing with different α (0.01 , 0.05 ) is performed, where the Default Acquisition Cycle (DRC) is set to 1 s. DRC is cooperative TDSS under different α (0.01°, is performed, where the Default Acquisition Cycle small so as to capture inclination data caused0.05°) by metro train vibration. (DRC) is set to 1 s. DRC is small so as to capture inclination data vibration. data Non-uniform sensing with cooperative TDSS in Scenario A iscaused shownby inmetro Figuretrain 6. Inclination Non-uniform sensing with cooperative TDSS in Scenario A is shown in Figure 6. Inclination has obviously instantaneous change at about 1100 DRC and 2000 DRC because metro train vibration data has obviously instantaneous change at about 1100 DRC and 2000 DRC because metro train causes instantaneous change in the inclination angle. The instantaneous change in inclination vibration causes instantaneous change in the inclination angle. The instantaneous change in data at about 1600 DRC should be caused by the metro train nearby. Thus metro train vibration inclination data at about 1600 DRC should be caused by the metro train nearby. Thus metro train has a pronounced influence on inclination data in a short-term span. Monitoring the normal vibration has a pronounced influence on inclination data in a short-term span. Monitoring the inclination angle of the segment in the tunnel is the real goal of such tunnel monitoring systems. normal inclination angle of the segment in the tunnel is the real goal of such tunnel monitoring Non-uniform sensing with cooperative TDSS can reduce sampling times by discarding unnecessary systems. Non-uniform sensing with cooperative TDSS can reduce sampling times by discarding data. Under the condition that actual change in the inclination angle of the segment can be restored unnecessary data. Under the condition that actual change in the inclination angle of the segment can effectively through less data, the energy consumption of sensor nodes can also reduced significantly be restored effectively through less data, the energy consumption of be sensor nodes can be also by reduced non-uniform sensing with cooperative TDSS. It can also be seen that the frequency of non-uniform significantly by non-uniform sensing with cooperative TDSS. It can also be seen that the ◦ shows that even sensing is inversely proportional α. Non-uniform sensing with α = 0.01 frequency of non-uniform sensing to is inversely proportional to α. Non-uniform sensing with α = 0.01° instantaneous change in inclination data can be sensed if α be is sensed small enough. Sensing using shows that even instantaneous change in inclination data can if α is small enough.data Sensing ◦ but larger energy consumption and α =data 0.01◦using are closer to the actual thanactual that using α = 0.05 α = 0.01° are closerdata to the data than that, using α = 0.05°, but larger energy more storage space needed because more sensing data. A non-uniform sensing can consumption andare more storage spaceofare needed because of suitable more sensing data. A suitable therefore be realized bycan adjusting α. be realized by adjusting α. non-uniform sensing therefore

-4.65

Raw sensor data Linear fitting Non-uniform sensing (α=0.01°) Non-uniform sensing (α=0.05°)

-4.7

-4.75

-4.8

400

800 1200 1600 The number of raw data acquisition

2000

Figure Non-uniformsensing sensingwith withcooperative cooperative TDSS TDSS in Figure 6. 6. Non-uniform in Scenario ScenarioAAin inaashort-term short-termspan. span.

Sensors 2016, 16, 1601 Sensors 2016, 16, 1601

12 of 18 12 of 18

Figure77shows showsnon-uniform non-uniformsensing sensingwith withcooperative cooperativeTDSS TDSSin inScenario ScenarioB. B.Without Withoutmetro metrotrain train Figure vibration,inclination inclination data is mainly affected by environmental noise. The scope variation scope of vibration, data is mainly affected by environmental noise. The variation of inclination inclination data BinisScenario is smaller than thatA.inCompared Scenario A. Compared Figure 6, the data in Scenario smaller B than that in Scenario with Figure 6,with the frequency of frequency ofsensing non-uniform sensing in Figure 7 is lower. non-uniform in Figure 7 is lower.


-4.7 Raw sensor data Linear fitting Non-uniform sensing (α=0.01°) Non-uniform sensing (α=0.05°)

-4.72

-4.74

-4.76

-4.78 500


2500

Figure7.7.Non-uniform Non-uniformsensing sensingwith withcooperative cooperativeTDSS TDSSininScenario ScenarioBBininaashort-term short-termspan. span. Figure

Consequently, cooperative TDSS can realize non-uniform sensing in a short-term span Consequently, cooperative TDSS can realize non-uniform sensing in a short-term span effectively. effectively. According to the temporal correlation, such non-uniform sensing based on cooperative According to the temporal correlation, such non-uniform sensing based on cooperative TDSS can TDSS can reduce the redundancy of data by discarding unnecessary data. In addition, actual change reduce the redundancy of data by discarding unnecessary data. In addition, actual change in the in the inclination angle of the segment can be restored effectively through less data. Moreover, the inclination angle of the segment can be restored effectively through less data. Moreover, the energy energy consumption of sensor nodes can be reduced significantly so that the life cycle of sensor consumption of sensor nodes can be reduced significantly so that the life cycle of sensor nodes can nodes can be extended. be extended. Spatial data-driven anomaly detection is then verified. Figure 8 shows anomaly indicator ξ of Spatial data-driven anomaly detection is then verified. Figure 8 shows anomaly indicator ξ of four sensor nodes in a short-term span. Node 2763, which is a sensor node at the segment with the four sensor nodes in a short-term span. Node 2763 , which is a sensor node at the segment with the ring number 276, is working normally. Anomaly indicator ξ of node 2242 suddenly grows larger, and ring number 276, is working normally. Anomaly indicator ξ of node 2242 suddenly grows larger, then decreases by about 50%, which means the anomaly “abnormal data” in node 2242 occurs once at and then decreases by about 50%, which means the anomaly “abnormal data” in node 2242 occurs once the 13th sampling time. Anomaly indicator ξ of node 2845 suddenly grows larger, and then varies at the 13th sampling time. Anomaly indicator ξ of node 2845 suddenly grows larger, and then varies between 0.4 and 1 in an abnormal way, which means the anomaly “abnormal data” in node 2845 between 0.4 and 1 in an abnormal way, which means the anomaly “abnormal data” in node 2845 occurs occurs continuously after the fourth sampling time. Anomaly indicator ξ of node 3214 suddenly continuously after the fourth sampling time. Anomaly indicator ξ of node 3214 suddenly grows larger, grows larger, and then remains high at about 4.7, which is close to the value of inclination data. and then remains high at about 4.7, which is close to the value of inclination data. Thus, node 3214 is Thus, node 3214 is in the state of the anomaly “disconnected node” after the 11th sampling times. In a in the state of the anomaly “disconnected node” after the 11th sampling times. In a short-term span, short-term span, “abnormal data” and “disconnected node” are the main anomalies. “abnormal data” and “disconnected node” are the main anomalies. In this tunnel monitoring system, anomaly indicator ξ is divided into three levels including In this tunnel monitoring system, anomaly indicator ξ is divided into three levels including green green (0 ≤ ξ < 0.2), yellow (0.2 ≤ ξ < 0.5) and red (ξ ≥ 0.5). (0 ≤ ξ < 0.2), yellow (0.2 ≤ ξ < 0.5) and red (ξ ≥ 0.5). To illustrate that the combination of temporal data-driven sleep scheduling and spatial To illustrate that the combination of temporal data-driven sleep scheduling and spatial data-driven data-driven anomaly detection can further improve system performance, four kinds of combinations anomaly detection can further improve system performance, four kinds of combinations are compared are compared in node 2845 where the anomaly “abnormal data” occurs. Combination 1 denotes the in node 2845 where the anomaly “abnormal data” occurs. Combination 1 denotes the combination of combination of cooperative TDSS and spatial data-driven anomaly detection; Combination 2 cooperative TDSS and spatial data-driven anomaly detection; Combination 2 denotes the combination denotes the combination of data acquisition with large cycle 10T (no TDSS) and spatial data-driven of data acquisition with large cycle 10T (no TDSS) and spatial data-driven anomaly detection; anomaly detection; Combination 3 denotes the combination of data acquisition with small cycle T Combination 3 denotes the combination of data acquisition with small cycle T (no TDSS) and (no TDSS) and spatial data-driven anomaly detection; Combination 4 denotes the combination of spatial data-driven anomaly detection; Combination 4 denotes the combination of cooperative cooperative TDSS and spatial data-driven anomaly detection without node maintenance. As shown TDSS and spatial data-driven anomaly detection without node maintenance. As shown in Figure 9, in Figure 9, compared with Combinations 2 and 3, Combination 1 has smaller anomaly indicator ξ compared with Combinations 2 and 3, Combination 1 has smaller anomaly indicator ξ when no when no anomaly occurs (time between T and 18T), and has a greater fluctuating range when an anomaly occurs (time between 26T and 46T). This means that cooperative TDSS can improve

Sensors 2016, 16, 1601

13 of 18

anomaly occurs (time between T and 18T), and has a greater fluctuating range when an anomaly occurs (time between 26T and 46T). This means that cooperative TDSS can improve accuracy and sensitivity of spatial data-driven anomaly detection, which is realized by synchronous acquisition in the same ring. Sleep time of Combination 1 is inversely proportional to its anomaly indicator ξ, while Combinations 2 and 3 have constant sleep time, which is not related with anomaly indicator ξ. Compared with Combination 3, Combination 1 can detect the anomaly well by using longer sleep time when no anomaly occurs, which reduces the number of meaningless anomaly detections. Such meaningless anomaly detection not only increases the energy consumption, but also wastes limited computing capacity in WSN. Although Combination 2 has the largest sleep time, the effect of anomaly detection is bad when an anomaly occurs. Accordingly, cooperative TDSS can make spatial data-driven anomaly detection efficient by reducing data redundancy through non-uniform sensing. In a short-term span, Combinations 1 and 4 have the same ability in anomaly detection because enough time is required in Sensors 2016, 16, x FOR PEER REVIEW 13 of 18 node maintenance. 4.8

Node 2763

4.6

Node 2242


Node 2845 Node 3214

0.8

0.6

0.4

0.2

0.0 2

4

6

8

10

12

14

16

18

20

Sampling times Figure 8. Anomaly indicator ξ of four sensor nodes in a short-term span.

Figure Anomaly indicator ξ of four sensor nodes in a short-term span. Sensors 2016, 16, x FOR PEER8.REVIEW

14 of 18


In this tunnel monitoring system, anomaly indicator ξ is divided into three levels including 1.0 green (0 ≤ ξ < 0.2), yellow (0.2 ≤ ξ < 0.5) and red (ξ ≥ 0.5). Combination 1 To illustrate that the combination of temporal data-driven sleep scheduling and spatial Combination 2 system performance, four kinds of combinations data-driven anomaly detection can further improve 0.8 Combination 3 are compared in node 2845 where the anomaly “abnormal data” occurs. Combination 1 denotes the Combination 4 combination of cooperative TDSS and spatial data-driven anomaly detection; Combination 2 0.6 of data acquisition with large cycle 10T (no TDSS) and spatial data-driven denotes the combination anomaly detection; Combination 3 denotes the combination of data acquisition with small cycle T (no TDSS) and spatial data-driven anomaly detection; Combination 4 denotes the combination of 0.4 cooperative TDSS and spatial data-driven anomaly detection without node maintenance. As shown in Figure 9, compared with Combinations 2 and 3, Combination 1 has smaller anomaly indicator ξ when no anomaly occurs 0.2 (time between T and 18T), and has a greater fluctuating range when an anomaly occurs (time between 26T and 46T). This means that cooperative TDSS can improve accuracy and sensitivity of spatial data-driven anomaly detection, which is realized by synchronous 0.0 ring. Sleep time of Combination 1 is inversely proportional to its anomaly acquisition in the same 0 4 8 12 16 20 24 28 32 36 40 44 48 indicator ξ, while Combinations 2 and 3 have constant sleep time, which is not related with anomaly Time (T) indicator ξ. Compared with Combination 3, Combination 1 can detect the anomaly well by using Figure 9. Anomaly ξ of occurs, node 284which 5 usingreduces methodsthe with different combinations anomaly in a longer sleep whenindicator no anomaly number of meaningless Figure 9. time Anomaly indicator ξ of node 284 5 using methods with different combinations in short-term span. detections. Such meaningless anomaly detection not only increases the energy consumption, but also a short-term span. wastes limited computing capacity in WSN. Although Combination 2 has the largest sleep time, the 5.3. Experimental overisa bad Long-Term effect of anomalyAnalysis detection when Span an anomaly occurs. Accordingly, cooperative TDSS can makeInspatial data-driven anomaly detection of efficient by in reducing dataDRC redundancy through the long-term monitoring of inclination segments the tunnel, is set to one hour. non-uniform sensing. In a short-term span, Combinations 1 and 4 have the same ability in anomaly Since sensor data is a kind of discrete data, Gaussian fitting is used to approximate the real changing detection because enough is required in node maintenance. trend of inclination angle. time As shown in Figure 10, the curve of Gaussian fitting can approximate the real changing trend of inclination angle effectively, and its expression is given:

Sensors 2016, 16, 1601

14 of 18

Sensors 2016, 16, 1601

14 of 18

5.3. Experimental Analysis over a Long-Term Span In the long-term monitoring of inclination 5.3. Experimental Analysis over a Long-Term Span of segments in the tunnel, DRC is set to one hour. Since sensor data is a kind of discrete data, Gaussian fitting is used to approximate the real changing long-term monitoring of inclination segments the tunnel, DRC set to one hour. trendInofthe inclination angle. As shown in Figure 10,ofthe curve ofinGaussian fitting canisapproximate the Since sensor data is a kind of discrete data, Gaussian fitting is used to approximate the real changing real changing trend of inclination angle effectively, and its expression is given: trend of inclination angle. As shown in Figure 10, the curve of Gaussian fitting can approximate the 2 real changing trend of inclination angle effectively, its expression is given:  and  x  1894    Y  2 .739  exp     . (5)   1 .978  10 4 2! ( x − 1894)   Y = −2.739 × exp − . (5) 1.978 × 104 The variance DSC of sensor data to the curve of Gaussian fitting is defined as follows: The variance DSC of sensor data to the curve of Gaussian fitting is defined as follows: n 2 1 D SC  1 n X i  Xˆ i2 (6) X − Xˆ i , (6) DSC = n ∑ , n i=i 11 i





denotesthe thevalue valueofof sensor data, the value the of curve of Gaussian where XXi i denotes Xˆ i denotes where sensor data, andand Xˆ i denotes the value of theof curve Gaussian fitting. fitting. In10, Figure DSC istoequal to 0.00644, It is therefore reasonable that Gaussian In Figure DSC 10, is equal 0.00644, which iswhich small.isIt small. is therefore reasonable that Gaussian fitting is ◦ fittingtoisapproximate used to approximate the real changing trend of inclination By comparing the line of α used the real changing trend of inclination angle. Byangle. comparing the line of α= 0.003 ◦ = 0.003° andofthe of α, non-uniform = 0.005°, non-uniform sensing with smaller α has a more number of data and the line α =line 0.005 sensing with smaller α has a more number of data acquisition, acquisition, it sensitive is also more sensitive to the real inclination angle. However,more it consumes and it is alsoand more to the real inclination angle. However, it consumes energy. Ifmore α is ◦ ◦ energy.reduced, If α is further reduced, the of sensor data will increased. lines of αα== 0.005 0.003° further the redundancy of redundancy sensor data will increased. The lines of α =The 0.003 and and αthat = 0.005° that they can achieve a reasonable non-uniform sensing. DSC003 DSC005 can by be show they show can achieve a reasonable non-uniform sensing. DSC003 and DSC005 canand be calculated − 6 − 6 −6 −6 calculated(6), byDEquation (6), D× SC003 × 10 , DSC005 × 10 D, SCxxx where is anynumber) natural Equation 10= 4.3843 , DSC005 = 2.9141 × =102.9141 , where (x D isSCxxx any (x natural SC003 = 4.3843 number)the denotes the of variance sensor data α =the 0.xxx to the curve of Gaussian denotes variance sensor of data under α =under 0.xxx to curve of Gaussian fitting. fitting.


-2.705 Raw sensor data Gaussian fitting non-uniform sensing (α=0.003°) non-uniform sensing (α=0.005°)

-2.715

-2.725

The last sensing when α=0.005°

-2.735

-2.745

300

600 900 1200 The number of data acquisition

1500

Figure 10. 10. Non-uniform Non-uniform sensing sensing with with cooperative cooperative TDSS TDSS over over aa long-term long-term span. span. Figure

Two methods about calculating ∆V, which is the absolute value of difference between two Two methods about calculating ∆V, which is the absolute value of difference between two sensor sensor data, are made: (1) it is the absolute value of difference between two sensor data from the last data, are made: (1) it is the absolute value of difference between two sensor data from the last two adjacent data acquisition; and (2) it is the absolute value of difference between the current two adjacent data acquisition; and (2) it is the absolute value of difference between the current sensor sensor data and the last sensor data, of which ∆V is greater than or equal to α. As shown in Figure 11 data and the last sensor data, of which ∆V is greater than or equal to α. As shown in Figure 11 where α = 0.003°, two methods can achieve a reasonable non-uniform sensing. DSC with method 1 where α = 0.003◦ , two methods can achieve a reasonable non-uniform sensing. DSC with method 1 and DSC with method 2 are equal to 3.7100 × 10−6−and 3.5245 × 10−6. and DSC with method 2 are equal to 3.7100 × 10 6 and 3.5245 × 10−6 .

Sensors 2016, 16, 1601 Sensors 2016, 16, 1601

15 of 18 15 of 18


-2.705 Raw sensor data Gaussian fitting Non-uniform sensing with the method (1) Non-uniform sensing with the method (2)

-2.715

-2.725

-2.735

-2.745

300


1500

Figure11. 11. Non-uniform Non-uniform sensing sensingwith withcooperative cooperativeTDSS TDSSunder undertwo twomethods methodsof ofcalculating calculating∆V. ∆V. Figure

Consequently, cooperative TDSS can also realize non-uniform sensing over a long-term span Consequently, cooperative TDSS can also realize non-uniform sensing over a long-term span effectively. Spatial data-driven anomaly detection over a long-term span is then verified. As shown effectively. Spatial data-driven anomaly detection over a long-term span is then verified. As shown in Figure 12, node 2845, in which the anomaly “abnormal data” occurs according to the analysis in in Figure 12, node 2845 , in which the anomaly “abnormal data” occurs according to the analysis Figure 8, triggers the alarm level “red” after four months. Node 3214, in which the anomaly in Figure 8, triggers the alarm level “red” after four months. Node 3214 , in which the anomaly “disconnected node” occurs according to the analysis in Figure 8, triggers the alarm level “red” after “disconnected node” occurs according to the analysis in Figure 8, triggers the alarm level “red” after six months. These two nodes are repaired by engineers in a timely way, and thus their anomaly six months. These two nodes are repaired by engineers in a timely way, and thus their anomaly indication ξ decrease after the reparation. As time passed by, sensor errors at different levels occur in indication ξ decrease after the reparation. As time passed by, sensor errors at different levels occur in four nodes, which are mainly caused by the sensor drift. It can be seen that the anomaly “sensor four nodes, which mainly caused by the sensor drift. It can be seen that the anomaly “sensor16 error” Sensors 2016, 16, x FORare PEER REVIEW of 18 error” mainly occurs in nodes over a long-term span. After eight months, their anomaly indication ξ mainly occurs in nodes over a long-term span. After eight months, their anomaly indication ξ increase increase slowly, slowly, and and ξξ of of node 224 22422 is is larger larger than that that of of the the other other three three nodes. nodes. Node Node 224 22422 triggers triggers the the increase slowly, and ξ of node 224node than thatthan of the other three nodes. Node 224 the alarm 2 is larger 2 triggers alarm level level “yellow” “yellow” after after 19 19 months, months, which which will will be be considered considered in in the node node maintenance. maintenance. Since Since alarm level “yellow” after 19 months, which will be considered in the node the maintenance. Since anomaly anomaly indication ξ is inversely proportional to the precision of sensor data, spatial data-driven anomaly indication ξ is inversely proportional to the precision sensor data, spatial data-driven indication ξ is inversely proportional to the precision of sensor of data, spatial data-driven anomaly anomaly detection detection can can maintain maintain and and improve improve the the precision precision of of sensor sensor data data effectively. effectively. anomaly detection can maintain and improve the precision of sensor data effectively. 4.8 4.8

Anomaly Anomalyindicator indicatorξ?

4.6 4.6

Node 276 2763 Node 3 Node 224 2242 Node 2

Node 284 2845 Node 5 Node 321 3214 Node

0.6 0.6

4

0.4 0.4

0.2 0.2

0.0 0.0

2 2

4 4

6 6

8 8

10 10

12 12

14 14

16 16

18 18

20 20

Time(month) Time(month) Figure 12. 12. Anomaly indicator indicator ξ of of four four sensor sensor nodes nodes over over a long-term long-term span. span. Figure Figure 12. Anomaly Anomaly indicator ξ ξ of four sensor nodes over aa long-term span.

Four kinds kinds of of combinations combinations are are also also compared compared in in node node 284 28455over over aa long-term long-term span. span. As As shown shown in in Four Figure 13, 13, anomaly anomaly indicator indicator ξξ of of Combination Combination 11 is is smaller smaller than than that that of of Combinations Combinations 22 and and 33 when when Figure anomaly indicator ξ has the level of green. This means that cooperative TDSS can improve accuracy of spatial data-driven anomaly detection, which is realized by synchronous acquisition in the same ring. Because the node is repaired by engineers in a timely way, anomaly indication ξ of

Sensors 2016, 16, 1601

16 of 18

Four kinds of combinations are also compared in node 2845 over a long-term span. As shown in Figure 13, anomaly indicator ξ of Combination 1 is smaller than that of Combinations 2 and 3 when anomaly indicator ξ has the level of green. This means that cooperative TDSS can improve accuracy of spatial data-driven anomaly detection, which is realized by synchronous acquisition in the same ring. Because the node is repaired by engineers in a timely way, anomaly indication ξ of Combinations 1–3 decrease after the reparation. Once a failure occurred in the node, Combination 4 would not repair it through node maintenance; thus, anomaly indicator ξ of Combination 4 would always change irregularly. Such change in anomaly indicator ξ means that sensor data always obviously change, although no event occurs in the tunnel. Such change in sensor data inevitably leads to little sleep time, which increases the unnecessary energy consumption. Since TDSS is a kind of data-driven mechanism, the high precision of sensor data should be ensured. When failure exists in the node for a long time, cooperative TDSS cannot achieve appropriate sleep time for the real condition in the tunnel. Accordingly, spatial data-driven anomaly detection can maintain and improve the precision of sensor2016, data16,through node maintenance to keep cooperative TDSS valid. Sensors x FOR PEER REVIEW 17 of 18


0.8

0.6

0.4

Combination 1 Combination 2 Combination 3 Combination 4

0.2

0.0 2

4

6

8

10

12

14

16

18

20

Time(month) Figure 5 using methods with different combinations over a Figure 13. 13. Anomaly Anomalyindicator indicatorξξofofnode node284 284 5 using methods with different combinations over long-term span. a long-term span.

6. Conclusions 6. Conclusions In this work, we exploit the spatial–temporal correlation of sensor data to improve the In this work, we exploit the spatial–temporal correlation of sensor data to improve the performance of the tunnel monitoring system based on clustered WSN. The combination of performance of the tunnel monitoring system based on clustered WSN. The combination of temporal temporal data-driven sleep scheduling and spatial data-driven anomaly detection is proposed. The data-driven sleep scheduling and spatial data-driven anomaly detection is proposed. The proposed proposed temporal data-driven sleep scheduling model is inspired by TCP congestion control. By temporal data-driven sleep scheduling model is inspired by TCP congestion control. By exploiting exploiting the temporal correlation of sensor data, TDSS achieves appropriate sleep time to reduce the temporal correlation of sensor data, TDSS achieves appropriate sleep time to reduce the temporal the temporal redundancy of sensor data. Since the sleep time increases exponentially in the initial redundancy of sensor data. Since the sleep time increases exponentially in the initial phase, phase, such appropriate sleep time can be achieved quickly. such appropriate sleep time can be achieved quickly. Based on long and linear cluster structure built for the tunnel monitoring system, cooperative Based on long and linear cluster structure built for the tunnel monitoring system, TDSS and spatial data-driven anomaly detection are proposed. Cooperative TDSS in the cluster not cooperative TDSS and spatial data-driven anomaly detection are proposed. Cooperative TDSS only realizes an efficient data-driven sleep scheduling based on the temporal correlation to reduce in the cluster not only realizes an efficient data-driven sleep scheduling based on the temporal the energy consumption, but can also realize synchronous acquisition in the same ring for analyzing correlation to reduce the energy consumption, but can also realize synchronous acquisition in the the situation of the every ring. Different from most existing works where the spatial correlation is same ring for analyzing the situation of the every ring. Different from most existing works where used to further compress data, this work uses the spatial correlation to achieve cluster-based the spatial correlation is used to further compress data, this work uses the spatial correlation to anomaly detection. The anomalies of sensor nodes mainly include “sensor error”, “abnormal data” achieve cluster-based anomaly detection. The anomalies of sensor nodes mainly include “sensor error”, and “disconnected node”. Spatial data-driven anomaly detection in the cluster is realized by using Kriging method based on the spatial correlation. The performance of temporal data-driven sleep scheduling and spatial data-driven anomaly detection is evaluated in experiments on the Shanghai Metro #12 Line. The experiments’ results show that cooperative TDSS can realize non-uniform sensing in a short or long-term span effectively. Such non-uniform sensing based on cooperative TDSS not only can reduce the energy consumption of sensor nodes significantly but also can restore actual change in the inclination angle

Sensors 2016, 16, 1601

17 of 18

“abnormal data” and “disconnected node”. Spatial data-driven anomaly detection in the cluster is realized by using Kriging method based on the spatial correlation. The performance of temporal data-driven sleep scheduling and spatial data-driven anomaly detection is evaluated in experiments on the Shanghai Metro #12 Line. The experiments’ results show that cooperative TDSS can realize non-uniform sensing in a short or long-term span effectively. Such non-uniform sensing based on cooperative TDSS not only can reduce the energy consumption of sensor nodes significantly but also can restore actual change in the inclination angle of the segment effectively through less data. It is also verified in the experiments that spatial data-driven anomaly detection, which generates an anomaly indicator ξ to decide the priority level of node maintenance, can maintain and improve the precision of sensor data effectively, especially over a long-term span. Acknowledgments: This work is supported by the National Basic Research Program of China (973 Program: Grant No. 2011CB013803), CERNET (China Education and Research Network) Innovation Project (Grant No. NGII20151205), the National Natural Science Foundation of China (Grant No. 51275360), and the Shanghai Key Project of Basic Research (Grant No. 14JC1405900). Author Contributions: Gang Li, Bin He and Hongwei Huang conceived and designed the experiments; Gang Li, Bin He and Limin Tang performed the experiments; Gang Li and Limin Tang analyzed the data; Bin He and Hongwei Huang contributed analysis tools; and Gang Li wrote the paper. Conflicts of Interest: The authors declare no conflict of interest.

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.

12. 13. 14. 15.

Aygün, B.; Gungor, V.C. Wireless sensor networks for structure health monitoring: Recent advances and future research directions. Sens. Rev. 2011, 31, 261–276. [CrossRef] Sayyed, A.; de Araújo, G.M.; Bodanese, J.P.; Becker, L.B. Dual-stack single-radio communication architecture for UAV acting as a mobile node to collect data in WSNs. Sensors 2015, 15, 23376–23401. [CrossRef] [PubMed] He, B.; Li, Y.; Huang, H.; Tang, H. Spatial–temporal compression and recovery in a wireless sensor network in an underground tunnel environment. Knowl. Inf. Syst. 2014, 41, 449–465. [CrossRef] Chen, Z.; Ranieri, J.; Zhang, R.; Vetterli, M. DASS: Distributed adaptive sparse sensing. IEEE Trans. Wirel. Commun. 2015, 14, 2571–2583. [CrossRef] Al-Kahtani, M.S. Efficient Cluster-Based Sleep Scheduling for M2M Communication Network. Arab. J. Sci. Eng. 2015, 40, 2361–2373. [CrossRef] He, B.; Li, G. PUAR: Performance and usage aware routing algorithm for long and linear wireless sensor networks. Int. J. Distrib. Sens. Netw. 2014, 2014. [CrossRef] Tyagi, S.; Kumar, N. A systematic review on clustering and routing techniques based upon LEACH protocol for wireless sensor networks. J. Netw. Comput. Appl. 2013, 36, 623–645. [CrossRef] Dhasian, H.R.; Balasubramanian, P. Survey of data aggregation techniques using soft computing in wireless sensor networks. IET Inf. Secur. 2013, 7, 336–342. [CrossRef] Alsheikh, M.A.; Lin, S.; Niyato, D.; Tan, H.-P. Rate-distortion Balanced Data Compression for Wireless Sensor Networks. IEEE Sens. J. 2016, 16, 5072–5083. [CrossRef] Arunraja, M.; Malathi, V.; Sakthivel, E. Energy conservation in WSN through multilevel data reduction scheme. Microprocess. Microsyst. 2015, 39, 348–357. [CrossRef] Gong, B.; Cheng, P.; Chen, Z.; Liu, N.; Gui, L.; de Hoog, F. Spatiotemporal compressive network coding for energy-efficient distributed data storage in wireless sensor networks. IEEE Commun. Lett. 2015, 19, 803–806. [CrossRef] Ma, S.; Qian, J.; Sun, Y. Optimal sleep scheduling scheme for wireless sensor networks based on balanced energy consumption. J. Comput. 2013, 8, 1610–1617. [CrossRef] Yang, O.; Heinzelman, W. An adaptive sensor sleeping solution based on sleeping multipath routing and duty-cycled MAC protocols. ACM Trans. Sens. Netw. TOSN 2013, 10, 10. [CrossRef] Liu, F.; Tsui, C.Y.; Zhang, Y.J. Joint routing and sleep scheduling for lifetime maximization of wireless sensor networks. IEEE Trans. Wirel. Commun. 2010, 9, 2258–2267. [CrossRef] Titouna, C.; Aliouat, M.; Gueroui, M. Outlier detection approach using bayes classifiers in wireless sensor networks. Wirel. Pers. Commun. 2015, 85, 1009–1023. [CrossRef]

Sensors 2016, 16, 1601

16. 17. 18.

18 of 18

Nisha, U.B.; Maheswari, N.U.; Venkatesh, R.; Abdullah, R.Y. Improving Data Accuracy Using Proactive Correlated Fuzzy System in Wireless Sensor Networks. KSII Trans. Internet Inf. Syst. 2015, 9, 3515–3538. Xiang, Y.; Xuan, Z.; Tang, M.; Zhang, J.; Sun, M. 3D space detection and coverage of wireless sensor network based on spatial correlation. J. Netw. Comput. Appl. 2016, 61, 93–101. [CrossRef] Elson, J.; Girod, L.; Estrin, D. Fine-grained network time synchronization using reference broadcasts. ACM SIGOPS Oper. Syst. Rev. 2002, 36, 147–163. [CrossRef] © 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

Energy Efficient Medium Access Control Protocol for Clustered Wireless Sensor Networks with Adaptive Cross-Layer Scheduling.

Energy Efficient Cluster Based Scheduling Scheme for Wireless Sensor Networks.

Mixed Criticality Scheduling for Industrial Wireless Sensor Networks.

A novel joint spatial-code clustered interference alignment scheme for large-scale wireless sensor networks.

A Comparative Study of Anomaly Detection Techniques for Smart City Wireless Sensor Networks.

Nearest neighbor imputation using spatial-temporal correlations in wireless sensor networks.

Service-oriented node scheduling scheme for wireless sensor networks using Markov random field model.

Differential Characteristics Based Iterative Multiuser Detection for Wireless Sensor Networks.

Geometry-Based Distributed Spatial Skyline Queries in Wireless Sensor Networks.

A Deadline-Aware Scheduling and Forwarding Scheme in Wireless Sensor Networks.

Energy-Saving Traffic Scheduling in Hybrid Software Defined Wireless Rechargeable Sensor Networks.

Wireless sensor networks for ambient assisted living.

Sensor data security level estimation scheme for wireless sensor networks.

Reliability of wireless sensor networks.

A Temporal Credential-Based Mutual Authentication with Multiple-Password Scheme for Wireless Sensor Networks.

Wireless sensor networks for heritage object deformation detection and tracking algorithm.

Clustering and Beamforming for Efficient Communication in Wireless Sensor Networks.

Efficient Actor Recovery Paradigm for Wireless Sensor and Actor Networks.

Availability issues in wireless visual sensor networks.

An Intrusion Detection System Based on Multi-Level Clustering for Hierarchical Wireless Sensor Networks.

An uncertainty-based distributed fault detection mechanism for wireless sensor networks.

A Protocol Layer Trust-Based Intrusion Detection Scheme for Wireless Sensor Networks.

Optimizing Retransmission Threshold in Wireless Sensor Networks.

Reputation-based secure sensor localization in wireless sensor networks.