sensors Article

Indoor-Outdoor Detection Using a Smart Phone Sensor Weiping Wang, Qiang Chang *, Qun Li, Zesen Shi and Wei Chen College of Information Systems and Management, National University of Defense Technology, Changsha 410073, China; [email protected] (W.W.); [email protected] (Q.L.); [email protected] (Z.S.); [email protected] (W.C.) * Correspondence: [email protected]; Tel.: +86-187-1108-3508 Academic Editor: Fan Ye Received: 27 June 2016; Accepted: 19 September 2016; Published: 22 September 2016

Abstract: In the era of mobile internet, Location Based Services (LBS) have developed dramatically. Seamless Indoor and Outdoor Navigation and Localization (SNAL) has attracted a lot of attention. No single positioning technology was capable of meeting the various positioning requirements in different environments. Selecting different positioning techniques for different environments is an alternative method. Detecting the users’ current environment is crucial for this technique. In this paper, we proposed to detect the indoor/outdoor environment automatically without high energy consumption. The basic idea was simple: we applied a machine learning algorithm to classify the neighboring Global System for Mobile (GSM) communication cellular base station’s signal strength in different environments, and identified the users’ current context by signal pattern recognition. We tested the algorithm in four different environments. The results showed that the proposed algorithm was capable of identifying open outdoors, semi-outdoors, light indoors and deep indoors environments with 100% accuracy using the signal strength of four nearby GSM stations. The required hardware and signal are widely available in our daily lives, implying its high compatibility and availability. Keywords: seamless positioning; indoor/outdoor detection; machine learning; GSM

1. Introduction In the era of mobile internet, we can connect to the internet any place and any time with portable devices. As a result, Location Based Services (LBS) have developed dramatically. The integration of traditional services and location information has brought about a great deal of innovative applications. All of these LBS applications have the same requirement: the user’s current position. The users may by in any place, such as in the open outdoors, crowded avenues, deep indoors and so on. The next generation of positioning systems must work well in a range of environments to meet the needs of a variety of LBS applications. Researchers have developed many positioning technologies for different environments. No single positioning technology is robust enough to perform well in all of these environments. However, there is an accurate enough positioning technique for any specific environment. For example, Global Navigation Satellite Systems (GNSS) performs well in open sky environments, while Shadow Matching [1] is enough for urban canyon environments, and Wi-Fi fingerprint positioning is suitable for indoor environments. This implies that we can integrate all these positioning techniques for Seamless indoor and outdoor Navigation and Localization (SNAL) applications [2]. However, the battery power is a limitation for any portable smart device. Turning on all of the required sensors is energy consuming. Letting the user manually choose different techniques according to different environments is not user friendly. Detecting the users’ current environments

Sensors 2016, 16, 1563; doi:10.3390/s16101563

www.mdpi.com/journal/sensors

Sensors 2016, 16, 1563

2 of 15

Sensors 2016, 16, 1563 Sensors 2016, 16, 1563 Sensors 2016, 16, 1563 Sensors 2016, 16, 1563

15 2of of 22of 15 2of 1515

automatically and efficiently with low battery consumption is crucial for SNAL. In this paper, we focus consumption crucial for SNAL. In this paper, we focus on the automatic indoor/outdoor detection consumption is crucial for SNAL. In this paper, we focus on the automatic indoor/outdoor detection consumption isisiscrucial for SNAL. In paper, we focus on the automatic indoor/outdoor detection consumption crucial for SNAL. Inthis this paper, we focus on the automatic indoor/outdoor detection on the automatic indoor/outdoor detection problem in SNAL. problem in SNAL. problem inSNAL. SNAL. problem inin SNAL. problem There are a lot of previous works focused on indoor/outdoor detection to provide essential There are of previous works focused on indoor/outdoor detection to provide essential There are alot lot of previous works focused on indoor/outdoor detection to provide essential There are aaalot of previous works focused on indoor/outdoor detection to provide essential There are lot of previous works focused on indoor/outdoor detection to provide essential information for upper layer applications [3]. They mainly workinintwo two typesofof environment: indoor information for upper layer applications [3]. They mainly work types environment: indoor information for upper layer applications [3]. They mainly work intwo two types ofenvironment: environment: indoor information for upper layer applications [3]. They mainly work inintwo types ofofenvironment: indoor information for upper layer applications [3]. They mainly work types indoor and outdoor environments [4], but for SNAL, these two environments are not enough. In this paper, and outdoor environments [4], but for SNAL, these two environments are not enough. andoutdoor outdoorenvironments environments[4], [4],but butfor forSNAL, SNAL,these thesetwo twoenvironments environmentsare arenot notenough. enough. and and outdoor environments [4], but for SNAL, these two environments are not enough. we focused on four types of on environments: open outdoors, semi-outdoors, light indoors and deep In this paper, we focused four types of environments: open outdoors, semi-outdoors, light this paper, we focused on four types environments: open outdoors, semi-outdoors, light In this paper, we focused onon four types of environments: open outdoors, semi-outdoors, light InIn this paper, we focused four types ofof environments: open outdoors, semi-outdoors, light indoors. These environments are shown in Table 1. indoors and deep indoors. These environments are shown in Table indoors and deep indoors. These environments are shown in Table indoors and deep indoors. These environments are shown in 1.1.1.1. indoors and deep indoors. These environments are shown inTable Table Table 1. Four indoor/outdoor environments. Table Four indoor/outdoor environments. Table 1. Four indoor/outdoor environments. Table 1.1.1.Four indoor/outdoor environments. Table Four indoor/outdoor environments. Environment Environment Environment Environment Environment

Open Outdoors Open Outdoors Open Outdoors Open Outdoors Open Outdoors

Semi-Outdoors Semi-Outdoors Semi-Outdoors Semi-Outdoors Semi-Outdoors

Definition Definition Definition Definition Definition

Outside building Outside abuilding building Outside building aaabuilding Outside

Near building Near abuilding building Near aaabuilding building Near Near

Light Indoors Light Indoors Light Indoors Light Indoors Light Indoors In a room with aroom room with InInaInaroom with with with windows windows windows windows

Deep Indoors Deep Indoors Deep Indoors Deep Indoors Deep Indoors In a room without without InInIn aInaroom without without aaroom room without windows windows windows windows windows

Example Example Example Example Example

In the open outdoors environment, least four satellites are available for positioning because In the open outdoors environment, at least four satellites are available for positioning because In the open outdoors environment, atat least four satellites are available for positioning because In the open outdoors environment, atleast least four satellites are available for positioning because In the open outdoors environment, at four satellites are available for positioning because of of the open sky condition. Semi-outdoors represents aaGNSS-hostile GNSS-hostile outdoor environment such as of the open sky condition. Semi-outdoors represents a GNSS-hostile outdoor environment such of the open sky condition. Semi-outdoors represents a outdoor environment such as of the open sky condition. Semi-outdoors represents GNSS-hostile outdoor environment such asas the open sky condition. Semi-outdoors represents a GNSS-hostile outdoor environment such as an urban an urban canyon or area, where there are not enough satellites for positioning. Light an urban canyon or awooded wooded area, where there are not enough satellites for positioning. Light an urban or aaawooded area, where there are not enough satellites for positioning. Light an urban canyon or wooded area, where there are not enough satellites for positioning. Light canyon or acanyon wooded area, where there are not enough satellites for positioning. Light indoors is similar to indoors is similar to a semi-outdoors environment. Deep indoors environment refers to a place with indoors is similar to a semi-outdoors environment. Deep indoors environment refers to a place indoors is similar to a semi-outdoors environment. Deep indoors environment refers to a place with indoors is similar to a semi-outdoors environment. Deep indoors environment refers to a place with a semi-outdoors environment. Deep indoors environment refers to a place with no satellite in view. with no satellite in view. no satellite in view. no satellite in view. no satellite in view. In each of these four environments, there was was aa well-developed well-developedpositioning positioningalgorithm algorithm for In each of these four environments, there for each these four environments, there was awell-developed well-developed positioning algorithm for InInIn each ofofof these four environments, there was a awell-developed positioning algorithm for each these four environments, there was positioning algorithm for localization. In the theopen openoutdoors outdoors environment, no matter whether on water or highway, on a highway, localization. In environment, no matter whether on water or on we localization. the open outdoors environment, no matter whether on water on ahighway, highway, we localization. In the open outdoors environment, no matter whether on water or on aaahighway, we localization. InIn the open outdoors environment, no matter whether on water oror on we we can localize ourselves using GNSS [5]. In the semi-outdoors environment, urban canyons and can localize ourselves using GNSS [5]. In the semi-outdoors environment, urban canyons and can localize ourselves using GNSS [5]. the semi-outdoors environment, urban canyons and can localize ourselves using GNSS [5]. InInIn the semi-outdoors environment, urban canyons and can localize ourselves using GNSS [5]. the semi-outdoors environment, urban canyons and wooded areas are both capable of using GNSS shadow matching matchingfor forpositioning positioning[1]. [1].In In the light wooded areas are both capable of using GNSS shadow the light wooded areas are both capable using GNSS shadow matching for positioning [1]. the light wooded areas are both capable of using GNSS shadow matching for positioning [1]. InInIn the light wooded areas are both capable ofof using GNSS shadow matching for positioning [1]. the light indoors environment, cooperative positioning wassuitable suitable[6], [6],while whileinindeep deepindoors indoors environments, indoors environment, cooperative positioning was environments, indoors environment, cooperative positioning was suitable [6], while indeep deep indoors environments, indoors environment, cooperative positioning was suitable [6], while inindeep indoors environments, indoors environment, cooperative positioning was suitable [6], while indoors environments, a afingerprint positioning algorithm is the most commonly used technique [7,8]. fingerprint positioning algorithm commonly used technique [7,8]. afingerprint fingerprint positioning algorithm isthe the most commonly used technique [7,8]. a afingerprint positioning algorithm isisthe most commonly used technique [7,8]. positioning algorithm most commonly used technique [7,8]. The difference between open outdoors and semi-outdoors is the number of satellites in view. The difference between outdoors and semi-outdoors the number of satellites in view. The difference between open outdoors and semi-outdoors is the number satellites view. The difference between open outdoors and semi-outdoors isisisthe number of satellites in view. The difference between open outdoors and semi-outdoors the number ofof satellites inin view. In open outdoor environments, least four satellites are available, which means the user can be In open outdoor environments, at least four satellites are available, which means the user can be open outdoor environments, least four satellites are available, which means the user can InInIn open outdoor environments, atatat least four satellites are available, which means the user can be open outdoor environments, least four satellites are available, which means the user can bebe localized by GNSS, while in semi-outdoors environments, the number of visible satellites is not enough localized by GNSS, while in semi-outdoors environments, the number of visible satellites is not localized by GNSS, while semi-outdoors environments, the number visible satellites isnot not localized byby GNSS, while ininin semi-outdoors environments, the number ofofof visible satellites isisnot localized GNSS, while semi-outdoors environments, the number visible satellites enough for GNSS localization. The difference between light indoors and deep indoors is the for GNSS localization. The difference between light indoors and deep indoors is the visibility of enough for GNSS localization. The difference between light indoors and deep indoors is the enough for GNSS localization. The difference between light indoors and deep indoors is the enough for GNSS localization. The difference between light indoors and deep indoors is the visibility of navigation satellites. In light indoors environments, users may receive navigation navigation satellites. In light indoors environments, users may receive navigation signals from several visibility navigation satellites. light indoors environments, users may receive navigation visibility ofofof navigation satellites. InInIn light indoors environments, users may receive navigation visibility navigation satellites. light indoors environments, users may receive navigation signals from several satellites. As result, the users in light indoor environments can be localized satellites. Asseveral a result, the users in light indoor environments canenvironments be localized can using peer to peer signals from several satellites. As aresult, result, the users light indoor environments can be localized signals from satellites. As aaaresult, the users in light indoor be localized signals from several satellites. As the users inin light indoor environments can be localized using peer to peer cooperative positioning, but in deep indoor environments, no navigation cooperative positioning, but in deep indoor environments, no indoor navigation satellites areno available. In that using peer peer cooperative positioning, but deep indoor environments, no navigation using peer tototo peer cooperative positioning, but ininin deep environments, navigation using peer peer cooperative positioning, but deep indoor environments, no navigation satellites are available. In that case, cooperative positioning fails. case, cooperative positioning fails. satellites are available. In that case, cooperative positioning fails. satellites are available. In case, cooperative positioning fails. satellites are available. Inthat that case, cooperative positioning fails. Many works address the issue of detecting indoor/outdoor environments. Walter al. Many works address the issue of detecting indoor/outdoor environments. Walter et al. Many works address the issue detecting indoor/outdoor environments. Walter Many works address the issue of detecting indoor/outdoor environments. Walter etet al. Many works address the issue ofof detecting indoor/outdoor environments. Walter etet al.al. identified a set of novel environmental features that could be used for environment detection [9]. identified aset set novel environmental features that could be used for environment detection [9]. identified novel environmental features that could bebe used for environment detection [9]. identified novel environmental features that could used for environment detection [9]. identified a aset ofofof novel environmental features that could be used for environment detection [9]. These features included gravity, ambient light, magnetic fields, scents, road signs, temperature, These features included gravity, ambient light, magnetic fields, scents, road signs, temperature, These features included gravity, ambient light, magnetic fields, scents, signs, temperature, These features included gravity, ambient light, magnetic fields, scents, road signs, temperature, These features included gravity, ambient light, magnetic fields, scents, roadroad signs, temperature, terrain terrain height, road texture and so on. Some of the related works focused on environment sensing terrain height, road texture and so on. Some of the related works focused on environment sensing terrain height, road texture and so on. Some of the related works focused on environment sensing terrain height, road texture and so on. Some of the related works focused on environment sensing height, road texture and so on. Some of the related works focused on environment sensing using using mobile device’s sensors based on these features. The sensors used included Global Positioning using mobile device’s sensors based on these features. The sensors used included Global Positioning using mobile device’s sensors based on these features. The sensors used included Global Positioning using mobile device’s sensors based on these features. The sensors used included Global Positioning mobile device’s sensors based on these features. The sensors used included Global Positioning System ( GPS ) , accelerometers, gyroscopes, barometers, geomagnetic sensors, Wi-Fi cards and so on. System (GPS , accelerometers, gyroscopes, barometers, geomagnetic sensors, Wi-Fi cards and so on. System ((GPS), GPS ), )accelerometers, gyroscopes, barometers, geomagnetic sensors, Wi-Fi cards and so on. System (GPS ,)accelerometers, accelerometers, gyroscopes, barometers, geomagnetic sensors, Wi-Fi cards and so on. System gyroscopes, barometers, geomagnetic sensors, Wi-Fi cards and so In Groves’s work, environmental context detection using GNSS and Wi-Fi were examined [10]. The In Groves’s work, environmental context detection using GNSS and Wi-Fi were examined [10]. The In Groves’s work, environmental context detection using GNSS and Wi-Fi were examined [10]. The In Groves’s work, environmental context detection using GNSS and Wi-Fi were examined [10]. The on. In Groves’s work, environmental context detection using GNSS and Wi-Fi were examined [10]. results showed that GNSS C/No measurement can be used to distinguish indoor from outdoor results showed that GNSS C/No measurement can used distinguish indoor from outdoor results showed that GNSS C/No measurement can be used to distinguish indoor from outdoor results showed that GNSS C/No measurement can bebe used toto indoor from outdoor The results showed that GNSS C/No measurement can be used todistinguish distinguish indoor from outdoor environments and to distinguish different types of outdoor environment, such as urban and open. environments and to distinguish different types of outdoor environment, such as urban and open. environments and to distinguish different types of outdoor environment, such as urban and open. environments and to distinguish different types of outdoor environment, such as urban and open. environments and to distinguish different types of outdoor environment, such as urban and open. Wi-Fi measurements have been shown to be unreliable for distinguishing betweenindoor and Wi-Fi measurements have been shown to be unreliable for distinguishing betweenindoor and Wi-Fi measurements have been shown to be unreliable for distinguishing betweenindoor and Wi-Fi measurements have been shown to be unreliable for distinguishing betweenindoor and Wi-Fi measurements have been shown to be unreliable for distinguishing betweenindoor and outdoor outdoor environments, but good for distinguishing different outdoor types, such as residential and outdoor environments, but good for distinguishing different outdoor types, such residential and outdoor environments, but good for distinguishing different outdoor types, such as residential and outdoor environments, but good for distinguishing different outdoor types, such asas residential and environments, but Muralidharan good for distinguishing different outdoor and types, such as residential and business business districts. proposed to use barometer GPS to identify different floors in business districts. Muralidharan proposed to use abarometer barometer and GPS to identify different floors business districts. Muralidharan proposed to aaabarometer and GPS to different floors in business districts. Muralidharan proposed touse use and GPS toidentify identify different floors inin districts. Muralidharan proposed to use a barometer and GPS to identify different floors in indoor indoor environments [11]. Ravindranath et al. showed that GPS lock status can be used to indirectly indoor environments [11]. Ravindranath etal.al. showed that GPS lock status can used toindirectly indirectly indoor environments [11]. Ravindranath etetal. showed that GPS lock status can bebebe used totoindirectly indoor environments [11]. Ravindranath showed that GPS lock status can used environments [11].environment Ravindranath etThe al. showed that GPS lock status can beenergy used to indirectly infer infer the ambient [12]. GPS, however, consumed too much to be useful for infer the ambient environment [12]. The GPS, however, consumed too much energy to useful for infer the ambient environment [12]. The GPS, however, consumed too much energy to useful for infer the ambient environment [12]. The GPS, however, consumed too much energy tobe bebe useful for the ambient environment [12]. The GPS, however, consumed too much energy to be useful for many many applications. smart phone may run out of energy in about GPS many applications. smart phone may run out energy about ifthe the GPS isrunning running many applications. AAAA smart phone may run out of energy in about 666h6hhifhififthe GPS isisisrunning many applications. smart phone may run out ofof energy inin about the GPS running applications. A[13], smart phone may run out ofwas energy inavailable about 6 hin if the GPS isenvironments running continuously [13], continuously and furthermore, GPS only open sky [14]. As continuously [13], and furthermore, GPS was only available open sky environments [14]. As continuously [13], and furthermore, GPS was only available ininin open sky environments [14]. As aaa a continuously [13], and furthermore, GPS was only available open sky environments [14]. As

Sensors 2016, 16, 1563

3 of 15

and furthermore, GPS was only available in open sky environments [14]. As a result, researchers proposed to use the low energy consumption sensors available in smart phones for detecting the environment, such as accelerometers, gyroscopes, barometers and magnetic sensors. Mostafa et al. made use of accelerometers, gyroscopes, barometers, and magnetic sensors to detect the height changing modes of the user in indoor environments [15–17]. However, accelerometers are orientation Sensors 2016, 16, 1563 3 of 15 and position-dependent, a require a high sampling rate to achieve good accuracy. Researchers sought to use alternative sensors.proposed For example, IODetector indoors, result, researchers to use the low energy detected consumption sensors outdoors available in and smartsemi-outdoors phones for detecting the environment, such as accelerometers, gyroscopes, barometers and magnetic environments using cell signals, light and magnetic intensity [2]. Vanini proposed to use barometers to et al. made use of accelerometers, gyroscopes, barometers, and magnetic sensors to detect the sensors. changeMostafa of floors [18]. TempIO classified the indoor/outdoor environment by comparing detect the height changing modes of the user in indoor environments [15–17]. However, the temperature [19]. Wu et al. showed barometers were capable of sampling monitoring events [20]. accelerometers are orientation and that position-dependent, a require a high rate door to achieve Barometergood and accuracy. temperature sensors are not widely available current mobile phones. Radu et al. Researchers sought to use alternative sensors.inFor example, IODetector detected outdoors and semi-outdoors environments using cell signals, learning light and magnetic intensity presented indoors, a general method employing semi-supervised machine and used light[2]. intensity, Vanini proposed to use barometers to detect theintensity change of floors [18]. TempIOaclassified cellular signal strength, magnetic variance and sound [3]. They provided detectiontheaccuracy indoor/outdoor environment by comparing the temperature [19]. Wu et al. showed that barometers exceeding 90%, but their algorithm relied on several sensors, which is not energy efficient. Detecting were capable of monitoring door events [20]. Barometer and temperature sensors are not widely indoor/outdoor environments widely sensors with lowemploying energy consumption available in current mobileusing phones. Radu etavailable al. presented a general method semi-supervisedremains a challengemachine for researchers. learning and used light intensity, cellular signal strength, magnetic variance and sound intensity was [3]. They provided detection accuracyworks, exceeding but their algorithm relied on to use Our research motivated byathese pioneering but90%, we went further. We proposed several sensors, which is not energy efficient. Detecting indoor/outdoor environments using widely the GSM signal strength to detect four types of indoor/outdoor environments. Firstly, GSM is available available sensors with low energy consumption remains a challenge for researchers. on all GSM-based smart phones. Secondly, it consumes minimal energy in addition to standard Our research was motivated by these pioneering works, but we went further. We proposed to cell-phone use operation The basic to idea was simple: theindoor/outdoor propagationenvironments. of radio signals is affected the GSM[21]. signal strength detect four types of Firstly, GSM is by the available on all GSM-based smart phones. Secondly, it consumes minimal energy in addition to environment. Different environments result in different signal strength characteristics. By identifying standard cell-phone operation [21]. The basic idea was simple: the propagation of radio signals is the signal strength’s characteristics, we can determine the user’s environment. We have investigated affected by the environment. Different environments result in different signal strength a wide range of machine learning algorithms for classification, including Decision Tree (DT), Random characteristics. By identifying the signal strength’s characteristics, we can determine the user’s Forest (RF), Support Vector Machine (SVM), Nearest Neighbor (KNN), Logistic Regression (LR), environment. We have investigated a wideKrange of machine learning algorithms for classification, Naive Bayesian (NB) [22], and Network (NN). classifiers applied including Decision TreeNeural (DT), Random Forest (RF),The Support Vector were Machine (SVM),for K recognition. Nearest Neighbor results (KNN), showed Logistic Regression (LR), Naive Bayesian (NB) and Neural Network (NN). Experimental that the proposed algorithm was [22], capable of detecting open outdoors, The classifiers were applied for recognition. semi-outdoors, light indoors and deep indoors environments with 100% accuracy using four nearby Experimental results showed that the proposed algorithm was capable of detecting open GSM stations’ signal strength. The required hardware and signals are widely available in our daily outdoors, semi-outdoors, light indoors and deep indoors environments with 100% accuracy using lives, implying its high compatibility availability. four nearby GSM stations’ signaland strength. The required hardware and signals are widely available in our daily lives, implying its high compatibility and availability.

2. Methodology 2. Methodology

The proposed algorithm contains three processes: Data Input, Training, and Testing. An overview The proposed algorithm contains three processes: Data Input, Training, and Testing. An of the indoor/outdoor detection process is illustrated in Figure 1. overview of the indoor/outdoor detection process is illustrated in Figure 1. ANN Training Set Data Collection Training

RF ... DT

PreProcessing Classifier Data Input

Feature Extraction

Testing Set

Testing

Accuracy

Predicted Class

1. Framework of indoor/outdoor detection. This framework contains three processes: Data Figure 1. Figure Framework of indoor/outdoor detection. This framework contains three processes: Input, Training, and Testing. In the Data Input process, neighboring cellular base stations’ signal Data Input, Training, and Testing. In the Data Input process, neighboring cellular base stations’ strength is recorded, and features are extracted. The data are classified in the Training phase. The signal strength is recorded, features extracted. The data classified classifier is applied toand detect the user’sare current environment in the are Testing phase. in the Training phase. The classifier is applied to detect the user’s current environment in the Testing phase.

Sensors 2016, 16, 1563

4 of 15

2.1. Data Input In the Data Input process, three sub-processes are included: Data Collection, Pre-Processing, and Feature Extraction. 2.1.1. Data Collection In this process, GSM signal strength is recorded in different environments. According to the GSM standard, each smart phone records a vector of the detected power levels of the pilot signal from at most seven cellular base stations, one of which the smart phone is associated with [21]. For the Android Operating System, the received signal strength index (RSSI) for GSM is in asu, where 0 asu means −113 dBm and 31 asu means −51 dBm. So we have: rss = −113 + 2 ∗ rssi

(1)

The recorded signal strength is arranged from the strongest to the weakest. Assuming that the sampling interval is τ (s), the sampling frequency is 1/τ Hz. The data collected during T(s) is denoted as S, where:  S = {si |i = 0, 1, · · · , b T/τ c} , si = rssi,1 , rssi,2 , · · · , rssi,ni , rssi,m ≥ rssi,n ∀m > n

(2)

where ni is the number of visible stations at time i. 2.1.2. Data Pre-Processing The environment is changing continuously, therefore, we need to look at how a signal changes over a specific period of time to identify the environment. A window is the most basic step and is used by almost all researchers. At each time epoch, the signal involving the current measurement and the previous N − 1 epochs, where N is a chosen integer that represents the length of the window. Assuming the window length is ∆T(s), so we have N = b∆T/τ c, the measurement is arranged as: W = {wi |i = 0, 1, · · · , b T/ 4 T c} , where wt = {st−i |i = 0, 1, · · · , b∆T/τ c − 1}

(3)

Different window lengths contain different information used for detection. The wider the window is, the more descriptive the features extracted from it can be. While the shorter the window, the less information it contains for detection. 2.1.3. Feature Extraction In each window, we can extract several features to describe the character of the environment. These features include Mean, Standard Deviation, Maximum, Minimum and Range. (1)

Mean Mean is the most basic character of a signal. It is calculated by summing the values and dividing the numbers: mean(wt ) = ∑ wt / wt (4)

(2)

where |*| is the number getting operation. Mean is a measure of the middle value of a signal. Standard Deviation Standard Deviation is an indicator of how much a signal is dispersed around its mean. It is calculated as follows: r Std(wt ) = ∑ (si − mean(wt ))2 / wt (5)

(3)

Maximum and Minimum

Sensors 2016, 16, 1563

5 of 15

Maximum and Minimum values are the extreme values in the window: Max (wt ) = si ∈ wt , ∀s j ∈ wt , s j < si , Min(wt ) = si ∈ wt , ∀s j ∈ wt , s j > si (4)

(6)

Range Range is the difference between the Maximum and Minimum values: Range(wt ) = Max (wt ) − Min(wt )

(7)

2.2. Training In the Training Phase, we investigate a wide range of machine learning algorithms to classify the training data, including Decision Tree, Random Forest, Support Vector Machine, K Nearest Neighbor, Logistic Regression, Naive Bayesian, and Neural Network. The classifiers are applied for recognition. The training data can be raw data from the sensors, or the features in different window lengths. The best classifier will be selected for indoors/outdoors detection. 2.3. Testing In the testing phase, new samples are categorized using the classifiers created by the machine learning algorithms. Different algorithms perform differently. Researchers have proposed a lot of performance measures for multi-class classification. Confusion matrix was proposed to evaluate the performance of a classification system for both 2-class and multi-class samples. Table 2 is a template of confusion matrix for a 3-class classifier. Table 2. Template of a confusion matrix for a 3-class classifier. Class

Class 1

Class 2

Class 3

Class 1 Class 2 Class 3

n11 n21 n31

n12 n22 n32

n13 n23 n33

In Table 2, the rows represent the true classes of the tested samples, and the columns represent the predicted classes. nij is the number of test samples of a class i recognized as class j. Sometimes nij would also be the percentage of the classification result. The nearer the confusion matrix is to a diagonal matrix, the better the classification algorithm is. Based on the confusion matrix, there are several widely used performance measures. Here we just introduce in Table 3 the ones we will use in the experiments. Table 3. Template of a confusion matrix for a 3-class classifier. Definition True Positive (TP)

Formula

The number of samples of a class which have been correctly classified

TPi = nii

True Negative (TN)

The number of samples of other classes which has been correctly classified

False Positive (FP)

The number of samples not belongs to a class which has been incorrectly classified as belonging to it

FPi = ∑k6=i nki

The number of samples belonging to a class which have been incorrectly classified as belong to other class

FNi = ∑k6=i nik

False Negative (FN) Accuracy

The proportion of all samples which have been correctly classified

TNi = ∑ j6=i ∑k6=i n jk

Acc =

∑i nii TPi + TNi + FPi + FNi

Sensors 2016, 16, 1563

Sensors 2016, 16, 1563

6 of 15

Table 3. Cont.

6 of 15

Definition

Precision Sensitivity Precision Specificity Specificity

F-Measure F-Measure

Formula

TPi Prec TPi i  = Sens i TPTP + FPi i i i FP

The proportion of sample predicted to belong The proportion of samples which have been a class which is correct correctly to classified The predicted to belong The proportion proportionofofsample negative samples which to a class whichbeen is correct have correctly classified to be negative The proportion of negative samples which have been

thecorrectly weighted average precision and classified toof bethe negative sensitivity

the weighted average of the precision and sensitivity

TN ii = TPTP SpecPrec i + FPi i i FPi  TN i TNi FPi + TNi

2 1/=Sensi 21/ Preci F1

F1 

Speci =

1/Sensi +1/Preci

3. Experiments 3. Experiments ToTotest implemented an anAndroid Androidsmart smartphone phone application testthe thecontext contextsensing sensing algorithm, algorithm, we we implemented application capable nearbycellular cellular stations’ strength. The application is called capableofof capturing capturing nearby basebase stations’ signal signal strength. The application is called DrawRSS. DrawRSS. This application was testedMX3 on asmart Meizu MX3 smartZhuHai, phone China), (MeiZu,which ZhuHai, China), This application was tested on a Meizu phone (MeiZu, is running which is running Android OS(Google, VersionMountain 4.4 (KitKat) (Google, View, USA). A Android OS Version 4.4 (KitKat) View, CA, USA). Mountain A screen-shot of thisCA, application screen-shot ofFigure this application is shown in Figure 2. is shown in 2.

Figure 2. 2. Screenshot application,all allthe theneighboring neighboring cellular base Figure Screenshotofofthe theDrawRSS DrawRSSapplication. application. In In this this application, cellular base stations’ signal andrecorded recordedininthe thephone’s phone’s memory. stations’ signalstrengths strengthsare aredrawn drawn on on the the canvas canvas and memory.

The signal strength is recorded every 0.5 s, and then drawn oin the user interface, so the sampling rate is 2 Hz. The application records the number of cellular base stations and their signal strengths. The data are called training data. The testing environments in this experiment are shown in Figure 3.

Sensors 2016, 16, 1563

7 of 15

The signal strength is recorded every 0.5 s, and then drawn oin the user interface, so the sampling rate is 2 Hz. The application records the number of cellular base stations and their signal strengths. The data are called training data. The testing environments in this experiment are shown in7 Figure 3. Sensors 2016, 16, 1563 of 15

Sensors 2016, 16, 1563

7 of 15

(a) open outdoors

(a) open outdoors

(c) light indoors

(b) semi-outdoors

(b) semi-outdoors

(d) deep indoors

Figure 3. The four testing environments. These four environments are located on the same campus.

Figure 3. The four testing environments. These four environments are located on the same campus. The largest distancelight between them is less than 400 m. indoors (d) deep indoors The largest distance (c) between them is less than 400 m. Figure 3. The four testing environments. These four environments are located on the same campus. During the experiment, the volunteer stayed in the four environments mentioned above for The largest distance between them is less than 400 m. about 10 min. He could use phone in the same manner he naturally would. He kept moving During the experiment, the his volunteer stayed in the four environments mentioned above for about around in the environment. For example, in the light indoors environment, he walked in the room 10 min. He could use his phone in the same manner he naturally would. He kept moving around in During the experiment, the volunteer stayed in the four environments mentioned above for randomly. training data collected during these 10 min. In he thewalked deep indoors environment the environment. For in light environment, in room randomly. about 10The min. Heexample, could useare histhe phone inindoors the same manner he naturally would. Hethe kept moving (Figure 3d), volunteer moved one in side the corridor to theindoors other side severalintimes. around inthe the environment. For from example, theof light indoors environment, heenvironment walked the room The training data are collected during these 10 min. In the deep (Figure 3d), Figure 4 The is the number ofarecellular base stations in10different environments recorded by the randomly. training data collected during these min. In the deep indoors environment the volunteer moved from one side of the corridor to the other side several times. application. This onlymoved showsfrom the result beyond first 200 s. other side several times. (Figure 3d), thefigure volunteer one side of the the corridor to the

Figure 4 is the number of cellular base stations in different environments recorded by the Figure 4 is the number of cellular base stations in different environments recorded by the application. This figure onlyonly shows thethe result thefirst first200 200 application. This figure shows resultbeyond beyond the s. s.

Figure 4. The number of cellular base stations in different environments. Figure 4. The number of cellular base stations in different environments.

4. The of cellular stations different environments. In FigureFigure 4, we can see number that the number of base cellular base in stations is quite different from the open In to Figure we can see that the number ofopen cellular base stations is quite different from the stations open outdoors deep4,indoors environment. In the outdoors environment, we have seven outdoors towe deep indoors environment. In the open outdoors environment, have seven stations most of the4,time, while in that the semi-outdoors environment, westations often have sixwestations. In the indoors In Figure can see the number of cellular base is quite different from the open most of the time, while in the semi-outdoors environment, weFigure often have six stations. In thecurves indoorsfor environment, the number of stations varies from 1 to 7. 5 is the probability outdoors to deep indoors environment. In the open outdoors environment, we have seven stations environment, the number of stations varies from 1 to 7. Figure 5 is the probability curves for

Sensors 2016, 16, 1563

8 of 15

most of the time, while in the semi-outdoors environment, we often have six stations. In the indoors Sensors 2016, 16, 1563 8 of 15 environment, the number of stations varies from 1 to 7. Figure 5 is the probability curves for different environments. Every node (x, y) in Figure there is y%there probability of having at least different environments. Every node (x,5y)means in Figure 5 means is y% probability of having at x cellular least x cellular base stations. base stations. Sensors 2016, 16, 1563

8 of 15

different environments. Every node (x, y) in Figure 5 means there is y% probability of having at least x cellular base stations.

Figure 5. The probability curve of the number of GSM cellular base stations.

Figure 5. The probability curve of the number of GSM cellular base stations. From Figure 5, we can see that we have at least one station in the specified four environments. If we want cellular basethat stations, probability 98.02%, 100%,in 91.87%, and 83.1% four in theenvironments. open From Figure 5,six we can see we the have at leastisone station the specified outdoors, semi-outdoors, light indoorscurve and of deep indoors environments, separately. On average, the Figure 5. The probability the number of GSM cellular base stations. If we want six cellular base stations, the probability is 98.02%, 100%, 91.87%, and 83.1% in the open probability of having at least 1 to 7 base stations in any environment is 100%, 99.95%, 99.70%, outdoors, 98.02%, semi-outdoors, light indoors and deep indoors environments, separately. On average, and 79.17%, respectively. From96.18%, Figure 93.25%, 5, we can see that we have at least one station in the specified four environments. We outatthe samples less than sixiscellular stations and compare signal the probability offiltered having least 1 towith 7thebase stations in anybase environment is83.1% 100%, 99.95%, If we want six cellular base stations, probability 98.02%, 100%, 91.87%, and inthe the open 99.70%, strength93.25%, receivedand in the different environments in Figure 6. The widely used Log-Distance Path outdoors, semi-outdoors, light indoors and deep indoors environments, separately. On average, the 98.02%, 96.18%, 79.17%, respectively. Loss (LDPL) propagation model [23]stations tells us thatany the signal strengthisis 100%, affected mainly99.70%, by the of signal having at least with 1 to 7 less base environment 99.95%, We probability filtered out the samples than sixincellular base stations and compare the signal distance. As a 93.25%, result, we will have the same signal strength measures in different environments. In 98.02%, 96.18%, and 79.17%, respectively. strength received thecan different environments in ifFigure 6. The widely used Log-Distance Figure 6,inwe findsamples many examples, but take morestations stations’ strength into Path Loss We filtered out the with less than six we cellular base andsignal compare the signal consideration, and we look at[23] howtells the signal strength changes with time, we will find that different (LDPL) signal propagation model us that the signal strength is affected mainly by the distance. strength received in the different environments in Figure 6. The widely used Log-Distance Path environments show different patterns. From Figure 6, we can see that the signal strength received As a result, will signal havepropagation the same signal strength measures instrength different environments. In Figure 6, Losswe (LDPL) model [23] tells us that the signal is affected mainly by the in these four environments is quite different. These results imply that it is possible to identify distance. As examples, a result, we will have thetake samemore signal stations’ strength measures in different into environments. In we can find many but if we signal strength consideration, and different environments using the cellular base stations’ signal strength. 6, the we signal can findstrength many examples, but if we takewe more stations’ signal strengthenvironments into we look Figure at how changes with time, will find that different Before performing a classification using the raw data, we pre-process these data as mentioned consideration, and we look at how the signal strength changes with time, we will find that different above. The raw data are grouped according to different varies from 1strength to 20 s. Wereceived calculate in these show different patterns. From Figure 6, we can seewindows that the signal environments show different patterns. From Figure 6, we can see that the signal strength received the featuresisinquite each window, including Mean, Standard Deviation, Maximum, Minimum, and four environments different. These results imply that it is possible to identify in these four environments is quite different. These results imply that it is possible to identify different Range. environments using the cellular stations’ strength. different environments usingbase the cellular basesignal stations’ signal strength. Before performing a classification usingthe the raw wewe pre-process these these data asdata mentioned Before performing a classification using rawdata, data, pre-process as mentioned above. The raw data are grouped according to different windows varies from 1 to 20 s. We calculate above. The raw data are grouped according to different windows varies from 1 to 20 s. We calculate the features in each window, including Mean, Standard Deviation, Maximum, Minimum, and the features in each window, including Mean, Standard Deviation, Maximum, Minimum, and Range. Range.

(a) open outdoors

(b) semi-outdoors Figure 6. Cont.

(a) open outdoors

(b) semi-outdoors Figure 6. Cont.

Figure 6. Cont.

Sensors 2016, 16, 1563 Sensors 2016, 16, 1563

9 of 15 9 of 15

Sensors 2016, 16, 1563

9 of 15

(c) light indoors

(c) light indoors

(d) deep indoors

(d) deep indoors

Figure 6. Six cellular base stations’ signal strength received in different environments.

Figure6.6. Six Six cellular cellular base base stations’ stations’ signal Figure signal strength strength received received in in different different environments. environments.

4. Data Analysis

4. Data Analysis The input data, including measured data and the pre-processed data are both ready for 4. Data Analysis classification using the machine learning algorithms. In this section, we investigate the use of a The input data, including measured data and the pre-processed data are both ready for Thewide input data, including measured and the thetraining pre-processed dataDecision are both range of machine learning algorithmsdata to classify data, including Tree,ready for classification using the machine learning algorithms. In thisLogistic section, we investigate the use of a Random Forest, Support Vector Machine, K Nearest Neighbor, Regression, Naive Bayesian classification using the machine learning algorithms. In this section, we investigate the use of a wide wide rangeNeural of machine learning algorithms to classify the training data, including Decision Tree, Network. The classifiers are applied The classification will range of and machine learning algorithms to classify for therecognition. training data, including performance Decision Tree, Random RandombeForest, Support Machine, K Nearest Neighbor, Logistic Regression, Naive Bayesian compared betweenVector these Machine Learning algorithms. Forest, Support Vector Machine, K Nearest Neighbor, Logistic Regression, Naive Bayesian[24]. and Neural this paper,The learning and classification is conducted by the Orange data mining toolkit and NeuralIn Network. classifiers are applied for recognition. The classification performance will Network.Orange The classifiers are applied forlearning recognition. The classification performance willmodel be compared is an easy-to-use machine toolkit, which allows us to perform repeat be compared between these Machine Learning algorithms. using a wide range ofalgorithms. machine learning algorithms and employing standard performance betweentraining these Machine Learning In this paper, learning and classification is conducted by the Orange data mining toolkit [24]. testing techniques, such as cross-validation without programming. Figure 7 shows the work flowtoolkit of In this paper, learning and classification is conducted by the Orange data mining [24]. Orangethis is experiment. an easy-to-use machine learning toolkit, which allows us to perform repeat model Orange is an easy-to-use machine learning toolkit, which allows us to perform repeat model training using training using a wide range of machine learning algorithms and employing standard performance atesting wide range of machine algorithms without and employing standard performance techniques, techniques, such learning as cross-validation programming. Figure 7 showstesting the work flow of such as cross-validation without programming. Figure 7 shows the work flow of this experiment. this experiment.

Figure 7. Workflow of the data classification using Orange toolkit.

In Figure 7, seven classification algorithms are created by dragging the corresponding widgets to the canvas. The file widget reads data from disk. Firstly, we apply the raw data for classification. The data are collected every 0.5 s in the four environments for 10 min, respectively. Figure 8 shows the classification accuracy. In Figure 8, we can see that for most of the classification algorithms, the more stations used, the better the accuracy is. KNN performs the best among the seven algorithms, FigureTree 7. Workflow of the dataalgorithm. classification using Orange toolkit.the worst. We followed by Decision and Random Forest Neural Network performs Figure 7. Workflow of the data classification using Orange toolkit. just need four cells’ signal strength to get the best accuracy using KNN.

In Figure 7, seven classification algorithms are created by dragging the corresponding widgets In Figure 7, seven classification algorithms are created by dragging widgets to the canvas. The file widget reads data from disk. Firstly, we apply the the rawcorresponding data for classification. toThe thedata canvas. The file widget reads data from disk. Firstly, we apply the raw data for classification. are collected every 0.5 s in the four environments for 10 min, respectively. Figure 8 shows The are collected every In 0.5Figure s in the8,four environments 10 min, respectively. Figure 8 showsthe the the data classification accuracy. we can see that forfor most of the classification algorithms, classification accuracy. In Figure 8, we can see that for most of the classification algorithms, the more more stations used, the better the accuracy is. KNN performs the best among the seven algorithms, stations used, the betterTree the accuracy is. KNN performs the best among the seven algorithms, followed followed by Decision and Random Forest algorithm. Neural Network performs the worst. We by Decision Tree andsignal Random Forest Network performs the worst. We just need just need four cells’ strength to algorithm. get the bestNeural accuracy using KNN. four cells’ signal strength to get the best accuracy using KNN.

Sensors 2016, 16,16, 1563 Sensors2016, 2016,16, 1563 Sensors 1563

10 of 15 10of of15 15 10

Figure 8. Classification accuracy using raw data. Figure8. 8.Classification Classificationaccuracy accuracyusing usingraw rawdata. data. Figure

However, from Figure 5 we know thatthat we cannot always receive signalsignal from from at least However, from Figure we know that we cannot cannot always receive signal from atfour leaststations. four However, from Figure 55 we know we always receive at least four On average, the probability of receiving at least four cellular stations is 98.02% for all stations. On On average, average, the the probability probability of of receiving receiving at at least least four four cellular cellular stations stations is is 98.02% 98.02% for forthe allcases. the stations. all the However, we will surely have at least one station in any case. In the following part of this section, cases. However, we will surely have at least one station in any case. In the following part of this cases. However, we will surely have at least one station in any case. In the following part of this wesection, are going to one cellular station’s signal strength for classification. section, we are areusing goingjust to using using just one onebase cellular base station’s station’s signal strength strength for classification. classification. we going to just cellular base signal for As mentioned above, we can calculate the features for different window lengths and apply the AsAsmentioned features for different differentwindow windowlengths lengthsand and apply the mentionedabove, above,we wecan can calculate calculate the the features apply the features for classification.The Thewindow windowlength lengthisis isvaried variedfrom from111to to20 20sssand andwe wecalculate calculatethe thefeatures featuresin features for classification. features for classification. The window length varied from to 20 and we calculate the features in each each window length, including Mean, Standard Deviation, Maximum, Minimum, and Range. In each window length, including Mean, Standard Deviation, Maximum, Minimum, andand Range. In each in window length, including Mean, Standard Deviation, Maximum, Minimum, Range. In each window length, the number of instance is different. For example, there are 600 instances if the window length,length, the number of instance is different. For example, there are 600are instances if the window each window the number of instance is different. For example, there 600 instances if the window length is 119s.s.shows Figurethe shows the accuracy accuracy of classifying classifying the features. features. length is 1 length s. Figure accuracy of classifying the features. window is Figure 99 shows the of the

Figure9. 9.Classification Classificationaccuracy accuracy usingfeatures features varyingwith with windowlength. length. Figure Figure 9. Classification accuracy using using featuresvarying varying withwindow window length.

From Figure Figure 9, 9, we we can can see see that that in in most most of of the the cases, cases, the the longer longer the the window window length length is, is, the the better better From Figure 9, we canagain, see that in most of thethe cases, longerall the window lengthfollowed is, the better the theFrom accuracy gets. Once KNN performs bestthe among the algorithms, by the the the accuracy gets. Once again, KNN performs the best among all the algorithms, followed by accuracy gets. Once KNN performs the best amongRegression all the algorithms, followed by the Decision Decision Tree andagain, Random Forest algorithms. Logistic Regression performs the worst. worst. When the Decision Tree and Random Forest algorithms. Logistic performs the When the Tree and Random Forest algorithms. Logistic Regression performs the worst. When the window length window length is 8 s, we can correctly classify all four environments using the KNN algorithm. window length is 8 s, we can correctly classify all four environments using the KNN algorithm. is 8 s, we correctly classify all four different environments using we the KNN algorithm. Forcan testing this algorithm algorithm under different conditions, conditions, we experimented experimented during one one week week with with For testing this under during For testing this algorithm different conditions, we experimented during one week with five different walking tracesunder during the period period from 9:00 9:00 to 17:00 17:00 under under different weather five different walking traces during the from to different weather conditions. These traces are different from the environments where we collected the data to five different walking traces during the period from 9:00 to 17:00 under different weather conditions. conditions. These traces are different from the environments where we collected the data to These traces are different from the environments where we collected the data to generate the classifier.

Sensors 2016, Sensors Sensors 2016, 16, 16, 1563 1563

11 of of 15 15 11 11 of 15

generate generate the the classifier. classifier. These These walking walking traces traces contain contain nine nine open open outdoors outdoors segments, segments, 11 11 These walking segments, traces contain nine indoors open outdoors segments, 11deep semi-outdoors segments, 10 light semi-outdoors 10 light segments, and 11 indoors segments. Figure 10 semi-outdoors segments, 10 light indoors segments, and 11 deep indoors segments. Figure 10 indoors segments, and 11 deep indoors segments. Figure 10 shows one of the walking traces that we shows shows one one of of the the walking walking traces traces that that we we experimented experimented with. with. experimented with.

Figure 10. Testing Testing trace located located in the university campus. This trace contains contains two open open outdoors Figure Figure 10. 10. Testing trace trace located in in the the university university campus. campus. This This trace trace contains two two open outdoors outdoors segments, three semi-outdoors segments, twotwo lightlight indoors segments, and one deep indoors segment. segments, three semi-outdoors segments, indoors segments, and one segments, three semi-outdoors segments, two light indoors segments, and one deep deep indoors indoors The volunteer walks along the path in 10 min. segment. segment. The The volunteer volunteer walks walks along along the the path path in in 10 10 min. min.

In the experiments, the volunteer walks along his in same In the volunteer walks along this this pathpath whilewhile usingusing his phone in the same manner theexperiments, experiments,the the volunteer walks along this path while using his phone phone in the the same manner he naturally would. The true environment type is manually labeled. Each day, these traces he naturally would. The trueThe environment type is manually labeled. Each day,Each theseday, traces are tested manner he naturally would. true environment type is manually labeled. these traces are tested times in morning, noon respectively. three timesthree in the morning, and afternoon, respectively. are tested three times in the thenoon morning, noon and and afternoon, afternoon, respectively. We first use the four cellular base stations’ signal strength testing. instances less base stations’ signal strength for for testing. TheThe instances withwith less than We first firstuse usethe thefour fourcellular cellular base stations’ signal strength for testing. The instances with less than four are out. 11 the accuracy for classifiers. four stations are filtered out. Figure 11 shows the average accuracy for different classifiers. than four stations stations are filtered filtered out. Figure Figure 11 shows shows the average average accuracy for different different classifiers.

Figure 11. Classification accuracy using four stations’ signal strength with classifiers using four stations’ signal strength with different classifiers created Figure 11. 11.Classification Classificationaccuracy accuracy using four stations’ signal strength with different different classifiers created by different machine learning algorithms. by different machinemachine learninglearning algorithms. created by different algorithms.

From Figure 11, find that all the have better than 94%. KNN From Figure11, 11,wewe we find allalgorithms the algorithms algorithms have accuracies accuracies better than 94%. KNN From Figure find thatthat all the have accuracies better than 94%. KNN performs performs the best, with an average accuracy is 97.27%. Random Forest performs the worst, as performs the best, with an averageisaccuracy is 97.27%.Forest Random Forestthe performs theitsworst, as its its the best, with an average accuracy 97.27%. Random performs worst, as accuracy is accuracy accuracy is is 94.36%. 94.36%. The The confusion confusion matrix matrix for for KNN KNN algorithm algorithm is is shown shown in in Table Table 4. 4. The The value value aaijij in in this this confusion confusion matrix matrix is is not not the the number number of of the the sample, sample, but but the the percentage, percentage, given given by by the the following following expression: expression:

Sensors 2016, 16, 1563

12 of 15

94.36%. The confusion matrix for KNN algorithm is shown in Table 4. The value aij in this confusion matrix is not the number of the sample, but the percentage, given by the following expression: aij = nij /ni × 100%

(8)

Table 4. Confusion matrix using four stations’ signal strength with KNN classifier. Environment

Deep Indoors

Semi-Outdoors

Light Indoors

Open Outdoors

Deep Indoors Semi Outdoors Light Indoors Open Outdoors

97.1% 0 6.5% 0

0 98.6% 0 0

2.9% 0 93.5% 0

0 1.4% 0 100%

Table 4 shows that we can detect open outdoors environments correctly. There is a 2.9% possibility of identifying deep indoors as light indoors environments, and a 1.4% possibility of identifying semi-outdoors as open outdoors environments. A light indoors environment might be classified as deep indoors with a possibility of 6.5%. However, we can’t always have at least four cellular stations. Figure 9 show that when the window length is 8 s, we can correctly classify all four environments using the KNN algorithm. In Table 5, we apply the different classifiers to detect the environment using 8 s window length. From Table 5, we can see that Random Forest algorithm performs the best in all the measures. In the following comparison, we will apply the Random Forest algorithm for classification. Finally, we compare the proposed indoor/outdoor detection algorithm with the IODetector [2], Co-Training [3] and GPS based detection in terms of accuracy and energy consumption. Table 5. Performance measures using 8 s window length with different algorithms. Algorithm

Accuracy

Sensitivity

Specificity

F-Measure

Precision

SVM KNN DT NB LR NN RF

0.8095 0.8652 0.8861 0.8734 0.7994 0.8677 0.8943

0.8095 0.8652 0.8861 0.8734 0.7994 0.8677 0.8943

0.9365 0.9551 0.9621 0.9578 0.9331 0.9559 0.9647

0.8077 0.8638 0.8856 0.8719 0.7932 0.8658 0.8938

0.8087 0.8634 0.8901 0.8717 0.8051 0.8652 0.8933

The app DrawRSS is updated to collect other required signal strengths, including light intensity, cellular signal strength, magnetic variance, sound intensity, and visible GPS satellites. A screen shot of the application is given in Figure 12. We collect the required signal strength from the four environments for 10 min. The sampling rate is 2 Hz. The IODetector use the light intensity, cellular signal strength and magnetic variance to detect the indoor/outdoor environment. Co-Training uses the light intensity, cellular signal strength and sound intensity for detection. Our proposed algorithm uses the Random Forest algorithm to classify 8 s window length features from one cellular station’s signal strength. These four algorithms can detect different indoor/outdoor environments: co-Training and a GPS-based algorithm are proposed to detect indoor and outdoor environments, while IODetector is capable of detecting outdoors, semi-outdoors, and indoors; our proposed algorithm can detect four kind of different indoor/outdoor environment. In this experiment, all four of these algorithms only detect indoors and outdoors environments. Table 6 shows the results, confirming that the proposed algorithm performs the best among the four algorithms.

Sensors 2016, 16, 1563 Sensors 2016, 16, 1563

13 of 15 13 of 15

Figure 12. 12.Screen Screenshot shot of the updated application. The updated application light of the updated application. The updated application records records the light the intensity, intensity, cellular signal magnetic strength, variance, magnetic sound variance, sound and intensity, visible GPSin satellites in the cellular signal strength, intensity, visibleand GPS satellites the memory. memory. Table 6. Accuracy comparison among the proposed algorithm, IODetector, Co-Training and GPS Table 6. Accuracy comparison among the proposed algorithm, IODetector, Co-Training and GPS based detection. based detection. Random Forest IODetector Random Forest IODetector 61.2% Accuracy Accuracy 95.3% 95.3% 61.2%

Co-Training Co-Training 93.14% 93.14%

GPS

GPS 71.6% 71.6%

Different indoor/outdooralgorithms algorithms require different sensors. As a they result, they consume Different indoor/outdoor require different sensors. As a result, consume different different amounts of energy. According to [4], GPS consumes the most energy, by amounts of energy. According to [4], GPS consumes the most energy, followed by followed microphone, microphone, sensor, and magnetic GSMthe consumes the least energy. Compared with light sensor, light and magnetic sensor. GSMsensor. consumes least energy. Compared with the other the other indoor/outdoor detection algorithms, our proposed algorithm only needs the GSM sensor, indoor/outdoor detection algorithms, our proposed algorithm only needs the GSM sensor, which which consumes energy in to addition to cell-phone standard cell-phone operation, so the proposed consumes minimalminimal energy in addition standard operation, so the proposed algorithm is algorithm is the most energy efficient among the four detection algorithms. the most energy efficient among the four detection algorithms. 5. Discussion Discussionand andConclusions Conclusions Detecting the users’ current environment automatically automatically is crucial for SNAL. In this paper we GSM-based indoor/outdoor indoor/outdoor detection propose a GSM-based detection algorithm. algorithm. Some Some studies studies [25,26] [25,26] show that both the distance from the base station and the difference in a height of mobile phones influence the received signal strength. Our strength. We Our idea idea is is to to not just rely on the signal strength. We apply a machine learning algorithm to classify the neighboring GSM station’s signal in different environments, and identify the users’ users’current current context by signal recognition. test the in algorithm in four different context by signal recognition. We test We the algorithm four different environments. environments. The results show that the proposed algorithm is capable of identifying open The results show that the proposed algorithm is capable of identifying open outdoors, semi-outdoors, outdoors, semi-outdoors, light indoors and deep indoors environments with 100% accuracy using four nearby GSM stations’ signal strength. Large scale experiments have proved the efficiency of

Sensors 2016, 16, 1563

14 of 15

light indoors and deep indoors environments with 100% accuracy using four nearby GSM stations’ signal strength. Large scale experiments have proved the efficiency of the algorithm. The required hardware and signals are widely available in our daily lives, implying its high compatibility and availability. Future work will concentrate on real-time indoors/outdoors detection by introducing new low energy consumption signals that exist pervasively. The proposed approach was tested on only one phone. However, GSM RSSI may be different on different phones even for the same location. The device diversity problem is the one we must focus on in the future. Supplementary Materials: The supplementary materials are available online at http://www.mdpi.com/14248220/16/10/1563/s1. Acknowledgments: This study is supported by the Natural Science Foundation of China (NSFC) program under Grand No. 71031007, No. 71373282. Author Contributions: Li Qun conceived and designed the experiments; Wang Weiping performed the experiments; Shi Zesen and Chen Wei analyzed the data. Chang Qiang wrote the paper. Conflicts of Interest: The authors declare no conflict of interest. The founding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.

References 1. 2. 3. 4.

5. 6. 7. 8.

9.

10.

11.

12.

13.

Groves, P.D. Shadow matching: A new GNSS positioning technique for urban canyons. J. Navig. 2011, 64, 417–430. [CrossRef] Cheng, J.; Yang, L.; Li, Y.; Zhang, W. Seamless outdoor/indoor navigation with WIFI/GPS aided low cost Inertial Navigation System. Phys. Commun. 2014, 13, 31–43. [CrossRef] Li, M.; Zhou, P.; Zheng, Y. IODetector: A Generic Service for Indoor/Outdoor Detection. ACM Trans. Sens. Netw. 2014, 11. [CrossRef] Dadu, V.; Katasikouli, P.; Sarkar, R.; Marina, M.K. A Semi-Supervised Learning Approach for Robust Indoor-Outdoor Detection with Smartphones. In Proceedings of the 12th ACM Conference on Embedded Network Sensor Systems, Memphis, TN, USA, 3–6 November 2014. Kaplan, E.D. Understanding GPS: Principles and Applications; Artech House: London, UK, 2005. Caceres, M.A.; Penna, F.; Wymeersch, H.; Garello, R. Hybrid cooperative positioning based on distributed belief propagation. IEEE J. Sel. Areas Commun. 2011, 29, 1948–1958. [CrossRef] Chen, W.; Wang, W.; Li, Q.; Chang, Q.; Hou, H. A Crowd-Sourcing Indoor Localization Algorithm via Optical Camera on a Smartphone Assisted by Wi-Fi Fingerprint RSSI. Sensors 2016, 16, 410. [CrossRef] [PubMed] Bahl, P.; Padmanabhan, V.N. RADAR: An in-building RF-based user location and tracking system. In Proceedings of Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies, Tel Aviv, Israel, 26–30 March 2000. Walter, D.J.; Groves, P.D.; Mason, R.J.; Harrison, J.; Woodward, J.; Wright, P. Novel Environmental Features for Robust Multisensor Navigation. In Proceedings of the 26th International Technical Meeting of the Satellite Division of the Institute of Navigation (ION GNSS 2013), Nashville, TN, USA, 16–20 September 2013. Groves, P.D.; Martin, H.F.S.; Voutsis, K.; Walter, D.J.; Wang, L. Context detection, categorization and connectivity for advanced adaptive integrated navigation. In Proceedings of the 26th International Technical Meeting of the Satellite Division of the Institute of Navigation (ION GNSS 2013), Nashville, TN, USA, 16–20 September 2013. Muralidharan, K.; Khan, A.J.; Misra, A.; Balan, R.K.; Agarwal, S. Barometric phone sensors: More hype than hope! In Proceedings of the 15th Workshop on Mobile Computing Systems and Applications, Santa Barbara, CA, USA, 26–27 February 2014. Ravindranath, L.; Newport, C.; Balakrishnan, H.; Madden, S. Improving wireless network performance using sensor hints. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, Boston, MA, USA, 30 March–1 April 2011. Liu, J.; Priyantha, B.; Hart, T.; Ramos, H.S.; Loureiro, A.A.F.; Wang, Q. Energy efficient GPS sensing with cloud offloading. In Proceedings of the 10th ACM Conference on Embedded Network Sensor Systems, Toronto, ON, Canada, 6–9 November 2012; pp. 85–98.

Sensors 2016, 16, 1563

14.

15.

16. 17.

18.

19. 20.

21. 22. 23. 24. 25. 26.

15 of 15

Lin, K.; Kansal, A.; Lymberopoulos, D.; Zhao, F. Energy-accuracy trade-off for continuous mobile device location. In Proceedings of the 8th International Conference on Mobile Systems, Applications, and Services, San Francisco, CA, USA, 15–18 June 2010; pp. 285–298. Elhoushi, M.; Georgy, J.; Wahdan, A.; Korenberg, M.; Noureldin, A. Using portable device sensors to recognize height changing modes of motion. In Proceedings of the 2014 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Montevideo, Uruguay, 12–15 May 2014; pp. 477–481. Elhoushi, M. Advanced Motion Mode Recognition for Portable Navigation. Ph.D. Thesis, Queen’s University, Kingston, ON, Canada, 2015. Elhoushi, M.; Georgy, J.; Korenberg, M.; Noureldin, A. Robust motion mode recognition for portable navigation independent on device usage. In Proceedings of the 2014 IEEE/ION Position, Location and Navigation Symposium (PLANS 2004), Monterey, CA, USA, 5–8 May 2014; pp. 158–163. Vanini, S.; Giordano, S. Adaptive context-agnostic floor transition detection on smart mobile devices. In Proceedings of the 2013 IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops), San Diego, CA, USA, 18–22 March 2013; pp. 2–7. Krumm, J.; Hariharan, R. TempIO: Inside/outside classification with temperature. In Proceedings of the 2nd Workshop on Man-Machine Symbiotic Systems, Kyoto, Japan, 23–24 November 2004. Wu, M.; Pathak, P.H.; Mohapatra, P. Monitoring building door events using barometer sensor in smartphones. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, Osaka, Japan, 7–11 September 2015; pp. 319–323. Ergen, S.C.; Tetikol, H.S.; Kontik, M.; Sevlian, R.; Rajagopal, R.; Varaiya, P. RSSI-fingerprinting-based mobile phone localization with route constraints. IEEE Trans. Veh. Technol. 2014, 63, 423–428. [CrossRef] Chen, R.; Chu, T.; Liu, K.; Liu, J.; Chen, Y. Inferring Human Activity in Mobile Devices by Computing Multiple Contexts. Sensors 2015, 15, 21219–21238. [CrossRef] [PubMed] Rappaport, T.S. Wireless Communications: Principles and Practice; Prentice Hall PTR: Upper Saddle River, NJ, USA, 1996; Volume 2. Orange. Available online: http://www.ailab.si/orange/ (accessed on 3 March 2016). Stanivuk, V. Measurements of the GSM signal strength by mobile phone. In Proceedings of the 2012 20th Telecommunications Forum (TELFOR), Belgrade, Serbia, 20–22 November 2012; pp. 1784–1787. Kyritsi, P.; Cox, D.C. Propagation characteristics of horizontally and vertically polarized electric fields in an indoor environment: Simple model results. In Proceedings of the IEEE 54th Vehicular Technology Conference VTC Fall 2001, Atlantic City, NJ, USA, 7–11 October 2001; Volume 3, pp. 1422–1426. © 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

Indoor-Outdoor Detection Using a Smart Phone Sensor.

In the era of mobile internet, Location Based Services (LBS) have developed dramatically. Seamless Indoor and Outdoor Navigation and Localization (SNA...
12MB Sizes 0 Downloads 12 Views