1

Recovering Chaotic Properties From Small Data Chenxi Shao, Fang Fang, Qingqing Liu, Tingting Wang, Binghong Wang, and Peifeng Yin

Abstract—Physical properties are obviously essential to study a chaotic system that generates discrete-time signals, but recovering chaotic properties of a signal source from small data is a very troublesome work. Existing chaotic models are weak in dealing with such case in that most of them need big data to exploit those properties. In this paper, geometric theory is considered to solve this problem. We build a smooth trajectory from series to implicitly exhibit the chaotic properties with series-nonuniform rational B-spline (S-NURBS) modeling method, which is presented by our team to model slow-changing chaotic time series. As for the part of validation, we reveal how well our model recovers the properties from both the statistical and the chaotic aspects to confirm the effectiveness of the model. Finally a practical chaotic model is built up to recover the chaotic properties contained in the Musa standard dataset, which is used in analyzing software reliability, thereby further proves the high credibility of this model in practical time series. The effectiveness of the S-NURBS modeling leads us to believe that it is really a feasible and worthy research area to study chaotic systems from geometric perspective. For this reason, we reckon that we have opened up a new horizon for chaotic system research. Index Terms—Chaotic properties, S-NURBS, time series, validation.

I. Introduction HAOTIC properties are the crucial elements to study chaotic behavior no matter in the theoretical analysis [1] or in determining the practical predictability [2] of the system. When dealing with a small chaotic sequence, most time we

C

Manuscript received March 28, 2013; revised December 1, 2013 and February 21, 2014; accepted February 28, 2014. This work was supported by the National Natural Science Foundation of China under Grant 61174144, Grant 60874065, Grant 10975126, and Grant 91024026. This paper was recommended by Associate Editor R. Lynch. C. Shao is with Computer Science and Technology College, University of Science and Technology of China, Hefei, Anhui 230026, China, and also with the Anhui Province Key Laboratory of Software in Computing and Communication, Hefei, Anhui 230027, China (e-mail: [email protected]). F. Fang is with Computer Science and Technology College, University of Science and Technology of China, Hefei, Anhui 230026, China (e-mail: [email protected]). Q. Liu was with Computer Science and Technology College, University of Science and Technology of China, Hefei, Anhui 230026, China. He is now with IFLYTEK, Hefei 230088, China (e-mail: [email protected]). T. Wang was with Computer Science and Technology College, University of Science and Technology of China, Hefei, Anhui 230026, China. She is now with ZTE Corporation, Nanjing 320100, China (e-mail: [email protected]). B. Wang is with the Institute of Theoretical Physics, Department of Modern Physics, University of Science and Technology of China, Hefei, Anhui 230026, China (e-mail: [email protected]). P. Yin is with the Department of Computer Science and Engineering, Pennsylvania State University, State College, PA 16801 USA (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TCYB.2014.2309989

need to develop some specialized algorithms [3] to explore the chaotic properties of the signal source. Developing algorithms is such a challenging work for interdisciplinary researchers that most of them turn to a more effective way to study the model [4] built by series. Since the last decade of 20th century lots of models dealing with chaotic time series have emerged, but unluckily almost all of them suffer large errors and are not applicable to practical use when recovering chaotic properties with small data. Below is a roundly summary of several widely used methods and some extended applications based on them. Cremers and Hubler [5] proposed continuous-time polynomials to exploit a systematically approach partial differential equation by the linear combination of many polynomial functions in 1987. Then, in recent years, Horbelt et al. [6] tried this method to study the physical properties of a CO2 laser and Mangiarotti et al. [7] designed new algorithms to find polynomial formulations and to identify the coefficients. In 1992, Albano et al. [8] modeled chaotic time series with a neural network through training the weight function between input and hidden nodes based on a given error in order to form the dynamical system. This method was not only applied to study chaotic chemical reaction system by Kim and Chang [9] but also used by Molkov et al. [10] to determine embedding dimension from noisy time series. In the same year Smith [11] applied the radial basis function (RBF) to model chaotic time series by minimizing the error of linear fitting of all basis functions and explaining nonlinearity by basis functions. This method was used by Pilgram et al. [12] to model the dynamics of nonlinear time series in 2002 and also used by Hao et al. [13] who proposed a novel online modeling algorithm for nonlinear and nonstationary systems using a RBF neural network with a fixed number of hidden nodes in 2013. In 1994, Gouesbet and Letellier [14] designed a multivariable polynomial for global vector-field reconstruction aimed at finding the relationship between the current state and its derivatives from one single time series. In 1998, Bagarinao et al. [15] introduced discrete-time polynomials into chaotic time series, which could show nonlinearity compared with RBF model better. It is permitted to select proper basis functions according to the properties of the modeling data and to model by exploiting the linear combination among the basis functions. Then we introduce rational model. Rational model was first analytically proposed by Gouesbet [16] in 1991, but it had not been practiced until 2000 when Correa et al. [17] modeled time series derived from electronic oscillator. All the methods above use either the error minimization strategy or an acceptable error to end the modeling process, so the accurate series used for modeling cannot be resampled

c 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. 2168-2267 See http://www.ieee.org/publications standards/publications/rights/index.html for more information.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2

IEEE TRANSACTIONS ON CYBERNETICS

from these models. It means the models lose some information of the time series. In addition, there are other potential problems, such as the ability to explain nonlinearity, the selection of polynomials and parameters, the numerical stability, the model runtime and so on. Each of these shortcomings can cause uncertainty when studying the properties of the original system. Furthermore, as we all know, today is the era of big data, and there are many algorithms and mechanisms to deal with big data. In this paper, small data refer to sparse data [18], if the chaotic properties of system can recover from small data, that is to say, multiplying the amount of valid data become possible by building model with the small data. It means we can process small data sets by the method that relies on big data handling. Apparently, there still has a very wide application for the new modeling method that recovers chaotic data from small data and well captures the chaotic properties of the system at the same time. Therefore, we put forward a new idea to study the physical phenomenon from geometric aspect that a smooth continuous trajectory which passes all modeling points will be reconstructed to recover the chaotic properties of the system. To achieve this goal, we develop a new modeling method called series-nonuniform rational B-spline model and S-NURBS for short by introducing the time parameter into the NURBS geometry modeling. NURBS [19] has a unified mathematical expression for any free curve and standard curve, and can build the corresponding curve with the known discrete data points with high precision. Not only that, it also has a lot of good mathematical properties and geometry technology. By introducing time parameter into NURBS, S-NURBS not only retains the advantages of NURBS, but also makes it better used for time series analysis. S-NURBS modeling method can construct the mathematical equation that is between the time parameter t and the data value. So, we can get the data value of any time by giving the time value. It facilitates better show of systematic evolution, and it can be used expediently in practical application. In Section II, we introduce how to reconstruct a finite trajectory model from small data by S-NURBS model. Section III presents the statistical and chaotic validation of S-NURBS method by modeling five benchmark chaotic systems to confirm whether the model can recover the chaotic features of the original system. In addition, our method is applied to Musa standard dataset to build a chaotic model for recovery and forecast work in Section IV. Finally, the main conclusions about reconstruction, validation and application are drawn in Section V. II. S-NURBS Model Our group [20] had put forward to introduce time parameter into NURBS geometrical in 2011, but it is only a preliminary exploration and try. With the deepening study, the optimized S-NURBS model is discussed extensively and comprehensively as follows. The direct control variable of the S-NURBS model is the time variable t, the indirect control variable is the parameter value u, and the final control variable is the data value of the time series points. Therefore, it needs two major

steps to achieve the conversion. The specific implementation steps are as follows. A. Constructing k-Order NURBS Model A k-order NURBS curve model is defined as n+k−1 pi wi Ni,k (u) i=0 c(u) = n+k−1 wi Ni,k (u)

(1)

i=0

where pi is the control point, ωi is the weight of control point, Ni,k (u) is the B-spline basis function, and u is a continuous value bounded by knot vector. In order to construct the NURBS model from a given series X = {x0 , x1 , . . . , xn }, we must first calculate various parameters of the model. It mainly includes the knot vector U, the B-spline basis function, the control points, and its weight. Each calculation is shown as follows. 1) Parameterized Method for Knot Vector: Suppose we are given a time series X = {x0 , x1 , . . . , xn }. If we want to interpolate these points with a k-order B-spline curve. we need to assign a parameter value uj to each xj , and select an appropriate knot vector U = {u0 , u1 , . . . , um }. There are four common methods to calculate the node vector U reported by previous studies [19]. The uniform parametric method is suitable for the occasion that all edges connecting adjacent data points have roughly equal size. The centripetal parametric method applies for the case that the angles of neighbor edges are small. The accumulating chord length parametric method has been regarded as the best method as the involved chord length could reflect the distribution of data points. Instead, arc length is devoted to reflecting the distribution of data points in the curvature parametric method. Hence, it is appropriate to the situation that chord length is far shorter than its arc length, but with the computation being time-consuming. In the practical application, we should choose an appropriate method according to the distribution characteristics of data points. After uj is determined, we can get the knot vector U = {u0 , u1 , . . . , um }, where u0 = · · · = uk = 0, um−k = · · · = um = 1 (2) uj+k = u¯j , j = 1, 2, . . . , n − k. m = n + 2k. 2) Calculate B-Spline Basis Function: Usually we calculate the B-spline basis function Ni,k (u) by a simple formula called de Boor Cox recursive formula as follows, which is easy programming: 1 if ui ≤ u < ui+1 Ni,0 (u) = 0 otherwise (3) u − ui ui+k+1 − u Ni,k (u) = Ni,k−1 + Ni+1,k−1 (u). ui+k − ui ui+k+1 − ui+1 3) Weight of Control Points: In fact, control points’ weight must get through the inverse calculation of data points’ weight; the relative size of the data points’ weight value represents the role effect for the formation of model shape. Generally, when the status of each data point for generate the model is equal, we can choose each data point’s weight ωj∗ = 1, j = 0, 1, . . . , n.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. SHAO et al.: RECOVERING CHAOTIC PROPERTIES FROM SMALL DATA

The relationship between the data points’ weight and their corresponding control points’ weight are shown as follows: ⎧ ⎨ ∗ n+k−1 wj = Ni,k (uj+k )wi , j = 0, 1, . . . , n. (4) i=1 ⎩ wi ≥ 0, i = 0, 1, . . . , n + k − 1. Obviously, we directly set the corresponding control points’ weight of data points to be 1, which is just a special solution of the equations above. In fact, we can also make use of optimization theory and algorithm, such as the quadratic programming to obtain optimum solution for the corresponding control points’ weight; we can find the necessary instructions and algorithm implementation about it in Appendix B. Modelers can choose the weight series manually to approximate the trajectory of original system infinitely as long as they know enough information about system, for example, the endpoints and the extreme value points have greater influence on model shape; for sake of simplicity, these data points’ weight is set to 1; the weights for other data points are selected in the adjacent interval [0.5, 1) at random. 4) Calculating Control Points: Consider the control points series as P = {p0 , p1 , ..., pn+k−1 }, According to the local support property of B-spline basis function and the given boundary conditions, the problem of calculating the control points can be converted into solving the linear equations eventually xi−k = Ni−k,k (ui ) Ni−k+1,k (ui ) · · · Ni,k (ui ) T (5) × pi−k pi−k+1 · · · pi , i = 1, . . . n + k − 1. Once the knot vector, the control points, and its weights have been determined, a NURBS curve will be determined uniquely. That means a mapping from parametric space to track space has been achieved. At this point, the building of NURBS model is finally done. B. Adding Time Parameter Into NURBS Expression In order to take the excellent modeling technology NURBS better used for time series analysis and applications, we must add the time parameter t into it effectively. The main work of this step is to build the function between the node vector value u and the time t. The uniform mapping is the initial method we considered, but the non-uniform variation of data points with t leads to a large error. Therefore, we adopt the piecewise uniform mapping in each segment (Eq. (6)) to reduce the error. The maximum error is the biggest time step of time series. The mapping relationship is as follows: t−ti (ui+1 − ui ) + ui t ∈ (ti , ti+1 ) u = f (t) = ti+1 −ti (6) t = ti u = ui It is worth noting that the uniform section mapping presented above can be replaced by other nonlinear mapping so long as the mapping can reflect the instinct properties of the time series in a more realistic way. This means that if the modelers are dissatisfied with the mapping method presented in the paper when dealing with some certain time series, they are free to use a more effective nonlinear time mapping method.

3

Finally we replace the variable u with the function u = f (t), and get an S-NURBS model as c(t) n+k−1

c (u) = c (f (t)) =

pi wi Ni,k (f (t))

i=0 n+k−1

(7) wi Ni,k (f (t))

i=0

C. Some Advantages of S-NURBS By introducing time parameter into NURBS, S-NURBS not only retains the properties of NURBS model very well, but also makes the one-to-one mapping relationship between the time value and the time series data value be better shown. As the most excellent method in the area of shape modeling, NURBS provides a mature platform as well as many proven mathematical properties. The most important point is that it is able to give an equivalent expression of the original trajectory if enough information is obtained, while all other shape modeling algorithms such as local linear interpolation modeling and Bezier curve modeling only do their best to approach the original one and cannot eliminate the recovery errors. For this, the reader can refer to our previous work [21], which mainly compared the interpolation method based on S-NURBS with other interpolation methods (such as linear interpolation, parabola interpolation, cubic interpolation, and quartic interpolation), and showed its advantages. In addition, a simple comparison between S-NURBS modeling and local linear interpolation modeling is presented in Appendix C.

III. Recovering Properties by S-NURBS Modeling At this point, we have reconstructed a possible trajectory for the small data with S-NURBS model. But we only know the small piece of time series and their corresponding time are correct in the model. If we want to recover the properties of original system, we should make sure the other part is also similar to the original trajectory. This question needs to be solved before we actually use it to recover the practical system. So it is the core issue in this paper, and it is actually a process to validate the effectiveness of model. In order to confirm the recovering effect of the S-NURBS modeling method, we take both automatical and nonautomatical continuous-time chaotic systems as study subjects, and validate the models built with small data generated by these systems from statistical, topological, dynamical and geometrical aspects. In terms of the practical validation, we evaluate the performance by measuring the difference of series respectively sampled from original and reconstructed trajectories. It is helpful for the calculation of statistical and chaotic invariants of models. Five famous benchmark systems (i.e., Lorenz system, R¨ossler system, Chua’s system, forced van der Pol oscillator and forced Duffing oscillator) are employed and the producing work of chaotic time series of systems are shown in Appendix A. Then we model an S-NURBS curve with a small part of the series, and sample our model with the same interval used in sampling system to get a time series generated by our model.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 4

IEEE TRANSACTIONS ON CYBERNETICS

In Section III-A, two classic statistical methods called mean absolute error and Pearson’s product-moment correlation coefficient are used to validate the statistical performance of our model. In Section III-B, we will choose two qualitative methods called trajectory embeddings and Poincare section that are commonly used to indicate topological features to study whether our model is available in explaining the topological properties of the system. In Section III-C, we will calculate two chaotic invariants called largest Lyapunov exponent and correlation dimension to study the chaotic properties of the time series generated by the system and S-NURBS model.

A. Analysis of Statistical Properties We mainly validate the statistical properties of S-NURBS model here, i.e., we consider the value difference and the shape similarity between the two trajectories to study the reconstructing ability of S-NURBS model. We select the following two representative evaluation, i.e., calculate mean absolute error (MAE) to indicate the value difference of every benchmark system and its S-NURBS model, and show evolutive shape similarity of them by Pearson’s product-moment correlation coefficient (PMCC). Time series generated by system is defined as X = {x1 , x2 , . . . , xn } and the series generated by S-NURBS model is defined as Y = {y1 , y2 , . . . , yn }. 1) Mean Absolute Error: The formula of MAE is as follows: n

MAE =

||Xk − Yk ||

k=1

n

.

(8)

The MAE of two time series is used as an indication of the value difference. The smaller the MAE value, the closer our model is to the system. However, the model performance cannot be reflected from it because the error magnitude is related to the numerical scale of time series used for modeling. Below, we define a new metric comparable mean absolute error, i.e., dividing the MAE by the difference of maximal and minimal value in series, which is able to reflect the system’s numerical scale. CMAE = MAE/(max(data) − min(data)). The CMAE can be used to compare different models with a regulation; the smaller the CMAE, the closer the S-NURBS curve to the original trajectory. Table I lists all the MAE value and CMAE value of five benchmark chaotic systems. As we see from Table I that the S-NURBS method model slow-changing chaotic time series perfectly because all the models are very close to the original trajectories; the max MAE error is no larger than 0.2 and the max CMAE is no higher than 0.17%. Besides, we compare which system our model is better adapted to fit with the CMAE. Taking Lorenz system and R¨ossler system as example, from Table I it can be seen that our model is more suited to modeling the time series generated by R¨ossler system rather than Lorenz system. This is reasonable because R¨ossler system with only one nonlinear term is simpler than Lorenz system.

TABLE I MAE Value and CMAE Value of Every Free Variable

∗1

A represents the Lorenz system. B represents the R¨ossler system. C represents the Chua’s system. D represents the forced van der Pol system. E represents the forced Duffing system. ∗2 Column S shows the MAE between S-NURBS model and system taking the combination of all free variables [(x, y, z) or (x, y)] as phase space. ∗3 Each column (X, Y, Z) refers to the MAE or the CMAE of every free variable respectively in its reconstructed phase space. ∗4 The value None means that there is no such variable. TABLE II PMCC of Every Free Variable

∗1 ∗2 ∗3

The meanings of A–E system are shown in Table I. Each column (X, Y, Z) refers to the PMCC of every free variable. The value None means that there is no such variable.

2) Pearson’s Product-Moment Correlation Coefficient: The formula of PMCC is shown as follows: PMCC =

cov(X, Y ) . σ(X) · σ(Y )

(9)

The molecule is a covariance, and the denominator is the product of standard deviation. PMCC of two time series is used as an indication of evolutive shape similarity. The larger the correlation coefficient value, the higher one series is positively related to the other and the shape of one series is more similar to the shape of the other; its maximum value is 1. From Table II, it can be found that the S-NURBS curve is strikingly similar to the original trajectory in evolutive shape with a fact that all the PMCC values are higher than 99.97%, which means that the S-NURBS model is almost the same as the actual system in evolution laws. Now we have studied the statistical properties of S-NURBS model by MAE and PMCC. We find that the CMAE between the time series generated by benchmark systems and their SNURBS models is no more than 0.2%; in another word, our model can match the original value at a level higher than 99.8%. Then we calculate the PMCC of the systems and their models and prove that the evolutive shape similarities are higher than 99.97% in experiments. The experiments of these two statistical indices illustrate that our model curve basically matches practical trajectory. In the next section,

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. SHAO et al.: RECOVERING CHAOTIC PROPERTIES FROM SMALL DATA

5

Fig. 1. Trajectory embeddings of X variant in each benchmark chaotic system and its corresponding S-NURBS model. The left is the trajectory embeddings generated by original time series, and the right is the trajectory embeddings generated by time series of S-NURBS model. They are generated by the same time delay. (a) Lorenz system. (b) R¨ossler system. (c) Chua’s system. (d) Forced van der Pol oscillator. (e) Forced Duffing oscillator.

we will study the topological properties of the reconstructed model qualitatively by comparing figures showing topological properties of attractor on 2-D plane. B. Graphical Chaotic Model Validation Techniques Analysis of the Model Graphical validation techniques, which compare the graphical difference of two graphs with chaotic character generated by benchmark system and its model in 2-D plane, are a kind of qualitative validation techniques. In this section, we choose two indispensable and common-used graphical validation techniques, trajectory embeddings and Poincare section, to study the topological properties of the S-NURBS model. Trajectory embeddings is the projection of phase space in 2-D plane, and Poincare section is the intersection set of phase trajectory and the low-dimensional curved surface. Both of them reflect the chaotic evolutive features of the phase space. So we can validate whether the reconstructed trajectory built by S-NURBS can recover the evolutive features of original trajectory by using these two validation method. 1) Trajectory Embeddings: We can plot the steady-state trajectory of the system in the phase space when we are analyzing a chaotic system. It can be generated as plotting the free variable and its conjugate variable as (y(t), y˙ (t)) for a system with one degree of freedom. In our situation, we cannot plot a continuous y˙ (t) because we are dealing with time series. But the fact that y(t − τ) is related to y˙ (t) after we calculate the time delay τ, which leads us to plot (Y (t), Y (t − τ)) as an alternative way. We usually define (Y (t), Y (t−τ), Y (t−2τ), . . . , Y (t−(m−1)τ)) as a phase space. As two dimension in the phase space, (Y (t), Y (t − τ)) defines

the embedded trajectory in the so-called pseudophase plane. Actually, the embedded trajectory is a projection of phase space and has many properties similar to the original attractor of the system. Therefore, we can estimate the topological properties of S-NURBS model by comparing the embedded trajectories between reconstructed and original trajectory. We use the time series generated by the free variable X in every benchmark system and the corresponding time series generated by S-NURBS model to plot their embedded trajectories with the same time delay. The embedded trajectory of the free variable X in every system is shown in Fig. 1. We can obviously find that the embedded trajectory structure of S-NURBS model is extremely similar to the original system from Fig. 1. Meanwhile, the embedded trajectory can exhibit many topological properties of system, so we preliminarily confirm that the S-NURBS model can recover the topological properties of the original system. Next, we further employ Poincare section to study the topological properties of S-NURBS model. 2) Poincare Section: For a continuous-time trajectory, when we select a section (named Poincare section) in phase space and make sure that it is not tangent to the trajectory and it does not contain the trajectory, we can get a series of intersections between the trajectory and the section. These intersections on section form a special graph: when the continuoustime trajectory is formed by chaotic systems, the graph is expected to be a set of fragmented fractal intensive points on the section. As they are generated by the movement of chaotic attractors, we can study the similarity of topological properties about the reconstructed trajectory by comparing their graph composed of the intersections on the same Poincare section.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 6

IEEE TRANSACTIONS ON CYBERNETICS

Fig. 2. Poincare sections of each benchmark chaotic system and its corresponding S-NURBS model. The left is the Poincare section generated by the original system, and the right is the section generated by S-NURBS model. They are generated by the same parameters which are shown below the figures. (a) Lorenz system. (b) R¨ossler system. (c) Chua’s system. (d) Forced van der Pol oscillator. (e) Forced Duffing oscillator.

We model the combination of all free variable in each system (take Lorenz system as example, we model the multidimensional time series of (X, Y, Z)) with S-NURBS method. Then we plot the Poincare section of the phase trajectories corresponding to system and model to compare the similarity between the system and our model. The Poincare section of every system and its S-NURBS model are shown in Fig. 2 (the section parameters are below the graph). After comparing the Poincare sections in each part of Fig. 2 we find that the Poincare section of trajectory generated by S-NURBS model is also very similar to the Poincare section of original trajectory, besides the intersections distribution on Poincare section show a part of topological features of the system, so we confirm that the S-NURBS model can really recover the topological features of original trajectory very well. In this section, we validate the S-NURBS model qualitatively by comparing the embedded trajectory and Poincare section of every benchmark system and its corresponding S-NURBS model; we find that the topological properties of reconstructed trajectory are very similar to the properties of original trajectory. Next we will calculate the chaotic invariants of the reconstructed trajectory to confirm whether S-NURBS model can exhibit the chaotic properties of system quantitatively. The largest Lyapunov exponent and correlation dimension are taken to analyze whether S-NURBS model recovers the dynamical and geometrical properties of original system in the next section. C. Analysis of the Time Series Invariants We usually calculate some invariants of time series to show the chaotic properties of the complex system in practice.

When validating the S-NURBS model, we choose Lyapunov exponent and correlation dimension to analyze the chaotic properties of the model because they are used very commonly to show the chaotic properties of time series and to validate chaotic models. Lyapunov exponent is used to depict orbit divergence rate of system and correlation dimension is used to describe convergence dimension of system (i.e., the phase points’ status of system). These two indices can be used as the criteria for model validation both to express the chaotic properties deviation of S-NURBS model and to estimate whether the constructed trajectory recovers the chaotic properties of system. 1) Largest Lyapunov Exponent: The sensitivity of the initial value is a basic character of chaotic dynamics, i.e., if we have two trajectories that pass two proximal points in the same direction, the trajectories will diverge at an exponent rate. This phenomenon is called as unstable behaviors. Lyapunov exponent is an index to show the mean orbit divergence rate of neighbor trajectories along a certain direction and indicates the dynamical instability of the system. There must be a positive Lyapunov exponent when the attractor is chaotic and all the Lyapunov exponents in a nonchaotic attractor are not positive. Considering the fact that the estimation for all Lyapunov exponents from a finite time series suffers large error, we can study the dynamical difference of the system and our model by only calculating the error of the largest Lyapunov exponent, because in many circumstances the largest Lyapunov exponent is the only positive Lyapunov exponent to express the orbit instability. We calculate the largest Lyapunov exponent of the time series generated by each free variant in benchmark systems

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. SHAO et al.: RECOVERING CHAOTIC PROPERTIES FROM SMALL DATA

7

TABLE III

TABLE V

Maximum Lyapunov Exponent of Each System ∗

Correlation Dimension of Each System ∗

∗ Each row shows the correlation dimension of free variables in benchmark chaotic systems with the same computational parameters. ∗

The rows of the free variants combination (labeled as S, i.e., (x, y, z) or (x, y)) show the maximum Lyapunov exponents calculated in the phase space of the free variants combination with same computational parameters. Other rows show the maximum Lyapunov exponents of free variables calculated in reconstructed phase space with same parameters. TABLE IV Distribution of Maximum Lyapunov Exponent Error

and their S-NURBS models, and define the largest Lyapunov exponent error (LLEE) as the ratio of the absolute difference between system’s and model’s largest Lyapunov exponent value to the system’s largest Lyapunov exponent (i.e., |MLE system − MLE model| ÷ MLE system × 100%). Then we use the distribution of LLEE value to estimate whether S-NURBS model can recover the dynamical properties of orbit divergence. Each LLEE value is shown in Table III. We should notice that the single variable’s time series cannot exhibit the physical properties entirely, and it is very subjective both to select parameters used for reconstructing phase space and to calculate largest Lyapunov exponent, so the largest Lyapunov exponent value of time series generated by systems is not equal to the theoretical Lyapunov exponent but just indicate the orbit divergence rate in a certain phase space. From Table III we find that the largest Lyapunov exponent value of our S-NURBS model is very close to the value of the original system, so our model can recover the orbit divergence rate of the original system significantly. Then we summarize the error distribution of the largest Lyapunov exponent error (LLEE) in Table IV. Obviously, from Table IV we find that if we take 4% as an acceptable error, the S-NURBS model can recover the dynamical properties of a large part (61.1111%) of time series generated by continuous-time chaotic systems. And if we

take 10% as an acceptable error, the S-NURBS model can recover the dynamical properties of most (83.3333%) time series. So we believe that the S-NURBS model can recover the dynamical properties of the original system well. 2) Correlation Dimension: The correlation dimension is a metric of the system’s geometrical distribution complexity and depicts the distribution of points in phase space which is called as fractal structure. In a deterministic system, the correlation dimension is the essential independent variable number used to generate an accordingly complex system. The correlation dimension of a chaotic system is not an integer, so if the correlation dimension is D as D = d + δ(0 < δ < 1), then d + 1 is the necessary number of independent variables constructing the system. As a physical invariant, correlation dimension can be used to recognize whether the time series is generated by a chaotic system and to show some geometrical information of time series. The bigger correlation dimension is, the more complex the distribution of phase points is. In this section, we calculate the correlation dimension of the time series generated by each free variant in benchmark systems and its S-NURBS model. Then we define the correlation dimension error (CDE) as the ratio of the absolute difference between system’s and model’s correlation dimension value to the system’s correlation dimension (i.e., |CD system − CD model| ÷ CD system × 100%. Then we use the distribution of CDE value to estimate whether S-NURBS model can recover the geometrical properties, i.e., the distribution complexity of phase points. Each CDE value is shown in Table V. The same as the largest Lyapunov exponent, the correlation dimension value of time series generated by systems is not equal to the theoretical correlation dimension, but it just indicates the distribution complexity of phase points in certain parameters. We find that the correlation dimension of our S-NURBS model is extremely close to the value of original system from Table V, so we assert that our model can recover the distribution complexity of phase points of the original system

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 8

IEEE TRANSACTIONS ON CYBERNETICS

TABLE VI Distribution of CDE

wonderfully, where the largest CDE is only 1.247%. Then we summarize the error distribution of CDE in Table VI. Obviously, Table VI leads us to believe that S-NURBS model can recover the geometrical properties of original system significantly that it can model 46.15% of time series generated from benchmark systems in a 0.1% level of error and there is only one group of time series which causes the CDE higher than 1%. Therefore we believe that S-NURBS model can recover the fractal feature in chaotic time series amazingly. In this section, we have studied the properties of S-NURBS model from statistical, topological, dynamical and geometrical aspects and validated S-NURBS models that built with time series of five benchmark systems. The curve reconstructed by S-NURBS method is extremely similar to the original trajectory that the value similarity is more than 99.8% and the evolutive shape similarity is more than 99.97%. Meanwhile, the low-dimensional topological structures of the reconstructed trajectories are very similar to original trajectories. Besides, the reconstructed trajectories recover the chaotic features significantly with a result that it restore the orbit divergence rate at an error level of 10% and restore the distribution complexity of phase points at an error level of 1% for most groups of time series. All these validation results lead us to draw a conclusion that the reconstructed trajectory can recover both the statistical and chaotic properties of the system very well. Now the study of recovering chaotic properties from artificial small data has been achieved. In the next section, we will apply S-NURBS method in modeling Musa standard dataset to recover chaotic properties from practical small piece of time series. Musa dataset is a group of time series that is very important in software security. Meanwhile, as for validation work, Aguirre and Billings [22] proved that the methods used above are only necessary conditions to make sure the model is valid rather than sufficient conditions, which means that if the model is not valid in expressing the system, those indices and figures may also look good. And they find only bifurcation diagram is the sufficient condition to validate model by analyzing all the existing chaotic analysis methods. But we cannot get a precise bifurcation diagram from finite time series, so we adopt many validation methods above to enhance reliability of our conclusion when validating the S-NURBS model. But we do not know whether it is enough, so the application in the next section is also a practical validation for S-NURBS model. IV. Validate the S-NURBS Model With the Musa Standard Dataset As the development of computer technology, software is becoming more complex than hardware. Therefore, it is very necessary to quarantee the software security. As an important

Fig. 3. Value of Musa standard dataset. We find that the change in value in the dataset is not drastic.

field in software security, software reliability attracts great concern in past decades and many software reliability models have been built up. Traditional theories of software reliability treat software failure as stochastic process, but Zou and Li [23] found there are both stochastic and chaotic sides in the software failure process. Hence, we can deal with the deduced problem of software reliability with chaos theory. The Musa standard dataset is an important group of time series in software reliability, which is generated by a test for a system of 21 700 instructions that is executed by nine programmers in Bell Laboratories and published by J. D. Musa in 1990. In 2002, L¨u et al. [24] proved the Musa standard dataset is chaotic by estimating the correlation dimension of the dataset and obtaining a result of 1.49. In this paper, we use the first dataset that have 136 samples rather than the 75 samples and 86 samples, we mainly model a small part of Musa standard dataset with S-NURBS method to validate whether our model is useful for recovering chaotic properties of the original system in practice. What should be noticed is that the Musa dataset is a dataset of failure occurring time and the value of the series changes slowly, so we can apply S-NURBS in this situation. The dataset is shown in Fig. 3. Because the value of the dataset changes more drastically than artificial series, we select the modeling series every four points rather than five as in previous section. Meanwhile we find the length of the dataset is 136, so the last point in series cannot be used for modeling if we take model points at every four step. In this paper, we solve this problem by prolongating the series for one step (we choose the last value in the dataset as the prolongation value because it makes little change to the original dataset) and build a model as Fig. 4. We need to round the value sampled from S-NURBS trajectory to the nearest integer because Musa standard dataset is a set of integers which indicate the time to failure. Then we validate the model with the methods used in the previous section, but we should notice that we cannot plot the Poincare section because the length of the Musa dataset is too short, so we only calculate the quantitative indices, and get a result as Table VII.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. SHAO et al.: RECOVERING CHAOTIC PROPERTIES FROM SMALL DATA

Fig. 4. Original Musa standard dataset and its S-NURBS model. The blue line shows the original Musa dataset, the red line that gently varied shows the S-NURBS model, and the blue asterisk shows the points used for modeling. TABLE VII Quantitative Indices of Musa Dataset and S-Nurbs Model

From Table VII and Fig. 4, we obviously find that the S-NURBS method is well applied to the Musa standard dataset. The comparable mean absolute error (CMAE) is only 0.45%, which means that the S-NURBS curve is very close to the Musa dataset trajectory. The Pearson’s product-moment correlation coefficient (PMCC) is 0.999525, which means that the S-NURBS model can reconstruct the shape of original trajectory almost precisely. These two indices mean that the evolutive trajectory of S-NURBS model is very similar to the original trajectory. Besides, the largest Lyapunov exponent error (LLEE) of model is only 0.604% and the CDE is only 0.941%, which means that the S-NURBS model can express the chaotic properties of Musa standard dataset very well. So far, we have proved the availability of the S-NURBS in practice by computing both the statistical and chaotic indices of the Musa dataset and its corresponding S-NURBS model. Likewise, it also means that the S-NURBS model can be applied in practice to recover the chaotic properties from small data (in this experiment, modeling just 35 points can recover the chaotic properties contained in 137 points).

9

and nonautomatical chaotic systems to generate artificial small data and validate the model from statistical, topological, dynamical and geometrical aspects. By statistical validation (MAE and PMCC) we find that the S-NURBS model trajectories are extremely similar to the original ones as the evolutive error is lower than 0.2% and the similarity of evolutive trajectory is higher than 99.97%. Then we study the chaotic properties of S-NURBS model in a further way. By qualitative validation (trajectory embeddings and Poincare section) which shows the topological information of system, we find that the S-NURBS model can recover the topological properties of the original system. After qualitative validation, we calculate the largest Lyapunov exponent and correlation dimension to validate whether the model can recover the chaotic properties from small data. By quantitative validation we find there are 75% of models keeping the largest Lyapunov exponent error in 5% and 88.89% of models in an error of 10%. Meanwhile, there are 76.92% of models keeping the CDE lower than 0.5% and 92.31% of models within an error of 1%. All these statistical and chaotic indices illustrate that we can recover both the evolutive behaviors and chaotic properties contained in 10001 points very well by modeling only 2001 points with the S-NURBS method. Furthermore, we apply the S-NURBS method to recover the chaotic properties from part of Musa standard dataset which is very important in the study of software security. In this experiment, we take 35 points to recover the properties contained in 137 points. The experimental results show that the value error is only 0.45%, the shape similarity is higher than 99.95%, and the error to express chaotic properties (i.e., the largest Lyapunov exponent error and CDE) is no more than 1%. All these indices mean that the method can be used to recover chaotic properties from real-world small data as well and the correctness and usability of this model are proved. Above knowable, we have proved that the S-NURBS model can recover the statistical and chaotic properties of system by reconstructing a smooth continuous trajectory from small data. It is also a good way to model slow-changing chaotic signals because the real trajectory can be limitless approached by the reconstructed one as long as we get enough information about the real-world system. In addition, the effectiveness of the S-NURBS model leads us believe it is really a feasible and worthy research area to study physical systems from a geometric perspective. For this reason, we reckon that we have opened up a new horizon for chaotic system research. The tool may be used not only to learn further physical properties of the signal source but also to construct differential equations regardless of the difficulty in polynomial selection. All these research works may lead to new techniques for forecasting and processing the signals or the behaviors of real-world system.

V. Conclusion

Appendix A Generating Artificial Time Series

In this paper, we first propose to use S-NURBS for recovering chaotic properties from small data by reconstructing a smooth trajectory for signals. Then we take five automatical

The famous Lorenz [25] system, R¨ossler [26] system, and Chua’s [27] system are chosen to represent the autonomous chaotic system; at the same time, the forced

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 10

IEEE TRANSACTIONS ON CYBERNETICS

van der Pol [28]–[30] oscillator and forced Duffing [31] oscillator are employed to represent nonautonomous systems. The classical fourth-order Runge–Kutta method is adopted to discrete the trajectories generated by systems to sample 11 001 points with a proper sampling interval. Every point contains three dimensions (X, Y, Z) in the autonomous system and two dimensions (X, Y ) in the nonautonomous system. Then we remove the first 1000 points to eliminate the transient influence and keep 10 001 points to validate our model. Next, we extract 1/5 of data (2001 points) equally spaced from the time series to build the S-NURBS model. After modeling, we discrete the S-NURBS curve with the same sampling interval to sample 10 001 points from the model. In the following, the dynamic differential equation and the parameters of the five benchmark chaotic systems are described. A. Differential Equation of Lorenz System ⎧ dx ⎪ ⎪ = σ(y − x) ⎪ ⎪ ⎨ dt dy (10) = x(R − z) − y ⎪ dt ⎪ ⎪ dz ⎪ ⎩ = xy − bz dt where we set σ = 16, R = 45.92, b = 4, initial point=(0,1,0), and the sampling interval = 0.01. B. Differential Equation of R¨ossler System ⎧ dx ⎪ ⎪ ⎪ dt = −(y + z) ⎪ ⎨ dy (11) = x + ay ⎪ dt ⎪ ⎪ ⎪ ⎩ dz = b + z(x − c) dt where we set a = 0.15, b = 0.20, c = 10.0, initial point=(0, 1, 0), and the sampling interval = 0.01. C. Differential Equation of Chua’s System ⎧ dx ⎪ ⎪ ⎪ dt = α(y − x − h(x)) ⎪ ⎨ dy =x−y+z ⎪ dt ⎪ ⎪ ⎪ ⎩ dz = −βy dt

(12)

1 and h(x) = m1 x + (m0 − m1 )[|x + 1| − |x − 1|] 2 where we set α = 9.8, β = 14.87, m0 = −1.27, m1 = −0.68, initial point=(0,1,0), and the sampling interval = 0.01. D. Differential Equation of Forced Van Der Pol Oscillator ⎧ 3 ⎪ ⎨ dx = x − x − y + p + q cos(wt) dt 3 (13) ⎪ ⎩ dy = c(x + a − by) dt where we set a = 0.7, b = 0.8, c = 0.1, p = 0.0, q = 0.74, w = 1.0, initial point=(0,1), and the sampling interval = 0.05.

E. Differential Equation of Forced Duffing Oscillator dx d2x +γ − κx + ζx3 = F cos(ωt) 2 dt dt i.e. ⎧ (14) dx ⎪ ⎨ =y dt dy ⎪ ⎩ = −γy + κx − ζx3 + F cos(wt) dt where we set γ = 0.1, κ = 1, ζ = 1, F = 1, ω = 0.5, initial point=(0, 1), and the sampling interval = 0.05. Appendix B Optimal Weight for the Control Points In fact, control points’ weight must be obtained through the inverse calculation of data points’ weight. When the data points’ weights are given, in order to get the optimal weight for control points, we should build the quadratic programming problem below according to the relationship between them minf (X) = XT X

(15)

⎧ ⎨ w∗ = n+k−1 Ni,k (uj+k )wi , j = 0, 1, . . . , n j j=0 ⎩ wi ≥ 0, i = 0, 1, . . . , n + k − 1

(16)

where ⎧ T ⎨ X = w1 − w∗a , w2 − w∗a , . . . , wn+k−1 − w∗a n 1 ∗ wi . ⎩ w∗a = n+1 j=0

(17) After obtaining the appropriate control point weight, the S-NURBS interpolation curve of the actual time sequence can be constructed well in accordance with the S-NURBS modeling steps, which are described in the body. Appendix C Simple Comparison Between S-NURBS and Local Linear Interpolation In this section, Lorenz system and forced van der Pol oscillator are employed to show the recovery performance of S-NURBS. The data used are shown in the previous section. S-NURBS model cannot pass through the end of the original sampled trajectory under some sampling intervals such as 0.03 (the model passes through every third point of data, i.e., the points {1, 4, 7, . . . , 10 000}), so we discard the last point. The same process is done to other improper sampling intervals, and the error comparison to local linear interpolation is shown in Table VIII. The standard of credibility is determined by modelers. Take Lorenz system as an example; if an error less than 1 is deemed to be credible, Table VIII shows that the reconstructed trajectory of Lorenz system keeps entrusted until the sampling interval is higher than 0.09. Meanwhile, Table VIII also indicates that the recovery performance of S-NURBS is always better than the one of local linear interpolation, no matter for the autonomous Lorenz system or the nonautonomous forced van der Pol oscillator.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. SHAO et al.: RECOVERING CHAOTIC PROPERTIES FROM SMALL DATA

TABLE VIII Error of Recovery

Acknowledgment The authors would like to thank the editors and reviewers very much for their positive and constructive comments and suggestions which helped improve the quality of this paper.

References [1] T. Miyoshi and A. Murata, “Chaotic properties of rhythmic forearm movement,” in Proc. IEEE Int. Conf. Syst., Man, Cybern., vol. 3. Oct. 2000, pp. 2234–2239. [2] L. Cao, Y. Hong, H. Fang, and G. He, “Predicting chaotic time series with wavelet networks,” Physica D, vol. 85, nos. 1–2, pp. 225–238, 1995. [3] M. Shen, W. N. Chen, and J. Zhang, “Optimal selection of parameters for nonuniform embedding of chaotic time series using ant colony optimization,” IEEE Trans. Cybern., vol. 43, no. 2, pp. 790–802, Apr. 2013. [4] Y. Zhao, B. Li, and J. Qin, “H-infinity consensus and synchronization of nonlinear systems based on a novel fuzzy model,” IEEE Trans. Cybern., vol. 43, no. 6, pp. 2157–2169, Dec. 2013. [5] J. Cremers and A. Hubler, “Construction of differential equations from experimental data,” Z. Naturforsch A, vol. 42, no. 8, pp. 797–802, 1987. [6] W. Horbelt, J. Timmer, M. J. B¨unner, R. Meucci, and M. Ciofini, “Identifying physical properties of a CO2 laser by dynamical modeling of measured time series,” Phys. Rev. E, vol. 64, no. 1, pp. 061222-1–061222-7, 2001. [7] S. Mangiarotti, R. Coudret, L. Drapeau, and L. Jarlan, “Polynomial search and global modeling: Two algorithms for modeling chaos,” Phys. Rev. E, vol. 86, no. 4, pp. 046205-1–046205-14, Oct. 2012. [8] A. M. Albano, A. Passamante, T. Hediger, and M. E. Farrell, “Using neural nets to look for chaos,” Physica D, vol. 58, nos. 1–4, pp. 1–9, 1992. [9] H. J. Kim and K. S. Chang, “A method of model validation for chaotic chemical reaction systems based on neural networks,” Korean J. Chem. Eng., vol. 18, no. 5, pp. 623–629, 2001. [10] Y. I. Molkov, D. N. Mukhin, E. M. Loskutov, A. M. Feigin, and G. A. Fidelin, “Using the minimum description length principle for global reconstruction of dynamic systems from noisy time series,” Phys. Rev. E, vol. 80, no. 4, pp. 046207-1–046207-6, Oct. 2009. [11] L. A. Smith, “Identification and prediction of low-dimensional dynamics,” Physica D, vol. 58, nos. 1–4, pp. 50–76, 1992. [12] B. Pilgram, K. Judd, and A. Mees, “Modelling the dynamics of nonlinear time series using canonical variate analysis,” Physica D, vol. 170, no. 2, pp. 103–117, 2002. [13] C. Hao, G. Yu, and H. Xia, “Online modeling with tunable RBF network,” IEEE Trans. Cybern, vol. 43, no. 3, pp. 935–947, Jun. 2013. [14] G. Gouesbet and C. Letellier, “Global vector-field reconstruction by using a multivariate polynomial l2 approximation on nets,” Phys. Rev. E, vol. 49, no. 6, pp. 4955–4972, Jun. 1994.

11

[15] E. Bagarinao, Jr., T. Nomura, K. Pakdaman, and S. Sato, “Generalized one-parameter bifurcation diagram reconstruction using time series,” Physica D, vol. 124, nos. 1–3, pp. 258–270, 1998. [16] G. Gouesbet, “Reconstruction of the vector fields of continuous dynamical systems from numerical scalar time series,” Phys. Rev. A, vol. 43, no. 10, pp. 5321–5331, 1991. [17] M. V. Correa, L. A. Aguirre, and E. M. A. M. Menues, “Modeling chaotic dynamics with discrete nonlinear rational models,” Int. J. Bifurcat. Chaos, vol. 10, no. 5, pp. 1019–1032, 2000. [18] Y. L. Ren, G. Li, and J. Zhang, “Lazy collaborative filtering for data sets with missing values,” IEEE Trans. Cybern., vol. 43, no. 6, pp. 1822–1834, Dec. 2013. [19] L. Piegl and W. Tiller, The NURBS Book, 2nd ed. Berlin, Germany: Springer-Verlag, 1997. [20] C. Shao and L. Xiao, “Nurbs model for chaotic time series,” in Proc. 3rd Int. Conf. Comput. Res. Develop., vol. 4. Mar. 2011, pp. 135–138. [21] C. X. Shao, Q. Q. Liu, T. T. Wang, P. F. Yin, and B. Wang, “Series-nonuniform rational B-spline (S-NURBS) model: A geometrical interpolation framework for chaotic data,” Chaos, vol. 23, no. 3, pp. 033132-1–033132-11, 2013. [22] L. A. Aguirre and S. A. Billings, “Validating identified nonlinear models with chaotic dynamics,” Int. J. Bifurcat. Chaos, vol. 4, no. 1, pp. 109–125, 1994. [23] F. Z. Zou and C. X. Li, “A chaotic model for software reliability,” Chin. J. Comput., vol. 4, no. 3, pp. 281–291, 2001. [24] J. H. L¨u, J. A. Lu, and S. H. Chen, Chaotic Time Series Analysis and Application. Wuhan, China: Wuhan Univ. Press, 2002. [25] E. N. Lorenz, “Deterministic nonperiodic flow,” J. Atmos. Sci., vol. 20, no. 2, pp. 130–141, 1963. [26] O. E. R¨ossler, “An equation for continuous chaos,” Phys. Lett. A, vol. 57, no. 5, pp. 397–398, 1976. [27] L. O. Chua, M. Komuro, and T. Matsumoto, “The double scroll family,” IEEE Trans. Circuits Syst., vol. 33, no. 11, pp. 1072–1118, Nov. 1986. [28] B. van der Pol, “On relaxation-oscillations,” London, Edinburgh Dublin Philosoph. Mag. J. Sci., vol. 2, no. 7, pp. 978–992, 1927. [29] B. van der Pol and B. van der Mark, “Frequency demultiplication,” Nature, vol. 120, pp. 363–364, Sep. 1927. [30] T. Hikihara, P. Holmes, T. Kambe, and G. Rega, “Introduction to the focus issue: Fifty years of chaos: Applied and theoretical,” Chaos, vol. 22, no. 4, p. 047501, 2012. [31] G. Duffing, Erzwungene Schwingungen bei Ver¨anderlicher Eigenfrequenz. Braunschweig, Germany: F. Vieweg und Sohn, 1918.

Chenxi Shao received the M.S. degree in computer science from the University of Science and Technology of China, Hefei, China, in 1995. He is currently an Associate Professor with the Department of Computer Science and Technology, University of Science and Technology of China. He is also with the Anhui Province Key Laboratory of Software in Computing and Communication, Hefei. His current research interests include computer integrated manufacturing, embedded operating system, network and security. Mr. Shao has received several awards, including the 2nd Prize State Science and Technology Progress Award, and HP Information Science Award.

Fang Fang received the B.S. degree in computer science and technology from the Anhui Agricultural University, Hefei, China, in 2011. She is currently pursuing the M.S. degree with the Department of Computer Science and Technology, University of Science and Technology of China, Hefei. Her current research interests include nonlinear modeling and its applications, chaos control and missing data recovery.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 12

IEEE TRANSACTIONS ON CYBERNETICS

Qingqing Liu received the B.S. degree in information management and information systems from Anhui University, Hefei, China, in 2010, and the M.S. degree from the Department of Computer Science and Technology, University of Science and Technology of China, Hefei, in 2013. He is currently a Software Developer at IFLYTEK, Hefei. His current research interests include dynamics modeling, information security, and chaos analysis.

Tingting Wang received the B.S. degree in information and computation science from the Anhui University of Science and Technology, Huainan, China, in 2010, and the M.S. degree from the Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China, in 2013. She is currently a Data Analysis Engineer at ZTE Corporation, Nanjing, China. Her current research interests include modeling, software methodologies, and time series prediction.

Binghong Wang received the B.S. degree in theoretical physics from the University of Science and Technology of China, Hefei, China, in 1967. From 1985 to 1989, he held a post-doctoral position in the Department of Physics, Stevens Institute of Technology. He is currently a Professor in the Department of Theoretical Physics, University of Science and Technology of China, where he is also the Head of the National Key Disciplines of Theoretical Physics, and the Director of the Institute for Theoretical Physics, Nonlinear Science Society of Anhui Province. He has published over 100 relevant papers, which are exerting a significant influence in the international academic circles. Prof. Wang served as the Chairman of meetings or common domestic and international academic conferences over 30 times. He is also a Commentator of the American Mathematical Reviews, and a Reviewer of the Physical Review Letters, Physical Review E, Physica A and Science in China, Chinese Science Bulletin, Chinese Journal of Physics Letters, Journal of Physics, and Nonlinear Dynamics.

Peifeng Yin received the B.Sc. degree in computer science from the University of Science and Technology of China, Hefei, China. He is currently pursuing the Ph.D. degree in computer science and engineering from Pennsylvania State University, State College, PA, USA. His current research interests include application of machine learning and data mining techniques in (location-based) social networks, such as recommendation, user behavior modeling, etc.