Unraveling chaotic attractors by complex networks and measurements of stock market complexity.

Unraveling chaotic attractors by complex networks and measurements of stock market complexity Hongduo Cao and Ying Li Citation: Chaos: An Interdisciplinary Journal of Nonlinear Science 24, 013134 (2014); doi: 10.1063/1.4868258 View online: http://dx.doi.org/10.1063/1.4868258 View Table of Contents: http://scitation.aip.org/content/aip/journal/chaos/24/1?ver=pdfcov Published by the AIP Publishing Articles you may be interested in Dynamics in a nonlinear Keynesian good market model Chaos 24, 013142 (2014); 10.1063/1.4870015 Multidimensional stock network analysis: An Escoufier's RV coefficient approach AIP Conf. Proc. 1557, 550 (2013); 10.1063/1.4823975 The structure and resilience of financial market networks Chaos 22, 013117 (2012); 10.1063/1.3683467 Hidden temporal order unveiled in stock market volatility variance AIP Advances 1, 022127 (2011); 10.1063/1.3598412 Forbidden patterns in financial time series Chaos 18, 013119 (2008); 10.1063/1.2841197

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 85.70.232.155 On: Sun, 06 Apr 2014 11:38:12

CHAOS 24, 013134 (2014)

Unraveling chaotic attractors by complex networks and measurements of stock market complexity Hongduo Cao and Ying Lia) Business School, Sun Yat-Sen University, Guangzhou 510275, China

(Received 17 June 2013; accepted 28 February 2014; published online 17 March 2014) We present a novel method for measuring the complexity of a time series by unraveling a chaotic attractor modeled on complex networks. The complexity index R, which can potentially be exploited for prediction, has a similar meaning to the Kolmogorov complexity (calculated from the Lempel–Ziv complexity), and is an appropriate measure of a series’ complexity. The proposed method is used to research the complexity of the world’s major capital markets. None of these markets are completely random, and they have different degrees of complexity, both over the entire length of their time series and at a level of detail. However, developing markets differ significantly from mature markets. Specifically, the complexity of mature stock markets is stronger and more stable over time, whereas developing markets exhibit relatively low and unstable complexity over certain time periods, implying a stronger long-term price memory C 2014 AIP Publishing LLC. [http://dx.doi.org/10.1063/1.4868258] process. V The complexity level of a series reflects the amount of information it contains, and its dynamic properties. The complexity level of financial data reflects the property of the market and affects market aspects, such as forecasting, pricing, investment decision-making, and the applicability of the classic Black–Scholes Option Pricing Model. This paper constructs a new complexity index to research the complexity level of time series. The complexity index is used to analyze the complexity level of random and chaotic data, which exist in many fields, such as stock markets. A theoretical explanation of the method, based on the definition of chaos and the connotation of complexity, is given to clarify the meaning of the complexity index. Based on the results from stock data, some useful conclusions are drawn that may be applicable in the analysis of stock markets.

I. INTRODUCTION

Chaotic time series exist in finance, biology, meteorology, and other complex systems. Different from completely random series, chaotic series display topological transitivity, a dense set of periodic points, and sensitive dependence on initial conditions in light of the classic Devaney definition (Devaney, 1989). The degree of complexity of a chaotic series is related to how close it is to randomness, with a stronger random property implying greater complexity. Many methods have been developed to classify the complexity of chaos, including the Kolmogorov complexity (Kolmogorov, 1965), approximate entropy (ApEn) (Pincus, 1991 and Pincus, 1995), and permutation entropy (Bandt and Pompe, 2002). The chaotic properties of time series related to stock markets have been the subject of much research. The different complexity of financial series indicates that the internal a)

Author to whom correspondence should be addressed. Electronic mail: [email protected].

1054-1500/2014/24(1)/013134/11/$30.00

operating mechanisms of the markets differ. The traditional Efficient Market Hypothesis (EMH) (Fama, 1969, 1970) states that the movement of asset prices follows a geometric Brownian motion, and that stock price volatility can be expressed as a Wiener process, which implies that market prices are completely random. However, many researchers have confirmed that there are fractal and chaotic characteristics in stock markets. For example, Mandelbrot (1960) suggested that capital market returns are subject to the Pareto distribution (fractal distribution), and Mantegna and Stanley (1995) provided evidence that the stock index exhibits a power-law distribution. It is thought that securities prices, which are driven by somewhat deterministic trends, are not random walks, but can be described by fractional Brownian motion (FBM)with a Hurst exponent H > 1/2 (Hurst, 1951). Small and Tse (2003) also provided evidence that the financial data exhibit deterministic nonlinearity. Therefore, the real price movement is not completely random, but contains certain chaotic characteristics. Moreover, FBM indicates a long-range correlation. That is, information about the market not only affects the instantaneous price, as per the EMH hypothesis, but also contributes to price formation for a considerably longer time. The complexity of stock price time series affects aspects of markets such as forecasting, pricing, and investment decision-making. Some scholars have refreshed the classical Black–Scholes Option Pricing Model by introducing the Hurst exponent H. Decreusefond et al. (1999) developed an option pricing formula based on FBM using the Stratonovich integral. Shortly after, Duncan et al. (2000) derived the FBM integral on the basis of the Wick operator when H 僆 (0.5, 1), proving that the market has no arbitrage under the FBM assumption. This led to a new European call option pricing model. Rogers (1997) also studied arbitrage under FBM. Besides an analysis of H, more efficient ways to depict the complex characteristics of financial time series can elicit better financial models. This paper focuses on a measurement method for the degree of complexity of chaotic time series, particularly

24, 013134-1

C 2014 AIP Publishing LLC V


013134-2

H. Cao and Y. Li

those of stock prices. In this study, the unraveling process of a chaotic attractor is described through the application of complex networks. Such networks are used to analyze the complex characteristics of the world’s major capital markets. The complexity of mature markets appears to have a certain similarity, i.e., strong and relatively stable. In comparison, for some time periods, developing markets exhibit lower complexity, implying a more obvious deterministic property. II. RELATED RESEARCH

It is important to be able to measure the complexity level of chaotic series. However, complexity does not yet have an explicit mathematical definition. The concept of complexity implies lack of predictability or determinacy. (Here, determinacy refers to the predictability of the order and implies that the timing of information is determinate.) The Kolmogorov complexity (Kolmogorov, 1965) defines the intrinsic descriptive complexity of a series. The Kolmogorov complexity of a string is the length of the shortest possible description of the string. It emphasizes “the amount of efficient information,” which in this case refers to the shortest possible description of the string, but this is not generally restricted to a series’ dynamical features. However, the dynamical features of series can partly reflect the amount of efficient information. For example, the amount of efficient information of a regular series (e.g., fixed points, cycle series, and linear series) is smaller than that of random series or chaotic series of the same length. As a result, the complexity of regular series is then lower than that of random series or chaotic series. When discussing complexity, the concept of randomness always occurs. The randomness of a sequence implies the degree to which it does not follow a deterministic pattern, but instead evolves according to some probability distribution. The randomness, different from the predictability and determinism, is a dynamical feature of a series, as is the regularity. (As opposed to randomness, regularity means orderliness, periodicity, sometimes even not varying, and predictable.) We can observe the dynamical features, and then decide or conjecture the predictability and determinism. The index of the complexity, a comprehensive characteristic quantity, should broadly reflect both the dynamical features and attributes, such as predictability and determinism. When regular series, chaotic series, and random series are compared, we would expect their complexity to increase in that order. A completely random series has the largest complexity value of 1. The lower the predictability and determinism of a series, the larger its complexity. A complex series displays a mixture of effects, which are probably a result of trend components, regular components, chaotic components, and random components. Some of these are deterministic, but others are not. Decomposing every real signal can be difficult, because they have many components. Therefore, an index that can collectively reflect the complex features of system behavior, including regularity, randomness, predictability, and determinism, will be useful. However, the Kolmogorov complexity, which is the basic measurement of the complexity of a sequence, is

Chaos 24, 013134 (2014)

difficult to compute. Nevertheless, Lempel and Ziv (1976) proposed a simple, computable method to calculate the Kolmogorov complexity. The Lempel–Ziv complexity, represented by K in this paper, tends to 1 for random series. A larger value of K implies a greater complexity. Generally, the Lempel–Ziv complexity of a chaotic series is greater than 0 and less than 1. In practical applications, however, the Lempel–Ziv algorithm is computationally expensive for long time series, because different series must be continuously compared during the calculation process. In this paper, we present a fundamentally different method for calculating the degree of complexity. Our method introduces complex networks to express the unraveling process of attractors, enabling us to quantify the degree of complexity in a chaotic series. However, similar to the Kolmogorov complexity, our new complexity index still reflects the amount of efficient information in a series. Complex networks are a tool for describing complex systems. In a complex network, the basic element is regarded as a node, with the corresponding relationships represented as edges. The structural features of the system and the relationships among elements are studied through network characteristics. The basic topological characteristics of complex networks include the number of nodes, degree distribution, average shortest path length, clustering coefficient, and betweenness. An important parameter for describing network connectivity features is the distribution P(k),which is equal to the density of nodes with degree k in the network. Many real networks have the characteristic of being scale-free (Albert and Barabasi, 2002), which refers to the power-law shape of P(k). The application of networks to study financial time series has caught the attention of many scholars. Mantegna (1999) transformed financial time series analysis into network analysis, and Bonanno et al. (2001; 2004) analyzed networks of stock returns and global stock markets. In addition, Kim et al. (2001), Caldarelli (2007), and Garlaschelli et al. (2005) studied different networks based on securities markets, and Tse et al. (2010) formulated a full network of US stock prices. Another field attracting considerable interest is the application of networks to the study of time series. Nicolis et al. (2005) developed a connection between dynamical systems and network theory by mapping the system dynamics into a discrete probabilistic process. Zhang and Small (2006, 2009) mapped time series into complex networks. They found that the networks produced by chaotic time series had small-world and scale-free characteristics. Noisy time series, however, correspond to the characteristics of a random graph. Other related research has been conducted by Donner et al. (2010), Xu et al. (2008), and Lacasa et al. (2008), amongst others. In Donner’s paper, networks are called recurrence networks. We will discuss the relation between our method and recurrence networks later in this paper. Li et al. (2011a) presented another method for transforming time series into networks. Their method is based on the distribution features of time series in m-dimensional reconstructed phase space. The topological characteristics of


013134-3

H. Cao and Y. Li

the network can display the dynamic characteristics of the system. This method transforms fixed points, cycle time series, linear divergence time series, and chaotic time series into a fully connected graph, regular graph, tree, and network with approximate power-law degree distribution, respectively. When the reconstructed phase space dimension m increases, random time series are quickly converted into completely unconnected graphs, but chaotic time series continue to maintain a certain connection. This result allows random and chaotic time series to be distinguished. The following sections of this paper will analyze, explain, and develop this method in depth.

III. DESCRIPTION OF THE PROPOSED METHOD

We measure the complexity and dynamic evolution of the securities market using stock index time series. Our algorithm includes three steps: first, the time series is mapped into a network (including the unraveling chaotic attractor); second, the complexity indicators are calculated according to network parameters; and third, a sliding time window is used to observe the dynamic changes in complexity indicators. The algorithm for mapping a time series to a network is as follows. A. Algorithm for mapping a time series to a network

The mapping algorithm (Li et al., 2011a; 2011b) includes a definition of nodes, a definition of the distance, and a connection rule. (1) Definition of a node: A node is defined as a point in an m-dimensional reconstructed phase space. For a time series x1, x2, …. xt, …, xn(t ¼ 1, 2, …, n) in a reconstructed phase space, Xi ¼ ½xi ; xiþ1 ; xiþ2 ; ::::xiþðm1Þ ; i ¼ 1; 2; :::; k; where m denotes the dimension of the embedding space. The total number of nodes is k ¼ n–mþ1. (2) Definition of distance: The Euclidean distance between nodes i and j is defined as: qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi (1) dij ¼ jxi1 xj1 j2 þ jxi2 xj2 j2 þ :::::jxim xjm j2 :

Chaos 24, 013134 (2014)

mc ¼ 1 to represent a regular series. Then, mc is a series of integers tending to infinity. A larger value of mc implies that the time series is less random. Based on mc, we introduce a new index to depict the complexity. The complexity index R, which compresses the critical embedding dimension mc into the interval [0, 1], is defined as follows: 1 mc < 3 : (2) R¼ eð3mc Þl mc 3 Here, l is the compression parameter that determines the decay rate of R. Figure 1 shows the value of R for different values of l. As shown in Fig. 1, if l is too large, then R will decay too rapidly. For example, when l ¼ 1, R quickly decays to below 0.5, but when l is too small, R decays too slowly, such as when l ¼ 0.01. Hence, l should be assigned an appropriate value. In this article, we set l ¼ 0.1 to ensure that the value of R in sequences with mc > 30, such as the Lorenz equations, will not rapidly decay to 0. Considering R 僆 [0,1] makes it convenient to observe the relative complexity level for the same value of l. C. Dynamic nature of stock index time series

The characteristics of stock price time series vary over different periods of time. Therefore, we provide an additional method to observe the evolutionary dynamics of time series. First, we construct a network from the entire series based on the above method, which shows the complexity features of the global data. If the results reveal the market to be irregular (R 6¼ 0 or mc 6¼ 1), then the following operations are applied. The above conditions ensure the system has complexity at a certain level. A sliding window, whose length is less than that of the time series, is then moved across the series. Within each window, R is calculated and plotted to illustrate the dynamic complexity of different periods of the time series. The sliding window method has four steps: (1) (2) (3) (4)

Set the window length n. Calculate mc within the window. Calculate R within the window. Plot the evolution of R to observe the degree of complexity.

(3) Rule of connection: We define the connection rule as follows. Let dmax denote the maximum distance in phase space. Namely, dmax ¼ max(dij). Then, D ¼ dmax/(k-1) is called the judgment distance (or the equipartition of the maximum distance in the phase space). Nodes i and j are only connected if dij D. B. Definition of the complexity index R

As revealed by the results of Li et al. (2011a; 2011b), the critical embedding dimension mc (the critical embedding dimension is the value of m at which all edges completely disappear) can reflect the randomness and regularity of a series. For regular series, mc does not exist. Hence, we set

FIG. 1. Relation between R and mc for different l.


013134-4

H. Cao and Y. Li

IV. DISCUSSION OF THE PROPOSED METHOD A. Analysis of the proposed algorithm: Chaotic attractor unraveling

We first analyze the principle of the algorithm in mapping a time series to a network. Banks (1992) proved that the basic conditions of chaos (Li and Yorke, 1975) are topological transitivity and a dense set of periodic points. The method in this paper mainly uses the characteristic of the denseness of periodic points. However, a precise mathematical definition of “dense” is difficult for real data. Hence, we use the term “close enough” to describe the relation between paired points in high-dimensional space. The orbits of a chaotic system display a certain continuity in high-dimensional space (m 2). This implies that some adjacent points are “close enough” at this dimension. We explain the term “close enough” as follows. The maximum distance between paired points in phase space, representing the diameter of the system in m-dimensional space, is divided into equal intervals of size D. We say that D is the equipartition of the overall space. We then determine whether points are “close enough” by comparing the distance between them to D. That is, if the distance between a pair of points is less than D, namely dij < D, these points are considered to be “close enough.” According to this algorithm, the points located within an attractor will have more connection opportunities. As m increases, however, the difference (distance) between each pair of points of the attractor will appear more significant. When the difference between a pair of points becomes greater than D—namely, once dij > D—the two corresponding points in the phase space of the attractor separate. When all paired points in phase space are no longer close enough, the attractor is regarded to have unraveled. The unraveling process can be thought of as the process of a network disconnecting, i.e., the disappearance of all edges in a network is equivalent to the attractor unraveling. Let mc denote the critical embedding dimension at which all edges completely disappear (i.e., the number of edges stabilizes at 0 as m increases). Figure 2 illustrates the unraveling process of a chaotic attractor. The denser the chaotic attractor, the greater the number of neighbors that are close enough. As a result, it will take a long time for the attractor to become unraveled. Chaos, as orderly disorder, is locally unpredictable and globally predictable. (“Locally unpredictable and globally predictable” means that, over short time periods, nearby states move away from each other. However, for the system to consistently produce stable behavior, over long time periods, the

FIG. 2. Schematic diagram for the chaotic attractor unraveling as the network becomes disconnected.

Chaos 24, 013134 (2014)

set of behaviors must fall back into itself. The tension of these two properties leads to very elegantly structured chaotic attractors.) A long unraveling time implies that the deterministic behavior of chaos is stronger. Conversely, if the unraveling process is very quick, the chaotic attractor exhibits a lower density, meaning that the degree of randomness is stronger. For random series, some neighbors, which are close at dimension m ¼ 1, are not close in higher dimensions. Then, at m ¼ 3, all neighbors become separated. This means that no pairwise neighbors are close enough in three (or more)-dimensional space. However, it is difficult to unravel the chaotic series, because points in phase space still have neighbors in a reconstructed space with dimension larger than three. It should be noted that, at a value of m ¼ mc, the method in this paper separates all pairwise points that were once regarded as nearest neighbors in lowdimensional space. Hence, the method “unravels” rather than “unfolds” a chaotic attractor. Note also that mc does not refer to the appropriate dimension for unfolding an attractor, though it is probably related to the appropriate unfolding dimension. Moreover, R 6¼ 0 (mc 6¼ 1) is an indicator of irregularity, disorder, and then probably a certain degree of topological transitivity (i.e., cannot be precisely predictable), which is supported by numerical experiments (Li et al., 2011a; 2011b). This again reflects some property of chaos. For the chaotic behavior of real time series, which are often driven by many forces, it is difficult to define the strict form of the chaos. A method that can reflect the essential property of chaos would be useful when testing for chaos. There is a similarity between recurrence networks (Donner et al., 2011) and the method in this paper. The unique feature of the proposed method is that the parameter D ¼ dmax/(k-1) varies as m increases, which ensures that linearly divergent time series are mapped into a tree. Moreover, the proposed method aims to show the complexity of the time series. B. Relation between R and the complexity level of time series

The distribution of points in the reconstructed phase space can reflect the dynamical features of a series, and then the amount of efficient information, and finally the complexity. The distribution of regular series generally takes on certain patterns in m-dimensional space, which implies relatively little efficient information, and thus a low complexity. The distribution of chaotic series is dense (some pairwise points are close enough) in m-dimensional space, which implies more efficient information and higher complexity than in a regular series. For random series, there is no dense area (no pairwise points are close enough) in three (or higher)-dimensional space, which implies more efficient information than regular series and chaotic series, and higher complexity. Because of the randomness of the chaotic series, the process of attractor unraveling is faster, and so the randomness characteristics of the series are more significant. In contrast, a longer process of attractor unraveling means more regularity. Therefore, R, as derived from mc, can reflect the


013134-5

H. Cao and Y. Li

complexity of the series and display its dynamical features, including regularity and randomness. According to the results presented by Li et al. (2011a), mc ¼ 3 for completely random series, and mc ¼ 1 for regular series, including fixed points, cyclic time series, and linear divergence. This leaves 3 < mc < 1 for chaotic series. Hence, R ¼ 1 (mc 3) represents complete randomness, R ¼ 0 (mc ¼ 1) implies regularity, and 0 < R < 1 (3 < mc < 1) corresponds to chaos. Therefore, the value of R can reflect the degree of complexity of a series. Furthermore, R 6¼ 0 represents an irregular shape, R ¼ 1 corresponds to a completely indeterminate behavior mechanism, and R 6¼ 1 represents some kind of determinacy. This determinism implies a prediction probability—R close to 0 or 1 would indicate a different level of prediction precision. For chaotic series, the characteristic of a dense set of periodic points (related to order) decreases the complexity, but the characteristic of topological transitivity (related to disorder) will increase the complexity. R increases with the strength of the randomness of a chaotic series. (Topological transitivity and a dense set of periodic points are the two essential conditions for chaos. Topological transitivity causes unpredictability, whereas the denseness of periodic points is relevant to predictability. Because the “string” of this paper is a time series, the complexity of a time series is related to its dynamical features. The accurate relation between the value of R and the chaotic characteristics requires a theoretical proof. However, R > 0 probably implies the existence of topological transitivity, and R < 1 probably implies that there exist dense periodic points (an attractor). This is an open area of research for the future.) Compared with the Lempel–Ziv complexity K, the advantage of R is that it can reflect both determinism and irregularity. In practice, R is suitable for various types of series, especially irregular series. If R 6¼ 0, the series is irregular and exhibits a degree of complexity. In the following, we evaluate K in comparison with R. In principle, K displays the complexity of a one-dimensional series, whereas R reflects higher-dimensional complexity, because the process of calculating R involves m-dimensional phase space reconstruction. Figure 3 shows the comparison between K and R. From this figure, we can deduce the following useful results. First, K is almost the same as R for most series, including fixed points (constant), linear data, Henon maps, Lorenz

FIG. 3. Comparison of K and R. The label henon_x denotes the x-component of the Henon map, and so on. For R, l ¼ 0.1.

Chaos 24, 013134 (2014)

maps, and random series. In chaotic series, especially the Henon series, components of different directions (x and y) have almost the same degree of complexity. This indicates that R performs a similar function to K. In addition, the reconstructions based on x, y, and z components of the Lorenz system give essentially the same complexity value for x and y, but a different one for z. This is consistent with the fact that the Lorenz attractor cannot be topologically equivalently reconstructed from its z component because of its intrinsic symmetries (Letellier et al., 2002). Second, the time cost of K is very large, because the Lempel–Ziv algorithm must constantly compare different series. For a random time series with 1000 data, it takes 15.28 s to calculate K, and only 1.24 s to calculate R. (Computed using an Intel(R) Core(TM) i7 CPU [email protected] GHz, 4 GB RAM, and Matlab software.) Third, the values of R are more reasonable than those of K. For example, the K value of the periodic sin function is 0.2358, which is greater than that of the Lorenz series, a classical chaotic series. This is an unreasonable result. In contrast, R correctly reflects the complexity of the Lorenz series as greater than that of the periodic sin function. In summary, R efficiently depicts the degree of complexity of each of the tested series, exhibits the complexity of high-dimensional space, and has a lower time cost than K. The proposed complexity measure R is also a more appropriate index than K for some time series, such as periodic series. C. Effect of the length of the time series

Longer time series contain more data, and should thus present more plentiful dynamical behavior. We calculate mc for different lengths of time series. For random series, mc is always equal to 3. However, for chaotic series, mc initially decreases before becoming more stable, as shown in Fig. 4. This is because more data are likely to bring about more dynamical patterns, which will make the series appear more disordered. In addition, a stable mc implies that the behavior of the system is steady. To reflect all of the dynamical aspects of a series, the proposed method first calculates the value of mc for the entire set of data, and then the value of mc for sliding windows of the same length for different series. Because R is an exponential function of mc, a small gap between different mc values would not result in a huge difference in R.

FIG. 4. Effectof different lengths n of time series on mc. The mc value of random series remains unchanged for all values of n. The mc values of chaotic series exhibit an initial change, before becoming steady.


013134-6

H. Cao and Y. Li

Chaos 24, 013134 (2014)

FIG. 5. Effect of trends of various strengths. k denotes the strength of the trend. When k 2, the random series has a constant value of R ¼ 1 (mc ¼ 3). For the logistic series, R first increases, remains steady, and then decreases. The gradually increasing linear driving force of the trend will cause the determinism of the system to become progressively more obvious.

attractor become less dense. In contrast, the attractors in other intervals (k 僆 [20,100)) become denser. This trend will disturb the chaotic movement and random motion to a variable degree, depending on its strength.

D. Effect of noise and non-stationarity

The effect of noise has been discussed by Li et al. (2011b) The structure of a time series will probably be corrupted by the addition of noise. The proposed method is generally effective if the signal-to-noise ratio (SNR) is less than 50. However, once the SNR drops below 10, the structure of a chaotic series becomes corrupted, and it cannot be separated from a random series. We now discuss the effect of non-stationarity. There are many types of non-stationarity. We give an example of one type to test the effect. Here, a time series is said to be nonstationary if it includes a (linear or nonlinear) trend. For some specific forms, e.g., piecewise linear trends, the nonlinear trend can simply be expressed as a sum of linear trends. Therefore, we add a linear trend to a time series to test the proposed method. The original time series is modified as follows: Yt ¼ xt þ k • zt ;

V. DATA

The proposed method is tested on seven world stock indexes. These are Standard & Poor’s 500 Index (S&P500), London’s FTSE 100, the NASDAQ Composite Index, the Dow Jones Industrial Average (DJIA), the Shanghai Composite Index (SHCI), the Shenzhen Component Index (SZCI), and the Hang Seng Index (HSI). Table I shows details of each market, and a standardized time series diagram is presented in Fig. 6. VI. RESULTS A. Networks based on the overall data of the series

(3)

Based on the whole time series, a network was built for each market to judge its integral complexity. Figure 7 shows the networks formed from the time series of each market with m ¼ 3. As an example, Fig. 8 shows the evolution of the FTSE 100 diagram as m increases. As shown in Table II, 3 < mc < 1 for every market. This implies the price movements are not completely random and have a certain complexity. The complexity characteristic of each market differs significantly, and the data length does not affect the apparent complexity level. The S&P 500 has 8639 data points, and mc ¼ 9. However, HIS has almost the same number of data as S&P 500 (8815 data points), but its mc ¼ 200, which indicates a low complexity. The FTSE 100, DJIA, and NASDAQ have fewer data and smaller mc values than HSI (and also smaller mc values than SHCI and SZCI). The mc

where zt denotes the linear trend at time t, xt is the original time series at time t, k is a ratio parameter representing the strength of the trend, and zt ¼ at þ b for some constants a, b. Figure 5 shows the effect of this trend. For trends of different strength, the reactions of random and chaotic series differ. The trend does not affect the value of R in random series for a certain range of k(k 2 in this numerical experiment), but affects chaotic series for all values of k. However, the influence is uncertain for chaotic series. For example, when k increases, the value of R in the logistic series increases, attains a maximum and remains steady, and then decreases. When the trend is very strong, namely k > 20, R begins to decay, because the linear trend gradually becomes the main dynamic driving force. We can explain this as follows. In some intervals (k 僆 [0,0.1]), the orbits of the TABLE I. Name, time period, and number of data of the seven stock markets. S&P 500 1976.07.01 –2010.09.17 8639 points

FTSE 100

NASDAQ

DJIA

SHCI

SZCI

HSI

1984.04.02 –2010.09.17 6693 points

1986.11.03 –2010.09.17 6020 points

1985.05.24 –2010.09.17 6392 points

1990.12.19 –2010.09.17 4844 points

1991.04.03 –2010.09.17 3924 points

1975.01.06 –2010.09.17 8815 points


013134-7

H. Cao and Y. Li

Chaos 24, 013134 (2014)

FIG. 6. Standardized time series diagram of the seven stock indexes.

FIG. 7. Two-dimensional networks of all markets with m ¼ 3.

FIG. 8. Evolution of the FTSE 100 network as m increases.

values of the SHCI and HSI networks are obviously greater than those of other markets, implying relatively weak randomness, and strong determinism. However, the mc value of the entire dataset cannot reflect the dynamic behavior in different periods. Therefore, a series of sliding windows were applied. B. Sliding window analysis to show the dynamics of the series

From the above results, we can conclude that the markets are not completely random, but generally chaotic. Yet, in different time periods, the time series exhibit different complexity characteristics. Therefore, to observe the characteristics of different periods for each series, sliding windows were applied to the data. We used window lengths of 22 days (1 month), 66 days (3 months), 264 days (12 months), TABLE II. Value of mc for the whole time series of each market.

mc

S&P 500

FTSE 100

DJIA

NASDAQ

SHCI

SZCI

HSI

9

5

6

12

200

16

202

528 days (24 months), 792 days (36 months), and 1056 days (48 months). Figure 9 shows the average, maximum, and minimum value of mc for each market over different window lengths. For each window length, all markets have minimum mc values of 3, 4, or 5 for which the difference is insignificant. This result suggests that each of the markets has periods of relatively high complexity in which the market is completely random, especially when mc ¼ 3. However, there are significant differences between the markets in terms of the average and maximum mc values. In particular, SHCI and SZCI attain significantly larger values than the other markets, implying that the complexity of these two markets is relatively lower. The average mc value of the other five markets(S&P 500, FTSE 100, NASDAQ, DJIA, and HSI) fluctuates between 5 and 6.6. This indicates that the complexity of these five markets is obvious and stable. With an average of up to 19.3, the complexity of the SHCI and the SZCI is lower than that of the other five markets. The SHCI, in particular, is especially low. After counting the mc values, the mode (the most frequent occurrence) in each window was found to be in the set {4,5,6}. The proportion of times for which mc ¼ 5 is very


013134-8

H. Cao and Y. Li

Chaos 24, 013134 (2014)

FIG. 9. Average, maximum, and minimum values of mc for all markets over different sliding window lengths. The x axis denotes the length of the windows. For example, the maximum of mc refers to the maximum mc among all mc calculated by the method defined in Sec. III for all sliding window lengths (a) Average value of mc, (b) Maximum value of mc, and (c) Minimum value of mc.

high (up to 84%). This indicates that the complexity level of each market is high. The complexity index R is shown in Fig. 10. As the length of the window increases, the interval over which R varies gradually converges to the R value of the entire series. Because the dynamic behavior of the same market changes consistently, as the window is enlarged (similar to an increasing mesh), the shape of R for the same market becomes similar. As shown in Fig. 10, for the S&P 500, FTSE 100, DJIA, and NASDAQ, R is generally greater than 0.5 once the window length is greater than or equal to three months. Compared with the above four markets, although the R value of SHCI, SZCI, and HSI is above 0.5 for most of the time periods, it is below 0.5 for some periods (highlighted by the red boxes). The range of R for the S&P 500, FTSE 100, DJIA, and NASDAQ is smaller than that for the other three markets. This indicates that the complexity level of mature stock markets, i.e., the S&P 500, FTSE 100, DJIA, and NASDAQ, is relatively stable. From these observations, we conclude that, in most periods, the seven markets exhibit high complexity. The most stable markets are the S&P 500, FTSE 100, DJIA, and NASDAQ. For some periods, the complexity of the SHCI, SZCI, and HSI is relatively lower. The frequency distributions of R for each market and all window lengths are shown in Fig. 11. For the various markets, the frequency distribution patterns are very similar. The peak frequency mostly occurs at around R ¼ 0.82 (R ¼ 0.82 corresponds to mc ¼ 5, which is the highest mode value of mc, as mentioned previously), indicating that all markets have consistent characteristics: weak regularity and strong randomness. The similarity between the markets indicates

that, though the trading rules or laws of the stock market may be very different, the complexity level is stable and similar. VII. SUMMARY

From the above results, we can derive the following conclusions. As a measure of complexity level, R 僆 [0,1] conveniently reflects the degree of complexity in a series. The sliding window approach allows the detailed dynamic behavior to be observed. Increasing the window length is equivalent to gradually enlarging the scale of observation. The complexity index R gradually converges to the value attained when the window length is equal to the length of the entire dataset. None of the stock markets researched in this paper are completely random, and they have different degrees of complexity, both over the entire length of their time series and at a level of detail. (This is not valid for too small a window (window length ¼ 22 days), where the method is not efficient.) The characteristic of high complexity occurs in many time periods, because the peak frequency occurs at R ¼ 0.82. From a global perspective, the complexity of mature stock markets, including the S&P 500, FTSE 100, DJIA, and NASDAQ, is relatively high and stable. The SHCI, SZCI, and HSI appear to have significantly reduced complexity over certain time periods. Hence, these three developing markets have relatively low overall complexity. The complexity index R varies greatly in different stages. As newly developing markets, the SHCI and SZCI appear to behave like strong attractors in the early stages of their time series. Afterwards, however, the complexity becomes similar to that


013134-9

H. Cao and Y. Li

Chaos 24, 013134 (2014)

FIG. 10. Complexity index R for different sliding window lengths (l ¼ 0.1) for all markets. According to the methods described in Sec. III C, we use sliding windows to observe the value of R over different periods. In each window, one value of R is calculated, and the window slides to the next period, where a new value of R is calculated. Plotting all values of R in one graph, we can observe the dynamic change in R over all periods. Each row shows values of R for a particular market and all sliding window lengths.

of the mature markets. Therefore, we conclude that more developed markets are more complex. This result may be closely related to the symmetry of information, transparency, and the level of rationality of the investors. In financial markets, EMH implies that information can be immediately reflected in the price. That is, if information is not immediately reflected in the price, long-range correlations will appear. In a completely efficient market, price is a random motion. On the contrary, the occurrence of longrange price correlations always implies chaotic motion. The results of our research show that the complexity of developed markets is high, which implies that the randomness is high. Therefore, EMH is more suitable for developed

markets, and market information affects stock prices over a relatively short term. In the developed markets, the symmetry of information, transparency, and the level of rationality of the investors have become relatively perfect, resulting in an efficient market. For developing markets, however, particularly in their early stages, the relatively weak randomness and strong regularity and determinism resemble a biased stochastic process. The market is weakly efficient because it is immature. The price motion is a biased stochastic process and contains long-range correlations. Therefore, the Black–Scholes Option Pricing Model should be partly revised for developing markets.


013134-10

H. Cao and Y. Li

Chaos 24, 013134 (2014)

FIG. 11. Frequency distribution of R for each market and all window lengths. Different colors represent different window lengths.

VIII. CONCLUSION

Our novel method measures the complexity of time series based on the unraveling of a chaotic attractor. We use the chaotic characteristic of the denseness of periodic points. The concept of “close enough” describes the relation of paired points in high-dimensional space, and thus indirectly reflects the “dense” nature of the attractor. The noise, trend, and length of the time series affect the results. The value of R 僆 [0,1] is a measure of the complexity level. R has a similar meaning to the Kolmogorov complexity (calculated as the Lempel–Ziv complexity). However, R is superior to the Lempel–Ziv complexity, as it can reflect both determinism and irregularity. R can also measure the complexity in high-dimensional space, and has a lower time cost than K. The degree of complexity of the securities market reflects certain inherent features. Although each stock market has different products, trading rules, scale, traders, and trader psychologies, the complex nature of different markets has both differences and similarities. The complexity reflects the dissemination mechanism of market information, as well as indicating the degree of rational investing, a concept related to herding. Hence, R has great significance for stock markets. Our empirical results show that the complexity level of mature stock markets is relatively high and stable, but that of developing markets (here, we have used three China-related stock indexes) is relatively low and unstable. To some extent, the different complexity reflects the level of fairness and the transparency of information in the market. The premise of EMH is that, in the stock market, the information available to each investor is equal. Hence, the higher and

more stable complexity value of mature markets implies a degree of superiority in information dissemination to every investor. EMH is more applicable to mature markets. In contrast, the complexity level of the developing markets in some time periods is weak and unstable. These markets have shortages of information dissemination, market equity, and investor rationality. Therefore, the traditional EMH is relatively unsuitable for developing markets. ACKNOWLEDGMENTS

This work was supported, in part, by the National Natural Science Foundation of China (Grant Nos. 70801066, 71071167, 71071168, and 71371200), and by a grant from Sun Yat-sen University Basic Research Funding (Grant Nos. 1009028 and 1109115). Albert, R. and Barabasi, A. L., “Statistical mechanics of complex networks,” Rev. Mod. Phys. 74, 47–97 (2002). Bandt, C. and Pompe, B., “Permutation entropy: A natural complexity measure for time series,” Phys. Rev. Lett. 88, 174102 (2002). Banks, J., Brooks, J., Cairns, G., Davis, G., and Stacey, P., “On Devaney’s definition of chaos,” Am. Math. Mon. 99, 332–334 (1992). Bonanno, G., Caldarelli, G., Lillo, F., Miccieche, S., Vandewalle, N., and Mantegna, R., “Networks of equities in financial markets,” Eur. Phys. J. B 38, 363–371 (2004). Bonanno, G., Lillo, F., and Mantegna, R., “High–frequency crosscorrelation in a set of stocks,” Quantum Finance 1, 96–104 (2001). Caldarelli, G., Scale-Free Networks (Oxford University Press, Oxford, 2007). € unel, A., “Stochastic analysis of the fractional Decreusefond, L. and Ust€ Brownian motion,” Potential Anal. 10, 177–214 (1999). Devaney, R., An Introduction to Chaotic Dynamical Systems (AddisonWesley, Redwood City, CA, 1989).


013134-11

H. Cao and Y. Li

Donner, R. V., Zou, Y., Donges, J. F., Marwan, N., and Kurths, J., “Recurrence networks—A novel paradigm for nonlinear time series analysis,” New J. Phys. 12, 033025 (2010). Donner, R. V., Small, M., Donges, J. F., Marwan, N., Zou, Y., Xiang, R., and Kurths, J., “Recurrence-based time series analysis by means of complex network methods,” Int. J. Bifurcation Chaos 21, 1019–1046 (2011). Duncan, T. E., Hu, Y., and Pasik-Duncan, B., “Stochastic calculus for fractional Brownian motion. I. Theory,” SIAM J. Control Opt. 38, 582–612 (2000). Fama, E. F., “Efficient capital markets: A review of theory and empirical work,” J. Financ. 25, 383–417 (1970). Fama, E. F., Fisher, L., Jensen, M. C., and Roll, R., “The adjustment of stock prices to new information,” Int. Econ. Rev. 10, 1–21 (1969). Garlaschelli, D., Battiston, S., Castri, M., Servedio, V., and Caldarelli, G., “The scale-free topology of market investments,” Physica A 350, 491–499 (2005). Hurst, H. E., “Long-term storage capacity of reservoirs,” Trans. Am. Soc. Civil Eng. 116, 770–799 (1951). Kim, H.-J., Lee, Y., Kahng, B., and Kim, I., “Scale-free network in financial correlations,” J. Phys. Soc. Jpn. 71(9), 2133–2136 (2002). Kolmogorov, A. N., “Three approaches to the quantitative definition of information,” Prob. Inform. Trans. 1(1), 1–7 (1965). Lacasa, L., Luque, B., Ballesteros, F., Luque, J., and Nuno, J. C., “From time series to complex networks: The visibility graph,” Proc. Natl. Acad. Sci. U.S.A. 105, 4972–4975 (2008). Lempel, A. and Ziv, J., “On the complexity of finite sequences,” IEEE Trans. Inf. Theory 22, 75–81 (1976). Letellier, C. and Aguirre, L. A., “Investigating nonlinear dynamics from time series: The influence of symmetries and the choice of observables,” Chaos 12, 549–558 (2002). Li, Y., Cao, H. D., and Tan, Y., “A novel method of identifying time series based on network graph,” Complexity 17, 13–34 (2011a).

Chaos 24, 013134 (2014) Li, Y., Cao, H. D., and Tan, Y., “A comparison of two methods for modeling large-scale data from time series as complex networks,” AIP Adv. 1, 012103 (2011b). Li, T. Y. and Yorke, J. A., “Period three implies chaos,” Am. Math. Mon. 82, 985–992 (1975). Mandelbrot, B., “The Pareto-Levy law and the distribution of income,” Int. Econ. Rev. 1, 79–106 (1960). Mantegna, R. N. and Stanley, H. E., “Scaling behavior in the dynamics of an economic index,” Nature 376, 46–49 (1995). Mantegna, R. N., “Hierarchical structure in financial markets,” Eur. Phys. J. B 11, 193–197 (1999). Nicolis, G., Garcıa-Cant u, A., and Nicolis, C., “Dynamical aspects of interaction networks,” Int. J. Bifurcat. Chaos 15, 3467–3480 (2005). Pincus, S. M., “Approximate entropy as a measure of system complexity,” Proc. Natl. Acad. Sci. U.S.A. 88, 2297–2301 (1991). Pincus, S. M., “Approximate entropy(ApEn) as a complexity measure,” Chaos 5, 110–117 (1995). Rogers, L. C. G., “Arbitrage with fractional Brownian motion,” Math. Finance 7, 95–105 (1997). Small, M., and Tse, C. K., “Determinism in financial time series,” Study Nonlinear Dyn. Econ. 7(3), 1–31 (2003). Small, M., Zhang, J., and Xu, X. K., “Transforming time series into complex networks,” Complex Sciences 5, 2078–2089 (2009). Tse, C. K., Liu, J., and Lau, F. C. M., “A network perspective of the stock market,” J. Empir. Finance 17, 659–667 (2010). Xu, X., Zhang, J., and Small, M., “Superfamily phenomena and motifs of networks induced from time series,” Proc. Natl. Acad. Sci. U.S.A. 105, 19601–19605 (2008). Zhang, J., and Small, M., “Complex network from pseudo periodic time series: Topology vs dynamics,” Phys. Rev. Lett. 96, 238701 (2006).


Tumors as chaotic attractors.

Control of asymmetric Hopfield networks and application to cancer attractors.

Topological Characteristics of the Hong Kong Stock Market: A Test-based P-threshold Approach to Understanding Network Complexity.

Crisis-like behavior in China's stock market and its interpretation.

Soccer and stock market risk: empirical evidence from the Istanbul Stock Exchange.

Strange attractors, chaotic behavior and informational aspects of sleep EEG data.

Association between Stock Market Gains and Losses and Google Searches.

Confidence and the stock market: an agent-based approach.

Quantifying the relationship between financial news and the stock market.

Exploring Market State and Stock Interactions on the Minute Timescale.

Market Confidence Predicts Stock Price: Beyond Supply and Demand.

Confidence and self-attribution bias in an artificial stock market.

Reducing the complexity of complex gene coexpression networks by coupling multiweighted labeling with topological analysis.

Quantifying the effect of investors' attention on stock market.

Dynamic evolution of cross-correlations in the Chinese stock market.

Collective Behavior of Market Participants during Abrupt Stock Price Changes.

Profitability of Contrarian Strategies in the Chinese Stock Market.

APPLICATION OF T-TECHNIQUE FACTOR ANALYSIS TO THE STOCK MARKET.

Polynomial law for controlling the generation of n-scroll chaotic attractors in an optoelectronic delayed oscillator.

Synchronization of networks of chaotic oscillators: Structural and dynamical datasets.

Unraveling the complexity of lipid body organelles in human eosinophils.

Unraveling the genomic complexity of small cell lung cancer.

Study on Market Stability and Price Limit of Chinese Stock Index Futures Market: An Agent-Based Modeling Perspective.

Criteria for stochastic pinning control of networks of chaotic maps.