This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

1

Adaptive Identifier for Uncertain Complex Nonlinear Systems Based on Continuous Neural Networks Mariel Alfaro-Ponce, Amadeo Argüelles Cruz, and Isaac Chairez

Abstract— This paper presents the design of a complex-valued differential neural network identifier for uncertain nonlinear systems defined in the complex domain. This design includes the construction of an adaptive algorithm to adjust the parameters included in the identifier. The algorithm is obtained based on a special class of controlled Lyapunov functions. The quality of the identification process is characterized using the practical stability framework. Indeed, the region where the identification error converges is derived by the same Lyapunov method. This zone is defined by the power of uncertainties and perturbations affecting the complex-valued uncertain dynamics. Moreover, this convergence zone is reduced to its lowest possible value using ideas related to the so-called ellipsoid methodology. Two simple but informative numerical examples are developed to show how the identifier proposed in this paper can be used to approximate uncertain nonlinear systems valued in the complex domain. Index Terms— Complex-valued neural networks, continuous neural network, controlled Lyapunov function, nonparametric identifier.

I. I NTRODUCTION A. Complex-Valued Nonlinear Systems The concept of complex numbers (CNs) was not accepted as a valid idea within mathematics field for a long time [1]. Even when their mathematical characteristics and properties were successfully described, its applicability in real life was not understood completely before the middle of 19th century. Actually, during the Industrial Revolution, mathematicians finally recognized the importance of CNs not only in pure theoretical problems but also in engineering issues. Today, CNs are being actively used in different areas such as physics, circuit theory, and fluid analysis. Just to give a couple of examples, CNs appear in Schrödinger’s equation of quantum mechanics [2] and are used in electrical engineering quite naturally as part of the Laplace transformation theory [3]. CNs are now firmly established in all fields of natural science and engineering. Manuscript received November 14, 2012; revised May 16, 2013; accepted July 20, 2013. This work was supported in part by the National Council of Science and Technology and the Local Federal District of Science and Technology. M. Alfaro-Ponce and A. A. Cruz are with the Centro de Investigación en Computación, CIC-IPN, Mexico D.F. 07738, Mexico (e-mail: [email protected]; [email protected]). I. Chairez is with Interdisciplinary Unit of Biotechnology, Instituto Politecnico Nacional, Mexico D.F. 07738, Mexico (e-mail: [email protected]). Digital Object Identifier 10.1109/TNNLS.2013.2275959

Once the CN concept was accepted, dynamic systems including states defined within the complex domain were proposed and analyzed. This class of particular systems was used to describe a large set of real plants where waves and related phenomena appear. Complex-valued (CV) systems appear in different engineering areas: communications, medical imaging, frequency-based analysis for several types of real plants, etc. In many fields, real-valued methods are not adequate enough and CNs offer a more suitable basement. Among all systems based on CNs, those described in terms of the so-called frequency response are particularly interesting. Analytic signal obtained by the Hilbert transform exists in the complex domain and is considered as a universal tool providing the local amplitude and the local phase of a given signal. Besides, it is common to apply Fourier transformation to periodic signals using the well-known windowing procedure. This process will produce a complete magnitude and phase variation within each window applied to the signal under analysis. Collecting all magnitude and phase dynamics through consecutive windows will produce a set of nonlinear uncertain dynamics for each frequency described in terms of CNs. The sense of uncertainty in the previous system arises because the formal description of the aforementioned magnitude/phase variation with respect to time is difficult to be obtained. One possible solution to overcome this problem comes from the neural network (NN) theory. Indeed, this idea can be solved using the complex-valued neurons. Complex-valued neurons are those whose input and output signals are CNs. The recognition of this class of neural networks was achieved easily. Actually, several studies have focused on the usefulness of complex-valued neural networks (CVNNs) in engineering [4]– [7]. Since the concept of complex-valued neuron was introduced, several structured organization of those neurons have been proposed. These structures were named CVNNs. B. Complex-Valued Neural Networks The real-valued NN’s success in different engineering areas motivated the use of CVNNs in several studies [8]. Indeed, the growing number of NN applications demonstrates the potential of such methods in different theoretical and practical fields. NN in the complex domain has started to become an active research field. In the early 1990s, preliminary important results

2162-237X © 2013 IEEE

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2

regarding CVNNs were obtained. Since then, interesting counterparts of real-valued neural networks within the CVNN have appeared. Among others, the complex-valued backpropagation method [9], the multilayer perceptron architecture [10], and many others were actively developed. Indeed, one can find successful CVNN applications in image analysis, signal processing, pattern recognition, network communications, and others scientific areas [11]. Several models of complex-valued neurons have been proposed, for example, phasor neurons [12], discrete-state phasor neurons [13], amplitude-phase type of complex-valued neurons [10], real part–imaginary part type of complex-valued neurons [11] and many others [14]. Several types of neural networks, such as feed-forward neural networks, Hopfield Networks, and Boltzmann machines, were developed within the CVNN [15]. Recently, recurrent CVNNs have been introduced in different frameworks [16], [17]. The so-called complex-valued Hopfield networks proposed in [13] are interesting. They are called continuous phasor or discrete phasor neural networks. Hirose [18] proposed backpropagation learning algorithms for the amplitude–phase type of CVNNs. Benvenuto and Piazza [19] and Nitta and Furuya [9] independently proposed backpropagation learning algorithms for the real part–imaginary part type of CVNNs [20]. Boltzmann machines were also extended to complex-valued Boltzmann machines [21]. These complex-valued Boltzmann machines are continuous models. Kobayashi and Yamazaki proposed the discrete version of complex-valued Boltzmann machines [22]. In connection with the frequency response mentioned in the previous subsection, the application of Fourier transformation in many scientific areas, such as communications, robotics, image processing, and specially bioinformatics, can be also treated within CVNNs. For example, in the human brain, an action potential may have different pulse patterns, and the distance between the pulses may be different. This suggests that it is appropriate to introduce CN representing phase and amplitude into NN. Pattern recognition in electroencephalogram (EEG) signals using CVNN may become also active field within the area of theso-called brain–machine interfaces. C. Contribution and Organization of the Paper Even though the number of CVNNs are continuously growing, there are few applications within identification, state estimation, and control design using such class of NNs. This paper describes the application of a special class of continuous CVNNs used to obtain a numerical nonparametric model of a CV nonlinear uncertain system, namely complex-valued differential neural network (CVDNN). The method introduced in this paper includes the algorithm to adjust the weights in the CVDNN, the method to select the parameter to adjust the learning rate, and the convergence proof based on the second method of Lyapunov. Two simple numerical examples show how the technique developed in this paper may be applied to obtain numerical models based on the CVDNN. The paper is organized as follows. Section II describes the class of CV nonlinear system considered in this paper

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

including the class of nonlinear sections and noise affecting the system. Section III describes the approximation of an uncertain nonlinear system based on a special class of continuous NNs. The identifier structure is introduced in Section IV, which includes the class of learning laws used to adjust the parameters defining the identifier. Section VI describes the training algorithm used to obtain a numerical solvable scheme of the adjustment laws used for the parameters described in the identifier. Section VII defines how to optimize the region where the identification error asymptotically converges. The numerical results section demonstrates how the identifier can be solved numerically. Finally, Section VIII concludes the paper with some final remarks. II. C LASS OF C OMPLEX -VALUED N ONLINEAR S YSTEMS The class of nonlinear dynamics with complex state considered in this paper is characterized by the following mathematical model: x˙ (t) = f (x(t), u(t)) + ξ (t) x (0) fixed and bounded f (·, ·) := fr (·, ·) + j f i (·, ·)

(1)

where x(t) ∈ Cn , x(t) := [x 1 (t), . . . , x n (t)] defines the system state composed by its corresponding real xr (t) ∈ Rn and imaginary parts x i (t) ∈ Rn , that is x (t) = xr (t)+ j x i (t). The function u(t) ∈ Rm represents an exogenous input or a feedback control action. Anyway, this signal is assumed to be measurable and bounded. In this paper, the state is also assumed to be measurable. Measurability of the state is understood in the following sense: the magnitude |x i (t)| and phase of each component arg{x i (t)}, i = 1, n are available by different mechanisms in the whole period of time while the system evolves. This is not a strong assumption because it is usual to obtain such components in real applications where the frequency analysis gives characteristic information for the system beyond the time-evolution information, e.g., in acoustic engineering, impedance analysis, image and signal processing, etc. The nonlinear CV function f (·, ·)Cn+m → Cn is composed by two real-valued sections fr (·, ·) and f i (·, ·). These nonlinear functions fr (x, u) and f i (x, u) should satisfy a number of conditions such that the complex uncertain nonlinear system given by (1) has a solution. The system (1) is assumed to be stable; therefore x(t) ≤ x + , ∀t ≥ 0. Particularly, the following condition is required to solve the CV uncertain system and to obtain the convergence regimen of the identifier based on continuous NN. Condition 1: Each nonlinear function fr (x, u) and fi (x, u) is continuous with respect to their first argument  fr (x, u) − fr (y, u)2 ≤ L r x − y2  f i (x, u) − f i (y, u)2 ≤ L i x − y2

(2)

for x, y ∈ Cn and L r , L i ∈ + are bounded positive constants. The source of uncertainty comes not only from the low level of knowledge associated with the mathematical structure of f (x, u) := fr (x, u) + j f i (x, u), but also from the presence of

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. ALFARO-PONCE et al.: ADAPTIVE IDENTIFIER FOR UNCERTAIN COMPLEX NONLINEAR SYSTEMS

external noise or/and perturbations in the system dynamics x. These sources of perturbations are formally represented by ξ (t) ∈ Cn and fulfilling the following inclusion: ξ (t)2ξ ≤ ξ +



ξ > 0, ξ = ξ , ξ ∈ n×n .

(3)

In this paper, we construct an approximate mathematical description of f (x(t), u(t)) to obtain a degree of knowledge about a plausible and feasible nonparametric mathematical model. This model will be constructed using the information coming from the CV state and real-valued input. This approximation will be called as the adaptive identifier based on CVNN. The method introduced in this paper also gives the adjustment laws for the parameters involved in the approximation model based on the so-called continuous NN. Moreover, a static optimization scheme is proposed to reduce the zone where the identification error asymptotically converges. III. N EURAL N ETWORK A PPROXIMATION FOR N ONLINEAR C OMPLEX -VALUED S YSTEMS A. Adaptive Approximation The nonparametric mathematical model was obtained using a particular type of nonlinear least mean square algorithm. The use of this method demands a very important assumption: the couple of nonlinear structures fr (x, u) and f i (x, u) should admit a numerical reconstruction based on NNs. Actually, they should be CVNNs. This pair of CVNNs is represented by fr,0 (x, u) and f i,0 (x, u) , respectively. This assumption is paramount to admit the existence of solution for the adaptive modeling problem. A number of possible approximations can be used here. Among others, the classical least mean square based on polynomials, sinusoid functions, wavelet nonlinear functions, and NNs can be used here. No matter what selection is adopted to obtain the approximation, the following construction for representing the uncertain complex nonlinear system should be considered: x˙ (t) = Ax(t) + fr,0 (x(t), u(t)) + j f i,0 (x(t), u(t)) +ηr (x(t)) + j ηi (x(t)) + ξ (t)

such characteristic. In this paper, a finite number of basis functions are used. That is why we considered the modelling errors included in ηr (x(t), t) and ηi (x(t), t) with the characteristics described in (5). Usually, the nonlinear parts fr,0 (x, u) and f i,0 (x, u) are represented by linear combinations of nonlinear functions (x, u) such as those explained in the literature [23] regarding the approximation capability shown by different NNs. Therefore, the so-called nominal sections will be approximated using the classical linear regression form fr,0 (x(t), u(t)) := r  (x (t) , u (t)) f i,0 (x(t), u(t)) := i  (x (t) , u (t)) .

(6)

Here, (x(t), u(t)) represents the basis of the Hilbert space used to reconstruct the uncertain nonlinear system with complex dynamics. Besides, r and i are parameters used to adjust the contribution of each basis function required to obtain the approximation result. In [24], the approximation parameters were proposed as a combination of linear and nonlinear terms. This paper considers a special construction to approximate the uncertain nonlinear system, given by ∗ ∗ ψ1 (x(t)) + Wr,2 ψ2 (x(t)) u(t) fr,0 (x(t), u(t)) := Wr,1

∗ ∗ ψ1 (x(t)) + Wi,2 ψ2 (x(t)) u(t). f i,0 (x(t), u(t)) := Wi,1

(7)

Therefore, x˙ (t) = Ax(t) + W1∗ ψ1 (x(t)) +W2∗ ψ2 (x(t)) u(t) + η (x(t)) + ξ (t)

(8)

where A = Ar + j Ai , W ∗j := Wr,∗ j + j Wi,∗ j ( j = 1, 2) and η(x(t)) := ηr (x(t)) + j ηi (x(t)). The set of matrices W ∗j are unknown but they are bounded as follows: W ∗j  ≤ ∗ ∗  ∗ W+ j () where W j  := [W j ] W j . One must note that + the positive scalar W j is known. Functions ψ1 (x(t)) and ψ2 (x(t)) are the basis functions of the same Hilbert space described above. Particularly, in CVNNs, one of the widely used activation functions is ψs (x(t)) := ψr,s (x(t)) + j ψi,s (x(t)) , s = 1, 2

(4)

where ηr (x), ηi (x) are used to include the modeling errors connected to the approximation used in this paper. The matrix A ∈ Cn×n is introduced to represent a possible linear section of the nonlinear uncertain system. The class of nonlinear systems analyzed in this paper and the assumption on the existence of a solution lead us to consider that both approximation errors fulfill the following sector restrictions: ηr (x)2 ≤ ηr,0 + ηr,1 x2 ηi (x)2 ≤ ηi,0 + ηi,1 x2 ηr,0 , ηr,1 , ηi,0 , ηi,1 ∈ +.

3

(5)

The Stone–Weisstrass theorem claims that, if the number of basis functions tends to infinity, the approximation of the uncertain function will be exact. Nevertheless, it is practically impossible to construct a suitable numerical algorithm with

ψr,s (x(t)) := tanh(xr (t)),

ψi,s (x(t)) := tanh(x i (t)) (9)

where ψr,s (x(t)) and ψi,s (x(t)) are the real and imaginary parts of the activation function. Here, tanh(·) is a real-valued hyperbolic tangent that is continuous and bounded as follows: 2  ψ j (x) − ψ j (x) ¯ 2 ψ (x(t))2 ≤ L + ¯  ≤ L ψ j x − x ψ x, x¯ ∈ Cn .

(10)

It should be noticed that, in this paper, the CV basis functions ψ1 (x(t)) and ψ2 (x(t)) will be the same for both the real and imaginary parts. These functions are suitable for processing waves or wave-related information. The wave amplitude corresponds to the amplitude of the complex variable in the NN, while wave phase corresponds to the phase of the neural variable. Then the saturation characteristic of the nonlinear function can be related to the saturation of wave energy, which is widely observed in various physical phenomena. Even when the approximation provided above could be exact as demanded

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 4

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

(depending on the number of activation functions), a constructive method should be proposed to obtain the true values of W ∗j . Actually, this method based on the nonparametric identifier using NN will be introduced in the next section. IV. I DENTIFIER S TRUCTURE The identifier based on NN is proposed following the classical strategy used within the adaptive parameter identification framework [24]. This construction uses a structural copy of the approximation for uncertain system based on the NN defined in (8). Therefore, the identifier based on NN has the following structure:     d ˆ + W2 (t) ψ2 x(t) ˆ u(t) xˆ (t) = A xˆ (t) + W1 (t) ψ1 x(t) dt xˆ (0) fixed and bounded (11) where the time-varying weights Ws (t) := Wr,s (t) + j Wi,s (t) (s = 1, 2) are used to adjust the adaptive identifier. This is the main method to construct a general form for this numerical approximation. The identifier state is defined as xˆ (t) ∈ Cn . Matrix A and functions ψ1 (·) , ψ2 (·) have the same meaning as those introduced in the previous section. The introduction of this scheme leads to the description of the problem under analysis as follows. Problem 1: Based on the uncertain CV nonlinear system (1) and its approximation scheme introduced in (8), to design the algorithm   (12) W˙ j (t) := j (t) , W j (t) , j = 1, 2 to adjust W1 (t) and W2 (t) using the information provided by the error between the system state and the identifier state ˆ − x(t)) will converge defined as (t) ∈ Cn ( (t) := x(t) to zero asymptotically to a small zone characterized by the power of perturbations and uncertainties appearing in (1). This algorithm is referred to as the learning law. Moreover, when perturbations and noise disappear, the asymptotic convergence to the origin must be achieved. The following subsection describes such an algorithm and how it is obtained.

A. Learning Algorithm The learning algorithm is obtained using the so-called controlled Lyapunov functions. This class of functions is a generalization of the well-known Lyapunov functions that are usually needed to show when a nonlinear system is stable (in the Lyapunov sense) or at least practically stable, that is, the identifier error converges to the origin without the presence of noise/uncertainties or to a ball near the origin defined by the power of those external perturbations. The controlled Lyapunov function was defined three decades ago and introduced formally in the Artstein Theorem [25]. In this paper, a special type of controlled Lyapunov function will be used to obtain the learning law and the convergence of the identification error (t) in an asymptotic fashion. Two different choices can be made here. The first one requires a

class of the so-called vector Lyapunov function constructed as ⎤ ⎡ Vv,1 ( 1 ) ⎢ Vv,2 ( 2 ) ⎥ ⎥ ⎢ (13) Vv ( ) := ⎢ ⎥ .. ⎦ ⎣ . Vv,n ( n ) where Vv,h (·) (h = 1, 2, . . . , n) is the Lyapunov candidate function for each individual CV state of the estimation error h . Surely, each individual Lyapunov-like function includes the square norm of each h . Nevertheless, a nonstandard Lyapunov technique must be used, the so-called vector Lyapunov function theory introduced in [26]. This methodology requires a set of tools that can imply additional and unnecessary complexity to the identification problem. That is why the following method was preferred. The second plausible proposal is to decompose each individual state h of the estimation error into its real h,r and imaginary h,i (t) parts and to construct a new vector as follows: (14)

 := 1,r 1,i · · · n,r n,i and to construct a general Lyapunov-like function V (·) based on this new extended state. Actually, both options are equivalent in some sense. One possible equivalence between the two options can be obtained by taking the sum of all functions Vv,h ( h ) and taking this sum like a possible selection of V (·). Based on the previous statement, in this paper the following structure for the Lyapunov-like function V (·) was selected: V ( , W˜ 1, W˜ 2 ) :=  2P +

2

   k j tr W˜ j W˜ j .

(15)

j =1

This function has a classical quadratic structure with respect to the state variables. Direct application of the second Lyapunov method leads to the following learning algorithm:  W˙ j (t) := − k j P (t) j (z(t)) + α W˜ j (t)

(16)

where P > 0 is a positive-definite symmetric matrix (P = P  ∈ 2n×2n ) that is a solution of the matrix inequality (MI) Ri c(P, α, R) < 0, where     α α Ri c(P, K , Q 0 ) := P A + In×n + A + In×n P 2 2 +PRP + Q 2 2  

−1 −1  I W+ , Q :=  + L j ψ j . (17) R := n×n ξ j j j =1

j =1

In (16), the vector z j (t) is defined as ⎧ x(t), if j = 1 ⎪ ⎪ ⎨  z j (t) :=  x(t) ⎪ ⎪ , if j = 2 ⎩ u(t)

(18)

and the general functions j (z j (t)) are constructed by ⎧   ˆ , if j = 1   ⎨ ψ1 x(t) (19) j z j (t)  ⎩  ψ2 x(t) ˆ u(t), if j = 2.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. ALFARO-PONCE et al.: ADAPTIVE IDENTIFIER FOR UNCERTAIN COMPLEX NONLINEAR SYSTEMS

B. Convergence of Estimation Error and Learning Law

Based on the previous definition

The application of the second Lyapunov method and the previous theorem leads to the following result. Theorem 1: If there exist positive and bounded parameters α and τ such that the linear matrix inequality (LMI) (17) admits at least one positive-definite solution P = P  > 0 with the corresponding parameters, then the identifier error

converges to a ball around the states of the CV nonlinear uncertain system (1) characterized by   β 2 (20) sup  (T ) P ≤ . lim sup α T →∞ η(T ),ξ (T ) The parameters defining the zone where the estimation error converges are given by   2 + η2 + η2 + η2 , β := ξ + + x + ηr,1 α > 0. (21) r,0 i,1 i,0 Proof: The proof of this theorem is presented in the following paragraphs. The first step comes from dynamics of the identification error defined by ˙ (t) := A (t) − η (x(t)) − ξ (t)

  ˆ + W1∗ ψ˜ 1 (x(t), x(t)) +W˜ 1 (t) ψ1 x(t)   ˆ u(t) + W2∗ ψ˜ 2 (x(t)) u(t). (22) +W˜ 2 (t) ψ2 x(t) The evaluation of this trajectory on the full time derivative of the Lyapunov function is ˙ + V˙ (t) := 2 (t)P (t)

2





 k j tr W˜ j (t) W˙ j (t) .

(23)

j =1

˙ ˙ and Direct substitution of (t) in the term (t)P (t)  the repeated application of the inequality XY + Y X  ≤ XX  + Y −1 Y  valid for any X, Y ∈ Rr×s and any 0 <  =  ∈ R s×s leads to  ˙

(t)P (t) ⎧ := (t)P A (t) ⎫ 2 ⎨   ⎬ + (t)P W ∗j −1 W ∗j P (t) j ⎩ ⎭ j =1

+

2 

  1/2  2 ˆ u(t)  + (t)Q (t) + β ψ j j x(t), x(t), j =1

    + (t)P W˜ 1 (t) ψ1 x(t) ˆ + (t)P W˜ 2 (t) ψ2 x(t) ˆ u(t)     ˆ u(t) = ψ˜ 1 x(t), x(t) ˆ 1 x(t), x(t),     2 x(t), x(t), ˆ u(t) = ψ˜ 2 x(t), x(t) ˆ u(t). (24) The previous inequality can be inserted in the expression accepted for the full time derivative of Lyapunov function defined in (15). This substitution transforms the differential inclusion of the full time derivative of Lyapunov function into       α α V˙ (t) ≤ (t) P A + In×n + A + In×n P (t) 2 2 + (t) {PRP + Q} (t) − αV (t) + β

     . (25) + W˜ (t) W˙ j (t) − j (t) , W j (t) j

5

V˙ (t) ≤ (t)Ri c(P, α, R) (t) − αV (t) + β 2 

    W˜ j (t) W˙ j (t) − j (t) , W j (t) . (26) + j =1

By the assumption of the positive solution of the equation described in (17) and the learning laws defined in (16), the following inequality is achieved: V˙ (t) ≤ −αV (t) + β.

(27)

If the upper bound of the previous inequality is considered, using the comparison principle and the fact that V (t) > 0 for all t ≥ 0, one has  β V (t) ≤ e−αt V (0) + (28) 1 − e−αt . α Taking the upper limit when t → ∞, finishes the proof. Corollary 1: If the uncertainties η(x) are pure linear (η0 = 0) and external perturbations are absent (ξ(t) = 0) for a class of particular asymptotically stable system ( lim sup  (t) = 0), then the trajectories of the identifier t →∞ error converge to the origin. Proof: The proof follows directly from the theorem. Under the assumptions stated in the corollary, β = 0 and the proof is completed. V. T RAINING A LGORITHM The learning laws proposed in (16) cannot be solved, indeed. These equations demand the knowledge of W ∗j , j = 1, 2, which are actually unknown. Therefore, the learning laws cannot be used as they were presented previously. Nevertheless, a training algorithm can be used to overcome this problem. In this paper, the training follows the method proposed in [23]. The training method will produce a set that must be near W ∗j in some sense. The of values W ∗,id j are following procedure will show how the values of W ∗,id j obtained. 1) Collect a few sets X ktra (k = 1, . . . , r ) of discrete state’s measurements X ktra := {x ktra (ts ), s a positive integer or zero with ts −ts−1 > 0} or continuous trajectories X ktra := {x ktra (t), t ≥ 0} obtained from the uncertain system that will be identified by the CVDNN. Evidently, the information got from these sets is affected by measurement noise. 2) Propose a parameter identification algorithm to obtain a suitable value of W ∗ using the available information collected before. In this step, just a fraction of all information sets obtained here is used to perform the training. In this case, we have used the 70% of all the sets to perform the training. The remainder sets were used to validate the training. 3) Set the weights W ∗ on the identifier structure proposed above. 4) The validation of this training is developed using a simple strategy of substituting the weights W ∗ produced by the training, and then testing the online learning

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 6

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

system. This is a key aspect that reveals the differences between the classical static NN and the continuous NN. The key step to perform the training is the second one. Several methods have been proposed to solve this issue. Among others, matrix least mean square with several modifications have been used to solve the training algorithm. This section describes how to solve the training scheme using the continuous version of the least mean square method. The discrete version can be easily reproduced following a similar procedure. Moreover, if the nonlinear least mean square method is introduced in (11), then the following equation is obtained: x˙ (t) x (t) = Ax(t)x (t) + W ∗  (x(t), u(t)) x (t)

+χ (t) x (t) W ∗ := W1∗ W2∗   ψ1 (x(t))  (x(t), u(t)) := ψ2 (x(t)) u(t) χ (t) = [η (x(t)) + ξ (t)] .

(29)

The direct integral operation from 0 to T on both sides of the previous equation will lead to T

T

x˙ (τ ) x (τ )dτ = τ =0

T





 (x(τ ), u(τ )) x (τ )dτ + τ =0

χ (τ ) x (τ )dτ.

τ =0

(30) Considering that the upper limit in the integral operation described in the last equation is finite, the weights produced after training will never be the true ones This is one of the reasons for introducing an online learning procedure, which has already has been detailed. The left-hand side of the previous equation can be estimated as (using a simple integration by parts) T

x˙ (τ ) x (τ )dτ = x (T ) x (T ) − x (0) x (0) τ =0 T

x˙ (τ ) x (τ )dτ.



(31)

τ =0

Therefore, by integration by parts, one gets T

1 x (T ) x (T ) − x (0) x (0) = A 2

x(τ )x (τ )dτ

τ =0

T

+W

 (x(τ ), u(τ )) x (τ )dτ + τ =0

τ =0

⎤−1

T

 (x(τ ), u(τ )) x (τ )dτ ⎦

×⎣ τ =0

1 x (T ) x (T ) − x (0) x (0) .  := (33) 2 When the available information for the training process is substantial (T → ∞), the following expression can be also used to obtain the value of W ∗,id : ⎤ ⎡ T W ∗,id = lim ⎣ − A T →∞



τ =0 T

x(τ )x (τ )dτ ⎦ ⎤−1

 (x(τ ), u(τ )) x (τ )dτ ⎦

.

(34)

No matter which reference was consulted, the following property has been discovered: W ∗,id −W ∗ 2 ≤ ϒ, where ϒ is bounded and a positive constant. The last equation is only valid when T → ∞. However, if T if finite, a small deviation of the real W ∗,id value is obtained. Therefore, instead of W ∗,id , a value named W¯ ∗,id (obtained with a finite time T in the previous equation) is introduced in the adjustment laws (12). Therefore, the learning laws defined previously are transformed into   (35) ˆ + α W¯ id W˙ j (t) := −k j P (t)ψ j x(t) j ∗,id −W ∗ . This change is used to see how the where W¯ id j := W training process can affect the quality of identification based on continuous NN for the CV uncertain system. The utilization of this process is obligatory because no knowledge on W ∗ is assumed. Based on the result presented in the main theorem of this paper, the identification error remains bounded. Therefore, the identifier state is also bounded by the assumption presented in this subsection. Using the conditions established in this part, changing W ∗,id instead of W ∗ will produce a bigger but bounded deviation of the identifier state compared to the trajectories of the uncertain system. Therefore, the quality of the identification process will be reduced but the decrement will be measurable and bounded.

T







x(τ )x (τ )dτ ⎦

τ =0

T

+W

W ∗,id = ⎣ − A

×⎣

Ax(τ )x (τ )dτ τ =0



algorithm. This solution is based on the well-known results on matrix least mean square algorithms. In this paper, we omit the details on the solvability of this problem because there is a lot of information available on the same in [27] and references therein. The formal expression for this term is ⎡ ⎤ T

χ (τ ) x (τ )dτ.

τ =0

(32)

!T The term τ =0 χ(τ )x (τ )dτ is not available to solve the previous equation. Here, one can obtain an approximation for W ∗ , namely W ∗,id , that actually is the solution of the training

VI. O PTIMIZATION OF THE C ONVERGENCE R EGION The result obtained in theorem1 is enough to show where the identification error is converging. Nevertheless, the MI used within this result may be used to reduce the convergence region. This may be done using known results within the MI theory. In some papers, this method is known as the invariant

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. ALFARO-PONCE et al.: ADAPTIVE IDENTIFIER FOR UNCERTAIN COMPLEX NONLINEAR SYSTEMS

ellipsoid method. The first theorem discussed in this paper showed that β lim sup  (t)2 ≤ (36) αλmin {P} t →∞ where λmin {P} is the minimum eigenvalue of matrix P. Equivalently, the previous inequality can be represented as lim sup  (t) t →∞

αP

(t) ≤ 1. β

(37)

The eventual maximization of  (α, P) associated with the previous inequality will provide a family of the so-called minimal invariant ellipsoids. Some results regarding the ellipsoid technique have introduced several options to define  :  (α, P) := tr {α P/β},  (α, P) := det{α P/β}, or the spectral norm α P/β. The simplest way is the first one because it is a linear operation. The following theorem used this method to reduce the convergence region of the estimation error provided by the CVDNN identifier. Theorem 2: All minimal regions where the identifier trajectories converge are characterized by the positive definite solution P(α), 0 < α < α ∗ of the following matrix Inequality (namely Lyapunov equation): # " K α W (P, K , α) := P A + I + 2 2 # " K  α + A+ I + P + Q < 0 (38) 2 2 where α ∗ is a positive bounded scalar. Moreover, the function  (α, P) is strictly convex in the interval 0 < α < α ∗ . Therefore, the static optimization problem % $ αP (39) sup tr β 0

Adaptive identifier for uncertain complex nonlinear systems based on continuous neural networks.

This paper presents the design of a complex-valued differential neural network identifier for uncertain nonlinear systems defined in the complex domai...
1MB Sizes 0 Downloads 3 Views