IEEE TRANSACTIONS ON CYBERNETICS, VOL. 45, NO. 7, JULY 2015

1315

Distributed Cooperative Optimal Control for Multiagent Systems on Directed Graphs: An Inverse Optimal Approach Huaguang Zhang, Senior Member, IEEE, Tao Feng, Guang-Hong Yang, Senior Member, IEEE, and Hongjing Liang

Abstract—In this paper, the inverse optimal approach is employed to design distributed consensus protocols that guarantee consensus and global optimality with respect to some quadratic performance indexes for identical linear systems on a directed graph. The inverse optimal theory is developed by introducing the notion of partial stability. As a result, the necessary and sufficient conditions for inverse optimality are proposed. By means of the developed inverse optimal theory, the necessary and sufficient conditions are established for globally optimal cooperative control problems on directed graphs. Basic optimal cooperative design procedures are given based on asymptotic properties of the resulting optimal distributed consensus protocols, and the multiagent systems can reach desired consensus performance (convergence rate and damping rate) asymptotically. Finally, two examples are given to illustrate the effectiveness of the proposed methods. Index Terms—Asymptotic properties, consensus performance, convergence rate, damping rate, distributed consensus protocols, inverse optimality.

I. I NTRODUCTION OOPERATIVE control of multiagent systems has attracted extensive attention from different fields due to its wide range of applications, e.g., vehicle formations [1], flocking of groups of mobile autonomous agents [2], and obtained fruitful results [3]–[26]. Generally, the multiagent systems can be categorized into leaderless consensus problem and leader following consensus problem. For the leaderless consensus problem, as stated in [3] and [4], distributed controllers are designed for each agent such that all agents are eventually driven to an unprescribed common value. This value maybe constant, or

C

Manuscript received January 22, 2014; revised June 5, 2014; accepted August 5, 2014. Date of publication September 9, 2014; date of current version June 12, 2015. This work was supported in part by the National Natural Science Foundation of China under Grant 61034005, Grant 61433004, and Grant 61273148, and in part by the National High Technology Research and Development Program of China under Grant 2012AA040104, in part by IAPI Fundamental Research Funds under Grant 2013ZCX14, and in part by the Development Project of Key Laboratory of Liaoning Province. This paper was recommended by Associate Editor P. Shi. H. Zhang and G.-H. Yang are with the College of Information Science and Engineering, Northeastern University, Shenyang 110819, China and also with the State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110004, China (e-mail: [email protected]; [email protected]). T. Feng and H. Liang are with the College of Information Science and Engineering, Northeastern University, Shenyang 110819, China (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TCYB.2014.2350511

time varying, and is generally a function of the initial states of the agents in the communication network [5]. This problem is also known as cooperative regulator problem [3], [4], synchronization or rendezvous in [5]. For the leader following consensus problem, a leader agent acts as a command generator, which generates the desired reference trajectory and ignores information from the follower agents [3], [4]. All other agents attempt to follow the trajectory of the leader agent. This problem is also known as cooperative tracking problem [20], consensus to a leader [21], model reference consensus [22], pinning control [23], or synchronized tracking control [24], [25]. Cooperative optimal control problems have aroused the interest of many researchers [27]–[33]. Optimality of the control protocols gives rise to desirable properties such as reduced sensitivity and robust stability. The main difficulty is that the globally optimal problems generally require global information of agents which is difficult to obtain in most applications. In addition, for multiagent systems, the graph topology interplays with the system dynamics, hence the globally optimal control problems are fairly complicated. The existing work mainly constructs the consensus protocols which minimize a local performance index, rather than a global one. In the case of agents with identical linear time-invariant dynamics, a suboptimal design using local linear quadratic regulator (LQR) design method was presented in [28]. The distributed games on graphs were studied by introducing the notion of iterative Nash equilibrium in [29], where each agent only minimizes its own performance index. In [30], consensus problem of multiagent differential games of nonlinear systems based on the optimal coordination control was solved via fuzzy adaptive dynamic programming approach [31]. There are few papers working on globally optimal cooperative control for multiagent systems. Solutions of the globally optimal problem were obtained by designing a local observer in [32], where the global information was used. In [33], the optimal linear-consensus algorithm for multiagent systems with single-integrator dynamics was proposed by defining two different global performance indexes. By using the inverse optimality method, an optimality criterion was established related to the graph topology to obtain the distributed optimal control [20], [34]. In [20], the LQR based optimal design method was proposed to obtain the optimal distributed consensus protocols by constructing a global performance index. The consensus performance problem for multiagent systems is of great significance from a practical point of view. It is

c 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. 2168-2267  See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

1316

known that the transient behaviors of the system depend on the location of eigenvalues, especially the nearest one to the imaginary. In view of this point, two indexes with respect to eigenvalues of the closed-loop system were introduced to evaluate the consensus performance in [35]: convergence rate and damping rate. The former is used to evaluate the convergence speed of agents, and the latter is used to evaluate the oscillating behaviors of agents. However, the existing LQR based consensus design methods show great weakness in addressing such problem, since it is difficult to select appropriate weighting matrices such that eigenvalues of multiagent systems lie in specified region to obtain desired consensus performance. Moreover, since the graph topology interplays with the system dynamics, the locations that the global closed-loop poles lie in are hard to be determined even the closedloop poles of each agent system are placed at specified region. The problem of designing distributed consensus protocols such that the multiagent systems can reach desired consensus performance remains unsolved. This motivates our work. This paper aims to design globally optimal distributed consensus protocols which not only optimize some global quadratic performance indexes and guarantee the consensus of agents, but also place all closed-loop eigenvalues of the multiagent systems in specified region. Therefore, the resulting globally optimal distributed consensus protocols such that the multiagent systems can reach desired consensus performance. The main contributions are as follows. 1) Novel inverse optimal results of linear systems are constructed based on partial stability principle. 2) By means of the inverse optimal results, the necessary and sufficient conditions are given for solving the cooperative optimal control problems for leaderless and leader following multiagent systems. 3) Optimal cooperative design procedures are proposed based on asymptotic properties of the resulting optimal distributed consensus protocols such that the multiagent systems can reach desired consensus performance asymptotically. The rest of the paper is organized as follows. In Section II, we firstly show some concepts of the graph theory, and then give some results on the inverse optimality of linear systems. The main results for the cooperative optimal control are given in Section III, necessary and sufficient conditions for the globally optimal cooperative control problems are proposed, then issues of consensus performance of the multiagent systems are investigated, which leads to novel and simple optimal distributed cooperative design methods. In Section IV, two numerical examples are given to illustrate the superiority of the proposed methods. Conclusions and future work prospects are given in Section V. Notations: A > 0 (< 0) means matrix A is positive (negative) definite, A ≥ 0 (≤ 0) means matrix A is positive (negative) semi-definite. The Kronnecker product is denoted by ⊗. The transposition of matrix A is denoted by AT . In denotes the n dimensional identity matrix in Rn×n . 1n ∈ Rn is the vector with all elements 1. ker(A) denotes the null space of matrix A. rank(A) denotes the rank of matrix A.

IEEE TRANSACTIONS ON CYBERNETICS, VOL. 45, NO. 7, JULY 2015

II. P RELIMINARIES A. Graph Theory Consider a weighted digraph G = (V, E, A) with a nonempty finite set of N nodes V = {v1 , v2 , . . . , vN }, a set of edges E ⊂ V × V and the associated adjacency matrix A = [aij ] ∈ RN×N . An edge rooted at node j and ended at node i is denoted by (vj , vi ), which means the information flows from node j to node i. The weight aij of edge (vj , vi ) is positive, i.e., aij > 0 if (vj , vi ) ∈ E, otherwise, aij = 0. In this paper, assume that there are no repeated edges and no self loops, i.e., aii = 0, ∀i ∈ N , where N = {1, 2, . . . , N}. If (vj , vi ) ∈ E, then node j is called a neighbor of node i. The set of neighbors of node i is denoted as Ni = {j|(vj , vi ) ∈ E}. Define the in-degree matrix as D = diag{di } ∈ RN×N with di = j∈Ni aij and the Laplacian matrix as L = D − A. Obviously, L1N = 0. The graph is said to be connected if every two vertices can be joined by a path. A graph is said to be strongly connected if every two vertices can be joined by a directed path. If G is strongly connected, the zero eigenvalue of is simple, and kerL = span{1N }. A digraph is said to have a spanning tree, if there is a node ir , such that there is a directed path from the node ir to every other nodes in the graph. B. Inverse Optimality of Linear Systems Consider the following linear quadratic (LQ) regulator problem: x˙ = Ax + Bu  ∞   T x Qx + uT Ru dt J=

(1a) (1b)

0

where x ∈ Rn is the state vector, u ∈ Rm is the control input. Assume that (A, B) is controllable and B is of full column rank m. The inverse optimal control problem (IOCP) [36], [37] considered in this paper is to find the condition on A, B and a given stable control u = −Kx

(2)

such that the control law (2) minimizes the cost (1b) for some symmetric nonnegative definite matrices Q and symmetric positive definite matrices R. If Q and R are given, then the LQ regulator is given by a state feedback control u = −Kx, where K = R−1 BT P

(3)

and P is the unique symmetric positive definite solution of the algebra Riccati equation (ARE) AT P + PA − PBR−1 BT P + Q = 0.

(4)

Equations (3) and (4) are equivalent to AT P + PA − K T RK + Q = 0 BT P = RK.

(5a) (5b)

With this expression, the IOCP is to find some symmetric nonnegative definite matrices Q, positive definite matrices R and P which satisfy (5).

ZHANG et al.: DISTRIBUTED COOPERATIVE OPTIMAL CONTROL FOR MULTIAGENT SYSTEMS ON DIRECTED GRAPHS

Lemma 1 [20] (Partial Stability): Given an affine manifold S, which contains a neighborhood D ⊃ S, if there exists a quadratic function V : D → R, V(x) = xT Px ≥ 0, such that S = kerP V(x) = xT Px ≥ 0 ˙ V(x) ≤ 0, ∀x ∈ D ˙ {x ∈ D|V(x) = 0} = S then S is Lyapunov asymptotically stable, uniformly. Lemma 2: If KB is a simple positive semi-definite matrix and rank(KB) = rank(K), then there exist some matrices R = RT > 0 and P = PT ≥ 0, such that K = R−1 BT P. Proof: The matrix KB is a simple positive semi-definite matrix, then there is a nonsingular matrix W such that WKBW −1 =  = diag{λ1 , . . . , λm } ≥ 0. Let R be formed as R = W T W, where the matrix  satisfies:  =  T > 0, and  = . Let P = K T R(RKB)† RK, where (RKB)† is the Moore– Penrose generalized inverse and satisfies (RKB)(RKB)† (RKB) = RKB.

(6)

Since rank(KB) = rank(K), one has (RKB)(RKB)† RK = RK.

(7)

For simplicity, set  = Im , then RKB = W T W is symmetric positive semi-definite, and (RKB)† = W −1 W −T ,  is a diagonal matrix with eigenvalues: if λi = 0, πi = 1/λi , else 0. Apparently, (RKB)† is positive semi-definite and P = PT ≥ 0. Using (7), we obtain K = R−1 BTP . Remark 1: If KB is simple positive definite, then rank(K) = rank(B) = rank(KB) = m, thus there exist some P = PT > 0 such that K = R−1 BT P. Proposition 1: If the following two conditions hold: 1) KB is a simple positive semi-definite matrix and rank(KB) = rank(K), i.e., there exist some matrices R = RT > 0 and P = PT ≥ 0, such that K = R−1 BT P; 2) K is stabilized to the null space of P; then for the IOCP (5), K is optimal and P is a symmetric positive semi-definite solution to the corresponding ARE with some matrices R = RT > 0 and Q = QT . Proof: Let Q be given as

1317

Therefore, u = −Kx is optimal and (8) is the corresponding ARE. On the other hand, we have the following conclusion. Proposition 2: For the IOCP (5), if K is optimal and the corresponding ARE has a symmetric positive semi-definite solution P for some matrices Q = QT and R = RT > 0, then the following holds. 1) K is stabilized to the null space of P. 2) KB is a simple positive semi-definite matrix. Proof: Let R = TT T , where T is nonsingular. Combining with RKB = BT PB, then we have T −1 (BT PB)(T T )−1 = yT T (KB)(T T )−1 . The matrix BT PB is symmetric positive semidefinite implies that KB is a simple positive semi-definite matrix. Since K is optimal and minimizes the following form of quadratic performance index [39]:   1 ∞ T (10) x (t)Qx(t) + uT (t)Ru(t) dt. J= 2 0 In addition, the corresponding ARE (4) has the symmetric positive semi-definite matrix P. Therefore, the second-order variation of J satisfies the inequality [39]  ∞   δ2J = δx(t)T Qδx(t) + δu(t)T Rδu(t) dt ≥ 0 (11) 0

where δ n (∗) denotes the nth-order variation of functional “∗.” From (9), we have δ 2 J = δxT (0)Pδx(0) − δxT (∞)Pδx(∞)  ∞  T  δu(t) − δu∗ (t) R δu(t) − δu∗ (t) dt (12) + 0

(9)

where δu∗ (t) = R−1 BT Pδx(t). For R > 0, P ≥ 0 and ∀δx(0), δ 2 J ≥ 0 implies that K is stabilized to the null space of P. If not, A − BK must be unstable, then there at least exists one unstable eigenvalue, denoted by λ, and let ξ be its corresponding eigenvector. For δx(0) = cξ , c > 0, one has δx(t) = ceλt ξ and limt→∞ δx(t) = ∞. However, limt→∞ Pδx(t) = 0, which leads to limt→∞ (δx(t))T Pδx(t) = ∞, thus δ 2 J → −∞. This contradicts with the optimality of K. Proposition 3: For the IOCP (5), K is optimal and the corresponding ARE has a symmetric positive semi-definite solution P for some symmetric matrices Q ≥ 0 and R > 0, if the following two conditions hold. 1) 1/2K is stabilized to the null space of P. 2) KB is a simple positive semi-definite matrix. Proof: Using Proposition 1, for the linear system (1a), there ˜ =Q ˜T exist some matrices P = PT ≥ 0, R = RT > 0 and Q satisfy

T

1 1 ˜ A − BK P + P A − BK + Q 2 2

T

1 1 + K (2R) K =0 (13a) 2 2 BT P = RK. (13b)

since K is stabilized to the null space of P, thus limt→∞ Px(t) = 0. The minimum of J(x0 ) is achieved when K = R−1 BT P.

The state feedback control law u∗ = −1/2Kx minimizes ∞ ˜ + the corresponding performance index J(x0 ; u) = 0 (xT Qx T 2u Ru)dt.

Q = −AT P − PA + PBR−1 BT P.

(8)

Consider the following performance index:  ∞  T  x (t)Qx(t) + uT (t)Ru(t) dt J(x0 ) = 0 ∞  T  = x (t)Qx(t) + xT (t)K T RKx(t) dt 0

= xT (0)Px(0) − xT (∞)Px(∞)  ∞  T + xT (t) K − R−1 BT P 0  × R K − R−1 BT P x(t)dt

1318

IEEE TRANSACTIONS ON CYBERNETICS, VOL. 45, NO. 7, JULY 2015

It is readily known that the optimal value is given as  ∞

˜ + 1 K T RK xdt J ∗ (x0 ) = x0T Px0 = xT Q (14) 2 0 for any x0 ∈ Rn , x0 = 0. ˜ + 1/2K T RK ≥ 0. Since P ≥ 0, then (14) indicates that Q T ˜ Let Q = Q + 1/2K RK. Subtracting K T RK from both sides of (13a) yields

A. Globally Optimal Leader Following Consensus Problem The globally optimal leader following consensus problem is to design distributed consensus protocols ui , ∀i ∈ N , such that all nodes of the multiagent system synchronize to the state trajectory of the leader node and optimize some global performance indexes, simultaneously. The dynamic of the leader, labeled by 0, is given by x˙ 0 = Ax0

(A − BK)T P + P (A − BK) = −Q − K T RK ≤ 0. (15) According to Lemma 1, K is stabilized to the null space of P. Since KB is a simple positive semi-definite matrix, then K is optimal using Proposition 1 and (15) is the ARE with respect to the state feedback control u = −Kx, which indicates that K is LQ optimal for Q = QT ≥ 0 and R = RT > 0. For the case that the ARE has a symmetric positive definite solution P, we just give the conclusion, the proof is similar and omitted. Proposition 4: For the IOCP (5), K is optimal and the corresponding ARE has a symmetric positive definite solution P for some Q = QT and R = RT > 0, if and only if the following holds. 1) A − BK is Hurwitz. 2) KB is a simple positive definite matrix. Remark 2: If the state weighting matrix Q is restricted to be positive definite, then condition 1) should be replaced by A − 1/2BK, which is Hurwitz. III. G LOBALLY O PTIMAL D ISTRIBUTED C ONSENSUS P ROTOCOLS D ESIGN In this section, the globally optimal cooperative control problems are considered for multiagent systems with a group of N nodes, distributed on a directed communication graph G, which have the following identical linear time-invariant dynamics: x˙ i = Axi + Bui ∀i ∈ N Rn ,

(16)

Rm .

where the state xi ∈ the input ui ∈ Assume that (A, B) is controllable and the input matrix B is of full column rank m. Since the optimality of K is invariant under any state coordinate transformation x = Uz of the system (1a), in the sense that K is optimal for the system (1a) if and only if the control law KU is optimal for the transformed system z˙ = U −1 AUz + U −1 Bu [38]. Without loss of generality, we assume in this section that the system matrices A and B are the following form: 



0 A11 A12 , B= (17) A= A21 A22 B2 where A11 ∈ R(n−m)×(n−m) , A22 ∈ Rm×m , and B2 ∈ Rm×m . Note that B2 is nonsingular and the matrix pair (A11 , A12 ) is controllable [41]. The global form of (16) follows: x˙ = (IN ⊗ A) x + (IN ⊗ B) u

(18)

T )T ∈ RnN , the input where the state x = (x1T , x2T , . . . , xN T T T T mN u = (u1 , u2 , . . . , uN ) ∈ R . In the following, the Laplacian matrix L is restricted to be positive semi-definite.

(19)

where x0 ∈ Rn is the state. This can be considered as a command generator which generates the desired target trajectory. The leader node can be observed from a small subset of nodes in graph G. If node i observes the leader, then an edge (v0 , vi ) is said to exist with weighting gain gi > 0. The node with gi > 0 is referred as a pinned or controlled node. Denote the pinning matrix as G = diag{g1 , . . . , gN }. Assumption 1: The digraph G contains a spanning tree and the root node ir can observe information from the leader node, i.e., gir > 0. Remark 3: Assumption 1 indicates that all eigenvalues of the matrix L + G have positive real part, [40]. The local neighborhood error is defined as    aij xi − xj + gi (xi − x0 ) (20) εi = j∈N

and the global neighborhood tracking error is ξ = (L + G) ⊗ In δ, where the global disagreement error is δ = x − 1N ⊗ x0 ∈ RnN . The distributed consensus protocol in this paper is a state variable feedback control [18] ui = −cKεi

(21)

where the scalar coupling gain c > 0 and the feedback control gain matrix K ∈ Rm×n . The global form of the distributed consensus protocol is given as u = −c [(L + G) ⊗ K] δ

(22)

where G = diag{g1 , . . . , gN } is the pinning matrix. The global system using the protocol (22) is given as x˙ = (IN ⊗ A) x − c [(L + G) ⊗ (BK)] δ

(23)

hence we have the global error system δ˙ = (IN ⊗ A) δ + (IN ⊗ B) u.

(24)

Using the protocol (22), the global closed-loop error system is formed as δ˙ = [IN ⊗ A − c (L + G) ⊗ (BK)] δ.

(25)

To achieve the synchronization, (24) or (25) must be asymptotically stabilized to the origin. Lemma 3 [1]: Let λi (i ∈ N ) be the eigenvalues of the matrix L + G. The global closed-loop system (25) is asymptotically stable if and only if all matrices A − cλi BK ∀i ∈ N are Hurwitz, i.e., asymptotically stable.

(26)

ZHANG et al.: DISTRIBUTED COOPERATIVE OPTIMAL CONTROL FOR MULTIAGENT SYSTEMS ON DIRECTED GRAPHS

Theorem 1: For the global error system (24), assume that the graph contains a spanning tree with at least one nonzero pinning gain connecting into a root node, then there exist some distributed consensus protocols in form (22), which are optimal with respect to some global quadratic performance indexes  ∞  T  ¯ + uT Ru ¯ dt δ Qδ (27) J= 0

¯ =Q ¯ T > 0 and R¯ = R¯ T > 0, if and only if L + G is a with Q simple positive definite matrix. Proof: Necessity: Using Proposition 4, if u = −c[(L + G) ⊗ K]δ is optimal, then the matrix [(L + G) ⊗ K](IN ⊗ B) is a simple positive definite matrix, hence there exists a nonsingular matrix X ∈ RmN×mN , such that (L + G) ⊗ (KB) = X −1 X

(28)

where  = diag{κ1 , . . . , κmN }, κj > 0, j = 1, . . . , mN. The Jordan decomposition of L + G and KB are given as L + G = Y −1 J1 Y

(29)

1319

Consider a similarity transformation of A − cλi BK, i ∈ N , let

T=

I S

0 I



then it holds 

A12 A11 T −1 T A21 − cλi B2 K1 A22 − cλi B2 K2 

A12 A11 − A12 S (36) =  SA12 + A22 − cλi B2 K2 where  = SA11 + A21 − cλi B2 K1 − (SA12 + A22 − cλi B2 K2 )S. Set  = 0, then all eigenvalues of A − cλi BK are placed at n − m eigenvalues of A11 − A12 S and m eigenvalues of SA12 + A22 − cλi B2 K2 . Denote all eigenvalues of A − cλi BK by ωji ( j = 1, . . . , n). Let n−m have the diagi } and m have the diagonal onal elements of {ω1i , . . . , ωn−m i elements of {ωn−m+1 , . . . , ωni }, then there exist an matrix S and nonsingular matrices T1 ∈ R(n−m)×(n−m) , T2 ∈ Rm×m , such that A11 − A12 S = T1 n−m T1−1 SA12 + A22 − cλi B2 K2 =

and KB = Z −1 J2 Z

(30)

where J1 ∈ RN×N is an upper triangular matrix with its eigenvalues {λ1 , . . . , λN } on the diagonal, Y ∈ RN×N is nonsingular and J2 ∈ Rm×m is an upper triangular matrix with its eigenvalues {μ1 , . . . , μm } on the diagonal, Z ∈ Rm×m is nonsingular. Then it follows from (28) that: (Y ⊗ Z)−1 (J1 ⊗ J2 ) (Y ⊗ Z) = X −1 X.

(31)

We readily see that J1 and J2 must be simple, hence L + G is simple. Since the graph contains a spanning tree with at least one nonzero pinning gain connecting into a root node, hence all eigenvalues of L + G have positive real part. Suppose that L + G has a pair of complex roots αt ± βt j, where αt > 0, βt = 0 ∈ R and j is the imaginary unit with property j2 = −1. Let μs be an eigenvalue of KB, then μs (αt ± βt j) must be two eigenvalues of , we have κp = μs (αt + βt j) = μs αt + μs βt j > 0

(32)

κq = μs (αt − βt j) = μs αt − μs βt j > 0.

(33)

It easily seen that βt = 0 and μs > 0, which means L + G is positive definite. Therefore, L + G is a simple positive definite matrix. Sufficiency: Firstly, we prove that there exists a K = [K1 K2 ] ∈ Rm×n , such that (L + G) ⊗ (KB) is simple positive definite. Since L + G is simple positive definite, then there exists a simple positive definite matrix ∈ Rm×m (say

= Im ), such that (L + G) ⊗ is simple positive definite. m×(n−m) Let K2 = B−1 2 , then for any K1 ∈ R   (34) K = K1 B−1 2 makes (L + G) ⊗ (KB) a simple positive definite matrix.

(35)

T2 m T2−1 .

(37a) (37b)

Then (37b) can also be written as T2 m T2−1 = SA12 + A22 − cλi B2 B−1 2 .

(38)

The matrix S can always be selected such that n − m eigeni } values of A − cλi BK lie in desired locations {ω1i , . . . , ωn−m in the left half complex plane due to the controllability of (A11 , A12 ), where λi denotes the ith eigenvalue of L + G, i = 1, . . . , N. For > 0 and λi > 0, the matrix λi B2 B−1 2 is positive definite, which implies that for a given positive number cmin , there exists a suitable (say = γ Im , where γ > 0 is sufficiently large), such that ∀c ≥ cmin , all the eigenvalues of SA12 + A22 − cλi B2 K2 are located in the left half complex plane for any given S. In view of this point, set cmin = 1/λ, λ = mini∈N λi , let be selected such that all the eigenvalues of SA12 + A22 − cmin λB2 K2 are placed in the left half complex plane and K1 is obtained by setting  in (36) to be zero, then A − cmin λBK = A − BK is Hurwitz. Now we prove there exist some c and such that A−cλi BK are Hurwitz, i = 1, . . . , N. From Remark 1, since K is formed as (34), then there exist some R = RT > 0 and P = PT > 0, such that BT P = RK. Since A − BK is Hurwitz, then we have (A − BK)T P + P(A − BK) < 0, which is similar to Proposition 3. Let V(x) = xT Px and take the derivative along the state trajectory x˙ = (A − cλi BK)x, ∀c ≥ cmin , we have ˙ V(x) = xT [(A − cλi BK)T P + P(A − cλi BK)]x = xT [(A − BK)T P + P(A − BK) + 2(1 − cλi )K T RK]x < 2(1 − cλi )xT K T RKx ˙ then we obtain V(x) < 0 since cλi ≥ 1, thus A − cλi BK is Hurwitz, ∀i ∈ N . Therefore, such selections of S, and c ≥ cmin indicate that c(L + G) ⊗ K is asymptotically stabilized to the origin

1320

IEEE TRANSACTIONS ON CYBERNETICS, VOL. 45, NO. 7, JULY 2015

according to Lemma 3. Using Proposition 4, c[(L + G) ⊗ K] is an optimal feedback gain. According to Proposition 4 and Remark 2, for the system (24), the feedback gain c[(L + G) ⊗ K] is optimal for ¯ =Q ¯ T > 0 if IN ⊗A−c[(L + G)⊗(BK)]/2 is Hurwitz, some Q which can be easily satisfied by setting c ≥ 2cmin . Set cmin = 1/λ, then in (34) must be selected such that SA12 +A22 −B2 B−1 2 is Hurwitz. For simplicity, let = γ Im , γ > 0, then the eigenvalues of SA12 + A22 − B2 B−1 2 are exactly λ(SA12 + A22 ) − γ , λ(SA12 + A22 ) denote the eigenvalues of SA12 + A22 . Therefore, γ is simply selected such that λ(SA12 + A22 ) − γ are located in the desired region in the left half complex plane. This leads to a fairly simple parameterization of K2 = γ B−1 2 , then K1 is solved from (SA + A − SA12 S − A22 S + γ S). The  = 0 as K1 = B−1 11 21 2 parameterization of the optimal distributed protocols gain K is given as  SA11 + A21 − SA12 S − A22 S + γ S γ Im . (39) K = B−1 2 shows important Remark 4: Selecting cmin = 1/λ significance. For any c ≥ cmin , the parameters S and γ in (39) can be just selected to guarantee the stability of each agent (A − BK is Hurwitz), without considering the graph topology. This is to say, the design of K is decoupled from the details of the communication graph structure. In the sufficiency proof of Theorem 1, it is readily seen that a local feedback gain K is a candidate if KB is simple positive definite and A − BK is Hurwitz. In view of Proposition 4, such a local gain can be obtained by the LQR optimal design method, as in [20]. Let the matrices Q = QT > 0 and R = RT > 0, design the state variable feedback control gain as K = R−1 BT P

(40)

where P is the unique positive definite solution of the ARE AT P + PA + Q − PBR−1 BT P = 0.

(41)

To ensure the asymptotical stability of (24), using [18, Th. 1], the coupling gain satisfies c ≥ 1/(2λ). According to Proposition 4 and Remark 2, we have the following corollary. Corollary 1 (LQR optimal design method): For the global error system (24), assume that the graph contains a spanning tree with at least one nonzero pinning gain connecting into a root node, then the distributed protocol (22) with the gain K given by (40) is optimal for some global quadratic performance indexes  ∞  T  ¯ + uT Ru ¯ dt δ Qδ (42) J= 0

¯ =Q ¯ T > 0 and R¯ = R¯ T > 0, if: with Q 1) L + G is simple positive definite; 2) the coupling gain c ≥ 1/λ; where λ = mini∈N λi , λi denote the eigenvalues of L + G. Remark 5: The lower bound of the coupling gain c proposed in Corollary 1 is fairly simple compared with that given in [20] and is valid for any choice of Q = QT > 0 and R = RT > 0 in the ARE (41).

B. Globally Optimal Leaderless Consensus Problem For the leaderless consensus problems, all agents are to reach the same state, i.e., xi − xj  → 0 as t → ∞, ∀i, j ∈ N . The globally optimal leaderless consensus problems are to design distributed consensus protocols ui , ∀i ∈ N , such that the consensus is reached and optimize some global performance indexes, simultaneously. The local neighborhood error is defined as    aij xi − xj (43) εi = j∈N

and the global neighborhood error is given as ξ = (L ⊗ In )x. Consider the distributed consensus protocol of form ui = −cKεi

(44)

where the scalar coupling gain c > 0 and the feedback control gain matrix K ∈ Rm×n . The global form of the distributed consensus protocol is u = −c(L ⊗ K)x

(45)

which gives the global closed-loop system x˙ = [IN ⊗ A − cL ⊗ (BK)] x.

(46)

Assume that the graph G is strongly connected, then λ1 = 0 is a simple eigenvalue of the Laplacian matrix L, the other N − 1 eigenvalues λi , i = 2, . . . , N have positive real part. The main results are stated as follows. Theorem 2: For the global closed-loop system (46), there exist some controllers in form (45) with K given by (34), which are optimal with respect to some global quadratic performance indexes  ∞  T  ¯ + uT Ru ¯ dt x Qx (47) J= 0

¯ = Q ¯ T ≥ 0 and R¯ = R¯ T > 0, if and only if L is a with Q simple positive semi-definite matrix. Moreover, the optimal controller (45) is stabilized to the null space of L ⊗ In , since the graph G is strongly connected, then the consensus of the agents is reached. Proof: Sufficiency: The Laplacian matrix L is simple, then there exists a nonsingular matrix T, such that L = T −1 J1 T, where J1 is a diagonal matrix of eigenvalues of L. Then there exists a K of form (34), such that cL ⊗ (KB) is simple positive semi-definite. Since KB is simple positive definite, according to Remark 1, there exist symmetric positive definite matrices R and P, such that K = R−1 BT P

(48)

then it follows that:

  cL ⊗ K = c IN ⊗ R−1 (IN ⊗ B)T (L ⊗ P) = (R1 ⊗ R)−1 (IN ⊗ B)T [(cR1 L) ⊗ P]

(49)

where R1 = T T T. Obviously, rank(cL ⊗ K) = rank(cL ⊗ (KB)) = (N − 1)m. In the following, we prove that cL ⊗ K is asymptotically stabilized to the null space of L ⊗ In .

ZHANG et al.: DISTRIBUTED COOPERATIVE OPTIMAL CONTROL FOR MULTIAGENT SYSTEMS ON DIRECTED GRAPHS

Let c ≥ cmin = 1/λ and K be given by (39), where λ is the minimum positive eigenvalue of L, S and γ are selected such that A − BK is Hurwitz. From the sufficiency proof of Theorem 1, we see that ∀c ≥ cmin , A − cλi BK is Hurwitz and (A − cλi BK)T P + P(A − cλi BK) < 0, i = 2, . . . , N. Let P¯ = c(R1 L) ⊗ P = c(T T J1 T) ⊗ P, then we see ¯ = ker(L ⊗ In ). Consider the following quadratic that ker(P) function: ¯ = cxT (T T J1 T) ⊗ Px. V(x) = xT Px

(50)

Taking the derivative along the trajectory (46) yields ˙ ¯ N ⊗ A − cL ⊗ (BK)]x V(x) = xT P[I ¯ + xT [IN ⊗ A − cL ⊗ (BK)]T Px (51)   T T 2 J1 ⊗ (AP) − cJ1 ⊗ (BKP) = cx (T ⊗ In ) T   (T ⊗ In )x. + J1 ⊗ (AP) − cJ12 ⊗ (BKP) (52) Since the graph G is assumed to be strongly connected, then J1 is diagonal with N − 1 positive eigenvalues and a simple zero eigenvalue, the matrix T can be selected such that J1 = diag{0, λ2 , . . . , λN }, then we have ⎡ ⎤ 0 ⎢ ⎥ 2 ⎢ ⎥ ˙ V(x) = cxT (T ⊗ In )T ⎢ ⎥ (T ⊗ In ) x . .. ⎣ ⎦ N (53) where i = λi [(A − cλi BK)P + P(A − cλi BK)T ] < 0, ˙ i = 2, . . . , N. Obviously, V(x) ≤ 0. ˙ = 0} = span{1N ⊗ η}, Now we prove that {x ∈ RnN |V(x) where η ∈ Rn . ˙ ¯ = ker(L⊗In ), thus it is easily seen that V(x) = Since ker(P) ˙ = 0}. On 0 if x = 1N ⊗ η, so span{1N ⊗ η} ⊆ {x ∈ RnN |V(x) ˙ the other hand, V(x) = 0 if and only if ⎡ ⎤ 1 ⎢0⎥ ⎢ ⎥ ⊗ξ (54) (T ⊗ In ) x = ⎢ . ⎥ ⎣ .. ⎦ 0 where ξ ∈

Rn .

N×1

Then we have

¯ = c (T ⊗ In )T (J1 ⊗ P) (T ⊗ In ) x Px ⎛⎡ 0 ⎜⎢ λ2 ⎜⎢ = c (T ⊗ In )T ⎜⎢ .. ⎝⎣ . ⎛⎡ ⎤ ⎞ 1 ⎜⎢ 0 ⎥ ⎟ ⎜⎢ ⎥ ⎟ × ⎜⎢ . ⎥ ⊗ ξ⎟ = 0 ⎝⎣ .. ⎦ ⎠ 0 N×1





(55)

⎟ ⎥ ⎟ ⎥ ⎥ ⊗ P⎟ ⎠ ⎦ λN

(56)

˙ hence x ∈ ker(L ⊗ In ), i.e., {x ∈ RnN |V(x) = 0} ⊆ span{1N ⊗ ˙ η}. Therefore, span{1N ⊗ η} = {x ∈ RnN |V(x) = 0}.

1321

Using Lemma 1, cL ⊗ K is stabilized to the null space of L ⊗ In . Therefore, all conditions of Proposition 1 are satisfied, the controller (45) is optimal and stabilized to the null space of L ⊗ In . As a result, the consensus of the agents is reached. ¯ is restricted to be positive If the state weighting matrix Q semi-definite, one just needs to set c ≥ 2cmin according to Proposition 3. Necessity: If u = −c(L ⊗ K)x is optimal, then L ⊗ (KB) is simple positive semi-definite using Proposition 2. Similar to the necessity analysis of Theorem 1, we obtain that L is simple positive semi-definite. Since a local feedback gain K can be readily obtained by using the LQR optimal design method, similar to Corollary 1, a lower bound of the coupling gain c is given in the following corollary. Corollary 2 (LQR Optimal Design Method): The protocol (45) with the gain K given by (40) is optimal for some global quadratic performance indexes  ∞  T  ¯ + uT Ru ¯ dt x Qx (57) J= 0

¯ =Q ¯ T ≥ 0 and R¯ = R¯ T > 0, if: with Q 1) L is simple positive semi-definite; 2) the coupling gain c ≥ 1/λ; where λ = mini∈N λi , λi denote the positive eigenvalues of L. Remark 6: It is worthy pointing out that if the graph is undirected, then the constraint on graph topology is naturally satisfied since the Laplacian matrix L is symmetric positive semi-definite, thus all conclusions are still valid. C. Consensus Performance of the Agents The main advantage of LQR based design methods seems to be the simple implementation. From a practical point of view, however, the desired consensus performance for the multiagent system is of great significance. It is well known that the transient behaviors of the system depend on the location of closed-loop poles. Relationships between the weight selection and pole placement have been extensively studied, see [42] and [43], just to name a few. The drawback of these methods is complexity and may create computational complexity problems, especially for large scale multiagent systems. Moreover, since the graph topology interplays with the system dynamics, the location that closed-loop poles of the global system lie in are hard to be determined even the poles of each agent system are placed at the specified region. Therefore, the global information is necessary to investigate transient behaviors of multiagent systems. To confront this problem, we develop a novel and simple distributed design scheme by means of inverse optimal design methods proposed in above subsections, such that the resulting multiagent systems reach desired consensus performance asymptotically. In [35], two indexes with respect to the closed-loop eigenvalues are proposed to evaluate the consensus performance of the agents. 1) Convergence Rate: The nonzero eigenvalue ωρ with minimum absolute value of real part ρ.

1322

IEEE TRANSACTIONS ON CYBERNETICS, VOL. 45, NO. 7, JULY 2015

2) Damping Rate: Eigenvalues ωθ with maximum argument θ from negative direction of real-axis. The convergence rate is used to evaluate the convergence speed of the agents, and the damping rate is used to evaluate the oscillating behaviors of the agents. According to Lemma 3, for leader following consensus problems, it is necessary to investigate the locations of eigenvalues of A−cλi BK, i = 1, . . . , N. Similarly, for leaderless consensus problems, the consensus performance relies on the locations of eigenvalues of A − cλi BK, i = 2, . . . , N. The following theorem shows the asymptotic behavior of the eigenvalues. Theorem 3: If K is given by (39), and S is selected such d } that n−m eigenvalues of A−BK are placed at {ω1d , . . . , ωn−m in the left half complex plane, then as γ → +∞: 1) there are n − m eigenvalues of A − cλi BK following ωji → ωjd , j = 1, . . . , n − m; 2) there are m eigenvalues A − cλi BK following ωji → −γ cλi , j = n − m + 1, . . . , n; where cλi ≥ 1, ∀i ∈ N . Proof: Consider the following dynamic system: δ˙i = (A − cλi BK) δi .

(58)

Note that  = 0 in(36) and using the transformation z1 zi = Tδi , where zi = i2 T is defined in (35), then we have zi T (A − cλi BK) T −1

A11 − A12 S = (1 − cλi ) (SA11 + A21 − SA12 S − A22 S)  A12 SA12 + A22 − γ cλi Im 

H12 H11 . ≡ H21 H22 − γ cλi Im Let μ = γ −1 , then (58) is transformed into z˙1i = H11 z1i + H12 z2i μ˙z2i = μH21 z1i + (μH22 − cλi Im ) z2i .

(59a) (59b)

Note that (59a) and (59b) have the same form as [41, eqs. (13) and (14)]. As γ → +∞, μ → 0, then the conclusion follows from the similar development of [41, Th. 1]. Theorem 3 indicates that all eigenvalues of A − cλi BK asymptotically tend to eigenvalues of A − BK as γ → +∞. d } are domObviously, n − m finite eigenvalues {ω1d , . . . , ωn−m inant eigenvalues since they completely determine the convergence rate and the damping rate of the agents. Therefore, desired consensus performance of the agents can be reached asymptotically by means of specifying the set of eigenvald } of each agent, which yields the following ues {ω1d , . . . , ωn−m optimal distributed cooperative design procedures. Procedure 1: Inverse optimal design for optimal leader following consensus problems. 1) Record the minimum eigenvalue λ of L+G, set c = 1/λ. 2) Design S such that A11 − A12 S places n − m eigenvalues d } at specified locations. {ω1d , . . . , ωn−m 3) Obtain K by (39) and let γ → +∞.

Fig. 1.

Communication topology.

4) Determine the optimal distributed consensus protocols by (22). Procedure 2: Inverse optimal design for optimal leaderless consensus problems. 1) Record the minimum positive eigenvalue λ of L, set c = 1/λ. 2) Design S such that A11 − A12 S places n − m eigenvalues d {ω1d , . . . , ωn−m } at desired locations. 3) Obtain K by (39) and let γ → +∞. 4) Determine the optimal distributed consensus protocols by (45). Remark 7: Note that if the convergence process of ωji → ωjd (j = 1, . . . , n − m) is slow dramatically, then γ may be large, which will lead to high gain controllers. Therefore, in practical applications, a tradeoff must be made between acceptable accuracy of desired eigenvalue locations and possible high gain of the resulting controllers. IV. S IMULATIONS In this section, two numerical examples are given to show the design procedures of the optimal distributed protocols, and demonstrate the advantage of the developed design methods. Example 1 (Leaderless Case): Consider the following multiagent systems with six nodes: x˙ i = Axi + Bui , i = 1, . . . , 6 where



−1 A=⎣ 1 −5

2 −1 0

⎤ ⎡ ⎤ 5 0 2 ⎦, B = ⎣ 0 ⎦. −1 1

(60)

(61)

The initial state of each agent is generated randomly in [0, 1], and the communication topology is described in Fig. 1. The minimum positive eigenvalue of L is λ = 0.8340, hence the coupling gain is set as c = 1.1991, which also satisfies the coupling gain selection criterion in [20]. By choosing weighting matrices Q = I3 and R = 1, the LQR based optimal feedback gain is given by (40), and the evolutionary process of consensus is shown in Fig. 2. The consensus is reached within 8 s. The corresponding convergence rate and damping rate are shown in Table I. Now we aim to design the optimal distributed consensus protocols (45) such that the resulting global closed-loop

ZHANG et al.: DISTRIBUTED COOPERATIVE OPTIMAL CONTROL FOR MULTIAGENT SYSTEMS ON DIRECTED GRAPHS

Fig. 2. Consensus process using LQR optimal based distributed consensus protocols: Q = I3 , R = 1.

1323

Fig. 3. Consensus process using inverse optimal based optimal distributed consensus protocols: {−3 ± 0.5j}, γ = 500.

TABLE I C ONSENSUS P ERFORMANCE U SING LQR O PTIMAL D ESIGN AND I NVERSE O PTIMAL D ESIGN

Fig. 4.

where system (46) asymptotically achieves the desired convergence rate and damping rate of {3 ± 0.5j}. This can be simply implemented by running Procedure 2. To show the asymptotic properties, we set γ = 10, 100, 500, respectively. The corresponding convergence rate and damping rate are shown in Table I. The desired consensus performance is asymptotically reached, and the resulting evolutionary process of consensus (γ = 500) is shown in Fig. 3. Compared with the agents behaviors in Fig. 2, obviously, the consensus is reached faster in Fig. 3 (within 2 s), and the oscillating behaviors are also improved, all the facts verify the superiority of the developed inverse optimal based design methods. Example 2 (Leader Following Case: Two-Mass-Spring System): This example is taken from [3] with some modifications. It is well known that many industrial applications can be modeled as the mass-spring systems. The two-massspring system with single force input is considered and shown in Fig. 4, where m1 and m2 are two masses, k1 and k2 are spring constants, the force input u is for mass 1, and y1 , y2 denotes displacement of the two masses, respectively. Define the state vector x = [x1 , x2 , x3 , x4 ]T = [y2 , y˙ 2 , y1 , y˙ 1 ]T , then the two-mass-spring system can be modeled as x˙ = Ax + Bu

(62)

Two-mass-spring system.

⎡ ⎢ A=⎢ ⎣

0 k2 m2

0

2 − k1m+k 1

0 0 1 0

0 − mk22 0 k2 m1

⎤ ⎡ ⎤ 1 0 ⎢ 0 ⎥ 0⎥ ⎥ B = ⎢ ⎥. ⎣ 0 ⎦ 0⎦ 0

(63)

1 m1

Let one unforced two-mass-spring system be the leader node and produce a desired state trajectory, and other six two-massspring systems act as follower nodes. These six nodes can get state information from their neighbors with the communication topology given in Fig. 5. Let ui , yi,1 , and yi,2 be the force input, displacement of mass 1 and displacement of mass 2 for the ith node, respectively, where i = 0, 1, . . . , 6. The objective is to design the distributed optimal protocols for the follower nodes, such that the displacements of the two masses synchronize to that of the leader node. Set m1 = 0.8 kg, m2 = 1.2 kg, k1 = 1.5 N/m, k2 = 1 N/m. The minimum eigenvalue of L + G is λ = 0.4544, hence the coupling gain is set as c = 2.2007, which also satisfies the coupling gain selection criterion in [20]. By choosing Q = I4 and R = 1, the LQR based optimal feedback gain is given by (40), and the evolutionary process of consensus is shown in Fig. 6. The consensus is reached within 15 s, corresponding convergence rate and damping rate are shown in Table II. To obtain more desired consensus performance, we run Procedure 1 to obtain the optimal distributed consensus protocols (22) such that the resulting global error closed-loop

1324

IEEE TRANSACTIONS ON CYBERNETICS, VOL. 45, NO. 7, JULY 2015

TABLE II C ONSENSUS P ERFORMANCE U SING LQR O PTIMAL D ESIGN AND I NVERSE O PTIMAL D ESIGN

Fig. 5.

Communication topology.

Fig. 7. Consensus process using inverse optimal based optimal distributed consensus protocols: {−0.5 ± 1.2j, −0.5}, γ = 100.

Fig. 6. Consensus process using LQR based optimal distributed consensus protocols: Q = I4 , R = 1.

system (24) asymptotically reaches the desired convergence rate and damping rate of {−0.5 ± 1.2j, −0.5} and {−0.8 ± 1.2j, −0.8}, for compare purpose. We set γ = 100 and the corresponding convergence rates and damping rates are shown in Table II. The desired consensus performance is asymptotically reached, and the resulting evolutionary process of consensus is shown in Figs. 7 and 8, respectively. Compared with the agents behaviors in Figs. 6 and 7, the consensus is reached faster (within 5 s) in Fig. 8, and the oscillating behaviors of the agents are also more desirable, which indicates that all states y1 , y˙ 1 , y2 , y˙ 2 can synchronize to leader node more rapidly and steady. The above facts show the effectiveness of the developed inverse optimal based design methods. V. C ONCLUSION In this paper, the inverse optimal approach has been employed to design the distributed consensus protocols that guarantee consensus and global optimality for identical linear

Fig. 8. Consensus process using inverse optimal based optimal distributed consensus protocols: {−0.8 ± 1.2j, −0.8}, γ = 100.

systems on a directed graph. The necessary and sufficient conditions for the inverse optimality have been established. By means of the developed inverse optimality theory, we have proved that for the globally optimal leader following consensus problem on a directed graph which contains a spanning tree with at least one nonzero pinning gain connecting into a root node, such optimal distributed consensus protocols exist

ZHANG et al.: DISTRIBUTED COOPERATIVE OPTIMAL CONTROL FOR MULTIAGENT SYSTEMS ON DIRECTED GRAPHS

if and only if L+G is simple positive definite. For the globally optimal leaderless consensus problem on a directed strongly connected graph, such optimal distributed consensus protocols exist if and only if the L is simple positive semi-definite. Consensus performance has been addressed by investigating asymptotic properties of optimal distributed consensus protocols. Simple design procedures have been proposed such that the resulting multiagent system can reach desired consensus performance asymptotically. It appears that all results are also available if the graphs are undirected. Two examples have been given to illustrate the effectiveness and superiority of the developed methods. The resulting optimal distributed consensus protocols contain a parameter γ , which tends to infinite such that the agents reach desired consensus performance asymptotically. However, if the convergence process is slow dramatically, then γ may be large, hence the resulting optimal distributed consensus protocols may be large in magnitude. Therefore, trial and error iterations are needed to find acceptable tradeoff parameter γ . A remedy for this is left for future study.

R EFERENCES [1] J. A. Fax and R. M. Murray, “Information flow and cooperative control of vehicle formations,” IEEE Trans. Autom. Control, vol. 49, no. 9, pp. 1465–1476, Sep. 2004. [2] R. Olfati-Saber, “Flocking for multi-agent dynamic systems: Algorithms and theory,” IEEE Trans. Autom. Control, vol. 51, no. 3, pp. 401–420, Mar. 2006. [3] H. Zhang and F. L. Lewis, “Lyapunov, adaptive, and optimal design techniques for cooperative systems on directed communication graphs,” IEEE Trans. Ind. Electron., vol. 59, no. 7, pp. 3026–3041, Jul. 2012. [4] Q. Shen, B. Jiang, P. Shi, and J. Zhao, “Cooperative adaptive fuzzy tracking control for networked unknown nonlinear multiagent systems with time-varying actuator faults,” IEEE Trans. Fuzzy Syst., vol. 22, no. 3, pp. 494–504, Jun. 2014. [5] W. Ren, R. Beard, and E. Atkins, “Information consensus in multivehiches cooperative control,” IEEE Control Syst. Mag., vol. 27, no. 2, pp. 71–82, Apr. 2007. [6] H. Liang, H. Zhang, Z. Wang, and J. Wang, “Output regulation of statecoupled linear multi-agent systems with globally reachable topologies,” Neurocomputing, vol. 123, pp. 337–343, Jan. 2014. [7] H. Zhang, L. Cui, X. Zhang, and Y. Luo, “Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method,” IEEE Trans. Neural Netw., vol. 22, no. 12, pp. 2226–2236, Dec. 2011. [8] X. Su, L. Wu, and P. Shi, “Sensor networks with random link failures: Distributed filtering for T-S fuzzy systems,” IEEE Trans. Ind. Informat., vol. 9, no. 3, pp. 1739–1750, Aug. 2013. [9] L. I. Barna and Z. Constantin-Bala, “ERMS: An evolutionary reorganizing multiagent system,” Int. J. Innov. Comput. Inf. Control, vol. 9, no. 3, pp. 1171–1188, Mar. 2013. [10] A. Sedziwy, “Effective graph representation supporting multi-agent distributed computing,” Int. J. Innov. Comput. Inf. Control, vol. 10, no. 1, pp. 101–113, 2014. [11] S. Tong and Y. Li, “Adaptive fuzzy output feedback tracking backstepping control of strict-feedback nonlinear systems with unknown dead zones,” IEEE Trans. Fuzzy Syst., vol. 20, no. 1, pp. 168–180, Feb. 2012. [12] S. Tong and Y. Li, “Adaptive fuzzy output feedback control of MIMO nonlinear systems with unknown dead-zone inputs,” IEEE Trans. Fuzzy Syst., vol. 21, no. 1, pp. 134–146, Feb. 2013. [13] S. Tong, B. Huo, and Y. Li, “Observer-based adaptive decentralized fuzzy fault-tolerant control of nonlinear large-scale systems with actuator failures,” IEEE Trans. Fuzzy Syst., vol. 22, no. 1, pp. 1–15, Feb. 2014. [14] S. Tong, Y. Li, Y. Li, and Y. Liu, “Observer-based adaptive fuzzy backstepping control for a class of stochastic nonlinear strict-feedback systems,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 41, no. 6, pp. 1693–1704, Dec. 2011.

1325

[15] S. Hara and T. Iwasaki, “Sum-of-squares decomposition via generalized KYP lemma,” IEEE Trans. Autom. Control, vol. 54, no. 5, pp. 1025–1029, May 2009. [16] J. Chen, S. Hara, Q. Li, and R. H. Middleton, “Best achievable tracking performance in sampled-data systems via LTI controllers,” IEEE Trans. Autom. Control, vol. 53, no. 11, pp. 2467–2479, Dec. 2008. [17] X. Xin, S. Hara, and M. Kaneda, “Reduced-order proper H∞ controllers for descriptor systems: Existence conditions and LMI-based design algorithms,” IEEE Trans. Autom. Control, vol. 53, no. 5, pp. 1253–1258, Jun. 2008. [18] H. Zhang and F. L. Lewis, “Optimal design for synchronization of cooperative systems: State feedback, observer and output feedback,” IEEE Trans. Autom. Control, vol. 56, no. 8, pp. 1948–1953, Aug. 2011. [19] W. Ren, R. Beard, and E. Atkins, “A survey of consensus problems in multi-agent coordination,” in Proc. Amer. Control Conf., Portland, OR, USA, 2005, pp. 1859–1864. [20] H. M. Kristian and F. L. Lewis, “Cooperative optimal control for multi-agent systems on directed graph topologies,” IEEE Trans. Autom. Control, vol. 59, no. 3, pp. 769–774, Mar. 2014. [21] W. Wang and J. Slotine, “A theoretical study of different leader roles in networks,” IEEE Trans. Autom. Control, vol. 51, no. 7, pp. 1156–1161, Jul. 2006. [22] W. Ren, K. Moore, and Y. Chen, “High-order and model reference consensus algorithms in cooperative control of multivehicle systems,” J. Dyn. Syst. Meas. Control, vol. 129, no. 5, pp. 678–688, Sep. 2007. [23] X. Wang and G. Chen, “Pinning control of scale-free dynamical networks,” Phys. A Statist. Mech. Appl., vol. 310, nos. 3–4, pp. 521–531, Jul. 2002. [24] R. Cui, S. S. Ge, and B. Ren, “Synchronized tracking control of multiagent system with limited information,” in Proc. 49th IEEE Conf. Decis. Control, Atlanta, GA, USA, 2010, pp. 5480–5485. [25] R. Cui, S. S. Ge, and B. Ren, “Synchronized altitude tracking control of multiple unmanned helicopters,” in Proc. Amer. Control Conf., Baltimore, MD, USA, 2010, pp. 4433–4438. [26] D. Tsubakino and S. Hara, “Eigenvector-based characterization for hierarchical multi-agent dynamical systems with low rank interconnection,” in Proc. IEEE Int. Conf. Control Appl. (CCA), Yokohama, Japan, Sep. 2010, pp. 2023–2028. [27] W. B. Dunbar and R. M. Murray, “Distributed receding horizon control for multi-vehicle formation stabilization,” Automatica, vol. 42, no. 4, pp. 549–558, 2006. [28] F. Borelli and T. Keviczky, “Distributed LQR design for identical dynamically decoupled systems,” IEEE Trans. Autom. Control, vol. 53, no. 8, pp. 1901–1912, Sep. 2008. [29] K. G. Vamvoudakis, F. L. Lewis, and G. R. Hudas, “Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality,” Automatica, vol. 48, no. 8, pp. 1598–1611, 2012. [30] H. Zhang, J. Zhang, G. H. Yang, and Y. Luo, “Leader-based optimal coordination control for the consensus problem of multi-agent differential games via fuzzy adaptive dynamic programming,” IEEE Trans. Fuzzy Syst., to be published. DOI: 10.1109/TFUZZ.2014.2310238. [31] H. Zhang, D. Liu, Y. Luo, and D. Wang, Adaptive Dynamic Programming for Control-Algorithms and Stability. London, U.K.: Springer, 2013. [32] W. Dong, “Distributed optimal control of multiple systems,” Int. J. Control, vol. 83, no. 10, pp. 2067–2079, 2010. [33] Y. Cao and W. Ren, “Optimal linear-consensus algorithms: An LQR perspective,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 40, no. 3, pp. 819–830, Jun. 2010. [34] Z. Qu, M. Simaan, and J. Doug, “Inverse optimality of cooperative control for networked systems,” in Proc. Joint 48th IEEE Conf. Decis. Control 28th Chinese Control Conf. (CDC/CCC), Shanghai, China, Dec. 2009, pp. 1651–1658. [35] S. Hara, S. Hikarua, and K. Tae-Hyoung, “Consensus in hierarchical multi-agent dynamical systems with low-rank interconnections: Analysis of stability and convergence rates,” in Proc. Amer. Control Conf., St. Louis, MO, USA, Jun. 2009, pp. 5192–5197. [36] R. E. Kalman, “When is a linear control system optimal?” J. Fluids Eng., vol. 86, no. 1, pp. 81–90, 1964. [37] A. Jameson and E. Kreindler, “Inverse problem of linear optimal control,” SIAM J. Control Optim., vol. 11, no. 1, pp. 1–19, 1973. [38] T. Fujii, “A new approach to the LQ design from the viewpoint of the inverse regulator problem,” IEEE Trans. Autom. Control, vol. 32, no. 11, pp. 995–1004, Nov. 1987. [39] F. L. Lewis, D. Vrabie, and V. L. Syrmos, Optimal Control, 3rd ed. New York, NY, USA: Wiley, 2012.

1326

IEEE TRANSACTIONS ON CYBERNETICS, VOL. 45, NO. 7, JULY 2015

[40] Z. Li, Z. Duan, G. Chen, and L. Huang, “Consensus of multiagent systems and synchronization of complex networks: A unified viewpoint,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 57, no. 1, pp. 213–224, Jan. 2010. [41] K. D. Young, P. V. Kokovtic, and V. I. Utkin, “A singular perturbation analysis of high-gain feedback systems,” IEEE Trans. Autom. Control, vol. 22, no. 6, pp. 931–938, Dec. 1977. [42] C. A. Harvey and G. Stein, “Quadratic weights for asymptotic regulator properties,” IEEE Trans. Autom. Control, vol. 23, no. 3, pp. 378–387, Jun. 1978. [43] W. M. Haddad and D. S. Bernstein, “Controller design with regional pole constraints,” IEEE Trans. Autom. Control, vol. 37, no. 1, pp. 54–69, Jan. 1992.

Huaguang Zhang (SM’04) received the B.S. and M.S. degrees in control engineering from Northeastern Electric Power University, Jilin, China, in 1982 and 1985, respectively, and the Ph.D. degree in thermal power engineering and automation from Southeast University, Nanjing, China, in 1991. He joined the Department of Automatic Control, Northeastern University, Shenyang, China, as a PostDoctoral Fellow, in 1992. Since 1994, he has been a Professor and the Head of the Electric Automation Institute, Northeastern University. His current research interests include neural network-based control, fuzzy control, chaos control, nonlinear control, signal processing, adaptive dynamic programming, and their industrial applications. He has authored three English monographs, and holds 30 patents. Dr. Zhang was the recipient of the Nationwide Excellent Post-Doctor, the Outstanding Youth Science Foundation Award from the National Natural Science Foundation Committee of China, in 2003, the Cheung Kong Scholar Award from the Education Ministry of China, in 2005, and the IEEE T RANSACTIONS ON N EURAL N ETWORKS Outstanding Paper Award, in 2012. He was an Associate Editor of the IEEE T RANSACTIONS ON C YBERNETICS AND N EUROCOMPUTING and Automatica. He is the Deputy Director of the Intelligent System Engineering Committee of Chinese Association of Artificial Intelligence.

Tao Feng received the B.S. degree in mathematics and applied mathematics from the China University of Petroleum, Dongying, China, in 2008, and the M.S. degree in fundamental mathematics from Northeastern University, Shenyang, China, in 2011, where he is currently pursuing the Ph.D. degree from the College of Information Science and Engineering. His current research interests include approximate dynamic programming, inverse optimal control, and multiagent systems.

Guang-Hong Yang (SM’04) received the B.S. and the M.S. degrees from the Northeast University of Technology, Liaoning, China, in 1983 and 1986, respectively, and the Ph.D. degree in control engineering from Northeastern University (formerly, Northeast University of Technology), Shenyang, China, in 1994. He was a Lecturer/Associate Professor with Northeastern University from 1986 to 1995. He joined Nanyang Technological University, Singapore, as a Post-Doctoral Fellow, in 1996. From 2001 to 2005, he was a Research Scientist/Senior Research Scientist with the National University of Singapore, Singapore. He is currently a Professor with the College of Information Science and Engineering, Northeastern University. His current research interests include fault-tolerant control, fault detection and isolation, nonfragile control systems design, and robust control. Dr. Yang is an Associate Editor of the International Journal of Control, Automation, and Systems, the International Journal of Systems Science, the IET Control Theory and Applications, and the IEEE T RANSACTIONS ON F UZZY S YSTEMS.

Hongjing Liang received the B.S. degree in mathematics from Bohai University, Jinzhou, China, in 2009, and the M.S. degree in fundamental mathematics from Northeastern University, Shenyang, China, in 2011, where he is currently pursuing the Ph.D. degree. His current research interests include multiagent systems, complex systems, and output regulation.

Distributed Cooperative Optimal Control for Multiagent Systems on Directed Graphs: An Inverse Optimal Approach.

In this paper, the inverse optimal approach is employed to design distributed consensus protocols that guarantee consensus and global optimality with ...
2MB Sizes 0 Downloads 7 Views