262

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 27, NO. 2, FEBRUARY 2016

A New Continuous-Time Equality-Constrained Optimization to Avoid Singularity Quan Quan and Kai-Yuan Cai

Abstract— In equality-constrained optimization, a standard regularity assumption is often associated with feasible point methods, namely, that the gradients of constraints are linearly independent. In practice, the regularity assumption may be violated. In order to avoid such a singularity, a new projection matrix is proposed based on which a feasible point method to continuous-time, equality-constrained optimization is developed. First, the equality constraint is transformed into a continuoustime dynamical system with solutions that always satisfy the equality constraint. Second, a new projection matrix without singularity is proposed to realize the transformation. An update (or say a controller) is subsequently designed to decrease the objective function along the solutions of the transformed continuous-time dynamical system. The invariance principle is then applied to analyze the behavior of the solution. Furthermore, the proposed method is modified to address cases in which solutions do not satisfy the equality constraint. Finally, the proposed optimization approach is applied to three examples to demonstrate its effectiveness. Index Terms— Continuous-time dynamical systems, equality constraints, optimization, singularity.

I. I NTRODUCTION CCORDING to the implementation of a differential equation, most approaches to continuous-time optimization can be classified as either a dynamical system [1]–[3] or a neural network [4]–[10]. The dynamical system approach relies on the numerical integration of differential equations on a digital computer. Unlike discrete optimization methods, the step sizes of dynamical system approaches can be automatically controlled in the integration process and can sometimes be made larger than usual. This advantage suggests that the dynamical system approach can, in fact, be comparable with the currently available conventional discrete optimal methods and can facilitate faster convergence [1], [3]. The application of a higher order numerical integration process also enables us to avoid the zigzagging phenomenon, which is often encountered in typical linear extrapolation methods [1]. On the other hand, the neural network approach emphasizes implementation by analog circuits, very large-scale integration, and optical technologies [11]. The major breakthrough of this approach is attributed to [12], which introduced an artificial neural network

A

Manuscript received July 15, 2014; accepted August 29, 2015. Date of publication September 22, 2015; date of current version January 18, 2016. This work was supported by the National Natural Science Foundation of China under Grant 61473012. The authors are with the Department of Automatic Control, Beijing University of Aeronautics and Astronautics, Beijing 100191, China (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNNLS.2015.2476348

to solve the traveling salesman problem. By employing analog hardware, the neural network approach offers low computational complexity and is suitable for parallel implementation. For continuous-time equality-constrained optimization, existing methods can be classified into three categories [1], [4], [8], [13]: 1) the feasible point method (or primal method); 2) the penalty function method; and 3) the Lagrangian multiplier method (or primal-dual method). The feasible point method directly solves the original problem by searching through the feasible region for the optimal solution. Each point in the process is feasible, and the value of the objective function constantly decreases. Compared with the two other methods, the feasible point method offers three significant advantages that highlight its usefulness as a general procedure that is applicable to almost all nonlinear programming problems [13, p. 360]: 1) the terminating point is feasible if the process is terminated before the solution is reached; 2) the limit point of the convergent sequence of solutions must be at least a local constrained minimum; and 3) the method is applicable to general nonlinear programming problems because it does not rely on special problem structures, such as convexity. In equality-constrained optimization, a standard regularity assumption is often associated with feasible point methods, namely, that the gradients of constraints are linearly independent. Besides, the regularity assumption is also required by the penalty function method [13, p. 417] and the Lagrangian multiplier method [13, p. 476]. However, in practice, the regularity assumption may be violated. In order to avoid such a singularity, a continuous-time feasible point method is proposed to identify the local minimum from a feedback control perspective for a general equality-constrained optimization problem. Compared with global optimization methods, local optimization methods are still necessary. First, they often serve as a basic component for some global optimizations, such as the branch-and-bound method [14]. On the other hand, they require less computation for online optimization. Compared with the discrete optimal methods offered by MATLAB, illustrative examples show that the proposed method avoids convergence to a singular point and facilitates faster convergence through numerical integration on a digital computer. Moreover, one illustrative example shows that the proposed projection matrix also outperforms the modified commonly used projection matrix. In view of these, the contributions of this paper are clear and listed as follows. 1) A new projection matrix is proposed to remove a standard regularity assumption that is often associated with feasible point methods, namely, that

2162-237X © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

QUAN AND CAI: NEW CONTINUOUS-TIME EQUALITY-CONSTRAINED OPTIMIZATION TO AVOID SINGULARITY

the gradients of constraints are linearly independent (see [1, p. 158, eq. (4)], [2, p. 156, eq. (2.3)], [8, p. 1669, Assumption 1]). Compared with a modified commonly used projection matrix, the proposed projection matrix achieves a better precision. Moreover, its recursive form can be implemented more easily. 2) Based on the proposed projection matrix, a continuoustime, equality-constrained optimization method is developed to avoid convergence to a singular point. The invariance principle is applied to analyze the behavior of the solution. 3) The modified version of the proposed optimization is further developed to address cases in which solutions do not satisfy the equality constraint. This ensures its robustness against uncertainties caused by numerical error or realization by analog hardware. 4) Equality-constrained optimization is formulated as a control problem, in which the equality constraint is transformed into a continuous-time dynamical system. The proposed method is easily accessible, especially to practitioners in the control field. The following notation is used. Rn is Euclidean space of dimension n. · denotes the Euclidean vector norm or induced matrix norm. In is the identity matrix with dimension n. 0n1 ×n2 denotes a zero vector or a zero matrix with dimension n 1 ×n 2 . (·)† denotes the Moore–Penrose inverse. Direct product ⊗ and vec(·) operation are defined in Appendix A. The function [·]× : R3 → R3×3 with matrix H ∈ R9×3 is also defined in Appendix A. Suppose g : Rn → R. The gradient of the function g is given by ∇g(x) = ∇x g(x) = [∂g(x)/∂ x 1 · · · ∂g(x)/∂ x n ]T ∈ Rn and the matrix of second partial derivatives of g(x) known as Hessian is given by ∇x x : R → Rn×n and ∇x x g(x) = [∂ 2 g(x)/∂ x i ∂ x j ]i j . II. P ROBLEM F ORMULATION In this section, the considered equality-constrained optimization problem is formulated first. Then, an equality constraint transformation is proposed to replace with the equality constraints. Based on it, the objectives are proposed. A. Equality-Constrained Optimization The class of equality-constrained optimization problems considered here is defined as follows: min v(x), s.t. c(x) = 0m×1

x∈Rn

(1)

where v : Rn → R is the objective function and c = [c1 c2 · · · cm ]T ∈ Rm , ci : Rn → R are the equality constraints. They are both twice continuously differentiable. Denote by ∇c  [∇c1 , ∇c2 , . . . , ∇cm ] ∈ Rn×m . To avoid a trivial case, suppose the constraint (or feasible set) F = { x ∈ Rn | c(x) = 0m×1 } = ∅. Definition 1 [15, pp. 316–317]: For the problem (1), a vector x ∗ ∈ F is a global minimum if v(x ∗ ) ≤ v(x), ∀x ∈ F ; a vector x ∗ ∈ F is a local (strict local) minimum if there is a neighborhood N of x ∗ , such that v(x ∗ ) ≤ v(x) (v(x ∗ ) < v(x)) for x ∈ N ∩ F .

263

Definition 2 [13, p. 325]: A vector x ∗ ∈ F is said to be a regular point if the gradient vectors ∇c1 (x ∗ ), ∇c2 (x ∗ ), . . . , ∇cm (x ∗ ) are linearly independent. Otherwise, it is called a singular point. Definition 3 [16, p. 117]: A function v(x) satisfying v(0) = 0 and v(x) > 0 for x = 0n×1 is said to be positive definite. Remark 1 (On Inequality-Constrained Optimization): Inequality-constrained optimization problems can be transformed into equality-constrained optimization problems. For example, the inequality constraint x ≤ 1, x ∈ R can be replaced with an equality constraint x + z 2 = 1, z ∈ R. On the other hand, the inequality constraint −1 ≤ x ≤ 1, x ∈ R can be replaced with an equality constraint x = sin(z), z ∈ R. B. Equality Constraint Transformation Optimization problems are often solved using numerical iterative methods. For an equality-constrained optimization problem, the major difficulty lies in ensuring that each iteration satisfies the constraints and can further move toward the minimum. To address this difficulty, a transformation of the equality constraints is proposed, which is first formulated as an assumption. Assumption 1: For a given x 0 ∈ F , there exists a function f : Rn → Rn×l , such that x(t) ˙ = f (x(t))u(t), x(0) = x 0

(2)

with solutions that satisfy x(t) ∈ Fu (x 0 ) ⊂ F , where Fu (x 0 ) = {x(t)|x(t) ˙ = f (x(t))u(t), x(0) = x 0 ∈ F , u(t) ∈ Rl , t ≥ 0}. The best choice of f (x) is to make Fu (x 0 ) = F . This holds for linear constraints. For the sake of simplicity, the variable t will be omitted except when necessary. Theorem 1: Suppose that c(x) = Ax and f (x) ≡ A⊥ , where A⊥ is with full column rank, and the space spanned by the columns of A⊥ is the null space of A. Then, Fu (x 0 ) = F , ∀x 0 ∈ F . Proof: See Appendix B.  From the proof of Theorem 1, the choice of f (x) is, in fact, an accessibility problem in the control field. However, it is difficult to achieve Fu (x 0 ) = F in a general case. For example, if c(x) = (x 1 + 1)(x 1 − 1), x = [x 1 x 2 ]T ∈ R2 , 2 then F =  { x ∈ R x 1 = 1 or2 x 1 = −1}. Since the two sets 2  { x ∈ R x 1 = 1} and { x ∈ R  x 1 = −1} are not connected, the solution of (2) starting from either set cannot access the other. Although Fu (x 0 ) = F , it is expected that the function f (x) is chosen to make the set Fu (x 0 ) as large as possible, so that the probability of x ∗ ∈ Fu (x 0 ) is higher. Motivated by the linear case in Theorem 1, the function f (x) should satisfy V1 (x) = V2 (x) where V1 (x) = {z ∈ Rn |∇c(x)T z = 0m×1 }, [null space of ∇c(x)T ] V2 (x) = {z ∈ Rn |z = f (x)u, u ∈ Rl }, [range space of f (x)].

264

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 27, NO. 2, FEBRUARY 2016

For a given x  ∈ Rn , if V1 (x  ) = V2 (x  ), then f (x  ) is defined as the projection matrix of ∇c(x  ). Obviously, it satisfies ∇c(x  )T f (x  ) = 0m×l . One commonly used projection matrix, denoted by f com (x), is given as follows [1], [2], [8]: f com (x) = In − (∇c(∇c T ∇c)−1 ∇c T )(x).

(3)

Obviously, the matrix function f com (x) is the projection matrix for all regular points except for some singular points. C. Objective This paper aims to propose a continuous-time, equalityconstrained optimization method to identify the local minimum from a feedback control perspective. Concretely, the first objective is to propose a new projection matrix to avoid singularity. Based on it, the second objective is to design the update u to make the solutions of (2) achieve a local minimum. By Assumption 1, the update u in (2) can be considered as a control input from the feedback control perspective. The objective function v(x) can be considered as a Lyapunovlike function, although v(x) is, in fact, not required to be positive definite here. According to this, the second objective of this paper can be restated as: to design a control input u to decrease v(x) along the solutions of (2) until x has achieved a local minimum.

Fig. 1. Precision error of the modified commonly used projection matrix with different ε = 10−k .

With it, the projection matrix becomes  T −1 T f newcom = I4 − ∇cnew ∇cnew ∇cnew ∇cnew . It is easy to see that ∇c T f newcom ≡ 03×4 . However, the best cure cannot be implemented continuously using (3), which further cannot be realized by analog hardware. For such a purpose, a new projection matrix is proposed in the following.

III. N EW P ROJECTION M ATRIX In order to avoid such a singularity, a new projection matrix is proposed. Before it, a modification of fcom (x) in (3) is investigated. A. Modified Commonly Used Projection Matrix In order to avoid singularity problem, a commonly used projection matrix (3) is modified as follows: f mcom (x) = In − (∇c(ε Im + ∇c T ∇c)−1 ∇c T )(x)

(4)

where ε > 0 is a small positive scale. Obviously, the smaller ε is, the closer to zero ∇c T f mcom  is. On the other hand, however, a very small ε will cause an ill-conditioning problem especially for a low-precision processor. Example 1: Consider the following gradient vectors: ∇c1 = [ 1

1

1

1 ]T

∇c2 = [ 2

1

1

1 ]T

∇c3 = [ 3

2

2

2 ]T

(5)

which are linearly dependent as ∇c3 = ∇c1 + ∇c2 . The precision error e p = ∇c T f mcom  is performed with different ε = 10−k , k = 1, . . . , 15, where ∇c = [∇c1 , ∇c2 , ∇c3 ]. As shown in Fig. 1, the smallest precision error is achieved only at ε = 10−8 with a precision error around 10−8 . Reducing ε further will increase the numerical error. In order to avoid the singularity problem in Example 1, the best cure is to remove the vector ∇c3 , resulting in ∇cnew = [∇c1 , ∇c2 ] ∈ R4×2 .

B. New Projection Matrix A new projection matrix, denoted by f pro (x), is proposed to avoid the singularity problem. For a special case c : Rn → R, such f pro (x) is designed in Theorem 2. Consequently, a method is proposed to construct a projection matrix for a general case c : Rn → Rm . Before the design, the following preliminary result is needed. Lemma 1: Suppose W1 = {z ∈ Rn |L T z = 0}   n W2 = z ∈ R |z = In −

L LT δ(L2 ) + L2



 u, u ∈ R

n

where L ∈ Rn and

 1, δ(x) = 0,

x = 0, x ∈ R x = 0, x ∈ R.

Then, W1 = W2 . Proof: See Appendix C.  Theorem 2: Suppose that c : Rn → R and the function f pro (x) is designed to be f pro = In −

∇c∇c T . δ(∇c2 ) + ∇c2

(6)

Then, Assumption 1 is satisfied with u ∈ Rn and V1 (x) = V2 (x), ∀x ∈ Rn . Proof: Since c(x) ˙ = ∇c(x)T x˙ and x˙ = f pro (x)u, it follows c(x) ˙ ≡ 0m×1 by Lemma 1. Therefore, Assumption 1 is satisfied with u ∈ Rn . Further by Lemma 1, V1 (x) = V2 (x). 

QUAN AND CAI: NEW CONTINUOUS-TIME EQUALITY-CONSTRAINED OPTIMIZATION TO AVOID SINGULARITY

265

Theorem 3: Suppose that c : Rn → Rm and the functions f k (x) are in a recursive form as follows: f 0 = In T ∇c ∇c T f f k−1 f k−1 k k k−1  f k = f k−1 −  T T ∇c 2 2 δ  fk−1 ∇ck  +  fk−1 k

(7)

where k = 1, . . . , m. Then, Assumption 1 is satisfied with f pro = f m and u ∈ Rn and V1 (x) = V2 (x), ∀x ∈ Rn . Proof: See Appendix D.  Remark 2 (On the Best Cure): If ∇cm is represented by a linear combination of ∇ci , then ∇c∇c T is singular, T f i = 1, . . . , m − 1. In this case, ∇cm as m−1 = 01×n T ∇ci f m−1 = 01×n , i = 1, . . . , m − 1. By (7), the projection matrix f pro will degenerate to f pro = fm−1 , that is equivalent to removing the term ∇cm . This is consistent with the best cure aforementioned. Remark 3 (On Example 1): Let us revisit Example 1. In practice, the impulse function δ(x) is approximated by some continuous functions, such as δ(x) ≈ e−γ |x| , where γ is a large positive scale. The precision error e p = ∇c T f pro  is performed with δ(x) ≈ e−30|x| , resulting in e p = 2.7629 ×10−10. This demonstrates the advantage of our proposed projection matrix over f mcom in (4). Furthermore, compared with (3) or (4), the explicit recursive form of the proposed projection matrix is also easier to realize by analog hardware because of avoiding the matrix inversion. IV. U PDATE D ESIGN AND C ONVERGENCE A NALYSIS In this section, the update (or say controller) u is designed to make v(x) ˙ ≤ 0, so that v(x) is nonincreasing. If v(x) is positive definite, then the theories of Lyapunov stability is available. That will be very familiar to practitioners in the control field. However, in order to make v(x) more general, the objective function v(x) here is not required to be positive definite or convexity. Because of this, the analysis is based on the LaSalle invariance theorem [16, pp. 126–129]. A. Update Design Based on Assumption 1, taking the time derivative of v(x) along the solutions of (2) results in v(x) ˙ = ∇v(x)T f (x)u

(8)

˙ ≤ 0, a direct way of where ∇v(x) ∈ Rn . In order to get v(x) designing u is proposed as follows: u = −Q(x) f (x)T ∇v(x) → where Q : Then, (8) becomes Rn

Rl×l

(9)

and Q(x) ≥  Il > 0,  > 0, ∀x ∈ Rn .

v(x) ˙ = −∇v(x)T f (x)Q(x) f (x)T ∇v(x) ≤ 0.

(10)

Substituting (9) into the continuous-time dynamical system (2) results in x˙ = − f (x)Q(x) f (x)T ∇v(x)

(11)

with solutions which always satisfy the constraint c(x) = 0m×1 . The closed-loop system corresponding to the continuous-time dynamical system (2) and the controller (9) is shown in Fig. 2.

Fig. 2.

Closed-loop control system.

B. Convergence Analysis The invariance principle is applied to analyze the behavior of the solution of (11). The readers not interested in these details can directly proceed to Section IV-C. Theorem 4: Under Assumption 1, given x 0 ∈ F , if the set K = {x ∈ Rn |v(x) ≤ v(x 0 ), c(x) = 0m×1 } is bounded, then the solution of (11) starting at x 0 approaches xl∗ ∈ S, where S = {x ∈ K|∇v(x)T f (x) = 01×l }. In addition, ∗ ∗ ∗ if V1 (xl∗ ) = V2 (xl∗ ), then there must exist m λ ∗= [λ1 ∗λ2 · · · ∗ ∗ T m λm ] ∈ R , such that ∇v(xl ) = i=1 λi ∇ci (x l ) and c(xl∗ ) = 0m×1 , namely, xl∗ is a Karush–Kuhn–Tucker (KKT) point. Furthermore, if z T ∇x x L(xl∗ , λ∗ )z > 0, for all then xl∗ is a strict local minimum, z ∈ V1 (xl∗ ), z = 0n×1 , where L(x, λ) = v(x) − m i=1 λi ci (x). Proof: The proof is composed of three propositions. Proposition 1 is to show that K is compact and positively invariant with respect to (11). Proposition 2 is to show that the solution of (11) starting at x 0 approaches xl∗ ∈ S. Proposition 3 is to show that xl∗ ∈ S is a KKT point, further a strict local minimum. The three propositions are proven in Appendix E.  Corollary 1: Suppose that f (x) = f pro (x) as in (7) for c : Rn → Rm and the set K = {x ∈ Rn |v(x) ≤ v(x 0 ), c(x) = 0m×1 } is bounded for given x 0 ∈ F . Then, the solution of (11) starting at x 0 approaches xl∗ ∈ S, where S = {x ∈ K|∇v(x)T f (x) = 01×n } and xl∗ is a KKT point. In addition, if z T ∇x x L(xl∗ , λ∗ )z > 0, for all then xl∗ is a strict local minimum, z ∈ V1 (xl∗ ), z = 0n×1 , where L(x, λ) = v(x) − m i=1 λi ci (x). Proof: Since V1 (xl∗ ) = V2 (xl∗ ) by Theorem 3, the remainder of the proof is the same as that of Theorem 4.  Corollary 2: Consider the following equality-constrained optimization problem: min v(x),

x∈Rn

s.t. Ax = b

(12)

where v(x) is convex and twice continuously differentiable, A ∈ Rm×n with rank A < n, and K = {x ∈ Rn |v(x) ≤ v(x 0 ), Ax = b} is bounded, then the solution of (11) with f (x) ≡ A⊥ starting at any x 0 ∈ F approaches the global minimum x ∗ . Proof: The solution of (11) starting at x 0 approaches ∈ S. Since rank A < n, it holds that xl∗ V1 (xl∗ ) = V2 (xl∗ ) = ∅. Since the equality-constrained optimization (12) is convex, a KKT point xl∗ is the global minimum x ∗ of the problem (12). The remainder of the proof is the same as that of Theorem 4.  Remark 4 (On Boundedness of Set K): If K is not a bounded set, then S defined in Theorem 4 may be empty.

266

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 27, NO. 2, FEBRUARY 2016

Therefore, the boundedness of the set K is necessary. For example, v(x) = x 1 + x 2 , show that c(x) = x 1 − x 2 = 0. The set K = { x ∈ R2  x 1 + x 2 ≤ v(x 0 ), x 1 − x 2 = 0} is unbounded. According to Theorem 1, f (x) ≡ [1 1]T . In this case, ∇v(x)T f (x) ≡ 2 = 0, and then the set S is empty. C. Modified Closed-Loop Dynamical System Although the proposed method ensures that the solutions satisfy the constraints, this method may fail if x 0 ∈ / F or if numerical algorithms are used to compute the solutions. Moreover, if the impulse function δ in (6) is approximated, then the constraints will also be violated. With these facts, the following modified closed-loop dynamical system is proposed to amend this situation. Similar to [2], a correction term −ρ∇c(x)c(x) is introduced into (11), resulting in x˙ = −ρ∇c(x)c(x) − f (x)Q(x) f (x)T ∇v(x) x(0) = x 0

(13)

where ρ > 0. Define v c (x) = c(x)T c(x). Then v˙c (x) = −ρc(x)T ∇c(x)T ∇c(x)c(x) ≤ 0 where ∇c(x)T f (x) ≡ 0m×n is utilized. Therefore, the solutions of (13) will tend to the feasible set F if ∇c(x) is of full column rank. Once c(x) = 0m×1 , the modified dynamical system (13) degenerates to (11). The self-correcting feature enables the step size to be automatically controlled in the numerical integration process or to tolerate uncertainties when the differential equation is realized using analog hardware. Singular points will make ∇c(x) be of nonfull column rank. In this case, the self-correction will not occur in the entire space with respect to c(x), namely, Dc(x) → 0 not c(x) → 0, where D ∈ R p×m , p < m. After passing singular points, the self-correction works in the entire space again. Once c(x) = 0, the property is mainly maintained by the proposed projection matrix. Remark 5 (On the Matrix Q(x)): The matrix Q(x) plays a role in avoiding instability in the numerical solution of differential equations by normalizing the Lipschitz condition of functions, such as f (x)Q(x) f (x)T ∇v(x). If the dimension of dynamical system (13) is low, then let Q(x) = μ/ f (x) f (x)T ∇v(x) for simplicity, where μ > 0. The matrix Q(x) also plays a role in coordinating the convergence rate of all states by minimizing the condition number of the matrix functions, such as f (x)Q(x) f (x)T , especially in the case that the dimension of f (x)Q(x) f (x)T is high. Ideally, it is expected f (x)Q(x) f (x)T = In . However, it is difficult to obtain such Q(x), since the number of degrees of freedom of Q(x) ∈ Rl×l is less than the number of elements of In . A natural choice is proposed as follows: Q(x) = μ( f (x)T f (x))† +  Il where μ,  > 0. The term  Il makes Q(x) ≥  Il > 0. As a result, one has f (x)Q(x) f (x)T = μU † † U T +  f (x) f (x)T .

Here, the singular value decomposition of the matrix f (x) is f (x) = U V, where U ∈ Rn×n and V ∈ Rl×l are real unitary matrices, and ∈ Rn×l is a rectangular diagonal matrix with non-negative real numbers on the diagonal. If the dynamical system (13) is solved by the numerical integration of differential equations, Q(x) needs to be computed every time. This, however, will cost much time. A time-saving way is to update Q(x) at a reasonable interval. On the other hand, if the dynamical system (13) is realized by a neural network, then Q(x) should be realized by an additional continuous-time dynamical system. Thanks to parallel implementation, the realization can also be implemented online. Finally, the matrix function Q(x) also plays a role in determining the search direction. In the future, it is expected that the matrix function Q(x) can be designed to generate search directions like those the interior point method and the conjugate gradient method provide. Remark 6 (On Large-Scale Optimization): Consider the following equality-constrained optimization problem: min v(x), s.t. c(x) = c (x) + c0 = 0m×1

x∈Rn

where c0 ∈ Rm is the constant vector depending on a concrete problem. The continuous-time dynamical system (13) can be written as x˙ = −ρ∇c(x)(c(x) + c0 ) + h(x) where h(x) = − f (x)Q(x) f (x)T ∇v(x) ∈ Rn . For such a class of large-scale optimization problems, the terms h(x) are the same for any given c0 ∈ Rm . In order to make the method efficient, the function h(x) can be derived offline first and realized by analog hardware. As shown in Fig. 3(a), h(·) is a block by taking x as an input and h(x) as an output. Therefore, h(x) depends on x, but h(·) does not. Thanks to the recursive form, as shown in Fig. 3(b), the realization of the proposed projection matrix f (·) can use the element F(·, ·) repeatedly. This makes the realization easier. The realization of F(·, ·) is shown in Fig. 3(c). In practice, the term h(x) can be approximated offline by a static and simple neural network n n (x) [simpler than h(x) at least] if only a bound x ∈ D is considered, resulting in x˙ = −ρ∇c(x)c(x) + n n (x). Although the approximation error exists, the solution can still converge to the constraints because of the correction term −ρ∇c(x)c(x). V. I LLUSTRATIVE E XAMPLES Three examples are given. The first example is mainly to show, in the presence of singularity, a commonly-used method will fail to find the minimum, whereas the proposed method can. The second example is mainly to show that the proposed projection matrix outperforms the modified commonly used projection matrix. The third example is mainly to show how the proposed method is applied to a practical example. Two of the three examples also show the advantage in running time.

QUAN AND CAI: NEW CONTINUOUS-TIME EQUALITY-CONSTRAINED OPTIMIZATION TO AVOID SINGULARITY

267

TABLE I R ESULT FOR E XAMPLE IN S ECTION V-A

origin of (14), i.e., P ∈ Rn×n is a positive-definite matrix, such that A T P + P A < 0n×n . Then, the largest ellipsoidal estimate of the attraction domain of the origin can be computed via the following equality-constrained optimization problem [17]: min

x∈Rn \{02×1 }

x T Px,

s.t. x T P[ Ax + g(x)] = 0.

Since {x ∈ Rn |x T Px ≤ x 0T Px 0 } is bounded, the subset K = {x ∈ Rn |x T Px ≤ x 0T Px 0 , x T P[ Ax + g(x)] = 0} is bounded no matter what g is. For simplicity, consider (14) with x = [x 1 x 2 ]T ∈ R2 , A = −I2 , P = I2 and g(x) = (c(x) + 1)[x 1 x 2 ]T , where c(x) = (x 1 + x 2 + 2) ((x 2 + 1) − 0.1(x 1 + 1)2 ). Then, the optimization problem is formulated as   min x 12 + x 22 , s.t. x 12 + x 22 c(x) = 0. x∈R2 \{02×1 }

Since x = 02×1 , the problem is further formulated as min v(x) = x 12 + x 22 ,

x∈R2

s.t. c(x) = 0.

Then ∇v(x) = [2x 1 2x 2 ]T

d2 − 0.1d12 − 0.2d1 d3 ∇c(x) = d2 − 0.1d12 + d3 d1 = x 1 + 1, d2 = x 2 + 1, d3 = x 1 + x 2 + 2.

Fig. 3. Block diagram realization of (a) (13), (b) new projection matrix f (·), (c) element F(·, ·), and (d) multiplier and sum.

A. Estimate of Attraction Domain For a given Lyapunov function, the crucial step in any procedure for estimating the attraction domain is determining the optimal estimate. Consider the system of differential equations x˙ = Ax + g(x)

(14)

where x ∈ Rn is the state vector, A ∈ Rn×n is a Hurwitz matrix, and g : Rn → Rn is a vector function. Let v(x) = x T Px be a given quadratic Lyapunov function for the

In this example, the modified dynamical system (13) is adopted, where f is chosen as f pro in (6) with δ(x) = e−γ |x| , T ∇v. It is then and γ = 10, ρ = Q = 20/∇cc − f pro f pro solved using the MATLAB function ode45 with variable step.1 The comparisons with the MATLAB optimal constrained nonlinear multivariate function fmincon are derived in Table I. The point x s = [−1 −1]T is a singular point, at which ∇c(x s ) = [0 0]T . As shown in Table I, under initial points [−3 1]T ∈ F and [2 −4]T ∈ F , by the MATLAB function, the singular point instead of the minimum is found, whereas the proposed method still finds the minimum. Under initial point / F , the proposed method can still find the minimum, [1 −4]T ∈ similar to the MATLAB function. Under a different initial value, the evolutions of (13) are shown in Fig. 4. As shown, once close to the singular point [−1 −1]T , the solutions 1 In this section, all computation is performed by MATLAB 6.5 on a personal computer (Asus x8ai) with Intel core Duo 2 Processor at 2.2 GHz.

268

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 27, NO. 2, FEBRUARY 2016

The minimum is    

1 1 1 T 1 1 √ . 1− √ 1− √ x∗ = 2 3 2 3 3 Meanwhile, the gradients of the constraints are ⎤ ⎡ ⎤ ⎡ 1 1 ⎥ ⎢ ⎥ ⎢ ∇c1 = ⎣ 1 ⎦ , ∇c2 (x) = ⎣ 1 ⎦. 1 3x 32

Fig. 4. Optimization of the estimate of attraction domain. Solid line: solution evolution. Dotted line: constraint evolution. Dashed–dotted line: objective evolution.

Obviously, ∇c1 (x ∗ ) = ∇c2 (x ∗ ), namely, the minimum is also the singular point. Therefore, if the solution is closer to the minimum, the projection matrix f com (x) in (3) is more singular. Hence, the modified projection matrix f mcom in (4) is considered. The proposed projection matrix is constructed in the following. By (7), f 1 is obtained as ⎡ ⎤ 2 1 1 − − ⎢ 3 3 3⎥ ⎢ ⎥ ∇c1 ∇c1T 2 1⎥ ⎢ 1 f 1 = I3 − = ⎢ ⎥ − − ⎢ 3 δ(∇c1 2 ) + ∇c1 2 3 3⎥ ⎣ 1 1 2⎦ − − 3 3 3 where δ(∇c1 2 ) = 0 as ∇c1 is constant. Furthermore, the proposed projection matrix is written as f pro (x) = f 1 −

f 1 f 1T ∇c2 (x)∇c2T (x) f 1 ε +  f 1T ∇c2 (x)2

(15)

where δ( f1T ∇c2 (x)2 ) is replaced by ε same to that in f mcom for comparison. In particular, since ∇c2T (x ∗ ) f 1 = 01×3 at the singularity point, one has f pro (x ∗ ) = f 1

Fig. 5. Block diagram realization for example in Section V-A. (a) Block diagram realization. (b) Block diagram realization of the new projection matrix.

of (13) change direction and then move to the minimum xl∗ = [0.2061 − 0.8545]T . Compared with the discrete optimal methods offered by MATLAB, these results show that the proposed method avoids convergence to a singular point. Moreover, the proposed method is comparable with currently available conventional discrete optimal methods and facilitates even faster convergence. The latter conclusion is consistent with that proposed in [1] and [3]. The realization by analog hardware is shown in Fig. 5. B. Optimization With Minimum Being Singular Point Consider a special equality-constrained optimization problem as follows:   1 2 2 2 min v(x) = x 1 + x 2 + x 3 − √ x∈R3 3 s.t. c1 (x) = x 1 + x 2 + x 3 − 1 1 1 c2 (x) = x 1 + x 2 + x 33 − 1 + √ − √ . 3 ( 3)3

which still possesses ∇c(x ∗ )T f pro (x ∗ ) = 02×3 no matter what ε is. In this example, the modified dynamical system (13) √ is adopted, where x 0 = [ −3 −2 (1/ 3) ]T and γ = 10, ρ = Q = 20/∇cc − f f T ∇v, f = f mcom , f pro . The differential equation (13) is solved using the MATLAB function ode45 with variable step. Denote x est(k) to be the stable solution when ε = 10−k , k = 1, . . . 10. The results are shown in Fig. 6, where c(x) = [c1 (x) c2 (x)]T . As shown in Fig. 6 (first subplot), the obtained solutions using f pro are closer to the minimum than those solutions using f mcom . Moreover, as shown in Fig. 6 (second subplot), the obtained solutions using fpro are further constrained better. These observation can be explained that, at the singularity point, ∇c(x ∗ )T f pro (x ∗ ) = 02×3 , whereas ∇c(x ∗ )T fmcom (x ∗ ) = 02×3 . Finally, it is observed that the advantage of the proposed projection matrix exists if ε ≥ 10−4 . This is also useful because a larger ε implies an easier realization by lower-precision hardware, vice versa. C. Estimate of Rotation Matrix 1) Problem Formulation: For simplicity, assume that images are taken by two identical pinhole cameras with the focal length equal to one. As shown in Fig. 7, the two cameras are specified by the camera centers C1 , C2 ∈ R3 and attached

QUAN AND CAI: NEW CONTINUOUS-TIME EQUALITY-CONSTRAINED OPTIMIZATION TO AVOID SINGULARITY

where

269

T ⎤ ⊗ m 1,1 ⎥ ⎢ .. N×9 A=⎣ ⎦∈R . T T m 2,N ⊗ m 1,N ϕ = vec([T ]× R).



T m 2,1

(19)

In practice, these image points m 1,k and m 2,k are subject to noise, k = 1, 2, . . . , N. Therefore, T and R are often solved by the following optimization problem: min v(x) =

x∈R12

s.t.

Fig. 6. Optimization by applying two projection matrices f mcom , f pro to (13), respectively.

1 ϕ(x)T A T Aϕ(x) 2

1 (T 2 − 1) = 0 2 1 T (R R − I3 ) = 03×3 2

(20)

where x = [T T vec(R)T ]T ∈ R12 . This is an equalityconstrained optimization considered here. In the following, the proposed method is applied to the optimization problem (20). 2) Projection Matrix: By Theorem 2, the projection matrix for the constraint (1/2)(T 2 − 1) = 0 is f pro,1 = I3 −

TTT . δ(T 2 ) + T 2

Since T 2 = 1 has to be satisfied exactly or approximately, then δ(T 2 ) = 0. Therefore, the projection matrix for the constraint is Fig. 7.

f pro,1 = I3 − T T T /T 2 .

Epipolar geometry.

Then, the constraint is transformed into orthogonal camera frames {e1 , e2 , e3 } and {e1 , e2 , e3 }, respectively. Denote T = C2 − C1 ∈ R3 to be the translation from the first camera to the second and R ∈ R3×3 to be the rotation matrix from the basis vectors {e1 , e2 , e3 } to {e1 , e2 , e3 }, expressed with respect to the basis {e1 , e2 , e3 }. Then, it is well known in the computer vision literature [18] that two corresponding image points are represented as follows: 1 Mk Mk (3) 1 M  , k = 1, 2, . . . , N =  Mk (3) k

m 1,k = m 2,k

(16)

where Mk and Mk represent the positions of the kth point expressed in the two camera frames {e1 , e2 , e3 } to {e1 , e2 , e3 }, respectively; Mk (3) and Mk (3) represent the third element of vectors Mk , Mk , respectively. They have the relationship Mk = RMk + T, k = 1, 2, . . . , N. These corresponding image points satisfy the epipolar constraints [18, p. 257] T Em 2,k = 0, k = 1, 2, . . . , N m 1,k

(17)

where E = [T ]× R is known as the essential matrix. Using the direct product ⊗ and the vec(·) operation, the equations in (17) are equivalent to Aϕ = 0 N×1

(18)

T˙ = (I3 − T T T /T 2 )u 1 where u 1 ∈ R3 . The constraint (1/2)(R T R − I3 ) = 03×3 is transformed into R˙ = [u 2 ]× R

(21)

where u 2 ∈ R3 . If so, then d T (R R) = R T R˙ + R˙ T R dt   T R = 03×3 . = R T [u 2 ]× + [u 2 ]× Furthermore, (21) is rewritten as ˙ = (R T ⊗ I3 )H u 2. vec( R) Then, the continuous-time dynamical system, whose solutions always satisfy the equality constraints (1/2)(T 2 − 1) = 0 and (1/2)(R T R − I3 ) = 03×3 , is expressed as (2) with

f pro,1 03×3 f pro (x) = ∈ R12×6 09×3 (R T ⊗ I3 )H u1 ∈ R6 . u= (22) u2 If the initial value T (0)2 = 1 and R(0)T R(0) = I3 , then all solutions of (2) satisfy the equality constraints.

270

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 27, NO. 2, FEBRUARY 2016

3) Update Design: Since ∇v(x) = [(R T ⊗ I3 )H I3 ⊗ [T ]× ]T A T Aϕ, the time derivative of v(x) along the solutions of (2) is

TABLE II R ESULT FOR E XAMPLE IN S ECTION V-C

v(x) ˙ = −ϕ T A T A (x)T Q(x) (x)A T Aϕ ≤ 0 where (x) =



(I3 − T T T /T 2 )T H T (R T ⊗ I3 )T H T (R T ⊗ I3 )T (I3 ⊗ [T ]× )T

.

∈ R6×9 .

The simplest way of choosing Q(x) is Q(x) ≡ I6 . In this case, the eigenvalues of the matrix A (x)T (x)A T are often ill-conditioned, namely λmin (A (x)T (x)A T )  λmax (A (x)T (x)A T ). Convergence rates of the components of Aϕ(x) depend on the eigenvalues of A (x)T Q(x) (x)A T . As a consequence, some components of Aϕ converge fast, while the other may converge slowly. This leads to poor asymptotic performance of the closed-loop system. It is expected that each component of Aϕ can converge at the same speed as far as possible. Suppose ¯ that there exists Q(x), such that T ¯ A (x)T Q(x) (x)A = I9 .

Fig. 8.

Then

Evolvement of the state.

v(x) ˙ = −ϕ T A T Aϕ ≤ 0. By Theorem 4, x will approach the set { x ∈ Rn | Aϕ(x) = 0 N×1 }, each element of which is a global minimum since v(x) = 0 in the set. Moreover, each component of Aϕ converges at a similar speed. However, it is ¯ difficult to obtain such Q(x), since the number of degrees of ¯ freedom of Q(x) ∈ R6×6 is less than the number of elements of I9 . A modified way is to make A (x)T Q(x) (x)A T ≈ I9 . A natural choice is proposed as follows: Q(x) = μ( (x)A T A (x)T )† +  I6

(23)

where μ > 0, ( (x)A T A (x)T )† denotes the Moore–Penrose inverse of (x)A T A T (x). The matrix  I6 is to make Q(x) positive definite, where  is a small positive real. From the procedure above, ( (x)A T A (x)T )† needs to be computed every time. This, however, will cost much time. A timesaving way is to update Q(x) at a reasonable interval. Then, (11) becomes x˙ = − f pro (x)Q(x) (x)A T Aϕ(x)

(24)

where fpro (x) is defined in (22), and Q(x) is defined in (23). The differential equation can be solved by the Runge–Kutta methods. The solutions of (24) satisfy the constraints, where x = [T T vec(R)T ]T . Moreover, the dynamic system will reach some final resting state eventually. 4) Simulation: Suppose that there exist six points in the field of view, whose positions are expressed in the first camera frame as follows: 1) M1 = [−1 1 1]T ; 2) M2 = [2 0 1]T ; 3) M3 = [1 −1 1]T ; 4) M4 = [−1 −1 1]T ; 5) M5 = [1 1 1]T ; and 6) M6 = [−1 3 1]T . Compared with the first camera

frame, the second camera frame has translated and rotated with ⎡ ⎤ ⎡ ⎤ 1 0.9900 −0.0894 0.1088 T¯ = ⎣ 1 ⎦, R¯ = ⎣ 0.0993 0.9910 −0.0894⎦. −1 −0.0998 0.0993 0.9900 The image points are generated by (16). Using the generated image points, A is obtained by (19). Set the initial value as follows: T (0) = [0 0 1]T , R(0) = I3 , μ = 20, and  = 0.2. The differential equation (24) is solved using MATLAB function ode45 with variable step. Compared with MATLAB optimal constrained nonlinear multivariate function fmincon, the following comparisons are given. As shown in Table II, the proposed method requires less time to achieve a higher accuracy. Given that v(x ∗ ) = 0, the solution is a global minimum. The evolution of each element of x is shown in Fig. 8. The state eventually reaches a rest state at a similar speed. With different initial values, several other simulations are also implemented. Based on the results, the proposed algorithm has met the expectations. VI. C ONCLUSION A method to continuous-time, equality-constrained optimization based on a new projection matrix is proposed for the determination of local minima. With the transformation of the equality constraint into a continuous-time dynamical system, the class of equality-constrained optimization is formulated as a control problem. The resultant method is more general than the existing control theoretic approaches. Thus, the proposed method serves as a potential bridge

QUAN AND CAI: NEW CONTINUOUS-TIME EQUALITY-CONSTRAINED OPTIMIZATION TO AVOID SINGULARITY

between the optimization and control theories. Compared with other standard discrete-time methods, the proposed method avoids convergence to a singular point and facilitates faster convergence through numerical integration on a digital computer.

C. Proof of Lemma 1 Since δ(L2 ) + L2 = 1, if L = 0n×1 δ(L2 ) + L2 = L2 , if L = 0n×1 one has δ(L2 ) + L2 = 0, ∀L ∈ Rn . According to this, the following relationship holds:

A PPENDIX

L T (In − L L T /(δ(L2 ) + L2 )) = L T − L T L2 /(δ(L2 ) + L2 )

A. Kronecker Product Vector and Skew-Symmetric Matrix The symbol vec(X) is the column vector obtained by stacking the second column of X under the first, and then the third, and so on. With X = [x i j ] ∈ Rn×m , the Kronecker product X ⊗ Y is the matrix ⎡ ⎤ x 11 Y · · · x 1m Y ⎢ .. ⎥. .. X ⊗ Y = ⎣ ... . . ⎦ x n1 Y

···

x nm Y

The relation vec(XY Z ) = (Z T ⊗ X)vec(Y ) holds [19, p. 318]. The cross product of two vectors x ∈ R3 and y ∈ R3 is denoted by x × y = [x]× y, where the symbol [·]× : R3 → R3×3 is defined as [21, p. 194] ⎤ ⎡ 0 −x 3 x2 0 −x 1 ⎦ ∈ R3×3 . [x]×  ⎣ x 3 −x 2 x1 0 By the definition of [x]× , x × x = [x]× x = 03×1 , ∀x ∈ R3 and vec([x]× ) = H x ⎡ 0 0 H = ⎣0 0 0 1

0 −1 0

0 0 −1

271

0 0 0

1 0 0

0 1 0

−1 0 0

⎤T 0 0⎦ . 0

≡ 0 ∀L ∈ Rn . This implies that L T z = 0, ∀z ∈ W2 , namely, W2 ⊆ W1 . On the other hand, any z ∈ W1 is rewritten as z = (In − L L T /(δ(L2 ) + L2 ))z where L T z = 0 is utilized. Hence W1 ⊆ W2 . Consequently, W1 = W2 . D. Proof of Theorem 3 The proof is done by mathematical induction. Denote   j V1 = z ∈ Rn |∇ciT z = 0, i = 1, . . . , j, j ≤ m   j V2 = z ∈ Rn |z = f j u j , u j ∈ Rn , j ≤ m . First, by Theorem 2, it is easy to see that the conclusions are satisfied with j = 1. Suppose that V1k−1 = V2k−1 holds. Then, prove that V1k = V2k holds. If so, this proof is concluded. By V1k−1 (x) = V2k−1 (x), one has   V1k = z ∈ Rn |∇ckT z = 0, z ∈ V1k−1   = z ∈ Rn |∇ckT z = 0, z = f k−1 u k−1 , u k−1 ∈ Rn   = z ∈ Rn |∇ckT f k−1 u k−1 = 0, z = f k−1 u k−1 , u k−1 ∈ Rn . On the other hand, by Lemma 1, one has ∇ckT fk−1 u k−1 = 0 

B. Proof of Theorem 1 Since Fu (x 0 ) ⊆ F , the remaining task is to prove F ⊆ Fu (x 0 ), ∀x 0 ∈ F , namely, for any x¯ ∈ F , there exists a control input u ∈ Rl that can transfer any initial state x 0 ∈ F to x. ¯ Since x 0 , x¯ ∈ F , there exist u 0 , u¯ ∈ Rl , such that x¯ = A⊥ u¯ and x(0) = A⊥ u 0 by the definition of A⊥ . Design a control input ⎧ ⎨1 (u¯ − u 0 ), 0 ≤ t ≤ t¯ u(t) = t¯ ⎩0 , t > t¯. l×1

With the control input above, it holds that  t x(t) − x(0) = A⊥ u(s)ds 0  t¯ = A⊥ u(s)ds = A⊥ u¯ − A⊥ u 0 0

when t ≥ t¯. Then, x(t) = x, ¯ t ≥ t¯. Hence F ⊆ Fu (x 0 ), ∀x 0 ∈ F . Consequently, F = Fu (x 0 ), ∀x 0 ∈ F .

⇔ u k−1 = namely

f T ∇ck ∇c T f k−1 In −  T k−1  k T δ  fk−1 ∇ck 2 +  fk−1 ∇ck 2

 uk

 V1k = V2k = {z ∈ Rn z = f k u k , u k ∈ Rn }

where f k = f k−1



f T ∇ck ∇c T fk−1 In −  T k−1 2  k T δ  f k−1 ∇ck  +  fk−1 ∇ck 2

 .

E. Proof of Propositions in Theorem 4 The proof of Theorem 4 uses three propositions. Their proofs are shown as follows. 1) Proof of Proposition 1: In the space Rn , the set K is compact if and only if it is bounded and closed in [20, p. 41, Th. 8.2]. Hence the remainder of this paper is to prove that K is closed. Suppose, to the contrary, K is not closed. Then, there exists a sequence / K with tn → ∞. Whereas, x(tn ) ∈ K → p ∈ v( p) = limtn →∞ v(x(tn )) ≤ v(x 0 ) and c( p) =

272

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 27, NO. 2, FEBRUARY 2016

limtn →∞ c(x(tn )) = 0m×1 which imply p ∈ K. The contradiction implies that K is closed. Hence, the set K is compact. By (10), v(x) ≤ v(x 0 ) with respect to (11), t ≥ 0. By Assumption 1, all solutions of (11) satisfy c(x) = 0m×1 . Therefore, K is positively invariant with respect to (11). 2) Proof of Proposition 2: Since K is compact and positively invariant with respect to (11), by Theorem 4.4 (invariance principle) in [16, p. 128], the ˙ = 0, solution of (11) starting at x 0 approaches v(x) namely, ∇v(x)T f (x) = 0. In addition, since (11) becomes x˙ = 0n×1 in S, the solution approaches a constant vector xl∗ ∈ S. 3) Proof of Proposition 3: Since V1 (xl∗ ) = V2 (xl∗ ) and xl∗ ∈ S satisfy the following two equalities: ∇v(xl∗ )T f (xl∗ ) = 01×n , c(xl∗ ) = 0m×1 there exists u, such that z = f (xl∗ )u for any z ∈ V1 (xl∗ ). As a consequence, for any z ∈ V1 (xl∗ ), ∇v(xl∗ )T z = ∇v(xl∗ )T f (xl∗ )u = 0. There must exist λ∗i ∈ R, m ∗ ∗ ∗ i = 1, . . . , m, such that ∇v(xl ) = i=1 λi ∇ci (x l ). ∗ ∗ T Otherwise ∃¯z ∈ V1 (xl ), ∇v(xl ) z¯ = 0. Therefore, xl∗ ∈ S is a KKT point [15, p. 328]. Furthermore, by Theorem 12.6 in [15, p. 345], xl∗ is a strict local minimum if z T ∇x x L(xl∗ , λ∗ )z > 0, for all z ∈ V1 (xl∗ ), z = 0. R EFERENCES [1] K. Tanabe, “A geometric method in nonlinear programming,” J. Optim. Theory Appl., vol. 30, no. 2, pp. 181–210, Feb. 1980. [2] H. Yamashita, “A differential equation approach to nonlinear programming,” Math. Program., vol. 18, no. 1, pp. 155–168, Dec. 1980. [3] A. A. Brown and M. C. Bartholomew-Biggs, “ODE versus SQP methods for constrained optimization,” J. Optim. Theory Appl., vol. 62, no. 3, pp. 371–386, Sep. 1989. [4] S. Zhang and A. G. Constantinides, “Lagrange programming neural networks,” IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 39, no. 7, pp. 441–452, Jul. 1992. [5] Z.-G. Hou, “A hierarchical optimization neural network for largescale dynamic systems,” Automatica, vol. 37, no. 12, pp. 1931–1940, Dec. 2001. [6] L.-Z. Liao, H. Qi, and L. Qi, “Neurodynamical optimization,” J. Global Optim., vol. 28, no. 2, pp. 175–195, 2004. [7] Y. Xia and J. Wang, “A recurrent neural network for solving nonlinear convex programs subject to linear constraints,” IEEE Trans. Neural Netw., vol. 16, no. 2, pp. 379–386, Mar. 2005. [8] M. P. Barbarosou and N. G. Maratos, “A nonfeasible gradient projection recurrent neural network for equality-constrained optimization problems,” IEEE Trans. Neural Netw., vol. 19, no. 10, pp. 1665–1677, Oct. 2008. [9] Y. Xia, G. Feng, and J. Wang, “A novel recurrent neural network for solving nonlinear optimization problems with inequality constraints,” IEEE Trans. Neural Netw., vol. 19, no. 8, pp. 1340–1353, Aug. 2008.

[10] S. Qin and X. Xue, “A two-layer recurrent neural network for nonsmooth convex optimization problems,” IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 6, pp. 1149–1160, May 2015. [11] P.-A. Absil, “Computation with continuous-time dynamical systems,” in Proc. Grand Challenge Non-Classical Comput. Int. Workshop, York, U.K., Apr. 2005, pp. 18–19. [12] J. J. Hopfield and D. W. Tank, “‘Neural’ computation of decisions in optimization problems,” Biological, vol. 52, no. 3, pp. 141–152, Jul. 1985. [13] D. G. Luenberger and Y. Ye, Linear and Nonlinear Programming, 3rd ed. Boston, MA, USA: Springer-Verlag, 2008. [14] E. L. Lawler and D. E. Wood, “Branch-and-bound methods: A survey,” Oper. Res., vol. 14, no. 4, pp. 699–719, Jul./Aug. 1966. [15] J. Nocedal and S. Wright, Numerical Optimization. New York, NY, USA: Springer-Verlag, 1999. [16] H. K. Khalil, Nonlinear Systems, 3rd ed. Upper Saddle River, NJ, USA: Prentice-Hall, 2002. [17] G. Chesi, A. Garulli, A. Tesi, and A. Vicino, “Solving quadratic distance problems: An LMI-based approach,” IEEE Trans. Autom. Control, vol. 48, no. 2, pp. 200–212, Feb. 2003. [18] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, 2nd ed. Cambridge, U.K.: Cambridge Univ. Press, 2003. [19] U. Helmke and J. B. Moore, Optimization and Dynamical Systems. London, U.K.: Springer-Verlag, 1994. [20] F. Morgan, Real Analysis and Applications: Including Fourier Series and the Calculus of Variations. Providence, RI, USA: AMS, 2005. [21] A. Isidori, L. Marconi, and A. Serrani, Robust Autonomous Guidance: An Internal Model-Based Approach. London, U.K.: Springer-Verlag, 2003.

Quan Quan received the B.S. and Ph.D. degrees from Beihang University, Beijing, China, in 2004 and 2010, respectively. He has been an Associate Professor with Beihang University since 2013. His current research interests include vision-based navigation and reliable flight control.

Kai-Yuan Cai received the B.S., M.S., and Ph.D. degrees from Beihang University, Beijing, China, in 1984, 1987, and 1991, respectively. He has been a Full Professor with Beihang University since 1995. He is currently a Cheung Kong Scholar (Chair Professor), jointly appointed by the Ministry of Education of China and the Li Ka Shing Foundation of Hong Kong in 1999. His current research interests include software testing, software reliability, reliable flight control, and software cybernetics.

A New Continuous-Time Equality-Constrained Optimization to Avoid Singularity.

In equality-constrained optimization, a standard regularity assumption is often associated with feasible point methods, namely, that the gradients of ...
1KB Sizes 1 Downloads 12 Views