Neural Networks 63 (2015) 272–281

Contents lists available at ScienceDirect

Neural Networks journal homepage: www.elsevier.com/locate/neunet

Neural network for constrained nonsmooth optimization using Tikhonov regularization Sitian Qin a , Dejun Fan a,∗ , Guangxi Wu a , Lijun Zhao b a

Department of Mathematics, Harbin Institute of Technology at Weihai, Weihai 264209, PR China

b

School of Automobile Engineering, Harbin Institute of Technology at Weihai, Weihai 264209, PR China

article

info

Article history: Received 18 November 2013 Received in revised form 12 December 2014 Accepted 16 December 2014 Available online 31 December 2014 Keywords: One-layer neural network Nonsmooth convex optimization problems Tikhonov regularization method

abstract This paper presents a one-layer neural network to solve nonsmooth convex optimization problems based on the Tikhonov regularization method. Firstly, it is shown that the optimal solution of the original problem can be approximated by the optimal solution of a strongly convex optimization problems. Then, it is proved that for any initial point, the state of the proposed neural network enters the equality feasible region in finite time, and is globally convergent to the unique optimal solution of the related strongly convex optimization problems. Compared with the existing neural networks, the proposed neural network has lower model complexity and does not need penalty parameters. In the end, some numerical examples and application are given to illustrate the effectiveness and improvement of the proposed neural network. © 2014 Elsevier Ltd. All rights reserved.

1. Introduction In this paper, we will study following constrained nonsmooth convex optimization problem minimize subject to

f (x) g (x) ≤ 0 Ax = b

(1)

where x = (x1 , x2 , . . . , xn )T ∈ Rn is the vector of decision variables, f : Rn → R is the objective function which is convex but maybe nonsmooth, g (x) = (g1 (x), g2 (x), . . . , gm (x))T : Rn → Rm is a m-dimensional vector valued function and gi is also convex but maybe nonsmooth (i = 1, 2, . . . , m), A ∈ Rr ×n is a full rowrank matrix (i.e., rank(A) = r ≤ n), b = (b1 , b2 , . . . , br )T ∈ Rr . Throughout this paper, we suppose that f is bounded from below. Without loss of generality, we suppose that f (x) ≥ 0, ∀x ∈  Rn . Letting E = {x : Ax = b} and I = {x : g (x) ≤ 0}, then Ω = E I is the feasible region of (1). During recent decades, with the aid of hardware implementation, neural dynamical methods for solving optimization problem (1) have received considerable attention (see Bazaraa, Sherali, and Shetty (1993), Bian and Xue (2009, 2013), Cheng et al. (2011), Deng



Corresponding author. E-mail addresses: [email protected] (S. Qin), [email protected] (D. Fan).

http://dx.doi.org/10.1016/j.neunet.2014.12.007 0893-6080/© 2014 Elsevier Ltd. All rights reserved.

and Bu (2012), Forti, Nistri, and Quincampoix (2004, 2006), Gao and Liao (2010), Guo, Liu, and Wang (2011), Kennedy and Chua (1988), Liu, Cao, and Chen (2010), Liu, Dang, and Cao (2010), Liu, Guo, and Wang (2012), Liu and Wang (2008a, 2008b, 2011), Qin and Xue (2010a, 2010b), Qin, Xue, and Wang (2013), Tank and Hopfield (1986), Wang (1994), Xia and Wang (2004, 2005), Xue and Bian (2008), Yang and Cao (2010) and Zhang and Constantinides (1992)). Neural dynamical method is a possible and promising approach to solve optimization problem (1) with high dimension and complex structure in real time. Since Tank and Hopfield in Tank and Hopfield (1986) firstly proposed a neural network for linear programming, recurrent neural networks for optimization (1) and their engineering applications have been widely investigated. And there arise some other neural network models for optimization problems (1), such as neural network with a finite penalty parameter in Kennedy and Chua (1988), Lagrangian neural network in Zhang and Constantinides (1992), projection-type neural network in Xia and Wang (2005), deterministic annealing neural network in Wang (1994), generalized neural network in Forti et al. (2004), and some simplified neural networks in Bian and Xue (2009), Cheng et al. (2011), Liu, Cao et al. (2010), Liu and Wang (2008a, 2008b), Xia and Wang (2004) and Xue and Bian (2008). Meanwhile, more and more scholars observe important applications of nonsmooth optimization problem, such as manipulator control, signal processing, sliding mode and so on (see Bian and Xue (2009), Cheng et al. (2011), Forti et al. (2004), Liu, Cao et al. (2010), Liu and Wang (2008a, 2008b, 2011) and Xue and Bian

S. Qin et al. / Neural Networks 63 (2015) 272–281

(2008)). Forti et al. (2004) proposed a generalized neural network to solve a much wider class of nonsmooth optimization problems (1) without equation constraints in real time. For general nonsmooth optimization problems (1), Bian and Xue (2009) and Xue and Bian (2008) proposed recurrent neural networks based on subgradient and penalty parameters method. However, penalty function method is effective relying on the exact penalty parameters, and it is difficult to estimate the penalty parameters in real applications. Cheng et al. (2011) proposed a nonsmooth recurrent neural network to solve the nonsmooth optimization problems (1). The proposed neural network in Cheng et al. (2011) could deal with nonsmooth convex optimization problem with a larger class of constraints, and did not based on any penalty method. However, the neural network in Cheng et al. (2011) had a complex structure. In order to reduce the model complexity and avoid penalty parameters, authors in Liu and Wang (2011) and Bian and Xue (2013) separately presented some related simplified neural networks. In Liu and Wang (2011), authors proposed a novel one-layer neural network modeled by means of a differential inclusion for solving nonsmooth optimization problems (1), in which the number of neurons in the proposed neural network is the same as the number of decision variables of optimization problems. In Bian and Xue (2013), by the regularization item, without any estimation on the exact penalty, authors proposed a simplified neural network to solve nonsmooth optimization problems (1). But, the conclusions in Liu and Wang (2011) and Bian and Xue (2013) are effective relying on the same assumption: int (I ) ∩ E ̸= ∅, where int (I ) is the interior of set I. Obviously, the assumption int (I ) ∩ E ̸= ∅ constrained applications of neural network in Liu and Wang (2011) and Bian and Xue (2013). The well-known Tikhonov regularization plays an important role in convex optimization and monotone variational inequality, especially in ill-posed problems. The main idea of the Tikhonov regularization for convex optimization is that the original convex optimization can be converted to a family of strongly convex problems. And the unique solution of the strongly convex problem will be convergent to a solution of the original convex optimization, as regularization parameter converges to zero (see Attouch (1996), Attouch and Cominetti (1996), Attouch and Czarnecki (2010), Cominetti, Peypouquet, and Sorin (2008), Hung and Muu (2011) and Oliveira, Santos, and Silva (2012)). For example, in Cominetti et al. (2008), Cominetti et al. studied following differential inclusion with Tikhonov regularization scheme,

− u˙ (t ) ∈ ∂ f (u(t )) + ε(t )u(t ). (2)  +∞ And they proved if 0 ε(t )dt = +∞, then the solution u(t ) of

(2) will be convergent to x∗ . Here, x∗ is the least norm element of S = {x ∈ H : 0 ∈ ∂ f (x)}, where H is a Hilbert space. Inspired by Bian and Xue (2013) and Tikhonov regularization, in this paper, we propose a simplified neural network to solve nonsmooth optimization problems (1). The contributions of this paper are listed as follows. (i) The proposed neural network has a low model complexity and cancels the assumption int (I ) ∩ E ̸= ∅. Hence, it has wider application than neural network in Bian and Xue (2013) and Liu and Wang (2011). (ii) Unlike neural networks in Bian and Xue (2009) and Xue and Bian (2008), the proposed neural network in this paper does not base on penalty method, which means that we need not find an exact parameter in advance. (iii) We release some assumptions in Bian and Xue (2013), such as the objective function is coercive or the feasible region is bounded. Based on more general conditions, we prove the stability of the proposed neural network. The remainder of this paper is organized as follows. In Sections 2 and 3, we introduce some definitions and preliminary knowledge needed in this paper. Then, a one-layer neural network model for solving the problem (3) is proposed. In Section 4, the main theoretical results of this paper are given.

273

From any initial point, the state of proposed neural network is globally existent, unique and convergent to the optimal solution. In Section 5, some numerical examples are shown to verify the analysis the theoretical results. Finally, Section 6 concludes this paper. 2. Preliminaries In this section, some definitions and properties concerning the set-valued map and non-smooth analysis are introduced, which are needed in the remainder of this paper. We refer readers to Aubin and Cellina (1984), Clarke (1983), Filippov (1964) and Tuy (1998) for more thorough discussions. Let Rn be an n-dimensional real Euclidean space with inner product ⟨·, ·⟩, and induced norm ∥ · ∥. Definition 2.1 (Aubin & Cellina, 1984). A map F : K ⊆ Rn → Rn is called a set-valued map, if to each point x ∈ K , there corresponds a nonempty set F (x) ⊆ Rn . A set-valued map F : K ⊆ Rn → Rn with nonempty values is said to be upper semicontinuous (U.S.C.) at  x ∈ K if for any open set Uy containing F ( x), there exists a neighborhood Ux of  x such that F (Ux ) ⊆ Uy . If K is closed, F has nonempty closed values and is bounded in a neighborhood of each point x ∈ K , then F is upper semicontinuous on K if and only if the graph of F (denoted by Gr (F )) is closed, where Gr (F ) is defined as Gr (F ) = {(x, y) ∈ K × Rn : y ∈ F (x)}. Definition 2.2. Let f be Lipschitz near a given point x0 ∈ Rn , and v any other vector in Rn . The generalized directional derivative of f at x0 in the direction v , denoted f ◦ (x0 ; v), is defined as follows, f ◦ (x0 ; v) = lim sup

f (y + t v) − f (y)

y→x0 ,t ↓0

t

.

The Clarke subdifferential of f at x0 is given by ∂ f (x0 ) = {ξ ∈ Rn : f ◦ (x0 ; v) ≥ ⟨ξ , v⟩ for all v in Rn }, which is a subset of Rn . Note that if f is Lipschitz of rank K near x0 , then ∂ f (x0 ) is a nonempty, convex, compact subset of Rn and ∥ξ ∥ ≤ K for any ξ ∈ ∂ f (x0 ) (see Clarke (1983, Proposition 2.1.2)). Definition 2.3 (Tuy, 1998). Let K ⊆ Rn be a convex set. (i) A function f : K → R is convex, if for all x, y ∈ K and λ ∈ [0, 1], we have f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y). (ii) A function f : K → R is strongly convex, if there exists β > 0, such that for all x, y ∈ K and λ ∈ [0, 1], we have f (λx + (1 − λ)y)

≤ λf (x) + (1 − λ)f (y) − λ(1 − λ)β∥x − y∥2 . Convex function satisfies following properties. Proposition 2.1 (Clarke, 1983, Proposition 2.2.7). If f : Rn → R is convex, then, (i) the Clarke subdifferential of f at x coincides with the subdifferential of f at x in the sense of convex analysis, i.e.,

∂ f (x0 ) = {ξ ∈ Rn : f (x0 ) − f (x) ≤ ⟨ξ , x0 − x⟩, ∀x ∈ Rn }; (ii) ∂ f is maximal monotone, i.e., ⟨x − x0 , v − v0 ⟩ ≥ 0 for any v ∈ ∂ f (x) and v0 ∈ ∂ f (x0 ); (iii) ∂ f is upper semicontinuous.

274

S. Qin et al. / Neural Networks 63 (2015) 272–281

Definition 2.4. f is said to be regular at x provided (1) for all v ∈ R , the usual one-sided directional derivative f ′ (x; v) exists; (2) for all v ∈ Rn , f ′ (x; v) = f ◦ (x; v). n

Obviously, any convex function is regular. Regular function has a very important property (i.e. chain rule), which has been used in many papers (see Bian and Xue (2009, 2013), Cheng et al. (2011), Forti et al. (2004), Guo et al. (2011), Liu, Cao et al. (2010), Liu, Dang et al. (2010), Liu et al. (2012), Liu and Wang (2008a, 2008b, 2011), Qin and Xue (2010a, 2010b), Qin et al. (2013) and Xue and Bian (2008)). Lemma 2.1 (Chain Rule (Bellen, Jackiewicz, & Zennaro, 1988)). If V (x) : Rn → R is regular and x(t ) : [0, +∞) → Rn is absolutely continuous on any compact interval of [0, +∞), then x(t ) and V (x(t )) : [0, +∞) → R are differentiable for a.e. t ∈ [0, +∞), and V˙ (x(t )) = ⟨ξ , x˙ (t )⟩,

Lemma 2.2. If K ⊆ Rn is a nonempty closed convex set and x ∈ Rn . Then there exists a unique point x¯ ∈ K satisfying dist(x, K ) = ∥x − x¯ ∥ = min ∥x − y∥ y∈K

3. Problem formulation and model description The Tikhonov regularization scheme for nonsmooth convex optimization problems (1) is the following strongly convex optimization problems:

subject to

g (x) ≤ 0 Ax = b.

1 2k

∥x ∥2 (3)

Obviously, fk is a strongly convex function, k ∈ N. In this paper, we will propose a simplified neural network to solve strongly convex optimization problem (3). And the optimal solution of (1) will be approximated by the unique optimal solution of (3), as k → +∞. Lemma 3.1. For each k ∈ N, strongly convex optimizationproblem (3) has a unique optimal solution, denoted by x∗k ∈ Ω = E I. Proof. Obviously, strongly convex optimization problem (3) has at least an optimal solution since fk is bounded below in Ω . If there exist two different optimal points denoted by x∗k and x∗∗ k for (3). Then, fk (x∗k ) = fk (x∗∗ k ) is the optimal value of (3). Since fk is a strongly convex function, there exists a constant β > 0 such that ∗ ∗∗ fk (λx∗k + (1 − λ)x∗∗ k ) ≤ λfk (xk ) + (1 − λ)fk (xk )

− βλ(1 − λ)∥xk − xk ∥ ∗

∗∗ 2

(4)

for any λ ∈ [0, 1]. Hence, for ∀λ ∈ (0, 1), we have ∗ ∗ fk (λx∗k + (1 − λ)x∗∗ k ) ≤ λfk (xk ) + (1 − λ)fk (xk )

Without loss of generality, by M, we denote the optimal solution set of (1), i.e.,

M = {x ∈ Ω : f (x ) = min f (x)}. ∗

x∈Ω

f (x∗0 ) ≤ f (x∗k ). On the other hand, since x∗k is the optimal solution of (3), fk (x∗k ) ≤ fk (x∗0 ), i.e., 1 2k

∥x∗k ∥2 ≤ f (x∗0 ) +

1 2k

∥x∗0 ∥2 .

Combining the above inequality, we can conclude that f (x∗0 ) ≤ f (x∗k ) ≤ f (x∗0 ) + 1 2k

1 2k

(∥x∗0 ∥2 − ∥x∗k ∥2 )

∥x∗0 ∥2 .

(7)

Hence, limk→+∞ f (x∗k ) = f (x∗0 ). That is, limk→+∞ dist(x∗k , M) = 0. Meanwhile, from inequality (7), we have ∥x∗k ∥ ≤ ∥x∗0 ∥. Then by the definition of x∗0 in (6), limk→+∞ x∗k = x∗0 .  Based on Lemmas 3.1 and 3.2, in order to solve nonsmooth convex optimization problem (1), we only need to solve strongly convex optimization problem (3). The approximated method is cited in some mini-max optimization problems (see Charalambous and Conn (1978), Di Pillo, Grippo, and Lucidi (1997) and Yang and Cao (2010)). Next, we will propose a one-layer neural network to solve strongly convex optimization problem (3). First, we introduce the following penalty function, which is also cited in Bian and Xue (2009, 2013), Forti et al. (2004), Liu and Wang (2011) and Xue and Bian (2008). Define P1 (x) =

m 

max{0, gi (x)},

(8)

i=1

which implies I = {x : P1 (x) ≤ 0}.

(9)

Since gi (i = 1, . . . , m) is convex, P1 is a convex function on Rn and

 [0, 1]∂ gi (x)    0 (x)  i ∈ I  ∂ P1 (x) = {0}       ∂ g i ( x) + [0, 1]∂ gi (x) 

if x ∈ bd(I ) if x ∈ int (I )

(10)

if x ̸∈ I ,

i∈I 0 (x)

where

which leads a contradiction with the optimal value fk (x∗k ). Hence, there exists a unique optimal solution for strongly convex optimization problem (3). 



Proof. From the definition of x∗0 and the fact that x∗k ∈ Ω , we have

i∈I + (x)

2 − βλ(1 − λ)∥x∗k − x∗∗ k ∥ ∗ < fk (xk ),

(6)

Lemma 3.2. The unique optimal solution x∗k of (3) converges to the optimal solution of (1) with the smallest-norm as k → +∞, i.e., limk→∞ x∗k = x∗0 .

≤ f (x∗0 ) +

where dist(x, K ) is the distance between x and K .

f k ( x) = f ( x) +

∥x∗0 ∥ = min{∥x∥ : x ∈ M}.

f (x∗k ) +

∀ξ ∈ ∂ V (x(t )) for a.e. t ∈ [0, +∞).

minimize

Obviously, M is a convex subset of Ω . Hence, there exists a unique x∗0 ∈ M satisfying

(5)

I 0 (x) = {i ∈ {1, 2, . . . , m} : gi (x) = 0} I + (x) = {i ∈ {1, 2, . . . , m} : gi (x) > 0}. Additionally, since A is a full row rank matrix, it is clear that AAT is invertible. Then we denote P = AT (AAT )−1 A in this paper.

S. Qin et al. / Neural Networks 63 (2015) 272–281

275

4. Convergence analysis In this section, we shall study the stability of the equilibrium solution generated by the proposed neural network (11). Firstly, we prove that for any initial point, the state of (11) reaches the equality feasible region E in finite time and stays there thereafter. Moreover, the state is unique if the initial point lies in E. Then, we prove that the state of (11) is convergent to the unique optimal solution x∗k of strongly convex optimization problems (3). Lemma 4.1. For any initial point x0 ∈ Rn , there is at least a local solution x(t ) of neural network (11) defined on a maximal interval [0, T ), for some T ∈ (0, +∞]. Moreover, the local solution x(t ) will reach the equality feasible region E in finite time and stay there thereafter, i.e., there exists T1 ∈ (0, T ) such that x(t ) ∈ E , ∀t ∈ [T1 , T ).

Fig. 1. Schematic block diagram of (11) for optimization problem (13).

Proof. Obviously, F (x, t ) := −(I − P )(ε(t )∂ fk (x) + ∂ P1 (x)) − AT h(Ax − b) is U.S.C. with nonempty convex compact values. By the local viability (see Aubin (1991, Theorem 3.3.4)), for any initial point x0 ∈ Rn , there exists a positive T and a local solution x(t ) of neural network (11), t ∈ [0, T ). That is, there exist measurable functions γ (t ) ∈ ∂ fk (x(t )), η(t ) ∈ ∂ P1 (x(t )) and ξ (t ) ∈ h(Ax(t ) − b) such that x˙ (t ) = −(I − P )(ε(t )γ (t ) + η(t )) − AT ξ (t )

(14)

for a.e. t ∈ [0, T ). Here the positive constant T satisfies one of the following



either T = +∞, or T < +∞ and lim ∥x(t )∥ = +∞.

Fig. 2. Block diagram of F by G-NPC.

Based on the above results, we propose a simplified neural network for solving strongly convex optimization problems (3), with the following dynamics, x˙ (t ) ∈ −(I − P )(ε(t )∂ fk (x(t )) + ∂ P1 (x(t )))

− AT h(Ax(t ) − b)

(11)

where ε : [0, ∞) → (0, ∞) is a decreasing function defined by

ε(t ) = √ 3

ε0

Let B(x) = ∥Ax − b∥1 . Here, ∥ · ∥1 is the 1-norm of Rn . It is clear that B(x) is convex and regular and ∂ B(x(t )) = AT h(Ax(t ) − b). Hence, by chain rule, d dt

B(x(t )) = B˙ (x(t )) = ⟨ζ (t ), x˙ (t )⟩,

dt

if xi < 0 if xi = 0, i = 1, 2, . . . , n if xi > 0.

(12)

The neural network (11) can be implemented by a generalized nonlinear programming circuit, which was firstly introduced in Forti et al. (2004). Generalized nonlinear programming circuit (G-NPC), which derives from a natural extension of NPC, has a neural-like architecture and also features the presence of constraint neurons modeled by ideal diodes with infinite slope in the conducting region (see Bian and Xue (2013) and Forti et al. (2004)). In order to show the implement of the neural network (11), we introduce the following simple optimization problem, minimize subject to

f (x) = |x2 | g (x) = x1 + x2 ≤ 0 a1 x1 + a2 x2 = 1.

= −ξ (t )T AAT ξ (t ) ≤ −λmin (AAT )∥ξ (t )∥2 .

The architecture of neural network (11) for above optimization problem can be depicted in Fig. 1, where F = (F1 , F2 ) = ε(t ) (∂ f (x) + 1k x) + ∂ P1 (x) is depicted in the diagram Fig. 2.

(17)

We next prove that the local solution x(t ) will reach E in finite time and stay there thereafter. It is proved by reduction to absurdity. Suppose that x(t ) ̸∈ E ,

∀t ∈ [0, T ).

(18)

Case i: T = +∞. Since x(t ) ̸∈ E (i.e., Ax(t ) − b ̸= 0), at least one of the components of h(Ax(t ) − b) is 1 or −1. Thus, by (17), d dt

B(x(t )) ≤ −λmin (AAT ) < 0.

(19)

Integrating the above inequality from 0 to t, we have

∥Ax(t ) − b∥1 ≤ ∥Ax0 − b∥1 − λmin (AAT )t . (13)

(16)

B(x(t )) = ⟨AT ξ (t ), −(I − P )(ε(t )γ (t ) + η(t )) − AT ξ (t )⟩

with ε0 > 0 and h = (h(x1 ), h(x2 ), . . . , h(xn ))T , each component of which is defined as

 −1 , h(xi ) = [−1, 1], 1,

∀ζ (t ) ∈ ∂ B(x(t )).

Since A(I − P ) = A − AAT (AAT )−1 A = 0, by (28), (16), we have d

t +1

(15)

t →T −

(20)

Therefore, Ax(t ) − b = 0 when t = ∥Ax0 − b∥1 /λmin (AAT ). Obviously, it contradicts with assumption (18). Case ii: T < +∞ and limt→T− ∥x(t)∥ = +∞. Similar to above proof, for any t ∈ [0, T ),

∥Ax(t ) − b∥1 ≤ ∥Ax0 − b∥1 − λmin (AAT )t ≤ ∥Ax0 − b∥1 .

(21)

276

S. Qin et al. / Neural Networks 63 (2015) 272–281

Hence,

λmin (AA )∥x(t )∥ − 2∥b A∥ ∥x(t )∥ + b b ≤ ∥Ax(t ) − b∥ ≤ ∥Ax0 − b∥21 < +∞. T

2

T

T

2

(22)



Therefore, we obtain the uniqueness of the state of neural network (11). On the other hand, for any initial point x0 ∈ E, let x(t ) be a local solution of neural network (11). That is, there exist measurable functions γ (t ) ∈ ∂ fk (x(t )), η(t ) ∈ ∂ P1 (x(t )) and ξ (t ) ∈ h(Ax(t ) − b) such that

Then, based on assumptions of Case ii and letting t → T , we have

x˙ (t ) = −(I − P )(ε(t )γ (t ) + η(t )) − AT ξ (t )

+ ∞ = lim [λmin (AA )∥x(t )∥ − 2∥b A∥ ∥x(t )∥ + b b]

for a.e. t. According to Lemma 4.1, x(t ) ∈ E, i.e., Ax(t ) = b for t ≥ 0. Hence,

T

2

T

T

t →T −

< + ∞.

(23)

Obviously, it is also a contradiction. Hence, for any initial point x0 ̸∈ E, the solution of neural network (11) will reach E in finite time, i.e., there exists T1 ∈ (0, T ) such that x(T1 ) ∈ E. We next prove x(t ) ∈ E for any t ∈ [T1 , T ), i.e., the solution of neural network (11) will stay inside E thereafter, once it reaches E. If not so, then x(t ) leaves E at t ′ > T1 . Hence, there must exist interval (t ′ , t ′′ ) such that x(t ) ̸∈ E for any t ∈ (t ′ , t ′′ ) and ∥Ax(t ′ ) − b∥1 = 0. However, by (17), we have

∥Ax(t ′′ ) − b∥1 ≤ ∥Ax(t ′ ) − b∥1 − λm (AAT )(t ′′ − t ′ ) = −λm (AAT )(t ′′ − t ′ ) < 0,

(28)

0 = Ax˙ (t )

= −A(I − P )(ε(t )γ (t ) + η(t )) − AAT ξ (t ) = −AAT ξ (t ), which implies ξ (t ) = 0, since AAT is invertible by assumption. Hence, for a.e. t, the solution of neural network (11) also satisfies the following property,



x˙ (t ) = −(I − P )(ε(t )γ (t ) + η(t )) Ax(t ) = b.

(29)

We define a Lyapunov function as follows, (24)

which is a contradiction. Then, the state x(t ) reaches E in finite time and stays there thereafter. 

V (x, t ) = ε(t )fk (x) + P1 (x).

(30)

So,

∂x V (x, t ) = ε(t )∂ fk (x) + ∂ P1 (x).

(31)

Next, we show the uniqueness of the state of (11). We first introduce a notation as follows. Suppose Ω ⊆ Rn is a convex set. By m(Ω ), we denote the element of m(Ω ) with smallest norm. That is,

Differentiating V (x(t ), t ) along neural network (11) (or (29)) and by chain rule, we have

(1) m(Ω ) ∈ Ω , (2) ∥m(Ω )∥ = max ∥ζ ∥.

dt

d (25)

ζ ∈Ω

It is clear that m(Ω ) = PΩ (0), where PΩ (·) is the projection on Ω . Lemma 4.2. For any initial point x0 ∈ E, there is a unique state x(t ) of (11) defined on the maximal interval. Moreover, the unique state x(t ) is just the slow solution of neural network (11), i.e., x(t ) satisfies the following property, x˙ = −m((I − P )[ε(t )∂ fk (x) + ∂ P1 (x)]),

x˙ i (t ) = −(I − P )(ε(t )γi (t ) + ηi (t )) − AT ξi (t )

d dt

(26)

V (x, t ) = ε˙ (t )fk (x) + ⟨(I − P )ζ (t ), x˙ (t )⟩.

(33)

Especially, by the arbitrary of ζ (t ) in ∂x V (x(t ), t ), we have dt

Proof. According to Lemma 4.1, for any initial point, there is at least a local solution of neural network (11) defined on a maximal interval. Let xi (t ) be the local solution of neural network (11) with initial point xi (0) ∈ E , i = 1, 2. Then, by the definition, there exist measurable functions γi (t ) ∈ ∂ fk (xi (t )), ηi (t ) ∈ ∂ P1 (xi (t )) and ξi (t ) ∈ h(Axi (t ) − b) such that

(32)

for any ζ (t ) ∈ ∂x V (x(t ), t ) and a.e. t. Meanwhile, since (I − P )2 = I − P and x˙ (t ) = (I − P )˙x(t ),

d

for a.e. t ≥ 0.

V (x, t ) = ε˙ (t )fk (x) + ⟨ζ (t ), x˙ (t )⟩

V (x, t ) = ε˙ (t )fk (x) + ⟨m((I − P )∂x V (x, t )), x˙ (t )⟩

= ε˙ (t )fk (x) − ⟨˙x(t ), x˙ (t )⟩.

(34)

Hence, for a.e. t, ⟨m((I − P )∂x V (x, t )), x˙ (t )⟩ = −⟨˙x(t ), x˙ (t )⟩, which implies

∥˙x(t )∥ ≤ ∥m((I − P )∂x V (x, t ))∥. Then, by the definition m(·) in (25) and the fact x˙ (t ) ∈ (I − P ) ∂x V (x, t ), for a.e. t, we have

for almost all t. Moreover, by Lemma 4.1, we have xi (t ) ∈ E for t > 0, which means that Pxi (t ) = AT (AAT )−1 Axi (t ) = AT (AAT )−1 b. Hence, by the maximal monotonicity of convex subdifferential,

x˙ (t ) = −m((I − P )∂x V (x, t ))

1 d

Theorem 4.1. For any initial point x0 ∈ Rn , there is at least a state x(t ) of neural network (11) defined on (0, +∞).

2 dt

∥x1 (t ) − x2 (t )∥2 = ⟨x1 (t ) − x2 (t ), x˙ 1 (t ) − x˙ 2 (t )⟩

= ⟨x1 (t ) − x2 (t ), −(I − P )[ε(t )(γ1 (t ) − γ2 (t )) + η1 (t ) − η2 (t )] − AT (ξ1 (t ) − ξ2 (t ))⟩ = −ε(t )⟨x1 (t ) − x2 (t ), γ1 (t ) − γ2 (t )⟩ − ⟨x1 (t ) − x2 (t ), η1 (t ) − η2 (t )⟩ ≤ 0.

Then, we have

∥x1 (t ) − x2 (t )∥ ≤ ∥x1 (0) − x2 (0)∥.

(27)

= −m((I − P )[ε(t )∂ fk (x) + ∂ P1 (x)]). 

(35)

Proof. In order to obtain the global existence of solution x(t ) of neural network (11), we only need to prove that the second case in (15) cannot hold. According to proof in Lemma 4.1, for any initial point x0 ̸∈ E, there exists T1 ∈ (0, T ) such that x(t ) ∈ E , ∀t ∈ [T1 , T ). Hence, without loss of generality, we assume the initial point x0 ∈ E, which means that x(t ) ∈ E for all t > 0. Choosing h ∈ (0, T ) and by (27), we have

∥x(t + h) − x(t )∥ ≤ ∥x(h) − x(0)∥,

t ∈ (0, T − h).

S. Qin et al. / Neural Networks 63 (2015) 272–281

Additionally, from the inequality (39), we get that V (x(t ), t ) is nonincreasing about t, thus

Then, lim sup ∥x(t )∥ = lim sup ∥x(t + h)∥ t →(T −h)−

t →T −

tV (x(t ), t ) −

≤ ∥x(T − h)∥ + ∥x(h) − x(0)∥ < ∞,

Lemma 4.1 and Theorem 4.1 show that for any initial point, there exists a global solution of neural network (11), which will reach E in finite time and stay there thereafter. We next prove the global solution will be convergent to the inequality constraint set I. n

Theorem 4.2. For any initial point x0 ∈ R , the solution of neural network (11) is globally convergent to the inequality constraint set I, i.e.,

(36)

for almost all t > 0. Moreover, there exists T1 > 0 such that x(t ) ∈ E, i.e., Ax(t ) = b, for a.e. t > T1 . Similar to the proof of (29), for a.e. t > T1 , (37)

It is obvious that if x(t ) ∈ E, P T (x(t ) − x∗k ) = AT (AAT )−1 A(x(t ) − b) = 0. Hence, for a.e. t > T1 , d 1 dt 2

∥x(t ) − xk ∥ = ⟨x(t ) − xk , x˙ (t )⟩ ∗ 2



= ⟨x(t ) − x∗k , −(I − P )(ε(t )γ (t ) + η(t ))⟩ = ⟨x(t ) − x∗k , −(ε(t )γ (t ) + η(t ))⟩.

(38)

Differentiating V (x(t ), t ) defined in (30) along neural network (11) (or (37)) and by chain rule, for a.e. t > T1 , d dt

On the other hand, V (x, t ) is convex about x. Utilizing the convex inequality, we obtain that V (x(t ), t ) − V (x∗k , t ) ≤ ⟨ε(t )γ (t ) + η(t ), x(t ) − x∗k ⟩.

V (x, t ) − V (x∗k , t ) ≤ −

dt 2

∥x(t ) − x∗k ∥2 .

[V (x(s), s) − V (x∗k , s)]ds ≤ 0



1 2 1 2

(41)

1 2

∥x0 − x∗k ∥2 +

(43)

t



ε(s)fk (x∗k )ds. (44) 0

P1 (x(t )) ≤

∥x0 − x∗k ∥2 2t

t 0

+

ε(s)fk (x∗k )ds t

.

(45)

It is easy to get that 0

ε(s)ds t

= 0.

Letting t → +∞ in (45), it gives lim P1 (x(t )) = 0,

(46)

t →+∞

which means that the solution x(t ) converges to I, i.e., lim dist(x(t ), I ) = 0. 

t →+∞

Remark 1. For any initial point x0 ∈ Rn , the solution x(t ) of neural network (11) is convergent to the feasible region Ω , i.e., limt →+∞ dist(x(t ), Ω ) = 0. In addition, x∗k is the optimal solution of programming (3), thus x∗k belongs to the feasible region, for each k ∈ N. Next, we prove that the trajectory x(t ) of the proposed neural network (11) is not only convergent to Ω , but also to the x∗k , for each k ∈ N. Theorem 4.3. For any initial point x0 ∈ Rn , the solution x(t ) of neural network (11) is convergent to the unique optimal solution of strongly convex optimization problem (3), i.e., limt →+∞ x(t ) = x∗k . Proof. Similar to the proof of Theorem 4.2 and by the convex inequality, for a.e. t > T1 , we have d 1 dt 2

∥x(t ) − x∗k ∥2 = ⟨x(t ) − x∗k , x˙ (t )⟩ = −⟨x(t ) − x∗k , ε(t )γ (t ) + η(t )⟩ ≤ ε(t )(fk (x∗k ) − fk (x(t ))) + P1 (x∗k ) − P1 (x(t )) ≤ ε(t )(fk (x∗k ) − fk (x(t ))).

(47)

Obviously, the monotonicity of dtd 12 ∥x(t ) − x∗k ∥2 depends on the item fk (x∗k ) − fk (x(t )). For convenience of discussion, we denote F (x, t ) = fk (x∗k ) − fk (x), S1 = {t ∈ [T1 , +∞) : F (x(t ), t ) ≥ 0}; The rest of the proof will be divided into three cases as follows. Case 1: The set S1 is bounded. Then, there exists T2 ≥ T1 such that

1

F (x(t ), t ) = fk (x∗k ) − fk (x(t )) < 0,

2

By (47), we have

∥x0 − x∗k ∥2 − ∥x(t ) − x∗k ∥2 ∥x0 − x∗k ∥2 .

∥x0 − x∗k ∥2 .

S2 = {t ∈ [T1 , +∞) : F (x(t ), t ) < 0}.

Integrating the both sides of inequality (41) from 0 to t, we have t



2

and (40)

Therefore, combining (38) and (40), we have d 1

1

Then,

V (x(t ), t ) = (ε(t )γ (t ) + η(t ))T x˙ (t ) + ε˙ (t )f (x(t ))

≤ ⟨ε(t )γ (t ) + η(t ), −(I − P )(ε(t )γ (t ) + η(t ))⟩ = −∥(I − P )[ε(t )γ (t ) + η(t )]∥2 ≤ 0. (39)

V (x(s), s) − V (x∗k , s)ds 0

t [ε(t )fk (x(t )) + P1 (x(t ))] ≤

t →+∞

Proof. According to Lemma 4.1 and Theorem 4.1, for any initial point x0 ∈ Rn , there exists a global solution x(t ) of neural network (11), which will reach E in finite time and stay there thereafter. That is, there exist measurable functions γ (t ) ∈ ∂ fk (x(t )), η(t ) ∈ ∂ P1 (x(t )) and ξ (t ) ∈ h(Ax(t ) − b) such that

t



Therefore, from the above inequality, we obtain

lim

t →+∞

x˙ (t ) = −(I − P )(ε(t )γ (t ) + η(t )) Ax(t ) = b.

ε(s)fk (x∗k )ds ≤ ≤

t

lim dist(x(t ), I ) = 0.

x˙ (t ) = −(I − P )(ε(t )γ (t ) + η(t )) − AT ξ (t )

t

 0

which means the second case in (15) cannot hold. Hence, T = +∞. Therefore, for any initial point x0 ∈ Rn , the solution x(t ) of (11) exists for t ∈ (0, +∞). 



277

(42)

d 1 dt 2

∥x(t ) − x∗k ∥2 < 0,

∀t ≥ T2 .

∀t ≥ T2 .

278

S. Qin et al. / Neural Networks 63 (2015) 272–281 Table 1 Comparison of related neural networks for nonsmooth optimization problems (1). Ref.

Layers

Neurons

Penalty parameters

Additional assumptions

Herein Cheng et al. (2011) Xue and Bian (2008) Bian and Xue (2013) Liu and Wang (2011)

1 3 1 1 1

n n+m+p n n n

No No No No Yes

No No  int (I )  E ̸= Ø int (I )  E ̸= Ø, f is coercive or Ω is bounded. int (I ) E ̸= Ø, f is coercive or Ω is bounded.

Therefore, limt →+∞ 12 ∥x(t ) − x∗k ∥2 exists. Meanwhile, by the definition of x∗k , we have lim fk (x(t )) ≥ fk (x∗k ).

According to the structure of open set, open set S2 can be expressed as S2 =

t →+∞

Next, we prove that limt →+∞ fk (x(t )) = fk (xk ). If not so, then limt →+∞ fk (x(t )) > fk (x∗k ). Thus, there are T3 ≥ T2 and ϵ > 0 such that

+∞

ε(t )(fk (x(t )) − fk (x∗k ))dt ≥ T3

1

+∞



ε(t )ϵ dt = +∞.

(48)

T3

t

V (x(s), s) − V (xk , s)ds ≤ ∗

0

1 2

x(t ) = x∗k .

2

∥x(t ) − x∗k ∥2 ≤

lim

∥x 0 − x k ∥ . ∗ 2

1 2

∥x(αi ) − x∗k ∥2 .

Since αi ∈ S1 , we know that

On the other hand, from (42), we obtain



lim

For t ∈ S2 , there exists i ∈ {1, 2, . . .} such that t ∈ (αi , βi ) and αi → +∞ as t → +∞. From (47), we have

Then,



S1 = [0, +∞) \ S2 .

The analysis of Case 2 shows that t →+∞,t ∈S1

∀t ≥ T3 .

and

i=1



fk (x(t )) ≥ fk (x∗k ) + ϵ,

∞  (αi , βi ),

(49)

t →+∞,t ∈S2

1 2

∥x(t ) − x∗k ∥2 ≤ lim

i→+∞

1 2

∥x(αi ) − x∗k ∥2 = 0.

Therefore, we confirm that

Since V (x, t ) ≥ ε(t )fk (x), V (x∗k , t ) = ε(t )fk (x∗k ), we have

lim x(t ) = x∗k . 

t →+∞

+∞



ε(t )(fk (x(t )) − fk (x∗k ))dt ≤ 0

1 2

∥x0 − x∗k ∥2

(50)

which leads a contradiction with (48). Therefore, lim fk (x(t )) = fk (x∗k ).

t →+∞

Then, there exists a sequence {tn } such that limn→+∞ tn = +∞ and lim fk (x(tn )) = fk (x∗k ).

n→+∞

Since x∗k is the unique optimal solution of strongly convex optimization problem (3), we get that limn→+∞ x(tn ) = x∗k . Hence, lim x(t ) = x∗k ,

t →+∞

since limt →+∞ 12 ∥x(t ) − x∗k ∥2 exists. Case 2: The set S2 is bounded. Then, there exists T2 ≥ T1 such that fk (x∗k ) − fk (x(t )) ≥ 0,

∀t ≥ T2 .

Then limt →+∞ fk (x(t )) ≤ fk (x∗k ). Utilizing {limt →+∞ x(t )} ⊆ Ω , we have fk (x∗k ) ≤ lim fk (x(t )) ≤ limt →+∞ fk (x(t )) ≤ fk (x∗k ), t →+∞

which implies that lim x(t ) = x∗k .

t →+∞

Case 3: Both S1 and S2 are all unbounded.

Remark 2. Numerical methods to implement differential inclusions have already been developed and convergence results proven by Bellen et al. (1988), Mannshardt (1978), Stewart (1990, 2011) and Taubert (1981). According to the conclusions in Stewart (2011) and Taubert (1981), we can use the fully implicit Runge–Kutta methods to implement neural network (11). Remark 3. Theorem 4.3 indicates neural network (11) can be used to solve the nonsmooth convex optimization problems (1). Recently, nonsmooth convex optimization problems (1) are studied extensively since their important applications, and some neural networks are proposed for (1). In order to show the superiority of neural network (11) in this paper, we make a comparison table of the proposed neural network (11) with several other existing neural networks for (1). Please see comparison Table 1. From the comparison Table 1, the proposed neural network (11) in this paper has lower model complexity and does not require additional assumptions. The following example explains it. Example 1. Consider following nonsmooth convex optimization problem minimize subject to

f (x) = e|x1 −1| + 2|x2 − 1| + x3 x1 − 2x2 + 1 = 0 x21 − 4x2 + 3 ≤ 0 x22 + e|x3 | ≤ 4.

(51)

After simple calculation, we obtain the feasible region Ω = I E = {x =(x1 , x2 , x3 )T |x1 = x2 = 1,− ln3 ≤ x3 ≤ ln3}. Obviously,  Ω = I E is bounded. But, int (I ) E = ∅. If not, i.e., int (I ) E ̸= 2 2 ∅, we have x1 = x2 and x1 −4x2 + 3 = (x1 − 1) < 0, which is a contradiction. Then, int (I ) E = ∅. Thus, the neural networks proposed in Bian and Xue (2013) and Liu and Wang (2011) cannot be used to solve the problem (51).



S. Qin et al. / Neural Networks 63 (2015) 272–281

279

Fig. 3. Transient behavior of x(t ) of the proposed neural network (11) in Example 1.

Here, the problem (51) can be solved by the proposed neural network (11) in this paper. Here, we choose k = 105 , ε0 = 2 and randomly select seven initial points Ai , i = 1, . . . , 7. The simulation result shows that the state of the neural network (11) is convergent to the optimal solution x∗k = (1.0000, 1.0000, −1.0986)T , which is illustrated in Fig. 3. Therefore, we get the approximate optimal solution of (51) is x∗ = (1.0000, 1.0000, −1.0986)T and f (x∗ ) = −1.0986.

Fig. 4. Transient behavior of x(t ) of the proposed neural network (11) with random initial points in Example 2.

5. Numerical example and applications In this section, we provide some numerical examples and application to illustrate the effectiveness and improvement of the proposed neural network (11) in solving the optimization problem (1). The simulations are conducted in MATLAB 2008. Example 2. Consider following nonsmooth convex optimization problem minimize subject to

f (x) = e|x1 +1| + 3x2 e|x3 −1| 2x1 + 3x2 + 2x3 + 4 = 0 x1 + 2x2 + x3 ≤ 0 e|x3 | ≤ 9.

(52)

Fig. 5. Transient behavior of x(t ) of the proposed neural network (11) with random initial points in Example 3.

It is clear that the objective function f (x) is convex. When x1 = 0 and x2 = 0, f (x) = e, which means that f (x) is not coercive. Furthermore, the feasible region in (52) is

that the trajectories of neural network (11) converge to the optimal point x∗k = (0.0001, −0.0001, 1.0000)T , which can be viewed as the approximate solution of the above problem when k is large. Ultimately, we get the optimal solution of (53) is x∗ = (0, 0, 1)T and the optimal value f (x∗ ) = 1.

Ω = {(x1 , x2 , x3 )T | − 8 + ln 9 ≤ x1 ≤ +∞, −∞ ≤ x2 ≤ 4, − ln 9 ≤ x3 ≤ ln 9}, which is unbounded. Hence, the problem (52) cannot be solved by the proposed neural network in Bian and Xue (2013). Here, we apply the neural network (11) to solve the convex problem (52). Similarly, we choose k = 105 and ε0 = 2. Fig. 4 shows that the trajectories of the neural network (11) with a set of different initial points converge to the optimal point x∗k = (−0.9377, −1.0640, 0.5333)T . Eventually, we obtain that the approximate optimal solution of (52) is x∗ = (−0.9377, −1.0640, 0.5333)T and f (x∗ ) = −4.0261.

Example 4. Application in belt conveyor. Belt conveyors with simple structure, high efficiency of transportation and large throughput, are widely used for handling bulk material. Belt conveyor is a typical energy conversion system which can convert electrical energy to mechanical energy. According to the requirement of conveying process, belt conveyor can operate separately, and also can work with other equipments, which the purpose is to meet the needs of more production line. In this application of belt conveyor, the mechanical power of a belt conveyor PT is can be expressed as

Example 3. Consider following programming problem

PT = (FH + FN + Fs + Fst )V .

minimize subject to

f (x) = (x1 + 3x2 + x3 )2 + 4(x1 − x2 ) x1 + x2 + x3 = 1 x21 − 6x2 − 4x3 + 3 ≤ 0 x ≥ 0.

(53)

Obviously, the objective function f (x) is convex, but not coercive. √ And the feasible region Ω = {x = (x1 , x2 , x3 )T |0 ≤ x1 ≤ 2 3 − 3, 0 ≤ x2 ≤ 1, 0 ≤ x3 ≤ 6} is bounded. Here, we choose k = 105 , ε0 = 0.4 and randomly select a set of initial points. Fig. 5 shows

In addition, V is the belt speed in meters per second (m/s). FH is the main resistance, which can be respectively as the following formula: FH = fLg [QRO + QRU + 2(QB + QG ) cos δ] where QRO is unit mass of rotating parts of carrying idlers (kg/m), QRU is unit mass of rotating parts of return idlers (kg/m), QB is unit mass of the belt (kg/m), QG is unit mass of the load (kg/m), which denoted QG = T /3.6 V, δ is inclination angle. f is friction factor,

280

S. Qin et al. / Neural Networks 63 (2015) 272–281

ˆ (t ). Fig. 6. Transient behavior of Θ

L is center-to-center distance of the belt (m), g is the gravitational acceleration, whose value is 9.8 m/s2 in this application. FN is secondary resistance, which can be expressed as FN =

TV 3.6

+

T2 6.48ρ b21

+ GFt

where ρ is density of the material (kg/m3 ), b1 is interval of the skirt boards (m), GFt is a constant. Fs is slop resistance, which can be expressed as Fs = k1

T2 V2

+ k2

T V

(PT (1), PT (2), PT (3), PT (4), PT (5)) = (790.2222, 1037.3, 867.7222, 410.2, 452.75). The estimation of Θ can be solved by the following problem: minimize

f (x) =

+ k3

N   i=1

we denote k1 , k2 , k3 are constant coefficients using to be related to the structural parameters of the belt conveyor. Fst is special resistance, which can be expressed as Fst = QG Hg where H is the net change in elevation (m). Denoted

θ1 =

In the application, Θ = (0.01, 150, 0.3, 20)T . And the sampling number N = 5, we set the values of (V (1), V (2), V (3), V (4), V (5)) = (2, 1, 2.5, 1.2, 1.5), (T (1), T (2), T (3), T (4), T (5)) = (20, 30, 20, 10, 10). According to the dates we set, we can obtain the

1

subject to

ˆ − PT (i) − L(i)Θ

V 2 (i)T (i) 3.6

2 (55)

ˆ ≤ ¯l l≤Θ

where L(i) = (T 2 (i)V (i), V (i), T 2 (i)V (i), T (i)), l = (0, 100, 0, 10)T , ¯l = (1, 104 , 1, 100)T . Here we use the proposed neural network (11) to solve (55). We choose the ε0 = 0.01 and k = 105 . Then, ˆ = (0.0100, the equilibrium point of neural network (11) is Θ 149.9977, 0.3001, 19.9972)T , which is shown the Fig. 6. And the ˆ is ∥Θ − Θ ˆ ∥ = 0.0036. error of Θ and Θ 6. Conclusions

6.48ρ b21

 θ2 = gf (QRO + QRU + 2QB ) L cos δ + L(1 − cos δ)   (54) 2QB × 1− + k3 + CFt QRO + QRU + 2QB θ3 = k1 gL sin δ + gfL cos δ θ4 = + k2 3.6 where θ1 is determined by the structural parameters, θ2 by components of a belt conveyor, θ3 by the operation circumstance and θ4 by the characteristic of the material handled. Therefore, the mechanical power PT is expressed as the parameter vector θ which

In this paper, based on Tikhonov regularization, we proposed a one-layer neural network to solve a class of convex optimization problem. In considered problem, the objective function cannot be required to be coercive, and the feasible region is not restricted to be bounded. It is proved that the optimal solution with the smallest-norm of convex programming problems can be approximated by the equilibrium point of the proposed neural network. And the global asymptotic stability of the proposed neural network is studied. Furthermore, some numerical examples are introduced to show improvements of this paper with some related references. Our future work will aim at nonconvex optimization problems.

constant coefficients for a certain belt conveyor as follows

This research is supported by the National Science Foundation of China (61403101, 61179069) and Weihai Science and technology Development Plan Project (2013DXGJ06).

PT = θ1 T 2 V + θ2 V + θ3

T

2

V

+ θ4 T +

2

V T 3.6

.

Acknowledgments

S. Qin et al. / Neural Networks 63 (2015) 272–281

References Attouch, H. (1996). Viscosity solutions of minimization problems. SIAM Journal on Optimization, 6, 769–806. Attouch, H., & Cominetti, R. (1996). A dynamical approach to convex minimization coupling approximation with the steepest descent method. Journal of Differential Equations, 128, 519–540. Attouch, H., & Czarnecki, M.-O. (2010). Asymptotic behavior of coupled dynamical systems with multiscale aspects. Journal of Differential Equations, 248, 1315–1344. Aubin, J. P. (1991). Viability theory. Cambridge, MA: Birkhäuser. Aubin, J. P., & Cellina, A. (1984). Differential inclusions. Berlin, Germany: SpringerVerlag. Bazaraa, M. S., Sherali, H. D., & Shetty, C. M. (1993). Nonlinear programming: Theory and algorithms. John Wiley & Sons. Bellen, A., Jackiewicz, Z., & Zennaro, M. (1988). Stability analysis of one-step methods for neutral delay-differential equations. Numerische Mathematik, 52, 605–619. Bian, W., & Xue, X. (2009). Subgradient-based neural networks for nonsmooth nonconvex optimization problems. IEEE Transactions on Neural Networks, 20, 1024–1038. Bian, W., & Xue, X. (2013). Neural network for solving constrained convex optimization problems with global attractivity. IEEE Transactions on Circuits and Systems I: Regular Papers, 60, 710–723. Charalambous, C., & Conn, A. (1978). An efficient method to solve the minimax problem directly. SIAM Journal on Numerical Analysis, 15, 162–187. Cheng, L., Hou, Z.-G., Lin, Y., Tan, M., Zhang, W. C., & Wu, F.-X. (2011). Recurrent neural network for non-smooth convex optimization problems with application to the identification of genetic regulatory networks. IEEE Transactions on Neural Networks, 22, 714–726. Clarke, F. (1983). Optimization and nonsmooth analysis. New York: Wiley. Cominetti, R., Peypouquet, J., & Sorin, S. (2008). Strong asymptotic convergence of evolution equations governed by maximal monotone operators with Tikhonov regularization. Journal of Differential Equations, 245, 3753–3763. Deng, M., & Bu, N. (2012). Robust control for nonlinear systems using passivitybased robust right coprime factorization. IEEE Transactions on Automatic Control, 57, 2599–2604. Di Pillo, G., Grippo, L., & Lucidi, S. (1997). Smooth transformation of the generalized minimax problem. Journal of Optimization Theory and Applications, 95, 1–24. Filippov, A. (1964). Differential equations with discontinuous right-hand side. American Mathematical Society Translations, 42, 199–231. Forti, M., Nistri, P., & Quincampoix, M. (2004). Generalized neural network for nonsmooth nonlinear programming problems. IEEE Transactions on Circuits and Systems I: Regular Papers, 51, 1741–1754. Forti, M., Nistri, P., & Quincampoix, M. (2006). Convergence of neural networks for programming problems via a nonsmooth Lojasiewicz inequality. IEEE Transactions on Neural Networks, 17, 1471–1486. Gao, X.-B., & Liao, L.-Z. (2010). A new one-layer neural network for linear and quadratic programming. IEEE Transactions on Neural Networks, 21, 918–929. Guo, Z., Liu, Q., & Wang, J. (2011). A one-layer recurrent neural network for pseudoconvex optimization subject to linear equality constraints. IEEE Transactions on Neural Networks, 22, 1892–1900. Hung, P. G., & Muu, L. D. (2011). The Tikhonov regularization extended to equilibrium problems involving pseudomonotone bifunctions. Nonlinear Analysis. Theory, Methods & Applications, 74, 6121–6129. Kennedy, M. P., & Chua, L. O. (1988). Neural networks for nonlinear programming. IEEE Transactions on Circuits and Systems, 35, 554–562. Liu, Q., Cao, J., & Chen, G. (2010). A novel recurrent neural network with finite-time convergence for linear programming. Neural Computation, 22, 2962–2978.

281

Liu, Q., Dang, C., & Cao, J. (2010). A novel recurrent neural network with one neuron and finite-time convergence for k-winners-take-all operation. IEEE Transactions on Neural Networks, 21, 1140–1148. Liu, Q., Guo, Z., & Wang, J. (2012). A one-layer recurrent neural network for constrained pseudoconvex optimization and its application for dynamic portfolio optimization. Neural Networks, 26, 99–109. Liu, Q., & Wang, J. (2008a). A one-layer recurrent neural network with a discontinuous activation function for linear programming. Neural Computation, 20, 1366–1383. Liu, Q., & Wang, J. (2008b). A one-layer recurrent neural network with a discontinuous hard-limiting activation function for quadratic programming. IEEE Transactions on Neural Networks, 19, 558–570. Liu, Q., & Wang, J. (2011). A one-layer recurrent neural network for constrained nonsmooth optimization. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 40, 1323–1333. Mannshardt, R. (1978). One-step methods of any order for ordinary differential equations with discontinuous right-hand sides. Numerische Mathematik, 31, 131–152. Oliveira, P., Santos, P., & Silva, A. (2012). A Tikhonov-type regularization for equilibrium problems in Hilbert spaces. Journal of Mathematical Analysis and Applications,. Qin, S., & Xue, X. (2010a). Dynamical analysis of neural networks of subgradient system. IEEE Transactions on Automatic Control, 55, 2347–2352. Qin, S., & Xue, X. (2010b). Dynamical behavior of a class of nonsmooth gradient-like systems. Neurocomputing, 73, 2632–2641. Qin, S., Xue, X., & Wang, P. (2013). Global exponential stability of almost periodic solution of delayed neural networks with discontinuous activations. Information Sciences, 220, 367–378. Stewart, D. E. (1990). High accuracy numerical methods for ordinary differential equations with discontinuous right-hand side. Bulletin of the Australian Mathematical Society, 42, 169–170. Stewart, D. E. (2011). Dynamics with inequalities. Society for Industrial and Applied Mathematics. Tank, D., & Hopfield, J. (1986). Simple ‘neural’ optimization networks: an A/D converter, signal decision circuit, and a linear programming circuit. IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, 33, 533–541. Taubert, D. K. (1981). Converging multistep methods for initial value problems involving multivalued maps. Computing, 27, 123–136. Tuy, H. (1998). Convex analysis and global optimization. New York: Springer-Verlag. Wang, J. (1994). A deterministic annealing neural network for convex programming. Neural Networks, 7, 629–641. Xia, Y., & Wang, J. (2004). A one-layer recurrent neural network for support vector machine learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 34, 1261–1269. Xia, Y., & Wang, J. (2005). A recurrent neural network for solving nonlinear convex programs subject to linear constraints. IEEE Transactions on Neural Networks, 16, 379–386. Xue, X., & Bian, W. (2008). Subgradient-based neural networks for nonsmooth convex optimization problems. IEEE Transactions on Circuits and Systems I: Regular Papers, 55, 2378–2391. Yang, Y., & Cao, J. (2010). The optimization technique for solving a class of non-differentiable programming based on neural network method. Nonlinear Analysis: Real World Applications, 11, 1108–1114. Zhang, S., & Constantinides, A. (1992). Lagrange programming neural networks. IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, 39, 441–452.

Neural network for constrained nonsmooth optimization using Tikhonov regularization.

This paper presents a one-layer neural network to solve nonsmooth convex optimization problems based on the Tikhonov regularization method. Firstly, i...
686KB Sizes 1 Downloads 7 Views