A Conjugate Gradient Algorithm with Function Value Information and N-Step Quadratic Convergence for Unconstrained Optimization.

RESEARCH ARTICLE

A Conjugate Gradient Algorithm with Function Value Information and N-Step Quadratic Convergence for Unconstrained Optimization Xiangrong Li1, Xupei Zhao2, Xiabin Duan1, Xiaoliang Wang1* 1 Guangxi Colleges and Universities Key Laboratory of Mathematics and Its Applications, College of Mathematics and Information Science, Guangxi University, Nanning, Guangxi, P.R. China, 2 Mathematics and Physics Institute of Henan University of Urban Construction, Pingdingshan, Henan, P. R. China

a11111

* [email protected]

Abstract OPEN ACCESS Citation: Li X, Zhao X, Duan X, Wang X (2015) A Conjugate Gradient Algorithm with Function Value Information and N-Step Quadratic Convergence for Unconstrained Optimization. PLoS ONE 10(9): e0137166. doi:10.1371/journal.pone.0137166 Editor: Hans A Kestler, Leibniz Institute for Age Research, GERMANY

It is generally acknowledged that the conjugate gradient (CG) method achieves global convergence—with at most a linear convergence rate—because CG formulas are generated by linear approximations of the objective functions. The quadratically convergent results are very limited. We introduce a new PRP method in which the restart strategy is also used. Moreover, the method we developed includes not only n-step quadratic convergence but also both the function value information and gradient value information. In this paper, we will show that the new PRP method (with either the Armijo line search or the Wolfe line search) is both linearly and quadratically convergent. The numerical experiments demonstrate that the new PRP algorithm is competitive with the normal CG method.

Received: January 18, 2015 Accepted: August 13, 2015 Published: September 18, 2015 Copyright: © 2015 Li et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Introduction Consider minn f ðxÞ; x2
0, we have ¼ yk1 þ

yk1

rTk1 sk1 k sk1 k2

¼ ðgk gk1 Þ þ

½2ðf ðxk1 Þ f ðxk ÞÞ þ ðgðxk Þ þ g T ðxk1 Þ sk1 Þ sk1 k sk1 k2

¼ ðgk gk1 Þ þ

ðf ðxk1 Þ f ðxk ÞÞ sk1 þ ðgk þ gk1 Þ k sk1 k2

¼ 2gk þ

ðf ðxk1 Þ f ðxk ÞÞ sk1 : k sk1 k2

By 1 1 1 f ðxk1 Þ f ðxk Þ Mk xk1 xk k2 ¼ Mk xk xk1 k2 ¼ Ma2k1 k dk1 k2 2 2 2 and ; k gk k M k xk xk1 k¼ Mak1 k dk1 k; sk1 ¼ ak1 dk1

we obtain k yk1 k

2 k gk þ 0

k f ðxk1 f ðxk ÞÞ k k sk1 k k sk1

1 1 Ma2k1 k dk1 k2 ak1 k dk1 k C k þ2 A a2k1 k dk1 k2

B 2@Mak1 k dk1

¼

k: 3Mak1 k dk1

Then, we have

j¼ jbPRP k

jgkT yk1 j k g T k k yk1 k 3Mak1 k gkT k k dk1 k k gk k k 3Mc1 1 2 2 2 k k gk1 k k gk1 k c1 ak1 k dk1 k k dk1

and jyk j ¼

jgkT dk1 j jg T d j k gk k 1 : ¼ kT k1 c1 1 ak1 2 k k gk1 k gk1 dk1 k dk1

Therefore, it follows from (8) and the Lipschitz continuity of g that k dk k

¼

k gk þ bPRP dk1 yk yk1 k k

k gk k þ jbPRP j k dk1 k þ jyk j k yk1 k k

k gk k þ

3Mak1 k gkT k k dk1 k k gk k 1 k yk1 k dk1 k þc1 k 1 ak1 2 k c1 ak1 k dk1 k k dk1

1 1 k gk k þ 3c1 1 M k gk k þ c1 ak1

ð17Þ

k gk k 3Mak1 k dk1 k k k dk1

ð1 þ 6c1 1 MÞ k gk k : If the Armijo line search is used, and αk 6¼ 1, then a0k ¼ ak r1 such that f ðxk þ a0k dk Þ f ðxk Þ > s1 a0k gkT dk : By the mean-value theorem and the above relation, we can deduce that there exists a

PLOS ONE | DOI:10.1371/journal.pone.0137166 September 18, 2015

6 / 17

A Conjugate Gradient Method

scalar μk 2 (0,1) satisfying 0

s1 ak gkT dk

0

0

0

< f ðxk þ ak dk Þ f ðxk Þ ¼ ak gðxk þ mk ak dk Þ dk 0

0

T

0

¼ ak ðgðxk þ mk ak dk Þ gðxk ÞÞ dk þ ak gkT dk Z 1 0 0 0 2 ¼ mk ðak Þ dkT r2 f ðxk þ tmk ak dk Þdt þ ak gkT dk T

0

0

2

0

ðak Þ M k dk k2 þ ak gkT dk : Thus, the relation 0

ak ¼ rak

ð1 s1 Þr gkT dk ð1 s1 Þr k gk k2 ¼ k dk k2 k dk k2 M M

1 holds. By the relation 1 þ 6Mc1 1 > 1 þ 2Mc1 > 0, we can find that

1 1 > : 1 1 þ 2Mc1 1 þ 6Mc 1 1 Then, we obtain 0

ak ¼ rak

ð1 s1 Þr gkT dk ð1 s1 Þr k gk k2 2 ¼ M 1 ð1 s1 Þrð1 þ 2Mc1 c : 1 Þ k dk k2 k dk k2 M M

Setting c ¼ minf1; c g, we have Eq (16). If the Wolfe line search is used, using Eq (13), we have Mak k dk k2 ðgðxk þ ak dk Þ gk Þ dk ð1 s2 ÞgkT dk : T

By analyzing the Armijo line search technique in a similar manner, we can find a lower positive bound of the step size αk. The proof is complete. Similar to [34], we can establish the following theorem. Here, we state the theorem below but omit its proof. Theorem 0.1 Let Assumption (i) hold, and let the sequence {xk} be generated by the MPRP method with the Armijo line search technique or Wolfe line search technique. Then, there are constants a > 0 and r 2 (0,1) that satisfy k xk x k ar k :

ð18Þ

The restart MPRP* method and its convergence As with [34], we define an initial step length gk as follows: gk

k k g k k 2 ; d ðgðxk þ k dk Þ gðxk ÞÞ T k

ð19Þ

using the integer sequence {k} ! 0 with k ! 1. Moreover, we can obtain jak gk j ! 0: Theorem 0.2 Let sequence {xk} be generated by the MPRP method, and let Assumption (i) hold. Then, for sufficiently large k, gk satisfies the Armijo line search and the Wolfe-Powell line search conditions. Theorem 0.2 shows that, for sufficiently large k, gk can be defined by Eq (19) such that the Armijo line search and Wolfe-Powell line search conditions are satisfied. In the following, jγkj is used as the initial step length of the restart MPRP method.


7 / 17


Algorithm 4.1 (RMPRP ) Step 0: Given x0 2 0 is an integer, let k: = 0. Step 1: If kgkk , stop. Step 2: If the inequality f ðxk þ j gk j dk Þ f ðxk Þ þ s1 j gk j gkT dk holds, we set αk = jγkj; otherwise, we determine αk = max{jγkjρjjj = 0, 1, 2, } satisfying

f ðxk þ ak dk Þ f ðxk Þ þ s1 ak gkT dk :

ð20Þ

Step 3: Let xkþ1 ¼ xk þ ak dk , and k: = k+1. Step 4: If kgkk , stop. Step 5: If k = γ, we let x0: = xk. Go to step 1. Step 6: Compute dk using (8). Go to step 2.

We now establish the global convergence of Algorithm 4.1. Theorem 0.3 Let the conditions of Theorem 0.2 hold. Then, the following relation lim k gk k¼ 0

ð21Þ

k!1

holds. Proof. We will prove this theorem by contradiction. Suppose that Eq (21) does not hold; then, for all k, there exists a constant ε1 > 0 satisfying k gk k ε1 :

ð22Þ

Using Eqs (10) and (20), if f is bounded from below, we can obtain 1 X a2k k dk k2 < 1:

ð23Þ

k¼0

In particular, we have lim ak k dk k¼ 0:

ð24Þ

k!1

If limk ! 1 αk > 0, by Eqs (10) and (24), we obtain limk ! 1kgkk = 0. This contradicts Eqs (22) and (21) holds. Otherwise, if limk ! 1 αk = 0., then there is an infinite index set K0 satisfying lim ak ¼ 0:

ð25Þ

k2K0 ; k!1

According to Step 2 of Algorithm 4.1, when k is sufficiently large, αk ρ−1 does not satisfy Eq (20), which implies that f ðxk þ ak r1 dk Þ f ðxk Þ > dr2 a2k k dk k2 :

ð26Þ

By Eq (22), similar to the proof of Lemma 2.1 in [33], we can deduce that there is a constant % > 0 such that k dk k %; 8 k:

ð27Þ

Using Eqs (27) and (10), and the mean-value theorem, we have f ðxk þ ak r1 dk Þ f ðxk Þ

¼ r1 ak gðxk þ x0 r1 ak dk Þ dk T

¼ r1 ak gkT dk þ r1 ak ðgðxk þ x0 r1 ak dk Þ gk Þ dk T

r1 ak gkT dk þ Mr2 a2k k dk k2 ; where ξ0 2 (0,1) and the last inequality follows Eq (15). Combining this result with Eq (26), for


8 / 17


all sufficiently large k 2 K0, we obtain k gk k2 r1 ðM þ dÞak k dk k2 : By Eq (27) and limk ! 1 αk = 0, the above inequality then implies that limk 2 K0, k ! 1kgkk = 0. This is also a contradiction. The proof is complete. Lemma 0.3 Let Assumption (i) hold, and let the sequence {xk} be generated by the RMPRP method. Then, there are four positive numbers ci, i = 1, 2, 3, 4 that satisfy

k gkþ1 k c1 k dk k; jbPRP k dkþ1 k c4 k dk k : kþ1 j c2 ; jykþ1 j c3 ;

ð28Þ

Proof. Considering the first inequality of Eq (28), we have k gk þ ðgkþ1 gk Þ kk gk k þjgk j k A^k dk k M M k dk k c1 k dk k; k dk k þ k dk k¼ 1 þ m m

k gkþ1 k

where A^k ¼

R1 0

¼

ð29Þ

r2 f ðxk þ t j gk j dk Þdt. Using bPRP kþ1 ; we will discuss the three other inequalities

of Eq (28). For bPRP kþ1 , we have

jbPRP kþ1 j ¼

T T jgkþ1 yk j k gkþ1 k k yk k : k gk k2 k gk k 2

Case 1: If ρk 0, then yk ¼ yk ¼ gkþ1 gk . Therefore, we have

jbPRP kþ1 j ¼

k gkþ1 k c jg jk dk k2 M k gkþ1 gk k 1 k 2 k gk k2 k gk k c1 k dk k2 M c1 k dk k2 M T c1 m1 M c2 dkT Ak dk dk Ak dk

R1 where Ak ¼ 0 r2 f ðxk þ tk dk Þdt and the second inequality follows from the mean-value theorem and Eq (29). Case 2: If ρk > 0, then we have 8 rT s > > < yk ¼ yk þ k k 2 k sk k > > : T rk ¼ 2½f ðxk Þ f ðxkþ1 Þ þ ðgðxkþ1 Þ þ gðxk ÞÞ sk :


9 / 17


Thus, we have

jbPRP kþ1 j

T k gkþ1 k k yk k k gk k2 k gkþ1 k 2 k f ðxkþ1 Þ f ðxk Þ k k sk k þ k gkþ1 þ gk k k sk k2 k gkþ1 gk k þ k sk k2 k gk k2 T k gkþ1 k k gkþ1 gk k k gkþ1 k 2 k f ðxkþ1 Þ f ðxk Þ k þ k gkþ1 þ gk kk sk k þ k gk k2 k gk k2 k sk k : k gkþ1 k 2 k f ðxkþ1 Þ f ðxk Þ k c2 þ þ k gkþ1 þ gk k k gk k2 k sk k c1 k dk k M k sk k2 þ ð1 þ c Þ k d k þ c2 1 k 2 2 k sk k ð1 þ 2Mc1 1 Þ k dk k 2 c2 þ ½c1 ð1 þ 2Mc1 1 Þ ðM þ c1 þ 1Þ c2

Let c2 ¼ maxfc2 ; c2 g. We obtain

jbPRP kþ1 j c2 and jykþ1 j ¼ j

T gkþ1 dk k dk k k dk k2 2 j k g k c c1 ð1 þ 2Mc1 kþ1 1 1 Þ c3 : k gk k2 k gk k2 k gk k2

, we obtain Using the definition of dkþ1 k ¼ k dkþ1

k gkþ1 þ bPRP kþ1 dk ykþ1 yk k

k gkþ1 k þ jbPRP kþ1 j k dk k þ jykþ1 j k ykþ1 k :

Case 1: If ρk 0, we have k k dkþ1

k gkþ1 k þ jbPRP kþ1 j k dk k þ jykþ1 jðk gkþ1 þ k gk kkÞ

¼ ð1 þ jykþ1 jÞ k gkþ1 k þ jbPRP kþ1 j k dk k þ jykþ1 j k gk k

ð1 þ c3 Þc1 k dk k þ c2 k dk k þ c3 k dk k ¼ ½ð1 þ c3 Þc1 þ c2 þ c3 k dk k c4 k dk k : Case 2: If ρk > 0, we have k yk k 3Mak k dk k 3M k dk k; ð0 < ak 1Þ; and k k dkþ1

k gkþ1 k þ jbPRP kþ1 j k dk k þ jykþ1 j k yk k

c1 k dk k þ c2 k dk k þ3c3 M k dk k ¼ ðc1 þ c2 þ 3Mc3 Þ k dk k c4 k dk k : Letting c4 ¼ maxfc4 ; c4 g, we obtain kdkþ1 k c4 kdk k. The proof is complete. Assumption (ii) In some neighborhood N of x , r2 f is Lipschitz continuous. Similar to [34], we can also establish the n-order quadratic convergence of the RMPRP method. Here, we state the theorem but omit the proof.


10 / 17


Table 1. Definition of the benchmark problems and their features. No.

Functions

1

Sphere

2

Schwefel’s

3

Rastrigin

4

Schwefel

5

Griewank

6

Rosenbrock

7

Ackley

8

Langerman

Definition Pp fSph ðxÞ ¼ i¼1 xi2 xi 2 [−5.12, 5.12], x* = (0, 0, , 0), fSph(x*) = 0. Pp Pi 2 fSchDS ðxÞ ¼ i¼1 ð j¼1 xj Þ xi 2 [−65.536, 65.536], x* = (0, 0, , 0), fSchDS(x*) = 0. Pp fRas ðxÞ ¼ 10p þ i¼1 ðxi2 10cosð2pxi ÞÞ xi 2 [−5.12, 5.12], x* = (0, 0, , 0), fRas(x*) = 0. pffiffiffiffiffiffiffiffiffi Pp fSch ðxÞ ¼ 418:9829p þ i¼1 xi sin j xi j xi 2 [−512.03, 511.97], x* = (−420.9678, −420.9678, , −420.9678), fSch(x*) = 0. Pp xi2 Qp fGri ðxÞ ¼ 1 þ i¼1 4000 i¼1 cos xii xi 2 [−600, 600], x* = (0, 0, , 0), fGri(x*) = 0. Pp1 2 2 fRos ðxÞ ¼ i¼1 ½100ðxiþ1 xi2 Þ þ ðxi 1Þ xi 2 [−2.048, 2.048], x* = (1, , 1), fRos(x*) = 0. pffiffiffiffiffiffiffiffiffiffiffiffiffiffi Pp 2ffi Pp 1 1 fAck ðxÞ ¼ 20 þ e 20e0:2 p i¼1 xi ep i¼1 cosð2pxi Þ xi 2 [−30, 30], x* = (0, , 0), fAck(x*) = 0. Pp Pm Pp 1 ðx a Þ2 2 fLan ðxÞ ¼ i¼1 ci e p j¼1 j ij cosðp j¼1 ðxj aij Þ Þ xi 2 [0, 10], m = p, x* = random, fLan(x*) = random.

Multimodal?

Separable?

Regular?

no

yes

n/a

no

no

n/a

yes

yes

n/a

yes

yes

n/a

yes

no

yes

no

no

n/a

yes

no

yes

yes

no

no

doi:10.1371/journal.pone.0137166.t001

Theorem 0.4 Let Assumptions (i) and (ii) hold. Then, there exists a constant c0 > 0 such that lim sup

k!1

k xkrþn x k 0 c < 1: k xkr x k2

ð30Þ

Specifically, the RMPRP method is quadratically convergent.

Numerical Results This section reports on various numerical experiments with Algorithm 4.1 and the normal algorithm to demonstrate the effectiveness of the given algorithm. We utilize the normal algorithm (called Algorithm N), whereby the algorithm does not use the restart technique, but the other steps are the same as those in Algorithm 4.1. We will test the following benchmark problems using these two algorithms. These problems are listed in Table 1. These benchmark problems and discussions concerning the choice of tested problems for an algorithm can be found at http : ==www:cs:cmu:edu=afs=cs=project=jair=pub=volume24=ortizboyer05a html=node6:html:

We will use these benchmark problems to perform the experiments with these two algorithms. The code was written in MATLAB 7.6.0 and run on a PC with a Core 2 Duo CPU, E7500 @2.93 GHz, 2.00 GB memory and Windows XP operation system. The parameters of the algorithms are chosen to be ρ = 0.1, γ = 10, σ1 = 0.0001, and γk = e1 = = 10−4. The dimension can be found in Tables 2–3. Because the line search cannot always ensure that the descent condition dkT gk < 0 holds, uphill searches will occur in the experiments. To avoid this situation, the step size αk will be accepted if the searching number is larger than ten in every line search technique. The Himmeblau stop rule will be used: If jf(xk)j > e1, set stop1 ¼ jf ðxkjfÞfðxkðxÞjkþ1 Þj ; otherwise, set stop1 = jf(xk)−f(xk+1)j.

For each problem, the program will stop if kg(x)k < e3 or stop1 < e2 is satisfied, where e1 = e2 = e3 = 10−4.


11 / 17


Table 2. Test results using Algorithm 4.1. No. 1

2

3

4

5

6

7

8

Dim x0

NI/NFG/fðx Þ (−5, )


NI/NFG/fðx Þ (2, )

NI/NFG/fðxÞ (4, )

30

2/6/8.355554e-023

2/6/3.274972e-024

2/6/1.455543e-024

2/6/5.822172e-024

10000

2/6/2.788150e-020

2/6/1.091657e-021

2/6/4.851810e-022

2/6/1.940724e-021

100000

2/6/5.970100e-019

2/6/8.813541e-020

2/6/4.871394e-021

2/6/1.948557e-020

103000

2/6/5.962874e-019

2/6/9.164034e-020

2/6/4.997364e-021

2/6/1.998946e-020

x0

(−0.002, )

(−0.001, )

(0.001, )

(0.0002, )

30

4/12/7.338081e-006

4/12/1.834520e-006

4/12/1.834520e-006

3/9/4.070200e-007

60

5/15/1.607584e-005

4/12/1.503191e-005

4/12/1.503191e-005

3/9/3.227252e-006

100

6/18/2.574911e-005

5/15/1.889258e-005

5/15/1.889258e-005

4/12/2.785399e-006

200

8/24/3.830680e-005

7/21/2.110989e-005

7/21/2.110989e-005

5/15/6.065297e-006

x0

(−0.02, 0, )

(0.01, 0, )

(0.001, 0, )

(0.006, 0, )

30

3/9/0.000000e+000

3/9/0.000000e+000

2/6/4.547474e-013

3/9/0.000000e+000

60

3/9/0.000000e+000

3/9/0.000000e+000

2/6/1.136868e-012

3/9/0.000000e+000

100

3/9/0.000000e+000

3/9/0.000000e+000

2/6/1.136868e-012

3/9/0.000000e+000

200

3/9/0.000000e+000

3/9/0.000000e+000

2/6/1.136868e-012

3/9/0.000000e+000

x0

(−500, )

(−10, )

(−300, )

(50, )

30

9/28/4.588939e+002

5/15/1.233277e+004

10/30/8.751388e+003

2/16/1.469607e+004 2/16/4.898690e+005

1000

9/28/1.529646e+004

5/15/4.110923e+005

10/30/2.917129e+005

10000

9/28/1.529646e+005

5/15/4.110923e+006

10/30/2.917129e+006

2/16/4.898690e+006

100000

9/28/1.529646e+006

5/15/4.110923e+007

10/30/2.917129e+007

2/16/4.898690e+007

x0

(−2, )

(−1, )

(1, )

(2, )

30

2/6/8.548717e-015

2/6/5.644248e-009

2/6/5.644248e-009

2/6/8.548717e-015

1000

2/6/0.000000e+000

2/6/0.000000e+000

2/6/0.000000e+000

2/6/0.000000e+000

10000

2/6/0.000000e+000

2/6/0.000000e+000

2/6/0.000000e+000

2/6/0.000000e+000

100000

2/6/0.000000e+000

2/6/0.000000e+000

2/6/0.000000e+000

2/6/0.000000e+000

x0

(1.01, )

(0.9999, )

(1.003, )

(1.005, )

30

12/36/6.832422e-005

2/6/1.919312e-006

6/18/5.497859e-005

9/27/6.152662e-005

1000

12/36/6.562848e-005

3/9/3.607120e-007

6/18/5.343543e-005

9/27/5.799691e-005

10000

12/36/6.382394e-005

3/9/3.615221e-007

7/21/4.048983e-005

9/27/6.078016e-005

100000

12/36/5.969884e-005

3/9/3.632823e-007

7/21/4.484866e-005

9/27/5.885821e-005

x0

(−0.1, )

(0.1, )

(0.01, )

(0.03, )

30

11/50/1.684545e+000

11/50/1.684545e+000

11/58/1.684467e+000

6/25/1.684388e+000

1000

14/67/1.717428e+000

14/67/1.717428e+000

9/46/1.717374e+000

11/56/1.717448e+000

10000

15/75/1.718189e+000

15/75/1.718189e+000

9/46/1.718252e+000

11/57/1.718327e+000

100000

15/75/1.718273e+000

15/75/1.718273e+000

9/46/1.718340e+000

11/57/1.718410e+000

x0

(−5, )

(−1, )

(2, )

(4, )

30

1/3/-1.889333e-104

2/16/-8.226096e-004

1/3/-2.009161e-015

1/3/-1.228836e-064

300

1/3/0.000000e+000

1/3/-3.893431e-040

1/3/-1.008253e-163

1/3/0.000000e+000

500

1/3/0.000000e+000

1/3/-1.459270e-067

1/3/-4.297704e-274

1/3/0.000000e+000

1000

1/3/0.000000e+000

1/3/-2.213361e-136

1/3/0.000000e+000

1/3/0.000000e+000



12 / 17


Table 3. Test results using Algorithm N. No. 1

2

3

4

5

6

7

8

Dim x0



NI/NFG/fðx Þ (2, )

NI/NFG/fðxÞ (4, )

30

2/6/8.355554e-023

2/6/3.274972e-024

2/6/1.455543e-024

2/6/5.822172e-024

10000

2/6/2.788150e-020

2/6/1.091657e-021

2/6/4.851810e-022

2/6/1.940724e-021

100000

2/6/5.970100e-019

2/6/8.813541e-020

2/6/4.871394e-021

2/6/1.948557e-020

103000

2/6/5.962874e-019

2/6/9.164034e-020

2/6/4.997364e-021

2/6/1.998946e-020

x0

(−0.002, )

(−0.001, )

(0.001, )

(0.0002, )

30

4/12/7.338081e-006

4/12/1.834520e-006

4/12/1.834520e-006

3/9/4.070200e-007

60

5/15/1.607584e-005

4/12/1.503191e-005

4/12/1.503191e-005

3/9/3.227252e-006

100

6/18/2.574911e-005

5/15/1.889258e-005

5/15/1.889258e-005

4/12/2.785399e-006

200

8/24/3.830680e-005

7/21/2.110989e-005

7/21/2.110989e-005

5/15/6.065297e-006

x0

(−0.02, 0, )

(0.01, 0, )

(0.001, 0, )

(0.006, 0, )

30

3/9/0.000000e+000

3/9/0.000000e+000

2/6/4.547474e-013

3/9/0.000000e+000

60

3/9/0.000000e+000

3/9/0.000000e+000

2/6/1.136868e-012

3/9/0.000000e+000

100

3/9/0.000000e+000

3/9/0.000000e+000

2/6/1.136868e-012

3/9/0.000000e+000

200

3/9/0.000000e+000

3/9/0.000000e+000

2/6/1.136868e-012

3/9/0.000000e+000

x0

(−500, )

(−10, )

(−300, )

(50, )

30

9/28/4.588939e+002

5/15/1.233277e+004

10/30/8.751388e+003

2/16/1.469607e+004 2/16/4.898690e+005

1000

9/28/1.529646e+004

5/15/4.110923e+005

10/30/2.917129e+005

10000

9/28/1.529646e+005

5/15/4.110923e+006

10/30/2.917129e+006

2/16/4.898690e+006

100000

9/28/1.529646e+006

5/15/4.110923e+007

10/30/2.917129e+007

2/16/4.898690e+007

x0

(−2, )

(−1, )

(1, )

(2, )

30

2/6/8.548717e-015

2/6/5.644248e-009

2/6/5.644248e-009

2/6/8.548717e-015

1000

2/6/0.000000e+000

2/6/0.000000e+000

2/6/0.000000e+000

2/6/0.000000e+000

10000

2/6/0.000000e+000

2/6/0.000000e+000

2/6/0.000000e+000

2/6/0.000000e+000

100000

2/6/0.000000e+000

2/6/0.000000e+000

2/6/0.000000e+000

2/6/0.000000e+000

x0

(1.01, )

(0.9999, )

(1.003, )

(1.005, )

30

16/48/8.156782e-005

2/6/1.919312e-006

6/18/5.497859e-005

9/27/6.152662e-005

1000

15/45/8.236410e-005

3/9/3.607120e-007

6/18/5.343543e-005

9/27/5.799691e-005

10000

14/42/8.568194e-005

3/9/3.615221e-007

7/21/4.048983e-005

9/27/6.078016e-005

100000

13/39/8.458113e-005

3/9/3.632823e-007

7/21/4.484866e-005

9/27/5.885821e-005

x0

(−0.1, )

(0.1, )

(0.01, )

(0.03, )

30

11/50/1.684545e+000

11/50/1.684545e+000

11/58/1.684467e+000

6/25/1.684388e+000

1000

14/67/1.717428e+000

14/67/1.717428e+000

9/46/1.717374e+000

11/56/1.717448e+000

10000

15/75/1.718189e+000

15/75/1.718189e+000

9/46/1.718252e+000

11/57/1.718327e+000

100000

15/75/1.718273e+000

15/75/1.718273e+000

9/46/1.718340e+000

11/57/1.718410e+000

x0

(−5, )

(−1, )

(2, )

(4, )

30

1/3/-1.889333e-104

2/16/-8.226096e-004

1/3/-2.009161e-015

1/3/-1.228836e-064

300

1/3/0.000000e+000

1/3/-3.893431e-040

1/3/-1.008253e-163

1/3/0.000000e+000

500

1/3/0.000000e+000

1/3/-1.459270e-067

1/3/-4.297704e-274

1/3/0.000000e+000

1000

1/3/0.000000e+000

1/3/-2.213361e-136

1/3/0.000000e+000

1/3/0.000000e+000



13 / 17


Table 4. The performance of Algorithm 4.1 and Algorithm N on NFG. Algorithm 4.1

Algorithm N

0.99

1


The program also stops if more than five thousand iterations are performed because the corresponding method for the problem is regarded as having failed. The meanings of the columns in Tables 2–3 are as follows: x0: the initial point NFG: the total number of NF and NG, i.e., NFG = NF+NG. NI: the total number of iterations Dim: the dimension of the problem; f ð x Þ denotes the function value at the point x when the program stops. The results in Tables 2–3 show that these two algorithms effectively solve the benchmark problems—except for the fourth problem. In the experiment, the results when the algorithms are applied to problems 2 and 3 are not satisfactory when the dimensions of the problem are large; thus, we use a lower dimension. The dimensions of problems 8 and 9 are less than 1,000; the reasoning is similar to that for problems 2 and 3. However, the dimensions of all the problems are larger than 30, which is fixed. For many problems, the results are similar, except for those of problem 6. Clearly, the restart algorithm is competitive with the normal algorithm without using the restart technique. To clearly to show the performance of Algorithm 4.1 and Algorithm N on NFG, we use the tool (S1 File) in [46] to analyze the algorithms. The results are listed in Table 4. It is easy to see that Algorithm 4.1 outperforms Algorithm N by approximately 1%, and we can conclude that the proposed method is better than the normal method. Thus, we hope that the given algorithm will be utilized in the future.

Conclusion 1. This paper presented a new conjugate gradient method that exhibits global convergence, linear convergence, and quadratic convergence when suitable assumptions are made. Our proposed CG formula includes not only the gradient value information but also the function value information. For the test benchmark problems, the numerical results showed that the given algorithm is more effective than the normal method without using the restart technique. 2. We performed tests using benchmark problems using the presented algorithm and the normal algorithm without employing the restart technique. These two methods were shown to be very effective for solving the given problems. Moreover, we did not fix the dimension because n = = 30, and the largest dimension was higher than 100,000 (103,000). Additional numerical problems (such as the problems in [47]) should be investigated in the future to examine this algorithm. 3. Recently, we solved nonsmooth optimization problems using the relative gradient methods and obtained various results; therefore, in the future, we will use the RMPRP method to solve nonsmooth optimization problems and hopefully obtain interesting results. Moreover, we will study the convergence of the CG method with other line search rules.


14 / 17


Supporting Information S1 File. Supporting Information for Table 4. (PDF)

Acknowledgments The authors would like to thank the editor and the referees for their useful suggestions and comments, which greatly improved the paper.

Author Contributions Conceived and designed the experiments: XZ XW XL. Performed the experiments: XZ XW. Analyzed the data: XW XD. Contributed reagents/materials/analysis tools: XL XZ XW. Wrote the paper: XW XZ.

References 1.

Gu B and Sheng VS. Feasibility and finite convergence analysis for accurate on-line v-support vector learning. IEEE Transactions on Neural Networks and Learning Systems. 2013; 24:1304–1315. doi: 10. 1109/TNNLS.2013.2250300

2.

Li J,Li XL,Yang B,Sun XM. Segmentation-based Image Copy-move Forgery Detection Scheme, IEEE Transactions on Information Forensics and Security. 2015; 10: 507–518. doi: 10.1109/TIFS.2014. 2381872

3.

Wen XZ,Shao L, Fang W, and Xue Y. Efficient Feature Selection and Classification for Vehicle Detection. IEEE Transactions on Circuits and Systems for Video Technology. 2015;. doi: 10.1109/TCSVT. 2014.2358031

4.

Zhang H, Wu J, Nguyen TM, Sun MX.Synthetic Aperture Radar Image Segmentation by Modified Student’s t-Mixture Model. IEEE Transaction on Geoscience and Remote Sensing. 2014; 52: 4391–4403. doi: 10.1109/TGRS.2013.2281854

5.

Fu ZJ. Achieving Efficient Cloud Search Services: Multi-keyword Ranked Search over Encrypted Cloud Data Supporting Parallel Computing. IEICE Transactions on Communications. E98-B, 2015: 190–200. doi: 10.1587/transcom.E98.B.190

6.

Dai Y, Yuan Y. A nonlinear conjugate gradient with a strong global convergence properties. SIAM J. Optimi. 2000; 10: 177–182. doi: 10.1137/S1052623497318992

7.

Dai Y, Yuan Y. Nonlinear conjugate gradient Methods. Shanghai Scientific and Technical Publishers, 1998.

8.

Fletcher R. Practical methods of optimization, Vol I: Unconstrained Optimization. 2nd edition, Wiley, New York, 1997.

9.

Fletcher R, Reeves C. Function minimization by conjugate gradients, Comput. J. 1964; 7: 149–154. doi: 10.1093/comjnl/7.2.149

10.

Hager WW,Zhang H. A new conjugate gradient method with guaranteed descent and an efficient line search. SIAM J. Optim. 2005; 16: 170–192. doi: 10.1137/030601880

11.

Hestenes MR, Stiefel E. Method of conjugate gradient for solving linear equations. J. Res. Nat. Bur. Stand. 1952; 49:409–436. doi: 10.6028/jres.049.044

12.

Liu Y, Storey C.Effcient generalized conjugate gradient algorithms part 1: Theory, J. Appl. Math. Comput. 1992; 69: 17–41.

13.

Polak E. The conjugate gradient method in extreme problems, Comput. Math. Mathem. Phy. 1969; 9: 94–112. doi: 10.1016/0041-5553(69)90035-4

14.

Polak E, Ribière G. Note sur la convergence de directions conjugees. Rev. Fran. Inf. Rech. Opérat. 1969; 3: 35–43.

15.

Wei Z, Yao S, Liu L. The convergence properties of some new conjugate gradient methods. Appl. Math. Comput. 2006; 183: 1341–1350. doi: 10.1016/j.amc.2006.05.150

16.

Yuan GL, Lu XW. A modified PRP conjugate gradient method. Anna. Operat. Res. 2009; 166: 73–90. doi: 10.1007/s10479-008-0420-4

17.

Yuan GL,. Lu XW, Wei ZX. A conjugate gradient method with descent direction for unconstrained optimization. J. Comput. Appl. Math. 2009; 233: 519–530. doi: 10.1016/j.cam.2009.08.001


15 / 17


18.

Dai Y. Analysis of conjugate gradient methods, Ph.D. Thesis, Institute of Computational Mathematics and Scientific/Engineering Computing. Chese Academy of Sciences, 1997.

19.

Dai ZF, Tian BS. Global convergence of some modified PRP nonlinear conjugate gradient methods. Optim. Let.

20.

Gilbert J C, Nocedal J. Global convergence properties of conjugate gradient methods for optimization. SIAM J. Optim. 1992; 2: 21–42. doi: 10.1137/0802003

21.

Hager WW, Zhang H. Algorithm 851: CG− DESCENT, A conjugate gradient method with guaranteed descent. ACM Trans. Mathem. Soft. 2006; 32: 113–137. doi: 10.1145/1132973.1132979

22.

Powell MJD. Nonconvex minimization calculations and the conjugate gradient method. Lecture Notes in Mathematics, Vol. 1066, Spinger-Verlag, Berlin, 1984, pp. 122–141.

23.

Powell MJD. Convergence properties of algorithm for nonlinear optimization. SIAM Rev. 1986; 28: 487–500. doi: 10.1137/1028154

24.

Yu GH. Nonlinear self-scaling conjugate gradient methods for large-scale optimization problems. thesis of Doctor’s Degree, Sun Yat-Sen University, 2007.

25.

Yuan GL. Modified nonlinear conjugate gradient methods with sufficient descent property for largescale optimization problems, Optim. Let. 2009; 3: 11–21. doi: 10.1007/s11590-008-0086-5

26.

Yuan YX. Analysis on the conjugate gradient method. Optim. Meth. Soft. 1993; 2: 19–29. doi: 10.1080/ 10556789308805532

27.

Yuan GL, Wei ZX and Li GY. A modified Polak-Ribiére-Polyak conjugate gradient algorithm for nonsmooth convex programs. Journal of Computational and Applied Mathematics. 2014; 255: 86–96. doi: 10.1016/j.cam.2013.04.032

28.

Yuan GL, Wei ZX, and Zhao QM. A modified Polak-Ribiére-Polyak conjugate gradient algorithm for large-scale optimization problems. IIE Transactions, 2014; 46: 397–413. doi: 10.1080/0740817X.2012. 726757

29.

Yuan GL, Zhang M J. A modified Hestenes-Stiefel conjugate gradient algorithm for large-scale optimization. Numerical Functional Analysis and Optimization. 2013; 34:914–937. doi: 10.1080/01630563. 2013.777350

30.

Burmeister W. Die Konvergenzordnung des Fletcher-Powell Algorithmus, Z. Angew. Math. Mech. 1973; 53: 693–699. doi: 10.1002/zamm.19730531007

31.

Cohen A. Rate of convergence of several conjugate gradient algorithms. SIAM J. Numer. Anal. 1972; 9: 248–259. doi: 10.1137/0709024

32.

Ritter K. On the rate of superlinear convergence of a class of variable metric methods, Numer. Math. 1980; 35: 293–313. doi: 10.1007/BF01396414

33.

Zhang L, Zhou W, Li DH. A descent modified Polak-Ribiére-Polyak conjugate gradient method and its global convergence, IMA J. Numer. Anal. 2006; 26: 629–640. doi: 10.1093/imanum/drl016

34.

Li DH, Tian BS. N-step quadratic convergence of the MPRP method with a restart strategy, J. comput. Appl. Math. 2011; 235: 4978–4990. doi: 10.1016/j.cam.2011.04.026

35.

Broyden CG, Dennis JE, Moré JJ. On the local and superlinear convergence of quasi-Newton methods, J. Ins. Math. Appl., 12 (1973), pp. 223–246. doi: 10.1093/imamat/12.3.223

36.

Byrd R, Nocedal J. A tool for the analysis of quasi-Newton methods with application to unconstrained minimization. SIAM J. Numer. Anal. 1989; 26: 727–739. doi: 10.1137/0726042

37.

Byrd R, Nocedal J, Yuan Y. Global convergence of a class of quasi-Newton methods on convex problems. SIAM J. Numer. Anal. 1987; 24: 1171–1189. doi: 10.1137/0724077

38.

Dai Y. Convergence properties of the BFGS algorithm. SIAM J. Optim. 2003: 13 693–701. doi: 10. 1137/S1052623401383455

39.

Dennis JE, Moré JJ. A characterization of superlinear convergence and its application to quasi-Newton methods. Math. Comput. 1974; 28:549–560. doi: 10.1090/S0025-5718-1974-0343581-1

40.

Li D, Fukushima M A modified BFGS method and its global convergence in nonconvex minimization. J. Comput. Appl. Math. 2001; 129: 15–35. doi: 10.1016/S0377-0427(00)00540-9

41.

Li D, Fukushima M. On the global convergence of the BFGS method for nonconvex unconstrained optimization problems. SIAM J. Optim. 2001; 11: 1054–1064. doi: 10.1137/S1052623499354242

42.

Liu GH, Han JY, Sun DF. Global convergence Analysis of the BFGS Algorithm with Nonmonotone linesearch. Optim. 1995; 34: 147–159. doi: 10.1080/02331939508844101

43.

Wei Z, Yu G, Yuan G, Lian Z. The superlinear convergence of a modified BFGS-type method for unconstrained optimization. Comput. Optim. Appli. 2004; 29: 315–332. doi: 10.1023/B:COAP.0000044184. 25410.39


16 / 17


44.

Wei Z, Li G, Qi L. New quasi-Newton methods for unconstrained optimization problems. Appl. Math. Comput. 2006; 175: 1156–1188. doi: 10.1016/j.amc.2005.08.027

45.

Yuan GL, Wei ZX. Convergence analysis of a modified BFGS method on convex minimizations. Comput. Optim. Appli. 2010; 47: 237–255. doi: 10.1007/s10589-008-9219-0

46.

Yuan GL, Lu XW. A modified PRP conjugate gradient method. Annals of Operations Research. 2009; 166: 73–90. doi: 10.1007/s10479-008-0420-4

47.

Nicholas I, Gould M, Dominique Orban and Philippe L. Toint, CUTEr(and SifDec): A Constrained and Unconstrained Testing Environment, revisited. ACM Trans. Mathem. Soft. 2003; 29: 373–394.


17 / 17

Adaptive cuckoo search algorithm for unconstrained optimization.

A modified nonmonotone BFGS algorithm for unconstrained optimization.

An Improved Quantum-Behaved Particle Swarm Optimization Algorithm with Elitist Breeding for Unconstrained Optimization.

The Modified HZ Conjugate Gradient Algorithm for Large-Scale Nonsmooth Optimization.

A Biogeography-Based Optimization Algorithm Hybridized with Tabu Search for the Quadratic Assignment Problem.

An accelerated proximal gradient algorithm for singly linearly constrained quadratic programs with box constraints.

Two New PRP Conjugate Gradient Algorithms for Minimization Optimization Models.

A new algorithm for quadratic sample entropy optimization for very short biomedical signals: application to blood pressure records.

A gradient boosting algorithm for survival analysis via direct optimization of concordance index.

A fast algorithm for nonnegative matrix factorization and its convergence.

Improved algorithm for gradient vector flow based active contour model using global and local information.

On-line node fault injection training algorithm for MLP networks: objective function and convergence analysis.

A cuckoo search algorithm for multimodal optimization.

Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification.

Enhanced spatial resolution in fluorescence molecular tomography using restarted L1-regularized nonlinear conjugate gradient algorithm.

Dai-Kou type conjugate gradient methods with a line search only using gradient.

Motion-compensated cone beam computed tomography using a conjugate gradient least-squares algorithm and electrical impedance tomography imaging motion data.

Estimation of organ transport function: model-free deconvolution by recursive quadratic programming optimization.

Self-consistent gradient flow for shape optimization.

An Efficient Chemical Reaction Optimization Algorithm for Multiobjective Optimization.

New discrete-time recurrent neural network proposal for quadratic optimization with general linear constraints.

Gradient maintenance: A new algorithm for fast online replanning.

A range division and contraction approach for nonconvex quadratic program with quadratic constraints.

A heterogeneous algorithm for PDT dose optimization for prostate.