RESEARCH ARTICLE
A Conjugate Gradient Algorithm with Function Value Information and N-Step Quadratic Convergence for Unconstrained Optimization Xiangrong Li1, Xupei Zhao2, Xiabin Duan1, Xiaoliang Wang1* 1 Guangxi Colleges and Universities Key Laboratory of Mathematics and Its Applications, College of Mathematics and Information Science, Guangxi University, Nanning, Guangxi, P.R. China, 2 Mathematics and Physics Institute of Henan University of Urban Construction, Pingdingshan, Henan, P. R. China
a11111
*
[email protected] Abstract OPEN ACCESS Citation: Li X, Zhao X, Duan X, Wang X (2015) A Conjugate Gradient Algorithm with Function Value Information and N-Step Quadratic Convergence for Unconstrained Optimization. PLoS ONE 10(9): e0137166. doi:10.1371/journal.pone.0137166 Editor: Hans A Kestler, Leibniz Institute for Age Research, GERMANY
It is generally acknowledged that the conjugate gradient (CG) method achieves global convergence—with at most a linear convergence rate—because CG formulas are generated by linear approximations of the objective functions. The quadratically convergent results are very limited. We introduce a new PRP method in which the restart strategy is also used. Moreover, the method we developed includes not only n-step quadratic convergence but also both the function value information and gradient value information. In this paper, we will show that the new PRP method (with either the Armijo line search or the Wolfe line search) is both linearly and quadratically convergent. The numerical experiments demonstrate that the new PRP algorithm is competitive with the normal CG method.
Received: January 18, 2015 Accepted: August 13, 2015 Published: September 18, 2015 Copyright: © 2015 Li et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Introduction Consider minn f ðxÞ; x2
0, we have ¼ yk1 þ
yk1
rTk1 sk1 k sk1 k2
¼ ðgk gk1 Þ þ
½2ðf ðxk1 Þ f ðxk ÞÞ þ ðgðxk Þ þ g T ðxk1 Þ sk1 Þ sk1 k sk1 k2
¼ ðgk gk1 Þ þ
ðf ðxk1 Þ f ðxk ÞÞ sk1 þ ðgk þ gk1 Þ k sk1 k2
¼ 2gk þ
ðf ðxk1 Þ f ðxk ÞÞ sk1 : k sk1 k2
By 1 1 1 f ðxk1 Þ f ðxk Þ Mk xk1 xk k2 ¼ Mk xk xk1 k2 ¼ Ma2k1 k dk1 k2 2 2 2 and ; k gk k M k xk xk1 k¼ Mak1 k dk1 k; sk1 ¼ ak1 dk1
we obtain k yk1 k
2 k gk þ 0
k f ðxk1 f ðxk ÞÞ k k sk1 k k sk1
1 1 Ma2k1 k dk1 k2 ak1 k dk1 k C k þ2 A a2k1 k dk1 k2
B 2@Mak1 k dk1
¼
k: 3Mak1 k dk1
Then, we have
j¼ jbPRP k
jgkT yk1 j k g T k k yk1 k 3Mak1 k gkT k k dk1 k k gk k k 3Mc1 1 2 2 2 k k gk1 k k gk1 k c1 ak1 k dk1 k k dk1
and jyk j ¼
jgkT dk1 j jg T d j k gk k 1 : ¼ kT k1 c1 1 ak1 2 k k gk1 k gk1 dk1 k dk1
Therefore, it follows from (8) and the Lipschitz continuity of g that k dk k
¼
k gk þ bPRP dk1 yk yk1 k k
k gk k þ jbPRP j k dk1 k þ jyk j k yk1 k k
k gk k þ
3Mak1 k gkT k k dk1 k k gk k 1 k yk1 k dk1 k þc1 k 1 ak1 2 k c1 ak1 k dk1 k k dk1
1 1 k gk k þ 3c1 1 M k gk k þ c1 ak1
ð17Þ
k gk k 3Mak1 k dk1 k k k dk1
ð1 þ 6c1 1 MÞ k gk k : If the Armijo line search is used, and αk 6¼ 1, then a0k ¼ ak r1 such that f ðxk þ a0k dk Þ f ðxk Þ > s1 a0k gkT dk : By the mean-value theorem and the above relation, we can deduce that there exists a
PLOS ONE | DOI:10.1371/journal.pone.0137166 September 18, 2015
6 / 17
A Conjugate Gradient Method
scalar μk 2 (0,1) satisfying 0
s1 ak gkT dk
0
0
0
< f ðxk þ ak dk Þ f ðxk Þ ¼ ak gðxk þ mk ak dk Þ dk 0
0
T
0
¼ ak ðgðxk þ mk ak dk Þ gðxk ÞÞ dk þ ak gkT dk Z 1 0 0 0 2 ¼ mk ðak Þ dkT r2 f ðxk þ tmk ak dk Þdt þ ak gkT dk T
0
0
2
0
ðak Þ M k dk k2 þ ak gkT dk : Thus, the relation 0
ak ¼ rak
ð1 s1 Þr gkT dk ð1 s1 Þr k gk k2 ¼ k dk k2 k dk k2 M M
1 holds. By the relation 1 þ 6Mc1 1 > 1 þ 2Mc1 > 0, we can find that
1 1 > : 1 1 þ 2Mc1 1 þ 6Mc 1 1 Then, we obtain 0
ak ¼ rak
ð1 s1 Þr gkT dk ð1 s1 Þr k gk k2 2 ¼ M 1 ð1 s1 Þrð1 þ 2Mc1 c : 1 Þ k dk k2 k dk k2 M M
Setting c ¼ minf1; c g, we have Eq (16). If the Wolfe line search is used, using Eq (13), we have Mak k dk k2 ðgðxk þ ak dk Þ gk Þ dk ð1 s2 ÞgkT dk : T
By analyzing the Armijo line search technique in a similar manner, we can find a lower positive bound of the step size αk. The proof is complete. Similar to [34], we can establish the following theorem. Here, we state the theorem below but omit its proof. Theorem 0.1 Let Assumption (i) hold, and let the sequence {xk} be generated by the MPRP method with the Armijo line search technique or Wolfe line search technique. Then, there are constants a > 0 and r 2 (0,1) that satisfy k xk x k ar k :
ð18Þ
The restart MPRP* method and its convergence As with [34], we define an initial step length gk as follows: gk
k k g k k 2 ; d ðgðxk þ k dk Þ gðxk ÞÞ T k
ð19Þ
using the integer sequence {k} ! 0 with k ! 1. Moreover, we can obtain jak gk j ! 0: Theorem 0.2 Let sequence {xk} be generated by the MPRP method, and let Assumption (i) hold. Then, for sufficiently large k, gk satisfies the Armijo line search and the Wolfe-Powell line search conditions. Theorem 0.2 shows that, for sufficiently large k, gk can be defined by Eq (19) such that the Armijo line search and Wolfe-Powell line search conditions are satisfied. In the following, jγkj is used as the initial step length of the restart MPRP method.
PLOS ONE | DOI:10.1371/journal.pone.0137166 September 18, 2015
7 / 17
A Conjugate Gradient Method
Algorithm 4.1 (RMPRP ) Step 0: Given x0 2 0 is an integer, let k: = 0. Step 1: If kgkk , stop. Step 2: If the inequality f ðxk þ j gk j dk Þ f ðxk Þ þ s1 j gk j gkT dk holds, we set αk = jγkj; otherwise, we determine αk = max{jγkjρjjj = 0, 1, 2, } satisfying
f ðxk þ ak dk Þ f ðxk Þ þ s1 ak gkT dk :
ð20Þ
Step 3: Let xkþ1 ¼ xk þ ak dk , and k: = k+1. Step 4: If kgkk , stop. Step 5: If k = γ, we let x0: = xk. Go to step 1. Step 6: Compute dk using (8). Go to step 2.
We now establish the global convergence of Algorithm 4.1. Theorem 0.3 Let the conditions of Theorem 0.2 hold. Then, the following relation lim k gk k¼ 0
ð21Þ
k!1
holds. Proof. We will prove this theorem by contradiction. Suppose that Eq (21) does not hold; then, for all k, there exists a constant ε1 > 0 satisfying k gk k ε1 :
ð22Þ
Using Eqs (10) and (20), if f is bounded from below, we can obtain 1 X a2k k dk k2 < 1:
ð23Þ
k¼0
In particular, we have lim ak k dk k¼ 0:
ð24Þ
k!1
If limk ! 1 αk > 0, by Eqs (10) and (24), we obtain limk ! 1kgkk = 0. This contradicts Eqs (22) and (21) holds. Otherwise, if limk ! 1 αk = 0., then there is an infinite index set K0 satisfying lim ak ¼ 0:
ð25Þ
k2K0 ; k!1
According to Step 2 of Algorithm 4.1, when k is sufficiently large, αk ρ−1 does not satisfy Eq (20), which implies that f ðxk þ ak r1 dk Þ f ðxk Þ > dr2 a2k k dk k2 :
ð26Þ
By Eq (22), similar to the proof of Lemma 2.1 in [33], we can deduce that there is a constant % > 0 such that k dk k %; 8 k:
ð27Þ
Using Eqs (27) and (10), and the mean-value theorem, we have f ðxk þ ak r1 dk Þ f ðxk Þ
¼ r1 ak gðxk þ x0 r1 ak dk Þ dk T
¼ r1 ak gkT dk þ r1 ak ðgðxk þ x0 r1 ak dk Þ gk Þ dk T
r1 ak gkT dk þ Mr2 a2k k dk k2 ; where ξ0 2 (0,1) and the last inequality follows Eq (15). Combining this result with Eq (26), for
PLOS ONE | DOI:10.1371/journal.pone.0137166 September 18, 2015
8 / 17
A Conjugate Gradient Method
all sufficiently large k 2 K0, we obtain k gk k2 r1 ðM þ dÞak k dk k2 : By Eq (27) and limk ! 1 αk = 0, the above inequality then implies that limk 2 K0, k ! 1kgkk = 0. This is also a contradiction. The proof is complete. Lemma 0.3 Let Assumption (i) hold, and let the sequence {xk} be generated by the RMPRP method. Then, there are four positive numbers ci, i = 1, 2, 3, 4 that satisfy
k gkþ1 k c1 k dk k; jbPRP k dkþ1 k c4 k dk k : kþ1 j c2 ; jykþ1 j c3 ;
ð28Þ
Proof. Considering the first inequality of Eq (28), we have k gk þ ðgkþ1 gk Þ kk gk k þjgk j k A^k dk k M M k dk k c1 k dk k; k dk k þ k dk k¼ 1 þ m m
k gkþ1 k
where A^k ¼
R1 0
¼
ð29Þ
r2 f ðxk þ t j gk j dk Þdt. Using bPRP kþ1 ; we will discuss the three other inequalities
of Eq (28). For bPRP kþ1 , we have
jbPRP kþ1 j ¼
T T jgkþ1 yk j k gkþ1 k k yk k : k gk k2 k gk k 2
Case 1: If ρk 0, then yk ¼ yk ¼ gkþ1 gk . Therefore, we have
jbPRP kþ1 j ¼
k gkþ1 k c jg jk dk k2 M k gkþ1 gk k 1 k 2 k gk k2 k gk k c1 k dk k2 M c1 k dk k2 M T c1 m1 M c2 dkT Ak dk dk Ak dk
R1 where Ak ¼ 0 r2 f ðxk þ tk dk Þdt and the second inequality follows from the mean-value theorem and Eq (29). Case 2: If ρk > 0, then we have 8 rT s > > < yk ¼ yk þ k k 2 k sk k > > : T rk ¼ 2½f ðxk Þ f ðxkþ1 Þ þ ðgðxkþ1 Þ þ gðxk ÞÞ sk :
PLOS ONE | DOI:10.1371/journal.pone.0137166 September 18, 2015
9 / 17
A Conjugate Gradient Method
Thus, we have
jbPRP kþ1 j
T k gkþ1 k k yk k k gk k2 k gkþ1 k 2 k f ðxkþ1 Þ f ðxk Þ k k sk k þ k gkþ1 þ gk k k sk k2 k gkþ1 gk k þ k sk k2 k gk k2 T k gkþ1 k k gkþ1 gk k k gkþ1 k 2 k f ðxkþ1 Þ f ðxk Þ k þ k gkþ1 þ gk kk sk k þ k gk k2 k gk k2 k sk k : k gkþ1 k 2 k f ðxkþ1 Þ f ðxk Þ k c2 þ þ k gkþ1 þ gk k k gk k2 k sk k c1 k dk k M k sk k2 þ ð1 þ c Þ k d k þ c2 1 k 2 2 k sk k ð1 þ 2Mc1 1 Þ k dk k 2 c2 þ ½c1 ð1 þ 2Mc1 1 Þ ðM þ c1 þ 1Þ c2
Let c2 ¼ maxfc2 ; c2 g. We obtain
jbPRP kþ1 j c2 and jykþ1 j ¼ j
T gkþ1 dk k dk k k dk k2 2 j k g k c c1 ð1 þ 2Mc1 kþ1 1 1 Þ c3 : k gk k2 k gk k2 k gk k2
, we obtain Using the definition of dkþ1 k ¼ k dkþ1
k gkþ1 þ bPRP kþ1 dk ykþ1 yk k
k gkþ1 k þ jbPRP kþ1 j k dk k þ jykþ1 j k ykþ1 k :
Case 1: If ρk 0, we have k k dkþ1
k gkþ1 k þ jbPRP kþ1 j k dk k þ jykþ1 jðk gkþ1 þ k gk kkÞ
¼ ð1 þ jykþ1 jÞ k gkþ1 k þ jbPRP kþ1 j k dk k þ jykþ1 j k gk k
ð1 þ c3 Þc1 k dk k þ c2 k dk k þ c3 k dk k ¼ ½ð1 þ c3 Þc1 þ c2 þ c3 k dk k c4 k dk k : Case 2: If ρk > 0, we have k yk k 3Mak k dk k 3M k dk k; ð0 < ak 1Þ; and k k dkþ1
k gkþ1 k þ jbPRP kþ1 j k dk k þ jykþ1 j k yk k
c1 k dk k þ c2 k dk k þ3c3 M k dk k ¼ ðc1 þ c2 þ 3Mc3 Þ k dk k c4 k dk k : Letting c4 ¼ maxfc4 ; c4 g, we obtain kdkþ1 k c4 kdk k. The proof is complete. Assumption (ii) In some neighborhood N of x , r2 f is Lipschitz continuous. Similar to [34], we can also establish the n-order quadratic convergence of the RMPRP method. Here, we state the theorem but omit the proof.
PLOS ONE | DOI:10.1371/journal.pone.0137166 September 18, 2015
10 / 17
A Conjugate Gradient Method
Table 1. Definition of the benchmark problems and their features. No.
Functions
1
Sphere
2
Schwefel’s
3
Rastrigin
4
Schwefel
5
Griewank
6
Rosenbrock
7
Ackley
8
Langerman
Definition Pp fSph ðxÞ ¼ i¼1 xi2 xi 2 [−5.12, 5.12], x* = (0, 0, , 0), fSph(x*) = 0. Pp Pi 2 fSchDS ðxÞ ¼ i¼1 ð j¼1 xj Þ xi 2 [−65.536, 65.536], x* = (0, 0, , 0), fSchDS(x*) = 0. Pp fRas ðxÞ ¼ 10p þ i¼1 ðxi2 10cosð2pxi ÞÞ xi 2 [−5.12, 5.12], x* = (0, 0, , 0), fRas(x*) = 0. pffiffiffiffiffiffiffiffiffi Pp fSch ðxÞ ¼ 418:9829p þ i¼1 xi sin j xi j xi 2 [−512.03, 511.97], x* = (−420.9678, −420.9678, , −420.9678), fSch(x*) = 0. Pp xi2 Qp fGri ðxÞ ¼ 1 þ i¼1 4000 i¼1 cos xii xi 2 [−600, 600], x* = (0, 0, , 0), fGri(x*) = 0. Pp1 2 2 fRos ðxÞ ¼ i¼1 ½100ðxiþ1 xi2 Þ þ ðxi 1Þ xi 2 [−2.048, 2.048], x* = (1, , 1), fRos(x*) = 0. pffiffiffiffiffiffiffiffiffiffiffiffiffiffi Pp 2ffi Pp 1 1 fAck ðxÞ ¼ 20 þ e 20e0:2 p i¼1 xi ep i¼1 cosð2pxi Þ xi 2 [−30, 30], x* = (0, , 0), fAck(x*) = 0. Pp Pm Pp 1 ðx a Þ2 2 fLan ðxÞ ¼ i¼1 ci e p j¼1 j ij cosðp j¼1 ðxj aij Þ Þ xi 2 [0, 10], m = p, x* = random, fLan(x*) = random.
Multimodal?
Separable?
Regular?
no
yes
n/a
no
no
n/a
yes
yes
n/a
yes
yes
n/a
yes
no
yes
no
no
n/a
yes
no
yes
yes
no
no
doi:10.1371/journal.pone.0137166.t001
Theorem 0.4 Let Assumptions (i) and (ii) hold. Then, there exists a constant c0 > 0 such that lim sup
k!1
k xkrþn x k 0 c < 1: k xkr x k2
ð30Þ
Specifically, the RMPRP method is quadratically convergent.
Numerical Results This section reports on various numerical experiments with Algorithm 4.1 and the normal algorithm to demonstrate the effectiveness of the given algorithm. We utilize the normal algorithm (called Algorithm N), whereby the algorithm does not use the restart technique, but the other steps are the same as those in Algorithm 4.1. We will test the following benchmark problems using these two algorithms. These problems are listed in Table 1. These benchmark problems and discussions concerning the choice of tested problems for an algorithm can be found at http : ==www:cs:cmu:edu=afs=cs=project=jair=pub=volume24=ortizboyer05a html=node6:html:
We will use these benchmark problems to perform the experiments with these two algorithms. The code was written in MATLAB 7.6.0 and run on a PC with a Core 2 Duo CPU, E7500 @2.93 GHz, 2.00 GB memory and Windows XP operation system. The parameters of the algorithms are chosen to be ρ = 0.1, γ = 10, σ1 = 0.0001, and γk = e1 = = 10−4. The dimension can be found in Tables 2–3. Because the line search cannot always ensure that the descent condition dkT gk < 0 holds, uphill searches will occur in the experiments. To avoid this situation, the step size αk will be accepted if the searching number is larger than ten in every line search technique. The Himmeblau stop rule will be used: If jf(xk)j > e1, set stop1 ¼ jf ðxkjfÞfðxkðxÞjkþ1 Þj ; otherwise, set stop1 = jf(xk)−f(xk+1)j.
For each problem, the program will stop if kg(x)k < e3 or stop1 < e2 is satisfied, where e1 = e2 = e3 = 10−4.
PLOS ONE | DOI:10.1371/journal.pone.0137166 September 18, 2015
11 / 17
A Conjugate Gradient Method
Table 2. Test results using Algorithm 4.1. No. 1
2
3
4
5
6
7
8
Dim x0
NI/NFG/fðx Þ (−5, )
NI/NFG/fðx Þ (−3, )
NI/NFG/fðx Þ (2, )
NI/NFG/fðxÞ (4, )
30
2/6/8.355554e-023
2/6/3.274972e-024
2/6/1.455543e-024
2/6/5.822172e-024
10000
2/6/2.788150e-020
2/6/1.091657e-021
2/6/4.851810e-022
2/6/1.940724e-021
100000
2/6/5.970100e-019
2/6/8.813541e-020
2/6/4.871394e-021
2/6/1.948557e-020
103000
2/6/5.962874e-019
2/6/9.164034e-020
2/6/4.997364e-021
2/6/1.998946e-020
x0
(−0.002, )
(−0.001, )
(0.001, )
(0.0002, )
30
4/12/7.338081e-006
4/12/1.834520e-006
4/12/1.834520e-006
3/9/4.070200e-007
60
5/15/1.607584e-005
4/12/1.503191e-005
4/12/1.503191e-005
3/9/3.227252e-006
100
6/18/2.574911e-005
5/15/1.889258e-005
5/15/1.889258e-005
4/12/2.785399e-006
200
8/24/3.830680e-005
7/21/2.110989e-005
7/21/2.110989e-005
5/15/6.065297e-006
x0
(−0.02, 0, )
(0.01, 0, )
(0.001, 0, )
(0.006, 0, )
30
3/9/0.000000e+000
3/9/0.000000e+000
2/6/4.547474e-013
3/9/0.000000e+000
60
3/9/0.000000e+000
3/9/0.000000e+000
2/6/1.136868e-012
3/9/0.000000e+000
100
3/9/0.000000e+000
3/9/0.000000e+000
2/6/1.136868e-012
3/9/0.000000e+000
200
3/9/0.000000e+000
3/9/0.000000e+000
2/6/1.136868e-012
3/9/0.000000e+000
x0
(−500, )
(−10, )
(−300, )
(50, )
30
9/28/4.588939e+002
5/15/1.233277e+004
10/30/8.751388e+003
2/16/1.469607e+004 2/16/4.898690e+005
1000
9/28/1.529646e+004
5/15/4.110923e+005
10/30/2.917129e+005
10000
9/28/1.529646e+005
5/15/4.110923e+006
10/30/2.917129e+006
2/16/4.898690e+006
100000
9/28/1.529646e+006
5/15/4.110923e+007
10/30/2.917129e+007
2/16/4.898690e+007
x0
(−2, )
(−1, )
(1, )
(2, )
30
2/6/8.548717e-015
2/6/5.644248e-009
2/6/5.644248e-009
2/6/8.548717e-015
1000
2/6/0.000000e+000
2/6/0.000000e+000
2/6/0.000000e+000
2/6/0.000000e+000
10000
2/6/0.000000e+000
2/6/0.000000e+000
2/6/0.000000e+000
2/6/0.000000e+000
100000
2/6/0.000000e+000
2/6/0.000000e+000
2/6/0.000000e+000
2/6/0.000000e+000
x0
(1.01, )
(0.9999, )
(1.003, )
(1.005, )
30
12/36/6.832422e-005
2/6/1.919312e-006
6/18/5.497859e-005
9/27/6.152662e-005
1000
12/36/6.562848e-005
3/9/3.607120e-007
6/18/5.343543e-005
9/27/5.799691e-005
10000
12/36/6.382394e-005
3/9/3.615221e-007
7/21/4.048983e-005
9/27/6.078016e-005
100000
12/36/5.969884e-005
3/9/3.632823e-007
7/21/4.484866e-005
9/27/5.885821e-005
x0
(−0.1, )
(0.1, )
(0.01, )
(0.03, )
30
11/50/1.684545e+000
11/50/1.684545e+000
11/58/1.684467e+000
6/25/1.684388e+000
1000
14/67/1.717428e+000
14/67/1.717428e+000
9/46/1.717374e+000
11/56/1.717448e+000
10000
15/75/1.718189e+000
15/75/1.718189e+000
9/46/1.718252e+000
11/57/1.718327e+000
100000
15/75/1.718273e+000
15/75/1.718273e+000
9/46/1.718340e+000
11/57/1.718410e+000
x0
(−5, )
(−1, )
(2, )
(4, )
30
1/3/-1.889333e-104
2/16/-8.226096e-004
1/3/-2.009161e-015
1/3/-1.228836e-064
300
1/3/0.000000e+000
1/3/-3.893431e-040
1/3/-1.008253e-163
1/3/0.000000e+000
500
1/3/0.000000e+000
1/3/-1.459270e-067
1/3/-4.297704e-274
1/3/0.000000e+000
1000
1/3/0.000000e+000
1/3/-2.213361e-136
1/3/0.000000e+000
1/3/0.000000e+000
doi:10.1371/journal.pone.0137166.t002
PLOS ONE | DOI:10.1371/journal.pone.0137166 September 18, 2015
12 / 17
A Conjugate Gradient Method
Table 3. Test results using Algorithm N. No. 1
2
3
4
5
6
7
8
Dim x0
NI/NFG/fðx Þ (−5, )
NI/NFG/fðx Þ (−3, )
NI/NFG/fðx Þ (2, )
NI/NFG/fðxÞ (4, )
30
2/6/8.355554e-023
2/6/3.274972e-024
2/6/1.455543e-024
2/6/5.822172e-024
10000
2/6/2.788150e-020
2/6/1.091657e-021
2/6/4.851810e-022
2/6/1.940724e-021
100000
2/6/5.970100e-019
2/6/8.813541e-020
2/6/4.871394e-021
2/6/1.948557e-020
103000
2/6/5.962874e-019
2/6/9.164034e-020
2/6/4.997364e-021
2/6/1.998946e-020
x0
(−0.002, )
(−0.001, )
(0.001, )
(0.0002, )
30
4/12/7.338081e-006
4/12/1.834520e-006
4/12/1.834520e-006
3/9/4.070200e-007
60
5/15/1.607584e-005
4/12/1.503191e-005
4/12/1.503191e-005
3/9/3.227252e-006
100
6/18/2.574911e-005
5/15/1.889258e-005
5/15/1.889258e-005
4/12/2.785399e-006
200
8/24/3.830680e-005
7/21/2.110989e-005
7/21/2.110989e-005
5/15/6.065297e-006
x0
(−0.02, 0, )
(0.01, 0, )
(0.001, 0, )
(0.006, 0, )
30
3/9/0.000000e+000
3/9/0.000000e+000
2/6/4.547474e-013
3/9/0.000000e+000
60
3/9/0.000000e+000
3/9/0.000000e+000
2/6/1.136868e-012
3/9/0.000000e+000
100
3/9/0.000000e+000
3/9/0.000000e+000
2/6/1.136868e-012
3/9/0.000000e+000
200
3/9/0.000000e+000
3/9/0.000000e+000
2/6/1.136868e-012
3/9/0.000000e+000
x0
(−500, )
(−10, )
(−300, )
(50, )
30
9/28/4.588939e+002
5/15/1.233277e+004
10/30/8.751388e+003
2/16/1.469607e+004 2/16/4.898690e+005
1000
9/28/1.529646e+004
5/15/4.110923e+005
10/30/2.917129e+005
10000
9/28/1.529646e+005
5/15/4.110923e+006
10/30/2.917129e+006
2/16/4.898690e+006
100000
9/28/1.529646e+006
5/15/4.110923e+007
10/30/2.917129e+007
2/16/4.898690e+007
x0
(−2, )
(−1, )
(1, )
(2, )
30
2/6/8.548717e-015
2/6/5.644248e-009
2/6/5.644248e-009
2/6/8.548717e-015
1000
2/6/0.000000e+000
2/6/0.000000e+000
2/6/0.000000e+000
2/6/0.000000e+000
10000
2/6/0.000000e+000
2/6/0.000000e+000
2/6/0.000000e+000
2/6/0.000000e+000
100000
2/6/0.000000e+000
2/6/0.000000e+000
2/6/0.000000e+000
2/6/0.000000e+000
x0
(1.01, )
(0.9999, )
(1.003, )
(1.005, )
30
16/48/8.156782e-005
2/6/1.919312e-006
6/18/5.497859e-005
9/27/6.152662e-005
1000
15/45/8.236410e-005
3/9/3.607120e-007
6/18/5.343543e-005
9/27/5.799691e-005
10000
14/42/8.568194e-005
3/9/3.615221e-007
7/21/4.048983e-005
9/27/6.078016e-005
100000
13/39/8.458113e-005
3/9/3.632823e-007
7/21/4.484866e-005
9/27/5.885821e-005
x0
(−0.1, )
(0.1, )
(0.01, )
(0.03, )
30
11/50/1.684545e+000
11/50/1.684545e+000
11/58/1.684467e+000
6/25/1.684388e+000
1000
14/67/1.717428e+000
14/67/1.717428e+000
9/46/1.717374e+000
11/56/1.717448e+000
10000
15/75/1.718189e+000
15/75/1.718189e+000
9/46/1.718252e+000
11/57/1.718327e+000
100000
15/75/1.718273e+000
15/75/1.718273e+000
9/46/1.718340e+000
11/57/1.718410e+000
x0
(−5, )
(−1, )
(2, )
(4, )
30
1/3/-1.889333e-104
2/16/-8.226096e-004
1/3/-2.009161e-015
1/3/-1.228836e-064
300
1/3/0.000000e+000
1/3/-3.893431e-040
1/3/-1.008253e-163
1/3/0.000000e+000
500
1/3/0.000000e+000
1/3/-1.459270e-067
1/3/-4.297704e-274
1/3/0.000000e+000
1000
1/3/0.000000e+000
1/3/-2.213361e-136
1/3/0.000000e+000
1/3/0.000000e+000
doi:10.1371/journal.pone.0137166.t003
PLOS ONE | DOI:10.1371/journal.pone.0137166 September 18, 2015
13 / 17
A Conjugate Gradient Method
Table 4. The performance of Algorithm 4.1 and Algorithm N on NFG. Algorithm 4.1
Algorithm N
0.99
1
doi:10.1371/journal.pone.0137166.t004
The program also stops if more than five thousand iterations are performed because the corresponding method for the problem is regarded as having failed. The meanings of the columns in Tables 2–3 are as follows: x0: the initial point NFG: the total number of NF and NG, i.e., NFG = NF+NG. NI: the total number of iterations Dim: the dimension of the problem; f ð x Þ denotes the function value at the point x when the program stops. The results in Tables 2–3 show that these two algorithms effectively solve the benchmark problems—except for the fourth problem. In the experiment, the results when the algorithms are applied to problems 2 and 3 are not satisfactory when the dimensions of the problem are large; thus, we use a lower dimension. The dimensions of problems 8 and 9 are less than 1,000; the reasoning is similar to that for problems 2 and 3. However, the dimensions of all the problems are larger than 30, which is fixed. For many problems, the results are similar, except for those of problem 6. Clearly, the restart algorithm is competitive with the normal algorithm without using the restart technique. To clearly to show the performance of Algorithm 4.1 and Algorithm N on NFG, we use the tool (S1 File) in [46] to analyze the algorithms. The results are listed in Table 4. It is easy to see that Algorithm 4.1 outperforms Algorithm N by approximately 1%, and we can conclude that the proposed method is better than the normal method. Thus, we hope that the given algorithm will be utilized in the future.
Conclusion 1. This paper presented a new conjugate gradient method that exhibits global convergence, linear convergence, and quadratic convergence when suitable assumptions are made. Our proposed CG formula includes not only the gradient value information but also the function value information. For the test benchmark problems, the numerical results showed that the given algorithm is more effective than the normal method without using the restart technique. 2. We performed tests using benchmark problems using the presented algorithm and the normal algorithm without employing the restart technique. These two methods were shown to be very effective for solving the given problems. Moreover, we did not fix the dimension because n = = 30, and the largest dimension was higher than 100,000 (103,000). Additional numerical problems (such as the problems in [47]) should be investigated in the future to examine this algorithm. 3. Recently, we solved nonsmooth optimization problems using the relative gradient methods and obtained various results; therefore, in the future, we will use the RMPRP method to solve nonsmooth optimization problems and hopefully obtain interesting results. Moreover, we will study the convergence of the CG method with other line search rules.
PLOS ONE | DOI:10.1371/journal.pone.0137166 September 18, 2015
14 / 17
A Conjugate Gradient Method
Supporting Information S1 File. Supporting Information for Table 4. (PDF)
Acknowledgments The authors would like to thank the editor and the referees for their useful suggestions and comments, which greatly improved the paper.
Author Contributions Conceived and designed the experiments: XZ XW XL. Performed the experiments: XZ XW. Analyzed the data: XW XD. Contributed reagents/materials/analysis tools: XL XZ XW. Wrote the paper: XW XZ.
References 1.
Gu B and Sheng VS. Feasibility and finite convergence analysis for accurate on-line v-support vector learning. IEEE Transactions on Neural Networks and Learning Systems. 2013; 24:1304–1315. doi: 10. 1109/TNNLS.2013.2250300
2.
Li J,Li XL,Yang B,Sun XM. Segmentation-based Image Copy-move Forgery Detection Scheme, IEEE Transactions on Information Forensics and Security. 2015; 10: 507–518. doi: 10.1109/TIFS.2014. 2381872
3.
Wen XZ,Shao L, Fang W, and Xue Y. Efficient Feature Selection and Classification for Vehicle Detection. IEEE Transactions on Circuits and Systems for Video Technology. 2015;. doi: 10.1109/TCSVT. 2014.2358031
4.
Zhang H, Wu J, Nguyen TM, Sun MX.Synthetic Aperture Radar Image Segmentation by Modified Student’s t-Mixture Model. IEEE Transaction on Geoscience and Remote Sensing. 2014; 52: 4391–4403. doi: 10.1109/TGRS.2013.2281854
5.
Fu ZJ. Achieving Efficient Cloud Search Services: Multi-keyword Ranked Search over Encrypted Cloud Data Supporting Parallel Computing. IEICE Transactions on Communications. E98-B, 2015: 190–200. doi: 10.1587/transcom.E98.B.190
6.
Dai Y, Yuan Y. A nonlinear conjugate gradient with a strong global convergence properties. SIAM J. Optimi. 2000; 10: 177–182. doi: 10.1137/S1052623497318992
7.
Dai Y, Yuan Y. Nonlinear conjugate gradient Methods. Shanghai Scientific and Technical Publishers, 1998.
8.
Fletcher R. Practical methods of optimization, Vol I: Unconstrained Optimization. 2nd edition, Wiley, New York, 1997.
9.
Fletcher R, Reeves C. Function minimization by conjugate gradients, Comput. J. 1964; 7: 149–154. doi: 10.1093/comjnl/7.2.149
10.
Hager WW,Zhang H. A new conjugate gradient method with guaranteed descent and an efficient line search. SIAM J. Optim. 2005; 16: 170–192. doi: 10.1137/030601880
11.
Hestenes MR, Stiefel E. Method of conjugate gradient for solving linear equations. J. Res. Nat. Bur. Stand. 1952; 49:409–436. doi: 10.6028/jres.049.044
12.
Liu Y, Storey C.Effcient generalized conjugate gradient algorithms part 1: Theory, J. Appl. Math. Comput. 1992; 69: 17–41.
13.
Polak E. The conjugate gradient method in extreme problems, Comput. Math. Mathem. Phy. 1969; 9: 94–112. doi: 10.1016/0041-5553(69)90035-4
14.
Polak E, Ribière G. Note sur la convergence de directions conjugees. Rev. Fran. Inf. Rech. Opérat. 1969; 3: 35–43.
15.
Wei Z, Yao S, Liu L. The convergence properties of some new conjugate gradient methods. Appl. Math. Comput. 2006; 183: 1341–1350. doi: 10.1016/j.amc.2006.05.150
16.
Yuan GL, Lu XW. A modified PRP conjugate gradient method. Anna. Operat. Res. 2009; 166: 73–90. doi: 10.1007/s10479-008-0420-4
17.
Yuan GL,. Lu XW, Wei ZX. A conjugate gradient method with descent direction for unconstrained optimization. J. Comput. Appl. Math. 2009; 233: 519–530. doi: 10.1016/j.cam.2009.08.001
PLOS ONE | DOI:10.1371/journal.pone.0137166 September 18, 2015
15 / 17
A Conjugate Gradient Method
18.
Dai Y. Analysis of conjugate gradient methods, Ph.D. Thesis, Institute of Computational Mathematics and Scientific/Engineering Computing. Chese Academy of Sciences, 1997.
19.
Dai ZF, Tian BS. Global convergence of some modified PRP nonlinear conjugate gradient methods. Optim. Let.
20.
Gilbert J C, Nocedal J. Global convergence properties of conjugate gradient methods for optimization. SIAM J. Optim. 1992; 2: 21–42. doi: 10.1137/0802003
21.
Hager WW, Zhang H. Algorithm 851: CG− DESCENT, A conjugate gradient method with guaranteed descent. ACM Trans. Mathem. Soft. 2006; 32: 113–137. doi: 10.1145/1132973.1132979
22.
Powell MJD. Nonconvex minimization calculations and the conjugate gradient method. Lecture Notes in Mathematics, Vol. 1066, Spinger-Verlag, Berlin, 1984, pp. 122–141.
23.
Powell MJD. Convergence properties of algorithm for nonlinear optimization. SIAM Rev. 1986; 28: 487–500. doi: 10.1137/1028154
24.
Yu GH. Nonlinear self-scaling conjugate gradient methods for large-scale optimization problems. thesis of Doctor’s Degree, Sun Yat-Sen University, 2007.
25.
Yuan GL. Modified nonlinear conjugate gradient methods with sufficient descent property for largescale optimization problems, Optim. Let. 2009; 3: 11–21. doi: 10.1007/s11590-008-0086-5
26.
Yuan YX. Analysis on the conjugate gradient method. Optim. Meth. Soft. 1993; 2: 19–29. doi: 10.1080/ 10556789308805532
27.
Yuan GL, Wei ZX and Li GY. A modified Polak-Ribiére-Polyak conjugate gradient algorithm for nonsmooth convex programs. Journal of Computational and Applied Mathematics. 2014; 255: 86–96. doi: 10.1016/j.cam.2013.04.032
28.
Yuan GL, Wei ZX, and Zhao QM. A modified Polak-Ribiére-Polyak conjugate gradient algorithm for large-scale optimization problems. IIE Transactions, 2014; 46: 397–413. doi: 10.1080/0740817X.2012. 726757
29.
Yuan GL, Zhang M J. A modified Hestenes-Stiefel conjugate gradient algorithm for large-scale optimization. Numerical Functional Analysis and Optimization. 2013; 34:914–937. doi: 10.1080/01630563. 2013.777350
30.
Burmeister W. Die Konvergenzordnung des Fletcher-Powell Algorithmus, Z. Angew. Math. Mech. 1973; 53: 693–699. doi: 10.1002/zamm.19730531007
31.
Cohen A. Rate of convergence of several conjugate gradient algorithms. SIAM J. Numer. Anal. 1972; 9: 248–259. doi: 10.1137/0709024
32.
Ritter K. On the rate of superlinear convergence of a class of variable metric methods, Numer. Math. 1980; 35: 293–313. doi: 10.1007/BF01396414
33.
Zhang L, Zhou W, Li DH. A descent modified Polak-Ribiére-Polyak conjugate gradient method and its global convergence, IMA J. Numer. Anal. 2006; 26: 629–640. doi: 10.1093/imanum/drl016
34.
Li DH, Tian BS. N-step quadratic convergence of the MPRP method with a restart strategy, J. comput. Appl. Math. 2011; 235: 4978–4990. doi: 10.1016/j.cam.2011.04.026
35.
Broyden CG, Dennis JE, Moré JJ. On the local and superlinear convergence of quasi-Newton methods, J. Ins. Math. Appl., 12 (1973), pp. 223–246. doi: 10.1093/imamat/12.3.223
36.
Byrd R, Nocedal J. A tool for the analysis of quasi-Newton methods with application to unconstrained minimization. SIAM J. Numer. Anal. 1989; 26: 727–739. doi: 10.1137/0726042
37.
Byrd R, Nocedal J, Yuan Y. Global convergence of a class of quasi-Newton methods on convex problems. SIAM J. Numer. Anal. 1987; 24: 1171–1189. doi: 10.1137/0724077
38.
Dai Y. Convergence properties of the BFGS algorithm. SIAM J. Optim. 2003: 13 693–701. doi: 10. 1137/S1052623401383455
39.
Dennis JE, Moré JJ. A characterization of superlinear convergence and its application to quasi-Newton methods. Math. Comput. 1974; 28:549–560. doi: 10.1090/S0025-5718-1974-0343581-1
40.
Li D, Fukushima M A modified BFGS method and its global convergence in nonconvex minimization. J. Comput. Appl. Math. 2001; 129: 15–35. doi: 10.1016/S0377-0427(00)00540-9
41.
Li D, Fukushima M. On the global convergence of the BFGS method for nonconvex unconstrained optimization problems. SIAM J. Optim. 2001; 11: 1054–1064. doi: 10.1137/S1052623499354242
42.
Liu GH, Han JY, Sun DF. Global convergence Analysis of the BFGS Algorithm with Nonmonotone linesearch. Optim. 1995; 34: 147–159. doi: 10.1080/02331939508844101
43.
Wei Z, Yu G, Yuan G, Lian Z. The superlinear convergence of a modified BFGS-type method for unconstrained optimization. Comput. Optim. Appli. 2004; 29: 315–332. doi: 10.1023/B:COAP.0000044184. 25410.39
PLOS ONE | DOI:10.1371/journal.pone.0137166 September 18, 2015
16 / 17
A Conjugate Gradient Method
44.
Wei Z, Li G, Qi L. New quasi-Newton methods for unconstrained optimization problems. Appl. Math. Comput. 2006; 175: 1156–1188. doi: 10.1016/j.amc.2005.08.027
45.
Yuan GL, Wei ZX. Convergence analysis of a modified BFGS method on convex minimizations. Comput. Optim. Appli. 2010; 47: 237–255. doi: 10.1007/s10589-008-9219-0
46.
Yuan GL, Lu XW. A modified PRP conjugate gradient method. Annals of Operations Research. 2009; 166: 73–90. doi: 10.1007/s10479-008-0420-4
47.
Nicholas I, Gould M, Dominique Orban and Philippe L. Toint, CUTEr(and SifDec): A Constrained and Unconstrained Testing Environment, revisited. ACM Trans. Mathem. Soft. 2003; 29: 373–394.
PLOS ONE | DOI:10.1371/journal.pone.0137166 September 18, 2015
17 / 17