# 基于拟牛顿方程一个改进的非线性共轭梯度算法A Modified Nonlinear Conjugate Gradient Algorithm Using Secant Conditions

• 全文下载: PDF(531KB)    PP.676-683   DOI: 10.12677/AAM.2019.84076
• 下载量: 244  浏览量: 350   科研立项经费支持

In this paper, based on the idea of Ref. [1], we propose a modified conjugate gradient method using secant conditions for unconstrained optimization problems. The proposed algorithm improves the method in Ref. [1], which possesses the following properties: the search direction has the sufficient descent property; the global convergence of the given algorithm will be established under suitable assumptions; numerical results are reported to test its efficiency.

1. 引言

$\underset{x\in {R}^{n}}{\mathrm{min}}f\left(x\right),$ (1.1)

${x}_{k+1}={x}_{k}+{\alpha }_{k}{d}_{k},k=0,\text{}1,\text{}2,\cdots$

${d}_{k}=\left\{\begin{array}{l}-{g}_{k}+{\beta }_{k}{d}_{k-1},k\ge 1,\\ -{g}_{k},k=0,\end{array}$ (1.2)

${\beta }_{k}^{PRP}=\frac{{g}_{k+1}^{T}\left({g}_{k+1}-{g}_{k}\right)}{{‖{g}_{k}‖}^{2}}$ [2] [3] ， ${\beta }_{k}^{HS}=\frac{{g}_{k+1}^{T}\left({g}_{k+1}-{g}_{k}\right)}{{\left({g}_{k+1}-{g}_{k}\right)}^{T}{d}_{k}}$ [4] ， ${\beta }_{k}^{FR}=\frac{{g}_{k+1}^{T}{g}_{k+1}}{{‖{g}_{k}‖}^{2}}$ [5] ，

${\beta }_{k}^{DY}=\frac{{g}_{k+1}^{T}{g}_{k+1}}{{\left({g}_{k+1}-{g}_{k}\right)}^{T}{d}_{k}}$ [6] ， ${\beta }_{k}^{CD}=\frac{{g}_{k+1}^{T}{g}_{k+1}}{-{d}_{k}^{T}{g}_{k}}$ [7] ， ${\beta }_{k}^{LS}=\frac{{g}_{k+1}^{T}\left({g}_{k+1}-{g}_{k}\right)}{-{g}_{k}{}^{T}{d}_{k}}$ [8] ，

${\beta }_{k}^{MPRP}={\beta }_{k}^{PRP}-\mathrm{min}\left\{{\beta }_{k}^{PRP},\frac{\mu {‖{y}_{k}‖}^{2}}{{‖{g}_{k}‖}^{4}}{g}_{k+1}^{T}{d}_{k}\right\},$

${\beta }_{k}^{MMLS+}={\beta }_{k}^{MMLS}-\mathrm{min}\left\{{\beta }_{k}^{MMLS},\frac{\mu {‖{y}_{{}_{k}}^{m}‖}^{2}}{{\left(-{d}_{k}^{T}{g}_{k}\right)}^{2}}{g}_{k+1}^{T}{d}_{k}\right\}$(1.3)

${B}_{k+1}{S}_{k}={y}_{k}^{m}$(1.4)

${\gamma }_{k}^{1}=\frac{3{\left({g}_{k+1}+{g}_{k}\right)}^{T}{s}_{k}+6\left(f\left({x}_{k}\right)-f\left({x}_{k+1}\right)\right)}{{‖{s}_{k}‖}^{2}},$

${\beta }_{k}^{MMLS*}={\beta }_{k}^{MMLS}-\mathrm{min}\left\{{\beta }_{k}^{MMLS},\frac{\mu {‖{y}_{k}^{m}‖}^{2}}{{\left(-{d}_{k}^{T}{g}_{k}\right)}^{2}}{g}_{k+1}^{T}{d}_{k}\right\}$(1.5)

2. 算法的描述

$f\left({x}_{k}+{\alpha }_{k}{d}_{k}\right)\le {f}_{k}+\delta {\alpha }_{k}{g}_{k}^{T}{d}_{k},$ (2.1)

$g{\left({x}_{k}+{\alpha }_{k}{d}_{k}\right)}^{T}{d}_{k}\ge \sigma {g}_{k}^{T}{d}_{k}.$ (2.2)

${d}_{k+1}=-{g}_{k+1}+{\beta }_{k}^{MMLS*}{d}_{k},$ (2.3)

3. 算法的充分下降性和全局收敛性

${d}_{k}^{T}{g}_{k}\le -c{‖{g}_{k}‖}^{2}$ (3.1)

${d}_{k}^{T}{y}_{k}\ge c\left(1-\sigma \right){‖{g}_{k}‖}^{2}$ (3.2)

$c>0$ 是常数。

$\begin{array}{c}{g}_{k+1}^{T}{d}_{k+1}=-{‖{g}_{k+1}‖}^{2}+{\beta }_{k}^{MMLS*}{d}_{k}^{T}{g}_{k+1}\\ =-{‖{g}_{k+1}‖}^{2}+\left[\frac{{g}_{k+1}^{T}{y}_{k}^{m}}{-{d}_{k}^{T}{g}_{k}}-\mathrm{min}\left\{\frac{{g}_{k+1}^{T}{y}_{k}^{m}}{-{d}_{k}^{T}{g}_{k}},\frac{\mu {‖{y}_{k}^{m}‖}^{2}}{{\left(-{d}_{k}^{T}{g}_{k}\right)}^{2}}{g}_{k+1}^{T}{d}_{k}\right\}\right]{d}_{k}^{T}{g}_{k+1}\end{array}$ (3.3)

$-{d}_{k}^{T}{g}_{k}>0.$$u=\frac{\sqrt{-{d}_{k}^{T}{g}_{k}}}{\sqrt{2\mu }}{g}_{k+1},v=\frac{\sqrt{2\mu }{g}_{k+1}^{T}{d}_{k}}{\sqrt{-{d}_{k}^{T}{g}_{k}}}{y}_{k}^{m}$ 。下面我们分两种情形分别讨论(3.3)式：

$\begin{array}{c}{g}_{k+1}^{T}{d}_{k+1}=-{‖{g}_{k+1}‖}^{2}+\left(\frac{{g}_{k+1}^{T}{y}_{k}^{m}}{-{d}_{k}^{T}{g}_{k}}-\frac{\mu {‖{y}_{k}^{m}‖}^{2}}{{\left(-{d}_{k}^{T}{g}_{k}\right)}^{2}}{g}_{k+1}^{T}{d}_{k}\right){d}_{k}^{T}{g}_{k+1}\\ =\frac{{d}_{k}^{T}{g}_{k+1}{g}_{k+1}^{T}{y}_{k}^{m}-{‖{g}_{k+1}‖}^{2}\left(-{d}_{k}^{T}{g}_{k}\right)-\frac{\mu {‖{y}_{k}^{m}‖}^{2}}{-{d}_{k}^{T}{g}_{k}}{\left({g}_{k+1}^{T}{d}_{k}\right)}^{2}}{-{d}_{k}^{T}{g}_{k}}\\ =\frac{{u}^{T}v-\frac{1}{2}\left({‖u‖}^{2}+{‖v‖}^{2}\right)}{-{d}_{k}^{T}{g}_{k}}+\frac{-\left(1-\frac{1}{4\mu }\right){‖{g}_{k+1}‖}^{2}{d}_{k}^{T}{y}_{k}^{m}}{-{d}_{k}^{T}{g}_{k}}\\ \le -\left(1-\frac{1}{4\mu }\right){‖{g}_{k+1}‖}^{2},\end{array}$

$‖{g}_{k}‖\ge {\epsilon }_{0},\forall k\ge 0.$ (3.4)

$‖{\nabla }^{2}f\left(x\right)‖\le {L}_{0},x\in \Omega .$

2) 目标函数 $f$ 在水平集 $\Omega$ 上连续可微并有下界，目标函数梯度满足Lipschitz条件，所谓Lipschitz条件就是存在常数 $L>0$ 满足下式

$‖g\left(x\right)-g\left(y\right)‖\le L‖x-y‖,\forall x,y\in \Omega$(3.5)

Gilbert和Nocedal [14] 给出下面性质(P)，具体内容是：

$\text{0}<{\gamma }_{1}\le ‖{g}_{k}‖\le {\gamma }_{2}$(3.6)

$‖{s}_{k}‖\le {M}_{1}$ (3.7)

$\begin{array}{c}|{\beta }_{k}^{MMLS*}|\le |\frac{{g}_{k+1}^{T}{y}_{k}^{m}}{{d}_{k}^{T}{g}_{k}}|+\frac{\mu {‖{y}_{k}^{m}‖}^{2}}{{\left(-{d}_{k}^{T}{g}_{k}\right)}^{2}}|{g}_{k+1}^{T}{d}_{k}|\le \frac{‖{g}_{k+1}‖‖‖{g}_{k+1}-{g}_{k}‖+3\left[‖{g}_{k+1}-{g}_{k}‖+‖{\nabla }^{2}f\left({\varsigma }_{k}\right){s}_{k}‖\right]‖}{c{‖{g}_{k}‖}^{2}}\\ +\frac{\mu ‖{g}_{k+1}‖‖{d}_{k}‖{‖‖{g}_{k+1}-{g}_{k}‖+3\left[‖{g}_{k+1}-{g}_{k}‖+‖{\nabla }^{2}f\left({\varsigma }_{k}\right){s}_{k}‖\right]‖}^{2}}{{c}^{2}{‖{g}_{k}‖}^{4}}\\ \le \frac{\left[L{\gamma }_{2}+3\left(L+{L}_{0}\right)\right]‖{s}_{k}‖}{c{\gamma }_{1}^{2}}+\frac{\left[\mu {\gamma }_{2}M{\left(L+3\left(L+{L}_{0}\right)\right)}^{2}{M}_{1}\right]‖{s}_{k}‖}{{c}^{2}{\gamma }_{1}^{4}}\\ =\left(\frac{\left[L{\gamma }_{2}+3\left(L+{L}_{0}\right)\right]c{\gamma }_{1}^{2}+\left[\mu {\gamma }_{2}M{\left(L+3\left(L+{L}_{0}\right)\right)}^{2}{M}_{1}\right]}{{c}^{2}{\gamma }_{1}^{4}}\right)‖{s}_{k}‖,\end{array}$

$b=\mathrm{max}\left\{2,\frac{\left[L{\gamma }_{2}+3\left(L+{L}_{0}\right)\right]c{\gamma }_{1}^{2}+\left[\mu {\gamma }_{2}M{\left(L+3\left(L+{L}_{0}\right)\right)}^{2}{M}_{1}\right]}{{c}^{2}{\gamma }_{1}^{4}}\right\}$

$\lambda =\frac{{c}^{2}{\gamma }_{1}^{4}}{b\left\{\left[L{\gamma }_{2}+3\left(L+{L}_{0}\right)\right]c{\gamma }_{1}^{2}+\left[\mu {\gamma }_{2}M{\left(L+3\left(L+{L}_{0}\right)\right)}^{2}{M}_{1}\right]\right\}}.$

$\begin{array}{c}|{\beta }_{k}^{MMLS*}|\le \frac{\left[L{\gamma }_{2}+3\left(L+{L}_{0}\right)\right]c{\gamma }_{1}^{2}+\left[\mu {\gamma }_{2}M{\left(L+3\left(L+{L}_{0}\right)\right)}^{2}{M}_{1}\right]}{{c}^{2}{\gamma }_{1}^{4}}‖{s}_{k}‖\\ \le \frac{\left[L{\gamma }_{2}+3\left(L+{L}_{0}\right)\right]c{\gamma }_{1}^{2}+\left[\mu {\gamma }_{2}M{\left(L+3\left(L+{L}_{0}\right)\right)}^{2}{M}_{1}\right]}{{c}^{2}{\gamma }_{1}^{4}}\lambda =\frac{1}{2b}.\end{array}$

4. 数值结果

1) Sphere函数： ${f}_{Sph}\left(x\right)=\underset{i=1}{\overset{n}{\sum }}{x}_{i}^{2},{x}_{i}\in \left[-5.12,5.12\right],{f}_{Sph}\left({x}^{*}\right)=0.$

2) Schwefel’s函数： ${f}_{SchDS}\left(x\right)=\underset{i=1}{\overset{n}{\sum }}{\left(\underset{j=1}{\overset{i}{\sum }}{x}_{j}\right)}^{2},{x}_{i}\in \left[-65.536,65.536\right],{f}_{SchDS}\left({x}^{*}\right)=0.$

3) Rastrigin函数： ${f}_{Ras}\left(x\right)=10n+\underset{i=1}{\overset{n}{\sum }}\left({x}_{i}^{2}-10\mathrm{cos}\left(2\text{π}{x}_{i}\right)\right),{x}_{i}\in \left[-5.12,5.12\right],{f}_{Ras}\left({x}^{*}\right)=0.$

4) Griewank函数： ${f}_{Gri}\left(x\right)=1+\underset{i=1}{\overset{n}{\sum }}\frac{{x}_{i}^{2}}{4000}-\underset{i=1}{\overset{n}{\prod }}\mathrm{cos}\frac{{x}_{i}}{i},{x}_{i}\in \left[-600,600\right],{f}_{Gri}\left({x}^{*}\right)=0.$

${x}_{0}$ ：初始点；Dim：问题的维数；NI：迭代次数；NFG：函数值与梯度值迭代次数和；Time：以秒为单位的计算机CPU时间； $f\left(x\right)$ ：算法终止时的函数值。

Table 1. Numerical results

Table 2. The Efficiency of NFG

2017年国家级大学生创新创业训练计划项目(201710606041)，广西自然科学基金(2018JJA110039)和玉林师范学院校级科研项目(2019YJKY16)。

NOTES

*第一作者。

#通讯作者。