# 基于迭代的支持向量机的产品销售预测模型Product Sale Forecasting Model Based on Iterative Support Vector Machine

Aiming at data characteristics of small samples and noise existing in the product sale series, an iterative support vector machine (Iε-SVM) is proposed in this paper. During the gradually reducing process of Iε-SVM’s parameter ε, the samples greatly affected by noise are iteratively amended to reduce their influence on the final forecasting model generated. Iε-SVM is applied to a numerical value example and the automobile sales forecasting in contrast with the ε support vector machine (ε-SVM). The experiment results indicate that Iε-SVM is effective and feasible, by which more accurate forecasting results are obtained over the ε-SVM.

1. 引言

2. 支持向量机

$\begin{array}{l}\underset{w,b,{\xi }^{\left(\ast \right)}}{min}\text{ }\tau \left(w,b,{\xi }^{\left(\ast \right)}\right)=\frac{1}{2}{‖w‖}^{2}+C\underset{i=1}{\overset{l}{\sum }}\left({\xi }_{i}+{\xi }_{i}^{\ast }\right)\\ \text{s}\text{.t}\text{.}\left\{\begin{array}{l}{y}_{i}-\left(w\cdot {x}_{i}+b\right)\le \epsilon +{\xi }_{i}\\ \left(w\cdot {x}_{i}+b\right)-{y}_{i}\le \epsilon +{\xi }_{i}^{*}\\ {\xi }^{\left(\ast \right)}\ge 0\end{array}\end{array}$ (1)

$\begin{array}{l}\underset{\alpha ,{\alpha }^{*}}{\mathrm{min}}\text{ }W\left(\alpha ,{\alpha }^{*}\right)=\frac{1}{2}\underset{i,j=1}{\overset{l}{\sum }}\left({\alpha }_{i}^{*}-{\alpha }_{i}\right)\left({\alpha }_{j}^{*}-{\alpha }_{j}\right)k\left({x}_{i},{x}_{j}\right)\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}-\underset{i=1}{\overset{l}{\sum }}\left({\alpha }_{i}^{*}-{\alpha }_{i}\right){y}_{i}+\underset{i=1}{\overset{l}{\sum }}\left({\alpha }_{i}^{*}+{\alpha }_{i}\right)\epsilon \\ \text{s}\text{.t}.\left\{\begin{array}{l}\underset{i=1}{\overset{l}{\sum }}\left({\alpha }_{i}^{*}-{\alpha }_{i}\right)=0\\ {\alpha }_{i},{\alpha }_{i}^{*}\in \left[0,C\right],i=1,\cdots ,l\end{array}\end{array}$ (2)

$f\left(x\right)=\underset{i=1}{\overset{l}{\sum }}\left({\alpha }_{i}^{\ast }-{\alpha }_{i}\right)k\left({x}_{i},x\right)+b$ (3)

3. 迭代支持向量机(Iε-SVM)

ε-SVM采用ε不敏感函数，夹在曲线 $f\left(x\right)=\underset{i=1}{\overset{l}{\sum }}\left({\alpha }_{i}^{\ast }-{\alpha }_{i}\right)k\left({x}_{i},x\right)+b+\epsilon$$f\left(x\right)=\underset{i=1}{\overset{l}{\sum }}\left({\alpha }_{i}^{\ast }-{\alpha }_{i}\right)k\left({x}_{i},x\right)+b-\epsilon$ 中间的区域被称为ε-带，当样本点位于ε-带中时则认为模型在该点没有损失，只有当样本点位于ε-带之外时，才有损失出现。针对样本点 $\left({x}_{i},{y}_{i}\right)$ 和ε-带之间的位置关系，我们有下面的定理。

i) 若 ${\alpha }_{i}={\alpha }_{i}^{*}=0$，则相应的样本点 $\left({x}_{i},{y}_{i}\right)$ 一定在ε-带的内部或边界上。

ii) 若 ${\alpha }_{i}\in \left(0,C\right),{\alpha }_{i}^{*}=0$${\alpha }_{i}=0,{\alpha }_{i}^{*}\in \left(0,C\right)$，则相应的样本点 $\left({x}_{i},{y}_{i}\right)$ 一定在ε-带的边界上。

iii) 若 ${\alpha }_{i}=C,{\alpha }_{i}^{*}=0$${\alpha }_{i}=0,{\alpha }_{i}^{*}=C$，则相应的样本点 $\left({x}_{i},{y}_{i}\right)$ 一定在ε-带的外部或边界上。

${\alpha }_{i}\left(\epsilon +{\xi }_{i}-{y}_{i}+w\cdot {x}_{i}+b\right)=0$ (4)

${\alpha }_{i}^{*}\left(\epsilon +{\xi }_{i}+{y}_{i}-w\cdot {x}_{i}-b\right)=0$ (5)

$\left(C-{\alpha }_{i}\right){\xi }_{i}=0$ (6)

$\left(C-{\alpha }_{i}^{*}\right){\xi }_{i}^{*}=0$ (7)

i) 由式(6)和(7)可以看出 $\xi ={\xi }_{i}^{*}=0$

ii) 由式(6)和(7)可以看出 $\xi ={\xi }_{i}^{*}=0$，同时结合式(4)和(5)可断定，当 ${\alpha }_{i}\ne 0$ 时， $\epsilon +{\xi }_{i}-{y}_{i}+w\cdot {x}_{i}+b=0$ ；当 ${\alpha }_{i}^{*}\ne 0$ 时， $\epsilon +{\xi }_{i}+{y}_{i}-w\cdot {x}_{i}-b=0$

iii) 由式(4)~(7)可以看出。

Figure 1. The flow chart of Iε-SVM

Iε-SVM的运行机制如图1所示，我们首先确定模型参数C及可能存在的核函数参数，为ε选取一个较大的初始值，训练ε-SVM，更新落入ε-带外部的样本点 $\left({x}_{i},{y}_{i}\right)$ 的值， $\lambda$ 是个预先给定的比例系数，每次按比例 $\lambda$ 缩小ε； $\varsigma$ 是一个预先给定的阀值，如果 $\epsilon <\varsigma$ 则终止算法，否则更新ε的值，并在更新后的样本集上训练ε-SVM。

1) 若 ${\alpha }_{i}\ne C$${\alpha }_{i}^{*}\ne C$，此时样本点 $\left({x}_{i},{y}_{i}\right)$ ε-带的内部或边界上，则令 ${\stackrel{˜}{y}}_{i}={y}_{i}$

2) 若 ${\alpha }_{i}=C,{\alpha }_{i}^{*}=0$，此时 ${y}_{i}\ge w\cdot {x}_{i}+b+\epsilon$，我们令 ${\stackrel{˜}{y}}_{i}=w\cdot {x}_{i}+b+\epsilon$

3) 若 ${\alpha }_{i}=0,{\alpha }_{i}^{*}=C$，此时 ${y}_{i}\le w\cdot {x}_{i}+b-\epsilon$，我们令 ${\stackrel{˜}{y}}_{i}=w\cdot {x}_{i}+b-\epsilon$

$\begin{array}{l}\underset{w,b,{\xi }^{\left(\ast \right)}}{min}\text{ }\tau \left(w,b,{\xi }^{\left(\ast \right)}\right)=\frac{1}{2}{‖w‖}^{2}+C\underset{i=1}{\overset{l}{\sum }}\left({\xi }_{i}+{\xi }_{i}^{\ast }\right)\\ \text{s}\text{.t}\text{.}\left\{\begin{array}{l}{\stackrel{˜}{y}}_{i}-\left(w\cdot {x}_{i}+b\right)\le \epsilon +{\xi }_{i}\\ \left(w\cdot {x}_{i}+b\right)-{\stackrel{˜}{y}}_{i}\le \epsilon +{\xi }_{i}^{*}\\ {\xi }^{\left(\ast \right)}\ge 0\end{array}\end{array}$ (8)

i) ${\stackrel{˜}{y}}_{i}={y}_{i}$，此时 $\left({x}_{i},{y}_{i}\right)$ 落入曲线 $y=\stackrel{¯}{w}\cdot x+\stackrel{¯}{b}$ 的ε-带内部或边界上，有：

${\stackrel{˜}{y}}_{i}-\left(\stackrel{¯}{w}\cdot {x}_{i}+\stackrel{¯}{b}\right)\le \epsilon ,$

$\left(\stackrel{¯}{w}\cdot {x}_{i}+\stackrel{¯}{b}\right)-{\stackrel{˜}{y}}_{i}\le \epsilon .$

ii) ${\stackrel{˜}{y}}_{i}=\stackrel{¯}{w}\cdot {x}_{i}+\stackrel{¯}{b}+\epsilon$，此时 $\left({x}_{i},{y}_{i}\right)$ 落入曲线 $y=\stackrel{¯}{w}\cdot x+\stackrel{¯}{b}$ 的ε-带外部或边界上，有：

${\stackrel{˜}{y}}_{i}-\left(\stackrel{¯}{w}\cdot {x}_{i}+\stackrel{¯}{b}\right)\le \epsilon ,$

$\left(\stackrel{¯}{w}\cdot {x}_{i}+\stackrel{¯}{b}\right)-{\stackrel{˜}{y}}_{i}\le \epsilon .$

iii) ${\stackrel{˜}{y}}_{i}=\stackrel{¯}{w}\cdot {x}_{i}+\stackrel{¯}{b}-\epsilon$，此时 $\left({x}_{i},{y}_{i}\right)$ 同样落入曲线 $y=\stackrel{¯}{w}\cdot x+\stackrel{¯}{b}$ 的ε-带外部或边界上，有：

${\stackrel{˜}{y}}_{i}-\left(\stackrel{¯}{w}\cdot {x}_{i}+\stackrel{¯}{b}\right)\le \epsilon ,$

$\left(\stackrel{¯}{w}\cdot {x}_{i}+\stackrel{¯}{b}\right)-{\stackrel{˜}{y}}_{i}\le \epsilon .$

$\tau \left(\stackrel{¯}{w},\stackrel{¯}{b},0\right)\ge \tau \left(\stackrel{˜}{w},\stackrel{˜}{b},{\stackrel{˜}{\xi }}^{\left(\ast \right)}\right),$

$\frac{1}{2}{‖\stackrel{˜}{w}‖}^{2}+C\underset{i=1}{\overset{l}{\sum }}\left({\stackrel{˜}{\xi }}_{i}+{\stackrel{˜}{\xi }}_{i}^{\ast }\right)\le \frac{1}{2}{‖\stackrel{¯}{w}‖}^{2}$，而 ${\stackrel{˜}{\xi }}^{\left(\ast \right)}\ge 0$，所以有：

${‖\stackrel{˜}{w}‖}^{2}\le {‖\stackrel{¯}{w}‖}^{2}.$

4. 仿真实验

4.1. 比较准则

${e}_{i}=|{y}_{i}-{{y}^{\prime }}_{i}|,$ (9)

1) 最大预测误差( $\mathrm{max}\left(e\right)\right)$

$\mathrm{max}\left(e\right)=\underset{1\le i\le {l}_{2}}{\mathrm{max}}\left({e}_{i}\right),$ (10)

2) 最小预测误差( $\mathrm{min}\left(e\right)\right)$

$\mathrm{min}\left(e\right)=\underset{1\le i\le {l}_{2}}{\mathrm{min}}\left({e}_{i}\right)$ (11)

3) 平均预测误差( $\text{mean}\left(e\right)\right)$

$\text{mean}\left(e\right)=\frac{1}{{l}_{2}}\left(\underset{i=1}{\overset{{l}_{2}}{\sum }}{e}_{i}\right)$ (12)

4) 预测误差方差( $\mathrm{var}\left(e\right)\right)$

$\mathrm{var}\left(e\right)=\frac{1}{{l}_{2}-1}\underset{i=1}{\overset{{l}_{2}}{\sum }}{\left({e}_{i}-\text{mean}\left(e\right)\right)}^{2}$ (13)

4.2. 数值算例

$y=x+\xi ,$ (14)

$\xi$ 是服从 $N\left(0,0.05\right)$ 的随机数，x分别取区间 $\left[1,2\right]$ 中平均分布的36个节点，并通过式14计算相应的y值，产生36组数据 $\left(x,y\right)$ 构成样本集T。将样本集T分为训练样本集 ${T}_{1}$ 和测试样本集 ${T}_{2}$ 两个部分，训练样本集 ${T}_{1}$ 取样本集T得前 ${l}_{1}$ 个样本，余下样本作为测试样本集 ${T}_{2}$${l}_{1}$ 分别取6、12、24三种不同情况，以便比较两种模型在不同规模训练样本集下的表现。分别将ε-SVM与Iε-SVM作用于训练样本集 ${T}_{1}$ 建立模型，ε-SVM与Iε-SVM采用相同的核函数：

$k\left({x}_{i},{x}_{j}\right)={x}_{i}\cdot {x}_{j}$

Table 1. The forecasting errors of two models in the numerical experiment

Figure 2. The decline curve of the average forecasting error of Iε-SVM in iteration when ${l}_{1}=6$

Figure 3. The forecasting results of two models when ${l}_{1}=6$

4.3. 应用实例

$K\left({x}_{i},{x}_{j}\right)=\mathrm{exp}\left(-\frac{{‖{x}_{i}-{x}_{j}‖}^{2}}{{\sigma }^{2}}\right)$

Figure 4. The decline curve of the average forecasting error of Iε-SVM in iteration

Figure 5. The forecasting results of two models

Table 2. The forecasting errors of two models in the practical experiment

5. 结论

