计算机应用   2017, Vol. 37 Issue (3): 746-749  DOI: 10.11772/j.issn.1001-9081.2017.03.746 0

### 引用本文

WANG Dingcheng, ZHAO Youzhi, CHEN Beijing, LU Yiyi. Fast learning algorithm of multi-output support vector regression with data-dependent kernel[J]. JOURNAL OF COMPUTER APPLICATIONS, 2017, 37(3): 746-749. DOI: 10.11772/j.issn.1001-9081.2017.03.746.

### 文章历史

Fast learning algorithm of multi-output support vector regression with data-dependent kernel
WANG Dingcheng, ZHAO Youzhi, CHEN Beijing, LU Yiyi
School of Computer and Software, Nanjing University of Information Science and Technology, Nanjing Jiangsu 210044, China
Abstract: For the Multi-output Support Vector Regression (MSVR) algorithm based on gradient descent method in the process of model parameter fitting, the convergence rate is slow and the prediction accuracy is low. A modified version of the Quasi-Newton algorithm (BFGS) with second-order convergence rate based on the rank-2 correction rule was used to fit the model parameters of MSVR algorithm. At the same time, to ensure the decrease of the iterative process and the global convergence, the step size factor was determined by the non-exact linear search technique. Based on the analysis of the geometry structure of kernel function in Support Vector Machine (SVM), a data-dependent kernel function was substituted for the traditional kernel function, and the multi-output data-dependent kernel support vector regression model was generated. The model was compared with the multi-output support vector regression model based on gradient descent method and modified Newton method. The experimental results show that in the case of 200 samples, the iterative time of the proposed algorithm is 72.98 s, the iterative time of modified Newton's algorithm is 116.34 s and the iterative time of gradient descent method is 2065.22 s. The proposed algorithm can reduce the model iteration time and has faster convergence speed.
Key words: data-dependent kernel    multi-output support vector regression    optimization algorithm    Quasi-Newton algorithm

1 多输出数据依赖核SVR算法 1.1 多输出支持向量回归算法

 $T=\{({{x}_{i}},\rm{ }{{y}_{i}})|{{x}_{i}}\in {{\mathit{\boldsymbol{R}}}^{n}},\rm{ }{{y}_{i}}\in {{\mathit{\boldsymbol{R}}}^{m}},\rm{ }i=1,\rm{ }2,\rm{ }\ldots ,\rm{ }l\}$ (1)

 $f({{x}_{i}})=W\cdot \varphi ({{x}^{i}})+\mathit{\boldsymbol{B}}$ (2)

 $\begin{array}{*{35}{l}} \rm{min}\sum\limits_{j=1}^{n}{{}}\|{{w}_{j}}{{\|}^{2}}+C\sum\limits_{j=1}^{l}{{}}\xi _{i}^{2} \\ \rm{s}\rm{.t}.\|{{y}_{i}}-\mathit{\boldsymbol{W}}\cdot \varphi \left( {{x}_{i}} \right)-\mathit{\boldsymbol{B}}\|\le \varepsilon +{{\xi }_{i}};\rm{ }{{\xi }_{i}}\ge 0 \\ \end{array}$ (3)

 L\varepsilon \left( z \right)=\left\{ \begin{align} & 0, & z ＜\varepsilon \\ & {{(z-\varepsilon )}^{2}}, & z\ge \varepsilon \\ \end{align} \right. (4)

 ${\rm{min}}\sum\limits_{j = 1}^n {{{\left\| {{\mathit{\boldsymbol{w}}_j}} \right\|}^2}} + C\sum\limits_{j = 1}^l {{L_\varepsilon }} (\left\| {{\mathit{\boldsymbol{y}}_i} - \mathit{\boldsymbol{W}} \times \varphi ({x_i}) - \mathit{\boldsymbol{B}}} \right\|)$ (5)

1.2 数据依赖核函数

 $\tilde{K}(x,{x}')=c(x)c({x}')K(x,{x}')$ (6)

1.3 基于数据依赖核的MSVR模型

 $L({{\mathit{\boldsymbol{ }}\!\!\beta\!\!\rm{ }}_{\rm{aug}}})=\lambda \mathit{\boldsymbol{ }}\!\!\beta\!\!\rm{ }_{\rm{aug}}^{T}{{\mathit{\boldsymbol{K}}}_{\rm{aug}}}{{\mathit{\boldsymbol{ }}\!\!\beta\!\!\rm{ }}_{\rm{aug}}}+\sum\limits_{i=1}^{n}{{{L}_{\varepsilon }}}(\left\| {{\mathit{\boldsymbol{y}}}_{i}}-\mathit{\boldsymbol{K}}_{i}^{T}{{\mathit{\boldsymbol{ }}\!\!\beta\!\!\rm{ }}_{\rm{aug}}} \right\|)$ (7)

 ${{\mathit{\boldsymbol{ }}\!\!\beta\!\!\rm{ }}_{i,j}}=-\frac{2}{C}\cdot \frac{\partial }{\partial {{w}_{j}}}{{L}_{\varepsilon }}(||{{y}_{i}}-\mathit{\boldsymbol{W}}\cdot \phi ({{\bf{x}}_{i}})-\mathit{\boldsymbol{B}}||)$ (8)

 \left\{ \begin{align} & L({{\mathit{\boldsymbol{ }}\!\!\beta\!\!\rm{ }}_{\rm{aug}}})=\lambda \mathit{\boldsymbol{ }}\!\!\beta\!\!\rm{ }_{\rm{aug}}^{\rm{T}}{{\mathit{\boldsymbol{K}}}_{\rm{aug}}}{{\mathit{\boldsymbol{ }}\!\!\beta\!\!\rm{ }}_{\rm{aug}}}+\sum\limits_{i=1}^{n}{{{L}_{\varepsilon }}}(\left\| {{y}_{i}}-\mathit{\boldsymbol{K}}_{i}^{T}{{\mathit{\boldsymbol{ }}\!\!\beta\!\!\rm{ }}_{\rm{aug}}} \right\|) \\ & {{W}_{j}}=\sum\limits_{i=1}^{n}{\Phi ({{x}_{i}})\beta _{j}^{i}} \\ \end{align} \right. (9)

 $\mathit{\boldsymbol{ }}\!\!\beta\!\!\rm{ }_{\rm{aug}}^{\rm{new}}\rm{=}\mathit{\boldsymbol{ }}\!\!\beta\!\!\rm{ }_{\rm{aug}}^{\rm{old}}\rm{-}\alpha {{\bf{H}}^{\rm{-1}}}\bf{G}$ (10)

 $L({{\mathit{\boldsymbol{ }}\!\!\beta\!\!\rm{ }}_{\rm{aug}}})=\lambda \mathit{\boldsymbol{ }}\!\!\beta\!\!\rm{ }_{\rm{aug}}^{\rm{T}}{{K}_{\rm{aug}}}{{\mathit{\boldsymbol{ }}\!\!\beta\!\!\rm{ }}_{\rm{aug}}}+{{\sum\limits_{i=1}^{M\rm{s}v}{(\left\| {{y}_{i}}-K_{i}^{T}{{\mathit{\boldsymbol{ }}\!\!\beta\!\!\rm{ }}_{\rm{aug}}} \right\|-\varepsilon )}}^{2}}$ (11)

2 快速学习算法 2.1 基于数据依赖核的MSVR最优化算法

2.2 Armijo准则确定步长

 $\Delta {{f}_{k}}=f({{x}_{k}})-f({{x}_{k}}+{{\alpha }_{k}}{{d}_{k}})>0$ (12)

Armijo准则 已知当前迭代点xk和迭代的方向dk，给定β∈(0, 0.5) ，σ∈(ρ, 1) 。令步长因子αk=βmk，其中mk为满足式(13) 的最小非负整数。

 $f({{x}_{k}}+{{\beta }^{m}}{{d}_{k}})\le f({{x}_{k}})+\delta {{\beta }^{m}}{{g}_{k}}^{T}{{d}_{k}}$ (13)

2.3 拟牛顿法选择迭代方向 2.3.1 经典牛顿法

 $T(f,{{x}_{k}},3)={{f}_{k}}+{{g}_{K}}^{T}(x-{{x}_{k}})+\frac{1}{2}{{(x-{{x}_{k}})}^{T}}{{G}_{k}}(x-{{x}_{k}})$ (14)

x求导，令其导数为0求其稳定:点

 $\nabla T={{g}_{k}}+G{}_{k}(x-{{x}_{k}})=0$ (15)

 ${{x}_{k+1}}={{x}_{k}}-{{G}_{k}}^{-1}{{g}_{k}}$ (16)

2.3.2 修正的BFGS算法

 ${{\mathit{\boldsymbol{G}}}_{k+1}}{{\mathit{\boldsymbol{s}}}_{k}}\approx \mathit{\boldsymbol{y}}{}_{k}$ (17)

 {{\mathit{\boldsymbol{B}}}_{k+1}}=\left\{ \begin{align} & {{\mathit{\boldsymbol{B}}}_{k}}\begin{matrix} {} & {} & {} & \mathit{\boldsymbol{s}}_{k}^{T}{{\mathit{\boldsymbol{y}}}_{k}}\le 0 \\ \end{matrix} \\ & {{\mathit{\boldsymbol{B}}}_{k}}-\frac{{{\mathit{\boldsymbol{B}}}_{k}}{{\mathit{\boldsymbol{y}}}_{k}}\mathit{\boldsymbol{y}}_{k}^{T}{{\mathit{\boldsymbol{B}}}_{k}}}{y_{k}^{T}{{\mathit{\boldsymbol{B}}}_{k}}{{\mathit{\boldsymbol{y}}}_{k}}}+\frac{{{\mathit{\boldsymbol{s}}}_{k}}\mathit{\boldsymbol{s}}_{k}^{T}}{\mathit{\boldsymbol{s}}_{k}^{T}{{\mathit{\boldsymbol{y}}}_{k}}},\begin{matrix} {} & \mathit{\boldsymbol{s}}_{k}^{T}{{\mathit{\boldsymbol{y}}}_{k}}>0 \\ \end{matrix} \\ \end{align} \right. (18)

2.4 算法在MSVR-DDK模型中的应用

3 仿真实验

 $\left\{ \begin{array}{l} {y_1} = \sin (3x) + cos(10x)\\ {y_2} = \sin ({y_1}/5) + cos({y_1}/2) \end{array} \right.$ (19)

 图 1 MSVR-DDK的样本数据分布 Figure 1 Sample data distribution of MSVR-DDK

 图 2 修正BFGS算法与梯度下降法迭代时间比较 Figure 2 Iteration time comparison between modified BFGS algorithm and gradient descent method
 图 3 修正BFGS算法与修正牛顿法迭代时间比较 Figure 3 Iteration time comparison between modified BFGS algorithm and modified Newton method

 图 4 梯度下降法、修正BFGS算法与修正牛顿法对比 Figure 4 Iteration time comparison of gradient descent method, modified BFGS and modified Newton method

 图 5 修正BFGS算法在不同核函数下的迭代时间 Figure 5 Iteration time of modified BFGS algorithm under different kernel functions

 $SIR=N/t$ (20)

4 结语

 [1] VAPNIK V N. The Nature of Statistical Learning Theory[M]. New York: Springer, 1995 : 280 -292. [2] 王定成, 倪郁佳, 陈北京, 等. 基于数据依赖核支持向量机回归的风速预测模型[J]. 南京师大学报(自然科学版), 2014, 37 (3) : 15-20. ( WANG D C, NI Y J, CHEN B J, et al. A wind speed forecasting model based on support vector regression with data dependent kernel[J]. Journal of Nanjing Normal University (Natural Science Edition), 2014, 37 (3) : 15-20. ) [3] YANG Y, CHEN D, DONG Z. Novel multi-output support vector regression model via double regularization[C]//Proceedings of the 2015 IEEE International Conference on Systems, Man, and Cybernetics. Piscataway, NJ:IEEE, 2015:2697-2701. [4] GAO F, TANG Y, LUO L. Nonlinear correction for thermocouple vacuum sensor based on multiple support vector regressions[C]//Proceedings of the 2010 International Conference on Measuring Technology and Mechatronics Automation. Washington, DC:IEEE Computer Society, 2010, 2:781-784. [5] 李兵.支持向量机的迭代学习算法及其应用[D].北京:清华大学,2014. ( LI B. Iterative training algorithms of support vector machines and applications[D]. Beijing:Tsinghua University, 2014. ) [6] 屈晓军.非凸无约束优化问题的修正拟牛顿算法[D].长沙:湖南大学,2014. ( QU X J. The modified quasi-Newton method for nonconvex unconstrained optimization problems[D]. Changsha:Hunan University, 2014. ) [7] 赖炎连. 优化问题的拟牛顿算法[J]. 咸宁师专学报, 2001, 21 (6) : 1-7. ( LAI Y L. Quasi-Newton methods for the optimization problem[J]. Journal of Xianning Teachers College, 2001, 21 (6) : 1-7. ) [8] 白华.一类新拟牛顿算法及其收敛性[D].南京:南京理工大学,2006. ( BAI H. A new Quasi-Newton method and its convergence[D]. Nanjing:Nanjing University of Science and Technology, 2006. ) [9] WANG D, NI Y, CHEN B, et al. Wind speed and direction predictions based on multidimensional support vector regression with data-dependent kernel[M]. Berlin: Springer, 2015 : 427 -436. [10] 李君宝, 高会军. 基于数据依赖核函数的核优化算法[J]. 模式识别与人工智能, 2010, 23 (3) : 300-306. ( LI J B. Data-dependent kernel function based kernel optimization algorithm[J]. Pattern Recognition and Artificial Intelligence, 2010, 23 (3) : 300-306. ) [11] AMARI S, WU S. Improving support vector machine classifiers by modifying kernel functions[J]. Neural Networks, 1999, 12 (6) : 783-789. doi: 10.1016/S0893-6080(99)00032-5 [12] MOKHTARI A, RIBEIRO A. Regularized stochastic BFGS algorithm[J]. IEEE Transactions on Signal Processing, 2014, 62 (23) : 6089-6104. doi: 10.1109/TSP.2014.2357775