广东工业大学学报  2024, Vol. 41Issue (1): 79-85.  DOI: 10.12052/gdutxb.220027. 0

### 引用本文

Liu Jin-neng, Xiao Yan-shan, Liu Bo. A Least Squares Twin Support Vector Machine Method with Uncertain Data[J]. JOURNAL OF GUANGDONG UNIVERSITY OF TECHNOLOGY, 2024, 41(1): 79-85. DOI: 10.12052/gdutxb.220027.

### 文章历史

1. 广东工业大学 计算机学院, 广东 广州 510006;
2. 广东工业大学 自动化学院, 广东 广州 510006

A Least Squares Twin Support Vector Machine Method with Uncertain Data
Liu Jin-neng1, Xiao Yan-shan1, Liu Bo2
1. School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China;
2. School of Automation, Guangdong University of Technology, Guangzhou 510006, China
Abstract: Twin support vector machine learns two nonparallel hyperplanes by calculating two quadratic programming problems to solve the binary classification problems. However, in practical applications, the data usually contain uncertain information, making it difficult to construct the classification model. This paper proposed a new and efficient uncertain-data-based least squares twin support vector machine (ULSTSVM) method to address the problem of data uncertainty. Firstly, since the data may contain uncertain information, a noise vector was introduced to model the uncertain information of each example. Secondly, the noise vectors were incorporated into the least squares TWSVM. Finally, to solve the derived learning problem, we employed a two-step heuristic framework to train the least squares TWSVM classifier and updated the noise vectors alternatively. The experiments showed that our proposed ULSTSVM outperforms the baselines in training time and meanwhile achieves comparable classification accuracy. In sum, ULSTSVM adopts a noise vector to model the uncertain information and transforms the quadratic programming problems of TWSVM into linear equations, such that better classification accuracy and higher training efficiency can be obtained.
Key words: least squares    twin support vector machine    nonparallel plane learning    data uncertainty    classification

1 面向不确定性数据的最小二乘孪生支持向量机 1.1 线性模型构建

1.2 模型求解

1.2.1 固定噪声向量，求解超平面

 $\begin{split} & \min \frac{1}{2}\left\|({{{\boldsymbol{A}} + \Delta {\boldsymbol{A}}}}) {{\boldsymbol{\omega}} ^{(1) }} + {{\boldsymbol{e}}_1}{b^{(1) }}\right\|{^2} + \frac{{{C_1}}}{2}{( {{{\boldsymbol{\xi}} ^{\left( 1 \right) }}} ) ^{\text{T}}}{{\boldsymbol{\xi}} ^{\left( 1 \right) }}, \\ & \qquad{\text{s}}{\text{.t}}{\text{.}} - ( {({\boldsymbol{B}} + \overline {\Delta {\boldsymbol{B}}} ) {{\boldsymbol{\omega}} ^{\left( 1 \right) }} + {{\boldsymbol{e}}_2}{b^{(1) }}} ) = {{\boldsymbol{e}}_2} - {{\boldsymbol{\xi}} ^{(1) }} \end{split}$ (3)
 $\begin{split} & \min \frac{1}{2}\left\|({\boldsymbol{B}} + \overline {\Delta {\boldsymbol{B}}} ) {{\boldsymbol{\omega}} ^{(2) }} + {{\boldsymbol{e}}_2}{b^{(2) }}\right\|{^2} + \frac{{{C_2}}}{2}{( {{{\boldsymbol{\xi}} ^{\left( 2 \right) }}} ) ^{\text{T}}}{{\boldsymbol{\xi}} ^{\left( 2 \right) }}, \\ & \qquad{\text{s}}{\text{.t}}.{\text{ }}({\boldsymbol{A}} + \overline {\Delta {\boldsymbol{A}}} ) {{\boldsymbol{\omega}} ^{\left( 2 \right) }} + {{\boldsymbol{e}}_1}{b^{(2) }} = {{\boldsymbol{e}}_1} - {{\boldsymbol{\xi}} ^{(2) }} \end{split}$ (4)

 $\begin{gathered} \min \frac{1}{2}\left\|({\boldsymbol{A}} + \Delta {\boldsymbol{A}}) {{\boldsymbol{\omega}} ^{(1) }} + {{\boldsymbol{e}}_1}{b^{(1) }}\right\|{^2} + \frac{{{C_1}}}{2}\big\|({\boldsymbol{B}} + \Delta {\boldsymbol{B}}) {{\boldsymbol{\omega}} ^{\left( 1 \right) }} + \\ \qquad{{\boldsymbol{e}}_2}{b^{(1) }} + {{\boldsymbol{e}}_2}\big\|{^2} \\ \end{gathered}$ (5)

 $\left( {\begin{array}{*{20}{c}} {{{\boldsymbol{\omega}} ^{\left( 2 \right) }}} \\ {{b^{\left( 2 \right) }}} \end{array}} \right) = {\left( {{{\boldsymbol{E}}^{\text{T}}}{\boldsymbol{E}} + \frac{1}{{{C_2}}}{{\boldsymbol{F}}^{\text{T}}}{\boldsymbol{F}}} \right) ^{ - 1}}{{\boldsymbol{E}}^{\text{T}}}{{\boldsymbol{e}}_1}$ (7)
1.2.2 固定超平面，求解噪声向量

 $\begin{split} {\nabla _{\Delta {\boldsymbol{A}}}} =& ({\left({\boldsymbol{A}}{\text{ + }}\Delta {\boldsymbol{A}}\right) ^{\text{T}}}\overline {{{\boldsymbol{\omega }}^{(1) }}} + {{\boldsymbol{e}}_1}\overline {{b^{(1) }}} ) {(\overline {{{\boldsymbol{\omega }}^{(1) }}} ) ^{\text{T}}} - {C_2}({{\boldsymbol{e}}_1} - \\ &\left({\boldsymbol{A}} + \Delta {\boldsymbol{A}}\right) \overline {{{\boldsymbol{\omega }}^{(2) }}} - {{\boldsymbol{e}}_2}\overline {{b^{(2) }}} ) {(\overline {{{\boldsymbol{\omega}} ^{(2) }}} ) ^{\text{T}}} \\ \end{split}$ (12)
 $\begin{split} {\nabla _{\Delta {\boldsymbol{B}}}} =& ({\left({\boldsymbol{B}}{\text{ + }}\Delta {\boldsymbol{B}}\right) ^{\text{T}}}\overline {{{\boldsymbol{\omega }}^{(2) }}} + {{\boldsymbol{e}}_2}\overline {{b^{(2) }}} ) {(\overline {{{\boldsymbol{\omega }}^{(2) }}} ) ^{\text{T}}} + {C_1}({{\boldsymbol{e}}_2} - \\ &({\boldsymbol{B}} + \Delta {\boldsymbol{B}}) \overline {{{\boldsymbol{\omega}} ^{(1) }}} + {{\boldsymbol{e}}_2}\overline {{b^{(1) }}} ) {(\overline {{{\boldsymbol{\omega }}^{(1) }}} ) ^{\text{T}}} \\ \end{split}$ (13)
1.2.3 启发式框架

${{{\boldsymbol{\xi }}^{\left( 1 \right) }}}$${{{\boldsymbol{\xi }}^{\left( 2 \right) }} }//实例的边界参数 输出： {{{\boldsymbol{\omega }}^{(1) }}}$$ {{b^{(1) }}}$${{{\boldsymbol{\omega }}^{(2) }}}$$ {{b^{(2) }}}$

1 构建数据矩阵AB

2 初始化t=0,${F_{{\rm{val}}}(t)=\infty}$

3 重复：

4 　　t=t+1;

5 　　if (t= =1) ：

6 　　　初始化${\Delta {\boldsymbol{A}}{\text{ = }}{\bf{0}} }$ and ${\Delta {\boldsymbol{B}}{\text{ = }}{\bf{0}} }$

7 　　else:

8 　　　用式(10) 和(11) 更新${\Delta {\boldsymbol{A}}}$${\Delta {\boldsymbol{B}}} 9 替换 {\Delta {\boldsymbol{A}}}$$ {\Delta {\boldsymbol{B}}}$，求解问题(3)和(4) , 得到${{{\boldsymbol{\omega }}^{(1) }}}$${{b^{(1) }}}， {{{\boldsymbol{\omega }}^{(2) }}}$$ {{b^{(2) }}}$

10 　　计算优化问题(1) 的${|{F_{{\text{val}}}}(t) - {F_{{\text{val}}}}(t - 1) | < \varepsilon {F_{\max }} }$ 值；

11 　　使${|{F_{{\text{val}}}}(t) - {F_{{\text{val}}}}(t - 1) | \lt \varepsilon {F_{\max }}}$

12 直到${ |{F_{{\text{val}}}}(t) - {F_{{\text{val}}}}(t - 1) | \lt \varepsilon {F_{\max }} }$

13 返回${{{\boldsymbol{\omega }}^{(1) }}}$${{b^{(1) }}}$$ {{{\boldsymbol{\omega }}^{(2) }} }$${{b^{(2) }}} 1.3 非线性核函数 受非线性孪生支持向量机的启发，这里将线性ULSTSVM扩展到非线性情况，旨在学习2个用核函数生成的超平面(14) 和(15) 。  K({{\boldsymbol{x}}^{{\text{T}}}} + \Delta {\boldsymbol{x}},{{\boldsymbol{C}}^{{\text{T}}}}) {{{\boldsymbol{u}}}^{(1) }} + {b^{(1) }} = 0 (14)  K({{\boldsymbol{x}}^{{\text{T}}}} + \Delta {\boldsymbol{x}},{{\boldsymbol{C}}^{{\text{T}}}}) {{{\boldsymbol{u}}}^{(2) }} + {b^{(2) }} = 0 (15) 式中：{\boldsymbol{C}} = \left( {\begin{array}{*{20}{c}} {{\boldsymbol{A}} + \Delta {\boldsymbol{A}}} \\ {{\boldsymbol{B}} + \Delta {\boldsymbol{B}}} \end{array}} \right)K为任意一个核函数。 在问题(1)中，将({\boldsymbol{A}} + \Delta {\boldsymbol{A}}) {{\boldsymbol{\omega}} ^{\left( 1 \right) }}$$\left( {{\boldsymbol{B}} + \Delta {\boldsymbol{B}}} \right) {{\boldsymbol{\omega}} ^{\left( 2 \right) }}$$( {{\boldsymbol{B}} +} { \Delta {\boldsymbol{B}}} ) {{\boldsymbol{\omega }}^{\left( 1 \right) }}$$({\boldsymbol{A}} + \Delta {\boldsymbol{A}}) {{\boldsymbol{\omega }}^{\left( 2 \right) }}$分别替换为$K({\boldsymbol{A}} + \Delta {\boldsymbol{A}},{{\boldsymbol{C}}^{\text{T}}}) {{{\boldsymbol{u}}}^{\left( 1 \right) }}$$K({\boldsymbol{B}} + \Delta {\boldsymbol{B}},{{\boldsymbol{C}}^{\text{T}}}) {{{\boldsymbol{u}}}^{\left( 2 \right) }}$$K({\boldsymbol{B}} + \Delta {\boldsymbol{B}},{{\boldsymbol{C}}^{\text{T}}}) {{{\boldsymbol{u}}}^{\left( 1 \right) }}$$K({\boldsymbol{A}} + \Delta {\boldsymbol{A}},{{\boldsymbol{C}}^{\text{T}}}) {{{\boldsymbol{u}}}^{\left( 2 \right) }}。面向不确定数据的非线性最小二乘孪生支持向量机的优化公式为  \begin{array}{l}\mathrm{min} \dfrac{1}{2} \left\| K ( {\boldsymbol{A}} + \Delta {\boldsymbol{A}},{{\boldsymbol{C}}}^{\text{T}} ) {{\boldsymbol{u}}}^{(1) } + {{\boldsymbol{e}}}_{1}{b}^{(1) }\right\|^{2} \text+ \dfrac{1}{2}\big\| K ( {\boldsymbol{B}}+\Delta {\boldsymbol{B}},{{\boldsymbol{C}}}^{\text{T}} ) {{\boldsymbol{u}}}^{(2) } +\\ \qquad{{\boldsymbol{e}}}_{2}{b}^{(2) }\big\|^{2}+\dfrac{{C}_{1}}{2}{({{\boldsymbol{\xi}} }^{\left(1\right) }) }^{\text{T}}{{\boldsymbol{\xi }}}^{\left(1\right) }+\dfrac{{C}_{2}}{2}{({{\boldsymbol{\xi }}}^{\left(2\right) }) }^{\text{T}}{{\boldsymbol{\xi}} }^{\left(2\right) },\\ \text{s}\text{.t}\text{.}\left\{ \begin{array}{l}-(K({\boldsymbol{B}}+\Delta {\boldsymbol{B}},{{\boldsymbol{C}}}^{{\rm{T}}}) {{\boldsymbol{u}}}^{\left(1\right) }+{{\boldsymbol{e}}}_{2}{b}^{(1) }) ={{\boldsymbol{e}}}_{2}-{{\boldsymbol{\xi }}}^{(1) }\\ K({\boldsymbol{A}}+\Delta {\boldsymbol{A}},{{\boldsymbol{C}}}^{{\rm{T}}}) {{\boldsymbol{u}}}^{\left(2\right) }+{{\boldsymbol{e}}}_{1}{b}^{(2) }={{\boldsymbol{e}}}_{1}-{{\boldsymbol{\xi }}}^{(2) }\\ \left|\right|\Delta {{\boldsymbol{A}}}_{i}\left|\right|\leqslant{\delta }_{1i},\text{ }i=1,\cdots ,{m}_{1}\\ \left|\right|\Delta {{\boldsymbol{B}}}_{j}\left|\right|\leqslant{\delta }_{2j},\text{ }j=1,\cdots ,{m}_{2}\end{array} \right.\end{array} (16) 在第1步中，固定 \Delta {\boldsymbol{A}}$$ \Delta {\boldsymbol{B}}$，问题(16) 的解为

 $\left( {\begin{array}{*{20}{c}} {{{\boldsymbol{u}}^{\left( 1 \right) }}} \\ {{b^{\left( 1 \right) }}} \end{array}} \right) = - {\left( {{{\boldsymbol{H}}^{\text{T}}}{\boldsymbol{H}} + \frac{1}{{{C_1}}}{{\boldsymbol{G}}^{\text{T}}}{\boldsymbol{G}}} \right) ^{ - 1}}{{\boldsymbol{H}}^{\text{T}}}{{\boldsymbol{e}}_2}$ (17)
 $\left( {\begin{array}{*{20}{c}} {{{\boldsymbol{u}}^{\left( 2 \right) }}} \\ {{b^{\left( 2 \right) }}} \end{array}} \right) = - {\left( {{{\boldsymbol{G}}^{\text{T}}}{\boldsymbol{G}} + \frac{1}{{{C_2}}}{{\boldsymbol{H}}^{\text{T}}}{\boldsymbol{H}}} \right) ^{ - 1}}{{\boldsymbol{G}}^{\text{T}}}{{\boldsymbol{e}}_1}$ (18)

 $\begin{split} {\nabla _{\Delta {\boldsymbol{A}}}} =& (K({\boldsymbol{A}} + \Delta {\boldsymbol{A}},{{\boldsymbol{C}}^{\rm{T}}}) \overline {{{{\boldsymbol{u}}}^{(1)}}} {\rm{ + }}{{\boldsymbol{e}}_1}\overline {{b^{(1)}}} \times {\overline {{{{{\boldsymbol{u}}}}^{(1)}}} ^{\rm{T}}} \times \\ & K{'}{({\boldsymbol{A}} + \Delta {\boldsymbol{A}},{{\boldsymbol{C}}^{\rm{T}}})^{\rm{T}}} + {C_2}(K({\boldsymbol{A}} + \Delta {\boldsymbol{A}},{{\boldsymbol{C}}^{\rm{T}}}) \overline {{{{{\boldsymbol{u}}}}^{(2)}}} - \\ &{{\boldsymbol{e}}_1}\overline {{b^{(2)}}} + {{\boldsymbol{e}}_1}) {\overline {{{{{\boldsymbol{u}}}}^{(2)}}} ^{\rm{T}}} {( - K{'}({\boldsymbol{A}} + \Delta {\boldsymbol{A}},{{\boldsymbol{C}}^{\rm{T}}}))^{\rm{T}}} \end{split}$ (23)
 $\begin{split} {\nabla _{\Delta {\boldsymbol{B}}}} =& (K({\boldsymbol{B}} + \Delta {\boldsymbol{B}},{{\boldsymbol{C}}^{\text{T}}}) \overline {{{{\boldsymbol{u}}}^{(2) }}} + {{\boldsymbol{e}}_2}\overline {{b^{(2) }}} ) {\overline {{{{{\boldsymbol{u}}}}^{(2) }}} ^{\text{T}}} \times \\ & K'{({\boldsymbol{B}} + \Delta {\boldsymbol{B}},{{\boldsymbol{C}}^{\text{T}}}) ^{\text{T}}} + {C_1}(K({\boldsymbol{B}} + \Delta {\boldsymbol{B}},{{\boldsymbol{C}}^{\text{T}}}) \overline {{{{{\boldsymbol{u}}}}^{(1) }}}+ \\ &{{\boldsymbol{e}}_2}\overline {{b^{(1) }}} + {{\boldsymbol{e}}_2}) {\overline {{{{{\boldsymbol{u}}}}^{(1) }}} ^{\text{T}}} K'{({\boldsymbol{B}} + \Delta {\boldsymbol{B}},{{\boldsymbol{C}}^{\text{T}}}) ^{\text{T}}} \end{split}$ (24)
2 实验与结果分析

2.1 对比方法与实验设置

2.2 UCI数据集

UCI数据集包括“Hepatitis” “WPBC”“BUPA liver”“Heart-Stalog”“Sonar”“Votes” “Australian”“German”“Breast”和 “Pima”。本文采用十折交叉验证来记录结果。对于每个数据集，将其分成相等的10个部分。在每一轮中，训练集由9个部分组成，剩下的一部分作为测试集处理，并记录准确率的平均结果。按照文献[19-21]的方法生成UCI数据集中的不确定信息。对于每一个维度，计算数据集的标准偏差${\sigma _i}$。在[0, 2${\sigma _i}$]范围内生成一个偏差值。然后，用生成的第i维偏差值来产生噪声数据，这使得噪声在不同维度上有所不同。将噪声随机添加到40% 的训练集实例中。训练数据加上噪声后，在UCI数据集上对ULSTSVM，LSTSVM，R-TWSVM和TSVC这4种方法进行比较。非线性核RBF下的分类准确率如表1所示。

2.3 真实的不确定数据集

ULSTSVM和3种对比方法在LDPA数据集上的F1值如图1所示。很明显地看到，ULSTSVM在LDPA数据集上获得了最高的F1值。具体来说，LSTSVM具有最低的F1值，因为它没有考虑数据的不确定性。此外，ULSTSVM与R-TWSVM和TSVC的分类性能相当。本文引入噪声向量$\Delta {\boldsymbol{x}}$对不确定信息进行建模，并采用两步启发式框架求解优化问题，并且在LDPA数据集上证明了该方法的有效性。

 图 1 不同方法在LDPA数据集上的F1值 Figure 1 F1-measure values of different methods on the LDPA dataset
2.4 不同噪声水平

 图 2 不同方法在不同噪声水平数据集上的分类准确率 Figure 2 Accuracy of different methods on the datasets with different levels of noise
2.5 训练时间

 图 3 不同方法在UCI数据集上的训练时间 Figure 3 The training time of different methods on the UCI dataset
3 结论

 [1] CORTES C, VAPNIK V. Support-vector networks[J]. Machine Learning, 1995, 20(3): 273-297. [2] LI S, KWOK J T, ZHU H, et al. Texture classification using the support vector machines[J]. Pattern Recognition, 2003, 36(12): 2883-2893. DOI: 10.1016/S0031-3203(03)00219-X. [3] KHAN N M, KSANTITI R, AHMAD I S, et al. A novel SVM+NDA model for classification with an application to face recognition[J]. Pattern Recognition, 2012, 45(1): 66-79. DOI: 10.1016/j.patcog.2011.05.004. [4] BILAL M. Algorithmic optimisation of histogram intersection kernel support vector machine-based pedestrian detection using low complexity features[J]. IET Computer Vision, 2017, 11(5): 350-357. DOI: 10.1049/iet-cvi.2016.0403. [5] YUAN L, YAO E, TAN G. Automated and precise event detection method for big data in biomedical imaging with support vector machine[J]. International Journal of Computer Systems Science & Engineering, 2018, 33(2): 105-113. [6] MANGASARIAN O L, WILD E W. Multisurface proximal support vector machine classification via generalized eigenvalues[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 28(1): 69-74. [7] KHEMCHANDANI R, CHANDRA S. Twin support vector machines for pattern classification[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(5): 905-910. DOI: 10.1109/TPAMI.2007.1068. [8] CHEN X, YANG J, YE Q, et al. Recursive projection twin support vector machine via within-class variance minimization[J]. Pattern Recognition, 2011, 44(10-11): 2643-2655. DOI: 10.1016/j.patcog.2011.03.001. [9] SHAO Y H, ZHANG C H, WANG X B, et al. Improvements on twin support vector machines[J]. IEEE Transactions on Neural Networks, 2011, 22(6): 962-968. DOI: 10.1109/TNN.2011.2130540. [10] KUMAR M A, GOPAL M. Least squares twin support vector machines for pattern classification[J]. Expert Systems with Applications, 2009, 36(4): 7535-7543. DOI: 10.1016/j.eswa.2008.09.066. [11] KUMAR M A, KHEMCHANDANI R, GOPAL M, et al. Knowledge based least squares twin support vector machines[J]. Information Sciences, 2010, 180(23): 4606-4618. DOI: 10.1016/j.ins.2010.07.034. [12] TANVEER M, SHARMA S, MUHAMMAD K. Large-scale least squares twin svms[J]. ACM Transactions on Internet Technology (TOIT), 2021, 21(2): 1-19. [13] CHEN S G, WU X J, XU J. Locality preserving projection least squares twin support vector machine for pattern classification[J]. Pattern Analysis and Applications, 2020, 23(1): 1-13. DOI: 10.1007/s10044-018-0728-x. [14] QI Z, TIAN Y, SHI Y. Robust twin support vector machine for pattern classification[J]. Pattern Recognition, 2013, 46(1): 305-316. DOI: 10.1016/j.patcog.2012.06.019. [15] MAIRAL J, BACH F, PONCE J, et al. Online dictionary learning for sparse coding[C]//Proceedings of the 26th Annual International Conference on Machine Learning. New York: ACM, 2009: 689-696. [16] ZHAO B, WANG F, ZHANG C. Efficient multiclass maximum margin clustering[C]//Proceedings of the 25th International Conference on Machine Learning. New York: ACM, 2008: 1248-1255. [17] LIU B, XIAO Y, CAO L, et al. One-class-based uncertain data stream learning[C]//Proceedings of the 2011 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics. Philadelphia: SIAM, 2011: 992-1003. [18] BI J, ZHANG T. Support vector classification with input data uncertainty[C]//Neural Information Processing Systems. Cambridge: MIT Press, 2004: 161-168. [19] TSANG S, KAO B, YIP K Y, et al. Decision trees for uncertain data[J]. IEEE Transactions on Knowledge and Data Engineering, 2009, 23(1): 64-78. [20] REN J, LEE S D, CHEN X, et al. Naive bayes classification of uncertain data[C]//2009 Ninth IEEE International Conference on Data Mining. Piscataway: IEEE, 2009: 944-949. [21] GAO C, WANG J. Direct mining of discriminative patterns for classifying uncertain data[C]//Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2010: 861-870. [22] YIN J, YANG Q, PAN J J. Sensor-based abnormal human-activity detection[J]. IEEE Transactions on Knowledge and Data Engineering, 2008, 20(8): 1082-1090. DOI: 10.1109/TKDE.2007.1042.