An Updated Projection Twin Support Vector Machine for Classification

. Based on projection twin support vector machine (PTSVM) and its extensions, this paper describes an updated PTSVM (UPTSVM) for classification. Compared with existing PTSVMs, UPTSVM has its own advantages. First, similar to the standard support vector machine (SVM), UPTSVM maintains the consistency of the optimization problems in the linear and nonlinear case, which results in the nonlinear formulations can be directly turned into the linear ones. Nevertheless, the existing PTSVMs lose the consistency because of using empirical kernel to construct nonlinear formulations. Second, UPTSVM avoids the inverse of kernel matrixes in the course of solving dual problems, which indicates it can not only reduce computing time but also save storage space. Third, UPTSVM can be practically proved equivalent to the PTSVM with regularization (RPTSVM). Experimental results on lots of data sets show the virtue of the presented method.


Introduction
Recently, proposed projection twin support vector machine (PTSVM) [1], which is on the base of multiweight vector support vector machine (MVSVM) [2] and twin support vector machine (TWSVM) [3], has got the attention of increasing researchers.For binary classification problems, PTSVM has the intension of searching two projection axes same as MVSVM and formulates two small scale optimiaztion problems similar to TWSVM.Experimental results in [1] indicatie that PTSVM has comparable generalization ability than MVSVM and TWSVM in some aspects.
For the objective of improving the generalization ability of PTSVM furtherly, a method named as PTSVM with regularization term (RPTSVM) is proposed in [4].Compared with PTSVM, RPTSVM has better classification capability because of considering the minimization of the structual risk similar to standard SVM [5].For furthering the local learning ability of PTSVM, a weighted PTSVM (WPTSVM) [6] is proposed by adding the weight of each sample to the primal formulations of PTSVM.Experimental results validate WPTSVM can obtain better classification performance in dealing with manifold data sets.Shao et al. proposed a least squares PTSVM (LSPTSVM) [7], which can reducing the compiting time to a great extent by changing the inequality constrains in PTSVM to the equality ones and turning the L1 norm in the objective functions of PTSVM into L2 norm.Just like WPTSVM improves PTSVM, Hua et al. present a weighted LSPTSVM with local information, termed as (LIWLSPTSVM) [8], to improve the local learning ability of the LSPTSVM.
However, PTSVM and its extensions can not avoid the inverse of kernel matrixes in the course of training, which indicate they need more computing time and storage space in solving their dual problems.In addition, existing PTSVMs use empirical kernel to construct nonlinear formulations, which leads to the inconsistency of linear and nonlinear optimization problems.For the intention of overcome the above two defects, we propose an updated PTSVM (UPTSVM) in this paper.
The rest of this paper is organized as follows.In section 2 we overview RPTSVM briefly.Section 3 describes our new model, and section 4 shows the experiment results.In section 5 we give a summary of this paper.

Projection twin support vector machine with regularization term
Let us consider the classification problem with two classes of training samples, which are organized as a l 1 ×n matrix A with positive samples (class 1) and a l 2 ×n matrix B with negative samples (class 2).Define  .
Two projection axes are given by ( ) x || w || (7) where x is a new sample, m i is the mean of the ith class samples, and |•| is the 1-norm.

Linear UPTSVM
In the linear case, the optimization problems of UPTSVM are formulated as and where  , , %  , %  slack vectors.
Consider the formulation (8).Compared with the formulation (1), the only change is taking the upper bounds Obviously, we can easily prove the equivalence between the formulations (1) and (8).The formulation (9) has similar conclusion.In order to gain the solutions of (8), we need to construct the Lagrangian function given by where  , *  ,  and  are Lagrangian multipliers.Then the KKT conditions for (8) are given by Substituting ( 11) and ( 12) into (10) and combining with (13) can generate the dual problem described as , .
) and and the dual problem ( 14) can be reformulated as 1 min , 2 s.t. .
Similarly, the dual problem of (9) can be generated as where  , *  , and  and are Lagrangian multipliers.It can also be described as ) and and After solving the dual problems ( 21) and ( 23), we can obtain the solutions of w 1 and w 2 .A new point x can be predicted according to 1,2 ( ) arg min(| ( ) |), where m i is the mean of the ith class samples.

Nonlinear UPTSVM
In order to solve linearly inseparable problems, we extend the linear UPTSVM to the nonlinear one.Different from existing PTSVMs, we directly formulate the nonlinear dual problems using the kernel function K( , ) according to the linear formulations ( 14) and ( 22), which is similar to the standard SVM.The formulations of nonlinear UPTSVM are , ,

Comparision of UPTSVM with RPTSVM
Consider the linear UPTSVM.Compared with the dual problems ( 3) and ( 4) in RPTSVM, the dual problems ( 14) and ( 22) in UPTSVM avoid the inverse of kernel matrixes in the course of training, which can not only reduce computing time but also save storage space.The nonlinear UPTSVM has the same advantage.
Comparing the dual formulations in nonlinear UPTSVM with ones in linear UPTSVM, we can find that the only difference is the kernel function K( , ) taken instead of inner product.That is to say UPTSVM maintains the consistency of linear and nonlinear primal formulations in addition to using different kernel functions.However, RPTSVM use empirical kernel to construct nonlinear formulations [4], which leads to the inconsistency of linear and nonlinear optimization problems.

Experimental results
For the intention to compare the performance of our NPTSVM with PTSVM and RPTSVM, we conduct experiments on lots of standard datasets used in [13], [14], [15].All of the classification methods are implemented on a computer with Matlab 7.0.The computer is equipped with Intel P4 processors (2.3 GHz) and 2 GB RAM.For the sake of brevity, C 1 =C 2 is set for PTSVM, and C 1 =C 3 and C 2 =C 4 are set for RPTSVM and our UPTSVM.For the parameter values, we select them from the range {2 i |i= -8,-6,…,+8}.Table 1 shows the average 10-fold cross-validation results.It discloses that the classification ability of our UPTSVM is comparable than PTSVM and RPTSVM.However, UPTSVM has higher computing complexity, although it does not need to compute the large inverse matrices.This is mainly because UPTSVM has more variables (number of 2l 1 +l 2 or 2l 2 +l 1 ) than PTSVM and RPTSVM (number of l 1 or l 2 ) during the learning process.
We also conduct experiments on benchmark datasets in nonlinear case.The RBF kernel exp(-||x i -x j ||2/  ) is choosed for all of the methods, the value of  is selected from the set {2 i |i= -1,0,…,+7}.

Conclusions
In this paper, an updated PTSVM (UPTSVM) is presented for improving the performance of existing PTSVMs.Compared with existing PTSVMs, UPTSVM avoids the inverse of kernel matrixes in the course of training and maintains the consistency of linear and nonlinear primal problems.In addition, UPTSVM can be practically proved equivalent to RPTSVM.However, although UPTSVM does not have to compute the large inverse matrices as existing PTSVMs do, it has higher computing complexity.So improving the computational efficiency is our further research goal in the future.

Table 1 .
Table 2 lists experimental results in nonlinear case.It reveals that the generalization ability of our UPTSVM is better than PTSVM and RPTSVM because of adopting different kernel tricks in constructing nonlinear formulations.It also disclose that UPTSVM is little slower than RPTSVM and PTSVM in terms of training time.Experimental results in linear case.

Table 2 .
Experimental results in nonlinear case.