SVM

 Support Vector Machines for Real-world Pattern Recognition

SVM is a nonlinear pattern recognition algorithm based on kernel methods. In contrast to linear methods, kernel methods map the original parameter vectors into a higher (possibly infinite) dimensional feature space through a nonlinear kernel function. Without need to compute the nonlinear mapping explicitly, dot-products can be computed efficiently in higher dimensional space. The dominant feature which makes SVM very attractive is that classes which are nonlinearly separable in the original space can be linearly separated in the higher dimensional feature space. Thus SVM is capable to solve complex nonlinear pattern recognition problems. Important characteristics of SVM are its ability to solve pattern recognition problems by means of convex quadratic programming (QP), and also the sparseness resulting from this QP problem.


  Although SVM-based methods show advantages in terms of generalization performance and the recognition accuracy, the pattern recognition of real-world data using SVM still faces several challenges that we are trying to solve.
  Huge training data: To make the training of classifier executable, most learning algorithms require a suitable amount of training data which scales with the number of inputs. SVM is one of the kernel methods and formulated as quadratic programming (QP) problems. The training time and space complexities have exponential relationships with the size of training data. Hence, a major stumbling block in SVM is the high training time and space complexities for large datasets, which is commonly encountered in real-world pattern recognition applications.
  High dimensional input space: Depending on the acquisition resolution, many real-world databases consist of hundreds to thousands of measurements. While the higher dimension of this input potentially makes classifier a unique and powerful technique for a certain application, on the other hand it complicates the computation and the design of an appropriate method to handle it.
  Noise and interaction: Since the acquisitions (training data and test data) are usually obtained from real-world, thus these databases are usually affected by interaction and many kinds of noises between classes. In most actual designs of classifiers, noises and interaction make the boundary between classes not clear. Many noise reduction approaches have proposed. However, it is expected that the classifier should be robust against these imperfections.
  Imbalance in database: In most actual classification applications, databases are usually unbalanced. That is, the size of one class is commonly much larger than the others. This phenomenon widely exists in the real-world and it is the main reason for causing the excursion of separation boundaries in SVM classifiers. Thus, it is required to construct a classifier which can modify the separation boundaries and overcome the excursion.
  Furthermore, based on the principle of ‘divide-and-conquer’, we are also developing modular SVM regression systems to solve the complicated problems of time series prediction and multiple SVM classifier systems to solve the complicated problems of data classification.