SVM is a nonlinear pattern
recognition algorithm based on kernel methods. In contrast to linear methods,
kernel methods map the original parameter vectors into a higher (possibly
infinite) dimensional feature space through a nonlinear kernel function.
Without need to compute the nonlinear mapping explicitly, dot-products can be
computed efficiently in higher dimensional space. The dominant feature which
makes SVM very attractive is that classes which are nonlinearly separable in
the original space can be linearly separated in the higher dimensional feature
space. Thus SVM is capable to solve complex nonlinear pattern recognition
problems. Important characteristics of SVM are its ability to solve pattern
recognition problems by means of convex quadratic programming (QP), and also
the sparseness resulting from this QP problem.
Although SVM-based methods show advantages in terms of generalization
performance and the recognition accuracy, the pattern recognition of real-world
data using SVM still faces several challenges that we are trying to solve.
Huge training data: To make the training of classifier executable, most
learning algorithms require a suitable amount of training data which scales
with the number of inputs. SVM is one of the kernel methods and formulated as
quadratic programming (QP) problems. The training time and space complexities
have exponential relationships with the size of training data. Hence, a major
stumbling block in SVM is the high training time and space complexities for
large datasets, which is commonly encountered in real-world pattern recognition
applications.
High dimensional input space: Depending on the acquisition resolution,
many real-world databases consist of hundreds to thousands of measurements.
While the higher dimension of this input potentially makes classifier a unique
and powerful technique for a certain application, on the other hand it
complicates the computation and the design of an appropriate method to handle
it.
Noise and interaction: Since the acquisitions (training data and test
data) are usually obtained from real-world, thus these databases are usually
affected by interaction and many kinds of noises between classes. In most
actual designs of classifiers, noises and interaction make the boundary between
classes not clear. Many noise reduction approaches have proposed. However, it
is expected that the classifier should be robust against these imperfections.
Imbalance in database: In most actual classification applications,
databases are usually unbalanced. That is, the size of one class is commonly
much larger than the others. This phenomenon widely exists in the real-world
and it is the main reason for causing the excursion of separation boundaries in
SVM classifiers. Thus, it is required to construct a classifier which can
modify the separation boundaries and overcome the excursion.
Furthermore, based on the principle of ‘divide-and-conquer’, we are also
developing modular SVM regression systems to solve the complicated problems of
time series prediction and multiple SVM classifier systems to solve the
complicated problems of data classification.