KEYWORDS: Binary data, Data modeling, Statistical analysis, Machine learning, Data mining, Data acquisition, Interference (communication), Data centers
Support vector machines (SVMs) have been widely used for binary classification. But large-scale training set will bring huge computation to the SVM. Researcher have proposed many techniques to improve the training efficiency of SVMs, and a typical class of improved SVMs is based on sparsely reducing training samples. To achieve this, clustering-based methods are most commonly used. However, clustering-based methods are ready to be disturbed by noise points. In order to solve this problem, this paper proposes a robust and efficient SVM algorithm based on K-Medians clustering (REK-SVM). Here, for each cluster, the cluster center takes the median value of each dimension attribute in the cluster, which can reduce the noise points. Especially, when the number of noise points distributed discretely is less than half of the total number of samples in the cluster, noise interference can be completely removed. The noise-free or noise-reduced subset data is used to train the SVM model. Experimental results show that our algorithm is fast and effective. For the processing of noise-containing classification data, its performance far exceeds SVM in terms of classification accuracy and efficiency. Compared to the K-SVM, they have the same computational complexity, but our algorithm is much higher than K-SVM in classification accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.