Nu-Support Vector Classification Example in Python

   Support Vector Machines (SVM) is a supervised learning method and it can be used for regression and classification problems. The SVM based classier is called the SVC (Support Vector Classifier) and we can use it in classification problems.
     The nu-support vector classifier (Nu-SVC) is similar to the SVC with the only difference that the nu-SVC classifier has a nu parameter to control the number of support vectors.

   In this tutorial, we'll briefly learn how to classify data by using Scikit-learn's NuSVC class in Python. The tutorial covers:
  1. Preparing the data
  2. Training the model
  3. Predicting and accuracy check
  4. Video tutorial
  5. Source code listing
   We'll start by loading the required libraries.

from sklearn.svm import NuSVC
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report

Preparing the data

   In this tutorial, we'll use the Iris dataset as target data to classify. We'll define the x and y data parts.

iris = load_iris()
x, y = iris.data, iris.target

Then, we'll split them into train and test parts. Here, we'll extract 15 percent of the dataset as test data.

xtrain, xtest, ytrain, ytest=train_test_split(x, y, test_size=0.15)


Training the model

   Next, we'll define the classifier by using the NuSVC class. We can use the default parameters of the class. The default value of nu is 0.5 and it can be changed according to classification data content.

nsvc = NuSVC()
print(nsvc)

NuSVC(break_ties=False, cache_size=200, class_weight=None, coef0=0.0,
      decision_function_shape='ovr', degree=3, gamma='scale', kernel='rbf',
      max_iter=-1, nu=0.5, probability=False, random_state=None, shrinking=True,
      tol=0.001, verbose=False)

Then, we'll fit the model on train data and check the model accuracy score.

nsvc.fit(xtrain, ytrain)

score = nsvc.score(xtrain, ytrain)
print("Score: ", score)

Score:  0.9763779527559056

We can also apply a cross-validation training method to the model and check the training score. 

cv_scores = cross_val_score(clf, xtrain, ytrain, cv=10)
print("CV average score: %.2f" % cv_scores.mean())

CV average score: 0.97


Predicting and accuracy check

Now, we can predict the test data by using the trained model. After the prediction, we'll check the accuracy level by using the confusion matrix function.

ypred = nsvc.predict(xtest)

cm = confusion_matrix(ytest, ypred)
print(cm)

[[12  0  0]
 [ 0  4  0]
 [ 0  1  6]]

We can also create a classification report by using classification_report() function on predicted data to check the other accuracy metrics.

cr = classification_report(ytest, ypred)
print(cr)

              precision    recall  f1-score   support

           0       0.99      0.88      0.93       241
           1       0.98      0.98      0.98       253
           2       0.89      0.99      0.94       256

    accuracy                           0.95       750
   macro avg       0.96      0.95      0.95       750
weighted avg       0.96      0.95      0.95       750

   In this tutorial, we've briefly learned how to classify data by using Scikit-learn API's NuSVC class in Python. The full source code is listed below.

Source code listing

from sklearn.svm import NuSVC
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report

iris = load_iris()
x, y = iris.data, iris.target
xtrain, xtest, ytrain, ytest=train_test_split(x, y, test_size=0.15)

nsvc = NuSVC()
print(nsvc)

nsvc.fit(xtrain, ytrain)
score = nsvc.score(xtrain, ytrain)
print("Score: ", score)

cv_scores = cross_val_score(nsvc, xtrain, ytrain, cv=10)
print("CV average score: %.2f" % cv_scores.mean())

ypred = nsvc.predict(xtest)

cm = confusion_matrix(ytest, ypred)
print(cm)

cr = classification_report(ytest, ypred)
print(cr) 


References:

No comments:

Post a Comment