Support Vector Machines (SVM) is a supervised learning method and it can be used for regression and classification problems. The SVM based classier is called the SVC (Support Vector Classifier) and we can use it in classification problems.
The nu-support vector classifier (Nu-SVC) is similar to the SVC with the only difference that the nu-SVC classifier has a nu parameter to control the number of support vectors.
The nu-support vector classifier (Nu-SVC) is similar to the SVC with the only difference that the nu-SVC classifier has a nu parameter to control the number of support vectors.
In this tutorial, we'll briefly learn how to classify data by using
Scikit-learn's NuSVC class in Python. The tutorial
covers:
- Preparing the data
- Training the model
- Predicting and accuracy check
- Video tutorial
- Source code listing
from sklearn.svm import NuSVC from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.model_selection import cross_val_score from sklearn.metrics import confusion_matrix from sklearn.metrics import classification_report
Preparing the data
In this tutorial, we'll use the Iris dataset as target data to classify. We'll define the x and y data parts.
Then, we'll split them into train and test parts. Here, we'll extract 15 percent of the dataset as test data.
iris = load_iris() x, y = iris.data, iris.target
Then, we'll split them into train and test parts. Here, we'll extract 15 percent of the dataset as test data.
xtrain, xtest, ytrain, ytest=train_test_split(x, y, test_size=0.15)
Training the model
Next, we'll define the classifier by using the
NuSVC class. We can use the default parameters of the class. The default value of nu is 0.5 and it can be changed according to classification data content.
nsvc = NuSVC() print(nsvc) NuSVC(break_ties=False, cache_size=200, class_weight=None, coef0=0.0, decision_function_shape='ovr', degree=3, gamma='scale', kernel='rbf', max_iter=-1, nu=0.5, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False)
Then, we'll fit the model on train data and check the model accuracy score.
nsvc.fit(xtrain, ytrain) score = nsvc.score(xtrain, ytrain) print("Score: ", score) Score: 0.9763779527559056
We can also apply a cross-validation training method to the model and check the training score.
cv_scores = cross_val_score(clf, xtrain, ytrain, cv=10) print("CV average score: %.2f" % cv_scores.mean()) CV average score: 0.97
Predicting and accuracy check
Now, we can predict the test data by using the trained model. After the
prediction, we'll check the accuracy level by using the confusion matrix
function.
ypred = nsvc.predict(xtest) cm = confusion_matrix(ytest, ypred) print(cm) [[12 0 0] [ 0 4 0] [ 0 1 6]]
We can also create a classification report by using classification_report() function on predicted data to check the other accuracy metrics.
cr = classification_report(ytest, ypred) print(cr) precision recall f1-score support 0 0.99 0.88 0.93 241 1 0.98 0.98 0.98 253 2 0.89 0.99 0.94 256 accuracy 0.95 750 macro avg 0.96 0.95 0.95 750 weighted avg 0.96 0.95 0.95 750
In this tutorial, we've briefly learned how to classify data by using
Scikit-learn API's NuSVC class in Python. The full source code is listed below.
Source code listing
from sklearn.svm import NuSVC from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.model_selection import cross_val_score from sklearn.metrics import confusion_matrix from sklearn.metrics import classification_report iris = load_iris() x, y = iris.data, iris.target xtrain, xtest, ytrain, ytest=train_test_split(x, y, test_size=0.15) nsvc = NuSVC() print(nsvc) nsvc.fit(xtrain, ytrain) score = nsvc.score(xtrain, ytrain) print("Score: ", score) cv_scores = cross_val_score(nsvc, xtrain, ytrain, cv=10) print("CV average score: %.2f" % cv_scores.mean()) ypred = nsvc.predict(xtest) cm = confusion_matrix(ytest, ypred) print(cm) cr = classification_report(ytest, ypred) print(cr)
References:
No comments:
Post a Comment