The caret package's train() function can also implement the SVM model. In this tutorial, we'll briefly learn how to implement the SVM algorithm with both 'e1071' and 'caret' methods to classify Iris dataset in R. The tutorial covers:
- Preparing data
- The 'e1071' method
- The caret's train method
- Source code listing
library(e1071) library(caret)
Preparing data
We'll use the Iris dataset in this tutorial. First, we'll prepare it by splitting it into the train and test parts.
data(iris) set.seed(123) indexes = createDataPartition(iris$Species, p = .9, list = F) train = iris[indexes, ] test = iris[-indexes, ]
The 'e1071' method
The 'e1071' package provides svm() function and we'll define the model by using it. We'll include the train data into the function.
model_svm = svm(Species~., data=train) print(model_svm) Call: svm(formula = Species ~ ., data = train) Parameters: SVM-Type: C-classification SVM-Kernel: radial cost: 1 gamma: 0.25 Number of Support Vectors: 49
Now, we can predict test data with the fitted model.
pred = predict(model_svm, test)
Finally, we'll predict the test data and check the accuracy with the confusion matrix.
cm = confusionMatrix(test$Species, pred) print(cm) Confusion Matrix and Statistics Reference Prediction setosa versicolor virginica setosa 5 0 0 versicolor 0 5 0 virginica 0 0 5 Overall Statistics Accuracy : 1 95% CI : (0.782, 1) No Information Rate : 0.3333 P-Value [Acc > NIR] : 6.969e-08 Kappa : 1 Mcnemar's Test P-Value : NA Statistics by Class: Class: setosa Class: versicolor Class: virginica Sensitivity 1.0000 1.0000 1.0000 Specificity 1.0000 1.0000 1.0000 Pos Pred Value 1.0000 1.0000 1.0000 Neg Pred Value 1.0000 1.0000 1.0000 Prevalence 0.3333 0.3333 0.3333 Detection Rate 0.3333 0.3333 0.3333 Detection Prevalence 0.3333 0.3333 0.3333 Balanced Accuracy 1.0000 1.0000 1.0000
The caret's train method
In this method, we'll use the caret's train() function. We'll define the 'svmRadial' method in a method.
model = train(Species~., data=train, method="svmRadial") print(model) Support Vector Machines with Radial Basis Function Kernel 135 samples 4 predictor 3 classes: 'setosa', 'versicolor', 'virginica' No pre-processing Resampling: Bootstrapped (25 reps) Summary of sample sizes: 135, 135, 135, 135, 135, 135, ... Resampling results across tuning parameters: C Accuracy Kappa 0.25 0.9381422 0.9060675 0.50 0.9415891 0.9112876 1.00 0.9479245 0.9208977 Tuning parameter 'sigma' was held constant at a value of 0.5648255 Accuracy was used to select the optimal model using the largest value. The final values used for the model were sigma = 0.5648255 and C = 1.
pred = predict(model, test) cm = confusionMatrix(test$Species, pred) print(cm) Confusion Matrix and Statistics Reference Prediction setosa versicolor virginica setosa 5 0 0 versicolor 0 5 0 virginica 0 0 5 Overall Statistics Accuracy : 1 95% CI : (0.782, 1) No Information Rate : 0.3333 P-Value [Acc > NIR] : 6.969e-08 Kappa : 1 Mcnemar's Test P-Value : NA Statistics by Class: Class: setosa Class: versicolor Class: virginica Sensitivity 1.0000 1.0000 1.0000 Specificity 1.0000 1.0000 1.0000 Pos Pred Value 1.0000 1.0000 1.0000 Neg Pred Value 1.0000 1.0000 1.0000 Prevalence 0.3333 0.3333 0.3333 Detection Rate 0.3333 0.3333 0.3333 Detection Prevalence 0.3333 0.3333 0.3333 Balanced Accuracy 1.0000 1.0000 1.0000
In this tutorial, we've briefly learned how to use the 'e1071' package's svm function to classify data in R. The full source code is listed below.
Source code listing
library(e1071) library(caret) # Classification example data(iris) set.seed(123) indexes = createDataPartition(iris$Species, p = .9, list = F) train = iris[indexes, ] test = iris[-indexes, ] model_svm = svm(Species~., data=train) print(model_svm) pred = predict(model_svm, test) # accuracy check cm = confusionMatrix(test$Species, pred) print(cm)
# caret train method
model = train(Species~., data=train, method="svmRadial")
print(model)
pred = predict(model, test) cm = confusionMatrix(test$Species, pred) print(cm)
No comments:
Post a Comment