In this post, we'll learn how to use the scikit-learn's BayesianRidge estimator class for a regression problem. The tutorial covers:
- Preparing the data
- How to use the model
- Source code listing
from sklearn.linear_model import BayesianRidge from sklearn.datasets import load_boston from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error import matplotlib.pyplot as plt from numpy import sqrt
Preparing the data
In this tutorial, we'll use the Boston-housing dataset. We'll load the dataset and split it into the train and test parts.
boston = load_boston() x, y = boston.data, boston.target xtrain, xtest, ytrain, ytest=train_test_split(x, y, test_size=0.15)
How to use the model
Next, we'll define the model with default parameters and fit it with train data.
bay_ridge = BayesianRidge() print(bay_ridge)
BayesianRidge(alpha_1=1e-06, alpha_2=1e-06, compute_score=False, copy_X=True, fit_intercept=True, lambda_1=1e-06, lambda_2=1e-06, n_iter=300, normalize=False, tol=0.001, verbose=False)
bay_ridge.fit(xtrain, ytrain)
We can check the model score that is R-squared metrics.
score=bay_ridge.score(xtrain, ytrain) print("Model score (R-squared): %.2f" % score)
Model score (R-squared): 0.74
Next, we'll predict the test data and check the accuracy level.
ypred = bay_ridge.predict(xtest) mse = mean_squared_error(ytest, ypred) print("MSE: %.2f" % mse)
MSE: 30.18
print("RMSE: %.2f" % sqrt(mse))
RMSE: 5.49
Finally, we'll create the plot to visualize the result and original data.
x_ax = range(len(ytest)) plt.scatter(x_ax, ytest, s=5, color="blue", label="original") plt.plot(x_ax, ypred, lw=0.8, color="red", label="predicted") plt.legend() plt.show()
In this post, we've briefly learned how to use the BayesianRidge estimator in Python. The full source code is listed below.
Source code listing
from sklearn.linear_model import BayesianRidge from sklearn.datasets import load_boston from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error import matplotlib.pyplot as plt from numpy import sqrt boston = load_boston() x, y = boston.data, boston.target xtrain, xtest, ytrain, ytest=train_test_split(x, y, test_size=0.15) bay_ridge = BayesianRidge() print(bay_ridge)
bay_ridge.fit(xtrain, ytrain) score=bay_ridge.score(xtrain, ytrain) print("Model score (R-squared): %.2f" % score) ypred = bay_ridge.predict(xtest) mse = mean_squared_error(ytest, ypred) print("MSE: %.2f" % mse) print("RMSE: %.2f" % sqrt(mse)) x_ax = range(len(ytest)) plt.scatter(x_ax, ytest, s=5, color="blue", label="original") plt.plot(x_ax, ypred, lw=0.8, color="red", label="predicted") plt.legend() plt.show()
References:
Thanks for this, really helped me get my head around evaluating the effectiveness of my BRR model.
ReplyDelete