DataTechNotes: Bayesian Ridge Regression Example in Python

Bayesian regression can be implemented by using regularization parameters in estimation. The BayesianRidge estimator applies Ridge regression and its coefficients to find out a posteriori estimation under the Gaussian distribution.
In this post, we'll learn how to use the scikit-learn's BayesianRidge estimator class for a regression problem. The tutorial covers:

Preparing the data
How to use the model
Source code listing

We'll start by loading the required modules.

from sklearn.linear_model import BayesianRidge
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt
from numpy import sqrt

Preparing the data

In this tutorial, we'll use the Boston-housing dataset. We'll load the dataset and split it into the train and test parts.

boston = load_boston()
x, y = boston.data, boston.target
xtrain, xtest, ytrain, ytest=train_test_split(x, y, test_size=0.15)

How to use the model

Next, we'll define the model with default parameters and fit it with train data.

bay_ridge = BayesianRidge()
print(bay_ridge)

BayesianRidge(alpha_1=1e-06, alpha_2=1e-06, compute_score=False, copy_X=True,
       fit_intercept=True, lambda_1=1e-06, lambda_2=1e-06, n_iter=300,
       normalize=False, tol=0.001, verbose=False)

bay_ridge.fit(xtrain, ytrain)

We can check the model score that is R-squared metrics.

score=bay_ridge.score(xtrain, ytrain)
print("Model score (R-squared): %.2f" % score)

Model score (R-squared): 0.74

Next, we'll predict the test data and check the accuracy level.

ypred = bay_ridge.predict(xtest)
mse = mean_squared_error(ytest, ypred)
print("MSE: %.2f" % mse)

MSE: 30.18

print("RMSE: %.2f" % sqrt(mse))

RMSE: 5.49

Finally, we'll create the plot to visualize the result and original data.

x_ax = range(len(ytest))
plt.scatter(x_ax, ytest, s=5, color="blue", label="original")
plt.plot(x_ax, ypred, lw=0.8, color="red", label="predicted")
plt.legend()
plt.show()

In this post, we've briefly learned how to use the BayesianRidge estimator in Python. The full source code is listed below.

Source code listing

from sklearn.linear_model import BayesianRidge
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt
from numpy import sqrt

boston = load_boston()
x, y = boston.data, boston.target
xtrain, xtest, ytrain, ytest=train_test_split(x, y, test_size=0.15)

bay_ridge = BayesianRidge()
print(bay_ridge)

bay_ridge.fit(xtrain, ytrain)

score=bay_ridge.score(xtrain, ytrain)
print("Model score (R-squared): %.2f" % score)

ypred = bay_ridge.predict(xtest)
mse = mean_squared_error(ytest, ypred)
print("MSE: %.2f" % mse)
print("RMSE: %.2f" % sqrt(mse))

x_ax = range(len(ytest))
plt.scatter(x_ax, ytest, s=5, color="blue", label="original")
plt.plot(x_ax, ypred, lw=0.8, color="red", label="predicted")
plt.legend()
plt.show()

References:

sklearn.linear_model.BayesianRidge

DataTechNotes

Pages

Bayesian Ridge Regression Example in Python

1 comment: