In this tutorial, we'll learn how to fit regression data with LARS and Lasso Lars algorithms in Python. We'll use the sklearn's Lars and LarsLasso estimators and the Boston housing dataset in this tutorial. The post covers:
- Preparing the data
- How to use LARS
- How to use Lasso LARS
- Source code listing
from sklearn import linear_model from sklearn.datasets import load_boston from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error import matplotlib.pyplot as plt from numpy import sqrt
Preparing the data
We'll load the Boston dataset and split it into the train and test parts.
boston = load_boston() x, y = boston.data, boston.target xtrain, xtest, ytrain, ytest=train_test_split(x, y, test_size=0.15)
How to use LARS
We'll define the model with Lars() class (with default parameters) and fit it with train data.
lars = linear_model.Lars().fit(xtrain, ytrain) print(lars)
Lars(copy_X=True, eps=2.220446049250313e-16, fit_intercept=True, fit_path=True, n_nonzero_coefs=500, normalize=True, positive=False, precompute='auto', verbose=False)And check the model coefficients.
print(lars.coef_)
[-1.16800795e-01 1.02016954e-02 -2.99472206e-01 4.21380667e+00 -2.18450214e+01 4.01430635e+00 -9.90351759e-03 -1.60916999e+00 -2.32195752e-01 2.80140313e-02 -1.08077980e+00 1.07377184e-02 -5.02331702e-01]
Next, we'll predict test data and check the MSE and RMSE metrics.
ypred = lars.predict(xtest) mse = mean_squared_error(ytest, ypred) print("MSE: %.2f" % mse)
MSE: 36.96
print("RMSE: %.2f" % sqrt(mse))
RMSE: 6.08
Finally, we'll create the plot to visualize the original and predicted data.
x_ax = range(len(ytest)) plt.scatter(x_ax, ytest, s=5, color="blue", label="original") plt.plot(x_ax, ypred, lw=0.8, color="red", label="predicted") plt.legend() plt.show()
How to use Lasso Lars
LassoLars is an implementation of the LARS algorithm with the Lasso model. We'll define the model with LassoLars() class by setting 0.1 to the alpha parameter and fit the model on train data.
assolars = linear_model.LassoLars(alpha =.1).fit(xtrain, ytrain) print(lassolars)
LassoLars(alpha=0.1, copy_X=True, eps=2.220446049250313e-16, fit_intercept=True, fit_path=True, max_iter=500, normalize=True, positive=False, precompute='auto', verbose=False)We can check the coefficients.
print(lassolars.coef_)
[ 0. 0. 0. 0. 0. 3.00873485 0. 0. 0. 0. -0.28423008 0. -0.42849354]
Next, we'll predict test data and check the MSE and RMSE metrics.
ypred = lassolars.predict(xtest) mse = mean_squared_error(ytest, ypred) print("MSE: %.2f" % mse)
MSE: 45.59
print("RMSE: %.2f" % sqrt(mse))
RMSE: 6.75
Finally, we'll create the plot to visualize the original and predicted data.
x_ax = range(len(ytest)) plt.scatter(x_ax, ytest, s=5, color="blue", label="original") plt.plot(x_ax, ypred, lw=0.8, color="red", label="predicted") plt.legend() plt.show()
In this tutorial, we've briefly learned how to fit and predict regression data with LARS and Lasso Lars algorithms.
Source code listing
from sklearn import linear_model from sklearn.datasets import load_boston from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error import matplotlib.pyplot as plt from numpy import sqrt boston = load_boston() x, y = boston.data, boston.target xtrain, xtest, ytrain, ytest=train_test_split(x, y, test_size=0.15) lars = linear_model.Lars().fit(xtrain, ytrain) print(lars) print(lars.coef_) ypred = lars.predict(xtest) mse = mean_squared_error(ytest, ypred) print("MSE: %.2f" % mse) print("RMSE: %.2f" % sqrt(mse)) x_ax = range(len(ytest)) plt.scatter(x_ax, ytest, s=5, color="blue", label="original") plt.plot(x_ax, ypred, lw=0.8, color="red", label="predicted") plt.legend() plt.show() lassolars = linear_model.LassoLars(alpha =.1).fit(xtrain, ytrain) print(lassolars)
print(lassolars.coef_) ypred = lassolars.predict(xtest) mse = mean_squared_error(ytest, ypred) print("MSE: %.2f" % mse) print("RMSE: %.2f" % sqrt(mse)) x_ax = range(len(ytest)) plt.scatter(x_ax, ytest, s=5, color="blue", label="original") plt.plot(x_ax, ypred, lw=0.8, color="red", label="predicted") plt.legend() plt.show()
References and further reading:
- Least Angle Regression, by Efron Bradley; Hastie Trevor; Johnstone Iain; Tibshirani Robert (2004)
- Least-Angel Regression, Wikipedia
- sklearn.linear_model.Lars
- sklearn.linear_model.LassoLars
No comments:
Post a Comment