The post covers:
- Creating time series data with pandas.
- Decomposing time series data.
- Forecasting with ARMA/ARIMA model
import random import pandas as pd import matplotlib.pyplot as plt from statsmodels.tsa.arima_model import ARIMA from statsmodels.tsa.arima_model import ARMA from statsmodels.tsa.seasonal import seasonal_decompose
Creating time series data with pandas
For test purpose, I'll create time series data with the following function.
def CreateTSData(N): columns = ['value'] df = pd.DataFrame(columns=columns) for i in range(N): v = i/20+random.uniform(-12, 8)+random.uniform(-1, 1) df.loc[i]= [v] return df N = 400 # total number of rows days = 10 df = CreateTSData(N) df.index=pd.DatetimeIndex(freq="d", start=pd.Timestamp('2000-01-01'),periods=N) df.head()value
2000-01-01 -0.802450
2000-01-02 -0.147009
2000-01-03 -1.862958
2000-01-04 5.919821
2000-01-05 2.061787
Decomposing time series data
Time series data decomposition is a method to split data series into the components like a trend, seasonal, and irregular noise.
- Trend component reflects the overall direction in data. It is mean value over time.
- Seasonal component is variations that occur at specific regular intervals in data series (e.g., weekly, monthly).
- Irregular (noise) component is residuals that is a remaining part after removing the above components.
decomp = seasonal_decompose(df["value"]) decomp.plot() plt.show()
Forecasting with ARMA/ARIMA models
Autoregressive Moving Average (ARMA) and Autoregressive Integrated Moving Average (ARIMA) are commonly used models to forecast time series data. The ARMA model needs (p, q) values and the ARIMA model requires (p,d,q) values where p, d, and q are non-negative integer values, and represents;
p - the number of lag observations in the model, also known as the AR.
d - the number of times that the raw observations are differenced, also known as the degree of difference.
q - the size of the moving average window, also known as the order of the moving average.
ARIMA model
The model can be created with ARIMA function, you may check the summary of the model with below functions
arima = ARIMA(df, order = (10,0,0)) arima = arima.fit() arima.summary()
Next, we forecast data for new 10 days and visualize it in a plot.
plt.plot(df) plt.plot(arima.predict(1, N + days), color="red") plt.show()
ARMA model
We use ARMA function this time and fit the model.
arma = ARMA(df, order = (2,1)) arma = arma.fit() arma.summary()
Next, we forecast data for new 10 days and visualize it in a plot.
plt.plot(df) plt.plot(arma.predict(1, N+days), color = "red") plt.show()
In this post, we have briefly learned how to decompose and forecast time series data in Python. I hope you have found it useful.
A full source code is listed below.
import random import pandas as pd import matplotlib.pyplot as plt from statsmodels.tsa.arima_model import ARIMA from statsmodels.tsa.arima_model import ARMA from statsmodels.tsa.seasonal import seasonal_decompose def CreateTSData(N): columns = ['value'] df = pd.DataFrame(columns=columns) for i in range(N): v = i/20+random.uniform(-12, 8)+random.uniform(-1, 1) df.loc[i]= [v] return df N = 400 # total number of rows days = 10 # days to forecast df = CreateTSData(N) df.index=pd.DatetimeIndex(freq="d",start=pd.Timestamp('2000-01-01'),periods=N) df.head() decomp = seasonal_decompose(df["value"]) decomp.plot() plt.show() arima = ARIMA(df, order = (10,0,0)) arima = arima.fit() arima.summary() plt.plot(df) plt.plot(arima.predict(1, N + days), color="red") plt.show() arma = ARMA(df, order = (2,1)) arma = arma.fit() arma.summary() plt.plot(df) plt.plot(arma.predict(1, N+days), color = "red") plt.show()
No comments:
Post a Comment