Forecasting Metrics.

Total Error (TE)

$ \sum^{n}_{i}y_{i} - \hat{y}_{i}$

__total_error[source]

__total_error(ts:array, f:array)

Computes the total error:

$\sum^{n}_{i}(ts[i] - f[i]), n = len(ts) = len(f)$

.Ignores nan values in times-series or the forecast.


Parameters

ts : np.array with the time-series f : np.array with the forecast


Returns

ts, f = np.array([1, 2, 3]), np.array([.5, 2.5, 2])
assert __total_error(ts=ts, f=f) == 1.

ts, f = np.array([1, 2, 3, np.nan]), np.array([.5, 2.5, 2, 1])
assert __total_error(ts=ts, f=f) == 1
df, _ = moving_average.SMA(
    1,
    df=simulate_data.pandas_time_series(),
)

__total_error(ts= df['time_series'].to_numpy(), f=df['ma_1'].to_numpy())
2.795063162712527

Absolute Error (AE)

$ \sum^{n}_{i} \vert y(t_{i}) - \hat{y}(t_{i})\vert $

__absolute_error[source]

__absolute_error(ts:array, f:array)

Computes the absolute error:

$\sum^{n}_{i}|ts[i] - f[i]|, n = len(ts) = len(f)$

.Ignores nan values in times-series or the forecast.


Parameters

ts : np.array with the time-series f : np.array with the forecast


Returns

ts, f = np.array([1, 2, 3]), np.array([.5, 2.5, 2])
assert __absolute_error(ts=ts, f=f) == 2

ts, f = np.array([np.nan, 1, 2, 3]), np.array([100, .5, 2.5, 2])
assert __absolute_error(ts=ts, f=f) == 2

Squared Error (SE)

$ \sum^{n}_{i} \vert y(t_{i}) - \hat{y}(t_{i})\vert^2 $

__squared_error[source]

__squared_error(ts:array, f:array)

Computes the squared error:

$\sum^{n}_{i}|ts[i] - f[i]|**2, n = len(ts) = len(f)$

.Ignores nan values in times-series or the forecast.


Parameters

ts : np.array with the time-series f : np.array with the forecast


Returns

ts, f = np.array([1, 2, 3]), np.array([.5, 2.5, 2])
assert __squared_error(ts=ts, f=f) == 2 * (.5)**2 + 1

ts, f = np.array([1, 2, np.nan, 3]), np.array([.5, 2.5, 10**3, 2])
assert __squared_error(ts=ts, f=f) == 2 * (.5)**2 + 1

Mean Error

$ \frac{1}{n}\sum^{n}_{i}y_{i} - \hat{y}_{i}$

__mean_error[source]

__mean_error(ts:array, f:array)

Computes the mean error:

$(1/n)\sum^{n*}_{i}ts[i] - f[i], n* = len(ts) = len(f)$

.Ignores nan values in times-series or the forecast. Value n is n* minus the ignored values


Parameters

ts : np.array with the time-series f : np.array with the forecast


Returns

ts, f = np.array([1, 2, 3]), np.array([.5, 2.5, 2])
assert __mean_error(ts=ts, f=f) == (1 / 3) * (1)

ts, f = np.array([1, 2, 10**3, 3]), np.array([.5, 2.5, np.nan, 2])
assert __mean_error(ts=ts, f=f) == (1 / 3) * (1)

ts, f = np.array([np.nan, np.nan]), np.array([.5, 2.5])
assert abs(__mean_error(ts=ts, f=f)) < 10**(-20)

Mean Absolute Error (MAE)

$ \frac{1}{n}\sum^{n}_{i} \vert y(t_{i}) - \hat{y}(t_{i})\vert $

__mean_absolute_error[source]

__mean_absolute_error(ts:array, f:array)

Computes the mean absolute error:

$$(1/n)\sum^{n}_{i} | ts[i] - f[i]|, n = len(ts) = len(f)$$

.Ignores nan values in times-series or the forecast.


Parameters

ts : np.array with the time-series f : np.array with the forecast


Returns

ts, f = np.array([1, 2, 3]), np.array([.5, 2.5, 2])
assert __mean_absolute_error(ts=ts, f=f) == (1 / 3)*(2)


ts, f = np.array([np.nan, np.nan]), np.array([.5, 2.5])
assert abs(__mean_absolute_error(ts=ts, f=f)) < 10**(-20)

Mean Squared Error (MSE)

$\frac{1}{n} \sum^{n}_{i}(y_{i} - \hat{y}_{i})^{2}$

__mean_squared_error[source]

__mean_squared_error(ts:array, f:array)

Computes the mean squared error:

$$(1/n)\sum^{n}_{i} | ts[i] - f[i]|**2, n = len(ts) = len(f)$$

.Ignores nan values in times-series or the forecast.


Parameters

ts : np.array with the time-series f : np.array with the forecast


Returns

ts, f = np.array([1, 2, 3]), np.array([.5, 2.5, 2])
assert __mean_squared_error(ts=ts, f=f) == (1 / 3) * (2 * .5**2 + 1)

ts, f = np.array([np.nan, np.nan]), np.array([.5, 2.5])
assert __mean_squared_error(ts=ts, f=f) < 10**(-20)

Root Mean Square Error (RMSE)

$\sqrt{\frac{1}{n} \sum^{n}_{i}(y_{i} - \hat{y}_{i})^{2}}$

__root_mean_square_error[source]

__root_mean_square_error(ts:array, f:array)

Computes the root mean square error:

$$\sqrt{(1/n)\sum^{n}_{i} | ts[i] - f[i]|**2}, n = len(ts) = len(f)$$

.Ignores nan values in times-series or the forecast.


Parameters

ts : np.array with the time-series f : np.array with the forecast


Returns

ts, f = np.array([1, 2, 3]), np.array([.5, 2.5, 2])
assert __root_mean_square_error(ts=ts, f=f) == np.sqrt((1 / 3) * (2 * .5**2 + 1))

ts, f = np.array([np.nan, np.nan]), np.array([.5, 2.5])
assert __root_mean_square_error(ts=ts, f=f) < 10**(-20)

Mean Percentage Error (MPE)

$\frac{1}{n} \sum^{n}_{i}\frac{y_{i} - \hat{y}_{i}} { y_i} $

__mean_percentage_error[source]

__mean_percentage_error(ts:array, f:array)

Computes the root mean percentage error:

$$ (1/n)\sum ( ts[i] - f[i]) /ts[i], n* = len(ts) = len(f)$$

.Ignores nan values and division by zero. n is n* minus the ignored values


Parameters

ts : np.array with the time-series f : np.array with the forecast


Returns

ts, f = np.array([1, 2, 3]), np.array([.5, 2.5, 2])
assert __mean_percentage_error(ts=ts, f=f) == (.5 / 2 + 1 / 3) * (1 / 3)

ts, f = np.array([np.nan, np.nan]), np.array([.5, 2.5])
assert __mean_percentage_error(ts=ts, f=f) < 10**(-20)

Mean Absolute Percent Error (MAPE)

$\frac{1}{n} \sum^{n}_{i}\vert \frac{y_{i} - \hat{y}_{i}} { y_i} \vert $

__mean_absolute_percent_error[source]

__mean_absolute_percent_error(ts:array, f:array)

Computes the mean absolute percentage error:

$$ (1/n)\sum | ts[i] - f[i] /ts[i] ert, n* = len(ts) = len(f)$$

.Ignores nan values and division by zero. n is n* minus the ignored values


Parameters

ts : np.array with the time-series f : np.array with the forecast


Returns

ts, f = np.array([1, 2, 3]), np.array([.5, 2.5, 2])
assert __mean_absolute_percent_error(ts=ts,
                                     f=f) == np.divide(.5 * (3. / 2) + (1. / 3),
                                                       3)

ts, f = np.array([np.nan, np.nan]), np.array([.5, 2.5])
assert __mean_percentage_error(ts=ts, f=f) < 10**(-20)

Summary

Summary of metrics

SUMMARY[source]

SUMMARY(df:DataFrame=None, val_col:str=None, pred_cols:List[str]=None)

Summary of Prediction Metrics


Parameters

df : dataframe , dafault None. If None it will generate a simulated dataframe. val_col : str , default None. Name of the column with the actual values. It should be provided when a datafre is provided. pred_cols : List[str] . Names of the columns with the predictions for the values If not provided it will take all dataframe columns except val_col


Returns

dataframe : Summary results

SUMMARY()
ma_1 ma_4
Error
total 5.014113 7.783203
absolute 21.970438 22.912183
squared 25.868284 30.116424
mean 0.172900 0.299354
mean absolute 0.757601 0.881238
mean squared 0.892010 1.158324
root mean square 0.944463 1.076255
mean percentage 0.016861 0.027478
mean absolute percent 0.127818 0.144065

Dataframes and Figures

Generates a Time Series Dataframe and a Figure Object

The Values of The Time Series are Simulated

Includes Forecasting with Moving Averages