## 3.5 Prediction intervals

As discussed in Section 1.7, a prediction interval gives an interval within which we expect \(y_{t}\) to lie with a specified probability. For example, assuming that the forecast errors are uncorrelated and normally distributed, a simple 95% prediction interval for the next observation in a time series is \[ \hat{y}_{t} \pm 1.96 \hat\sigma, \] where \(\hat\sigma\) is an estimate of the standard deviation of the forecast distribution. In this book we usually calculate 80% intervals and 95% intervals, although any percentage may be used.

When forecasting one step ahead, the standard deviation of the forecast distribution is almost the same as the standard deviation of the residuals. (In fact, the two standard deviations are identical if there are no parameters to be estimated, as is the case with the naïve method. For forecasting methods involving parameters to be estimated, the standard deviation of the forecast distribution is slightly larger than the residual standard deviation, although this difference is often ignored.)

For example, consider a naïve forecast for the Dow-Jones Index. The last value of the observed series is 3830, so the forecast of the next value of the DJI is 3830. The standard deviation of the residuals from the naïve method is 22.00 . Hence, a 95% prediction interval for the next value of the DJI is \[ 3830 \pm 1.96(22.00) = [3787, 3873]. \] Similarly, an 80% prediction interval is given by \[ 3830 \pm 1.28(22.00) = [3802, 3858]. \]

The value of the multiplier (1.96 or 1.28) determines the percentage of the prediction interval. The following table gives the values to be used for different percentages.

Percentage | Multiplier |
---|---|

50 | 0.67 |

55 | 0.76 |

60 | 0.84 |

65 | 0.93 |

70 | 1.04 |

75 | 1.15 |

80 | 1.28 |

85 | 1.44 |

90 | 1.64 |

95 | 1.96 |

96 | 2.05 |

97 | 2.17 |

98 | 2.33 |

99 | 2.58 |

The use of this table and the formula \(\hat{y}_{t} \pm k \hat\sigma\) (where \(k\) is the multiplier) assumes that the residuals are normally distributed and uncorrelated. If either of these conditions does not hold, then this method of producing a prediction interval cannot be used.

The value of prediction intervals is that they express the uncertainty in the forecasts. If we only produce point forecasts, there is no way of telling how accurate the forecasts are. However, if we also produce prediction intervals, then it is clear how much uncertainty is associated with each forecast. For this reason, point forecasts can be of almost no value without the accompanying prediction intervals.

To produce a prediction interval, it is necessary to have an estimate of the standard deviation of the forecast distribution. For one-step forecasts, the residual standard deviation provides a good estimate of the forecast standard deviation. But multi-step forecasts, a more complicated method of calculation is required. These calculations are usually done with standard forecasting software and need not trouble the forecaster (unless he or she is writing the software!).

A common feature of prediction intervals is that they increase in length as the forecast horizon increases. The further ahead we forecast, the more uncertainty is associated with the forecast, and thus the wider the prediction intervals. However, there are some (non-linear) forecasting methods that do not have this property.

If a transformation has been used, then the prediction interval should be computed on the transformed scale, and the end points back-transformed to give a prediction interval on the original scale. This approach preserves the probability coverage of the prediction interval, although it will no longer be symmetric around the point forecast.

Prediction intervals are computed for you when using any of the benchmark forecasting methods. For example, here is the output when using the naïve method for the Dow-Jones index.

```
naive(dj2)
#> Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
#> 251 3830 3802 3858 3787 3873
#> 252 3830 3790 3870 3769 3891
#> 253 3830 3781 3879 3755 3905
#> 254 3830 3774 3886 3744 3916
#> 255 3830 3767 3893 3734 3926
#> 256 3830 3761 3899 3724 3936
#> 257 3830 3755 3905 3716 3944
#> 258 3830 3750 3910 3708 3952
#> 259 3830 3745 3915 3701 3959
#> 260 3830 3741 3919 3694 3966
```

When plotted, the prediction intervals are shown as shaded region, with the strength of colour indicating the probability associated with the interval.

`autoplot(naive(dj2))`