De Econometrist neemt een statistische kijk op de wereld.

**For hundreds, maybe thousands of years humans have attempted to accurately predict the future. One good example of this was Nostradamus. Nostradamus wrote ‘****Les Prophéties****’ where he put together 942 vague statements claiming to describe future events. Some even insist he predicted COVID-19. This however was quickly debunked, since no conclusive evidence was ever found and most of the statements Nostradamus wrote can be interpreted in many ways. This is due to the vagueness in the language he used. Next to Nostradamus, there are many ‘fake’ psychics, claiming to be able to ‘see’ the future. I however do not believe there are any merits to their claims. That is why I will give some real application of predicting the future and give some insight into the reason tomorrow is so hard to predict.**

One daily application of future prediction almost everyone uses are weather forecasts. Although these are not 100 percent reliable, they do give a good idea about what the weather tomorrow might look like. Forecasting weather further into the future will become increasingly less accurate. This is due to the fact that the weather is dependent on such a large number of external factors that it is currently still imposible to take all of these factors into account when making the weather predictions. And this phenomenon of not having all external factors included in our forecasting model is a recurring theme.

When buying a lottery ticket and providing the numbers we wish to bet on, we make a prediction as well. To correctly predict the winning lottery numbers might be even harder than correctly predicting the weather a year from now. This is due to the fact that we have at least some data on the weather of past years and are hence able to make a rough estimate of the weather next year. For a lottery this is however not the case. Knowing the winning numbers of previous lotteries will provide no edge over other people buying lottery tickets. This is because the lottery is a memoryless process. In probability theory memoryless means that knowing something about the past will be of no use in the current state.

Betting on a horse race might be a better example where you might be able to get some edge by looking at past races. If you know a horse has performed consistently and well over a period of time, one could argue that there is a good chance this same horse will perform well in the race you are betting on. This process clearly does have memory but is definitely not going to guarantee you a profit. This is because a horse race is not only dependent on past races, but also an unthinkable number of other external factors. Examples of these external factors are the current condition of the horse, the weight of the jockey, but also factors like the temperature and humidity may well play a role in the outcome of a horse race. Taking all of these external factors into account when dividing a good model might improve your luck by a small amount, but you will never come close to a 100 percent correct prediction of a horse race, other than just by pure coincidence and luck.

The most accurate forecasts are based on data of past events. Econometricians use time series data and dynamic methods to optimize their forecast models. The easiest model is a random walk. This model is not very good at forecasting since it does not take into account past data, but only relies on a random process.

A better model should thus use past data about the time series one wishes to predict. Such a model is called an autoregressive model. In an autoregressive model, one or more lags of the data are used to try to explain the data in the current period. This model uses the knowledge of autocorrelation between two different points that are a given distance away from each other. This correlation can be calculated by keeping the distance in time between observations constant and then using all the given sets of combinations to find the correlation. If there is a significant correlation over a given distance in time one should consider adding the lag to the data.

Even on this many improvements can still be made. We can for example add a moving average process to the autoregressive model to make further improvements. Combining these models in this way, we get an autoregressive moving average model or ARMA model in short. Without using any external data, this is one of the best models that can be used to forecast. But we can still do better.

The last model we will consider is the autoregressive distributed lag model or ARDL in short. An ARDL model is very similar to the ARMA model. The difference is that the moving average part of the ARMA model is random, where the distributed lag part of the ARDL model is based on one or more external factors. These external factors are called leading indicators. This ARDL model might improve on the ARMA model, but this is not always true. The model will only improve if the right leading indicators are used within the model. These models can be used to forecast the stock market to a certain degree. But to make very accurate predictions, many improvements still have to be made.

In conclusion, a lot of developments are being made in the world of forecasting and predicting. Econometricians use dynamic models to predict things like and the weather by using past data on the dependent variable and a number of external variables. These models are however still not nearly good enough to make an accurate prediction of a horse race or a lottery. Maybe someday humans will crack the code, and find the key to the future, but looking at current developments, I predict this event to still be very far away if we should ever reach it at all.

*This article was written by David Anthonio*