De Econometrist neemt een statistische kijk op de wereld.

**Computers do what we tell them to do, if you know how to operate them. That is why computers play such an important role in our lives. However, sometimes we want a computer to just do something we cannot predict: to generate (a sequence of) random numbers. Random numbers are useful for a large variety of purposes, such as encryption, simulation and sampling. Now when you think about it for a moment, you may wonder how a computer can generate a purely random number by itself. Is it not possible to predict the numbers if you know in what way they are generated?**

When a coin is flipped fairly, we can assume that the outcome is hardly predictable with a probability very close to 50% for both heads and tails. This is because in the physical world, most events depend on many factors that are extremely hard to predict, especially on the very small, quantum physical level. By itself, a computer cannot make a truly random choice between multiple options, no matter how advanced an algorithm it utilises. When you have access to the algorithm, you can predict the numbers that are going to be generated. Therefore, all methods of true random number generation rely on external, physical factors.

An intelligible example of a method to generate a random number is the following: A photon source capable of emitting single photons releases a photon onto a semi-transparent mirror, after which the photon either gets reflected onto sensor A or goes straight through the mirror and ends up at sensor B. This event is unpredictable and makes for an intrinsically random binary choice.

Now when this process is repeated multiple times, one can generate a sequence of 0’s and 1’s, just to be left with a random binary number, which can be converted to either normal numbers or to letters. However, in today’s life, those randomly generated numbers are required in extremely large amounts and on very short notice. Methods that generate purely random numbers, comparable to the above one, oftentimes do not meet such requirements and therefore, numbers are often generated in a so-called pseudorandom fashion.

As the name suggests (‘pseudes’, Greek for ‘feigned’ or ‘deceived’), a pseudorandom number generator (abbreviated PRNG) will not provide a truly random sequence of numbers. However, by using algorithms, some more complex than others, it can turn a shorter sequence of numbers (the seed) into a sequence that is much larger. This way of generating numbers brings some advantages as well as disadvantages. An example of such an algorithm is the ‘middle-square’ method: the seed (in this case consisting of 6 numbers) is squared after which the 6 middle numbers are added to the sequence and used as the next seed.

This process can then be repeated multiple times to generate a large sequence of pseudorandom numbers. It will, however, eventually repeat itself, when the new seed has already passed through the algorithm. Also, some seeds will cause the algorithm to become stationary, for example when the middle numbers are 0’s at some point.

A PRNG can produce outputs that are close to truly random, so that they will pass thorough statistical tests for independence and randomness. It is, however, never flawless.

Problems that often arise with such generators are:

- Shorter than expected loops for some seeds. (The sequence repeats itself after enough iterations, but sometimes this happens quickly)
- Strong correlation of successive values
- Poor dimensional distribution of the output sequences (some sequences can for example never be outputted by specific generators, making the output biased)

A scatter of some pseudorandom generated sequence could look as follows:

When we take , it is hard to discover a pattern, but when we take realisations, we can clearly see that the sequence is repeating, and thus predictable.

Still, for most purposes, these types of number generators are sufficient. When a very large pseudorandom sequence without repetition is produced, it can be extremely hard to find out what kind of algorithm was used to convert the seed and what this seed was.

In general, the more random we want outputs of an PRNG to be, the more time and memory it takes to generate them.

So, coming back to the name pseudorandom, properly generated numbers often look totally random and pass many, if not all, randomness tests, but there will always be some kind of pattern. Citing American mathematician and computer scientist George Marsaglia: “If the numbers are not random, they are at least higgledy-piggledy.” So if they pass tests and are nearly indistinguishable from real random numbers, what is the issue?

As there exist extremely advanced deciphering hardware and software nowadays, a very strong computer, fed with a large pool of such pseudorandom sequences, might be able to crack the algorithm and provide information needed to predict future outputs. For most purposes, such as simulations, there is no point in doing this, as predicting future outputs is useless. However, there are some uses of number generation that require the algorithms and seeds to remain secret, examples of which include encryption and some online betting games. Also many sorts of slot machines contain a PRNG. Recently, some people have exploited those machines, by observing the outcomes and finding a pattern. Therefore, for these sensitive uses of PRNG’s, it is common to let the generator switch between multiple algorithms every so often. This makes it unworkable for outsiders to exploit such a generating system.

Combining a properly generated random seed with switching between many hard-to-crack algorithms therefore makes for a number or sequence that is nearly impossible to distinguish from a true random number. Thus, we can conclude that, even for such sensitive purposes, when enough computing power is available and when handled competently, pseudorandom number generation will just do the trick.

Dit artikel is geschreven door Pieter Dilg