Level 2: Pseudorandom number generators

Welcome to dungeon Level 2! Let's take a look at different pseudorandom number generators in order to grasp the basic concepts. There are many different families and people keep improving the algorithms, but we can look at two famous types: linear congruential generators (LCG) and linear-feedback shift registers (LFSR).

Linear Congruential Generators (LCG)

LCGs are some of the first pseudorandom number generators that were devised. The idea behind them is to use properties of modulo arithmetic to produce seemingly random outputs. An LCG takes the following general form:

$$x_{i+1} = a x_{i} + b \mod m$$

In the above equation, $x_{i}$ is the $i$-th generated pseudorandom number, $a \in (0,m)$ is the multiplier, $b \in [0,m)$ is the increment and $m >= 1$ is the modulus (all integers). Hence, once we have $x_{i}$ we can directly calculate $x_{i+1}$ with the equation above. What about $x_{0}$? That value is called the seed and, as we can see, it completely determines the generated sequence of numbers.

Another thing to note is that the equation above can only produce numbers between $0$ and $m-1$, because of the modulo operator. Also, the values of $a$ and $b$ are relevant and interact with $m$ in certain ways. Hence, choosing a good value of the parameters is paramount.

Let's see some GML code that we can type to implement an LCG. We'll create a struct, but for now let's take a look at a method that actually generates the random numbers, given the parameters above:


self.irnd = function() {
	self.__value = 	self.__index == 0 ?
					self.__seed : 
					(self.__value * self.__multiplier + self.__increment) % self.__modulus;
	self.__index++;
	return self.__value;
}

As you can see, it's pretty straightforward to calculate it (one line of code), and this method will return a value between $0$ and __modulus. The __index variable is our $i$, and will be incremented on every call of the LCG. To be able to use this generator, we need to initialize the struct with the appropriate values (seed, multiplier, increment and modulus), and then call the irnd method. We can see the full code below (I omitted non-critical stuff for brevity):


function manta_lcg_random(_seed = 0, _modulus = power(2,32), _multiplier = 1664525, _increment = 1013904223) constructor {
	
	// Omitted additional instance variables and constructor validation	
	self.__modulus = _modulus;
	self.__multiplier = _multiplier;
	self.__increment = _increment;
	
	self.irnd = function() {
		self.__value = self.__index == 0 ? self.__seed : (self.__value * self.__multiplier + self.__increment) % self.__modulus;
		self.__index++;
		return self.__value;
	}
	
	self.rnd = function() {
		return self.irnd()/self.__modulus;
	}
	
	self.initialize = function(_seed = date_current_datetime()) {		
		// Omitted validations
		self.__seed = _seed;
		self.__index = 0;
		self.__value = 0;
	}
	
	self.initialize(_seed);
	
}

Let's actually create an instance of this struct and generate some values:

self.rng = new manta_lcg_random(0, 8, 5, 1);
var _str = "";
for (var _i=0; _i<20; _i++) {
	_str += string(self.rng.irnd())+" ";
}
show_debug_message(_str);

In this first example we are initalizing the LCG with seed $x_{0}=0$ and setting the LCG parameters to $m=8, a=5, b=6$. Let's check our output:

0 1 6 7 4 5 2 3 0 1 6 7 4 5 2 3 0 1 6 7

We can see that our LCG is generating all numbers between 0 and 7, starting with 0 (the seed given) and that it's repeating itself after 7 digits. We know that the maximum possible number of different values is $m$, so in case we need more numbers (hint: we REALLY do!) we might be tempted to just use $m=20$ for example. Let's see:

self.rng = new manta_lcg_random(0, 20, 5, 1);
var _str = "";
for (var _i=0; _i<20; _i++) {
	_str += string(self.rng.irnd())+" ";
}
show_debug_message(_str);

0 1 6 11 16 1 6 11 16 1 6 11 16 1 6 11 16 1 6 11

Ugh. It actually got worse, only generating three different values after the seed. The fundamental concept here is the relationship between $a, b$ and $m$. We want a big $m$ (to have the maximum possible range) but we need to carefully choose $a$ and $b$. Fortunately, thanks to some modulo arithmetic magic theorem you can check here, one can determine the optimal values of the parameters of LCGs that can generate the maximum number of values. However, that does not mean that those parameters will yield a "good" PRNG according to the test batteries discussed before. One iconic example is RANDU, a LCG developed by IBM and used in the 1970s with parameters $a=65,539, b=0, m=2^{31}$. The sad story is this PRNG is maximal (i.e. generates all numbers between 0 and $2^{31}$, so great period) but fails very important randomness tests and thus it renders really bad results. You can try RANDU for yourself in GameMaker with our struct.

In general, LCG were very important and critical for influencing further progress, but are outdated nowadays and much better PRNGs have been developed in the last 30 years or so. We will now take a look at another family of PRNGs called linear-feedback shift registers.

Linear-Feedback Shift Registers (LFSR)

The basic idea behind LFSRs is to manipulate the (binary representation of) the previous number in the stream using different bitwise operations such as bit shifting and bitwise XOR, AND, OR. Bitwise operations are attractive because they are very fast. We use $>>$ to denote a bit shift to the right, $<<$ a bit shift to the left, $&$ to denote bitwise AND, $|$ to denote bitwise OR and $^$ to denote bitwise XOR. Finally, we can define a tap as the XOR of two offset positions of bits.

One example of an LFSR is the following, where we set a 4-bit period (16 possible numbers from 0 to 15) and perform the following operations:

Start with the previous calculated value;
Take the current value and tap it with its bit-shift to the right;
Obtain the last digit of the previous step;
Bit-shift the original value to the right and substitute the first digit with the one obtained on the previous step.

Again, we can use the bitwise operators in GameMaker and implement the calculation on a struct method (using an auxilliary __tap method):


self.__tap = function(_v, _k) {
	return _v ^ (_v >> _k);
}
	
self.irnd = function() {
	self.__value = 	self.__index == 0 ?
					self.__seed :
					(self.__value >> 1) | ((self.__tap(self.__value, 1) & 1) << 3);
	self.__index++;
	return self.__value;		
}

As we can see, we first compute the tap, then we bitwise-AND it with 1 (i.e. all that remains is the last digit) and then we bit-shift that last digit three places to the left (considering we are using a 4-bit number). Finally, we bitwise-OR the result with the right bit-shift of the original value. For example:

We start with the previous calculated value (let's say, 9 or 1001 in binary);
We take the current value and tap it with its bit-shift to the right. This is, we XOR 1001 with 0100 and get 1101.
We keep only the last digit of the result. We can do that by bitwise-ANDing it with 1 (i.e. 0001): 1101 & 0001 = 0001.
Finally we bit-shift the original value to the right (1001 >> 1 = 0100) and substitute the first digit with the one obtained on the previous step, by bitwise-ORing it with the value. Since we want to modify the first position, we first left bitwise-shift the digit of step 3 three times to the left (0001 << 3 = 1000) and then we do the bitwise-OR operation: 1000 | 0100 = 1100. This is our next random number!

Let's take a look at the full code:


function manta_lfsr_random(_seed=0) constructor {
	
	// Omitted additional instance variables and constructor validation
	self.__tap = function(_v, _k) {
		return _v ^ (_v >> _k);
	}	
	
	self.irnd = function() {
		self.__value = self.__index == 0 ? self.__seed : (self.__value >> 1) | ((self.__tap(self.__value, 1) & 1) << 3);
		self.__index++;
		return self.__value;		
	}
	
	self.rnd = function() {
		return self.irnd()/16;
	}
	
	self.initialize = function(_seed) {
		self.__seed = _seed;
		self.__index = 0;
		self.__value = 0;
	}
	
	self.initialize(_seed);
}

Now let's check out a run of this PRNG using seed 9 (which, again, is 1001 or (1 << 3) | 1):

self.rng = new manta_lfsr_random(9);
var _str = "";
for (var _i=0; _i<20; _i++) {
	_str += bin(self.rng.irnd(), 4)+" ";
}

1001 1100 0110 1011 0101 1010 1101 1110 1111 0111 0011 0001 1000 0100 0010 1001 1100 0110 1011 0101

9 12 6 11 5 10 13 14 15 7 3 1 8 4 2 9 12 6 11 5

We can see we get 15 different values (note it couldn't possibly output 0, otherwise it would be stuck at 0 forever) and then we get the repeat. This is the main mechanism by which LFSRs work. Again, this small example is obviously a very bad PRNG for actual use, but we can obviously build much more complex LFSRs with different taps and shifts that produce very good random numbers. Xorshift and its evolutions are a good example.

While LCGs have tremendous historical importance, LFSRs have been fundamental to develop more advanced pseudorandom number generation algorithms, including most modern ones.

WELL512a: GameMaker's random number generator

There are many other families of PRNGs. One very important one is the Mersenne Twister, a good LFSR-like PRNG created by M. Matsumoto and T. Nishimura in 1997, with a looong period of $2^{19937}-1$ (a Mersenne prime) and the default PRNG used in many languages and packages, including Python, PHP, Lisp, Excel, MATLAB and Ruby. It passes all of the Diehard/Dieharder tests and most of the TestU01 tests, and it is fast. However, it has come of age lately and exhibits some flaws on certain situations.

In 2006, F. Panneton, P. L'Ecuyer and M. Matsumoto created the WELL family of generators. WELL is a recursive acronym for Well-Equidistributed Long-period Linear and generates equally "good" random numbers, almost as fast and fixing many of the Mersenne Twister's flaws. WELL is a family of PRNGs of different periods (indicated by the digits next to the WELL acronym), and is actually a form of LFSR. You can read the technical paper here. The 512 version has a period of $2^{512}$, which is sufficiently long for nearly all applications, and certainly for game development. YoyoGames chose WELL512a as the built-in random generator in GameMaker, which is great news for us game developers, as it's a modern, robust PRNG.

Purely for theoretical purposes, I decided to take a shot at (re)implementing WELL512a in GameMaker, considering it actually doesn't support 512-bit numbers, so we have to work with 64-bit instead. The irnd should look something like this (please note, this is purely an exercise, you should not use this code), where __state is a 16-position array initialized to random numbers:


	self.irnd = function() {
		var a = self.__state[self.__index];
		var c = self.__state[(self.__index+13) & 15];
		var b = a ^ c ^ (a << 16) ^ (c << 15);
		c = self.__state[(self.__index + 9) & 15];
		c ^= (c >> 11);
		self.__state[self.__index] = b ^ c;
		a = self.__state[self.__index];
		var d = a ^ ( (a << 5) & 0xDA442D24 );
		self.__index = (self.__index + 15) & 15;
		a = self.__state[self.__index];
		self.__state[self.__index] = a ^ b ^ d ^ (a << 2) ^ (b << 18) ^ (c << 28);
		return self.__state[self.__index] + power(2,63); // Make it unsigned
	}

Fortunately, as we'll see in the next level of the dungeon, we don't actually have to implement it, we can just use the functions provided by GameMaker and that's it!

GameMaker uses WELL512a... except for the HTML5 target. In this case, GameMaker exports to Javascript, and Javascript uses Math.Random, which in turn tells the browser to generate a random number with its built-in pseudorandom generator. This means the PRNG used in HTML5 exports is browser-specific, although most modern browsers (Chrome, Node.js, Firefox, Safari and Microsoft Edge) use a robust LSFR called xorshift128+. Unfortunately, this means that, even with the same starting seed, you will not get the same sequence of pseudorandom numbers in your game when you test it on the HTML5 target than when you test it in desktop.

Additional notes on PRNGs

Let me add some remarks on pseudorandom generators. First, we started talking about generating numbers between 0 and 1, but all the examples above generated positive integers. Thankfully, we can just divide those numbers by the period of each PRNG to do that, since we know they generate numbers between 0 and the period (exclusive). If you take a look at some of the structs above, I created the rnd method, which does exactly this.

Secondly, note that we can reinitialize/reset/restart a PRNG by setting the seed to the same value we set it at the beginning. As we've seen, the sequence of numbers will repeat. In particular, this means two computers with the same PRNG and the same exact starting seed, will produce the same random outputs. If done correctly on a game, they will consequently produce the exact same game situation/scenario/random level. You've probably seen this applied in many games (Spelunky comes to mind fisrt), where we have "seeded runs", that let multiple players compete in defeating the same exact level, even though levels are procedurally generated (this is particularly notable in speedruns).

Also note that initializing PRNGs with two consecutive (or "close") values for the seed, will result in drastically different runs. This is a chaotic property of PRNGs and it is very much needed to have good randomization.

Thirdly, we don't want players to be presented with the same exact scenario every time they run our game. Hence, we need to start the seed to something different every time. One way of doing this is using the date and time of the system, to a sufficiently high precision (microseconds). We'll see how to do this in GameMaker in the next level.

Other ways of generating random numbers

PRNGs are not the only way of generating random numbers. Nowadays there is specialized hardware and online services that can measure entropy from physical processes (think temperatures, atmospheric pressure, radioactive decay etc.) and using those quantities as random numbers. The fact that these are nature-based phenomena makes them almost impossible to be tampered with, so they are effectively not predictable; so they are "much better" random numbers and thus they are used for cryptographic applications and other stuff. However, in most common scenarios of game development, it is highly unlikely you will ever need to resort to this kind of numbers, unless you are very serious about creating a heavily regulated real money betting game (see this for example).

For curious/stubborn people that have nothing better to do, here's a way of using nature-based random numbers from random.org's API into GameMaker, for example getting 20 realizations of a dice roll:


// Create event
response = http_get("https://www.random.org/integers/?num=20&min=1&max=6&col=1&base=10&format=plain&rnd=new");

// Async HTTP event
if (async_load[? "id"] == response) {
	if (async_load[? "http_status"] == 200) {
		show_debug_message(async_load[? "result"]);
	}
	else {
		show_debug_message(async_load[? "http_status"]);
	}
}

Phew. That was a long one. I guess you leveled up again! Now we're ready to descend again and start simulating actual stuff in GameMaker!