Gaussian samples – Part(2)

Background

In part 1 of this project, I’ve shown how to generate Gaussian samples using the common technique of inversion sampling:

The bad news: the Gaussian inverse CDF is not well-defined, so we have to approximate that function, and the simple Taylor series was used. However, this only samples accurately near the Gaussian mean, and under-samples more extreme values at both ends of the distribution.

Image for post

A: left-side area (uniformly sampled). Tᵢ: the iᵗʰ-degree Taylor series approximation of the inverse Gaussian CDF: √2 Erfinv(2A-1)

Therefore, in this part of the project, we will investigate a more “direct” sampling method that does not depend on the approximation of the Gaussian inverse CDF. This method is called the Box-Muller transform, after the two mathematicians who invented the method in 1958: the British George E. P. Box and the American Mervin E. Muller.

Gaussian

Left: George E. P. Box (1919–2013). Right: Mervin E. Muller (1928–2018)

How does the Box-Muller transform work?

For this project, my goal is to generate Gaussian samples in two dimensions i.e. generating samples whose x and y coordinates are independent standard normals (Gaussian with zero mean and standard deviation of 1). In part 1, I used the inverse Gaussian CDF to generate separately the x and y coordinates from their respective uniform samples (U₁ and U₂):

Gaussian

Generate x and y coordinates from uniform samples (U₁ and U₂) using inverse transform sampling

For the Box-Muller transform, I will also start with the same two uniform samples. However, I will transform these uniform samples into the x and y coordinates using much simpler formulas:

Gaussian

Generate x and y coordinates from uniform samples (U₁ and U₂) using Box-Muller transform

Despite the strong coupling between U₁ and U₂ in each of the two formulas above, the generated x and y coordinates, which are both standard Gaussians, are still surprisingly independent from each other! In the derivation of the Box-Muller transform that follows, I will demonstrate why this is indeed this case.

Derivation of Box-Muller transform

We know that for any two independent random variables x and y, the joint probability density f(x, y) is simply the product of the individual density functions: f(x) and f(y). Furthermore, Pythagorus theorem allows us to combine the x² and y² term in each of the Gaussian density function. This results in the -r²/2 term in the exponential of the joint distribution, where r is the distance from the origin to the 2-D Gaussian sample.

Gaussian

To make it simpler, we then define a variable s that is equal to r²/2. In other words, s is simply the half of squared distance from the origin of our Gaussian sample. Written this way, the joint PDF is simply the product of a constant (1/2π) and an exponential (e⁻ˢ).