Ignacio Darago

Simulating random variables

Whenever we are dealing with random events, we would like to have some way to replicate experiments following certain probability distributions. For instance, flipping a (fair) coin is the experiment simulating a Bernoulli random variable of probability 0.50.5. However, when we deal with more complicated random variables, it’s not immediately clear how to produce an experiment to simulate them.

Throughout the entire post, we assume that we have some way of generating a (uniform) random number UU in [0,1][0,1]. Our goal is to design experiments to simulate random variables XX, starting from such UU. For example, instead of flipping a coin, we could simulate the same outcome by picking a random number UU in [0,1][0,1] and saying that it’s heads if U0.5U \leq 0.5 and tails if U>0.5U>0.5.

The “simplest” possible case, is when we know the cumulative distribution function F(x)=P(Xx)F(x) = \mathrm{P}(X\leq x). In some cases, F(x)F(x) is strictly increasing and continuous so it has an inverse function F1(p)F^{-1}(p), which gives us the unique number xx such that P(Xx)=p\mathrm{P}(X\leq x) = p. If we have the knowledge of such an inverse function (sometimes called the quantile function), we can produce a very simple “experiment”.

Inverse transform sampling

Consider a uniform random variable UU in [0,1][0,1], and suppose that we know the inverse of the cumulative distribution function. Then, to simulate XX with the given cumulative distribution function FF is as simple as looking at F1(U)F^{-1}(U).

Indeed, this is because P(Xx)=P(F1(U)x)=P(UF(x))=F(x)\mathrm{P}(X\leq x) = \mathrm{P}(F^{-1}(U)\leq x) = \mathrm{P}(U\leq F(x)) = F(x). Just notice that the second equality holds because FF is an increasing function, and so aba\leq b is equivalent to F(a)F(b)F(a)\leq F(b).

The skeleton of such an implementation in Python is as follows:

import numpy as np

# This is the inverse of the cumulative distribution function of a random variable X we explicitly know
def F_inv(x):
    return #explicit expression

def X():
    U = np.random.uniform(0,1) # Generates a uniform random number between 0 and 1
    return F_inv(U)

A simple example where we know the inverse of the cumulative distribution function is when XX is an exponential random variable. Recall that an exponential random variable Xexp(λ)X\sim\mathrm{exp}(\lambda) is often used to model the time between events. Its probability density function is given by

fX(x)={λeλx if x>00 otherwise, f_X(x) = \begin{cases} \lambda e^{-\lambda x} &\text{ if } x>0 \\ 0 &\text{ otherwise, }\end{cases}

from which we can easily see that the cumulative distribution function is

FX(x)=xfX(t)  dt={1eλx if x>00 otherwise,F_X(x) = \displaystyle\int_{-\infty}^x f_X(t)\;\mathrm{d}t = \begin{cases}1-e^{-\lambda x} &\text{ if } x>0 \\ 0 &\text{ otherwise,}\end{cases}

and the inverse of the cumulative distribution function is

FX1(p)=1λln(1p).F_X^{-1}(p) = -\dfrac{1}{\lambda}\ln(1-p).

Hence we can give a very simple implementation of such a random variable in Python. This is the histogram we obtain after simulating 1000010000 rounds of such an experiment (in red, is plotted).

Histogram of simulated exponential

Rejection sampling

The idea of this method is simple. Suppose that we want to sample points in some region. Then we can sample uniformly in a box containing our region and keep (or in scientific terms, accept) the sample if it lies in our region, or reject it if it does not.

For instance, if we want to sample points uniformly in a circle, we may sample points (x,y)(x,y) in the square [1,1]×[1,1][-1,1]\times[-1,1] and accept them if x2+y2<1x^2+y^2 < 1 or reject them otherwise.

import numpy as np

def sample_circle():
    while True:
        [x,y] = np.random.uniform(-1,1,2)
        if (x**2 + y**2 < 1):
            return [x,y]

The following picture shows 1000010000 points generated using this method:

Rejection sampling points 10000 in a circle Figure: The points in green are the ones we accepted and the points in red are the ones we rejected.

Suppose now that XX is a random variable in one dimension. If we plot the probability density function, we can play the same game as above, but with the region below the graph!

This tells us that the method of rejection sampling simulates XX from its probability density function f(x)f(x).

Now the question is where to sample the points to begin with. We need some sort of envelope, that is, a region that contains our graph from which we will take the points.

Rejection sampling idea Figure: In red, the graph of the probability distribution function we want to simulate, in blue the envelope.

This envelope will generally be the graph of some probability density function gg that we know how to simulate, up to some scaling factor MM.

If we know how to simulate YY with probability density function gg such that Mg(x)f(x)Mg(x)\geq f(x) for all xx, then we can simply:

The skeleton of the code will look something like this:

import numpy as np

def f(x):
    return # explicit expression

def g(x):
    return # explicit expression

def simulate_Y():
    return # explicit method to simulate random variable with pdf g

def rejection_sampling():
    while True:
        U = np.random.uniform(0,1)
        Y = simulate_Y()
        if U < f(Y)/(M*g(Y)):
            return Y

Rejection sampling for a Normal variable

As an example, let’s show how to simulate a normal distribution XN(0,1)X\sim N(0,1) with zero mean and standard deviation 11.

The first observation is that, by symmetry, it’s enough to sample from the absolute value X\vert X\vert. Indeed, if we know X\vert X\vert then we just generate a sign S=±1S=\pm 1 with probability 0.50.5 and set X=SXX = S\vert X\vert.

The density function of X\vert X\vert is given by

f(x)=22πex2/2,    x0.f(x) = \dfrac{2}{\sqrt{2\pi}} e^{-x^2/2}, \;\;x\geq 0.

As an envelope, we can take the exponential distribution g(x)=exg(x)=e^{-x} (which we know how to simulate by the previous example!).

The scaling factor MM is given by the maximum value of f(x)g(x)=exx2/22/π\dfrac{f(x)}{g(x)} = e^{x-x^2/2} \sqrt{2/\pi}, which occurs at x=1x=1. We get, in that case, that f(x)Mg(x)=e(x1)2/2.\dfrac{f(x)}{Mg(x)} = e^{-(x-1)^2/2}.

import numpy as np

def f(x):
    return 2*np.exp(-x**2/2)/np.sqrt(2*np.pi) if x>= 0 else 0

def g(x):
    return np.exp(-x) if x>=0 else 0

def simulate_Y():
    U = np.random.uniform(0,1)
    return -np.log(U)

def rejection_sampling():
    while True:
        U = np.random.uniform(0,1)
        Y = simulate_Y()
        if U <= np.exp(-(Y-1)**2 / 2):
            S = 1 if np.random.uniform(0,1)>0.5 else -1
            return S*Y

Below we can picture the histogram of 100000100000 points generated using this algorithm, together with the actual density function in red.

Histogram normal variable

In the next post, we will see some more sophisticated sampling techniques, such as Markov Chain Montecarlo methods, the Metropolis-Hastings algorithm and Gibbs sampling.

You can find all the code involved in this post on my GitHub.