Bayesian Analysis - DREAM#

The main Bayesian Sampler in RAT is a MATLAB implementation of the DREAM (DiffeRential Evolution Adaptive Metropolis) algorithm of Vrugt (2016), [1] modified to be compilable to C++ via MATLAB Coder.

As the name suggests, DREAM combines differential evolution with a variant on the Metropolis-Hastings optimisation algorithm.

The basic idea of the Metropolis-Hastings algorithm is that it randomly “walks” around the parameter space, by taking a “step” of a fixed distance and either accepting the step or rejecting it (i.e. going back to the previous step and starting again) based on the following process:

  1. Let \(x\) be the current point, \(x'\) be the proposed new step, and \(f\) the function that calculates chi-squared at any point.

  2. Calculate the ratio

    \[\alpha = f(x')/f(x).\]
  3. Generate a random number \(u\) between 0 and 1.

  4. If \(u \leq \alpha\), accept the candidate. Otherwise, reject it.

This means that while steps into points with better chi-squared are more likely, they are not guaranteed, and the algorithm is also able to step to points of worse fit (but this is less likely to be accepted). The sequence of steps is an example of a Markov chain. After a large number of steps we can infer various things about the shape of our probability distribution for each parameter. The algorithm can be adapted to, for example, adapt the size of the steps to shrink proportionally to the number of steps that have been accepted, with the assumption that the algorithm has found a minimum after many accepted steps.

DREAM modifies this idea and combines it with differential evolution. It creates multiple Markov chains (“walks”) in parallel, and the new proposed points are generated by using a differential evolution process between all the chains:

  1. Let \(v_i\) be the vector representing chain \(i\); \(v_i = (x^i_1, x^i_2, x^i_3,...)\) where \(x^i_j\) is the \(j\)’th point on chain \(i\).

  2. Let there be \(N\) chains, and say we are on iteration \(t\). Use differential evolution over all points on all chains to propose a new point for each chain: \(x^1_t, x^2_t, ..., x^N_t\).

  3. For each point, either accept or reject using the same criteria as in Metropolis-Hastings. If accepted, add it to its respective chain.

This means that over time, DREAM creates \(N\) parallel Markov chains that explore the parameter space, and each chain can ‘learn’ from better-performing chains via the differential evolution process. After some iterations, we start ‘cutting off’ the older end of the chain so that the new proposals are generated from a sample set of better points.

It also performs several auxilliary improvements such as subspace sampling (i.e. generating proposals from only a subset of the parameters at a time), and a correction algorithm for chains that are outliers. It has been found to outperform other adaptive MCMC approaches, and in some cases even outperforms classical optimisation algorithms at parameter estimation. [2]

Algorithm control parameters#

The following parameters in the Controls object are specific to DREAM:

  • nSamples: The number of samples in the initial population for each chain.

  • nChains: The number of Markov chains to use in the algorithm.

  • jumpProbability: A probability range for the size of jumps when performing subspace sampling. Larger values mean more random variation in jump sizes for new proposed values.

  • pUnitGamma: The ‘probability of unit jump’ rate. New proposals are scaled down by a factor \(\gamma = 2.38/2d\), where \(d\) is the current length of the chain. This means that as the chain gets longer and ‘better’, jumps to new points aren’t as far from the rest of the chain (as we are likely close to a minimum of the space). pUnitGamma is the probability that this will be ignored for an iteration and instead \(\gamma = 1\) for that iteration (hence the name: “p unit gamma”, the probability that gamma will be equal to the unit for an iteration). This allows chains to ‘jump’ out of their local area and more easily explore other parts of the parameter space.

  • boundHandling: How steps across the boundary of the space should be handled. Can be:

    • "off": allow steps to pass the boundary.

    • "reflect": if a step passes a boundary, reflect it back across the boundary.

    • "bound": if a step passes a boundary, set it equal to the nearest point on the boundary.

    • "fold": treat the boundary as periodic and ‘wrap the step around’ to the other side of the space.

  • adaptPCR: whether crossover probabilities for differential evolution should be adapted by the algorithm as it runs.