In this case study I introduce the very basics of modeling and inference with stochastic differential equations. To demonstrate the concepts I also provide examples of both prior and posterior inference in Stan.

1 Stochastic Differential Equations

Stochastic differential equations and stochastic processes are rich and sophisticated mathematics and in this section we will introduce only the main concepts. We will begin by framing ordinary differential equations as a way to generate deterministic evolution and then introduce stochastic differential equations as a way to generate probabilistic evolution. Finally we will discuss how that probabilistic evolution can be approximated in practice.

There are many references for further exploration although I have found that [1] is a particularly comprehensive resource.

1.1 Deterministic Differential Equations

A first-order, ordinary differential equation defines how a real-valued state \(x \in X \subseteq \mathbb{R}^{D}\) varies with respect to infinitesimal changes in some auxiliary, real-valued quantity \(t \in \mathbb{R}\), \[ \frac{ \mathrm{d} x }{ \mathrm{d} t } = f(x, t). \] Here I will consider this auxiliary quantity as time so that the differential equation defines the temporal evolution of some state.

Writing this first-order, ordinary differential equation in terms of differentials, \[ \mathrm{d} x = \mathrm{d} t \, f(x, t), \] allows us to integrate the differential equation from some initial state \(x_{0}\) at time \(t = 0\) to the configuration of the state at a later time \(T\), \[ \begin{align*} \mathrm{d} x &= \mathrm{d} t \, f(x, t) \\ \int_{x_{0}}^{x_{T}} \mathrm{d} x' &= \int_{0}^{T} \mathrm{d} t' \, f(x(t'), t') \\ x_{T} - x_{0} &= \int_{0}^{T} \mathrm{d} t' \, f(x(t'), t') \\ x_{T} &= x_{0} + \int_{0}^{T} \mathrm{d} t' \, f(x(t'), t') \end{align*} \] In other words the differential equation defines a deterministic map from some input state at time \(t = 0\) to some output state at time \(t = T\).

By separating the integration into smaller intervals we can also recover intermediate states. Consider for example the sequence of states \(x_{1}, \ldots, x_{N}\) at the times \(t_{1}, \ldots, t_{N}\) with \[ 0 = t_{0} \lt t_{1} \lt \ldots \lt t_{N} = T. \] Given the initial state \(x_{0}\) at \(t = t_{0} = 0\) we can derive the first state \(x_{1}\) by integrating to \(t_{1}\), \[ x_{1} = x_{0} + \int_{t_{0}}^{t_{1}} \mathrm{d} t' \, f(x(t'), t'). \] At that point we can derive the second state by integrating from \(t_{1}\) to \(t_{2}\), \[ x_{2} = x_{1} + \int_{t_{1}}^{t_{2}} \mathrm{d} t' \, f(x(t'), t'), \] and so on until we derive the final state from \[ x_{N} = x_{N - 1} + \int_{t_{N - 1}}^{T} \mathrm{d} t' \, f(x(t'), t'). \] Consequently the differential equation defines a deterministic map from the input state to any sequence of later states.

If we consider an infinite number of intermediate times then we can recover all of the intermediate states at the times time \(t \in [0, T]\) while integrating from \(t = 0\) to \(t = T\). From this perspective the differential equation and initial state define a function from time to state configurations, \[ \begin{alignat*}{6} x :\; &[0, T]& &\rightarrow& \; &X& \\ &t& &\mapsto& &x(t) = x_{0} + \int_{0}^{t} \mathrm{d} t' \, f(x(t'), t')&, \end{alignat*} \] which we refer to as a trajectory.




1.2 Stochastic Differential Equations

A first-order stochastic differential equation introduces a probabilistic element to the evolution of a system along some auxiliary quantity. In analogy to the deterministic differential equation that we considered above we can define a stochastic differential equation as \[ \mathrm{d} x = \mathrm{d} t \, f(x, t) + \mathrm{d} W_{t} \, g(x, t), \] where \(\mathrm{d} W_{t}\) in the new term denotes a Wiener process that defines an independent distribution of perturbing values at each time \(t\).

Technically there are two common ways to specify stochastic differential equations, depending on how that second probabilistic term is treated. The form used above, and which we will use through this study, is known as the Itô form of a stochastic differential equation; a common alternative is the Stratonovich form. Both forms can be used to implement the same probabilistic evolutions, but the precise mathematical details of those implementations are different.

The words "stochastic" and "probabilistic" derive from the same linguistic root, the former through Greek and the latter through Latin. They also refer to the same underlying mathematics. Although "stochastic differential equation" has become conventional in the academic literature these mathematical systems can just as well be referred to as "probabilistic differential equations". In fact "probabilistic" more directly describes how these systems transcend ordinary differential equations.

Although even defining the integration of these systems is complicated, it can be done with enough mathematical fortitude. What results are probabilistic variants of the deterministic evolution we derived from an ordinary differential equation.

Integrating a stochastic differential equation from the initial state \(x'\) at time \(t'\) to the time \(t\) yields not a single new state configuration but rather a probability distribution of state configurations represented by the conditional probability density function \[ \pi(x_{t} \mid x_{t'}) \equiv p(x, t; x', t'). \] Formally this conditional probability density function is implicitly defined by the Fokker-Planck partial differential equation, \[ \frac{ \partial}{\partial t} p(x, t; x', t') = - \frac{ \partial}{\partial x} \left( p(x, t; x', t') \cdot f(x, t) \right) + \frac{1}{2} \frac{ \partial^{2}}{\partial x^{2}} \left( p(x, t; x', t')\cdot g(x, t) \right) \] with initial value \[ p(x, t = t'; x', t') = \delta(x - x'). \]

Similarly integrating across multiple time intervals defines a joint probability distribution over the corresponding intermediate states, \[ \pi(x_{1}, \ldots, x_{N} \mid x_{0}) = \prod_{n = 1}^{N} \pi(x_{n} \mid x_{n - 1}). \] In the same way integrating over infinitesimal time intervals defines a probability distribution over the infinite-dimensional space of trajectories! These infinite-dimensional probability distributions are referred to as stochastic processes.

Equivalently we can consider the stochastic differential equation as first defining an infinite-dimensional probability distribution over trajectories which we can then marginalize to a finite-dimensional probability distribution over the states at a finite set of times. The similarity to gaussian processes here is no coincidence; gaussian processes are special case of stochastic processes. Many gaussian process have not only useful kernel representations but also useful stochastic differential equation representations [2].

In either case we never actually work with that infinite-dimensional model directly. Instead we only ever consider the marginalization to a finite number of states, \[ \pi(x_{1}, \ldots, x_{N} \mid x_{0}) = \prod_{n = 1}^{N} \pi(x_{n} \mid x_{n - 1}). \]

1.3 Numerical Integration of Stochastic Differential Equations

While the Fokker-Planck equation formally defines the conditional probability density functions that specifies the probabilistic evolution of a stochastic differential equation, in practice we can rarely solve for that probabilistic evolution in closed form.

In order to incorporate a stochastic differential equation \[ \mathrm{d} x = \mathrm{d} t \, f(x, t) + \mathrm{d} W_{t} \, g(x, t), \] into a model we often have to consider tractable approximations to \(\pi(x_{1}, \ldots, x_{I} \mid x_{0})\). The most common approximation is the Euler-Maruyama approximation which models each conditional distribution as \[ \pi(x_{n} \mid x_{n - 1}) \approx \text{normal} \left( x_{n - 1} + f(x_{n - 1}, t_{n - 1}) \cdot (t_{n} - t_{n - 1}), g(x_{n - 1}, t_{n - 1}) \cdot \sqrt{ t_{n} - t_{n - 1} } \right). \] We can interpret this approximation as a first-order Euler approximation to the deterministic evolution, \[ \xi_{n} \approx x_{n - 1} + f(x_{n - 1}, t_{n - 1}) \cdot (t_{n} - t_{n - 1}), \] convolved with a distribution of perturbations, \[ \begin{align*} \delta_{n} &\sim \text{normal} \left(0, g(x_{n - 1}, t_{n - 1}) \cdot \sqrt{ t_{n} - t_{n - 1} } \right) \\ x_{n} &\rightarrow \xi_{n - 1} + \delta_{n}. \end{align*} \]

While not the most sophisticated method the Euler-Maruyama approximation is straightforward to implement in practice and can be reasonably accurate if the time differences \(t_{n} - t_{n - 1}\) are small enough. If the states of practical interest are too far apart to achieve a reasonable accuracy then the only way to utilize the Euler-Maruyama approximation is to introduce intermediate states to fill in the gaps.

In other words we have to model \[ \begin{align*} \pi(x_{1}, \ldots, x_{N} \mid x_{0}) &= \prod_{n = 1}^{I} \pi(x_{n} \mid x_{n - 1}) \\ &\approx \text{normal} \left( x_{n - 1} + f(x_{n - 1}, t_{n - 1}) \cdot (t_{n} - t_{n - 1}), g(x_{n - 1}, t_{n - 1}) \cdot \sqrt{ t_{n} - t_{n - 1} } \right). \end{align*} \] for large enough \(N\) such that each \(t_{n} - t_{n - 1}\) is small enough for the Euler-Maruyama approximation to be sufficiently accurate. Only afterwards can we select the subset of \(x_{n}\) that we'll need to evaluate the rest of the model.

2 Implementing Stochastic Differential Equations

In this section we'll discuss how to incorporate stochastic differential equations into probabilistic models and infer which state configurations are consistent with observed data, both theoretically and in practice with Stan.

In preparation let's set up our R environment for running Stan,

library(rstan)
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())
parallel:::setDefaultClusterOptions(setup_strategy = "sequential")

util <- new.env()
source('stan_utility.R', local=util)

set.seed(5584339)

and configure our graphical settings,

library(colormap)

c_light <- c("#DCBCBC")
c_light_highlight <- c("#C79999")
c_mid <- c("#B97C7C")
c_mid_highlight <- c("#A25050")
c_dark <- c("#8F2727")
c_dark_highlight <- c("#7C0000")

nom_colors <- c("#DCBCBC", "#C79999", "#B97C7C", "#A25050", "#8F2727", "#7C0000")

c_light_teal <- c("#6B8E8E")
c_mid_teal <- c("#487575")
c_dark_teal <- c("#1D4F4F")

par(family="CMU Serif", las=1, bty="l", cex.axis=1, cex.lab=1, cex.main=1,
    xaxs="i", yaxs="i", mar = c(5, 5, 3, 1))

2.1 The Ornstein–Uhlenbeck Process

To demonstrate the basic concepts of stochastic differential equations let's consider a mean-reverting Ornstein–Uhlenbeck process defined over the state space \(X = \mathbb{R}\) with the stochastic differential equation \[ \mathrm{d} x = \mathrm{d} t \, \gamma \, (\mu - x) + \mathrm{d} W_{t} \, \tau. \] Here the deterministic evolution defined by \[ \mathrm{d} x = \mathrm{d} t \, \gamma \, (\mu - x) \] alone eventually converges to \(\mu\) with \(\gamma\) determining the speed of that convergence. The probabilistic contribution introduces perturbations around that deterministic behavior that propagation along the evolution. Because of these perturbations almost all of the trajectories that can be realized from this process are continuous but not differentiable.

A huge benefit of the mean-reverting Ornstein–Uhlenbeck process is that the corresponding Fokker-Planck equation can be solved in closed form to give an exact description of the probabilistic evolution from a fixed initial point, \[ \pi(x_{t} \mid x_{t'}) = \text{normal} \left( x_{t} \mid \mu + (x_{t'} - \mu) e^{- \gamma \, (t - t')}, \frac{\tau}{\sqrt{2}} \sqrt{ \frac{ 1 - e^{- 2 \, \gamma \, (t - t')}}{\gamma} } \right), \] or equivalently \[ \pi(x_{t + \delta t} \mid x_{t}) = \text{normal} \left( x_{t + \delta t} \mid \mu + (x_{t} - \mu) e^{-\gamma \, \delta t}, \frac{\tau}{\sqrt{2}} \sqrt{ \frac{ 1 - e^{- 2 \, \gamma \, \delta t}}{\gamma} } \right). \] From this onditional probability density function between neighboring states we can then build up the probability density function for all states given the initial state.

The Euler-Maruyama approximation utilizes the conditional density function with \[ \pi(x_{t + \delta t} \mid x_{t}) = \text{normal} \left( x_{t + \delta t} \mid x_{t} + \gamma \, (\mu - x_{t}) \, \delta t, \tau \, \sqrt{\delta t} \right). \] In this case the Euler-Maruyama approximation is given by the same normal family of probability density functions as the exact solution. Moreover the location and scale parameters of the approximation with the normal family are equivalent to the best linear approximations of the exact location and scale parameters, \[ \begin{align*} \mu + (x_{t} - \mu) e^{-\gamma \, \delta t} &\approx \mu + (x_{t} - \mu) \, (1 - \gamma \, \delta t) \\ &\approx \mu + x_{t} - \mu - \gamma \, (x_{t} - \mu) \, \delta t \\ &\approx x_{t} + \gamma \, (\mu - x_{t}) \, \delta t, \end{align*} \] and \[ \begin{align*} \frac{\tau}{\sqrt{2}} \sqrt{ \frac{ 1 - e^{-2 \gamma \, \delta t}}{\gamma} } &= \tau \, \sqrt{ \frac{ 1 - e^{-2 \gamma \, \delta t}}{2 \, \gamma} } \\ &\approx \tau \, \sqrt{ \frac{ 1 - 1 + 2 \gamma \, \delta t}{2 \, \gamma} } \\ &\approx \tau \, \sqrt{ \frac{ 2 \gamma \, \delta t}{2 \, \gamma} } \\ &\approx \tau \, \sqrt{ \delta t }. \end{align*} \] This coincidence, however, will not hold in general.

2.2 Simulating Probabilistic Evolution

In order to simulate the probabilistic evolution defined by a stochastic differential equation we have to sample from the corresponding joint distribution of states. Here let's consider states on a relatively fine partition of times between \(t = 0\) and \(t = 10\).

N <- 100

t <- 10 * (0:N) / N

For convenience I've explicitly included the initial state so that there are \(N + 1\) total times.

Fixing the initial condition \(x_{0}\) allows us to evolve to the \(N\) states that follow in time. If we also fix the Ornstein-Uhlenbeck parameters \(\mu\), \(\gamma\), and \(\tau\) then we can use the conditional density function derived from the Fokker-Planck equation to construct a joint density function for all of the following states, \[ \begin{align*} \pi(x_{1}, \ldots, x_{N} \mid x_{0}, \mu, \gamma, \tau) &= \prod_{n = 1}^{N} \pi(x_{n} \mid x_{n - 1}, \mu, \gamma, \tau) \\ &= \prod_{n = 1}^{N} \text{normal} \left( x_{n} \mid \mu + (x_{n - 1} - \mu) e^{-\gamma \, (t_{n} - t_{n - 1})}, \frac{\tau}{\sqrt{2}} \sqrt{ \frac{ 1 - e^{- 2 \gamma \, (t_{n} - t_{n - 1})}}{\gamma} } \right). \end{align*} \]

Conveniently this joint density function is straightforward to implement in Stan.

writeLines(readLines("stan_programs/ou_analytic.stan"))
data {
  int<lower=1> N;
  real t[N + 1];
  
  real x0;

  real mu;
  real<lower=0> gamma;
  real<lower=0> tau;
}

parameters {
  real x_free[N];
}

transformed parameters {
  real x[N + 1] = append_array(rep_array(x0, 1), x_free);
}

model {
  for (n in 1:N) {
    real decay = exp(-gamma * (t[n + 1] - t[n]));
    x[n + 1] ~ normal(mu + (x[n] - mu) * decay, 0.7071 * tau * sqrt( (1 - decay) / gamma ) );
  }
}

Given the Stan program we can use Markov chain Monte Carlo to simulate realized trajectories evolving from \(x_{0}\).

x0 <- -2

mu <- 4
gamma <- 0.75
tau <- 1

simu_data <- list(N=N, t=t, x0=x0, mu=mu, gamma=gamma, tau=tau)
fit <- stan(file='stan_programs/ou_analytic.stan', data=simu_data,
            seed=494838, refresh=1000)
DIAGNOSTIC(S) FROM PARSER:
Info:
Left-hand side of sampling statement (~) may contain a non-linear transform of a parameter or local variable.
If it does, you need to include a target += statement with the log absolute determinant of the Jacobian of the transform.
Left-hand-side of sampling statement:
    x[n + 1] ~ normal(...)

The warning here arises because we're using the concatenated array x to build up the joint density function instead of the stated parameters x_free. In this case the warning is a false positive.

util$check_all_diagnostics(fit)
[1] "n_eff for parameter x[1] is NaN!"
[1] "  n_eff / iter below 0.001 indicates that the effective sample size has likely been overestimated"
[1] "Rhat for parameter x[1] is NaN!"
[1] "  Rhat above 1.1 indicates that the chains very likely have not mixed"
[1] "0 of 4000 iterations ended with a divergence (0%)"
[1] "0 of 4000 iterations saturated the maximum tree depth of 10 (0%)"
[1] "E-FMI indicated no pathological behavior"

Because x[1] here is just the fixed initial condition we can ignore the effective sample size and \(\hat{R}\) warnings and move on.

ou_anal_mcmc_samples <- extract(fit)

Because our times are so close together we might be tempted to interpolate between each sampled state configuration with a line in order to approximately visualize the trajectories supported by this Ornstein-Uhlenbeck. Technically the Ornstein-Uhlenbeck process allocates zero probability to piecewise continuous trajectories, including these linear interpolations, but this visualization device is still common despite it not being formally correct. We just have to be careful to recognize that if we zoomed into any line segment then we would see more variation.

plot_realizations <- function(t, x, name, ylim=c(-3, 7)) {
  I <- length(x[,1])

  plot_idx <- seq(1, I, 80)
  J <- length(plot_idx)
  line_colors <- colormap(colormap=nom_colors, nshades=J)

  plot(1, type="n", xlab="t", ylab="x", main=name,
       xlim=c(t[1], t[length(t)]), ylim=ylim)
  for (j in 1:J)
    lines(t, x[plot_idx[j],], col=line_colors[j], lwd=2)
}

plot_marginal_quantiles <- function(t, x, name, ylim=c(-3, 7)) {
  probs = c(0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9)
  cred <- sapply(1:length(t), function(n) quantile(x[,n], probs=probs))

  plot(1, type="n", main=name,
       xlab="t", ylab="x", xlim=c(t[1], t[length(t)]), ylim=ylim)
  polygon(c(t, rev(t)), c(cred[1,], rev(cred[9,])),
          col = c_light, border = NA)
  polygon(c(t, rev(t)), c(cred[2,], rev(cred[8,])),
          col = c_light_highlight, border = NA)
  polygon(c(t, rev(t)), c(cred[3,], rev(cred[7,])),
          col = c_mid, border = NA)
  polygon(c(t, rev(t)), c(cred[4,], rev(cred[6,])),
          col = c_mid_highlight, border = NA)
  lines(t, cred[5,], col=c_dark, lwd=2)
}


par(mfrow=c(1, 2))

plot_realizations(simu_data$t, ou_anal_mcmc_samples$x, "Trajectory Realizations")
plot_marginal_quantiles(simu_data$t, ou_anal_mcmc_samples$x, "Marginal Quantiles")

While Markov chain Monte Carlo worked well here we can also exploit the conditional structure of the joint density function to generate exact samples with ancestral sampling, allowing us to use pure Monte Carlo.

writeLines(readLines("stan_programs/simu_ou_analytic.stan"))
data {
  int<lower=1> N;
  real t[N + 1];
  
  real x0;

  real mu;
  real<lower=0> gamma;
  real<lower=0> tau;
}

generated quantities {
  real x[N + 1];
  x[1] = x0;
  for (n in 1:N) {
    real decay = exp(-gamma * (t[n + 1] - t[n]));
    x[n + 1] = normal_rng(mu + (x[n] - mu) * decay, 0.7071 * tau * sqrt( (1 - decay) / gamma ) );
  }
}
fit <- stan(file='stan_programs/simu_ou_analytic.stan', data=simu_data,
            warmup=0, iter=4000, chains=1, seed=494838,
            algorithm="Fixed_param", refresh=4000)

SAMPLING FOR MODEL 'simu_ou_analytic' NOW (CHAIN 1).
Chain 1: Iteration:    1 / 4000 [  0%]  (Sampling)
Chain 1: Iteration: 4000 / 4000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 0 seconds (Warm-up)
Chain 1:                0.040727 seconds (Sampling)
Chain 1:                0.040727 seconds (Total)
Chain 1: 
ou_anal_mc_samples <- extract(fit)

While the resulting simulations are cheaper to generate they yield the same qualitative results as the Markov chain Monte Carlo sampling.

par(mfrow=c(1, 2))

plot_realizations(simu_data$t, ou_anal_mc_samples$x, "Trajectory Realizations")
plot_marginal_quantiles(simu_data$t, ou_anal_mc_samples$x, "Marginal Quantiles")

With the states so close in time we should also be able to approximate this probabilistic evolution with the Euler-Maruyama approximation. As with the analytic model we can implement the Euler-Maruyama joint density function in Stan and then use Markov chain Monte Carlo to generate samples.

writeLines(readLines("stan_programs/ou_em.stan"))
data {
  int<lower=1> N;
  real t[N + 1];
  
  real x0;

  real mu;
  real<lower=0> gamma;
  real<lower=0> tau;
}

parameters {
  real x_free[N];
}

transformed parameters {
  real x[N + 1] = append_array(rep_array(x0, 1), x_free);
}

model {
  for (n in 1:N) {
    real epsilon = t[n + 1] - t[n];
    x[n + 1] ~ normal(x[n] + gamma * (mu - x[n]) * epsilon, tau * sqrt(epsilon) );
  }
}
fit <- stan(file='stan_programs/ou_em.stan', data=simu_data,
            seed=494838, refresh=1000)
DIAGNOSTIC(S) FROM PARSER:
Info:
Left-hand side of sampling statement (~) may contain a non-linear transform of a parameter or local variable.
If it does, you need to include a target += statement with the log absolute determinant of the Jacobian of the transform.
Left-hand-side of sampling statement:
    x[n + 1] ~ normal(...)
util$check_all_diagnostics(fit)
[1] "n_eff for parameter x[1] is NaN!"
[1] "  n_eff / iter below 0.001 indicates that the effective sample size has likely been overestimated"
[1] "Rhat for parameter x[1] is NaN!"
[1] "  Rhat above 1.1 indicates that the chains very likely have not mixed"
[1] "0 of 4000 iterations ended with a divergence (0%)"
[1] "0 of 4000 iterations saturated the maximum tree depth of 10 (0%)"
[1] "E-FMI indicated no pathological behavior"

These warnings are safe to ignore here for the same reasons as above.

ou_em_mcmc_samples <- extract(fit)

par(mfrow=c(1, 2))

plot_realizations(simu_data$t, ou_em_mcmc_samples$x, "Trajectory Realizations")
plot_marginal_quantiles(simu_data$t, ou_em_mcmc_samples$x, "Marginal Quantiles")

The iterative nature of the Euler-Maruyama approximation also naturally admits ancestral sampling.

writeLines(readLines("stan_programs/simu_ou_em.stan"))
data {
  int<lower=1> N;
  real t[N + 1];
  
  real x0;

  real mu;
  real<lower=0> gamma;
  real<lower=0> tau;
}

generated quantities {
  real x[N + 1];
  x[1] = x0;
  for (n in 1:N) {
    real epsilon = t[n + 1] - t[n];
    x[n + 1] = normal_rng(x[n] + gamma * (mu - x[n]) * epsilon, tau * sqrt(epsilon) );
  }
}
fit <- stan(file='stan_programs/simu_ou_em.stan', data=simu_data,
            warmup=0, iter=4000, chains=1, seed=494838,
            algorithm="Fixed_param", refresh=4000)

SAMPLING FOR MODEL 'simu_ou_em' NOW (CHAIN 1).
Chain 1: Iteration:    1 / 4000 [  0%]  (Sampling)
Chain 1: Iteration: 4000 / 4000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 0 seconds (Warm-up)
Chain 1:                0.035633 seconds (Sampling)
Chain 1:                0.035633 seconds (Total)
Chain 1: 
ou_em_mc_samples <- extract(fit)
par(mfrow=c(1, 2))

plot_realizations(simu_data$t, ou_em_mc_samples$x, "Trajectory Realizations")
plot_marginal_quantiles(simu_data$t, ou_em_mc_samples$x, "Marginal Quantiles")

Because we have analytic results we can visually confirm that the Euler-Maruyama approximation is doing quite well for this time discretization, although in general validating the accuracy of the approximation is significantly more challenging.

par(mfrow=c(2, 2))

plot_marginal_quantiles(simu_data$t, ou_anal_mcmc_samples$x,
                        "Analytic Markov Chain Monte Carlo")
plot_marginal_quantiles(simu_data$t, ou_anal_mc_samples$x,
                        "Analytic Monte Carlo")
plot_marginal_quantiles(simu_data$t, ou_em_mcmc_samples$x,
                        "Euler-Maruyama Markov Chain Monte Carlo")
plot_marginal_quantiles(simu_data$t, ou_em_mc_samples$x,
                        "Euler-Maruyama Monte Carlo")

When we're interested in inferring the initial condition and the Ornstein-Uhlenbeck parameters we need to give them their own probabilistic structure. This results in a full latent model that covers all of the states and the stochastic process parameters, \[ \begin{align*} \pi(x_{0}, x_{1}, \ldots, x_{N}, \mu, \gamma, \tau) &= \pi(x_{1}, \ldots, x_{N} \mid x_{0}, \mu, \gamma, \tau) \, \pi(x_{0}) \, \pi(\mu, \gamma, \tau) \\ &= \left[ \prod_{n = 1}^{N} \pi(x_{n} \mid x_{n - 1}, \mu, \gamma, \tau) \right] \, \pi(x_{0}) \, \pi(\mu, \gamma, \tau). \end{align*} \]

With just a few modifications of our initial program we can implement this expanded model in Stan.

writeLines(readLines("stan_programs/simu_ou_analytic_full_latent.stan"))
data {
  int<lower=1> N;
  real t[N + 1];

}

generated quantities {
  real mu = normal_rng(0, 10);
  real<lower=0> gamma = fabs(normal_rng(0, 1));
  real<lower=0> tau = fabs(normal_rng(0, 1));
  
  real x0 = normal_rng(0, 3);
  
  real x[N + 1];
  x[1] = x0;
  for (n in 1:N) {
    real decay = exp(-gamma * (t[n + 1] - t[n]));
    x[n + 1] = normal_rng(mu + (x[n] - mu) * decay, 0.7071 * tau * sqrt( (1 - decay) / gamma ) );
  }
}
fit <- stan(file='stan_programs/simu_ou_analytic_full_latent.stan', data=simu_data,
            warmup=0, iter=4000, chains=1, seed=494838,
            algorithm="Fixed_param", refresh=4000)

SAMPLING FOR MODEL 'simu_ou_analytic_full_latent' NOW (CHAIN 1).
Chain 1: Iteration:    1 / 4000 [  0%]  (Sampling)
Chain 1: Iteration: 4000 / 4000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 0 seconds (Warm-up)
Chain 1:                0.037339 seconds (Sampling)
Chain 1:                0.037339 seconds (Total)
Chain 1: 
prior_samples <- extract(fit)

By incorporating an entire space of initial conditions and Ornstein-Uhlenbeck configurations we drastically increase the diversity of trajectory behaviors that we can integrate into inferential models.

par(mfrow=c(1, 2))

plot_realizations(simu_data$t, prior_samples$x, "Trajectory Realizations", ylim=c(-30, 30))
plot_marginal_quantiles(simu_data$t, prior_samples$x, "Marginal Quantiles", ylim=c(-30, 30))

2.3 Inferring Stochastic Processes

Speaking of inferential models let's consider observations made at different times, and hence generated from some of the latent states evolving according to our assumed stochastic differential equation, \[ \pi(y_{1}, \ldots, y_{K} | x_{n(1)}, \ldots x_{n(K)}). \] In this context we can ask which configurations of the latent stochastic process are compatible with realized observations \(\tilde{y}_{1}, \ldots, \tilde{y}_{K}\).

To quantify this consistency with Bayesian inference we first construct the full Bayesian model, \[ \begin{align*} \pi(y_{1}, \ldots, y_{K}, x_{0}, x_{1}, \ldots, x_{N}, \mu, \tau, \gamma) &= \;\; \pi(y_{1}, \ldots, y_{K} | x_{n(1)}, \ldots x_{n(K)}) \\ &\quad \cdot \pi(x_{0}, x_{1}, \ldots, x_{N}, \mu, \tau, \gamma), \end{align*} \] and then let Stan condition on the observed data and generate samples from the corresponding posterior distribution, \[ \pi(x_{0}, x_{1}, \ldots, x_{N}, \mu, \gamma, \tau \mid \tilde{y}_{1}, \ldots, \tilde{y}_{K}). \]

Even if we are interested in only the Ornstein-Uhlenbeck parameters \(\mu\), \(\tau\), and \(\gamma\) we first have to construct the full posterior distribution and then marginalize, \[ \begin{align*} \pi(\mu, \gamma, \tau \mid y_{1}, \ldots, y_{K}) &= \int \prod_{n = 0}^{N} \mathrm{d} x_{n} \, \pi(x_{1}, \ldots, x_{N}, x_{0}, \mu, \gamma, \tau \mid \tilde{y}_{1}, \ldots, \tilde{y}_{K}) \\ &= \int \prod_{k = 0}^{K} \mathrm{d} x_{n(k)} \, \pi(x_{n(1)}, \ldots, x_{n(K)}, x_{0}, \mu, \gamma, \tau \mid \tilde{y}_{1}, \ldots, \tilde{y}_{K}) \end{align*} \]

Whenever we condition on observed data only the state configurations \(x_{1}, \ldots, x_{N}\) realized from the posterior stochastic process, not the prior stochastic process, are relevant. In particular the simulations we generated from \(\pi(x_{0}, x_{1}, \ldots, x_{N}, \mu, \gamma, \tau)\) above will not be helpful in quantifying our posterior inferences.

To demonstrate proper posterior inference let's bring in some simulated data, along with the true realization from which the data were simulated for comparison.

data <- read_rdump("data/obs.data.R")
truth <- read_rdump("data/truth.data.R")

par(mfrow=c(1, 1))

plot(data$t, truth$x, type="l", lwd=2, main="", col="black",
     xlab="t", ylab="x", xlim=c(0, 10), ylim=c(-8, 10))

points(data$t_obs, data$y_obs, pch=16, cex=1.00, col="white")
points(data$t_obs, data$y_obs, pch=16, cex=0.75, col=c_mid_teal)

Here we will assume that the observations are given by independent normal density functions, \[ \pi(y_{1}, \ldots, y_{K} | x_{n(1)}, \ldots x_{n(K)}) = \prod_{k = 1}^{K} \text{normal}(y_{k} \mid x_{n(k)}, \sigma). \] Because the measurement variability \(\sigma\) is often difficult to distinguish from the Ornstein-Uhlenbeck variability \(\tau\) I will fix \(\sigma\) to its true value to simplify the demonstration here. The potential degeneracies of common stochastic processes are their own long discussion.

With the exact probabilistic evolution model we can construct the full Bayesian model in Stan using only the states at the observed times. This is useful if we are interested in only the Ornstein-Uhlenbeck parameters and not the underlying trajectories.

writeLines(readLines("stan_programs/fit_ou_analytic_obs_t.stan"))
data {
  real t0;
  
  int<lower=1> N_obs;
  real t_obs[N_obs];
  real y_obs[N_obs];
  real<lower=0> sigma;
}

parameters {
  real mu;
  real<lower=0> gamma;
  real<lower=0> tau;

  real x0;
  real x[N_obs];
}

model {
  real decay;
  
  mu ~ normal(0, 10);
  gamma ~ normal(0, 1);
  tau ~ normal(0, 1);
  
  x0 ~ normal(0, 3);
  
  decay = exp(-gamma * (t_obs[1] - t0));
  x[1] ~ normal(mu + (x0 - mu) * decay, 0.7071 * tau * sqrt( (1 - decay) / gamma ) );
  
  for (n in 2:N_obs) {
    decay = exp(-gamma * (t_obs[n] - t_obs[n - 1]));
    x[n] ~ normal(mu + (x[n - 1] - mu) * decay, 0.7071 * tau * sqrt( (1 - decay) / gamma ) );
  }
  
  y_obs ~ normal(x, sigma);
}
data$t0 <- data$t[1]

fit <- stan(file='stan_programs/fit_ou_analytic_obs_t.stan',
            data=data, seed=5838299, refresh=1000)

util$check_all_diagnostics(fit)
[1] "n_eff / iter looks reasonable for all parameters"
[1] "Rhat looks reasonable for all parameters"
[1] "0 of 4000 iterations ended with a divergence (0%)"
[1] "0 of 4000 iterations saturated the maximum tree depth of 10 (0%)"
[1] "E-FMI indicated no pathological behavior"
posterior_samples <- extract(fit)

The marginal posterior distribution demonstrate that the posterior captures the true configuration of the underlying Ornstein-Uhlenbeck process well.

plot_marginal <- function(name, prior_samples, posterior_samples, display_xlims) {
  prior_values <- prior_samples[name][[1]]
  posterior_values <- posterior_samples[name][[1]]
  bin_lims <- range(c(prior_values, posterior_values))
  delta <- diff(range(posterior_values)) / 50
  breaks <- seq(bin_lims[1], bin_lims[2]+ delta, delta)

  y_max <- max(hist(prior_values, breaks=breaks, plot=F)$counts,
               hist(posterior_values, breaks=breaks, plot=F)$counts)

  hist(prior_values, breaks=breaks,
       main="", xlab=name, xlim=display_xlims,
       ylab="", ylim=c(0, 1.05 * y_max), yaxt='n',
       col=c_light, border=c_light_highlight)
  hist(posterior_values, breaks=breaks,
       col=c_dark, border=c_dark_highlight, add=T)
  abline(v=truth[name], col="white", lty=1, lwd=3)
  abline(v=truth[name], col="black", lty=1, lwd=2)
}

par(mfrow=c(2, 2))

plot_marginal("mu", prior_samples, posterior_samples, c(-20, 20))
plot_marginal("gamma", prior_samples, posterior_samples, c(0, 3))
plot_marginal("tau", prior_samples, posterior_samples, c(0, 4))
plot_marginal("x0", prior_samples, posterior_samples, c(-10, 10))

If we want to investigate the latent trajectories consistent with the observations, and not just the state behavior at the observed times, then we need to consider a fine temporal grid and the many state configurations along that grid.

writeLines(readLines("stan_programs/fit_ou_analytic_all_t.stan"))
data {
  int<lower=1> N;
  real t[N + 1];
  
  int<lower=1> N_obs;
  int<lower=1, upper=N + 1> obs_idx[N_obs];
  int<lower=1, upper=N + 1> unobs_idx[N - N_obs];
  real y_obs[N_obs];
  real<lower=0> sigma;
}

parameters {
  real mu;
  real<lower=0> gamma;
  real<lower=0> tau;

  real x0;  
  real x_cp[N_obs];
  real x_ncp[N - N_obs];
}

transformed parameters {
  real x[N + 1]; 
  
  x[1] = x0;
  x[obs_idx] = x_cp;
  
  for (n in 1:(N - N_obs)) {
    int m = unobs_idx[n];  
    real decay = exp(-gamma * (t[m] - t[m - 1]));
    x[m] = mu + (x[m - 1] - mu) * decay + 0.7071 * tau * sqrt( (1 - decay) / gamma ) * x_ncp[n];
  }
}

model {
  mu ~ normal(0, 10);
  gamma ~ normal(0, 1);
  tau ~ normal(0, 1);
  
  x0 ~  normal(0, 3);
    
  for (n in 1:N_obs) {
    int m = obs_idx[n];
    real decay = exp(-gamma * (t[m] - t[m - 1]));
    x_cp[n] ~ normal(mu + (x[m - 1] - mu) * decay, 0.7071 * tau * sqrt( (1 - decay) / gamma ) );
  }
  
  x_ncp ~ normal(0, 1);
  
  y_obs ~ normal(x[obs_idx], sigma);
}

As this case study is meant to be a brief discussion I won't walk us through every detail of this more sophisticated Stan program. In summary the analytic probabilistic evolution model introduces many latent normal density functions which can be implemented with the centered and non-centered parameterizations we discussed in my hierarchical modeling case study. The best geometry is given by centering only those states that are strongly informed by the realized likelihood function, which in this case include the states at the observed times.

data$unobs_idx <- setdiff(1:data$N + 1, data$obs_idx)

fit <- stan(file='stan_programs/fit_ou_analytic_all_t.stan',
                  data=data, seed=5838299, refresh=1000)

util$check_all_diagnostics(fit)
[1] "n_eff / iter looks reasonable for all parameters"
[1] "Rhat looks reasonable for all parameters"
[1] "0 of 4000 iterations ended with a divergence (0%)"
[1] "0 of 4000 iterations saturated the maximum tree depth of 10 (0%)"
[1] "E-FMI indicated no pathological behavior"
anal_post_samples <- extract(fit)

With a normal density function connecting each latent state there are many opportunities for funnel degeneracies, but our careful parameterization seems to have been sufficient here. Again a full discussion of the degeneracies inherent to Ornstein-Uhlenbeck processes, let alone other common stochastic processes, is beyond the scope of this case study.

Without any indication of computational problems we can move on to examining our inferences. The marginal posterior distributions are consistent with the previous fit.

par(mfrow=c(2, 2))

plot_marginal("mu", prior_samples, anal_post_samples, c(-20, 20))
plot_marginal("gamma", prior_samples, anal_post_samples, c(0, 3))
plot_marginal("tau", prior_samples, anal_post_samples, c(0, 4))
plot_marginal("x0", prior_samples, anal_post_samples, c(-10, 10))

Now, however, we can visualize the posterior stochastic process using the technically-incorrect-but-practically-common linear interpolation heuristic.

par(mfrow=c(2, 2))

plot_realizations(data$t, prior_samples$x,
                  "Prior Realizations", ylim=c(-12, 12))

plot_realizations(data$t, anal_post_samples$x,
                  "Posterior Realizations", ylim=c(-12, 12))
points(data$t_obs, data$y_obs, pch=16, cex=1.00, col="white")
points(data$t_obs, data$y_obs, pch=16, cex=0.75, col=c_mid_teal)

plot_marginal_quantiles(data$t, prior_samples$x,
                        "Prior Quantiles", yli=c(-12, 12))

plot_marginal_quantiles(data$t, anal_post_samples$x,
                        "Posterior Quantiles", yli=c(-12, 12))
points(data$t_obs, data$y_obs, pch=16, cex=1.00, col="white")
points(data$t_obs, data$y_obs, pch=16, cex=0.75, col=c_mid_teal)

Because we have access to the actual realized trajectory we can also see that our inferred posterior distribution recovers that latent truth.

par(mfrow=c(1, 2))

plot_realizations(data$t,anal_post_samples$x,
                  "Posterior Realizations", ylim=c(-6, 11))
lines(data$t, truth$x, type="l", lwd=3, main="", col="white")
lines(data$t, truth$x, type="l", lwd=2, main="", col="black")
points(data$t_obs, data$y_obs, pch=16, cex=1.00, col="white")
points(data$t_obs, data$y_obs, pch=16, cex=0.75, col=c_mid_teal)

plot_marginal_quantiles(data$t, anal_post_samples$x,
                        "Posterior Quantiles", yli=c(-6, 11))
lines(data$t, truth$x, type="l", lwd=3, main="", col="white")
lines(data$t, truth$x, type="l", lwd=2, main="", col="black")
points(data$t_obs, data$y_obs, pch=16, cex=1.00, col="white")
points(data$t_obs, data$y_obs, pch=16, cex=0.75, col=c_mid_teal)

Finally let's consider how well we can infer these behaviors using the Euler-Maruyama approximation to which we're often limited in practice. In this case the fine temporal grid between the observation times isn't just a convenience but rather a requirement for reasonable accuracy.

writeLines(readLines("stan_programs/fit_ou_em_all_t.stan"))
data {
  int<lower=1> N;
  real t[N + 1];
  
  int<lower=1> N_obs;
  int<lower=1, upper=N + 1> obs_idx[N_obs];
  int<lower=1, upper=N + 1> unobs_idx[N - N_obs];
  real y_obs[N_obs];
  real<lower=0> sigma;
}

parameters {
  real mu;
  real<lower=0> gamma;
  real<lower=0> tau;
  
  real x0;
  real x_cp[N_obs];
  real x_ncp[N - N_obs];
}

transformed parameters {
  real x[N + 1]; 
  
  x[1] = x0;
  x[obs_idx] = x_cp;
  
  for (n in 1:(N - N_obs)) {
    int m = unobs_idx[n];  
    real epsilon = t[m] - t[m - 1];
    x[m] =  x[m - 1] + gamma * (mu - x[m - 1]) * epsilon 
          + tau * sqrt(epsilon) * x_ncp[n];
  }
}

model {
  mu ~ normal(0, 10);
  gamma ~ normal(0, 1);
  tau ~ normal(0, 1);
  
  x0 ~ normal(0, 3);
  
  for (n in 1:N_obs) {
    int m = obs_idx[n];
    real epsilon = t[m] - t[m - 1];
    x_cp[n] ~ normal(x[m - 1] + gamma * (mu - x[m - 1]) * epsilon, 
                     tau * sqrt(epsilon) );
  }
  
  x_ncp ~ normal(0, 1);

  y_obs ~ normal(x[obs_idx], sigma);
}

No matter the structure of the generating stochastic differential equation the Euler-Maruyama approximation always yields latent normal density functions, and the parameterization considerations from above will always be relevant.

fit <- stan(file='stan_programs/fit_ou_em_all_t.stan', data=data, seed=5838299, refresh=1000)

util$check_all_diagnostics(fit)
[1] "n_eff / iter looks reasonable for all parameters"
[1] "Rhat looks reasonable for all parameters"
[1] "0 of 4000 iterations ended with a divergence (0%)"
[1] "0 of 4000 iterations saturated the maximum tree depth of 10 (0%)"
[1] "E-FMI indicated no pathological behavior"
em_post_samples <- extract(fit)

The diagnostics look clean, allowing us to move on to investigation of the posterior distribution. Despite the approximate model the resulting marginal posterior distributions agree with the those from the exact model.

par(mfrow=c(2, 2))

plot_marginal("mu", prior_samples, em_post_samples, c(-20, 20))
plot_marginal("gamma", prior_samples, em_post_samples, c(0, 3))
plot_marginal("tau", prior_samples, em_post_samples, c(0, 4))
plot_marginal("x0", prior_samples, em_post_samples, c(-10, 10))

Similarly we are able to recover the true trajectory realization.

par(mfrow=c(1, 2))

plot_realizations(data$t, em_post_samples$x, "Posterior Realizations", ylim=c(-6, 11))
lines(data$t, truth$x, type="l", lwd=3, main="", col="white")
lines(data$t, truth$x, type="l", lwd=2, main="", col="black")
points(data$t_obs, data$y_obs, pch=16, cex=1.00, col="white")
points(data$t_obs, data$y_obs, pch=16, cex=0.75, col=c_mid_teal)

plot_marginal_quantiles(data$t, em_post_samples$x, "Posterior Quantiles", yli=c(-6, 11))
lines(data$t, truth$x, type="l", lwd=3, main="", col="white")
lines(data$t, truth$x, type="l", lwd=2, main="", col="black")
points(data$t_obs, data$y_obs, pch=16, cex=1.00, col="white")
points(data$t_obs, data$y_obs, pch=16, cex=0.75, col=c_mid_teal)

Visually comparing inferences from the exact and approximate models side by side we don't see any substantial differences, indicating that the Euler-Maruyama approximation is doing well.

par(mfrow=c(2, 2))

plot_realizations(data$t,anal_post_samples$x,
                  "Analytic Posterior Realizations", ylim=c(-6, 11))
plot_realizations(data$t, em_post_samples$x,
                  "Euler-Maruyama Posterior Realizations", ylim=c(-6, 11))

plot_marginal_quantiles(data$t, anal_post_samples$x,
                        "Analytic Posterior Quantiles", yli=c(-6, 11))
plot_marginal_quantiles(data$t, em_post_samples$x,
                        "Euler-Maruyama Posterior Quantiles", yli=c(-6, 11))

When we observe only a few states and aren't interested in inferring latent trajectories the fine temporal grid needed for the Euler-Maruyama approximation to be reasonably accurate can be an expensive burden. If we are interested in recovering the realized trajectory to high temporal resolution anyways, however, then the Euler-Maruyama approximation is quite natural.

That said quantifying the accuracy of the Euler-Maruyama approximation, and how finely we need to discretize the time intervals between observed states, is a much more subtle and challenging problem when we can't compare to the exact model.

3 Alternative Computational Methods

In this introduction I have considered only Monte Carlo and Markov chain Monte Carlo methods for quantifying the prior and posterior distributions induced by a a stochastic differential equation model, but there are a variety of other computational methods that can be useful in certain circumstances.

For example if we are interested not in the full posterior distribution, \[ \pi(x_{0}, x_{1}, \ldots, x_{N}, \mu, \gamma, \tau \mid \tilde{y}_{1}, \ldots, \tilde{y}_{K}), \] but only the marginal posterior distribution, \[ \pi(\mu, \gamma, \tau \mid y_{1}, \ldots, y_{K}) = \int \prod_{n = 0}^{N} \mathrm{d} x_{n} \, \pi(x_{1}, \ldots, x_{N}, x_{0}, \mu, \gamma, \tau \mid \tilde{y}_{1}, \ldots, \tilde{y}_{K}), \] then we might consider methods that approximately quantify that marginal distribution directly, such as the Integrated Nested Laplace Approximation [2]. The accuracy of these methods, however, can depend strongly on the details of the stochastic differential equation model and the shape of the realized likelihood function.

Similarly in some cases we might be able to analyze all of the observations together but only sequentially, for example when observations are streamed in time and discarded when the next observation arrives. In this case our inferential challenge is quantifying not the posterior distribution over the observations all at once but rather the sequence of posterior distributions that incorporate each observation only one at a time, \[ \pi(x_{0}, x_{1}, \ldots, x_{N}, \mu, \gamma, \tau \mid \tilde{y}_{1}, \ldots, \tilde{y}_{k + 1}) \propto \pi(\tilde{y}_{k + 1} \mid x_{n(k + 1)}) \cdot \pi(x_{0}, x_{1}, \ldots, x_{N}, \mu, \gamma, \tau \mid \tilde{y}_{1}, \ldots, \tilde{y}_{k}). \] Approximately updating only certain posterior expectation values in this way is often referred to as smoothing or filtering.

Quantifying every sequential posterior distribution is typically more computationally demanding then quantifying the single full posterior distribution. When streaming inferences are needed, however, methods like Sequential Monte Carlo may yield sufficiently accurate sequential posterior quantifications [3].

4 Conclusion

Because of their probabilistic nature stochastic differential equations are straightforward to incorporate into probabilistic models, making them a useful modeling technique in many applications. At least in theory approximations like the Euler-Maruyama discretization also provide a general implementation strategy even when the stochastic differential equation can't be solved in closed form.

In practice, however, implementing these approximations well requires care. For these approximations to be reasonably accurate we need to model fine temporal discretizations. Identifying a sufficiently fine discretization is a challenge in of itself, but in any case the models will be high-dimensional. With prudent parameterizations computational tools like the dynamic Hamiltonian Monte Carlo sampler implemented in Stan are often able to fit these high-dimensional models, but the typical computational expensive forces us to moderate our expectations for how generally practical stochastic differential equation models might actually be.

5 Acknowledgements

I thank Sam Power for helpful comments.

A very special thanks to everyone supporting me on Patreon: Aapeli Nevala, Abhinav Katoch, Adam Bartonicek, Adam Fleischhacker, Adam Golinski, Adriano Yoshino, Alan O'Donnell, Alessandro Varacca, alex , Alex ANDORRA, Alexander Bartik, Alexander Noll, Alexander Rosteck, Anders Valind, Andrea Serafino, Andrew Mascioli, Andrew Rouillard, Angus Williams, Antoine Soubret, Ara Winter, Arya, asif zubair, Austin Carter, Austin Rochford, Austin Rochford, Avraham Adler, bbbbb, Ben Matthews, Ben Swallow, Benoit Essiambre, Bertrand Wilden, Bo Schwartz Madsen, Brian Hartley, Bryan Yu, Brynjolfur Gauti Jónsson, Canaan Breiss, Cat Shark, Charles Naylor, Chase Dwelle, Chris Jones, Chris Zawora, Christopher Mehrvarzi, Chuck Carlson, Cole Monnahan, Colin Carroll, Colin McAuliffe, D, Damien Mannion, Damon Bayer, Dan Killian, dan mackinlay, Dan Muck, Dan W Joyce, Dan Weitzenfeld, Daniel Edward Marthaler, Daniel Hocking, Daniel Rowe, Darshan Pandit, David Burdelski, David Humeau, David Pascall, Derek G Miller, dilsher singh dhillon, Doug Rivers, Dylan Murphy, Ed Cashin, Ed Henry, Eduardo, Eric Novik, Erik Banek, Ero Carrera, Ethan Goan, Eugene, Evan Cater, Fabio Zottele, Federico Carrone, Felipe González, Fergus Chadwick, Finn Lindgren, Florian Wellmann, Francesco Corona, Geoff Rollins, George Ho, Granville Matheson, Greg Sutcliffe, Guido Biele, Hamed Bastan-Hagh, Haonan Zhu, Harrison Katz, Henri Wallen, Hugo Botha, Huib Meulenbelt, Hyunji Moon, Ian , Ignacio Vera, Ilaria Prosdocimi, Isaac S, J, J Michael Burgess, Jair Andrade, James Doss-Gollin, James McInerney, James Wade, JamesU, Janek Berger, Jason Martin, Jason Pekos, Jeff Dotson, Jeff Helzner, Jeffrey Arnold, Jeffrey Burnett, Jeffrey Erlich, Jessica Graves, Joel Kronander, Joel Ong, John Flournoy, Jon , Jonathan St-onge, Jonathon Vallejo, Joran Jongerling, Jordi Warmenhoven, Jose Pereira, Josh Weinstock, Joshua Duncan, Joshua Griffith, Joshua Mayer, Josué Mendoza, Justin Bois, Karim Naguib, Karim Osman, Kejia Shi, Konsta Happonen, Kádár András, Lane Harrison, lizzie , Luiz Carvalho, Luiz Pessoa, Marc Dotson, Marc Trunjer Kusk Nielsen, Marcel Lüthi, Marco Gorelli, Marek Kwiatkowski, Mario Becerra, Mark Donoghoe, Mark Worrall, Markus P., Martin Modrák, Matthew, Matthew Hawthorne, Matthew Kay, Matthew Quick, Matthew T Stern, Matthieu , Maurits van der Meer, Maxim Kesin, Melissa Wong, Merlin Noel Heidemanns, Michael Cao, Michael Colaresi, Michael DeWitt, Michael Dillon, Michael Lerner, Michael Tracy, Mick Cooney, Miguel de la Varga, Mike Lawrence, MisterMentat , Mutual Information, Márton Vaitkus, N , Name, Nathaniel Burbank, Nerf-Shepherd , Nicholas Cowie, Nicholas Erskine, Nicholas Ursa, Nick S, Nicolas Frisby, Noah Guzman, Octavio Medina, Ole Rogeberg, Oliver Crook, Olivier Ma, Omri Har Shemesh, Pablo León Villagrá, Patrick Boehnke, Pau Pereira Batlle, Paul Oreto, Peter Smits, Pieter van den Berg , Pintaius Pedilici, ptr, Ramiro Barrantes Reynolds, Ravin Kumar, Raúl Peralta Lozada, Reece Willoughby, Riccardo Fusaroli, Robert Frost, Robert Goldman, Robert kohn, Robert Mitchell V, Robin Taylor, Rong Lu, Ryan Grossman, Ryan McMinds, Rémi , S Hong, Scott Block, Scott Brown, Serena, Seth Axen, Shira, sid phani, Simon Dirmeier, Simon Duane, Simon Lilburn, Srivatsa Srinath, sssz, STAN, Stephanie Fitzgerald, Stephen Lienhard, Stephen Oates, Steve Bertolani, Stone Chen, Sus, Susan Holmes, Svilup, Sören Berg, Tao Ye, Teddy Groves, Teresa Ortiz, Thomas Kealy, Thomas Vladeck, Théo Galy-Fajou, Tiago Cabaço, Tim Howes, Tim Radtke, Tobychev, Tom Knowles, Tom McEwen, Tommy Naugle, Tony Wuersch, Tyrel Stokes, Utku Turk, Vincent Arel-Bundock, Virginia Fisher, vittorio trotta, Vladimir Markov, Wil Yegelwel, Will Dearden, Will Farr, Will Kurt, Will Tudor-Evans, yolhaj, yureq, Z, Zach A, Zad Rafi, Zhengchen Cai, and Zwelithini Tunyiswa.

References

[1] Pavliotis, G. A. (2014). Stochastic processes and applications. vol 60 Springer, New York.

[2] Lindgren, F., Bolin, D. and Rue, H. (2021). The spde approach for gaussian and non-gaussian fields: 10 years and still running.

[3] Doucet, A., Freitas, N. and Gordon, N. (2001). Sequential monte carlo methods in practice. Springer.

License

A repository containing the material used in this case study is available on GitHub.

The code in this case study is copyrighted by Michael Betancourt and licensed under the new BSD (3-clause) license:

https://opensource.org/licenses/BSD-3-Clause

The text and figures in this case study are copyrighted by Michael Betancourt and licensed under the CC BY-NC 4.0 license:

https://creativecommons.org/licenses/by-nc/4.0/

Original Computing Environment

writeLines(readLines(file.path(Sys.getenv("HOME"), ".R/Makevars")))
CC=clang

CXXFLAGS=-O3 -mtune=native -march=native -Wno-unused-variable -Wno-unused-function -Wno-macro-redefined -Wno-unneeded-internal-declaration
CXX=clang++ -arch x86_64 -ftemplate-depth-256

CXX14FLAGS=-O3 -mtune=native -march=native -Wno-unused-variable -Wno-unused-function -Wno-macro-redefined -Wno-unneeded-internal-declaration -Wno-unknown-pragmas
CXX14=clang++ -arch x86_64 -ftemplate-depth-256
sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] colormap_0.1.4       rstan_2.19.3         ggplot2_3.3.1       
[4] StanHeaders_2.21.0-3

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.4.6       pillar_1.4.4       compiler_4.0.2     prettyunits_1.1.1 
 [5] tools_4.0.2        digest_0.6.25      pkgbuild_1.0.8     jsonlite_1.6.1    
 [9] evaluate_0.14      lifecycle_0.2.0    tibble_3.0.1       gtable_0.3.0      
[13] pkgconfig_2.0.3    rlang_0.4.6        cli_2.0.2          curl_4.3          
[17] parallel_4.0.2     yaml_2.2.1         xfun_0.16          loo_2.2.0         
[21] gridExtra_2.3      withr_2.2.0        stringr_1.4.0      dplyr_1.0.0       
[25] knitr_1.29         generics_0.0.2     vctrs_0.3.0        stats4_4.0.2      
[29] grid_4.0.2         tidyselect_1.1.0   glue_1.4.1         inline_0.3.15     
[33] R6_2.4.1           processx_3.4.2     fansi_0.4.1        rmarkdown_2.2     
[37] purrr_0.3.4        callr_3.4.3        magrittr_1.5       codetools_0.2-16  
[41] matrixStats_0.56.0 ps_1.3.3           scales_1.1.1       ellipsis_0.3.1    
[45] htmltools_0.4.0    assertthat_0.2.1   colorspace_1.4-1   V8_3.2.0          
[49] stringi_1.4.6      munsell_0.5.0      crayon_1.3.4