Impactful Research Projects

There are many important open problems within applied computational statistics, many of which fall out of mainstream academic interests and are hence overlooked. Here I collect some of the research projects that I personally think would be particularly impactful in improving tools for practical applications. Unsurprisingly there is a strong focus on the principled implementation of geometric methodologies.

At the moment I am not able to supervise any of these projects but I am happy to answer occasional questions. Additionally I may be interested in collaboration.

Step Size Optimization for Dynamic Hamiltonian Monte Carlo

Early implementations of Hamiltonian Monte Carlo utilized trajectories with static integration times and considered only the final point in the trajectory. In Optimizing The Integrator Step Size for Hamiltonian Monte Carlo colleagues and I derived a general criteria for the optimal integrator step size in these algorithms. Modern implementations of Hamiltonian Monte Carlo, in particular that used in Stan, utilize dynamic integration times and consider all of the points within each trajectory. Scattered empirical results suggest that the step size optimization for these algorithms must be modified.

Can one generalize the optimality criterion for static implementations to dynamic implementations?

Geometric Ergodicity of Dynamic Hamiltonian Monte Carlo

In On the Geometric Ergodicity of Hamiltonian Monte Carlo colleagues and I identified important obstructions to the geometric ergodicity of static Hamiltonian Monte Carlo implementations. Empirical and theoretical evidence, however, indicates that many of these obstructions do not persist to dynamic implementations of Hamiltonian Monte Carlo.

Can one identify obstructions to geometric ergodicity for dynamic implementations?

Note that many of the mathematical techniques utilized in the literature rely on bounding a Markov chain by a diffusion with known properties. These bounds have limited utility for Hamiltonian Monte Carlo, however, which exploits non-diffusive Hamiltonian flow. Significant generalizations may require entirely new techniques, perhaps exploiting the topology of Hamiltonian sets and subsequent properties like Poincaré recurrence and Hamiltonian chaos. See for example “Hamiltonian Chaos and Fractional Dynamics” by Zaslavsky as well as some interesting work in Rapid Mixing of Hamiltonian Monte Carlo on Strongly Log-Concave Distributions.

Divergences for Implicit Symplectic Integrators

Divergences result when a symplectic integrator becomes unstable and numerical trajectories stray away from the true trajectories they are attempting to simulate. In practice they are powerful diagnostics of the failure of Hamiltonian Monte Carlo methods to adequately explore a given target distribution. For more see Section 5 and 6 of A Conceptual Introduction to Hamiltonian Monte Carlo.

Divergences are straightforward to identify using explicit symplectic integrators but they are more subtle for implicit symplectic integrators, in particular those required by Riemannian Hamiltonian Monte Carlo implementations. The problem is that each update of an implicit symplectic integrators requires a fixed point solution of an implicit function, and these fixed point solves may fail to converge. Indeed they are vulnerable to convergence failures in exactly the circumstances that encourage divergences, making it difficult to distinguish between a true divergence and a convergence failure.

Can one design a robust implementation of implicit symplectic integrators that doesn’t compromise the diagnostic power of divergences in Hamiltonian Monte Carlo?

Dynamic Tuning of Higher-Order Symplectic Integrators

Typical implementations of Hamiltonian Monte Carlo utilize the leapfrog integrator, which is a second-order symplectic integrator. Higher-order symplectic integrators are more expensive but offer higher numerical accuracy that has the potential to offer higher overall performance within Hamiltonian Monte Carlo algorithms. The problem is that higher-order symplectic integrators have multiple configurations which introduce additional tuning parameters. Unfortunately the performance of resulting Hamiltonian Monte Carlo implementations can be extremely sensitive to these tunings and the optimal tuning is itself sensitive to the integrator step size, further complicating matters. See Adaptive multi-stage integrators for optimal energy conservation in molecular simulations and Multi-stage splitting integrators for sampling with modified Hamiltonian Monte Carlo methods for some relevant discussion.

Can one design a comprehensive tuning method for implementations of Hamiltonian Monte Carlo that utilize higher-order symplectic integrators that covers the integrator configuration and step size?

Implementing Adiabatic Monte Carlo

In Adiabatic Monte Carlo I introduced a generalization of Hamiltonian Monte Carlo capable of targeting multimodal distributions. Robust implementations of this method share much with robust implementations of Hamiltonian Monte Carlo methods, but there are a few key differences. The most problematic of these is that Hamiltonian Monte Carlo targets a single distribution while Adiabatic Monte Carlo targets an smooth interpolation between two distributions. This difference means that the techniques used to compensate for the numerical error introduced by simulations in Hamiltonian Monte Carlo are not applicable to Adiabatic Monte Carlo.

Can one design an exact algorithmic correction to the numerical error introduced when simulating adiabatic flows in Adiabatic Monte Carlo?

Generalizing Hamiltonian Monte Carlo to Novel Topological Spaces

Hamiltonian Monte Carlo is applicable to any smooth probability distribution defined on a smooth manifold, for example those that admit probability density functions with well-defined gradients. The methodology, however, is a special case of a more general approach that may be applicable to more general spaces.

This more general approach focuses on building flows that preserve the given target distribution from group actions. In particular these flows arise immediately from the orbits of any group whose Haar measure equals or marginalizes to the target distribution. For example, Hamiltonian Monte Carlo can be thought of as a method that constructs a symplectomorphism group that acts on the cotangent bundle of the target space. The symplectomorphic orbits trace out the Hamiltonian trajectories that generate the efficient exploration of the target space. For more discussion see Section 4.3 of The Geometric Foundations of Hamiltonian Monte Carlo.

Taking this group theoretic perspective may help to identify similar methods amenable to non-smooth spaces. For example, might the groups arising in tropical algebras provide the foundation of methods that flow between the topological configurations in phylogenetic spaces?

Can one identify group constructions that provide the foundations for generalizing Hamiltonian Monte Carlo to novel spaces?

Principled Derivation of the Bregman Hamiltonian

A Variational Perspective on Accelerated Methods in Optimization introduced a dynamical system whose evolution reduces to the Nesterov accelerated gradient method under a particular discretization. That dynamical system is defined by a time-dependent Hamiltonian function they denote the Bregman Hamiltonian which considers a scaled version of the objective function as the potential energy and the Bregman divergence of an auxiliary function as the kinetic energy.

Following up on that work, On Symplectic Optimization showed that a formal symplectic discretization of the Bregman Hamiltonian yields stable numerical dynamics with the same qualitative behavior as the original discretization. This motivates a more foundational role for the Bregman Hamiltonian rather than just a means towards deriving Nesterov methods.

Is there a geoemtric argument for why the Bregman Hamiltonian define dynamics that converge to the optimium of the given objective function? Are there other Hamiltonians that yield similar behavior, and if so what properties define the dynamics which converge the fastest? What is the role of the auxiliary function used to define the Bregman divergence?

Convergence Analysis of Symplectic Optimization

A Variational Perspective on Accelerated Methods in Optimization also demonstrated that the exact Bregman dynamics enjoy strong convergence properties which carry over to the particular discretization they consider. The symplectic integration introduced in On Symplectic Optimization offers improved stability, and hence the opportunity for faster simulation of the Bregman dynamics, but the nature of symplectic integration makes it difficult to prove whether or not the resulting numerical dynamics converge at the same rate as the exact dynamics. Acceleration via Symplectic Discretization of High-Resolution Differential Equations makes progress for first-order symplectic integrators, but the techniques they employ do not carry over to higher-order symplectic integrators.

Can one prove that higher-order symplectic integration preserves the convergence rates of the exact Bregman dynamics?

The problem is that most techniques used in the convergence analysis of optimization methods aim to bound the convergence of the method at each iteration. Higher-order symplectic integratators, however, achieve their performance by weaving together updates with large errors that carefully cancel to achieve stability over longer time scales. Conseqeuntly the error in the dynamics, and hence convergence of the numerical dynamics, oscillates and obstructs monotonic bounds.

My intuition is that results for general symplectic integrators will require global analysis of the numerical dynamics, likely exploiting backwards error analysis and the modified, or shadow, Hamiltonians of a given symplectic integrator. For the typical Lipschitz conditions assumed in these analyses one should be able to show that the difference between the true Hamiltonian and the modified Hamiltonian can be bounded and then use that to bound the convergence of the numerical dynamics relative to the exact dynamics.

Michael Betancourt