# Writing

Preprints of my research work are posted on the arXiv as much as possible. Highlights include a long but comprehensive introduction to statistical computing and Hamiltonian Monte Carlo targeted at applied researches, and a more theoretical treatment of the geometric foundations of Hamiltonian Monte Carlo.

I am currently developing a book covering important concepts in probabilistic modeling with Stan. The ultimate goal is a reasonably self-contained treatment demonstrating how to build, evaluate, and utilize probabilistic models that capture domain expertise in applied analyses; the most important prerequisites are a familiarity with calculus and linear algebra.

While there is still much work to be done the currently available chapter drafts are listed below. Although relatively mature these resources are still very much dynamic and improving as I find better organizations of the material and receive feedback from readers. Please don’t hesitate to send comments through email or pull requests on the case study GitHub repositories linked below.

### Part I: Probability Theory

#### Updated Chapters

Chapter 1: Measure and Probability on Finite Sets HTML PDF

Chapter 2: Mathematical Spaces HTML PDF

Chapter 3: Product Spaces HTML PDF

Chapter 4: Measure and Probability on General Spaces HTML PDF

Chapter 5: Expectation Values HTML PDF

Chapter 6: Probability Density Functions HTML PDF

Chapter 7: Transforming Probability Spaces HTML PDF

Chapter 8: Conditional Probability Theory HTML PDF

Chapter 9: Probability Theory on Product Spaces (coming soon!)

Chapter 10: Useful Probability Density Functions (coming soon!)

Chapter 11: Sampling and Monte Carlo (coming soon!)

#### Older Chapters

Probability Theory on Product Spaces

Sampling and Monte Carlo

Common Families of Probabilty Densities

Probabilistic Computation

Markov Chain Monte Carlo

- More Comprehensive Markov chain Monte Carlo examples (Very Rough Draft!)

- arXiv manuscript

- Markov chain Monte Carlo diagnostic code

Hamiltonian Monte Carlo (coming soon!)

- arXiv manuscript

### Part II: Modeling and Inference

Modeling and Inference

Generative Modeling

Prior Modeling

Introduction to Stan

Identifiability and Degeneracies

Principled Model Building Workflow

### Part III: Modeling Techniques

Hierarchical Modeling

Factor Modeling

Modeling Sparsity

Modeling Ordinal Outcomes

Survival Modeling

Modeling Consensus HTML PDF

Modeling Selection HTML PDF

Variate-Covariate Modeling

Modeling Functional Behavior I: Linearized Models

Modeling Functional Behavior II: General Linearized Models

Underdetermined Linearized Regression

The QR Decompostion for Linearized Regression

Modeling Functional Behavior III: Gaussian Processes

Modeling Functional Behavior IV: Stochastic Differential Equations

Modeling Functional Behavior V: The Brownian Bridge

### Case Studies

Inferring Customer Conversion HTML PDF

Inferring Gravity From Data

Modeling Tree Diameter Growth HTML PDF

Inferring Window Size In The Dark HTML PDF

Inferring and Deciding On Die Fairness HTML PDF

### Miscellaneous

Some Ruminations on Containment Prior Modeling

Incomplete Draft Introducing Mixture Modeling Techniques

Incomplete Discussion of Degeneracy In Mixture Modeling

Citation information for these drafts, as well as some other miscellaneous case studies, are listed below.

## General Taylor Models

Taylor approximation is a powerful and general strategy for modeling the behavior of a function within a local neighborhood of inputs. The utility of this strategy, however, can be limited when the output of the target function is constrained to some subset of values. In this case study we'll see how Taylor approximations can be combined with transformations from the constrained output space to an unconstrained output space and back to robustly model the local behavior of constrained functions.

View
(HTML)

betanalpha/knitr_case_studies/general_taylor_models/
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2022). General Taylor Models. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/general_taylor_models, commit 8c568575d8f4ca7700cf9b83faf73dc4a2089df2.*

## Taylor Regression Models

Linear functions of the covariates are ubiquitous in the regression modeling, although it's not clear if that ubiquity is due to any universal utility or just mathematical simplicity. In this case study I consider linear functions of input covariates as local approximations of more general functional behavior within a neighborhood of covariate configurations. The theory of Taylor approximations grounds these models in an explicit context that offers interpretability and guidance for how they, and the heuristics that often accompany them, can be robustly applied in practice.

View
(HTML)

betanalpha/knitr_case_studies/taylor_models/
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2022). Taylor Regression Models. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/taylor_models, commit f99e798124a3d3cd77fc5779c9e8833e94d61a12.*

## Bridge Over Troubled Processes

Brownian motion is a fundamental stochastic process over trajectories that expands around a given initial state. A Brownian bridge restricts this Brownian motion to trajectories that also converge to a given terminal state. In this case study we will briefly review the implementation of both Brownian motion and Brownian bridges in Stan.

View
(HTML)

betanalpha/knitr_case_studies/some_containment_prior_models/
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2022). Bridge Over Troubled Processes. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/brownian_bridge, commit 7436188815c142cafad97a5dd4843dfd3c967af9.*

## Some Ruminations on Containment Prior Modeling

In my prior modeling case study I advocate for the general use of soft containment prior models and suggest some basic families of probability density functions that are well-suited to this task. Here I discuss some more sophisticated containment prior models that may or may not be of use in some applications, and may or may not be an excuse to indulge in some integral calculations. After reviewing the tail probability conditions that specify a particular containment prior model from a given family I consider some containment prior models for unconstrained and positively-constrained, one-dimensional spaces.

View
(HTML)

betanalpha/knitr_case_studies/some_containment_prior_models/
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2022). Some Ruminations on Containment Prior Modeling. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/some_containment_prior_models, commit 0b667a51ac437694b76894acb88ea4c2ec361963.*

## Outwit, Outlast, Outmodel

Survival modeling is a broadly applicable technique for modeling events that are triggered by the persistent accumulation of some stimulus. In this case study I review the basic foundations of survival modeling from this particular perspective and discuss their implementation in Stan. Building upon those foundations I also discuss censored survival models and sequential survival models.

View
(HTML)

betanalpha/knitr_case_studies/survival_modeling/
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2022). Outwit, Outlast, Outmodel. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/survival_modeling, commit 0b667a51ac437694b76894acb88ea4c2ec361963.*

## (Co)variations On A Theme

In this case study we'll review the foundations of variate-covariate modeling and techniques for building and implementing these models in practice, demonstrating the resulting in methodology with some extensive examples. Throughout I will attempt to relate this modeling perspective to more conventional treatments of regression modeling.

View
(HTML)

betanalpha/knitr_case_studies/variate_covariate_modeling/
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2021). (Co)variations On A Theme. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/variate_covariate_modeling, commit e8148832c82cdaaa06b224e184dbe8ba81488753.*

## An Infinitesimal Introduction to Stochastic Differential Equation Modeling

In this case study I introduce the very basics of modeling and inference with stochastic differential equations. To demonstrate the concepts I also provide examples of both prior and posterior inference in Stan.

View
(HTML)

betanalpha/knitr_case_studies/stochastic_differential_equations/
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2021). An Infinitesimal Introduction to Stochastic Differential Equation Modeling. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/stochastic_differential_equations, commit 25275128b6436b21662fc1b204c8733f46c357d5.*

## Prior Modeling

In Bayesian inference the prior model provides a valuable opportunity to incorporate domain expertise into our inferences. Unfortunately this opportunity often becomes a contentious issue in many fields, and this potential value is lost in the debate. In this case study I will discuss the challenges of building prior models that capture meaningful domain expertise and some practical strategies for ameliorating those challenges as much as possible.

View
(HTML)

betanalpha/knitr_case_studies/prior_modeling/
(GitHub)

Dependences: `R, knitr`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2021). Prior Modeling. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/prior_modeling, commit 56606fa62e35f87bc88cec6892b4a4d3587f7029.*

## (What's the Probabilistic Story) Modeling Glory?

Generative modeling is often suggested as a useful approach for designing probabilistic models that capture the relevant structure of a given application. The specific details of this approach, however, is left vague enough to limit how useful it can actually be in practice. In this case study I present an explicit definition of generative modeling as a way to bridge implicit domain expertise and explicit probabilistic models, motivating a wealth of useful model critique and construction techniques.

View
(HTML)

betanalpha/knitr_case_studies/generative_modeling/
(GitHub)

Dependences: `R, knitr`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2021). (What’s the Probabilistic Story) Modeling Glory?. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/generative_modeling, commit 99e07946e73d4c45ea757c4cde7882495e6519a8.*

## Sparsity Blues

As measurements become more complex they often convolve meaningful phenomena with more extraneous phenomena. In order to limit the impact of these irrelevant phenomena on our inferences, and isolate the relevant phenomena, we need models that encourage sparse inferences. In this case study I review the basics of Bayesian inferential sparsity and some of the various strategies for building prior models that incorporate sparsity assumptions.

View
(HTML)

betanalpha/knitr_case_studies/modeling_sparsity/
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2021). Sparsity Blues. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/modeling_sparsity, commit e6bb565825cd3c2533a4b4fee6f1b68a5728956a.*

## Sampling

Probability theory teaches us that the only well-posed way to extract information from a probability distribution is through expectation values. Unfortunately computing expectation values from abstract probability distributions is no easy feat. In some cases, however, we can readily approximate expectation values by averaging functions of interest over special collections of points known as samples. In this case study we carefully define the concept of a sample from a probability distribution and show how those samples can be used to approximate expectation values through the Monte Carlo method.

View
(HTML)

betanalpha/knitr_case_studies/sampling/
(GitHub)

Dependences: `R, knitr`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2021). Sampling. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/sampling, commit eb27fbdb5063220f3d335024d2ae5e2bfdf70955.*

## Factor Modeling

Hierarchical models are a natural way to model heterogeneity across exchangeable contexts, but they become less appropriate when additional information discriminates those contexts. In particular any labels that categorize individual contexts into groups immediately obstructs the full exchangeability, and modeling heterogeneity consistently with these groupings becomes much more subtle. When the contexts are subject to multiple, overlapping categorizations the challenge becomes even more difficult. In this case study we investigate how to generalize exchangeability in the presence of factors that categorize individual contexts and then develop modeling techniques compatible with this generalization.

View
(HTML)

betanalpha/knitr_case_studies/factor_modeling/
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2021). Factor Modeling. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/factor_modeling, commit 6e4566309163ee79f8b7c907e2efce969a96bc54.*

## Hierarchical Modeling

Hierarchical modeling is a powerful technique for modeling heterogeneity and, consequently, it is becoming increasingly ubiquitous in contemporary applied statistics. Unfortunately that ubiquitous application has not brought with it an equivalently ubiquitous understanding for how awkward these models can be to fit in practice. In this case study we dive deep into hierarchical models, from their theoretical motivations to their inherent degeneracies and the strategies needed to ensure robust computation. We will learn not only how to use hierarchical models but also how to use them robustly.

View
(HTML)

betanalpha/knitr_case_studies/hierarchical_modeling/
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2020). Hierarchical Modeling. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/hierarchical_modeling, commit 27c1d260e9ceca710465dc3b02f59f59b729ca43.*

## Robust Gaussian Process Modeling

Gaussian processes are a powerful approach for modeling functional behavior, but as with any tool that power can be wielded responsibly only with deliberate care. In this case study I introduce the basic usage of Gaussian processes in modeling applications, and how to implement them in Stan, before considering some of their subtle but ubiquitous inferential pathologies and the techniques for addressing those pathologies with principled domain expertise.

View
(HTML)

betanalpha/knitr_case_studies/gaussian_processes/
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2020). Robust Gaussian Process Modeling. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/gaussian_processes, commit e10083abbcdb65c745f840ab9d2da58229fa9af3.*

## Identity Crisis

Under ideal conditions only a small neighborhood of model configurations will be consistent with both the observed data and the domain expertise we encode in our prior model, resulting in a posterior distribution that strongly concentrates along each parameter. This not only yields precise inferences and but also facilitates accurate estimation of those inferences. Under more realistic conditions, however, our measurements and domain expertise can be much less informative, allowing our posterior distribution to stretch across more expansive, complex neighborhoods of the model configuration space. These intricate uncertainties complicate not only the utility of our inferences but also our ability to quantify those inferences computationally. In this case study we will explore the theoretical concept of identifiability and its more geometric counterpart degeneracy that better generalizes to applied statistical practice. We will also discuss the principled ways in which we can identify and then compensate for degenerate inferences before demonstrating these strategies in a series of pedagogical examples.

View
(HTML)

betanalpha/knitr_case_studies/identifiability
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2020). Identity Crisis. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/identifiability, commit b71d65d44731ce90bbfc769f3cbc8355efac262f.*

## Towards A Principled Bayesian Workflow (RStan)

Given the posterior distribution derived from a probabilistic model and a particular observation, Bayesian inference is straightforward to implement: inferences, and any decisions based upon them, follow immediately in the form of posterior expectations. Building such a probabilistic model that is satisfactory in a given application, however, is a far more open-ended challenge. In order to ensure robust analyses we need a principled workflow that guides the development of a probabilistic model that is consistent with both our domain expertise and any observed data while also being amenable to accurate computation. In this case study I introduce a principled workflow for building and evaluating probabilistic models in Bayesian inference.

View
(HTML)

betanalpha/knitr_case_studies/principled_bayesian_workflow
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2020). Towards A Principled Bayesian Workflow (RStan). Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/principled_bayesian_workflow, commit aeab31509b8e37ff05b0828f87a3018b1799b401.*

## An Introduction to Stan

Stan is a comprehensive software ecosystem aimed at facilitating the application of Bayesian inference. It features an expressive probabilistic programming language for specifying sophisticated Bayesian models backed by extensive math and algorithm libraries to support automated computation. In this case study I present a thorough introduction to the Stan ecosystem with a particular focus on the modeling language. After a motivating introduction we will review the Stan ecosystem and the fundamentals of the Stan modeling language and the RStan interface. Finally I will demonstrate some more advanced features and debugging techniques in a series of exercises.

View
(HTML)

betanalpha/knitr_case_studies/stan_intro
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2020). An Introduction to Stan. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/stan_intro, commit b474ec1a5a79347f7c9634376c866fe3294d657a.*

## Probability Theory on Product Spaces

Once we are given a well-defined probability distribution probability theory tells us how we can manipulate it for our applied needs. What the theory does not tell us, however, is how to construct a useful probability distribution in the first place. The challenge of building probability distributions in high-dimensional spaces becomes much more manageable when the ambient spaces are product spaces constructed from lower-dimensional component spaces. This product structure serves as scaffolding on which we can build up a probability distribution piece by piece. In this case study I introduce probability theory on product spaces, with an emphasis on how this structure facilitates the construction of probability distributions in practice while also motivating convenient notation, vocabulary, and visualizations.

View
(HTML)

betanalpha/knitr_case_studies/probability_on_product_spaces
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2020). Probability Theory on Product Spaces. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/probability_on_product_spaces, commit dfb96237152414a4c8c1a5d6c8639da9915c2378.*

## Falling (In Love With Principled Modeling)

The concept of measurement and inference is often introduced through what are supposed to be simple experiments, such as inferring gravity from the time it takes objects to fall from various heights. The realizations of these experiments in practice, however, are often much more complex than their idealized designs imply, and any principled statistical analysis will have to go into much more detail than one might expect. At the same time these simple experiments can provide an elegant demonstration of many of they key concepts of a principled Bayesian workflow where an initial model based on the theoretical experimental design is continuously expanded to capture all of the important features exhibited by the realization of the experiment. In this case study I attempt to infer the local gravitational acceleration from falling ball measurements, along the way illustrating strategies for identifying model defects and motivating principled model development.

View
(HTML)

betanalpha/knitr_case_studies/falling
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2020). Falling (In Love With Principled Modeling). Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/falling, commit b474ec1a5a79347f7c9634376c866fe3294d657a.*

## Markov chain Monte Carlo

Markov chain Monte Carlo is one of our best tools in the desperate struggle against high-dimensional probabilistic computation, but its fragility makes it dangerous to wield without adequate training. In order to use the method responsibly, and ensure accurate quantification of the target distribution, practitioners need to know not just how it works under ideal circumstances but also how it breaks under less ideal, but potentially more common, circumstances. Most importantly a practitioner needs to be able to identify when the method breaks and they shouldn’t trust the results that it gives. In other words a responsible user of Markov chain Monte Carlo needs to know how to manage its risks. In this case study I introducing the critical concepts needed to understand how employ Markov chain Monte Carlo responsibly.

View
(HTML)

betanalpha/knitr_case_studies/markov_chain_monte_carlo
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2020). Markov chain Monte Carlo. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/markov_chain_monte_carlo, commit b474ec1a5a79347f7c9634376c866fe3294d657a.*

## Probabilistic Computation

In practice we define probability distributions implicitly through their computational consequences. More formally we define them algorithmically through methods that return expectation values. The sophisticated probability distributions that arise in applied analyses, however, do not often admit exact results and we have to focus instead on numerical algorithms which only _estimate_ the exact expectation values returned by a given probability distribution. Practical probabilistic computation considers the construction of algorithms with well-behaved and well-quantified errors that allow us to understand when we can trust the corrupted answers they give. In this case study I introduce the basics of probabilistic computation with a focus on the challenges that arise as we attempt to scale to problems in more than a few dimenions.

View
(HTML)

betanalpha/knitr_case_studies/probabilistic_computation
(GitHub)

Dependences: `R, knitr`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2019). Probabilistic Computation. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/probabilistic_computation, commit b474ec1a5a79347f7c9634376c866fe3294d657a.*

## Probabilistic Building Blocks

Probability theory defines how we can manipulate and utilize probability distributions, but it provides no guidance for which probability distributions we might want to utilize in a given application. Fortunately we can exploit conditional probability theory to build up high-dimensional probability distributions from many one-dimensional probability distributions about which we can more easily reason. With only a modest collection of probability distributions at their disposal a practitioner has to potential to construct a wide array of sophisticated probability distributions. A cache of interpretable, well-understood probability distributions is a critical piece of any practitioner’s statistical toolkit. In this case study I review common families of probability density functions which provide the foundational elements of this toolkit.

View
(HTML)

betanalpha/knitr_case_studies/probability_densities
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2019). Probabilistic Building Blocks. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/probability_densities, commit b474ec1a5a79347f7c9634376c866fe3294d657a.*

## Ordinal Regression

Regression models quantify statistical correlations by allowing latent effects to moderate the distribution of an observed outcome. Ordinal regression models the influence of a latent effect on an ordinal outcome consisting of discrete but ordered categories. In thise cases study I review the mathematical structure of ordinal regression models and their practical implementation in Stan. In the process I also derive a principled prior model that ensures robustness even when ordinal data are only weakly informative.

View
(HTML)

betanalpha/knitr_case_studies/ordinal_regression
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2019). Ordinal Regression. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/ordinal_regression, commit 1b602ef70a969ff873c04e26101bf44195ce4609.*

## Probabilistic Modeling and Statistical Inference

In this case study we’ll review the foundations of statistical models and statistical inference that advise principled decision making. We’ll place a particular emphasis on Bayesian inference, which utilizes probability theory and statistical modeling to encapsulate information from observed data and our own domain expertise.

View
(HTML)

betanalpha/knitr_case_studies/modeling_and_inference
(GitHub)

Dependences: `R, knitr`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2019). Probabilistic Modeling and Statistical Inference. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/modeling_and_inference, commit b474ec1a5a79347f7c9634376c866fe3294d657a.*

## Underdetermined Linear Regression

A linear regression is underdetermined when there are fewer observations than parameters. In this case the likelihood function does not concentrate on a compact neighborhood but rather an underdetermined hyperplane of degenerate model configurations. The corresponding posterior density has a surprisingly interesting geometry, even with weakly-informative prior densities that ensure a well-defined fit. In this short note I walk through the nature of this geometry and why underdetermined regressions are so hard to fit.

View
(HTML)

betanalpha/knitr_case_studies/underdetermined_linear_regression
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2018). Underdetermined Linear Regression. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/underdetermined_linear_regression, commit b474ec1a5a79347f7c9634376c866fe3294d657a.*

## Towards A Principled Bayesian Workflow (PyStan)

Given the posterior distribution derived from a probabilistic model and a particular observation, Bayesian inference is straightforward to implement: inferences, and any decisions based upon them, follow immediately in the form of posterior expectations. Building such a probabilistic model that is satisfactory in a given application, however, is a far more open-ended challenge. In order to ensure robust analyses we need a principled workflow that guides the development of a probabilistic model that is consistent with both our domain expertise and any observed data while also being amenable to accurate computation. In this case study I introduce a principled workflow for building and evaluating probabilistic models in Bayesian inference.

View
(HTML)

betanalpha/jupyter_case_studies/principled_bayesian_workflow
(GitHub)

Dependences: `Python, Jupyter, PyStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2018). Towards A Principled Bayesian Workflow (PyStan). Retrieved from https://github.com/betanalpha/jupyter_case_studies/tree/master/principled_bayesian_workflow, commit 2580fdeac38f77859b2d1c60f9c4a37237864e63.*

## Conditional Probability Theory (For Scientists and Engineers)

This case study will introduce a conceptual understanding of conditional probability theory and its applications. We’ll begin with a discussion of marginal probability distributions before introducing conditional probability distributions as their complement. Then we’ll examine how different conditional probability distributions can be related to each other through Bayes’ Theorem before considering how all of these objects manifest in probability mass function and probability density function representations. Finally we’ll review some of the important practical applications of the theory.

View
(HTML)

betanalpha/knitr_case_studies/conditional_probability_theory
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2018). Conditional Probability Theory (For Scientists and Engineers). Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/conditional_probability_theory, commit b474ec1a5a79347f7c9634376c866fe3294d657a.*

## Probability Theory (For Scientists and Engineers)

Formal probability theory is a rich and complex field of mathematics with a reputation for being confusing if not outright impenetrable. Much of that intimidation, however, is due not to the abstract mathematics but rather how they are employed in practice. In this case study I attempt to untangle this pedagogical knot to illuminate the basic concepts and manipulations of probability theory and how they can be implemented in practice. Our ultimate goal is to demystify what we can calculate in probability theory and how we can perform those calculations in practice.

View
(HTML)

betanalpha/knitr_case_studies/probability_theory
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2018). Probability Theory (For Scientists and Engineers). Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/probability_theory, commit b474ec1a5a79347f7c9634376c866fe3294d657a.*

## Fitting the Cauchy

In this case study I review various ways of implementing the Cauchy distribution, from the nominal implementation to alternative implementations aimed at ameliorating these difficulties, and demonstrate their relative performance.

View
(HTML)

betanalpha/knitr_case_studies/fitting_the_cauchy
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2018). Fitting the Cauchy. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/fitting_the_cauchy, commit b474ec1a5a79347f7c9634376c866fe3294d657a.*

## The QR Decomposition for Regression Models

This case study reviews the QR decomposition, a technique for decorrelating covariates and, consequently, the resulting posterior distribution in regression models.

View
(HTML)

betanalpha/knitr_case_studies/qr_regression
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2017). The QR Decomposition for Regression Models. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/qr_regression, commit b474ec1a5a79347f7c9634376c866fe3294d657a.*

## Robust PyStan Workflow

This case study demonstrates a proper PyStan workflow that ensures robust inferences with the default dynamic Hamiltonian Monte Carlo algorithm.

View
(HTML)

betanalpha/jupyter_case_studies/pystan_workflow
(GitHub)

Dependences: `Python, Jupyter, PyStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2017). Robust PyStan Workflow. Retrieved from https://github.com/betanalpha/jupyter_case_studies/tree/master/pystan_workflow, commit 2580fdeac38f77859b2d1c60f9c4a37237864e63.*

## Robust RStan Workflow

This case study demonstrates a proper RStan workflow that ensures robust inferences with the default dynamic Hamiltonian Monte Carlo algorithm.

View
(HTML)

betanalpha/knitr_case_studies/rstan_workflow
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2017). Robust RStan Workflow. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/rstan_workflow, commit b474ec1a5a79347f7c9634376c866fe3294d657a.*

## Diagnosing Biased Inference with Divergences

This case study discusses the subtleties of accurate Markov chain Monte Carlo estimation and how divergences can be used to identify biased estimation in practice.

View
(HTML)

betanalpha/knitr_case_studies/divergences_and_bias
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2017). Diagnosing Biased Inference with Divergences. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/divergences_and_bias, commit b474ec1a5a79347f7c9634376c866fe3294d657a.*

## Identifying Bayesian Mixture Models

This case study discusses the common pathologies of Bayesian mixture models as well as some strategies for identifying and overcoming them.

View
(HTML)

betanalpha/knitr_case_studies/identifying_mixture_models
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2017). Identifying Bayesian Mixture Models. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/identifying_mixture_models, commit b474ec1a5a79347f7c9634376c866fe3294d657a.*

## How the Shape of a Weakly Informative Prior Affects Inferences

This case study reviews the basics of weakly-informative priors and how the choice of a specific shape of such a prior affects the resulting posterior distribution.

View
(HTML)

betanalpha/knitr_case_studies/weakly_informative_shapes
(GitHub)

Dependences: `R, knitr, RStan`

Code License: BSD (3 clause),
Text License: CC BY-NC 4.0

Cite As: *Betancourt, Michael (2017). How the Shape of a Weakly Informative Prior Affects Inferences. Retrieved from https://github.com/betanalpha/knitr_case_studies/tree/master/weakly_informative_shapes, commit b474ec1a5a79347f7c9634376c866fe3294d657a.*