Hierarchical models are a natural way to model heterogeneity across exchangeable contexts, but they become less appropriate when additional information discriminates those contexts. In particular any labels that categorize individual contexts into groups immediately obstructs the full exchangeability, and modeling heterogeneity consistently with these groupings becomes much more subtle. When the contexts are subject to multiple, overlapping categorizations the challenge becomes even more difficult. In this case study we investigate how to generalize exchangeability in the presence of factors that categorize individual contexts and then develop modeling techniques compatible with this generalization.

We begin by discussing how to consistently model heterogeneity in the presence of a single factor before considering multiple factors, in particular multiple factors whose groups nest within each other and multiple factors whose groups overlap. With this foundation laid we then demonstrate the basic implementation and analysis of these models before working through some more realistic examples.

1 Single Factor Models

Let's begin by considering how heterogeneity manifests in the presence of a single categorizing variable. After defining these categorizing variables as factors we discuss the conditional and marginal exchangeability compatible with categorical groups and the resulting hierarchical models. Finally we reframe these hierarchies in terms of residual variations across the groups.

1.1 Factors

As we saw in my hierarchical modeling case study a set of individual contexts are exchangeable only when we do not have, or ignore, any information that discriminates one context from another. For example the cards in a deck are exchangeable when we can view only their back faces.




Flipping the cars over, however, reveals patterns that break the full exchangeability of the cards.




We will denote such discriminating information with traditional statistics terminology. A factor is a categorical variable that endows each context with one of a finite number of discrete, unordered values, or levels. When there is minimal risk of confusion I will also use "level" to refer to all of the individual contexts grouped with the same level value. In our card example the immediate factor is color, with the three levels corresponding to light red, red, and dark red values. The "dark red level" would then correspond to all cards with dark red coloring.

Factors can capture any abstract categorization or labeling. For example if the contexts are defined by facilities where a certain product is manufactured then a region factor might assign values of "northeast", "southeast", southwest", and "northwest" depending on where each factory is located. Additional factors might include where the material input to the manufacturing process is sourced, environmental conditions, and even manager and employee assignments.

1.2 Conditional and Marginal Exchangeability

Regardless of its interpretation, the level assignments within a factor obstruct some of the permutations that would be valid without the factor. In particular although our information is still indifferent to permutations of cards within the same level we will notice permutations of cards within different levels.




Grouping the individual contexts by their level assignments makes it easier to identify, and classify, the permutations that persist after a factor has been introduced. For example the permutations that modify only contexts within a single level are unaffected by the introduction of the factor and they define a conditional exchangeability.