When constructing models that we want to use in the inference of human cultural traits, we often encounter a problem of (possible) dependency: The individual features that evolve are not necessarily independent of each other. This can be due to obvious implications – a culture living inland will hardly have any traits related to sea fishing. But there are less evident connections, such as many linguistic universals. When describing the transitions of these traits through a continuous-time Markov chain (and there are other models, like Felsenstein's threshold model), there are a few shapes of transition matrices we could consider. Each of these CTMC structures has different advantages and disadvantages, so let's have a closer look at each of them, before comparing them.
Independent Evolution
In the simplest case, we ignore the dependency between the characters. This may be a reasonable model when the actual dependency is weak, but otherwise this is precisely the model we want to get away from, because it is too restrictive for our assumptions.
Fully-Dependent Evolution
The other extreme is to assume that all possible transitions between any combination of traits can occur. This is the least restrictive model, unfortunately it means that the number of parameters of the model explodes.
Cardinal-Directions Evolution
Another model assumes that only one character can change in infinitesimal time, but the transition rates depend on the current combination of traits. This shape has gained wide use in the literature. Dunn et al. use it to investigate word order universals.
Cardinal-Directions with Clamped Rates
Pagel and Meade have suggested a set of reversible jump operators that interpolate between the independent model and the CDM. They allow different transition rates to be clamped to the same value. The case where two characters are independent is then the special case of the CDM where rates are clamped in a way to not depend on the other states.
Continuous-Time Bayesian Networks
Continuous-Time Bayesian Networks were developed by the probabilistic graphical model community. The CDM above is a special case of CTBNs. A CTBN specifies for which traits the transition rates depend on which other traits. In the case that for all traits, the transition rates depend on all other traits, we recover the cardinal-directions model. If, on the other hand, the transition rates for each trait depend on no other trait, we recover the independent transition matrix.
Kronecker-Coupled CTMCs
In the computer science literature on model-checking continuous-time Markov chains (eg. Buchholz et al.), a different way to generalize away from the independent model can be found. This class of transition matrices is used to model parallel systems with locking states. The fundamental addition to independent Markov chains is therefore the introduction of synchronisation events. Such a Markov chain with synchronized transitions can be easily decomposed into Kronecker products of simple matrices, which makes sparse storing and matrix exponentiation very simple. On the other hand, this ansatz shares a severe disadvantage with the independent evolution model. Kronecker-coupled CTMCs can never express that a single combination of states is impossible. All other dependent models described above allow me to set some rates to 0, such that a single state cannot be reached. But this is not possible in a Kronecker-coupled CTMC: If one state is made unreachable by setting a rate to 0, at least one other state will also become unreachable.
Comparison
[A table showing 2×2 transitions, number of parameters, probability of independent model, other notes]
All structures listed here have some advantages and some disadvantages when being used as transition models for the evolution of correlated characters. I am sure I have missed some other way of restricting CTMC transition rates, so I would welcome any comment about those. In addition, there are other classes of models that we should maybe look at. And yet, there are only two (point five) of these that I have seen used in phylogeny of anthropological features – it might be worth changing that.