Markov aggregation is the task of representing a Markov chain with a large alphabet by a Markov chain with a smaller alphabet, thus reducing model complexity while at the same time retaining the computationally and analytically desirable Markov property. In this work we propose an information theoretic cost function for Markov aggregation. This cost function is motivated by two objectives: 1. The process obtained by observing the Markov chain through the mapping should be close to a Markov chain, and 2. the aggregated Markov chain should retain as much of the temporal dependence structure of the original Markov chain as possible.
We then adapt the framework such that we can use it for clustering and co-clustering. This adapted framework includes the most famous co-clustering cost functions including ITCC and IBCC as special cases. Using this framework we then try to ascertain if information theoretic cost functions (proposed so far in the literature and the ones proposed by us as well) are a good choice for clustering problems.
«
Markov aggregation is the task of representing a Markov chain with a large alphabet by a Markov chain with a smaller alphabet, thus reducing model complexity while at the same time retaining the computationally and analytically desirable Markov property. In this work we propose an information theoretic cost function for Markov aggregation. This cost function is motivated by two objectives: 1. The process obtained by observing the Markov chain through the mapping should be close to a Markov chain...
»