The Complexity of Hidden Markov Models
by Lorenzo Finesso
Hidden Markov Models (HMMs) are mathematical models of uncertain phenomena, well suited to describe complex dynamical behaviours. For quite some time HMMs have been mainly applied to speech recognition, but they are currently spreading to a variety of applications in the areas of digital signal processing, control, and pattern recognition. Popular as they have become, HMMs are not yet mature and still offer a wealth of challenging theoretical problems. HMMs are an ongoing research theme of the System Theory group at LADSEB-CNR.
The simplest probabilistic models of uncertain phenomena are independent processes. The salient feature of these models is their lack of memory: the present value of the process is not influenced by its past or future values. Independent processes are adequate to represent the randomness of games of chance and of equivalently simple physical phenomena, but are of limited use in describing true dynamic behaviours. One step above independent processes, in the hierarchy of probabilistic models, one finds Markov chain (MC) models. The present value of a MC depends on a finite (bounded) number of its most recent past values. If their memory size is correctly adjusted, MCs can be used to describe approximately phenomena with complex dynamical behaviour. Hidden Markov Models are at the top of this hierarchy. Roughly speaking, HMMs are processes that can be represented as functions of MCs with a finite number of states: if X(t) is a (finitely valued) MC and h(.) a given function, then Y(t) = h(X(t)) is an HMM.
The simplicity of the definition and the popularity of HMMs in applied research should not give the (wrong) impression that these are simple generalizations of MCs. In general, the transformation h of the MC X(t) destroys the Markov property and, as a result, the HMM process Y(t) can exhibit infinite memory. Only a few of the fundamental questions concerning HMMs have been solved satisfactorily. In the following, we briefly mention the problems to which researchers working at LADSEB have directly contributed.
The most basic probabilistic question is the characterization of HMMs. Is it possible to decide whether a process Y(t) is an HMM, knowing all of its finite dimensional distributions? The positive answer to this question was provided, independently, by Heller and by Furstenberg in the early sixties and later rederived, within the framework of System Theory, at LADSEB, by Picci in 1976. Another probabilistic problem is the realization of HMMs (on Realization theory, see also the articles by Gombani and van Schuppen in this number). In the context of HMMs, realization theory provides the tools to construct, for a given HMM Y(t), the parameters X(t) and h(.), such that Y(t)=h(X(t)). The minimal number of states required for the construction of X(t) is called the order of the HMM and is a measure of its complexity. Picci and van Schuppen (CWI) have shown that the realization problem for HMMs is equivalent to a problem of positive factorization of positive matrices.
Statistical inference problems for HMMs have been intensively studied since the early sixties, when Baum and his coworkers presented two related algorithms (the Forward-Backward and a form of what is today known as the EM algorithm) that helped solve the Maximum Likelihood parameter estimation problem in the case of finitely valued HMMs. At LADSEB, we are currently working on some aspects of the parameters and order estimation problem for finite and continuous HMMs. The work in this area is in collaboration with Le Gland and Mevel (IRISA-Rennes). Order estimation problems can also be approached with the tools of Information Theory (as pioneered by Rissanen with his MDL, in the context of time series) and are of direct relevance in areas of communi-cations theory such as data compression and source coding. Work on this problem is done jointly with Narayan (University of Maryland).
There are two less standard topics which we have found interesting and on which we are planning future work. In collaboration with Gerencsér (SZTAKI) we are studying estimation problems for quantized ARMA processes, which call for a generalization of HMMs to continuous MCs X(t). In collaboration with Spreij (University of Amsterdam) we study special classes of HMMs hoping to find good approximants of general HMMs in the sense of complexity reduction. For more information on the research of the System Theory group at LADSEB see also the article by Andrea Gombani in this number.
Please contact:
Lorenzo Finesso - LADSEB-CNR
Tel: +39 049 829 5755
E-mail: finesso@ladseb.pd.cnr.it