Hidden Markov Model (HMM) is a method for representing most likely corresponding sequences of observation data. HMM is used in speech and pattern recognition, computational biology, and other areas of data modeling.
In this tutorial, we'll briefly learn how to implement HMM by using HMM package in R. The tutorial covers:
- Transition and emission matrix
- Building HMM model
- Predicting the state
- Source code listing
We'll start by installing the HMM package. 
 install.packages('HMM')    library(HMM)  
Transition and emission matrix
   The transition and emission matrix are the main parameters to build HMM.
- The transition matrix is a probability of switching from one state to another.
- Emission matrix is a selection probability of the element in a list.
   Let's see an example. There are two possible states called "Target" and "Outlier" in a test data, and their selecting probabilities are as below,
  states <- c("Target","Outlier")
targetProb <- c(0.4, 0.6)
outlierProb <- c(0.6, 0.4)
 
    Based on those selection probabilities, we can build a transition probability matrix.
  transProb <- matrix(c(targetProb, outlierProb), 2)
print(transProb)      [,1] [,2]
[1,]  0.4  0.6
[2,]  0.6  0.4    
A state element can be a "short", "long" or "normal". In each state, the selection probability of element is different. We'll set the state probability for each element as below. Here, we'll divide the selection percentage of each element.
elements <- c("short","normal","long")
targetStateProb <- c(0.1, 0.3, 0.6)
outlierStateProb <- c(0.6, 0.3, 0.1)
   
Next, we'll create emission probability matrix.
emissProb <- matrix(c(targetStateProb,outlierStateProb), 2, byrow = T) 
print(emissProb)      [,1] [,2] [,3]
[1,]  0.1  0.3  0.6
[2,]  0.6  0.3  0.1    
Building HMM model 
    The initHMM function initializes a general discrete time and space Hidden Markov Model. HMM inferences on states through the observation of emissions. Now, we can build a model with the above inputs and check the summary of the model.
hmm <- initHMM(States = states, 
               Symbols = elements,
               transProbs=transProb,
               emissionProbs = emissProb)
print(hmm)  $States
[1] "Target"  "Outlier"
$Symbols
[1] "short"  "normal" "long"  
$startProbs
 Target Outlier 
    0.5     0.5 
$transProbs
         to
from      Target Outlier
  Target     0.4     0.6
  Outlier    0.6     0.4
$emissionProbs
         symbols
states    short normal long
  Target    0.1    0.3  0.6
  Outlier   0.6    0.3  0.1    
    Next, we'll simulate 10 observation elements with a simHMM function using our hmm model. simHMM function simulates a path of states and observation for a given HMM.
simhmm <- simHMM(hmm, 10)
simulated <- data.frame(state=simhmm$states, element=simhmm$observation)
print(simulated)        state element
1   Target  normal
2  Outlier  normal
3   Target  normal
4   Target   short
5   Target    long
6  Outlier   short
7  Outlier   short
8  Outlier   short
9   Target    long
10 Outlier   short     
Predicting the state
   The Viterbi function  calculates the possible state for a sequence of observations for a given HMM. We'll create test data and find out possible states for those elements with the hmm model.
testElements <- c("long","normal","normal","short",
                 "normal","normal","short","long")
stateViterbi <- viterbi(hmm, testElements)     
We'll check the result.
predState <- data.frame(Element=testElements, State=stateViterbi)
print(predState)     Element   State
1    long  Target
2  normal Outlier
3  normal  Target
4   short Outlier
5  normal  Target
6  normal  Target
7   short Outlier
8    long  Target    
In this tutorial, we've briefly learned how to implement Hidden Markov Model by using HMM package functions. Full source code is listed below.
Source code listing 
 install.packages('HMM')  library(HMM)
states <- c("Target","Outlier")
targetProb <- c(0.4, 0.6)
outlierProb <- c(0.6, 0.4)
transProb <- matrix(c(targetProb, outlierProb), 2)
print(transProb)
elements <- c("short","normal","long")
targetStateProb <- c(0.1, 0.3, 0.6)
outlierStateProb <- c(0.6, 0.3, 0.1)
emissProb <- matrix(c(targetStateProb,outlierStateProb), 2, byrow = T) 
print(emissProb)
hmm <- initHMM(States = states, 
               Symbols = elements,
               transProbs=transProb,
               emissionProbs = emissProb)
print(hmm)
simhmm <- simHMM(hmm, 10)
simulated <- data.frame(state=simhmm$states, element=simhmm$observation)
print(simulated)
testElements <- c("long","normal","normal","short",
                 "normal","normal","short","long")
stateViterbi <- viterbi(hmm, testElements)
predState <- data.frame(Element=testElements, State=stateViterbi)
print(predState)  
 
Hey!
ReplyDeleteI think there are some problems with the matrices in this post (maybe it was written against an earlier version of the HMM library?
The transProbs-matrix needs to be transposed, so that each of the rows sum to 1. In general, this matrix needs to have the same amount of rows and columns.
The emissionProbs-matrix also needs to have the same amount of rows/columns.
These conclusions I have drawn from the documentation of initHMM(..).
Yes, you are right, the rows sum must be equal to 1.
DeleteI updated matrix values. Thanks!