Hidden Markov Model example in R

   Hidden Markov Model (HMM) is a method for representing most likely corresponding sequences of observation data. HMM is used in speech and pattern recognition, computational biology, and other areas of data modeling.

    In this tutorial, we'll briefly learn how to implement HMM by using HMM package in R. The tutorial covers:

  1. Transition and emission matrix
  2. Building HMM model
  3. Predicting the state
  4. Source code listing

We'll start by installing the HMM package.

 
install.packages('HMM')   
 
library(HMM) 
 

 

Transition and emission matrix

   The transition and emission matrix are the main parameters to build HMM.

  • The transition matrix is a probability of switching from one state to another.
  • Emission matrix is a selection probability of the element in a list.

   Let's see an example. There are two possible states called "Target" and "Outlier" in a test data, and their selecting probabilities are as below,

  
states <- c("Target","Outlier")
targetProb <- c(0.4, 0.6)
outlierProb <- c(0.6, 0.4)
 

    Based on those selection probabilities, we can build a transition probability matrix.

  
transProb <- matrix(c(targetProb, outlierProb), 2)
print(transProb)
 
     [,1] [,2]
[1,]  0.4  0.6
[2,]  0.6  0.4 
   

   A state element can be a "short", "long" or "normal". In each state, the selection probability of element is different. We'll set the state probability for each element as below. Here, we'll divide the selection percentage of each element.


elements <- c("short","normal","long")
targetStateProb <- c(0.1, 0.3, 0.6)
outlierStateProb <- c(0.6, 0.3, 0.1)
   

    Next, we'll create emission probability matrix.


emissProb <- matrix(c(targetStateProb,outlierStateProb), 2, byrow = T) 
print(emissProb) 
  
     [,1] [,2] [,3]
[1,]  0.1  0.3  0.6
[2,]  0.6  0.3  0.1 
   

 

Building HMM model

    The initHMM function initializes a general discrete time and space Hidden Markov Model. HMM inferences on states through the observation of emissions. Now, we can build a model with the above inputs and check the summary of the model.


hmm <- initHMM(States = states, 
               Symbols = elements,
               transProbs=transProb,
               emissionProbs = emissProb)
print(hmm) 
 
$States
[1] "Target"  "Outlier"

$Symbols
[1] "short"  "normal" "long"  

$startProbs
 Target Outlier 
    0.5     0.5 

$transProbs
         to
from      Target Outlier
  Target     0.4     0.6
  Outlier    0.6     0.4

$emissionProbs
         symbols
states    short normal long
  Target    0.1    0.3  0.6
  Outlier   0.6    0.3  0.1 
   


    Next, we'll simulate 10 observation elements with a simHMM function using our hmm model. simHMM function simulates a path of states and observation for a given HMM.


simhmm <- simHMM(hmm, 10)
simulated <- data.frame(state=simhmm$states, element=simhmm$observation)

print(simulated)  
 
     state element
1   Target  normal
2  Outlier  normal
3   Target  normal
4   Target   short
5   Target    long
6  Outlier   short
7  Outlier   short
8  Outlier   short
9   Target    long
10 Outlier   short  
   


Predicting the state

   The Viterbi function  calculates the possible state for a sequence of observations for a given HMM. We'll create test data and find out possible states for those elements with the hmm model.


testElements <- c("long","normal","normal","short",
                 "normal","normal","short","long")
stateViterbi <- viterbi(hmm, testElements) 
    

We'll check the result.


predState <- data.frame(Element=testElements, State=stateViterbi)
print(predState)   
  
  Element   State
1    long  Target
2  normal Outlier
3  normal  Target
4   short Outlier
5  normal  Target
6  normal  Target
7   short Outlier
8    long  Target 
   

    In this tutorial, we've briefly learned how to implement Hidden Markov Model by using HMM package functions. Full source code is listed below.

 

Source code listing

 
install.packages('HMM') 
 
library(HMM)

states <- c("Target","Outlier")
targetProb <- c(0.4, 0.6)
outlierProb <- c(0.6, 0.4)

transProb <- matrix(c(targetProb, outlierProb), 2)
print(transProb)

elements <- c("short","normal","long")
targetStateProb <- c(0.1, 0.3, 0.6)
outlierStateProb <- c(0.6, 0.3, 0.1)

emissProb <- matrix(c(targetStateProb,outlierStateProb), 2, byrow = T) 
print(emissProb)

hmm <- initHMM(States = states, 
               Symbols = elements,
               transProbs=transProb,
               emissionProbs = emissProb)
print(hmm)

simhmm <- simHMM(hmm, 10)
simulated <- data.frame(state=simhmm$states, element=simhmm$observation)

print(simulated)

testElements <- c("long","normal","normal","short",
                 "normal","normal","short","long")
stateViterbi <- viterbi(hmm, testElements)

predState <- data.frame(Element=testElements, State=stateViterbi)
print(predState) 
 

 

2 comments:

  1. Hey!
    I think there are some problems with the matrices in this post (maybe it was written against an earlier version of the HMM library?

    The transProbs-matrix needs to be transposed, so that each of the rows sum to 1. In general, this matrix needs to have the same amount of rows and columns.

    The emissionProbs-matrix also needs to have the same amount of rows/columns.

    These conclusions I have drawn from the documentation of initHMM(..).

    ReplyDelete
    Replies
    1. Yes, you are right, the rows sum must be equal to 1.
      I updated matrix values. Thanks!

      Delete