The following libraries and references may be useful for solving this homework.
-
NanoHMM library (includes both C and Python implementations).
Part 1: Slow Forward Algorithm
Implement the "slow" version of the forward algorithm. It should run in O(NT). It should support at least 4 states and sequences of length at least 5. This should be your own code, i.e., you are not allowed to use any other libraries or implementations for this part.
// Insert your code here
Part 2: The Forward Algorithm
Implement the Forward algorithm that runs in O(N2T). It should support sequences of length at least 8 with at least 5 states. Because these numbers are relatively small, your code doesn't have to re-normalize the probabilities at each step of the algorithm. This should be your own code, i.e., you are not allowed to use any other libraries or implementations for this part.
// Insert your code here
Part 3: Forward Check
Check your implementation of the forward algorithm by computing the forward variable alpha for the observation sequence O=(0,1,0,2,0,1,0) given the HMM.
Part 3A: Forward Check Using HMM with Two States
The HMM for Part 3A is specified below:
A = [[0.66, 0.34],
[1, 0]]
B = [[0.5, 0.25, 0.25],
[0.1, 0.1, 0.8]]
pi = [0.8, 0.2]
// Insert the computed N-by-T array for the forward variable alpha here.
Part 3B: Forward Check Using HMM with Three States
The HMM for Part 3B is specified below:
A = [[0.8, 0.1, 0.1],
[0.4, 0.2, 0.4],
[0, 0.3, 0.7]]
B = [[0.66, 0.34, 0],
[0, 0, 1],
[0.5, 0.4, 0.1]]
pi = [0.6, 0, 0.4]
// Insert the computed N-by-T array for the forward variable alpha here.
Part 4: Likelihood Calculation
Compute the likelihood for each of the following five observation sequences given the same HMM model:
O1 = (1,0,0,0,1,0,1) O2 = (0,0,0,1,1,2,0) O3 = (1,1,0,1,0,1,2) O4 = (0,1,0,2,0,1,0) O5 = (2,2,0,1,1,0,1)
The HMM for Part 6 is specified below:
A = [[0.6, 0.4],
[1, 0]]
B = [[0.7, 0.3, 0],
[0.1, 0.1, 0.8]]
pi = [0.7, 0.3]
Hint: Compute this by adding the elements in the last column of the alpha array that is computed by your Forward algorithm.
// Insert the computed likelihood for each sequence here. Likelihood for O1 = ? Likelihood for O2 = ? Likelihood for O3 = ? Likelihood for O4 = ? Likelihood for O5 = ?
Part 5: Match Sequences to HMMs
Use your implementation of the Forward algorithm to compute the likelihood for each of the following five observation sequences given each of the following five HMMs. Fill the table below and indicate with * the most probable HMM for each sequence.
The observation sequences are:
O1 = (1,0,0,0,1,0,1) O2 = (0,0,0,1,1,2,0) O3 = (1,1,0,1,0,1,2) O4 = (0,1,0,2,0,1,0) O5 = (2,2,0,1,1,0,1)
The HMMs are:
HMM 1: A = [[1.0, 0.0], [0.5, 0.5]] B = [[0.4, 0.6, 0.0], [0.0, 0.0, 1.0]] pi = [0.0, 1.0] HMM 2: A = [[0.25, 0.75], [1.0, 0.0]] B = [[0, 1.0, 0], [0.66, 0.0, 0.34]] pi = [1.0, 0.0] HMM 3: A = [[0.0, 1.0], [1.0, 0.0]] B = [[1.0, 0.0, 0.0], [0.0, 0.66, 0.34]] pi = [1.0, 0.0] HMM 4: A = [[1, 0], [0.44, 0.56]] B = [[0.36, 0.42, 0.22], [1.0, 0, 0]] pi = [0, 1.0] HMM 5: A = [[0.0, 1.0], [1.0, 0.0]] B = [[0.25, 0.75, 0.0], [1.0, 0.0, 0.0]] pi = [1.0, 0.0]
HMM1 HMM2 HMM3 HMM4 HMM5
O1 L=? *L=? ...
O2 *L=? L=? ...
O3 L=? ...
O4 L=?
O5 L=?
// Insert your code here.
Part 6: Match Sequences to HMMs (using NanoHMM)
This problem is similar to Part 8, but the sequences are now longer and your Forward and Backward algorithms may no longer work because they don't perform renormalization at each step.
Use the implementation of the Forward algorithm in the NanoHMM library to compute the log-likelihood for each of the following five observation sequences given each of the following five HMMs. Fill the table below and indicate with * the most likely HMM for each sequence. In all cases, N=5, M=6, and T=20.
O1 = (4,2,5,1,5,1,5,3,2,3,2,0,1,0,0,4,4,3,0,1) O2 = (3,2,3,3,5,5,5,5,1,0,1,4,2,4,3,0,5,3,1,0) O3 = (4,3,0,3,4,0,1,0,2,0,5,3,2,0,0,5,5,3,5,4) O4 = (3,4,2,0,5,4,4,3,1,5,3,3,2,3,0,4,2,5,2,4) O5 = (2,0,5,4,4,2,0,5,5,4,4,2,0,5,4,4,5,5,5,5)
The HMMs are:
HMM 1:
A = [[0.33, 0, 0, 0.67, 0],
[0.67, 0, 0.33, 0, 0],
[0, 1.0, 0.0, 0, 0],
[0, 0, 0, 0.25, 0.75],
[0.0, 0.0, 0.6, 0, 0.4]]
B = [[0.67, 0, 0, 0, 0, 0.33],
[0.0, 1.0, 0, 0, 0, 0],
[0.5, 0, 0, 0, 0, 0.5],
[0, 0, 0, 0.25, 0.75, 0],
[0, 0.0, 0.6, 0.4, 0, 0.0]]
pi = [0.0, 0.0, 0.0, 1.0, 0.0]
HMM 2:
A = [[0.0, 0.0, 1.0, 0, 0.0],
[0.0, 0, 0.0, 0.0, 1.0],
[0.38, 0.0, 0.23, 0.38, 0.0],
[0.0, 0.31, 0.0, 0.69, 0],
[0.0, 0.75, 0.0, 0.25, 0.0]]
B = [[0.0, 0.0, 1.0, 0.0, 0.0, 0.0],
[0.0, 0.6, 0.2, 0.2, 0.0, 0.0],
[0.0, 0.0, 0, 1.0, 0.0, 0],
[0, 0.0, 0, 0.22, 0.0, 0.78],
[0.6, 0.0, 0.0, 0.0, 0.4, 0.0]]
pi = [0.0, 0.0, 1.0, 0.0, 0.0]
HMM 3:
A = [[0, 0.0, 0.32, 0.18, 0.5],
[0.0, 0.0, 0.0, 1.0, 0.0],
[0, 0.0, 0, 0.0, 1.0],
[0, 0.64, 0, 0.0, 0.36],
[1.0, 0.0, 0, 0, 0]]
B = [[0.0, 0.17, 0.33, 0.0, 0.0, 0.5],
[0.0, 0.0, 0.0, 1.0, 0.0, 0.0],
[0.47, 0.0, 0.0, 0.0, 0.0, 0.53],
[0.27, 0.0, 0.0, 0.0, 0.73, 0.0],
[0.66, 0.0, 0.0, 0.33, 0.0, 0.0]]
pi = [0.0, 0.0, 0.0, 1.0, 0.0]
HMM 4:
A = [[0.0, 0.0, 1.0, 0, 0.0],
[0.0, 0, 0.62, 0, 0.38],
[0.0, 0.5, 0.0, 0.5, 0.0],
[0.0, 0.23, 0.0, 0.0, 0.77],
[0.0, 0, 0, 1.0, 0]]
B = [[0.0, 0.0, 0.0, 1.0, 0.0, 0.0],
[0.0, 0.0, 0.62, 0, 0.38, 0.0],
[0, 0.0, 0.0, 0.0, 1, 0],
[0, 0.0, 0, 0.41, 0.18, 0.41],
[0.31, 0.16, 0.37, 0.16, 0, 0.0]]
pi = [1.0, 0.0, 0.0, 0.0, 0]
HMM 5:
A = [[0.5, 0.33, 0, 0.17, 0.0],
[0.0, 0.0, 0.0, 0.0, 1.0],
[0.75, 0.0, 0.25, 0.0, 0.0],
[0.0, 0.0, 0, 1.0, 0.0],
[0.0, 0.0, 1.0, 0.0, 0.0]]
B = [[0.0, 0.0, 0.0, 0.0, 1.0, 0],
[0.0, 0.0, 1.0, 0.0, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0, 0, 1.0],
[0.0, 0.0, 0.0, 0.0, 0, 1.0],
[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]]
pi = [0.0, 1.0, 0.0, 0.0, 0.0]
HMM1 HMM2 HMM3 HMM4 HMM5
O1 logL=? *logL=? ...
O2 *logL=? logL=? ...
O3 logL=? ...
O4 logL=?
O5 logL=?
// Insert your code here.
Part 7: Train HMMs (using the NanoHMM library)
The following five observation sequences are used for both parts 10A and 10B:
O1 = (4,2,5,1,5,1,5,3,2,3,2,0,1,0,0,4,4,3,0,1) O2 = (3,2,3,3,5,5,5,5,1,0,1,4,2,4,3,0,5,3,1,0) O3 = (4,3,0,3,4,0,1,0,2,0,5,3,2,0,0,5,5,3,5,4) O4 = (3,4,2,0,5,4,4,3,1,5,3,3,2,3,0,4,2,5,2,4) O5 = (2,0,5,4,4,2,0,5,5,4,4,2,0,5,4,4,5,5,5,5)
Part 7A: Train 3-State HMMs
Train a 3-state HMM for each of the five observation sequences using the Baum-Welch implementation in the NanoHMM library.
Trained HMM for O1: A = ... B = ... pi = ... Trained HMM for O2: A = ... B = ... pi = ... ...
Part 7B: Train 4-State HMMs
Train a 4-state HMM for each of the five observation sequences using the Baum-Welch implementation in the NanoHMM library.
Trained HMM for O1: A = ... B = ... pi = ... Trained HMM for O2: A = ... B = ... pi = ... ...
// Insert your code for parts 6A and 6B here.
Extra Credit
For each of the three problems below, you are allowed to use only your own code. In other words, you are not allowed to use any other libraries or implementations for these problems.
Part EC1: Implement the Forward Algorithm with Re-Normalization
// Insert your code here
Part EC2: Implement the Forward-Backward Algorithm with Re-Normalization
// Insert your code here
Part EC3: Implement the Baum-Welch Algorithm
// Insert your code here