8,9-POS tagging and HMMs
Use Hidden Markov Models to do POS tagging
Notation:
Sequence of observation overtime (sentence): $ O=o_1\dots o_T $
Sequence of states (tags): $Q=q^1\dots q^N$
Sequence states over time: $Q = q_1 \dots q_T$
Vocabulary size: $V$
Time step: $t$, not a tag
Matrix of transition probabilities: $A$
$a_{ij}$: the prob of transitioning from state $q^i$ to $q^j$ Matrix of output probabilities: $B$
$b_i(o_t)$: the prob of emitting $o_t$ from state $q^i$ 1.