n-gram language models

Language models estimate the probability of a word sequence, $\hat P(w_1, w_2, \ldots, w_m)$ - that is, they evaluate

as defined in equation 1.3 in chapter 1.^14.1

The probability $\hat P(w_1, w_2, \ldots, w_m)$ can be decomposed as a product of conditional probabilities:

$\displaystyle \hat P(w_1, w_2, \ldots, w_m) = \prod_{i=1}^{m} \hat P(w_i \;\vert\; w_1, \ldots, w_{i-1})$

(14.1)

Back to HTK site
See front page for HTK Authors