WARNING: The HTK Large Vocabulary Decoder HDECODE has been specifically written for speech recognition tasks using cross-word triphone models. Known restrictions are:
The decoder distributed with HTK, HVITE, is only suitable for small and medium vocabulary systems3.7 and systems using bigrams. For larger vocabulary systems, or those requiring trigram language models to be used directly in the search, HDECODE is available as an extension3.8 to HTK. HDECODE has been specifically written for large vocabulary speech recognition using cross-word triphone models. Known restrictions are listed above. For detailed usage, see the HDECODE reference page 17.6 for more information. HDECODE will also be used to generate lattices for discriminative training described in the next section.
In this section, examples are given for using HDECODE for large vocabulary speech recognition. Due to the limitations described above, the word-internal tripone systems generated in the previous stages cannot be used with HDECODE. For this section it is assumed that there is a cross-word triphone system in the directory hmm20 along with a model-list in xwrdtiedlist. In contrast to the previous sections both the macros and HMM definitions are stored in the same file hmm20/models. For an example of how to build a cross-word state-clustered triphone system, see the Resource Management (RM) example script step 9, in the RM samples tar-ball.
Note: the grammar scale factors used in this section, and the next section on discriminative training, are consistent with the values used in the previous tutorial sections. However for large vocabulary speech recognition systems grammar scale factors in the range 12-15 are commonly used.