Discrete systems have the advantage of low run-time computation. However, vector quantisation reduces accuracy and this can lead to poor performance. As a intermediate between discrete and continuous, a fully tied-mixture system can be used. Tied-mixtures are conceptually just another example of the general parameter tying mechanism built-in to HTK. However, to use them effectively in speech recognition systems a number of storage and computational optimisations must be made. Hence, they are given special treatment in HTK.
When specific mixtures are tied as in
TI "mix" {*.state[2].mix[1]}
then a Gaussian mixture component is shared across all of the owners
of the tie. In this example, all models will share the same Gaussian
for the first mixture component of state 2. However, if the mixture
component index is missing, then all of the mixture components participating in
the tie are joined rather than tied. More specifically, the commands
JO 128 2.0
TI "mix" {*.state[2-4].mix}
has the following effect. All of the mixture components in states 2 to 4 of
all models are collected into a pool. If the number of components
in the pool exceeds 128, as set by the preceding join command
JO, then
components with the smallest weights are removed until the pool size is exactly
128. Similarly, if the size of the initial pool is less than 128, then mixture
components are split using the same algorithm as for the Mix-Up MU
command. All states then share all of the
mixture components in this pool. The new mixture weights are chosen to be proportional
to the log probability of the corresponding new mixture component mean with
respect to the original distribution for that state. The log is used here
to give a wider spread of mixture weights. All mixture weights are floored
to the value of the second argument of the JO command times
MINMIX.
The net effect of the above two commands is to create a set of tied-mixture HMMs11.2where the same set of mixture components is shared across all states of all models. However, the type of the HMM set so created will still be SHARED and the internal representation will be the same as for any other set of parameter tyings. To obtain the optimised representation of the tied-mixture weights described in section 7.5, the following HHED HK command must be issued
HK TIEDHS
This will convert the internal representation to the special tied-mixture
form in which all of the tied mixtures are stored in a global table and
referenced implicitly instead
of being referenced explicitly using pointers.
Tied-mixture HMMs work best if the information relating to different sources such as delta coefficients and energy are separated into distinct data streams. This can be done by setting up multiple data stream HMMs from the outset. However, it is simpler to use the SS command in HHED to split the data streams of the currently loaded HMM set. Thus, for example, the command
SS 4
would convert the currently loaded HMMs to use four separate data streams
rather than one. When used in the construction of tied-mixture HMMs
this is analogous to the use of multiple codebooks in discrete density
HMMs.
The procedure for building a set of tied-mixture HMMs may be summarised as follows
SS 4
JO 256 2.0
TI st1 {*.state[2-4].stream[1].mix}
JO 128 2.0
TI st2 {*.state[2-4].stream[2].mix}
JO 128 2.0
TI st3 {*.state[2-4].stream[3].mix}
JO 64 2.0
TI st4 {*.state[2-4].stream[4].mix}
HK TIEDHS
When evaluating probabilities in tied-mixture systems, it is often sufficient to sum just the most likely mixture components since for any particular input vector, its probability with respect to many of the Gaussian components will be very low. HTK tools recognise TIEDHS HMM sets as being special in the sense that additional optimisations are possible. When full tied-mixtures are used, then an additional layer of pruning is applied. At each time frame, the log probability of the current observation is computed for each mixture component. Then only those components which lie within a threshold of the most likely component are retained. This pruning is controlled by the -c option in HREST, HEREST and HVITE.