To conclude this chapter,
this section presents a formal description
of the HMM definition language used by HTK.
Syntax is described using an extended BNF notation in which
alternatives are separated by a vertical bar , parentheses () denote
factoring, brackets [ ] denote options, and braces {} denote zero or more
repetitions.
All keywords are enclosed in angle brackets7.7 and the case of the keyword name is not significant. White space is not significant except within double-quoted strings.
The top level structure of a HMM definition is shown by the following rule.
hmmdef = [A HMM definition consists of an optional set of global options followed by theh macro ]
BeginHMM
![]()
NumStates
short
state { state }
transP
EndHMM
![]()
The global options are common to all HMMs. They can be given
separately using a o option macro
optmacro =
o globalOpts
or they can be included in one or more HMM definitions. Global
options may be repeated but no definition can change a previous
definition. All global options must be defined before any other
macro definition is processed. In practice this means that any
HMM system which uses parameter tying must have a
The full set of global options is given below. Every HMM set must
define the vector size (via VecSize
), the stream widths
(via
StreamInfo
)
and the observation parameter kind. However, if only the stream
widths are given, then the vector size will be inferred. If
only the vector size is given, then a single stream of identical
width will be assumed. All other options default to null.
globalOpts = option { option }The
option =HmmSetId
string
![]()
StreamInfo
short { short }
![]()
VecSize
short
![]()
ProjSize
short
![]()
InputXform
inputXform
![]()
ParentXform
![]()
a macro
![]()
covkind![]()
durkind![]()
parmkind
The covkind defines the kind of the covariance matrix
covkind=whereDiagC
![]()
![]()
InvDiagC
![]()
![]()
FullC
![]()
![]()
LLTC
![]()
![]()
XformC
![]()
The durkind denotes the type of duration model used according to the following rules
durkind=For anything other thannullD
![]()
![]()
poissonD
![]()
![]()
gammaD
![]()
![]()
genD
![]()
The parameter kind is any legal parameter kind including qualified forms (see section 5.1)
parmkind=where the syntax rule for parmkind is non-standard in that no spaces are allowed between the base kind and any subsequent qualifiers. As noted in chapter 5,basekind{_D
_A
_T
_E
_N
_Z
_O
_V
_C
_K}
![]()
basekind=discrete
lpc
lpcepstra
mfcc
![]()
![]()
fbank
![]()
![]()
melspec
![]()
lprefc
lpdelcep
![]()
![]()
user
![]()
Each state of each HMM must have its own section defining the parameters associated with that state
state=where the short followingState: Exp
short stateinfo
stateinfo =A stateinfo definition consists of an optional specification of the number of mixture components, an optional set of stream weights, followed by a block of information for each stream, optionally terminated with a duration vector. Alternatively,s macro
![]()
[ weights ] stream { stream } [ duration ]
macro = string
The optional mixes in a stateinfo definition specify the number of mixture components (or discrete codebook size) for each stream of that state
mixes =where there should be one short for each stream. If this specification is omitted, it is assumed that all streams have just one mixture component.NumMixes
short {short}
The optional weights in a stateinfo definition define a set of exponent weights for each independent data stream. The syntax is
weights =where the short gives the numberw macro
![]()
SWeights
short vector
vector = float { float }
The definition of each stream depends on the kind of HMM set. In the normal case, it consists of a sequence of mixture component definitions optionally preceded by the stream number. If the stream number is omitted then it is assumed to be 1. For tied-mixture and discrete HMM sets, special forms are used.
stream = [Stream
short ]
(mixture { mixture }tmixpdf
discpdf)
The definition of each mixture component consists of a Gaussian pdf optionally preceded by the mixture number and its weight
mixture = [If theMixture
short float ] mixpdf
The tmixpdf option is used only for fully tied mixture sets. Since the mixpdf parts are all macros in a tied mixture system and since they are identical for every stream and state, it is only necessary to know the mixture weights. The tmixpdf syntax allows these to be specified in the following compact form
tmixpdf =where each short is a mixture component weight scaled so that a weight of 1.0 is represented by the integer 32767. The optional asterix followed by a char is used to indicate a repeat count. For example, 0*5 is equivalent to 5 zeroes. The Gaussians which make-up the pool of tied-mixtures are defined usingTMix
macro weightList
weightList = repShort { repShort }
repShort = short [char ]
Discrete probability HMMs are defined in a similar way
discpdf =The only difference is that the weights in the weightList are scaled log probabilities as defined in section 7.6.DProb
weightList
The definition of a Gaussian pdf requires the mean vector to be given and one of the possible forms of covariance
mixpdf =In mean and var, the short preceding the vector defines the length of the vector, in inv the short preceding the tmatrix gives the size of this square upper triangular matrix, and in xform the two short's preceding the matrix give the number of rows and columns. The optionalm macro
mean cov [
GConst
float ]
mean =u macro
![]()
Mean
short vector
cov = varinv
xform
var =v macro
![]()
Variance
short vector
inv =i macro
![]()
(InvCovar
![]()
![]()
LLTCovar
) short tmatrix
xform =x macro
![]()
Xform
short short matrix
matrix = float {float}
tmatrix = matrix
In addition to defining the output distributions, a state can have a duration probability distribution defined for it. However, no current HTK tool can estimate or use these.
duration =Alternatively, as shown by the top level syntax for a hmmdef, duration parameters can be specified for a whole model.d macro
![]()
Duration
short vector
The transition matrix is defined by
transP =where the short in this case should be equal to the number of states in the model.t macro
![]()
TransP
short matrix
To support HMM adaptation (as described in chapter 9) baseclasses and regression class trees are defined. A baseclass is defined as
baseClass =where itemlist is a list of mixture components specified using the same conventions as the HHED command described in section 10.3. A regression class tree may also exist for an HMM set. This is defined byb macro baseopts classes
baseopts =MMFIdMask
string
Parameters
baseKind
NumClasses
int
baseKind = MIXBASEMEANBASE
COVBASE
classes =Class
int itemlist { classes }
regTree =For the definition of a node (r macro
BaseClass
baseclasses node
baseclasses =b macro
baseopts classes
node = (Node
int int int { int }
![]()
TNode
int intint { int }) { node }
Adaptation transforms are defined using
adaptXForm =In the definition of thea macro adaptOpts
XformSet
xformset
adaptOpts =AdaptKind
adaptkind
BaseClass
baseclasses [
ParentXForm
parentxform]
parentxform =a macro
adaptOpts
XformSet
xformset
adaptKind = TREEBASE
xformset =XFormKind
xformKind
NumXForms
int { linxform }
xformKind = MLLRMEANMLLRCOV
MLLRVAR
CMLLR
SEMIT
linxform =LinXForm
int
VecSize
int [
OFFSET
xformbias]
BlockInfo
int int {int} block {block}
xformbias =y macro
![]()
Bias
short vector
block =Block
int xform
Finally the input transform is defined by
inputXform =where the short followingj macro
inhead inmatrix
inhead =MMFIdMask
string parmkind [
PreQual
]
inmatrix =LinXform
![]()
VecSize
int
BlockInfo
int int {int} block {block}
block =Block
int xform