Strings and Names

Many HTK definition files include names of various types of objects: for example labels, model names, words, etc. In order to achieve some uniformity, HTK applies standard rules for reading strings which are names. These rules are not, however, necessary when using the language modelling tools - see below.

A name string consists of a single white space delimited word or a quoted string. Either the single quote ' or the double quote " can be used to quote strings but the start and end quotes must be matched. The backslash \ character can also be used to introduce otherwise reserved characters. The character following a backslash is inserted into the string without special processing unless that character is a digit in the range 0 to 7. In that case, the three characters following the backslash are read and interpreted as an octal character code. When the three characters are not octal digits the result is not well defined.

In summary the special processing is

Notation Meaning
\\ \
\_ represents a space that will not terminate a string
\' ' (and will not end a quoted string)
\" " (and will not end a quoted string)
\nnn the character with octal code \nnn

Note that the above allows the same effect to be achieved in a number of different ways. For example,

    "\"QUOTE" 
    \"QUOTE 
    '"QUOTE' 
    \042QUOTE
all produce the string "QUOTE.

The only exceptions to the above general rules are:

Note that under some versions of Unix HTK can support the 8-bit character sets used for the representation of various orthographies. In such cases the shell environment variable $LANG usually governs which ISO character set is in use.



Subsections
Back to HTK site
See front page for HTK Authors