?? htkslf.tex
字號(hào):
%/* ----------------------------------------------------------- */%/* */%/* ___ */%/* |_| | |_/ SPEECH */%/* | | | | \ RECOGNITION */%/* ========= SOFTWARE */ %/* */%/* */%/* ----------------------------------------------------------- */%/* developed at: */%/* */%/* Speech Vision and Robotics group */%/* Cambridge University Engineering Department */%/* http://svr-www.eng.cam.ac.uk/ */%/* */%/* Entropic Cambridge Research Laboratory */%/* (now part of Microsoft) */%/* */%/* ----------------------------------------------------------- */%/* Copyright: Microsoft Corporation */%/* 1995-2000 Redmond, Washington USA */%/* http://www.microsoft.com */%/* */%/* 2002 Cambridge University */%/* Engineering Department */%/* */%/* Use of this software is governed by a License Agreement */%/* ** See the file License for the Conditions of Use ** */%/* ** This banner notice must not be removed ** */%/* */%/* ----------------------------------------------------------- */%\mychap{\HTK\ Standard Lattice Format (SLF)}{htkslf}\index{standard lattice format!definition}\mysect{SLF Files}{slffiles}Lattices in \HTK\ are used for storing multiplehypotheses\index{multiple hypotheses} from the output of a speechrecogniser and for specifying finite state syntax networks forrecognition. The \HTK\ standard lattice format (SLF) is designed tobe extensible and to be able to serve a variety of purposes. However,in order to facilitate the transfer of lattices\index{lattices}, itincorporates a core set of common features.An SLF file can contain zero or more sub-lattices\index{sub-lattices}followed by a main lattice. Sub-lattices are used for definingsub-networks prior to their use in subsequent sub-lattices or the mainlattice. They are identified by the presence of a\texttt{SUBLAT}\index{sublat@\texttt{SUBLAT}} field and they areterminated by a single period on a line by itself. Sub-lattices offera convenient way to structure finite state grammar networks. They arenever used in the output word lattices generated by a decoder. Somelattice processing operations like lattice pruning or expansion willdestroy the sub-lattice structure, i.e.\ expand all sub-latticereferences and generate one unstructured lattice.A lattice consists of optional header\index{lattice!header}information followed by a sequence of node definitions and a sequenceof link (arc) definitions. Nodes and links are numbered and the firstdefinition line must give the total number of each.Each link\index{lattice!link} represents a word instance occurringbetween two nodes, however for more compact storage the nodes oftenhold the word labels since these are frequently common to all wordsentering a node (the node effectively represents the end of severalword instances). This is also used in lattices representing word-levelnetworks where each node is a word end, and each arc is a wordtransition.Each node\index{lattice!node} may optionally be labelled with a wordhypothesis and with a time. Each link has a start and end node numberand may optionally be labelled with a word hypothesis (including thepronunciation variant, acoustic score and segmentation of the wordhypothesis) and a language model score.The lattice must have exactly one start node (no incoming arcs) andone end node (no outgoing arcs). The special word identifier\verb|!NULL| can be used for the start and end node if necessary.\mysect{Format}{slfformat}The format\index{lattice!format} is designed to allow optionalinformation that at its most detailed gives full identity, alignmentand score (log likelihood) information at the word and phone level toallow calculation of the alignment and likelihood of an individualhypothesis. However, without scores or times the lattice is just aword graph. The format is designed to be extensible. Further fieldnames can be defined to allow arbitrary information to be added to thelattice without making the resulting file unreadable by others.The lattices are stored in a text file as a series of fields that formtwo blocks:\begin{itemize}\item A header, specifying general information about the lattice.\item The node and link definitions.\end{itemize}Either block may contain comment lines\index{lattice!comment lines},for which the first character is a `\#' and the rest of the line isignored.All non-comment lines consist of fields, separated by white space.Fields consist of an alphanumeric field name, followed by a delimiter(the character `=' or `\verb|~|') and a (possibly ``quoted'') fieldvalue. Single character field names are reserved for fields definedin the specification and single character abbreviations may be usedfor many of the fields defined below. Field values can be specifiedeither as normal text (e.g.\ \verb|a=-318.31|) or in a binaryrepresentation if the `=' character is replaced by `\verb|~|'. Thebinary representation consists of a 4-byte floating point number (IEEE754) or a 4-byte integer number stored in big-endian byte order bydefault (see section~\ref{s:byteswap} for a discussion of differentbyte-orders in HTK).The convention used to define the current fieldnames\index{lattice!field names} is that lower case is used foroptional fields and upper case is used for required fields. Themeaning of field names can be dependent on the context in which theyappear.The header must include a field specifying which utterance was used togenerate the lattice and a field specifying the version of the latticespecification used. The header is terminated by a line which definesthe number of nodes and links in the lattice.The node definitions are optional but if included each node definitionconsists of a single line which specifies the node number followed byoptional fields that may (for instance) define the time of the node orthe word hypothesis ending at that node.The link definitions are required and each link definition consists ofa single line which specifies the link number as well as the start andend node numbers that it connects to and optionally other informationabout the link such as the word identity and language model score. Ifword identity information is not present in node definitions then itmust appear in link definitions.\mysect{Syntax}{slfsyntax}The following rules define the syntax\index{lattice!syntax} of an SLFlattice. Any unrecognised fields will be ignored and no user definedfields may share the first character with pre-defined field names. Thesyntax specification below employs the modified BNF notation used insection~\ref{s:hmmdef}. For the node and arc field names only theabbreviated names are given and only the text format is documented inthe syntax.\begin{verbatim}latticedef = laticehead lattice { lattice }latticehead = "VERSION=" number "UTTERANCE=" string "SUBLAT=" string { "vocab=" string | "hmms=" string | "lmname=" string | "wdpenalty=" floatnumber | "lmscale=" floatnumber | "acscale=" floatnumber | "base=" floatnumber | "tscale=" floatnumber }lattice = sizespec { node } { arc }sizespec = "N=" intnumber "L=" intnumbernode = "I=" intnumber { "t=" floatnumber | "W=" string | "s=" string | "L=" string | "v=" intnumber } arc = "J=" intnumber "S=" intnumber "E=" intnumber { "a=" floatnumber | "l=" floatnumber | "a=" floatnumber | "r=" floatnumber | "W=" string | "v=" intnumber | "d=" segments }segments = ":" segment {segment}segment = string [ "," floatnumber [ "," floatnumber ]] ":"\end{verbatim}\mysect{Field Types}{slffields}The currently defined fields are as follows:-\begin{verbatim} Field abbr o|c DescriptionHeader fields VERSION=%s V o Lattice specification adhered to UTTERANCE=%s U o Utterance identifier SUBLAT=%s S o Sub-lattice name acscale=%f o Scaling factor for acoustic likelihoods tscale=%f o Scaling factor for times (default 1.0, i.e.\ seconds) base=%f o LogBase for Likelihoods (0.0 not logs, default base e) lmname=%s o Name of Language model lmscale=%f o Scaling factor for language model wdpenalty=%f o Word insertion penaltyLattice Size fields NODES=%d N c Number of nodes in lattice LINKS=%d L c Number of links in latticeNode Fields I=%d Node identifier. Starts node information time=%f t o Time from start of utterance (in seconds) WORD=%s W wc Word (If lattice labels nodes rather that links) L=%s wc Substitute named sub-lattice for this node var=%d v wo Pronunciation variant number s=%s s o Semantic TagLink Fields J=%d Link identifier. Starts link information START=%d S c Start node number (of the link) END=%d E c End node number (of the link) WORD=%s W wc Word (If lattice labels links rather that nodes) var=%d v wo Pronunciation variant number div=%s d wo Segmentation (modelname, duration, likelihood) triples acoustic=%f a wo Acoustic likelihood of link language=%f l o General language model likelihood of link r=%f r o Pronunciation probabilityNote: The word identity (and associated `w' fields var,div and acoustic) must appear on either the link or the end node. abbr is a possible single character abbreviation for the field name o|c indicates whether field is optional or compulsory.\end{verbatim}% ngram=%f n o NGram likelihood of link\mysect{Example SLF file}{slfeg}The following is a real lattice (generated by the \HTK\ SwitchboardLarge Vocabulary System with a 54k dictionary and a word fourgram LM)with word labels occurring on the end nodes of the links.Note that the \verb|!SENT_SENT| and \verb|!SENT_END| ``words'' modelinitial and final silence.\begin{verbatim}VERSION=1.0UTTERANCE=s22-0017-A_0017Af-s22_000070_000157.plplmname=/home/solveb/hub5/lib/lang/fgintcat_54khub500.txtlmscale=12.00 wdpenalty=-10.00vocab=/home/solveb/hub5/lib/dicts/54khub500v3.lvx.dctN=32 L=45 I=0 t=0.00 W=!NULL I=1 t=0.05 W=!SENT_START v=1 I=2 t=0.05 W=!SENT_START v=1 I=3 t=0.15 W=!SENT_START v=1 I=4 t=0.15 W=!SENT_START v=1 I=5 t=0.19 W=HOW v=1 I=6 t=0.29 W=UM v=1 I=7 t=0.29 W=M v=1 I=8 t=0.29 W=HUM v=1 I=9 t=0.70 W=OH v=1 I=10 t=0.70 W=O v=1 I=11 t=0.70 W=KOMO v=1 I=12 t=0.70 W=COMO v=1 I=13 t=0.70 W=CUOMO v=1 I=14 t=0.70 W=HELLO v=1 I=15 t=0.70 W=OH v=1 I=16 t=0.70 W=LOW v=1 I=17 t=0.71 W=HELLO v=1 I=18 t=0.72 W=HELLO v=1 I=19 t=0.72 W=HELLO v=1 I=20 t=0.72 W=HELLO v=1 I=21 t=0.73 W=CUOMO v=1 I=22 t=0.73 W=HELLO v=1 I=23 t=0.77 W=I v=1 I=24 t=0.78 W=I'M v=1 I=25 t=0.78 W=TO v=1 I=26 t=0.78 W=AND v=1 I=27 t=0.78 W=THERE v=1 I=28 t=0.79 W=YEAH v=1 I=29 t=0.80 W=IS v=1 I=30 t=0.88 W=!SENT_END v=1 I=31 t=0.88 W=!NULL J=0 S=0 E=1 a=-318.31 l=0.000 J=1 S=0 E=2 a=-318.31 l=0.000 J=2 S=0 E=3 a=-1094.09 l=0.000 J=3 S=0 E=4 a=-1094.09 l=0.000 J=4 S=2 E=5 a=-1063.12 l=-5.496 J=5 S=3 E=6 a=-1112.78 l=-4.395 J=6 S=4 E=7 a=-1086.84 l=-9.363 J=7 S=2 E=8 a=-1876.61 l=-7.896 J=8 S=6 E=9 a=-2673.27 l=-5.586 J=9 S=7 E=10 a=-2673.27 l=-2.936 J=10 S=1 E=11 a=-4497.15 l=-17.078 J=11 S=1 E=12 a=-4497.15 l=-15.043 J=12 S=1 E=13 a=-4497.15 l=-12.415 J=13 S=2 E=14 a=-4521.94 l=-7.289 J=14 S=8 E=15 a=-2673.27 l=-3.422 J=15 S=5 E=16 a=-3450.71 l=-8.403 J=16 S=2 E=17 a=-4635.08 l=-7.289 J=17 S=2 E=18 a=-4724.45 l=-7.289 J=18 S=2 E=19 a=-4724.45 l=-7.289 J=19 S=2 E=20 a=-4724.45 l=-7.289 J=20 S=1 E=21 a=-4796.74 l=-12.415 J=21 S=2 E=22 a=-4821.53 l=-7.289 J=22 S=18 E=23 a=-435.64 l=-4.488 J=23 S=18 E=24 a=-524.33 l=-3.793 J=24 S=19 E=25 a=-520.16 l=-4.378 J=25 S=20 E=26 a=-521.50 l=-3.435 J=26 S=17 E=27 a=-615.12 l=-4.914 J=27 S=22 E=28 a=-514.04 l=-5.352 J=28 S=21 E=29 a=-559.43 l=-1.876 J=29 S=9 E=30 a=-1394.44 l=-2.261 J=30 S=10 E=30 a=-1394.44 l=-1.687 J=31 S=11 E=30 a=-1394.44 l=-2.563 J=32 S=12 E=30 a=-1394.44 l=-2.352 J=33 S=13 E=30 a=-1394.44 l=-3.285 J=34 S=14 E=30 a=-1394.44 l=-0.436 J=35 S=15 E=30 a=-1394.44 l=-2.069 J=36 S=16 E=30 a=-1394.44 l=-2.391 J=37 S=23 E=30 a=-767.55 l=-4.081 J=38 S=24 E=30 a=-692.95 l=-3.868 J=39 S=25 E=30 a=-692.95 l=-2.553 J=40 S=26 E=30 a=-692.95 l=-3.294 J=41 S=27 E=30 a=-692.95 l=-0.855 J=42 S=28 E=30 a=-623.50 l=-0.762 J=43 S=29 E=30 a=-556.71 l=-3.019 J=44 S=30 E=31 a=0.00 l=0.000 \end{verbatim}%%% Local Variables: %%% mode: latex%%% TeX-master: "htkbook"%%% End:
?? 快捷鍵說明
復(fù)制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號(hào)
Ctrl + =
減小字號(hào)
Ctrl + -