?? anti-ngram.html
字號:
<! $Id: anti-ngram.1,v 1.6 2004/12/03 17:59:01 stolcke Exp $><HTML><HEADER><TITLE>anti-ngram</TITLE><BODY><H1>anti-ngram</H1><H2> NAME </H2>anti-ngram - count posterior-weighted N-grams in N-best lists<H2> SYNOPSIS </H2><B> anti-ngram </B>[<B>-help</B>]<B></B><I> option </I>...<H2> DESCRIPTION </H2><B> anti-ngram </B>counts the N-grams in a set of N-best hypotheses lists.The N-gram counts are weighted by the posterior probabilities of thehypotheses they occur in.Thus, <B> anti-ngram </B>can be used to construct language models of word sequencesthat are acoustically confusable with correct hypotheses.The counts output should be processed with<B> ngram-count -float-counts </B>to estimate a language model.<H2> OPTIONS </H2><P>Each filename argument can be an ASCII file, or a compressed file (name ending in .Z or .gz), or ``-'' to indicatestdin/stdout.<DL><DT><B> -help </B><DD>Print option summary.<DT><B> -version </B><DD>Print version information.<DT><B>-refs</B><I> file</I><B></B><DD>Read the reference transcripts from <I>file</I>.<I></I>Each line should contain an utterance ID followed by the transcript words.<DT><B>-nbest-files</B><I> file</I><B></B><DD>List of N-best files.The base components of filenames must correspond to the utterance IDs foundin the reference file.<DT><B>-max-nbest</B><I> n</I><B></B><DD>Limits the number of hypotheses read from each N-best list to the first<I>n</I>.<I></I><DT><B>-order</B><I> n</I><B></B><DD>Set the maximal order (length) of N-grams to count.The default order is 3.<DT><B>-lm</B><I> file</I><B></B><DD>Reads an ARPA language model from <I> file </I>and rescores the N-best lists with it prior to counting N-grams.<DT><B>-classes</B><I> file</I><B></B><DD>Interpret the LM as a class-based N-gram and read class definitionsin <A HREF="classes-format.html">classes-format(5)</A>from<I>file</I>.<I></I><DT><B> -tolower </B><DD>Map vocabulary to lowercase, eliminating case distinctions.<DT><B> -multiwords </B><DD>Split multiwords (words joined by '_') into their components whenreading N-best lists.<DT><B>-rescore-lmw</B><I> lmw</I><B></B><DD>Sets the language model weight used in combining the language model logprobabilities with acoustic log probabilities(only relevant if separate scores are given in the N-best input).<DT><B>-rescore-wtw</B><I> wtw</I><B></B><DD>Sets the word transition weight used to weight the number of words relative tothe acoustic log probabilities(only relevant if separate scores are given in the N-best input).<DT><B>-posterior-scale</B><I> scale</I><B></B><DD>Divide the total weighted log score by <I> scale </I>when computing normalized posterior probabilities.This controls the peakedness of the posterior distribution. The default value is whatever was chosen for <B>-rescore-lmw</B>,<B></B>so that language model scores are scaled to have weight 1,and acoustic scores have weight 1/<I>lmw</I>.<DT><B> -all-ngrams </B><DD>Causes even N-grams that occur in the reference string to be counted.By default N-best N-grams that also occur in the corresponding reference are excluded.<DT><B>-min-count</B><I> C</I><B></B><DD>Prune all N-grams from the output that have counts less than<I>C</I>.<I></I><DT><B>-read-counts</B><I> countsfile</I><B></B><DD>Read N-gram counts from a file.Each line contains an N-gram of words, followed by an integer or fractional count, all separated by whitespace.Repeated counts for the same N-gram are added.N-grams from N-best lists are added to those read with this option.<DT><B>-write-counts</B><I> countsfile</I><B></B><DD>Writes total N-gram counts to<I>countsfile</I>.<I></I>The default is to write to stdout.<DT><B> -sort </B><DD>Output counts in lexicographic order, as required for<A HREF="ngram-merge.html">ngram-merge(1)</A>.<DT><B>-debug</B><I> level</I><B></B><DD>Set debugging output level.Level 0 means no debugging.Debugging messages are written to stderr.</DD></DL><H2> SEE ALSO </H2><A HREF="ngram.html">ngram(1)</A>, <A HREF="ngram-merge.html">ngram-merge(1)</A>, <A HREF="ngram-count.html">ngram-count(1)</A>, <A HREF="nbest-scripts.html">nbest-scripts(1)</A>,<A HREF="classes-format.html">classes-format(5)</A>,<BR>A. Stolcke et al., "The SRI March 2000 Hub-5 Conversational SpeechTranscription System",<I>Proc. NIST Speech Transcription Workshop</I>, College Park, MD, 2000.<H2> BUGS </H2>There is no<B> -vocab </B>option to limit the vocabulary.<H2> AUTHOR </H2>Andreas Stolcke <stolcke@speech.sri.com>.<BR>Copyright 2000-2004 SRI International</BODY></HTML>
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -