亚洲欧美第一页_禁久久精品乱码_粉嫩av一区二区三区免费野_久草精品视频

? 歡迎來到蟲蟲下載站! | ?? 資源下載 ?? 資源專輯 ?? 關于我們
? 蟲蟲下載站

?? training-scripts.1

?? 這是一款很好用的工具包
?? 1
字號:
.\" $Id: training-scripts.1,v 1.15 2006/08/11 22:35:11 stolcke Exp $.TH training-scripts 1 "$Date: 2006/08/11 22:35:11 $" "SRILM Tools".SH NAMEtraining-scripts, compute-oov-rate, continuous-ngram-count, get-gt-counts, make-abs-discount, make-batch-counts, make-big-lm, make-diacritic-map,  make-google-ngrams, make-gt-discounts, make-kn-counts, make-kn-discounts, merge-batch-counts, replace-words-with-classes, reverse-ngram-counts, split-tagged-ngrams, reverse-text, uniform-classes, vp2text \- miscellaneous conveniences for language model training.SH SYNOPSIS.B get-gt-counts.BI max= K.BI out= name.RI [ counts ...].br.B make-abs-discount.I gtcounts.br.B make-gt-discounts.BI min= min.BI max= max.I gtcounts.br.B make-kn-counts.BI order= N.BI max_per_file= M.BI output= file[.B no_max_order=1].br.B make-kn-discounts.BI min= min.I gtcounts.br.B make-batch-counts.I file-list.RI [ batch-size.RI [ filter.RI [ count-dir.RI [ options ...]]]].br.B merge-batch-counts.I count-dir.RI [ file-list |\c.IR start-iter ].br.B make-google-ngrams[.BI dir= DIR] [.BI per_file= N] [.B gzip=0].RI [ counts-file ...].br.B continuous-ngram-count[.BI order= N].RI [ textfile ...].br.B reverse-ngram-counts.RI [ counts-file ...].br.B reverse-text.RI [ textfile ...].br.B split-tagged-ngrams[.BI separator= S].RI [ counts-file ...].br.B make-big-lm.B \-name.I name.B \-read.I counts[.B \-trust-totals.BR \-max-per-file " M"].B \-lm.I new-model.RI [ options ...].br.B replace-words-with-classes.BI classes= classes[\c.BI outfile= counts.BR normalize= 0|1.BI addone= K.B have_counts=1.B partial=1].RI [ textfile ...].br.B uniform-classes.I classes .BI > new-classes.br.B make-diacritic-map.I vocab.br.B vp2text.RI [ textfile ...].br.B compute-oov-rate.I vocab.RI [ counts ...].SH DESCRIPTIONThese scripts perform convenience tasks associated with the training oflanguage models.They complement and extend the basic N-gram model estimator in.BR ngram-count (1)..PPSince these tools are implemented as scripts they don't automaticallyinput or output compressed data files correctly, unlike the mainSRILM tools.However, since most scripts work with data from standard input orto standard output (by leaving out the file argument, or specifying it as ``-'') it is easy to combine them with .BR gunzip (1)or.BR gzip (1)on the command line..PPAlso note that many of the scripts take their options with the .BR gawk (1)syntax.IB option = valueinstead of the more common.BI - option.IR value ..PP.B get-gt-countscomputes the counts-of-counts statistics needed in Good-Turing smoothing.The frequencies of counts up to.I K are computed (default is 10).The results are stored in a series of files with root.IR name ,.BR \fIname\fP.gt1counts ,.BR \fIname\fP.gt2counts ,\&..., .BR \fIname\fP.gt\fIN\fPcounts .It is assumed that the input counts have been properly merged, i.e.,that there are no duplicated N-grams..PP.B make-gt-discountstakes one of the output files of.B get-gt-countsand computes the corresponding Good-Turing discounting factors.The output can then be passed to.BR ngram-count (1)via the .BI \-gt noptions to control the smoothing during model estimation.Precomputing the GT discounting in this fashion has the advantage that theGT statistics are not affected by restricting N-grams to a limited vocabulary.Also, .BR get-gt-counts / make-gt-discountscan process arbitrarily large count files, since they do not need toread the counts into memory (unlike.BR ngram-count )..PP.B make-abs-discountcomputes the absolute discounting constant needed for the.B ngram-count.BI \-cdiscount noptions.Input is one of the files produced by .BR get-gt-counts . .PP.B make-kn-discountcomputes the discounting constants used by the modified Kneser-Neysmoothing method.Input is one of the files produced by .BR get-gt-counts . .PP.B make-batch-countsperforms the first stage in the construction of very large N-gram count files..I file-listis a list of input text files.Lines starting with a `#' character are ignored.These files will be grouped into batches of size.I batch-size (default 10)that are then processed in one run of.B ngram-count each.For maximum performance,.I batch-size should be as large as possible without triggering paging.Optionally, a.I filterscript or program can be given to condition the input texts.The N-gram count files are left in directory.I count-dir(``counts'' by default), where they can be found by a subsequentrun of.BR merge-batch-counts .All following.I optionsare passed to .BR ngram-count ,e.g., to control N-gram order, vocabulary, etc.(no options triggering model estimation should be included)..PP.B merge-batch-countscompletes the construction of large count files by merging the batched counts left in .I count-diruntil a single count file is produced.Optionally, a.I file-list of count files to combine can be specified; otherwise all count filesin.I count-dirfrom a prior run of.B make-batch-countswill be merged.A number as second argument restarts the merging process at iteration.IR start-iter .This is convenient if merging fails to complete for some reason(e.g., for temporary lack of disk space)..PP.B make-google-ngramstakes a sorted count file as input and creates an indexed directorystructure, in a format developed by Google to store very large N-gramcollections.The resulting directory can then be used with the.BR ngram-count (1).B \-read-googleoption.Optional arguments specify the output directory.I dirand the size.I Nof individual N-gram files(default is 10 million N-grams per file).The .B gzip=0 option writes plain, as opposed to compressed, files..PP.B continuous-ngram-countgenerates N-grams that span line breaks (which are usually taken tobe sentence boundaries).To count N-grams across line breaks use.br	continuous-ngram-count \fItextfile\fP | ngram-count -read -.brThe argument.I Ncontrols the order of N-grams counted (default 3), andshould match  the argument of .B ngram-count.BR \-order ..PP.B reverse-ngram-countsreverses the word order of N-grams in a counts file or stream.For example, to recompute lower-order counts from higher-order ones,but do the summation over preceding words (rather than following words,as in .BR ngram-count (1)),use.br	reverse-ngram-counts \fIcount-file\fP | \\.br	ngram-count -read - -recompute -write - | \\.br	reverse-ngram-counts > \fInew-counts\fP.PP.B reverse-textreverses the word order in text files, line-by-line.Start- and end-sentence tags, if present, will be preserved.This reversal is appropriate for preprocessing training datafor LMs that are meant to be used with the .B ngram.BR \-reverseoption..PP.B split-tagged-ngramsexpands N-gram count of word/tag pairs into mixed N-grams of words and tags.The optional .BI separator= Sargument allows the delimiting character, which defaults to "/",to be modified..PP.B make-big-lmconstructs large N-gram models in a more memory-efficient way than.B ngram-countby itself.It does so by precomputing the Good-Turing or Kneser-Ney smoothing parametersfrom the full set of counts, and then instructing.B ngram-count to store only a subset of the counts in memory,namely those of N-grams to be retained in the model.The.I nameparameter is used to name various auxiliary files..I counts contains the raw N-gram counts; it may be (and usually is) a compressed file.Unlike with.BR ngram-count ,the.B \-readoption can be repeated to concatenate multiple count files, but the argumentsmust be regular files; reading from stdin is not supported.If Good-Turing smoothing is used and the file contains complete lower-ordercounts corresponding to thesums of higher-order counts, then the.B \-trust-totals options may be given for efficiency.All other.I optionsare passed to .B ngram-count (only options affecting model estimation should be given).Smoothing methods other than Good-Turing and modified Kneser-Ney are notsupported by.BR make-big-lm .Kneser-Ney smoothing also requires enough disk space to compute and store themodified lower-order counts used by the KN method.This is done using the .B merge-batch-countscommand, and the.B \-max-per-fileoption controls how many counts are to be stored per batch, and should be chosen so that these batches fit in real memory..PP.B make-kn-countscomputes the modified lower-order counts used by the KN smoothing method.It is invoked as a helper scripts by .B make-big-lm ..PP.B replace-words-with-classesreplaces expansions of word classes with the corresponding class labels..I classesspecifies class expansions in .BR classes-format (5).Ambiguities are resolved in favor of the longest matching word strings.Ties are broken in favor of the expansion listed first in .IR classes.Optionally, the file.I countswill receive the expansion counts resulting from the replacements..B normalize=0or.B 1indicates whether the counts should be normalized to probabilities(default is 1).The.B addone option may be used to smooth the expansion probabilities by adding .I K to each count (default 1).The option .B have_counts=1indicates that the input consists of N-gram counts and that replacementshould be performed on them.Note this will not merge counts that have been mapped to identical N-grams,since this is done automatically when .BR ngram-count (1)reads count data.The option.B partial=1prevents multi-word class expansions from being replaced when more thanone space character occurs inbetween the words..PP.B uniform-classestakes a file in.BR classes-format (5)and adds uniform probabilities to expansions that don't have a probabilityexplicitly stated..PP.B make-diacritic-mapconstructs a map file that pairs an ASCII-fied version of the words in.I vocabwith all the occurring non-ASCII word forms.Such a map file can then be used with.BR disambig (1)and a language modelto reconstruct the non-ASCII word form with diacritics from an ASCIItext..PP.B vp2textis a reimplementation of the filter used in the DARPA Hub-3 and Hub-4 CSR evaluations to convert ``verbalized punctuation'' texts tolanguage model training data..PP.B compute-oov-ratedetermines the out-of-vocabulary rate of a corpus from its unigram.I countsand a target vocabulary list in.IR vocab ..SH "SEE ALSO"ngram-count(1), ngram(1), classes-format(5), disambig(1), select-vocab(1)..SH BUGSSome of the tools could be generalized and/or made more robust tomisuse..SH AUTHORAndreas Stolcke <stolcke@speech.sri.com>..brCopyright 1995-2006 SRI International

?? 快捷鍵說明

復制代碼 Ctrl + C
搜索代碼 Ctrl + F
全屏模式 F11
切換主題 Ctrl + Shift + D
顯示快捷鍵 ?
增大字號 Ctrl + =
減小字號 Ctrl + -
亚洲欧美第一页_禁久久精品乱码_粉嫩av一区二区三区免费野_久草精品视频
成人精品电影在线观看| 亚洲欧美综合在线精品| 亚洲18女电影在线观看| 欧美三级电影一区| 日韩制服丝袜先锋影音| 91精品午夜视频| 久久99精品国产91久久来源| 久久无码av三级| 成人久久18免费网站麻豆 | 奇米色一区二区| 精品国产一区a| 国产xxx精品视频大全| 国产精品成人午夜| 欧美在线播放高清精品| 欧美aaaaaa午夜精品| 2020日本不卡一区二区视频| 成人教育av在线| 亚洲高清免费视频| 久久女同性恋中文字幕| 99国产精品久久久久久久久久久 | 亚洲成人黄色小说| 日韩美女在线视频 | 精彩视频一区二区三区| 中文字幕欧美激情| 欧美中文字幕亚洲一区二区va在线| 亚洲aaa精品| 欧美国产日本视频| 欧美中文字幕一区二区三区亚洲| 久久国产精品第一页| 中文字幕色av一区二区三区| 欧美日韩国产成人在线91| 国产精品影视在线观看| 伊人开心综合网| 精品成人免费观看| 色婷婷激情久久| 国产一区二区精品在线观看| 亚洲精品成人少妇| 久久久久久久久久久久久久久99| 色综合中文字幕国产 | 亚洲福中文字幕伊人影院| 精品三级在线观看| 欧美视频中文字幕| 成人精品一区二区三区四区| 丝袜诱惑亚洲看片| 亚洲视频香蕉人妖| 久久久精品免费免费| 7878成人国产在线观看| 94色蜜桃网一区二区三区| 精品无人区卡一卡二卡三乱码免费卡| 国产精品第一页第二页第三页| 日韩一区二区三区视频在线| 91啪在线观看| 国产成人三级在线观看| 免费高清成人在线| 亚洲高清久久久| 亚洲老妇xxxxxx| 国产精品久线观看视频| 精品av久久707| 欧美一区二区三区在线观看| 在线影院国内精品| 91猫先生在线| 不卡电影一区二区三区| 国产精品一级在线| 精品综合久久久久久8888| 五月婷婷另类国产| 亚洲动漫第一页| 亚洲第一福利视频在线| 亚洲免费色视频| 中文字幕中文字幕在线一区| 国产欧美一区二区三区在线老狼| 欧美第一区第二区| 精品国产91九色蝌蚪| 欧美成人艳星乳罩| 欧美大片在线观看一区二区| 欧美一区二区精品在线| 制服丝袜中文字幕一区| 在线播放91灌醉迷j高跟美女| 欧美午夜一区二区三区免费大片| 欧洲av在线精品| 在线观看一区二区视频| 欧美视频在线不卡| 欧美色图在线观看| 欧美日韩激情一区| 日韩一区二区三区电影| 欧美一区二区三区色| 欧美一区二区啪啪| 日韩精品一区二区三区中文不卡| 欧美大度的电影原声| 337p粉嫩大胆色噜噜噜噜亚洲| 亚洲精品一区二区三区在线观看| 精品国产亚洲在线| 国产欧美日韩激情| 亚洲伦理在线精品| 亚洲国产日韩av| 欧美a级一区二区| 国产一区二区三区最好精华液| 国产一区日韩二区欧美三区| 国产99精品国产| 91浏览器打开| 欧美丰满高潮xxxx喷水动漫| 日韩精品专区在线影院重磅| 国产日韩欧美精品综合| 一区二区三区在线视频播放| 天天色天天操综合| 国产成人免费高清| 色婷婷综合激情| 日韩一区二区三区在线观看 | 国产精品国产三级国产有无不卡| 一区二区三区中文在线观看| 日韩av一区二| 国产成人免费视频网站| 色天天综合色天天久久| 91精品国产综合久久精品图片| 精品理论电影在线| 亚洲美女视频在线观看| 免费观看30秒视频久久| 不卡大黄网站免费看| 欧美日韩国产综合一区二区 | 亚洲成人av一区二区| 免费看日韩精品| 波多野结衣在线一区| 5566中文字幕一区二区电影 | 日韩一区二区三区视频在线观看| 国产三级一区二区| 一卡二卡欧美日韩| 国产一区二区在线观看视频| 欧美在线视频不卡| 久久精品无码一区二区三区| 亚洲一区二区三区四区在线免费观看 | 成熟亚洲日本毛茸茸凸凹| 欧美性xxxxxxxx| 日本一区二区免费在线观看视频| 午夜精品一区在线观看| 99久久精品99国产精品| 日韩欧美中文字幕公布| 亚洲男人的天堂av| 国产成人自拍高清视频在线免费播放| 欧美午夜精品久久久久久超碰| 国产欧美一区二区精品久导航| 日韩影院免费视频| 色猫猫国产区一区二在线视频| 久久综合色天天久久综合图片| 亚洲国产精品视频| 日本韩国欧美三级| 国产欧美一区二区精品性色 | 亚洲三级在线播放| 极品少妇xxxx精品少妇偷拍 | 91丨porny丨国产入口| 午夜日韩在线观看| 在线观看亚洲一区| 国产精品你懂的| 国产精品主播直播| 337p粉嫩大胆噜噜噜噜噜91av | 亚洲综合av网| zzijzzij亚洲日本少妇熟睡| www国产精品av| 久久国产乱子精品免费女| 欧美日韩成人一区二区| 亚洲自拍与偷拍| 欧美偷拍一区二区| 一区二区久久久久久| 色8久久人人97超碰香蕉987| 国产精品国产三级国产aⅴ中文 | 亚洲色图一区二区| 成人免费视频免费观看| 国产亚洲一区二区三区在线观看| 蜜臀精品一区二区三区在线观看| 欧美日韩一级片网站| 亚洲成在人线免费| 欧美亚日韩国产aⅴ精品中极品| 日韩毛片精品高清免费| 99久久亚洲一区二区三区青草| 中文字幕一区二区三区不卡在线 | 99国产精品视频免费观看| 国产精品污www在线观看| 国产不卡在线视频| 国产精品免费网站在线观看| 成人理论电影网| 亚洲精品中文字幕在线观看| 一本大道久久a久久综合婷婷| 一区二区三区日韩精品视频| 欧美性大战xxxxx久久久| 日韩中文字幕1| 26uuu国产日韩综合| 成人综合激情网| 一区二区三区四区在线| 欧美日韩一区视频| 麻豆精品国产91久久久久久| 欧美精品一区二区三区在线 | 亚洲视频在线一区| 日本道在线观看一区二区| 亚洲资源中文字幕| 欧美日韩国产区一| 激情五月婷婷综合网| 国产欧美一区二区在线| 91色乱码一区二区三区| 欧洲av一区二区嗯嗯嗯啊| 日本不卡一区二区三区高清视频| 337p粉嫩大胆噜噜噜噜噜91av| a级精品国产片在线观看|