Moses manual 中Basline System 2.3.4节用IRSTLM创建语言模型的命令有误

手册里写到:

 ~/irstlm/bin/compile-lm  \

   --text yes \

   news-commentary-v8.fr-en.lm.en.gz \

   news-commentary-v8.fr-en.arpa.en

经过查阅compile-lm的帮助里写到:

compile-lm - compiles an ARPA format LM into an IRSTLM format one



USAGE:

       compile-lm [options] <input-file.lm> [output-file.blm]



DESCRIPTION:

       compile-lm reads a standard LM file in ARPA format and produces

       a compiled representation that the IRST LM toolkit can quickly

       read and process. LM file can be compressed.



OPTIONS:

Parameters:

    Help:      print this help

    d:      verbose output for --eval option; default is 0

    debug:      verbose output for --eval option; default is 0

    dict_load_factor:      sets the load factor for ngram cache; it should be a positive real value; default is 0

    dub:      dictionary upperbound to compute OOV word penalty: default 10^7

    e:      computes perplexity of the specified text file

    eval:      computes perplexity of the specified text file

    f:      filter a binary language model with a word list

    filter:      filter a binary language model with a word list

    h:      print this help

    i:      builds an inverted n-gram binary table for fast access; default if false

    invert:      builds an inverted n-gram binary table for fast access; default if false

    keepunigrams:      filter by keeping all unigrams in the table, default  is true

    ku:      filter by keeping all unigrams in the table, default  is true

    l:      maximum level to load from the LM; if value is larger than the actual LM order, the latter is taken

    level:      maximum level to load from the LM; if value is larger than the actual LM order, the latter is taken

    memmap:      uses memory map to read a binary LM

    mm:      uses memory map to read a binary LM

    ngram_load_factor:      sets the load factor for ngram cache; it should be a positive real value; default is false

    r:      computes N random calls on the specified text file

    randcalls:      computes N random calls on the specified text file

    s:      computes log-prob scores of n-grams from standard input

    score:      computes log-prob scores of n-grams from standard input

    sentence:      computes perplexity at sentence level (identified through the end symbol)

    t:      output is again in text format; default is false

    text:      output is again in text format; default is false

    tmpdir:      directory for temporary computation, default is either the environment variable TMP if defined or "/tmp")

也就是说 --text参数后面无需再加yes,不知道为什么Hieu加了yes,可能是版本不同?今晚给mailing list发个邮件试试

你可能感兴趣的:(System)