kenlm运行不通过(已放弃,核心已转储)

最近自然语言处理可老师让我们用kenlm训练语料库,装了之后运行不了,不过终端也给出了解决问题的方法,最后一行(已放弃,核心已转储,的上面一行)告诉你说要
rerun with --discount_fallback
所以说,在执行命令
bin/lmplz -o 3 -S 50% --verbose_header --text text/text.txt --arpa MyModel/jisuanji.arpa
的中间插入一个参数,–discount_fallback就可以啦。
在这里插入图片描述lianfu@lianfu-Lenovo-Legion-Y7000P-1060:~/kenlm/build$ bin/lmplz -o 3 -S 50% --verbose_header --text text/text.txt --arpa MyModel/jisuanji.arpa
=== 1/5 Counting and sorting n-grams ===
Reading /home/lianfu/kenlm/build/text/text.txt
----5—10—15—20—25—30—35—40—45—50—55—60—65—70—75—80—85—90—95–100


Unigram tokens 3 types 6
=== 2/5 Calculating and sorting adjusted counts ===
Chain sizes: 1:72 2:2901334272 3:5440002048
/home/lianfu/kenlm/lm/builder/adjust_counts.cc:52 in void lm::builder::{anonymous}::StatCollector::CalculateDiscounts(const lm::builder::DiscountConfig&) threw BadDiscountException because `s.n[j] == 0’.
Could not calculate Kneser-Ney discounts for 1-grams with adjusted count 4 because we didn’t observe any 1-grams with adjusted count 3; Is this small or artificial data?
Try deduplicating the input. To override this error for e.g. a class-based model, rerun with --discount_fallback

已放弃 (核心已转储)

kenlm运行不通过(已放弃,核心已转储)_第1张图片这是加了参数的结果,成功了!!

你可能感兴趣的:(kenlm运行不通过(已放弃,核心已转储))