MacOS 安装 ROUGE 1.5.5 和 pyrouge

MacOS 安装 ROUGE 1.5.5 和 pyrouge

ROUGE python 工具总结

官方 ROUGE :是 ROUGE-155(perl版本),但是这个项目已经停止维护了,很难用;
pyrouge:官方版本的 Python wrapper,但需要另外安装perl版本,用起来较麻烦,参考https://blog.csdn.net/jolinxia/article/details/77508435;
rouge:纯python版本,但是作者对比后与官方版本的结果不同;
py-rouge:Diego999的版本,据说解决了rouge的问题,没有验证过;
Google seq2seq项目中的纯python版本:链接,经作者介绍,结果与pyrouge不同(原因),但呈现出来的规律是相似的,只是没法用来与发表的论文对比;
pythonrouge:一个日本人发布的版本,也是perl基础上的wrapper,不过在源码中集成了ROUGE155

安装 ROUGE 155

参考

  • 教程:https://blog.csdn.net/qq_32458499/article/details/80282049
  • XML::DOM安装:https://www.cpan.org/modules/INSTALL.html

步骤

  1. 安装 perl,MacOS 和 Linux 自带,运行 perl- v,如果出来了版本信息则安装成功。
  2. 安装 XML::DOM,先安装cpan工具
    【MacOS】brew install cpanm
    【Linux】apt install cpanm
    【Windows】http://strawberryperl.com/
    sudo cpan XML::DOM一定要用sudo安装,否则 ROUGE 找不到 XML::DOM
  3. 下载ROUGE155,完成相关配置
    git clone https://github.com/summanlp/evaluation
    #【环境变量】将下面一行加入到 ~/.bash_profile 中
    export ROUGE_EVAL_HOME="/Users/zhouwei/Documents/ROUGE-RELEASE-1.5.5/data/"
    cd ROUGE-RELEASE-1.5.5
    运行:perl runROUGE-test.pl,若出现如下内容则安装成功
    MacOS 安装 ROUGE 1.5.5 和 pyrouge_第1张图片
    若提示:“Cannot open exception db file for reading: data/WordNet-2.0.exc.db”,则:
    cd data/WordNet-2.0-Exceptions/
    mv WordNet-2.0.exc.db WordNet-2.0.exc.db.bak
    ./buildExeptionDB.pl . exc WordNet-2.0.exc.db
    再运行就发现可以了。

安装 pyrouge

参考

  • https://github.com/bheinzerling/pyrouge
  • https://blog.csdn.net/uhauha2929/article/details/79438659

安装

pip install pyrouge
然后查看安装路径,看看python是否能找到。
我的Mac上:/Users/zhouwei/.local/lib/python3.6/site-packages/usr/local/lib/python2.7/site-packages,
则需要把以上路径添加到 ~/.bash_profile中的 PYTHONPATH 中,然后重启命令行。

安装完成,但是命令行不好用,pyrouge.test 也无法通过。。。

测试

数据:
test_data/gold/gold.A.0.txt
test_data/system/system.0.txt
文件内部每行为一句话,提前分好词用空格分开:

丰 田 RAV4 性 价 比 超 高 , 前 置 前 驱 , 适 合 年 轻 人
from pyrouge import Rouge155
r = Rouge155("/Users/zhouwei/Documents/ROUGE-RELEASE-1.5.5")	# 这里只要在一台机器上运行过一次,下一次直接使用 r = Rouge155() 就会找到该路径
r.system_dir = "/Users/zhouwei/Documents/ROUGE-RELEASE-1.5.5/test_data/system"
r.model_dir = "/Users/zhouwei/Documents/ROUGE-RELEASE-1.5.5/test_data/gold"
r.system_filename_pattern = 'system.(\d+).txt'
r.model_filename_pattern = 'gold.[A-Z].#ID#.txt'
output = r.convert_and_evaluate()

若打印如下,则python端使用正常:

2018-11-05 10:04:50,578 [MainThread  ] [INFO ]  Writing summaries.
2018-11-05 10:04:50,581 [MainThread  ] [INFO ]  Processing summaries. Saving system files to /var/folders/pv/f00msc61103dmwyts362vbkr0000gn/T/tmp_ho983em/system and model files to /var/folders/pv/f00msc61103dmwyts362vbkr0000gn/T/tmp_ho983em/model.
2018-11-05 10:04:50,581 [MainThread  ] [INFO ]  Processing files in /Users/zhouwei/Documents/ROUGE-RELEASE-1.5.5/test_data_zw/system.
2018-11-05 10:04:50,581 [MainThread  ] [INFO ]  Processing system.0.txt.
2018-11-05 10:04:50,582 [MainThread  ] [INFO ]  Saved processed files to /var/folders/pv/f00msc61103dmwyts362vbkr0000gn/T/tmp_ho983em/system.
2018-11-05 10:04:50,582 [MainThread  ] [INFO ]  Processing files in /Users/zhouwei/Documents/ROUGE-RELEASE-1.5.5/test_data_zw/gold.
2018-11-05 10:04:50,582 [MainThread  ] [INFO ]  Processing gold.A.0.txt.
2018-11-05 10:04:50,583 [MainThread  ] [INFO ]  Saved processed files to /var/folders/pv/f00msc61103dmwyts362vbkr0000gn/T/tmp_ho983em/model.
2018-11-05 10:04:50,584 [MainThread  ] [INFO ]  Written ROUGE configuration to /var/folders/pv/f00msc61103dmwyts362vbkr0000gn/T/tmpy_0jvm85/rouge_conf.xml
2018-11-05 10:04:50,584 [MainThread  ] [INFO ]  Running ROUGE with command /Users/zhouwei/Documents/ROUGE-RELEASE-1.5.5/ROUGE-1.5.5.pl -e /Users/zhouwei/Documents/ROUGE-RELEASE-1.5.5/data -c 95 -2 -1 -U -r 1000 -n 4 -w 1.2 -a -m /var/folders/pv/f00msc61103dmwyts362vbkr0000gn/T/tmpy_0jvm85/rouge_conf.xml

然后:

print(output)

结果如下:

---------------------------------------------
1 ROUGE-1 Average_R: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
1 ROUGE-1 Average_P: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
1 ROUGE-1 Average_F: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
---------------------------------------------
1 ROUGE-2 Average_R: 0.00000 (95%-conf.int. 0.00000 - 0.00000)
1 ROUGE-2 Average_P: 0.00000 (95%-conf.int. 0.00000 - 0.00000)
1 ROUGE-2 Average_F: 0.00000 (95%-conf.int. 0.00000 - 0.00000)
---------------------------------------------
1 ROUGE-3 Average_R: 0.00000 (95%-conf.int. 0.00000 - 0.00000)
1 ROUGE-3 Average_P: 0.00000 (95%-conf.int. 0.00000 - 0.00000)
1 ROUGE-3 Average_F: 0.00000 (95%-conf.int. 0.00000 - 0.00000)
---------------------------------------------
1 ROUGE-4 Average_R: 0.00000 (95%-conf.int. 0.00000 - 0.00000)
1 ROUGE-4 Average_P: 0.00000 (95%-conf.int. 0.00000 - 0.00000)
1 ROUGE-4 Average_F: 0.00000 (95%-conf.int. 0.00000 - 0.00000)
---------------------------------------------
1 ROUGE-L Average_R: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
1 ROUGE-L Average_P: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
1 ROUGE-L Average_F: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
---------------------------------------------
1 ROUGE-W-1.2 Average_R: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
1 ROUGE-W-1.2 Average_P: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
1 ROUGE-W-1.2 Average_F: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
---------------------------------------------
1 ROUGE-S* Average_R: 0.00000 (95%-conf.int. 0.00000 - 0.00000)
1 ROUGE-S* Average_P: 0.00000 (95%-conf.int. 0.00000 - 0.00000)
1 ROUGE-S* Average_F: 0.00000 (95%-conf.int. 0.00000 - 0.00000)
---------------------------------------------
1 ROUGE-SU* Average_R: 0.00000 (95%-conf.int. 0.00000 - 0.00000)
1 ROUGE-SU* Average_P: 0.00000 (95%-conf.int. 0.00000 - 0.00000)
1 ROUGE-SU* Average_F: 0.00000 (95%-conf.int. 0.00000 - 0.00000)

转成python dict方便调用:

output_dict = r.output_to_dict(output)

你可能感兴趣的:(自然语言处理,Python,Mac)