这几天在看《Natural Language Processing with Python》,根据书上的提示,需要安装NLTK工具包,在安装配置的过程中遇到不少问题,现在一 一总结下来。
首先要说明的是,目前最新版的Python3.3.2不能完全兼容最新版的NLTK3.0a0,因为后者是基于Python3.0的,所以我把以前安装的Python3.3.2卸载了,安装了Python3.0.1。我的安装环境是Win7 32位。
软件下载地址:
首先依次安装上述两个软件,我的安装目录是:
根据书的提示,软件安装好需要下载相应的数据资料,在IDLE中运行:
import nltk nltk.download()
但是这里运行第一行语句就会提示错误:ImportError: No module named yaml
谷歌上查了一下,这意思是说还需要再安装另一个软件包yaml,下载地址:http://pyyaml.org/download/pyyaml/PyYAML-3.10.win32-py3.0.exe,仍然以管理员权限进行安装,仍然会自动安装到Python的目录下。
这时再执行import nltk,就不会有问题了。接下来就该下载NLTK图书集,也就相应的一些数据资料。一定要注意的是,由于软件版本的不同,本文所说的下载方法与与那本书写的不一样。
在执行 “import nltk” 之后,再执行“nltk.download()”,会出现一个与以往不同的界面:
然后按照以下操作,即可开始进行下载:先输入d,再输入book
我第一次执行这个语句是在宿舍,晚上8点多,校园网速本来就是属王八的,恰好又是晚高峰,所以几次尝试均提示服务器无回应,刚开始没意识到网速的原因,在网上查了下,有说法是可能网站被国内和谐,可手工下载,手工下载资料的地址是:http://nltk.googlecode.com/svn/trunk/nltk_data/index.xml。可是地址直接在浏览器中可以直接打开的,所以不相信这种说法。后来等到半夜12点左右再试,就顺利开始下载了。总共需要10分钟左右,下载了298M的资料。这里要注意下载的位置,是由软件自动去选择的,但是如果事先用户自己在C盘或D盘或E盘的根目录下创建"nltk_data"文件夹的话,它就会把资料下载到这个文件夹中,如果用户没有这么做,那就下载到如上图所示的位置中去了。全部下载完成后,界面上会有提示信息,输入q退出下载,回到Python的等待输入状态。
然后执行后续语句,顺利通过:
>>> from nltk.book import * *** Introductory Examples for the NLTK Book *** Loading text1, ..., text9 and sent1, ..., sent9 Type the name of the text or sentence to view it. Type: 'texts()' or 'sents()' to list the materials. text1: Moby Dick by Herman Melville 1851 text2: Sense and Sensibility by Jane Austen 1811 text3: The Book of Genesis text4: Inaugural Address Corpus text5: Chat Corpus text6: Monty Python and the Holy Grail text7: Wall Street Journal text8: Personals Corpus text9: The Man Who Was Thursday by G . K . Chesterton 1908 >>> text1 <Text: Moby Dick by Herman Melville 1851> >>> text1.concordance('monstrous') Building index... Displaying 11 of 11 matches: ong the former , one was of a most monstrous size . ... This came towards us , ON OF THE PSALMS . " Touching that monstrous bulk of the whale or ork we have r ll over with a heathenish array of monstrous clubs and spears . Some were thick d as you gazed , and wondered what monstrous cannibal and savage could ever hav that has survived the flood ; most monstrous and most mountainous ! That Himmal they might scout at Moby Dick as a monstrous fable , or still worse and more de th of Radney .'" CHAPTER 55 Of the Monstrous Pictures of Whales . I shall ere l ing Scenes . In connexion with the monstrous pictures of whales , I am strongly ere to enter upon those still more monstrous stories of them which are to be fo ght have been rummaged out of this monstrous cabinet there is no telling . But of Whale - Bones ; for Whales of a monstrous size are oftentimes cast up dead u >>>