python2.7安装使用thulac库时遇到的一些问题

环境:Windows10、python2.7.14、thulac-0.1.1
提示错误:

Traceback (most recent call last):
File "", line 1, in 
File "C:\Python27\lib\site-packages\thulac_init_.py", line 58, in init
self.__tagging_decoder.init((self.__prefix+"model_c_model.bin"),(self.__prefix+"model_c_dat.bin"),(self.__prefix+"model_c_label.txt"))
File "C:\Python27\lib\site-packages\thulac\character\CBTaggingDecoder.py", line 36, in init
self.model = CBModel(modelFile)
File "C:\Python27\lib\site-packages\thulac\character\CBModel.py", line 58, in init
self.fl_weights = struct.unpack("<"+str(self.f_size * self.l_size)+"i", temp)
MemoryError

查阅了资料后,找到了修改方法
编辑CBModel.py,找到__init__方法,找到下面这个代码:

temp = inputfile.read(4 * self.l_size * self.l_size)
self.ll_weights = struct.unpack("<"+str(self.l_size * self.l_size)+"i", temp)
self.ll_weights = tuple(self.ll_weights)

temp = inputfile.read(4 * self.f_size * self.l_size)
self.fl_weights = struct.unpack("<"+str(self.f_size * self.l_size)+"i", temp)

改成

temp = inputfile.read(4 * self.l_size * self.l_size)
self.ll_weights = array.array('i')
self.ll_weights.fromstring(temp)
self.ll_weights = tuple(self.ll_weights)

temp = inputfile.read(4 * self.f_size * self.l_size)
self.fl_weights = array.array('i')
self.fl_weights.fromstring(temp)

再加上import

import array

python2.7适用,3.6有这个API,也能兼任

转自:https://github.com/thunlp/THULAC-Python/issues/25

你可能感兴趣的:(中文分词算法,python,thulac,MemoryErro)