spacy安装中遇到的bug

自己遇到的问题,仅供参考
错误:Could not read config.cfg from D:\soft\Anaconda\envs\py38\lib\site-packages\de_core_news_sm\de_core_news_sm-2.2.5\config.cfg

问题是spacy和de_core_news_sm-2.2.5的版本不匹配

我这里的spacy是3.0,需要下载对应的包

官网使用python -m spacy download en_core_web_sm下载

出现如下错误,连接失败requests.exceptions.ConnectionError: HTTPSConnectionPool(host=‘raw.githubusercontent.com’, port=443): Max retries exceeded with url: /explosion/spacy-models/master/compatibility.json (Caused by NewConnectionError(’: Failed to establish a new connection: [Errno 11004] getaddrinfo failed’))

解决办法是直接在git上下载对应的tag.gz
地址https://github.com/explosion/spacy-models/releases/tag/en_core_web_sm-3.0.0

其他语言或版本直接修改包名和版本号即可下载

下一步是安装

pip install de_core_news_sm-3.0.0.tar.gz
pip install en_core_web_sm-3.0.0.tar.gz

代码中load

spacy_de = spacy.load(‘de_core_news_sm’)
spacy_en = spacy.load(‘en_core_web_sm’)

你可能感兴趣的:(记忆回收站,bug,github,nlp)