python语言运行出现ValueError: empty vocabulary; perhaps the documents only contain stop words

我运行的代码出自https://github.com/sunxiangguo/chinese_text_classification,python是3.9的,pycharm是2020.3.3。

训练集和测试集都是自带的,然后要自己创建两个文件夹来存放分词完之后的文本,后来执行TF-IDF的时候出现了以下错误

C:/Users/qianyz/Downloads/chinese_text_classification-master/TFIDF_space.py
Traceback (most recent call last):
  File "C:\Users\qianyz\Downloads\chinese_text_classification-master\TFIDF_space.py", line 41, in 
    vector_space(stopword_path, bunch_path, space_path)
  File "C:\Users\qianyz\Downloads\chinese_text_classification-master\TFIDF_space.py", line 30, in vector_space
    tfidfspace.tdm = vectorizer.fit_transform(bunch.contents)
  File "C:\Users\qianyz\venv\Lib\site-packages\sklearn\feature_extraction\text.py", line 1849, in fit_transform
    X = super().fit_transform(raw_documents)
  File "C:\Users\qianyz\venv\Lib\site-packages\sklearn\feature_extraction\text.py", line 1203, in fit_transform
    vocabulary, X = self._count_vocab(raw_documents,self.fixed_vocabulary_)
  File "C:\Users\qianyz\venv\Lib\site-packages\sklearn\feature_extraction\text.py", line 1134, in _count_vocab
    raise ValueError("empty vocabulary; perhaps the documents only"
ValueError: empty vocabulary; perhaps the documents only contain stop words

进程已结束,退出代码为 1

我看了很多网上和博客上的解决方法,修改过analyzer的数据变成word和char,但是还是报一样的错误,求各位大佬解答一下,不胜感激

你可能感兴趣的:(python)