sklearn.externals.joblib.externals.loky.process_executor.BrokenProcessPool: A task has 未能取消序列化

背景

在做硕士毕业设计的时候,用到随机森林这个模型,在写完代码的时候,跑的时候,老是出现sklearn.externals.joblib.externals.loky.process_executor.BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.翻译为中文是说:如果一个任务未能取消序列化,请确保函数的参数都是可解析的。

这个问题我是真没遇到过,结果,从晚上十点到十二点多,两个多小时,百度了好多,还去看了英文的一些解决方法,都不尽如意。后来,修改了参数,然后降低了sklearn版本后解决了,很是兴奋,所以,立马写下这个博客,希望后来者看到后能节约时间。

错误重现

sklearn.externals.joblib.externals.loky.process_executor.BrokenProcessPool: A task has 未能取消序列化_第1张图片

sklearn.externals.joblib.externals.loky.process_executor.BrokenProcessPool: A task has 未能取消序列化_第2张图片 

重点是最后一句!!!
D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\preprocessing\data.py:617: DataConversionWarning: Data with input dtype int64, float64 were all converted to float64 by StandardScaler.
  return self.partial_fit(X, y)
D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\base.py:465: DataConversionWarning: Data with input dtype int64, float64 were all converted to float64 by StandardScaler.
  return self.fit(X, y, **fit_params).transform(X)
Fitting 3 folds for each of 9 candidates, totalling 27 fits
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
exception calling callback for
sklearn.externals.joblib.externals.loky.process_executor._RemoteTraceback: 
'''
Traceback (most recent call last):
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 393, in _process_worker
    call_item = call_queue.get(block=True, timeout=timeout)
  File "C:\Program Files\Python37\lib\multiprocessing\queues.py", line 99, in get
    if not self._rlock.acquire(block, timeout):
PermissionError: [WinError 5] 拒绝访问。
'''
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\externals\loky\_base.py", line 625, in _invoke_callbacks
    callback(self)
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\parallel.py", line 375, in __call__
    self.parallel.dispatch_next()
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\parallel.py", line 797, in dispatch_next
    if not self.dispatch_one_batch(self._original_iterator):
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\parallel.py", line 825, in dispatch_one_batch
    self._dispatch(tasks)
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\parallel.py", line 782, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 506, in apply_async
    future = self._workers.submit(SafeFunction(func))
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\externals\loky\reusable_executor.py", line 151, in submit
    fn, *args, **kwargs)
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 1016, in submit
    raise self._flags.broken
sklearn.externals.joblib.externals.loky.process_executor.BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.
[Parallel(n_jobs=-1)]: Done   2 tasks      | elapsed:    1.0s
[Parallel(n_jobs=-1)]: Done   3 tasks      | elapsed:    1.0s
[Parallel(n_jobs=-1)]: Done   4 tasks      | elapsed:    1.0s
[Parallel(n_jobs=-1)]: Done   5 tasks      | elapsed:    1.0s
[Parallel(n_jobs=-1)]: Done   6 tasks      | elapsed:    1.0s
[Parallel(n_jobs=-1)]: Done   7 tasks      | elapsed:    1.0s
[Parallel(n_jobs=-1)]: Done   8 tasks      | elapsed:    1.0s
[Parallel(n_jobs=-1)]: Done   9 tasks      | elapsed:    1.0s
[Parallel(n_jobs=-1)]: Done  10 tasks      | elapsed:    1.0s
[Parallel(n_jobs=-1)]: Done  11 tasks      | elapsed:    1.0s
[Parallel(n_jobs=-1)]: Done  12 tasks      | elapsed:    1.0s
[Parallel(n_jobs=-1)]: Done  14 out of  17 | elapsed:    1.0s remaining:    0.1s
����: û���ҵ����� "19312"��
����: û���ҵ����� "20840"��
sklearn.externals.joblib.externals.loky.process_executor._RemoteTraceback: 
'''
Traceback (most recent call last):
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 393, in _process_worker
    call_item = call_queue.get(block=True, timeout=timeout)
  File "C:\Program Files\Python37\lib\multiprocessing\queues.py", line 99, in get
    if not self._rlock.acquire(block, timeout):
PermissionError: [WinError 5] 拒绝访问。
'''
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "", line 1, in
  File "C:\Program Files (x86)\myInstall\pycharm\PyCharm 2019.1.1\helpers\pydev\_pydev_bundle\pydev_umd.py", line 197, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "C:\Program Files (x86)\myInstall\pycharm\PyCharm 2019.1.1\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "D:/myCode/spark/spark_ML/buildingModel.py", line 114, in
    pipe_rf.fit(xtrain,ytrain)
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\pipeline.py", line 267, in fit
    self._final_estimator.fit(Xt, y, **fit_params)
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\model_selection\_search.py", line 722, in fit
    self._run_search(evaluate_candidates)
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\model_selection\_search.py", line 1191, in _run_search
    evaluate_candidates(ParameterGrid(self.param_grid))
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\model_selection\_search.py", line 711, in evaluate_candidates
    cv.split(X, y, groups)))
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\parallel.py", line 996, in __call__
    self.retrieve()
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\parallel.py", line 899, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 517, in wrap_future_result
    return future.result(timeout=timeout)
  File "C:\Program Files\Python37\lib\concurrent\futures\_base.py", line 432, in result
    return self.__get_result()
  File "C:\Program Files\Python37\lib\concurrent\futures\_base.py", line 384, in __get_result
    raise self._exception
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\externals\loky\_base.py", line 625, in _invoke_callbacks
    callback(self)
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\parallel.py", line 375, in __call__
    self.parallel.dispatch_next()
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\parallel.py", line 797, in dispatch_next
    if not self.dispatch_one_batch(self._original_iterator):
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\parallel.py", line 825, in dispatch_one_batch
    self._dispatch(tasks)
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\parallel.py", line 782, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 506, in apply_async
    future = self._workers.submit(SafeFunction(func))
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\externals\loky\reusable_executor.py", line 151, in submit
    fn, *args, **kwargs)
  File "D:\myCode\PythonTest\MachineLearning\venv\lib\site-packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 1016, in submit
    raise self._flags.broken
sklearn.externals.joblib.externals.loky.process_executor.BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

 刚开始的时候跑的还是可以的,但是后边就不行了。

解决方法

百度了很久,刚开始是csdn,结果显示的这类问题很少,可能大多数人都没碰到过吧。然后又百度了其他地方。后来在stackoverflow上找到了,截图如下

sklearn.externals.joblib.externals.loky.process_executor.BrokenProcessPool: A task has 未能取消序列化_第3张图片

他的意思是降低版本为0.20.2,并且把n_jobs 这个参数给修改。我试了,但是么有成功,错误依旧。

######################################################################################

之后,又去了GitHub上,找到了英文的一个差不多的问题,英语还凑合,大概能看懂GitHub。(如果英语不行的同学,可以用谷歌浏览器,转换成英语,但是最好还是直接看英文,因为谷歌的翻译的话会连代码也给翻译)仔细研究了很久,大概就是修改参数,降低版本啥的。

sklearn.externals.joblib.externals.loky.process_executor.BrokenProcessPool: A task has 未能取消序列化_第4张图片

后来,仔细研究了截图中的框内的话,再结合可能参数也需要修改,就基本确定了思路:修改参数并且降低版本肯定能行。

于是,降低sklearn为 0.20.0 (为啥用这个版本,我试过其他版本都不行,只有这个版本可以),然后改参数

# 参数搜索  将n_jobs = -1修改为 n_jobs = 1就没错误了
rf_gridsearch = GridSearchCV(rf_reg,rf_grid_params,cv=cv, n_jobs = -1,
                               scoring='neg_mean_squared_error',verbose=5,refit=True)

 ######################################################################################

 思路是正确的,但是,错误还是没有结束,之前的错误消失了,但是又出来了新的错误:

Python DeprecationWarning the imp module is deprecated in favour of importlib

 显然这个错误很简单,就是说imp这个模块有点老,人家不用了,有了新的。imp 从 Python 3.4 之后弃用了,建议使用 importlib 代替

解决:

点开错误提示的链接,然后打开文件,注释掉 imp,import importlib

sklearn.externals.joblib.externals.loky.process_executor.BrokenProcessPool: A task has 未能取消序列化_第5张图片

sklearn.externals.joblib.externals.loky.process_executor.BrokenProcessPool: A task has 未能取消序列化_第6张图片 

之后就可以完美运行了。

sklearn.externals.joblib.externals.loky.process_executor.BrokenProcessPool: A task has 未能取消序列化_第7张图片

总结

其实就是版本问题,虽然第一次遇到,但是经过了两个多小时的不离座的搜索百度,还是给解决了,整个过程虽然很累,但是解决掉问题后的心情还是很美好的。看着程序在那跑,大概十几分钟的时间,就去开开心心的冲凉了,整个过程都很开心,也许这就是代码的乐趣所在吧。

你可能感兴趣的:(机器学习,Python,机器学习)