参考:https://github.com/tensorflow/tensorflow/issues/9829
出现这个错误的原因有很多,我这里只分享我遇到的这个错误原因。
在Jupyter中运行如下代码,报 kernel died 的错误。
# 加载模型
new_model = load_model('./saved_models/my_model.h5')
def test_reprocess(img_path):
img = cv2.imread(img_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = np.expand_dims(img, axis=0)
img = np.array(img)
img = (1.0 * img) / 255
return img
img = test_reprocess('239.jpg')
print(new_model.predict(img))
原因是我运行了另一个python程序,使用tensorflow进行训练,占用了GPU的所有显存。
在这个程序中,上面代码的最后一行又想调用GPU,但是没有显存,所以就报错。
将另一个程序在jupyter中关闭(shutdown)就可以了。
同样的tensorflow显存的问题,在终端中执行python文件,报错如下:
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.
raceback (most recent call last):
File "/home/ejior/working/demo/test.py", line 12, in
new_model = load_model('/home/ejior/working/demo/saved_models/my_model.h5')
File "/home/ejior/anaconda3/envs/tf/lib/python3.6/site-packages/keras/engine/saving.py", line 419, in load_model
model = _deserialize_model(f, custom_objects, compile)
File "/home/ejior/anaconda3/envs/tf/lib/python3.6/site-packages/keras/engine/saving.py", line 287, in _deserialize_model
K.batch_set_value(weight_value_tuples)
File "/home/ejior/anaconda3/envs/tf/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2470, in batch_set_value
get_session().run(assign_ops, feed_dict=feed_dict)
File "/home/ejior/anaconda3/envs/tf/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 186, in get_session
_SESSION = tf.Session(config=config)
File "/home/ejior/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1563, in __init__
super(Session, self).__init__(target, graph, config=config)
File "/home/ejior/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 633, in __init__
self._session = tf_session.TF_NewSession(self._graph._c_graph, opts)
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.
可以在通过命令 nvidia-smi 查看显存使用情况:
(tf) ejior@ejior-XPS-8930:~$ nvidia-smi
Tue Jan 15 12:02:41 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 415.25 Driver Version: 415.25 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 208... Off | 00000000:01:00.0 On | N/A |
| 18% 42C P2 70W / 250W | 10737MiB / 10986MiB | 33% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1046 G /usr/lib/xorg/Xorg 250MiB |
| 0 16027 C ...e/ejior/anaconda3/envs/tf/bin/python3.6 10475MiB |
+-----------------------------------------------------------------------------+
对占用显存的进程kill掉就可以了。