在Tensorflow下使用SSD模型训练自己的数据集时,经过查找很多博客资料,已经成功训练出来了自己的模型,但就是在测试自己模型效果的时候,出现了如下错误。
2019-10-27 14:47:12.862573: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Not found: Key ssd_300_vgg/block3_box/L2Normalization/gamma not found in checkpoint Traceback (most recent call last): File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call return fn(*args) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.NotFoundError: Key ssd_300_vgg/block3_box/L2Normalization/gamma not found in checkpoint [[{{node save/RestoreV2}}]] [[{{node save/RestoreV2}}]] During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1276, in restore {self.saver_def.filename_tensor_name: save_path}) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run run_metadata_ptr) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run feed_dict_tensor, options, run_metadata) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run run_metadata) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.NotFoundError: Key ssd_300_vgg/block3_box/L2Normalization/gamma not found in checkpoint [[node save/RestoreV2 (defined at ssd_notebook.py:53) ]] [[node save/RestoreV2 (defined at ssd_notebook.py:53) ]] Caused by op 'save/RestoreV2', defined at: File "ssd_notebook.py", line 53, in <module> saver = tf.train.Saver() File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 832, in __init__ self.build() File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 844, in build self._build(self._filename, build_save=True, build_restore=True) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 881, in _build build_save=build_save, build_restore=build_restore) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 513, in _build_internal restore_sequentially, reshape) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 332, in _AddRestoreOps restore_sequentially) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 580, in bulk_restore return io_ops.restore_v2(filename_tensor, names, slices, dtypes) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1572, in restore_v2 name=name) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op op_def=op_def) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__ self._traceback = tf_stack.extract_stack() NotFoundError (see above for traceback): Key ssd_300_vgg/block3_box/L2Normalization/gamma not found in checkpoint [[node save/RestoreV2 (defined at ssd_notebook.py:53) ]] [[node save/RestoreV2 (defined at ssd_notebook.py:53) ]] During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1286, in restore names_to_keys = object_graph_key_mapping(save_path) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1591, in object_graph_key_mapping checkpointable.OBJECT_GRAPH_PROTO_KEY) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 370, in get_tensor status) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__ c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.NotFoundError: Key _CHECKPOINTABLE_OBJECT_GRAPH not found in checkpoint During handling of the above exception, another exception occurred: Traceback (most recent call last): File "ssd_notebook.py", line 54, in <module> saver.restore(isess, ckpt_filename) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1292, in restore err, "a Variable name or other graph key that is missing") tensorflow.python.framework.errors_impl.NotFoundError: Restoring from checkpoint failed. This is most likely due to a Variable nameor other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error: Key ssd_300_vgg/block3_box/L2Normalization/gamma not found in checkpoint [[node save/RestoreV2 (defined at ssd_notebook.py:53) ]] [[node save/RestoreV2 (defined at ssd_notebook.py:53) ]] Caused by op 'save/RestoreV2', defined at: File "ssd_notebook.py", line 53, in <module> saver = tf.train.Saver() File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 832, in __init__ self.build() File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 844, in build self._build(self._filename, build_save=True, build_restore=True) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 881, in _build build_save=build_save, build_restore=build_restore) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 513, in _build_internal restore_sequentially, reshape) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 332, in _AddRestoreOps restore_sequentially) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 580, in bulk_restore return io_ops.restore_v2(filename_tensor, names, slices, dtypes) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1572, in restore_v2 name=name) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op op_def=op_def) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__ self._traceback = tf_stack.extract_stack() NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error: Key ssd_300_vgg/block3_box/L2Normalization/gamma not found in checkpoint [[node save/RestoreV2 (defined at ssd_notebook.py:53) ]] [[node save/RestoreV2 (defined at ssd_notebook.py:53) ]]
在查找资料的过程中,出现了很多波折,百度上基本没有同样的错误,
最开始使用的代码是:
ckpt_filename = '../train_model/model.ckpt-1000'
尝试过很多种方法,比如下面这种方法,改了后还是报同样的错误。
ckpt_filename = tf.train.latest_checkpoint('../train_model/model.ckpt-1000')
还有说模型没有完全保存,经过很多次训练,发现模型确实是成功保存了的。
还说是按照这个英文意思来解决,就是这个Key在ckpt文件里面没有。经查找资料用如下代码查看ckpt文件里面的key。
import os from tensorflow.python import pywrap_tensorflow current_path = '****/SSD_small_object_detection/' model_dir = os.path.join(current_path, 'train_model') checkpoint_path = os.path.join(model_dir,'model.ckpt-1000') # 保存的ckpt文件名,不一定是这个 # Read data from checkpoint file reader = pywrap_tensorflow.NewCheckpointReader(checkpoint_path) var_to_shape_map = reader.get_variable_to_shape_map() # Print tensor name and values for key in var_to_shape_map: print("tensor_name: ", key) # print(reader.get_tensor(key)) # 打印变量的值,对我们查找问题没啥影响,打印出来反而影响找问题
确实得到了一点结果,如下图所示:
就算得到了结果,但是代码太复杂,本身也看不太懂,就想着实在没办法的话就尝试Debug下代码,但是我相信前面的步骤没有问题,然后终于发现了解决方法。
于是我在我自己的代码中将saver的定义改变一下
saver = tf.train.import_meta_graph("../train_model/model.ckpt-1000.meta")
错误成功解决。