【写在前面】
记录使用Tensorflow过程中遇到的问题及解决方案,有些问题尚未解决,主要应用场景为训练模型,云端部署模型及移动端部署模型。
import tensorflow as tf
【Demo】
import tensorflow as tf
v1 = tf.get_variable("v1", shape=[1, 10], initializer=tf.truncated_normal_initializer(stddev=0.1))
y = tf.argmax(v1, 1)
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
print("v1 value: {}".format(v1))
print("v1 value: {}".format(sess.run(v1)))
print("Compare result: {}".format(y))
print("Compare result: {}".format(sess.run(y)))
【Problem】
InvalidArgumentError (see above for traceback): Expected dimension in the range [-1, 1), but got 1
[[Node: ArgMax = ArgMax[T=DT_DOUBLE, Tidx=DT_INT32, output_type=DT_INT64, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ArgMax/input, ArgMax/dimension)]]
【Reason】
argmax函数参数越界,[-1, 1)
【Solve】
参数改为-1
【Demo】
v1 = tf.get_variable("v1", shape=[1, 10], initializer=tf.truncated_normal_initializer(stddev=0.1))
y = tf.argmax(v1, -1)
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
print("v1 value: {}".format(v1))
print("v1 value: {}".format(sess.run(v1)))
print("Compare result: {}".format(y))
print("Compare result: {}".format(sess.run(y)))
【Result】
v1 value:
v1 value: [[ 0.00457896 -0.15347771 -0.02465441 0.00503899 0.03228834 0.15578857
0.06421228 0.04700307 -0.05741185 0.0248824 ]]
Compare result: Tensor("ArgMax:0", shape=(1,), dtype=int64)
Compare result: [5]
【Problem】
RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6
【Solve】
没办法,降低Python版本,使用Python3.5。
【Problem】
Variable conv_w1 already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:
File "", line 58, in init_conv_weights
weights = tf.get_variable(name=name, shape=shape, dtype=tf.float32, initializer=tf.contrib.layers.xavier_initializer_conv2d())
File "", line 101, in Inference
cw_1 = init_conv_weights([3, 3, 3, 16], name='conv_w1')
File "", line 42, in
class Inference(object):
【Resson】
变量重复,需要重建计算图结构(Graph)
【Solve】
文件开始,加入重置Tensorflow计算图的配置语句
import tensorflow as tf
tf.reset_default_graph()
【Problem】
WARNING:tensorflow:From /home/SP-in-AI/xindq/couplets/seq2seq.py:12: BasicLSTMCell.__init__ (from tensorflow.python.ops.rnn_cell_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This class is deprecated, please use tf.nn.rnn_cell.LSTMCell, which supports all the feature this cell currently has. Please replace the existing code with tf.nn.rnn_cell.LSTMCell(name='basic_lstm_cell').
2019-01-24 17:33:37.439645: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
【Resson】
Tensorflow版本更新,未来版本支持tf.nn.rnn_cell.LSTMCell(name='basic_lstm_cell')
【Solve】
rnn.BasicLSTMCell
更换为tf.nn.rnn_cell.LSTMCell(name='basic_lstm_cell')
import tensorflow as tf
tf.nn.rnn_cell.LSTMCell(name='basic_lstm_cell')
【Problem】
seq2seq can't pickle _thread.RLock objects
【Resson】
我也不清楚
【Solve】
在seq2seq.py
模型中添加:
setattr(tf.contrib.rnn.GRUCell, '__deepcopy__', lambda self, _: self)
setattr(tf.contrib.rnn.BasicLSTMCell, '__deepcopy__', lambda self, _: self)
setattr(tf.contrib.rnn.MultiRNNCell, '__deepcopy__', lambda self, _: self)
【Problem】
RuntimeError: The Session graph is empty. Add operations to the graph before calling run()
【Resson】
【Solve】
【Problem】
ValueError: Fetch argument cannot be interpreted as a Tensor. (Tensor Tensor("model_with_buckets/sequence_loss/truediv:0", shape=(), dtype=float32) is not an element of this graph.)
【Resson】
【Solve】
【Problem】
The passed save_path is not a valid checkpoint
【Resson】
模型名称或路径不正确
【Solve】
模型路径填写到后缀以前的全部内容.如:模型名称:test_chatbot.ckpt-250.meta
,使用模型名称为:test_chatbot.ckpt-250
【Problem】
File "/home/xdq/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1139, in _build
raise ValueError("No variables to save")
ValueError: No variables to save
【Resson】
保存命令前未设置变量.
【Solve】
保存命令前要设置变量.
【Problem】
ZeroDivisionError: float division by zero
【Problem】
raise ValueError("Tensor %s is not an element of this graph." % obj)
ValueError: Tensor Tensor("out_2/truediv:0", shape=(?, ?, 5990), dtype=float32) is not an element of this graph
【Resson】
后台路由中载入模型,启动后台服务时会加载一次模型,当调用该服务时,加载模型失效。
【Solve】
设定全局载入模型,即:
loadmodel
route
loadmodel
[参考文献]
[1]https://stackoverflow.com/questions/44855603/typeerror-cant-pickle-thread-lock-objects-in-seq2seq
[2]http://www.cnblogs.com/yanjj/p/8242595.html