这个函数用于计算所有examples(假设一句话有n个单词,一个单词及单词所对应的label就是一个example,所有example就是一句话中所有单词)的加权交叉熵损失,logits参数是一个2D Tensor构成的列表对象,每一个2D Tensor的尺寸为[batch_size x num_decoder_symbols],函数的返回值是一个1D float类型的Tensor,尺寸为batch_size,其中的每一个元素代表当前输入序列example的交叉熵。另外,还有一个与之类似的函数sequence_loss,它对sequence_loss_by_example函数返回的结果进行了一个tf.reduce_sum运算,因此返回的是一个标称型float Tensor。
进一步理解:
logits 的shape = [batch_size*numsteps, vocab_size], vocab_size是(分类)类别的个数
targets 的shape = [batch_size*num_steps]
sequence_loss_by_example的做法是,针对logits中的每一个num_step,即[batch_size, vocab_size], 对所有vocab_size个预测结果,得出预测值最大的那个类别,与target中的值相比较计算Loss值
http://blog.csdn.net/u012436149/article/details/52874718
# Defines a list of `num_steps` variables, each 1-D with length `batch_size`.
weights = [tf.get_variable("w", [batch_size]) for _ in range(num_steps)]
loss = seq2seq.sequence_loss_by_example([logits_1, ..., logits_n],
[targets_1, ..., targets_n],
weights,
vocab_size_all)
http://stackoverflow.com/questions/35930950/trainable-weight-for-tensorflow-sequence-loss-by-example
下面的例子有点问题,待解决,请不要参考(正考虑解决方案):
#logit_list是个list, 个数是num_step, 元素的shape = [batch_size, num_class]
#labels的shape = [batch_size, numstep, num_class]
# 需要修改,
def loss(self, logit_list, labels, seq_len):
logits = tf.pack(logit_list) #shape = [num_step, batch_size, num_class]
logits = tf.transpose(logits, [1,0,2]) # shape = [batch_size, num_step, num_class]
logits = tf.reshape(logits, [-1, self.config.num_class]) # shape = [batch_size*num_step, num_class]
labels = tf.reshape(labels, [-1, self.config.num_class]) # shape = [batch_size*num_step, num_class]
#loss的shape = [batch_size*num_steps]
#这儿需要修改
loss = tf.nn.seq2seq.sequence_loss_by_example( # loss的shape= [batch_size*num_steps]
[logits], # output [batch*numsteps, vocab_size]
[labels], # target, [batch_size, num_steps] 然后展开成一维【列表】
[tf.ones([self.config.batch_size * self.config.num_steps], dtype=tf.float32)]) # weight
mask = self.mask(seq_len)
clean_loss = loss*mask
part_cost = cost = tf.reduce_sum(clean_loss) / self.config.batch_size # 计算得到平均每批batch的误差,为什么没有除以num_steps
total_loss = part_cost + self.l2_loss()
#还有一个问题需要注意,如果要除以num_steps,就要计算实际的单词个数
return total_loss # 传入到优化器