多标签训练只适合未修改inception v3网络的情形,不同于迁移学习。本文参考了基于Inception v3进行多标签训练 修正了错误并进一步完善了代码
数据集的准备,假设有3个类,每个类别差不多有50张图,注意图片的规模不能太少(一般一个类不小于25张图),不然在验证的时候会报除0错误。
先看一下目录结构
images:需创建一个目录(图中为multi_image),存放所有的图片, 不同于tf v1.1.0及之后官方的retrain,要求将多个类放在这个目录
retrained_labels.txt:包含数据集提供的类别,具体看你的类别有啥
image_labels_dir:包含所有图片名字+txt,比如 数据集有 图boat.jpg,在此目录下就会有boat.jpg.txt, 内容存储图片的标签,一个图片可能有多个标签,为简单测试,我使用getclass.sh来每张图生成一个标签
getclass.sh:将同一类别的不同图片放在一个目录下,将此文件放在此目录下执行,为每张图片生成txt文件,内容就是目录的名字。之后将图拷贝到 images/multi_image/,将**.jpg.txt文件拷贝到 image_labels_dir
dir=$(cd "$(dirname "$0")";pwd)
basedir=$(basename ${dir})
echo "$basedir"
for name in $dir/*
do
echo "$name"
filename=$(basename ${name})
echo "$filename"
echo "$basedir" >> ${filename}.txt
done
echo $0
rm -rf $(basename $0).txt
具体的retrain代码太多,就不贴出来了,见本人的 github
执行eval_retrain.sh脚本,如果里面的参数自己需要改动,根据情况设置即可
python retrain.py \
--bottleneck_dir=bottlenecks \
--how_many_training_steps 1000 \
--model_dir=model_dir \
--output_graph=retrained_graph.pb \
--output_labels=retrained_labels.txt \
--summaries_dir=retrain_logs \
--image_dir=images
执行完之后,会在此工程目录下生成 retrained_graph.pb文件和log日志(可以在tensorbord下可视化),使用label_image就可以测试了。注意,如果使用的是tf v1.1.0及之后的官方label_image.py需修改参数
input_layer = "Mul"
output_layer = "final_result"
项目中的label_image_v0.py是旧版,代码如下
import tensorflow as tf
import sys
# change this as you see fit
image_path = sys.argv[1]
# Read in the image_data
image_data = tf.gfile.FastGFile(image_path, 'rb').read()
# Loads label file, strips off carriage return
label_lines = [line.rstrip() for line
in tf.gfile.GFile("retrained_labels.txt")]
# Unpersists graph from file
with tf.gfile.FastGFile("retrained_graph.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
# Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
predictions = sess.run(softmax_tensor, \
{'DecodeJpeg/contents:0': image_data})
# Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
for node_id in top_k:
human_string = label_lines[node_id]
score = predictions[0][node_id]
print('%s (score = %.5f)' % (human_string, score))
filename = "results.txt"
with open(filename, 'a+') as f:
f.write('\n**%s**\n' % (image_path))
for node_id in top_k:
human_string = label_lines[node_id]
score = predictions[0][node_id]
f.write('%s (score = %.5f)\n' % (human_string, score))
新版改动较大,代码如下:
# coding=utf-8
# 用于更改数据集重训练模型
# ======================================================
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import argparse
import sys
import numpy as np
import tensorflow as tf
def load_graph(model_file):
graph = tf.Graph()
graph_def = tf.GraphDef()
with open(model_file, "rb") as f:
graph_def.ParseFromString(f.read())
with graph.as_default():
tf.import_graph_def(graph_def)
return graph
def read_tensor_from_image_file(file_name, input_height=299, input_width=299,
input_mean=0, input_std=255):
input_name = "file_reader"
output_name = "normalized"
file_reader = tf.read_file(file_name, input_name)
if file_name.endswith(".png"):
image_reader = tf.image.decode_png(file_reader, channels=3,
name='png_reader')
elif file_name.endswith(".gif"):
image_reader = tf.squeeze(tf.image.decode_gif(file_reader,
name='gif_reader'))
elif file_name.endswith(".bmp"):
image_reader = tf.image.decode_bmp(file_reader, name='bmp_reader')
else:
image_reader = tf.image.decode_jpeg(file_reader, channels=3,
name='jpeg_reader')
float_caster = tf.cast(image_reader, tf.float32)
dims_expander = tf.expand_dims(float_caster, 0);
resized = tf.image.resize_bilinear(dims_expander, [input_height, input_width])
normalized = tf.divide(tf.subtract(resized, [input_mean]), [input_std])
sess = tf.Session()
result = sess.run(normalized)
return result
def load_labels(label_file):
label = []
proto_as_ascii_lines = tf.gfile.GFile(label_file).readlines()
for l in proto_as_ascii_lines:
label.append(l.rstrip())
return label
if __name__ == "__main__":
# file_name = "tensorflow/examples/label_image/data/grace_hopper.jpg"
# model_file = \
# "tensorflow/examples/label_image/data/inception_v3_2016_08_28_frozen.pb"
# label_file = "tensorflow/examples/label_image/data/imagenet_slim_labels.txt"
file_name = 'test.jpg'
model_file = 'retrained_graph.pb'
label_file = 'retrained_labels.txt'
input_height = 299
input_width = 299
input_mean = 0
input_std = 255
# -------必须修改下面的2个名称
input_layer = "Mul"
output_layer = "final_result"
parser = argparse.ArgumentParser()
parser.add_argument("--image", help="image to be processed")
parser.add_argument("--graph", help="graph/model to be executed")
parser.add_argument("--labels", help="name of file containing labels")
parser.add_argument("--input_height", type=int, help="input height")
parser.add_argument("--input_width", type=int, help="input width")
parser.add_argument("--input_mean", type=int, help="input mean")
parser.add_argument("--input_std", type=int, help="input std")
parser.add_argument("--input_layer", help="name of input layer")
parser.add_argument("--output_layer", help="name of output layer")
args = parser.parse_args()
if args.graph:
model_file = args.graph
if args.image:
file_name = args.image
if args.labels:
label_file = args.labels
if args.input_height:
input_height = args.input_height
if args.input_width:
input_width = args.input_width
if args.input_mean:
input_mean = args.input_mean
if args.input_std:
input_std = args.input_std
if args.input_layer:
input_layer = args.input_layer
if args.output_layer:
output_layer = args.output_layer
graph = load_graph(model_file)
t = read_tensor_from_image_file(file_name,
input_height=input_height,
input_width=input_width,
input_mean=input_mean,
input_std=input_std)
input_name = "import/" + input_layer
output_name = "import/" + output_layer
input_operation = graph.get_operation_by_name(input_name)
output_operation = graph.get_operation_by_name(output_name)
with tf.Session(graph=graph) as sess:
results = sess.run(output_operation.outputs[0],
{input_operation.outputs[0]: t})
results = np.squeeze(results)
top_k = results.argsort()[-5:][::-1]
labels = load_labels(label_file)
for i in top_k:
print(labels[i], results[i])
给一张帆船的测试图,分别执行二者程序,结果如下
xzy@xzy-ThinkPad-S2:~/PycharmProjects/multi-label$ python label_image_v1.py
2018-01-25 12:49:10.944557: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-01-25 12:49:11.320347: W tensorflow/core/framework/op_def_util.cc:343] Op BatchNormWithGlobalNormalization is deprecated. It will cease to work in GraphDef version 9. Use tf.nn.batch_normalization().
junk 0.971204
boat 0.0287959
xzy@xzy-ThinkPad-S2:~/PycharmProjects/multi-label$ python label_image_v0.py test.jpg
2018-01-25 12:49:32.411879: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-01-25 12:49:32.773317: W tensorflow/core/framework/op_def_util.cc:343] Op BatchNormWithGlobalNormalization is deprecated. It will cease to work in GraphDef version 9. Use tf.nn.batch_normalization().
carrier (score = 0.98324)
boat (score = 0.01676)
可以发现,新版的label_image的测试结果更好,建议使用。
具体的细节见github
参考文献
基于inception v3进行多标签训练
CRITICAL: tensorflow:Category has no images - validation
保证至少1张验证集图片的改良retrain