TensorFlow.NET
发布了v0.3版本,让我们尝试将其用于图像识别。使用TensorFlow
.NET和[ ](https://github.com/scisharp/NumSharp)
进行图像识别的示例,它将使用预先训练的初始模型来预测输出按概率排序的类别的图像。原始论文在这里。 GoogLeNet
的Inception
体系结构设计为即使在严格的内存和计算预算约束下也能很好地运行。 Inception
的计算成本也远低于其他之前的方案。这使得在大数据场景中利用初始网络变得可行,其中需要以合理的成本处理大量数据,或者存储器或计算容量本身受限的场景,例如在移动视觉设置中。
GoogLeNet架构符合以下设计原则:
让我们开始使用实际代码。
此示例将下载数据集并自动解压缩。省略了一些外部路径,请参考实际路径的源代码。
private void PrepareData()
{
Directory.CreateDirectory(dir);
// get model file
string url = "models/inception_v3_2016_08_28_frozen.pb.tar.gz";
string zipFile = Path.Join(dir, $"{pbFile}.tar.gz");
Utility.Web.Download(url, zipFile);
Utility.Compress.ExtractTGZ(zipFile, dir);
// download sample picture
string pic = "grace_hopper.jpg";
Utility.Web.Download($"data/{pic}", Path.Join(dir, pic));
}
我们需要加载一个样本图像来测试我们预先训练的初始模型。将其转换为张量并对输入图像进行标准化。预训练模型以4维张量的形式输入,形状为[BATCH_SIZE,INPUT_HEIGHT,INPUT_WEIGHT,3]
,其中:
BATCH_SIZE
允许在一次处理多个图像
INPUT_HEIGHT
是训练模型的图像的高度
INPUT_WEIGHT
是训练模型的图像的宽度
3是表示为浮点的像素颜色的(R,G,B)
值。
private NDArray ReadTensorFromImageFile(string file_name,
int input_height = 299,
int input_width = 299,
int input_mean = 0,
int input_std = 255)
{
return with(tf.Graph().as_default(), graph =>
{
var file_reader = tf.read_file(file_name, "file_reader");
var image_reader = tf.image.decode_jpeg(file_reader, channels: 3, name: "jpeg_reader");
var caster = tf.cast(image_reader, tf.float32);
var dims_expander = tf.expand_dims(caster, 0);
var resize = tf.constant(new int[] { input_height, input_width });
var bilinear = tf.image.resize_bilinear(dims_expander, resize);
var sub = tf.subtract(bilinear, new float[] { input_mean });
var normalized = tf.divide(sub, new float[] { input_std });
return with(tf.Session(graph), sess => sess.run(normalized));
});
}
加载预先训练的初始模型,该模型保存为Google的protobuf
文件格式。在新会话中设置输入和输出操作。运行会话后,您将获得由NumSharp提供的类似numpy的ndarray
。使用NumSharp,您可以轻松地在.NET环境中对多维数组执行各种操作。
public void Run()
{
PrepareData();
var labels = File.ReadAllLines(Path.Join(dir, labelFile));
var nd = ReadTensorFromImageFile(Path.Join(dir, picFile),
input_height: input_height,
input_width: input_width,
input_mean: input_mean,
input_std: input_std);
var graph = Graph.ImportFromPB(Path.Join(dir, pbFile));
var input_operation = graph.get_operation_by_name(input_name);
var output_operation = graph.get_operation_by_name(output_name);
var results = with
sess => sess.run(output_operation.outputs[0],
new FeedItem(input_operation.outputs[0], nd)));
results = np.squeeze(results);
var argsort = results.argsort();
var top_k = argsort.Data()
.Skip(results.size - 5)
.Reverse()
.ToArray();
foreach (float idx in top_k)
Console.WriteLine($"{picFile}: {idx} {labels[(int)idx]}, {results[(int)idx]}");
}
最好的概率是军装,即0.8343058。这是正确的分类。
2/18/2019 3:56:18 AM Starting InceptionArchGoogLeNet
label_image_data\inception_v3_2016_08_28_frozen.pb.tar.gz already exists.
label_image_data\grace_hopper.jpg already exists.
2019-02-19 21:56:18.684463: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
create_op: Const 'file_reader/filename', inputs: empty, control_inputs: empty, outputs: file_reader/filename:0
create_op: ReadFile 'file_reader', inputs: file_reader/filename:0, control_inputs: empty, outputs: file_reader:0
create_op: DecodeJpeg 'jpeg_reader', inputs: file_reader:0, control_inputs: empty, outputs: jpeg_reader:0
create_op: Cast 'Cast/Cast', inputs: jpeg_reader:0, control_inputs: empty, outputs: Cast/Cast:0
create_op: Const 'ExpandDims/dim', inputs: empty, control_inputs: empty, outputs: ExpandDims/dim:0
create_op: ExpandDims 'ExpandDims', inputs: Cast/Cast:0, ExpandDims/dim:0, control_inputs: empty, outputs: ExpandDims:0
create_op: Const 'Const', inputs: empty, control_inputs: empty, outputs: Const:0
create_op: ResizeBilinear 'ResizeBilinear', inputs: ExpandDims:0, Const:0, control_inputs: empty, outputs: ResizeBilinear:0
create_op: Const 'y', inputs: empty, control_inputs: empty, outputs: y:0
create_op: Sub 'Sub', inputs: ResizeBilinear:0, y:0, control_inputs: empty, outputs: Sub:0
create_op: Const 'y_1', inputs: empty, control_inputs: empty, outputs: y_1:0
create_op: RealDiv 'truediv', inputs: Sub:0, y_1:0, control_inputs: empty, outputs: truediv:0
grace_hopper.jpg: 653 military uniform, 0.8343058
grace_hopper.jpg: 668 mortarboard, 0.02186947
grace_hopper.jpg: 401 academic gown, 0.01035806
grace_hopper.jpg: 716 pickelhaube, 0.008008132
grace_hopper.jpg: 466 bulletproof vest, 0.005350832
2/18/2019 3:56:25 AM Completed InceptionArchGoogLeNet
完整的代码在这里。
作者:Eric Chen
地址:https://medium.com/scisharp/image-recognition-in-tensorflow-net-and-numsharp-d87b380344dc