Tensorflow Lite的量化工具tf.contrib.quantize的使用(缓慢更新)

2019第一篇,先祝大家新年快乐鸭~
本文使用的是Tensorflow Lite中自带的量化工具包,Github上官方代码,使用手册,我看到是18年12月才更新的工具包,import方式改变了,直接从tf.contrib.quantize中import,做的人还比较少,本文想先评估一下它的量化效果,也记录一下使用方法,因为其实官方没有给很多的demo指导。
TODO

  • 在模型中加入量化工具
  • 尝试在已训练好的模型加载量化
  • 继续测评

Index

  • 打开一个训练好的graph(测试)
  • 建立全量化模型
    • 在MNIST数据集上用最简单的Lenet尝试(把自己的网络封装成class类型)

打开一个训练好的graph(测试)

因为我之前做keras比较多,所以其实这里上手还是磨合了一番的。
准备环境:
python3.5
tensorflow-gpu==1.12.0
加上一个已经训练好,freeze过的graph

直接上代码,我说也说不清,参考一下这个
这里有一个很纠结的点就是这样打开.pb文件就是GraphDef而不是tf.Graph()文件,如果直接用量化函数打开graph_def就会报错 AttributeError: ‘GraphDef’ object has no attribute 'get_all_collection_keys’

import tensorflow as tf
from tensorflow.contrib.quantize import *
with tf.Session() as sess:
    with gfile.FastGFile('./VGG16_freeze.pb', 'rb') as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())
        tf.import_graph_def(graph_def, name='vgg')
    graph = sess.graph
    out = create_training_graph(graph, 100)

但是呢,这里如果只按照我这个代码来其实还是不行的,报错如下

INFO:tensorflow:Saver not created because there are no variables in the graph to restore
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/quantize/python/quantize.py", line 200, in QuantizeOpWithWeights
    input_idx = next(i for i, v in enumerate(op.inputs)
StopIteration

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/quantize/python/quantize_graph.py", line 112, in create_training_graph
    device_name_or_function=device_name_or_function)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/quantize/python/quantize_graph.py", line 68, in _create_graph
    quantize.Quantize(g, is_training=is_training)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/quantize/python/quantize.py", line 101, in Quantize
    context.QuantizeOpWithWeights(op, folded=False)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/quantize/python/quantize.py", line 204, in QuantizeOpWithWeights
    raise ValueError('No inputs to quantize for op: %s' % op)
ValueError: No inputs to quantize for op: name: "vgg/block1_conv1/convolution"

这里很明显就是说只加载图是不行的,还要定义一下输入的节点,所以整个的代码应该要加上一下对应的tf.placeholder()

说明:
这里tf版本之间区别比较大,我之前用的是1.6.0,函数的定义和最新版本的完全不一样,不过也可以做,1.6.0版本中没有quant_delay参数,直接去掉,用create_training_graph(graph)即可,使用这个版本的小伙伴应该可以清晰看到函数定义,在此不赘述
但是我还是升级了tf版本到了1.12.0,补充函数定义

def create_training_graph(input_graph=None, quant_delay=0):
  """Rewrites a training input_graph in place for simulated quantization.
  Variables added by the rewrite get added to the global variables collection.
  This function must be invoked prior to insertion of gradient ops in a graph
  as quantization should be modeled in both forward and backward passes.
  The graph has fake quantization ops inserted to simulate the error
  introduced by quantization. Since the graph is transformed in place,
  the expected behavior of previously held references to nodes and tensors may
  change.
  The default value of quant_delay is suitable for finetuning an already trained
  floating point model (recommended).
  If one wants to train a quantized model from scratch, quant_delay should be
  set to the number of steps it take the floating point model to converge.
  Quantization will be activated at this point and effectively finetune the
  model. If quant_delay is not provided when training from scratch, training can
  often fail.
  Args:
    input_graph: The tf.Graph to be transformed.
    quant_delay: Number of steps after which weights and activations are
      quantized during training.
  Raises:
    ValueError: If elements contains an element that isn't a tf.Tensor or
      tf.Operation.
  """

我目前的实验是直接在模型定义时按照官方指导手册修改的,可以运行

loss = tf.losses.get_total_loss()
#这里就是自己定义的loss函数
g = tf.get_default_graph()
tf.contrib.quantize.create_training_graph(input_graph=g,
                                          quant_delay=2000000)
#直接加上量化函数
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
optimizer.minimize(loss)
#自己的优化函数,之后就可以训练整个模型

建立全量化模型

官方指导中其实就分为了4步:
(1)用create_training_graph()重写训练图
(2)用create_eval_graph()重写测试图
这两步做的是fake quantize,模拟前向和后向的量化误差,之后就开始建立完整的量化模型,TensorFlow Lite可以直接把create_eval_graph()生成的图转化成定点模型
(3)用freeze_graph冻结图
(4)用TensorFlow Lite Optimizing Converter (TOCO)完成转化
…………………………………………
但是实现起来我碰见了很多bug。
先说我自己的

在MNIST数据集上用最简单的Lenet尝试(把自己的网络封装成class类型)

源代码直接github上down的,博客上也有很多,但是我这里用的(也是自己挖的坑)是他自己编写的lenet网络并封装成一个class,封装了loss,digits,accuracy,optimizer等等…………所以改起来遇到了很多麻烦。
修改源码
首先是先把两个重写图的步骤加进来,我没有修改网络结构的部分和训练的部分,主要修改了lenet类初始化定义部分和eval部分,我贴上来

class Lenet:
    def __init__(self, is_train = True):  #加上is_train这个flag用于切换重写图函数
        self.raw_input_image = tf.placeholder(tf.float32, [None, 784])
        self.input_images = tf.reshape(self.raw_input_image, [-1, 28, 28, 1])
        self.raw_input_label = tf.placeholder("float", [None, 10])
        self.input_labels = tf.cast(self.raw_input_label,tf.int32)
        self.dropout = cfg.KEEP_PROB
        self.is_train = is_train

        with tf.variable_scope("Lenet") as scope:
            self.train_digits = self.construct_net(True)
            scope.reuse_variables()
            self.pred_digits = self.construct_net(False)

        self.prediction = tf.argmax(self.pred_digits, 1)
        self.correct_prediction = tf.equal(tf.argmax(self.pred_digits, 1), tf.argmax(self.input_labels, 1))
        self.train_accuracy = tf.reduce_mean(tf.cast(self.correct_prediction, "float"))


        self.loss = slim.losses.softmax_cross_entropy(self.train_digits, self.input_labels)
        g = tf.get_default_graph()
        #changed part by chutongz 这里就是添加两个重写图函数,没什么可讲的
        if self.is_train:
            create_training_graph(g, 2000000)
        else:
            create_eval_graph(g)

        self.lr = cfg.LEARNING_RATE
        self.train_op = tf.train.AdamOptimizer(self.lr).minimize(self.loss)

eval部分(2019年1月9日已修改bug)
参考博客
由于我对TensorFlow太不熟悉了,导致稍微卡了一下,遇到了很多进程、restore模型上的bug,这里就不赘述,加上这两天接手了一个新任务,就没有更新(小声bb)

import os
import tensorflow as tf
import tensorflow.contrib.slim as slim
import config as cfg
from tensorflow.contrib.quantize import *
import tensorflow.examples.tutorials.mnist.input_data as input_data
from lenet import Lenet
os.environ['CUDA_VISIBLE_DEVICES']='1'

saver = tf.train.import_meta_graph('./checkpoint/variable_q.ckpt.meta')
mnist = input_data.read_data_sets('MNIST_data/', one_hot=True)

with tf.Session() as sess:
    saver.restore(sess, cfg.PARAMETER_FILE)
    g = tf.get_default_graph()
    #print(g.get_operations())
    #读取图之后一定要生成一次操作点
    input_node = g.get_operation_by_name('Placeholder').outputs[0]
    input_label = g.get_operation_by_name('Placeholder_1').outputs[0]
    pred = g.get_operation_by_name('softmax_cross_entropy_loss/value').outputs[0]
    
	#这一段是我用来看结果的,不重要,把进程运行一遍即可
	#sess.run(pred, feed_dict={input_node: mnist.test.images, input_label: mnist.test.labels})
    for i in range(200):
        batch = mnist.test.next_batch(50)
        train_accuracy = sess.run(pred, feed_dict={input_node: mnist.test.images, input_label: mnist.test.labels})
        print("step %d, training accuracy %g" % (i, train_accuracy))
    
with open('eval_freeze_graph.pb', 'w') as f:
    f.write(str(g.as_graph_def()))

输出结果:
lenet只训练了大概5000个周期,这个结果情有可原哈
之后呢就是转化图的步骤
直接用自带的freeze_graph工具

$ freeze_graph --input_graph=eval_freeze_graph.pb --input_checkpoint=./checkpoint/variable_q.ckpt --output_graph=freeze_eval_graph.pb --output_node_names=Lenet/fc9/Relu

然后用toco转化成.tflite模型
2019年3月1日更新:转换方式参考我的另一篇博客【点击这里】

你可能感兴趣的:(原创)