最近,研究了下如何使用基于tensorflow-hub中预训练bert,一开始找到的关于预模型使用介绍的官方教程国内打不开,所以看了很多博客遇到了很多坑,直至最后找到能打开的教程,才发现使用很简单。
实验版本:
tensorflow版本: 2.3.0
tensorflow-hub版本:0.9.0
python版本: 3.7.6
数据准备:
首先,熟悉bert的都知道输入有3个:inputIds、inputMask、segmentIds,这个不多说了,百度一大堆。
直接获取bert输出代码:
max_seq_length = 256
input_word_ids = tf.keras.layers.Input(shape=(max_seq_length,),
dtype=tf.int32,name="input_word_ids")
input_mask = tf.keras.layers.Input(shape=(max_seq_length,),
dtype=tf.int32,name="input_mask")
segment_ids = tf.keras.layers.Input(shape=(max_seq_length,),
dtype=tf.int32,name="segment_ids")
# 将trainable值改为False
module = hub.KerasLayer(BERT_URL,trainable=False)#,signature="token")
pooled_output, sequence_output = module([input_mask,segment_ids,input_word_ids])
# 构建模型输入输出
model = tf.keras.Model(inputs=[input_word_ids,input_mask,segment_ids],outputs=[pooled_output,sequence_output])
# 获取输出
output = model.predict([inputIds,inputMask,segmentIds])
# output输出结果 ----》 pool_out: shape=[batch, 768];sequence_out: shape=[batch, 256, 768]
-------------------------------------------------BUG----------------------------------------------
这里也尝试了参考链接3中博客方式获取bert输出结果,但是遇到个问题
ValueError: Could not find matching function to call loaded from the SavedModel. Got:
Positional arguments (2 total):
* False
* None:
# 实验内容1——参数名来自https://hub.tensorflow.google.cn/tensorflow/bert_zh_L-12_H-768_A-12/2
outputs,_ = hub_module(input_word_ids=tf.constant(tmp_inputids),
input_mask=tf.constant(tmp_inputMask),
segment_ids=tf.constant(tmp_segmentIds))
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in
2 outputs,_ = hub_module(input_word_ids=tf.constant(tmp_inputids),
3 input_mask=tf.constant(tmp_inputMask),
----> 4 segment_ids=tf.constant(tmp_segmentIds))
5
6 # # 实验内容2——参数名来自报错提示
/opt/conda/lib/python3.7/site-packages/tensorflow/python/saved_model/load.py in _call_attribute(instance, *args, **kwargs)
507
508 def _call_attribute(instance, *args, **kwargs):
--> 509 return instance.__call__(*args, **kwargs)
510
511
/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py in __call__(self, *args, **kwds)
778 else:
779 compiler = "nonXla"
--> 780 result = self._call(*args, **kwds)
781
782 new_tracing_count = self._get_tracing_count()
/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py in _call(self, *args, **kwds)
812 # In this case we have not created variables on the first call. So we can
813 # run the first trace but we should fail if variables are created.
--> 814 results = self._stateful_fn(*args, **kwds)
815 if self._created_variables:
816 raise ValueError("Creating variables on a non-first call to a function"
/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py in __call__(self, *args, **kwargs)
2826 """Calls a graph function specialized to the inputs."""
2827 with self._lock:
-> 2828 graph_function, args, kwargs = self._maybe_define_function(args, kwargs)
2829 return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
2830
/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py in _maybe_define_function(self, args, kwargs)
3211
3212 self._function_cache.missed.add(call_context_key)
-> 3213 graph_function = self._create_graph_function(args, kwargs)
3214 self._function_cache.primary[cache_key] = graph_function
3215 return graph_function, args, kwargs
/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py in _create_graph_function(self, args, kwargs, override_flat_arg_shapes)
3073 arg_names=arg_names,
3074 override_flat_arg_shapes=override_flat_arg_shapes,
-> 3075 capture_by_value=self._capture_by_value),
3076 self._function_attributes,
3077 function_spec=self.function_spec,
/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, autograph, autograph_options, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, override_flat_arg_shapes)
984 _, original_func = tf_decorator.unwrap(python_func)
985
--> 986 func_outputs = python_func(*func_args, **func_kwargs)
987
988 # invariant: `func_outputs` contains only Tensors, CompositeTensors,
/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py in wrapped_fn(*args, **kwds)
598 # __wrapped__ allows AutoGraph to swap in a converted function. We give
599 # the function a weak reference to itself to avoid a reference cycle.
--> 600 return weak_wrapped_fn().__wrapped__(*args, **kwds)
601 weak_wrapped_fn = weakref.ref(wrapped_fn)
602
/opt/conda/lib/python3.7/site-packages/tensorflow/python/saved_model/function_deserialization.py in restored_function_body(*args, **kwargs)
255 .format(_pretty_format_positional(args), kwargs,
256 len(saved_function.concrete_functions),
--> 257 "\n\n".join(signature_descriptions)))
258
259 concrete_function_objects = []
ValueError: Could not find matching function to call loaded from the SavedModel. Got:
Positional arguments (2 total):
* False
* None
Keyword arguments: {'input_word_ids': , 'input_mask': , 'segment_ids': }
Expected these arguments to match one of the following 4 option(s):
Option 1:
Positional arguments (3 total):
* [TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/0'), TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/1'), TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/2')]
* True
* None
Keyword arguments: {}
Option 2:
Positional arguments (3 total):
* [TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids'), TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask'), TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids')]
* True
* None
Keyword arguments: {}
Option 3:
Positional arguments (3 total):
* [TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/0'), TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/1'), TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/2')]
* False
* None
Keyword arguments: {}
Option 4:
Positional arguments (3 total):
* [TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids'), TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask'), TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids')]
* False
* None
Keyword arguments: {}
从错误描述中可以看到,bert模型的输入有4中可选方式,每种可选方式都是3个参数,第一个参数为bert的输入(list类型),第二个参数为trainable(bool类型),第三个参数猜测为signatures(在官方教程中没说,大概率在tf-1.x版本里有介绍)。
根据错误提示修改参数名后继续尝试,还是一样的问题:
# 实验内容2——参数名来自报错提示Option2、4
outputs,_ = hub_module(input_word_ids=tf.constant(tmp_inputids),
input_mask=tf.constant(tmp_inputMask),
input_type_ids=tf.constant(tmp_segmentIds))
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in
7 outputs,_ = hub_module(input_word_ids=tf.constant(tmp_inputids),
8 input_mask=tf.constant(tmp_inputMask),
----> 9 input_type_ids=tf.constant(tmp_segmentIds))
10
11 # # 实验内容3
/opt/conda/lib/python3.7/site-packages/tensorflow/python/saved_model/load.py in _call_attribute(instance, *args, **kwargs)
507
508 def _call_attribute(instance, *args, **kwargs):
--> 509 return instance.__call__(*args, **kwargs)
510
511
/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py in __call__(self, *args, **kwds)
778 else:
779 compiler = "nonXla"
--> 780 result = self._call(*args, **kwds)
781
782 new_tracing_count = self._get_tracing_count()
/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py in _call(self, *args, **kwds)
812 # In this case we have not created variables on the first call. So we can
813 # run the first trace but we should fail if variables are created.
--> 814 results = self._stateful_fn(*args, **kwds)
815 if self._created_variables:
816 raise ValueError("Creating variables on a non-first call to a function"
/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py in __call__(self, *args, **kwargs)
2826 """Calls a graph function specialized to the inputs."""
2827 with self._lock:
-> 2828 graph_function, args, kwargs = self._maybe_define_function(args, kwargs)
2829 return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
2830
/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py in _maybe_define_function(self, args, kwargs)
3211
3212 self._function_cache.missed.add(call_context_key)
-> 3213 graph_function = self._create_graph_function(args, kwargs)
3214 self._function_cache.primary[cache_key] = graph_function
3215 return graph_function, args, kwargs
/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py in _create_graph_function(self, args, kwargs, override_flat_arg_shapes)
3073 arg_names=arg_names,
3074 override_flat_arg_shapes=override_flat_arg_shapes,
-> 3075 capture_by_value=self._capture_by_value),
3076 self._function_attributes,
3077 function_spec=self.function_spec,
/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, autograph, autograph_options, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, override_flat_arg_shapes)
984 _, original_func = tf_decorator.unwrap(python_func)
985
--> 986 func_outputs = python_func(*func_args, **func_kwargs)
987
988 # invariant: `func_outputs` contains only Tensors, CompositeTensors,
/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py in wrapped_fn(*args, **kwds)
598 # __wrapped__ allows AutoGraph to swap in a converted function. We give
599 # the function a weak reference to itself to avoid a reference cycle.
--> 600 return weak_wrapped_fn().__wrapped__(*args, **kwds)
601 weak_wrapped_fn = weakref.ref(wrapped_fn)
602
/opt/conda/lib/python3.7/site-packages/tensorflow/python/saved_model/function_deserialization.py in restored_function_body(*args, **kwargs)
255 .format(_pretty_format_positional(args), kwargs,
256 len(saved_function.concrete_functions),
--> 257 "\n\n".join(signature_descriptions)))
258
259 concrete_function_objects = []
ValueError: Could not find matching function to call loaded from the SavedModel. Got:
Positional arguments (2 total):
* False
* None
Keyword arguments: {'input_word_ids': , 'input_mask': , 'input_type_ids': }
Expected these arguments to match one of the following 4 option(s):
Option 1:
Positional arguments (3 total):
* [TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/0'), TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/1'), TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/2')]
* True
* None
Keyword arguments: {}
Option 2:
Positional arguments (3 total):
* [TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids'), TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask'), TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids')]
* True
* None
Keyword arguments: {}
Option 3:
Positional arguments (3 total):
* [TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/0'), TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/1'), TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/2')]
* False
* None
Keyword arguments: {}
Option 4:
Positional arguments (3 total):
* [TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids'), TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask'), TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids')]
* False
* None
Keyword arguments: {}
第一个参数是list类型,所以继续修改为:
# 实验内容3
outputs,_ = hub_module([tf.constant(tmp_inputids),
tf.constant(tmp_inputMask),
tf.constant(tmp_segmentIds)])
果然,还是一样报错,百度一下,发现还有这种写法,但是还是一样的错误:
# 实验内容4
bert_inputs = dict(
input_word_ids=tf.constant(tmp_inputids),
input_mask=tf.constant(tmp_inputMask),
input_type_ids=tf.constant(tmp_segmentIds))
outputs, _ = hub_module(bert_inputs)
尝试了这么多,说明可能不能这么写(如果有不同意的可以反馈一下这种方式的写法),那就老老实实重新构建bert吧,利用predict函数获取预训练bert的输出。
---------------------------------------------------------------------------------------------
预训练模型+自定义层——重新训练模型代码:
# 预训练bert模型地址,其中,末尾的 2 为该模型的版本号
# 目前还有很多模型是基于 TF1.0 的,选择的过程中请注意甄别,有些模型会明确写出来是试用哪个版本,或者,检查使用是否是 tfhub 0.5.0 或以上版本的 API hub.load(url) ,在之前版本使用的hub.Module(url)
BERT_URL = "https://hub.tensorflow.google.cn/tensorflow/bert_zh_L-12_H-768_A-12/2"
max_seq_length = 256 # 定义序列长度
# 定义3个输入
input_word_ids = tf.keras.layers.Input(shape=(max_seq_length,),
dtype=tf.int32,name="input_word_ids")
input_mask = tf.keras.layers.Input(shape=(max_seq_length,),
dtype=tf.int32,name="input_mask")
segment_ids = tf.keras.layers.Input(shape=(max_seq_length,),
dtype=tf.int32,name="segment_ids")
# 构建预训练模型+自定义层
module = hub.KerasLayer(BERT_URL,trainable=True)
pooled_output, sequence_output = module([input_mask,segment_ids,input_word_ids])
out = tf.keras.layers.Dense(2,activation="softmax")(pooled_output)
model = tf.keras.Model(inputs=[input_word_ids,input_mask,segment_ids],outputs=out)
model.compile(optimizer='adam',
loss=tf.losses.BinaryCrossentropy(from_logits=True),
metrics=[tf.metrics.BinaryAccuracy(threshold=0.0, name='accuracy')])
# 这里确保输入和输出是numpy.array类型
model.fit([inputIds, inputMask, segmentIds],labels,
batch_size=1,epochs=20,
validation_data=([inputIds, inputMask, segmentIds],labels),
verbose=1)
预训练模型微调方法:hub.KerasLayer()中将trainable改为False,利用keras中介绍的微调方式获取module的相应层并将对应层的trainable改为True就可以了。(未尝试)
总结:
(1)利用tensorflow-hub导入bert模型;
(2)利用tensorflow.keras搭建模型框架:可以直接构建bert输入和输出;或者将bert模型作为该框架中的某一层;
(3)训练模型或者直接获取bert模型输出;
(4)微调的话,将预训练bert模型作为框架的一层,并获取bert的某几层并修改trainable就行,具体可参考:keras微调预训练模型
参考链接:
tensorflow-hub官方相关介绍:https://tensorflow.google.cn/hub/installation
tensorflow-hub各种模型使用教程(为主):https://hub.tensorflow.google.cn/s?module-type=text-embedding,text-classification,text-generation,text-language-model,text-question-answering,text-retrieval-question-answering&tf-version=tf2&q=bert
从这里获得的tensorflow-hub国内镜像源:https://www.cnblogs.com/xingnie/p/12343601.html
----------
这个博客跟我写的差不多,不过用的模型不一样:https://cloud.tencent.com/developer/article/1537222