TF2 使用TPU Compilation failure: Dynamic dimension propagation on reversed dimension is not supported

 

在tensorflow2下使用TPU训练时

...

tf.pad(inputs, [[0, 0], [10, 10], [10, 10], [0, 0]], mode='reflect') 

...

batch_size = 1 * tpu_strategy.num_replicas_in_sync
data_set = tf.data.Dataset.from_tensor_slices(X_data) # X_data is a numpy Array
data_set.batch(batch_size)

...

model.fit(data_set, epochs=1)

出现以下错误: 

UnimplementedError: {{function_node __inference_train_function_18379}} Compilation failure: Dynamic dimension propagation on reversed dimension is not supported %reverse.1029 = f32[<=2,276,276,3]{3,2,1,0} reverse(f32[<=2,276,276,3]{3,2,1,0} %concatenate.1028), dimensions={0}, metadata={op_type="MirrorPad" op_name="style_transfer_net/tf_op_layer_MirrorPad/MirrorPad"}
    TPU compilation failed
     [[{{node tpu_compile_succeeded_assert/_9843984394415337403/_4}}]]

查看data_set的信息:


print(data_set)

发现data_set的第一维batch_size是None动态变化的,按理说即然指定了batch_size,data_set的batch_size应该是固定的,于 是翻tensorflow官方文档,发现tf.data.Dataset.batch()有个drop_remainder选项,决定是否丢弃序列尾部长度不足batch_size的部分,

于是尝试将data_set.batch(batch_size)改成data_set.batch(batch_size, drop_remainder=True),  

data_set变成

问题成功解决!原因应该就是TPU或多GPU环境下,MirrorPad不支持动态变化的batch_size.

 

 


 

 

 

 

你可能感兴趣的:(tensorflow,tensorflow,tpu,深度学习)