解决tensorflow.python.framework.errors_impl.NotFoundError: Could not find valid device for node.

报错信息:
tensorflow.python.framework.errors_impl.NotFoundError: Could not find valid device for node.
Node:{{node Minimum}}
All kernels registered for op Minimum :
 device=‘CPU’; T in [DT_FLOAT]
 device=‘CPU’; T in [DT_HALF]
 device=‘CPU’; T in [DT_BFLOAT16]
 device=‘CPU’; T in [DT_DOUBLE]
 device=‘CPU’; T in [DT_INT32]
 device=‘CPU’; T in [DT_INT64]
 device=‘GPU’; T in [DT_FLOAT]
 device=‘GPU’; T in [DT_HALF]
 device=‘GPU’; T in [DT_DOUBLE]
 device=‘GPU’; T in [DT_INT64]
 device=‘GPU’; T in [DT_INT32]
[Op:Minimum]

TensorFlow

 从报错日志来看,似乎是跟CPU与GPU驱动相关的报错,出现这种情况的原因呢,是因为输入的数据不对(不是TensorFlow在训练时所需要的数据格式类型,必须是tensor或者numpy的 float32,int32, float64,int64,bool等类型)。

系统: win11
TensorFlow: 1.1.5
CUDA: 10.1
cudnn: 7.2.1

 以下代码是自定义的TensorFlow反向传播计算代码:

@tf.function
def train_step(model_train, stft_data, data_input, targets, optimizer):
    with tf.GradientTape(True) as tape:
        outputs = model_train(data_input)
        loss_value = loss_cross(targets, np.multiply(stft_data, outputs))
        loss_value = tf.reduce_sum(model_train.losses) + loss_value

    grads = tape.gradient(loss_value, model_train.trainable_variables)
    optimizer.apply_gradients(zip(grads, model_train.trainable_variables))
    return loss_value

报错日志自己代码的报错行数
File “D:\PyCharm\TF\huanyuan\CSPAttUNnetMusic\utils\utils_fit.py”, line 27, in train_step
loss_value = loss_cross(targets, np.multiply(stft_data, outputs))
File “D:\Tool\Anaconda3\envs\TF\lib\site-packages\tensorflow_core\python\keras\losses.py”, line 989, in binary_crossentropy
  K.binary_crossentropy(y_true, y_pred, from_logits=from_logits), axis=-1)
File “D:\Tool\Anaconda3\envs\TF\lib\site-packages\tensorflow_core\python\keras\backend.py”, line 4472, in binary_crossentropy
  output = clip_ops.clip_by_value(output, epsilon_, 1. - epsilon_)

 作者在做loss时,进去的数据类型为complex64(复数类型),导致TensorFlow中loss函数无法正确计算,需要修改为tensor或者numpy的 float32,int32, float64,int64,bool 等类型。
解决tensorflow.python.framework.errors_impl.NotFoundError: Could not find valid device for node._第1张图片

  所以只需要将错误数据类型转为np或者tensor类型即可,只需要修改原始代码的第四行即可:

@tf.function
def train_step(model_train, stft_data, data_input, targets, optimizer):
    with tf.GradientTape(True) as tape:
        outputs = model_train(data_input)
        loss_value = loss_cross(targets, np.array(np.multiply(stft_data, outputs),np.float32))
        loss_value = tf.reduce_sum(model_train.losses) + loss_value

    grads = tape.gradient(loss_value, model_train.trainable_variables)
    optimizer.apply_gradients(zip(grads, model_train.trainable_variables))
    return loss_value

附上成功解决的运行截图:
解决tensorflow.python.framework.errors_impl.NotFoundError: Could not find valid device for node._第2张图片

总结

  本错误主要是自己数据不符合TensorFlow训练所要求的数据类型,定位到报错的行数,修改当前行的数据格式为为tensor或者numpy的 float32,int32, float64,int64,bool 等类型即可。

你可能感兴趣的:(深度学习,神经网络模型部署,tensorflow,python,深度学习)