运行test.py出现以下问题,解决方法均在tf环境中运行(source activate tf)
1、no module named 'skimage'
解决方法:
pip install scikit-image
2、no module named 'mrcnn'
直接pip install keras(自动安装的2.4.2版),再运行报错:keras2.4.2需要至少2.2以上版本的tensorflow支持
我安装的是tensorflow-gpu2.1.0,对应的keras版本应该时2.3.1
所以就下载之前安装的keras重装指定版本的keras
pip uninstall keras
pip install keras==2.3.1
Framework
Env name (--env parameter)
Description
Docker Image
Packages and Nvidia SettingsTensorFlow 2.2
tensorflow-2.2
TensorFlow 2.2.0 + Keras 2.3.1 on Python 3.7.
TensorFlow 2.1
tensorflow-2.1
TensorFlow 2.1.0 + Keras 2.3.1 on Python 3.6.
TensorFlow 2.0
tensorflow-2.0
TensorFlow 2.0.0 + Keras 2.3.1 on Python 3.6.
TensorFlow 1.15
tensorflow-1.15
TensorFlow 1.15.0 + Keras 2.3.1 on Python 3.6.
TensorFlow 1.14
tensorflow-1.14
TensorFlow 1.14.0 + Keras 2.2.5 on Python 3.6.
TensorFlow 1.13
tensorflow-1.13
TensorFlow 1.13.0 + Keras 2.2.4 on Python 3.6.
TensorFlow 1.12
tensorflow-1.12
TensorFlow 1.12.0 + Keras 2.2.4 on Python 3.6.
tensorflow-1.12:py2
TensorFlow 1.12.0 + Keras 2.2.4 on Python 2.
TensorFlow 1.11
tensorflow-1.11
TensorFlow 1.11.0 + Keras 2.2.4 on Python 3.6.
tensorflow-1.11:py2
TensorFlow 1.11.0 + Keras 2.2.4 on Python 2.
TensorFlow 1.10
tensorflow-1.10
TensorFlow 1.10.0 + Keras 2.2.0 on Python 3.6.
tensorflow-1.10:py2
TensorFlow 1.10.0 + Keras 2.2.0 on Python 2.
TensorFlow 1.9
tensorflow-1.9
TensorFlow 1.9.0 + Keras 2.2.0 on Python 3.6.
tensorflow-1.9:py2
TensorFlow 1.9.0 + Keras 2.2.0 on Python 2.
TensorFlow 1.8
tensorflow-1.8
TensorFlow 1.8.0 + Keras 2.1.6 on Python 3.6.
tensorflow-1.8:py2
TensorFlow 1.8.0 + Keras 2.1.6 on Python 2.
TensorFlow 1.7
tensorflow-1.7
TensorFlow 1.7.0 + Keras 2.1.6 on Python 3.6.
tensorflow-1.7:py2
TensorFlow 1.7.0 + Keras 2.1.6 on Python 2.
TensorFlow 1.5
tensorflow-1.5
TensorFlow 1.5.0 + Keras 2.1.6 on Python 3.6.
tensorflow-1.5:py2
TensorFlow 1.5.0 + Keras 2.1.6 on Python 2.
TensorFlow 1.4
tensorflow-1.4
TensorFlow 1.4.0 + Keras 2.0.8 on Python 3.6.
tensorflow-1.4:py2
TensorFlow 1.4.0 + Keras 2.0.8 on Python 2.
TensorFlow 1.3
tensorflow-1.3
TensorFlow 1.3.0 + Keras 2.0.6 on Python 3.6.
tensorflow-1.3:py2
TensorFlow 1.3.0 + Keras 2.0.6 on Python 2.
TensorFlow 1.2
tensorflow-1.2
TensorFlow 1.2.0 + Keras 2.0.6 on Python 3.5.
tensorflow-1.2:py2
TensorFlow 1.2.0 + Keras 2.0.6 on Python 2.
TensorFlow 1.1
tensorflow
TensorFlow 1.1.0 + Keras 2.0.6 on Python 3.5.
tensorflow:py2
TensorFlow 1.1.0 + Keras 2.0.6 on Python 2.
TensorFlow 1.0
tensorflow-1.0
TensorFlow 1.0.0 + Keras 2.0.6 on Python 3.5.
tensorflow-1.0:py2
TensorFlow 1.0.0 + Keras 2.0.6 on Python 2.
TensorFlow 0.12
tensorflow-0.12
TensorFlow 0.12.1 + Keras 1.2.2 on Python 3.5.
tensorflow-0.12:py2
TensorFlow 0.12.1 + Keras 1.2.2 on Python 2.
PyTorch 1.5
pytorch-1.5
PyTorch 1.5.0 + fastai 1.0.61 on Python 3.7.
PyTorch 1.4
pytorch-1.4
PyTorch 1.4.0 + fastai 1.0.60 on Python 3.6.
PyTorch 1.3
pytorch-1.3
PyTorch 1.3.0 + fastai 1.0.60 on Python 3.6.
PyTorch 1.2
pytorch-1.2
PyTorch 1.2.0 + fastai 1.0.60 on Python 3.6.
PyTorch 1.1
pytorch-1.1
PyTorch 1.1.0 + fastai 1.0.57 on Python 3.6.
PyTorch 1.0
pytorch-1.0
PyTorch 1.0.0 + fastai 1.0.51 on Python 3.6.
pytorch-1.0:py2
PyTorch 1.0.0 on Python 2.
PyTorch 0.4
pytorch-0.4
PyTorch 0.4.1 on Python 3.6.
pytorch-0.4:py2
PyTorch 0.4.1 on Python 2.
PyTorch 0.3
pytorch-0.3
PyTorch 0.3.1 on Python 3.6.
pytorch-0.3:py2
PyTorch 0.3.1 on Python 2.
PyTorch 0.2
pytorch-0.2
PyTorch 0.2.0 on Python 3.5
pytorch-0.2:py2
PyTorch 0.2.0 on Python 2.
PyTorch 0.1
pytorch-0.1
PyTorch 0.1.12 on Python 3.
pytorch-0.1:py2
PyTorch 0.1.12 on Python 2.
Theano 0.9
theano-0.9
Theano rel-0.8.2 + Keras 2.0.3 on Python3.5.
theano-0.9:py2
Theano rel-0.8.2 + Keras 2.0.3 on Python2.
Caffe
caffe
Caffe rc4 on Python3.5.
caffe:py2
Caffe rc4 on Python2.
Torch
torch
Torch 7 with Python 3 env.
torch:py2
Torch 7 with Python 2 env.
Chainer 1.23
chainer-1.23
Chainer 1.23.0 on Python 3.
chainer-1.23:py2
Chainer 1.23.0 on Python 2.
Chainer 2.0
chainer-2.0
Chainer 1.23.0 on Python 3.
chainer-2.0:py2
Chainer 1.23.0 on Python 2.
MxNet 1.0
mxnet
MxNet 1.0.0 on Python 3.6.
mxnet:py2
MxNet 1.0.0 on Python 2.
3、no module named 'Ipython'
pip install Ipython
4、module 'tensorflow' has no attribute 'log'
将mask-rcnn/Mask_RCNN-master/mrcnn/model.py中第341行中的tf.log改为tf.math.log
5、module 'tensorflow_core._api.v2.sets' has no attribute'set-intersection'
将mask-rcnn/Mask_RCNN-master/mrcnn/model.py中第721行中的tf.sets.set_intersection改为tf.sets.intersection
6、
将tf.sparse_tensor_to_dense换为tf.compat.v1.sparse_tensor_to_dense
7、
将tf.to_float(tf.gather(...))[...,tf,],,,,axis=1)改为tf.cast(tf.gather(...),dtype=float)[...,tf,],,,,axis=1)
另一种方法:将mrcnn文件下model.py, parallel_model.py和utils.py中import tensorflow as tf改为import tensorflow.compat.v1 as tf
运行demo.py(训练程序)出现以下问题,解决方法均在tf环境中运行(source activate tf)
1、No module named 'cv2'
pip install opencv-python
以下为安装过程:
(tf) ubuntu@ubuntu-precision-3630-tower:~$ pip install opencv-python
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting opencv-python
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d0/f0/cfe88d262c67825b20d396c778beca21829da061717c7aaa8b421ae5132e/opencv_python-4.2.0.34-cp37-cp37m-manylinux1_x86_64.whl (28.2 MB)
|████████████████████████████████| 28.2 MB 141 kB/s
Requirement already satisfied: numpy>=1.14.5 in ./anaconda3/envs/tf/lib/python3.7/site-packages (from opencv-python) (1.19.0)
Installing collected packages: opencv-python
Successfully installed opencv-python-4.2.0.34
2、 module 'tensorflow' has no attribute 'random_shuffle'
将mask-rcnn/Mask_RCNN-master/mrcnn/model.py中第555、560行中的tf.random_shuffle改为tf.compat.v1.random_shuffle如下:
positive_indices = tf.compat.v1.random_shuffle(positive_indices)[:positive_count]
3、module 'tensorflow' has no attribute 'log'
将mask-rcnn/Mask_RCNN-master/mrcnn/utils.py中第201、202行中的tf.log改为tf.math.log
4、reduce_mean() got an unexpected keyword argument 'keep_dims'
将mask-rcnn/Mask_RCNN-master/mrcnn/model.py中第2189、2217行中的keep_dims改为keepdims
5、OperatorNotAllowedInGraphError: using atf.Tensoras a Pythonboolis not allowed in Graph execution. Use Eager execution or decorate this function with @tf.function.
将mask-rcnn/Mask_RCNN-master/mrcnn/model.py中第2187行中if layer.output in self.keras_model.losses: 这条if以及下面的continue注释掉并改为以下内容
6、'ParallelModel' object has no attribute 'metrics_tensors'
将mask-rcnn/Mask_RCNN-master/mrcnn/model.py中第2219行中self.keras_model.metrics_tensors.append(loss)这条语句改为
self.keras_model.add_metric(loss),因keras2.3.0有API改动,改用add_metric,
或者在model.py文件下查找self.keras_model.metrics_tensors.append(loss),在这条语句所属的for循环前加入self.keras_model.metrics_tensors = []
7、'Model' object has no attribute '_get_distribution_strategy'
貌似是版本不同导致的
先进入anaconda的安装目录,进入lib/site-packages/tensorflow_core/python/keras,编辑callbacks.py(先备份!),把1529行开始的
# TensorBoard callback involves writing a summary file in a
# possibly distributed settings.
self._log_write_dir = distributed_file_utils.write_dirpath(
self.log_dir, self.model._get_distribution_strategy()) # pylint: disable=protected-access
修改为
# In case this callback is used via native Keras, _get_distribution_strategy does not exist.
if hasattr(self.model, '_get_distribution_strategy'):
# TensorBoard callback involves writing a summary file in a
# possibly distributed settings.
self._log_write_dir = distributed_file_utils.write_dirpath(
self.log_dir, self.model._get_distribution_strategy()) # pylint: disable=protected-access
else:
self._log_write_dir = self.log_dir
再将1738行左右开始的
# Safely remove the unneeded temp files.
distributed_file_utils.remove_temp_dirpath(
self.log_dir, self.model._get_distribution_strategy()) # pylint: disable=protected-access
修改为
# In case this callback is used via native Keras, _get_distribution_strategy does not exist.
if hasattr(self.model, '_get_distribution_strategy'):
# Safely remove the unneeded temp files.
distributed_file_utils.remove_temp_dirpath(
self.log_dir, self.model._get_distribution_strategy()) # pylint: disable=protected-access
原文链接:https://blog.csdn.net/weixin_44563427/article/details/104904555
8、训练过程中出现警告:
FutureWarning: Input image dtype is bool. Interpolation is not defined with bool data type. Please set order to 0 or explicitely cast input image to another data type. Starting from version 0.19 a ValueError will be raised instead of this warning.
order = _validate_interpolation_order(image.dtype, order)
9、 UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
内存警告,忽略
10、Using a generator with `use_multiprocessing=True` and multiple workers may duplicate your data. Please consider using the `keras.utils.Sequence class.
UserWarning('Using a generator with `use_multiprocessing=True`'
WARNING:tensorflow:Model failed to serialize as JSON. Ignoring... can't pickle _thread.RLock objects
造成不能按照图像的id顺序进行训练,一副图像训练好多次,出现问题。
将mask-rcnn/Mask_RCNN-master/mrcnn/model.py中第2402行中
workers=workers,
use_multiprocessing=True,
改为
workers=1, use_multiprocessing=False,
多线程改为单线程运行
11、/media/ubuntu/wenjian/mask rcnn/demo.py:92: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
temp = yaml.load(f.read())将/media/ubuntu/wenjian/mask rcnn/demo.py第92行中的 temp = yaml.load(f.read())改为
temp = yaml.load(f.read(),Loader=yaml.FullLoader)
12、cuda程序执行出错: libcudart.so.10.0: cannot open shared object file: No such file or directory
在/etc/profile文件中将原来的(修改只读文件输入sudo nautilus)
export PATH=$PATH:/usr/local/cuda-8.0/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64删除
最后添加
export PATH=$PATH:/usr/local/cuda-10.1/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.1/lib64
export LIBRARY_PATH=$LIBRARY_PATH:/usr/local/cuda-10.1/lib64
$source /etc/profile
使配置文件生效,再次执行。
还是报了类似这个语句的错误
sudo gedit ~/.bashrc
在末尾添加
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.1/lib64
export PATH=$PATH:/usr/local/cuda-10.1/bin
export CUDA_HOME=$CUDA_HOME:/usr/local/cuda-10.1
创建链接文件:
sudo vim /etc/ld.so.conf.d/cuda.conf
在其中添加如下语句:
/usr/local/cuda/lib64
然后执行
sudo ldconfig
之后重新运行程序才成功调用GPU
13、WARNING:tensorflow:Model failed to serialize as JSON. Ignoring... can't pickle _thread.RLock objects
UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape.
This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
未解决
14、
2020-06-24 10:11:15.484320: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-8.0/lib64:
2020-06-24 10:11:15.484372: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-8.0/lib64:
2020-06-24 10:11:15.484379: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
Using TensorFlow backend.
尝试在tf环境中安装TensorRT但没有成功。安装文件在/home/ubuntu/TensorRT-6.0.1.5中,原始下载文件在/wenjian/已下载软件安装包中
不影响运行程序