mask rcnn只能在linux里运行,ubuntu16.04+cuda10.1+tensoflow-gpu2.1+keras2.3.1运行mask rcnn兼容性问题解决...

运行test.py出现以下问题,解决方法均在tf环境中运行(source activate tf)

1、no module named 'skimage'

解决方法:

pip install scikit-image

2、no module named 'mrcnn'

直接pip install keras(自动安装的2.4.2版),再运行报错:keras2.4.2需要至少2.2以上版本的tensorflow支持

我安装的是tensorflow-gpu2.1.0,对应的keras版本应该时2.3.1

所以就下载之前安装的keras重装指定版本的keras

pip uninstall keras

pip install keras==2.3.1

Framework

Env name (--env parameter)

Description

Docker Image

Packages and Nvidia SettingsTensorFlow 2.2

tensorflow-2.2

TensorFlow 2.2.0 + Keras 2.3.1 on Python 3.7.

TensorFlow 2.1

tensorflow-2.1

TensorFlow 2.1.0 + Keras 2.3.1 on Python 3.6.

TensorFlow 2.0

tensorflow-2.0

TensorFlow 2.0.0 + Keras 2.3.1 on Python 3.6.

TensorFlow 1.15

tensorflow-1.15

TensorFlow 1.15.0 + Keras 2.3.1 on Python 3.6.

TensorFlow 1.14

tensorflow-1.14

TensorFlow 1.14.0 + Keras 2.2.5 on Python 3.6.

TensorFlow 1.13

tensorflow-1.13

TensorFlow 1.13.0 + Keras 2.2.4 on Python 3.6.

TensorFlow 1.12

tensorflow-1.12

TensorFlow 1.12.0 + Keras 2.2.4 on Python 3.6.

tensorflow-1.12:py2

TensorFlow 1.12.0 + Keras 2.2.4 on Python 2.

TensorFlow 1.11

tensorflow-1.11

TensorFlow 1.11.0 + Keras 2.2.4 on Python 3.6.

tensorflow-1.11:py2

TensorFlow 1.11.0 + Keras 2.2.4 on Python 2.

TensorFlow 1.10

tensorflow-1.10

TensorFlow 1.10.0 + Keras 2.2.0 on Python 3.6.

tensorflow-1.10:py2

TensorFlow 1.10.0 + Keras 2.2.0 on Python 2.

TensorFlow 1.9

tensorflow-1.9

TensorFlow 1.9.0 + Keras 2.2.0 on Python 3.6.

tensorflow-1.9:py2

TensorFlow 1.9.0 + Keras 2.2.0 on Python 2.

TensorFlow 1.8

tensorflow-1.8

TensorFlow 1.8.0 + Keras 2.1.6 on Python 3.6.

tensorflow-1.8:py2

TensorFlow 1.8.0 + Keras 2.1.6 on Python 2.

TensorFlow 1.7

tensorflow-1.7

TensorFlow 1.7.0 + Keras 2.1.6 on Python 3.6.

tensorflow-1.7:py2

TensorFlow 1.7.0 + Keras 2.1.6 on Python 2.

TensorFlow 1.5

tensorflow-1.5

TensorFlow 1.5.0 + Keras 2.1.6 on Python 3.6.

tensorflow-1.5:py2

TensorFlow 1.5.0 + Keras 2.1.6 on Python 2.

TensorFlow 1.4

tensorflow-1.4

TensorFlow 1.4.0 + Keras 2.0.8 on Python 3.6.

tensorflow-1.4:py2

TensorFlow 1.4.0 + Keras 2.0.8 on Python 2.

TensorFlow 1.3

tensorflow-1.3

TensorFlow 1.3.0 + Keras 2.0.6 on Python 3.6.

tensorflow-1.3:py2

TensorFlow 1.3.0 + Keras 2.0.6 on Python 2.

TensorFlow 1.2

tensorflow-1.2

TensorFlow 1.2.0 + Keras 2.0.6 on Python 3.5.

tensorflow-1.2:py2

TensorFlow 1.2.0 + Keras 2.0.6 on Python 2.

TensorFlow 1.1

tensorflow

TensorFlow 1.1.0 + Keras 2.0.6 on Python 3.5.

tensorflow:py2

TensorFlow 1.1.0 + Keras 2.0.6 on Python 2.

TensorFlow 1.0

tensorflow-1.0

TensorFlow 1.0.0 + Keras 2.0.6 on Python 3.5.

tensorflow-1.0:py2

TensorFlow 1.0.0 + Keras 2.0.6 on Python 2.

TensorFlow 0.12

tensorflow-0.12

TensorFlow 0.12.1 + Keras 1.2.2 on Python 3.5.

tensorflow-0.12:py2

TensorFlow 0.12.1 + Keras 1.2.2 on Python 2.

PyTorch 1.5

pytorch-1.5

PyTorch 1.5.0 + fastai 1.0.61 on Python 3.7.

PyTorch 1.4

pytorch-1.4

PyTorch 1.4.0 + fastai 1.0.60 on Python 3.6.

PyTorch 1.3

pytorch-1.3

PyTorch 1.3.0 + fastai 1.0.60 on Python 3.6.

PyTorch 1.2

pytorch-1.2

PyTorch 1.2.0 + fastai 1.0.60 on Python 3.6.

PyTorch 1.1

pytorch-1.1

PyTorch 1.1.0 + fastai 1.0.57 on Python 3.6.

PyTorch 1.0

pytorch-1.0

PyTorch 1.0.0 + fastai 1.0.51 on Python 3.6.

pytorch-1.0:py2

PyTorch 1.0.0 on Python 2.

PyTorch 0.4

pytorch-0.4

PyTorch 0.4.1 on Python 3.6.

pytorch-0.4:py2

PyTorch 0.4.1 on Python 2.

PyTorch 0.3

pytorch-0.3

PyTorch 0.3.1 on Python 3.6.

pytorch-0.3:py2

PyTorch 0.3.1 on Python 2.

PyTorch 0.2

pytorch-0.2

PyTorch 0.2.0 on Python 3.5

pytorch-0.2:py2

PyTorch 0.2.0 on Python 2.

PyTorch 0.1

pytorch-0.1

PyTorch 0.1.12 on Python 3.

pytorch-0.1:py2

PyTorch 0.1.12 on Python 2.

Theano 0.9

theano-0.9

Theano rel-0.8.2 + Keras 2.0.3 on Python3.5.

theano-0.9:py2

Theano rel-0.8.2 + Keras 2.0.3 on Python2.

Caffe

caffe

Caffe rc4 on Python3.5.

caffe:py2

Caffe rc4 on Python2.

Torch

torch

Torch 7 with Python 3 env.

torch:py2

Torch 7 with Python 2 env.

Chainer 1.23

chainer-1.23

Chainer 1.23.0 on Python 3.

chainer-1.23:py2

Chainer 1.23.0 on Python 2.

Chainer 2.0

chainer-2.0

Chainer 1.23.0 on Python 3.

chainer-2.0:py2

Chainer 1.23.0 on Python 2.

MxNet 1.0

mxnet

MxNet 1.0.0 on Python 3.6.

mxnet:py2

MxNet 1.0.0 on Python 2.

3、no module named 'Ipython'

pip install Ipython

4、module 'tensorflow' has no attribute 'log'

将mask-rcnn/Mask_RCNN-master/mrcnn/model.py中第341行中的tf.log改为tf.math.log

5、module 'tensorflow_core._api.v2.sets' has no attribute'set-intersection'

将mask-rcnn/Mask_RCNN-master/mrcnn/model.py中第721行中的tf.sets.set_intersection改为tf.sets.intersection

6、

mask rcnn只能在linux里运行,ubuntu16.04+cuda10.1+tensoflow-gpu2.1+keras2.3.1运行mask rcnn兼容性问题解决..._第1张图片

mask rcnn只能在linux里运行,ubuntu16.04+cuda10.1+tensoflow-gpu2.1+keras2.3.1运行mask rcnn兼容性问题解决..._第2张图片

将tf.sparse_tensor_to_dense换为tf.compat.v1.sparse_tensor_to_dense

7、

mask rcnn只能在linux里运行,ubuntu16.04+cuda10.1+tensoflow-gpu2.1+keras2.3.1运行mask rcnn兼容性问题解决..._第3张图片

将tf.to_float(tf.gather(...))[...,tf,],,,,axis=1)改为tf.cast(tf.gather(...),dtype=float)[...,tf,],,,,axis=1)

另一种方法:将mrcnn文件下model.py, parallel_model.py和utils.py中import tensorflow as tf改为import tensorflow.compat.v1 as tf

运行demo.py(训练程序)出现以下问题,解决方法均在tf环境中运行(source activate tf)

1、No module named 'cv2'

pip install opencv-python

以下为安装过程:

(tf) ubuntu@ubuntu-precision-3630-tower:~$ pip install opencv-python

Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple

Collecting opencv-python

Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d0/f0/cfe88d262c67825b20d396c778beca21829da061717c7aaa8b421ae5132e/opencv_python-4.2.0.34-cp37-cp37m-manylinux1_x86_64.whl (28.2 MB)

|████████████████████████████████| 28.2 MB 141 kB/s

Requirement already satisfied: numpy>=1.14.5 in ./anaconda3/envs/tf/lib/python3.7/site-packages (from opencv-python) (1.19.0)

Installing collected packages: opencv-python

Successfully installed opencv-python-4.2.0.34

2、 module 'tensorflow' has no attribute 'random_shuffle'

将mask-rcnn/Mask_RCNN-master/mrcnn/model.py中第555、560行中的tf.random_shuffle改为tf.compat.v1.random_shuffle如下:

positive_indices = tf.compat.v1.random_shuffle(positive_indices)[:positive_count]

3、module 'tensorflow' has no attribute 'log'

将mask-rcnn/Mask_RCNN-master/mrcnn/utils.py中第201、202行中的tf.log改为tf.math.log

4、reduce_mean() got an unexpected keyword argument 'keep_dims'

将mask-rcnn/Mask_RCNN-master/mrcnn/model.py中第2189、2217行中的keep_dims改为keepdims

5、OperatorNotAllowedInGraphError: using atf.Tensoras a Pythonboolis not allowed in Graph execution. Use Eager execution or decorate this function with @tf.function.

将mask-rcnn/Mask_RCNN-master/mrcnn/model.py中第2187行中if layer.output in self.keras_model.losses: 这条if以及下面的continue注释掉并改为以下内容

mask rcnn只能在linux里运行,ubuntu16.04+cuda10.1+tensoflow-gpu2.1+keras2.3.1运行mask rcnn兼容性问题解决..._第4张图片

6、'ParallelModel' object has no attribute 'metrics_tensors'

将mask-rcnn/Mask_RCNN-master/mrcnn/model.py中第2219行中self.keras_model.metrics_tensors.append(loss)这条语句改为

self.keras_model.add_metric(loss),因keras2.3.0有API改动,改用add_metric,

mask rcnn只能在linux里运行,ubuntu16.04+cuda10.1+tensoflow-gpu2.1+keras2.3.1运行mask rcnn兼容性问题解决..._第5张图片

或者在model.py文件下查找self.keras_model.metrics_tensors.append(loss),在这条语句所属的for循环前加入self.keras_model.metrics_tensors = []

7、'Model' object has no attribute '_get_distribution_strategy'

貌似是版本不同导致的

先进入anaconda的安装目录,进入lib/site-packages/tensorflow_core/python/keras,编辑callbacks.py(先备份!),把1529行开始的

# TensorBoard callback involves writing a summary file in a

# possibly distributed settings.

self._log_write_dir = distributed_file_utils.write_dirpath(

self.log_dir, self.model._get_distribution_strategy())  # pylint: disable=protected-access

修改为

# In case this callback is used via native Keras, _get_distribution_strategy does not exist.

if hasattr(self.model, '_get_distribution_strategy'):

# TensorBoard callback involves writing a summary file in a

# possibly distributed settings.

self._log_write_dir = distributed_file_utils.write_dirpath(

self.log_dir, self.model._get_distribution_strategy())  # pylint: disable=protected-access

else:

self._log_write_dir = self.log_dir

再将1738行左右开始的

# Safely remove the unneeded temp files.

distributed_file_utils.remove_temp_dirpath(

self.log_dir, self.model._get_distribution_strategy())  # pylint: disable=protected-access

修改为

# In case this callback is used via native Keras, _get_distribution_strategy does not exist.

if hasattr(self.model, '_get_distribution_strategy'):

# Safely remove the unneeded temp files.

distributed_file_utils.remove_temp_dirpath(

self.log_dir, self.model._get_distribution_strategy())  # pylint: disable=protected-access

原文链接:https://blog.csdn.net/weixin_44563427/article/details/104904555

8、训练过程中出现警告:

FutureWarning: Input image dtype is bool. Interpolation is not defined with bool data type. Please set order to 0 or explicitely cast input image to another data type. Starting from version 0.19 a ValueError will be raised instead of this warning.

order = _validate_interpolation_order(image.dtype, order)

9、 UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.

内存警告,忽略

10、Using a generator with `use_multiprocessing=True` and multiple workers may duplicate your data. Please consider using the `keras.utils.Sequence class.

UserWarning('Using a generator with `use_multiprocessing=True`'

WARNING:tensorflow:Model failed to serialize as JSON. Ignoring... can't pickle _thread.RLock objects

造成不能按照图像的id顺序进行训练,一副图像训练好多次,出现问题。

将mask-rcnn/Mask_RCNN-master/mrcnn/model.py中第2402行中

workers=workers,

use_multiprocessing=True,

改为

workers=1, use_multiprocessing=False,

多线程改为单线程运行

11、/media/ubuntu/wenjian/mask rcnn/demo.py:92: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.

temp = yaml.load(f.read())将/media/ubuntu/wenjian/mask rcnn/demo.py第92行中的  temp = yaml.load(f.read())改为

temp = yaml.load(f.read(),Loader=yaml.FullLoader)

12、cuda程序执行出错: libcudart.so.10.0: cannot open shared object file: No such file or directory

在/etc/profile文件中将原来的(修改只读文件输入sudo nautilus)

export PATH=$PATH:/usr/local/cuda-8.0/bin

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64删除

最后添加

export PATH=$PATH:/usr/local/cuda-10.1/bin

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.1/lib64

export LIBRARY_PATH=$LIBRARY_PATH:/usr/local/cuda-10.1/lib64

$source /etc/profile

使配置文件生效,再次执行。

还是报了类似这个语句的错误

sudo gedit ~/.bashrc

在末尾添加

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.1/lib64

export PATH=$PATH:/usr/local/cuda-10.1/bin

export CUDA_HOME=$CUDA_HOME:/usr/local/cuda-10.1

创建链接文件:

sudo vim /etc/ld.so.conf.d/cuda.conf

在其中添加如下语句:

/usr/local/cuda/lib64

然后执行

sudo ldconfig

之后重新运行程序才成功调用GPU

13、WARNING:tensorflow:Model failed to serialize as JSON. Ignoring... can't pickle _thread.RLock objects

UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape.

This may consume a large amount of memory.

"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "

未解决

14、

2020-06-24 10:11:15.484320: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-8.0/lib64:

2020-06-24 10:11:15.484372: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-8.0/lib64:

2020-06-24 10:11:15.484379: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

Using TensorFlow backend.

尝试在tf环境中安装TensorRT但没有成功。安装文件在/home/ubuntu/TensorRT-6.0.1.5中,原始下载文件在/wenjian/已下载软件安装包中

不影响运行程序

你可能感兴趣的:(mask,rcnn只能在linux里运行)