Tensorflow - 语义分割 Deeplab API 之 Demo
Tensorflow - 语义分割 Deeplab API 之 ModelZoo
Tensorflow DeepLab 语义分割还提供了在 PASCAL VOC 2012, Cityscapes, ADE20K 三个分割数据集上的训练实现.
主要包括 PASCAL VOC 2012 语义分割数据集下载,和转换为 Tensorflow 的 TFRecord.
Shell 脚本 - download_and_convert_voc201.sh
:
#!/bin/bash
# Usage:
# bash ./download_and_convert_voc2012.sh
#
# 假设该 Shell 脚本所在路径的目录结构为:
# + datasets
# - build_data.py
# - build_voc2012_data.py
# - download_and_convert_voc2012.sh
# - remove_gt_colormap.py
# + pascal_voc_seg
# + VOCdevkit
# + VOC2012
# + JPEGImages
# + SegmentationClass
#
# Exit immediately if a command exits with a non-zero status.
set -e
CURRENT_DIR=$(pwd) # 当前路径
WORK_DIR="./pascal_voc_seg"
mkdir -p "${WORK_DIR}"
cd "${WORK_DIR}"
# 该函数用于 PASCAL VOC 2012 分割数据集的下载和解压
download_and_uncompress() {
local BASE_URL=${1}
local FILENAME=${2}
if [ ! -f "${FILENAME}" ]; then
echo "Downloading ${FILENAME} to ${WORK_DIR}"
wget -nd -c "${BASE_URL}/${FILENAME}"
fi
echo "Uncompressing ${FILENAME}"
tar -xf "${FILENAME}"
}
# 下载VOC2012图片.
BASE_URL="http://host.robots.ox.ac.uk/pascal/VOC/voc2012/"
FILENAME="VOCtrainval_11-May-2012.tar"
download_and_uncompress "${BASE_URL}" "${FILENAME}"
cd "${CURRENT_DIR}"
# PASCAL VOC 2012 分割数据集的根目录
PASCAL_ROOT="${WORK_DIR}/VOCdevkit/VOC2012"
# Remove the colormap in the ground truth annotations.
SEG_FOLDER="${PASCAL_ROOT}/SegmentationClass"
SEMANTIC_SEG_FOLDER="${PASCAL_ROOT}/SegmentationClassRaw"
echo "Removing the color map in ground truth annotations..."
python ./remove_gt_colormap.py \
--original_gt_folder="${SEG_FOLDER}" \
--output_dir="${SEMANTIC_SEG_FOLDER}"
# Build TFRecords of the dataset.
# First, create output directory for storing TFRecords.
OUTPUT_DIR="${WORK_DIR}/tfrecord"
mkdir -p "${OUTPUT_DIR}"
IMAGE_FOLDER="${PASCAL_ROOT}/JPEGImages"
LIST_FOLDER="${PASCAL_ROOT}/ImageSets/Segmentation"
echo "Converting PASCAL VOC 2012 dataset..."
python ./build_voc2012_data.py \
--image_folder="${IMAGE_FOLDER}" \
--semantic_segmentation_folder="${SEMANTIC_SEG_FOLDER}" \
--list_folder="${LIST_FOLDER}" \
--image_format="jpg" \
--output_dir="${OUTPUT_DIR}"
Tensorflow DeepLab 推荐的 VOC2012 数据目录结构为:
+ datasets
+ pascal_voc_seg
+ VOCdevkit
+ VOC2012
+ JPEGImages
+ SegmentationClass
+ tfrecord
+ exp
+ train_on_train_set # 保存了训练的输出, 可视化等结果信息.
+ train
+ eval
+ vis
fine_tune_batch_norm = True
.fine_tune_batch_norm = False
.output_stride=8
,则需要改变 atrous_rates
: 从 [6, 12, 18] 修改为 [12, 24, 36].decoder_output_stride
.以 xception_65
为例:
训练 Train:
# From tensorflow/models/research/
python deeplab/train.py \
--logtostderr \
--training_number_of_steps=30000 \
--train_split="train" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--train_crop_size=513 \
--train_crop_size=513 \
--train_batch_size=1 \
--dataset="pascal_voc_seg" \
--tf_initial_checkpoint=${PATH_TO_INITIAL_CHECKPOINT} \
--train_logdir=${PATH_TO_TRAIN_DIR} \
--dataset_dir=${PATH_TO_DATASET}
验证 Eval:
# From tensorflow/models/research/
python deeplab/eval.py \
--logtostderr \
--eval_split="val" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--eval_crop_size=513 \
--eval_crop_size=513 \
--dataset="pascal_voc_seg" \
--checkpoint_dir=${PATH_TO_CHECKPOINT} \
--eval_logdir=${PATH_TO_EVAL_DIR} \
--dataset_dir=${PATH_TO_DATASET}
train_logdir
.可视化 Visualization:
# From tensorflow/models/research/
python deeplab/vis.py \
--logtostderr \
--vis_split="val" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--vis_crop_size=513 \
--vis_crop_size=513 \
--dataset="pascal_voc_seg" \
--checkpoint_dir=${PATH_TO_CHECKPOINT} \
--vis_logdir=${PATH_TO_VIS_DIR} \
--dataset_dir=${PATH_TO_DATASET}
train_logdir
.also_save_raw_predictions = True
.如果时按照上面推荐的目录结构,可以直接运行:
tensorboard --logdir=${PATH_TO_LOG_DIRECTORY}
tain
, eval
, vis
的路径, 如上面的 train_on_train_set
路径.Tensorboard 可能需要几分钟准备数据.
local_test.sh
Shell 脚本来在 PASCAL VOC2012数据集上运行 train.py
, eval.py
, vis.py
, export_model.py
.
# From tensorflow/models/research/deeplab
sh local_test.sh
#!/bin/bash
# This script is used to run local test on PASCAL VOC 2012. Users could also
# modify from this script for their use case.
#
# Usage:
# # From the tensorflow/models/research/deeplab directory.
# sh ./local_test.sh
#
# Exit immediately if a command exits with a non-zero status.
set -e
# Move one-level up to tensorflow/models/research directory.
cd ..
# Update PYTHONPATH.
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
# Set up the working environment.
CURRENT_DIR=$(pwd)
WORK_DIR="${CURRENT_DIR}/deeplab"
# Run model_test first to make sure the PYTHONPATH is correctly set.
python "${WORK_DIR}"/model_test.py -v
# Go to datasets folder and download PASCAL VOC 2012 segmentation dataset.
DATASET_DIR="datasets"
cd "${WORK_DIR}/${DATASET_DIR}"
sh download_and_convert_voc2012.sh
# Go back to original directory.
cd "${CURRENT_DIR}"
# Set up the working directories.
PASCAL_FOLDER="pascal_voc_seg"
EXP_FOLDER="exp/train_on_trainval_set"
INIT_FOLDER="${WORK_DIR}/${DATASET_DIR}/${PASCAL_FOLDER}/init_models"
TRAIN_LOGDIR="${WORK_DIR}/${DATASET_DIR}/${PASCAL_FOLDER}/${EXP_FOLDER}/train"
EVAL_LOGDIR="${WORK_DIR}/${DATASET_DIR}/${PASCAL_FOLDER}/${EXP_FOLDER}/eval"
VIS_LOGDIR="${WORK_DIR}/${DATASET_DIR}/${PASCAL_FOLDER}/${EXP_FOLDER}/vis"
EXPORT_DIR="${WORK_DIR}/${DATASET_DIR}/${PASCAL_FOLDER}/${EXP_FOLDER}/export"
mkdir -p "${INIT_FOLDER}"
mkdir -p "${TRAIN_LOGDIR}"
mkdir -p "${EVAL_LOGDIR}"
mkdir -p "${VIS_LOGDIR}"
mkdir -p "${EXPORT_DIR}"
# Copy locally the trained checkpoint as the initial checkpoint.
TF_INIT_ROOT="http://download.tensorflow.org/models"
TF_INIT_CKPT="deeplabv3_pascal_train_aug_2018_01_04.tar.gz"
cd "${INIT_FOLDER}"
wget -nd -c "${TF_INIT_ROOT}/${TF_INIT_CKPT}"
tar -xf "${TF_INIT_CKPT}"
cd "${CURRENT_DIR}"
PASCAL_DATASET="${WORK_DIR}/${DATASET_DIR}/${PASCAL_FOLDER}/tfrecord"
# Train 10 iterations.
NUM_ITERATIONS=10
python "${WORK_DIR}"/train.py \
--logtostderr \
--train_split="trainval" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--train_crop_size=513 \
--train_crop_size=513 \
--train_batch_size=4 \
--training_number_of_steps="${NUM_ITERATIONS}" \
--fine_tune_batch_norm=true \
--tf_initial_checkpoint="${INIT_FOLDER}/deeplabv3_pascal_train_aug/model.ckpt" \
--train_logdir="${TRAIN_LOGDIR}" \
--dataset_dir="${PASCAL_DATASET}"
# Run evaluation.
# This performs eval over the full val split (1449 images) and will take a while.
# Using the provided checkpoint, one should expect mIOU=82.20%.
python "${WORK_DIR}"/eval.py \
--logtostderr \
--eval_split="val" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--eval_crop_size=513 \
--eval_crop_size=513 \
--checkpoint_dir="${TRAIN_LOGDIR}" \
--eval_logdir="${EVAL_LOGDIR}" \
--dataset_dir="${PASCAL_DATASET}" \
--max_number_of_evaluations=1
# Visualize the results.
python "${WORK_DIR}"/vis.py \
--logtostderr \
--vis_split="val" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--vis_crop_size=513 \
--vis_crop_size=513 \
--checkpoint_dir="${TRAIN_LOGDIR}" \
--vis_logdir="${VIS_LOGDIR}" \
--dataset_dir="${PASCAL_DATASET}" \
--max_number_of_iterations=1
# Export the trained checkpoint.
CKPT_PATH="${TRAIN_LOGDIR}/model.ckpt-${NUM_ITERATIONS}"
EXPORT_PATH="${EXPORT_DIR}/frozen_inference_graph.pb"
python "${WORK_DIR}"/export_model.py \
--logtostderr \
--checkpoint_path="${CKPT_PATH}" \
--export_path="${EXPORT_PATH}" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--num_classes=21 \
--crop_size=513 \
--crop_size=513 \
--inference_scales=1.0
图片测试 Demo - 参考 Tensorflow - 语义分割 Deeplab API 之 Demo.
在 Cityscapes 语义分割数据集上的训练类似于 PASCAL VOC2012 数据集的训练.
Shell 脚本 convert_cityscapes.sh
包括了数据集下载和转换为 TFRecord.
在 Cityscapes 数据集下载前需要注册 - https://www.cityscapes-dataset.com/
# From the tensorflow/models/research/deeplab/datasets directory.
sh convert_cityscapes.sh
#!/bin/bash
# Usage:
# bash ./preprocess_cityscapes.sh
#
# The folder structure is assumed to be:
# + datasets
# - build_cityscapes_data.py
# - convert_cityscapes.sh
# + cityscapes
# + cityscapesscripts (downloaded scripts)
# + gtFine
# + leftImg8bit
#
# Exit immediately if a command exits with a non-zero status.
set -e
CURRENT_DIR=$(pwd)
WORK_DIR="."
# Root path for Cityscapes dataset.
CITYSCAPES_ROOT="${WORK_DIR}/cityscapes"
# Create training labels.
python "${CITYSCAPES_ROOT}/cityscapesscripts/preparation/createTrainIdLabelImgs.py"
# Build TFRecords of the dataset.
# First, create output directory for storing TFRecords.
OUTPUT_DIR="${CITYSCAPES_ROOT}/tfrecord"
mkdir -p "${OUTPUT_DIR}"
BUILD_SCRIPT="${CURRENT_DIR}/build_cityscapes_data.py"
echo "Converting Cityscapes dataset..."
python "${BUILD_SCRIPT}" \
--cityscapes_root="${CITYSCAPES_ROOT}" \
--output_dir="${OUTPUT_DIR}" \
得到的数据保存路径为:./deeplab/datasets/cityscapes/tfrecord
.
推荐的数据集组成结构如:
+ datasets
+ cityscapes
+ leftImg8bit
+ gtFine
+ tfrecord
+ exp
+ train_on_train_set
+ train
+ eval
+ vis
fine_tune_batch_norm = True
.fine_tune_batch_norm = False
.output_stride=8
,则需要改变 atrous_rates
: 从 [6, 12, 18] 修改为 [12, 24, 36].decoder_output_stride
.以 xception_65
为例:
训练Train:
# From tensorflow/models/research/
python deeplab/train.py \
--logtostderr \
--training_number_of_steps=90000 \
--train_split="train" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--train_crop_size=769 \
--train_crop_size=769 \
--train_batch_size=1 \
--dataset="cityscapes" \
--tf_initial_checkpoint=${PATH_TO_INITIAL_CHECKPOINT} \
--train_logdir=${PATH_TO_TRAIN_DIR} \
--dataset_dir=${PATH_TO_DATASET}
验证Eval:
# From tensorflow/models/research/
python deeplab/eval.py \
--logtostderr \
--eval_split="val" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--eval_crop_size=1025 \
--eval_crop_size=2049 \
--dataset="cityscapes" \
--checkpoint_dir=${PATH_TO_CHECKPOINT} \
--eval_logdir=${PATH_TO_EVAL_DIR} \
--dataset_dir=${PATH_TO_DATASET}
train_logdir
.可视化Visualization:
# From tensorflow/models/research/
python deeplab/vis.py \
--logtostderr \
--vis_split="val" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--vis_crop_size=1025 \
--vis_crop_size=2049 \
--dataset="cityscapes" \
--colormap_type="cityscapes" \
--checkpoint_dir=${PATH_TO_CHECKPOINT} \
--vis_logdir=${PATH_TO_VIS_DIR} \
--dataset_dir=${PATH_TO_DATASET}
train_logdir
.also_save_raw_predictions = True
.如果时按照上面推荐的目录结构,可以直接运行:
tensorboard --logdir=${PATH_TO_LOG_DIRECTORY}
tain
, eval
, vis
的路径, 如上面的 train_on_train_set
路径.Tensorboard 可能需要几分钟准备数据.
在 ADE20K 语义分割数据集上的训练类似于 PASCAL VOC2012 数据集的训练.
Shell 脚本 download_and_convert_ade20k.sh
包括了数据集下载和转换为 TFRecord.
# From the tensorflow/models/research/deeplab/datasets directory.
sh convert_cityscapes.sh
#!/bin/bash
# Usage:
# bash ./download_and_convert_ade20k.sh
#
# The folder structure is assumed to be:
# + datasets
# - build_data.py
# - build_ade20k_data.py
# - download_and_convert_ade20k.sh
# + ADE20K
# + tfrecord
# + ADEChallengeData2016
# + annotations
# + training
# + validation
# + images
# + training
# + validation
# Exit immediately if a command exits with a non-zero status.
set -e
CURRENT_DIR=$(pwd)
WORK_DIR="./ADE20K"
mkdir -p "${WORK_DIR}"
cd "${WORK_DIR}"
# Helper function to download and unpack ADE20K dataset.
download_and_uncompress() {
local BASE_URL=${1}
local FILENAME=${2}
if [ ! -f "${FILENAME}" ]; then
echo "Downloading ${FILENAME} to ${WORK_DIR}"
wget -nd -c "${BASE_URL}/${FILENAME}"
fi
echo "Uncompressing ${FILENAME}"
unzip "${FILENAME}"
}
# Download the images.
BASE_URL="http://data.csail.mit.edu/places/ADEchallenge"
FILENAME="ADEChallengeData2016.zip"
download_and_uncompress "${BASE_URL}" "${FILENAME}"
cd "${CURRENT_DIR}"
# Root path for ADE20K dataset.
ADE20K_ROOT="${WORK_DIR}/ADEChallengeData2016"
# Build TFRecords of the dataset.
# First, create output directory for storing TFRecords.
OUTPUT_DIR="${WORK_DIR}/tfrecord"
mkdir -p "${OUTPUT_DIR}"
echo "Converting ADE20K dataset..."
python ./build_ade20k_data.py \
--train_image_folder="${ADE20K_ROOT}/images/training/" \
--train_image_label_folder="${ADE20K_ROOT}/annotations/training/" \
--val_image_folder="${ADE20K_ROOT}/images/validation/" \
--val_image_label_folder="${ADE20K_ROOT}/annotations/validation/" \
--output_dir="${OUTPUT_DIR}"
得到的数据保存路径为:./deeplab/datasets/ADE20K/tfrecord
.
推荐的数据集组成结构如:
+ datasets
- build_data.py
- build_ade20k_data.py
- download_and_convert_ade20k.sh
+ ADE20K
+ tfrecord
+ exp
+ train_on_train_set
+ train
+ eval
+ vis
+ ADEChallengeData2016
+ annotations
+ training
+ validation
+ images
+ training
+ validation
fine_tune_batch_norm = True
.fine_tune_batch_norm = False
.min_resize_value
和 max_resize_value
以得到较好的结果. 且要保证 resize_factor
== output_stride
.output_stride=8
,则需要改变 atrous_rates
: 从 [6, 12, 18] 修改为 [12, 24, 36].decoder_output_stride
.以 xception_65
为例:
训练Train:
# From tensorflow/models/research/
python deeplab/train.py \
--logtostderr \
--training_number_of_steps=90000 \
--train_split="train" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--train_crop_size=513 \
--train_crop_size=513 \
--train_batch_size=4 \
--min_resize_value=513 \
--max_resize_value=513 \
--resize_factor=16 \
--dataset="ade20k" \
--tf_initial_checkpoint=${PATH_TO_INITIAL_CHECKPOINT} \
--train_logdir=${PATH_TO_TRAIN_DIR}\
--dataset_dir=${PATH_TO_DATASET}
如果时按照上面推荐的目录结构,可以直接运行:
tensorboard --logdir=${PATH_TO_LOG_DIRECTORY}
tain
, eval
, vis
的路径, 如上面的 train_on_train_set
路径.Tensorboard 可能需要几分钟准备数据.
如果不是只使用提供的 backbone network,如 Xception,可以修改 core/feature_extractor.py
以支持更多的主干网络.
如果需要在新的数据集上进行模型训练,可以修改 dataset/build_{cityscapes,voc2012}_data.py
和 dataset/segmentation_dataset.py
来构建新的数据集.
PASCAL VOC augmented website
DeepLab 没有进行 DenseCRF 处理,可以参考: densecrf.
如果硬件资源有限,建议采用 DeepLab 提供的断点 Checkpoint 来进行直接 fine-tune,因为断点已经训练好了 batch norm 参数. 如,设置晓得学习率,设置fine_tune_batch_norm = false
,由于学习率小而延长训练迭代次数.
如果确实要重新训练模型,建议:
- 1 - 设置 output_stride = 16
,或者 output_stride = 32
, 记得同时修改对应的 atrous_rates
,例如,atrous_rates = [3, 6, 9]
对应 output_stride = 32
.
- 2 - 尽可能的采用多张 GPUs, 修改 train.py
中的 num_clones
参数,尽可能的设置大的 train_batch_size
.
- 3 - 调整 train.py
中的 train_crop_size
参数,可以设置小一点,如 513x513, 或 321x321. 这样可以设置大的 train_batch_size
.
- 4 - 采用小型主干网络backbone network, 如 MobileNet-v2.
在 train.py
中, 可以设置 num_replices
参数 - 训练的机器数量,和 num_ps_tasks
参数. 一般设置 num_ps_tasks
= num_replicas
/ 2.
可以查看更多细节 - slim.deployment.model_deploy.
可以尝试运行:
sh local_test.sh
或者
sh local_test_mobilenetv2.sh
首先,确保可以根据 DeepLab 提供的设置能够复现结果.
然后,可以试着修改,进行调试.
DeepLab 采用整张图片进行测试,也就是说,eval_crop_size
= output_stride
* k + 1.
其中,k 是整数,设置 k 使得 eval_crop_size
稍微大于数据集中图片尺寸的最大值.
例如,对于 PASCAL 数据集,设置 eval_crop_size
= 513x513,比图片尺寸 512 稍大.
同样地,所有的图片维度都等于 1024x2048.
原文:https://www.aiuai.cn/aifarm257.html