(2020ECCV)超快速的结构感知深度车道检测 论文+代码解读

浙江大学的一篇工作。可以直接看原作者的中文介绍:

https://zhuanlan.zhihu.com/p/157530787

官方源码:

https://github.com/cfzd/Ultra-Fast-Lane-Detection

这里主要是记录一下自己的学习,同时对官方源码进行了解读。下面是正文:

摘要

现今主流做法将车道检测当作语义分割问题来处理,但这样存在复杂场景下效果不好、速度慢的缺点。受人类感知的启发,在严重遮挡和极端光照条件下对车道的识别主要基于上下文全局信息。具体来说,我们将车道检测过程视为使用全局特征的基于行的选择问题

contributions

  1. 减少了计算量
  2. 基于全局特征的大的感受野能够处理复杂场景(遮挡、光照等)
  3. 结构化损失
  4. 速度超快

引言

(2020ECCV)超快速的结构感知深度车道检测 论文+代码解读_第1张图片

相关工作

传统方法。
深度学习方法。

方法

车道检测的新范式

快速和没有视觉线索对车道检测十分重要。在本节中,我们通过解决速度和无视觉线索的问题来展示我们的范式的派生。 为了更好地说明,表1显示了下文中使用的一些符号。

(2020ECCV)超快速的结构感知深度车道检测 论文+代码解读_第2张图片

范式定义

代码解读

安装

  1. git clone源码

git clone https://github.com/cfzd/Ultra-Fast-Lane-Detection
cd Ultra-Fast-Lane-Detection

  1. 创建conda虚拟环境并激活

conda create -n lane-det python=3.7 -y
conda activate lane-det

  1. 安装依赖

#If you dont have pytorch
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
pip install -r requirements.txt

  1. 数据准备

下载CULaneTusimple。 然后将它们提取到$ CULANEROOT$ TUSIMPLEROOTTusimple的目录排列应如下所示:

$TUSIMPLEROOT
|──clips
|──label_data_0313.json
|──label_data_0531.json
|──label_data_0601.json
|──test_tasks_0627.json
|──test_label.json
|──readme.md

CULane的目录排列应如下所示:

$CULANEROOT
|──driver_100_30frame
|──driver_161_90frame
|──driver_182_30frame
|──driver_193_90frame
|──driver_23_30frame
|──driver_37_30frame
|──laneseg_label_w16
|──list

Tusimple数据集,未提供语义分割标签,因此我们需要根据json文件生成。

python scripts/convert_tusimple.py --root $TUSIMPLEROOT
#this will generate segmentations and two list files: train_gt.txt and test.txt

import os
import cv2
import tqdm
import numpy as np
import pdb
import json, argparse


def calc_k(line):
    '''
    Calculate the direction of lanes
    '''
    line_x = line[::2]
    line_y = line[1::2]
    length = np.sqrt((line_x[0]-line_x[-1])**2 + (line_y[0]-line_y[-1])**2)
    if length < 90:
        return -10                                          # if the lane is too short, it will be skipped

    p = np.polyfit(line_x, line_y,deg = 1)
    rad = np.arctan(p[0])
    
    return rad
def draw(im,line,idx,show = False):
    '''
    Generate the segmentation label according to json annotation
    '''
    line_x = line[::2]
    line_y = line[1::2]
    pt0 = (int(line_x[0]),int(line_y[0]))
    if show:
        cv2.putText(im,str(idx),(int(line_x[len(line_x) // 2]),int(line_y[len(line_x) // 2]) - 20),cv2.FONT_HERSHEY_SIMPLEX, 1.0, (255, 255, 255), lineType=cv2.LINE_AA)
        idx = idx * 60
        
    
    for i in range(len(line_x)-1):
        cv2.line(im,pt0,(int(line_x[i+1]),int(line_y[i+1])),(idx,),thickness = 16)
        pt0 = (int(line_x[i+1]),int(line_y[i+1]))
def get_tusimple_list(root, label_list):
    '''
    Get all the files' names from the json annotation
    '''
    label_json_all = []
    for l in label_list:
        l = os.path.join(root,l)
        label_json = [json.loads(line) for line in open(l).readlines()]
        label_json_all += label_json
    names = [l['raw_file'] for l in label_json_all]
    h_samples = [np.array(l['h_samples']) for l in label_json_all]
    lanes = [np.array(l['lanes']) for l in label_json_all]

    line_txt = []
    for i in range(len(lanes)):
        line_txt_i = []
        for j in range(len(lanes[i])):
            if np.all(lanes[i][j] == -2):
                continue
            valid = lanes[i][j] != -2
            line_txt_tmp = [None]*(len(h_samples[i][valid])+len(lanes[i][j][valid]))
            line_txt_tmp[::2] = list(map(str,lanes[i][j][valid]))
            line_txt_tmp[1::2] = list(map(str,h_samples[i][valid]))
            line_txt_i.append(line_txt_tmp)
        line_txt.append(line_txt_i)

    return names,line_txt

def generate_segmentation_and_train_list(root, line_txt, names):
    """
    The lane annotations of the Tusimple dataset is not strictly in order, so we need to find out the correct lane order for segmentation.
    We use the same definition as CULane, in which the four lanes from left to right are represented as 1,2,3,4 in segentation label respectively.
    """
    train_gt_fp = open(os.path.join(root,'train_gt.txt'),'w')
    
    for i in tqdm.tqdm(range(len(line_txt))):

        tmp_line = line_txt[i]
        lines = []
        for j in range(len(tmp_line)):
            lines.append(list(map(float,tmp_line[j])))
        
        ks = np.array([calc_k(line) for line in lines])             # get the direction of each lane

        k_neg = ks[ks<0].copy()
        k_pos = ks[ks>0].copy()
        k_neg = k_neg[k_neg != -10]                                      # -10 means the lane is too short and is discarded
        k_pos = k_pos[k_pos != -10]
        k_neg.sort()
        k_pos.sort()

        label_path = names[i][:-3]+'png'
        label = np.zeros((720,1280),dtype=np.uint8)
        bin_label = [0,0,0,0]
        if len(k_neg) == 1:                                           # for only one lane in the left
            which_lane = np.where(ks == k_neg[0])[0][0]
            draw(label,lines[which_lane],2)
            bin_label[1] = 1
        elif len(k_neg) == 2:                                         # for two lanes in the left
            which_lane = np.where(ks == k_neg[1])[0][0]
            draw(label,lines[which_lane],1)
            which_lane = np.where(ks == k_neg[0])[0][0]
            draw(label,lines[which_lane],2)
            bin_label[0] = 1
            bin_label[1] = 1
        elif len(k_neg) > 2:                                           # for more than two lanes in the left, 
            which_lane = np.where(ks == k_neg[1])[0][0]                # we only choose the two lanes that are closest to the center
            draw(label,lines[which_lane],1)
            which_lane = np.where(ks == k_neg[0])[0][0]
            draw(label,lines[which_lane],2)
            bin_label[0] = 1
            bin_label[1] = 1

        if len(k_pos) == 1:                                            # For the lanes in the right, the same logical is adopted.
            which_lane = np.where(ks == k_pos[0])[0][0]
            draw(label,lines[which_lane],3)
            bin_label[2] = 1
        elif len(k_pos) == 2:
            which_lane = np.where(ks == k_pos[1])[0][0]
            draw(label,lines[which_lane],3)
            which_lane = np.where(ks == k_pos[0])[0][0]
            draw(label,lines[which_lane],4)
            bin_label[2] = 1
            bin_label[3] = 1
        elif len(k_pos) > 2:
            which_lane = np.where(ks == k_pos[-1])[0][0]
            draw(label,lines[which_lane],3)
            which_lane = np.where(ks == k_pos[-2])[0][0]
            draw(label,lines[which_lane],4)
            bin_label[2] = 1
            bin_label[3] = 1

        cv2.imwrite(os.path.join(root,label_path),label)


        train_gt_fp.write(names[i] + ' ' + label_path + ' '+' '.join(list(map(str,bin_label))) + '\n')
    train_gt_fp.close()

def get_args():
    parser = argparse.ArgumentParser()
    parser.add_argument('--root', required=True, help='The root of the Tusimple dataset')
    return parser

if __name__ == "__main__":
    args = get_args().parse_args()

    # training set
    names,line_txt = get_tusimple_list(args.root,  ['label_data_0601.json','label_data_0531.json','label_data_0313.json'])
    # generate segmentation and training list for training
    generate_segmentation_and_train_list(args.root, line_txt, names) # 3268+358=3626

    # testing set
    names,line_txt = get_tusimple_list(args.root, ['test_tasks_0627.json'])
    # generate testing set for testing
    with open(os.path.join(args.root,'test.txt'),'w') as fp: # 2782
        for name in names:
            fp.write(name + '\n')

  1. 安装CULane评估工具(仅用于测试)

如果您只想训练模型或进行演示,则不需要此工具,可以跳过此步骤。 如果要在CULane上获得评估结果,则应安装此工具。

此工具需要OpenCV C ++。 请按照此处安装OpenCV C ++。 当您构建OpenCV时,请从PATH中删除anaconda的路径,否则它将失败。

#First you need to install OpenCV C++.
#After installation, make a soft link of OpenCV include path.
ln -s /usr/local/include/opencv4/opencv2 /usr/local/include/opencv2

我们提供了三种Compile管道来构建CULane的评估工具。

选择1:

cd evaluation/culane
make

选择2:

cd evaluation/culane
mkdir build && cd build
cmake …
make
mv culane_evaluator …/evaluate

选择3:(对Windows用户)

mkdir build-vs2017
cd build-vs2017
cmake … -G “Visual Studio 15 2017 Win64”
cmake --build . --config Release
#or, open the “xxx.sln” file by Visual Studio and click build button
move culane_evaluator …/evaluate

注:按照RESA这篇文章里的做法,好像上述第5步可以由下述命令取代:(待验证)

sudo apt-get install libopencv-dev

开始

首先,请根据您的环境在configs / culane.pyconfigs / tusimple.py配置中修改data_rootlog_path

  • data_root是您的CULane数据集或Tusimple数据集的路径。
  • log_path是tensorboard日志,训练好的模型和代码备份的存储位置。 应将其放置在该项目之外

对于单个GPU:

python train.py configs/path_to_your_config

train.py文件如下:

import torch, os, datetime
import numpy as np

from model.model import parsingNet
from data.dataloader import get_train_loader

from utils.dist_utils import dist_print, dist_tqdm, is_main_process, DistSummaryWriter
from utils.factory import get_metric_dict, get_loss_dict, get_optimizer, get_scheduler
from utils.metrics import MultiLabelAcc, AccTopk, Metric_mIoU, update_metrics, reset_metrics

from utils.common import merge_config, save_model, cp_projects
from utils.common import get_work_dir, get_logger

import time


def inference(net, data_label, use_aux):
    if use_aux:
        img, cls_label, seg_label = data_label
        img, cls_label, seg_label = img.cuda(), cls_label.long().cuda(), seg_label.long().cuda()
        cls_out, seg_out = net(img)
        return {'cls_out': cls_out, 'cls_label': cls_label, 'seg_out':seg_out, 'seg_label': seg_label}
    else:
        img, cls_label = data_label
        img, cls_label = img.cuda(), cls_label.long().cuda()
        cls_out = net(img)
        return {'cls_out': cls_out, 'cls_label': cls_label}


def resolve_val_data(results, use_aux):
    results['cls_out'] = torch.argmax(results['cls_out'], dim=1)
    if use_aux:
        results['seg_out'] = torch.argmax(results['seg_out'], dim=1)
    return results


def calc_loss(loss_dict, results, logger, global_step):
    loss = 0

    for i in range(len(loss_dict['name'])):

        data_src = loss_dict['data_src'][i]

        datas = [results[src] for src in data_src]

        loss_cur = loss_dict['op'][i](*datas)

        if global_step % 20 == 0:
            logger.add_scalar('loss/'+loss_dict['name'][i], loss_cur, global_step)

        loss += loss_cur * loss_dict['weight'][i]
    return loss


def train(net, data_loader, loss_dict, optimizer, scheduler,logger, epoch, metric_dict, use_aux):
    net.train()
    progress_bar = dist_tqdm(train_loader)
    t_data_0 = time.time()
    for b_idx, data_label in enumerate(progress_bar):
        t_data_1 = time.time()
        reset_metrics(metric_dict)
        global_step = epoch * len(data_loader) + b_idx

        t_net_0 = time.time()
        results = inference(net, data_label, use_aux)

        loss = calc_loss(loss_dict, results, logger, global_step)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        scheduler.step(global_step)
        t_net_1 = time.time()

        results = resolve_val_data(results, use_aux)

        update_metrics(metric_dict, results)
        if global_step % 20 == 0:
            for me_name, me_op in zip(metric_dict['name'], metric_dict['op']):
                logger.add_scalar('metric/' + me_name, me_op.get(), global_step=global_step)
        logger.add_scalar('meta/lr', optimizer.param_groups[0]['lr'], global_step=global_step)

        if hasattr(progress_bar,'set_postfix'):
            kwargs = {me_name: '%.3f' % me_op.get() for me_name, me_op in zip(metric_dict['name'], metric_dict['op'])}
            progress_bar.set_postfix(loss = '%.3f' % float(loss), 
                                    data_time = '%.3f' % float(t_data_1 - t_data_0), 
                                    net_time = '%.3f' % float(t_net_1 - t_net_0), 
                                    **kwargs)
        t_data_0 = time.time()
        





if __name__ == "__main__":
    torch.backends.cudnn.benchmark = True

    args, cfg = merge_config()

    work_dir = get_work_dir(cfg)

    distributed = False
    if 'WORLD_SIZE' in os.environ:
        distributed = int(os.environ['WORLD_SIZE']) > 1

    if distributed:
        torch.cuda.set_device(args.local_rank)
        torch.distributed.init_process_group(backend='nccl', init_method='env://')
    dist_print(datetime.datetime.now().strftime('[%Y/%m/%d %H:%M:%S]') + ' start training...')
    dist_print(cfg)
    assert cfg.backbone in ['18','34','50','101','152','50next','101next','50wide','101wide']


    train_loader, cls_num_per_lane = get_train_loader(cfg.batch_size, cfg.data_root, cfg.griding_num, cfg.dataset, cfg.use_aux, distributed, cfg.num_lanes)

    net = parsingNet(pretrained = True, backbone=cfg.backbone,cls_dim = (cfg.griding_num+1,cls_num_per_lane, cfg.num_lanes),use_aux=cfg.use_aux).cuda()

    if distributed:
        net = torch.nn.parallel.DistributedDataParallel(net, device_ids = [args.local_rank])
    optimizer = get_optimizer(net, cfg)

    if cfg.finetune is not None:
        dist_print('finetune from ', cfg.finetune)
        state_all = torch.load(cfg.finetune)['model']
        state_clip = {}  # only use backbone parameters
        for k,v in state_all.items():
            if 'model' in k:
                state_clip[k] = v
        net.load_state_dict(state_clip, strict=False)
    if cfg.resume is not None:
        dist_print('==> Resume model from ' + cfg.resume)
        resume_dict = torch.load(cfg.resume, map_location='cpu')
        net.load_state_dict(resume_dict['model'])
        if 'optimizer' in resume_dict.keys():
            optimizer.load_state_dict(resume_dict['optimizer'])
        resume_epoch = int(os.path.split(cfg.resume)[1][2:5]) + 1
    else:
        resume_epoch = 0



    scheduler = get_scheduler(optimizer, cfg, len(train_loader))
    dist_print(len(train_loader))
    metric_dict = get_metric_dict(cfg)
    loss_dict = get_loss_dict(cfg)
    logger = get_logger(work_dir, cfg)
    cp_projects(args.auto_backup, work_dir)

    for epoch in range(resume_epoch, cfg.epoch):

        train(net, train_loader, loss_dict, optimizer, scheduler,logger, epoch, metric_dict, cfg.use_aux)
        
        save_model(net, optimizer, epoch ,work_dir, distributed)
    logger.close()

上述代码调用的model.model.py如下:

import torch
from model.backbone import resnet
import numpy as np

class conv_bn_relu(torch.nn.Module):
    def __init__(self,in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1,bias=False):
        super(conv_bn_relu,self).__init__()
        self.conv = torch.nn.Conv2d(in_channels,out_channels, kernel_size, 
            stride = stride, padding = padding, dilation = dilation,bias = bias)
        self.bn = torch.nn.BatchNorm2d(out_channels)
        self.relu = torch.nn.ReLU()

    def forward(self,x):
        x = self.conv(x)
        x = self.bn(x)
        x = self.relu(x)
        return x
class parsingNet(torch.nn.Module):
    def __init__(self, size=(288, 800), pretrained=True, backbone='50', cls_dim=(37, 10, 4), use_aux=False):
        super(parsingNet, self).__init__()

        self.size = size
        self.w = size[0]
        self.h = size[1]
        self.cls_dim = cls_dim # (num_gridding, num_cls_per_lane, num_of_lanes)
        # num_cls_per_lane is the number of row anchors
        self.use_aux = use_aux
        self.total_dim = np.prod(cls_dim)

        # input : nchw,
        # output: (w+1) * sample_rows * 4 
        self.model = resnet(backbone, pretrained=pretrained)

        if self.use_aux:
            self.aux_header2 = torch.nn.Sequential(
                conv_bn_relu(128, 128, kernel_size=3, stride=1, padding=1) if backbone in ['34','18'] else conv_bn_relu(512, 128, kernel_size=3, stride=1, padding=1),
                conv_bn_relu(128,128,3,padding=1),
                conv_bn_relu(128,128,3,padding=1),
                conv_bn_relu(128,128,3,padding=1),
            )
            self.aux_header3 = torch.nn.Sequential(
                conv_bn_relu(256, 128, kernel_size=3, stride=1, padding=1) if backbone in ['34','18'] else conv_bn_relu(1024, 128, kernel_size=3, stride=1, padding=1),
                conv_bn_relu(128,128,3,padding=1),
                conv_bn_relu(128,128,3,padding=1),
            )
            self.aux_header4 = torch.nn.Sequential(
                conv_bn_relu(512, 128, kernel_size=3, stride=1, padding=1) if backbone in ['34','18'] else conv_bn_relu(2048, 128, kernel_size=3, stride=1, padding=1),
                conv_bn_relu(128,128,3,padding=1),
            )
            self.aux_combine = torch.nn.Sequential(
                conv_bn_relu(384, 256, 3,padding=2,dilation=2),
                conv_bn_relu(256, 128, 3,padding=2,dilation=2),
                conv_bn_relu(128, 128, 3,padding=2,dilation=2),
                conv_bn_relu(128, 128, 3,padding=4,dilation=4),
                torch.nn.Conv2d(128, cls_dim[-1] + 1,1)
                # output : n, num_of_lanes+1, h, w
            )
            initialize_weights(self.aux_header2,self.aux_header3,self.aux_header4,self.aux_combine)

        self.cls = torch.nn.Sequential(
            torch.nn.Linear(1800, 2048),
            torch.nn.ReLU(),
            torch.nn.Linear(2048, self.total_dim),
        )

        self.pool = torch.nn.Conv2d(512,8,1) if backbone in ['34','18'] else torch.nn.Conv2d(2048,8,1)
        # 1/32,2048 channel
        # 288,800 -> 9,40,2048
        # (w+1) * sample_rows * 4
        # 37 * 10 * 4
        initialize_weights(self.cls)

    def forward(self, x):
        # n c h w - > n 2048 sh sw
        # -> n 2048
        x2,x3,fea = self.model(x) # x2:(32,128,36,100) x3:(32,256,18,50) fea:(32,512,9,25)
        if self.use_aux:
            x2 = self.aux_header2(x2) # (32,128,36,100)
            x3 = self.aux_header3(x3) # (32,128,18,50) 
            x3 = torch.nn.functional.interpolate(x3,scale_factor = 2,mode='bilinear') # (32,128,36,100) 
            x4 = self.aux_header4(fea) # (32,128,9,25) 
            x4 = torch.nn.functional.interpolate(x4,scale_factor = 4,mode='bilinear') # (32,128,36,100) 
            aux_seg = torch.cat([x2,x3,x4],dim=1) # (32,384,36,100) 
            aux_seg = self.aux_combine(aux_seg) # (32,3,288,800) 
        else:
            aux_seg = None

        fea = self.pool(fea).view(-1, 1800) # 输入:(32,512,9,25)  view之前:(32,8,9,25) view之后:(32,1800)

        group_cls = self.cls(fea).view(-1, *self.cls_dim) # (32,101,56,4)

        if self.use_aux:
            return group_cls, aux_seg # group_cls:(32,101,56,4) aux_seg:(32,3,288,800) 

        return group_cls


def initialize_weights(*models):
    for model in models:
        real_init_weights(model)
def real_init_weights(m):

    if isinstance(m, list):
        for mini_m in m:
            real_init_weights(mini_m)
    else:
        if isinstance(m, torch.nn.Conv2d):    
            torch.nn.init.kaiming_normal_(m.weight, nonlinearity='relu')
            if m.bias is not None:
                torch.nn.init.constant_(m.bias, 0)
        elif isinstance(m, torch.nn.Linear):
            m.weight.data.normal_(0.0, std=0.01)
        elif isinstance(m, torch.nn.BatchNorm2d):
            torch.nn.init.constant_(m.weight, 1)
            torch.nn.init.constant_(m.bias, 0)
        elif isinstance(m,torch.nn.Module):
            for mini_m in m.children():
                real_init_weights(mini_m)
        else:
            print('unkonwn module', m)

对多个GPU:

sh launch_training.sh

或者

python -m torch.distributed.launch --nproc_per_node=$NGPUS train.py configs/path_to_your_config

如果没有预训练的torchvision模型,则多gpu训练可能会导致多次下载。 您可以先手动下载相应的模型,然后重新启动多GPU训练。

由于我们的代码具有自动备份功能,该功能将根据gitignore将所有代码复制到log_path,因此,如果未通过gitignore过滤,则可能还会复制其他临时文件,如果临时文件很大,则可能会阻止执行。 因此,您应该保持工作目录的清洁。

除了配置样式设置外,我们还支持命令行样式。 您可以覆盖设置比如:

python train.py configs/path_to_your_config --batch_size 8

batch_size在训练阶段会被设定为8。

为了用tensorboard进行可视化日志,运行:

tensorboard --logdir log_path --bind_all

训练好的模型

我们在CULane和Tusimple上提供了两个经过训练的Res-18模型。

(2020ECCV)超快速的结构感知深度车道检测 论文+代码解读_第3张图片
Tusimple:谷歌云/百度云
CULane:谷歌云/百度云

为了评估,运行:

mkdir tmp
# This a bad example, you should put the temp files outside the project.

python test.py configs/culane.py --test_model path_to_culane_18.pth --test_work_dir ./tmp

python test.py configs/tusimple.py --test_model path_to_tusimple_18.pth --test_work_dir ./tmp

同样支持多卡测试。

可视化

我们提供了一个脚本来可视化检测结果。 运行以下命令以可视化CULane和Tusimple的测试集。

python demo.py configs/culane.py --test_model path_to_culane_18.pth
# or
python demo.py configs/tusimple.py --test_model path_to_tusimple_18.pth

由于未排序Tusimple的测试集,因此可视化的视频可能看起来很糟糕,我们不建议您这样做。

速度

为了测试运行时间,运行:

python speed_simple.py  
# this will test the speed with a simple protocol and requires no additional dependencies

python speed_real.py
# this will test the speed with real video or camera input

它将循环100次,并计算环境中的平均运行时间和fps。

致谢

Thanks zchrissirhcz for the contribution to the compile tool of CULane, KopiSoftware for contributing to the speed test, and ustclbh for testing on the Windows platform.

引用信息:

@InProceedings{qin2020ultra,
author = {Qin, Zequn and Wang, Huanyu and Li, Xi},
title = {Ultra Fast Structure-aware Deep Lane Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
year = {2020}
}

你可能感兴趣的:(车道线检测/道路边缘检测,可视化,深度学习,python,人工智能,神经网络)