基于实例分割方法的端到端车道线检测论文+代码解读

Towards End-to-End Lane Detection: an Instance Segmentation Approach

论文原文

https://arxiv.org/pdf/1802.05591v1.pdf

前言
车道线检测的一篇很经典的论文，网上关于这篇论文的代码解读很多。这里记录一下自己的学习。

摘要

传统方法：手工特征提取，易受环境影响，达不到实时性要求。
之前的论文中基于深度学习方法：只能检测固定数量的车道线。
本文方法：将车道线检测作为一个实例分割问题。同时提出用神经网络去拟合逆透视变换矩阵，而不是直接固定矩阵参数，从而可以对道路变化更加鲁棒。

50fps，tuSimple数据集验证。

正文

图1是本文方法框架。

LaneNet

图2：LaneNet架构。将车道线检测作为一个实例分割问题，实现端到端。从而可实现对不同数量的车道线的检测。

该网络结合了二值车道分割的优点和为one-shot实例分割而设计的聚类损失函数。在LaneNet的输出中，每个lane pixel都被分配对应的lane id。

多任务网络联合训练可提高速度和准确率。包含两个分支

lane segmentation branch：（两分类问题）输出背景或者车道线。从而不必为不同的车道分配不同的类别。

为了构造ground-truth segmentation map，将所有ground-truth lane points连接在一起，形成每个lane的一条连接线。并且对隐含的车道线进行标注。（通过物体(如阻塞的汽车)，或者在没有明显可见的车道片段(如虚线或褪色的车道)的情况下，绘制这些真实的车道。）这样网络也可以学习到隐藏的车道线。
损失函数：standard cross-entropy
lane/background类别不均衡：bounded inverse class weighting。

p为对应类别在总体样本中出现的概率，c是超参数。

lane embedding branch：分割出的车道线分为不同的实例。使用聚类损失函数，为车道分割分支中的每个像素分配一个车道id，忽略背景像素。

目标检测方法（边界框）适合于紧实的物体，而车道线不是。因此将其对待为实例分割问题。采用一种基于距离度量学习的one-shot方法。通过对聚类损失函数的设计，使得同一条车道线的像素距离近，不同车道线像素距离远。具体实现如下：

L=Lvar+Ldist

Lvar：方差项。每个像素向量点施加一个拉力，使其朝向车道的平均像素向量点。（hinged）像素向量与聚类中心距离大于δv时才被激活。
Ldist：距离项。使聚类中心彼此远离。（hinged）聚类中心之间的距离小于δd时才被激活。
C：聚类中心（车道线）的数量
Nc：聚类中心c的元素数量
xi：一个像素向量
uc：聚类中心c的平均向量
||·||：L2距离
[x]+ = max(0,x)：hinge

clustering

迭代过程。为了方便在推理时对像素进行聚类，在上述损失L中设置δd>6δv。（因为这样以一个随机的车道线嵌入为圆心，以2δv为半径，选取圆中所有的像素归为同一车道线。）

在进行聚类时，首先使用mean shift聚类，使得簇中心沿着密度上升的方向移动，防止将离群点选入相同的簇中；之后对像素向量进行划分：以簇中心为圆心，以2δv为半径，选取圆中所有的像素归为同一车道线。重复该步骤，直到将所有的车道线像素分配给对应的车道。

（作者：liyonghong
链接：https://www.jianshu.com/p/c6d38d648509
来源：简书
著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。）

网络架构

基于ENet架构。具体如图2所示。两个分支的损失同等的后向传播。
关于ENet的论文：

https://arxiv.org/pdf/1606.02147.pdf

解读：

https://blog.csdn.net/u011974639/article/details/78956380

ENet各个子模块介绍（caffe实现）：

https://blog.csdn.net/u013241583/article/details/90170369
https://blog.csdn.net/u013241583/article/details/90171242
https://blog.csdn.net/u013241583/article/details/90174188
https://blog.csdn.net/u013241583/article/details/90174490

ENet网络结构图

本文对ENet网络进行了略微的修改：

LaneNet的体系结构是基于编码器-解码器网络ENet[29]，该网络因此被修改为一个双分支网络。由于ENet的编码器比解码器包含更多的参数，完全共享两个任务之间的完整编码器将导致不满意的结果[27]。因此，虽然原始的ENet编码器包括三个阶段(阶段1、2、3)，但LaneNet只在两个分支之间共享前两个阶段(阶段1和2)，ENet编码器的阶段3和完整的ENet解码器作为每个独立分支的主干。分割分支的最后一层输出一个通道图像(二值分割)，而嵌入分支的最后一层输出一个N通道图像，嵌入维数为N。如图2所示。每个支路的损失项是相等加权的，并通过网络反向传播。

本文设置

嵌入向量维度4
δv = 0.5
δd = 3
输入尺寸：512x256
Adam
batch size = 8
learning rate = 5e-4

HNet

有了车道线实例之后，为了参数化描述车道线：鸟瞰图（在保持计算效率的同时提高拟合的质量）。先将实例转化到鸟瞰图上，再转换回原图。（神经网络拟合变换矩阵）具体如下。

LaneNet输出每个车道的像素集合，仍然需要通过这些像素拟合一条曲线来得到参数化的车道。在原图中直接拟合效果并不好（需要高次多项式）。因此将LaneNet的输出（像素集合）转换为鸟瞰图来拟合。如果直接用固定的变换矩阵，就会导致如图4（2）：fixed所示的效果。因此本文采用H-Net输出变换矩阵，在这种变换中，车道可以用一个低阶多项式进行最佳拟合，效果如图4（2）：cond所示。

其中，变换矩阵H有6个自由度。（放置这些零是为了强制要求水平线在转换下保持水平。）

即坐标y的变换不受坐标x的影响

车道线拟合

（原文对这里讲的过于复杂，其实看下面即可）

具体过程如图3所示。

网络架构如表1所示。

本文设置

训练用于三阶多项式拟合
输入尺寸128x64
Adam
batch size = 10
learning rate = 5e-5

实验

tuSimple数据集。3626训练集，2782测试集。
accuracy：每幅图像的平均正确点数。

Cim：定位对的点的数量
Sim：ground truth点的数量
（小于指定阈值时被认为正确）

false positive and false negative scores

Fpred：错误预测的车道数
Npred：预测的车道数
Mpred：错过的ground-truth车道数
Ngt：所有ground-truth车道数

附一篇优秀的解读：

https://www.jianshu.com/p/c6d38d648509

代码

https://github.com/ms5898/LaneNet-PyTorch

应用pytorch框架
注：该代码并未对Hnet进行复现。而只是用sklearn的linearregress进行拟合。

包


    Python 3.7
    PyTorch 1.4.0
    torchvision
    sklearn 0.22.1
    NumPy 1.18.2

数据集

TuSimple 数据集

下载

解压train_set.zip、test_set.zip到文件夹ECBM6040-Project/TUSIMPLE
将test_label.json放到ECBM6040-Project/TUSIMPLE/test_set（从test_set.zip解压出来的）

准备

将train_set加工为ground truth image, binary ground truth and instance ground truth

python utils/process_training_dataset_2.py --src_dir (your train_set folder place)
for me this step is: python utils/process_training_dataset_2.py --src_dir /Users/smiffy/Documents/GitHub/ECBM6040-Project/TUSIMPLE/train_set

解读process_training_dataset_2.py：这个py文件就是将图森数据集中的训练集进一步划分为训练、验证、测试集。并保存起来。

import argparse # https://blog.csdn.net/yy_diego/article/details/82851661
import glob
import json
import os
import os.path as ops
import shutil

import cv2
import numpy as np


def init_args():
    parser = argparse.ArgumentParser()
    parser.add_argument('--src_dir', type=str, help='The origin path of unzipped tusimple dataset')
    return parser.parse_args() # 返回Namespace


def get_image_to_folders(json_label_path, gt_image_dir, gt_binary_dir, gt_instance_dir, src_dir):
    image_nums = len(os.listdir(gt_image_dir)) # 记录当前目录下的文件图片数（如果`process_training_dataset_2.py`运行了不止一次，则数量会不对。需要删掉training文件夹再重新运行。正常应该是3626（测试集）数。
    with open(json_label_path, 'r') as file:
        for line_index, line in enumerate(file):
            info_dict = json.loads(line)

            raw_file = info_dict['raw_file']
            h_samples = info_dict['h_samples']
            lanes = info_dict['lanes']

            image_path = ops.join(src_dir, raw_file)
            image_name_new = '{:s}.png'.format('{:d}'.format(line_index + image_nums).zfill(4)) # zfill()：返回指定长度字符（右起）。https://www.runoob.com/python/att-string-zfill.html
            image_output_path = ops.join(ops.split(src_dir)[0], 'training', 'gt_image', image_name_new)
            binary_output_path = ops.join(ops.split(src_dir)[0], 'training', 'gt_binary_image', image_name_new)
            instance_output_path = ops.join(ops.split(src_dir)[0], 'training', 'gt_instance_image', image_name_new)

            src_image = cv2.imread(image_path, cv2.IMREAD_COLOR) # cv2.IMREAD_COLOR：加载一张彩色图片，忽视它的透明度。
            dst_binary_image = np.zeros([src_image.shape[0], src_image.shape[1]], np.uint8)
            dst_instance_image = np.zeros([src_image.shape[0], src_image.shape[1]], np.uint8)

            for lane_index, lane in enumerate(lanes): # 图森数据集介绍https://blog.csdn.net/qq_38096703/article/details/105513685
                assert len(h_samples) == len(lane) # 除去无效图片。
                lane_x = []
                lane_y = []
                for index in range(len(lane)): # 去除无效点。
                    if lane[index] == -2:
                        continue
                    else:
                        ptx = lane[index] # 有效x点
                        pty = h_samples[index] # 有效y点
                        lane_x.append(ptx) # lane_x：一张图中一条车道线的所有有效x点
                        lane_y.append(pty) # lane_y：一张图中一条车道线的所有有效y点
                if not lane_x:
                    continue
                lane_pts = np.vstack((lane_x, lane_y)).transpose() # np.vstack:按垂直方向（行顺序）堆叠数组构成一个新的数组。transpose：转置。https://blog.csdn.net/xiongchengluo1129/article/details/79017142
                lane_pts = np.array([lane_pts], np.int64)

                cv2.polylines(dst_binary_image, lane_pts, isClosed=False, color=255, thickness=5)
                cv2.polylines(dst_instance_image, lane_pts, isClosed=False, color=lane_index * 50 + 20, thickness=5) # 通过color控制线条颜色。

            cv2.imwrite(binary_output_path, dst_binary_image) # 写入
            cv2.imwrite(instance_output_path, dst_instance_image)
            cv2.imwrite(image_output_path, src_image)
        print('Process {:s} success'.format(json_label_path)) # 打印完成信息。


def gen_train_sample(src_dir, b_gt_image_dir, i_gt_image_dir, image_dir):
    os.makedirs('{:s}/txt_for_local'.format(ops.split(src_dir)[0]), exist_ok=True)
    with open('{:s}/txt_for_local/train.txt'.format(ops.split(src_dir)[0]), 'w') as file:
        for image_name in os.listdir(b_gt_image_dir): # os.listdir() 方法用于返回指定的文件夹包含的文件或文件夹的名字的列表。https://www.runoob.com/python/os-listdir.html
            if not image_name.endswith('.png'):
                continue
            binary_gt_image_path = ops.join(b_gt_image_dir, image_name)
            instance_gt_image_path = ops.join(i_gt_image_dir, image_name)
            image_path = ops.join(image_dir, image_name)

            b_gt_image = cv2.imread(binary_gt_image_path, cv2.IMREAD_COLOR)
            i_gt_image = cv2.imread(instance_gt_image_path, cv2.IMREAD_COLOR)
            image = cv2.imread(image_path, cv2.IMREAD_COLOR)

            if b_gt_image is None or image is None or i_gt_image is None:
                print('Image set: {:s} broken'.format(image_name))
                continue
            else:
                info = '{:s} {:s} {:s}'.format(image_path, binary_gt_image_path, instance_gt_image_path)
                file.write(info + '\n') # 三张对应的图片为一行
    return


def split_train_txt(src_dir):
    train_file_path =  '{:s}/txt_for_local/train.txt'.format(ops.split(src_dir)[0])
    test_file_path = '{:s}/txt_for_local/test.txt'.format(ops.split(src_dir)[0])
    valid_file_path = '{:s}/txt_for_local/val.txt'.format(ops.split(src_dir)[0])
    with open(train_file_path, 'r') as file: # 对图森数据集中的测试集再进一步划分
        data = file.readlines()
        train_data = data[0:int(len(data)*0.8)] # 2900
        test_data = data[int(len(data)*0.8): int(len(data)*0.9)] # 363
        valid_data = data[int(len(data) * 0.9): -1] # 362
    with open(train_file_path, 'w') as file:
        for d in train_data:
            file.write(d)
    with open(test_file_path, 'w') as file:
        for d in test_data:
            file.write(d)
    with open(valid_file_path, 'w') as file:
        for d in valid_data:
            file.write(d)


def process_tusimple_dataset(src_dir):
    traing_folder_path = ops.join(ops.split(src_dir)[0], 'training') # os.path.split():https://blog.csdn.net/xijuezhu8128/article/details/87861417 os.path.join()：https://www.jb51.net/article/171478.htm
    os.makedirs(traing_folder_path, exist_ok=True) # 创建目录

    gt_image_dir = ops.join(traing_folder_path, 'gt_image')
    gt_binary_dir = ops.join(traing_folder_path, 'gt_binary_image')
    gt_instance_dir = ops.join(traing_folder_path, 'gt_instance_image')

    os.makedirs(gt_image_dir, exist_ok=True)
    os.makedirs(gt_binary_dir, exist_ok=True)
    os.makedirs(gt_instance_dir, exist_ok=True)

    for json_label_path in glob.glob('{:s}/*.json'.format(src_dir)): # glob.glob：获取指定类型文件。https://blog.csdn.net/georgeai/article/details/81035422
        get_image_to_folders(json_label_path, gt_image_dir, gt_binary_dir, gt_instance_dir, src_dir) # 将图像放到文件夹中
    gen_train_sample(src_dir, gt_binary_dir, gt_instance_dir, gt_image_dir) # 把training中三个文件夹（gt_binary_image、gt_image、gt_instance_image）的每一张图片文件名	对应写入train.txt文件中
    split_train_txt(src_dir)


if __name__ == '__main__':
    args = init_args()
    process_tusimple_dataset(args.src_dir)

为了单步调试process_training_dataset_2.py文件，将第一个函数修改如下：

def init_args():
    parser = argparse.ArgumentParser()
    parser.add_argument('--src_dir', type=str, default='/home/wqf/ECBM6040-Project/TUSIMPLE/train_set', help='The origin path of unzipped tusimple dataset')
    return parser.parse_args()

你的TUSIMPLE目录应该类似下图。

训练LaneNet（基于E-Net）

在ECBM6040-Project/Notebook-experiment/Dataset Show.ipynb查看用于训练的数据集。代码如下：

import os.path as ops
import numpy as np
import torch
import cv2
import sys
sys.path.append('..') # ..代表上一级目录。如果不写，下一行导入模块是用不了的。https://www.cnblogs.com/mandy-study/p/7735801.html
from dataset.dataset_utils import TUSIMPLE, TUSIMPLE_AUG

# Build The datasets
# root = '/Users/smiffy/Documents/GitHub/TUSIMPLE/Data_Tusimple_PyTorch/training'
root = '../TUSIMPLE/txt_for_local'

train_set = TUSIMPLE(root=root, flag='train')
valid_set = TUSIMPLE(root=root, flag='valid')
test_set = TUSIMPLE(root=root, flag='test')

print('train_set length {}'.format(len(train_set))) # 调用__len__方法。返回2900
print('valid_set length {}'.format(len(valid_set))) # 362
print('test_set length {}'.format(len(test_set))) # 363

gt, bgt, igt = train_set[280] # 选取一张图片
print('image type {}'.format(type(gt))) # image type 
print('image size {} \n'.format(gt.size())) # image size torch.Size([3, 256, 512]) 

print('gt binary image type {}'.format(type(bgt))) # gt binary image type 
print('gt binary image size {}'.format(bgt.size())) # gt binary image size torch.Size([256, 512])
print('items in gt binary image {} \n'.format(torch.unique(bgt))) # items in gt binary image tensor([0, 1]) 

print('gt instance type {}'.format(type(igt))) # gt instance type 
print('gt instance size {}'.format(igt.size())) # gt instance size torch.Size([256, 512])
print('items in gt instance {} \n'.format(torch.unique(igt))) # items in gt instance tensor([  0,  20,  70, 120, 170]) 

# Show the images
image_show = ((gt.numpy() + 1) * 127.5).astype(int) # 
image_show.shape # (3, 256, 512)

import matplotlib.pyplot as plt
# image_show = image_show[...,::-1]
plt.figure(figsize=(15,15))
image_show = image_show.transpose(1,2,0)
image_show = image_show[...,::-1]
plt.imshow(image_show)

bgt.shape # torch.Size([256, 512])

plt.figure(figsize=(20,20))
ax1 = plt.subplot(121)
plt.imshow(bgt, cmap='gray')
ax1 = plt.subplot(122)
plt.imshow(igt, cmap='gray')

# Aug Dataset
# root = '/Users/smiffy/Documents/GitHub/TUSIMPLE/Data_Tusimple_PyTorch/training'
root = '../TUSIMPLE/txt_for_local'

train_set = TUSIMPLE_AUG(root=root, flag='train')
valid_set = TUSIMPLE_AUG(root=root, flag='valid')
test_set = TUSIMPLE_AUG(root=root, flag='test')

print('train_set length {}'.format(len(train_set))) # 2900x2
print('valid_set length {}'.format(len(valid_set))) # 362x2
print('test_set length {}'.format(len(test_set)))  # 363x2

idx = 280
gt, bgt, igt = train_set[idx]
gt_aug, bgt_aug, igt_aug = train_set[idx+1]
print('image type {}'.format(type(gt)))
print('image size {} \n'.format(gt.size()))

print('gt binary image type {}'.format(type(bgt)))
print('gt binary image size {}'.format(bgt.size()))
print('items in gt binary image {} \n'.format(torch.unique(bgt)))

print('gt instance type {}'.format(type(igt)))
print('gt instance size {}'.format(igt.size()))
print('items in gt instance {} \n'.format(torch.unique(igt)))

image_show = ((gt.numpy() + 1) * 127.5).astype(int)
image_show_aug = ((gt_aug.numpy() + 1) * 127.5).astype(int)
image_show.shape

import matplotlib.pyplot as plt
# image_show = image_show[...,::-1]
plt.figure(figsize=(20,20))
ax1 = plt.subplot(121)
image_show = image_show.transpose(1,2,0)
image_show = image_show[...,::-1]
plt.imshow(image_show)

ax1 = plt.subplot(122)
image_show_aug = image_show_aug.transpose(1,2,0)
image_show_aug = image_show_aug[...,::-1]
plt.imshow(image_show_aug)

plt.show()

plt.figure(figsize=(20,20))
ax1 = plt.subplot(121)
plt.imshow(bgt, cmap='gray')
ax1 = plt.subplot(122)
plt.imshow(igt, cmap='gray')

plt.figure(figsize=(20,20))
ax1 = plt.subplot(121)
plt.imshow(bgt_aug, cmap='gray')
ax1 = plt.subplot(122)
plt.imshow(igt_aug, cmap='gray')

在上述文件导入的本地模块from dataset.dataset_utils import TUSIMPLE, TUSIMPLE_AUG代码解读如下：

import os.path as ops
import numpy as np
import torch
import cv2
import torchvision


class TUSIMPLE(torch.utils.data.Dataset): # torch.utils.data.Dataset是代表自定义数据集方法的抽象类，你可以自己定义你的数据类继承这个抽象类，非常简单，只需要定义__len__和__getitem__这两个方法就可以。。https://blog.csdn.net/qq_36653505/article/details/83351808
    def __init__(self, root, transforms=None, resize=(512, 256), flag='train'):
        self.root = root
        self.transforms = transforms
        self.resize = resize
        self.flag = flag

        self.img_pathes = []

        self.train_file = ops.join(root, 'train.txt') # 进入自己写的模块，见下面代码块解析。
        self.val_file = ops.join(root, 'val.txt')
        self.test_file = ops.join(root, 'test.txt')

        if self.flag == 'train':
            file_open = self.train_file
        elif self.flag == 'valid':
            file_open = self.val_file
        else:
            file_open = self.test_file

        with open(file_open, 'r') as file:
            data = file.readlines()
            for l in data: # l：'/home/wqf/ECBM6040-Project/TUSIMPLE/training/gt_image/0487.png /home/wqf/ECBM6040-Project/TUSIMPLE/training/gt_binary_image/0487.png /home/wqf/ECBM6040-Project/TUSIMPLE/training/gt_instance_image/0487.png
'
                line = l.split() # line：{list:3}
                self.img_pathes.append(line) # {list:{list:3}}

    def __len__(self): # __len__是魔法方法，它可以让你的自定义类使用len()方法来直接获取类的长度值，len() 是内置的方法，对于python的一些内置的类，比如列表(list)，字符串(str)，子节等，可以直接使用。但是，如果你的自定义类不包含__len__方法，len()函数在终端运行中是会报错的哦。https://blog.csdn.net/qq_38883271/article/details/96439208
        return len(self.img_pathes) # 返回图片数量

    def __getitem__(self, idx): # 如果在类中定义了__getitem__()方法，那么他的实例对象（假设为P）就可以这样P[key]取值。当实例对象做P[key]运算时，就会调用类中的__getitem__()方法。https://blog.csdn.net/chituozha5528/article/details/78354833
        gt_image = cv2.imread(self.img_pathes[idx][0], cv2.IMREAD_UNCHANGED) # 读取图片
        gt_binary_image = cv2.imread(self.img_pathes[idx][1], cv2.IMREAD_UNCHANGED)
        gt_instance = cv2.imread(self.img_pathes[idx][2], cv2.IMREAD_UNCHANGED)

        gt_image = cv2.resize(gt_image, dsize=self.resize, interpolation=cv2.INTER_LINEAR) # resize
        gt_binary_image = cv2.resize(gt_binary_image, dsize=self.resize, interpolation=cv2.INTER_NEAREST)
        gt_instance = cv2.resize(gt_instance, dsize=self.resize, interpolation=cv2.INTER_NEAREST)

        gt_image = gt_image / 127.5 - 1.0 # 归一化到[-1,1]
        gt_binary_image = np.array(gt_binary_image / 255.0, dtype=np.uint8) # 归一化到[0,1]
        gt_binary_image = gt_binary_image[:, :, np.newaxis]
        gt_instance = gt_instance[:, :, np.newaxis]

        gt_binary_image = np.transpose(gt_binary_image, (2, 0, 1)) # （1，256，512）
        gt_instance = np.transpose(gt_instance, (2, 0, 1))

        gt_image = torch.tensor(gt_image, dtype=torch.float)
        gt_image = np.transpose(gt_image, (2, 0, 1))
        # trsf = torchvision.transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225), inplace=False)
        # gt_image = trsf(gt_image)

        gt_binary_image = torch.tensor(gt_binary_image, dtype=torch.long).view(self.resize[1], self.resize[0])
        #gt_binary_image = torch.tensor(gt_binary_image, dtype=torch.float)
        # gt_instance = torch.tensor(gt_instance, dtype=torch.float)
        gt_instance = torch.tensor(gt_instance, dtype=torch.long).view(self.resize[1], self.resize[0])

        return gt_image, gt_binary_image, gt_instance # 返回tensor。关于pytroch中数据格式的转换还是不太明白。

    
class TUSIMPLE_AUG(torch.utils.data.Dataset): # 数据增强
    def __init__(self, root, transforms=None, resize=(512, 256), flag='train'):
        self.root = root
        self.transforms = transforms
        self.resize = resize
        self.flag = flag

        self.img_pathes = []

        self.train_file = ops.join(root, 'train.txt')
        self.val_file = ops.join(root, 'val.txt')
        self.test_file = ops.join(root, 'test.txt')

        if self.flag == 'train':
            file_open = self.train_file
        elif self.flag == 'valid':
            file_open = self.val_file
        else:
            file_open = self.test_file

        with open(file_open, 'r') as file:
            data = file.readlines()
            for l in data:
                line = l.split()
                self.img_pathes.append(line)

    def __len__(self):
        return len(self.img_pathes) * 2

    def __getitem__(self, idx):
        if idx % 2 == 0:
            gt_image = cv2.imread(self.img_pathes[int(idx/2)][0], cv2.IMREAD_UNCHANGED) # cv2.IMREAD_UNCHANGED：顾名思义，读入完整图片，包括alpha通道。也就是透明度通道。https://wendao.blog.csdn.net/article/details/98768293
            gt_binary_image = cv2.imread(self.img_pathes[int(idx/2)][1], cv2.IMREAD_UNCHANGED)
            gt_instance = cv2.imread(self.img_pathes[int(idx/2)][2], cv2.IMREAD_UNCHANGED)
        else:
            gt_image = cv2.imread(self.img_pathes[int((idx-1)/2)][0], cv2.IMREAD_UNCHANGED)
            gt_binary_image = cv2.imread(self.img_pathes[int((idx-1)/2)][1], cv2.IMREAD_UNCHANGED)
            gt_instance = cv2.imread(self.img_pathes[int((idx-1)/2)][2], cv2.IMREAD_UNCHANGED)

            gt_image = cv2.flip(gt_image, 1) # 水平翻转图像
            gt_binary_image = cv2.flip(gt_binary_image, 1)
            gt_instance = cv2.flip(gt_instance, 1)

        gt_image = cv2.resize(gt_image, dsize=self.resize, interpolation=cv2.INTER_LINEAR)
        gt_binary_image = cv2.resize(gt_binary_image, dsize=self.resize, interpolation=cv2.INTER_NEAREST)
        gt_instance = cv2.resize(gt_instance, dsize=self.resize, interpolation=cv2.INTER_NEAREST)

        gt_image = gt_image / 127.5 - 1.0
        gt_binary_image = np.array(gt_binary_image / 255.0, dtype=np.uint8)
        gt_binary_image = gt_binary_image[:, :, np.newaxis]
        gt_instance = gt_instance[:, :, np.newaxis]

        gt_binary_image = np.transpose(gt_binary_image, (2, 0, 1))
        gt_instance = np.transpose(gt_instance, (2, 0, 1))

        gt_image = torch.tensor(gt_image, dtype=torch.float)
        gt_image = np.transpose(gt_image, (2, 0, 1))
        # trsf = torchvision.transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225), inplace=False)
        # gt_image = trsf(gt_image)

        gt_binary_image = torch.tensor(gt_binary_image, dtype=torch.long).view(self.resize[1], self.resize[0])
        # gt_binary_image = torch.tensor(gt_binary_image, dtype=torch.float)
        # gt_instance = torch.tensor(gt_instance, dtype=torch.float)
        gt_instance = torch.tensor(gt_instance, dtype=torch.long).view(self.resize[1], self.resize[0])

        return gt_image, gt_binary_image, gt_instance

用ECBM6040-Project/Train.ipynb训练LaneNet，模型将会保存在ECBM6040-Project/TUSIMPLE/Lanenet_output。代码如下

import os.path as ops
import numpy as np
import torch
import cv2
import time
from dataset.dataset_utils import TUSIMPLE
from Lanenet.model2 import Lanenet

# define the dataset
# root = '/Users/smiffy/Documents/GitHub/TUSIMPLE/Data_Tusimple_PyTorch/training'
root = 'TUSIMPLE/txt_for_local'
train_set = TUSIMPLE(root=root, flag='train')
valid_set = TUSIMPLE(root=root, flag='valid')
test_set = TUSIMPLE(root=root, flag='test')

print('train_set length {}'.format(len(train_set)))
print('valid_set length {}'.format(len(valid_set)))
print('test_set length {}'.format(len(test_set)))

gt, bgt, igt = train_set[0]
print('image type {}'.format(type(gt)))
print('image size {} \n'.format(gt.size()))

print('gt binary image type {}'.format(type(bgt)))
print('gt binary image size {}'.format(bgt.size()))
print('items in gt binary image {} \n'.format(torch.unique(bgt)))

print('gt instance type {}'.format(type(igt)))
print('gt instance size {}'.format(igt.size()))
print('items in gt instance {} \n'.format(torch.unique(igt)))

# DataLoader
batch_size = 8

data_loader_train = torch.utils.data.DataLoader(train_set, batch_size=batch_size, shuffle=True, num_workers=0) # num_workers进程数。为什么要设为0不太明白（使用主进程导入），是关于《操作系统》方面的知识。要补。
data_loader_valid = torch.utils.data.DataLoader(valid_set, batch_size=1, shuffle=True, num_workers=0) # torch.utils.data.DataLoader：用来把训练数据分成多个小组，此函数每次抛出一组数据。直至把所有的数据都抛出。
data_loader_test = torch.utils.data.DataLoader(test_set, batch_size=1, shuffle=False, num_workers=0)

# Model and optim
learning_rate = 5e-4

device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')

LaneNet_model = Lanenet(2, 4) # 二进制分类，4个嵌入纬度表示
LaneNet_model.to(device) # 放到gpu

params = [p for p in LaneNet_model.parameters() if p.requires_grad] # 获取网络中所有需要梯度更新的参数。requires_grad : https://blog.csdn.net/xuyi582605786/article/details/104973079/
optimizer = torch.optim.Adam(params, lr=learning_rate, weight_decay=0.0002)

lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1) # 动态调整学习率。具体策略用到再看。https://www.cnblogs.com/zf-blog/p/11262906.html

num_epochs = 30

from Lanenet.cluster_loss3 import cluster_loss
criterion = cluster_loss() # 实例化聚类损失
# criterion = torch.nn.CrossEntropyLoss(weight=torch.tensor([ 1.4393, 27.7296]).cuda())

from torch.autograd import Variable

loss_all = []
for epoch in range(num_epochs): # data_loader_train所有图片经过一次计算叫一个epoch
    LaneNet_model.train() # 告知程序开启训练模式
    ts = time.time()
    for iter, batch in enumerate(data_loader_train): # 迭代器。每个batch经过一次计算叫一个iter
        input_image = Variable(batch[0]).to(device) # 一个batch的数据放到gpu上
        binary_labels = Variable(batch[1]).to(device)
        instance_labels = Variable(batch[2]).to(device)
        
        binary_final_logits, instance_embedding = LaneNet_model(input_image) # 返回的是网络预测的二进制图片和实例嵌入
        # loss = LaneNet_model.compute_loss(binary_logits=binary_final_logits, binary_labels=binary_labels,
        #                               instance_logits=instance_embedding, instance_labels=instance_labels, delta_v=0.5, delta_d=3)
        binary_segmenatation_loss, instance_segmenatation_loss = criterion(binary_logits=binary_final_logits, binary_labels=binary_labels,
                                       instance_logits=instance_embedding, instance_labels=instance_labels, delta_v=0.5, delta_d=3) # 计算损失
        
        # binary_segmenatation_loss = criterion(binary_final_logits, binary_labels)
        loss = 1*binary_segmenatation_loss + 1*instance_segmenatation_loss # 文中提到。一样看待二者损失
        optimizer.zero_grad()
        loss_all.append(loss.item()) # 一个元素张量可以用item得到元素值。https://www.jianshu.com/p/be3276b434b2
        loss.backward()
        optimizer.step()
        
        if iter % 20 == 0:
            print("epoch[{}] iter[{}] loss: [{}, {}] ".format(epoch, iter, binary_segmenatation_loss.item(), instance_segmenatation_loss.item()))
    lr_scheduler.step() # 学习率也动态变化
    print("Finish epoch[{}], time elapsed[{}]".format(epoch, time.time() - ts))
    torch.save(LaneNet_model.state_dict(),                        f"TUSIMPLE/Lanenet_output/lanenet_epoch_{epoch}_batch_{8}.model")

# Show the Loss
import matplotlib.pylab as plt
plt.plot(loss_all)

可以看到loss下降的非常快：

在上述文件导入的本地模块from Lanenet.model2 import Lanenet代码解读如下：

import torch.nn as nn
import torch


class InitialBlock(nn.Module):
    """The initial block is composed of two branches:两个分支
    1. a main branch which performs a regular convolution with stride 2;main分支执行常规卷积，stride=2。输出13层
    2. an extension branch which performs max-pooling. # extension分支执行最大池化。输出3层
    Doing both operations in parallel and concatenating their results # 并行，之后concatenat。总共16层特征图
    allows for efficient downsampling and expansion. The main branch # 可以有效下采样
    outputs 13 feature maps while the extension branch outputs 3, for a 
    total of 16 feature maps after concatenation.
    Keyword arguments:
    - in_channels (int): the number of input channels.
    - out_channels (int): the number output channels.
    - kernel_size (int, optional): the kernel size of the filters used in
    the convolution layer. Default: 3.
    - padding (int, optional): zero-padding added to both sides of the
    input. Default: 0.
    - bias (bool, optional): Adds a learnable bias to the output if
    ``True``. Default: False. # 如果bias=True时输出一个学习得到的bias
    - relu (bool, optional): When ``True`` ReLU is used as the activation
    function; otherwise, PReLU is used. Default: True.
    """

    def __init__(self,
                 in_channels,
                 out_channels,
                 bias=False,
                 relu=True):
        super().__init__() # https://blog.csdn.net/a__int__/article/details/104600972

        if relu:
            activation = nn.ReLU
        else:
            activation = nn.PReLU

        # Main branch - As stated above the number of output channels for this
        # branch is the total minus 3, since the remaining channels come from
        # the extension branch
        self.main_branch = nn.Conv2d(
            in_channels,
            out_channels - 3,
            kernel_size=3,
            stride=2,
            padding=1,
            bias=bias)

        # Extension branch
        self.ext_branch = nn.MaxPool2d(3, stride=2, padding=1)

        # Initialize batch normalization to be used after concatenation
        self.batch_norm = nn.BatchNorm2d(out_channels)

        # PReLU layer to apply after concatenating the branches
        self.out_activation = activation()

    def forward(self, x):
        main = self.main_branch(x)
        ext = self.ext_branch(x)

        # Concatenate branches
        out = torch.cat((main, ext), 1)

        # Apply batch normalization
        out = self.batch_norm(out)

        return self.out_activation(out)


class RegularBottleneck(nn.Module):
    """Regular bottlenecks are the main building block of ENet.
    Main branch:
    1. Shortcut connection.
    Extension branch:
    1. 1x1 convolution which decreases the number of channels by
    ``internal_ratio``, also called a projection;
    2. regular, dilated or asymmetric convolution;
    3. 1x1 convolution which increases the number of channels back to
    ``channels``, also called an expansion;
    4. dropout as a regularizer.
    Keyword arguments:
    - channels (int): the number of input and output channels.
    - internal_ratio (int, optional): a scale factor applied to
    ``channels`` used to compute the number of
    channels after the projection. eg. given ``channels`` equal to 128 and
    internal_ratio equal to 2 the number of channels after the projection
    is 64. Default: 4.
    - kernel_size (int, optional): the kernel size of the filters used in
    the convolution layer described above in item 2 of the extension
    branch. Default: 3.
    - padding (int, optional): zero-padding added to both sides of the
    input. Default: 0.
    - dilation (int, optional): spacing between kernel elements for the
    convolution described in item 2 of the extension branch. Default: 1.dilation卷积是卷积核之间留有间隙。
    asymmetric (bool, optional): flags if the convolution described in
    item 2 of the extension branch is asymmetric or not. Default: False.标记扩展分支的第2项中描述的卷积是否不对称。
    - dropout_prob (float, optional): probability of an element to be
    zeroed. Default: 0 (no dropout).
    - bias (bool, optional): Adds a learnable bias to the output if
    ``True``. Default: False.
    - relu (bool, optional): When ``True`` ReLU is used as the activation
    function; otherwise, PReLU is used. Default: True.
    """

    def __init__(self,
                 channels,
                 internal_ratio=4,
                 kernel_size=3,
                 padding=0,
                 dilation=1,
                 asymmetric=False,
                 dropout_prob=0,
                 bias=False,
                 relu=True):
        super().__init__()

        # Check in the internal_scale parameter is within the expected range
        # [1, channels]
        if internal_ratio <= 1 or internal_ratio > channels:
            raise RuntimeError("Value out of range. Expected value in the "
                               "interval [1, {0}], got internal_scale={1}."
                               .format(channels, internal_ratio))

        internal_channels = channels // internal_ratio

        if relu:
            activation = nn.ReLU
        else:
            activation = nn.PReLU

        # Main branch - shortcut connection

        # Extension branch - 1x1 convolution, followed by a regular, dilated or
        # asymmetric convolution, followed by another 1x1 convolution, and,
        # finally, a regularizer (spatial dropout). Number of channels is constant.

        # 1x1 projection convolution
        self.ext_conv1 = nn.Sequential(
            nn.Conv2d(
                channels,
                internal_channels,
                kernel_size=1,
                stride=1,
                bias=bias), nn.BatchNorm2d(internal_channels), activation())

        # If the convolution is asymmetric we split the main convolution in
        # two. Eg. for a 5x5 asymmetric convolution we have two convolution:
        # the first is 5x1 and the second is 1x5.可分解卷积
        if asymmetric:
            self.ext_conv2 = nn.Sequential(
                nn.Conv2d(
                    internal_channels,
                    internal_channels,
                    kernel_size=(kernel_size, 1),
                    stride=1,
                    padding=(padding, 0),
                    dilation=dilation,
                    bias=bias), nn.BatchNorm2d(internal_channels), activation(),
                nn.Conv2d(
                    internal_channels,
                    internal_channels,
                    kernel_size=(1, kernel_size),
                    stride=1,
                    padding=(0, padding),
                    dilation=dilation,
                    bias=bias), nn.BatchNorm2d(internal_channels), activation())
        else:
            self.ext_conv2 = nn.Sequential(
                nn.Conv2d(
                    internal_channels,
                    internal_channels,
                    kernel_size=kernel_size,
                    stride=1,
                    padding=padding,
                    dilation=dilation,
                    bias=bias), nn.BatchNorm2d(internal_channels), activation())

        # 1x1 expansion convolution
        self.ext_conv3 = nn.Sequential(
            nn.Conv2d(
                internal_channels,
                channels,
                kernel_size=1,
                stride=1,
                bias=bias), nn.BatchNorm2d(channels), activation())

        self.ext_regul = nn.Dropout2d(p=dropout_prob)

        # PReLU layer to apply after adding the branches
        self.out_activation = activation()

    def forward(self, x):
        # Main branch shortcut
        main = x

        # Extension branch
        ext = self.ext_conv1(x)
        ext = self.ext_conv2(ext)
        ext = self.ext_conv3(ext)
        ext = self.ext_regul(ext)

        # Add main and extension branches
        out = main + ext

        return self.out_activation(out)


class DownsamplingBottleneck(nn.Module):
    """Downsampling bottlenecks further downsample the feature map size.用于进一步下采样特征图。
    Main branch:
    1. max pooling with stride 2; indices are saved to be used for
    unpooling later.步长为2的最大池化，保存索引用于后续上采样
    Extension branch:
    1. 2x2 convolution with stride 2 that decreases the number of channels
    by ``internal_ratio``, also called a projection;用2x2卷积通过internal_ratio（也叫投影）降采样通道数
    2. regular convolution (by default, 3x3);
    3. 1x1 convolution which increases the number of channels to
    ``out_channels``, also called an expansion;1x1卷积增加输出通道数，也叫expansion
    4. dropout as a regularizer.dropout实现正则化
    Keyword arguments:
    - in_channels (int): the number of input channels.
    - out_channels (int): the number of output channels.
    - internal_ratio (int, optional): a scale factor applied to ``channels``
    used to compute the number of channels after the projection. eg. given
    ``channels`` equal to 128 and internal_ratio equal to 2 the number of
    channels after the projection is 64. Default: 4.应用于“通道”的比例因子，用于计算投影后的通道数。例如，给定``通道``等于128，internal_ratio``等于2，则投影后的通道数是64。默认值：4。
    - return_indices (bool, optional):  if ``True``, will return the max
    indices along with the outputs. Useful when unpooling later.和输出一起返回最大值索引，上采样时用到
    - dropout_prob (float, optional): probability of an element to be
    zeroed. Default: 0 (no dropout).
    - bias (bool, optional): Adds a learnable bias to the output if
    ``True``. Default: False.
    - relu (bool, optional): When ``True`` ReLU is used as the activation
    function; otherwise, PReLU is used. Default: True.
    """

    def __init__(self,
                 in_channels,
                 out_channels,
                 internal_ratio=4,
                 return_indices=False,
                 dropout_prob=0,
                 bias=False,
                 relu=True):
        super().__init__()

        # Store parameters that are needed later
        self.return_indices = return_indices

        # Check in the internal_scale parameter is within the expected range
        # [1, channels]
        if internal_ratio <= 1 or internal_ratio > in_channels:
            raise RuntimeError("Value out of range. Expected value in the "
                               "interval [1, {0}], got internal_scale={1}. "
                               .format(in_channels, internal_ratio))

        internal_channels = in_channels // internal_ratio

        if relu:
            activation = nn.ReLU
        else:
            activation = nn.PReLU

        # Main branch - max pooling followed by feature map (channels) padding
        self.main_max1 = nn.MaxPool2d(
            2,
            stride=2,
            return_indices=return_indices)

        # Extension branch - 2x2 convolution, followed by a regular, dilated or
        # asymmetric convolution, followed by another 1x1 convolution. Number
        # of channels is doubled.

        # 2x2 projection convolution with stride 2
        self.ext_conv1 = nn.Sequential(
            nn.Conv2d(
                in_channels,
                internal_channels,
                kernel_size=2,
                stride=2,
                bias=bias), nn.BatchNorm2d(internal_channels), activation())

        # Convolution
        self.ext_conv2 = nn.Sequential(
            nn.Conv2d(
                internal_channels,
                internal_channels,
                kernel_size=3,
                stride=1,
                padding=1,
                bias=bias), nn.BatchNorm2d(internal_channels), activation())

        # 1x1 expansion convolution
        self.ext_conv3 = nn.Sequential(
            nn.Conv2d(
                internal_channels,
                out_channels,
                kernel_size=1,
                stride=1,
                bias=bias), nn.BatchNorm2d(out_channels), activation())

        self.ext_regul = nn.Dropout2d(p=dropout_prob)

        # PReLU layer to apply after concatenating the branches
        self.out_activation = activation()

    def forward(self, x):
        # Main branch shortcut
        if self.return_indices:
            main, max_indices = self.main_max1(x)
        else:
            main = self.main_max1(x)

        # Extension branch
        ext = self.ext_conv1(x)
        ext = self.ext_conv2(ext)
        ext = self.ext_conv3(ext)
        ext = self.ext_regul(ext)

        # Main branch channel padding
        n, ch_ext, h, w = ext.size()
        ch_main = main.size()[1]
        padding = torch.zeros(n, ch_ext - ch_main, h, w)

        # Before concatenating, check if main is on the CPU or GPU and
        # convert padding accordingly
        if main.is_cuda:
            padding = padding.cuda()

        # Concatenate
        main = torch.cat((main, padding), 1)

        # Add main and extension branches
        out = main + ext

        return self.out_activation(out), max_indices


class UpsamplingBottleneck(nn.Module):
    """The upsampling bottlenecks upsample the feature map resolution using max
    pooling indices stored from the corresponding downsampling bottleneck.用下采样中索引来实现上采样
    Main branch:
    1. 1x1 convolution with stride 1 that decreases the number of channels by
    ``internal_ratio``, also called a projection;
    2. max unpool layer using the max pool indices from the corresponding
    downsampling max pool layer.
    Extension branch:
    1. 1x1 convolution with stride 1 that decreases the number of channels by
    ``internal_ratio``, also called a projection;
    2. transposed convolution (by default, 3x3);
    3. 1x1 convolution which increases the number of channels to
    ``out_channels``, also called an expansion;
    4. dropout as a regularizer.
    Keyword arguments:
    - in_channels (int): the number of input channels.
    - out_channels (int): the number of output channels.
    - internal_ratio (int, optional): a scale factor applied to ``in_channels``
     used to compute the number of channels after the projection. eg. given
     ``in_channels`` equal to 128 and ``internal_ratio`` equal to 2 the number
     of channels after the projection is 64. Default: 4.
    - dropout_prob (float, optional): probability of an element to be zeroed.
    Default: 0 (no dropout).
    - bias (bool, optional): Adds a learnable bias to the output if ``True``.
    Default: False.
    - relu (bool, optional): When ``True`` ReLU is used as the activation
    function; otherwise, PReLU is used. Default: True.
    """

    def __init__(self,
                 in_channels,
                 out_channels,
                 internal_ratio=4,
                 dropout_prob=0,
                 bias=False,
                 relu=True):
        super().__init__()

        # Check in the internal_scale parameter is within the expected range
        # [1, channels]
        if internal_ratio <= 1 or internal_ratio > in_channels:
            raise RuntimeError("Value out of range. Expected value in the "
                               "interval [1, {0}], got internal_scale={1}. "
                               .format(in_channels, internal_ratio))

        internal_channels = in_channels // internal_ratio

        if relu:
            activation = nn.ReLU
        else:
            activation = nn.PReLU

        # Main branch - max pooling followed by feature map (channels) padding
        self.main_conv1 = nn.Sequential(
            nn.Conv2d(in_channels, out_channels, kernel_size=1, bias=bias),
            nn.BatchNorm2d(out_channels))

        # Remember that the stride is the same as the kernel_size, just like
        # the max pooling layers
        self.main_unpool1 = nn.MaxUnpool2d(kernel_size=2)

        # Extension branch - 1x1 convolution, followed by a regular, dilated or
        # asymmetric convolution, followed by another 1x1 convolution. Number
        # of channels is doubled.

        # 1x1 projection convolution with stride 1
        self.ext_conv1 = nn.Sequential(
            nn.Conv2d(
                in_channels, internal_channels, kernel_size=1, bias=bias),
            nn.BatchNorm2d(internal_channels), activation())

        # Transposed convolution
        self.ext_tconv1 = nn.ConvTranspose2d(
            internal_channels,
            internal_channels,
            kernel_size=2,
            stride=2,
            bias=bias)
        self.ext_tconv1_bnorm = nn.BatchNorm2d(internal_channels)
        self.ext_tconv1_activation = activation()

        # 1x1 expansion convolution
        self.ext_conv2 = nn.Sequential(
            nn.Conv2d(
                internal_channels, out_channels, kernel_size=1, bias=bias),
            nn.BatchNorm2d(out_channels), activation())

        self.ext_regul = nn.Dropout2d(p=dropout_prob)

        # PReLU layer to apply after concatenating the branches
        self.out_activation = activation()

    def forward(self, x, max_indices, output_size):
        # Main branch shortcut
        main = self.main_conv1(x)
        main = self.main_unpool1(
            main, max_indices, output_size=output_size)

        # Extension branch
        ext = self.ext_conv1(x)
        ext = self.ext_tconv1(ext, output_size=output_size)
        ext = self.ext_tconv1_bnorm(ext)
        ext = self.ext_tconv1_activation(ext)
        ext = self.ext_conv2(ext)
        ext = self.ext_regul(ext)

        # Add main and extension branches
        out = main + ext

        return self.out_activation(out)


class Lanenet(nn.Module): # 目前可以理解为固定写法。https://zhuanlan.zhihu.com/p/88712978。https://blog.csdn.net/weixin_42018112/article/details/90084419
    def __init__(self, binary_seg, embedding_dim, encoder_relu=False, decoder_relu=True):
        super(Lanenet, self).__init__() # super().__init__()和 super(class,self).__init__()区别：https://www.v2ex.com/amp/t/740751。目前不太理解，好像在这里是没什么区别，后面学一些python高级编程再说吧。

        self.initial_block = InitialBlock(3, 16, relu=encoder_relu) # https://blog.csdn.net/u013241583/article/details/90170369

        # Stage 1 share
        self.downsample1_0 = DownsamplingBottleneck(16, 64, return_indices=True, dropout_prob=0.01, relu=encoder_relu) # https://blog.csdn.net/u013241583/article/details/90171242
        self.regular1_1 = RegularBottleneck(64, padding=1, dropout_prob=0.01, relu=encoder_relu)
        self.regular1_2 = RegularBottleneck(64, padding=1, dropout_prob=0.01, relu=encoder_relu)
        self.regular1_3 = RegularBottleneck(64, padding=1, dropout_prob=0.01, relu=encoder_relu)
        self.regular1_4 = RegularBottleneck(64, padding=1, dropout_prob=0.01, relu=encoder_relu)

        # Stage 2 share
        self.downsample2_0 = DownsamplingBottleneck(64, 128, return_indices=True, dropout_prob=0.1, relu=encoder_relu)
        self.regular2_1 = RegularBottleneck(128, padding=1, dropout_prob=0.1, relu=encoder_relu)
        self.dilated2_2 = RegularBottleneck(128, dilation=2, padding=2, dropout_prob=0.1, relu=encoder_relu)
        self.asymmetric2_3 = RegularBottleneck(128, kernel_size=5, padding=2, asymmetric=True, dropout_prob=0.1, relu=encoder_relu)
        self.dilated2_4 = RegularBottleneck(128, dilation=4, padding=4, dropout_prob=0.1, relu=encoder_relu)
        self.regular2_5 = RegularBottleneck(128, padding=1, dropout_prob=0.1, relu=encoder_relu)
        self.dilated2_6 = RegularBottleneck(128, dilation=8, padding=8, dropout_prob=0.1, relu=encoder_relu)
        self.asymmetric2_7 = RegularBottleneck(128, kernel_size=5, asymmetric=True, padding=2, dropout_prob=0.1, relu=encoder_relu)
        self.dilated2_8 = RegularBottleneck(128, dilation=16, padding=16, dropout_prob=0.1, relu=encoder_relu)

        # stage 3 binary
        self.regular_binary_3_0 = RegularBottleneck(128, padding=1, dropout_prob=0.1, relu=encoder_relu)
        self.dilated_binary_3_1 = RegularBottleneck(128, dilation=2, padding=2, dropout_prob=0.1, relu=encoder_relu)
        self.asymmetric_binary_3_2 = RegularBottleneck(128, kernel_size=5, padding=2, asymmetric=True, dropout_prob=0.1, relu=encoder_relu)
        self.dilated_binary_3_3 = RegularBottleneck(128, dilation=4, padding=4, dropout_prob=0.1, relu=encoder_relu)
        self.regular_binary_3_4 = RegularBottleneck(128, padding=1, dropout_prob=0.1, relu=encoder_relu)
        self.dilated_binary_3_5 = RegularBottleneck(128, dilation=8, padding=8, dropout_prob=0.1, relu=encoder_relu)
        self.asymmetric_binary_3_6 = RegularBottleneck(128, kernel_size=5, asymmetric=True, padding=2, dropout_prob=0.1, relu=encoder_relu)
        self.dilated_binary_3_7 = RegularBottleneck(128, dilation=16, padding=16, dropout_prob=0.1, relu=encoder_relu)

        # stage 3 embedding
        self.regular_embedding_3_0 = RegularBottleneck(128, padding=1, dropout_prob=0.1, relu=encoder_relu)
        self.dilated_embedding_3_1 = RegularBottleneck(128, dilation=2, padding=2, dropout_prob=0.1, relu=encoder_relu)
        self.asymmetric_embedding_3_2 = RegularBottleneck(128, kernel_size=5, padding=2, asymmetric=True, dropout_prob=0.1, relu=encoder_relu)
        self.dilated_embedding_3_3 = RegularBottleneck(128, dilation=4, padding=4, dropout_prob=0.1, relu=encoder_relu)
        self.regular_embedding_3_4 = RegularBottleneck(128, padding=1, dropout_prob=0.1, relu=encoder_relu)
        self.dilated_embedding_3_5 = RegularBottleneck(128, dilation=8, padding=8, dropout_prob=0.1, relu=encoder_relu)
        self.asymmetric_bembedding_3_6 = RegularBottleneck(128, kernel_size=5, asymmetric=True, padding=2, dropout_prob=0.1, relu=encoder_relu)
        self.dilated_embedding_3_7 = RegularBottleneck(128, dilation=16, padding=16, dropout_prob=0.1, relu=encoder_relu)

        # binary branch
        self.upsample_binary_4_0 = UpsamplingBottleneck(128, 64, dropout_prob=0.1, relu=decoder_relu)
        self.regular_binary_4_1 = RegularBottleneck(64, padding=1, dropout_prob=0.1, relu=decoder_relu)
        self.regular_binary_4_2 = RegularBottleneck(64, padding=1, dropout_prob=0.1, relu=decoder_relu)
        self.upsample_binary_5_0 = UpsamplingBottleneck(64, 16, dropout_prob=0.1, relu=decoder_relu)
        self.regular_binary_5_1 = RegularBottleneck(16, padding=1, dropout_prob=0.1, relu=decoder_relu)
        self.binary_transposed_conv = nn.ConvTranspose2d(16, binary_seg, kernel_size=3, stride=2, padding=1, bias=False)

        # embedding branch
        self.upsample_embedding_4_0 = UpsamplingBottleneck(128, 64, dropout_prob=0.1, relu=decoder_relu)
        self.regular_embedding_4_1 = RegularBottleneck(64, padding=1, dropout_prob=0.1, relu=decoder_relu)
        self.regular_embedding_4_2 = RegularBottleneck(64, padding=1, dropout_prob=0.1, relu=decoder_relu)
        self.upsample_embedding_5_0 = UpsamplingBottleneck(64, 16, dropout_prob=0.1, relu=decoder_relu)
        self.regular_embedding_5_1 = RegularBottleneck(16, padding=1, dropout_prob=0.1, relu=decoder_relu)
        self.embedding_transposed_conv = nn.ConvTranspose2d(16, embedding_dim, kernel_size=3, stride=2, padding=1, bias=False)

    def forward(self, x):
        # Initial block
        input_size = x.size() # torch.Size([8, 3, 256, 512])
        x = self.initial_block(x)

        # Stage 1 share
        stage1_input_size = x.size()
        x, max_indices1_0 = self.downsample1_0(x)
        x = self.regular1_1(x)
        x = self.regular1_2(x)
        x = self.regular1_3(x)
        x = self.regular1_4(x)

        # Stage 2 share
        stage2_input_size = x.size()
        x, max_indices2_0 = self.downsample2_0(x)
        x = self.regular2_1(x)
        x = self.dilated2_2(x)
        x = self.asymmetric2_3(x)
        x = self.dilated2_4(x)
        x = self.regular2_5(x)
        x = self.dilated2_6(x)
        x = self.asymmetric2_7(x)
        x = self.dilated2_8(x)

        # stage 3 binary
        x_binary = self.regular_binary_3_0(x)
        x_binary = self.dilated_binary_3_1(x_binary)
        x_binary = self.asymmetric_binary_3_2(x_binary)
        x_binary = self.dilated_binary_3_3(x_binary)
        x_binary = self.regular_binary_3_4(x_binary)
        x_binary = self.dilated_binary_3_5(x_binary)
        x_binary = self.asymmetric_binary_3_6(x_binary)
        x_binary = self.dilated_binary_3_7(x_binary)

        # stage 3 embedding
        x_embedding = self.regular_embedding_3_0(x)
        x_embedding = self.dilated_embedding_3_1(x_embedding)
        x_embedding = self.asymmetric_embedding_3_2(x_embedding)
        x_embedding = self.dilated_embedding_3_3(x_embedding)
        x_embedding = self.regular_embedding_3_4(x_embedding)
        x_embedding = self.dilated_embedding_3_5(x_embedding)
        x_embedding = self.asymmetric_bembedding_3_6(x_embedding)
        x_embedding = self.dilated_embedding_3_7(x_embedding)

        # binary branch
        x_binary = self.upsample_binary_4_0(x_binary, max_indices2_0, output_size=stage2_input_size)
        x_binary = self.regular_binary_4_1(x_binary)
        x_binary = self.regular_binary_4_2(x_binary)
        x_binary = self.upsample_binary_5_0(x_binary, max_indices1_0, output_size=stage1_input_size)
        x_binary = self.regular_binary_5_1(x_binary)
        binary_final_logits = self.binary_transposed_conv(x_binary, output_size=input_size)

        # embedding branch
        x_embedding = self.upsample_embedding_4_0(x_embedding, max_indices2_0, output_size=stage2_input_size)
        x_embedding = self.regular_embedding_4_1(x_embedding)
        x_embedding = self.regular_embedding_4_2(x_embedding)
        x_embedding = self.upsample_embedding_5_0(x_embedding, max_indices1_0, output_size=stage1_input_size)
        x_embedding = self.regular_embedding_5_1(x_embedding)
        instance_notfinal_logits = self.embedding_transposed_conv(x_embedding, output_size=input_size)

        return binary_final_logits, instance_notfinal_logits


if __name__ == '__main__':
    test_input = torch.ones((8, 3, 256, 512))
    net = Lanenet(2, 4)
    binary_final_logits, instance_notfinal_logits = net(test_input)
    print(binary_final_logits.shape)
    print(instance_notfinal_logits.shape)

在上述文件导入的本地模块from Lanenet.cluster_loss3 import cluster_loss代码解读如下：

import torch
import torch.nn as nn
from torch_scatter import scatter # 关于这个模块的下载：https://www.cnblogs.com/cykablyat/p/14293500.html。注意要下载与cuda和python对应版本的。对我，下载链接https://pytorch-geometric.com/whl/torch-1.7.0.html。下载文件名：torch_scatter-latest+cu110-cp38-cp38-linux_x86_64.whl


class cluster_loss_helper(nn.Module): # 计算L=Lar+Ldist的损失。（二进制分支）
    def __init__(self):
        super(cluster_loss_helper, self).__init__()

    def forward(self, prediction, correct_label, delta_v, delta_d):
        """

        :param prediction: [N, 4, 256, 512]
        :param correct_label: [N, 256, 512]
        :param delta_v:
        :param delta_d:
        :return:
        """
        prediction_reshape = prediction.view(prediction.shape[0], prediction.shape[1],
                                             prediction.shape[2] * prediction.shape[3])  # [N, 4, 131072]
        correct_label_reshape = correct_label.view(correct_label.shape[0], 1,
                                                   correct_label.shape[1] * correct_label.shape[
                                                       2])  # [N, 1, 131072]

        output, inverse_indices, counts = torch.unique(correct_label_reshape, return_inverse=True,
                                                       return_counts=True)
        counts = counts.float()
        num_instances = len(output) # 车道线实例数量

        # mu_sum = scatter(prediction_reshape, inverse_indices, dim=2, reduce="sum") # [N, 4, 5]
        # muc = mu_sum/counts # [N, 4, 5]
        muc = scatter(prediction_reshape, inverse_indices, dim=2, reduce="mean")  # [N, 4, 5] 具体用法：https://blog.csdn.net/StarfishCu/article/details/108853080

        dis = torch.index_select(muc, 2, inverse_indices.view(inverse_indices.shape[-1]),
                                 out=None)  # [N, 4, 131072]
        dis = dis - prediction_reshape
        dis = torch.norm(dis, dim=1, keepdim=False, out=None, dtype=None)  # [N, 131072]
        dis = dis - delta_v
        dis = torch.clamp(dis, min=0.)  # [N, 131072]
        dis = torch.pow(dis, 2, out=None)

        L_var = scatter(dis, inverse_indices.view(inverse_indices.shape[-1]), dim=1, reduce="mean")  # [N, 3]
        L_var = torch.sum(L_var) / num_instances

        L_dist = torch.tensor(0, dtype=torch.float)
        for A in range(num_instances):
            for B in range(num_instances):
                if A != B:
                    dis = muc[:, :, A] - muc[:, :, B]
                    dis = torch.norm(dis, dim=1, keepdim=False, out=None, dtype=None)
                    dis = delta_d - dis
                    dis = torch.clamp(dis, min=0.)
                    dis = torch.pow(dis, 2, out=None)
                    L_dist = L_dist + dis
        L_dist = L_dist / (num_instances * (num_instances - 1))
        L_dist = L_dist.view([])
        total_loss = L_var + L_dist
        return total_loss


class cluster_loss(nn.Module):
    def __init__(self):
        super(cluster_loss, self).__init__()

    def forward(self, binary_logits, binary_labels,
                instance_logits, instance_labels, delta_v=0.5, delta_d=3):
        # Binary Loss
        # Since the two classes (lane/background) are highly unbalanced, we apply bounded inverse class weighting
        output, counts = torch.unique(binary_labels, return_inverse=False, return_counts=True) # output：不重复的元素。counts：元素对应数量 torch.unique：https://blog.csdn.net/t20134297/article/details/108235355。
        counts = counts.float()
        inverse_weights = torch.div(1.0, torch.log(
            torch.add(torch.div(counts, torch.sum(counts)), torch.tensor(1.02, dtype=torch.float)))) # lane/background类别不均衡：**bounded inverse class weighting**

        binary_loss = torch.nn.CrossEntropyLoss(weight=inverse_weights)
        binary_segmenatation_loss = binary_loss(binary_logits, binary_labels) # 带有权重的CrossEntropyLoss

        batch_size = instance_logits.shape[0]
        loss_set = []
        for dimen in range(batch_size):
            loss_set.append(cluster_loss_helper()) # 记录

        instance_segmenatation_loss = torch.tensor(0.)#.cuda() 

        for dimen in range(batch_size):
            instance_loss = loss_set[dimen] # 
            # prediction = instance_logits[dimen].view(1, instance_logits.shape[1], instance_logits.shape[2],
            #                                          instance_logits.shape[3])
            # correct_label = instance_labels[dimen].view(1, instance_labels.shape[1], instance_labels.shape[2])
            # instance_segmenatation_loss += instance_loss(prediction, correct_label, delta_v, delta_d)
            prediction = torch.unsqueeze(instance_logits[dimen], 0) # .cuda()
            correct_label = torch.unsqueeze(instance_labels[dimen], 0)# .cuda()
            instance_segmenatation_loss += instance_loss(prediction, correct_label, delta_v, delta_d)

        instance_segmenatation_loss = instance_segmenatation_loss / batch_size
        return binary_segmenatation_loss, instance_segmenatation_loss

运行到这里，可能会报错。错误信息如下：

Traceback (most recent call last):
File “/home/wqf/ECBM6040-Project/test.py”, line 82, in
binary_segmenatation_loss, instance_segmenatation_loss = criterion(
File “/home/wqf/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py”, line 727, in _call_impl
result = self.forward(*input, **kwargs)
File “/home/wqf/ECBM6040-Project/Lanenet/cluster_loss3.py”, line 98, in forward
instance_segmenatation_loss += instance_loss(prediction, correct_label, delta_v, delta_d)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

这段话的意思就是说所有的tensor要在一个设备上运行，但这里至少找到了两个设备cuda：0和cpu。因此我们在报错地方打印所有变量的运行位置

            print(delta_d.device)
            print(delta_v.device)
            print(prediction.device)
            print(correct_label.device)

不要忘了赋值语句左边的变量：

            print(instance_segmenatation_loss)

之后可以发现是delta_d、delta_v、instance_segmenatation_loss这三个小可爱在捣鬼。因此我们将他们放到gpu上。

            delta_v = delta_v.to(device)
            delta_d = delta_d.to(device)

instance_segmenatation_loss = torch.tensor(0.).cuda()

但是又报另一个错：

Traceback (most recent call last):
File “/home/wqf/ECBM6040-Project/test.py”, line 82, in
binary_segmenatation_loss, instance_segmenatation_loss = criterion(
File “/home/wqf/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py”, line 727, in _call_impl
result = self.forward(*input, **kwargs)
File “/home/wqf/ECBM6040-Project/Lanenet/cluster_loss3.py”, line 99, in forward
delta_v = delta_v.cuda()
AttributeError: ‘float’ object has no attribute 'cuda’

因此我们将delta_d、delta_v这两个变量转换为tensor格式，再放到gpu上

import numpy as np
...
    def forward(self, binary_logits, binary_labels,
                instance_logits, instance_labels, delta_v=np.float(0.5), delta_d=np.int(3)):
                ...
            delta_v = torch.tensor(delta_v)
            delta_d = torch.tensor(delta_d)

之后便可以愉快的跑代码啦～

修改之后的cluster_loss3代码如下：

# _*_ coding utf-8 _*_
# 开发团队：
# 开发人员：wqf
# 开发时间：2021/1/22 下午5:10
# 文件名称：cluster_loss3.py
# 开发工具：PyCharm
import torch
import torch.nn as nn
from torch_scatter import scatter # 关于这个模块的下载：https://www.cnblogs.com/cykablyat/p/14293500.html。注意要下载与cuda和python对应版本的。对我，下载链接https://pytorch-geometric.com/whl/torch-1.7.0.html。下载文件名：torch_scatter-latest+cu110-cp38-cp38-linux_x86_64.whl
import numpy as np

class cluster_loss_helper(nn.Module): # 计算L=Lar+Ldist的损失。（二进制分支）。对公式的代码表示
    def __init__(self):
        super(cluster_loss_helper, self).__init__()

    def forward(self, prediction, correct_label, delta_v, delta_d):
        """

        :param prediction: [N, 4, 256, 512]
        :param correct_label: [N, 256, 512]
        :param delta_v:
        :param delta_d:
        :return:
        """
        prediction_reshape = prediction.view(prediction.shape[0], prediction.shape[1],
                                             prediction.shape[2] * prediction.shape[3])  # [N, 4, 131072]
        correct_label_reshape = correct_label.view(correct_label.shape[0], 1,
                                                   correct_label.shape[1] * correct_label.shape[
                                                       2])  # [N, 1, 131072]

        output, inverse_indices, counts = torch.unique(correct_label_reshape, return_inverse=True,
                                                       return_counts=True)
        counts = counts.float()
        num_instances = len(output) # 车道线实例数量

        # mu_sum = scatter(prediction_reshape, inverse_indices, dim=2, reduce="sum") # [N, 4, 5]
        # muc = mu_sum/counts # [N, 4, 5]
        muc = scatter(prediction_reshape, inverse_indices, dim=2, reduce="mean")  # [N, 4, 5] 具体用法：https://blog.csdn.net/StarfishCu/article/details/108853080

        dis = torch.index_select(muc, 2, inverse_indices.view(inverse_indices.shape[-1]),
                                 out=None)  # [N, 4, 131072]
        dis = dis - prediction_reshape
        dis = torch.norm(dis, dim=1, keepdim=False, out=None, dtype=None)  # [N, 131072]
        dis = dis - delta_v
        dis = torch.clamp(dis, min=0.)  # [N, 131072]
        dis = torch.pow(dis, 2, out=None)

        L_var = scatter(dis, inverse_indices.view(inverse_indices.shape[-1]), dim=1, reduce="mean")  # [N, 3]
        L_var = torch.sum(L_var) / num_instances

        L_dist = torch.tensor(0, dtype=torch.float)
        for A in range(num_instances):
            for B in range(num_instances):
                if A != B:
                    dis = muc[:, :, A] - muc[:, :, B]
                    dis = torch.norm(dis, dim=1, keepdim=False, out=None, dtype=None)
                    dis = delta_d - dis
                    dis = torch.clamp(dis, min=0.)
                    dis = torch.pow(dis, 2, out=None)
                    L_dist = L_dist + dis
        L_dist = L_dist / (num_instances * (num_instances - 1))
        L_dist = L_dist.view([])
        total_loss = L_var + L_dist
        return total_loss


class cluster_loss(nn.Module):
    def __init__(self):
        super(cluster_loss, self).__init__()

    def forward(self, binary_logits, binary_labels,
                instance_logits, instance_labels, delta_v=np.float(0.5), delta_d=np.int(3)):
        # Binary Loss
        # Since the two classes (lane/background) are highly unbalanced, we apply bounded inverse class weighting
        output, counts = torch.unique(binary_labels, return_inverse=False, return_counts=True) # output：不重复的元素。counts：元素对应数量 torch.unique：https://blog.csdn.net/t20134297/article/details/108235355。
        counts = counts.float()
        inverse_weights = torch.div(1.0, torch.log(
            torch.add(torch.div(counts, torch.sum(counts)), torch.tensor(1.02, dtype=torch.float)))) # lane/background类别不均衡：**bounded inverse class weighting**

        binary_loss = torch.nn.CrossEntropyLoss(weight=inverse_weights)
        binary_segmenatation_loss = binary_loss(binary_logits, binary_labels) # 带有权重的CrossEntropyLoss

        batch_size = instance_logits.shape[0]
        loss_set = []
        for dimen in range(batch_size):
            loss_set.append(cluster_loss_helper()) # 记录batch里每张图片的嵌入损失

        instance_segmenatation_loss = torch.tensor(0.).cuda()#.cuda()

        for dimen in range(batch_size):
            instance_loss = loss_set[dimen] # 取出对应的实例化后的嵌入损失
            # prediction = instance_logits[dimen].view(1, instance_logits.shape[1], instance_logits.shape[2],
            #                                          instance_logits.shape[3])
            # correct_label = instance_labels[dimen].view(1, instance_labels.shape[1], instance_labels.shape[2])
            # instance_segmenatation_loss += instance_loss(prediction, correct_label, delta_v, delta_d)
            prediction = torch.unsqueeze(instance_logits[dimen], 0) # .cuda()
            correct_label = torch.unsqueeze(instance_labels[dimen], 0)# .cuda()

            delta_v = torch.tensor(delta_v)
            delta_d = torch.tensor(delta_d)

            delta_v = delta_v.cuda()
            delta_d = delta_d.cuda()

            instance_segmenatation_loss += instance_loss(prediction, correct_label, delta_v, delta_d)

        instance_segmenatation_loss = instance_segmenatation_loss / batch_size # 实例分割损失要除以batch数，计算单张图片的损失，而二进制损失不知道为啥不要除。待研究。
        return binary_segmenatation_loss, instance_segmenatation_loss

我这里是3060TI的显卡，训练一个epoch大概要180s。

用ECBM6040-Project/Train_aug.ipynb使用经数据增强处理后的数据训练LaneNet。

这个代码这里就不贴了，和之前的一样的。只说一点：注意路径的书写
源代码：

torch.save(LaneNet_model.state_dict(),

               f"/TUSIMPLE/Lanenet_output/lanenet_epoch_{epoch}_batch_{8}_AUG.model")

会报错。修改如下

torch.save(LaneNet_model.state_dict(),
               f"TUSIMPLE/Lanenet_output/lanenet_epoch_{epoch}_batch_{8}_AUG.model")

注：训练时没有进行clustering。而是lanenet输出二进制分割损失和实例嵌入损失后，直接进行梯度下降。我认为，这是因为在训练时是知道每个图像有几个车道线的。而测试时不知道每张图片有几条车道线，所以要进行聚类。聚类时，为了防止将离群点作为聚类初始点，先用mean shift算法，到达样本特征点最密集的点那里。
关于mean shift算法为什么可以到到特征点密集处：https://blog.csdn.net/u014661698/article/details/84979979

在测试集上评估模型

用ECBM6040-Project/Notebook-experiment/Evaluation of Lanenet.ipynb去评估。
代码解读如下：

import json
import os.path as ops
import numpy as np
import torch
import cv2
import time
import os
import matplotlib.pylab as plt
import sys
from tqdm import tqdm
sys.path.append('..') # 载入上级目录。前面有解释
from dataset.dataset_utils import TUSIMPLE
from Lanenet.model2 import Lanenet
from utils.evaluation import gray_to_rgb_emb, process_instance_embedding, video_to_clips

# Load the Model
model_path = '../TUSIMPLE/Lanenet_output/lanenet_epoch_39_batch_8.model'
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
LaneNet_model = Lanenet(2, 4)
LaneNet_model.load_state_dict(torch.load(model_path)) # 加载模型。
LaneNet_model.to(device)
print('success')

# Load the Test Dataset
# Test use TUSIMPLE test dataset `clips` and `test_label.json` and write the predit result in `test_tasks_0627.json` use the evaluation from TUSIMPLE dataset in `utils/lane.py`
# write lanes and run_time to `pred_result.json`
pred_json_path = '../TUSIMPLE/test_set/test_tasks_0627.json' # inference得到的车道线标签
json_pred = [json.loads(line) for line in open(pred_json_path).readlines()] # json.loads()函数是将字符串转化为字典。https://www.cnblogs.com/hjianhui/p/10387057.html。运行本行后，json_pred是一个2782的list

all_time_forward = [] 
all_time_clustering = []
for i, sample in enumerate(tqdm(json_pred)): # 遍历 json_pred 中的元素及其索引, 如下，i是索引，sample是json_pred中的元素
    h_samples = sample['h_samples'] # 宽度（即竖直方向）。[160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660, 670, 680, 690, 700, 710]
    lanes = sample['lanes'] # []
    run_time = sample['run_time'] # list of float. The running time for each frame in the clip. The unit is millisecond.
    raw_file = sample['raw_file']
    img_path = ops.join('../TUSIMPLE/test_set', raw_file)
    # read the image
    gt_img_org = cv2.imread(img_path, cv2.IMREAD_UNCHANGED) # 【720，1280，3】
    org_shape = gt_img_org.shape
    gt_image = cv2.resize(gt_img_org, dsize=(512, 256), interpolation=cv2.INTER_LINEAR)
    gt_image = gt_image / 127.5 - 1.0
    gt_image = torch.tensor(gt_image, dtype=torch.float)
    gt_image = np.transpose(gt_image, (2, 0, 1)) # {Tensor:3}。（3，256，512）。（通道数，高度，长度）
    # Go through the network
    time_start=time.time()
    binary_final_logits, instance_embedding = LaneNet_model(gt_image.unsqueeze(0).cuda()) # 返回的是网络预测的二进制图片和实例嵌入
    # binary_final_logits = binary_final_logits.cpu()
    # instance_embedding = instance_embedding.cpu()
    time_end=time.time()
    # Get the final embedding image
    binary_img = torch.argmax(binary_final_logits, dim=1).squeeze().cpu().numpy() # 有车道线的地方是1，没有则是0
    binary_img[0:50,:] = 0
    clu_start = time.time()
    rbg_emb, cluster_result = process_instance_embedding(instance_embedding.cpu(), binary_img, distance=1.5, lane_num=4)
    clu_end = time.time()
    cluster_result = cv2.resize(cluster_result, dsize=(org_shape[1], org_shape[0]), 
                                interpolation=cv2.INTER_NEAREST)
    elements = np.unique(cluster_result)
    for line_idx in elements:
        if line_idx == 0: # 如果是背景，则continue
            continue
        else:
            mask = (cluster_result == line_idx) # 生成对应车道线的掩码(720,1280)
            select_mask = mask[h_samples] # 对纵坐标的掩码(56,1280)
            row_result = []
            for row in range(len(h_samples)): # 按h_samples所提供的行
                col_indexes = np.nonzero(select_mask[row])[0] # 对每一行找出非零元素（有车道线）
                if len(col_indexes) == 0:
                    row_result.append(-2) # 没有车道线
                else:
                    row_result.append(int(col_indexes.min() + (col_indexes.max()-col_indexes.min())/2)) # 记录该行车道线的中点坐标
            json_pred[i]['lanes'].append(row_result) # 运行前：{'h_samples': [160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660, 670, 680, 690, 700, 710], 'lanes': [], 'run_time': 1000, 'raw_file': 'clips/0530/1492626760788443246_0/20.jpg
            json_pred[i]['run_time'] = time_end-time_start # 运行后：{'h_samples': [160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660, 670, 680, 690, 700, 710], 'lanes': [[-2, -2, -2, -2, -2, -2, -2, -2, -2, 678, 686, 701, 710, 721, 732, 747, 756, 771, 783, 797, 815, 823, 838, 848, 865, 876, 890, 901, 916, 931, 941, 954, 966, 982, 994, 1009, 1021, 1037, 1052, 1062, 1078, 1090, 1103, 1113, 1128, 1138, 1153, 1168, 1180, 1196, 1208, 1221, 1231, 1246, 1253, 1262]], 'run_time': 0.5215342044830322, 'raw_file': 'clips/0530/1492626760788443246_0/20.jpg'}
            all_time_forward.append(time_end-time_start) # 前向传播时间。Lanenet inference一次所用时间
            all_time_clustering.append(clu_end-clu_start) # 聚类一次所用时间

forward_avg = np.sum(all_time_forward[500:2000])/1500 # 选择中间的1500次进行平均
cluster_avg = np.sum(all_time_clustering[500:2000])/1500

print('The Forward pass time for one image is: {}ms'.format(forward_avg*1000)) # 33.479649225870766ms
print('The Clustering time for one image is: {}ms'.format(cluster_avg*1000)) # 206.9698650042216ms
print('The total time for one image is: {}ms'.format((cluster_avg+forward_avg)*1000)) # 240.44951423009238ms

print('The speed for foreard pass is: {}fps'.format(1/forward_avg)) # 29.868891195767635fps
print('The speed for cluster pass is: {}fps'.format(1/cluster_avg)) # 4.831621260320206fps

with open('../TUSIMPLE/pred.json', 'w') as f:
    for res in json_pred:
        json.dump(res, f)
        f.write('\n')

# Evaluation using TUSIMPLE script
from utils.lane import LaneEval

result = LaneEval.bench_one_submit('../TUSIMPLE/pred.json',
                         '../TUSIMPLE/test_set/test_label.json') # 这里根据你自己的路径来写就行

print(result)
# [{"name":"Accuracy","value":0.9430533446304444,"order":"desc"},{"name":"FP","value":0.15022166307213028,"order":"asc"},{"name":"FN","value":0.07329858614905349,"order":"asc"}]
# order是什么意思不太懂。。。 

# Evaluation for aug result
model_path = '../TUSIMPLE/Lanenet_output/lanenet_epoch_39_batch_8_AUG.model'
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
LaneNet_model = Lanenet(2, 4)
LaneNet_model.load_state_dict(torch.load(model_path, map_location=torch.device('cpu')))
LaneNet_model.to(device)
print('success')

pred_json_path = '../TUSIMPLE/test_set/test_tasks_0627.json'
json_pred = [json.loads(line) for line in open(pred_json_path).readlines()]

all_time_forward = []
all_time_clustering = []
for i, sample in enumerate(tqdm(json_pred)):
    h_samples = sample['h_samples']
    lanes = sample['lanes']
    run_time = sample['run_time']
    raw_file = sample['raw_file']
    img_path = ops.join('../TUSIMPLE/test_set', raw_file)
    # read the image
    gt_img_org = cv2.imread(img_path, cv2.IMREAD_UNCHANGED)
    org_shape = gt_img_org.shape
    gt_image = cv2.resize(gt_img_org, dsize=(512, 256), interpolation=cv2.INTER_LINEAR)
    gt_image = gt_image / 127.5 - 1.0
    gt_image = torch.tensor(gt_image, dtype=torch.float)
    gt_image = np.transpose(gt_image, (2, 0, 1))
    # Go through the network
    time_start=time.time()
    binary_final_logits, instance_embedding = LaneNet_model(gt_image.unsqueeze(0)) 
    # binary_final_logits = binary_final_logits.cpu()
    # instance_embedding = instance_embedding.cpu()
    time_end=time.time()
    # Get the final embedding image
    binary_img = torch.argmax(binary_final_logits, dim=1).squeeze().cpu().numpy()
    binary_img[0:50,:] = 0
    clu_start = time.time()
    rbg_emb, cluster_result = process_instance_embedding(instance_embedding.cpu(), binary_img,
                                                             distance=1.5, lane_num=4)
    clu_end = time.time()
    cluster_result = cv2.resize(cluster_result, dsize=(org_shape[1], org_shape[0]), 
                                interpolation=cv2.INTER_NEAREST)
    elements = np.unique(cluster_result)
    for line_idx in elements:
        if line_idx == 0:
            continue
        else:
            mask = (cluster_result == line_idx)
            select_mask = mask[h_samples]
            row_result = []
            for row in range(len(h_samples)):
                col_indexes = np.nonzero(select_mask[row])[0]
                if len(col_indexes) == 0:
                    row_result.append(-2)
                else:
                    row_result.append(int(col_indexes.min() + (col_indexes.max()-col_indexes.min())/2))
            json_pred[i]['lanes'].append(row_result)
            json_pred[i]['run_time'] = time_end-time_start
            all_time_forward.append(time_end-time_start)
            all_time_clustering.append(clu_end-clu_start)

with open('../TUSIMPLE/pred_aug.json', 'w') as f:
    for res in json_pred:
        json.dump(res, f)
        f.write('\n')

from utils.lane import LaneEval

result = LaneEval.bench_one_submit('../TUSIMPLE/pred_aug.json',
                         '../TUSIMPLE/test_set/test_label.json') 

print(result)

在上述文件导入的本地模块utils.evaluation代码解读如下：

import os.path as ops
import numpy as np
import torch
import cv2
import time
import tqdm
import os
from sklearn.cluster import MeanShift, estimate_bandwidth


def gray_to_rgb_emb(gray_img): # 把二进制图片转换为rgb图片
    """
    :param gray_img: torch tensor 256 x 512
    :return: numpy array 256 x 512
    """
    H, W = gray_img.shape
    element = torch.unique(gray_img).numpy()
    rbg_emb = np.zeros((H, W, 3))
    color = [[0, 0, 0], [255, 0, 0], [0, 255, 0], [0, 0, 255], [255, 215, 0], [0, 255, 255]]
    for i in range(len(element)):
        rbg_emb[gray_img == element[i]] = color[i]
    return rbg_emb/255 # 返回归一化rgb图片


def process_instance_embedding(instance_embedding, binary_img, distance=1, lane_num=5):
    embedding = instance_embedding[0].detach().numpy().transpose(1, 2, 0) # detach()：不具有梯度https://blog.csdn.net/weixin_33913332/article/details/93300411。https://www.cnblogs.com/jiangkejie/p/9981707.html。embedding形状：(256,512,4)。4代表4个嵌入维度
    cluster_result = np.zeros(binary_img.shape, dtype=np.int32) # （256，512）。记录聚类结果
    cluster_list = embedding[binary_img > 0] # 只聚类二进制图片中大于0的部分。（5228，4）
    mean_shift = MeanShift(bandwidth=distance, bin_seeding=True, n_jobs=-1) # 实例化一个mean shitf。https://blog.csdn.net/weixin_41636030/article/details/88793284。https://zhuanlan.zhihu.com/p/69119285。关于mean shift算法，待学习。
    mean_shift.fit(cluster_list) # 计算。单步调试时到这里会报错如下：“Traceback (most recent call last):
  File "/home/wqf/下载/pycharm-community-2020.3/plugins/python-ce/helpers/pydev/_pydevd_bundle/pydevd_comm.py", line 290, in _on_run
    r = self.sock.recv(1024)
OSError: [Errno 9] 错误的文件描述符” 但不影响运行
    labels = mean_shift.labels_ # 记录每一个点的类别。（5228，）

    cluster_result[binary_img > 0] = labels + 1 # 
    cluster_result[cluster_result > lane_num] = 0 # 如果聚类结果比设定的车道线数量（lane_num）多，那么就把多出来的类别置为0（丢弃）。这里设置为4好像是因为图森数据集中说只关注当前车道和左右车道。（挖个坑）
    for idx in np.unique(cluster_result):
        if len(cluster_result[cluster_result == idx]) < 15: # 如果某一类标签小于15，也丢弃
            cluster_result[cluster_result == idx] = 0

    H, W = binary_img.shape
    rbg_emb = np.zeros((H, W, 3))
    color = [[0, 0, 0], [255, 0, 0], [0, 255, 0], [0, 0, 255], [255, 215, 0], [0, 255, 255]]
    element = np.unique(cluster_result)
    for i in range(len(element)):
        rbg_emb[cluster_result == element[i]] = color[i] # 标注颜色

    return rbg_emb / 255, cluster_result # 返回rbg_emb / 255：归一化后的标注好的rgb图像（256，512，3）。cluster_result：聚类结果（256，512）【不同车道线用1、2、3、4等代表】0代表背景


def video_to_clips(video_name):
    test_video_dir = ops.split(video_name)[0]
    outimg_dir = ops.join(test_video_dir, 'clips')
    if ops.exists(outimg_dir):
        print('Data already exist in {}'.format(outimg_dir))
        return
    if not ops.exists(outimg_dir):
        os.makedirs(outimg_dir)
    video_cap = cv2.VideoCapture(video_name)
    frame_count = 0
    all_frames = []

    while (True):
        ret, frame = video_cap.read()
        if ret is False:
            break
        all_frames.append(frame)
        frame_count = frame_count + 1

    for i, frame in enumerate(all_frames):
        out_frame_name = '{:s}.png'.format('{:d}'.format(i + 1).zfill(6))
        out_frame_path = ops.join(outimg_dir, out_frame_name)
        cv2.imwrite(out_frame_path, frame)
    print('finish process and save in {}'.format(outimg_dir))

在上述文件导入的本地模块from utils.lane import LaneEval代码解读如下：

import numpy as np
from sklearn.linear_model import LinearRegression
import ujson as json


class LaneEval(object):
    lr = LinearRegression() # 实例化一个线性回归器
    pixel_thresh = 20 # 一个阈值
    pt_thresh = 0.85 # 阈值

    @staticmethod
    def get_angle(xs, y_samples): # 计算角度（与图像底边）
        xs, ys = xs[xs >= 0], y_samples[xs >= 0] # 得到有意义的车道线点
        if len(xs) > 1:
            LaneEval.lr.fit(ys[:, None], xs) # 拟合
            k = LaneEval.lr.coef_[0] # 得到系数
            theta = np.arctan(k)
        else:
            theta = 0
        return theta # 返回角度

    @staticmethod
    def line_accuracy(pred, gt, thresh): # 计算拟合准确率
        pred = np.array([p if p >= 0 else -100 for p in pred])
        gt = np.array([g if g >= 0 else -100 for g in gt])
        return np.sum(np.where(np.abs(pred - gt) < thresh, 1., 0.)) / len(gt) # 如果pred和gt（均是x）差距小于阈值，则认为拟合成功，否则失败

    @staticmethod
    def bench(pred, gt, y_samples, running_time):
        if any(len(p) != len(y_samples) for p in pred): 
            raise Exception('Format of lanes error.')
        if running_time > 200 or len(gt) + 2 < len(pred): # 如果运行时间超过200ms，或者真实车道线数量比预测的车道线数量**（这里不懂什么意思）**
            return 0., 0., 1.
        angles = [LaneEval.get_angle(np.array(x_gts), np.array(y_samples)) for x_gts in gt] # gt：一张图片中真实车道线的gt标注。x_gts：图片中一条车道线的真实x坐标。y_samples：纵坐标。angles：记录车道线与x轴夹角
        threshs = [LaneEval.pixel_thresh / np.cos(angle) for angle in angles] # 对每一个角度的车道线允许x方向差值不同。角度越大的，阈值越大（因为更不容易把）
        line_accs = []
        fp, fn = 0., 0.
        matched = 0.
        for x_gts, thresh in zip(gt, threshs): # gt：一张图片中真实车道线的gt标注。x_gts：图片中一条车道线的真实x坐标。 # threshs：车道线允许的偏差阈值。thresh：一条车道线具体的偏差阈值
            accs = [LaneEval.line_accuracy(np.array(x_preds), np.array(x_gts), thresh) for x_preds in pred] # x_preds：一条车道线的预测x坐标。x_gts：一条车道线的真实x坐标。计算一张图中每一条车道线有多少点拟合对了（差值小于阈值即认为正确）。
            max_acc = np.max(accs) if len(accs) > 0 else 0. # 由于不知道pred中哪个标注是与x_gts对应的，因此遍历pred，找出pred中与当前x_gts最对应的那条车道线。
            if max_acc < LaneEval.pt_thresh: # 如果x_gts在pred中没有找到对应的，则认为这条车道线错过了。
                fn += 1 # fn=错过的gt车道线数量/所有gt车道线的数量。这里应该是记录的错过的gt车道线
            else:
                matched += 1 # 找到了
            line_accs.append(max_acc)
        fp = len(pred) - matched # fp=错误预测的车道线数量/预测的车道线数量。pred是预测的车道小，matched是预测到的车道线。
        if len(gt) > 4 and fn > 0: # 如果一张图中车道线数量大于4了，并且fn中有值。则认为这是因为模型问题，不算在fn中（因为文中说到有的gt是大于4的，而我们好像认为车道数量最多是4来建立模型的）
            fn -= 1
        s = sum(line_accs)
        if len(gt) > 4: # 原因同上
            s -= min(line_accs)
        return s / max(min(4.0, len(gt)), 1.), fp / len(pred) if len(pred) > 0 else 0., fn / max(min(len(gt), 4.) , 1.) # 这里是对acc和fp、fn的公式实现

    @staticmethod # 静态方法。可以不实例化调用。https://www.runoob.com/python/python-func-staticmethod.html
    def bench_one_submit(pred_file, gt_file):
        try:
            json_pred = [json.loads(line) for line in open(pred_file).readlines()] # 记录所有预测json
        except BaseException as e:
            raise Exception('Fail to load json file of the prediction.')
        json_gt = [json.loads(line) for line in open(gt_file).readlines()] # 记录所有真实json
        if len(json_gt) != len(json_pred): # 2782
            raise Exception('We do not get the predictions of all the test tasks')
        gts = {l['raw_file']: l for l in json_gt} #json_gt：[{'lanes': [[-2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 648, 636, 626, 615, 605, 595, 585, 575, 565, 554, 545, 536, 526, 517, 508, 498, 489, 480, 470, 461, 452, 442, 433, 424, 414, 405, 396, 386, 377, 368, 359, 349, 340, 331, 321, 312, 303, 293, 284, 275, 265, 256, 247, 237, 228, 219], [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 681, 692, 704, 716, 728, 741, 754, 768, 781, 794, 807, 820, 834, 847, 860, 873, 886, 900, 913, 926, 939, 952, 966, 979, 992, 1005, 1018, 1032, 1045, 1058, 1071, 1084, 1098, 1111, 1124, 1137, 1150, 1164, 1177, 1190, 1203, 1216, 1230, 1243, 1256, 1269], [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 713, 746, 778, 811, 845, 880, 916, 951, 986, 1022, 1057, 1092, 1128, 1163, 1198, 1234, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2], [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 754, 806, 858, 909, 961, 1013, 1064, 1114, 1164, 1213, 1263, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2]], 'h_samples': [160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660, 670, 680, 690, 700, 710], 'raw_file': 'clips/0530/1492626760788443246_0/20.jpg'} ……共计2782] gts：{'clips/0530/1492626760788443246_0/20.jpg': {'lanes': [[-2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 648, 636, 626, 615, 605, 595, 585, 575, 565, 554, 545, 536, 526, 517, 508, 498, 489, 480, 470, 461, 452, 442, 433, 424, 414, 405, 396, 386, 377, 368, 359, 349, 340, 331, 321, 312, 303, 293, 284, 275, 265, 256, 247, 237, 228, 219], [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 681, 692, 704, 716, 728, 741, 754, 768, 781, 794, 807, 820, 834, 847, 860, 873, 886, 900, 913, 926, 939, 952, 966, 979, 992, 1005, 1018, 1032, 1045, 1058, 1071, 1084, 1098, 1111, 1124, 1137, 1150, 1164, 1177, 1190, 1203, 1216, 1230, 1243, 1256, 1269], [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 713, 746, 778, 811, 845, 880, 916, 951, 986, 1022, 1057, 1092, 1128, 1163, 1198, 1234, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2], [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 754, 806, 858, 909, 961, 1013, 1064, 1114, 1164, 1213, 1263, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2]], 'h_samples': [160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660, 670, 680, 690, 700, 710], 'raw_file': 'clips/0530/1492626760788443246_0/20.jpg'}, ……共计2782}。
        accuracy, fp, fn = 0., 0., 0.
        for pred in json_pred: # pred：记录一条预测的图片信息
            if 'raw_file' not in pred or 'lanes' not in pred: #or 'run_time' not in pred:
                raise Exception('raw_file or lanes or run_time not in some predictions.')
            raw_file = pred['raw_file']
            pred_lanes = pred['lanes']
            # run_time = pred['run_time']
            run_time = 100
            if raw_file not in gts:
                raise Exception('Some raw_file from your predictions do not exist in the test tasks.')
            gt = gts[raw_file] # 提取pred对应的gt图片信息
            gt_lanes = gt['lanes']
            y_samples = gt['h_samples']
            try:
                a, p, n = LaneEval.bench(pred_lanes, gt_lanes, y_samples, run_time) # 静态方法。不实例化就可调用。传入预测的车道线pred_lanes：[[-2, -2, -2, -2, -2, -2, -2, -2, -2, 677, 684, 701, 708, 722, 732, 746, 756, 771, 782, 797, 813, 825, 838, 848, 865, 875, 888, 902, 916, 930, 941, 956, 966, 982, 995, 1009, 1019, 1036, 1051, 1065, 1081, 1092, 1106, 1117, 1132, 1143, 1158, 1173, 1187, 1202, 1212, 1227, 1236, 1250, 1257, 1262], [-2, -2, -2, -2, -2, -2, -2, -2, -2, 658, 653, 638, 629, 617, 609, 598, 589, 579, 571, 559, 548, 541, 529, 522, 512, 504, 493, 484, 474, 464, 456, 444, 437, 426, 418, 408, 401, 390, 378, 371, 361, 353, 343, 335, 323, 315, 305, 293, 286, 273, 266, 254, 246, 236, 227, 216], [-2, -2, -2, -2, -2, -2, -2, -2, -2, 701, 722, 762, 789, 823, 846, 883, 909, 946, 974, 1015, 1053, 1076, 1116, 1145, 1187, 1216, 1251, 1266, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2], [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 910, 939, 1001, 1040, 1106, 1161, 1209, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2]]；真实车道线：[[-2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 648, 636, 626, 615, 605, 595, 585, 575, 565, 554, 545, 536, 526, 517, 508, 498, 489, 480, 470, 461, 452, 442, 433, 424, 414, 405, 396, 386, 377, 368, 359, 349, 340, 331, 321, 312, 303, 293, 284, 275, 265, 256, 247, 237, 228, 219], [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 681, 692, 704, 716, 728, 741, 754, 768, 781, 794, 807, 820, 834, 847, 860, 873, 886, 900, 913, 926, 939, 952, 966, 979, 992, 1005, 1018, 1032, 1045, 1058, 1071, 1084, 1098, 1111, 1124, 1137, 1150, 1164, 1177, 1190, 1203, 1216, 1230, 1243, 1256, 1269], [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 713, 746, 778, 811, 845, 880, 916, 951, 986, 1022, 1057, 1092, 1128, 1163, 1198, 1234, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2], [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 754, 806, 858, 909, 961, 1013, 1064, 1114, 1164, 1213, 1263, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2]]；标注的y点y_samples：[160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660, 670, 680, 690, 700, 710]；run_time：lanenet模型模型预测所用时间
            except BaseException as e:
                raise Exception('Format of lanes error.')
            accuracy += a
            fp += p # False Positive : False(检测模型不能成功地) Positive (判定出结果是Positive的)FP=预测错了的车道线/预测的车道线

            fn += n # False Negative : False(检测模型不能成功地) Negative (判定出结果是Negative的) FN=错过的ground-truth车道数/所有ground-truth车道数
        num = len(gts)
        # the first return parameter is the default ranking parameter
        return json.dumps([
            {'name': 'Accuracy', 'value': accuracy / num, 'order': 'desc'},
            {'name': 'FP', 'value': fp / num, 'order': 'asc'},
            {'name': 'FN', 'value': fn / num, 'order': 'asc'}
        ])


if __name__ == '__main__':
    import sys
    try:
        if len(sys.argv) != 3:
            raise Exception('Invalid input arguments')
        print(LaneEval.bench_one_submit(sys.argv[1], sys.argv[2]))
    except Exception as e:
        print(e.message)
        sys.exit(e.message)

关于sklearn中LinearRegression的例子，看网上好多教程也没说清。还是直接看源码的例子给力：

    Examples
    --------
    >>> import numpy as np
    >>> from sklearn.linear_model import LinearRegression
    >>> X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
    >>> # y = 1 * x_0 + 2 * x_1 + 3
    >>> y = np.dot(X, np.array([1, 2])) + 3
    >>> reg = LinearRegression().fit(X, y)  #拟合
    >>> reg.score(X, y) # 拟合质量
    1.0
    >>> reg.coef_ # 系数
    array([1., 2.])
    >>> reg.intercept_ # 截距项
    3.0000...
    >>> reg.predict(np.array([[3, 5]])) # 预测
    array([16.])
    """

结果如下：

在用增强后的数据测试时，报错了：

Traceback (most recent call last):
  File "/home/wqf/ECBM6040-Project/Notebook-experiment/Evaluation of Lanenet_AUG.py", line 55, in <module>
    binary_final_logits, instance_embedding = LaneNet_model(gt_image.unsqueeze(0))  #
  File "/home/wqf/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/wqf/ECBM6040-Project/Lanenet/model2.py", line 516, in forward
    x = self.initial_block(x)
  File "/home/wqf/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/wqf/ECBM6040-Project/Lanenet/model2.py", line 59, in forward
    main = self.main_branch(x)
  File "/home/wqf/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/wqf/anaconda3/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 423, in forward
    return self._conv_forward(input, self.weight)
  File "/home/wqf/anaconda3/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 419, in _conv_forward
    return F.conv2d(input, weight, self.bias, self.stride,
RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same

Process finished with exit code 1

这个信息又是CPU和GPU的问题。

从上往下改bug，在File "/home/wqf/ECBM6040-Project/Notebook-experiment/Evaluation of Lanenet_AUG.py", line 55文件中：

binary_final_logits, instance_embedding = LaneNet_model(gt_image.unsqueeze(0))  #

确认赋值号左右都是在GPU上的。经检查发现gt_image在GPU上，所以改该行代码：

binary_final_logits, instance_embedding = LaneNet_model(gt_image.unsqueeze(0).cuda())

便调试成功。

产生GIF动图

通过ECBM6040-Project/Notebook-experiment/Generate Video and show the result.ipynb文件，你可以产生GIF动图，动图输出在：ECBM6040-Project/TUSIMPLE/gif_output

# _*_ coding utf-8 _*_
# 开发团队：
# 开发人员：wqf
# 开发时间：2021/1/27 下午5:28
# 文件名称：Generate Video and show the result.py
# 开发工具：PyCharm
import os.path as ops
import numpy as np
import torch
import cv2
import time
import os
import matplotlib.pylab as plt
import sys
from tqdm import tqdm
import imageio

sys.path.append('..')
from dataset.dataset_utils import TUSIMPLE
from Lanenet.model2 import Lanenet
from utils.evaluation import gray_to_rgb_emb, process_instance_embedding, video_to_clips

# Load the Model
model_path = '../TUSIMPLE/Lanenet_output/lanenet_epoch_39_batch_8.model'
LaneNet_model = Lanenet(2, 4)
LaneNet_model.load_state_dict(torch.load(model_path, map_location=torch.device('cpu')))

# Load the Test Dataset
root = '../TUSIMPLE/txt_for_local'
train_set = TUSIMPLE(root=root, flag='train')
valid_set = TUSIMPLE(root=root, flag='valid')
test_set = TUSIMPLE(root=root, flag='test')

print('train_set length {}'.format(len(train_set)))
print('valid_set length {}'.format(len(valid_set)))
print('test_set length {}'.format(len(test_set)))

gt, bgt, igt = test_set[20]
print('image type {}'.format(type(gt)))
print('image size {} \n'.format(gt.size()))

print('gt binary image type {}'.format(type(bgt)))
print('gt binary image size {}'.format(bgt.size()))
print('items in gt binary image {} \n'.format(torch.unique(bgt)))

print('gt instance type {}'.format(type(igt)))
print('gt instance size {}'.format(igt.size()))
print('items in gt instance {} \n'.format(torch.unique(igt)))

data_loader_test = torch.utils.data.DataLoader(test_set, batch_size=1, shuffle=False,
                                               num_workers=0)

# Get one output from the network
binary_final_logits, instance_embedding = LaneNet_model(gt.unsqueeze(0))
print('binary_final_logits shape: {}'.format(binary_final_logits.shape))
print('instance_embedding shape: {}'.format(instance_embedding.shape))

# Show one result on Test Dataset
# For ground truth
gt_image_show = ((gt.numpy() + 1) * 127.5).astype(int)  # [3, 256, 512]

plt.figure(figsize=(20, 10))
ax1 = plt.subplot(221)
image_show = gt_image_show.transpose(1, 2, 0)
image_show = image_show[..., ::-1]
plt.imshow(image_show)

ax1 = plt.subplot(222)
plt.imshow(bgt, cmap='gray')

ax1 = plt.subplot(223)
rbg_emb = gray_to_rgb_emb(igt)
plt.imshow(rbg_emb)

ax1 = plt.subplot(224)
a = 0.7
plt.imshow(a * image_show / 255 + (1 - a) * rbg_emb)

# For binary_final_logits
binary_img = torch.argmax(binary_final_logits, dim=1).squeeze().numpy()  # binary_img:有车道线的地方是1，没有则是0
# plt.imshow(binary_img, cmap='gray')

# For instance_embedding
rbg_emb, cluster_result = process_instance_embedding(instance_embedding, binary_img,
                                                     distance=1, lane_num=5)

# Show result
plt.figure(figsize=(20, 10))
ax1 = plt.subplot(221)
plt.imshow(image_show)
plt.title('Original image')

ax1 = plt.subplot(222)
plt.imshow(binary_img, cmap='gray')
plt.title('Binary lane segmentation')

ax1 = plt.subplot(223)
plt.imshow(rbg_emb)
plt.title('Pixel embeddings')

ax1 = plt.subplot(224)
a = 0.7
plt.imshow(a * image_show / 255 + (1 - a) * rbg_emb)
plt.title('Final result')


# Generate videos
### Read test_video.mp4 and make to clips
# video_name = '../TUSIMPLE/test_video/test_video.mp4' video_to_clips(video_name)

# Read clips into dataset
# test_clips_root = '/Users/smiffy/Documents/GitHub/TUSIMPLE/test_video/clips'
def clips_to_gif(test_clips_root, git_root): # test_clips_root：源目录'../TUSIMPLE/test_clips/1494452927854312215' git_root：'../TUSIMPLE/gif_output/1494452927854312215.gif'
    img_paths = []
    for img_name in os.listdir(test_clips_root):
        img_paths.append(ops.join(test_clips_root, img_name))
    img_paths.sort() # 列表数据结构的内置方法：排序
    gif_frames = []
    for i, img_name in enumerate(img_paths):
        gt_img_org = cv2.imread(img_name, cv2.IMREAD_UNCHANGED)
        org_shape = gt_img_org.shape # (720, 1280, 3)
        gt_image = cv2.resize(gt_img_org, dsize=(512, 256),
                              interpolation=cv2.INTER_LINEAR)
        gt_image = gt_image / 127.5 - 1.0
        gt_image = torch.tensor(gt_image, dtype=torch.float)
        gt_image = np.transpose(gt_image, (2, 0, 1)) # （3，720，1280）

        binary_final_logits, instance_embedding = LaneNet_model(gt_image.unsqueeze(0))
        binary_img = torch.argmax(binary_final_logits, dim=1).squeeze().numpy()
        binary_img[0:65, :] = 0
        rbg_emb, cluster_result = process_instance_embedding(instance_embedding,
                                                             binary_img,
                                                             distance=1.5, lane_num=4)

        rbg_emb = cv2.resize(rbg_emb, dsize=(org_shape[1], org_shape[0]),
                             interpolation=cv2.INTER_LINEAR)
        a = 0.6
        frame = a * gt_img_org[..., ::-1] / 255 + rbg_emb * (1 - a)
        frame = np.rint(frame * 255)
        frame = frame.astype(np.uint8)
        gif_frames.append(frame)
    imageio.mimsave(git_root, gif_frames, fps=5) # git_root:生成图片的文件名称，要生成gif的素材图片。这里是20帧图片。fps：一个参数，应该是一秒显示多少帧，网上没找到


clips_root = '../TUSIMPLE/test_clips'
gif_dir = '../TUSIMPLE/gif_output'
if not os.path.exists(gif_dir):
    os.makedirs(gif_dir)
for dir_name in os.listdir(clips_root):
    if dir_name == '.DS_Store':
        continue
    print('Pdrocess the clip {} \n'.format(dir_name))
    test_clips_root = ops.join(clips_root, dir_name)
    git_root = ops.join(gif_dir, dir_name) + '.gif'
    clips_to_gif(test_clips_root, git_root)

呼～收工。

你可能感兴趣的:(车道线检测/道路边缘检测,计算机视觉,深度学习,python)

【Python】一文详细介绍 py格式文件高斯小哥 Python基础【高质量合集】python 新手入门学习
【Python】一文详细介绍py格式文件个人主页：高斯小哥高质量专栏：Matplotlib之旅：零基础精通数据可视化、Python基础【高质量合集】、PyTorch零基础入门教程希望得到您的订阅和支持~创作高质量博文(平均质量分92+)，分享更多关于深度学习、PyTorch、Python领域的优质内容！（希望得到您的关注~）文章目录一、py格式文件简介二、如何创建和编辑py格式文件三、如何运行py
python抓包与解包_Python—网络抓包与解包（pcap、dpkt） weixin_39691055 python抓包与解包
pcap安装[root@localhost~]#pipinstallpypcap抓包与解包#-*-coding:utf-8-*-importpcap,dpktimportre,threading,requests__black_ip=['103.224.249.123','203.66.1.212']#抓包：param1eth_name网卡名，如：eth0,eth3。param2p_type日志捕
美团自动配送车2024春季招聘 | 社招专场美团技术团队
关于美团自动配送团队美团自动配送以自研L4级自动驾驶软硬件技术为核心，与美团即时零售业务结合，形成满足公开道路、校园、社区、工业园区等室外全场景下的自动配送整体解决方案。美团自动配送团队成立于2016年，团队成员来自于Waymo、Cruise、Pony.ai、泛亚等自动驾驶行业头部公司，自动驾驶技术团队博士占比高达30%，依靠视觉、激光等传感器，实时感知预测周围环境，通过高精地图定位和智能决策规划
华为OD机试 - 单向链表中间节点（Java & JS & Python & C & C++）华为OD题库华为od 链表 java
须知哈喽，本题库完全免费，收费是为了防止被爬，大家订阅专栏后可以私信联系退款。感谢支持文章目录须知题目描述输出描述解析代码题目描述给定一个单链表L，请编写程序输出L中间结点保存的数据。如果有两个中间结点，则输出第二个中间结点保存的数据。例如：给定L为1→7→5，则输出应该为7；给定L为1→2→3→4，则输出应该为3；输入描述每个输入包含1个测试用例。每个测试用例：第一行给出链表首结点的地址、结点总
python 推导式(派生、衍生) sanduo112 人工智能 python windows 开发语言
python推导式一、推导式(派生、衍生)1.Python推导式是一种独特的数据处理方式，可以从一个数据序列构建另一个新的数据序列的结构体。2.列表(list)推导式3.字典(dict)推导式4.集合(set)推导式5.元组(tuple)推导式二、代码概述一、推导式(派生、衍生)1.Python推导式是一种独特的数据处理方式，可以从一个数据序列构建另一个新的数据序列的结构体。Python支持各种数
数据挖掘|数据预处理|基于Python的数据标准化方法皖山文武数据挖掘数据建模与分析 python 数据挖掘开发语言
基于Python的数据标准化方法1.z-score方法2.极差标准化方法3.最大绝对值标准化方法在数据分析之前，通常需要先将数据标准化（Standardization），利用标准化后的数据进行数据分析，以避免属性之间不同度量和取值范围差异造成数据对分析结果的影响。1.z-score方法Z-score方法是基于原始数据的均值和标准差来进行数据标准化的，处理后的数据均值为0，方差为1，符合标准正态分布
CSV指南：Python程序获取大型CSV文件行数孤独打铁匠Julian 笔记经验分享 python
本指南提供了几种使用Python来获取大型CSV文件行数的方法，并解释了每种方法的适用场景。方法1:使用csv.reader处理复杂CSV文件当你的CSV文件中包含多行字段（即某些字段的值中包含换行符）时，使用csv.reader是一个可靠的选择，因为它能够正确处理这些复杂情况。这个方法适用于大多数大小的CSV文件，但是对于非常大的文件，读取整个文件可能会占用较多的时间和内存。对于极大的文件，考虑
谷歌浏览器驱动Chromedriver（114-120版本）文件以及驱动下载教程 pigerr杨 Python python chrome drivers
ChromeDriver官方网站GitHub||GoogleChromeLabs/chrome-for-testingChromeDriver113-125_JSONChromeforTestingavailability123-125zip白月黑羽Python基础|进阶|Qt图形界面|Django|自动化测试|性能测试|JS语言|JS前端|原理与安装
大创项目推荐深度学习 opencv python 公式识别(图像识别机器视觉) laafeer python
文章目录0前言1课题说明2效果展示3具体实现4关键代码实现5算法综合效果6最后0前言优质竞赛项目系列，今天要分享的是基于深度学习的数学公式识别算法实现该项目较为新颖，适合作为竞赛课题方向，学长非常推荐！学长这里给一个题目综合评分(每项满分5分)难度系数：3分工作量：4分创新点：4分更多资料,项目分享：https://gitee.com/dancheng-senior/postgraduate1课题
python转码 Desamond python 开发语言
转码在许多场景中都有应用，以下是一些常见的场景：网页开发：当用户在网页上输入文本时，可能需要将特殊字符（如空格、引号、特殊符号等）进行转码，以防止这些字符对URL或HTML代码产生干扰。文件名处理：在处理文件名时，可能需要将特殊字符进行转码，以避免文件名被错误地解析或显示。数据传输：在数据传输过程中，为了确保数据的完整性和正确性，可能需要将数据中的特殊字符进行转码。数据存储：在数据库或数据存储中，
排序算法太多？常用排序都在这了，一篇文章总结和实现所有面试会考的排序算法（基于Python实现）宇宙之一粟不归路之Python #IT面试题收集与总结数据结构与算法算法数据结构排序算法 python java
文章目录排序算法1.常见的排序算法1.1选择排序1.1.1思想1.1.2实现**1.1.3选择排序分析**1.2冒泡排序**1.2.1思想****1.2.2实现****1.2.3冒泡排序分析**1.3插入排序**1.3.1思想****1.3.2实现****1.3.3插入排序分析**1.4归并排序☆☆★**1.4.1思想****1.4.2实现****1.4.3归并排序分析**1.5快速排序☆★★**
27.Python从入门到精通—Python异常处理抛出异常用户自定义异常定义清理行为预定义的清理行为以山河作礼。 #Python基础入门—详解版 python java 服务器
27.从入门到精通：Python异常处理抛出异常用户自定义异常定义清理行为预定义的清理行为异常处理抛出异常用户自定义异常定义清理行为预定义的清理行为异常处理在Python中，异常处理是一种处理程序在执行期间可能遇到的错误的方法。当Python解释器遇到错误时，它会引发异常。异常是一种Python对象，它包含有关错误的信息，例如错误类型和错误位置。为了处理异常，您可以使用try-except语句。在
python清华大学出版社答案_Python机器学习及实践 weixin_39805119 python清华大学出版社答案
第1章机器学习的基础知识1.1何谓机器学习1.1.1传感器和海量数据1.1.2机器学习的重要性1.1.3机器学习的表现1.1.4机器学习的主要任务1.1.5选择合适的算法1.1.6机器学习程序的步骤1.2综合分类1.3推荐系统和深度学习1.3.1推荐系统1.3.2深度学习1.4何为Python1.4.1使用Python软件的由来1.4.2为什么使用Python1.4.3Python设计定位1.4.
Python | Redis工具类 -拟墨画扇- Python redis 数据库缓存 python
一、需求自动连接Redis数据库，通过连接池处理数据对输出结果进行Log打印并保存到文件二、代码Utils.redisUtils.py#!/usr/bin/envpython#-*-coding:utf-8-*-importredisfromUtils.loggerimportlog"""Redis数据格式(1)字符串|存储形式:key-value:str-存储二进制数据:可以存储任意类型的数据，
Python dict字符串转json对象，小数精度丢失问题朝如青丝暮成雪 json python
一前言JSON(JavaScriptObjectNotation)是一种轻量级的数据交换格式，dict是Python的一种数据格式。本篇介绍一个float数据转换时精度丢失的案例。二问题描述importjsontest_str1='{"π":3.1415926535897932384626433832795028841971}'test_str2='{"value":10.00000}'print
Python+Requests模拟发送GET请求爱学习的执念自动化测试软件测试技术分享 python 开发语言
模拟发送GET请求前置条件：导入requests库一、发送不带参数的get请求代码如下：以百度首页为例importrequests#发送get请求response=requests.get(url="http://www.baidu.com")print(response.content.decode("utf-8"))#以utf-8的编码输出内容二、发送带参数的get请求发送带参数的get请求有
Python极速入门：五分钟开启实战之旅！知白守黑V Python 编程语言系统运维 python 编程语言 python开发 python学习 python入门 python数据分析
1.Python基础语法和结构：了解Python的基本语法，包括变量、数据类型、运算符、注释等。控制流：掌握条件语句（if-elif-else）、循环（for和while）及其控制（break和continue）。函数：学习如何定义和使用函数，包括参数传递、返回值、作用域和闭包。模块和包：理解如何导入和使用模块，以及如何创建和使用自己的包。2.数据处理列表、元组和集合：学习这些序列类型的操作和方法
Python Flask 使用数据库安果移不动 python flask 开发语言
pipinstallflask_sqlalchemy官方文档：Flask-SQLAlchemy—Flask-SQLAlchemyDocumentation(3.1.x)为了不报错也需要导入另外两个库#pipinstallflask_sqlalchemy#pipinstallmysqlclient完整代码importosfromflaskimportFlaskfromflask_sqlalchemy
深度学习项目-基于深度学习的股票价格预测研究雅致教育计算机毕业设计深度学习人工智能
概要随着经济的发展，中国股票市场的规模持续扩大，早已成为金融投资的重要部分，掌握股票市场的变化规律无论是对监管者还是投资者都具有极其重要的意义。正因如此，人们不断探索着股票市场的变化规律，其中使用深度学习预测股价是当前国内国际研究与应用的热点。本文首先从有效市场假说和分形市场假说两个角度讨论了中国股票市场的有效性，说明股票市场具有复杂的非线性特征。其次，结合股票市场特征对比了当前的预测方法
PaperWeekly sapienst Papers PaperwithCode General ML
1.Python软件包解决DL在未见过的数据分布下性能差的问题：（1）神经网络和损失分离的模块化设计（2）强大便捷的基准测试能力（3）易于使用但难以修改（4）github:https://github.com/marrlab/domainlabTrainer和Models之间是什么关系Trainer和Models是DomainLab中的两个核心概念。Trainer是一个用于指导数据流向模型并计算S
使用Python读取Excel文件并计算平均分嘻嘻爱编码 Python从入门到放弃 python excel 开发语言
在这篇博客中，我们将探讨如何使用Python的pandas库来读取Excel文件，并计算其中数据的平均分。pandas是一个强大的数据分析工具，它允许我们以简单直观的方式处理表格数据。安装必要的库在开始之前，确保你的环境中安装了pandas和openpyxl库。可以使用以下命令进行安装：pipinstallpandasopenpyxl读取Excel文件首先，我们需要读取Excel文件。假设我们有一
python项目练习——7.网站访问日志分析器 F—— python项目练习 python 信息可视化数据分析数据挖掘开发语言学习
项目功能分析：这个项目可以读取网站的访问日志文件，统计访问量、独立访客数、访问来源等信息，并以图表或表格的形式展示出来。这个项目涉及到文件操作、数据处理、数据可视化等方面的技术。示例代码：importrefromcollectionsimportCounterimportmatplotlib.pyplotaspltdefparse_log_file(log_file):#读取日志文件内容witho
python的while双重循环九九乘法表 Jinm_R python 开发语言
a=1whilea<=9:b=1#乘数每次需要从1开始whileb<=a:print(f"{a}*{b}={a*b}\t",end='')#\t为制表符使乘法表整齐end=''代表用空格代替换行b+=1a+=1print()#乘数每加一换行
ChatGPT技巧大揭秘：AI写代码新境界 2401_83550420 chatgpt4.0 chatgpt chatgpt 人工智能 AI写作
ChatGPT无限次数:点击直达ChatGPT技巧大揭秘：AI写代码新境界随着人工智能技术的不断进步，开发人员现在有了更多有趣的工具来提高他们的工作效率。其中，ChatGPT作为一种基于深度学习的自然语言处理模型，已经成为许多开发者的新宠。在本文中，我们将揭秘使用ChatGPT来帮助编写代码的技巧，探索AI在编程领域的新境界。ChatGPT简介ChatGPT是一种基于大型神经网络的对话生成模型，它
AI大模型学习：开启智能时代的新篇章游向大厂的咸鱼人工智能学习
随着人工智能技术的不断发展，AI大模型已经成为当今领先的技术之一，引领着智能时代的发展。这些大型神经网络模型，如OpenAI的GPT系列、Google的BERT等，在自然语言处理、图像识别、智能推荐等领域展现出了令人瞩目的能力。然而，这些模型的背后是一系列复杂的学习过程，深度学习技术的不断演进推动了AI大模型学习的发展。首先，AI大模型学习的基础是深度学习技术。深度学习是一种模仿人类大脑结构的机器
【Python】成功解决ModuleNotFoundError: No module named ‘torchinfo‘ 高斯小哥 BUG解决方案合集 python pytorch 新手入门学习 debug
【Python】成功解决ModuleNotFoundError:Nomodulenamed‘torchinfo’个人主页：高斯小哥高质量专栏：Matplotlib之旅：零基础精通数据可视化、Python基础【高质量合集】、PyTorch零基础入门教程希望得到您的订阅和支持~创作高质量博文(平均质量分92+)，分享更多关于深度学习、PyTorch、Python领域的优质内容！（希望得到您的关注~）文
OpenCV（一个C++人工智能领域重要开源基础库）简介愚梦者 OpenCV 人工智能人工智能 opencv c++图像处理计算机视觉开源
返回：OpenCV系列文章目录（持续更新中......）上一篇：OpenCV4.9.0配置选项参考下一篇：OpenCV4.9.0开源计算机视觉库安装概述引言：OpenCV（全称OpenSourceComputerVisionLibrary）是一个基于开放源代码发行的跨平台计算机视觉库，可以用来进行图像处理、计算机视觉和机器学习等领域的开发。该库由英特尔公司于1999年开始开发，最初是为了加速处理器
Python自动化测试web常见框架汇总自动化测试薰儿软件测试技术分享 python 前端开发语言
1、前言目前，有非常多的Python框架，用来帮助你更轻松的创建web应用。这些框架把相应的模块组织起来，使得构建应用的时候可以更快捷，也不用去关注一些细节（例如socket和协议），所以需要的都在框架里了。接下来我们会介绍不同的选项。经过初期的不起眼，Python已经成为互联网最流行的服务端编程语言之一。根据W3Techs的统计，它被用于很多的大流量的站点很多的大流量的站点很多的大流量的站点，超
python安装jupter在线ide 晚风拂柳颜生活小经验 python3 ide jupter
我在虚拟3.6.8的环境里面安装的，具体用了以下命令；pipinstallipython-ihttps://mirrors.aliyun.com/pypi/simple/pipinstalljupyter-ihttps://mirrors.aliyun.com/pypi/simple/jupyternotebook当然，jupter可以直接通过python环境里script目录下的jupyter-
opencv 十八 python下实现0缓存掉线重连的rtsp直播流播放器摸鱼的机器猫 opencv实战 opencv python 缓存
使用opencv打开rtsp视频流时，会因为网络问题导致VideoCapture掉线；也会因为图像的后处理阶段耗时过长导致opencv缓冲区数据堆积，从而使程序无法及时处理最新的数据。为此对cv2.VideoCapture进行封装，实现0缓存掉线重连的rtsp直播流播放器，让程序能一直处理最新的数据。代码实现fromcollectionsimportdequeimportthreadingimpo
项目中枚举与注解的结合使用飞翔的马甲 java enum annotation
前言：版本兼容，一直是迭代开发头疼的事，最近新版本加上了支持新题型，如果新创建一份问卷包含了新题型，那旧版本客户端就不支持，如果新创建的问卷不包含新题型，那么新旧客户端都支持。这里面我们通过给问卷类型枚举增加自定义注解的方式完成。顺便巩固下枚举与注解。一、枚举 1.在创建枚举类的时候，该类已继承java.lang.Enum类，所以自定义枚举类无法继承别的类，但可以实现接口。
【Scala十七】Scala核心十一：下划线_的用法 bit1129 scala
下划线_在Scala中广泛应用，_的基本含义是作为占位符使用。_在使用时是出问题非常多的地方，本文将不断完善_的使用场景以及所表达的含义 1. 在高阶函数中使用 scala> val list = List(-3,8,7,9) list: List[Int] = List(-3, 8, 7, 9) scala> list.filter(_ > 7) r
web缓存基础：术语、http报头和缓存策略 dalan_123 Web
对于很多人来说，去访问某一个站点，若是该站点能够提供智能化的内容缓存来提高用户体验，那么最终该站点的访问者将络绎不绝。缓存或者对之前的请求临时存储，是http协议实现中最核心的内容分发策略之一。分发路径中的组件均可以缓存内容来加速后续的请求，这是受控于对该内容所声明的缓存策略。接下来将讨web内容缓存策略的基本概念，具体包括如如何选择缓存策略以保证互联网范围内的缓存能够正确处理的您的内容，并谈论下
crontab 问题周凡杨 linux crontab unix
一： 0481-079 Reached a symbol that is not expected. 背景： */5 * * * * /usr/IBMIHS/rsync.sh
让tomcat支持2级域名共享session g21121 session
tomcat默认情况下是不支持2级域名共享session的，所有有些情况下登陆后从主域名跳转到子域名会发生链接session不相同的情况，但是只需修改几处配置就可以了。打开tomcat下conf下context.xml文件找到Context标签,修改为如下内容如果你的域名是www.test.com <Context sessionCookiePath="/path&q
web报表工具FineReport常用函数的用法总结（数学和三角函数）老A不折腾 Web finereport 总结
ABS ABS(number):返回指定数字的绝对值。绝对值是指没有正负符号的数值。 Number:需要求出绝对值的任意实数。示例: ABS(-1.5)等于1.5。 ABS(0)等于0。 ABS(2.5)等于2.5。 ACOS ACOS(number):返回指定数值的反余弦值。反余弦值为一个角度，返回角度以弧度形式表示。 Number:需要返回角
linux 启动java进程 sh文件墙头上一根草 linux shell jar
#!/bin/bash #初始化服务器的进程PId变量 user_pid=0; robot_pid=0; loadlort_pid=0; gateway_pid=0; ######### #检查相关服务器是否启动成功 #说明： #使用JDK自带的JPS命令及grep命令组合，准确查找pid #jps 加 l 参数，表示显示java的完整包路径 #使用awk，分割出pid
我的spring学习笔记5-如何使用ApplicationContext替换BeanFactory aijuans Spring 3 系列
如何使用ApplicationContext替换BeanFactory？ package onlyfun.caterpillar.device; import org.springframework.beans.factory.BeanFactory; import org.springframework.beans.factory.xml.XmlBeanFactory; import
Linux 内存使用方法详细解析 annan211 linux 内存 Linux内存解析
来源 http://blog.jobbole.com/45748/ 我是一名程序员，那么我在这里以一个程序员的角度来讲解Linux内存的使用。一提到内存管理，我们头脑中闪出的两个概念，就是虚拟内存，与物理内存。这两个概念主要来自于linux内核的支持。 Linux在内存管理上份为两级，一级是线性区，类似于00c73000-00c88000，对应于虚拟内存，它实际上不占用
数据库的单表查询常用命令及使用方法(-) 百合不是茶 oracle 函数单表查询
创建数据库; --建表 create table bloguser(username varchar2(20),userage number(10),usersex char(2)); 创建bloguser表,里面有三个字段 &nbs
多线程基础知识 bijian1013 java 多线程 thread java多线程
一．进程和线程进程就是一个在内存中独立运行的程序，有自己的地址空间。如正在运行的写字板程序就是一个进程。 “多任务”：指操作系统能同时运行多个进程（程序）。如WINDOWS系统可以同时运行写字板程序、画图程序、WORD、Eclipse等。线程：是进程内部单一的一个顺序控制流。线程和进程 a. 每个进程都有独立的
fastjson简单使用实例 bijian1013 fastjson
一.简介阿里巴巴fastjson是一个Java语言编写的高性能功能完善的JSON库。它采用一种“假定有序快速匹配”的算法，把JSON Parse的性能提升到极致，是目前Java语言中最快的JSON库；包括“序列化”和“反序列化”两部分，它具备如下特征：
【RPC框架Burlap】Spring集成Burlap bit1129 spring
Burlap和Hessian同属于codehaus的RPC调用框架，但是Burlap已经几年不更新，所以Spring在4.0里已经将Burlap的支持置为Deprecated,所以在选择RPC框架时，不应该考虑Burlap了。这篇文章还是记录下Burlap的用法吧，主要是复制粘贴了Hessian与Spring集成一文，【RPC框架Hessian四】Hessian与Spring集成
【Mahout一】基于Mahout 命令参数含义 bit1129 Mahout
1. mahout seqdirectory $ mahout seqdirectory --input (-i) input Path to job input directory(原始文本文件). --output (-o) output The directory pathna
linux使用flock文件锁解决脚本重复执行问题 ronin47 linux lock　重复执行
linux的crontab命令，可以定时执行操作，最小周期是每分钟执行一次。关于crontab实现每秒执行可参考我之前的文章《linux crontab 实现每秒执行》现在有个问题，如果设定了任务每分钟执行一次，但有可能一分钟内任务并没有执行完成，这时系统会再执行任务。导致两个相同的任务在执行。例如： <? // test .php
java-74-数组中有一个数字出现的次数超过了数组长度的一半，找出这个数字 bylijinnan java
public class OcuppyMoreThanHalf { /** * Q74 数组中有一个数字出现的次数超过了数组长度的一半，找出这个数字 * two solutions: * 1.O(n) * see <beauty of coding>--每次删除两个不同的数字，不改变数组的特性 * 2.O(nlogn) * 排序。中间
linux 系统相关命令 candiio linux
系统参数 cat /proc/cpuinfo cpu相关参数 cat /proc/meminfo 内存相关参数 cat /proc/loadavg 负载情况性能参数 1）top M：按内存使用排序 P：按CPU占用排序 1：显示各CPU的使用情况 k：kill进程 o：更多排序规则回车：刷新数据 2）ulimit ulimit -a：显示本用户的系统限制参
[经营与资产]保持独立性和稳定性对于软件开发的重要意义 comsci 软件开发
一个软件的架构从诞生到成熟，中间要经过很多次的修正和改造如果在这个过程中，外界的其它行业的资本不断的介入这种软件架构的升级过程中那么软件开发者原有的设计思想和开发路线
在CentOS5.5上编译OpenJDK6 Cwind linux OpenJDK
几番周折终于在自己的CentOS5.5上编译成功了OpenJDK6，将编译过程和遇到的问题作一简要记录，备查。 0. OpenJDK介绍 OpenJDK是Sun（现Oracle）公司发布的基于GPL许可的Java平台的实现。其优点： 1、它的核心代码与同时期Sun（-> Oracle）的产品版基本上是一样的，血统纯正，不用担心性能问题，也基本上没什么兼容性问题；（代码上最主要的差异是
java乱码问题 dashuaifu java乱码问题 js中文乱码
swfupload上传文件参数值为中文传递到后台接收中文乱码在js中用setPostParams（{"tag" : encodeURI( document.getElementByIdx_x("filetag").value，"utf-8")}）; 然后在servlet中String t
cygwin很多命令显示command not found的解决办法 dcj3sjt126com cygwin
cygwin很多命令显示command not found的解决办法修改cygwin.BAT文件如下 @echo off D: set CYGWIN=tty notitle glob set PATH=%PATH%;d:\cygwin\bin;d:\cygwin\sbin;d:\cygwin\usr\bin;d:\cygwin\usr\sbin;d:\cygwin\us
[介绍]从 Yii 1.1 升级 dcj3sjt126com PHP yii2
2.0 版框架是完全重写的，在 1.1 和 2.0 两个版本之间存在相当多差异。因此从 1.1 版升级并不像小版本间的跨越那么简单，通过本指南你将会了解两个版本间主要的不同之处。如果你之前没有用过 Yii 1.1，可以跳过本章，直接从"入门篇"开始读起。请注意，Yii 2.0 引入了很多本章并没有涉及到的新功能。强烈建议你通读整部权威指南来了解所有新特性。这样有可能会发
Linux SSH免登录配置总结 eksliang ssh-keygen Linux SSH免登录认证 Linux SSH互信
转载请出自出处：http://eksliang.iteye.com/blog/2187265 一、原理我们使用ssh-keygen在ServerA上生成私钥跟公钥，将生成的公钥拷贝到远程机器ServerB上后,就可以使用ssh命令无需密码登录到另外一台机器ServerB上。生成公钥与私钥有两种加密方式，第一种是
手势滑动销毁Activity gundumw100 android
老是效仿ios，做android的真悲催！有需求：需要手势滑动销毁一个Activity 怎么办尼？自己写？不用~，网上先问一下百度。结果： http://blog.csdn.net/xiaanming/article/details/20934541 首先将你需要的Activity继承SwipeBackActivity，它会在你的布局根目录新增一层SwipeBackLay
JavaScript变换表格边框颜色 ini JavaScript html Web html5 css
效果查看：http://hovertree.com/texiao/js/2.htm代码如下，保存到HTML文件也可以查看效果： <html> <head> <meta charset="utf-8"> <title>表格边框变换颜色代码-何问起</title> </head> <body&
Kafka Rest : Confluent kane_xie kafka REST confluent
最近拿到一个kafka rest的需求，但kafka暂时还没有提供rest api（应该是有在开发中，毕竟rest这么火），上网搜了一下，找到一个Confluent Platform，本文简单介绍一下安装。这里插一句，给大家推荐一个九尾搜索，原名叫谷粉SOSO，不想fanqiang谷歌的可以用这个。以前在外企用谷歌用习惯了，出来之后用度娘搜技术问题，那匹配度简直感人。环境声明：Ubu
Calender不是单例 men4661273 单例 Calender
在我们使用Calender的时候，使用过Calendar.getInstance()来获取一个日期类的对象，这种方式跟单例的获取方式一样，那么它到底是不是单例呢，如果是单例的话，一个对象修改内容之后，另外一个线程中的数据不久乱套了吗？从试验以及源码中可以得出，Calendar不是单例。测试： Calendar c1 =
线程内存和主内存之间联系 qifeifei java thread
1， java多线程共享主内存中变量的时候，一共会经过几个阶段， lock:将主内存中的变量锁定，为一个线程所独占。 unclock:将lock加的锁定解除，此时其它的线程可以有机会访问此变量。 read:将主内存中的变量值读到工作内存当中。 load:将read读取的值保存到工作内存中的变量副本中。
schedule和scheduleAtFixedRate tangqi609567707 java timer schedule
原文地址：http://blog.csdn.net/weidan1121/article/details/527307 import java.util.Timer;import java.util.TimerTask;import java.util.Date; /** * @author vincent */public class TimerTest {
erlang 部署 wudixiaotie erlang
1.如果在启动节点的时候报这个错： {"init terminating in do_boot",{'cannot load',elf_format,get_files}} 则需要在reltool.config中加入 {app, hipe, [{incl_cond, exclude}]}, 2.当generate时，遇到： ERROR

基于实例分割方法的端到端车道线检测 论文+代码解读

Towards End-to-End Lane Detection: an Instance Segmentation Approach

摘要

正文

LaneNet

clustering

网络架构

本文设置

HNet

车道线拟合

本文设置

实验

代码

包

数据集

下载

准备

训练LaneNet（基于E-Net）

在测试集上评估模型

产生GIF动图

你可能感兴趣的:(车道线检测/道路边缘检测,计算机视觉,深度学习,python)

基于实例分割方法的端到端车道线检测论文+代码解读