MOT (Beyond Pixels: Leveraging Geometry and Shape Cues for Online Multi-Object Tracking)demo阅读

变量:

  • all_wts:3D-3D,3D-2D和2D-2D权重
    global_id:车辆总数
    K_all:相机内参
    angleMapdetectionsQ_dnos:原帧车辆回归框号
    detectionsT_dnos:下一帧车辆回归框号
    result_for_kitti_eval:预估结果
    total_detecions总检测数
    params_carCuboid(avgCar_Sz, sz_ub,sz_lb):车辆参数
    sz_lb:车参上界
    sz_ub:车参下界
    avgCar_Sz[l, h, w]:车参:长,宽,高
    params_2D3D      = struct( K,  n, h);
    K:相机内参
    n[0,1,0]:道路平面
    h:摄像机离地高度
    detection       = struct( 'dno',      -1.0,    ...  
                      'bbox',       zeros(4,1),   ...
                      'sigma_2D',    zeros(3,3),   ...
                      'yaw',        0.0,           ... 
                      'bvolume',    [],            ... 
                      'bvolume_proj',   [],        ...   
                      'k',              [],        ... 
                      'origin',      zeros(3,1),   ... 
                      'start_frame',    -1.0,      ... 
                      'end_frame',      -1.0,      ... 
                      'sigma_3D',       zeros(4,4)    ...                               
     );
    dno:车辆号
    bbox:车辆检测区域
    sigma_2D:2D激活函数
    yaw:偏航角
    bvolume:3D边界体积(bvolume),它是将偏航和LBH错误应用于平均汽车长方体边界后得到的点的凸包bvolume_projk:边界体积凸包
    origin:车辆位置原点
    start_frame:起始帧
    end_frame:终止帧
    sigma_3D:3D激活函数
    num_matches:匹配数
    pose:Monocular ORBSLAM获取参数[平移量,旋转量,偏航量]
    detections:车辆检测矩阵
    feautes_2D2D:图像特征矩阵
    start_fno:起始帧号
    end_fno:终止帧号
    scoreMatrices_for_seq(end_fno, 1)当前帧得分矩阵
    first_time:是否是首帧
    num_detectionsQ:当前帧检测出的车辆数
    num_detectionsT:下一帧检测出的车辆数
    detectionsQ:当前帧检测出的车辆位置回归框
    detectionsT:下一帧检测出的车辆位置回归框
    motion[1*4]:帧移动量

     

函数

  • getCanonicalCuboid(avg_sz):在图像二维图像上生成长方体
    getOffsetBasedOnYaw([车宽,车高],yaw):计算要应用到X的偏移量,即回归框底线的中心,以实际到达车辆的中心。此偏移只能应用于Z方向。 
    generateScoreMatrices(detectionsQ, detectionsT):生成得分矩阵
    get3D3DMatchScore(detectionsT{j}.bvolume, detectionsQ{i}.bvolume):3D车辆物体相交程度
    get3D2DMatchScore(detectionsT{j}.bbox, detectionsQ{i}.bvolume_proj):3d投影相交程度
    generate2D2DScoreMatrix(features_2D2D{i}, features_2D2D{i+1}):车辆回归框特征相似度矩阵

     

初始化权重,车参,结构体,导入pose矩阵,bounding box矩阵,feature矩阵,angle_map矩阵:

addpath('../third_party');
         % 33  32  22  vp
all_wts = [0.6 0.4 0.2 0.0];
       
dcolors = distinguishable_colors(800);

for ww = 1:size(all_wts,1)
    
    c33 = all_wts(ww,1);
    c32 = all_wts(ww,2);
    c22 = all_wts(ww,3);
    cvp = all_wts(ww,4);       
            
    % 道路平面和相机高度
    n =  [0; 1; 0];
    h =  1.72;
    
    % 平均车参    avgCar_Sz = [4.3; 2; 2];
    sz_ub     = [34; 31.5; 31.5];
    sz_lb     = [-34; -31.5; -31.5];
    
    K_all     = load('../Data/calib/calib_all_test.txt');
    
    
    angleMap = containers.Map([ 90:270 90:-1:0 360:-1:270], [0:-1:-180 0:90 90:180]);        
    
    hungarian_association = false;      
    detectionsQ_dnos = [];
    detectionsT_dnos = [];
    
    for s = [2]        
        
        seqNo = s;                                        
    
        result_for_kitti_eval = [];
        
        total_detecions = 0;
        
        %% SETUP PARAMETERS

        image_path    = sprintf('../Data/images/image_02/test/%04d',seqNo);      
        pose_path     = sprintf('../Data/ORBSLAM_pose/test/%04d/KITTITrajectoryComplete_new',seqNo);              
        K             = [ K_all(seqNo+1,1:3); K_all(seqNo+1,5:7); K_all(seqNo+1,9:11)];
                
        % 定义车辆参数,相机参数和车辆结构体
        params_carCuboid = struct('avg_Sz',         avgCar_Sz,    ...
                                  'sz_ub',          sz_ub,        ...
                                  'sz_lb',          sz_lb         ...
                                  );
        
        params_2D3D      = struct('K',              K,            ...
                                  'n',              n,            ...
                                  'h',              h             ...
                                  );
        
        detection       = struct( 'dno',           -1.0,         ...       
                                 'bbox',           zeros(4,1),   ...
                                 'sigma_2D',       zeros(3,3),   ...       
                                 'yaw',            0.0,          ...
                                 'bvolume',        [],           ...       
                                 'bvolume_proj',   [],           ...       
                                 'k',              [],           ...       
                                 'origin',         zeros(3,1),   ...
                                 'start_frame',    -1.0,         ...
                                 'end_frame',      -1.0,         ...
                                 'sigma_3D',       zeros(4,4)    ...
                                );
        
        
        
        num_matches = 0;
        num_false_positive = 0;
                
        
        %% SETUP STRUCTS
        global_id = 1;
        
        %% 导入数据 :
        
        % load pose (Monocular ORBSLAM)
        pose = load(pose_path);      
                                
        %load all detections cell array - variable name 'detections'
        load(sprintf('../Data/RRC_Detections_mat/test/%04d/detections_rrc_test_%02d.mat', seqNo,seqNo));                
        
        % feautes_2D2D
        load(sprintf('../Data/Features2D_mat/test/%04d/features_2_rrc_test_%02d.mat', seqNo,seqNo));               
        
        start_fno 		 = 1;
        end_fno 		 = 35;%size(pose,1); 
        
        scoreMatrices_for_seq = cell(end_fno, 1);
        
        %% RUN SCORING
        
        first_time = 1;
        tempDQ = [];
        
        figure(1);
        
        for i = 1:end_fno - 1
            
            disp(sprintf('Seq<%02d> | frame : %04d', seqNo, i));
            
            if(first_time)
                num_detectionsQ = size(detections{i},1);
            end
            
            num_detectionsT = size(detections{i+1},1);

如果输入是图像第一帧,将每个车辆位置,车辆序号,车辆偏航角,车辆3D转2D举证填入结构体,因为是第一帧,每填入一辆车,车号加一:

if(first_time)
        
                % load the detections in to the query and train struct cell arrays x1 y1 x2 y2 confidence ID
                for j = 1:num_detectionsQ
                    
                    detectionsQ{j} = detection; % emtpy detection struct
                    
                    detectionsQ{j}.dno          = global_id;
                    detectionsQ{j}.bbox         = detections{i}(j,1:4);
                    detectionsQ{j}.yaw          = deg2rad(-90) ;                   
                    detectionsQ{j}.sigma_3D     = [1.3 0 0 0; 0 1.1 0 0; 0 0 1.1 0; 0 0 0 deg2rad(0)];

                    
                    global_id = global_id + 1;  % increment the global ID
                end
            end

非第一帧,将当前帧下一帧的检测出来的所有车辆序号置一,填入当前帧下一帧的位置填入,偏航角置为90°,填入3D激活函数:

for j = 1:num_detectionsT
                
                detectionsT{j} = detection; % emtpy detection struct
                
                detectionsT{j}.dno          = -1;     % will be given after association
                detectionsT{j}.bbox         = detections{i+1}(j,1:4);
                detectionsT{j}.yaw  =       deg2rad(-90) ;
                detectionsT{j}.sigma_3D     = [1.3 0 0 0; 0 1.1 0 0; 0 0 1.1 0; 0 0 0 deg2rad(0)];;%
                %1.6, 1.1, 1.1 : 0.8 0.1 0.1
            end

计算两帧之间摄像头的平移,偏航角,旋转角变化,将θ置为0,然后将motion归一化:

 motion = [-(pose(i+1,11)-pose(i,11));
                      -(pose(i+1,12)-pose(i,12));
                      -(pose(i+1,13)-pose(i,13));
                        deg2rad(0)];
                        
            % scale the motion
            motion(1:3,1) =  motion(1:3,1).*(1.72/44);

将前后两帧的结构体,车参,相机参数,摄像头移动参数输入,完善结构体数据:

 [detectionsQ, detectionsT] = propagateDetections(detectionsQ,       ...
                                                             detectionsT,       ...
                                                             params_2D3D,       ...
                                                             params_carCuboid,  ...
                                                             motion             ...
                                                             );

其中,propagateDetections函数将2D图像转成3D,公式为X_{f}=\pi _{G}^{-1}(x)=hK^{-1}x/n{T}K^{-1}x

for i = 1:length(detectionsQ)                        
        
        b1Q = [];
        B1Q = [];
        bvolumeQ1 = [];
        
        % check if the detection is first time i.e. we have to start with
        % the canonical cuboid. Else we already have the cuboid.
        
        if(length(detectionsQ{i}.bvolume) == 0 )
            %获取标准长方体
            canonicalCuboid = getCanonicalCuboid( params_carCuboid.avg_Sz);   

            % 找寻底部中心
            b1Q = [detectionsQ{i}.bbox(1) + (detectionsQ{i}.bbox(3) - detectionsQ{i}.bbox(1))/2 ;
                  detectionsQ{i}.bbox(4);
                  1.0];

            % 投影成3d
            B1Q = (params_2D3D.h * inv(params_2D3D.K) * b1Q) / (params_2D3D.n' * inv(params_2D3D.K) * b1Q);

            % 应用偏移量,这是偏航的一个函数,并获得汽车的原点(记住近似值)
            offset_Z = getOffsetBasedOnYaw([params_carCuboid.avg_Sz(1); params_carCuboid.avg_Sz(3)], detectionsQ{i}.yaw);        

            % 当前帧中的汽车原点
            B1Q = B1Q + [0; 0; offset_Z];

            % 转换成标准长方体
            canonicalCuboid = canonicalCuboid + repmat(B1Q', 8,1);

            % 获取当前帧边界体积                
            [bvolumeQ1, k1] = getBoundingVolume(B1Q, canonicalCuboid, detectionsQ{i}.yaw, ... 
                                                detectionsQ{i}.sigma_3D, ...
                                                params_carCuboid.sz_ub, params_carCuboid.sz_lb ...
                                               );  
                                           
            bvolumeQ1 = bvolumeQ1 - repmat([0 params_carCuboid.avg_Sz(2)/2 0], size(bvolumeQ1,1), 1);  
        else
           
            B1Q = detectionsQ{i}.origin;
            bvolumeQ1 = detectionsQ{i}.bvolume;
            
        end
        
        % TODO
             % propagate noise and find new noise .... for now same noise             
        %                       
        
        % 车在当前帧的坐标原点
        B2Q = motion(1:3) + B1Q ;
        bvolumeQ1 = bvolumeQ1 + repmat(motion(1:3)', size(bvolumeQ1,1),1);
        % CUBOID IN TRAIN FRAME        
        
        [bvolumeQ2, k2] = getBoundingVolume(B2Q, bvolumeQ1, motion(4), ... 
                                             detectionsQ{i}.sigma_3D, ...
                                             params_carCuboid.sz_ub, params_carCuboid.sz_lb ...
                                            );                 
        
       % offset the h/2 as car ar to be on road             
        
        % 计算边界体积投影并保存为bvolume_proj变量
        bvolume_proj = (params_2D3D.K*bvolumeQ2')';
        bvolume_proj(:,1:3) = bvolume_proj(:,1:3)./repmat(bvolume_proj(:,3), 1,3);
        kbvolume_proj = convhull(bvolume_proj(:,1:2));
        bvolume_proj = bvolume_proj(kbvolume_proj,:);
        
        % 更新结构体数据
        
        detectionsQ{i}.bvolume = bvolumeQ2;             
        detectionsQ{i}.bvolume_proj = bvolume_proj;
        detectionsQ{i}.yaw  = detectionsQ{i}.yaw+motion(4); 
        detectionsQ{i}.origin  = B2Q;        
        detectionsQ{i}.k = k2;       
        
                        
    end

将结构体输入得分矩阵计算函数,获取2D-2D,3D-3D和3D-2D得分,并将2D-2D得分矩阵归一化:

[scoreMat_3D3D, scoreMat_3D2D] = generateScoreMatrices(detectionsQ, detectionsT);
            [scoreMat_2D2D]                = generate2D2DScoreMatrix(features_2D2D{i}, features_2D2D{i+1});
            
            %normalize scoreMat2D2D from -1:1 to 0:1    
            max2D2D = max(scoreMat_2D2D');
            max2D2D = repmat(max2D2D', 1, size(scoreMat_2D2D,2));
            min2D2D = min(scoreMat_2D2D');
            min2D2D = repmat(min2D2D', 1, size(scoreMat_2D2D,2));
            scoreMat_2D2D = (scoreMat_2D2D - min2D2D) ./ (max2D2D-min2D2D +0.0000001) ;
            
            scoreMat_viewpoint = zeros(size(scoreMat_3D3D)); 

其中,generateScoreMatrices函数输入前后两帧的结构体,生成3D-3D和3D-2D得分矩阵:

function [scoreMatrix_3D3D, scoreMatrix_3D2D] = generateScoreMatrices(detectionsQ, detectionsT)

    scoreMatrix_3D3D = zeros(length(detectionsQ), length(detectionsT));
    scoreMatrix_3D2D = zeros(length(detectionsQ), length(detectionsT));
    
    
    % 3D 3D 得分   
    for i = 1:length(detectionsQ)                
        for j = 1:length(detectionsT)
            scoreMatrix_3D3D(i,j) = get3D3DMatchScore(detectionsT{j}.bvolume, detectionsQ{i}.bvolume);            
            scoreMatrix_3D2D(i,j) = get3D2DMatchScore(detectionsT{j}.bbox, detectionsQ{i}.bvolume_proj);            
        end
    end
    
end

其中,get3D3DMatchScore函数:

function score = get3D3DMatchScore(bvolumeQ, bvolumeT)

    % 获取物体底部位置; 接着计算3D-3D得分   
    kbvolumeQ_xz = convhull(bvolumeQ(:,1),  bvolumeQ(:,3));%计算物体凸包
    bvolumeQ_xz  = [bvolumeQ(kbvolumeQ_xz,1) bvolumeQ(kbvolumeQ_xz,3)];
        
    kbvolumeT_xz = convhull(bvolumeT(:,1),  bvolumeT(:,3));
    bvolumeT_xz = [bvolumeT(kbvolumeT_xz,1) bvolumeT(kbvolumeT_xz,3)];
    
    % 比较车辆位置和搜索区域交集
    [x_in, y_in] = polybool('intersection', bvolumeQ_xz(:,1), bvolumeQ_xz(:,2), bvolumeT_xz(:,1), bvolumeT_xz(:,2));    
    
    % 计算车辆位置在搜索区域占比    
    area_in = polyarea(x_in, y_in);    
    bvolumeQ_xz_area  = polyarea(bvolumeQ_xz(:,1), bvolumeQ_xz(:,2));    
    
    score = area_in / bvolumeQ_xz_area;    
end

get3D2DMatchScore函数:

% 计算车辆位置与搜索区域交集
function [score] = get3D2DMatchScore(bbox, searchRegion)
    
    convShapeQ = [bbox(1) bbox(2);
                  bbox(3) bbox(2);
                  bbox(3) bbox(4);
                  bbox(1) bbox(4)];
              
    convShapeT = searchRegion;
    
    % find the intersection of bbox and search region
    [x_in, y_in] = polybool('intersection', convShapeQ(:,1), convShapeQ(:,2), convShapeT(:,1), convShapeT(:,2));    
    
    % compute percentage area of bbox i.e. Q present in the search region    
    area_in = polyarea(x_in, y_in);    
    convShapeQ_area  = polyarea(convShapeQ(:,1), convShapeQ(:,2));    
    
    score = area_in / convShapeQ_area;    
    
end

将2D-2D,3D-3D和3D-2D得分加权相加后保存到scoreMat_all:

 if(~isempty(scoreMat_2D2D))                           
                
                scoreMat_3D2D_mask = (scoreMat_3D2D > 0);
                scoreMat_2D2D = scoreMat_2D2D .* scoreMat_3D2D_mask;
                
                scoreMat_all =  c33*scoreMat_3D3D + c32*scoreMat_3D2D + c22*scoreMat_2D2D + cvp*scoreMat_viewpoint;
                
            else
                
                scoreMat_all = c33*scoreMat_3D3D + c32*scoreMat_3D2D + cvp*scoreMat_viewpoint;
            end  

使用匈牙利算法计算两个物体间的联系:

if(hungarian_association)
                % 使用Hungarian算法找循两个物体间的联系
                [assign, cost]=  munkres(-scoreMat_all);
                
                % give IDs to the  Ts based on assignment or give new ID
                
                for ii = 1:length(assign)
                    
                    % ADD: scoreMat_all(ii,assign(ii)) > 0.005 , to handle
                    % death and birth issues. Not very useful though.
                    if(assign(ii) > 0 && scoreMat_all(ii,assign(ii)) > 0.0   )
                        detectionsT{assign(ii)}.dno = detectionsQ{ii}.dno;
                    end
                end
                
            end

若没有使用匈牙利算法,则通过加权得分矩阵判断前后两帧检测的物体是同一辆车:

if(~hungarian_association)
                
                for ii = 1:length(detectionsQ)
                    
                    scoreQT = scoreMat_all(ii,:);
                    [max_v, max_id] = max(scoreQT)
                    
                    % 如果最大得分超过阈值,则判定与前一帧检测出的车辆相同
                    if(max_v > 0.15)
                        % give the Q's id to matching T
                        detectionsT{max_id}.dno = detectionsQ{ii}.dno;
                        
                    end
                    
                end
            end

若当前帧的车辆未在下一帧找到匹配车辆,车辆总数加1:

 for ii = 1:length(detectionsT)
                
                if(detectionsT{ii}.dno == -1)
                    detectionsT{ii}.dno = global_id;
                    global_id = global_id + 1;
                end

将下一帧结构体赋值给当前帧,继续循环:

            first_time = 0;
            tempDQ = detectionsQ;
            detectionsQ = detectionsT;    

 

你可能感兴趣的:(MOT (Beyond Pixels: Leveraging Geometry and Shape Cues for Online Multi-Object Tracking)demo阅读)