空持千百偈，不如吃茶去

OpenPCDet训练Waymo数据集

一、前言

官方给出的方法：

https://github.com/open-mmlab/OpenPCDet/blob/master/docs/GETTING_STARTED.mdhttps://github.com/open-mmlab/OpenPCDet/blob/master/docs/GETTING_STARTED.md

本文采用两个方法生成数据集：官方方法和自定义路径方法，后者是为了防止内存不够用。

二、使用官方方法生成并训练数据集

1、整理数据

（1）下载官方数据集

Waymo Open Dataset, 包括训练数据：training_0000.tar~training_0031.tar以及验证集数据： validation_0000.tar~validation_0007.tar（如果仅仅只是想用一部分进行训练，可只下载几个tar压缩包就可以，数量不会影响）。

（2）将上述所有xxxx.tar文件解压到data/waymo/raw_data目录(可以得到798训练tfrecord和202验证tfrecord)。

（3）所有的数据集名称需要更改为：segment-xxxxxxxx.tfrecord。

（4）整理目录

在OpenPCDet-master/data文件目录需要整理如下：

├── data

│   ├── waymo

│   │   │── ImageSets

│   │   │── raw_data

│   │   │   │── segment-xxxxxxxx.tfrecord

|   |   |   |── ...

注意：这里只能在OpenPCDet根目录下生成数据集，否则需要更改源码，后面的方法有讲到

简单整理如下图所示：

raw_data目录：

注意：这里raw_data一定要将train和val的数据集放到一起，不能只放其中一个，否则之后eval会报错，无法评估结果。

2、安装工具包

pip install waymo-open-dataset-tf-2.11.0==1.5.0

我安装的是waymo-open-dataset-tf-2.11.0，大家根据需求来安装版本，我的2-1-0不能用。

3、使用命令进行转换

从tfrecord中提取点云数据并生成数据信息：

python -m pcdet.datasets.waymo.waymo_dataset --func create_waymo_infos --cfg_file tools/cfgs/dataset_configs/waymo_dataset.yaml

最终生成的数据集目录如下所示：

# Download Waymo and organize it into the following form:

├── data

│   ├── waymo

│   │   │── ImageSets

│   │   │── raw_data

│   │   │   │── segment-xxxxxxxx.tfrecord

|   |   |   |── ...

|   |   |── waymo_processed_data

│   │   │   │── segment-xxxxxxxx/

|   |   |   |── ...

│   │   │── pcdet_gt_database_train_sampled_xx/

│   │   │── pcdet_waymo_dbinfos_train_sampled_xx.pkl

如图：

4、训练

这里以PV_RCNN++为例，使用命令：

python train.py --cfg_file /home/xd/xyy/OpenPCDet-master/tools/cfgs/waymo_models/pv_rcnn_plusplus.yaml

三、在自定义目录下生成并训练数据集

自定义目录下生成并训练数据集主要是为了节约内存，不同版本pcdet及相关代码共用一个数据集。

1、整理数据

（1）请下载官方数据集：

Waymo Open Dataset, 包括训练数据：training_0000.tar~training_0031.tar以及验证集数据： validation_0000.tar~validation_0007.tar.

（2）将上述所有xxxx.tar文件解压到data/waymo/raw_data目录(可以得到798训练tfrecord和202验证tfrecord)：

（3）所有的数据集名称需要更改为：segment-xxxxxxxx.tfrecord，如果是下载的是individual

（4）整理目录

前三步是与上面一样的。

下面整理目录可以将数据放在xxx/data文件目录下：

├xxx

├── data

│   ├── waymo

│   │   │── ImageSets

│   │   │── raw_data

│   │   │   │── segment-xxxxxxxx.tfrecord

|   |   |   |── ...

以下面图举例，我是放在了自定义文件夹hpc/data/waymo_pcdet_mini中，

raw_data文件夹如下所示：

2、安装工具包

pip install waymo-open-dataset-tf-2.11.0==1.5.0

3、更改waymo源码

注：源码只能在OpenPCDet-master目录下生成数据，这里需要新加一些可以传入的参数进行转换。

（1）更改生成数据源码——waymo_dataset.py

位置：

/OpenPCDet-master/pcdet/datasets/waymo/waymo_dataset.py

更改部分主要是：parser和ROOT_DIR

其中，更改部分如下图所示：

更改后的源码如下：

# OpenPCDet PyTorch Dataloader and Evaluation Tools for Waymo Open Dataset
# Reference https://github.com/open-mmlab/OpenPCDet
# Written by Shaoshuai Shi, Chaoxu Guo
# All Rights Reserved.

import os
import pickle
import copy
import numpy as np
import torch
import multiprocessing
import SharedArray
import torch.distributed as dist
from tqdm import tqdm
from pathlib import Path
from functools import partial

from ...ops.roiaware_pool3d import roiaware_pool3d_utils
from ...utils import box_utils, common_utils
from ..dataset import DatasetTemplate


class WaymoDataset(DatasetTemplate):
    def __init__(self, dataset_cfg, class_names, training=True, root_path=None, logger=None):
        super().__init__(
            dataset_cfg=dataset_cfg, class_names=class_names, training=training, root_path=root_path, logger=logger
        )
        self.data_path = self.root_path / self.dataset_cfg.PROCESSED_DATA_TAG
        self.split = self.dataset_cfg.DATA_SPLIT[self.mode]
        split_dir = self.root_path / 'ImageSets' / (self.split + '.txt')
        self.sample_sequence_list = [x.strip() for x in open(split_dir).readlines()]

        self.infos = []
        self.seq_name_to_infos = self.include_waymo_data(self.mode)

        self.use_shared_memory = self.dataset_cfg.get('USE_SHARED_MEMORY', False) and self.training
        if self.use_shared_memory:
            self.shared_memory_file_limit = self.dataset_cfg.get('SHARED_MEMORY_FILE_LIMIT', 0x7FFFFFFF)
            self.load_data_to_shared_memory()

        if self.dataset_cfg.get('USE_PREDBOX', False):
            self.pred_boxes_dict = self.load_pred_boxes_to_dict(
                pred_boxes_path=self.dataset_cfg.ROI_BOXES_PATH[self.mode]
            )
        else:
            self.pred_boxes_dict = {}

    def set_split(self, split):
        super().__init__(
            dataset_cfg=self.dataset_cfg, class_names=self.class_names, training=self.training,
            root_path=self.root_path, logger=self.logger
        )
        self.split = split
        split_dir = self.root_path / 'ImageSets' / (self.split + '.txt')
        self.sample_sequence_list = [x.strip() for x in open(split_dir).readlines()]
        self.infos = []
        self.seq_name_to_infos = self.include_waymo_data(self.mode)

    def include_waymo_data(self, mode):
        self.logger.info('Loading Waymo dataset')
        waymo_infos = []
        seq_name_to_infos = {}

        num_skipped_infos = 0
        for k in range(len(self.sample_sequence_list)):
            sequence_name = os.path.splitext(self.sample_sequence_list[k])[0]
            info_path = self.data_path / sequence_name / ('%s.pkl' % sequence_name)
            info_path = self.check_sequence_name_with_all_version(info_path)
            if not info_path.exists():
                num_skipped_infos += 1
                continue
            with open(info_path, 'rb') as f:
                infos = pickle.load(f)
                waymo_infos.extend(infos)

            seq_name_to_infos[infos[0]['point_cloud']['lidar_sequence']] = infos

        self.infos.extend(waymo_infos[:])
        self.logger.info('Total skipped info %s' % num_skipped_infos)
        self.logger.info('Total samples for Waymo dataset: %d' % (len(waymo_infos)))

        if self.dataset_cfg.SAMPLED_INTERVAL[mode] > 1:
            sampled_waymo_infos = []
            for k in range(0, len(self.infos), self.dataset_cfg.SAMPLED_INTERVAL[mode]):
                sampled_waymo_infos.append(self.infos[k])
            self.infos = sampled_waymo_infos
            self.logger.info('Total sampled samples for Waymo dataset: %d' % len(self.infos))
            
        use_sequence_data = self.dataset_cfg.get('SEQUENCE_CONFIG', None) is not None and self.dataset_cfg.SEQUENCE_CONFIG.ENABLED
        if not use_sequence_data:
            seq_name_to_infos = None 
        return seq_name_to_infos

    def load_pred_boxes_to_dict(self, pred_boxes_path):
        self.logger.info(f'Loading and reorganizing pred_boxes to dict from path: {pred_boxes_path}')
        with open(pred_boxes_path, 'rb') as f:
            pred_dicts = pickle.load(f)

        pred_boxes_dict = {}
        for index, box_dict in enumerate(pred_dicts):
            seq_name = box_dict['frame_id'][:-4].replace('training_', '').replace('validation_', '')
            sample_idx = int(box_dict['frame_id'][-3:])

            if seq_name not in pred_boxes_dict:
                pred_boxes_dict[seq_name] = {}

            pred_labels = np.array([self.class_names.index(box_dict['name'][k]) + 1 for k in range(box_dict['name'].shape[0])])
            pred_boxes = np.concatenate((box_dict['boxes_lidar'], box_dict['score'][:, np.newaxis], pred_labels[:, np.newaxis]), axis=-1)
            pred_boxes_dict[seq_name][sample_idx] = pred_boxes

        self.logger.info(f'Predicted boxes has been loaded, total sequences: {len(pred_boxes_dict)}')
        return pred_boxes_dict

    def load_data_to_shared_memory(self):
        self.logger.info(f'Loading training data to shared memory (file limit={self.shared_memory_file_limit})')

        cur_rank, num_gpus = common_utils.get_dist_info()
        all_infos = self.infos[:self.shared_memory_file_limit] \
            if self.shared_memory_file_limit < len(self.infos) else self.infos
        cur_infos = all_infos[cur_rank::num_gpus]
        for info in cur_infos:
            pc_info = info['point_cloud']
            sequence_name = pc_info['lidar_sequence']
            sample_idx = pc_info['sample_idx']

            sa_key = f'{sequence_name}___{sample_idx}'
            if os.path.exists(f"/dev/shm/{sa_key}"):
                continue

            points = self.get_lidar(sequence_name, sample_idx)
            common_utils.sa_create(f"shm://{sa_key}", points)

        dist.barrier()
        self.logger.info('Training data has been saved to shared memory')

    def clean_shared_memory(self):
        self.logger.info(f'Clean training data from shared memory (file limit={self.shared_memory_file_limit})')

        cur_rank, num_gpus = common_utils.get_dist_info()
        all_infos = self.infos[:self.shared_memory_file_limit] \
            if self.shared_memory_file_limit < len(self.infos) else self.infos
        cur_infos = all_infos[cur_rank::num_gpus]
        for info in cur_infos:
            pc_info = info['point_cloud']
            sequence_name = pc_info['lidar_sequence']
            sample_idx = pc_info['sample_idx']

            sa_key = f'{sequence_name}___{sample_idx}'
            if not os.path.exists(f"/dev/shm/{sa_key}"):
                continue

            SharedArray.delete(f"shm://{sa_key}")

        if num_gpus > 1:
            dist.barrier()
        self.logger.info('Training data has been deleted from shared memory')

    @staticmethod
    def check_sequence_name_with_all_version(sequence_file):
        if not sequence_file.exists():
            found_sequence_file = sequence_file
            for pre_text in ['training', 'validation', 'testing']:
                if not sequence_file.exists():
                    temp_sequence_file = Path(str(sequence_file).replace('segment', pre_text + '_segment'))
                    if temp_sequence_file.exists():
                        found_sequence_file = temp_sequence_file
                        break
            if not found_sequence_file.exists():
                found_sequence_file = Path(str(sequence_file).replace('_with_camera_labels', ''))
            if found_sequence_file.exists():
                sequence_file = found_sequence_file
        return sequence_file

    def get_infos(self, raw_data_path, save_path, num_workers=multiprocessing.cpu_count(), has_label=True, sampled_interval=1, update_info_only=False):
        from . import waymo_utils
        print('---------------The waymo sample interval is %d, total sequecnes is %d-----------------'
              % (sampled_interval, len(self.sample_sequence_list)))

        process_single_sequence = partial(
            waymo_utils.process_single_sequence,
            save_path=save_path, sampled_interval=sampled_interval, has_label=has_label, update_info_only=update_info_only
        )
        sample_sequence_file_list = [
            self.check_sequence_name_with_all_version(raw_data_path / sequence_file)
            for sequence_file in self.sample_sequence_list
        ]

        # process_single_sequence(sample_sequence_file_list[0])
        with multiprocessing.Pool(num_workers) as p:
            sequence_infos = list(tqdm(p.imap(process_single_sequence, sample_sequence_file_list),
                                       total=len(sample_sequence_file_list)))

        all_sequences_infos = [item for infos in sequence_infos for item in infos]
        return all_sequences_infos

    def get_lidar(self, sequence_name, sample_idx):
        lidar_file = self.data_path / sequence_name / ('%04d.npy' % sample_idx)
        point_features = np.load(lidar_file)  # (N, 7): [x, y, z, intensity, elongation, NLZ_flag]

        points_all, NLZ_flag = point_features[:, 0:5], point_features[:, 5]
        if not self.dataset_cfg.get('DISABLE_NLZ_FLAG_ON_POINTS', False):
            points_all = points_all[NLZ_flag == -1]
        points_all[:, 3] = np.tanh(points_all[:, 3])
        return points_all

    @staticmethod
    def transform_prebox_to_current(pred_boxes3d, pose_pre, pose_cur):
        """

        Args:
            pred_boxes3d (N, 9 or 11): [x, y, z, dx, dy, dz, raw,  score, label]
            pose_pre (4, 4):
            pose_cur (4, 4):
        Returns:

        """
        assert pred_boxes3d.shape[-1] in [9, 11]
        pred_boxes3d = pred_boxes3d.copy()
        expand_bboxes = np.concatenate([pred_boxes3d[:, :3], np.ones((pred_boxes3d.shape[0], 1))], axis=-1)

        bboxes_global = np.dot(expand_bboxes, pose_pre.T)[:, :3]
        expand_bboxes_global = np.concatenate([bboxes_global[:, :3],np.ones((bboxes_global.shape[0], 1))], axis=-1)
        bboxes_pre2cur = np.dot(expand_bboxes_global, np.linalg.inv(pose_cur.T))[:, :3]
        pred_boxes3d[:, 0:3] = bboxes_pre2cur

        if pred_boxes3d.shape[-1] == 11:
            expand_vels = np.concatenate([pred_boxes3d[:, 7:9], np.zeros((pred_boxes3d.shape[0], 1))], axis=-1)
            vels_global = np.dot(expand_vels, pose_pre[:3, :3].T)
            vels_pre2cur = np.dot(vels_global, np.linalg.inv(pose_cur[:3, :3].T))[:,:2]
            pred_boxes3d[:, 7:9] = vels_pre2cur

        pred_boxes3d[:, 6]  = pred_boxes3d[..., 6] + np.arctan2(pose_pre[..., 1, 0], pose_pre[..., 0, 0])
        pred_boxes3d[:, 6]  = pred_boxes3d[..., 6] - np.arctan2(pose_cur[..., 1, 0], pose_cur[..., 0, 0])
        return pred_boxes3d

    @staticmethod
    def reorder_rois_for_refining(pred_bboxes):
        num_max_rois = max([len(bbox) for bbox in pred_bboxes])
        num_max_rois = max(1, num_max_rois)  # at least one faked rois to avoid error
        ordered_bboxes = np.zeros([len(pred_bboxes), num_max_rois, pred_bboxes[0].shape[-1]], dtype=np.float32)

        for bs_idx in range(ordered_bboxes.shape[0]):
            ordered_bboxes[bs_idx, :len(pred_bboxes[bs_idx])] = pred_bboxes[bs_idx]
        return ordered_bboxes

    def get_sequence_data(self, info, points, sequence_name, sample_idx, sequence_cfg, load_pred_boxes=False):
        """
        Args:
            info:
            points:
            sequence_name:
            sample_idx:
            sequence_cfg:
        Returns:
        """

        def remove_ego_points(points, center_radius=1.0):
            mask = ~((np.abs(points[:, 0]) < center_radius) & (np.abs(points[:, 1]) < center_radius))
            return points[mask]

        def load_pred_boxes_from_dict(sequence_name, sample_idx):
            """
            boxes: (N, 11)  [x, y, z, dx, dy, dn, raw, vx, vy, score, label]
            """
            sequence_name = sequence_name.replace('training_', '').replace('validation_', '')
            load_boxes = self.pred_boxes_dict[sequence_name][sample_idx]
            assert load_boxes.shape[-1] == 11
            load_boxes[:, 7:9] = -0.1 * load_boxes[:, 7:9]  # transfer speed to negtive motion from t to t-1
            return load_boxes

        pose_cur = info['pose'].reshape((4, 4))
        num_pts_cur = points.shape[0]
        sample_idx_pre_list = np.clip(sample_idx + np.arange(sequence_cfg.SAMPLE_OFFSET[0], sequence_cfg.SAMPLE_OFFSET[1]), 0, 0x7FFFFFFF)
        sample_idx_pre_list = sample_idx_pre_list[::-1]

        if sequence_cfg.get('ONEHOT_TIMESTAMP', False):
            onehot_cur = np.zeros((points.shape[0], len(sample_idx_pre_list) + 1)).astype(points.dtype)
            onehot_cur[:, 0] = 1
            points = np.hstack([points, onehot_cur])
        else:
            points = np.hstack([points, np.zeros((points.shape[0], 1)).astype(points.dtype)])
        points_pre_all = []
        num_points_pre = []

        pose_all = [pose_cur]
        pred_boxes_all = []
        if load_pred_boxes:
            pred_boxes = load_pred_boxes_from_dict(sequence_name, sample_idx)
            pred_boxes_all.append(pred_boxes)

        sequence_info = self.seq_name_to_infos[sequence_name]

        for idx, sample_idx_pre in enumerate(sample_idx_pre_list):

            points_pre = self.get_lidar(sequence_name, sample_idx_pre)
            pose_pre = sequence_info[sample_idx_pre]['pose'].reshape((4, 4))
            expand_points_pre = np.concatenate([points_pre[:, :3], np.ones((points_pre.shape[0], 1))], axis=-1)
            points_pre_global = np.dot(expand_points_pre, pose_pre.T)[:, :3]
            expand_points_pre_global = np.concatenate([points_pre_global, np.ones((points_pre_global.shape[0], 1))], axis=-1)
            points_pre2cur = np.dot(expand_points_pre_global, np.linalg.inv(pose_cur.T))[:, :3]
            points_pre = np.concatenate([points_pre2cur, points_pre[:, 3:]], axis=-1)
            if sequence_cfg.get('ONEHOT_TIMESTAMP', False):
                onehot_vector = np.zeros((points_pre.shape[0], len(sample_idx_pre_list) + 1))
                onehot_vector[:, idx + 1] = 1
                points_pre = np.hstack([points_pre, onehot_vector])
            else:
                # add timestamp
                points_pre = np.hstack([points_pre, 0.1 * (sample_idx - sample_idx_pre) * np.ones((points_pre.shape[0], 1)).astype(points_pre.dtype)])  # one frame 0.1s
            points_pre = remove_ego_points(points_pre, 1.0)
            points_pre_all.append(points_pre)
            num_points_pre.append(points_pre.shape[0])
            pose_all.append(pose_pre)

            if load_pred_boxes:
                pose_pre = sequence_info[sample_idx_pre]['pose'].reshape((4, 4))
                pred_boxes = load_pred_boxes_from_dict(sequence_name, sample_idx_pre)
                pred_boxes = self.transform_prebox_to_current(pred_boxes, pose_pre, pose_cur)
                pred_boxes_all.append(pred_boxes)

        points = np.concatenate([points] + points_pre_all, axis=0).astype(np.float32)
        num_points_all = np.array([num_pts_cur] + num_points_pre).astype(np.int32)
        poses = np.concatenate(pose_all, axis=0).astype(np.float32)

        if load_pred_boxes:
            temp_pred_boxes = self.reorder_rois_for_refining(pred_boxes_all)
            pred_boxes = temp_pred_boxes[:, :, 0:9]
            pred_scores = temp_pred_boxes[:, :, 9]
            pred_labels = temp_pred_boxes[:, :, 10]
        else:
            pred_boxes = pred_scores = pred_labels = None

        return points, num_points_all, sample_idx_pre_list, poses, pred_boxes, pred_scores, pred_labels

    def __len__(self):
        if self._merge_all_iters_to_one_epoch:
            return len(self.infos) * self.total_epochs

        return len(self.infos)

    def __getitem__(self, index):
        if self._merge_all_iters_to_one_epoch:
            index = index % len(self.infos)

        info = copy.deepcopy(self.infos[index])
        pc_info = info['point_cloud']
        sequence_name = pc_info['lidar_sequence']
        sample_idx = pc_info['sample_idx']
        input_dict = {
            'sample_idx': sample_idx
        }
        if self.use_shared_memory and index < self.shared_memory_file_limit:
            sa_key = f'{sequence_name}___{sample_idx}'
            points = SharedArray.attach(f"shm://{sa_key}").copy()
        else:
            points = self.get_lidar(sequence_name, sample_idx)

        if self.dataset_cfg.get('SEQUENCE_CONFIG', None) is not None and self.dataset_cfg.SEQUENCE_CONFIG.ENABLED:
            points, num_points_all, sample_idx_pre_list, poses, pred_boxes, pred_scores, pred_labels = self.get_sequence_data(
                info, points, sequence_name, sample_idx, self.dataset_cfg.SEQUENCE_CONFIG,
                load_pred_boxes=self.dataset_cfg.get('USE_PREDBOX', False)
            )
            input_dict['poses'] = poses
            if self.dataset_cfg.get('USE_PREDBOX', False):
                input_dict.update({
                    'roi_boxes': pred_boxes,
                    'roi_scores': pred_scores,
                    'roi_labels': pred_labels,
                })

        input_dict.update({
            'points': points,
            'frame_id': info['frame_id'],
        })

        if 'annos' in info:
            annos = info['annos']
            annos = common_utils.drop_info_with_name(annos, name='unknown')

            if self.dataset_cfg.get('INFO_WITH_FAKELIDAR', False):
                gt_boxes_lidar = box_utils.boxes3d_kitti_fakelidar_to_lidar(annos['gt_boxes_lidar'])
            else:
                gt_boxes_lidar = annos['gt_boxes_lidar']

            if self.dataset_cfg.get('TRAIN_WITH_SPEED', False):
                assert gt_boxes_lidar.shape[-1] == 9
            else:
                gt_boxes_lidar = gt_boxes_lidar[:, 0:7]

            if self.training and self.dataset_cfg.get('FILTER_EMPTY_BOXES_FOR_TRAIN', False):
                mask = (annos['num_points_in_gt'] > 0)  # filter empty boxes
                annos['name'] = annos['name'][mask]
                gt_boxes_lidar = gt_boxes_lidar[mask]
                annos['num_points_in_gt'] = annos['num_points_in_gt'][mask]

            input_dict.update({
                'gt_names': annos['name'],
                'gt_boxes': gt_boxes_lidar,
                'num_points_in_gt': annos.get('num_points_in_gt', None)
            })

        data_dict = self.prepare_data(data_dict=input_dict)
        data_dict['metadata'] = info.get('metadata', info['frame_id'])
        data_dict.pop('num_points_in_gt', None)
        return data_dict

    def evaluation(self, det_annos, class_names, **kwargs):
        if 'annos' not in self.infos[0].keys():
            return 'No ground-truth boxes for evaluation', {}

        def kitti_eval(eval_det_annos, eval_gt_annos):
            from ..kitti.kitti_object_eval_python import eval as kitti_eval
            from ..kitti import kitti_utils

            map_name_to_kitti = {
                'Vehicle': 'Car',
                'Pedestrian': 'Pedestrian',
                'Cyclist': 'Cyclist',
                'Sign': 'Sign',
                'Car': 'Car'
            }
            kitti_utils.transform_annotations_to_kitti_format(eval_det_annos, map_name_to_kitti=map_name_to_kitti)
            kitti_utils.transform_annotations_to_kitti_format(
                eval_gt_annos, map_name_to_kitti=map_name_to_kitti,
                info_with_fakelidar=self.dataset_cfg.get('INFO_WITH_FAKELIDAR', False)
            )
            kitti_class_names = [map_name_to_kitti[x] for x in class_names]
            ap_result_str, ap_dict = kitti_eval.get_official_eval_result(
                gt_annos=eval_gt_annos, dt_annos=eval_det_annos, current_classes=kitti_class_names
            )
            return ap_result_str, ap_dict

        def waymo_eval(eval_det_annos, eval_gt_annos):
            from .waymo_eval import OpenPCDetWaymoDetectionMetricsEstimator
            eval = OpenPCDetWaymoDetectionMetricsEstimator()

            ap_dict = eval.waymo_evaluation(
                eval_det_annos, eval_gt_annos, class_name=class_names,
                distance_thresh=1000, fake_gt_infos=self.dataset_cfg.get('INFO_WITH_FAKELIDAR', False)
            )
            ap_result_str = '\n'
            for key in ap_dict:
                ap_dict[key] = ap_dict[key][0]
                ap_result_str += '%s: %.4f \n' % (key, ap_dict[key])

            return ap_result_str, ap_dict

        eval_det_annos = copy.deepcopy(det_annos)
        eval_gt_annos = [copy.deepcopy(info['annos']) for info in self.infos]

        if kwargs['eval_metric'] == 'kitti':
            ap_result_str, ap_dict = kitti_eval(eval_det_annos, eval_gt_annos)
        elif kwargs['eval_metric'] == 'waymo':
            ap_result_str, ap_dict = waymo_eval(eval_det_annos, eval_gt_annos)
        else:
            raise NotImplementedError

        return ap_result_str, ap_dict

    def create_groundtruth_database(self, info_path, save_path, used_classes=None, split='train', sampled_interval=10,
                                    processed_data_tag=None):

        use_sequence_data = self.dataset_cfg.get('SEQUENCE_CONFIG', None) is not None and self.dataset_cfg.SEQUENCE_CONFIG.ENABLED

        if use_sequence_data:
            st_frame, ed_frame = self.dataset_cfg.SEQUENCE_CONFIG.SAMPLE_OFFSET[0], self.dataset_cfg.SEQUENCE_CONFIG.SAMPLE_OFFSET[1]
            self.dataset_cfg.SEQUENCE_CONFIG.SAMPLE_OFFSET[0] = min(-4, st_frame)  # at least we use 5 frames for generating gt database to support various sequence configs (<= 5 frames)
            st_frame = self.dataset_cfg.SEQUENCE_CONFIG.SAMPLE_OFFSET[0]
            database_save_path = save_path / ('%s_gt_database_%s_sampled_%d_multiframe_%s_to_%s' % (processed_data_tag, split, sampled_interval, st_frame, ed_frame))
            db_info_save_path = save_path / ('%s_waymo_dbinfos_%s_sampled_%d_multiframe_%s_to_%s.pkl' % (processed_data_tag, split, sampled_interval, st_frame, ed_frame))
            db_data_save_path = save_path / ('%s_gt_database_%s_sampled_%d_multiframe_%s_to_%s_global.npy' % (processed_data_tag, split, sampled_interval, st_frame, ed_frame))
        else:
            database_save_path = save_path / ('%s_gt_database_%s_sampled_%d' % (processed_data_tag, split, sampled_interval))
            db_info_save_path = save_path / ('%s_waymo_dbinfos_%s_sampled_%d.pkl' % (processed_data_tag, split, sampled_interval))
            db_data_save_path = save_path / ('%s_gt_database_%s_sampled_%d_global.npy' % (processed_data_tag, split, sampled_interval))

        database_save_path.mkdir(parents=True, exist_ok=True)
        all_db_infos = {}
        with open(info_path, 'rb') as f:
            infos = pickle.load(f)

        point_offset_cnt = 0
        stacked_gt_points = []
        for k in tqdm(range(0, len(infos), sampled_interval)):
            # print('gt_database sample: %d/%d' % (k + 1, len(infos)))
            info = infos[k]

            pc_info = info['point_cloud']
            sequence_name = pc_info['lidar_sequence']
            sample_idx = pc_info['sample_idx']
            points = self.get_lidar(sequence_name, sample_idx)

            if use_sequence_data:
                points, num_points_all, sample_idx_pre_list, _, _, _, _ = self.get_sequence_data(
                    info, points, sequence_name, sample_idx, self.dataset_cfg.SEQUENCE_CONFIG
                )

            annos = info['annos']
            names = annos['name']
            difficulty = annos['difficulty']
            gt_boxes = annos['gt_boxes_lidar']

            if k % 4 != 0 and len(names) > 0:
                mask = (names == 'Vehicle')
                names = names[~mask]
                difficulty = difficulty[~mask]
                gt_boxes = gt_boxes[~mask]

            if k % 2 != 0 and len(names) > 0:
                mask = (names == 'Pedestrian')
                names = names[~mask]
                difficulty = difficulty[~mask]
                gt_boxes = gt_boxes[~mask]

            num_obj = gt_boxes.shape[0]
            if num_obj == 0:
                continue

            box_idxs_of_pts = roiaware_pool3d_utils.points_in_boxes_gpu(
                torch.from_numpy(points[:, 0:3]).unsqueeze(dim=0).float().cuda(),
                torch.from_numpy(gt_boxes[:, 0:7]).unsqueeze(dim=0).float().cuda()
            ).long().squeeze(dim=0).cpu().numpy()

            for i in range(num_obj):
                filename = '%s_%04d_%s_%d.bin' % (sequence_name, sample_idx, names[i], i)
                filepath = database_save_path / filename
                gt_points = points[box_idxs_of_pts == i]
                gt_points[:, :3] -= gt_boxes[i, :3]

                if (used_classes is None) or names[i] in used_classes:
                    gt_points = gt_points.astype(np.float32)
                    assert gt_points.dtype == np.float32
                    with open(filepath, 'w') as f:
                        gt_points.tofile(f)

                    db_path = str(filepath.relative_to(self.root_path))  # gt_database/xxxxx.bin
                    db_info = {'name': names[i], 'path': db_path, 'sequence_name': sequence_name,
                               'sample_idx': sample_idx, 'gt_idx': i, 'box3d_lidar': gt_boxes[i],
                               'num_points_in_gt': gt_points.shape[0], 'difficulty': difficulty[i]}

                    # it will be used if you choose to use shared memory for gt sampling
                    stacked_gt_points.append(gt_points)
                    db_info['global_data_offset'] = [point_offset_cnt, point_offset_cnt + gt_points.shape[0]]
                    point_offset_cnt += gt_points.shape[0]

                    if names[i] in all_db_infos:
                        all_db_infos[names[i]].append(db_info)
                    else:
                        all_db_infos[names[i]] = [db_info]
        for k, v in all_db_infos.items():
            print('Database %s: %d' % (k, len(v)))

        with open(db_info_save_path, 'wb') as f:
            pickle.dump(all_db_infos, f)

        # it will be used if you choose to use shared memory for gt sampling
        stacked_gt_points = np.concatenate(stacked_gt_points, axis=0)
        np.save(db_data_save_path, stacked_gt_points)

    def create_gt_database_of_single_scene(self, info_with_idx, database_save_path=None, use_sequence_data=False, used_classes=None,
                                           total_samples=0, use_cuda=False, crop_gt_with_tail=False):
        info, info_idx = info_with_idx
        print('gt_database sample: %d/%d' % (info_idx, total_samples))

        all_db_infos = {}
        pc_info = info['point_cloud']
        sequence_name = pc_info['lidar_sequence']
        sample_idx = pc_info['sample_idx']
        points = self.get_lidar(sequence_name, sample_idx)

        if use_sequence_data:
            points, num_points_all, sample_idx_pre_list, _, _, _, _ = self.get_sequence_data(
                info, points, sequence_name, sample_idx, self.dataset_cfg.SEQUENCE_CONFIG
            )

        annos = info['annos']
        names = annos['name']
        difficulty = annos['difficulty']
        gt_boxes = annos['gt_boxes_lidar']

        if info_idx % 4 != 0 and len(names) > 0:
            mask = (names == 'Vehicle')
            names = names[~mask]
            difficulty = difficulty[~mask]
            gt_boxes = gt_boxes[~mask]

        if info_idx % 2 != 0 and len(names) > 0:
            mask = (names == 'Pedestrian')
            names = names[~mask]
            difficulty = difficulty[~mask]
            gt_boxes = gt_boxes[~mask]

        num_obj = gt_boxes.shape[0]
        if num_obj == 0:
            return {}

        if use_sequence_data and crop_gt_with_tail:
            assert gt_boxes.shape[1] == 9
            speed = gt_boxes[:, 7:9]
            sequence_cfg = self.dataset_cfg.SEQUENCE_CONFIG
            assert sequence_cfg.SAMPLE_OFFSET[1] == 0
            assert sequence_cfg.SAMPLE_OFFSET[0] < 0
            num_frames = sequence_cfg.SAMPLE_OFFSET[1] - sequence_cfg.SAMPLE_OFFSET[0] + 1
            assert num_frames > 1
            latest_center = gt_boxes[:, 0:2]
            oldest_center = latest_center - speed * (num_frames - 1) * 0.1
            new_center = (latest_center + oldest_center) * 0.5
            new_length = gt_boxes[:, 3] + np.linalg.norm(latest_center - oldest_center, axis=-1)
            gt_boxes_crop = gt_boxes.copy()
            gt_boxes_crop[:, 0:2] = new_center
            gt_boxes_crop[:, 3] = new_length

        else:
            gt_boxes_crop = gt_boxes

        if use_cuda:
            box_idxs_of_pts = roiaware_pool3d_utils.points_in_boxes_gpu(
                torch.from_numpy(points[:, 0:3]).unsqueeze(dim=0).float().cuda(),
                torch.from_numpy(gt_boxes_crop[:, 0:7]).unsqueeze(dim=0).float().cuda()
            ).long().squeeze(dim=0).cpu().numpy()
        else:
            box_point_mask = roiaware_pool3d_utils.points_in_boxes_cpu(
                torch.from_numpy(points[:, 0:3]).float(),
                torch.from_numpy(gt_boxes_crop[:, 0:7]).float()
            ).long().numpy()  # (num_boxes, num_points)

        for i in range(num_obj):
            filename = '%s_%04d_%s_%d.bin' % (sequence_name, sample_idx, names[i], i)
            filepath = database_save_path / filename
            if use_cuda:
                gt_points = points[box_idxs_of_pts == i]
            else:
                gt_points = points[box_point_mask[i] > 0]

            gt_points[:, :3] -= gt_boxes[i, :3]

            if (used_classes is None) or names[i] in used_classes:
                gt_points = gt_points.astype(np.float32)
                assert gt_points.dtype == np.float32
                with open(filepath, 'w') as f:
                    gt_points.tofile(f)

                db_path = str(filepath.relative_to(self.root_path))  # gt_database/xxxxx.bin
                db_info = {'name': names[i], 'path': db_path, 'sequence_name': sequence_name,
                            'sample_idx': sample_idx, 'gt_idx': i, 'box3d_lidar': gt_boxes[i],
                            'num_points_in_gt': gt_points.shape[0], 'difficulty': difficulty[i],
                            'box3d_crop': gt_boxes_crop[i]}

                if names[i] in all_db_infos:
                    all_db_infos[names[i]].append(db_info)
                else:
                    all_db_infos[names[i]] = [db_info]
        return all_db_infos

    def create_groundtruth_database_parallel(self, info_path, save_path, used_classes=None, split='train', sampled_interval=10,
                                             processed_data_tag=None, num_workers=16, crop_gt_with_tail=False):
        use_sequence_data = self.dataset_cfg.get('SEQUENCE_CONFIG', None) is not None and self.dataset_cfg.SEQUENCE_CONFIG.ENABLED
        if use_sequence_data:
            st_frame, ed_frame = self.dataset_cfg.SEQUENCE_CONFIG.SAMPLE_OFFSET[0], self.dataset_cfg.SEQUENCE_CONFIG.SAMPLE_OFFSET[1]
            self.dataset_cfg.SEQUENCE_CONFIG.SAMPLE_OFFSET[0] = min(-4, st_frame)  # at least we use 5 frames for generating gt database to support various sequence configs (<= 5 frames)
            st_frame = self.dataset_cfg.SEQUENCE_CONFIG.SAMPLE_OFFSET[0]
            database_save_path = save_path / ('%s_gt_database_%s_sampled_%d_multiframe_%s_to_%s_%sparallel' % (processed_data_tag, split, sampled_interval, st_frame, ed_frame, 'tail_' if crop_gt_with_tail else ''))
            db_info_save_path = save_path / ('%s_waymo_dbinfos_%s_sampled_%d_multiframe_%s_to_%s_%sparallel.pkl' % (processed_data_tag, split, sampled_interval, st_frame, ed_frame, 'tail_' if crop_gt_with_tail else ''))
        else:
            database_save_path = save_path / ('%s_gt_database_%s_sampled_%d_parallel' % (processed_data_tag, split, sampled_interval))
            db_info_save_path = save_path / ('%s_waymo_dbinfos_%s_sampled_%d_parallel.pkl' % (processed_data_tag, split, sampled_interval))
            
        database_save_path.mkdir(parents=True, exist_ok=True)

        with open(info_path, 'rb') as f:
            infos = pickle.load(f)

        print(f'Number workers: {num_workers}')
        create_gt_database_of_single_scene = partial(
            self.create_gt_database_of_single_scene,
            use_sequence_data=use_sequence_data, database_save_path=database_save_path,
            used_classes=used_classes, total_samples=len(infos), use_cuda=False,
            crop_gt_with_tail=crop_gt_with_tail
        )
        # create_gt_database_of_single_scene((infos[300], 0))
        with multiprocessing.Pool(num_workers) as p:
            all_db_infos_list = list(p.map(create_gt_database_of_single_scene, zip(infos, np.arange(len(infos)))))

        all_db_infos = {}

        for cur_db_infos in all_db_infos_list:
            for key, val in cur_db_infos.items():
                if key not in all_db_infos:
                    all_db_infos[key] = val
                else:
                    all_db_infos[key].extend(val)

        for k, v in all_db_infos.items():
            print('Database %s: %d' % (k, len(v)))

        with open(db_info_save_path, 'wb') as f:
            pickle.dump(all_db_infos, f)

def create_waymo_infos(dataset_cfg, class_names, data_path, save_path,
                       raw_data_tag='raw_data', processed_data_tag='waymo_processed_data',
                       workers=min(16, multiprocessing.cpu_count()), update_info_only=False):
    dataset = WaymoDataset(
        dataset_cfg=dataset_cfg, class_names=class_names, root_path=data_path,
        training=False, logger=common_utils.create_logger()
    )
    train_split, val_split = 'train', 'val'

    train_filename = save_path / ('%s_infos_%s.pkl' % (processed_data_tag, train_split))
    val_filename = save_path / ('%s_infos_%s.pkl' % (processed_data_tag, val_split))

    os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
    print('---------------Start to generate data infos---------------')

    dataset.set_split(train_split)
    waymo_infos_train = dataset.get_infos(
        raw_data_path=data_path / raw_data_tag,
        save_path=save_path / processed_data_tag, num_workers=workers, has_label=True,
        sampled_interval=1, update_info_only=update_info_only
    )
    with open(train_filename, 'wb') as f:
        pickle.dump(waymo_infos_train, f)
    print('----------------Waymo info train file is saved to %s----------------' % train_filename)

    dataset.set_split(val_split)
    waymo_infos_val = dataset.get_infos(
        raw_data_path=data_path / raw_data_tag,
        save_path=save_path / processed_data_tag, num_workers=workers, has_label=True,
        sampled_interval=1, update_info_only=update_info_only
    )
    with open(val_filename, 'wb') as f:
        pickle.dump(waymo_infos_val, f)
    print('----------------Waymo info val file is saved to %s----------------' % val_filename)

    if update_info_only:
        return

    print('---------------Start create groundtruth database for data augmentation---------------')
    os.environ["CUDA_VISIBLE_DEVICES"] = "0"
    dataset.set_split(train_split)
    dataset.create_groundtruth_database(
        info_path=train_filename, save_path=save_path, split='train', sampled_interval=1,
        used_classes=['Vehicle', 'Pedestrian', 'Cyclist'], processed_data_tag=processed_data_tag
    )
    print('---------------Data preparation Done---------------')


def create_waymo_gt_database(
    dataset_cfg, class_names, data_path, save_path, processed_data_tag='waymo_processed_data',
    workers=min(16, multiprocessing.cpu_count()), use_parallel=False, crop_gt_with_tail=False):
    dataset = WaymoDataset(
        dataset_cfg=dataset_cfg, class_names=class_names, root_path=data_path,
        training=False, logger=common_utils.create_logger()
    )
    train_split = 'train'
    train_filename = save_path / ('%s_infos_%s.pkl' % (processed_data_tag, train_split))

    print('---------------Start create groundtruth database for data augmentation---------------')
    dataset.set_split(train_split)

    if use_parallel:
        dataset.create_groundtruth_database_parallel(
            info_path=train_filename, save_path=save_path, split='train', sampled_interval=1,
            used_classes=['Vehicle', 'Pedestrian', 'Cyclist'], processed_data_tag=processed_data_tag,
            num_workers=workers, crop_gt_with_tail=crop_gt_with_tail
        )
    else:
        dataset.create_groundtruth_database(
            info_path=train_filename, save_path=save_path, split='train', sampled_interval=1,
            used_classes=['Vehicle', 'Pedestrian', 'Cyclist'], processed_data_tag=processed_data_tag
        )
    print('---------------Data preparation Done---------------')


if __name__ == '__main__':
    import argparse
    import yaml
    from easydict import EasyDict

    parser = argparse.ArgumentParser(description='arg parser')
    parser.add_argument('--cfg_file', type=str, default=None, help='specify the config of dataset')
    parser.add_argument('--func', type=str, default='create_waymo_infos', help='')
    parser.add_argument('--processed_data_tag', type=str, default='waymo_processed_data_v0_5_0', help='')
    parser.add_argument('--update_info_only', action='store_true', default=False, help='')
    parser.add_argument('--use_parallel', action='store_true', default=False, help='')
    parser.add_argument('--wo_crop_gt_with_tail', action='store_true', default=False, help='')
    parser.add_argument('--data_path', default=None, help='')

    args = parser.parse_args()
    if args.data_path is not None:
        ROOT_DIR = (Path(args.data_path)).resolve()
    else:
        ROOT_DIR = (Path(__file__).resolve().parent / '../../../').resolve() / 'data'/'waymo'
    # ROOT_DIR = (Path(self.dataset_cfg.DATA_PATH)).resolve()
    if args.func == 'create_waymo_infos':
        try:
            yaml_config = yaml.safe_load(open(args.cfg_file), Loader=yaml.FullLoader)
        except:
            yaml_config = yaml.safe_load(open(args.cfg_file))
        dataset_cfg = EasyDict(yaml_config)
        dataset_cfg.PROCESSED_DATA_TAG = args.processed_data_tag
        create_waymo_infos(
            dataset_cfg=dataset_cfg,
            class_names=['Vehicle', 'Pedestrian', 'Cyclist'],
            data_path=ROOT_DIR,
            save_path=ROOT_DIR,
            raw_data_tag='raw_data',
            processed_data_tag=args.processed_data_tag,
            update_info_only=args.update_info_only
        )
    elif args.func == 'create_waymo_gt_database':
        try:
            yaml_config = yaml.safe_load(open(args.cfg_file), Loader=yaml.FullLoader)
        except:
            yaml_config = yaml.safe_load(open(args.cfg_file))
        dataset_cfg = EasyDict(yaml_config)
        dataset_cfg.PROCESSED_DATA_TAG = args.processed_data_tag
        create_waymo_gt_database(
            dataset_cfg=dataset_cfg,
            class_names=['Vehicle', 'Pedestrian', 'Cyclist'],
            data_path=ROOT_DIR,
            save_path=ROOT_DIR,
            processed_data_tag=args.processed_data_tag,
            use_parallel=args.use_parallel, 
            crop_gt_with_tail=not args.wo_crop_gt_with_tail
        )
    else:
        raise NotImplementedError

（2）更改训练源码

位置：OpenPCDet-master/tools/cfgs/dataset_configs/waymo_dataset.yaml

将DATA_PATH改为自己数据集的路径：

4、使用命令进行转换

从tfrecord中提取点云数据并生成数据信息：

python -m pcdet.datasets.waymo.waymo_dataset --func create_waymo_infos --cfg_file tools/cfgs/dataset_configs/waymo_dataset.yaml --data_path /media/xd/hpc/data/waymo_pcdet_mini/

指令格式：

指令格式为：python -m pcdet.datasets.waymo.waymo_dataset --func create_waymo_infos --cfg_file tools/cfgs/dataset_configs/waymo_dataset.yaml --data_path /media/xd/hpc/data/waymo_pcdet_mini/（waymo数据集路径）

5、训练

这里用PV_RCNN++举例，命令为

python train.py --cfg_file /home/xd/xyy/OpenPCDet-master/tools/cfgs/waymo_models/pv_rcnn_plusplus.yaml

四、关于数据集划分大小测验

我选择测试的是PV-RCNN++，官方的精度为：

测试1/8的结果如下（Traning和validation各取1/8）：

测试一半的数据集时（Traning和validation各取一半），测试结果如下：

这时已经很接近官方的结果了，所以最后还是拿全集去测试。

五、遇到问题

1、ModuleNotFoundError: No module named 'numpy.typing'

答：numpy版本太低了，需要安装numpy>=1.21.0。

2、NotImplementedError: Cannot convert a symbolic Tensor (strided_slice:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported

答：

这是一个很恶心的问题，主要是由于tensorflow与numpy版本不匹配，如果你使用的是官方的指令就容易出现这种错误：

pip install waymo_open_dataset-tf-2-1-0

问题描述：这里的tf-2-1-0指的就是在tensorflow2.1环境下运行，而源码中有些地方需要numpy>=2.21.0（例如：av2需要numpy>=2.21.0)。

提升或降低numpy版本：如果升numpy版本，就会遇到现在这个问题，如果降低numpy版本，又会报问题1的错误。

提升或降低tensorflow版本：不管提升还是降低tensorflow版本，都会由于av2过不了evaluation。

因此会陷入死循环，所以建议提升waymo_open_dataset-tf工具版本。

解决方法：把tensorflow版本提高，用如下指令：

pip install waymo_open_dataset-tf-2.11.0==1.5.0

这个指令会自动安装waymo_open_dataset-tf-2.11.0最新版本工具以及对应版本的tensorflow。

注：对于waymo数据集工具箱与tensorflow版本对应关系如下所示：

tensor版本	waymo_open_dataset版本	numpy版本
tensorflow 2.1至 tensorflow 2.6	waymo_open_dataset-tf-2-1-0至waymo_open_dataset-tf-2-6-0	1.19.2及以下
tensorflow 2.11	waymo_open_dataset-tf-2.11.0	1.21.5

中间的tensorflow2.7-2.10目前都是没有开发的，具体可以参考最新手册及案例：

https://github.com/waymo-research/waymo-open-dataset/blob/master/tutorial/tutorial_v2.ipynbhttps://github.com/waymo-research/waymo-open-dataset/blob/master/tutorial/tutorial_v2.ipynb

3、ImportError: cannot import name 'ParamSpec' from 'typing_extensions'

答：将typing_extenstions降级：

pip install typing-extensions==4.3.0

如果低版本安装不了，可以卸载了重新装一个不同版本的就可以。

参考：

ImportError: cannot import name 'ParamSpec' from 'typing_extensions'-编程语言-CSDN问答https://ask.csdn.net/questions/7829856

4、

ImportError: cannot import name 'TypeGuard' from 'typing_extensions'

解决方法：同问题3.

5、AttributeError: module 'numpy.typing' has no attribute 'NDArray'

答：同问题1，numpy版本太低了，需要安装numpy>=1.21.0。

6、AttributeError: module ‘spconv‘ has no attribute ‘SparseModule‘

答：spconv2.x版本太高，需要安装spconv1.2.1进行降级。

CUDA11.x以上建议安装spconv2.x高版本，除非你是CUDA10.2，否则不建议安装spconv1.2.1，过程异常麻烦，且很大概率会在spconv构建环境build时报错：Subprocess.CalledProcessError。

具体可以参考我以前的博客：

九天毕昇”云平台：python3.7+CUDA10.1+torch1.6.0+spconcv1.2.1安装OpenPCDet全流程_空持千百偈，不如吃茶去的博客-CSDN博客主要在云平台上搭建OpenPCDet环境https://blog.csdn.net/weixin_44013732/article/details/126030624

7、ValueError: need at least one array to concatenate

答：原因有很多，路径错误，文件名称错误，或者是train数据集和val数据集没有放到一起去转化。

8、tensorflow.python.framework.errors_impl.NotFoundError: /home/xd/anaconda3/lib/python3.8/site-packages/waymo_open_dataset/metrics/ops/metrics_ops.so: undefined symbol: _ZNK10tensorflow8OpKernel11TraceStringERKNS_15OpKernelContextEb

答：waymo_open_dataset-tf的版本与tensorflow不匹配。

9、ImportError: cannot import name 'detection_metrics' from 'waymo_open_dataset.metrics.python' (unknown location)

答：重装一遍waymo_open_dataset-tf。

10、ZeroDivisionError: float division by zero

问题详细描述：

Traceback (most recent call last):
  File "train.py", line 229, in
    main()
  File "train.py", line 219, in main
    repeat_eval_ckpt(
  File "/home/xyy/OpenPCDet-master/tools/test.py", line 123, in repeat_eval_ckpt
    tb_dict = eval_utils.eval_one_epoch(
  File "/home/xyy/OpenPCDet-master/tools/eval_utils/eval_utils.py", line 95, in eval_one_epoch
    sec_per_example = (time.time() - start_time) / len(dataloader.dataset)
ZeroDivisionError: float division by zero

解决方法：

将val数据集放到/waymo/raw_data下一起转换。

具体解释如下：

首先我们查看一下报错的信息：

ZeroDivisionError: float division by zero代表着分母为0除不尽。

接着，我们来到报错的位置，尝试输出len(dataloader.dataset)：

所以我们可以得出结论，就是dataloader没有读入数据，因此我们开始向上看信息。

查看前面有没有这样的提示：

2023-08-25 19:44:58,443 INFO Total skipped info 202
2023-08-25 19:44:58,443 INFO Total samples for Waymo dataset: 0

信息中说明，跳过了202个，采样到了0个waymo数据。

我们来到total skipped info程序中，位置在：tools/cfgs/dataset_configs/waymo_dataset.py

我们发现，跳过的个数是依靠num_skipped_info 来计数的，当info_path（也就是数据集路径）中没有val.txt(在data/waymo/ImageSets)所提到的数据，那么就计数加1，所以在这里我们输出info_path。

发现跳过的就是val.txt中的文件，经查发现，/raw_data文件夹下确实忘记把val数据集放进去了。

所以结论很明了，我们只需要将val的解压包进行解压，放到val.

11、TFrecord文件错误

问题描述：

在转换waymo的tfrecord遇到这么一个阴间问题，导致我查了很久：

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/data/conda/envs/pcdet/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/xyy/OpenPCDet-master/pcdet/datasets/waymo/waymo_utils.py", line 225, in process_single_sequence
    for cnt, data in enumerate(dataset):
  File "/home/user/.local/lib/python3.8/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 787, in __next__
    return self._next_internal()
  File "/home/user/.local/lib/python3.8/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 770, in _next_internal
    ret = gen_dataset_ops.iterator_get_next(
  File "/home/user/.local/lib/python3.8/site-packages/tensorflow/python/ops/gen_dataset_ops.py", line 3017, in iterator_get_next
    _ops.raise_from_not_ok_status(e, name)
  File "/home/user/.local/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 7215, in raise_from_not_ok_status
    raise core._status_to_exception(e) from None  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.DataLossError: {{function_node __wrapped__IteratorGetNext_output_types_1_device_/job:localhost/replica:0/task:0/device:CPU:0}} corrupted record at 969164982 [Op:IteratorGetNext]
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/data/conda/envs/pcdet/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/data/conda/envs/pcdet/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/xyy/OpenPCDet-master/pcdet/datasets/waymo/waymo_dataset.py", line 831, in
    create_waymo_infos(
  File "/home/xyy/OpenPCDet-master/pcdet/datasets/waymo/waymo_dataset.py", line 744, in create_waymo_infos
    waymo_infos_train = dataset.get_infos(
  File "/home/xyy/OpenPCDet-master/pcdet/datasets/waymo/waymo_dataset.py", line 199, in get_infos
    sequence_infos = list(tqdm(p.imap(process_single_sequence, sample_sequence_file_list),
  File "/data/conda/envs/pcdet/lib/python3.8/site-packages/tqdm/std.py", line 1178, in __iter__
    for obj in iterable:
  File "/data/conda/envs/pcdet/lib/python3.8/multiprocessing/pool.py", line 868, in next
    raise value
tensorflow.python.framework.errors_impl.DataLossError: {{function_node __wrapped__IteratorGetNext_output_types_1_device_/job:localhost/replica:0/task:0/device:CPU:0}} corrupted record at 969164982 [Op:IteratorGetNext]

问题原因：就是官方数据集在转换tfrecord时出了问题，这个现象在v1.4.1更为常见，因此，我们最主要的目的就是查出问题文件并删除。

省事方法：

将名为segment-9175749307679169289_5933_260_5953_260_with_camera_labels.tfrecord文件删除（或先放到一个新建的文件夹里）。

详细处理步骤：

根据问题描述，我们先来到 /OpenPCDet-master/pcdet/datasets/waymo/waymo_utils.py文件的225行：

我们在代码上面输出一下当前转换的tfrecord文件：

接着看下一个问题，定位到了读条函数那里，我们来到/OpenPCDet-master/pcdet/datasets/waymo/waymo_dataset.py的199行：

我们将读条函数的num_workers改为1，让他一个一个转换，看具体是哪个出了问题：

最后再运行一下生成数据指令：

python -m pcdet.datasets.waymo.waymo_dataset --func create_waymo_infos --cfg_file tools/cfgs/dataset_configs/waymo_dataset.yaml

查看到的结果如下：

因此，我们将这个文件拿去即可接着转换：

没问题喽~。

修改源码如下：

/OpenPCDet-master/pcdet/datasets/waymo/waymo_dataset.py:

# OpenPCDet PyTorch Dataloader and Evaluation Tools for Waymo Open Dataset
# Reference https://github.com/open-mmlab/OpenPCDet
# Written by Shaoshuai Shi, Chaoxu Guo
# All Rights Reserved.

import os
import pickle
import copy
import numpy as np
import torch
import multiprocessing
import SharedArray
import torch.distributed as dist
from tqdm import tqdm
from pathlib import Path
from functools import partial

from ...ops.roiaware_pool3d import roiaware_pool3d_utils
from ...utils import box_utils, common_utils
from ..dataset import DatasetTemplate


class WaymoDataset(DatasetTemplate):
    def __init__(self, dataset_cfg, class_names, training=True, root_path=None, logger=None):
        super().__init__(
            dataset_cfg=dataset_cfg, class_names=class_names, training=training, root_path=root_path, logger=logger
        )
        self.data_path = self.root_path / self.dataset_cfg.PROCESSED_DATA_TAG
        self.split = self.dataset_cfg.DATA_SPLIT[self.mode]
        split_dir = self.root_path / 'ImageSets' / (self.split + '.txt')
        self.sample_sequence_list = [x.strip() for x in open(split_dir).readlines()]

        self.infos = []
        self.seq_name_to_infos = self.include_waymo_data(self.mode)

        self.use_shared_memory = self.dataset_cfg.get('USE_SHARED_MEMORY', False) and self.training
        if self.use_shared_memory:
            self.shared_memory_file_limit = self.dataset_cfg.get('SHARED_MEMORY_FILE_LIMIT', 0x7FFFFFFF)
            self.load_data_to_shared_memory()

        if self.dataset_cfg.get('USE_PREDBOX', False):
            self.pred_boxes_dict = self.load_pred_boxes_to_dict(
                pred_boxes_path=self.dataset_cfg.ROI_BOXES_PATH[self.mode]
            )
        else:
            self.pred_boxes_dict = {}

    def set_split(self, split):
        super().__init__(
            dataset_cfg=self.dataset_cfg, class_names=self.class_names, training=self.training,
            root_path=self.root_path, logger=self.logger
        )
        self.split = split
        split_dir = self.root_path / 'ImageSets' / (self.split + '.txt')
        self.sample_sequence_list = [x.strip() for x in open(split_dir).readlines()]
        self.infos = []
        self.seq_name_to_infos = self.include_waymo_data(self.mode)

    def include_waymo_data(self, mode):
        self.logger.info('Loading Waymo dataset')
        waymo_infos = []
        seq_name_to_infos = {}

        num_skipped_infos = 0
        for k in range(len(self.sample_sequence_list)):
            sequence_name = os.path.splitext(self.sample_sequence_list[k])[0]
            info_path = self.data_path / sequence_name / ('%s.pkl' % sequence_name)
            info_path = self.check_sequence_name_with_all_version(info_path)
            # print(info_path)
            if not info_path.exists():
                num_skipped_infos += 1
                continue
            with open(info_path, 'rb') as f:
                infos = pickle.load(f)
                waymo_infos.extend(infos)

            seq_name_to_infos[infos[0]['point_cloud']['lidar_sequence']] = infos

        self.infos.extend(waymo_infos[:])
        self.logger.info('Total skipped info %s' % num_skipped_infos)
        self.logger.info('Total samples for Waymo dataset: %d' % (len(waymo_infos)))

        if self.dataset_cfg.SAMPLED_INTERVAL[mode] > 1:
            sampled_waymo_infos = []
            for k in range(0, len(self.infos), self.dataset_cfg.SAMPLED_INTERVAL[mode]):
                sampled_waymo_infos.append(self.infos[k])
            self.infos = sampled_waymo_infos
            self.logger.info('Total sampled samples for Waymo dataset: %d' % len(self.infos))

        use_sequence_data = self.dataset_cfg.get('SEQUENCE_CONFIG',
                                                 None) is not None and self.dataset_cfg.SEQUENCE_CONFIG.ENABLED
        if not use_sequence_data:
            seq_name_to_infos = None
        return seq_name_to_infos

    def load_pred_boxes_to_dict(self, pred_boxes_path):
        self.logger.info(f'Loading and reorganizing pred_boxes to dict from path: {pred_boxes_path}')
        with open(pred_boxes_path, 'rb') as f:
            pred_dicts = pickle.load(f)

        pred_boxes_dict = {}
        for index, box_dict in enumerate(pred_dicts):
            seq_name = box_dict['frame_id'][:-4].replace('training_', '').replace('validation_', '')
            sample_idx = int(box_dict['frame_id'][-3:])

            if seq_name not in pred_boxes_dict:
                pred_boxes_dict[seq_name] = {}

            pred_labels = np.array(
                [self.class_names.index(box_dict['name'][k]) + 1 for k in range(box_dict['name'].shape[0])])
            pred_boxes = np.concatenate(
                (box_dict['boxes_lidar'], box_dict['score'][:, np.newaxis], pred_labels[:, np.newaxis]), axis=-1)
            pred_boxes_dict[seq_name][sample_idx] = pred_boxes

        self.logger.info(f'Predicted boxes has been loaded, total sequences: {len(pred_boxes_dict)}')
        return pred_boxes_dict

    def load_data_to_shared_memory(self):
        self.logger.info(f'Loading training data to shared memory (file limit={self.shared_memory_file_limit})')

        cur_rank, num_gpus = common_utils.get_dist_info()
        all_infos = self.infos[:self.shared_memory_file_limit] \
            if self.shared_memory_file_limit < len(self.infos) else self.infos
        cur_infos = all_infos[cur_rank::num_gpus]
        for info in cur_infos:
            pc_info = info['point_cloud']
            sequence_name = pc_info['lidar_sequence']
            sample_idx = pc_info['sample_idx']

            sa_key = f'{sequence_name}___{sample_idx}'
            if os.path.exists(f"/dev/shm/{sa_key}"):
                continue

            points = self.get_lidar(sequence_name, sample_idx)
            common_utils.sa_create(f"shm://{sa_key}", points)

        dist.barrier()
        self.logger.info('Training data has been saved to shared memory')

    def clean_shared_memory(self):
        self.logger.info(f'Clean training data from shared memory (file limit={self.shared_memory_file_limit})')

        cur_rank, num_gpus = common_utils.get_dist_info()
        all_infos = self.infos[:self.shared_memory_file_limit] \
            if self.shared_memory_file_limit < len(self.infos) else self.infos
        cur_infos = all_infos[cur_rank::num_gpus]
        for info in cur_infos:
            pc_info = info['point_cloud']
            sequence_name = pc_info['lidar_sequence']
            sample_idx = pc_info['sample_idx']

            sa_key = f'{sequence_name}___{sample_idx}'
            if not os.path.exists(f"/dev/shm/{sa_key}"):
                continue

            SharedArray.delete(f"shm://{sa_key}")

        if num_gpus > 1:
            dist.barrier()
        self.logger.info('Training data has been deleted from shared memory')

    @staticmethod
    def check_sequence_name_with_all_version(sequence_file):
        if not sequence_file.exists():
            found_sequence_file = sequence_file
            for pre_text in ['training', 'validation', 'testing']:
                if not sequence_file.exists():
                    temp_sequence_file = Path(str(sequence_file).replace('segment', pre_text + '_segment'))
                    if temp_sequence_file.exists():
                        found_sequence_file = temp_sequence_file
                        break
            if not found_sequence_file.exists():
                found_sequence_file = Path(str(sequence_file).replace('_with_camera_labels', ''))
            if found_sequence_file.exists():
                sequence_file = found_sequence_file
        return sequence_file

    def get_infos(self, raw_data_path, save_path, num_workers=multiprocessing.cpu_count(), has_label=True,
                  sampled_interval=1, update_info_only=False):
        from . import waymo_utils
        print('---------------The waymo sample interval is %d, total sequecnes is %d-----------------'
              % (sampled_interval, len(self.sample_sequence_list)))

        process_single_sequence = partial(
            waymo_utils.process_single_sequence,
            save_path=save_path, sampled_interval=sampled_interval, has_label=has_label,
            update_info_only=update_info_only
        )
        sample_sequence_file_list = [
            self.check_sequence_name_with_all_version(raw_data_path / sequence_file)
            for sequence_file in self.sample_sequence_list
        ]

        # process_single_sequence(sample_sequence_file_list[0])
        # 读条
        num_workers=1
        with multiprocessing.Pool(num_workers) as p:
            sequence_infos = list(tqdm(p.imap(process_single_sequence, sample_sequence_file_list),
                                       total=len(sample_sequence_file_list)))

        all_sequences_infos = [item for infos in sequence_infos for item in infos]
        return all_sequences_infos

    def get_lidar(self, sequence_name, sample_idx):
        lidar_file = self.data_path / sequence_name / ('%04d.npy' % sample_idx)
        point_features = np.load(lidar_file)  # (N, 7): [x, y, z, intensity, elongation, NLZ_flag]

        points_all, NLZ_flag = point_features[:, 0:5], point_features[:, 5]
        if not self.dataset_cfg.get('DISABLE_NLZ_FLAG_ON_POINTS', False):
            points_all = points_all[NLZ_flag == -1]
        points_all[:, 3] = np.tanh(points_all[:, 3])
        return points_all

    @staticmethod
    def transform_prebox_to_current(pred_boxes3d, pose_pre, pose_cur):
        """
        Args:
            pred_boxes3d (N, 9 or 11): [x, y, z, dx, dy, dz, raw,  score, label]
            pose_pre (4, 4):
            pose_cur (4, 4):
        Returns:
        """
        assert pred_boxes3d.shape[-1] in [9, 11]
        pred_boxes3d = pred_boxes3d.copy()
        expand_bboxes = np.concatenate([pred_boxes3d[:, :3], np.ones((pred_boxes3d.shape[0], 1))], axis=-1)

        bboxes_global = np.dot(expand_bboxes, pose_pre.T)[:, :3]
        expand_bboxes_global = np.concatenate([bboxes_global[:, :3], np.ones((bboxes_global.shape[0], 1))], axis=-1)
        bboxes_pre2cur = np.dot(expand_bboxes_global, np.linalg.inv(pose_cur.T))[:, :3]
        pred_boxes3d[:, 0:3] = bboxes_pre2cur

        if pred_boxes3d.shape[-1] == 11:
            expand_vels = np.concatenate([pred_boxes3d[:, 7:9], np.zeros((pred_boxes3d.shape[0], 1))], axis=-1)
            vels_global = np.dot(expand_vels, pose_pre[:3, :3].T)
            vels_pre2cur = np.dot(vels_global, np.linalg.inv(pose_cur[:3, :3].T))[:, :2]
            pred_boxes3d[:, 7:9] = vels_pre2cur

        pred_boxes3d[:, 6] = pred_boxes3d[..., 6] + np.arctan2(pose_pre[..., 1, 0], pose_pre[..., 0, 0])
        pred_boxes3d[:, 6] = pred_boxes3d[..., 6] - np.arctan2(pose_cur[..., 1, 0], pose_cur[..., 0, 0])
        return pred_boxes3d

    @staticmethod
    def reorder_rois_for_refining(pred_bboxes):
        num_max_rois = max([len(bbox) for bbox in pred_bboxes])
        num_max_rois = max(1, num_max_rois)  # at least one faked rois to avoid error
        ordered_bboxes = np.zeros([len(pred_bboxes), num_max_rois, pred_bboxes[0].shape[-1]], dtype=np.float32)

        for bs_idx in range(ordered_bboxes.shape[0]):
            ordered_bboxes[bs_idx, :len(pred_bboxes[bs_idx])] = pred_bboxes[bs_idx]
        return ordered_bboxes

    def get_sequence_data(self, info, points, sequence_name, sample_idx, sequence_cfg, load_pred_boxes=False):
        """
        Args:
            info:
            points:
            sequence_name:
            sample_idx:
            sequence_cfg:
        Returns:
        """

        def remove_ego_points(points, center_radius=1.0):
            mask = ~((np.abs(points[:, 0]) < center_radius) & (np.abs(points[:, 1]) < center_radius))
            return points[mask]

        def load_pred_boxes_from_dict(sequence_name, sample_idx):
            """
            boxes: (N, 11)  [x, y, z, dx, dy, dn, raw, vx, vy, score, label]
            """
            sequence_name = sequence_name.replace('training_', '').replace('validation_', '')
            load_boxes = self.pred_boxes_dict[sequence_name][sample_idx]
            assert load_boxes.shape[-1] == 11
            load_boxes[:, 7:9] = -0.1 * load_boxes[:, 7:9]  # transfer speed to negtive motion from t to t-1
            return load_boxes

        pose_cur = info['pose'].reshape((4, 4))
        num_pts_cur = points.shape[0]
        sample_idx_pre_list = np.clip(
            sample_idx + np.arange(sequence_cfg.SAMPLE_OFFSET[0], sequence_cfg.SAMPLE_OFFSET[1]), 0, 0x7FFFFFFF)
        sample_idx_pre_list = sample_idx_pre_list[::-1]

        if sequence_cfg.get('ONEHOT_TIMESTAMP', False):
            onehot_cur = np.zeros((points.shape[0], len(sample_idx_pre_list) + 1)).astype(points.dtype)
            onehot_cur[:, 0] = 1
            points = np.hstack([points, onehot_cur])
        else:
            points = np.hstack([points, np.zeros((points.shape[0], 1)).astype(points.dtype)])
        points_pre_all = []
        num_points_pre = []

        pose_all = [pose_cur]
        pred_boxes_all = []
        if load_pred_boxes:
            pred_boxes = load_pred_boxes_from_dict(sequence_name, sample_idx)
            pred_boxes_all.append(pred_boxes)

        sequence_info = self.seq_name_to_infos[sequence_name]

        for idx, sample_idx_pre in enumerate(sample_idx_pre_list):

            points_pre = self.get_lidar(sequence_name, sample_idx_pre)
            pose_pre = sequence_info[sample_idx_pre]['pose'].reshape((4, 4))
            expand_points_pre = np.concatenate([points_pre[:, :3], np.ones((points_pre.shape[0], 1))], axis=-1)
            points_pre_global = np.dot(expand_points_pre, pose_pre.T)[:, :3]
            expand_points_pre_global = np.concatenate([points_pre_global, np.ones((points_pre_global.shape[0], 1))],
                                                      axis=-1)
            points_pre2cur = np.dot(expand_points_pre_global, np.linalg.inv(pose_cur.T))[:, :3]
            points_pre = np.concatenate([points_pre2cur, points_pre[:, 3:]], axis=-1)
            if sequence_cfg.get('ONEHOT_TIMESTAMP', False):
                onehot_vector = np.zeros((points_pre.shape[0], len(sample_idx_pre_list) + 1))
                onehot_vector[:, idx + 1] = 1
                points_pre = np.hstack([points_pre, onehot_vector])
            else:
                # add timestamp
                points_pre = np.hstack([points_pre,
                                        0.1 * (sample_idx - sample_idx_pre) * np.ones((points_pre.shape[0], 1)).astype(
                                            points_pre.dtype)])  # one frame 0.1s
            points_pre = remove_ego_points(points_pre, 1.0)
            points_pre_all.append(points_pre)
            num_points_pre.append(points_pre.shape[0])
            pose_all.append(pose_pre)

            if load_pred_boxes:
                pose_pre = sequence_info[sample_idx_pre]['pose'].reshape((4, 4))
                pred_boxes = load_pred_boxes_from_dict(sequence_name, sample_idx_pre)
                pred_boxes = self.transform_prebox_to_current(pred_boxes, pose_pre, pose_cur)
                pred_boxes_all.append(pred_boxes)

        points = np.concatenate([points] + points_pre_all, axis=0).astype(np.float32)
        num_points_all = np.array([num_pts_cur] + num_points_pre).astype(np.int32)
        poses = np.concatenate(pose_all, axis=0).astype(np.float32)

        if load_pred_boxes:
            temp_pred_boxes = self.reorder_rois_for_refining(pred_boxes_all)
            pred_boxes = temp_pred_boxes[:, :, 0:9]
            pred_scores = temp_pred_boxes[:, :, 9]
            pred_labels = temp_pred_boxes[:, :, 10]
        else:
            pred_boxes = pred_scores = pred_labels = None

        return points, num_points_all, sample_idx_pre_list, poses, pred_boxes, pred_scores, pred_labels

    def __len__(self):
        if self._merge_all_iters_to_one_epoch:
            return len(self.infos) * self.total_epochs

        return len(self.infos)

    def __getitem__(self, index):
        if self._merge_all_iters_to_one_epoch:
            index = index % len(self.infos)

        info = copy.deepcopy(self.infos[index])
        pc_info = info['point_cloud']
        sequence_name = pc_info['lidar_sequence']
        sample_idx = pc_info['sample_idx']
        input_dict = {
            'sample_idx': sample_idx
        }
        if self.use_shared_memory and index < self.shared_memory_file_limit:
            sa_key = f'{sequence_name}___{sample_idx}'
            points = SharedArray.attach(f"shm://{sa_key}").copy()
        else:
            points = self.get_lidar(sequence_name, sample_idx)

        if self.dataset_cfg.get('SEQUENCE_CONFIG', None) is not None and self.dataset_cfg.SEQUENCE_CONFIG.ENABLED:
            points, num_points_all, sample_idx_pre_list, poses, pred_boxes, pred_scores, pred_labels = self.get_sequence_data(
                info, points, sequence_name, sample_idx, self.dataset_cfg.SEQUENCE_CONFIG,
                load_pred_boxes=self.dataset_cfg.get('USE_PREDBOX', False)
            )
            input_dict['poses'] = poses
            if self.dataset_cfg.get('USE_PREDBOX', False):
                input_dict.update({
                    'roi_boxes': pred_boxes,
                    'roi_scores': pred_scores,
                    'roi_labels': pred_labels,
                })

        input_dict.update({
            'points': points,
            'frame_id': info['frame_id'],
        })

        if 'annos' in info:
            annos = info['annos']
            annos = common_utils.drop_info_with_name(annos, name='unknown')

            if self.dataset_cfg.get('INFO_WITH_FAKELIDAR', False):
                gt_boxes_lidar = box_utils.boxes3d_kitti_fakelidar_to_lidar(annos['gt_boxes_lidar'])
            else:
                gt_boxes_lidar = annos['gt_boxes_lidar']

            if self.dataset_cfg.get('TRAIN_WITH_SPEED', False):
                assert gt_boxes_lidar.shape[-1] == 9
            else:
                gt_boxes_lidar = gt_boxes_lidar[:, 0:7]

            if self.training and self.dataset_cfg.get('FILTER_EMPTY_BOXES_FOR_TRAIN', False):
                mask = (annos['num_points_in_gt'] > 0)  # filter empty boxes
                annos['name'] = annos['name'][mask]
                gt_boxes_lidar = gt_boxes_lidar[mask]
                annos['num_points_in_gt'] = annos['num_points_in_gt'][mask]

            input_dict.update({
                'gt_names': annos['name'],
                'gt_boxes': gt_boxes_lidar,
                'num_points_in_gt': annos.get('num_points_in_gt', None)
            })

        data_dict = self.prepare_data(data_dict=input_dict)
        data_dict['metadata'] = info.get('metadata', info['frame_id'])
        data_dict.pop('num_points_in_gt', None)
        return data_dict

    def evaluation(self, det_annos, class_names, **kwargs):
        if 'annos' not in self.infos[0].keys():
            return 'No ground-truth boxes for evaluation', {}

        def kitti_eval(eval_det_annos, eval_gt_annos):
            from ..kitti.kitti_object_eval_python import eval as kitti_eval
            from ..kitti import kitti_utils

            map_name_to_kitti = {
                'Vehicle': 'Car',
                'Pedestrian': 'Pedestrian',
                'Cyclist': 'Cyclist',
                'Sign': 'Sign',
                'Car': 'Car'
            }
            kitti_utils.transform_annotations_to_kitti_format(eval_det_annos, map_name_to_kitti=map_name_to_kitti)
            kitti_utils.transform_annotations_to_kitti_format(
                eval_gt_annos, map_name_to_kitti=map_name_to_kitti,
                info_with_fakelidar=self.dataset_cfg.get('INFO_WITH_FAKELIDAR', False)
            )
            kitti_class_names = [map_name_to_kitti[x] for x in class_names]
            ap_result_str, ap_dict = kitti_eval.get_official_eval_result(
                gt_annos=eval_gt_annos, dt_annos=eval_det_annos, current_classes=kitti_class_names
            )
            return ap_result_str, ap_dict

        def waymo_eval(eval_det_annos, eval_gt_annos):
            from .waymo_eval import OpenPCDetWaymoDetectionMetricsEstimator
            eval = OpenPCDetWaymoDetectionMetricsEstimator()

            ap_dict = eval.waymo_evaluation(
                eval_det_annos, eval_gt_annos, class_name=class_names,
                distance_thresh=1000, fake_gt_infos=self.dataset_cfg.get('INFO_WITH_FAKELIDAR', False)
            )
            ap_result_str = '\n'
            for key in ap_dict:
                ap_dict[key] = ap_dict[key][0]
                ap_result_str += '%s: %.4f \n' % (key, ap_dict[key])

            return ap_result_str, ap_dict

        eval_det_annos = copy.deepcopy(det_annos)
        eval_gt_annos = [copy.deepcopy(info['annos']) for info in self.infos]

        if kwargs['eval_metric'] == 'kitti':
            ap_result_str, ap_dict = kitti_eval(eval_det_annos, eval_gt_annos)
        elif kwargs['eval_metric'] == 'waymo':
            ap_result_str, ap_dict = waymo_eval(eval_det_annos, eval_gt_annos)
        else:
            raise NotImplementedError

        return ap_result_str, ap_dict

    def create_groundtruth_database(self, info_path, save_path, used_classes=None, split='train', sampled_interval=10,
                                    processed_data_tag=None):

        use_sequence_data = self.dataset_cfg.get('SEQUENCE_CONFIG',
                                                 None) is not None and self.dataset_cfg.SEQUENCE_CONFIG.ENABLED

        if use_sequence_data:
            st_frame, ed_frame = self.dataset_cfg.SEQUENCE_CONFIG.SAMPLE_OFFSET[0], \
            self.dataset_cfg.SEQUENCE_CONFIG.SAMPLE_OFFSET[1]
            self.dataset_cfg.SEQUENCE_CONFIG.SAMPLE_OFFSET[0] = min(-4,
                                                                    st_frame)  # at least we use 5 frames for generating gt database to support various sequence configs (<= 5 frames)
            st_frame = self.dataset_cfg.SEQUENCE_CONFIG.SAMPLE_OFFSET[0]
            database_save_path = save_path / ('%s_gt_database_%s_sampled_%d_multiframe_%s_to_%s' % (
            processed_data_tag, split, sampled_interval, st_frame, ed_frame))
            db_info_save_path = save_path / ('%s_waymo_dbinfos_%s_sampled_%d_multiframe_%s_to_%s.pkl' % (
            processed_data_tag, split, sampled_interval, st_frame, ed_frame))
            db_data_save_path = save_path / ('%s_gt_database_%s_sampled_%d_multiframe_%s_to_%s_global.npy' % (
            processed_data_tag, split, sampled_interval, st_frame, ed_frame))
        else:
            database_save_path = save_path / (
                        '%s_gt_database_%s_sampled_%d' % (processed_data_tag, split, sampled_interval))
            db_info_save_path = save_path / (
                        '%s_waymo_dbinfos_%s_sampled_%d.pkl' % (processed_data_tag, split, sampled_interval))
            db_data_save_path = save_path / (
                        '%s_gt_database_%s_sampled_%d_global.npy' % (processed_data_tag, split, sampled_interval))

        database_save_path.mkdir(parents=True, exist_ok=True)
        all_db_infos = {}
        with open(info_path, 'rb') as f:
            infos = pickle.load(f)

        point_offset_cnt = 0
        stacked_gt_points = []
        for k in tqdm(range(0, len(infos), sampled_interval)):
            # print('gt_database sample: %d/%d' % (k + 1, len(infos)))
            info = infos[k]

            pc_info = info['point_cloud']
            sequence_name = pc_info['lidar_sequence']
            sample_idx = pc_info['sample_idx']
            points = self.get_lidar(sequence_name, sample_idx)

            if use_sequence_data:
                points, num_points_all, sample_idx_pre_list, _, _, _, _ = self.get_sequence_data(
                    info, points, sequence_name, sample_idx, self.dataset_cfg.SEQUENCE_CONFIG
                )

            annos = info['annos']
            names = annos['name']
            difficulty = annos['difficulty']
            gt_boxes = annos['gt_boxes_lidar']

            if k % 4 != 0 and len(names) > 0:
                mask = (names == 'Vehicle')
                names = names[~mask]
                difficulty = difficulty[~mask]
                gt_boxes = gt_boxes[~mask]

            if k % 2 != 0 and len(names) > 0:
                mask = (names == 'Pedestrian')
                names = names[~mask]
                difficulty = difficulty[~mask]
                gt_boxes = gt_boxes[~mask]

            num_obj = gt_boxes.shape[0]
            if num_obj == 0:
                continue

            box_idxs_of_pts = roiaware_pool3d_utils.points_in_boxes_gpu(
                torch.from_numpy(points[:, 0:3]).unsqueeze(dim=0).float().cuda(),
                torch.from_numpy(gt_boxes[:, 0:7]).unsqueeze(dim=0).float().cuda()
            ).long().squeeze(dim=0).cpu().numpy()

            for i in range(num_obj):
                filename = '%s_%04d_%s_%d.bin' % (sequence_name, sample_idx, names[i], i)
                filepath = database_save_path / filename
                gt_points = points[box_idxs_of_pts == i]
                gt_points[:, :3] -= gt_boxes[i, :3]

                if (used_classes is None) or names[i] in used_classes:
                    gt_points = gt_points.astype(np.float32)
                    assert gt_points.dtype == np.float32
                    with open(filepath, 'w') as f:
                        gt_points.tofile(f)

                    db_path = str(filepath.relative_to(self.root_path))  # gt_database/xxxxx.bin
                    db_info = {'name': names[i], 'path': db_path, 'sequence_name': sequence_name,
                               'sample_idx': sample_idx, 'gt_idx': i, 'box3d_lidar': gt_boxes[i],
                               'num_points_in_gt': gt_points.shape[0], 'difficulty': difficulty[i]}

                    # it will be used if you choose to use shared memory for gt sampling
                    stacked_gt_points.append(gt_points)
                    db_info['global_data_offset'] = [point_offset_cnt, point_offset_cnt + gt_points.shape[0]]
                    point_offset_cnt += gt_points.shape[0]

                    if names[i] in all_db_infos:
                        all_db_infos[names[i]].append(db_info)
                    else:
                        all_db_infos[names[i]] = [db_info]
        for k, v in all_db_infos.items():
            print('Database %s: %d' % (k, len(v)))

        with open(db_info_save_path, 'wb') as f:
            pickle.dump(all_db_infos, f)

        # it will be used if you choose to use shared memory for gt sampling
        stacked_gt_points = np.concatenate(stacked_gt_points, axis=0)
        np.save(db_data_save_path, stacked_gt_points)

    def create_gt_database_of_single_scene(self, info_with_idx, database_save_path=None, use_sequence_data=False,
                                           used_classes=None,
                                           total_samples=0, use_cuda=False, crop_gt_with_tail=False):
        info, info_idx = info_with_idx
        print('gt_database sample: %d/%d' % (info_idx, total_samples))

        all_db_infos = {}
        pc_info = info['point_cloud']
        sequence_name = pc_info['lidar_sequence']
        sample_idx = pc_info['sample_idx']
        points = self.get_lidar(sequence_name, sample_idx)

        if use_sequence_data:
            points, num_points_all, sample_idx_pre_list, _, _, _, _ = self.get_sequence_data(
                info, points, sequence_name, sample_idx, self.dataset_cfg.SEQUENCE_CONFIG
            )

        annos = info['annos']
        names = annos['name']
        difficulty = annos['difficulty']
        gt_boxes = annos['gt_boxes_lidar']

        if info_idx % 4 != 0 and len(names) > 0:
            mask = (names == 'Vehicle')
            names = names[~mask]
            difficulty = difficulty[~mask]
            gt_boxes = gt_boxes[~mask]

        if info_idx % 2 != 0 and len(names) > 0:
            mask = (names == 'Pedestrian')
            names = names[~mask]
            difficulty = difficulty[~mask]
            gt_boxes = gt_boxes[~mask]

        num_obj = gt_boxes.shape[0]
        if num_obj == 0:
            return {}

        if use_sequence_data and crop_gt_with_tail:
            assert gt_boxes.shape[1] == 9
            speed = gt_boxes[:, 7:9]
            sequence_cfg = self.dataset_cfg.SEQUENCE_CONFIG
            assert sequence_cfg.SAMPLE_OFFSET[1] == 0
            assert sequence_cfg.SAMPLE_OFFSET[0] < 0
            num_frames = sequence_cfg.SAMPLE_OFFSET[1] - sequence_cfg.SAMPLE_OFFSET[0] + 1
            assert num_frames > 1
            latest_center = gt_boxes[:, 0:2]
            oldest_center = latest_center - speed * (num_frames - 1) * 0.1
            new_center = (latest_center + oldest_center) * 0.5
            new_length = gt_boxes[:, 3] + np.linalg.norm(latest_center - oldest_center, axis=-1)
            gt_boxes_crop = gt_boxes.copy()
            gt_boxes_crop[:, 0:2] = new_center
            gt_boxes_crop[:, 3] = new_length

        else:
            gt_boxes_crop = gt_boxes

        if use_cuda:
            box_idxs_of_pts = roiaware_pool3d_utils.points_in_boxes_gpu(
                torch.from_numpy(points[:, 0:3]).unsqueeze(dim=0).float().cuda(),
                torch.from_numpy(gt_boxes_crop[:, 0:7]).unsqueeze(dim=0).float().cuda()
            ).long().squeeze(dim=0).cpu().numpy()
        else:
            box_point_mask = roiaware_pool3d_utils.points_in_boxes_cpu(
                torch.from_numpy(points[:, 0:3]).float(),
                torch.from_numpy(gt_boxes_crop[:, 0:7]).float()
            ).long().numpy()  # (num_boxes, num_points)

        for i in range(num_obj):
            filename = '%s_%04d_%s_%d.bin' % (sequence_name, sample_idx, names[i], i)
            filepath = database_save_path / filename
            if use_cuda:
                gt_points = points[box_idxs_of_pts == i]
            else:
                gt_points = points[box_point_mask[i] > 0]

            gt_points[:, :3] -= gt_boxes[i, :3]

            if (used_classes is None) or names[i] in used_classes:
                gt_points = gt_points.astype(np.float32)
                assert gt_points.dtype == np.float32
                with open(filepath, 'w') as f:
                    gt_points.tofile(f)

                db_path = str(filepath.relative_to(self.root_path))  # gt_database/xxxxx.bin
                db_info = {'name': names[i], 'path': db_path, 'sequence_name': sequence_name,
                           'sample_idx': sample_idx, 'gt_idx': i, 'box3d_lidar': gt_boxes[i],
                           'num_points_in_gt': gt_points.shape[0], 'difficulty': difficulty[i],
                           'box3d_crop': gt_boxes_crop[i]}

                if names[i] in all_db_infos:
                    all_db_infos[names[i]].append(db_info)
                else:
                    all_db_infos[names[i]] = [db_info]
        return all_db_infos

    def create_groundtruth_database_parallel(self, info_path, save_path, used_classes=None, split='train',
                                             sampled_interval=10,
                                             processed_data_tag=None, num_workers=16, crop_gt_with_tail=False):
        use_sequence_data = self.dataset_cfg.get('SEQUENCE_CONFIG',
                                                 None) is not None and self.dataset_cfg.SEQUENCE_CONFIG.ENABLED
        if use_sequence_data:
            st_frame, ed_frame = self.dataset_cfg.SEQUENCE_CONFIG.SAMPLE_OFFSET[0], \
            self.dataset_cfg.SEQUENCE_CONFIG.SAMPLE_OFFSET[1]
            self.dataset_cfg.SEQUENCE_CONFIG.SAMPLE_OFFSET[0] = min(-4,
                                                                    st_frame)  # at least we use 5 frames for generating gt database to support various sequence configs (<= 5 frames)
            st_frame = self.dataset_cfg.SEQUENCE_CONFIG.SAMPLE_OFFSET[0]
            database_save_path = save_path / ('%s_gt_database_%s_sampled_%d_multiframe_%s_to_%s_%sparallel' % (
            processed_data_tag, split, sampled_interval, st_frame, ed_frame, 'tail_' if crop_gt_with_tail else ''))
            db_info_save_path = save_path / ('%s_waymo_dbinfos_%s_sampled_%d_multiframe_%s_to_%s_%sparallel.pkl' % (
            processed_data_tag, split, sampled_interval, st_frame, ed_frame, 'tail_' if crop_gt_with_tail else ''))
        else:
            database_save_path = save_path / (
                        '%s_gt_database_%s_sampled_%d_parallel' % (processed_data_tag, split, sampled_interval))
            db_info_save_path = save_path / (
                        '%s_waymo_dbinfos_%s_sampled_%d_parallel.pkl' % (processed_data_tag, split, sampled_interval))

        database_save_path.mkdir(parents=True, exist_ok=True)

        with open(info_path, 'rb') as f:
            infos = pickle.load(f)

        print(f'Number workers: {num_workers}')
        create_gt_database_of_single_scene = partial(
            self.create_gt_database_of_single_scene,
            use_sequence_data=use_sequence_data, database_save_path=database_save_path,
            used_classes=used_classes, total_samples=len(infos), use_cuda=False,
            crop_gt_with_tail=crop_gt_with_tail
        )
        # create_gt_database_of_single_scene((infos[300], 0))
        with multiprocessing.Pool(num_workers) as p:
            all_db_infos_list = list(p.map(create_gt_database_of_single_scene, zip(infos, np.arange(len(infos)))))

        all_db_infos = {}

        for cur_db_infos in all_db_infos_list:
            for key, val in cur_db_infos.items():
                if key not in all_db_infos:
                    all_db_infos[key] = val
                else:
                    all_db_infos[key].extend(val)

        for k, v in all_db_infos.items():
            print('Database %s: %d' % (k, len(v)))

        with open(db_info_save_path, 'wb') as f:
            pickle.dump(all_db_infos, f)


def create_waymo_infos(dataset_cfg, class_names, data_path, save_path,
                       raw_data_tag='raw_data', processed_data_tag='waymo_processed_data',
                       workers=min(16, multiprocessing.cpu_count()), update_info_only=False):
    dataset = WaymoDataset(
        dataset_cfg=dataset_cfg, class_names=class_names, root_path=data_path,
        training=False, logger=common_utils.create_logger()
    )
    train_split, val_split = 'train', 'val'

    train_filename = save_path / ('%s_infos_%s.pkl' % (processed_data_tag, train_split))
    val_filename = save_path / ('%s_infos_%s.pkl' % (processed_data_tag, val_split))

    os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
    print('---------------Start to generate data infos---------------')

    dataset.set_split(train_split)
    waymo_infos_train = dataset.get_infos(
        raw_data_path=data_path / raw_data_tag,
        save_path=save_path / processed_data_tag, num_workers=workers, has_label=True,
        sampled_interval=1, update_info_only=update_info_only
    )
    with open(train_filename, 'wb') as f:
        pickle.dump(waymo_infos_train, f)
    print('----------------Waymo info train file is saved to %s----------------' % train_filename)

    dataset.set_split(val_split)
    waymo_infos_val = dataset.get_infos(
        raw_data_path=data_path / raw_data_tag,
        save_path=save_path / processed_data_tag, num_workers=workers, has_label=True,
        sampled_interval=1, update_info_only=update_info_only
    )
    with open(val_filename, 'wb') as f:
        pickle.dump(waymo_infos_val, f)
    print('----------------Waymo info val file is saved to %s----------------' % val_filename)

    if update_info_only:
        return

    print('---------------Start create groundtruth database for data augmentation---------------')
    os.environ["CUDA_VISIBLE_DEVICES"] = "0"
    dataset.set_split(train_split)
    dataset.create_groundtruth_database(
        info_path=train_filename, save_path=save_path, split='train', sampled_interval=1,
        used_classes=['Vehicle', 'Pedestrian', 'Cyclist'], processed_data_tag=processed_data_tag
    )
    print('---------------Data preparation Done---------------')


def create_waymo_gt_database(
        dataset_cfg, class_names, data_path, save_path, processed_data_tag='waymo_processed_data',
        workers=min(16, multiprocessing.cpu_count()), use_parallel=False, crop_gt_with_tail=False):
    dataset = WaymoDataset(
        dataset_cfg=dataset_cfg, class_names=class_names, root_path=data_path,
        training=False, logger=common_utils.create_logger()
    )
    train_split = 'train'
    train_filename = save_path / ('%s_infos_%s.pkl' % (processed_data_tag, train_split))

    print('---------------Start create groundtruth database for data augmentation---------------')
    dataset.set_split(train_split)
    # print("train_split:")
    # print(train_split)
    if use_parallel:
        dataset.create_groundtruth_database_parallel(
            info_path=train_filename, save_path=save_path, split='train', sampled_interval=1,
            used_classes=['Vehicle', 'Pedestrian', 'Cyclist'], processed_data_tag=processed_data_tag,
            num_workers=workers, crop_gt_with_tail=crop_gt_with_tail
        )
    else:
        dataset.create_groundtruth_database(
            info_path=train_filename, save_path=save_path, split='train', sampled_interval=1,
            used_classes=['Vehicle', 'Pedestrian', 'Cyclist'], processed_data_tag=processed_data_tag
        )
    print('---------------Data preparation Done---------------')


if __name__ == '__main__':
    import argparse
    import yaml
    from easydict import EasyDict

    parser = argparse.ArgumentParser(description='arg parser')
    parser.add_argument('--cfg_file', type=str, default=None, help='specify the config of dataset')
    parser.add_argument('--func', type=str, default='create_waymo_infos', help='')
    parser.add_argument('--processed_data_tag', type=str, default='waymo_processed_data_v0_5_0', help='')
    parser.add_argument('--update_info_only', action='store_true', default=False, help='')
    parser.add_argument('--use_parallel', action='store_true', default=False, help='')
    parser.add_argument('--wo_crop_gt_with_tail', action='store_true', default=False, help='')
    parser.add_argument('--data_path', default=None, help='')

    args = parser.parse_args()
    if args.data_path is not None:
        ROOT_DIR = (Path(args.data_path)).resolve()
    else:
        ROOT_DIR = (Path(__file__).resolve().parent / '../../../').resolve() / 'data' / 'waymo'
    # ROOT_DIR = (Path(self.dataset_cfg.DATA_PATH)).resolve()
    if args.func == 'create_waymo_infos':
        try:
            yaml_config = yaml.safe_load(open(args.cfg_file), Loader=yaml.FullLoader)
        except:
            yaml_config = yaml.safe_load(open(args.cfg_file))
        dataset_cfg = EasyDict(yaml_config)
        dataset_cfg.PROCESSED_DATA_TAG = args.processed_data_tag
        create_waymo_infos(
            dataset_cfg=dataset_cfg,
            class_names=['Vehicle', 'Pedestrian', 'Cyclist'],
            data_path=ROOT_DIR,
            save_path=ROOT_DIR,
            raw_data_tag='raw_data',
            processed_data_tag=args.processed_data_tag,
            update_info_only=args.update_info_only
        )
    elif args.func == 'create_waymo_gt_database':
        try:
            yaml_config = yaml.safe_load(open(args.cfg_file), Loader=yaml.FullLoader)
        except:
            yaml_config = yaml.safe_load(open(args.cfg_file))
        dataset_cfg = EasyDict(yaml_config)
        dataset_cfg.PROCESSED_DATA_TAG = args.processed_data_tag
        create_waymo_gt_database(
            dataset_cfg=dataset_cfg,
            class_names=['Vehicle', 'Pedestrian', 'Cyclist'],
            data_path=ROOT_DIR,
            save_path=ROOT_DIR,
            processed_data_tag=args.processed_data_tag,
            use_parallel=args.use_parallel,
            crop_gt_with_tail=not args.wo_crop_gt_with_tail
        )
    else:
        raise NotImplementedError

/OpenPCDet-master/pcdet/datasets/waymo/waymo_utils.py:

# OpenPCDet PyTorch Dataloader and Evaluation Tools for Waymo Open Dataset
# Reference https://github.com/open-mmlab/OpenPCDet
# Written by Shaoshuai Shi, Chaoxu Guo
# All Rights Reserved 2019-2020.


import os
import shutil
import pickle
import numpy as np
from ...utils import common_utils
import tensorflow as tf
from waymo_open_dataset.utils import frame_utils, transform_utils, range_image_utils
from waymo_open_dataset import dataset_pb2

try:
    tf.enable_eager_execution()
except:
    pass

WAYMO_CLASSES = ['unknown', 'Vehicle', 'Pedestrian', 'Sign', 'Cyclist']


def generate_labels(frame, pose):
    obj_name, difficulty, dimensions, locations, heading_angles = [], [], [], [], []
    tracking_difficulty, speeds, accelerations, obj_ids = [], [], [], []
    num_points_in_gt = []
    laser_labels = frame.laser_labels

    for i in range(len(laser_labels)):
        box = laser_labels[i].box
        class_ind = laser_labels[i].type
        loc = [box.center_x, box.center_y, box.center_z]
        heading_angles.append(box.heading)
        obj_name.append(WAYMO_CLASSES[class_ind])
        difficulty.append(laser_labels[i].detection_difficulty_level)
        tracking_difficulty.append(laser_labels[i].tracking_difficulty_level)
        dimensions.append([box.length, box.width, box.height])  # lwh in unified coordinate of OpenPCDet
        locations.append(loc)
        obj_ids.append(laser_labels[i].id)
        num_points_in_gt.append(laser_labels[i].num_lidar_points_in_box)
        speeds.append([laser_labels[i].metadata.speed_x, laser_labels[i].metadata.speed_y])
        accelerations.append([laser_labels[i].metadata.accel_x, laser_labels[i].metadata.accel_y])

    annotations = {}
    annotations['name'] = np.array(obj_name)
    annotations['difficulty'] = np.array(difficulty)
    annotations['dimensions'] = np.array(dimensions)
    annotations['location'] = np.array(locations)
    annotations['heading_angles'] = np.array(heading_angles)

    annotations['obj_ids'] = np.array(obj_ids)
    annotations['tracking_difficulty'] = np.array(tracking_difficulty)
    annotations['num_points_in_gt'] = np.array(num_points_in_gt)
    annotations['speed_global'] = np.array(speeds)
    annotations['accel_global'] = np.array(accelerations)

    annotations = common_utils.drop_info_with_name(annotations, name='unknown')
    if annotations['name'].__len__() > 0:
        global_speed = np.pad(annotations['speed_global'], ((0, 0), (0, 1)), mode='constant', constant_values=0)  # (N, 3)
        speed = np.dot(global_speed, np.linalg.inv(pose[:3, :3].T))
        speed = speed[:, :2]
        
        gt_boxes_lidar = np.concatenate([
            annotations['location'], annotations['dimensions'], annotations['heading_angles'][..., np.newaxis], speed],
            axis=1
        )
    else:
        gt_boxes_lidar = np.zeros((0, 9))
    annotations['gt_boxes_lidar'] = gt_boxes_lidar
    return annotations


def convert_range_image_to_point_cloud(frame, range_images, camera_projections, range_image_top_pose, ri_index=(0, 1)):
    """
    Modified from the codes of Waymo Open Dataset.
    Convert range images to point cloud.
    Args:
        frame: open dataset frame
        range_images: A dict of {laser_name, [range_image_first_return, range_image_second_return]}.
        camera_projections: A dict of {laser_name,
            [camera_projection_from_first_return, camera_projection_from_second_return]}.
        range_image_top_pose: range image pixel pose for top lidar.
        ri_index: 0 for the first return, 1 for the second return.

    Returns:
        points: {[N, 3]} list of 3d lidar points of length 5 (number of lidars).
        cp_points: {[N, 6]} list of camera projections of length 5 (number of lidars).
    """
    calibrations = sorted(frame.context.laser_calibrations, key=lambda c: c.name)
    points = []
    cp_points = []
    points_NLZ = []
    points_intensity = []
    points_elongation = []

    frame_pose = tf.convert_to_tensor(np.reshape(np.array(frame.pose.transform), [4, 4]))
    # [H, W, 6]
    range_image_top_pose_tensor = tf.reshape(
        tf.convert_to_tensor(range_image_top_pose.data), range_image_top_pose.shape.dims
    )
    # [H, W, 3, 3]
    range_image_top_pose_tensor_rotation = transform_utils.get_rotation_matrix(
        range_image_top_pose_tensor[..., 0], range_image_top_pose_tensor[..., 1],
        range_image_top_pose_tensor[..., 2])
    range_image_top_pose_tensor_translation = range_image_top_pose_tensor[..., 3:]
    range_image_top_pose_tensor = transform_utils.get_transform(
        range_image_top_pose_tensor_rotation,
        range_image_top_pose_tensor_translation)

    for c in calibrations:
        points_single, cp_points_single, points_NLZ_single, points_intensity_single, points_elongation_single \
            = [], [], [], [], []
        for cur_ri_index in ri_index:
            range_image = range_images[c.name][cur_ri_index]
            if len(c.beam_inclinations) == 0:  # pylint: disable=g-explicit-length-test
                beam_inclinations = range_image_utils.compute_inclination(
                    tf.constant([c.beam_inclination_min, c.beam_inclination_max]),
                    height=range_image.shape.dims[0])
            else:
                beam_inclinations = tf.constant(c.beam_inclinations)

            beam_inclinations = tf.reverse(beam_inclinations, axis=[-1])
            extrinsic = np.reshape(np.array(c.extrinsic.transform), [4, 4])

            range_image_tensor = tf.reshape(
                tf.convert_to_tensor(range_image.data), range_image.shape.dims)
            pixel_pose_local = None
            frame_pose_local = None
            if c.name == dataset_pb2.LaserName.TOP:
                pixel_pose_local = range_image_top_pose_tensor
                pixel_pose_local = tf.expand_dims(pixel_pose_local, axis=0)
                frame_pose_local = tf.expand_dims(frame_pose, axis=0)
            range_image_mask = range_image_tensor[..., 0] > 0
            range_image_NLZ = range_image_tensor[..., 3]
            range_image_intensity = range_image_tensor[..., 1]
            range_image_elongation = range_image_tensor[..., 2]
            range_image_cartesian = range_image_utils.extract_point_cloud_from_range_image(
                tf.expand_dims(range_image_tensor[..., 0], axis=0),
                tf.expand_dims(extrinsic, axis=0),
                tf.expand_dims(tf.convert_to_tensor(beam_inclinations), axis=0),
                pixel_pose=pixel_pose_local,
                frame_pose=frame_pose_local)

            range_image_cartesian = tf.squeeze(range_image_cartesian, axis=0)
            points_tensor = tf.gather_nd(range_image_cartesian,
                                         tf.where(range_image_mask))
            points_NLZ_tensor = tf.gather_nd(range_image_NLZ, tf.compat.v1.where(range_image_mask))
            points_intensity_tensor = tf.gather_nd(range_image_intensity, tf.compat.v1.where(range_image_mask))
            points_elongation_tensor = tf.gather_nd(range_image_elongation, tf.compat.v1.where(range_image_mask))
            cp = camera_projections[c.name][0]
            cp_tensor = tf.reshape(tf.convert_to_tensor(cp.data), cp.shape.dims)
            cp_points_tensor = tf.gather_nd(cp_tensor, tf.where(range_image_mask))

            points_single.append(points_tensor.numpy())
            cp_points_single.append(cp_points_tensor.numpy())
            points_NLZ_single.append(points_NLZ_tensor.numpy())
            points_intensity_single.append(points_intensity_tensor.numpy())
            points_elongation_single.append(points_elongation_tensor.numpy())

        points.append(np.concatenate(points_single, axis=0))
        cp_points.append(np.concatenate(cp_points_single, axis=0))
        points_NLZ.append(np.concatenate(points_NLZ_single, axis=0))
        points_intensity.append(np.concatenate(points_intensity_single, axis=0))
        points_elongation.append(np.concatenate(points_elongation_single, axis=0))

    return points, cp_points, points_NLZ, points_intensity, points_elongation


def save_lidar_points(frame, cur_save_path, use_two_returns=True):
    ret_outputs = frame_utils.parse_range_image_and_camera_projection(frame)
    if len(ret_outputs) == 4:
        range_images, camera_projections, seg_labels, range_image_top_pose = ret_outputs
    else:
        assert len(ret_outputs) == 3
        range_images, camera_projections, range_image_top_pose = ret_outputs

    points, cp_points, points_in_NLZ_flag, points_intensity, points_elongation = convert_range_image_to_point_cloud(
        frame, range_images, camera_projections, range_image_top_pose, ri_index=(0, 1) if use_two_returns else (0,)
    )

    # 3d points in vehicle frame.
    points_all = np.concatenate(points, axis=0)
    points_in_NLZ_flag = np.concatenate(points_in_NLZ_flag, axis=0).reshape(-1, 1)
    points_intensity = np.concatenate(points_intensity, axis=0).reshape(-1, 1)
    points_elongation = np.concatenate(points_elongation, axis=0).reshape(-1, 1)

    num_points_of_each_lidar = [point.shape[0] for point in points]
    save_points = np.concatenate([
        points_all, points_intensity, points_elongation, points_in_NLZ_flag
    ], axis=-1).astype(np.float32)

    np.save(cur_save_path, save_points)
    # print('saving to ', cur_save_path)
    return num_points_of_each_lidar


def process_single_sequence(sequence_file, save_path, sampled_interval, has_label=True, use_two_returns=True, update_info_only=False):
    sequence_name = os.path.splitext(os.path.basename(sequence_file))[0]

    # print('Load record (sampled_interval=%d): %s' % (sampled_interval, sequence_name))
    if not sequence_file.exists():
        print('NotFoundError: %s' % sequence_file)
        return []

    dataset = tf.data.TFRecordDataset(str(sequence_file), compression_type='')
    cur_save_dir = save_path / sequence_name
    cur_save_dir.mkdir(parents=True, exist_ok=True)
    pkl_file = cur_save_dir / ('%s.pkl' % sequence_name)

    sequence_infos = []
    if pkl_file.exists():
        sequence_infos = pickle.load(open(pkl_file, 'rb'))
        sequence_infos_old = None
        if not update_info_only:
            print('Skip sequence since it has been processed before: %s' % pkl_file)

            return sequence_infos
        else:
            sequence_infos_old = sequence_infos
            sequence_infos = []
        # shutil.move(pkl_file, pkl_file + '/1')
    print("##########################sequence_file#############################")
    print(sequence_file)
    for cnt, data in enumerate(dataset):
        if cnt % sampled_interval != 0:
            continue
        # print(sequence_name, cnt)
        frame = dataset_pb2.Frame()
        frame.ParseFromString(bytearray(data.numpy()))

        info = {}
        pc_info = {'num_features': 5, 'lidar_sequence': sequence_name, 'sample_idx': cnt}
        info['point_cloud'] = pc_info

        info['frame_id'] = sequence_name + ('_%03d' % cnt)
        info['metadata'] = {
            'context_name': frame.context.name,
            'timestamp_micros': frame.timestamp_micros
        }
        image_info = {}
        for j in range(5):
            width = frame.context.camera_calibrations[j].width
            height = frame.context.camera_calibrations[j].height
            image_info.update({'image_shape_%d' % j: (height, width)})
        info['image'] = image_info

        pose = np.array(frame.pose.transform, dtype=np.float32).reshape(4, 4)
        info['pose'] = pose

        if has_label:
            annotations = generate_labels(frame, pose=pose)
            info['annos'] = annotations

        if update_info_only and sequence_infos_old is not None:
            assert info['frame_id'] == sequence_infos_old[cnt]['frame_id']
            num_points_of_each_lidar = sequence_infos_old[cnt]['num_points_of_each_lidar']
        else:
            num_points_of_each_lidar = save_lidar_points(
                frame, cur_save_dir / ('%04d.npy' % cnt), use_two_returns=use_two_returns
            )
        info['num_points_of_each_lidar'] = num_points_of_each_lidar

        sequence_infos.append(info)

    with open(pkl_file, 'wb') as f:
        pickle.dump(sequence_infos, f)

    print('Infos are saved to (sampled_interval=%d): %s' % (sampled_interval, pkl_file))
    return sequence_infos

你可能感兴趣的:(3D视觉环境安装,人工智能,ubuntu,python,pytorch)

智慧交通是什么，可以帮助我们解决什么问题? Guheyunyi 运维大数据人工智能信息可视化前端
智慧交通是什么？智慧交通（SmartTransportation）是指利用物联网（IoT）、大数据、人工智能（AI）、云计算、5G通信等先进技术，对交通系统进行智能化管理和优化，以提高交通效率、减少拥堵、降低事故率、提升出行体验，并实现交通资源的合理配置和可持续发展。智慧交通的核心是通过数据采集、分析和应用，实现交通系统的智能化、自动化和协同化，从而构建一个高效、安全、绿色、便捷的交通生态系统。智
Python逆向爬取Tik Tok，MsToken,X-Bogus以及signature 才华是浅浅的耐心 python javascript 前端
自5月起，抖音正式开放Web接口，并不断升级风控机制。从最初的_signature参数，到增加滑块验证，再到如今的JSVM混淆处理，以及mstoken和x-bougs等参数的引入。分析发现，部分国内接口仅需提供Cookie即可访问，无需额外验签，而获取Cookie的方式多种多样，其中利用OpenCV识别滑块验证码是一种简单可行的方法。相比之下，TikTok的接口无需Cookie，但对签名的校验更加
【3D模型】【游戏开发】【Blender】Blender模型分享-狮头木雕附导入方法踏雪无痕老爷子资源介绍 3d blender
导入方法：[Blender]如何导入包含纹理的.blend模型文件在3D建模和渲染工作中，Blender是一款功能强大的免费开源软件。很多时候，我们需要导入.blend后缀的模型文件，同时确保纹理（textures）文件夹中的贴图能够正确加载。本文将介绍详细的导入步骤以及可能遇到的问题和解决方案。1.直接打开.blend文件如果你的.blend文件是一个完整的工程文件，包含了模型和纹理，直接打开即
Browser-Use WebUI项目启动指南思考在马桶上人工智能 chatgpt 经验分享 python
摘要此前发布《Browser-UseWebUI使用体验》博文后，鉴于部分朋友运行时出现问题，重新运行并整理相关内容。本文详细记录WebUI项目启动全过程，涵盖Python3.11+、Chrome浏览器及APIKeys等环境要求，Python环境检查、依赖安装等环境配置步骤，.env文件中环境变量的设置方法。同时，针对启动中如lxml.html.clean依赖缺失、连接被拒等问题给出解决方案，介绍启
Linux篇1-初识Linux 逃跑的机械工 Linux linux
1.Linux能干什么Linux能够进行各种语言的开发工作，基本主要以后端语言为主C++，JAVA,python;Linux能进行各种指令操作，从而完成各种的文件相关的管理工作2.Linux基本指令2.1ls指令在Linux中，以.开头的文件，叫做隐藏文件；ls-a显示隐藏文件隐藏文件：Linux配置文件，可以隐藏起来，防止误操作，起到保护作用；ls-l列出文件的详细信息-d将目录象文件一样显示，
Python获取tiktok视频数据信息 api 爬虫程序媛了了 python 开发语言
Tiktok通过ID爬取视频信息api采集页面如图：https://www.tiktok.com/@basketwithball2.0/video/7273119444522650912?q=irving&t=1706683319923请求APIhttp://api.xxxx.com/tt/video/info?video_id=7273119444522650912&token=test请求参数
Node.js 中使用 RabbitMQ 海上彼尚 node.js node.js rabbitmq 分布式
目录一、RabbitMQ简介二、核心概念解析三、环境搭建（以Ubuntu为例）四、Node.js实战：生产者与消费者1.安装依赖2.生产者代码（发送消息）3.消费者代码（处理消息）五、高级配置与最佳实践六、常见问题与解决方案七、总结一、RabbitMQ简介RabbitMQ是一个基于AMQP协议的开源消息代理工具，专为分布式系统设计。它通过解耦生产者和消费者实现异步通信，支持流量削峰、任务队列、服务
在Ubuntu上安装MEAN Stack的4个步骤 ubuntu
在Ubuntu上安装MEANStack的4个步骤为：1.安装MEAN；2.安装MongoDB；3.安装NodeJS，Git和NPM；4.安装剩余的依赖项。什么是MEANStack？平均堆栈一直在很大程度上升高为基于稳健的基于JavaScript的开发堆栈。名称的意思是指其组件;MongoDB，ExpressJS，Angularjs和NodeJS。第1步：安装MEAN对于此安装，我们将在本指南中使用
在线视频创作平台（Vidnami） deepdata_cn 视频生成视频剪辑视频创作
Vidnami是一款功能强大的在线视频创作平台，前身为ContentSamurai，于2015年推出，2020年更名为Vidnami。它运用人工智能技术，能够分析输入的文本，自动从大量素材中选取合适的图像和视频片段，将文字快速转化为具有专业外观的视频，无需用户具备视频编辑经验。该平台提供多种视频模板、全主题定制功能以及内置的免版权媒体库，包括3000万张图片和3万首音乐，还支持自动配音，用户可以录
Ubuntu执行apt-get install xxx报错怎么办？
在Ubuntu系统中，使用apt-getinstall命令安装软件包时，可能会遇到各种报错。本文将详细介绍Ubuntu执行apt-getinstallxxx报错的解决方法，帮助您快速定位并解决问题。️常见报错及解决方法1.更新源和软件包问题：软件包信息过时，导致无法找到或安装最新的软件包。解决方法：首先确保系统源和软件包是最新的，执行以下命令更新：sudoaptupdatesudoaptupgra
LeetCode98-验证二叉搜索树学习的学习者 LeetCode Python 二叉搜索树
上个星期和导师去了华农一趟名义上是和导师去参加一个国家级的项目其实没我啥事都是我导师在那口若悬河当时和那边的本科生去了另一间会议室交流了关于GAN的知识偶然听说大家都在用pytorch好像最新版的也挺好用的反正就是学术界目前主要用这个框架工业界主要用Tensorflow(没办法，Google出品)这两天也拿来瞧了瞧好像也确实可以的！！！98-验证二叉搜索树给定一个二叉树，判断其是否是一个有效的二叉
基于图像比对的跨平台UI一致性校验工具开发全流程指南——Android/iOS/Web三端自动化测试实战追寻向上 ui android ios
一、需求背景与方案概述1.1为什么需要跨平台UI校验？在移动互联网时代，同一产品需覆盖Android、iOS和Web三端。由于不同平台的开发框架（如Android的MaterialDesign与iOS的Cupertino风格）及渲染引擎差异，UI界面易出现以下问题：布局错位：按钮位置偏移、文本换行不一致视觉差异：颜色色差、字体粗细不同交互逻辑冲突：滑动方向、弹窗动画不一致传统人工测试效率低且易遗漏
【初学者】用Python语言来解释指针的用例与应用场景 lisw05 python python 开发语言
李升伟整理Python本身并不直接支持指针的概念，因为Python是一种高级语言，内存管理由解释器自动处理。不过，Python提供了一些机制（如引用、可变对象等）来实现类似指针的功能。以下是Python中“指针”的用例和应用场景。1.引用机制（类似指针）在Python中，变量是对对象的引用，而不是直接存储对象的值。这种引用机制类似于指针的概念。示例：a=10#a是对整数对象10的引用b=a#b也引
OpenCV第1课OpenCV 介绍及其树莓派下环境的搭建嵌入式老牛树莓派之OpenCV opencv 人工智能计算机视觉
1.机器是如何“看”的我们人类可以通过眼睛看到五颜六色的世界，是因为人眼的视觉细胞中存在分别对红、绿、蓝敏感的3种细胞。其中的光感色素根据光线的不同进行不同比例的分解，从而让我们识别到各种颜色。对人工智能而言，学会“看”也是非常关键的一步。那么机器人是如何看到这个世界的呢？这就涉及到人工智能方向重要的分支--机器视觉。机器视觉即用机器人代替人眼来做测量和判断，通过机器视觉产品（即图像摄取装置，分C
python、JavaScript 、JAVA等实例代码演示教你如何免费获取股票数据（实时数据、历史数据、CDMA、KDJ等指标数据）配有股票数据API接口说明文档详解参数说明蝶澈乐乐 python javascript java 股票数据接口 api 开发语言
近一两年来，股票量化分析逐渐受到广泛关注。而作为这一领域的初学者，首先需要面对的挑战就是如何获取全面且准确的股票数据。因为无论是实时交易数据、历史交易记录、财务数据还是基本面信息，这些数据都是我们进行量化分析时不可或缺的宝贵资源。我们的核心任务是从这些数据中挖掘出有价值的信息，为我们的投资策略提供有力的支持。在寻找数据的过程中，我尝试了多种途径，包括自编网易股票页面爬虫、申万行业数据爬虫，以及同花
31天Python入门——第7天:集合·字典你真的懂了吗? 安然无虞 Python手把手教程 python 开发语言后端
你好，我是安然无虞。文章目录1.集合1.1集合的定义1.2集合的常用操作1.3集合练习2.字典2.1字典的定义2.2嵌套字典和字典的取值2.3字典的常用操作补充知识:字典的优势是查找值效率高2.4字典推导式2.5字典练习很重要的补充练习:希望你能掌握练习一练习二1.集合在之前的章节中,我们学习了列表,元组,字符串.已经可以覆盖七成的使用场景了.那么为什么还要学习集合类型呢.列表:有序可变,元素可重
Opencv计算机视觉编程攻略-第一节图像读取与基本处理 weixin_44242403 深度学习 opencv 计算机视觉
1.图像读取导入依赖项的h文件#include#include#include#include项目Valuecore.hpp基础数据结构和操作（图像存储、矩阵运算、文件I/O）highgui.hpp图像显示、窗口管理、用户交互（图像/视频显示、用户输入处理、结果保存）imgproc.hpp图像处理算法（图像滤波、几何变换、边缘检测、形态学操作）二读取图片Matimage;//图像矩阵std::co
打造城市二手房分析与可视化系统+聚类分析+58爬虫+线性回归 OverlordDuke 聚类算法数据可视化爬虫线性回归算法
打造城市二手房分析与可视化系统+聚类分析+58爬虫+线性回归利用数据实现全面分析数据分析与可视化功能创新的聚类分析功能结语在如今房地产市场日益复杂的背景下，对于投资者、购房者和市场分析师来说，了解市场动态并做出明智的决策至关重要。基于此，我们开发了一款基于Python的城市二手房分析与可视化系统，为用户提供了强大的工具，帮助他们深入了解当地房地产市场。利用数据实现全面分析我们的系统利用爬取的58同
centos7输入python -m bitsandbytes报错CUDA Setup failed despite GPU being available. Please run the follo 小太阳，乐向上 python 开发语言
在centos7.9系统中安装gpu驱动及cuda，跑大模型会报错，提示让输入python-mbitsandbytes依然报错：CUDASETUP:Loadingbinary/usr/local/python3/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda117.so.../lib64/libstdc++.so.6:ve
Linux安装Anaconda和Jupyter 硬水果糖人工智能 Linux linux jupyter 运维
一、了解Anaconda和Jupyter引言：Anaconda是一个流行的开源数据科学平台，广泛用于数据分析、机器学习、人工智能等领域。它是一个集成了大量科学计算和数据科学工具的Python和R编程语言环境。Anaconda的主要目标是简化数据科学和机器学习的开发流程，提供一个易于安装和管理的环境。而预装了大量常用的Python和R库，这些库涵盖了数据科学的各个方面，包括：数据分析：Pandas、
python-56-基于Vue和Flask进行前后端分离的项目开发示例实战皮皮冰燃 python3 python vue.js flask
文章目录1创建Vue前端项目1.1运行demo1.2实现需求2flask部署上述dist(前后端未分离)2.1代码app.py2.2运行访问3nginx部署(前后端分离)3.1nginx前端服务3.3.1windows安装nginx3.3.2修改nginx.conf配置文件3.3.3启动nginx3.3.3停止nginx3.2启动后端服务3.2.1app.py(去除前端渲染)3.2.2启动flas
爬虫基础--request库详解 amo的代码园_毕设 Java基础爬虫 java spring boot vue.js python 开发语言
爬虫基础–request库详解1.requests模块介绍request库中文文档：https://docs.python-requests.org/zh_CN/latest/user/quickstart.htmlrequests是一个非常流行的PythonHTTP第三方库，它允许你发送各种HTTP请求，处理cookies、会话、连接池、重定向、多种认证方式等，使得处理HTTP请求变得非常便捷，
基于百度翻译的python爬虫示例魂万劫 python 爬虫开发语言百度翻译
(今年java工作真难找啊，有广州java高级岗位招人的好心人麻烦推一下，拜谢。。）花了一周时间，从零基础开始学习了python，学有所获之后，就总想爬些什么，不然感觉不得劲，所以花了一天时间整出了个百度翻译的爬虫示例，主要卡点花在了找token、sign以及调试请求上。代码有点乱，毕竟是demo，但是功能是实现了的。importrequestsimportjs2pyimportrefromurl
关于bitsandbytes安装报错跃跃欲试88 语言模型人工智能 transformer
RunTimeError:CUDASetupfaileddespiteGPUbeingavailable.InspecttheCUDASETUPoutputsabovetofixyourenvironment!ubuntu@VM-0-8-ubuntu:~$python-mbitsandbytesFalse===================================BUGREPORT===
ChatGPT、DeepSeek、GIS与Python机器学习强强联合！地质灾害风险评估、易发性分析、信息化建库及灾后重建 WangYan2022 DeepSeek ChatGPT 地下水地质灾害 DeepSeek ChatGPT GIS 灾后重建
在地质灾害频繁肆虐的当下，精准开展风险评价刻不容缓。如今，一门极具创新性的教程震撼登场，它将ChatGPT、DeepSeek等前沿技术与GIS、Python以及机器学习深度交融，为学员打造出前所未有的学习体验，助力大家在地质灾害风险评价领域强势突围，一路领先。前沿技术融合，铸就智能学习核心动力教程最闪耀的亮点之一，便是大胆引入了ChatGPT和DeepSeek技术。它们恰似无所不能的“数据魔法师”
python3实现爬取淘宝页面的商品的数据信息（selenium+pyquery+mongodb） flood_d mongodb python selenium pyquery 爬虫
1.环境须知做这个爬取的时候需要安装好python3.6和selenium、pyquery等等一些比较常用的爬取和解析库，还需要安装MongoDB这个分布式数据库。2.直接上代码spider.pyimportrefromconfigimport*importpymongofromseleniumimportwebdriverfromselenium.common.exceptionsimportT
一篇文章教会你用Python爬取淘宝评论数据【淘宝商品评论数据接口参数】 Tinalee-电商API接口呀主流电商数据采集API接口淘宝天猫商品API接口淘宝商品评论API接口 python 开发语言人工智能大数据爬虫 java
【一、项目简介】本文主要目标是采集淘宝的评价，找出客户所需要的功能。统计客户评价上面夸哪个功能多，比如防水，容量大，好看等等。【二·淘宝/天猫获得淘宝商品评论API返回值】item_review-获得淘宝商品评论taobao.item_review公共参数名称类型必须描述keyString是调用key（必须以GET方式拼接在URL中）secretString是调用密钥api_nameString是
Python for Android 安装和配置指南舒欣和Queenly
PythonforAndroid安装和配置指南python-for-androidTurnyourPythonapplicationintoanAndroidAPK项目地址:https://gitcode.com/gh_mirrors/py/python-for-android1.项目基础介绍和主要编程语言项目基础介绍PythonforAndroid(p4a)是一个开源工具，旨在将Python应用
python -m bitsandbytes 报错解释与解决 MityKif python 开发语言
RuntimeError:CUDASetupfaileddespiteGPUbeingavailable.Pleaserunthefollowingcommandtogetmoreinformation:python-mbitsandbytesInspecttheoutputofthecommandandseeifyoucanlocateCUDAlibraries.Youmightneedtoad
推特关键词爬虫Python实现最新版（2025.2.20）才华是浅浅的耐心爬虫 python 开发语言
引言随着各类自媒体平台的兴起，数据挖掘和分析变得尤为重要。推特作为全球最大的自媒体平台，越来越来越多的人需要通过爬取其内容进行分析。然后自从马斯克接手推特之后，推特api不可再用，推特的反爬力度也在逐渐增强。今天小编就分享一个推特爬虫的教程。描述这篇文章主要通过关键词爬取帖子内容信息以及帖子作者主页相关信息，用户也可根据自己需要的时间段进行筛选。推特可支持筛选多种语言，我这里先展示中文和英文的。字
解线性方程组 qiuwanchi
package gaodai.matrix; import java.util.ArrayList; import java.util.List; import java.util.Scanner; public class Test { public static void main(String[] args) { Scanner scanner = new Sc
在mysql内部存储代码 annan211 性能 mysql 存储过程触发器
在mysql内部存储代码在mysql内部存储代码，既有优点也有缺点，而且有人倡导有人反对。先看优点： 1 她在服务器内部执行，离数据最近，另外在服务器上执行还可以节省带宽和网络延迟。 2 这是一种代码重用。可以方便的统一业务规则，保证某些行为的一致性，所以也可以提供一定的安全性。 3 可以简化代码的维护和版本更新。 4 可以帮助提升安全，比如提供更细
Android使用Asynchronous Http Client完成登录保存cookie的问题 hotsunshine android
Asynchronous Http Client是android中非常好的异步请求工具除了异步之外还有很多封装比如json的处理，cookie的处理引用 Persistent Cookie Storage with PersistentCookieStore This library also includes a PersistentCookieStore whi
java面试题 Array_06 java 面试
java面试题第一，谈谈final, finally, finalize的区别。 final-修饰符（关键字）如果一个类被声明为final，意味着它不能再派生出新的子类，不能作为父类被继承。因此一个类不能既被声明为 abstract的，又被声明为final的。将变量或方法声明为final，可以保证它们在使用中不被改变。被声明为final的变量必须在声明时给定初值，而在以后的引用中只能
网站加速 oloz 网站加速
前序:本人菜鸟，此文研究总结来源于互联网上的资料，大牛请勿喷！本人虚心学习，多指教. 1、减小网页体积的大小，尽量采用div+css模式，尽量避免复杂的页面结构，能简约就简约。 2、采用Gzip对网页进行压缩； GZIP最早由Jean-loup Gailly和Mark Adler创建，用于UNⅨ系统的文件压缩。我们在Linux中经常会用到后缀为.gz
正确书写单例模式随意而生 java 设计模式单例
　　单例模式算是设计模式中最容易理解，也是最容易手写代码的模式了吧。但是其中的坑却不少，所以也常作为面试题来考。本文主要对几种单例写法的整理，并分析其优缺点。很多都是一些老生常谈的问题，但如果你不知道如何创建一个线程安全的单例，不知道什么是双检锁，那这篇文章可能会帮助到你。　　懒汉式，线程不安全　　当被问到要实现一个单例模式时，很多人的第一反应是写出如下的代码，包括教科书上也是这样
单例模式香水浓 java
懒汉调用getInstance方法时实例化 public class Singleton { private static Singleton instance; private Singleton() {} public static synchronized Singleton getInstance() { if(null == ins
安装Apache问题：系统找不到指定的文件 No installed service named "Apache2" AdyZhang apache http server
安装Apache问题：系统找不到指定的文件 No installed service named "Apache2" 每次到这一步都很小心防它的端口冲突问题，结果，特意留出来的80端口就是不能用，烦。解决方法确保几处： 1、停止IIS启动 2、把端口80改成其它（譬如90，800，，，什么数字都好） 3、防火墙(关掉试试) 在运行处输入 cmd 回车，转到apa
如何在android 文件选择器中选择多个图片或者视频？ aijuans android
我的android app有这样的需求，在进行照片和视频上传的时候，需要一次性的从照片/视频库选择多条进行上传但是android原生态的sdk中，只能一个一个的进行选择和上传。我想知道是否有其他的android上传库可以解决这个问题，提供一个多选的功能，可以使checkbox之类的，一次选择多个处理方法官方的图片选择器(但是不支持所有版本的androi，只支持API Level
mysql中查询生日提醒的日期相关的sql baalwolf mysql
SELECT sysid,user_name,birthday,listid,userhead_50,CONCAT(YEAR(CURDATE()),DATE_FORMAT(birthday,'-%m-%d')),CURDATE(), dayofyear( CONCAT(YEAR(CURDATE()),DATE_FORMAT(birthday,'-%m-%d')))-dayofyear(
MongoDB索引文件破坏后导致查询错误的问题 BigBird2012 mongodb
问题描述： MongoDB在非正常情况下关闭时，可能会导致索引文件破坏，造成数据在更新时没有反映到索引上。解决方案：使用脚本，重建MongoDB所有表的索引。 var names = db.getCollectionNames(); for( var i in names ){ var name = names[i]; print(name);
Javascript Promise bijian1013 JavaScript Promise
Parse JavaScript SDK现在提供了支持大多数异步方法的兼容jquery的Promises模式，那么这意味着什么呢，读完下文你就了解了。一.认识Promises “Promises”代表着在javascript程序里下一个伟大的范式，但是理解他们为什么如此伟大不是件简
[Zookeeper学习笔记九]Zookeeper源代码分析之Zookeeper构造过程 bit1129 zookeeper
Zookeeper重载了几个构造函数，其中构造者可以提供参数最多，可定制性最多的构造函数是 public ZooKeeper(String connectString, int sessionTimeout, Watcher watcher, long sessionId, byte[] sessionPasswd, boolea
【Java命令三】jstack bit1129 jstack
jstack是用于获得当前运行的Java程序所有的线程的运行情况(thread dump），不同于jmap用于获得memory dump [hadoop@hadoop sbin]$ jstack Usage: jstack [-l] <pid> (to connect to running process) jstack -F
jboss 5.1启停脚本　动静分离部署 ronin47
以前启动jboss，往各种xml配置文件，现只要运行一句脚本即可。start nohup sh /**/run.sh -c servicename -b ip -g clustername -u broatcast jboss.messaging.ServerPeerID=int -Djboss.service.binding.set=p
UI之如何打磨设计能力? brotherlamp UI ui教程 ui自学 ui资料 ui视频
在越来越拥挤的初创企业世界里，视觉设计的重要性往往可以与杀手级用户体验比肩。在许多情况下，尤其对于 Web 初创企业而言，这两者都是不可或缺的。前不久我们在《右脑革命：别学编程了，学艺术吧》中也曾发出过重视设计的呼吁。如何才能提高初创企业的设计能力呢?以下是 9 位创始人的体会。 1.找到自己的方式如果你是设计师，要想提高技能可以去设计博客和展示好设计的网站如D-lists或
三色旗算法 bylijinnan java 算法
import java.util.Arrays; /** 问题：假设有一条绳子，上面有红、白、蓝三种颜色的旗子，起初绳子上的旗子颜色并没有顺序，您希望将之分类，并排列为蓝、白、红的顺序，要如何移动次数才会最少，注意您只能在绳子上进行这个动作，而且一次只能调换两个旗子。网上的解法大多类似：在一条绳子上移动，在程式中也就意味只能使用一个阵列，而不使用其它的阵列来
警告:No configuration found for the specified action: \'s chiangfai configuration
1.index.jsp页面form标签未指定namespace属性。  <%@taglib prefix="s" uri="/struts-tags"%> ... <s:form action="submit" method="post"&g
redis -- hash_max_zipmap_entries设置过大有问题 chenchao051 redis hash
使用redis时为了使用hash追求更高的内存使用率，我们一般都用hash结构，并且有时候会把hash_max_zipmap_entries这个值设置的很大，很多资料也推荐设置到1000，默认设置为了512，但是这里有个坑 #define ZIPMAP_BIGLEN 254 #define ZIPMAP_END 255 /* Return th
select into outfile access deny问题 daizj mysql txt 导出数据到文件
本文转自：http://hatemysql.com/2010/06/29/select-into-outfile-access-deny%E9%97%AE%E9%A2%98/ 为应用建立了rnd的帐号，专门为他们查询线上数据库用的，当然，只有他们上了生产网络以后才能连上数据库，安全方面我们还是很注意的，呵呵。授权的语句如下： grant select on armory.* to rn
phpexcel导出excel表简单入门示例 dcj3sjt126com PHP Excel phpexcel
<?php error_reporting(E_ALL); ini_set('display_errors', TRUE); ini_set('display_startup_errors', TRUE); if (PHP_SAPI == 'cli') die('This example should only be run from a Web Brows
美国电影超短200句 dcj3sjt126com 电影
1. I see．我明白了。2. I quit! 我不干了!3. Let go! 放手!4. Me too．我也是。5. My god! 天哪!6. No way! 不行!7. Come on．来吧(赶快)8. Hold on．等一等。9. I agree。我同意。10. Not bad．还不错。11. Not yet．还没。12. See you．再见。13. Shut up!
Java访问远程服务 dyy_gusi httpclient webservice get post
随着webService的崛起，我们开始中会越来越多的使用到访问远程webService服务。当然对于不同的webService框架一般都有自己的client包供使用，但是如果使用webService框架自己的client包，那么必然需要在自己的代码中引入它的包，如果同时调运了多个不同框架的webService，那么就需要同时引入多个不同的clien
Maven的settings.xml配置 geeksun settings.xml
settings.xml是Maven的配置文件，下面解释一下其中的配置含义： settings.xml存在于两个地方： 1.安装的地方：$M2_HOME/conf/settings.xml 2.用户的目录：${user.home}/.m2/settings.xml 前者又被叫做全局配置，后者被称为用户配置。如果两者都存在，它们的内容将被合并，并且用户范围的settings.xml优先。
ubuntu的init与系统服务设置 hongtoushizi ubuntu
转载自： http://iysm.net/?p=178 init Init是位于/sbin/init的一个程序，它是在linux下，在系统启动过程中，初始化所有的设备驱动程序和数据结构等之后，由内核启动的一个用户级程序，并由此init程序进而完成系统的启动过程。 ubuntu与传统的linux略有不同，使用upstart完成系统的启动，但表面上仍维持init程序的形式。运行
跟我学Nginx+Lua开发目录贴 jinnianshilongnian nginx lua
使用Nginx+Lua开发近一年的时间，学习和实践了一些Nginx+Lua开发的架构，为了让更多人使用Nginx+Lua架构开发，利用春节期间总结了一份基本的学习教程，希望对大家有用。也欢迎谈探讨学习一些经验。目录第一章安装Nginx+Lua开发环境第二章 Nginx+Lua开发入门第三章 Redis/SSDB+Twemproxy安装与使用第四章 L
php位运算符注意事项 home198979 位运算 PHP &
$a = $b = $c = 0; $a & $b = 1; $b | $c = 1 问a,b,c最终为多少? 当看到这题时，我犯了一个低级错误，误以为位运算符会改变变量的值。所以得出结果是1 1 0 但是位运算符是不会改变变量的值的，例如： $a=1;$b=2; $a&$b; 这样a,b的值不会有任何改变
Linux shell数组建立和使用技巧 pda158 linux
1.数组定义　　[chengmo@centos5 ~]$ a=(1 2 3 4 5) 　　[chengmo@centos5 ~]$ echo $a 　　1 　　一对括号表示是数组，数组元素用“空格”符号分割开。　　 2.数组读取与赋值　　得到长度：　　[chengmo@centos5 ~]$ echo ${#a[@]} 　　5 　　用${#数组名[@或
hotspot源码(JDK7) ol_beta java HotSpot jvm
源码结构图，方便理解： ├─agent Serviceab
Oracle基本事务和ForAll执行批量DML练习 vipbooks oracle sql
基本事务的使用：从账户一的余额中转100到账户二的余额中去，如果账户二不存在或账户一中的余额不足100则整笔交易回滚 select * from account; -- 创建一张账户表 create table account( -- 账户ID id number(3) not null, -- 账户名称 nam