百度飞将BMN时序动作定位框架 | 数据准备与训练指南 (下)

三、BMN

上篇说到处理一下BMN需要的训练数据,在PaddleVideo-develop/applications/FootballAction/datasets/EuroCup2016/features这个文件下面

(1)训练数据处理

打开PaddleVideo-develop/applications/FootballAction/datasets/script这个目录,在这个路径下面执行:

python get_instance_for_bmn.py

(2)修改配置文件

打开PaddleVideo-develop/applications/FootballAction/train_proposal/configs/bmn_football_v2.0.yaml

第22、26、31行改为刚才生成的:PaddleVideo-develop/applications/FootballAction/datasets/EuroCup2016/input_for_bmn/label.json的绝对路径

第38、49、60行改为datasets绝对路径:PaddleVideo-develop/applications/FootballAction/datasets/EuroCup2016/input_for_bmn/feature

第84行PaddleVideo-develop/applications/FootballAction/BMN_INFERENCE_results的绝对路径然后在这个位置新建一个叫做BMN_INFERENCE_results的文件夹

第17行batchsize改为1

(3)训练BMN

在PaddleVideo项目根目录执行:

python -B -m paddle.distributed.launch --gpus="0" --log_dir=./football/logs_bmn main.py  --validate -c applications/FootballAction/train_proposal/configs/bmn_football_v2.0.yaml -o output_dir=./football/bmn

开始训练

(4)导出BMN推理模型

python tools/export_model.py -c applications/FootballAction/train_proposal/configs/bmn_football_v2.0.yaml -p ./football/bmn/BMN_epoch_00020.pdparams -o ./football/inference_model

官方文档用的第16次训练的权重,应该是和官方训练数据不一样导致的,我的所有val环节avg_val全是nan,我猜应该是训练数据不足,开源的数据可能不到总数据的40%,学不到东西很正常。

导出模型后在PaddleVideo-develop/football/inference_model文件夹里面可以找到

(4)生成时序提名

在PaddleVideo-develop/applications/FootballAction/extractor这个目录下面打开extract_bmn.py

第66行改为PaddleVideo-develop/applications/FootballAction/datasets/EuroCup2016的绝对路径

打开PaddleVideo-develop/applications/FootballAction/extractor/configs/configs.yaml

第38、39行:我们在上面已经改过一次,那么现在写的就是官方发布的权重,效果应该比自己训练的要好很多,但是我们为了演示全部训练过程,需要将它修改为刚才导出的BMN推理模型

第38行:PaddleVideo-develop/football/inference_model/BMN.pdmodel

第39行:PaddleVideo-develop/football/inference_model/BMN.pdiparams

第45行:batchsize改为2 (8G显存)

然后也在PaddleVideo-develop/applications/FootballAction/extractor这个目录下面执行

python extract_bmn.py

结束后,在:

   |--  datasets                   # 训练数据集和处理脚本
        |--  EuroCup2016            # 数据集
            |--  feature_bmn
                 |--  prop.json    # bmn 预测结果

四、AttentionLSTM

(1)处理时序提名

打开PaddleVideo-develop/applications/FootballAction/datasets/script里面的get_instance_for_lstm.py

第12行改为自己的EuroCup2016文件夹路径PaddleVideo-develop/applications/FootballAction/datasets/EuroCup2016

然后在PaddleVideo-develop/applications/FootballAction/datasets/script下面运行

python get_instance_for_lstm.py

处理结果存储在

   |--  datasets                    # 训练数据集和处理脚本
        |--  EuroCup2016            # 数据集
            |--  input_for_lstm     # lstm训练的proposal
                ├── feature         # 特征
                ├── label_info.json # 标签信息
                ├── train.txt       # 训练文件列表
                └── val.txt         # 测试文件列表

(2)训练AttentionLSTM

找到PaddleVideo-develop/applications/FootballAction/train_proposal/configs这个文件夹,看看有没有一个名为lstm_football.yaml的文件,如果没有的话就复制一下这个文档内容,保存在这个地方,命名为:lstm_football.yaml

MODEL: #MODEL field
    framework: "RecognizerAction"
    head:
        name: "ActionAttentionLstmHead"
        num_classes: 8
        feature_num: 2
        feature_dims: [2048, 1024]
        embedding_size: 512

DATASET: #DATASET field
    batch_size: 100
    num_workers: 4
    shuffle_valid: True
    train:
        format: "FeatureDataset"
        file_path: "./applications/FootballAction/datasets/EuroCup2016/input_for_lstm/train.txt" #Mandatory, train data index file path
    valid:
        format: "FeatureDataset"
        file_path: "./applications/FootballAction/datasets/EuroCup2016/input_for_lstm/val.txt" #Mandatory, train data index file path
    test:
        format: "FeatureDataset"
        file_path: "./applications/FootballAction/datasets/EuroCup2016/input_for_lstm/val.txt" #Mandatory, train data index file path


PIPELINE: #PIPELINE field
    train:
        decode:
            name: "ActionFeatureDecoder"
            max_len: 300
            num_classes: 8
    valid:
        decode:
            name: "ActionFeatureDecoder"
            max_len: 300
            num_classes: 8
    test:
        decode:
            name: "ActionFeatureDecoder"
            max_len: 300
            num_classes: 8

OPTIMIZER: #OPTIMIZER field
    name: 'RMSProp'
    centered: True
    learning_rate:
        name: 'PiecewiseDecay'
        boundaries: [5, 10, 15]
        values: [0.00047, 0.000094, 0.0000188]
    weight_decay:
        name: 'L2'
        value: 8e-4

METRIC:
    name: 'HitOneMetric'
    num_class: 8
    top_k: 5

INFERENCE:
    name: 'AttentionLSTM_Inference_helper'
    num_classes: 8
    feature_num: 2
    feature_dims: [1024, 128]
    embedding_size: 512
    lstm_size: 1024

model_name: "AttentionLSTM"
log_interval: 20 #Optional, the interal of logger, default:10
epochs: 20 #Mandatory, total epoch
save_interval: 2
log_level: "INFO"

你可能感兴趣的:(深度学习CV方向,深度学习,人工智能,动作识别,时序定位,视频分类)