LRCN_activity_recognition

http://www.eecs.berkeley.edu/~lisa_anne/LRCN_video

作者用的是caffe。。

Steps to retrain the LRCN activity recognition models:

  1. Extract RGB frames: The script “extract_frames.sh” will convert UCF-101 .avi files to .jpg images. I extracted frames at 30 frames/second.
    提取视频帧。
  2. Compute flow frames: After downloading the code from [1], you can use “create_flow_images_LRCN.m” to compute flow frames. Example flow images for the video “YoYo_g25_c03” are here.
    计算光流。
  3. Train single frame models: Finetune the hybrid model (found here) with video frames to train a single frame model. Use “run_singleFrame_RGB.sh” and “run_singleFrame_flow.sh” to train the RGB and flow models respectively. Make sure to change the “root_folder” param in “train_test_singleFrame_RGB.prototxt” and “train_test_singleFrame_flow.prototxt” as needed. The single frame models I trained can be found here.
    对RGB和flow分别训练baseline模型。
  4. Train LRCN models: Using the single frame models as a starting point, train the LRCN models by running “run_lstm_RGB.sh” and “run_lstm_flow.sh“. The data layer for the LRCN model is a python layer (“sequence_input_layer.py”). Make sure to set “WITH_PYTHON_LAYER := 1” in Makefile.config. Change the paths “flow_frames” and “RGB_frames” in “sequence_input_layer.py” as needed. The models I trained can be found here.
    分别训练LRCN模型。

run_singleFrame_RGB.sh

#!/bin/sh
TOOLS=../../build/tools

GLOG_logtostderr=1 $TOOLS/caffe train -solver singleFrame_solver_RGB.prototxt -weights caffe_imagenet_hyb2_wr_rc_solver_sqrt_iter_310000 
echo 'Done.'

GLOG_logtostderr=1 设置glog日志。glog是google 出的一个C++轻量级日志库,介绍请看 glog
singleFrame_RGB的solver文件如下:

net: "train_test_singleFrame_RGB.prototxt"
test_iter: 75 
test_state: { stage: 'test-on-test' }
test_interval: 100
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 3000
display: 20
max_iter: 5000
momentum: 0.9
weight_decay: 0.005  
snapshot: 5000
snapshot_prefix: "snapshots_singleFrame_RGB"
solver_mode: GPU
device_id: 0 
random_seed: 1701

run_singleFrame_flow.sh

和RGB差不多啦。就是参数不一样。

#!/bin/sh
TOOLS=../../build/tools

GLOG_logtostderr=1 $TOOLS/caffe train -solver singleFrame_solver_flow.prototxt -weights caffe_imagenet_hyb2_wr_rc_solver_sqrt_iter_310000 
echo 'Done.'

solver:

net: "train_test_singleFrame_flow.prototxt"
test_iter: 75 
test_state: { stage: 'test-on-test' }
test_interval: 100
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 20000
display: 20
max_iter: 50000
momentum: 0.9
weight_decay: 0.005  
snapshot: 5000
snapshot_prefix: "snapshots_singleFrame_flow"
solver_mode: GPU
device_id: 1 
random_seed: 1701

run_lstm_RGB.sh

#!/bin/bash

TOOLS=../../build/tools

export HDF5_DISABLE_VERSION_CHECK=1
export PYTHONPATH=.

GLOG_logtostderr=1  $TOOLS/caffe train -solver lstm_solver_RGB.prototxt -weights single_frame_all_layers_hyb_RGB_iter_5000.caffemodel  
echo "Done."

lstm模型是在single frame模型上继续训练的。
补全pythonpath,在makefile.config里面uncomment WITH_PYTHON_LAYER=1,Change the paths “flow_frames” and “RGB_frames” in “sequence_input_layer.py” as needed.

solver:

net: "train_test_lstm_RGB.prototxt"
test_iter: 100
test_state: { stage: 'test-on-test' }
test_interval: 100
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 10000
display: 20
max_iter: 30000
momentum: 0.9
weight_decay: 0.005
snapshot: 5000
snapshot_prefix: "snapshots_lstm_RGB"
solver_mode: GPU
device_id: 0
random_seed: 1701
average_loss: 1000
clip_gradients: 5

run_lstm_flow.sh

#!/bin/bash

TOOLS=../../build/tools

export HDF5_DISABLE_VERSION_CHECK=1
export PYTHONPATH=.
#for debugging python layer
GLOG_logtostderr=1  $TOOLS/caffe train -solver lstm_solver_flow.prototxt -weights single_frame_all_layers_hyb_flow_iter_50000.caffemodel  
echo "Done."

solver:

net: "train_test_lstm_flow.prototxt"
test_iter: 100
test_state: { stage: 'test-on-test' }
test_interval: 100
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 20000
display: 20
max_iter: 70000
momentum: 0.9
weight_decay: 0.005
snapshot: 5000
snapshot_prefix: "snapshots_lstm_flow"
solver_mode: GPU
device_id: 0
random_seed: 1701
average_loss: 1000
clip_gradients: 15

你可能感兴趣的:(论文阅读,caffe,python)