基于视频理解TSM和数据集20bn-jester-v1的27类手势识别

基于视频理解TSM-mobilenetv2和数据集20bn-jester-v1的27类手势识别

基于视频理解TSM-resnet50和数据集20bn-jester-v1的27类手势识别

基于视频理解TSM-resnet101和数据集20bn-jester-v1的27类手势识别

论文下载:TSM: Temporal Shift Module for Efficient Video Understanding

# TSM: Temporal Shift Module for Efficient Video Understanding [[Website]](https://hanlab.mit.edu/projects/tsm/) [[arXiv]](https://arxiv.org/abs/1811.08383)[[Demo]](https://www.youtube.com/watch?v=0T6u7S_gq-4)
一键运行TSM,修改后代码下载:下载链接

20bn-jester-v1数据集百度网盘下载:下载链接

识别demo:1346899

手势类型:Doing other things,Drumming Fingers,No gesture,Pulling Hand In,Pulling Two Fingers In,Pushing Hand Away,Pushing Two Fingers Away,Rolling Hand Backward
Rolling Hand Forward,Shaking Hand,Sliding Two Fingers Down,Sliding Two Fingers Left,Sliding Two Fingers Right,Sliding Two Fingers Up,Stop Sign,Swiping Down
Swiping Left,Swiping Right,Swiping Up,Thumb Down,Thumb Up,Turning Hand Clockwise,Turning Hand Counterclockwise,Zooming In With Full Hand,Zooming In With Two Fingers
Zooming Out With Full Hand,Zooming Out With Two Fingers。

20bn-jester-v1数据集解压:生成可训练的视频帧。

cat 20bn-jester-v1-?? | tar zx

基于视频理解TSM和数据集20bn-jester-v1的27类手势识别_第1张图片

20bn-jester-v1数据集抽取:生成category.txt、train_videofolder.txt和val_videofolder.txt

import os
data_dir = "/Users/xxx/Data/20bn-jester-v1/20bn-jester-v1/"
if __name__ == '__main__':
    dataset_name = '/Users/xxx/Data/20bn-jester-v1/jester-v1'
    with open('%s-labels.csv' % dataset_name) as f:
        lines = f.readlines()
    categories = []
    for line in lines:
        line = line.rstrip()
        categories.append(line)
    categories = sorted(categories)
    with open('20bn-jester-v1/category.txt', 'w') as f:
        f.write('\n'.join(categories))

    dict_categories = {}
    for i, category in enumerate(categories):
        dict_categories[category] = i

    files_input = ['%s-validation.csv' % dataset_name, '%s-train.csv' % dataset_name]
    files_output = ['20bn-jester-v1/val_videofolder.txt', '20bn-jester-v1/train_videofolder.txt']
    for (filename_input, filename_output) in zip(files_input, files_output):
        with open(filename_input) as f:
            lines = f.readlines()
        folders = []
        idx_categories = []
        for line in lines:
            line = line.rstrip()
            items = line.split(';')
            folders.append(items[0])
            idx_categories.append(dict_categories[items[1]])
        output = []
        for i in range(len(folders)):
            curFolder = folders[i]
            curIDX = idx_categories[i]
            # counting the number of frames in each video folders
            dir_files = os.listdir(os.path.join(data_dir, curFolder))
            output.append('%s %d %d' % (data_dir + curFolder, len(dir_files), curIDX))
            print('%d/%d' % (i, len(folders)))
        with open(filename_output, 'w') as f:
            f.write('\n'.join(output))

运行修改后的TSM源码:修改数据集配置选项ops/dataset_config.py和基于mobilenet-v2中运行中出现的bug。

  python main.py jester RGB \
       --arch mobilenetv2 --num_segments 8 \
       --gd 20 --lr 0.02 --wd 1e-4 --lr_steps 20 40 --epochs 1 \
       --batch-size 32 -j 16 --dropout 0.5 --consensus_type=avg --eval-freq=1 \
       --shift --shift_div=8 --shift_place=blockres --npb

 

 

你可能感兴趣的:(Code代码,计算机视觉CV项目,视频理解,手势识别,TSM-mobilenetv2,20bn-jester-v1)