Representation Videos using Mid-level discriminative patches + CVPR13

Video representation based on mid-level discriminative patches

Mine these patches from training videos

Then use these patch as a discriminative vovabulary for action classification

represent videos in terms of discriminative spato-temporal patches rather than global feature vectors.


 Video Representation方法大致有三类:

1、Global spatio-temporal templates

2、based on bag of features models (spatio-temporal imterest points、dense interest points...)适用于分类问题,不太适用于动作检测

3、分解视频为patches~


Mining Discriminative patches。。。。

你可能感兴趣的:(Representation Videos using Mid-level discriminative patches + CVPR13)