视频精彩片段提取 - 调研

思路1:从字幕或音轨中找到对话较多的部分

- 抽取音轨

ffmpeg -i a.mp4 -map 0:a:0 a.mp3

- 逐帧抽取RMS功率:

ffmpeg -i in.mp3 -af astats=metadata=1:reset=1,ametadata=print:key=lavfi.astats.Overall.RMS_level:file=log.txt -f null -

Determining audio level peaks with ffmpeg

https://superuser.com/questions/1183663/determining-audio-level-peaks-with-ffmpeg

- 对整体进行音量分析:

ffmpeg -i input.wav -filter:a volumedetect -f null /dev/null

https://trac.ffmpeg.org/wiki/AudioVolume

https://ffmpeg.org/ffmpeg-filters.html#volumedetect 

- 截取片段:

ffmpeg -ss $ss -t 00:05:00 -i $vfile.mp4 -vcodec copy -acodec copy -y $vfile.${ss//:/_}.mp4

https://stackoverflow.com/questions/21420296/how-to-extract-time-accurate-video-segments-with-ffmpeg

 

调试:

ffmpeg volumedetect returns unstable result

https://stackoverflow.com/questions/48673923/ffmpeg-volumedetect-returns-unstable-result

 

思路2:思路1+镜头边缘检测

安装opencv:https://www.cnblogs.com/yaoyaohust/p/10228888.html

镜头边界检测:https://www.cnblogs.com/lynsyklate/p/7840881.html

Yahoo的开源工具Hecate:https://github.com/yahoo/hecate

 

思路3:耗时更长、技术难度更高的做法

百度BROAD-Video Highlights视频精彩片段数据集简要介绍与分析

https://zhuanlan.zhihu.com/p/31770408

 

Temporal Action Detection (时序动作检测)方向2017年会议论文整理

https://zhuanlan.zhihu.com/p/31501316

 

Video Analysis 相关领域解读之Temporal Action Detection(时序行为检测)

https://zhuanlan.zhihu.com/p/26603387

 

Video Analysis相关领域解读之Action Recognition(行为识别)

https://zhuanlan.zhihu.com/p/26460437

 

Temporal Action Detection with Structured Segment Networks

林达华(香港中文)的团队

https://github.com/yjxiong/action-detection

基于PyTorch和DenseFlow

 

UntrimmedNets for Weakly Supervised Action Recognition and Detection

林达华(香港中文)的团队

https://github.com/wanglimin/UntrimmedNet

https://github.com/yjxiong/caffe/tree/untrimmednet

基于Caffe

 

转载于:https://www.cnblogs.com/yaoyaohust/p/10986934.html

你可能感兴趣的:(视频精彩片段提取 - 调研)