思路1:从字幕或音轨中找到对话较多的部分
- 抽取音轨
ffmpeg -i a.mp4 -map 0:a:0 a.mp3
- 逐帧抽取RMS功率:
ffmpeg -i in.mp3 -af astats=metadata=1:reset=1,ametadata=print:key=lavfi.astats.Overall.RMS_level:file=log.txt -f null -
Determining audio level peaks with ffmpeg
https://superuser.com/questions/1183663/determining-audio-level-peaks-with-ffmpeg
- 对整体进行音量分析:
ffmpeg -i input.wav -filter:a volumedetect -f null /dev/null
https://trac.ffmpeg.org/wiki/AudioVolume
https://ffmpeg.org/ffmpeg-filters.html#volumedetect
- 截取片段:
ffmpeg -ss $ss -t 00:05:00 -i $vfile.mp4 -vcodec copy -acodec copy -y $vfile.${ss//:/_}.mp4
https://stackoverflow.com/questions/21420296/how-to-extract-time-accurate-video-segments-with-ffmpeg
调试:
ffmpeg volumedetect returns unstable result
https://stackoverflow.com/questions/48673923/ffmpeg-volumedetect-returns-unstable-result
思路2:思路1+镜头边缘检测
安装opencv:https://www.cnblogs.com/yaoyaohust/p/10228888.html
镜头边界检测:https://www.cnblogs.com/lynsyklate/p/7840881.html
Yahoo的开源工具Hecate:https://github.com/yahoo/hecate
思路3:耗时更长、技术难度更高的做法
百度BROAD-Video Highlights视频精彩片段数据集简要介绍与分析
https://zhuanlan.zhihu.com/p/31770408
Temporal Action Detection (时序动作检测)方向2017年会议论文整理
https://zhuanlan.zhihu.com/p/31501316
Video Analysis 相关领域解读之Temporal Action Detection(时序行为检测)
https://zhuanlan.zhihu.com/p/26603387
Video Analysis相关领域解读之Action Recognition(行为识别)
https://zhuanlan.zhihu.com/p/26460437
Temporal Action Detection with Structured Segment Networks
林达华(香港中文)的团队
https://github.com/yjxiong/action-detection
基于PyTorch和DenseFlow
UntrimmedNets for Weakly Supervised Action Recognition and Detection
林达华(香港中文)的团队
https://github.com/wanglimin/UntrimmedNet
https://github.com/yjxiong/caffe/tree/untrimmednet
基于Caffe