最近整理了几种光流算,现在给出比较结果。
1、Farneback算法,2003年提出,opencv中已实现。
2、brox算法,2004年提出,opencv中已实现。
3、SF算法,2012年提出,opencv中已实现。
4、Flownet2,2016年提出,利用深度卷积神经网络生成光流图像。
首先给出各种算法的结果比较。
算法 | 提出时间 | 运行效率 | 其他说明 |
---|---|---|---|
Farneback | 2003年 | 较快 | - |
brox | 2004年 | 较快 | 需要使用GPU |
SF | 2012年 | 超慢 | - |
Flownet2 | 2016年 | 较快 | 需要使用GPU,且需要docker环境,生成的.flo需要转换成可用的jpg\png等格式 |
.
.
.
原图(一共10帧,隐去了第一帧):
四种光流图(9张光流图):
.
.
放几张大图比较(三张原图,两张光流图):
opencv中接口函数为:
calcOpticalFlowFarneback(InputArray prev, InputArray next, InputOutputArray flow, double pyr_scale,
int levels, int winsize, int iterations, int poly_n, double poly_sigma, int flags)
接口文档:
Parameters:
prev – first 8-bit single-channel input image.
next – second input image of the same size and the same type as prev.
flow – computed flow image that has the same size as prev and type CV_32FC2.
pyr_scale – parameter, specifying the image scale (<1) to build pyramids for each image; pyr_scale=0.5 means a classical pyramid, where each next layer is twice smaller than the previous one.
levels – number of pyramid layers including the initial image; levels=1 means that no extra layers are created and only the original images are used.
winsize – averaging window size; larger values increase the algorithm robustness to image noise and give more chances for fast motion detection, but yield more blurred motion field.
iterations – number of iterations the algorithm does at each pyramid level.
poly_n – size of the pixel neighborhood used to find polynomial expansion in each pixel; larger values mean that the image will be approximated with smoother
surfaces, yielding more robust algorithm and more blurred motion field, typically poly_n =5 or 7.
poly_sigma – standard deviation of the Gaussian that is used to smooth derivatives used as a basis for the polynomial expansion; for poly_n=5, you can set poly_sigma=1.1, for poly_n=7, a good value would be poly_sigma=1.5.
flags – operation flags that can be a combination of the following:
OPTFLOW_USE_INITIAL_FLOW: uses the input flow as an initial flow approximation.
OPTFLOW_FARNEBACK_GAUSSIAN: uses the Gaussian *winsize x winsize* filter instead of a box filter of the same size for optical flow estimation;
usually, this option gives z more accurate flow than with a box filter, at the cost of lower speed; normally,
winsize for a Gaussian window should be set to a larger value to achieve the same level of robustness.
说明: 不需要任何额外的配置,有opencv环境就可以。
opencv中接口函数为:
是一个类,需要定义对象,调用类函数。
cv::gpu::BroxOpticalFlow(float alpha_, float gamma_, float scale_factor_, int inner_iterations_,
int outer_iterations_, int solver_iterations_)
接口文档:
//! gradient constancy importance
float gamma;
//! pyramid scale factor
float scale_factor;
//! number of lagged non-linearity iterations (inner loop)
int inner_iterations;
//! number of warping iterations (number of pyramid levels)
int outer_iterations;
//! number of linear system solver iterations
int solver_iterations;
说明: 需要配置gpu编译的opencv环境,该算法需要用到GPU。关于GPU编译opencv以及相关配置问题,可以参考之前的博客。
opencv中接口函数为:
void calcOpticalFlowSF(Mat& from, Mat& to, Mat& flow, int layers, int averaging_block_size, int max_flow)
void calcOpticalFlowSF(Mat& from, Mat& to, Mat& flow, int layers, int averaging_block_size, int max_flow, double sigma_dist, double sigma_color,
int postprocess_window, double sigma_dist_fix, double sigma_color_fix, double occ_thr, int upscale_averaging_radius,
double upscale_sigma_dist, double upscale_sigma_color, double speed_up_thr)
接口文档:
prev – First 8-bit 3-channel image.
next – Second 8-bit 3-channel image
flowX – X-coordinate of estimated flow
flowY – Y-coordinate of estimated flow
layers – Number of layers
averaging_block_size – Size of block through which we sum up when calculate cost function for pixel
max_flow – maximal flow that we search at each level
sigma_dist – vector smooth spatial sigma parameter
sigma_color – vector smooth color sigma parameter
postprocess_window – window size for postprocess cross bilateral filter
sigma_dist_fix – spatial sigma for postprocess cross bilateralf filter
sigma_color_fix – color sigma for postprocess cross bilateral filter
occ_thr – threshold for detecting occlusions
upscale_averaging_radiud – window size for bilateral upscale operation
upscale_sigma_dist – spatial sigma for bilateral upscale operation
upscale_sigma_color – color sigma for bilateral upscale operation
speed_up_thr – threshold to detect point with irregular
flow - where flow should be recalculated after upscale
说明: 不需要任何额外配置,有opencv环境即可,但是目前我用的2.4.9版本中c++实现了这个接口,python虽然也有这个函数(但官方文档中没有写),但是调用时发现没有效果,应该是只定义,没有实现。
该算法是16年的一篇论文FlowNet 2.0 Evolution of Optical Flow Estimation with Deep Networks提出的,是利用深度卷积神经网络生成光流图像。生成的光流图像是.flo格式的,需要转换为png或者jpg等格式才能使用。