shape_based_matching lineMod开源代码学习

github开源代码地址:

https://github.com/meiqua/shape_based_matching

针对匹配精度问题,原作者采用了sub-pixel + icp的方法进行了最后的finetune,涉及到的相关原理可以查看:亚像素边缘提取与ICP2D的理解 - 知乎

涉及到的论文:

[1] Carsten Steger:Unbiased extraction of curvilinear structures from 2D and 3D images.

[2] Linear Least-Squares Optimization for Point-to-Plane ICP Surface Registration

https://download.csdn.net/download/lipeng19930407/87932688

简单记录一下查看代码过程中存在疑惑的点:

1、linemod论文中,梯度扩散是以当前像素为中心点,从r范围的邻域进行梯度扩散。

shape_based_matching lineMod开源代码学习_第1张图片

 而代码中的实现:

/**
 * \brief Spread binary labels in a quantized image.
 *
 * Implements section 2.3 "Spreading the Orientations."
 *
 * \param[in]  src The source 8-bit quantized image.
 * \param[out] dst Destination 8-bit spread image.
 * \param      T   Sampling step. Spread labels T/2 pixels in each direction.
 */
static void spread(const Mat& src, Mat& dst, int T)
{
  // Allocate and zero-initialize spread (OR'ed) image
  dst = Mat::zeros(src.size(), CV_8U);

  // Fill in spread gradient image (section 2.3)
  for (int r = 0; r < T; ++r)
  {
    int height = src.rows - r;
    for (int c = 0; c < T; ++c)
    {
      orUnaligned8u(&src.at(r, c), static_cast(src.step1()), dst.ptr(),
                    static_cast(dst.step1()), src.cols - c, height);
    }
  }
}

此处的实现,目标点位将其右下[0,T]范围内的梯度都进行了或操作,即某梯度只向其左上方T大小的范围内进行了扩散,与下文中的T间隔采样暗合,

shape_based_matching lineMod开源代码学习_第2张图片类似于粗略匹配,且针对噪音或微小形变具备较好的鲁棒性,但因此匹配精度上有所欠缺,才有了后续的icp+subpixel finetune。

2、similarity计算的位置点数

代码如下:

/**
 * \brief Compute similarity measure for a given template at each sampled image location.
 *
 * Uses linear memories to compute the similarity measure as described in Fig. 7.
 *
 * \param[in]  linear_memories Vector of 8 linear memories, one for each label.
 * \param[in]  templ           Template to match against.
 * \param[out] dst             Destination 8-bit similarity image of size (W/T, H/T).
 * \param      size            Size (W, H) of the original input image.
 * \param      T               Sampling step.
 */
static void similarity(const std::vector& linear_memories, const Template& templ,
                Mat& dst, Size size, int T)
{
  // 63 features or less is a special case because the max similarity per-feature is 4.
  // 255/4 = 63, so up to that many we can add up similarities in 8 bits without worrying
  // about overflow. Therefore here we use _mm_add_epi8 as the workhorse, whereas a more
  // general function would use _mm_add_epi16.
  CV_Assert(templ.features.size() <= 63);
  /// @todo Handle more than 255/MAX_RESPONSE features!!

  // Decimate input image size by factor of T
  int W = size.width / T;
  int H = size.height / T;

  // Feature dimensions, decimated by factor T and rounded up
  int wf = (templ.width - 1) / T + 1;
  int hf = (templ.height - 1) / T + 1;

  // Span is the range over which we can shift the template around the input image
  int span_x = W - wf;
  int span_y = H - hf;

  // Compute number of contiguous (in memory) pixels to check when sliding feature over
  // image. This allows template to wrap around left/right border incorrectly, so any
  // wrapped template matches must be filtered out!
  int template_positions = span_y * W + span_x + 1; // why add 1?
  //int template_positions = (span_y - 1) * W + span_x; // More correct?

  /// @todo In old code, dst is buffer of size m_U. Could make it something like
  /// (span_x)x(span_y) instead?
  dst = Mat::zeros(H, W, CV_8U);
  uchar* dst_ptr = dst.ptr();

#if CV_SSE2
  volatile bool haveSSE2 = checkHardwareSupport(CV_CPU_SSE2);
#if CV_SSE3
  volatile bool haveSSE3 = checkHardwareSupport(CV_CPU_SSE3);
#endif
#endif

  // Compute the similarity measure for this template by accumulating the contribution of
  // each feature
  for (int i = 0; i < (int)templ.features.size(); ++i)
  {
    // Add the linear memory at the appropriate offset computed from the location of
    // the feature in the template
    Feature f = templ.features[i];
    // Discard feature if out of bounds
    /// @todo Shouldn't actually see x or y < 0 here?
    if (f.x < 0 || f.x >= size.width || f.y < 0 || f.y >= size.height)
      continue;
    const uchar* lm_ptr = accessLinearMemory(linear_memories, f, T, W);

    // Now we do an aligned/unaligned add of dst_ptr and lm_ptr with template_positions elements
    int j = 0;
    // Process responses 16 at a time if vectorization possible
#if CV_SSE2
#if CV_SSE3
    if (haveSSE3)
    {
      // LDDQU may be more efficient than MOVDQU for unaligned load of next 16 responses
      for ( ; j < template_positions - 15; j += 16)
      {
        __m128i responses = _mm_lddqu_si128(reinterpret_cast(lm_ptr + j));
        __m128i* dst_ptr_sse = reinterpret_cast<__m128i*>(dst_ptr + j);
        *dst_ptr_sse = _mm_add_epi8(*dst_ptr_sse, responses);
      }
    }
    else
#endif
    if (haveSSE2)
    {
      // Fall back to MOVDQU
      for ( ; j < template_positions - 15; j += 16)
      {
        __m128i responses = _mm_loadu_si128(reinterpret_cast(lm_ptr + j));
        __m128i* dst_ptr_sse = reinterpret_cast<__m128i*>(dst_ptr + j);
        *dst_ptr_sse = _mm_add_epi8(*dst_ptr_sse, responses);
      }
    }
#endif
    for ( ; j < template_positions; ++j)
      dst_ptr[j] = uchar(dst_ptr[j] + lm_ptr[j]);
  }
}

模板也是T间隔滑动后,在采样点位置计算similarity的。这样可以充分使用 Spread T 以及 Restructuring the way the response images S 所带来的便利,使得匹配速度更快。

以上是阅读论文+代码时存在疑惑,记录一下

你可能感兴趣的:(学习)