opencv ORB特征检测识别乐谱符号

ORB算法简介

最近学习了opencv的相关操作,更新一下对乐谱的识别技术。首先介绍关键操作:ORB特征在python下的使用:
ORB算法的参数是使用ORB_create()函数设置的。 ORB_create()函数的参数及其默认值如下所示:

cv2.ORB_create(nfeatures = 500,
scaleFactor = 1.2,
nlevels = 8,
edgeThreshold = 31,
firstLevel = 0,
WTA_K = 2,
scoreType = HARRIS_SCORE,
patchSize = 31,
fastThreshold = 20)

参数解释:

  • nfeatures - int
    确定要查找的功能(关键点)的最大数量。

  • scaleFactor - float
    金字塔抽取比率必须大于1.ORB使用图像金字塔来查找要素,因此您必须提供金字塔中每层之间的比例因子和金字塔所具有的层数。 scaleFactor = 2表示古典金字塔,每个下一个级别的像素比前一个像素少4倍。大比例因子将减少找到的功能数量。

  • nlevels - int
    金字塔等级的数量。最小级别的线性大小等于input_image_linear_size / pow(scaleFactor,nlevels)。

  • edgeThreshold - - int
    未检测到要素的边框的大小。由于关键点具有特定的像素大小,因此图像的边缘必须从搜索中排除。 edgeThreshold的大小应该等于或大于patchSize参数。

  • firstLevel - int
    此参数允许您确定哪个级别应视为金字塔中的第一级。在当前的实现中它应该是0。通常,具有统一比例的金字塔等级被认为是第一等级。

  • WTA_K - int
    用于产生定向的BRIEF描述符的每个元素的随机像素的数量。可能的值是2,3和4,其中2是默认值。例如,3的值意味着一次选择三个随机像素来比较它们的亮度。最亮像素的索引被返回。由于有3个像素,返回的索引将为0,1或2。

  • scoreType - int
    该参数可以设置为HARRIS_SCORE或FAST_SCORE。默认的HARRIS_SCORE表示Harris corner算法用于对要素进行排名。该分数仅用于保留最佳功能。 FAST_SCORE产生的稳定关键点稍低,但计算速度稍快。

  • patchSize - int
    面向BRIEF描述符使用的补丁大小。当然,在较小的金字塔层上,由特征覆盖的感知图像区域将更大。

# Import copy to make copies of the training image
import copy

# Set the default figure size
plt.rcParams['figure.figsize'] = [14.0, 7.0]

# Set the parameters of the ORB algorithm by specifying the maximum number of keypoints to locate and
# the pyramid decimation ratio
orb = cv2.ORB_create(200, 2.0)

# Find the keypoints in the gray scale training image and compute their ORB descriptor.
# The None parameter is needed to indicate that we are not using a mask.
keypoints, descriptor = orb.detectAndCompute(training_gray, None)

# Create copies of the training image to draw our keypoints on
keyp_without_size = copy.copy(training_image)
keyp_with_size = copy.copy(training_image)

# Draw the keypoints without size or orientation on one copy of the training image 
cv2.drawKeypoints(training_image, keypoints, keyp_without_size, color = (0, 255, 0))

"""
下面的方法就是把关键点以及范围描述出来
"""
# Draw the keypoints with size and orientation on the other copy of the training image
cv2.drawKeypoints(training_image, keypoints, keyp_with_size, flags = cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)

# Display the image with the keypoints without size or orientation
plt.subplot(121)
plt.title('Keypoints Without Size or Orientation')
plt.imshow(keyp_without_size)

# Display the image with the keypoints with size and orientation
plt.subplot(122)
plt.title('Keypoints With Size and Orientation')
plt.imshow(keyp_with_size)
plt.show()

# Print the number of keypoints detected
print("\nNumber of keypoints Detected: ", len(keypoints))
opencv ORB特征检测识别乐谱符号_第1张图片
运行结果.png

特征比对

cv2.BFMatcher(normType = cv2.NORM_L2,
crossCheck = false)

  • normType
    指定用于确定匹配质量的度量。默认情况下,normType = cv2.NORM_L2,它测量两个描述符之间的距离。但是,对于由ORB创建的二进制描述符,海明度量标准更适合。汉明度量通过计算二进制描述符之间的不相似位的数量来确定距离。当使用WTA_K = 2创建ORB描述符时,选择两个随机像素并在亮度上进行比较。最亮像素的索引返回为0或1.这种输出只占用1位,因此应该使用cv2.NORM_HAMMING度量。另一方面,如果使用WTA_K = 3创建ORB描述符,则选择三个随机像素并在亮度上进行比较。最亮像素的索引返回为0,1或2.这样的输出将占用2位,因此称为cv2.NORM_HAMMING2(2代表2位)的汉明距离的特殊变体应该是代替使用。然后,对于选择的任何度量,当比较训练和查询图像中的关键点时,具有较小度量(它们之间的距离)的对被认为是最佳匹配。

  • crossCheck - bool
    一个布尔变量,可以设置为True或False。交叉检查对于消除错误匹配非常有用。通过两次执行匹配程序进行交叉检查。第一次将训练图像中的关键点与查询图像中的关键点进行比较;然而,第二次将查询图像中的关键点与训练图像中的关键点进行比较(即,比较是向后完成的)。当启用交叉检查时,只有在训练图像中的关键点A是查询图像中关键点B的最佳匹配(反之亦然)(即,如果查询图像中的关键点B是训练图像中的点A)。

# Set the default figure size
plt.rcParams['figure.figsize'] = [34.0, 34.0]

# Create a Brute Force Matcher object. We set crossCheck to True so that the BFMatcher will only return consistent
# pairs. Such technique usually produces best results with minimal number of outliers when there are enough matches.
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck = True)

# Perform the matching between the ORB descriptors of the training image and the query image
matches = bf.match(descriptors_train, descriptors_query)

# The matches with shorter distance are the ones we want. So, we sort the matches according to distance
matches = sorted(matches, key = lambda x : x.distance)

# Connect the keypoints in the training image with their best matching keypoints in the query image.
# The best matches correspond to the first elements in the sorted matches list, since they are the ones
# with the shorter distance. We draw the first 85 mathces and use flags = 2 to plot the matching keypoints
# without size or orientation.
result = cv2.drawMatches(training_gray, keypoints_train, query_gray, keypoints_query, matches[:85], query_gray, flags = 2)

# we display the image
plt.title('Best Matching Points', fontsize = 30)
plt.imshow(result)
plt.show()

# Print the number of keypoints detected in the training image
print("Number of Keypoints Detected In The Training Image: ", len(keypoints_train))

# Print the number of keypoints detected in the query image
print("Number of Keypoints Detected In The Query Image: ", len(keypoints_query))

# Print total number of matching Keypoints between the training and query images
print("\nNumber of Matching Keypoints Between The Training and Query Images: ", len(matches))
image.png

乐谱中的应用

import cv2
import numpy as np
import matplotlib.pyplot as plt
import os
import copy

"""
使用顺序:
upload_pic(path) 传入文件夹路径
select_pic(i) 选定第i个图
show_pic()  查看此图像
show_area(x,xx,y,yy) 展示图片区域
set_aim_area()  将展示的区域设为目标图像
show_feature()  以默认参数运行ORB   set_cv_parements()修改参数
serch_featrue() 在当前图片中搜索匹配
plot_result()   将搜索到的部分展示出来(最大30 张)
"""
class Score(object):
    
    def upload_pic(self,path):
        piclistdr=self.walkFile_item(path)
        self.picture_list=self.read_picture(piclistdr)
        self.len = len(self.picture_list)
        
    def select_pic(self,i):
        if i>=self.len:
            print('retry, max len is {}'.format(len(self.len)))
        else:
            self.i = i
            
    def show_area(self,x,xx,y,yy):
        if not self.i:
            self.i = 0

        self.picture = self.picture_list[self.i]
        self.area = self.picture[x:xx,y:yy,:]
        plt.imshow(self.area)
        
    def set_aim_area(self):
            self.aim_area = self.area

    def show_pic(self):
        plt.imshow(self.picture)
    
        
    def show_feature(self):
        self.set_cv_parements()
        self.aim_area_gray = cv2.cvtColor(self.aim_area, cv2.COLOR_BGR2GRAY)

        self.aimkeypoints, self.aimdescriptor = self.orb.detectAndCompute(self.aim_area_gray , None)

        # Create copies of the training image to draw our keypoints on
        keyp_without_size = copy.copy(self.aim_area)
        keyp_with_size = copy.copy(self.aim_area)
        cv2.drawKeypoints(self.aim_area, self.aimkeypoints, keyp_without_size, color = (0, 255, 0))
        cv2.drawKeypoints(self.aim_area, self.aimkeypoints, keyp_with_size, flags = cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)

        # Display the image with the keypoints without size or orientation
        plt.subplot(121)
        plt.title('Keypoints Without Size or Orientation')
        plt.imshow(keyp_without_size)

        # Display the image with the keypoints with size and orientation
        plt.subplot(122)
        plt.title('Keypoints With Size and Orientation')
        plt.imshow(keyp_with_size)
        plt.show()

        # Print the number of keypoints detected
        print("\nNumber of keypoints Detected: ", len(self.aimkeypoints))
        
    def serch_featrue(self,mid_step,thread):
        dx,dy,cc = self.aim_area.shape
        self.mid_step=mid_step
        self.dx = dx
        self.dy = dy
        aa,bb,cc = self.picture.shape
        xstep ,ystep= int((aa-dx)/mid_step),int((bb-dy)/mid_step)

        self.anserlist=[]
        for xs in range(xstep):
            for ys in range(ystep):
                query_image = self.picture[(xs)*self.mid_step:(xs)*self.mid_step+dx,
                                   (ys)*self.mid_step:(ys)*self.mid_step+dy,:]


                query_gray = cv2.cvtColor(query_image, cv2.COLOR_BGR2GRAY)
                keypoints_query, descriptors_query = self.orb.detectAndCompute(query_gray, None)
                          
                bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck = True)
                try:
                    matches = bf.match(self.aimdescriptor, descriptors_query)
                    matches = sorted(matches, key = lambda x : x.distance)
                    #print(xs,ys)
                    if len(matches)>thread:
                        self.anserlist.append([xs,ys,len(matches)])
                except:
                    pass
                                          
    def plot_result(self):

        for i,j in enumerate(self.anserlist[:30]):
            
            xs,ys,i=j[0],j[1],i+1
            ax = plt.subplot(5,6,i)        
            mid_step = self.mid_step
            query_image = self.picture[(xs)*mid_step:(xs)*mid_step+self.dx,
                               (ys)*mid_step:(ys)*mid_step+self.dy,:]
            query_gray = cv2.cvtColor(query_image, cv2.COLOR_BGR2GRAY)

            ax.imshow(query_gray)
            ax.set_title("location_{}_{}".format(xs,ys))

   
    def set_cv_parements(self,mount=200,vec=1.3,edge=21,WTA=2,patch=10,fast=10):
        self.orb = cv2.ORB_create(mount,vec,edgeThreshold = edge,
               firstLevel = 0,
               WTA_K = WTA,
               patchSize = patch,
               fastThreshold = fast)

    def show(self,xs,ys,i):
        ax = plt.subplot(5,6,i)
        
        mid_step = self.mid_step
        query_image = self.picture[(xs)*mid_step:(xs)*mid_step+dx,
                           (ys)*mid_step:(ys)*mid_step+dy,:]
        query_gray = cv2.cvtColor(query_image, cv2.COLOR_BGR2GRAY)

        ax.imshow(query_gray)
        ax.set_title("location_{}_{}".format(xs,ys))


    @staticmethod
    def walkFile_F(file):
        filelist = []
        for root, dirs, files in os.walk(file):
            for d in dirs:
                filelist.append(os.path.join(root, d))
        return filelist
    
    @staticmethod
    def walkFile_item(file):
        piclist=[]
        for root, dirs, files in os.walk(file):
            for f in files:
                piclist.append(os.path.join(root, f))
        return piclist
    
    @staticmethod
    def read_picture(path):
        picture_list = [cv2.imread(i) for i in path]
        picture_list = [cv2.cvtColor(i,cv2.COLOR_BGR2RGB) for i in picture_list]
        return picture_list

if __name__== "main":
    score = Score()
    score.upload_pic(r'C:\Users\86355\Desktop\score_detector\img_pdf')
    score.select_pic(7)
    score.show_area(360,480,355,405)
    score.set_aim_area()
    score.show_feature()A
    score.select_pic(15)
    score.serch_featrue(25,9)
    score.aim_area.shape
    %matplotlib qt5
    print(len(score.anserlist))
    score.plot_result()
 
opencv ORB特征检测识别乐谱符号_第2张图片
show feature
opencv ORB特征检测识别乐谱符号_第3张图片
匹配到的符号

切换其他片段:


opencv ORB特征检测识别乐谱符号_第4张图片
休止符
opencv ORB特征检测识别乐谱符号_第5张图片
休止符特征

修改参数后运行

    score.select_pic(15)
    score.set_cv_parements(mount=200,vec=1.4,edge=15,WTA=2,patch=8,fast=15)
    score.serch_featrue(25,15)
    score.aim_area.shape
    %matplotlib qt5
    print(len(score.anserlist))
    score.plot_result()
opencv ORB特征检测识别乐谱符号_第6张图片
运行结果:有少数错误

总结

单纯使用ORB特征算法匹配效果勉强,但是由于此类图像形状单一,默认参数检测不到特征,参数需要手工调整。对每一种类型的符号这样调整是不现实的。此算法在识谱中可以作为深度学习前期,半自动处理图数据时作为筛选过滤器使用。

你可能感兴趣的:(opencv ORB特征检测识别乐谱符号)