将描述子空间量化成一些典型实例,并将图像中的每个描述子指派到其中的某个实例中。对图像库里的图像进行SITF特征提取,SIFT原理和方法这里不再介绍。这些典型实例可以通 过分析训练图像集确定,并被视为视觉单词。所有这些视觉单词构成的集合称为视觉词汇,有时也称为视觉码本。对于给定的问题、图像类型,或在通常情况下仅需 呈现视觉内容,可以创建特定的词汇。
图像检索与识别(Bag-Of-Words Models)_第1张图片

2.“视觉词典(visual vocabulary)”的构造

“视觉词典(visual vocabulary)”的构造的过程就是一个图像分类的过程。这里我们用的图像分类方法是K-means聚类方法,当然不局限于这一种分类方法,比如对于文本分类而言,支持向量机(SVM)是一个可靠的选择。。


  1. 随机初始化 K 个聚类中心
  2. 重复下述步骤直至算法收敛:

太多: 计算量大,容易过拟合

常用参数设置:视觉单词数量(K-means算法获取的聚类中心)一般为 K=3000-100000,即图像整体描述的直方图维度为 3000~10000。我们要不断尝试,找到相对来说最适合的分类数目,最合适的意思是,你的分类比较正确。


图像检索与识别(Bag-Of-Words Models)_第2张图片`
def train(self,featurefiles,k=100,subsampling=10):
“”" Train a vocabulary from features in files listed
in featurefiles using k-means with k number of words.
Subsampling of training data can be used for speedup. “”"

    nbr_images = len(featurefiles)
    # read the features from file
    descr = []
    descriptors = descr[0] #stack all features for k-means
    for i in arange(1,nbr_images):
        descriptors = vstack((descriptors,descr[i]))
    # k-means: last number determines number of runs
    self.voc,distortion = kmeans(descriptors[::subsampling,:],k,1)
    self.nbr_words = self.voc.shape[0]
    # go through all training images and project on vocabulary
    imwords = zeros((nbr_images,self.nbr_words))
    for i in range( nbr_images ):
        imwords[i] = self.project(descr[i])
    nbr_occurences = sum( (imwords > 0)*1 ,axis=0)
    self.idf = log( (1.0*nbr_images) / (1.0*nbr_occurences+1) )
    self.trainingdata = featurefiles


3. 针对输入特征集,根据视觉词典进行量化

图像检索与识别(Bag-Of-Words Models)_第3张图片利用单词表的中词汇表示图像。利用SIFT算法,可以从每幅图像中提取很多个特征点,这些特征点都可以用单词表中的单词近似代替,通过统计单词表中每个单词在图像中出现的次数,可以将图像表示成为一个n维数值向量。但是考虑到很大概率实际会发生这种情况:


通过单词计数来构建文档直方图向量v,从而建立文档索引。通常,在单词计数时会忽略掉一些常用词,如“这”“和”“是”等,这些常用词称为停用词。由于每篇文档长度不同,故除以直方图总和将向量归一化成单位长度。对于直方图向量中的 每个元素,一般根据每个单词的重要性来赋予相应的权重。通常,数据集(或语料库)中一个单词的重要性与它在文档中出现的次数成正比,而与它在语料库中出现的次数成反比。
最常用的权重是 tf-idf(term frequency-inverse document frequency,词频- 逆向文档频率 ),单词 w 在文档 d 中的词频是:

nw 是单词 w 在文档 d 中出现的次数。为了归一化,将 nw 除以整个文档中单词的总数。
|D| 是在语料库D 中文档的数目,分母是语料库中包含单词w 的文档数d。将两者 相乘可以得到矢量 v 中对应元素的 tf-idf 权重。所以再下一步把输入图像转化成视觉单词(visual words) 的频率直方图时,我们不能简单地把特征点作为唯一标准,要降低那些比重较大的、影响实验结果的的特征。不过我后来再其他地方发现了这个,原来有可以直接调用的 tf-idf




4.把输入图像转化成视觉单词(visual words) 的频率直方图


    def project(self,descriptors):
        """ Project descriptors on the vocabulary
            to create a histogram of words. """
        # histogram of image words 
        imhist = zeros((self.nbr_words))
        words,distance = vq(descriptors,self.voc)
        for w in words:
            imhist[w] += 1
        return imhist


图像检索与识别(Bag-Of-Words Models)_第4张图片

1. 创建词汇

# -*- coding: utf-8 -*-
import pickle
from PCV.imagesearch import vocabulary
from PCV.tools.imtools import get_imlist
from PCV.localdescriptors import sift

imlist = get_imlist('first1000/')
nbr_images = len(imlist)
featlist = [imlist[i][:-3]+'sift' for i in range(nbr_images)]

for i in range(nbr_images):
    sift.process_image(imlist[i], featlist[i])

voc = vocabulary.Vocabulary('ukbenchtest')
voc.train(featlist, 1000, 10)
# saving vocabulary
with open('first1000/vocabulary.pkl', 'wb') as f:
    pickle.dump(voc, f)
print ('vocabulary is:', voc.name, voc.nbr_words)

图像检索与识别(Bag-Of-Words Models)_第5张图片

2. 建立数据库(imagesearch.py文件)

from numpy import *
import pickle
import sqlite3
from functools import cmp_to_key
import operator

class Indexer(object):
    def __init__(self,db,voc):
        """ Initialize with the name of the database 
            and a vocabulary object. """
        self.con = sqlite3.connect(db)
        self.voc = voc
    def __del__(self):
    def db_commit(self):
    def get_id(self,imname):
        """ Get an entry id and add if not present. """
        cur = self.con.execute(
        "select rowid from imlist where filename='%s'" % imname)
        if res==None:
            cur = self.con.execute(
            "insert into imlist(filename) values ('%s')" % imname)
            return cur.lastrowid
            return res[0] 
    def is_indexed(self,imname):
        """ Returns True if imname has been indexed. """
        im = self.con.execute("select rowid from imlist where filename='%s'" % imname).fetchone()
        return im != None
    def add_to_index(self,imname,descr):
        """ Take an image with feature descriptors, 
            project on vocabulary and add to database. """
        if self.is_indexed(imname): return
        print ('indexing', imname)
        # get the imid
        imid = self.get_id(imname)
        # get the words
        imwords = self.voc.project(descr)
        nbr_words = imwords.shape[0]
        # link each word to image
        for i in range(nbr_words):
            word = imwords[i]
            # wordid is the word number itself
            self.con.execute("insert into imwords(imid,wordid,vocname) values (?,?,?)", (imid,word,self.voc.name))
        # store word histogram for image
        # use pickle to encode NumPy arrays as strings
        self.con.execute("insert into imhistograms(imid,histogram,vocname) values (?,?,?)", (imid,pickle.dumps(imwords),self.voc.name))
    def create_tables(self): 
        """ Create the database tables. """
        self.con.execute('create table imlist(filename)')
        self.con.execute('create table imwords(imid,wordid,vocname)')
        self.con.execute('create table imhistograms(imid,histogram,vocname)')        
        self.con.execute('create index im_idx on imlist(filename)')
        self.con.execute('create index wordid_idx on imwords(wordid)')
        self.con.execute('create index imid_idx on imwords(imid)')
        self.con.execute('create index imidhist_idx on imhistograms(imid)')

class Searcher(object):
    def __init__(self,db,voc):
        """ Initialize with the name of the database. """
        self.con = sqlite3.connect(db)
        self.voc = voc
    def __del__(self):
    def get_imhistogram(self,imname):
        """ Return the word histogram for an image. """
        im_id = self.con.execute(
            "select rowid from imlist where filename='%s'" % imname).fetchone()
        s = self.con.execute(
            "select histogram from imhistograms where rowid='%d'" % im_id).fetchone()
        # use pickle to decode NumPy arrays from string
        return pickle.loads(s[0])
    def candidates_from_word(self,imword):
        """ Get list of images containing imword. """
        im_ids = self.con.execute(
            "select distinct imid from imwords where wordid=%d" % imword).fetchall()
        return [i[0] for i in im_ids]
    def candidates_from_histogram(self,imwords):
        """ Get list of images with similar words. """
        # get the word ids
        words = imwords.nonzero()[0]
        # find candidates
        candidates = []
        for word in words:
            c = self.candidates_from_word(word)
        # take all unique words and reverse sort on occurrence 
        tmp = [(w,candidates.count(w)) for w in set(candidates)]
        tmp.sort(key=cmp_to_key(lambda x,y:operator.gt(x[1],y[1])))
        # return sorted list, best matches first    
        return [w[0] for w in tmp] 
    def query(self,imname):
        """ Find a list of matching images for imname. """
        h = self.get_imhistogram(imname)
        candidates = self.candidates_from_histogram(h)
        matchscores = []
        for imid in candidates:
            # get the name
            cand_name = self.con.execute(
                "select filename from imlist where rowid=%d" % imid).fetchone()
            cand_h = self.get_imhistogram(cand_name)
            cand_dist = sqrt( sum( self.voc.idf*(h-cand_h)**2 ) )
            matchscores.append( (cand_dist,imid) )
        # return a sorted list of distances and database ids
        return matchscores
    def get_filename(self,imid):
        """ Return the filename for an image id. """
        s = self.con.execute(
            "select filename from imlist where rowid='%d'" % imid).fetchone()
        return s[0]

def tf_idf_dist(voc,v1,v2):
    v1 /= sum(v1)
    v2 /= sum(v2)
    return sqrt( sum( voc.idf*(v1-v2)**2 ) )

def compute_ukbench_score(src,imlist):
    """ Returns the average number of correct
        images on the top four results of queries. """
    nbr_images = len(imlist)
    pos = zeros((nbr_images,4))
    # get first four results for each image
    for i in range(nbr_images):
        pos[i] = [w[1]-1 for w in src.query(imlist[i])[:4]]
    # compute score and return average
    score = array([ (pos[i]//4)==(i//4) for i in range(nbr_images)])*1.0
    return sum(score) / (nbr_images)

# import PIL and pylab for plotting        
from PIL import Image
from pylab import *

def plot_results(src,res):
    """ Show images in result list 'res'. """
    nbr_results = len(res)
    for i in range(nbr_results):
        imname = src.get_filename(res[i])

SQLite 可以从 pysqlite2 模块中导入。 Indexer 类连接数据库,并且一旦创建(调用 init() 方法)后就可以保存词汇对象。del() 方法 可以确保关闭数据库连接,db_commit() 可以将更改写入数据库文件。
我们仅需一个包含三个表单的简单数据库模式。表单 imlist 包含所有要索引的图像

文件名;imwords 包含了一个那些单词的单词索引、用到了哪个词汇、以及单词出现 在哪些图像中;最后,imhistograms 包含了全部每幅图像的单词直方图。根据矢量空间模型,我们需要这些以便进行图像比较。
图像检索与识别(Bag-Of-Words Models)_第6张图片

3. 添加图像

# -*- coding: utf-8 -*-
import pickle
from PCV.imagesearch import imagesearch
from PCV.localdescriptors import sift
from sqlite3 import dbapi2 as sqlite
from PCV.tools.imtools import get_imlist

imlist = get_imlist('first1000/')
nbr_images = len(imlist)
featlist = [imlist[i][:-3]+'sift' for i in range(nbr_images)]

# load vocabulary
with open('first1000/vocabulary.pkl', 'rb') as f:
    voc = pickle.load(f)
indx = imagesearch.Indexer('testImaAdd.db',voc)
# go through all images, project features on vocabulary and insert
for i in range(nbr_images)[:1000]:
    locs,descr = sift.read_features_from_file(featlist[i])
# commit to database

con = sqlite.connect('testImaAdd.db')
print (con.execute('select count (filename) from imlist').fetchone())
print (con.execute('select * from imlist').fetchone())

图像检索与识别(Bag-Of-Words Models)_第7张图片

from pysqlite2 import dbapi2 as sqlite
con = sqlite.connect('test.db') 
print con.execute('select count (filename) from imlist').fetchone()
print con.execute('select * from imlist').fetchone()


imagesearch.py文件里的Sercher类用于实现搜索,一个新的 Searcher 对象连接到数据库,一旦删除便关闭连接。如果图像数据库很大,逐一比较整个数据库中的所有直方图往往是不可行的。我们 需要找到一个大小合理的候选集(这里的“合理”是通过搜索响应时间、所需内存 等确定的),单词索引的作用便在于此:我们可以利用单词索引获得候选集,然后只需在候选集上进行逐一比较。

5. 用一幅图像进行查询

import pickle
from PCV.localdescriptors import sift
from PCV.imagesearch import imagesearch
from PCV.geometry import homography
from PCV.tools.imtools import get_imlist
# 载入图像列表和词汇
with open('ukbench_imlist.pkl','rb') as f:
  imlist = pickle.load(f)  featlist = pickle.load(f)
nbr_images = len(imlist)
with open('vocabulary.pkl', 'rb') as f:  voc = pickle.load(f)
src = imagesearch.Searcher('test.db',voc)
# 查询图像的索引号和返回的搜索结果数目
q_ind = 50 
nbr_results = 20
# 常规查询
res_reg = [w[1] for w in src.query(imlist[q_ind])[:nbr_results]] 
print 'top matches (regular):', res_reg
# 载入查询图像特征
q_locs,q_descr = sift.read_features_from_file(featlist[q_ind]) 
fp = homography.make_homog(q_locs[:,:2].T)
# 用 RANSAC 模型拟合单应性
model = homography.RansacModel()
rank = {} 
# 载入搜索结果的图像特征
for ndx in res_reg[1:]: 
  locs,descr = sift.read_features_from_file(featlist[ndx])
  # 获取匹配数
  matches = sift.match(q_descr,descr)
  ind = matches.nonzero()[0]
  ind2 = matches[ind]
  tp = homography.make_homog(locs[:,:2].T)
  # 计算单应性,对内点计数。如果没有足够的匹配数则返回空列表
      H,inliers = homography.H_from_ransac(fp[:,ind],tp[:,ind2],model,match_theshold=4)
  :     inliers = []
  # 存储内点数
    rank[ndx] = len(inliers)
# 将字典排序,以首先获取最内层的内点数
 sorted_rank = sorted(rank.items(), key=lambda t: t[1], reverse=True) 
 res_geom = [res_reg[0]]+[s[0] for s in sorted_rank] 
 print 'top matches (homography):', res_geom
# 显示靠前的搜索结果 

首先,载入图像列表、特征列表(分别包含图像文件名和 SIFT 特征文件)及词汇。 然后,创建一个 Searcher 对象,执行定期查询,并将结果保存在 res_reg 列表中。然 后载入 res_reg 列表中每一幅图像的特征,并和查询图像进行匹配。单应性通过计算 匹配数和计数内点数得到。最终,我们可以通过减少内点的数目对包含图像索引和内点数的字典进行排序。

图像检索与识别(Bag-Of-Words Models)_第8张图片

图像检索与识别(Bag-Of-Words Models)_第9张图片

6. 用CherryPy创建Web应用

为了建立这些演示程序,我们将采用CherryPy 包,参见 http://www.cherrypy.org。 CherryPy 是一个纯Python 轻量级Web 服务器,使用面向对象模型

7. 图像搜索演示程序

# -*- coding: utf-8 -*-
import cherrypy
import pickle
import urllib
import os
from numpy import *
#from PCV.tools.imtools import get_imlist
from PCV.imagesearch import imagesearch
import random

This is the image search demo in Section 7.6.

class SearchDemo:

    def __init__(self):
        # 载入图像列表
        self.path = 'first1000/'
        #self.path = 'D:/python_web/isoutu/first500/'
        self.imlist = [os.path.join(self.path,f) for f in os.listdir(self.path) if f.endswith('.jpg')]
        #self.imlist = get_imlist('./first500/')
        #self.imlist = get_imlist('E:/python/isoutu/first500/')
        self.nbr_images = len(self.imlist)
        print (self.imlist)
        print (self.nbr_images)
        self.ndx = list(range(self.nbr_images))
        print (self.ndx)

        # 载入词汇
        # f = open('first1000/vocabulary.pkl', 'rb')
        with open('first1000/vocabulary.pkl','rb') as f:
            self.voc = pickle.load(f)

        # 显示搜索返回的图像数
        self.maxres = 10

        # header and footer html
        self.header = """
            Image search
        self.footer = """

    def index(self, query=None):
        self.src = imagesearch.Searcher('testImaAdd.db', self.voc)

        html = self.header
        html += """
Click an image to search. Random selection of images.

""" if query: # query the database and get top images #查询数据库,并获取前面的图像 res = self.src.query(query)[:self.maxres] for dist, ndx in res: imname = self.src.get_filename(ndx) html += "" html += ""+imname+"" print (imname+"################") html += "" # show random selection if no query # 如果没有查询图像则随机显示一些图像 else: random.shuffle(self.ndx) for i in self.ndx[:self.maxres]: imname = self.imlist[i] html += "" html += ""+imname+"" print (imname+"################") html += "" html += self.footer return html index.exposed = True #conf_path = os.path.dirname(os.path.abspath(__file__)) #conf_path = os.path.join(conf_path, "service.conf") #cherrypy.config.update(conf_path) #cherrypy.quickstart(SearchDemo()) cherrypy.quickstart(SearchDemo(), '/', config=os.path.join(os.path.dirname(__file__), 'service.conf'))

图像检索与识别(Bag-Of-Words Models)_第10张图片在浏览器里打开链接:
图像检索与识别(Bag-Of-Words Models)_第11张图片
图像检索与识别(Bag-Of-Words Models)_第12张图片
图像检索与识别(Bag-Of-Words Models)_第13张图片
