Image Retrieval: Ideas, influences, and trends of the new age 图像检索综述 文献翻译(一)

 

Image Retrieval: Ideas, influences, and trends of the new age

ACM literature

 

文献翻译

 

此处奉上原文链接:http://infolab.stanford.edu/~wangz/project/imsearch/review/JOUR/datta_TR.pdf

算是图像检索比较新也比较好的一篇综述了,写论文的时候可以用上一些内容

 

1.       由这个图可以看到Google Scholar 可以检索到一个科目的最新研究状况,这是一个很有用的数据,在写论文的时候如果能够附上这样一幅图,那么可以更好的将国内外的研究状况做一个分析。

 

 

 

2.       the various gaps introduced  鸿沟介绍

Sensory. The sensory gap is the gap between the object in the world and the information in a (computational) description derived from a recording of that scene.

         感知鸿沟,现实物体和我们对世界的感知差距

Semantic. The semantic gap is the lack of coincidence between the information that one can extract from the visual data and the interpretation that the same data has for a user in a given situation.

         语义鸿沟,人们从视觉数据中抽取的信息和某个用户在特定情况下对相同数据的描述缺乏一致性。

 

过去的图像检索 90-00

3.       In Smeulders et al. [2000]    图像搜索分为两个领域:narrow broad

Narrow image domain: 有限的变化性,更易定义的视觉特征  (特定的图片,如医疗图片)

Broad domain : 高变异性,同样的潜在语义概念的不可预测性 (网上的随意图片)

 

图像搜索的三个宽目录:

(1)     search by association  联合搜索

对于一副图像没有明确的意图,而是通过反复提炼浏览进行搜索

 

(2)     aimed search 有目的的搜索

搜索特定的图片

 

(3)     category search分类搜索

搜索一个语义类的单个图片代表

The overall goal therefore remains to bridge the semantic and sensorial gaps using the available visual features of images and relevant domain knowledge to support the varied search categories, ultimately to satiate the user.

因此,总的目标仍然是缩小语义和感官鸿沟,利用现有的相关领域知识的视觉特征的图像,并支持不同的搜索类别,最终满足一般用户。

4.       从图像抽取视觉内容分为两个部分:图像处理和特征重建。

In this context, search has been described as a specification of minimal invariant conditions that model the user intent, geared at reducing the sensory gap due to accidental distortions, clutter, occlusion, etc

在文中,搜索已被描述为一个最小不变情况的模式下,用户意图减少因意外的扭曲、杂波、闭塞所造成的感知鸿沟

 

 

5.       Once image features were extracted, the question remained as to how they could be indexed and matched against each other for retrieval.

一旦图像特征被抽取,问题将改变成为他们如何被在不同的检索过程从被索引和匹配。

 

6.       现代图像检索 00-08

 

 


一.用户意图

(1)     browser浏览者:用户没有最终目标的浏览图片,浏览会话将由一系列的不相关搜索组成。在搜索会话过程中浏览者将跨越多个不同的主题。

(2)     surfer冲浪者: 冲浪者是指一个用户拥有适度清晰的最终目标。一个冲浪者可能在开始阶段带有探索行为,可以为了接下来的所搜者能够更好的知道他从系统中所需要得到的信息。

(3)     searcher 搜索者:这种用户非常清楚在系统中需要什么信息。搜索会话通常会比较短,使用相关搜索达到所需情况。

One of the few studies categorizes users as experts and novices and studies their interaction patterns with respect to a video library   一些研究将专家和新手进行归类 一些研究对于视频库的影响模式

CHRISTEL, M. G. AND CONESCU, R. M. 2005. Addressing the challenge of visual information access from digital image and video libraries. In Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL). 列举从数码图片和视频库中获得视觉信息的挑战

 

总结: 对于最终用户的所有事项便是系统和用户的相互影响和相应的反应。所以后面建立了以人为本的多媒体系统。为了获得广泛的接受,图像检索系统需要获得一个以人为中心的视野。

如下文所示:

JAIMES, A., SEBE, N., AND GATICA-PEREZ, D. 2006. Human-centered computing: A multimedia perspective. In Proceedings of the ACM International Conference on Multimedia (Special Session on Human-Centered Multimedia). (以人为本的计算,多媒体观点)

 

二.数据范围

制度因素,如搜索的多样性,用户群和用户流量的预期也很大程度上影响了设计。沿着这方面,我们将搜索数据划分为以下类别:

(1)  Personal collection 个人收藏

(2)  Domain-Specific Collection 域的特定集合

(3)  Enterprise Collection 企业集

(4)  Archives  档案

(5)  Web  互联网

(6)   

Google’s Picasa system [Picasa 2004] provides a chronological display of images taking a user on a journey down memory lane

比较重要的几个应用:

implementation of a color-histogram-based image retrieval system 应用颜色直方图的FPGA检索系统:

KOTOULAS, L. AND ANDREADIS, I. 2003. Colour histogram content-based image retrieval and hardware implementation. IEEE Proc. Circ. Dev. Syst. 150, 5, 387–393.     (基于内容的图像直方图检索和硬件实现)

FPGA implementation for subimage retrieval within an image database 使用在数据库中的FPGA自图像检索

NAKANO, K. AND TAKAMICHI, E. 2003. An image retrieval system using FPGAs. In Proceedings of the Asia South Pacific Design Automation Conference.     (一种使用FPGA的图像检索方法)

a method for efficient retrieval in a network of imaging devices  对于网络图像设备的快速检索方法

WOODROW, E. AND HEINZELMAN, W. 2002. Spin-It: A data centric routing protocol for image retrieval in wireless networks. In Proceedings of the IEEE International Conference on Image Processing (ICIP).  (在无线网络中进行计算图像检索的数据为中心的路由协议)

Discussion. Regardless of the nature of the collection, as the expected user-base grows, factors such as concurrent query support, efficient caching, and parallel and distributed processing of requests become critical. For future real-world image retrieval systems, both software and hardware approaches to address these issues are essential.

More realistically, dedicated specialized servers, optimized memory and storage support, and highly parallelizable image search algorithms to exploit cluster computing powers are where the future of large-scale image search hardware support lies.

总结: 不管用户基数随着集合的性质如何增长,各种因素诸如并行查询的支持、高效缓存、并行、分布式处理等要求变得重要。对于未来的现实世界的图像检索系统,同时使用软件和硬件的方法来解决这些问题是至关重要的。更为现实的,专注致力于服务器,优化的内存和存储支持,高并行性图像检索算法,利用云计算的力量是未来大规模图像检索硬件的方向。

 

三.查询方式和处理

In the realm of image retrieval, an important parameter to measure user-system interaction level is the complexity of queries supported by the system. 

在图像检索界,衡量用户系统的交互等级的重要参数就是系统的查询复杂性

We describe next the various querying modalities, their characteristics, and the system support required thereof

我们描述了未来的各种查询方式,他们的特点,系统支持所需的单位。

(1)    Keywords 关键字

(2)    Free-Text 文字

(3)    Image 图像   搜索和所需图像相似的图像

(4)    Graphics 图形

(5)    Composite 组合

从系统查询的角度进行描述:

(1)    Text-Based 基于文本的

(2)    Content-Based 基于内容的

(3)    Composite 混合的

(4)    Interactive-Simple 单一交互式

(5)    Interactive-Composite 混合交互式

 

Processing text-based queries involves keyword matching using simple set-theoretic operations, and therefore a response can be generated very quickly.

处理基于文本的查询涉及到使用简单的关键字匹配的集合论操作,因此可以生成一个响应速度非常快。

文本检索方法:

R-trees are used for indexing images represented as attributed relational graphs (ARGs)

R-trees 作为引用关系图来对图像表示进行检索。

PETRAKIS, E. G. M., DIPLAROS, A., AND MILIOS, E. 2002. Matching and retrieval of distorted and occluded shapes using dynamic programming. IEEE Trans. Pattern Anal. Mach. Intell. 24, 4, 509–522.

PETRAKIS, E. G. M., FALOUTSOS, C., AND LIN, K. I. 2002. Imagemap: An image indexing method based on spatial similarity. IEEE Trans. Knowl. Data Eng. 14, 5, 979–987

 

Retrieval of images using wavelet coefficients as image representations and R-trees for indexing has been studied in Natsev et al. [2004]    使用小波系数作为图像表示进行检索,使用R树进行索引

NATSEV, A., RASTOGI, R., AND SHIM, K. 2004. Walrus: A similarity retrieval algorithm for image databases. IEEE Trans. Knowl. Data Eng. 16, 3, 301–316.    (图像数据库的相似检索算法)

 

Visual content matching using graph-based image representation and an efficient metric indexing algorithm has been proposed in Berretti et al. [2001]   使用基于图的图像表示方法进行视觉内容匹配并使用一个高效的度量索引算法进行索引。

BERRETTI, S., BIMBO, A. D., AND VICARIO, E. 2001. Efficient matching and indexing of graph models in content-based retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 23, 10, 1089–1105.   (基于内容的图像检索中的图模型匹配和索引方法)

 

Composite querying methods provide the users with more flexibility for expressing themselves.

综合查询方法提供了表达自己的用户提供更大的灵活性。

Some recent innovations in querying include sketch-based retrieval of color images [Chalechale et al. 2005].   一些最近发明的查询算法包括 彩色图像基于概括的图像检索。

CHALECHALE, A., NAGHDY, G., AND MERTINS, A. 2005. Sketch-based image matching using angular partitioning. IEEE Trans. Syst. Man Cybernet. 35, 1, 28–41.      (使用角分区的基于概述的图像匹配算法)

Querying using 3D models [Assfalg et al. 2002] has been motivated by the fact that 2D image queries are unable to capture the spatial arrangement of objects within the image. 3D模型检索的出现是由于2D图像检索不能够抓住图像空间布局的特点。

ASSFALG, J., DEL BIMBO, A., AND PALA, P. 2002. Three-Dimensional interfaces for querying by example in content-based image retrieval. IEEE Trans. Visualiz. Comput. Graphics 8, 4, 305–318.   基于内容的图像检索的三维接口

In another interesting work, a multimodal system involving hand gestures and speech for querying and relevance feedback was presented in Kaster et al. [2003].

在另一感兴趣的方面,多模型系统引入了手势和语言来检索并且提出了相关反馈机制

KASTER, T., PFEIFFER, M., AND BAUCKHAGE, C. 2003. Combining speech and haptics for intuitive and efficient navigation through image databases. In Proceedings of the 5th International Conference on Multimidia Interfaces (ICMI). 通过图像数据库对手势和触觉进行直观有效的导航。

Certain new interaction-based querying paradigms which statistically model the user’s interest [Fang et al. 2005],  新的基于互动的查询示例,用来统计用户感兴趣的模型

FANG, Y. AND GEMAN, D. 2005. Experiments in mental face retrieval. In Proceedings of the International Conference on Audio- and Video-Based Biometric Person Authentication (AVBPA).

FANG, Y., GEMAN, D., AND BOUJEMAA, N. 2005. An interactive system for mental face retrieval. In Proceedings of the ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR) at the International Multimedia Conference.

help the user refine her queries by providing cues and hints [Jaimes et al. 2004; Nagamine et al. 2004]

通过提供线索和提示帮助用户定义搜索。

JAIMES, A., OMURA, K., NAGAMINE, T., AND HIRATA, K. 2004. Memory cues for meeting video retrieval. In Proceedings of the 1st ACM Workshop on Continuous Archival and Retrieval of Personal Experiences (CARPE) at the ACM International Multimedia Conference.

NAGAMINE, T., JAIMES, A., OMURA, K., AND HIRATA, K. 2004. A visuospatial memory cue system for meeting video retrieval. In Proceedings of the ACM International Conference on Multimedia (demonstration).

 

总结

Discussion. A prerequisite for supporting text-based query processing is the presence of reliable metadata with pictures. However, pictures rarely come with reliable human tags. In recent years, there has been effort put into building interactive, public-domain games for large-scale collection of high-level manual annotations. One such game (the ESP game) has become very popular and has helped accumulate human annotations for about a hundred thousand pictures [von Ahn and Dabbish 2004]. Collection of manual tags for pictures has the dual advantage of: (1) facilitating text-based querying, and (2) building reliable training datasets for content-based analysis and automatic annotation algorithms. As explored in Datta et al. [2007], it is possible to effectively bridge the paradigms of keyword- and content-based search through a unified framework to provide the user the flexibility of both, without losing out on the search scope.

支持基于文本检索的先决条件是存在可信的图像元数据,然而图像却很少有可用的人为标签。近些年,人们将精力主要放在建立交互式的公共领域游戏,该游戏用来对于高层次手册说明的大规模收集。ESP十分流行并且帮助收集了大约十万图像的用户手册。手机图像的用户标签主要有以下优势:促进基于文本的查询,为基于内容的分析和自动标记算法建立可信的训练数据库,能够有效的连接关键字和基于内容搜索的数据,通过统一框架来提供用户多样的,不丢失的搜索领域。

 

四.可视化

Presentation of search results is perhaps one of the most important factors in the acceptance and popularity of an image retrieval system   在可接受和流行的图像检索系统中,对于搜索结果的描述应该是最重要的一个因素了。

(1)     Relevance-Ordered  关联 有序    搜索结果通过一些数字衡量进行排序

(2)     Time-Ordered  时间 排序    图片显示按时间排序

(3)     Clustered    

(4)     Hierarchical  分级

(5)     Composite 混合

 

In order to design interfaces for image retrieval systems, it helps to understand factors like how people manage their digital photographs [Rodden and Wood 2003] or frame their queries for visual art images Cunningham et al. [2004].   为了设计用于图像检索系统的接口,它有助于人们了解类似因素如何管理自己的数码照片

RODDEN, K. AND WOOD, K. 2003. How do people manage their digital photographs? In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI).

CUNNINGHAM, S. J., BAINBRIDGE, D., AND MASOODIAN, M. 2004. How people describe their image information needs: A grounded theory analysis of visual arts queries. In Proceedings of the ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL).

 

In Rodden et al. [2001], user studies on various ways of arranging images for browsing purposes are conducted, and the observation is that both visual-feature-based and concept-based arrangements have their own merits and demerits.   用户为浏览图片研究各种排列图像的方法,这个观察的结果是基于视觉特征和基于概念的排列方法各有他们自己的优点和缺点

RODDEN, K., BASALAJ, W., SINCLAIR, D., AND WOOD, K. 2001. Does organization by similarity assist image browsing? In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI).  

 

Thinking beyond the typical grid-based arrangement of top matching images, spiral and concentric visualization of retrieval results have been explored in Torres et al. [2003]. 基于网格的图像匹配,螺旋和检索可视化检索结果已经被提出。

TORRES, R. S., SILVA, C. G., MEDEIROS, C. B., AND ROCHA, H. V. 2003. Visual structures for image browsing. In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM).

 

For personal images, innovative arrangements of query results based on visual content, time-stamps, and efficient use of screen space add new dimensions to the browsing experience [Huynh et al. 2005].

HUYNH, D. F., DRUCKER, S. M., BAUDISCH, P., AND WONG, C. 2005. Time quilt: Scaling up zoomable photo browsers for large, unstructured photo collections. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI).

 

Image transcoding techniques, which aim at adapting multimedia (image and video) content to the capabilities of the client device, have been studied extensively in the last several years [Shanableh and Ghanbari 2000; Vetro et al. 2003; Bertini et al. 2003; Cucchiara et al. 2003].  图像转码技术,旨在适应多媒体和客户设备,在这些年已经大量研究了

SHANABLEH, T. AND GHANBARI, M. 2000. Heterogeneous video transcoding to lower spatio-temporal resolutions and different encoding formats. IEEE Trans. Multimed. 2, 2, 101–110.  (异构视频转码到低时空解决和不同编码格式)

VETRO, A., CHRISTOPOULOS, C., AND SUN, H. 2003. Video transcoding architectures and techniques: An overview. IEEE Signal Process. Mag. 20, 2, 18–29.   视频转码结构和技术

BERTINI, M., CUCCHIARA, R., DEL BIMBO, A., AND PRATI, A. 2003. Object and event detection for semantic annotation and transcoding. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME). 语言标注和转码的物体时间检测

CUCCHIARA, R., GRANA., C., AND PRATI, A. 2003. Semantic video transcoding using classes of relevance. Int. J. Image Graph. 3, 1, 145–170.    使用相关类的语义视频转码

 

总结:

Discussion. Study of organizations which maintain image management and retrieval systems has provided useful insights into system design, querying, and visualization. In Tope and Enser [2000], The final verdict of acceptance rejection for any visualization scheme comes from end-users.

 

研究表米娜那些坚持图像管理和检索系统组织已经对系统设计查询和可视化的有了一定的认识,

虽然简单,如基于网格的显示直观的界面,已成为可以接受的大多数搜索引擎用户,先进的可视化技术还可以在决策。

 

五.现代图像检索系统

Recently, a public-domain search engine called Riya (see Figure 4) has been developed, which incorporates image retrieval and face recognition for searching pictures of people and products on the Web.

大众主流搜索引擎Riya,包含了图像检索和脸部识别功能。

www.riya.com

It is also interesting to note that CBIR technology is being applied to domains as diverse as family album management, botany, astronomy, mineralogy, and remote sensing [Zhang et al. 2003;

Wang et al. 2002; Csillaghy et al. 2000; Painter et al. 2003; Schroder et al. 2000].

这也是值得注意的基于内容图像检索技术被应用于不同领域的家庭相册管理,植物学,天文学,矿物学,和遥感

ZHANG, L., CHEN, L., LI, M., AND ZHANG, H.-J. 2003. In Proceedings of the ACM International Conference on Multimedia    人脸识别

 

CSILLAGHY, A., HINTERBERGER, H., AND BENZ, A. 2000. Content based image retrieval in astronomy. Inf. Retriev.3, 3, 229–241.   天气预报

 

PAINTER, T. H., DOZIER, J., ROBERTS, D. A., DAVIS, R. E., AND GREEN, R. O. 2003. Retrieval of sub-pixel snow-covered area and grain size from imaging spectrometer data. Remote Sens. Env. 85, 1, 64–77.    积雪和粮食方面

 

SCHRODER, M.,REHRAUER, H., SEIDEL, K., ANDDATCU,M. 2000. Interactive learning and probabilistic retrieval in remote sensing image archives. IEEE Trans. Geosci. Remote Sens. 38, 5, 2288–2298.  

 

A publicly available similarity search tool [Wang et al. 2001] is being used for an online database of over 800, 000 airline-related images [Airliners.Net 2005; Slashdot 2005]  

http://airliners.net

 

the integration of similarity search functionality to a large collection of art and cultural images [GlobalMemoryNet 2006], and the incorporation of image similarity to a massive picture archive [Terragalleria 2001] of the renowned travel photographer Q.-T. Luong.

Automatic Linguistic Indexing of Pictures—Real-Time (ALIPR), an automatic image

annotation system [Li andWang 2006a; 2008], has been recently made public for people

to try to have their pictures annotated. As mentioned earlier, presence of reliable tags

with pictures is necessary for text-based image retrieval. As part of the ALIPR search

engine, an effort to automatically validate computer generated tags with human-given

annotation is being used in an attempt to build a very large collection of searchable

images (see Figure 5). Another work-in-progress is a Web image search system [Joshi

et al. 2006a] that exploits visual features and textual metadata, using state-of-the-art

algorithms, for a comprehensive search experience.

 

Discussion. Image analysis and retrieval systems have received widespread public and media interest of late [Mirsky 2006; Staedter 2006; CNN 2005]. It is reasonable to hope that in the near future, the technology will diversify to many other domains. We believe that the future of real-world image retrieval lies in exploiting both text- and content-based search technologies. While the former is considered more reliable from a user viewpoint, . This endeavor will hopefully be actualized in the years to come.

图像分析和检索系统已经得到了广泛的运用,有理由相信,在未来这项技术能够发展到很多其他的领域,我们相信在未来现实的图像检索可以使用基于文本的和基于内容的技术。然而基于文本的检索更依赖于用户的观点,在合并二者建立的自动图像搜索引擎有着巨大的潜力,他将成为网络图像检索的隐形的部分。这一努力希望在未来能够成真。

 

7.       图像检索技术 真正的核心问题

 

By the nature of its task, the CBIR technology boils down to two intrinsic problems:(a) how to mathematically describe an image, and (b) how to assess the similarity between a pair of images based on their abstracted descriptions.

由于工作性质,CBIR技术可归结为两个内在的问题:如何以数学方式描述图像,如何评估两个图像在抽象描述上的相似性

The first issue arises because the original representation of an image which is an array of pixel values, corresponds poorly to our visual response, let alone semantic understanding of the image.

第一个问题的产生是因为原来的像素值的一个形象代表的是一个数组,对应到我们的视觉反应很差,更不用说图像语义理解。

 

We refer to the mathematical description of an image, for retrieval purposes, as its signature. From the design perspective, the extraction of signatures and the calculation of image similarity cannot be cleanly separated. The formulation of signatures determines to a large extent the realm for definitions of similarity measures. On the other hand, intuitions are often the early motivating factors for designing similarity measures in a certain way, which in turn puts requirements on the construction of signatures.

基于图像检索的目的用数学方式描述图像作为他的签名。从设计的角度来看,签名的抽取和图像相似度的估量还不能完全的分开。签名的表示决定于相似性定义。另一方面,对于涉及相似性手段,直觉通常是一个非常积极的因素。他又给签名建设提供了需求。

In comparison with earlier, pre-2000 work in CBIR, a remarkable difference of recent years has been the increased diversity of image signatures. Advances have been made in both the derivation of new features (e.g., shape) and the construction of signatures based on these features, with the latter type of progress being more pronounced. The richness in the mathematical formulation of signatures grows alongside the invention of new methods for measuring similarity.

在引出新特征和基于这些特征的图像结构上做出了不少的进步,而且后一类更是作出了明显的进展。图像签名公式伴随这新的相似性度量一起增长。

In terms of methodology development, a strong trend which has emerged in recent years is the employment of statistical and machine learning techniques in various aspects of the CBIR technology.

按照方法学的发展,在最近几年一个很大的趋势就是统计学和机器学习技术最近几年多用在CBIR中。

一.抽取视觉

Most CBIR systems perform feature extraction as a preprocessing step.  大多数基于内容的图像检索系统将特征抽取作为图像检索的第一处理步骤

 

The current decade has seen great interest in region-based visual signatures, for which segmentation is the quintessential first step. While we begin the discussion with recent progress in image segmentation, we will see in the subsequent section that there is significant interest in segmentation free techniques to feature extraction and signature construction.

 

 图像风格   Image Segmentation

To acquire a region-based signature, a key step is to segment images. Reliable segmentation is especially critical for characterizing shapes within images, without which the shape estimates are largely meaningless. We described earlier a widely used segmentation approach based on k-means clustering. This basic approach enjoys a speed advantage, but is not as refined as some recently developed methods.

为了获得基于区域的签名,主要的步骤便是分割图像。对于形状有意义的图像来说,图像的形状特征而言可靠的分割是决定性的。 我们最先描述的是一个基于k-means族的广泛使用的算法,这个算法拥有速度优势,但不像其他方法一样精准。

One of the most important new advances in segmentation employs the normalized cuts criterion [Shi and Malik 2000].  在图像切割上一个新的最重要的发展就是使用了标准的切割标准

SHI, J. AND MALIK, J. 2000. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22, 8, 888–905

 

The problem of image segmentation is mapped to a weighted graph partitioning problem, where the vertex set of the graph is composed of image pixels and the edge weights represent some perceptual similarity between pixel pairs.

图像的分割问题映射到一个加权图的划分问题,那里的顶点图设置权重的是图像的像素点组成的代表和一些边缘的像素对知觉相似性。

The normalized cut segmentation method in Shi and Malik [2000] is also extended to textured image segmentation by using cues of contour and texture differences [Malik et al. 2001], and to incorporate known partial grouping priors by solving a constrained optimization problem [and Shi 2004].

上一篇文章同样也被扩展到纹理切割 并通过结合对已知内容的分割来解决约束优先问题

 

Searching of medical image collections has been an increasingly important research problem of late, due to the high-throughput, high-resolution, and highdimensional imaging modalities introduced.

医学图像检索方面的讲述

In this domain, 3D brain magnetic resonance (MR) images have been segmented using hidden Markov random fields and the expectation-maximization (EM) algorithm [et al. 2001], and the spectral clustering approach has found some success in segmenting vertebral bodies from sagittal MR images [Carballido-Gamio et al. 2004].

3D脑磁共振图像已经使用隐马尔可夫模型的随机场和期望最小值算法,并且这类似的方法在人体脊椎的核磁共振图像上也同样获得了成功

ZHANG, Y., BRADY, M., AND SMITH, S. 2001. Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm. IEEE Trans. Medical Imag. 20, 1, 45–57.

CARBALLIDO-GAMIO, J., BELONGIE, S., ANDMAJUMDAR, S. 2004. Normalized cuts in 3-D for spinal MRI segmentation. IEEE Trans. Medical Imag. 23, 1, 36–44.

 

 

 

 

 

 

8.        

9.        

10.    

11.    

12.    

13.    

14.    

15.    

 Image Retrieval: Ideas, influences, and trends of the new age

ACM literature

 

1.       由这个图可以看到Google Scholar 可以检索到一个科目的最新研究状况,这是一个很有用的数据,在写论文的时候如果能够附上这样一幅图,那么可以更好的将国内外的研究状况做一个分析。

 

 

 

2.       the various gaps introduced  鸿沟介绍

Sensory. The sensory gap is the gap between the object in the world and the information in a (computational) description derived from a recording of that scene.

         感知鸿沟,现实物体和我们对世界的感知差距

Semantic. The semantic gap is the lack of coincidence between the information that one can extract from the visual data and the interpretation that the same data has for a user in a given situation.

         语义鸿沟,人们从视觉数据中抽取的信息和某个用户在特定情况下对相同数据的描述缺乏一致性。

 

过去的图像检索 90-00

3.       In Smeulders et al. [2000]    图像搜索分为两个领域:narrow broad

Narrow image domain: 有限的变化性,更易定义的视觉特征  (特定的图片,如医疗图片)

Broad domain : 高变异性,同样的潜在语义概念的不可预测性 (网上的随意图片)

 

图像搜索的三个宽目录:

(1)     search by association  联合搜索

对于一副图像没有明确的意图,而是通过反复提炼浏览进行搜索

 

(2)     aimed search 有目的的搜索

搜索特定的图片

 

(3)     category search分类搜索

搜索一个语义类的单个图片代表

The overall goal therefore remains to bridge the semantic and sensorial gaps using the available visual features of images and relevant domain knowledge to support the varied search categories, ultimately to satiate the user.

因此,总的目标仍然是缩小语义和感官鸿沟,利用现有的相关领域知识的视觉特征的图像,并支持不同的搜索类别,最终满足一般用户。

4.       从图像抽取视觉内容分为两个部分:图像处理和特征重建。

In this context, search has been described as a specification of minimal invariant conditions that model the user intent, geared at reducing the sensory gap due to accidental distortions, clutter, occlusion, etc

在文中,搜索已被描述为一个最小不变情况的模式下,用户意图减少因意外的扭曲、杂波、闭塞所造成的感知鸿沟

 

 

5.       Once image features were extracted, the question remained as to how they could be indexed and matched against each other for retrieval.

一旦图像特征被抽取,问题将改变成为他们如何被在不同的检索过程从被索引和匹配。

 

6.       现代图像检索 00-08

 

 

一.用户意图

(1)     browser浏览者:用户没有最终目标的浏览图片,浏览会话将由一系列的不相关搜索组成。在搜索会话过程中浏览者将跨越多个不同的主题。

(2)     surfer冲浪者: 冲浪者是指一个用户拥有适度清晰的最终目标。一个冲浪者可能在开始阶段带有探索行为,可以为了接下来的所搜者能够更好的知道他从系统中所需要得到的信息。

(3)     searcher 搜索者:这种用户非常清楚在系统中需要什么信息。搜索会话通常会比较短,使用相关搜索达到所需情况。

One of the few studies categorizes users as experts and novices and studies their interaction patterns with respect to a video library   一些研究将专家和新手进行归类 一些研究对于视频库的影响模式

CHRISTEL, M. G. AND CONESCU, R. M. 2005. Addressing the challenge of visual information access from digital image and video libraries. In Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL). 列举从数码图片和视频库中获得视觉信息的挑战

 

总结: 对于最终用户的所有事项便是系统和用户的相互影响和相应的反应。所以后面建立了以人为本的多媒体系统。为了获得广泛的接受,图像检索系统需要获得一个以人为中心的视野。

如下文所示:

JAIMES, A., SEBE, N., AND GATICA-PEREZ, D. 2006. Human-centered computing: A multimedia perspective. In Proceedings of the ACM International Conference on Multimedia (Special Session on Human-Centered Multimedia). (以人为本的计算,多媒体观点)

 

二.数据范围

制度因素,如搜索的多样性,用户群和用户流量的预期也很大程度上影响了设计。沿着这方面,我们将搜索数据划分为以下类别:

(1)  Personal collection 个人收藏

(2)  Domain-Specific Collection 域的特定集合

(3)  Enterprise Collection 企业集

(4)  Archives  档案

(5)  Web  互联网

(6)   

Google’s Picasa system [Picasa 2004] provides a chronological display of images taking a user on a journey down memory lane

比较重要的几个应用:

implementation of a color-histogram-based image retrieval system 应用颜色直方图的FPGA检索系统:

KOTOULAS, L. AND ANDREADIS, I. 2003. Colour histogram content-based image retrieval and hardware implementation. IEEE Proc. Circ. Dev. Syst. 150, 5, 387–393.     (基于内容的图像直方图检索和硬件实现)

FPGA implementation for subimage retrieval within an image database 使用在数据库中的FPGA自图像检索

NAKANO, K. AND TAKAMICHI, E. 2003. An image retrieval system using FPGAs. In Proceedings of the Asia South Pacific Design Automation Conference.     (一种使用FPGA的图像检索方法)

a method for efficient retrieval in a network of imaging devices  对于网络图像设备的快速检索方法

WOODROW, E. AND HEINZELMAN, W. 2002. Spin-It: A data centric routing protocol for image retrieval in wireless networks. In Proceedings of the IEEE International Conference on Image Processing (ICIP).  (在无线网络中进行计算图像检索的数据为中心的路由协议)

Discussion. Regardless of the nature of the collection, as the expected user-base grows, factors such as concurrent query support, efficient caching, and parallel and distributed processing of requests become critical. For future real-world image retrieval systems, both software and hardware approaches to address these issues are essential.

More realistically, dedicated specialized servers, optimized memory and storage support, and highly parallelizable image search algorithms to exploit cluster computing powers are where the future of large-scale image search hardware support lies.

总结: 不管用户基数随着集合的性质如何增长,各种因素诸如并行查询的支持、高效缓存、并行、分布式处理等要求变得重要。对于未来的现实世界的图像检索系统,同时使用软件和硬件的方法来解决这些问题是至关重要的。更为现实的,专注致力于服务器,优化的内存和存储支持,高并行性图像检索算法,利用云计算的力量是未来大规模图像检索硬件的方向。

 

三.查询方式和处理

In the realm of image retrieval, an important parameter to measure user-system interaction level is the complexity of queries supported by the system. 

在图像检索界,衡量用户系统的交互等级的重要参数就是系统的查询复杂性

We describe next the various querying modalities, their characteristics, and the system support required thereof

我们描述了未来的各种查询方式,他们的特点,系统支持所需的单位。

(1)    Keywords 关键字

(2)    Free-Text 文字

(3)    Image 图像   搜索和所需图像相似的图像

(4)    Graphics 图形

(5)    Composite 组合

从系统查询的角度进行描述:

(1)    Text-Based 基于文本的

(2)    Content-Based 基于内容的

(3)    Composite 混合的

(4)    Interactive-Simple 单一交互式

(5)    Interactive-Composite 混合交互式

 

Processing text-based queries involves keyword matching using simple set-theoretic operations, and therefore a response can be generated very quickly.

处理基于文本的查询涉及到使用简单的关键字匹配的集合论操作,因此可以生成一个响应速度非常快。

文本检索方法:

R-trees are used for indexing images represented as attributed relational graphs (ARGs)

R-trees 作为引用关系图来对图像表示进行检索。

PETRAKIS, E. G. M., DIPLAROS, A., AND MILIOS, E. 2002. Matching and retrieval of distorted and occluded shapes using dynamic programming. IEEE Trans. Pattern Anal. Mach. Intell. 24, 4, 509–522.

PETRAKIS, E. G. M., FALOUTSOS, C., AND LIN, K. I. 2002. Imagemap: An image indexing method based on spatial similarity. IEEE Trans. Knowl. Data Eng. 14, 5, 979–987

 

Retrieval of images using wavelet coefficients as image representations and R-trees for indexing has been studied in Natsev et al. [2004]    使用小波系数作为图像表示进行检索,使用R树进行索引

NATSEV, A., RASTOGI, R., AND SHIM, K. 2004. Walrus: A similarity retrieval algorithm for image databases. IEEE Trans. Knowl. Data Eng. 16, 3, 301–316.    (图像数据库的相似检索算法)

 

Visual content matching using graph-based image representation and an efficient metric indexing algorithm has been proposed in Berretti et al. [2001]   使用基于图的图像表示方法进行视觉内容匹配并使用一个高效的度量索引算法进行索引。

BERRETTI, S., BIMBO, A. D., AND VICARIO, E. 2001. Efficient matching and indexing of graph models in content-based retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 23, 10, 1089–1105.   (基于内容的图像检索中的图模型匹配和索引方法)

 

Composite querying methods provide the users with more flexibility for expressing themselves.

综合查询方法提供了表达自己的用户提供更大的灵活性。

Some recent innovations in querying include sketch-based retrieval of color images [Chalechale et al. 2005].   一些最近发明的查询算法包括 彩色图像基于概括的图像检索。

CHALECHALE, A., NAGHDY, G., AND MERTINS, A. 2005. Sketch-based image matching using angular partitioning. IEEE Trans. Syst. Man Cybernet. 35, 1, 28–41.      (使用角分区的基于概述的图像匹配算法)

Querying using 3D models [Assfalg et al. 2002] has been motivated by the fact that 2D image queries are unable to capture the spatial arrangement of objects within the image. 3D模型检索的出现是由于2D图像检索不能够抓住图像空间布局的特点。

ASSFALG, J., DEL BIMBO, A., AND PALA, P. 2002. Three-Dimensional interfaces for querying by example in content-based image retrieval. IEEE Trans. Visualiz. Comput. Graphics 8, 4, 305–318.   基于内容的图像检索的三维接口

In another interesting work, a multimodal system involving hand gestures and speech for querying and relevance feedback was presented in Kaster et al. [2003].

在另一感兴趣的方面,多模型系统引入了手势和语言来检索并且提出了相关反馈机制

KASTER, T., PFEIFFER, M., AND BAUCKHAGE, C. 2003. Combining speech and haptics for intuitive and efficient navigation through image databases. In Proceedings of the 5th International Conference on Multimidia Interfaces (ICMI). 通过图像数据库对手势和触觉进行直观有效的导航。

Certain new interaction-based querying paradigms which statistically model the user’s interest [Fang et al. 2005],  新的基于互动的查询示例,用来统计用户感兴趣的模型

FANG, Y. AND GEMAN, D. 2005. Experiments in mental face retrieval. In Proceedings of the International Conference on Audio- and Video-Based Biometric Person Authentication (AVBPA).

FANG, Y., GEMAN, D., AND BOUJEMAA, N. 2005. An interactive system for mental face retrieval. In Proceedings of the ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR) at the International Multimedia Conference.

help the user refine her queries by providing cues and hints [Jaimes et al. 2004; Nagamine et al. 2004]

通过提供线索和提示帮助用户定义搜索。

JAIMES, A., OMURA, K., NAGAMINE, T., AND HIRATA, K. 2004. Memory cues for meeting video retrieval. In Proceedings of the 1st ACM Workshop on Continuous Archival and Retrieval of Personal Experiences (CARPE) at the ACM International Multimedia Conference.

NAGAMINE, T., JAIMES, A., OMURA, K., AND HIRATA, K. 2004. A visuospatial memory cue system for meeting video retrieval. In Proceedings of the ACM International Conference on Multimedia (demonstration).

 

总结

Discussion. A prerequisite for supporting text-based query processing is the presence of reliable metadata with pictures. However, pictures rarely come with reliable human tags. In recent years, there has been effort put into building interactive, public-domain games for large-scale collection of high-level manual annotations. One such game (the ESP game) has become very popular and has helped accumulate human annotations for about a hundred thousand pictures [von Ahn and Dabbish 2004]. Collection of manual tags for pictures has the dual advantage of: (1) facilitating text-based querying, and (2) building reliable training datasets for content-based analysis and automatic annotation algorithms. As explored in Datta et al. [2007], it is possible to effectively bridge the paradigms of keyword- and content-based search through a unified framework to provide the user the flexibility of both, without losing out on the search scope.

支持基于文本检索的先决条件是存在可信的图像元数据,然而图像却很少有可用的人为标签。近些年,人们将精力主要放在建立交互式的公共领域游戏,该游戏用来对于高层次手册说明的大规模收集。ESP十分流行并且帮助收集了大约十万图像的用户手册。手机图像的用户标签主要有以下优势:促进基于文本的查询,为基于内容的分析和自动标记算法建立可信的训练数据库,能够有效的连接关键字和基于内容搜索的数据,通过统一框架来提供用户多样的,不丢失的搜索领域。

 

四.可视化

Presentation of search results is perhaps one of the most important factors in the acceptance and popularity of an image retrieval system   在可接受和流行的图像检索系统中,对于搜索结果的描述应该是最重要的一个因素了。

(1)     Relevance-Ordered  关联 有序    搜索结果通过一些数字衡量进行排序

(2)     Time-Ordered  时间 排序    图片显示按时间排序

(3)     Clustered    

(4)     Hierarchical  分级

(5)     Composite 混合

 

In order to design interfaces for image retrieval systems, it helps to understand factors like how people manage their digital photographs [Rodden and Wood 2003] or frame their queries for visual art images Cunningham et al. [2004].   为了设计用于图像检索系统的接口,它有助于人们了解类似因素如何管理自己的数码照片

RODDEN, K. AND WOOD, K. 2003. How do people manage their digital photographs? In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI).

CUNNINGHAM, S. J., BAINBRIDGE, D., AND MASOODIAN, M. 2004. How people describe their image information needs: A grounded theory analysis of visual arts queries. In Proceedings of the ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL).

 

In Rodden et al. [2001], user studies on various ways of arranging images for browsing purposes are conducted, and the observation is that both visual-feature-based and concept-based arrangements have their own merits and demerits.   用户为浏览图片研究各种排列图像的方法,这个观察的结果是基于视觉特征和基于概念的排列方法各有他们自己的优点和缺点

RODDEN, K., BASALAJ, W., SINCLAIR, D., AND WOOD, K. 2001. Does organization by similarity assist image browsing? In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI).  

 

Thinking beyond the typical grid-based arrangement of top matching images, spiral and concentric visualization of retrieval results have been explored in Torres et al. [2003]. 基于网格的图像匹配,螺旋和检索可视化检索结果已经被提出。

TORRES, R. S., SILVA, C. G., MEDEIROS, C. B., AND ROCHA, H. V. 2003. Visual structures for image browsing. In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM).

 

For personal images, innovative arrangements of query results based on visual content, time-stamps, and efficient use of screen space add new dimensions to the browsing experience [Huynh et al. 2005].

HUYNH, D. F., DRUCKER, S. M., BAUDISCH, P., AND WONG, C. 2005. Time quilt: Scaling up zoomable photo browsers for large, unstructured photo collections. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI).

 

Image transcoding techniques, which aim at adapting multimedia (image and video) content to the capabilities of the client device, have been studied extensively in the last several years [Shanableh and Ghanbari 2000; Vetro et al. 2003; Bertini et al. 2003; Cucchiara et al. 2003].  图像转码技术,旨在适应多媒体和客户设备,在这些年已经大量研究了

SHANABLEH, T. AND GHANBARI, M. 2000. Heterogeneous video transcoding to lower spatio-temporal resolutions and different encoding formats. IEEE Trans. Multimed. 2, 2, 101–110.  (异构视频转码到低时空解决和不同编码格式)

VETRO, A., CHRISTOPOULOS, C., AND SUN, H. 2003. Video transcoding architectures and techniques: An overview. IEEE Signal Process. Mag. 20, 2, 18–29.   视频转码结构和技术

BERTINI, M., CUCCHIARA, R., DEL BIMBO, A., AND PRATI, A. 2003. Object and event detection for semantic annotation and transcoding. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME). 语言标注和转码的物体时间检测

CUCCHIARA, R., GRANA., C., AND PRATI, A. 2003. Semantic video transcoding using classes of relevance. Int. J. Image Graph. 3, 1, 145–170.    使用相关类的语义视频转码

 

总结:

Discussion. Study of organizations which maintain image management and retrieval systems has provided useful insights into system design, querying, and visualization. In Tope and Enser [2000], The final verdict of acceptance rejection for any visualization scheme comes from end-users.

 

研究表米娜那些坚持图像管理和检索系统组织已经对系统设计查询和可视化的有了一定的认识,

虽然简单,如基于网格的显示直观的界面,已成为可以接受的大多数搜索引擎用户,先进的可视化技术还可以在决策。

 

五.现代图像检索系统

Recently, a public-domain search engine called Riya (see Figure 4) has been developed, which incorporates image retrieval and face recognition for searching pictures of people and products on the Web.

大众主流搜索引擎Riya,包含了图像检索和脸部识别功能。

www.riya.com

It is also interesting to note that CBIR technology is being applied to domains as diverse as family album management, botany, astronomy, mineralogy, and remote sensing [Zhang et al. 2003;

Wang et al. 2002; Csillaghy et al. 2000; Painter et al. 2003; Schroder et al. 2000].

这也是值得注意的基于内容图像检索技术被应用于不同领域的家庭相册管理,植物学,天文学,矿物学,和遥感

ZHANG, L., CHEN, L., LI, M., AND ZHANG, H.-J. 2003. In Proceedings of the ACM International Conference on Multimedia    人脸识别

 

CSILLAGHY, A., HINTERBERGER, H., AND BENZ, A. 2000. Content based image retrieval in astronomy. Inf. Retriev.3, 3, 229–241.   天气预报

 

PAINTER, T. H., DOZIER, J., ROBERTS, D. A., DAVIS, R. E., AND GREEN, R. O. 2003. Retrieval of sub-pixel snow-covered area and grain size from imaging spectrometer data. Remote Sens. Env. 85, 1, 64–77.    积雪和粮食方面

 

SCHRODER, M.,REHRAUER, H., SEIDEL, K., ANDDATCU,M. 2000. Interactive learning and probabilistic retrieval in remote sensing image archives. IEEE Trans. Geosci. Remote Sens. 38, 5, 2288–2298.  

 

A publicly available similarity search tool [Wang et al. 2001] is being used for an online database of over 800, 000 airline-related images [Airliners.Net 2005; Slashdot 2005]  

http://airliners.net

 

the integration of similarity search functionality to a large collection of art and cultural images [GlobalMemoryNet 2006], and the incorporation of image similarity to a massive picture archive [Terragalleria 2001] of the renowned travel photographer Q.-T. Luong.

Automatic Linguistic Indexing of Pictures—Real-Time (ALIPR), an automatic image

annotation system [Li andWang 2006a; 2008], has been recently made public for people

to try to have their pictures annotated. As mentioned earlier, presence of reliable tags

with pictures is necessary for text-based image retrieval. As part of the ALIPR search

engine, an effort to automatically validate computer generated tags with human-given

annotation is being used in an attempt to build a very large collection of searchable

images (see Figure 5). Another work-in-progress is a Web image search system [Joshi

et al. 2006a] that exploits visual features and textual metadata, using state-of-the-art

algorithms, for a comprehensive search experience.

 

Discussion. Image analysis and retrieval systems have received widespread public and media interest of late [Mirsky 2006; Staedter 2006; CNN 2005]. It is reasonable to hope that in the near future, the technology will diversify to many other domains. We believe that the future of real-world image retrieval lies in exploiting both text- and content-based search technologies. While the former is considered more reliable from a user viewpoint, . This endeavor will hopefully be actualized in the years to come.

图像分析和检索系统已经得到了广泛的运用,有理由相信,在未来这项技术能够发展到很多其他的领域,我们相信在未来现实的图像检索可以使用基于文本的和基于内容的技术。然而基于文本的检索更依赖于用户的观点,在合并二者建立的自动图像搜索引擎有着巨大的潜力,他将成为网络图像检索的隐形的部分。这一努力希望在未来能够成真。

 

7.       图像检索技术 真正的核心问题

 

By the nature of its task, the CBIR technology boils down to two intrinsic problems:(a) how to mathematically describe an image, and (b) how to assess the similarity between a pair of images based on their abstracted descriptions.

由于工作性质,CBIR技术可归结为两个内在的问题:如何以数学方式描述图像,如何评估两个图像在抽象描述上的相似性

The first issue arises because the original representation of an image which is an array of pixel values, corresponds poorly to our visual response, let alone semantic understanding of the image.

第一个问题的产生是因为原来的像素值的一个形象代表的是一个数组,对应到我们的视觉反应很差,更不用说图像语义理解。

 

We refer to the mathematical description of an image, for retrieval purposes, as its signature. From the design perspective, the extraction of signatures and the calculation of image similarity cannot be cleanly separated. The formulation of signatures determines to a large extent the realm for definitions of similarity measures. On the other hand, intuitions are often the early motivating factors for designing similarity measures in a certain way, which in turn puts requirements on the construction of signatures.

基于图像检索的目的用数学方式描述图像作为他的签名。从设计的角度来看,签名的抽取和图像相似度的估量还不能完全的分开。签名的表示决定于相似性定义。另一方面,对于涉及相似性手段,直觉通常是一个非常积极的因素。他又给签名建设提供了需求。

In comparison with earlier, pre-2000 work in CBIR, a remarkable difference of recent years has been the increased diversity of image signatures. Advances have been made in both the derivation of new features (e.g., shape) and the construction of signatures based on these features, with the latter type of progress being more pronounced. The richness in the mathematical formulation of signatures grows alongside the invention of new methods for measuring similarity.

在引出新特征和基于这些特征的图像结构上做出了不少的进步,而且后一类更是作出了明显的进展。图像签名公式伴随这新的相似性度量一起增长。

In terms of methodology development, a strong trend which has emerged in recent years is the employment of statistical and machine learning techniques in various aspects of the CBIR technology.

按照方法学的发展,在最近几年一个很大的趋势就是统计学和机器学习技术最近几年多用在CBIR中。

一.抽取视觉

Most CBIR systems perform feature extraction as a preprocessing step.  大多数基于内容的图像检索系统将特征抽取作为图像检索的第一处理步骤

 


The current decade has seen great interest in region-based visual signatures, for which segmentation is the quintessential first step. While we begin the discussion with recent progress in image segmentation, we will see in the subsequent section that there is significant interest in segmentation free techniques to feature extraction and signature construction.

 

 图像风格   Image Segmentation

To acquire a region-based signature, a key step is to segment images. Reliable segmentation is especially critical for characterizing shapes within images, without which the shape estimates are largely meaningless. We described earlier a widely used segmentation approach based on k-means clustering. This basic approach enjoys a speed advantage, but is not as refined as some recently developed methods.

为了获得基于区域的签名,主要的步骤便是分割图像。对于形状有意义的图像来说,图像的形状特征而言可靠的分割是决定性的。 我们最先描述的是一个基于k-means族的广泛使用的算法,这个算法拥有速度优势,但不像其他方法一样精准。

One of the most important new advances in segmentation employs the normalized cuts criterion [Shi and Malik 2000].  在图像切割上一个新的最重要的发展就是使用了标准的切割标准

SHI, J. AND MALIK, J. 2000. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22, 8, 888–905

 

The problem of image segmentation is mapped to a weighted graph partitioning problem, where the vertex set of the graph is composed of image pixels and the edge weights represent some perceptual similarity between pixel pairs.

图像的分割问题映射到一个加权图的划分问题,那里的顶点图设置权重的是图像的像素点组成的代表和一些边缘的像素对知觉相似性。

The normalized cut segmentation method in Shi and Malik [2000] is also extended to textured image segmentation by using cues of contour and texture differences [Malik et al. 2001], and to incorporate known partial grouping priors by solving a constrained optimization problem [and Shi 2004].

上一篇文章同样也被扩展到纹理切割 并通过结合对已知内容的分割来解决约束优先问题

 

Searching of medical image collections has been an increasingly important research problem of late, due to the high-throughput, high-resolution, and highdimensional imaging modalities introduced.

医学图像检索方面的讲述

In this domain, 3D brain magnetic resonance (MR) images have been segmented using hidden Markov random fields and the expectation-maximization (EM) algorithm [et al. 2001], and the spectral clustering approach has found some success in segmenting vertebral bodies from sagittal MR images [Carballido-Gamio et al. 2004].

3D脑磁共振图像已经使用隐马尔可夫模型的随机场和期望最小值算法,并且这类似的方法在人体脊椎的核磁共振图像上也同样获得了成功

ZHANG, Y., BRADY, M., AND SMITH, S. 2001. Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm. IEEE Trans. Medical Imag. 20, 1, 45–57.

CARBALLIDO-GAMIO, J., BELONGIE, S., ANDMAJUMDAR, S. 2004. Normalized cuts in 3-D for spinal MRI segmentation. IEEE Trans. Medical Imag. 23, 1, 36–44.

 

 

 

 

你可能感兴趣的:(图像处理和人工智能)