【计算机科学】【2015.05】基于深度学习的图像目标检测——利用视觉关注特性学习如何搜索重要区域

【计算机科学】【2015.05】基于深度学习的图像目标检测——利用视觉关注特性学习如何搜索重要区域_第1张图片

本文为德国艾伯哈特-卡尔斯-图宾根大学(作者:Alina Kloss)的计算机科学硕士论文,共101页。

快速、可靠地检测并识别图像中的不同对象是与环境交互的重要技能。主要的问题是,从理论上说,图像的所有部分都必须在许多不同的尺度上搜索对象,以确保不遗漏任何可能的对象。然而,实际上对给定图像区域的内容进行分类需要相当长的时间和精力,但计算机能够花费在分类上的软硬件资源是有限的。人类使用一种称为视觉关注的过程来快速地决定图像的哪些位置需要被仔细处理以及哪些位置可以被忽略。这使得我们能够处理大量的视觉信息,并有效地利用视觉系统的容量。

对于计算机视觉,研究人员必须处理完全相同的问题,因此学习人类行为为改进现有算法提供了有希望的方案。在本篇硕士论文中,利用15名参与者搜索来自三种不同类别对象的图像获得的眼睛跟踪数据训练模型。使用深度卷积神经网络从输入图像中提取特征,然后结合起来形成显著特征图。该特征图提供关于在搜索给定目标对象时哪些图像区域是感兴趣的信息,因此可用于减少必须详细处理的图像部分。该方法基于Kummerer等人的最新报告成果,但是与计算一般的任务无关显著性的原始方法相比,所提出的模型要求在搜索不同目标类别时应具有不同的响应。

Detecting and identifying the differentobjects in an image fast and reliably is an important skill for interactingwith one’s environment. The main problem is that in theory, all parts of animage have to be searched for objects on many different scales to make surethat no object instance is missed. It however takes considerable time andeffort to actually classify the content of a given image region and both time andcomputational capacities that an agent can spend on classification are limited.

Humans use a process called visualattention to quickly decide which locations of an image need to be processed indetail and which can be ignored. This allows us to deal with the huge amount ofvisual information and to employ the capacities of our visual systemefficiently. For computer vision, researchers have to deal with exactly thesame problems, so learning from the behaviour of humans provides a promisingway to improve existing algorithms. In the presented master’s thesis, a modelis trained with eye tracking data recorded from 15 participants that were askedto search images for objects from three different categories. It uses a deepconvolutional neural network to extract features from the input image that arethen combined to form a saliency map. This map provides information about whichimage regions are interesting when searching for the given target object andcan thus be used to reduce the parts of the image that have to be processed indetail. The method is based on a recent publication of K¨ummerer et al., but incontrast to the original method that computes general, task independentsaliency, the presented model is supposed to respond differently when searchingfor different target categories.

1 引言

2 基础知识与相关工作回顾

3 研究方法

4 调节训练网络

5 评估与研究结果

6 讨论

7 未来工作展望

8 附录

下载英文原文地址:

http://page2.dfpan.com/fs/flcfj2421729a163774/

更多精彩文章请关注微信号:【计算机科学】【2015.05】基于深度学习的图像目标检测——利用视觉关注特性学习如何搜索重要区域_第2张图片

你可能感兴趣的:(【计算机科学】【2015.05】基于深度学习的图像目标检测——利用视觉关注特性学习如何搜索重要区域)