计算机视觉英文介绍
什么是计算机视觉?(What is Computer Vision?)
Computer Vision can be generally understood using two perspectives: (1) computer science and (2) computer engineering.
通常可以使用两种观点来理解计算机视觉:(1)计算机科学和(2)计算机工程。
In computer science, computer vision is the interdisciplinary field that develops theories and methods to allow computers to extract relevant information from digital images or videos. On the other hand, in computer engineering, it is considered as the interdisciplinary field that develops algorithms and tools to automate perceptual tasks normally performed by the human visual system.
在计算机科学中,计算机视觉是跨学科领域,其发展了理论和方法以允许计算机从数字图像或视频中提取相关信息。 另一方面,在计算机工程中,它被认为是跨学科领域,其开发算法和工具以使人类视觉系统通常执行的感知任务自动化。
Every image tells a story
每个图像都讲述一个故事
One of the goals of computer vision is to perceive the “story” behind the picture. It includes developing mathematical techniques for recovering the three-dimensional shape and appearance of objects in imagery, recognising objects and people in a scene, or even finding out what is happening in an image.
计算机视觉的目标之一是感知图片背后的“故事”。 它包括开发数学技术,以恢复图像中对象的三维形状和外观,识别场景中的对象和人物,甚至找出图像中正在发生的事情。
Human perception has its shortcomings
人类的感知有其缺点
I will show you some optical illusions and tease what they might tell us about the human visual system.
我将向您展示一些视觉上的错觉,并嘲笑它们可能告诉我们的关于人类视觉系统的信息。
Which side of the object is brighter?
物体的哪一侧更亮?
You may perceive the right side to be brighter. However, notice that both sides appear to have the same colour if you cover the vertical line as shown below.
您可能会感觉到右侧更亮。 但是,请注意,如果您遮盖垂直线,则如下所示,两面都具有相同的颜色。
In the image below, are the cells popping in or out?
在下图中,单元格是弹出还是弹出?
What happens when the image is rotated?
旋转图像会发生什么?
Which tile is darker in the image below? A or B?
下图中哪个图块更暗? A还是B?
Tiles A and B actually have the same absolute intensity value (as shown below). Your perception is due to brightness constancy where the visual system’s attempt to discount illumination when interpreting colors.
瓦片A和B实际上具有相同的绝对强度值(如下所示)。 您的感知是由于亮度恒定,视觉系统在解释颜色时会尝试降低照明度。
A variation of Hermann grid illusion from Hany Farid is shown below. As you move your eyes over the figure, gray spots appear at the intersections.
下面显示了来自Hany Farid的Hermann网格错觉的一种变体。 当您将眼睛移到图形上时,交叉点处会出现灰色斑点。
Perceptual psychologists have spent decades trying to understand how the visual system works. More examples of unbelievable optical illusions can be found here.
感知心理学家花了数十年的时间试图了解视觉系统的工作原理。 在这里可以找到更多令人难以置信的错觉的例子。
计算机视觉:应用 (Computer Vision: Applications)
You may ask, “Have I ever used computer vision? How? Where?”
您可能会问: “我曾经使用过计算机视觉吗? 怎么样? 哪里?”
Over the last two decades, computer vision research has seen rapid development and is being used today in a wide range of real-world applications, which include:
在过去的二十年中,计算机视觉研究得到了飞速的发展,如今已被广泛用于现实应用中,其中包括:
Optical Character Recognition (OCR) — it allows you to convert images of text into text. It also allows you to search for hand-written characters in the Notes app on your iPhone.
光学字符识别(OCR)-它使您可以将文本图像转换为文本。 它还允许您在iPhone上的Notes应用程序中搜索手写字符。
Object Detection — it allows you to search for a specific person or object (cat) in your iPhone’s photo library.
对象检测-它使您可以在iPhone的照片库中搜索特定的人或对象(猫) 。
iPhone Portait Mode — it uses the phone’s camera to create a depth-of-field effect. It lets you compose a photo that keeps your subject sharp while blurring the background.
iPhone Portait模式-它使用手机的摄像头创建景深效果。 它使您可以构图,使主体清晰而背景模糊。
iPhone FaceTime Attention Correction — it corrects your gaze to the camera while you give attention to the screen.
iPhone FaceTime注意校正-在将注意力集中到屏幕时,它可以将视线校正到相机。
演示地址
Amazon Go “Just Walk Out” Technology — it automates much of the purchase, checkout and payment steps associated with a purchase transaction.
Amazon Go的“ Just Walk Out”技术-使与购买交易相关的许多购买,结帐和付款步骤自动化。
演示地址
Vision-based Biometrics — “How the Afghan Girl was Identified by Her Iris Patterns” Read the story.
基于视觉的生物识别技术- “如何通过虹膜图案识别阿富汗女孩”阅读故事。
演示地址
3D Human Face Capture — learn about the latest research in digital human technology.
3D人脸捕捉—了解有关数字人技术的最新研究。
演示地址
Motion Capture for Visual Effects — discover the technology behind shooting the motion-capture scenes on location versus on a sound stage.
用于视觉效果的运动捕捉—发现在现场和声场上拍摄运动捕捉场景背后的技术。
演示地址
Self-driving Cars — they are capable of sensing their environment and moving safely with little or no human input.
自动驾驶汽车-他们能够感应到自己的环境并在很少或没有人工干预的情况下安全行驶。
演示地址
Image Synthesis — Playform is a new way for visionary artists, creators, and designers to experiment, explore and inspire with AI. I played around with the app, drew a landscape and generated images shown below. My sketch has been transformed into different images based from reference styles (e.g. Baroque) or artists (e.g. Vincent Van Gogh). A similar application is GauGAN from NVIDIA Research.
图像合成— Playform是有远见的艺术家,创作者和设计师使用AI进行实验,探索和启发的新方法。 我玩了一下该应用程序,绘制了风景,并生成了如下所示的图像。 我的素描已根据参考样式(例如,巴洛克风格)或艺术家(例如,文森特·梵高)转换为不同的图像。 NVIDIA Research的GauGAN是类似的应用程序。
Image-guided Surgery — learn how image-guided surgery (IGS) technology is used in the operating room.
图像引导手术-了解如何在手术室中使用图像引导手术(IGS)技术。
演示地址
Augmented/Mixed/Virtual/Extended Reality — Philips is piloting mixed reality in the domain of image-guided minimally invasive procedures.
增强/混合/虚拟/扩展现实-飞利浦正在图像引导的微创程序领域内尝试混合现实。
演示地址
Although not exhaustive, the list has demonstrated the incredible use of computer vision in a wide range of application areas. This is a very active research area where breakthroughs happen almost every year.
尽管并不详尽,但该清单已经证明了计算机视觉在各种应用领域中的不可思议的使用。 这是一个非常活跃的研究领域,几乎每年都有突破。
结论 (Conclusion)
Computer vision tasks generally include:
计算机视觉任务通常包括:
(a) obtaining simple inferences from individual pixel values
(a)从各个像素值获得简单的推论
(b) grouping pixels to separate object regions or infer shape information
(b)将像素分组以分离对象区域或推断形状信息
(c) recognising objects using geometric or statistical pixel information
(c)使用几何或统计像素信息识别物体
(d) combining information from multiple images into a coherent whole
(d)将来自多个图像的信息组合成一个连贯的整体
Critical issues in computer vision:
计算机视觉中的关键问题:
(a) sensing — how do sensors obtain images of the world?
(a)感应-感应器如何获取世界图像?
(b) encoded information — how do images yield information of the scene, such as colour, texture, shape, motion…?
(b)编码信息-图像如何产生场景信息,例如颜色,纹理,形状,运动……?
(c) representation — what representations are appropriate?
(c)代表-哪种代表是合适的?
(d) algorithms — what algorithms are appropriate to process image information and construct scene descriptions?
(d)算法-哪些算法适合处理图像信息和构建场景描述?
While computer vision is almost entirely digital image processing on a low level, it is about knowledge construction, representation and inference on a high level.
虽然计算机视觉几乎完全是一个低水平的数字图像处理,但它是关于较高水平的知识构建,表示和推理的。
(a) recognition — identify objects based on low-level information
(a)识别-根据低级信息识别对象
(b) interpretation — assign meaning to groups of recognised objects
(b)解释-将含义赋予已识别的对象组
(c) scene analysis — complete understanding of the captured scene
(c)场景分析-完全了解捕获的场景
Acknowledgements
致谢
This is a part of a series of articles that I am writing about Computer Vision. Some included images were taken from books (by Richard Szeliski, Ballard and Brown, Shapiro and Stockman) and lecture notes from Brown University, Cornell University, University of Michigan, University of New South Wales and University of California, Berkeley. Original sources are credited where possible.
这是我撰写的有关计算机视觉的一系列文章的一部分。 其中一些图像取材自书籍(Richard Szeliski,Ballard和Brown,Shapiro和Stockman拍摄)以及布朗大学,康奈尔大学,密歇根大学,新南威尔士大学和加利福尼亚大学伯克利分校的演讲笔记。 原始来源尽可能地记入贷方。
翻译自: https://medium.com/swlh/computer-vision-introduction-b9d38e3e02cc
计算机视觉英文介绍