ai医学影像

介绍 (Introduction)

Over the past fifty years, healthcare has come a long way, improving both longevity and quality of life on Earth. Technological advancements, easier access and a better understanding of the human genome are the main factors that were responsible for carrying the healthcare system this far [1]. However, today, poor budgeting, growing demand and the lack of specialists to commensurate with demand has put the healthcare system as we know it in peril. To add fuel to fire, the COVID-19 pandemic only highlighted these flaws. The system is beckoning for a change, and A.I in part could be this change.

在过去的五十年中，医疗保健取得了长足的进步，改善了地球的寿命和生活质量。技术进步，更容易获得和对人类基因组的更好理解是迄今为止携带医疗保健系统的主要因素[1]。但是，如今，预算不足，需求不断增长以及缺乏与需求相称的专家已使我们所知道的医疗体系陷入危险。为了增加燃料，COVID-19大流行只突出了这些缺陷。该系统正在招募变更，而AI可能就是这一变更。

Figure 1 — From 153 exabytes of healthcare data in 2013, the data is projected to grow to 2143 exabytes by the Fall of 2020 [3]. 图1 —从2013年的153艾字节的医疗数据开始，预计到2020年秋季，该数据将增长到2143艾字节[3]。

In the last decade, Big Data has grown more than ten folds [2], and the same holds true concerning advancements in computing hardware and software and cloud storage. The data and technological boom, in turn, engendered the possibility of applying A.I within industries across all walks of life — finance, education, travel, healthcare, etc.

在过去的十年中，大数据增长了十倍以上[2]，在计算硬件和软件以及云存储方面的进步也是如此。反过来，数据和技术的繁荣也带来了在各行各业的各个行业(例如金融，教育，旅行，医疗保健等)中应用AI的可能性。

Within the healthcare sector, high-resolution medical images, biosensor physiological data, genome sequence data, and digital records in big part have fostered the medical data boom from 2013 to 2020 (see figure 1). Medical professionals, whose capacity is already strained by ever-growing demand for healthcare, lack the resources and training to analyze and extract value from this enormous influx of data. In view of this disproportionate growth between medical experts and patient medical data, deployment of A.I systems have already begun within the fields of medical imagery, genome sequencing, and physiologic biosensor data.

在医疗保健领域，高分辨率医学图像，生物传感器生理数据，基因组序列数据和数字记录在很大程度上推动了2013年至2020年医疗数据的繁荣(见图1)。医疗专业人员的能力已经因对医疗保健的不断增长的需求而紧张，他们缺乏资源和培训来分析和从海量数据中提取价值。鉴于医学专家和患者医学数据之间的这种不成比例的增长，在医学图像，基因组测序和生理生物传感器数据领域内已经开始部署AI系统。

医学影像中的AI (A.I in Medical Imagery)

With more than 2 billion medical scans performed worldwide annually [4], the necessity for the convergence of human and artificial intelligence in the field of medical imagery is only accentuated. It is, however, comprehensible that like with any established industry, the healthcare industry too has its reservations concerning change, especially change that comes fast. But perhaps a brief history of A.I and its role in imagery could help thaw this ice.

全世界每年进行超过20亿次医学扫描[4]，在医学影像领域融合人类和人工智能的必要性日益突出。但是，可以理解的是，与任何已建立的行业一样，医疗保健行业也对变革持保留态度，尤其是快速变化的变革。但是，简短的AI历史及其在图像中的作用可能有助于解冻。

A.I to simply put it, was concocted to imitate human intelligence (in some wavelength, see figure 2), and since this intelligence was man-made it was called “artificial”. A subcategory of A.I is Machine Learning (ML) which in essence is the study of computer algorithms that build a mathematical model based on sample data, with the end goal of having the ability to effectuate predictions were similar data to be encountered by the algorithm.

简而言之，人工智能就是模仿人类的智能(在某种程度上，见图2)，并且由于这种智能是人造的，因此被称为“人造”。 AI的子类别是机器学习(ML)，本质上是对计算机算法的研究，该计算机算法基于样本数据构建数学模型，最终目标是能够进行预测，是该算法遇到的相似数据。

Figure 2 — Comparison between a biological (top) and artificial (bottom) neuron. The feature inputs in artificial neuron act as dendrites and the output as axon terminals. 图2 –生物学(顶部)神经元和人工(底部)神经元之间的比较。人工神经元中的特征输入充当树突，而输出则充当轴突终端。

One such ML algorithm is a Neural Network (NN), the first use of which dates back as far as mid 20th century. Inspired by the neurons in the human brain, a NN is architectured by combining multiple perceptrons (artificial neurons) and layering them one after the other, to create a single or multi-layered network. A NN with multiple layers is also called a Deep Neural Network (DNN), and the act of training a DNN to perform predictive tasks is called Deep Learning. With the advent of Big Data and substantial computer power, DNNs have set the state-of-the-art in a plethora of predictive tasks, becoming a default solution in numerous industries that seek to exploit data.

一种这样的ML算法是神经网络(NN)，其最早使用可以追溯到20世纪中期。受人脑神经元启发，通过组合多个感知器(人工神经元)并将它们一个接一个地分层，从而构建一个单层或多层网络，从而构建了一个NN。具有多层的NN也称为深度神经网络(DNN)，训练DNN以执行预测性任务的动作称为“深度学习”。随着大数据的出现和强大的计算机功能，DNN在众多的预测任务中都设置了最先进的技术，成为许多寻求利用数据的行业的默认解决方案。

Amongst the different NNs that came to be used, Convolutional Neural Networks (CNNs) stood out in particular. The unique capability of CNNs to understand and extract information from images was corroborated with ILSVRC2012 — an annual competition to determine the best image classification/localization algorithm — where a CNN (called AlexNet) vanquished the competition by a large margin [5]. With this began a new era of A.I in imagery, and not too long before medical imagery was also implicated. As of today, concrete applications of A.I in medical imagery are already being realized in the field of Radiology, Pathology, Dermatology, Ophthalmology and Cardiology to name a few.

在即将使用的不同NN中，卷积神经网络(CNN)特别突出。 CNSV能够从图像中了解和提取信息的独特能力得到了ILSVRC2012的证实，ILSVRC2012是确定最佳图像分类/定位算法的年度竞赛，而CNN(称为AlexNet)在很大程度上击败了竞赛[5]。这样就开始了图像人工智能的新纪元，不久之后还牵涉到医学图像。到目前为止，仅在放射学，病理学，皮肤病学，眼科和心脏病学领域已经实现了AI在医学图像中的具体应用。

Take the case of standard trauma X-rays, a market segment in which we at AZmed are positioned. The number of examinations has doubled in Europe in recent years, while the number of specialists capable of analyzing this flux of information hasn’t grown in proportion [5]. To add insult to injury, 92% of standard trauma X-rays turn out negative for bone lesions. This translates to, specialists expending their precious time and energy on diagnosing banal cases while they could be attending to more grave and life-threatening cases. To remedy this, we at AZmed are on a mission to lend a helping hand to radiologists to aide with the detection of bone lesions in X-rays. Our computer-aided diagnosis tool harnesses the power of AI to optimize the workflow of radiologists allowing faster and more precise detections, and as a consequence helping them conserve time and energy for intricate cases.

以标准创伤X射线为例，我们AZmed定位在这个市场领域。近年来，欧洲的考试数量翻了一番，而能够分析这种信息流的专家数量却没有按比例增加[5]。更糟的是，92％的标准创伤X射线对骨病变不利。这意味着，专家将宝贵的时间和精力花费在诊断普通案件上，而他们可能正在处理更多严重和危及生命的案件。为了解决这个问题，我们AZmed的任务是向放射科医生伸出援助之手，以帮助他们检测X射线中的骨病变。我们的计算机辅助诊断工具利用AI的力量来优化放射科医生的工作流程，从而实现更快，更精确的检测，从而帮助他们节省复杂病例的时间和精力。

AI和COVID-19 (A.I and COVID-19)

The world is slowly, but cautiously coming out of the Coronavirus Disease 2019 (COVID-19) pandemic. Having infected 4.8+ million and caused 300,000+ deaths across 188 counties [7], COVID-19 has attacked every link of the chain that constitutes society. While the pandemic has caused much hurt and suffering across all dimensions of wellness, it has also pointed out some gaping holes in the functioning of our society. The Medical sector contains many of these critical holes — including a shortage of personnel, equipment and budget to name a few. In this section, I’ll elaborate on a potential solution to a piece of this puzzle, given the context of medical imagery and COVID-19.

世界正在缓慢但谨慎地摆脱2019年冠状病毒病(COVID-19)大流行。 COVID-19在188个县中感染了4.8+百万例病毒，并造成300,000例以上的死亡[7]，已经袭击了构成社会的链条的每个环节。大流行在健康的各个方面造成了很大的伤害和痛苦，但它也指出了我们社会运转中的一些空白。医疗行业包含许多关键漏洞，包括人员，设备和预算短缺等。在本节中，鉴于医学影像和COVID-19的背景，我将详细说明解决这一难题的方法。

Unfortunately, there’s a general consensus in the medical community that COVID-19 cannot be accurately diagnosed using images from a CT-scan [8, 9, 10, 11], which is further corroborated by theses studies [12, 13]. This does not mean that CT-scans and X-rays have no role to play in the fight. For example, one of the signs of COVID-19 is pulmonary lesions, and chest X-rays are frequently used to diagnose it. However, the cause for those pulmonary lesions could be bacterial, fungi or even related to other viruses, and distinguishing between these causes using chest X-rays is a hard task. This is precisely where CNNs can come into play. The algorithm’s superior ability to distinguish between pixels in an image and extract relevant information could render it pivotal towards pinpointing the cause of the disease. And needless to say, ensuring the infected patient receives the appropriate treatment and care.

不幸的是，医学界普遍认为，使用CT扫描[8、9、10、11]的图像无法准确诊断出COVID-19 [8，9，10，11]，这些研究进一步证实了这一点[12，13]。这并不意味着CT扫描和X射线在战斗中不起作用。例如，COVID-19的症状之一是肺部病变，并且经常使用胸部X射线对其进行诊断。但是，造成这些肺部病变的原因可能是细菌，真菌或什至与其他病毒有关，因此使用胸部X射线来区分这些原因是一项艰巨的任务。这正是CNN可以发挥作用的地方。该算法具有出色的区分图像像素和提取相关信息的能力，这使其对确定疾病原因至关重要。不用说，确保感染病人得到适当的治疗和护理。

Figure 3 — Pneumonia and its causes. 图3-肺炎及其病因。

Enough of talk, let’s get our neurons a little dirty with a proof of concept for an A.I solution that can detect the cause of pneumonia from chest X-rays — it’ll get a little technical from here on, so bear with me. Let’s assume we have a dataset containing:

足够多的讨论，让我们用一种AI解决方案的概念证明使我们的神经元有点脏，该解决方案可以从胸部X射线检测出肺炎的原因-从这里开始会有点技术性，所以请耐心等待。假设我们有一个数据集，其中包含：

images of chest X-rays of patients suffering from pneumonia
肺炎患者的胸部X线照片
labels indicating the cause for pneumonia: Bacterial, Fungal, Viral-Other, Viral-COVID-19 (as seen in figure 3)
指示肺炎原因的标签：细菌，真菌，其他病毒，COVID-19病毒(如图3所示)

Predicting the cause for pneumonia from a dataset of chest x-rays falls under the category of image classification. And CNNs are the gold standard when it comes to classification and detection tasks in images. One way of carrying out image classification using a CNN is through a straightforward training procedure (as shown in Figure 4a); the dataset is fed to the CNN, and the CNN trains on the fed data extracting salient features that help tell apart images of one class from another. While this training methodology works fine on traditional datasets, it’s a sub-optimal approach on datasets where there exists an implicit hierarchical relationship between labels.

从胸部X射线数据集中预测肺炎的原因属于图像分类的类别。 CNN是图像分类和检测任务的黄金标准。使用CNN进行图像分类的一种方法是通过简单的训练过程(如图4a所示)；数据集将被馈送到CNN，而CNN会在馈送的数据上进行训练，以提取有助于将一类图像与另一类图像区分开的显着特征。尽管这种训练方法在传统数据集上效果很好，但在标签之间存在隐式层次关系的数据集上却是次优方法。

In our imaginary dataset, the causes for Pneumonia can be split into three principal or parent categories, namely Bacterial, Fungal and Viral. The category Viral can be further sub-divided into child categories namely, COVID-19 and Others. Classical training procedure ignores the hierarchical relationships and proceeds to classify the data into the four categories as indicated earlier. To this end, employing a conditional training approach would help better exploit the hierarchical relationships between diseases in a given dataset. In particular, we’ll focus on the conditional training procedure proposed by Chen et al [12]. The main idea is to effectuate the training in two steps:

在我们的假想数据集中，肺炎的原因可分为三大类或父类，即细菌，真菌和病毒。病毒类别可以进一步细分为子类别，即COVID-19和其他。经典训练过程忽略了层次关系，并继续将数据分类为前面提到的四个类别。为此，采用条件训练方法将有助于更好地利用给定数据集中疾病之间的层次关系。特别是，我们将关注Chen等[ 12 ]提出的条件训练程序。主要思想是分两步完成培训：

Figure 4 : Classical vs conditional training using CNNs. 图4：使用CNN的经典训练与条件训练。

Conditional training: step one of the training procedure which aims at dissecting the dependent relationships between parent and child labels. Here a CNN is trained on a partial training set ( shown as 1) in figure 4b) containing all positive parent categories — categories with at least one child category — to classify the child labels, Viral-Other, and Viral-COVID-19 in our scenario.
有条件的培训：培训过程的第一步，其目的是剖析父母和孩子标签之间的依存关系。在这里，CNN在部分训练集(图4b中显示为1)上进行训练，该训练集包含所有积极的父类别(具有至少一个子类别的类别)，以对子标签，Viral-Other和Viral-COVID-19进行分类。我们的情况。
Transfer Learning: the second step exploits the transfer learning technique. This technique involves incorporating knowledge acquired over the course of a previous training on another training to aid with predictive tasks associated with it. Let’s take an example to further vulgarize transfer learning. Say Novak wants to learn to play tennis, he’s likely to learn the sport quicker was he already versed in table-tennis. This because he’ll be able to transfer certain notions acquired playing table-tennis towards learning tennis. To apply this technique, training weights from step one are used as a starting point to kick-start a second round of training — freezing all except the last layer — on the complete dataset to classify the parent classes — Bacterial, Viral and Fungal.
转移学习：第二步利用转移学习技术。该技术涉及将在先前培训过程中获得的知识整合到另一培训中，以辅助与之相关的预测性任务。让我们举个例子来进一步粗化迁移学习。说诺瓦克想学习打网球，如果他已经精通乒乓球，他可能会更快地学习这项运动。这是因为他将能够将在乒乓球比赛中获得的某些观念转变为学习网球。要应用此技术，将从第一步开始的训练权重用作起点，以开始第二轮训练-冻结完整数据集上除最后一层以外的所有训练，以对父类(细菌，病毒和真菌)进行分类。

Inferencing after conditional training involves a specific procedure which I’ll let you inquisitive souls uncover for yourselves from this study [13] There you have it, an A.I solution for pneumonia cause classification. Medical workers equipped with an A.I solution as such would be able to make rapid and reliable diagnosis — two r’s that are essential to compensate for the lack of medical workers.

有条件的训练后进行推论涉及到一个特定的过程，我将让您从这项研究中发现自己好奇的灵魂[ 13 ]。在那里，有一个针对肺炎原因分类的AI解决方案。配备了AI解决方案的医务人员将能够做出快速而可靠的诊断-两个r对弥补医务人员的匮乏至关重要。

结论 (Conclusions)

As a data scientist in the medical sector I often get asked the question, “Will A.I ever replace medical experts?”. Someday, maybe, and come that day, data scientists will most probably find themselves in the same boat as medical experts and individuals from an array of different fields. But that day is not today, so the question one should be asking is, “Will A.I help augment the workflow of medical experts?”, the short and the long answer is yes.

作为医疗领域的数据科学家，我经常被问到一个问题：“人工智能会取代医学专家吗？”。也许有一天，也有一天，数据科学家很可能会发现自己与医学专家和来自不同领域的个人在同一条船上。但是那天不是今天，所以应该问的一个问题是：“人工智能将帮助扩大医学专家的工作流程吗？”，总之，长短的答案是肯定的。

As it stands today, AI is nothing but a tool, a tool that can be shaped towards proficiency on a specific task be it fraud detection, disease classification, language translation, etc. And like with any tool, it must not be mistaken as an alternative to human intervention, which is always going to be present in some wavelength. To conclude I shall borrow the words of the renowned cardiologist, scientist and author Eric Jeffrey Topol,

从今天的情况来看，人工智能不过是一种工具，可以将其塑造成能够熟练完成特定任务的工具，例如欺诈检测，疾病分类，语言翻译等。并且像其他任何工具一样，它一定不能被误认为是一种工具。人工干预的替代方案，它将始终以某种波长存在。最后，我将借鉴著名的心脏病专家，科学家兼作家Eric Jeffrey Topol的话，

Machines will not replace physicians, but physicians using A.I will soon replace those who aren’t.

机器不会替代医师，但是使用AI的医师将很快替代那些不会替代的医师。

关于作家 (About the writer)

Having lived across the eastern hemisphere, I became aware of the societal problems, of which health and environment drew my attention in particular. To complement my resolve to be a part of the solution, I went on to first obtain a masters in management, energy and environment followed by another masters in machine learning and data mining. Armed with the technical know-how and resolute for a positive impact, I decided to join AZmed. Today, I am involved in the R&D of state-of-the-art A.I solutions aimed at augmenting the workflow of medical professionals.

我生活在东半球，意识到了社会问题，尤其是健康和环境问题。 为了补充我成为解决方案一部分的决心，我首先获得了管理，能源和环境方面的硕士学位，然后又获得了机器学习和数据挖掘方面的另一个硕士学位。 凭借技术知识和坚定的积极影响力，我决定加入AZmed。 今天，我参与了旨在扩大医疗专业人员工作流程的最新AI解决方案的研发。

翻译自: https://medium.com/azmed/ai-and-medical-imagery-a-much-needed-marriage-33bd4201a8ac