人工智能时代的危机_AI信任危机:如何前进

人工智能时代的危机

Social, racial, and gender bias in data and models, has come up as a major concern in the machine learning industry and society.

数据和模型中的社会,种族和性别偏见已成为机器学习行业和社会的主要关注点。

Recently, MIT withdrew from public access to a popular computer vision dataset after a team of researchers found that it was socially biased, tinged with misogynistic and racist labels.

最近,在一组研究人员发现麻省理工学院和种族主义者的标签引起社会偏见之后,麻省理工学院退出了对流行的计算机视觉数据集的公开访问。

This discovery in Tiny Images, an 80 million images dataset, is the perfect example of how social bias proliferates and spreads into machine learning datasets and projects. In 1985, researchers in linguistic and psychology at Princeton University produced a semantic lexicon for the English language called WordNet that has been largely used for natural language processing tasks. Based on WordNet, MIT scientists released Tiny Images in 2006, a dataset of images compiled from an image search on the internet associated with WordNet words. Since WordNet was originally gender-based and racially biased, so were the Tiny Images labels associated with the collected images.

Tiny Images这个拥有8000万张图片数据集的发现是社会偏见如何扩散并扩散到机器学习数据集和项目中的完美例证。 1985年,普林斯顿大学的语言学和心理学研究人员为英语制作了一个名为WordNet的语义词典,该词典已被广泛用于自然语言处理任务。 麻省理工学院的科学家基于WordNet,于2006年发布了Tiny Images,这是从Internet上与WordNet单词相关的图像搜索中整理而来的图像数据集。 由于WordNet最初是基于性别的并且带有种族偏见,因此与收集的图像相关联的Tiny Images标签也是如此。

Just a few weeks ago, Pulse, a generative model for self-supervised photo upsampling, made a lot of noise and concerns over bias in AI models, as a pixelated image of former president of the United States Barack Obama was transformed by Pulse into a high-resolution image of a white man.

就在几周前, Pulse是一种用于自我监督的照片上采样的生成模型,由于AI前总统巴拉克·奥巴马(Barack Obama)的像素化图像被Pulse转变为一个像素化图像,因此在AI模型中产生了很多噪音和担忧。白人的高分辨率图像。

The PULSE model transforms pixelated faces into high-resolution images, with racial bias by turning the face of Barack Obama into a white man’s face. Image: PULSE模型通过将巴拉克·奥巴马(Barack Obama)的脸部变成白人的脸部,从而在具有种族偏见的情况下将像素化的脸部转换为高分辨率图像。 图片: Twitter / @Chicken3gg Twitter / @ Chicken3gg

These two examples amongst many are the embodiment of the bias problem in AI and show the complexity of this issue that extends further than data and models. It is a global issue that needs to be tackled from many angles, not just technical fixes.

这两个例子是AI中偏见问题的具体体现,它显示了此问题的复杂性,其范围比数据和模型还广。 这是一个全球性的问题,需要从多个角度解决,而不仅仅是技术修复。

如何解决偏差问题? (How to fix bias issues?)

The start of the art is moving fast, but it is safe to say that minimizing the effect of bias in data can be performed in many ways. There is no magic pill.

现有技术的发展日新月异,但是可以肯定地说,可以通过多种方式来最小化数据偏差的影响。 没有神奇的药丸。

Yann LeCun, Chief AI Scientist at Facebook, explains that the most straightforward way to deal with bias is to be aware of the imbalance in the data, and fix it with equalizing the frequencies of samples from the distinct categories. According to LeCun, this is the method used by Facebook in its face recognition system. To illustrate this, LeCun likes to use this analogy:

Facebook首席AI科学家Yann LeCun解释说, 解决偏差的最直接方法是意识到数据的不平衡 ,并通过均衡不同类别的样本频率来解决它。 LeCun认为,这是Facebook在其面部识别系统中使用的方法。 为了说明这一点,LeCun喜欢使用这个比喻:

“ If you go to medical school, you don’t spend time studying the common flu and some rare disease in proportion to their frequencies in patients. You spend comparatively more time on rare diseases because you need to develop the “features” for them.“ — Yann LeCun

“如果您上医学院,您不会花时间研究普通流感和某些罕见疾病,而与患者的发病频率成正比。 您在相对罕见的疾病上花费的时间相对较多,因为您需要为它们开发“特征”。” — Yann LeCun

Bias in Deep Learning systems is such a concern because due to the non-convexity of the loss, they do not learn and create features for rare categories, whereas this does not occur with logistic regression for instance.

深度学习系统中的偏差是一个令人担忧的问题,因为由于损失的非凸性,它们无法学习和创建稀有类别的特征,而例如逻辑回归则不会发生这种情况。

A recent paper proposes a new way to identify and minimize bias with “Invariant Risk Minimization”. In a nutshell, this method aims to train the machine learning model to estimate invariant correlations across multiple training distributions.

最近的论文提出了一种通过“ 不变风险最小化 ”来识别和最小化偏差的新方法。 简而言之,该方法旨在训练机器学习模型,以估计多个训练分布之间的不变相关性。

偏差不是唯一的问题 (Bias is not the only concern)

These recent events have reopened the debate on trustworthy AI: institutions and individuals working with AI models have to reconsider the way they craft, develop, and deploy AI. Consumers and citizens require the same level of rights and security whether an AI-based system or not. Today, AI is not naturally adopted by the population, there is most of the time a mistrust hovering around it. The ethical use of AI becomes a prerequisite for trust.

这些最近的事件重新引发了关于可信赖的AI的争论:使用AI模型的机构和个人必须重新考虑其制作,开发和部署AI的方式。 无论是否基于AI的系统,消费者和公民都需要相同级别的权利和安全。 如今,人工智能已不为人们所自然接受,在大多数情况下,人们一直对它不信任。 AI的道德使用成为信任的前提。

It’s time to break those worries and change the way we make IA.

现在该打破这些担忧并改变我们制作IA的方式了。

In Europe, the regulators are working on the subject and they have released in April 2019 a set of ethics guidelines for trustworthy AI.

在欧洲,监管机构正在研究这个问题,并于2019年4月发布了一套有关可信赖AI的道德准则 。

High-Level Expert Group on AI (AI HLEG) to publish ethics guidelines for trustworthy AI. Source: AI高层专家组 (AI HLEG),以发布可信赖AI的道德准则。 资料来源: European Commission 欧盟委员会

According to the Guidelines, trustworthy AI should be built around 3 pillars:

根据《指南》 ,值得信赖的AI应当围绕以下三个Struts构建:

  • Ethics: respecting and embracing ethical principles and values

    道德 :尊重和拥抱道德原则和价值观

  • Robustness: from a technical and social point of view, safe and reliable

    稳健性 :从技术和社会角度出发,安全可靠

  • Lawful: respecting all applicable laws and regulations

    合法 :遵守所有适用的法律和法规

EU Guidelines for trustworthy AI. 欧盟可信赖AI准则。

Around these pillars, the EU regulators have defined seven practical key requirements that any AI system should meet to be labeled as trustworthy.

围绕这些Struts,欧盟监管机构定义了七个 实际关键要求,任何AI系统都必须满足这些要求,才能被标记为可信赖。

  1. Human oversight: AI systems must keep humans in the loop, provide insights to allow them to make informed decisions while protecting their rights, and respecting their values.

    人工监督 :人工智能系统必须让人类处于循环中,提供见解,以使他们能够做出明智的决定,同时保护自己的权利并尊重其价值。

  2. Technical robustness and processing safety: resilience and safety are also a major concern. Higher demands are placed on reliability, accuracy, and reproducibility. AI systems cannot live as a standalone anymore, they must be accompanied by a back-up system in the event of a problem.

    技术稳健性和加工安全性 :弹性和安全性也是一个主要问题。 对可靠性,准确性和可重复性提出了更高的要求。 AI系统不能再独立运行了,如果出现问题,它们必须配有备份系统。

  3. Privacy and data governance: respect for privacy and data protection are mandatory to enable trust, besides data governance to guarantee data quality, integrity, and accredited access to data.

    隐私和数据治理 :除了保证数据质量,完整性和对数据的经授权访问的数据治理外,尊重隐私和数据保护是实现信任的强制性要求。

  4. Diversity, non-discrimination, and equity: as illustrated in the opening section of this post, bias can cause massive impact, from discrimination to prejudice to a specific group of individuals. A trustworthy AI must avoid bias by design. EU regulators also point out that access to AI systems must be universal, without discrimination, regardless of disability, and involve all parties all along the AI system life cycle.

    多样性,非歧视和公平 :正如这篇文章的开头所述,偏见会造成巨大的影响,从歧视到偏见再到特定人群。 值得信赖的AI必须避免设计偏见。 欧盟监管机构还指出,对AI系统的访问必须是普遍的,不受歧视,无论其是否有残障,并且必须使AI系统生命周期中的所有各方都参与进来。

  5. Environmental and societal impact: sustainable development is at the core of this key requirement. Major concerns on Green AI have risen in the past few years with the increasing computational power required by Deep learning models. AI systems must be “green” and benefit all, including future generations, taking into account both the environment and other living beings and species, with careful awareness of their societal impact.

    环境和社会影响:可持续发展是这一关键要求的核心。 在过去的几年中,随着深度学习模型所需的计算能力不断提高,人们对绿色AI的主要关注日益增加。 人工智能系统必须是“绿色的”,并要在造福所有人,包括后代的同时,充分考虑环境以及其他生物和物种,并认真认识其社会影响。

  6. Accountability: ensuring responsibility and accountability for AI systems and their outcomes can be achieved through AI audits, focusing on algorithms, data, and design.

    问责制 :确保AI系统及其结果的责任感和责任感可以通过AI审计来实现,重点是算法,数据和设计

  7. Transparency: Traceability and an Explainable AI can help to achieve trust in AI systems decisions. Capabilities and limitations must be clearly explained and documented. Humans must be notified that they are directly in connection with an AI system or that their data is being used by an AI system.

    透明度 :可追溯性和可解释的AI可以帮助获得对AI系统决策的信任。 功能和限制必须清楚地说明和记录。 必须通知人类他们直接与AI系统相关联或AI系统正在使用其数据。

Ethics guidelines for trustworthy AI, European Commission 欧盟委员会关于可信赖的AI的道德准则

“ Based on this work we can now move forward with creating a regulatory framework where everyone can reap the benefits of trustworthy AI.” — Commissioner for Internal Market, Thierry Breton

“基于这项工作,我们现在可以继续建立一个监管框架,使每个人都可以从可信赖的AI中受益。” —内部市场专员Thierry Breton

如何前进并解决AI信任危机? (How to move forward and solve the AI trust crisis?)

These 7 key principles can be used as an entry point to audit the value chain of AI systems, from conception to deployment and maintenance.

这7个关键原则可用作从概念到部署和维护的审计AI系统价值链的切入点。

The major added value of this document issued by the European Commission is that it comes with a list of practical questions to implement the key principles I have already mentioned.

欧盟委员会发布的这份文件的主要附加价值在于,它附带了一系列实践问题,以执行我已经提到的关键原则。

This trustworthy AI assessment list, ALTAI, is a great initiative for any company or individual who wants to make its first steps and move forward to solve the AI trust crisis we face today. It transposes the guidelines into an accessible and practical checklist that sponsors, developers, and deployers of AI can use to quickly audit their AI system. ALTAI is available in a document version and as a web-based tool (prototype version).

对于任何想要迈出第一步并继续解决我们今天面临的AI信任危机的公司或个人,此值得信赖的AI评估列表ALTAI都是一项伟大的举措。 它将准则转换为可访问且实用的清单,供AI的发起人,开发人员和部署人员用来快速审核其AI系统。 ALTAI提供文档版本和基于Web的工具 (原型版本)。

Glenn Carstens-Peters on 格伦·卡斯滕斯,彼得斯在 Unsplash Unsplash

Technical and non-technical methods are suggested to ensure Trustworthy AI incorporation in the design, development, and rolling phases of an AI system.

建议使用技术和非技术方法来确保将可信赖的AI纳入AI系统的设计,开发和滚动阶段。

The starting point for any company is to define its ethical values and responsibility in the use of AI and state it in a Code of Conduct. This should be translated into governance principles and the key principles should be anchored in the AI system’s architecture design. A critical point is to identify, assess, and manage risk through AI risk management tools and protocols. Rules, processes, and control procedures will set up the trustworthy AI framework. The monitoring of models and KPIs over time well help to detect data or performance drifts and raise flags for potential bias.

任何公司的出发点都是定义其在使用AI方面的道德价值观和责任,并在行为准则中加以说明。 这应该转化为治理原则,并且关键原则应该锚定在AI系统的体系结构设计中。 关键点是通过AI风险管理工具和协议来识别,评估和管理风险。 规则,流程和控制程序将建立可信赖的AI框架。 随着时间的推移,对模型和KPI的监视可以很好地帮助检测数据或性能偏差,并为潜在的偏差提供标志。

The ecosystem will also evolve, with emphasis on regulation, standardization, and certification of AI systems, but also with education and awareness to foster an ethical mindset, with all stakeholders participation and social dialogue, diversity, and inclusive design teams.

生态系统也将不断发展,重点是对AI系统的监管,标准化和认证,同时还将通过所有利益相关者的参与和社会对话,多样性和包容性设计团队的教育和意识培养道德观念。

AI audits methodology and will flourish in the next coming years and surely companies specialized in AI audits and certification will pop on the market. There is without a doubt a huge market around trustworthy AI, and companies will thrive to use it as a differentiating feature to stand out from their competitors.

人工智能审计方法学将在未来几年蓬勃发展,并且专门从事人工智能审计和认证的公司肯定会在市场上流行。 毫无疑问,值得信赖的AI周围有巨大的市场,并且公司会欣欣向荣地将其用作与众不同的功能,以使其与竞争对手脱颖而出。

I really hope this post will raise awareness on how important and urgent it is to move forward and start thinking, designing, and implementing these ethical recommendations into our daily AI practitioners’ jobs.

我真的希望这篇文章能使人们意识到前进并开始思考,设计和实施这些道德建议到我们日常AI从业人员的工作是多么重要和紧迫。

Trustfully yours, Tomas

相信您 ,托马斯

翻译自: https://towardsdatascience.com/the-ai-trust-crisis-how-to-move-forward-9b523ea0158d

人工智能时代的危机

你可能感兴趣的:(人工智能,python,机器学习)