openai-gpt_GPT-什么？ OpenAI突破性的新NLP模型的非技术指南

openai-gpt

OpenAI’s GPT-3 language model gained significant attention last week, leading many to believe that the new technology represents a significant inflection point in the development of Natural Language Processing (NLP) tools. Those with early API access through OpenAI’s beta program went to Twitter to showcase impressive early tools built using GPT-3 technology:

上周，OpenAI的GPT-3语言模型引起了广泛关注，使许多人相信，新技术代表了自然语言处理(NLP)工具开发中的重要拐点。那些通过OpenAI的beta程序具有早期API访问权限的人去Twitter展示了使用GPT-3技术构建的令人印象深刻的早期工具：

For non-engineers, this may look like magic, but there is a lot to be unpacked here. In this article I will provide a brief overview of GPT and what it can be used for.

对于非工程师来说，这看起来像是魔术，但是这里有很多需要解压的地方。在本文中，我将简要概述GPT及其用途。

What is OpenAI and GPT-3?

什么是OpenAI和GPT-3？

OpenAI is an AI research laboratory founded in 2015 by Elon Musk, Sam Altman, and others with the mission of creating AI that benefits all of humanity. The company recently received $1 billion of additional funding from Microsoft in 2019 and is considered a leader in AI research and development.

OpenAI是由Elon Musk，Sam Altman等人于2015年成立的AI研究实验室，其使命是创建造福全人类的AI。该公司最近在2019年从微软获得了10亿美元的额外资金，被认为是AI研发的领导者。

Historically, obtaining large quantities of labelled data to use to train models has been a major barrier in NLP development (and AI development in general). Normally, this can be extremely time consuming and expensive. To solve this, scientists have used an approach called transfer learning: use the existing representations/information learned in a previously-trained model as a starting point to fine-tune and train a new model for a different task.

从历史上看，获得大量的标记数据以用于训练模型一直是NLP开发(和AI开发)的主要障碍。通常，这可能非常耗时且昂贵。为了解决这个问题，科学家们使用了一种称为转移学习的方法：以在先前训练过的模型中学习到的现有表示/信息为起点，来微调和训练用于不同任务的新模型。

For example, suppose you would like to learn a new language — German. Initially, you will still think about your sentences in English, then translate and rearrange words to come up with the German equivalent. The reality is, you are still indirectly applying learnings about sentence structure, language, and communication from the previous language even though the actual words and grammar are different. This is why learning new languages is typically easier if you already know another language.

例如，假设您想学习一门新语言-德语。最初，您仍会考虑英语中的句子，然后翻译和重新排列单词以提出德语对等词。现实情况是，即使实际的单词和语法不同，您仍在间接地应用关于句子结构，语言和交流的学习。这就是为什么如果您已经知道另一种语言，学习新语言通常会更容易的原因。

Applying this strategy to AI means that we can use pre-trained models to create new models more quickly with less training data. In this great walkthrough, Francois Chollet compared the effectiveness of an AI model trained from scratch to one built from a pre-trained model. His results showed that the latter had 15% greater predictive accuracy after training both with the same amount of training data.

将此策略应用于AI意味着我们可以使用预先训练的模型以更少的训练数据更快地创建新模型。在这次精彩的演练中，Francois Chollet将从零开始训练的AI模型与从预先训练的模型构建的AI模型的有效性进行了比较。他的结果表明，在使用相同数量的训练数据进行训练后，后者的预测准确性提高了15％。

In 2018, OpenAI presented convincing research showing that this strategy (pairing supervised learning with unsupervised pre-training) is particularly very effective in NLP tasks. They first produced a generative pre-trained model (“GPT”) using “a diverse corpus of unlabeled text” (i.e. over 7,000 unique unpublished books from a variety of genres), essentially creating a model that “understood” English and language. Next, this pre-trained model could be further fine-tuned and trained to perform specific tasks using supervised learning. As an analogy, this would be like teaching someone English, then training him or her for the specific task of reading and classifying resumes of acceptable and unacceptable candidates for hiring.

2018年，OpenAI提出了令人信服的研究，表明该策略(将有监督的学习与无监督的预训练配对)在NLP任务中特别有效。他们首先使用“多样化的未标记文本语料库”(即来自各种流派的7,000多种未出版的独特书籍)生成了一种生成式预训练模型(“ GPT”)，从本质上创建了一个“理解”英语和语言的模型。接下来，可以使用监督学习对该预训练模型进行进一步的微调和训练，以执行特定任务。打个比方，这就像在教某人英语，然后训练他或她进行阅读和分类可接受和不可接受的求职者简历的特定任务。

GPT-3 is the latest iteration of the GPT model and was first described in May 2020. It contains 175 billion parameters compared to the 1.5 billion in GPT-2 (117x increase) and training it consumed several thousand petaflop/s-days of computing power. GPT-3 is fed with much more data and tuned with more parameters than GPT-2, and as a result, it has produced some amazing NLP capabilities so far. The volume of data and computing resources required makes it impossible for many organizations to recreate this, but luckily they won’t have to since OpenAI plans to release access via API in the future.

GPT-3是GPT模型的最新版本，于2020年5月首次描述。它包含了1,750 亿个参数，而GPT-2中的参数为15亿(增加了117倍)，并且对其进行的训练消耗了数千petaflop / s天的计算能力。与GPT-2相比，GPT-3提供了更多的数据并进行了更多的参数调整，因此，到目前为止，GPT-3已产生了一些惊人的NLP功能。所需的数据和计算资源量使许多组织无法重新创建它，但是幸运的是，由于OpenAI计划在将来释放通过API的访问权限，因此他们不必这样做。

Critical reception

紧急接待

Admittedly, GPT-3 didn’t get much attention until last week’s viral tweets by Sharif Shameem and others (above). They demonstrated that GPT-3 could be used to create websites based on plain English instructions, envisioning a new era of no-code technologies where people can create apps by simply describing them in words. Early adopter Kevin Lacker tested the model with a Turing test and saw amazing results. GPT-3 performed exceptionally well in the initial Q&A and displayed many aspects of “common sense” that AI systems traditionally struggle with.

诚然，直到Sharif Shameem和其他人(上图)在上周的病毒推文上，GPT-3才引起人们的关注。他们证明了GPT-3可用于根据简单的英语说明创建网站，预见了无代码技术的新时代，人们可以通过用文字描述应用程序来创建应用程序。早期采用者Kevin Lacker通过图灵测试对模型进行了测试，并看到了惊人的结果。 GPT-3在最初的问答环节中表现出色，并且展示了AI系统传统上难以解决的“常识”的许多方面。

However, the model is far from perfect. Max Woolf performed a critical analysis noting several issues such as model latency, implementation issues, and concerning biases in the data that need to be re-considered. Several users have reported these issues on Twitter as well:

但是，该模型远非完美。 Max Woolf进行了重要分析，指出了几个问题，例如模型等待时间，实现问题以及与需要重新考虑的数据偏差有关的问题。一些用户也在Twitter上报告了这些问题：

OpenAI’s blog discusses some of the key drawbacks of the model, most notably that GPT’s entire understanding of the world is based on the texts it was trained on. Case in point: it was trained in October 2019 and therefore does not know about COVID-19. It is unclear how these texts were chosen and what oversight was performed (or required) in this process.

OpenAI的博客讨论了该模型的一些关键缺陷，最值得注意的是，GPT对世界的整体理解是基于其受过训练的文本。例子：它于2019年10月接受培训，因此不了解COVID-19。目前尚不清楚如何选择这些文本以及在此过程中进行了(或要求)什么监督。

Additionally, the enormous computing resources required to produce and maintain these models raise serious questions about the environmental impact of AI technologies. Although often overlooked, both hardware and software usage significantly contribute to depletion of energy resources, excessive waste generation, and excessive mining of rare earth minerals with the associated negative impacts to human health.

此外，产生和维护这些模型所需的巨大计算资源提出了有关AI技术对环境影响的严重问题。尽管经常被忽视，但是硬件和软件的使用都极大地导致了能源的消耗，废物的大量产生以及稀土矿的过度开采，从而对人类健康产生了负面影响。

To quell concerns, OpenAI has repeatedly stated its mission to produce AI for the good of humanity and aims to stop access to its API if misuse is detected. Even in it’s beta access form, it asks candidates to describe their intentions with the technology and the benefits and risks to society.

为了平息疑虑，OpenAI反复声明了其为人类造福AI的使命，并旨在在发现滥用情况时停止访问其API。即使是Beta版访问表单，它也要求考生用技术描述其意图以及对社会的收益和风险。

Where do we go from here?

我们从这里去哪里？

Without a doubt, GPT-3 still represents a major milestone in AI development. Many early users have built impressive apps that accurately process natural language and produce amazing results. In summary:

毫无疑问，GPT-3仍然代表着AI开发的一个重要里程碑。许多早期用户已经构建了令人印象深刻的应用程序，它们可以准确地处理自然语言并产生惊人的结果。综上所述：

GPT-3 is a major improvement upon GPT-2 and features far greater accuracy for better use cases. This is a significant step forward for AI development, impressively accomplished in just a two-year time frame
GPT-3是对GPT-2的重大改进，其准确性更高，适用于更好的用例。这是AI开发的重要一步，仅用了两年时间就取得了令人印象深刻的成就
Early tools that have been built on GPT-3 show great promise for commercial usability such as: no-code platforms that allow you to build apps by describing then; advanced search platforms using plain English; and better data analytics tools that make data gathering and processing much faster
建立在GPT-3上的早期工具显示出商业可用性的广阔前景，例如：无代码平台，可让您通过描述来构建应用程序；使用简单英语的高级搜索平台；以及更好的数据分析工具，可以更快地收集和处理数据
OpenAI announced plans to release a commercial API, which will enable organizations to build products powered by GPT-3 at scale. However, many questions remain about how exactly this will be executed — pricing, SLA, model latency, etc.
OpenAI宣布了发布商业API的计划，该API将使组织能够大规模构建由GPT-3支持的产品。但是，关于如何执行此操作，仍然存在许多问题，例如定价，SLA，模型延迟等。
Users have pointed out several issues that need to be addressed before widespread commercial use. Inherent biases in the model, questions around fairness and ethics, and concerns about misuse (fake news, bots, etc.) need to be thought through and oversight might be necessary
用户指出了在广泛的商业用途之前需要解决的几个问题。该模型的内在偏见，有关公平和道德的问题以及对滥用(假新闻，机器人等)的担忧都需要经过仔细考虑，可能有必要进行监督
OpenAI is openly committed to creating AI for the benefit of humanity, but still, monitoring for misuse at scale will be difficult to achieve. This raises a broader question about the necessity of government involvement to protect the rights of individuals
OpenAI公开致力于创建AI，以造福人类，但仍然很难实现对滥用的大规模监控。这就提出了一个更广泛的问题，即政府参与保护个人权利的必要性

All said, I’m extremely excited to see which new technologies are built on GPT-3 and how OpenAI continues to improve on its model. Increased attention and funding in NLP and GPT-3 might be enough to ward off fears from many critics that an AI winter might be coming (myself included). Despite the shortfalls of the model, I am hoping that everyone can be optimistic about a future where humans and machines will communicate with each other in a unified language and the ability to create tools using technology will be accessible to billions of more people.

所有人都说，我很高兴看到在GPT-3上构建了哪些新技术，以及OpenAI如何继续改进其模型。对NLP和GPT-3的关注和资金的增加可能足以抵制许多批评家对AI冬季即将来临的担忧(包括我自己)。尽管该模型存在不足，但我希望每个人都对人与机器将以统一的语言相互通信的未来感到乐观，并且数十亿人将可以使用技术创建工具。

翻译自: https://towardsdatascience.com/gpt-what-why-this-groundbreaking-model-is-driving-the-future-of-ai-and-nlp-e38fcf891172

openai-gpt

openai-gpt_GPT-什么？ OpenAI突破性的新NLP模型的非技术指南

你可能感兴趣的:(python,java,机器学习)