自然语言理解和自然语言处理_4种自然语言处理和理解的方法

自然语言理解和自然语言处理

by Mariya Yao

姚iya(Mariya Yao)

4种自然语言处理和理解的方法 (4 Approaches To Natural Language Processing & Understanding)

In 1971, Terry Winograd wrote the SHRDLU program while completing his PhD at MIT.

1971年,Terry Winograd在麻省理工学院攻读博士学位时编写了SHRDLU程序。

SHRDLU features a world of toy blocks where the computer translates human commands into physical actions, such as “move the red pyramid next to the blue cube.”

SHRDLU具有玩具积木世界,其中计算机将人工命令转换为实际动作,例如“将红色金字塔移到蓝色立方体旁边”。

To succeed at such tasks, the computer must build up semantic knowledge iteratively, a process Winograd discovered as brittle and limited.

为了成功完成这些任务,计算机必须迭代地建立语义知识,而Winograd发现该过程脆弱且受限制。

The rise of chatbots and voice activated technologies has renewed fervor in natural language processing (NLP) and natural language understanding (NLU) techniques that can produce satisfying human-computer dialogs.

聊天机器人和语音激活技术的兴起重新激发了人们对自然语言处理(NLP)和自然语言理解(NLU)技术的热情,这些技术可以产生令人满意的人机对话。

Unfortunately, academic breakthroughs have not yet translated into improved user experience. Gizmodo writer Darren Orf declared Messenger chatbots “frustrating and useless” and Facebook admitted a 70% failure rate for their highly anticipated conversational assistant, “M.”

不幸的是,学术上的突破尚未转化为改善的用户体验。 Gizmodo作家Darren Orf宣布Messenger聊天机器人“ 令人沮丧且无用 ”,而Facebook承认其备受期待的对话助手“ M”的失败率高达70% 。

Nevertheless, researchers forge ahead with new plans of attack, occasionally revisiting the same tactics and principles Winograd tried in the 70s.

尽管如此,研究人员还是提出了新的进攻计划,偶尔会重温Winograd在70年代尝试过的相同战术和原则。

OpenAI recently leveraged reinforcement learning to teach to agents to design their own language by “dropping them into a set of simple worlds, giving them the ability to communicate, and then giving them goals that can be best achieved by communicating with other agents.” The agents independently developed a simple “grounded” language.

OpenAI最近利用强化学习来教给代理人设计自己的语言,方法是“将他们放到一组简单的世界中,赋予他们交流的能力,然后赋予他们可以与其他代理人进行交流的最佳目标。” 代理商独立开发了一种简单的“扎根”语言。

MIT Media Lab presents this satisfying clarification on what “grounded” means in the context of language:

麻省理工学院媒体实验室就语言中“扎根”的含义提出了令人满意的澄清:

“Language is grounded in experience. Unlike dictionaries which define words in terms of other words, humans understand many basic words in terms of associations with sensory-motor experiences. People must interact physically with their world to grasp the essence of words like “red,” “heavy,” and “above.” Abstract words are acquired only in relation to more concretely grounded terms. Grounding is thus a fundamental aspect of spoken language, which enables humans to acquire and to use words and sentences in context.”

语言基于经验。 与用其他词来定义词的词典不同,人类根据与感觉运动体验的关联来理解许多基本词。 人们必须与自己的世界进行互动,以掌握“红色”,“沉重”和“上方”等词语的本质。 仅与更具体的基础术语相关地获取抽象词。 因此,扎根是口头语言的基本方面,它使人类能够在上下文中获取和使用单词和句子。”

The antithesis of grounded language is inferred language. Inferred language derives meaning from words themselves rather than what they represent.

基本语言的对立是推断语言。 推断语言是从单词本身而不是它们所代表的含义中获得含义的。

When trained only on large corpuses of text — but not on real-world representations — statistical methods for NLP and NLU lack true understanding of what words mean.

如果只接受大型文本语料库的训练,而不能接受真实世界的表示法的训练,那么NLP和NLU的统计方法就无法真正理解单词的含义。

OpenAI points out that such approaches share the weaknesses revealed by John Searle’s famous Chinese Room thought experiment. Equipped with a universal dictionary to map all possible Chinese input sentences to Chinese output sentences, anyone can perform a brute force lookup and produce conversationally acceptable answers without understanding what they’re actually saying.

OpenAI指出,这种方法具有约翰·塞尔(John Searle)著名的中国房间思想实验所揭示的缺点。 配备了通用字典,可以将所有可能的中文输入句子映射到中文输出句子,任何人都可以执行暴力查询并产生对话可接受的答案,而无需了解他们的实际意思。

语言为何如此复杂? (Why Is Language So Complex?)

Percy Liang, a Stanford CS professor and NLP expert, breaks down the various approaches to NLP / NLU into four distinct categories:

斯坦福大学CS教授和NLP专家Percy Liang 将NLP / NLU的各种方法分解为四个不同的类别:

  1. Distributional

    分配式
  2. Frame-based

    基于框架
  3. Model-theoretical

    模型理论
  4. Interactive learning

    互动学习

First, a brief linguistics lesson before we continue on to define and describe those categories.

首先,在我们继续定义和描述这些类别之前,简要讲授语言学课程。

There are three levels of linguistic analysis:

语言分析分为三个级别:

  1. Syntax — what is grammatically correct?

    语法-语法上正确的是什么?
  2. Semantics — what is the meaning?

    语义学–是什么意思?
  3. Pragmatics — what is the purpose or goal?

    语用学–目的或目标是什么?

Drawing upon a programming analogy, Liang likens successful syntax to “no compiler errors,” semantics to “no implementation bugs,” and pragmatics to “implemented the right algorithm.”

根据编程的类比,梁将成功的语法比喻为“没有编译器错误”,语义比喻为“没有实现错误”,而语用比喻为“实现了正确的算法”。

He highlights that sentences can have the same semantics, yet different syntax, such as “3+2” versus “2+3”. Similarly, they can have identical syntax yet different syntax, for example 3/2 is interpreted differently in Python 2.7 vs Python 3.

他强调指出 ,句子可以具有相同的语义,但可以具有不同的语法,例如“ 3 + 2”和“ 2 + 3”。 同样,它们可以具有相同的语法,但是可以具有不同的语法,例如3/2在Python 2.7与Python 3中的解释不同。

Ultimately, pragmatics is key, since language is created from the need to motivate an action in the world. If you implement a complex neural network to model a simple coin flip, you have excellent semantics but poor pragmatics since there are a plethora of easier and more efficient approaches to solve the same problem.

归根结底,语用是关键,因为语言是出于激发世界行动的需要而创建的。 如果您使用复杂的神经网络对简单的硬币翻转进行建模,则您将拥有出色的语义,但实用主义却很差,因为存在许多解决同一问题的更简便,更有效的方法。

Plenty of other linguistics terms exist which demonstrate the complexity of language. Words take on different meanings when combined with other words, such as “light” versus “light bulb” (that is, multi-word expressions), or used in various sentences such as “I stepped into the light” and “the suitcase was light” (polysemy).

存在许多其他语言学术语,这些语言论证了语言的复杂性。 单词与其他单词结合使用时,具有不同的含义,例如“ light”和“ light bulb”(即多词表达),或者在各种句子中使用,例如“ Isteping the light”和“手提箱是轻”(多义)。

Hyponymy shows how a specific instance is related to a general term (a cat is a mammal) and meronymy denotes that one term is a part of another (a cat has a tail). Such relationships must be understood to perform the task of textual entailment, recognizing when one sentence is logically entailed in another. “You’re reading this article” entails the sentence “you can read.”

副词表示特定实例与一般术语(猫是哺乳动物)之间的关系,而副词则表示一个术语是另一术语的一部分(猫有尾巴)。 必须理解这种关系以执行文本包含的任务,认识到一个句子在逻辑上包含在另一个句子中。 “您正在阅读本文”包含句子“您可以阅读”。

Aside from complex lexical relationships, your sentences also involve beliefs, conversational implicatures, and presuppositions. Liang provides excellent examples of each. Superman and Clark Kent are the same person, but Lois Lane believes Superman is a hero while Clark Kent is not.

除了复杂的词汇关系外,您的句子还涉及信念,会话含义和预设。 梁提供了很好的例子。 超人和克拉克·肯特是同一个人,但路易斯·莱恩(Lois Lane)认为超人是英雄,而克拉克·肯特不是。

If you say “Where is the roast beef?” and your conversation partner replies “Well, the dog looks happy”, the conversational implicature is the dog ate the roast beef.

如果您说“烤牛肉在哪里?” 并且您的对话伙伴回答“好吧,狗看起来很高兴”,对话的含义是狗吃了烤牛肉。

Presuppositions are background assumptions that are true regardless of the truth value of a sentence. “I have stopped eating meat” has the presupposition “I once ate meat” even if you inverted the sentence to “I have not stopped eating meat.”

预设是与句子的真值无关的真实背景假设。 即使您将句子改为“我没有停止吃肉”,“我已经停止吃肉”的前提还是“我曾经吃过肉”。

Adding to the complexity are vagueness, ambiguity, and uncertainty. Uncertainty is when you see a word you don’t know and must guess at the meaning.

模糊性,歧义性和不确定性增加了复杂性。 不确定性是当您看到一个您不知道并且必须猜测其含义的单词时。

If you’re stalking a crush on Facebook and their relationship status says “It’s Complicated”, you already understand vagueness. Richard Socher, Chief Scientist at Salesforce, gave an excellent example of ambiguity at a recent AI conference: “The question ‘can I cut you?’ means very different things if I’m standing next to you in line or if I am holding a knife.”

如果您对Facebook情有独钟,并且他们的关系状态显示“很复杂”,那么您已经了解了模糊性。 Salesforce的首席科学家Richard Socher在最近的AI会议上给出了一个模棱两可的很好的例子:“我能切你吗?” 如果我排在你旁边或者我拿着刀,那意味着完全不同的事情。”

Now that you’re more enlightened about the myriad challenges of language, let’s return to Liang’s four categories of approaches to semantic analysis in NLP and NLU.

既然您对语言的无数挑战有了更多的了解,那么让我们回到Liang在NLP和NLU中进行语义分析的四类方法。

1:分配方法 (1: Distributional Approaches)

Distributional approaches include the large-scale statistical tactics of machine learning and deep learning. These methods typically turn content into word vectors for mathematical analysis and perform quite well at tasks such as part-of-speech tagging (is this a noun or a verb?), dependency parsing (does this part of a sentence modify another part?), and semantic relatedness (are these different words used in similar ways?). These NLP tasks don’t rely on understanding the meaning of words, but rather on the relationship between words themselves.

分布方法包括机器学习和深度学习的大规模统计策略。 这些方法通常将内容转换为用于数学分析的单词向量,并且在诸如词性标注(这是名词还是动词?),依存关系分析(句子的这一部分是否修改了另一部分?)之类的任务上表现出色。以及语义相关性(这些不同的词是否以类似的方式使用?)。 这些NLP任务不依赖于理解单词的含义,而是依赖于单词本身之间的关系。

Such systems are broad, flexible, and scalable. They can be applied widely to different types of text without the need for hand-engineered features or expert-encoded domain knowledge. The downside is that they lack true understanding of real-world semantics and pragmatics. Comparing words to other words, or words to sentences, or sentences to sentences can all result in different outcomes.

这样的系统是广泛的,灵活的和可扩展的。 它们可以广泛地应用于不同类型的文本,而无需手工设计的功能或专家编码的领域知识。 缺点是他们对真实世界的语义和语用缺乏真正的了解。 将单词与其他单词进行比较,或者将单词与句子进行比较,或者将句子与句子进行比较,都可能导致不同的结果。

Semantic similarity, for example, does not mean synonymy. A nearest neighbor calculation may even deem antonyms as related:

例如,语义相似性并不意味着同义词。 最近邻居计算甚至可以将反义词视为相关:

Advanced modern neural network models, such as the end-to-end attentional memory networks pioneered by Facebook or the joint multi-task model invented by Salesforce can handle simple question and answering tasks, but are still in early pilot stages for consumer and enterprise use cases.

先进的现代神经网络模型,例如Facebook倡导的端到端注意力记忆网络或Salesforce发明的联合多任务模型可以处理简单的问答任务,但仍处于消费者和企业使用的早期试验阶段案件。

Thus far, Facebook has only publicly shown that a neural network trained on an absurdly simplified version of The Lord of The Rings can figure out where the elusive One Ring is located.

到目前为止,Facebook仅公开表明 ,在荒谬的简化版《指环王》上训练的神经网络可以找出难以捉摸的“一环”的位置。

Although distributional methods achieve breadth, they cannot handle depth. Complex and nuanced questions that rely linguistic sophistication and contextual world knowledge have yet to be answered satisfactorily.

尽管分布方法可达到广度,但它们无法处理深度。 依赖于语言复杂性和上下文世界知识的复杂细微问题尚未得到令人满意的回答。

2:基于框架的方法 (2: Frame-Based Approach)

“A frame is a data-structure for representing a stereotyped situation,” explains Marvin Minsky in his seminal 1974 paper called “A Framework For Representing Knowledge.” Think of frames as a canonical representation for which specifics can be interchanged.

Maven·明斯基(Marvin Minsky)在1974年开创性的论文 “代表知识的框架”中解释说:“框架是代表刻板印象的情况的数据结构。” 可以将框架视为可以互换细节的规范表示。

Liang provides the example of a commercial transaction as a frame. In such situations, you typically have a seller, a buyers, goods being exchanged, and an exchange price.

梁以商业交易为例提供了框架。 在这种情况下,您通常会有一个卖方,一个买方,正在交换的商品以及一个交换价格。

Sentences that are syntactically different but semantically identical — such as “Cynthia sold Bob the bike for $200” and “Bob bought the bike for $200 from Cynthia” — can be fit into the same frame. Parsing then entails first identifying the frame being used, then populating the specific frame parameters — i.e. Cynthia, $200.

在句法上不同但在语义上相同的句子(例如“ Cynthia以200美元的价格卖给Bob的自行车”和“ Bob以200美元的价格从Cynthia买来的自行车”)可以安装在同一框架中。 然后进行解析需要首先识别正在使用的帧,然后填充特定的帧参数,即Cynthia,$ 200。

The obvious downside of frames is that they require supervision. In some domains, an expert must create them, which limits the scope of frame-based approaches. Frames are also necessarily incomplete. Sentences such as “Cynthia visited the bike shop yesterday” and “Cynthia bought the cheapest bike” cannot be adequately analyzed with the frame we defined above.

框架的明显缺点是它们需要监督。 在某些领域,专家必须创建它们,这限制了基于框架的方法的范围。 框架也一定是不完整的。 上面定义的框架无法充分分析“ Cynthia昨天去过自行车商店”和“ Cynthia购买了最便宜的自行车”之类的句子。

3:模型理论方法 (3: Model-Theoretical Approach)

The third category of semantic analysis falls under the model-theoretical approach. To understand this approach, we’ll introduce two important linguistic concepts: “model theory” and “compositionality”.

第三类语义分析属于模型理论方法。 为了理解这种方法,我们将介绍两个重要的语言概念:“模型理论”和“组合性”。

Model theory refers to the idea that sentences refer to the world, as in the case with grounded language (i.e. the block is blue). In compositionality, meanings of the parts of a sentence can be combined to deduce the whole meaning.

模型理论指的是句子指的是世界,就像扎根的语言一样(例如,方框是蓝色的)。 在构词性上,可以将句子各部分的含义组合起来以推断出整个含义。

Liang compares this approach to turning language into computer programs. To determine the answer to the query “what is the largest city in Europe by population”, you first have to identify the concepts of “city” and “Europe” and funnel down your search space to cities contained in Europe. Then you would need to sort the population numbers for each city you’ve shortlisted so far and return the maximum of this value.

梁比较了将语言转换为计算机程序的方法。 要确定“人口最多的欧洲城市是多少”这一查询的答案,您首先必须确定“城市”和“欧洲”的概念,然后将搜索范围集中到欧洲所包含的城市。 然后,您需要对到目前为止已入围的每个城市的人口数量进行排序,并返回该值的最大值。

To execute the sentence “Remind me to buy milk after my last meeting on Monday” requires similar composition breakdown and recombination.

要执行“在周一的上次会议后提醒我购买牛奶”的句子,需要类似的成分分解和重组。

Models vary from needing heavy-handed supervision by experts to light supervision from average humans on Mechanical Turk. The advantages of model-based methods include full-world representation, rich semantics, and end-to-end processing, which enable such approaches to answer difficult and nuanced search queries.

模式从需要专家的严格监督到从普通人在Mechanical Turk上的轻度监督,不一而足。 基于模型的方法的优点包括全域表示,丰富的语义以及端到端处理,这使此类方法能够回答困难而细微的搜索查询。

The major con is that the applications are heavily limited in scope due to the need for hand-engineered features. Applications of model-theoretic approaches to NLU generally start from the easiest, most contained use cases and advance from there.

主要缺点是,由于需要手工设计的功能,因此应用程序的范围受到很大限制。 模型理论方法在NLU中的应用通常从最简单,包含最多的用例开始,然后从那里开始。

The holy grail of NLU is both breadth and depth, but in practice you need to trade off between them. Distributional methods have scale and breadth, but shallow understanding. Model-theoretical methods are labor-intensive and narrow in scope. Frame-based methods lie in between.

NLU的圣杯既有广度又有深度,但是在实践中您需要在两者之间进行权衡。 分布方法具有规模和广度,但了解较浅。 模型理论方法的工作量大且范围狭窄。 基于框架的方法介于两者之间。

4:交互式学习方法 (4: Interactive Learning Approaches)

Paul Grice, a British philosopher of language, described language as a cooperative game between speaker and listener. Liang is inclined to agree. He believes that a viable approach to tackling both breadth and depth in language learning is to employ interactive, interactive environments where humans teach computers gradually. In such approaches, the pragmatic needs of language inform the development.

英国语言哲学家保罗·格里斯(Paul Grice)将语言描述为说话者和听者之间的合作游戏。 梁倾向于同意。 他认为解决语言学习广度和深度的一种可行方法是采用交互式的交互式环境,使人们逐步教计算机。 在这种方法中,语言的实用需求为开发提供了信息。

To test this theory, Liang developed SHRDLRN as a modern-day version of Winograd’s SHRDLU. In this interactive language game, a human must instruct a computer to move blocks from a starting orientation to an end orientation. The challenge is that the computer starts with no concept of language. Step by step, the human says a sentence and then visually indicates to the computer what the result of the execution should look like.

为了检验这一理论,梁启超开发了SHRDLRN作为Winograd的SHRDLU的现代版本。 在这种交互式语言游戏中,人类必须指示计算机将块从开始方向移动到结束方向。 挑战在于计算机从没有语言概念开始。 人们逐步说出一句话,然后以视觉方式向计算机指示执行结果应是什么样。

If a human plays well, he or she adopts consistent language that enables the computer to rapidly build a model of the game environment and map words to colors or positions. The surprising result is that any language will do, even individually invented shorthand notation, as long as you are consistent.

如果人类玩得很好,则他或她会采用一致的语言,使计算机能够快速构建游戏环境模型并将单词映射到颜色或位置。 令人惊讶的结果是,只要您保持一致,任何语言都可以使用,甚至是单独发明的速记符号。

The worst players who take the longest to train the computer often employ inconsistent terminology or illogical steps.

花费最长时间训练计算机的最糟糕的玩家通常会采用不一致的术语或不合逻辑的步骤。

Liang’s bet is that such approaches would enable computers to solve NLP and NLU problems end-to-end without explicit models. “Language is intrinsically interactive,” he adds. “How do we represent knowledge, context, memory? Maybe we shouldn’t be focused on creating better models, but rather better environments for interactive learning.”

Liang的押注是,这种方法将使计算机能够在没有显式模型的情况下端到端解决NLP和NLU问题。 他补充说:“语言本质上是互动的。” “我们如何代表知识,背景,记忆? 也许我们不应该专注于创建更好的模型,而应该专注于更好的交互式学习环境。”

Language is both logical and emotional. We use words to describe both math and poetry. Accommodating the wide range of our expressions in NLP and NLU applications may entail combining the approaches outlined above, ranging from the distributional / breadth-focused methods to model-based systems to interactive learning environments.

语言既逻辑又情感。 我们用单词来描述数学和诗歌。 要在NLP和NLU应用程序中适应我们广泛的表达方式,可能需要结合上面概述的方法,范围从以分布/广度为重点的方法到基于模型的系统再到交互式学习环境。

We may also need to re-think our approaches entirely, using interactive human-computer based cooperative learning rather than researcher-driven models.

我们可能还需要使用基于交互人机的协作学习而不是研究人员驱动的模型来重新思考我们的方法。

If you have a spare hour and a half, I highly recommend you watch Percy Liang’s entire talk which this summary article was based on:

如果您有一个半小时的空闲时间,我强烈建议您观看Percy Liang的整个演讲,该摘要文章基于以下内容:

A special thanks to Melissa Fabros for recommending Percy’s talk, Matthew Kleinsmith for highlighting the MIT Media Lab definition of “grounded” language, and Jeremy Howard and Rachel Thomas of fast.ai for facilitating our connection and conversation.

特别感谢Melissa Fabros推荐Percy的演讲, Matthew Kleinsmith强调了MIT媒体实验室对“基础”语言的定义,以及fast.ai的 Jeremy Howard和Rachel Thomas促进了我们的联系和对话。

If you enjoyed my article, join the TOPBOTS community and get the best bot news and exclusive industry content.

如果您喜欢我的文章,请加入TOPBOTS社区,并获取最佳的机器人新闻和独家行业内容。

翻译自: https://www.freecodecamp.org/news/how-natural-language-processing-powers-chatbots-4-common-approaches-a077a4de04d4/

自然语言理解和自然语言处理

你可能感兴趣的:(编程语言,python,机器学习,人工智能,深度学习)