【推荐】文本处理的卷积方法

摘要

转自:爱可可-爱生活

tl;dr

  • RNNS work great for text but convolutions can do it faster

  • Any part of a sentence can influence the semantics of a word. For that reason we want our network to see the entire input at once

  • Getting that big a receptive can make gradients vanish and our networks fail

  • We can solve the vanishing gradient problem with DenseNets or Dilated Convolutions

  • Sometimes we need to generate text. We can use “deconvolutions” to generate arbitrarily long outputs.

Intro

Over the last three years, the field of NLP has gone through a huge revolution thanks to deep learning. The leader of this revolution has been the recurrent neural network and particularly its manifestation as an LSTM. Concurrently the field of computer vision has been reshaped by convolutional neural networks. This post explores what we “text people” can learn from our friends who are doing vision.

Common NLP Tasks

To set the stage and agree on a vocabulary, I’d like to introduce a few of the more common tasks in NLP. For the sake of consistency, I’ll assume that all of our model’s inputs are characters and that our “unit of observation” is a sentence. Both of these assumptions are just for the sake of convenience and you can replace characters with words and sentences with documents if you so wish.

Classification

Perhaps the oldest trick in the book, we often want to classify a sentence. For example, we might want to classify an email subject as indicative of spam, guess the sentiment of a product review or assign a topic to a document.

The straightforward way to handle this kind of task with an RNN is to feed entire sentence into it, character by character, and then observe the RNNs final hidden state.

Sequence Labeling

Sequence labeling tasks are tasks that return an output for each input. Examples include part of speech labeling or entity recognition tasks. While the bare bones LSTM model is far from the state of the art, it is easy to implement and offers compelling results. See this paper for a more fleshed out architecture

  【推荐】文本处理的卷积方法_第1张图片

An example tagger with a bidirectional LSTM. Source

Sequence Generation

Arguably the most impressive results in recent NLP have been in translation. Translation is a mapping of one sequence to another, with no guarantees on the length of the output sentence. For example, translating the first words of the Bible from Hebrew to English is בראשית = “In the Beginning”.

链接:

https://medium.com/@TalPerry/convolutional-methods-for-text-d5260fd5675f

原文链接:

http://weibo.com/1402400261/F4nWcmOMi?type=comment#_rnd1495539504869

你可能感兴趣的:(NLP,paper,reading)