论文学习-BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Learning BERT Article

参考链接

  • The good site for learning:
    https://lena-voita.github.io/nlp_course.html#whats_inside_lectures
  • Online Video
    李沐老师 https://www.youtube.com/watch?v=ULD3uIb2MHQ

BERT: Bidirectional Encoder Representations from Transformers.

作者觉得现有的方法限制了预训练模型表征特征的能力,因为现有的方法是无向的(OpenAI GPT 使用的是从左到右的结构,Attention层中的每个Token只能关注到之前的Token)。在某些句子层级的任务上,这种限制是次优的,甚至在问答任务上,这种限制是有害的,因为需要从两个方向对句子的语义进行处理。

BERT alleviates the previously mentioned unidirectionality constraint by using a “masked language model” (MLM) pre-training objective, inspired by the Cloze task (Taylor, 1953). The masked language model randomly masks some of the tokens from the input, and the objective is to predict the original vocabulary id of the masked word based only on its context. Unlike left-to-right language model pre-training, the MLM objective enables the representation to fuse the left and the right context, which allows us to pretrain a deep bidirectional Transformer. In addition to the masked language model, we also use a “next sentence prediction” task that jointly pretrains text-pair representations.

论文学习-BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding_第1张图片

补充

how to apply Pre-training

There are two existing strategies for applying pre-trained language representations to downstream tasks: feature-based and fine-tuning.

  • The feature-based approach, such as ELMo (Peters et al., 2018a), uses task-specific architectures that include the pre-trained representations as additional features.
  • The fine-tuning approach, such as the Generative Pre-trained Transformer (OpenAI GPT) (Radford et al., 2018), introduces minimal task-specific parameters, and is trained on the downstream tasks by simply fine-tuning all pretrained parameters.

The two approaches share the same objective function during pre-training, where they use unidirectional language models to learn general language representations.

你可能感兴趣的:(学习)