Re33:读论文 Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Languag


论文名称:Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing
ACM Computing Surveys官方下载地址:

官网:Pretrain Language Models
(原文中的meta analysis部分我就不截图了)

最近发现2022年ACM Computing Surveys刚接收了这篇综述,看了一下页数比ArXiv版的还少,我就还是继续用ArXiv版的来写笔记了。



  • 1. 什么是prompt-based learning
    • 1.1 Prompt Addition
    • 1.2 Answer Search
    • 1.3 Answer Mapping
  • 2. NLP学习范式的变迁
  • 3. Design Considerations for Prompting
    • 3.1 Pre-trained Model Choice
    • 3.2 Prompt Engineering
    • 3.3 Answer Engineering
    • 3.4 Multi-Prompt Learning
    • 3.5 Training Strategies for Prompting Methods / Prompt-based Training Strategies
  • 4. 应用
  • 5. Prompt-relevant Topics
  • 6. Challenges
  • 7. 其他本文撰写过程中使用过的网络资料

1. 什么是prompt-based learning

传统有监督学习根据输入 x x x预测输出 y y y的概率 P ( y ∣ x ; θ ) P(y|x;\theta) P(yx;θ)
θ \theta θ是模型参数)
label set

prompt-based learning直接基于预训练语言模型建模文本概率:
输入 x x x
template/prompting function(有2个slot,一个填输入,一个用来输出结果)
用template将 x x x处理为textual string prompt x ′ x' x(将 x x x填进template)(包含了一些unfilled slots)
用语言模型根据概率填补unfilled slots,得到final string x ^ \hat{x} x^
通过 x ^ \hat{x} x^得到最终输出 y y y


  1. 推文:I missed the bus today.
    预测情感的话就在后面加上:I felt so ____
  2. 翻译的 x ′ x' xEnglish: I missed the bus today. French: _____

Re33:读论文 Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Languag_第1张图片

Re33:读论文 Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Languag_第2张图片

Re33:读论文 Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Languag_第3张图片


1.1 Prompt Addition

  1. slot在template中间叫cloze prompt,在尾部叫prefix prompt
  2. template不一定要是自然语言tokens,也可以是假词(也能嵌入到连续向量)或者直接就是连续向量
  3. slots数不固定

1.2 Answer Search

找到得分最高的 z ^ \hat{z} z^

Z Z Z z z z的取值范围


argmax search或sampling

1.3 Answer Mapping

z ^ \hat{z} z^转换为 y ^ \hat{y} y^

2. NLP学习范式的变迁

  1. Fully supervised learning:传统机器学习范式
    为了向模型提供合适的inductive bias,早期NLP模型依赖特征工程,神经网络出现后依赖architecture engineering。
  2. pre-train and fine-tune
    依赖objective engineering。
    不利于探索模型架构:1. 无监督预训练使structural priors选择范围小。2. 测试不同结构的预训练代价太高。
  3. pre-train, prompt, and predict
    依赖prompt engineering。

Re33:读论文 Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Languag_第4张图片

3. Design Considerations for Prompting

Re33:读论文 Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Languag_第5张图片

3.1 Pre-trained Model Choice


训练目标的选择取决于对特定prompting任务的适配,如left-to-right AR LMs适用于prefix prompts,reconstruction目标适用于cloze prompts。标准LM和FTR目标更适宜于文本生成任务。

prefix LM和encoder-decoder架构自然适用于文本生成任务,但也可以根据prompt修改得适用于其他任务。

3.2 Prompt Engineering

prompt template engineering→首先选择prompt shape,接下来考虑用manual or automated的方式

  1. Prompt Shape
    cloze prompts VS. prefix prompts
  2. Manual Template Engineering
  3. Automated Template Learning
    1. discrete prompts / hard prompts:文本(其实这一部分总容易让我联想到传统NLG使用模板/规则的方法,本文参考文献里还真的有Re3Sum1,但是似乎在正文中没有引用过)
      1. Prompt Mining:从语料库中挖掘
      2. Prompt Paraphrasing:复述已有的seed prompt
      3. Gradient-based Search
      4. Prompt Generation:直接视作文本生成任务
      5. Prompt Scoring
    2. continuous prompts / soft prompts:LM嵌入域的向量
      1. Prefix Tuning
        M ϕ M_\phi Mϕ:可训练的prefix matrix
        θ \theta θ:fixed pre-trained LM参数
        在这里插入图片描述如时间步在prefix内,直接从 M ϕ M_\phi Mϕ中复制;否则用预训练模型计算。
      2. Tuning Initialized with Discrete Prompts
      3. Hard-Soft Prompt Hybrid Tuning
    3. static
    4. dynamic

3.3 Answer Engineering

包括对 Z Z Z和mapping function的设计

  1. answer shape:粒度
    1. tokens
    2. spans:常用于 cloze prompts
    3. sentence:常用于 prefix prompts
  2. answer design method
    1. Manual Design
      1. Unconstrained Spaces:所有可选填入项,往往直接将answer z z z匹配到 y y y
      2. Constrained Spaces
    2. Discrete Answer Search
      1. Answer Paraphrasing:初始化 answer space Z ′ \mathcal{Z}' Z(后面的没看懂)
      2. Prune-then-Search
        y → z y→z yz:verbalizer(后面的没看懂)
      3. Label Decomposition:关系抽取
        answer span的概率是每个token概率的总和
    3. Continuous Answer Search:略

3.4 Multi-Prompt Learning

Re33:读论文 Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Languag_第6张图片

  1. Prompt Ensembling:连续prompts可能是通过不同初始化或随机种子学到的
    1. Uniform averaging
    2. Weighted averaging
    3. Majority voting
    4. Knowledge distillation
    5. Prompt ensembling for text generation:逐token ensemble:
  2. Prompt Augmentation / demonstration learning:细节略
    提供answered prompts来类比(学习重复的模式)
    1. Sample Selection
    2. Sample Ordering
  3. Prompt Composition
  4. Prompt Decomposition

3.5 Training Strategies for Prompting Methods / Prompt-based Training Strategies

  1. Training Settings
    不用训练:zero-shot setting(非真,详细略)
    full-data learning
    few-shot learning
  2. Parameter Update Methods
    Re33:读论文 Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Languag_第7张图片
    1. Promptless Fine-tuning:pre-train and fine-tune strategy
    2. Tuning-free Prompting
      可以用answered prompts增强输入:in-context learning
    3. Fixed-LM Prompt Tuning:缺点略
    4. Fixed-prompt LM Tuning
      null prompt
    5. Prompt+LM Tuning:优缺点略

4. 应用

Re33:读论文 Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Languag_第8张图片


  1. Knowledge Probing
    1. Factual Probing / fact retrieval:计算预训练模型的表征包含多少事实知识,关注对模板的学习
    2. Linguistic Probing
  2. Classification-based Tasks:如以slot filling的形式实现
    1. Text Classification:常用cloze prompts, prompt engineering + answer engineering, few-shot, fixed-prompt LM Tuning
    2. Natural Language Inference (NLI):常用cloze prompts,prompt engineering关注少样本学习场景下的template search。answer spaces常从词表中手动提前选好。
  3. Information Extraction:细节略
    1. Relation Extraction
    2. Semantic Parsing
    3. Named Entity Recognition (NER)
  4. “Reasoning” in NLP:细节略
    1. Commonsense Reasoning
    2. Mathematical Reasoning
  5. Mathematical Reasoning
    extractive QA
    multiple-choice QA
    free-form QA
  6. Text Generation:其他细节略
    prefix prompts + AR预训练语言模型:文本摘要、机器翻译
    in-context learning
    fixed-LM prompt tuning:data-to-text generation
  7. Automatic Evaluation of Text Generation:建模成文本生成任务(套娃是吧)
  8. Multi-modal Learning
  9. Meta-Applications
    1. Domain Adaptation(感觉看起来有点像文本风格迁移,所以文本风格迁移应该也有用prompt来做的工作吧?)
    2. Debiasing
    3. Dataset Construction

Re33:读论文 Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Languag_第9张图片

Re33:读论文 Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Languag_第10张图片

5. Prompt-relevant Topics

Re33:读论文 Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Languag_第11张图片

  1. Ensemble Learning VS. prompt ensembling
  2. Few-shot Learning
    Prompt augmentation / priming-based few-shot learning
  3. Larger-context Learning
  4. Query Reformulation
  5. QA-based Task Formulation
  6. Controlled Generation
  7. Supervised Attention
  8. Data Augmentation

6. Challenges

  1. Prompt Design
    1. Tasks beyond Classification and Generation
    2. Prompting with Structured Information
    3. Entanglement of Template and Answer
  2. Answer Engineering
    1. Many-class and Long-answer Classification Tasks
    2. Multiple Answers for Generation Tasks
  3. Selection of Tuning Strategy
  4. Multiple Prompt Learning
    1. Prompt Ensembling
    2. Prompt Composition and Decomposition
    3. Prompt Augmentation
    4. Prompt Sharing
      Re33:读论文 Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Languag_第12张图片
  5. Selection of Pre-trained Models
  6. Theoretical and Empirical Analysis of Prompting
  7. Transferability of Prompts
  8. Combination of Different Paradigms
  9. Calibration of Prompting Methods

