文本匹配相关论文

文章目录

  • 前言
  • 传统方法
  • 深度文本匹配
    • DSSM :
    • CDSSM:
    • ARC II:
    • CNTN:
    • LSTM-RNN:
    • MV-LSTM
    • MatchPyramid
    • Match-SRNN
    • KNRM
    • Conv-KNRM
    • DRMM
    • Siamese-LSTM
    • DAM
    • ESIM
    • DUET
    • BiMPM
    • DIIN
    • DRCN
    • RE2
    • DUA
    • BERT
    • to be continued

前言

\quad 文本匹配在信息检索、自动问答、对话系统当中有广泛的应用,这些任务都可以抽象成query和候选doc之间的匹配问题。工作期间我零零碎碎的去熟悉和掌握相关模型和方法,不过我还是觉得很有必要将这些东西系统的整理一遍。
\quad web检索引擎整体流程:
文本匹配相关论文_第1张图片
\quad 标红的部分即为文本匹配所在的位置,可以说是整个检索引擎的最核心部分。

传统方法

\quad 传统的方法主要基于人工提取的特征,因此问题的焦点在于如何设置合适的文本匹配学习算法来学习到最优的匹配模型。
\quad 常用方法有:BM25、TF-IDF、偏最小二乘(PLS)、正则化隐空间映射(RMLS)、监督语义索引模型(SSI)、双语话题模型(BLTM)、统计机器翻译模型(SMT)。

深度文本匹配

\quad 与传统的机器学习方法相比,深度学习方法在四个方面有所改善:

  1. 利用神经网络获得更丰富的语义表示信息;
  2. 利用神经网络可以构建更加强大的文本匹配模型;
  3. 以端到端的方式学习表征和匹配函数;
  4. 多模态匹配,可以学习通用的语义空间来普遍表示不同模态的数据。

DSSM :

Learning Deep Structured Semantic Models for Web Search using Clickthrough Data

CDSSM:

A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval

ARC II:

Convolutional Neural Network Architectures for Matching Natural Language Sentences

CNTN:

Convolutional neural tensor network architecture for community-based question answering

LSTM-RNN:

Deep Sentence Embedding Using the Long Short Term Memory Network: Analysis and Application to Information Retrieval

MV-LSTM

A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations

MatchPyramid

Text Matching as Image Recognition

Match-SRNN

Match-SRNN: Modeling the Recursive Matching Structure with Spatial RNN

KNRM

End-to-End Neural Ad-hoc Ranking with Kernel Pooling

Conv-KNRM

Convolutional neural networks for soft-matching N-grams in ad-hoc search

DRMM

A Deep Relevance Matching Model for Ad-hoc Retrieval

Siamese-LSTM

Siamese Recurrent Architectures for Learning Sentence Similarity

DAM

A Decomposable Attention Model for Natural Language Inference

ESIM

Enhanced LSTM for Natural Language Inference

DUET

Learning to Match using Local and Distributed Representations of Text for Web Search

BiMPM

Bilateral Multi-Perspective Matching for Natural Language Sentences

DIIN

Natural Language Inference Over Interaction Space(DIIN)

DRCN

Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information

RE2

Simple and Effective Text Matching with Richer Alignment Features

DUA

Modeling Multi-turn Conversation with Deep Utterance Aggregation

BERT

基本任务之一:BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
长文本匹配解决方法:Simple Applications of BERT for Ad Hoc Document Retrieval

to be continued

你可能感兴趣的:(文本匹配)