事件抽取文献整理(2020-2021)

前言

之前研究事件抽取领域(NLP一个小领域信息抽取的子领域), 之前整理过一些文献。

事件抽取文献整理(2020-2021)
+
事件抽取文献整理(2019)
+
事件抽取文献整理(2018)
+
事件抽取文献整理(2008-2017)

模型综述
图片来自: A Compact Survey on Event Extraction: Approaches and Applications

事件抽取文献整理(2020-2021)_第1张图片 事件抽取文献整理(2020-2021)_第2张图片
事件抽取文献整理(2020-2021)_第3张图片

之前看的时候还看了这篇描述 NLP 事件抽取综述(中)—— 模型篇

模型中有$代表有给代码



论文

2021

Gen-arg $

Document-Level Event Argument Extraction by Conditional Generation (aclanthology.org)


事件抽取文献整理(2020-2021)_第4张图片
使用了Bart模型, 但个人看了官方源码觉得不全



BRAD

Event Extraction from Historical Texts: A New Dataset for Black Rebellions (aclanthology.org)
无官方源码
提出了一个新的数据集(论文没有给公开的数据集链接), 是本文的主要贡献点。
a corpus of nineteenth-century African American newspapers.
Our dataset features 5 entity types, 12 event types, and 6 argument roles that concern slavery and black movements between the eighteenth and nineteenth centuries.


TEXT2EVENT $

原文: https://aclanthology.org/2021.acl-long.217.pdf
代码: luyaojie/Text2Event (github.com)
如何融合使用shcema去constraint decode过程或许可以参考

事件抽取文献整理(2020-2021)_第5张图片 事件抽取文献整理(2020-2021)_第6张图片 事件抽取文献整理(2020-2021)_第7张图片

CasEE $

CasEE: A Joint Learning Framework with Cascade Decoding for Overlapping Event Extraction (aclanthology.org)
代码: JiaweiSheng/CasEE: Source code for ACL 2021 finding paper: CasEE: A Joint Learning Framework with Cascade Decoding for Overlapping Event Extraction (github.com)
面向中文
我尝试了一下环境,发现没啥问题,能跑起来
也简单看了一遍代码

这篇文章其实是参考CasRel (arxiv.org), 一个三元组关系抽取任务。将这个范式迁移到事件抽取中。

事件抽取文献整理(2020-2021)_第8张图片

CasEE 架构:
事件抽取文献整理(2020-2021)_第9张图片
使用了CLN(Conditioned LayNorm)和 MSA(multiHead Self-Attention)

利用双指针, start pos, end pos, 但是缺点是阈值需要手动设定 We select tokens with 数学公式: t ^ i s c > ξ 2 \hat{t}^{sc}_i > ξ_2 t^isc>ξ2 as the start positions, and those with 数学公式: t ^ i e c > ξ 3 \hat{t}^{ec}_i > ξ_3 t^iec>ξ3as end positions, where 数学公式: ξ 2 , ξ 3 ∈ [ 0 , 1 ] ξ_2, ξ_3 ∈ [0, 1] ξ2,ξ3[0,1] are scalar thresholds.

在论元分类的时候,还有个type_soft_constrain的操作

p_s = torch.sigmoid(self.head_cls(inp))  # [b, t, l]
p_e = torch.sigmoid(self.tail_cls(inp))

type_soft_constrain = torch.sigmoid(self.gate_linear(type_emb))  # [b, l]
type_soft_constrain = type_soft_constrain.unsqueeze(1).expand_as(p_s)
p_s = p_s * type_soft_constrain
p_e = p_e * type_soft_constrain

不同模型不同学习率, 另外 get_cosine_schedule_with_warmup 可见这个例子: 情感分析bert家族 pytorch实现(ing)

def set_learning_setting(self, config, train_loader, dev_loader, model):
        instances_num = len(train_loader.dataset)
        train_steps = int(instances_num * config.epochs_num / config.batch_size) + 1

        print("Batch size: ", config.batch_size)
        print("The number of training instances:", instances_num)
        print("The number of evaluating instances:", len(dev_loader.dataset))

        bert_params = list(map(id, model.bert.parameters()))

        other_params = filter(lambda p: id(p) not in bert_params, model.parameters())
        optimizer_grouped_parameters = [{'params': model.bert.parameters()}, {'params': other_params, 'lr': config.lr_task}]

        optimizer = AdamW(optimizer_grouped_parameters, lr=config.lr_bert, correct_bias=False)
        scheduler = get_linear_schedule_with_warmup(optimizer, num_warmup_steps=train_steps * config.warmup, num_training_steps=train_steps)



CLEVE $

CLEVE: Contrastive Pre-training for Event Extraction (aclanthology.org)
代码: THU-KEG/CLEVE (github.com)
事件抽取文献整理(2020-2021)_第10张图片
事件抽取文献整理(2020-2021)_第11张图片

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Graph Isomorphism Network
Here we use a state-of-the-art GNN model, Graph Isomorphism Network (Xu et al., 2019), as our graph encoder for its strong representation ability.


FEAE

Trigger is Not Sufficient: Exploiting Frame-aware Knowledge for Implicit Event Argument Extraction (aclanthology.org)
无官方源码

事件抽取文献整理(2020-2021)_第12张图片 事件抽取文献整理(2020-2021)_第13张图片

MRC-based Argument Extraction
Teacher-student Framework


GIT $

Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker (aclanthology.org)
源码: Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker (aclanthology.org)

作者在AI Drive分享GIT的视频分享中也说了,一开始并不是end to end训练的,而是先给了gold label, 慢慢再替换为模型的输出

事件抽取文献整理(2020-2021)_第14张图片

tracker 非并行成为模型运行的速度瓶颈,另外, 论元抽取的顺序需要预先定义
例如这里的Equity Freeze需要手工定义Equity Holder -> FrozeShare -> StartDate…
这个需要训练才能发现好坏

事件抽取文献整理(2020-2021)_第15张图片

github是金融数据集



NoFPFN $

Revisiting the Evaluation of End-to-end Event Extraction (aclanthology.org)
源码: dolphin-zs/Doc2EDAG (github.com)

事件抽取文献整理(2020-2021)_第16张图片 事件抽取文献整理(2020-2021)_第17张图片

reinforcement learning, to support diverse preferences of evaluation metrics motivated by different scenarios, we propose a new training paradigm based on reinforcement learning for a typical end-to-end EE model,



GATE $

GATE: Graph Attention Transformer Encoder for Cross-lingual Relation and Event Extraction (arxiv.org)

Ahmad, W. U., Peng, N., & Chang, K.-W. (2021). GATE: Graph Attention Transformer Encoder for Cross-lingual Relation and Event Extraction. Proceedings of the AAAI Conference on Artificial Intelligence, 35(14), 12462-12470. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/17478

源码:wasiahmad/GATE (github.com)
跨语言
事件抽取文献整理(2020-2021)_第18张图片



DualQA

What the Role is vs. What Plays the Role: Semi-Supervised Event Argument Extraction via Dual Question Answering | Proceedings of the AAAI Conference on Artificial Intelligence
无官方源码
事件抽取文献整理(2020-2021)_第19张图片



GRIT $

GRIT: Generative Role-filler Transformers for Document-level Event Entity Extraction (aclanthology.org)

源码: xinyadu/grit_doc_event_entity (github.com)

Event Entity Extraction

事件抽取文献整理(2020-2021)_第20张图片

Partially causal masking strategy

事件抽取文献整理(2020-2021)_第21张图片

Wen et al.

Event Time Extraction and Propagation via Graph Attention Networks (aclanthology.org)
无官方源码

事件抽取文献整理(2020-2021)_第22张图片 事件抽取文献整理(2020-2021)_第23张图片



2020

SciBERT $

Biomedical Event Extraction as Multi-turn Question Answering (aclanthology.org)
源码:allenai/scibert: A BERT model for scientific text. (github.com)

Biomedical event extraction
describing specific relationships between multiple molecular entities, such as genes, proteins, or cellular components

事件抽取文献整理(2020-2021)_第24张图片
事件抽取文献整理(2020-2021)_第25张图片
事件抽取文献整理(2020-2021)_第26张图片

可视化工具 BioNLP Shared Task 2011: Supporting Resources (aclanthology.org)


HPNet

Joint Event Extraction with Hierarchical Policy Network (aclanthology.org)
无官方源码

事件抽取文献整理(2020-2021)_第27张图片
事件抽取文献整理(2020-2021)_第28张图片
事件抽取文献整理(2020-2021)_第29张图片

M2E2

Cross-media Structured Common Space for Multimedia Event Extraction (aclanthology.org)
无官方源码

事件抽取文献整理(2020-2021)_第30张图片
事件抽取文献整理(2020-2021)_第31张图片

MQAEE

Event Extraction as Multi-turn Question Answering (aclanthology.org)
无官方源码

事件抽取文献整理(2020-2021)_第32张图片

模型结构图:
事件抽取文献整理(2020-2021)_第33张图片

事件抽取文献整理(2020-2021)_第34张图片

Du et al. $

Event Extraction by Answering (Almost) Natural Questions (aclanthology.org)
源码: xinyadu/eeqa: Event Extraction by Answering (Almost) Natural Questions (github.com)
ll同学在用这篇, 先放放,看他怎么说
事件抽取文献整理(2020-2021)_第35张图片


Min et al.

Towards Few-Shot Event Mention Retrieval: An Evaluation Framework and A Siamese Network Approach (aclanthology.org)
无官方源码


事件抽取文献整理(2020-2021)_第36张图片
  • Sample pairs that are both in the query, and assign them the same class label.
  • Sample pairs such that one of them is in the query but the other is not, and assign this pair the not in same class label.
事件抽取文献整理(2020-2021)_第37张图片

Chen et al.

Reading the Manual: Event Extraction as Definition Comprehension (aclanthology.org)
无官方源码

事件抽取文献整理(2020-2021)_第38张图片

主要可以面向零样本和少样本
暂时没看懂Approach部分…
trigger cls: 72.9
arg cls: 42.4



EEGCN $

Edge-Enhanced Graph Convolution Networks for Event Detection with Syntactic Relation (aclanthology.org)

源码: cuishiyao96/eegcned (github.com)

事件抽取文献整理(2020-2021)_第39张图片

图模型有些奇特
Edge-Aware Node Update Module first aggregates information from neighbors of each node through specific edge, and Node-Aware Edge Update module refines the edge representation with its connected nodes.

事件抽取文献整理(2020-2021)_第40张图片

只有事件trigger任务

事件抽取文献整理(2020-2021)_第41张图片
事件抽取文献整理(2020-2021)_第42张图片

你可能感兴趣的:(文献阅读,自然语言处理,机器学习,数据挖掘,事件抽取)