之前研究事件抽取领域(NLP一个小领域信息抽取的子领域), 之前整理过一些文献。
事件抽取文献整理(2020-2021)
+
事件抽取文献整理(2019)
+
事件抽取文献整理(2018)
+
事件抽取文献整理(2008-2017)
模型综述
图片来自: A Compact Survey on Event Extraction: Approaches and Applications
之前看的时候还看了这篇描述 NLP 事件抽取综述(中)—— 模型篇
模型中有$代表有给代码
Document-Level Event Argument Extraction by Conditional Generation (aclanthology.org)
使用了Bart模型, 但个人看了官方源码觉得不全
Event Extraction from Historical Texts: A New Dataset for Black Rebellions (aclanthology.org)
无官方源码
提出了一个新的数据集(论文没有给公开的数据集链接), 是本文的主要贡献点。
a corpus of nineteenth-century African American newspapers.
Our dataset features 5 entity types, 12 event types, and 6 argument roles that concern slavery and black movements between the eighteenth and nineteenth centuries.
原文: https://aclanthology.org/2021.acl-long.217.pdf
代码: luyaojie/Text2Event (github.com)
如何融合使用shcema去constraint decode过程或许可以参考
CasEE: A Joint Learning Framework with Cascade Decoding for Overlapping Event Extraction (aclanthology.org)
代码: JiaweiSheng/CasEE: Source code for ACL 2021 finding paper: CasEE: A Joint Learning Framework with Cascade Decoding for Overlapping Event Extraction (github.com)
面向中文
我尝试了一下环境,发现没啥问题,能跑起来
也简单看了一遍代码
这篇文章其实是参考CasRel (arxiv.org), 一个三元组关系抽取任务。将这个范式迁移到事件抽取中。
CasEE 架构:
使用了CLN(Conditioned LayNorm)和 MSA(multiHead Self-Attention)
利用双指针, start pos, end pos, 但是缺点是阈值需要手动设定 We select tokens with 数学公式: t ^ i s c > ξ 2 \hat{t}^{sc}_i > ξ_2 t^isc>ξ2 as the start positions, and those with 数学公式: t ^ i e c > ξ 3 \hat{t}^{ec}_i > ξ_3 t^iec>ξ3as end positions, where 数学公式: ξ 2 , ξ 3 ∈ [ 0 , 1 ] ξ_2, ξ_3 ∈ [0, 1] ξ2,ξ3∈[0,1] are scalar thresholds.
在论元分类的时候,还有个type_soft_constrain的操作
p_s = torch.sigmoid(self.head_cls(inp)) # [b, t, l]
p_e = torch.sigmoid(self.tail_cls(inp))
type_soft_constrain = torch.sigmoid(self.gate_linear(type_emb)) # [b, l]
type_soft_constrain = type_soft_constrain.unsqueeze(1).expand_as(p_s)
p_s = p_s * type_soft_constrain
p_e = p_e * type_soft_constrain
不同模型不同学习率, 另外 get_cosine_schedule_with_warmup 可见这个例子: 情感分析bert家族 pytorch实现(ing)
def set_learning_setting(self, config, train_loader, dev_loader, model):
instances_num = len(train_loader.dataset)
train_steps = int(instances_num * config.epochs_num / config.batch_size) + 1
print("Batch size: ", config.batch_size)
print("The number of training instances:", instances_num)
print("The number of evaluating instances:", len(dev_loader.dataset))
bert_params = list(map(id, model.bert.parameters()))
other_params = filter(lambda p: id(p) not in bert_params, model.parameters())
optimizer_grouped_parameters = [{'params': model.bert.parameters()}, {'params': other_params, 'lr': config.lr_task}]
optimizer = AdamW(optimizer_grouped_parameters, lr=config.lr_bert, correct_bias=False)
scheduler = get_linear_schedule_with_warmup(optimizer, num_warmup_steps=train_steps * config.warmup, num_training_steps=train_steps)
CLEVE: Contrastive Pre-training for Event Extraction (aclanthology.org)
代码: THU-KEG/CLEVE (github.com)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Graph Isomorphism Network
Here we use a state-of-the-art GNN model, Graph Isomorphism Network (Xu et al., 2019), as our graph encoder for its strong representation ability.
Trigger is Not Sufficient: Exploiting Frame-aware Knowledge for Implicit Event Argument Extraction (aclanthology.org)
无官方源码
MRC-based Argument Extraction
Teacher-student Framework
Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker (aclanthology.org)
源码: Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker (aclanthology.org)
作者在AI Drive分享GIT的视频分享中也说了,一开始并不是end to end训练的,而是先给了gold label, 慢慢再替换为模型的输出
tracker 非并行成为模型运行的速度瓶颈,另外, 论元抽取的顺序需要预先定义
例如这里的Equity Freeze需要手工定义Equity Holder -> FrozeShare -> StartDate…
这个需要训练才能发现好坏
github是金融数据集
Revisiting the Evaluation of End-to-end Event Extraction (aclanthology.org)
源码: dolphin-zs/Doc2EDAG (github.com)
reinforcement learning, to support diverse preferences of evaluation metrics motivated by different scenarios, we propose a new training paradigm based on reinforcement learning for a typical end-to-end EE model,
GATE: Graph Attention Transformer Encoder for Cross-lingual Relation and Event Extraction (arxiv.org)
Ahmad, W. U., Peng, N., & Chang, K.-W. (2021). GATE: Graph Attention Transformer Encoder for Cross-lingual Relation and Event Extraction. Proceedings of the AAAI Conference on Artificial Intelligence, 35(14), 12462-12470. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/17478
源码:wasiahmad/GATE (github.com)
跨语言
What the Role is vs. What Plays the Role: Semi-Supervised Event Argument Extraction via Dual Question Answering | Proceedings of the AAAI Conference on Artificial Intelligence
无官方源码
GRIT: Generative Role-filler Transformers for Document-level Event Entity Extraction (aclanthology.org)
源码: xinyadu/grit_doc_event_entity (github.com)
Event Entity Extraction
Partially causal masking strategy
Event Time Extraction and Propagation via Graph Attention Networks (aclanthology.org)
无官方源码
Biomedical Event Extraction as Multi-turn Question Answering (aclanthology.org)
源码:allenai/scibert: A BERT model for scientific text. (github.com)
Biomedical event extraction
describing specific relationships between multiple molecular entities, such as genes, proteins, or cellular components
可视化工具 BioNLP Shared Task 2011: Supporting Resources (aclanthology.org)
Joint Event Extraction with Hierarchical Policy Network (aclanthology.org)
无官方源码
Cross-media Structured Common Space for Multimedia Event Extraction (aclanthology.org)
无官方源码
Event Extraction as Multi-turn Question Answering (aclanthology.org)
无官方源码
Event Extraction by Answering (Almost) Natural Questions (aclanthology.org)
源码: xinyadu/eeqa: Event Extraction by Answering (Almost) Natural Questions (github.com)
ll同学在用这篇, 先放放,看他怎么说
Towards Few-Shot Event Mention Retrieval: An Evaluation Framework and A Siamese Network Approach (aclanthology.org)
无官方源码
Reading the Manual: Event Extraction as Definition Comprehension (aclanthology.org)
无官方源码
主要可以面向零样本和少样本
暂时没看懂Approach部分…
trigger cls: 72.9
arg cls: 42.4
Edge-Enhanced Graph Convolution Networks for Event Detection with Syntactic Relation (aclanthology.org)
源码: cuishiyao96/eegcned (github.com)
图模型有些奇特
Edge-Aware Node Update Module first aggregates information from neighbors of each node through specific edge, and Node-Aware Edge Update module refines the edge representation with its connected nodes.
只有事件trigger任务