Google新作synthesizer:Rethinking Self-Attention in Transformer Models
0.背景机构:GoogleResearch作者:YiTay,DaraBahri,DonaldMetzler,Da-ChengJuan,ZheZhao,CheZheng论文地址:https://arxiv.org/abs/2005.007430.1摘要以当下基于Transformer的各种先进模型来看,使用点积自注意力(dotproductself-attention)是至关重要且不可或缺的。但,事