MAML,模型无关的元学习。MAML是什么,可以在HERE进行了解。
MAML是对优化的学习,通过学习优化进行优化学习。因为是模型无关,所以对损失进行了了两次梯度调整,在梯度上进行元学习,实现快速学习。MAML的突出点就是实现快速学习,通过梯度的多次调整。
将事件抽取看做一个序列标注的问题进行抽取,利用LSTM+softmax来做无关模型,主要是对loss进行梯度多次调整的操作。
for iin range(task_num):
#一个task有n_wey,k_shot,即n_wey*k_shot个样本
# 1. run the i-th task and compute loss for k=0
logits =self.net._get_lstm_features(x_spt[i])#[25,32,6],[shot_num,max_len,n_way+1]
loss=self.Floss(logits,y_spt[i])
#gradient:梯度,torch.autograd.grad(y,x)是y对x的梯度,就是dy/dx的一节导数。
grad = torch.autograd.grad(loss, self.net.parameters())#计算损失的梯度,就是误差
fast_weights =list(map(lambda p: p[1] -self.update_lr * p[0], zip(grad,self.net.parameters())))#这是一个利用误差(grad)更新参数的操作。
# this is the loss and accuracy before first update
with torch.no_grad():
# [setsz, nway]
logits_q =self.net._get_lstm_features(x_qry[i])
loss_q =self.Floss(logits_q, y_qry[i])
losses_q[0] += loss_q
pred_q = F.softmax(logits_q, dim=2).argmax(dim=2)
p,r,f1=f1_score(pred_q,y_qry[i])
P[0] = P[0]+p
R[0] = R[0]+r
F1[0] = F1[0]+f1
# correct = torch.eq(pred_q, y_qry[i]).sum().item()
# corrects[0] = corrects[0] + correct
with torch.no_grad():
# [setsz, nway]
self.paraUp(fast_weights)
logits_q =self.net._get_lstm_features(x_qry[i])#fast_weights是self.net.parameters()的跟新,即梯度跟新
loss_q =self.Floss(logits_q, y_qry[i])
losses_q[1] += loss_q
# [setsz]
pred_q = F.softmax(logits_q, dim=2).argmax(dim=2)
p, r, f1 = f1_score(pred_q, y_qry[i])
P[1] = P[1] + p
R[1] = R[1] + r
F1[1] = F1[1] + f1
for kin range(1, self.update_step):
# 1. run the i-th task and compute loss for k=1~K-1
logits =self.net._get_lstm_features(x_spt[i])
loss =self.Floss(logits, y_spt[i])
# 2. compute grad on theta_pi
grad = torch.autograd.grad(loss,self.net.parameters())
# 3. theta_pi = theta_pi - train_lr * grad
fast_weights =list(map(lambda p: p[1] -self.update_lr * p[0], zip(grad, fast_weights)))
#参数更新
self.paraUp(fast_weights)
logits_q =self.net._get_lstm_features(x_qry[i])
# loss_q will be overwritten and just keep the loss_q on last update step.
loss_q =self.Floss(logits_q, y_qry[i])
losses_q[k +1] += loss_q
with torch.no_grad():
pred_q = F.softmax(logits_q, dim=2).argmax(dim=2)
p, r, f1 = f1_score(pred_q, y_qry[i])
P[k+1] = P[k+1] + p
R[k+1] = R[k+1] + r
F1[k+1] = F1[k+1] + f1