在做本科毕业设计时,复现代码:
# ...
# We split the whole train dataset into 100 segments.
for i in range(20):
t1 = time.time()
total_loss = 0
train_dataset = QD.QDloadStrokeData(no=i,val = False,transforms = trans)
train_loader = DataLoader(dataset=train_dataset, batch_size=256,shuffle=False)
for t, (x,stroke, y) in enumerate(train_loader):
model.train()
x = x.to(device=device, dtype=dtype)
y = y.to(device=device, dtype=torch.long)
#add the center feature returned from resnet
scores,cf_pred = model(x)
#Caculate entropy loss
entropy_loss = F.cross_entropy(scores, y)
#Caculate the center loss
center_loss = F.mse_loss(cf_pred,cf_class[y])
loss = entropy_loss + alpha * center_loss
total_loss += loss
# ...
报以下错误:
RuntimeError Traceback (most recent call last)
/tmp/ipykernel_14660/27735771.py in
----> 1 train(cnn_model,optimizer, epochs=5,args=args)
/tmp/ipykernel_14660/4015440672.py in train(model, optimizer, epochs, args)
100
101 #Caculate the center loss
--> 102 center_loss = F.mse_loss(cf_pred,cf_class[y])
103 print(center_loss.shape)
104
RuntimeError: The size of tensor a (2048) must match the size of tensor b (2088) at non-singleton dimension 1
报错信息:张量a(2048)与张量b(2088)必须在索引为1的维度上相匹配。
这个报警信息没说是哪两个张量的维度不匹配,但是提示了在计算mse_loss的时候出错。
所以没什么好说的,最好的办法就是把中间所有张量的维度全输出出来看看:
另起一段代码,输入一个和原输入维度一样的随机张量查错:
import torch
import torch.nn.functional as F
import numpy as np
cf_class = torch.from_numpy(np.load("center_feature_ssn.npy"))
scores = torch.randn(256,40)
scores = torch.tensor(scores, dtype=torch.float)
cf_pred = torch.randn(256,2048)
print('scores:',scores.shape)
y = torch.zeros(256)
y = y.long()
print('y:',y.shape)
entropy_loss = F.cross_entropy(scores, y)
print('entropy_loss:',entropy_loss.shape)
print('cf_pred:',cf_pred.shape)
print('cf_class[y]:',cf_class[y].shape)
center_loss = F.mse_loss(cf_pred,cf_class[y])
输出:
scores: torch.Size([256, 40])
y: torch.Size([256])
entropy_loss: torch.Size([])
cf_pred: torch.Size([256, 2048])
cf_class[y]: torch.Size([256, 2088])
RuntimeError: The size of tensor a (2048) must match the size of tensor b (2088) at non-singleton dimension 1
到此我们成功复现了错误(doge)。就是cf_pred和cf_class[y]的维度1不匹配,和报错信息出现的数字一样,一个2048一个2088。
弄清楚哪里出了问题,接下来就去找这两个变量的维度是怎么变化的,然后让他俩维度保持一致就可以了。
附:
在执行cross_entropy()的时候还踩了一个坑:
RuntimeError: expected scalar type Long but found Float
这是因为交叉熵损失函数里面要求,target的类型应该是long类型,input类型不做要求。所以在y后面加一句转换类型语句y = y.long()即可。