视频学习
什么是语义分割
一张照片输出多个标签,为每一个像素分配具体的类别。
图像分割——>语义分割(全卷积网络)
扩大感知域
ASPP
PPM
对于Nonlocal Neural NEetworks 的理解,我去知乎上搜索了一下
Non-Local是王小龙在CVPR2018年提出的一个自注意力模型。Non-Local Neural Network和Non-Local Means非局部均值去燥滤波有点相似的感觉。普通的滤波都是3×3的卷积核,然后在整个图片上进行移动,处理的是3×3局部的信息。Non-Local Means操作则是结合了一个比较大的搜索范围,并进行加权。
图像语义分割前沿进展
语义分割面临的挑战
- 大小各异
- 形状复杂
- 环境多变
- 类别众多
在深度学习之前,SIFT,对多尺度信息的处理(人工设计)
引入深度学习,AlexNet,自动的去学习各种各样多尺度的表达
后边包括VGG,感受野几乎与原图一样大,可以获得大尺度的信息,但是感受野相对固定
后来的ResNet多尺度信息感知能力更强
再改进为
应用于很多任务,效果都很不错。
代码练习
代码部分我还不是很熟练,让同学给我讲解了一下
首先是定义SELayer
class SELayer(nn.Module): def __init__(self,channel,r=16): super(SELayer,self).__init__() # 定义自适应平均池化函数,降采样 self.avg_pool = nn.AdaptiveAvgPool2d(1) # 定义两个全连接层 self.fc = nn.Sequential( nn.Linear(channel,round(channel/r)), nn.ReLU(inplace = True), nn.Linear(round(channel/r),channel), nn.Sigmoid() ) def forward(self,x): b,c,_,_ = x.size() out = self.avg_pool(x).view(b,c) out = self.fc(out).view(b,c,1,1) out = x * out.expand_as(x) return out
然后再定义HybridSN
class HybridSN(nn.Module): def __init__(self): super(HybridSN, self).__init__() self.conv3d_1 = nn.Sequential( nn.Conv3d(1, 8, kernel_size=(7, 3, 3), stride=1, padding=0), nn.BatchNorm3d(8), nn.ReLU(inplace = True), ) self.conv3d_2 = nn.Sequential( nn.Conv3d(8, 16, kernel_size=(5, 3, 3), stride=1, padding=0), nn.BatchNorm3d(16), nn.ReLU(inplace = True), ) self.conv3d_3 = nn.Sequential( nn.Conv3d(16, 32, kernel_size=(3, 3, 3), stride=1, padding=0), nn.BatchNorm3d(32), nn.ReLU(inplace = True) ) self.conv2d_4 = nn.Sequential( nn.Conv2d(576, 64, kernel_size=(3, 3), stride=1, padding=0), nn.BatchNorm2d(64), nn.ReLU(inplace = True), ) self.SElayer = SELayer(64,16) self.fc1 = nn.Linear(18496,256) self.fc2 = nn.Linear(256,128) self.fc3 = nn.Linear(128,16) self.dropout = nn.Dropout(p = 0.4) def forward(self,x): out = self.conv3d_1(x) out = self.conv3d_2(out) out = self.conv3d_3(out) out = self.conv2d_4(out.reshape(out.shape[0],-1,19,19)) out = self.SElayer(out) out = out.reshape(out.shape[0],-1) out = F.relu(self.dropout(self.fc1(out))) out = F.relu(self.dropout(self.fc2(out))) out = self.fc3(out) return out
其后的数据训练仍然不变
在模型测试中
net.eval() count = 0 # 模型测试 for inputs, _ in test_loader: inputs = inputs.to(device) outputs = net(inputs) outputs = np.argmax(outputs.detach().cpu().numpy(), axis=1) if count == 0: y_pred_test = outputs count = 1 else: y_pred_test = np.concatenate( (y_pred_test, outputs) ) # 生成分类报告 classification = classification_report(ytest, y_pred_test, digits=4) print(classification)
可得结果