【HED边缘检测网络学习】

HED(Holistically-Nested Edge Detection) 网络

首先先看一下HED网络结构:搬运https://blog.csdn.net/sinat_26917383/article/details/73087831

HED 网络模型是在 VGG16 网络结构的基础上设计出来的,所以有必要先看看 VGG16。 
【HED边缘检测网络学习】_第1张图片
 

HED 网络在 VGG 网络的基础上去除了后5层,后面的全连接层与 softmax 层主要用于分类,HED 网络只需要提取图片的特征,保留了前面的卷积层和池化层(注意:去掉最后一层池化层)。下面是 HED 网络的示意图:

【HED边缘检测网络学习】_第2张图片

特征图去除深度:

      分别提取出 VGG 网络的 conv1_2, conv2_2, conv3_3, conv4_3, conv5_3 层,这些输出层的大小分别为 [224, 224, 64],[112, 112, 128],[56, 56, 256],[28, 28, 512],[14, 14, 512],由于需要将这些层的数据和成一张图片,首先需要将深度降维到1,然后再按比例放大 1,2,4,8,16 倍使得每一层的数据大小都为 [224, 224],最终相加就可以得到尺寸为 [224, 224] 的输出图片了。去深度使用输出深度为 1,卷积核为 1*1 的卷积操作,得到深度为 1 的输出,1*1 的卷积主要作用就是多通道融合。

损失函数:

      因为边界像素数目远小于其他数目,正负样本分布不均衡,所以不能采用普通的交叉熵损失,这里采用的是 pos_weight交叉熵。

参考链接:

https://pqpo.me/2019/08/02/machine-learning-hed-smartcropper/

https://blog.csdn.net/qq_36187544/article/details/91447065

测试程序:opencv4.2 模型下载链接:https://download.csdn.net/download/qq_35054151/12275865



#include   
#include  
#include
using namespace cv;
using namespace std;
void edgeDetection(cv::Mat &src, cv::Mat &dst, double threshold)
{
	//CV_DNN_REGISTER_LAYER_CLASS(Crop, CropLayer);
	cv::Mat img = src.clone();
	cv::Size reso(500, 500);
	cv::Mat blob = cv::dnn::blobFromImage(img, threshold, reso, cv::Scalar(104.00698793, 116.66876762, 122.67891434), false, false);
	cv::dnn::Net net = cv::dnn::readNet("deploy.prototxt", "hed_pretrained_bsds.caffemodel");
	net.setInput(blob);
	cv::Mat out = net.forward();
	cv::resize(out.reshape(1, reso.height), out, img.size());
	cv::Mat out2;
	out.convertTo(dst, CV_8UC3, 255);
}


int main(void)
{
	cv::Mat src = cv::imread("1.jpg");
	cv::namedWindow("原图", 0);
	cv::imshow("原图", src);

	resize(src, src, cv::Size(500, 500));

	cv::Mat dst;
	edgeDetection(src, dst, 2.2);
	cv::namedWindow("HED", 0);
	cv::imshow("HED", dst);

	cv::waitKey(0);

	return 0;
}

 

import cv2 as cv
import argparse
 
parser = argparse.ArgumentParser(
        description='This sample shows how to define custom OpenCV deep learning layers in Python. '
                    'Holistically-Nested Edge Detection (https://arxiv.org/abs/1504.06375) neural network '
                    'is used as an example model. Find a pre-trained model at https://github.com/s9xie/hed.')
parser.add_argument('--input', help='Path to image or video. Skip to capture frames from camera', default='1.jpg')
parser.add_argument('--prototxt', help='Path to deploy.prototxt', default='deploy.prototxt')
parser.add_argument('--caffemodel', help='Path to hed_pretrained_bsds.caffemodel', default='hed_pretrained_bsds.caffemodel')
parser.add_argument('--width', help='Resize input image to a specific width', default=500, type=int)
parser.add_argument('--height', help='Resize input image to a specific height', default=500, type=int)
args = parser.parse_args()
 
#! [CropLayenr]
class CropLayer(object):
    def __init__(self, params, blobs):
        self.xstart = 0
        self.xend = 0
        self.ystart = 0
        self.yend = 0
 
    # Our layer receives two inputs. We need to crop the first input blob
    # to match a shape of the second one (keeping batch size and number of channels)
    def getMemoryShapes(self, inputs):
        inputShape, targetShape = inputs[0], inputs[1]
        batchSize, numChannels = inputShape[0], inputShape[1]
        height, width = targetShape[2], targetShape[3]
 
        #self.ystart = (inputShape[2] - targetShape[2]) / 2
        #self.xstart = (inputShape[3] - targetShape[3]) / 2
 
 
        self.ystart = int((inputShape[2] - targetShape[2]) / 2)
        self.xstart = int((inputShape[3] - targetShape[3]) / 2)
 
        self.yend = self.ystart + height
        self.xend = self.xstart + width
 
        return [[batchSize, numChannels, height, width]]
 
    def forward(self, inputs):
        return [inputs[0][:,:,self.ystart:self.yend,self.xstart:self.xend]]
#! [CropLayer]
 
#! [Register]
cv.dnn_registerLayer('Crop', CropLayer)
#! [Register]
 
# Load the model.
net = cv.dnn.readNet(cv.samples.findFile(args.prototxt), cv.samples.findFile(args.caffemodel))
 
kWinName = 'Holistically-Nested Edge Detection'
cv.namedWindow('Input', cv.WINDOW_NORMAL)
cv.namedWindow(kWinName, cv.WINDOW_NORMAL)
 
 
frame=cv.imread('1.jpg')
 
 
cv.imshow('Input', frame)
#cv.waitKey(0)
 
inp = cv.dnn.blobFromImage(frame, scalefactor=1.0, size=(args.width, args.height),
                               mean=(104.00698793, 116.66876762, 122.67891434),
                               swapRB=False, crop=False)
net.setInput(inp)
 
out = net.forward()
out = out[0, 0]
out = cv.resize(out, (frame.shape[1], frame.shape[0]))
cv.imshow(kWinName, out)
cv.imwrite('result.png',out)
cv.waitKey(0)
 

 

测试图像:

【HED边缘检测网络学习】_第3张图片

你可能感兴趣的:(深度学习,文本检测识别)