孤鸿子_

deeplearning Note : Practical aspects of Deep Learning

作者: Dylan_frank(滔滔)

这是吴恩达 coursera Deep Learning Specialization 的第二门课程《Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization》的第一周笔记，主要讲在实现神经网络过程中所遇到的问题，和处理方法，具体来说是3点

初始化
regularization(正则化)
grad checking(梯度检验)

Initialization(初始化)

数据集初始化

初始化分两个方面，一是对于数据集，这个在机器学习中也会遇到，一般就是做一个标准化变换

x = ( x r a w - μ ) σ ( x r a w )

这个和机器学习中一样，主要是为了让数据集更均与，使cost fanction的contour图更正，不会变得像一个狭长的椭圆.

参数初始化

下面重点说一下这个参数的初始化。

我们先来看一下作业中的实验结果，作业做了3组实验

用0初始化
随机初始化为很大的值
he_initialization (随机初始化为一个与维度相关的数)

（由于代码和相关数据在coursera服务器上我无法在这里完全重现实验，只能展示一些实验结果）

引入的数据集长这样

(可以看到他引入的模块中包含它自身写好的东西以及数据集，所以我没法复现完整代码)

这是我们需要实验的模型

def model(X, Y, learning_rate = 0.01, num_iterations = 15000, print_cost = True, initialization = "he"):
    """
    Implements a three-layer neural network: LINEAR->RELU->LINEAR->RELU->LINEAR->SIGMOID.

    Arguments:
    X -- input data, of shape (2, number of examples)
    Y -- true "label" vector (containing 0 for red dots; 1 for blue dots), of shape (1, number of examples)
    learning_rate -- learning rate for gradient descent 
    num_iterations -- number of iterations to run gradient descent
    print_cost -- if True, print the cost every 1000 iterations
    initialization -- flag to choose which initialization to use ("zeros","random" or "he")

    Returns:
    parameters -- parameters learnt by the model
    """

    grads = {}
    costs = [] # to keep track of the loss
    m = X.shape[1] # number of examples
    layers_dims = [X.shape[0], 10, 5, 1]

    # Initialize parameters dictionary.
    if initialization == "zeros":
        parameters = initialize_parameters_zeros(layers_dims)
    elif initialization == "random":
        parameters = initialize_parameters_random(layers_dims)
    elif initialization == "he":
        parameters = initialize_parameters_he(layers_dims)

    # Loop (gradient descent)

    for i in range(0, num_iterations):

        # Forward propagation: LINEAR -> RELU -> LINEAR -> RELU -> LINEAR -> SIGMOID.
        a3, cache = forward_propagation(X, parameters)

        # Loss
        cost = compute_loss(a3, Y)

        # Backward propagation.
        grads = backward_propagation(X, Y, cache)

        # Update parameters.
        parameters = update_parameters(parameters, grads, learning_rate)

        # Print the loss every 1000 iterations
        if print_cost and i % 1000 == 0:
            print("Cost after iteration {}: {}".format(i, cost))
            costs.append(cost)

    # plot the loss
    plt.plot(costs)
    plt.ylabel('cost')
    plt.xlabel('iterations (per hundreds)')
    plt.title("Learning rate =" + str(learning_rate))
    plt.show()

    return parameters

是一个3层的神经网络

初始化为0

这是初始化为0的运行结果，cost function没有变化，因为参数没有打破对称性(break symmetry) 只是简单的将数据分为了0.

初始化为随机值

def initialize_parameters_random(layers_dims):
    """
    Arguments:
    layer_dims -- python array (list) containing the size of each layer.

    Returns:
    parameters -- python dictionary containing your parameters "W1", "b1", ..., "WL", "bL":
                    W1 -- weight matrix of shape (layers_dims[1], layers_dims[0])
                    b1 -- bias vector of shape (layers_dims[1], 1)
                    ...
                    WL -- weight matrix of shape (layers_dims[L], layers_dims[L-1])
                    bL -- bias vector of shape (layers_dims[L], 1)
    """

    np.random.seed(3)               # This seed makes sure your "random" numbers will be the as ours
    parameters = {}
    L = len(layers_dims)            # integer representing the number of layers

    for l in range(1, L):
        ### START CODE HERE ### (≈ 2 lines of code)
        parameters['W' + str(l)] = np.random.randn(layers_dims[l],layers_dims[l-1])*10
        parameters['b' + str(l)] = np.zeros((layers_dims[l],1))
        ### END CODE HERE ###

    return parameters

这里将参数设置为了正态分布点的10倍

他分类出来的数据边界是这样的

可以看到0此迭代cost function出现了inf，这是因为参数传播速度大致是指数级的会造成 vanishing/Exploding grad,
andrew 在课程中举了个很极端的例子

he initialization

之所以称为he initialization 是因为他在这边论文中被提到:He et al., 2015.

def initialize_parameters_he(layers_dims):
    """
    Arguments:
    layer_dims -- python array (list) containing the size of each layer.

    Returns:
    parameters -- python dictionary containing your parameters "W1", "b1", ..., "WL", "bL":
                    W1 -- weight matrix of shape (layers_dims[1], layers_dims[0])
                    b1 -- bias vector of shape (layers_dims[1], 1)
                    ...
                    WL -- weight matrix of shape (layers_dims[L], layers_dims[L-1])
                    bL -- bias vector of shape (layers_dims[L], 1)
    """

    np.random.seed(3)
    parameters = {}
    L = len(layers_dims) - 1 # integer representing the number of layers

    for l in range(1, L + 1):
        ### START CODE HERE ### (≈ 2 lines of code)
        parameters['W' + str(l)] = np.random.randn(layers_dims[l],layers_dims[l-1])*np.sqrt(2/layers_dims[l-1])
        parameters['b' + str(l)] = np.zeros((layers_dims[l],1))
        ### END CODE HERE ###

    return parameters

其实就是将每一层参数方差设置为 2layer_dims[l−1]−−−−−−−−−−√

可以看到这样做的分类效果非常好

测试集上的准确率提高到了 96%而且cost下降很明显

分类边界也很明显

至于它为什么好我就不知道了.

总结

What you should remember from this notebook:

Different initializations lead to different results
Random initialization is used to break symmetry and make sure different hidden units can learn different things
Don’t intialize to values that are too large
He initialization works well for networks with ReLU activations.

正则化

接下来我们来谈一谈正则化问题

L2范数正则化

这种正则化，我们在机器学习中已经接触到了，就是对参数加一个惩罚项避免过拟合。即

J (W, b) = 1 m \sum i c o s t_e n t r o p y (Y i, Y i^) + λ 2 m \sum i | | W i | | 2

W:指代所有层，这样写是为了方便

总结：

What you should remember – the implications of L2-regularization on:

The cost computation:
- A regularization term is added to the cost
The backpropagation function:
- There are extra terms in the gradients with respect to weight matrices
Weights end up smaller (“weight decay”):
- Weights are pushed to smaller values.

dropout

这是一项黑科技啊，以前没有接触过.

简单的说就是让一些神经元以一定的概率不发生作用，我们为每一层的参数生成一个随机矩阵 D ,均匀分布，然后以一个固定的概率某一些神经元不发生作用，比如keep_prob = 0.8,伪代码如下

D = np.random.rand(para.shape)
D = D < keep_prob

这样 D 就成了一个0，1矩阵，让参数乘D 就让某些参数不起作用，最总让某些神经元不起作用，伪代码如下

para *=D
para /=keep_prob

第一行代码很好理解，第二行代码是为了让下一层保持均值不变，因为从前一层传过来的时候有些神经元不起作用了，导致最终的和式均值变低，所以将起作用的神经元”加强”,就好比你让 1−keep_prob 的神经元不起作用，导致均值降低 1−keep_prob 所以将其提高 1−keep_prob 回来

实现代码

# GRADED FUNCTION: forward_propagation_with_dropout

def forward_propagation_with_dropout(X, parameters, keep_prob = 0.5):
    """
    Implements the forward propagation: LINEAR -> RELU + DROPOUT -> LINEAR -> RELU + DROPOUT -> LINEAR -> SIGMOID.

    Arguments:
    X -- input dataset, of shape (2, number of examples)
    parameters -- python dictionary containing your parameters "W1", "b1", "W2", "b2", "W3", "b3":
                    W1 -- weight matrix of shape (20, 2)
                    b1 -- bias vector of shape (20, 1)
                    W2 -- weight matrix of shape (3, 20)
                    b2 -- bias vector of shape (3, 1)
                    W3 -- weight matrix of shape (1, 3)
                    b3 -- bias vector of shape (1, 1)
    keep_prob - probability of keeping a neuron active during drop-out, scalar

    Returns:
    A3 -- last activation value, output of the forward propagation, of shape (1,1)
    cache -- tuple, information stored for computing the backward propagation
    """

    np.random.seed(1)

    # retrieve parameters
    W1 = parameters["W1"]
    b1 = parameters["b1"]
    W2 = parameters["W2"]
    b2 = parameters["b2"]
    W3 = parameters["W3"]
    b3 = parameters["b3"]

    # LINEAR -> RELU -> LINEAR -> RELU -> LINEAR -> SIGMOID
    Z1 = np.dot(W1, X) + b1
    A1 = relu(Z1)
    ### START CODE HERE ### (approx. 4 lines)         # Steps 1-4 below correspond to the Steps 1-4 described above. 
    D1 = np.random.rand(A1.shape[0],A1.shape[1])                     # Step 1: initialize matrix D1 = np.random.rand(..., ...)
    D1 = D1<=keep_prob                                # Step 2: convert entries of D1 to 0 or 1 (using keep_prob as the threshold)
    A1 *=D1                                           # Step 3: shut down some neurons of A1
    A1 /=keep_prob                                    # Step 4: scale the value of neurons that haven't been shut down
    ### END CODE HERE ###
    Z2 = np.dot(W2, A1) + b2
    A2 = relu(Z2)
    ### START CODE HERE ### (approx. 4 lines)
    D2 = np.random.rand(A2.shape[0],A2.shape[1])                      # Step 1: initialize matrix D2 = np.random.rand(..., ...)
    D2 = D2 < keep_prob                                # Step 2: convert entries of D2 to 0 or 1 (using keep_prob as the threshold)
    A2 *=D2                                            # Step 3: shut down some neurons of A2
    A2 /=keep_prob                                     # Step 4: scale the value of neurons that haven't been shut down
    ### END CODE HERE ###
    Z3 = np.dot(W3, A2) + b3
    A3 = sigmoid(Z3)

    cache = (Z1, D1, A1, W1, b1, Z2, D2, A2, W2, b2, Z3, A3, W3, b3)

    return A3, cache

反向传播代码

# GRADED FUNCTION: backward_propagation_with_dropout

def backward_propagation_with_dropout(X, Y, cache, keep_prob):
    """
    Implements the backward propagation of our baseline model to which we added dropout.

    Arguments:
    X -- input dataset, of shape (2, number of examples)
    Y -- "true" labels vector, of shape (output size, number of examples)
    cache -- cache output from forward_propagation_with_dropout()
    keep_prob - probability of keeping a neuron active during drop-out, scalar

    Returns:
    gradients -- A dictionary with the gradients with respect to each parameter, activation and pre-activation variables
    """

    m = X.shape[1]
    (Z1, D1, A1, W1, b1, Z2, D2, A2, W2, b2, Z3, A3, W3, b3) = cache

    dZ3 = A3 - Y
    dW3 = 1./m * np.dot(dZ3, A2.T)
    db3 = 1./m * np.sum(dZ3, axis=1, keepdims = True)
    dA2 = np.dot(W3.T, dZ3)
    ### START CODE HERE ### (≈ 2 lines of code)
    dA2 *= D2            # Step 1: Apply mask D2 to shut down the same neurons as during the forward propagation
    dA2 /= keep_prob              # Step 2: Scale the value of neurons that haven't been shut down
    ### END CODE HERE ###
    dZ2 = np.multiply(dA2, np.int64(A2 > 0))
    dW2 = 1./m * np.dot(dZ2, A1.T)
    db2 = 1./m * np.sum(dZ2, axis=1, keepdims = True)

    dA1 = np.dot(W2.T, dZ2)
    ### START CODE HERE ### (≈ 2 lines of code)
    dA1 *= D1              # Step 1: Apply mask D1 to shut down the same neurons as during the forward propagation
    dA1 /= keep_prob              # Step 2: Scale the value of neurons that haven't been shut down
    ### END CODE HERE ###
    dZ1 = np.multiply(dA1, np.int64(A1 > 0))
    dW1 = 1./m * np.dot(dZ1, X.T)
    db1 = 1./m * np.sum(dZ1, axis=1, keepdims = True)

    gradients = {"dZ3": dZ3, "dW3": dW3, "db3": db3,"dA2": dA2,
                 "dZ2": dZ2, "dW2": dW2, "db2": db2, "dA1": dA1, 
                 "dZ1": dZ1, "dW1": dW1, "db1": db1}

    return gradients

需要注意的是在反向传播的时候也需要用原来的随机矩阵保证起作用的参数变化

这里看一下实现效果.

可以看到边界很平滑，而没有用任何正则化的图是这样的

过拟合并且正确率稍低

总结

Note:
- A common mistake when using dropout is to use it both in training and testing. You should use dropout (randomly eliminate nodes) only in training.

What you should remember about dropout:
- Dropout is a regularization technique.
- You only use dropout during training. Don’t use dropout (randomly eliminate nodes) during test time.
- Apply dropout both during forward and backward propagation.
- During training time, divide each dropout layer by keep_prob to keep the same expected value for the activations. For example, if keep_prob is 0.5, then we will on average shut down half the nodes, so the output will be scaled by 0.5 since only the remaining half are contributing to the solution. Dividing by 0.5 is equivalent to multiplying by 2. Hence, the output now has the same expected value. You can check that this works even when keep_prob is other values than 0.5.

grad check

这个讲的是如何debug 你的梯度计算是否有问题，就是用数值方法计算梯度，然后将其与back propagation 计算出来的作比较，如果差值在合理范围内就认为其是正确的.
所谓数值计算方法，就是

d W i = lim θ \to 0 J ( W 1 , \dots , W i + θ , \dots ) - J ( W 1 , \dots , W i - θ , \dots ) 2 * θ

将

θ 取的足够小就行了，一般取

10−7

然后与反向传播的梯度做比较，设反向传播计算的梯度为 grad , 数值计算的梯度为 dW ,则用如下公式

d i f f = | | | g r a d | | 2 - | | d W | | 2 | | | g r a d | | 2 + | | d W | | 2

如果 diff 的数量级与

θ 相当则认为正确，本例中小于

10−7 认为正确。

下面是代码

# GRADED FUNCTION: gradient_check_n

def gradient_check_n(parameters, gradients, X, Y, epsilon = 1e-7):
    """
    Checks if backward_propagation_n computes correctly the gradient of the cost output by forward_propagation_n

    Arguments:
    parameters -- python dictionary containing your parameters "W1", "b1", "W2", "b2", "W3", "b3":
    grad -- output of backward_propagation_n, contains gradients of the cost with respect to the parameters. 
    x -- input datapoint, of shape (input size, 1)
    y -- true "label"
    epsilon -- tiny shift to the input to compute approximated gradient with formula(1)

    Returns:
    difference -- difference (2) between the approximated gradient and the backward propagation gradient
    """

    # Set-up variables
    parameters_values, _ = dictionary_to_vector(parameters)
    grad = gradients_to_vector(gradients)
    num_parameters = parameters_values.shape[0]
    J_plus = np.zeros((num_parameters, 1))
    J_minus = np.zeros((num_parameters, 1))
    gradapprox = np.zeros((num_parameters, 1))

    # Compute gradapprox
    for i in range(num_parameters):

        # Compute J_plus[i]. Inputs: "parameters_values, epsilon". Output = "J_plus[i]".
        # "_" is used because the function you have to outputs two parameters but we only care about the first one
        ### START CODE HERE ### (approx. 3 lines)
        thetaplus =  np.copy(parameters_values)                                     # Step 1
        thetaplus[i][0] +=epsilon                                                      # Step 2
        J_plus[i], _ = forward_propagation_n(X,Y,vector_to_dictionary(thetaplus))        # Step 3
        ### END CODE HERE ###

        # Compute J_minus[i]. Inputs: "parameters_values, epsilon". Output = "J_minus[i]".
        ### START CODE HERE ### (approx. 3 lines)
        thetaminus = np.copy(parameters_values)                                              # Step 1
        thetaminus[i][0] -= epsilon                                                          # Step 2        
        J_minus[i], _ = forward_propagation_n(X,Y,vector_to_dictionary(thetaminus))            # Step 3
        ### END CODE HERE ###

        # Compute gradapprox[i]
        ### START CODE HERE ### (approx. 1 line)
        gradapprox[i] = (J_plus[i]-J_minus[i]) / (2*epsilon)
        ### END CODE HERE ###

    # Compare gradapprox to backward propagation gradients by computing difference.
    ### START CODE HERE ### (approx. 1 line)
    norm1 ,norm2= np.linalg.norm(grad),np.linalg.norm(gradapprox)
    numerator =np.abs(norm1 - norm2)                                            # Step 1'
    denominator = norm1 + norm2                                  # Step 2'
    difference =  numerator / denominator                                       # Step 3'
    ### END CODE HERE ###

    if difference > 1e-7:
        print ("\033[93m" + "There is a mistake in the backward propagation! difference = " + str(difference) + "\033[0m")
    else:
        print ("\033[92m" + "Your backward propagation works perfectly fine! difference = " + str(difference) + "\033[0m")

    return difference

需要注意的是数值计算梯度往往只用在debug中因为计算复杂度太大了….

版权声明

本文作者Dylan_frank(滔滔) 谢绝任何爬虫转载，如有转载请注明出处，联系作者http://blog.csdn.net/dylan_frank/article/details/77284747

易 AI - 使用 TensorFlow 2 Keras 实现 AlexNet CNN 架构 CatchZeng
原文：https://makeoptim.com/deep-learning/yiai-alexnet-implementation前言网络结构实现SequentialSubclassingDemo小结参考前言上一篇笔者使用如何阅读深度学习论文的方法阅读了AlexNet。为了加深理解，本文带大家使用TensorFlow2Keras实现AlexNetCNN架构。网络结构image从上一篇可以得到Al
论文学习记录之Deep-learning seismic full-waveform inversion for realistic structuralmodels 摘星星的屋顶论文深度学习人工智能
一、ABSTRACT—摘要标题：Deep-learningseismicfull-waveforminversionforrealisticstructuralmodels（用于真实结构模型的深度学习地震全波形反演）作者：BinLiu1,SenlinYang2,YuxiaoRen2,XinjiXu3,PengJiang2,andYangkangChen4（和SeisInvNet有共同作者，应该是同
论文学习记录之SeisInvNet（Deep-Learning Inversion of Seismic Data）摘星星的屋顶论文人工智能
目录1INTRODUCTION—介绍2RELATEDWORKS—相关作品3METHODOLOGYANDIMPLEMENTATION—方法和执行3.1方法3.2执行4EXPERIMENTS—实验4.1数据集准备4.2实验设置4.3基线模型4.4定向比较4.5定量比较4.6机理研究5CONCLUSION—结论1INTRODUCTION—介绍地震勘探是根据地震波在大地中的传播规律来确定地下地层结构的一种
易 AI - 机器学习计算机视觉基础 CatchZeng
原文：http://makeoptim.com/deep-learning/yiai-cv计算机视觉表达黑白图灰度图彩色图操作卷积均值滤波归一化统一量纲加速模型训练梯度下降GPU浮点运算小结参考链接上一篇讲解了机器学习数据集的概念以及如何收集图片数据集。收集到的数据是被训练的对象，那么怎么表示这些数据呢？数据又需要被怎么操作呢？本文为大家讲解计算机视觉基础，帮助大家在后面的课程中更好地理解和训练模
【Pytorch】Transposed Convolution bryant_meng pytorch 人工智能 python 反卷积逆卷积
文章目录1卷积2反/逆卷积3MaxUnpool/ConvTranspose4encoder-decoder5可视化学习参考来自：详解逆卷积操作–Up-samplingwithTransposedConvolutionPyTorch使用记录https://github.com/naokishibuya/deep-learning/blob/master/python/transposed_convo
2-EagleC: A deep-learning framework for detecting a full range of structural variations from bulk... 怎么不是呐
Hi-C技术：检测人类基因组结构变异（SVs）的一种有前景的方法。目前严重缺乏能够使用Hi-C数据进行全范围SV检测的算法,只能以低于最佳的分辨率识别染色体间易位和远程染色体内SVs（>1mb）。本文开发了一个深度学习模型，结合了深度学习和集成学习策略的框架，以高分辨率预测全范围的SVs——EagleC在癌症基因组中认识了许多先前未知的融合事件，也发掘了已知致癌基因的新型调控机制，这些发现为癌症分
用数据玩点花样！如何构建skim-gram模型来训练和可视化词向量机器之心V php 人工智能
本文介绍了如何在TensorFlow中实现skim-gram模型，并用TensorBoard进行可视化。GitHub地址：https://github.com/priya-dwivedi/Deep-Learning/blob/master/word2vec_skipgram/Skip-Grams-Solution.ipynb本教程将展示如何在TensorFlow中实现skim-gram模型，以便为
Deep-learning 斗战胜佛oh
图卷积网络在药物研发中的应用综述尽管深度学习在很多领域在过去的几年取得了一定的成功，但是在分子信息和药物发现领域成功的应用依然有限。适用于深层架构的结构化数据方面的最新进展为药物研究开辟了新的范例。该篇从四个角度阐述了图神经网络在药物发现和分子信息领域的应用。1）分子属性和活性预测；2）相互作用预测；3）合成预测；4）从头药物设计。最后总结了药物相关问题的代表性应用。讨论将图卷积网络应用于药物发现
用BERT进行机器阅读理解 javastart 自然语言
这里可以找到带有代码的Github存储库:https://github.com/edwardcqian/bert_QA。本文将讨论如何设置此项功能.机器（阅读）理解是NLP的领域，我们使用非结构化文本教机器理解和回答问题。https://www.coursera.org/specializations/deep-learning?ranMID=40328&ranEAID=J2RDoRlzkk&ra
停车场车位检测思路梳理杂七杂八的
输入列表图像，在工具台中输出图像defshow_images(self,images,cmap=None):输入的是某一张图片和给图片的name，make_write表示是否需要yyyyafafaffadfsfgf10.fhttps://github.com/priya-dwivedi/Deep-Learning/tree/master/parking_spots_detector/train_d
AI - Ubuntu 机器学习环境 (TensorFlow GPU, JupyterLab, VSCode) CatchZeng
原文：https://makeoptim.com/deep-learning/tensorflow-gpu-on-ubuntu介绍所需软件安装前GCCNVIDIApackagerepositoriesNVIDIAmachinelearningNVIDIAGPUdriverCUDAToolKitandcuDNNTensorRTMiniconda虚拟环境安装TensorFlow安装JupyterLab
deep-learning(1) - 随手记录的知识点 Laniakea_01d0
业界通常认为第一层是隐藏层的第一层AI会遇上工程类问题Padding补零操作，可以保证卷积核在每块区域都进行卷积，迭代次数越多，更有效果，提取特征更好生成器和迭代器，存在的意义，一般我们需要对一个数组进行操作的时候，我们要遍历出来操作，比如一亿个参数，我们不可能一次性全部取出来，一个一个的去取，这就是生成器存在的意义。Dataloader加载数据到内存Next（iter（a））转换成0，1转换成正
易 AI - AlexNet 论文深度讲解 CatchZeng
原文：https://makeoptim.com/deep-learning/yiai-paper-alexnet论文地址阅读方式ImageNetClassificationwithDeepConvolutionalNeuralNetworks使用深度卷积神经网络的ImageNet分类Abstract摘要1Introduction1简介2TheDataset2数据集3TheArchitecture
AI - Mac M1 机器学习环境 (TensorFlow, JupyterLab, VSCode) CatchZeng
原文https://makeoptim.com/deep-learning/mac-m1-tensorflowXcodeCommandLineToolsHomebrewMiniforge下载AppleTensorFlow创建虚拟环境安装必须的包安装特殊版本的pip和其他包安装Apple提供的包(numpy,grpcio,h5py)安装额外的包安装TensorFlow测试JupyterLabVSCo
易 AI - 机器学习卷积神经网络（CNN） CatchZeng
原文：http://makeoptim.com/deep-learning/yiai-cnn卷积神经网络结构输入层隐藏层输出层TensorFlow中定义卷积神经网络模型宏观理解卷积神经网络全连接采样卷积小结上一篇介绍了如何在TensorFlow中加载数据集。从本文开始将以王者荣耀为例，介绍卷积神经网络（CNN）。由于涉及的内容较多，本文主要先介绍以下内容：卷积神经网络结构TensorFlow中定义
易 AI - 使用 TensorFlow Object Detection API 训练自定义目标检测模型 CatchZeng
原文：https://makeoptim.com/deep-learning/yiai-object-detection前言目标检测位置发展史传统方法（候选区域+手工特征提取+分类器）RegionProposal+CNN（Two-stage）端到端（One-stage）TensorFlowObjectDetectionAPI安装依赖项安装API工程创建数据集图片标注创建TFRecord模型训练下载
AI - Mac 机器学习环境 (TensorFlow, JupyterLab, VSCode) CatchZeng
原文：https://makeoptim.com/deep-learning/mac-tensorflowCondaAnacondaMiniconda创建虚拟环境安装tensorflow检查安装JupyterLab启动安装其他依赖JupyterLab运行tensorflow安装VSCodeVSCode运行tensorflow小结延伸阅读在MacM1机器学习环境讲述了如何在M1芯片的Mac搭建机器学
NLP(新闻文本分类)——数据读取与数据分析浩波的笔记 NLP 机器学习 python nlp
初始数据importpandasaspddf_train=pd.read_csv('E:/python-project/deep-learning/datawhale/nlp/news-data/train_set.csv/train_set.csv',sep='\t')df_test=pd.read_csv('E:/python-project/deep-learning/datawhale/n
AI - Apple Silicon Mac M1 原生支持 TensorFlow 2.6 GPU 加速（tensorflow-metal PluggableDevice） CatchZeng
原文：http://makeoptim.com/deep-learning/tensorflow-metal前言系统要求当前不支持XcodeCommandLineToolsHomebrewMiniforge创建虚拟环境安装Tensorflowdependencies首次安装升级安装安装Tensorflow安装metalplugin安装必须的包测试JupyterLabVSCode延伸阅读参考前言几天
易 AI - ResNet 论文深度讲解 CatchZeng
原文：https://makeoptim.com/deep-learning/yiai-paper-resnet论文地址阅读方式DeepResidualLearningforImageRecognition图像识别的深度残差学习Abstract摘要1Introduction1简介2RelatedWork2相关工作3.DeepResidualLearning3.深度残差学习3.1.ResidualL
Windows安装PyTorch-CPU Ann剑安装PyTorch pytorch windows python
看了好多大佬的教程，终于给自己老旧电脑成功安装了PyTorch本电脑安装的软件PyTorch=1.12.1anaconda版本为conda4.8.2（anaconda自行安装）开始前以管理员方式运行anacondaprompt一、安装PyTorch一、安装PyTorch（1）创建环境为deep-learning，也可以为PyTorch（就是一个名字）。指定Python版本condacreate-n
transformer(Bert)的多头注意力对每一个head进行降维的分析想赚钱的雷大
背景：在用keras的multiattention模块做实验的时候，发现学习参数随着头数的增多而增多，与transformer中的实现不太一致结果：本着想了解透彻的思路去网上搜索了一番，第一篇我就觉得整理的不错，附上链接：http://www.sniper97.cn/index.php/note/deep-learning/note-deep-learning/4002/总结一下：一言蔽之的话，大
nvidia 3060 + cuda + cudnn + tf 代码&诗 tensorflow python 深度学习
参考：https://eipi10.cn/deep-learning/2019/11/28/centos_cuda_cudnn/1.环境版本：CentOSLinuxrelease7.8.2003(Core)Tensorflow-gpu2.5nvidia3060cuda11.2.2cudnn-11.32.环境检查：lscpi|grep-invidia#要有nvidia设备3.首先安装nvidia-3
identifier “THCudaCheck“ is undefined 的解决方法莫说相公痴 Machine Learning Python Pytorch 深度学习 pytorch 人工智能
THCudaCheck在pytorch1.11.0版本被移除了，可以看文档https://www.exxactcorp.com/blog/Deep-Learning/pytorch-1-11-0-now-available解决方法是将THCudaCheck替换成C10_CUDA_CHECK
交通事故预测—《Traffic Accident’s Severity Prediction: A Deep-Learning Approach-Based CNN Network》永恒的记忆2019 科研论文 python 机器学习人工智能
一、文章信息《TrafficAccident’sSeverityPrediction:ADeep-LearningApproach-BasedCNNNetwork》，2019年Access上的一篇文章。二、摘要基于交通事故特征的权重，提出了基于特征矩阵的灰色图像(FM2GI)算法，将交通事故数据的单一特征关系转换为包含并行组合关系的灰色图像作为模型的输入变量，网络模型是基于CNN。（也就是说这篇文
通过 MQTT 检测对象和传输图像 woshicver python opencv vnc cv opengl
在本文中，我们将学习如何使用open-cv和YOLO对象检测器每五秒捕获/保存和检测图像中的对象。然后我们将图像转换为字节数组并通过MQTT发布，这将在另一个远程设备上接收并保存为JPG。我们将使用YoloV3算法和一个免费的MQTT代理YoloV3算法：https://viso.ai/deep-learning/yolov3-overview/#:~:text=What's%20Next%3F-
DNN(Deep-Learning Neural Network) sherlock31415931 ML 神经网络深度学习人工智能 tensorflow numpy
DNN(Deep-LearningNeuralNetwork)接下来介绍比较常见的全连接层网络（fully-connectedfeedfowardneruralnetwork）名词解释首先介绍一下神经网络的基本架构，以一个神经元为例输入是一个向量，权重（weights）也是一个矩阵把两个矩阵进行相乘，最后加上偏差（bias），即w1*x1+w2*x2+b神经元里面会有一个激活函数（activati
AlexNet详解 tt丫深度学习人工智能深度学习神经网络 AlexNet
入门小菜鸟，希望像做笔记记录自己学的东西，也希望能帮助到同样入门的人，更希望大佬们帮忙纠错啦~侵权立删。✨完整代码在我的github上，有需要的朋友可以康康✨GitHub-tt-s-t/Deep-Learning:Storesomeofyourownin-depthlearningcode,whichiscurrentlyintheupdatestage.Thecontentcovers:each
论文解读：ProteinBERT: a universal deep-learning model of protein sequence and function wangpan007 生信论文神经网络 python编程深度学习神经网络 python
目录1.研究背景2.研究数据2.1预训练的蛋白质数据集2.2蛋白质基准数据集3.研究方法3.1序列和标注编码3.2蛋白质序列和注释的自我监督预训练3.3对蛋白质基准进行监督微调3.4深度学习框架4.结果4.1预训练可以改善蛋白质模型4.2ProteinBERT在不同的蛋白质基准上达到了近乎最先进的结果4.4全局注意力机制的理解5.结论作者单位：耶路撒冷希伯来大学发表期刊：《Bioinformati
【U-Net2015】U-Net: Convolutional Networks for Biomedical Image Segmentation mage Segmentation 不会声调的博er 深度学习 caffe 计算机视觉
U-Net:ConvolutionalNetworksforBiomedicalmageSegmentation生物医学图像语义分割的卷积神经网络arXiv:1505.04597v1[cs.CV]18May2015文章地址：https://arxiv.org/abs/1505.04597代码地址：https://github.com/Jack-Cherish/Deep-Learning/tree/
ztree设置禁用节点 3213213333332132 JavaScript ztree json setDisabledNode Ajax
ztree设置禁用节点的时候注意，当使用ajax后台请求数据,必须要设置为同步获取数据，否者会获取不到节点对象，导致设置禁用没有效果。 $(function(){ showTree(); setDisabledNode(); });
JVM patch by Taobao bookjovi java HotSpot
在网上无意中看到淘宝提交的hotspot patch，共四个，有意思，记录一下。 7050685：jsdbproc64.sh has a typo in the package name 7058036：FieldsAllocationStyle=2 does not work in 32-bit VM 7060619：C1 should respect inline and
将session存储到数据库中 dcj3sjt126com sql PHP session
CREATE TABLE sessions ( id CHAR(32) NOT NULL, data TEXT, last_accessed TIMESTAMP NOT NULL, PRIMARY KEY (id) ); <?php /** * Created by PhpStorm. * User: michaeldu * Date
Vector 171815164 vector
public Vector<CartProduct> delCart(Vector<CartProduct> cart, String id) { for (int i = 0; i < cart.size(); i++) { if (cart.get(i).getId().equals(id)) { cart.remove(i);
各连接池配置参数比较 g21121 连接池
排版真心费劲，大家凑合看下吧，见谅~ Druid DBCP C3P0 Proxool 数据库用户名称 Username Username User 数据库密码 Password Password Password 驱动名
[简单]mybatis insert语句添加动态字段 53873039oycg mybatis
mysql数据库,id自增,配置如下： <insert id="saveTestTb" useGeneratedKeys="true" keyProperty="id" parameterType=&
struts2拦截器配置云端月影 struts2拦截器
struts2拦截器interceptor的三种配置方法方法1. 普通配置法 <struts> <package name="struts2" extends="struts-default"> &
IE中页面不居中，火狐谷歌等正常 aijuans IE中页面不居中
问题是首页在火狐、谷歌、所有IE中正常显示，列表页的页面在火狐谷歌中正常，在IE6、7、8中都不中，觉得可能那个地方设置的让IE系列都不认识，仔细查看后发现，列表页中没写HTML模板部分没有添加DTD定义，就是<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3
String,int,Integer,char 几个类型常见转换 antonyup_2006 html sql .net
如何将字串 String 转换成整数 int? int i = Integer.valueOf(my_str).intValue(); int i=Integer.parseInt(str); 如何将字串 String 转换成Integer ? Integer integer=Integer.valueOf(str); 如何将整数 int 转换成字串 String ? 1.
PL/SQL的游标类型百合不是茶显示游标(静态游标)隐式游标游标的更新和删除 %rowtype ref游标(动态游标)
游标是oracle中的一个结果集,用于存放查询的结果; PL/SQL中游标的声明; 1,声明游标 2,打开游标(默认是关闭的); 3,提取数据 4,关闭游标注意的要点:游标必须声明在declare中,使用open打开游标,fetch取游标中的数据,close关闭游标隐式游标:主要是对DML数据的操作隐
JUnit4中@AfterClass @BeforeClass @after @before的区别对比 bijian1013 JUnit4 单元测试
一.基础知识 JUnit4使用Java5中的注解（annotation），以下是JUnit4常用的几个annotation： @Before：初始化方法对于每一个测试方法都要执行一次（注意与BeforeClass区别，后者是对于所有方法执行一次）@After：释放资源对于每一个测试方法都要执行一次（注意与AfterClass区别，后者是对于所有方法执行一次
精通Oracle10编程SQL(12)开发包 bijian1013 oracle 数据库 plsql
/* *开发包 *包用于逻辑组合相关的PL/SQL类型（例如TABLE类型和RECORD类型）、PL/SQL项（例如游标和游标变量）和PL/SQL子程序（例如过程和函数） */ --包用于逻辑组合相关的PL/SQL类型、项和子程序，它由包规范和包体两部分组成 --建立包规范：包规范实际是包与应用程序之间的接口，它用于定义包的公用组件，包括常量、变量、游标、过程和函数等 --在包规
【EhCache二】ehcache.xml配置详解 bit1129 ehcache.xml
在ehcache官网上找了多次，终于找到ehcache.xml配置元素和属性的含义说明文档了，这个文档包含在ehcache.xml的注释中！ ehcache.xml ： http://ehcache.org/ehcache.xml ehcache.xsd ： http://ehcache.org/ehcache.xsd ehcache配置文件的根元素是ehcahe ehcac
java.lang.ClassNotFoundException: org.springframework.web.context.ContextLoaderL 白糖_ java eclipse spring tomcat Web
今天学习spring+cxf的时候遇到一个问题：在web.xml中配置了spring的上下文监听器： <listener> <listener-class>org.springframework.web.context.ContextLoaderListener</listener-class> </listener> 随后启动
angular.element boyitech AngularJS AngularJS API angular.element
angular.element 描述: 包裹着一部分DOM element或者是HTML字符串，把它作为一个jQuery元素来处理。（类似于jQuery的选择器啦）如果jQuery被引入了，则angular.element就可以看作是jQuery选择器，选择的对象可以使用jQuery的函数；如果jQuery不可用，angular.e
java-给定两个已排序序列，找出共同的元素。 bylijinnan java
import java.util.ArrayList; import java.util.Arrays; import java.util.List; public class CommonItemInTwoSortedArray { /** * 题目：给定两个已排序序列，找出共同的元素。 * 1.定义两个指针分别指向序列的开始。 * 如果指向的两个元素
sftp 异常，有遇到的吗？求解 Chen.H java jcraft auth jsch jschexception
com.jcraft.jsch.JSchException: Auth cancel at com.jcraft.jsch.Session.connect(Session.java:460) at com.jcraft.jsch.Session.connect(Session.java:154) at cn.vivame.util.ftp.SftpServerAccess.connec
[生物智能与人工智能]神经元中的电化学结构代表什么? comsci 人工智能
我这里做一个大胆的猜想,生物神经网络中的神经元中包含着一些化学和类似电路的结构,这些结构通常用来扮演类似我们在拓扑分析系统中的节点嵌入方程一样,使得我们的神经网络产生智能判断的能力,而这些嵌入到节点中的方程同时也扮演着"经验"的角色.... 我们可以尝试一下...在某些神经
通过LAC和CID获取经纬度信息 dai_lm lac cid
方法1：用浏览器打开http://www.minigps.net/cellsearch.html，然后输入lac和cid信息(mcc和mnc可以填0)，如果数据正确就可以获得相应的经纬度方法2：发送HTTP请求到http://www.open-electronics.org/celltrack/cell.php?hex=0&lac=<lac>&cid=&
JAVA的困难分析 datamachine java
前段时间转了一篇SQL的文章（http://datamachine.iteye.com/blog/1971896），文章不复杂，但思想深刻，就顺便思考了一下java的不足，当砖头丢出来，希望引点和田玉。 -----------------------------------------------------------------------------------------
小学5年级英语单词背诵第二课 dcj3sjt126com english word
money 钱 paper 纸 speak 讲，说 tell 告诉 remember 记得，想起 knock 敲，击，打 question 问题 number 数字，号码 learn 学会，学习 street 街道 carry 搬运，携带 send 发送，邮寄，发射 must 必须 light 灯，光线，轻的 front
linux下面没有tree命令 dcj3sjt126com linux
centos p安装 yum -y install tree mac os安装 brew install tree 首先来看tree的用法 tree 中文解释：tree 功能说明：以树状图列出目录的内容。语　　法：tree [-aACdDfFgilnNpqstux][-I <范本样式>][-P <范本样式
Map迭代方式，Map迭代，Map循环蕃薯耀 Map循环 Map迭代 Map迭代方式
Map迭代方式，Map迭代，Map循环 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 蕃薯耀 2015年
Spring Cache注解+Redis hanqunfeng spring
Spring3.1 Cache注解依赖jar包：  <dependency> <groupId>org.springframework.data</groupId> <artifactId>spring-data-redis</artifactId>
Guava中针对集合的 filter和过滤功能 jackyrong filter
在guava库中，自带了过滤器(filter)的功能，可以用来对collection 进行过滤，先看例子： @Test public void whenFilterWithIterables_thenFiltered() { List<String> names = Lists.newArrayList("John"
学习编程那点事 lampcy 编程 android PHP html5
一年前的夏天，我还在纠结要不要改行，要不要去学php？能学到真本事吗？改行能成功吗？太多的问题，我终于不顾一切，下定决心，辞去了工作，来到传说中的帝都。老师给的乘车方式还算有效，很顺利的就到了学校，赶巧了，正好学校搬到了新校区。先安顿了下来，过了个轻松的周末，第一次到帝都，逛逛吧！接下来的周一，是我噩梦的开始，学习内容对我这个零基础的人来说，除了勉强完成老师布置的作业外，我已经没有时间和精力去
架构师之流处理---------bytebuffer的mark,limit和flip nannan408 ByteBuffer
1.前言。如题，limit其实就是可以读取的字节长度的意思，flip是清空的意思，mark是标记的意思。 2.例子. 例子代码: String str = "helloWorld"; ByteBuffer buff = ByteBuffer.wrap(str.getBytes()); Sy
org.apache.el.parser.ParseException: Encountered " ":" ": "" at line 1, column 1 Everyday都不同 $转义 el表达式
最近在做Highcharts的过程中，在写js时，出现了以下异常：严重: Servlet.service() for servlet jsp threw exception org.apache.el.parser.ParseException: Encountered " ":" ": "" at line 1,
用Java实现发送邮件到163 tntxia java实现
/* 在java版经常看到有人问如何用javamail发送邮件？如何接收邮件？如何访问多个文件夹等。问题零散，而历史的回复早已经淹没在问题的海洋之中。本人之前所做过一个java项目，其中包含有WebMail功能，当初为用java实现而对javamail摸索了一段时间，总算有点收获。看到论坛中的经常有此方面的问题，因此把我的一些经验帖出来，希望对大家有些帮助。此篇仅介绍用
探索实体类存在的真正意义 java小叶檀 POJO
一. 实体类简述实体类其实就是俗称的POJO,这种类一般不实现特殊框架下的接口，在程序中仅作为数据容器用来持久化存储数据用的 POJO（Plain Old Java Objects）简单的Java对象它的一般格式就是 public class A{ private String id; public Str