SnailDove

吴恩达神经网络和深度学习第4周编程作业

由于csdn的markdown编辑器及其难用，已将本文转移至此处

Note

These are my personal programming assignments at the 4th week after studying the course neural-networks-deep-learning and the copyright belongs to deeplearning.ai.

Part 1：Building your Deep Neural Network: Step by Step

1. Packages

Let’s first import all the packages that you will need during this assignment.

numpy is the main package for scientific computing with Python.
matplotlib is a library to plot graphs in Python.
dnn_utils provides some necessary functions for this notebook.
testCases provides some test cases to assess the correctness of your functions
np.random.seed(1) is used to keep all the random function calls consistent. It will help us grade your work. Please don’t change the seed.

import numpy as np;
import h5py;
import matplotlib.pyplot as plt;
from testCases_v3 import *;
from dnn_utils_v2 import sigmoid, sigmoid_backward, relu, relu_backward;

%matplotlib inline
plt.rcParams['figure.figsize'] = (5.0, 4.0); # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest';
plt.rcParams['image.cmap'] = 'gray';

%load_ext autoreload
%autoreload 2

np.random.seed(1);

You can get the support code from here.

the sigmoid function:

def sigmoid(Z):
    """
    Implements the sigmoid activation in numpy

    Arguments:
    Z -- numpy array of any shape

    Returns:
    A -- output of sigmoid(z), same shape as Z
    cache -- returns Z as well, useful during backpropagation
    """

    A = 1 / (1 + np.exp(-Z));
    cache = Z;

    return A, cache;

the sigmoid_backward function:

def sigmoid_backward(dA, cache):
    """
    Implement the backward propagation for a single SIGMOID unit.

    Arguments:
    dA -- post-activation gradient, of any shape
    cache -- 'Z' where we store for computing backward propagation efficiently

    Returns:
    dZ -- Gradient of the cost with respect to Z
    """

    Z = cache;

    s = 1 / (1 + np.exp(-Z));
    dZ = dA * s * (1 - s);

    assert (dZ.shape == Z.shape);

    return dZ;

the relu function:

def relu(Z):
    """
    Implement the RELU function.

    Arguments:
    Z -- Output of the linear layer, of any shape

    Returns:
    A -- Post-activation parameter, of the same shape as Z
    cache -- a python dictionary containing "A" ; stored for computing the backward pass efficiently
    """

    A = np.maximum(0,Z);

    assert(A.shape == Z.shape);

    cache = Z; 
    return A, cache;

the relu_backward function：

def relu_backward(dA, cache):
    """
    Implement the backward propagation for a single RELU unit.

    Arguments:
    dA -- post-activation gradient, of any shape
    cache -- 'Z' where we store for computing backward propagation efficiently

    Returns:
    dZ -- Gradient of the cost with respect to Z
    """

    Z = cache;
    dZ = np.array(dA, copy = True); # just converting dz to a correct object.

    # When z <= 0, you should set dz to 0 as well. 
    dZ[Z <= 0] = 0;

    assert (dZ.shape == Z.shape);

    return dZ;

2. Outline of the Assignment

To build your neural network, you will be implementing several “helper functions”. These helper functions will be used in the next assignment to build a two-layer neural network and an L-layer neural network. Each small helper function you will implement will have detailed instructions that will walk you through the necessary steps. Here is an outline of this assignment, you will:

Initialize the parameters for a two-layer network and for an L-layer neural network.
Implement the forward propagation module (shown in purple in the figure below).
- Complete the LINEAR part of a layer’s forward propagation step (resulting in Z[l]).
- We give you the ACTIVATION function (relu/sigmoid).
- Combine the previous two steps into a new [LINEAR->ACTIVATION] forward function.
- Stack the [LINEAR->RELU] forward function L-1 time (for layers 1 through L-1) and add a [LINEAR->SIGMOID] at the end (for the final layer L). This gives you a new L_model_forward function.
Compute the loss.
Implement the backward propagation module (denoted in red in the figure below).
- Complete the LINEAR part of a layer’s backward propagation step.
- We give you the gradient of the ACTIVATE function (relu_backward/sigmoid_backward)
- Combine the previous two steps into a new [LINEAR->ACTIVATION] backward function.
- Stack [LINEAR->RELU] backward L-1 times and add [LINEAR->SIGMOID] backward in a new L_model_backward function
Finally update the parameters.

Note that for every forward function, there is a corresponding backward function. That is why at every step of your forward module you will be storing some values in a cache. The cached values are useful for computing gradients. In the backpropagation module you will then use the cache to calculate the gradients. This assignment will show you exactly how to carry out each of these steps.

3. Initialization

You will write two helper functions that will initialize the parameters for your model. The first function will be used to initialize parameters for a two layer model. The second one will generalize this initialization process to L layers.

3.1 2-layer Neural Network

Exercise: Create and initialize the parameters of the 2-layer neural network.

Instructions:

The model’s structure is: LINEAR -> RELU -> LINEAR -> SIGMOID.
Use random initialization for the weight matrices. Use np.random.randn(shape)*0.01 with the correct shape.
Use zero initialization for the biases. Use np.zeros(shape).

# GRADED FUNCTION: initialize_parameters

def initialize_parameters(n_x, n_h, n_y):
    """
    Argument:
    n_x -- size of the input layer
    n_h -- size of the hidden layer
    n_y -- size of the output layer

    Returns:
    parameters -- python dictionary containing your parameters:
                    W1 -- weight matrix of shape (n_h, n_x)
                    b1 -- bias vector of shape (n_h, 1)
                    W2 -- weight matrix of shape (n_y, n_h)
                    b2 -- bias vector of shape (n_y, 1)
    """

    np.random.seed(1);

    ### START CODE HERE ### (≈ 4 lines of code)
    W1 = np.random.randn(n_h, n_x) * 0.01;
    b1 = np.zeros((n_h, 1));
    W2 = np.random.randn(n_y, n_h) * 0.01;
    b2 = np.zeros((n_y, 1));
    ### END CODE HERE ###

    assert(W1.shape == (n_h, n_x));
    assert(b1.shape == (n_h, 1));
    assert(W2.shape == (n_y, n_h));
    assert(b2.shape == (n_y, 1));

    parameters = {"W1": W1,
                  "b1": b1,
                  "W2": W2,
                  "b2": b2};

    return parameters;

parameters = initialize_parameters(3,2,1);
print("W1 = " + str(parameters["W1"]));
print("b1 = " + str(parameters["b1"]));
print("W2 = " + str(parameters["W2"]));
print("b2 = " + str(parameters["b2"]));

W1 = [[ 0.01624345 -0.00611756 -0.00528172]
 [-0.01072969  0.00865408 -0.02301539]]
b1 = [[0.]
 [0.]]
W2 = [[ 0.01744812 -0.00761207]]
b2 = [[0.]]

3.2 L-layer Neural Network

The initialization for a deeper L-layer neural network is more complicated because there are many more weight matrices and bias vectors. When completing the initialize_parameters_deep, you should make sure that your dimensions match between each layer. Recall that is the number of units in layer . Thus for example if the size of our input is (with examples) then:

	Shape of W	Shape of b	Activation	Shape of Activation
Layer 1	(n[1],12288)	(n[1],1)	Z[1]=W[1]X+b[1]	(n[1],209)
Layer 2	(n[2],n[1])	(n[2],1)	Z[2]=W[2]A[1]+b[2]	(n[2],209)

Layer L-1	(n[L−1],n[L−2])	(n[L−1],1)	Z[L−1]=W[L−1]A[L−2]+b[L−1]	(n[L−1],209)
Layer L	(n[L],n[L−1])	(n[L],1)	Z[L]=W[L]A[L−1]+b[L]	(n[L],209)

Remember that when we compute in python, it carries out broadcasting. For example, if:

Then will be:

Exercise: Implement initialization for an L-layer Neural Network.

Instructions:

The model’s structure is [LINEAR -> RELU] × (L-1) -> LINEAR -> SIGMOID. I.e., it has L−1 layers using a ReLU activation function followed by an output layer with a sigmoid activation function.
Use random initialization for the weight matrices. Use np.random.rand(shape) * 0.01.
Use zeros initialization for the biases. Use np.zeros(shape).
We will store , the number of units in different layers, in a variable layer_dims. For example, the layer_dims for the “Planar Data classification model” from last week would have been [2,4,1]: There were two inputs, one hidden layer with 4 hidden units, and an output layer with 1 output unit. Thus means W1’s shape was (4,2), b1 was (4,1), W2 was (1,4) and b2 was (1,1). Now you will generalize this to L layers!

Here is the implementation for L=1 (one layer neural network). It should inspire you to implement the general case (L-layer neural network).

1
2
3

if L == 1:
parameters["W" + str(L)] = np.random.randn(layer_dims[1], layer_dims[0]) * 0.01;
parameters["b" + str(L)] = np.zeros((layer_dims[1], 1));

# GRADED FUNCTION: initialize_parameters_deep

def initialize_parameters_deep(layer_dims):
    """
    Arguments:
    layer_dims -- python array (list) containing the dimensions of each layer in our network

    Returns:
    parameters -- python dictionary containing your parameters "W1", "b1", ..., "WL", "bL":
                    Wl -- weight matrix of shape (layer_dims[l], layer_dims[l-1])
                    bl -- bias vector of shape (layer_dims[l], 1)
    """

    np.random.seed(3);
    parameters = {};
    L = len(layer_dims);     # number of layers in the network

    for l in range(1, L):
        ### START CODE HERE ### (≈ 2 lines of code)
        parameters["W" + str(l)] = np.random.randn(layer_dims[l], layer_dims[l - 1]) * 0.01;
        parameters["b" + str(l)] = np.zeros((layer_dims[l], 1));
        ### END CODE HERE ###

        assert(parameters['W' + str(l)].shape == (layer_dims[l], layer_dims[l-1]));
        assert(parameters['b' + str(l)].shape == (layer_dims[l], 1));


    return parameters;

parameters = initialize_parameters_deep([5,4,3]);
print("W1 = " + str(parameters["W1"]));
print("b1 = " + str(parameters["b1"]));
print("W2 = " + str(parameters["W2"]));
print("b2 = " + str(parameters["b2"]));

W1 = [[ 0.01788628  0.0043651   0.00096497 -0.01863493 -0.00277388]
 [-0.00354759 -0.00082741 -0.00627001 -0.00043818 -0.00477218]
 [-0.01313865  0.00884622  0.00881318  0.01709573  0.00050034]
 [-0.00404677 -0.0054536  -0.01546477  0.00982367 -0.01101068]]
b1 = [[0.]
 [0.]
 [0.]
 [0.]]
W2 = [[-0.01185047 -0.0020565   0.01486148  0.00236716]
 [-0.01023785 -0.00712993  0.00625245 -0.00160513]
 [-0.00768836 -0.00230031  0.00745056  0.01976111]]
b2 = [[0.]
 [0.]
 [0.]]

4 Forward propagation module

4.1 Linear Forward

Now that you have initialized your parameters, you will do the forward propagation module. You will start by implementing some basic functions that you will use later when implementing the model. You will complete three functions in this order:

LINEAR
LINEAR -> ACTIVATION where ACTIVATION will be either ReLU or Sigmoid.
[LINEAR -> RELU] × (L-1) -> LINEAR -> SIGMOID (whole model)

The linear forward module (vectorized over all the examples) computes the following equations:

where .

Exercise: Build the linear part of forward propagation.

Reminder:
The mathematical representation of this unit is . You may also find np.dot() useful. If your dimensions don’t match, printing W.shape may help.

# GRADED FUNCTION: linear_forward

def linear_forward(A, W, b):
    """
    Implement the linear part of a layer's forward propagation.

    Arguments:
    A -- activations from previous layer (or input data): (size of previous layer, number of examples)
    W -- weights matrix: numpy array of shape (size of current layer, size of previous layer)
    b -- bias vector, numpy array of shape (size of the current layer, 1)

    Returns:
    Z -- the input of the activation function, also called pre-activation parameter 
    cache -- a python dictionary containing "A", "W" and "b" ; stored for computing the backward pass efficiently
    """

    ### START CODE HERE ### (≈ 1 line of code)
    Z = np.dot(W, A) + b;
    ### END CODE HERE ###

    assert(Z.shape == (W.shape[0], A.shape[1]));
    cache = (A, W, b);

    return Z, cache;

1
2
3

A, W, b = linear_forward_test_case();
Z, linear_cache = linear_forward(A, W, b);
print("Z = " + str(Z));

Z = [[ 3.26295337 -1.23429987]]

linear_forward_test_case:

def linear_forward_test_case():
    np.random.seed(1);
    A = np.random.randn(3,2);
    W = np.random.randn(1,3);
    b = np.random.randn(1,1);
    return A, W, b;

4.2 Linear-Activation Forward

In this notebook, you will use two activation functions:

Sigmoid: We have provided you with the sigmoid function. This function returns two items: the activation value “a” and a “cache” that contains “Z” (it’s what we will feed in to the corresponding backward function). To use it you could just call:

1	A, activation_cache = sigmoid(Z);

ReLU: The mathematical formula for ReLu is A=RELU(Z)=max(0,Z). We have provided you with the relu function. This function returns two items: the activation value “A” and a “cache” that contains “Z” (it’s what we will feed in to the corresponding backward function). To use it you could just call:

1	A, activation_cache = relu(Z);

For more convenience, you are going to group two functions (Linear and Activation) into one function (LINEAR->ACTIVATION). Hence, you will implement a function that does the LINEAR forward step followed by an ACTIVATION forward step.

Exercise: Implement the forward propagation of the LINEAR->ACTIVATION layer. Mathematical relation is:
where the activation “g” can be sigmoid() or relu(). Use linear_forward() and the correct activation function.

# GRADED FUNCTION: linear_activation_forward

def linear_activation_forward(A_prev, W, b, activation):
    """
    Implement the forward propagation for the LINEAR->ACTIVATION layer

    Arguments:
    A_prev -- activations from previous layer (or input data): (size of previous layer, number of examples)
    W -- weights matrix: numpy array of shape (size of current layer, size of previous layer)
    b -- bias vector, numpy array of shape (size of the current layer, 1)
    activation -- the activation to be used in this layer, stored as a text string: "sigmoid" or "relu"

    Returns:
    A -- the output of the activation function, also called the post-activation value 
    cache -- a python dictionary containing "linear_cache" and "activation_cache";
             stored for computing the backward pass efficiently
    """

    if activation == "sigmoid":
        # Inputs: "A_prev, W, b". Outputs: "A, activation_cache".
        ### START CODE HERE ### (≈ 2 lines of code)
        Z, linear_cache = linear_forward(A_prev, W, b); # Z, (W, A_prev, B)
        A, activation_cache = sigmoid(Z); # A, (Z)
        ### END CODE HERE ###

    elif activation == "relu":
        # Inputs: "A_prev, W, b". Outputs: "A, activation_cache".
        ### START CODE HERE ### (≈ 2 lines of code)
        Z, linear_cache = linear_forward(A_prev, W, b); 
        A, activation_cache = relu(Z);
        ### END CODE HERE ###

    assert (A.shape == (W.shape[0], A_prev.shape[1]));
    cache = (linear_cache, activation_cache); #, ((W, A_prev, B) ,(Z))

    return A, cache;

A_prev, W, b = linear_activation_forward_test_case();

A, linear_activation_cache = linear_activation_forward(A_prev, W, b, activation = "sigmoid");
print("With sigmoid: A = " + str(A));

A, linear_activation_cache = linear_activation_forward(A_prev, W, b, activation = "relu");
print("With ReLU: A = " + str(A));

With sigmoid: A = [[0.96890023 0.11013289]]
With ReLU: A = [[3.43896131 0.        ]]

linear_activation_forward_test_case function:

def linear_activation_forward_test_case():
    np.random.seed(2)
    A_prev = np.random.randn(3,2)
    W = np.random.randn(1,3)
    b = np.random.randn(1,1)
    return A_prev, W, b

Note: In deep learning, the “[LINEAR->ACTIVATION]” computation is counted as a single layer in the neural network, not two layers.

4.3 L-Layer Model

For even more convenience when implementing the L-layer Neural Net, you will need a function that replicates the previous one (linear_activation_forward with RELU) L−1 times, then follows that with one linear_activation_forwardwith SIGMOID.

Exercise: Implement the forward propagation of the above model.

Instruction: In the code below, the variable AL will denote . (This is sometimes also called Yhat, i.e., this is .)

Tips:

Use the functions you had previously written
Use a for loop to replicate [LINEAR->RELU] (L-1) times
Don’t forget to keep track of the caches in the “caches” list. To add a new value c to a list, you can use list.append(c).

# GRADED FUNCTION: L_model_forward

def L_model_forward(X, parameters):
    """
    Implement forward propagation for the [LINEAR->RELU]*(L-1)->LINEAR->SIGMOID computation

    Arguments:
    X -- data, numpy array of shape (input size, number of examples)
    parameters -- output of initialize_parameters_deep()

    Returns:
    AL -- last post-activation value
    caches -- list of caches containing:
                every cache of linear_relu_forward() (there are L-1 of them, indexed from 0 to L-2)
                the cache of linear_sigmoid_forward() (there is one, indexed L-1)
    """

    caches = []
    A = X
    L = len(parameters) // 2                  # number of layers in the neural network

    # Implement [LINEAR -> RELU]*(L-1). Add "cache" to the "caches" list.
    for l in range(1, L):
        A_prev = A 
        ### START CODE HERE ### (≈ 2 lines of code)
        A, linear_activation_cache = linear_activation_forward(A_prev, parameters["W" + str(l)], parameters["b" + str(l)], "relu");
        caches.append(linear_activation_cache);
        ### END CODE HERE ###

    # Implement LINEAR -> SIGMOID. Add "cache" to the "caches" list.
    ### START CODE HERE ### (≈ 2 lines of code)
    AL, linear_activation_cache = linear_activation_forward(A, parameters["W" + str(L)], parameters["b" + str(L)], "sigmoid");
    caches.append(linear_activation_cache);
    ### END CODE HERE ###

    assert(AL.shape == (1,X.shape[1]));
    
    return AL, caches;

X, parameters = L_model_forward_test_case_2hidden();
AL, caches = L_model_forward(X, parameters);
print("AL = " + str(AL));
print("Length of caches list = " + str(len(caches)));

AL = [[0.03921668 0.70498921 0.19734387 0.04728177]]
Length of caches list = 3

L_model_forward_test_case function:

def L_model_forward_test_case():
    np.random.seed(1);
    X = np.random.randn(4,2);
    W1 = np.random.randn(3,4);
    b1 = np.random.randn(3,1);
    W2 = np.random.randn(1,3);
    b2 = np.random.randn(1,1);
    parameters = {"W1": W1,
                  "b1": b1,
                  "W2": W2,
                  "b2": b2};

    return X, parameters;

Great! Now you have a full forward propagation that takes the input and outputs a row vector containing your predictions. It also records all intermediate values in “caches”. Using , you can compute the cost of your predictions.

5. Cost function

Now you will implement forward and backward propagation. You need to compute the cost, because you want to check if your model is actually learning.

Exercise: Compute the cross-entropy cost , using the following formula:
$$-\frac{1}{m} \sum\limits_{i = 1}^{m} (y^{(i)}\log\left(a^{[L] (i)}\right) + (1-y^{(i)})\log\left(1- a^{L}\right)) \tag{4}$$

# GRADED FUNCTION: compute_cost

def compute_cost(AL, Y):
    """
    Implement the cost function defined by equation (7).

    Arguments:
    AL -- probability vector corresponding to your label predictions, shape (1, number of examples)
    Y -- true "label" vector (for example: containing 0 if non-cat, 1 if cat), shape (1, number of examples)

    Returns:
    cost -- cross-entropy cost
    """

    m = Y.shape[1];

    # Compute loss from aL and y.
    ### START CODE HERE ### (≈ 1 lines of code)
    cost = -1 / m * (np.dot(Y, np.log(AL).T) + np.dot(1 - Y, np.log(1 - AL).T));
    ### END CODE HERE ###

    cost = np.squeeze(cost);      # To make sure your cost's shape is what we expect (e.g. this turns [[17]] into 17).
    #assert(isinstance(cost, float));
    assert(cost.shape == ());
    
    return cost;

1 2	Y, AL = compute_cost_test_case(); print("cost = " + str(compute_cost(AL, Y)));

cost = 0.41493159961539694

compute_cost_test_case function:

def compute_cost_test_case(): 
    Y = np.asarray([[1, 1, 1]]);
    aL = np.array([[.8,.9,0.4]]); 
    return Y, aL;

6. Backward propagation module

Just like with forward propagation, you will implement helper functions for backpropagation. Remember that back propagation is used to calculate the gradient of the loss function with respect to the parameters.

Reminder:

Figure 3 : Forward and Backward propagation for LINEAR->RELU->LINEAR->SIGMOID

The purple blocks represent the forward propagation, and the red blocks represent the backward propagation.

In order to calculate the gradient , you use the previous chain rule and you do , . During the backpropagation, at each step you multiply your current gradient by the gradient corresponding to the specific layer to get the gradient you wanted. Equivalently, in order to calculate the gradient , you use the previous chain rule and you do . This is why we talk about backpropagation.

Now, similar to forward propagation, you are going to build the backward propagation in three steps:

LINEAR backward
LINEAR -> ACTIVATION backward where ACTIVATION computes the derivative of either the ReLU or sigmoid activation
[LINEAR -> RELU] × (L-1) -> LINEAR -> SIGMOID backward (whole model)

6.1 Linear backward

For layer l, the linear part is: , (followed by an activation).
Suppose you have already calculated the derivative . You want to get .

The three outputs , are computed using the input . Here are the formulas you need:

$$db^{[l]} = \frac{\partial \mathcal{L} }{\partial b^{[l]}} = \frac{1}{m} \sum_{i = 1}^{m} dZ^{l}\tag{6}$$

Exercise: Use the 3 formulas above to implement linear_backward().

# GRADED FUNCTION: linear_backward

def linear_backward(dZ, cache):
    """
    Implement the linear portion of backward propagation for a single layer (layer l)

    Arguments:
    dZ -- Gradient of the cost with respect to the linear output (of current layer l)
    cache -- tuple of values (A_prev, W, b) coming from the forward propagation in the current layer

    Returns:
    dA_prev -- Gradient of the cost with respect to the activation (of the previous layer l-1), same shape as A_prev
    dW -- Gradient of the cost with respect to W (current layer l), same shape as W
    db -- Gradient of the cost with respect to b (current layer l), same shape as b
    """
    A_prev, W, b = cache;
    m = A_prev.shape[1];

    ### START CODE HERE ### (≈ 3 lines of code)
    dW = 1 / m * np.dot(dZ, A_prev.T);
    db = 1 / m * np.sum(dZ, axis = 1, keepdims = True);
    dA_prev = np.dot(W.T, dZ);
    ### END CODE HERE ###

    assert (dA_prev.shape == A_prev.shape);
    assert (dW.shape == W.shape);
    assert (db.shape == b.shape);

    return dA_prev, dW, db;

# Set up some test inputs
dZ, linear_cache = linear_backward_test_case();
dA_prev, dW, db = linear_backward(dZ, linear_cache);
print ("dA_prev = "+ str(dA_prev));
print ("dW = " + str(dW));
print ("db = " + str(db));

dA_prev = [[ 0.51822968 -0.19517421]
 [-0.40506361  0.15255393]
 [ 2.37496825 -0.89445391]]
dW = [[-0.10076895  1.40685096  1.64992505]]
db = [[0.50629448]]

linear_backward_test_case function:

def linear_backward_test_case():
    np.random.seed(1);
    dZ = np.random.randn(1,2);
    A = np.random.randn(3,2);
    W = np.random.randn(1,3);
    b = np.random.randn(1,1);
    linear_cache = (A, W, b);
    return dZ, linear_cache;

6.2 Linear-Activation backward

Next, you will create a function that merges the two helper functions: linear_backward and the backward step for the activation linear_activation_backward.

To help you implement linear_activation_backward, we provided two backward functions:

sigmoid_backward: Implements the backward propagation for SIGMOID unit. You can call it as follows:

1	dZ = sigmoid_backward(dA, activation_cache)

relu_backward: Implements the backward propagation for RELU unit. You can call it as follows:

1	dZ = relu_backward(dA, activation_cache)

If g(.) is the activation function,
sigmoid_backward and relu_backward compute:

Exercise: Implement the backpropagation for the LINEAR->ACTIVATION layer.

# GRADED FUNCTION: linear_activation_backward

def linear_activation_backward(dA, cache, activation):
    
    """
    Implement the backward propagation for the LINEAR->ACTIVATION layer.

    Arguments:
    dA -- post-activation gradient for current layer l 
    cache -- tuple of values (linear_cache, activation_cache) we store for computing backward propagation efficiently
    activation -- the activation to be used in this layer, stored as a text string: "sigmoid" or "relu"

    Returns:
    dA_prev -- Gradient of the cost with respect to the activation (of the previous layer l-1), same shape as A_prev
    dW -- Gradient of the cost with respect to W (current layer l), same shape as W
    db -- Gradient of the cost with respect to b (current layer l), same shape as b
    """
    
    linear_cache, activation_cache = cache

    if activation == "relu":
        ### START CODE HERE ### (≈ 2 lines of code)
        dZ = relu_backward(dA, activation_cache);
        dA_prev, dW, db = linear_backward(dZ, linear_cache);
        ### END CODE HERE ###

    elif activation == "sigmoid":
        ### START CODE HERE ### (≈ 2 lines of code)
        dZ = sigmoid_backward(dA, activation_cache);
        dA_prev, dW, db = linear_backward(dZ, linear_cache);
        ### END CODE HERE ###

    return dA_prev, dW, db;

AL, linear_activation_cache = linear_activation_backward_test_case();

dA_prev, dW, db = linear_activation_backward(AL, linear_activation_cache, activation = "sigmoid");
print ("sigmoid:");
print ("dA_prev = "+ str(dA_prev));
print ("dW = " + str(dW));
print ("db = " + str(db) + "\n");

dA_prev, dW, db = linear_activation_backward(AL, linear_activation_cache, activation = "relu");
print ("relu:");
print ("dA_prev = "+ str(dA_prev));
print ("dW = " + str(dW));
print ("db = " + str(db));

sigmoid:
dA_prev = [[ 0.11017994  0.01105339]
 [ 0.09466817  0.00949723]
 [-0.05743092 -0.00576154]]
dW = [[ 0.10266786  0.09778551 -0.01968084]]
db = [[-0.05729622]]

relu:
dA_prev = [[ 0.44090989  0.        ]
 [ 0.37883606  0.        ]
 [-0.2298228   0.        ]]
dW = [[ 0.44513824  0.37371418 -0.10478989]]
db = [[-0.20837892]]

linear_activation_backward_test_case function:

def linear_activation_backward_test_case():
    np.random.seed(2);
    dA = np.random.randn(1,2);
    A = np.random.randn(3,2);
    W = np.random.randn(1,3);
    b = np.random.randn(1,1);
    Z = np.random.randn(1,2);
    linear_cache = (A, W, b);
    activation_cache = Z;
    linear_activation_cache = (linear_cache, activation_cache);

    return dA, linear_activation_cache;

6.3 L-Model Backward

Now you will implement the backward function for the whole network. Recall that when you implemented the L_model_forward function, at each iteration, you stored a cache which contains (X,W,b, and z). In the back propagation module, you will use those variables to compute the gradients. Therefore, in the L_model_backward function, you will iterate through all the hidden layers backward, starting from layer L. On each step, you will use the cached values for layer l to backpropagate through layer l. Figure 5 below shows the backward pass.

Initializing backpropagation:

To backpropagate through this network, we know that the output is, . Your code thus needs to compute .
To do so, use this formula (derived using calculus which you don’t need in-depth knowledge of):

1	dAL = - (np.divide(Y, AL) - np.divide(1 - Y, 1 - AL)) # derivative of cost with respect to AL

You can then use this post-activation gradient dAL to keep going backward. As seen in Figure 5, you can now feed in dALinto the LINEAR->SIGMOID backward function you implemented (which will use the cached values stored by the L_model_forward function). After that, you will have to use a for loop to iterate through all the other layers using the LINEAR->RELU backward function. You should store each dA, dW, and db in the grads dictionary. To do so, use this formula :

For example, for l=3 this would store in grads["dW3"].

Exercise: Implement backpropagation for the [LINEAR->RELU] × (L-1) -> LINEAR -> SIGMOID model.

# GRADED FUNCTION: L_model_backward

def L_model_backward(AL, Y, caches):
    """
    Implement the backward propagation for the [LINEAR->RELU] * (L-1) -> LINEAR -> SIGMOID group

    Arguments:
    AL -- probability vector, output of the forward propagation (L_model_forward())
    Y -- true "label" vector (containing 0 if non-cat, 1 if cat)
    caches -- list of caches containing:
                every cache of linear_activation_forward() with "relu" (it's caches[l], for l in range(L-1) i.e l = 0...L-2)
                the cache of linear_activation_forward() with "sigmoid" (it's caches[L-1])

    Returns:
    grads -- A dictionary with the gradients
             grads["dA" + str(l)] = ... 
             grads["dW" + str(l)] = ...
             grads["db" + str(l)] = ... 
    """
    grads = {};
    L = len(caches); # the number of layers
    m = AL.shape[1];
    Y = Y.reshape(AL.shape); # after this line, Y is the same shape as AL

    # Initializing the backpropagation
    ### START CODE HERE ### (1 line of code)
    dAL = - (np.divide(Y, AL) - np.divide(1 - Y, 1 - AL));
    ### END CODE HERE ###

    # Lth layer (SIGMOID -> LINEAR) gradients. Inputs: "AL, Y, caches". Outputs: "grads["dAL"], grads["dWL"], grads["dbL"]
    ### START CODE HERE ### (approx. 2 lines)
    dA_prev, dW, db = linear_activation_backward(dAL, caches[L - 1], "sigmoid");
    grads["dA" + str(L)], grads["dW" + str(L)], grads["db" + str(L)] = dA_prev, dW, db;
    ### END CODE HERE ###

    for l in reversed(range(L-1)):
        # lth layer: (RELU -> LINEAR) gradients.
        # Inputs: "grads["dA" + str(l + 2)], caches". Outputs: "grads["dA" + str(l + 1)] , grads["dW" + str(l + 1)] , grads["db" + str(l + 1)] 
        ### START CODE HERE ### (approx. 5 lines)
        dA = dA_prev;
        dA_prev, dW, db = linear_activation_backward(dA, caches[l], "relu");
        grads["dA" + str(l + 1)] = dA_prev;
        grads["dW" + str(l + 1)] = dW;
        grads["db" + str(l + 1)] = db;
        ### END CODE HERE ###

    return grads;

1
2
3

AL, Y_assess, caches = L_model_backward_test_case();
grads = L_model_backward(AL, Y_assess, caches);
print_grads(grads);

dW1 = [[0.41010002 0.07807203 0.13798444 0.10502167]
 [0.         0.         0.         0.        ]
 [0.05283652 0.01005865 0.01777766 0.0135308 ]]
db1 = [[-0.22007063]
 [ 0.        ]
 [-0.02835349]]
dA1 = [[ 0.12913162 -0.44014127]
 [-0.14175655  0.48317296]
 [ 0.01663708 -0.05670698]]

L_model_backward_test_case function in testCases_v3.py:

def L_model_backward_test_case():
    """
    X = np.random.rand(3,2)
    Y = np.array([[1, 1]])
    parameters = {'W1': np.array([[ 1.78862847,  0.43650985,  0.09649747]]), 'b1': np.array([[ 0.]])}

    aL, caches = (np.array([[ 0.60298372,  0.87182628]]), [((np.array([[ 0.20445225,  0.87811744],
           [ 0.02738759,  0.67046751],
           [ 0.4173048 ,  0.55868983]]),
    np.array([[ 1.78862847,  0.43650985,  0.09649747]]),
    np.array([[ 0.]])),
   np.array([[ 0.41791293,  1.91720367]]))])
   """
    np.random.seed(3)
    AL = np.random.randn(1, 2)
    Y = np.array([[1, 0]])

    A1 = np.random.randn(4,2)
    W1 = np.random.randn(3,4)
    b1 = np.random.randn(3,1)
    Z1 = np.random.randn(3,2)
    linear_cache_activation_1 = ((A1, W1, b1), Z1)

    A2 = np.random.randn(3,2)
    W2 = np.random.randn(1,3)
    b2 = np.random.randn(1,1)
    Z2 = np.random.randn(1,2)
    linear_cache_activation_2 = ((A2, W2, b2), Z2)

    caches = (linear_cache_activation_1, linear_cache_activation_2)

    return AL, Y, caches

6.4 Update Parameters

In this section you will update the parameters of the model, using gradient descent:

where is the learning rate. After computing the updated parameters, store them in the parameters dictionary.

Exercise: Implement update_parameters() to update your parameters using gradient descent.

Instructions:
Update parameters using gradient descent on every and for .

# GRADED FUNCTION: update_parameters

def update_parameters(parameters, grads, learning_rate):
    """
    Update parameters using gradient descent

    Arguments:
    parameters -- python dictionary containing your parameters 
    grads -- python dictionary containing your gradients, output of L_model_backward

    Returns:
    parameters -- python dictionary containing your updated parameters 
                  parameters["W" + str(l)] = ... 
                  parameters["b" + str(l)] = ...
    """

    L = len(parameters) // 2 # number of layers in the neural network

    # Update rule for each parameter. Use a for loop.
    ### START CODE HERE ### (≈ 3 lines of code)
    for l in range(L):
        parameters["W" + str(l + 1)] -= learning_rate * grads["dW" + str(l + 1)];
        parameters["b" + str(l + 1)] -= learning_rate * grads["db" + str(l + 1)];
    ### END CODE HERE ###
    return parameters;

parameters, grads = update_parameters_test_case();
parameters = update_parameters(parameters, grads, 0.1);

print ("W1 = "+ str(parameters["W1"]));
print ("b1 = "+ str(parameters["b1"]));
print ("W2 = "+ str(parameters["W2"]));
print ("b2 = "+ str(parameters["b2"]));

W1 = [[-0.59562069 -0.09991781 -2.14584584  1.82662008]
 [-1.76569676 -0.80627147  0.51115557 -1.18258802]
 [-1.0535704  -0.86128581  0.68284052  2.20374577]]
b1 = [[-0.04659241]
 [-1.28888275]
 [ 0.53405496]]
W2 = [[-0.55569196  0.0354055   1.32964895]]
b2 = [[-0.84610769]]

update_parameters_test_case function in testCases_v3.py:

def update_parameters_test_case():
    np.random.seed(2)
    W1 = np.random.randn(3,4)
    b1 = np.random.randn(3,1)
    W2 = np.random.randn(1,3)
    b2 = np.random.randn(1,1)
    parameters = {"W1": W1,
                  "b1": b1,
                  "W2": W2,
                  "b2": b2}
    np.random.seed(3)
    dW1 = np.random.randn(3,4)
    db1 = np.random.randn(3,1)
    dW2 = np.random.randn(1,3)
    db2 = np.random.randn(1,1)
    grads = {"dW1": dW1,
             "db1": db1,
             "dW2": dW2,
             "db2": db2}

    return parameters, grads

7. Conclusion

Congrats on implementing all the functions required for building a deep neural network!

We know it was a long assignment but going forward it will only get better. The next part of the assignment is easier.

In the next assignment you will put all these together to build two models:

A two-layer neural network
An L-layer neural network

You will in fact use these models to classify cat vs non-cat images!

Part 2：Deep Neural Network for Image Classification: Application

1. Packages

Let’s first import all the packages that you will need during this assignment.

numpy is the fundamental package for scientific computing with Python.
matplotlib is a library to plot graphs in Python.
h5py is a common package to interact with a dataset that is stored on an H5 file.
PIL and scipy are used here to test your model with your own picture at the end.
dnn_app_utils provides the functions implemented in the “Building your Deep Neural Network: Step by Step” assignment to this notebook.
np.random.seed(1) is used to keep all the random function calls consistent. It will help us grade your work.

import time
import numpy as np
import h5py
import matplotlib.pyplot as plt
import scipy
from PIL import Image
from scipy import ndimage
from dnn_app_utils_v2 import *

%matplotlib inline
plt.rcParams['figure.figsize'] = (5.0, 4.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

%load_ext autoreload
%autoreload 2

np.random.seed(1)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload

2. Dataset

You will use the same “Cat vs non-Cat” dataset as in “Logistic Regression as a Neural Network” (Assignment 2). The model you had built had 70% test accuracy on classifying cats vs non-cats images. Hopefully, your new model will perform a better!

Problem Statement: You are given a dataset (“data.h5”) containing:

a training set of m_train images labelled as cat (1) or non-cat (0)
a test set of m_test images labelled as cat and non-cat
each image is of shape (num_px, num_px, 3) where 3 is for the 3 channels (RGB).

Let’s get more familiar with the dataset. Load the data by running the cell below.

1	train_x_orig, train_y, test_x_orig, test_y, classes = load_data();

The following code will show you an image in the dataset. Feel free to change the index and re-run the cell multiple times to see other images.

# Example of a picture
index = 10;
plt.imshow(train_x_orig[index]);
print ("y = " + str(train_y[0,index]) + ". It's a " + classes[train_y[0,index]].decode("utf-8") +  " picture.");

y = 0. It's a non-cat picture.

# Explore your dataset 
m_train = train_x_orig.shape[0];
num_px = train_x_orig.shape[1];
m_test = test_x_orig.shape[0];

print ("Number of training examples: " + str(m_train));
print ("Number of testing examples: " + str(m_test));
print ("Each image is of size: (" + str(num_px) + ", " + str(num_px) + ", 3)");
print ("train_x_orig shape: " + str(train_x_orig.shape));
print ("train_y shape: " + str(train_y.shape));
print ("test_x_orig shape: " + str(test_x_orig.shape));
print ("test_y shape: " + str(test_y.shape));

Number of training examples: 209
Number of testing examples: 50
Each image is of size: (64, 64, 3)
train_x_orig shape: (209, 64, 64, 3)
train_y shape: (1, 209)
test_x_orig shape: (50, 64, 64, 3)
test_y shape: (1, 50)

As usual, you reshape and standardize the images before feeding them to the network. The code is given in the cell below.

# Reshape the training and test examples 
train_x_flatten = train_x_orig.reshape(train_x_orig.shape[0], -1).T;   # The "-1" makes reshape flatten the remaining dimensions
test_x_flatten = test_x_orig.reshape(test_x_orig.shape[0], -1).T;

# Standardize data to have feature values between 0 and 1.
train_x = train_x_flatten / 255.;
test_x = test_x_flatten / 255.;

print ("train_x's shape: " + str(train_x.shape));
print ("test_x's shape: " + str(test_x.shape));

train_x's shape: (12288, 209)
test_x's shape: (12288, 50)

equals which is the size of one reshaped image vector.

3. Architecture of your model

Now that you are familiar with the dataset, it is time to build a deep neural network to distinguish cat images from non-cat images.

You will build two different models:

A 2-layer neural network
An L-layer deep neural network

You will then compare the performance of these models, and also try out different values for L.

Let’s look at the two architectures.

3.1 2-layer neural network

The model can be summarized as: INPUT -> LINEAR -> RELU -> LINEAR -> SIGMOID -> OUTPUT

Detailed Architecture of figure 2:

The input is a image which is flattened to a vector of size .
The corresponding vector: is then multiplied by the weight matrix of size .
You then add a bias term and take its relu to get the following vector: .
You then repeat the same process.
You multiply the resulting vector by and add your intercept (bias).
Finally, you take the sigmoid of the result. If it is greater than , you classify it to be a cat.

3.2 L-layer deep neural network

It is hard to represent an L-layer deep neural network with the above representation. However, here is a simplified network representation:

The model can be summarized as: [LINEAR -> RELU] × (L-1) -> LINEAR -> SIGMOID

Detailed Architecture of figure 3:

The input is a image which is flattened to a vector of size .
The corresponding vector: is then multiplied by the weight matrix and then you add the intercept . The result is called the linear unit.
Next, you take the relu of the linear unit. This process could be repeated several times for each depending on the model architecture.
Finally, you take the sigmoid of the final linear unit. If it is greater than , you classify it to be a cat.

3.3 General methodology

As usual you will follow the Deep Learning methodology to build the model:

Initialize parameters / Define hyperparameters
Loop for num_iterations:
1. Forward propagation
2. Compute cost function
3. Backward propagation
4. Update parameters (using parameters, and grads from backprop)
Use trained parameters to predict labels

Let’s now implement those two models!

4. Two-layer neural network

Question: Use the helper functions you have implemented in the previous assignment to build a 2-layer neural network with the following

structure: LINEAR -> RELU -> LINEAR -> SIGMOID. The functions you may need and their inputs are:

def initialize_parameters(n_x, n_h, n_y):
    ...
    return parameters 
def linear_activation_forward(A_prev, W, b, activation):
    ...
    return A, cache
def compute_cost(AL, Y):
    ...
    return cost
def linear_activation_backward(dA, cache, activation):
    ...
    return dA_prev, dW, db
def update_parameters(parameters, grads, learning_rate):
    ...
    return parameters

### CONSTANTS DEFINING THE MODEL ####
n_x = 12288;    # num_px * num_px * 3
n_h = 7;
n_y = 1;
layers_dims = (n_x, n_h, n_y);

#GRADED FUNCTION: two_layer_model

def two_layer_model(X, Y, layers_dims, learning_rate = 0.0075, num_iterations = 3000, print_cost=False):
    """
    Implements a two-layer neural network: LINEAR->RELU->LINEAR->SIGMOID.

    Arguments:
    X -- input data, of shape (n_x, number of examples)
    Y -- true "label" vector (containing 0 if cat, 1 if non-cat), of shape (1, number of examples)
    layers_dims -- dimensions of the layers (n_x, n_h, n_y)
    num_iterations -- number of iterations of the optimization loop
    learning_rate -- learning rate of the gradient descent update rule
    print_cost -- If set to True, this will print the cost every 100 iterations 

    Returns:
    parameters -- a dictionary containing W1, W2, b1, and b2
    """

    np.random.seed(1);
    grads = {};
    costs = [];                              # to keep track of the cost
    m = X.shape[1];                           # number of examples
    (n_x, n_h, n_y) = layers_dims;

    # Initialize parameters dictionary, by calling one of the functions you'd previously implemented
    ### START CODE HERE ### (≈ 1 line of code)
    parameters = initialize_parameters(n_x, n_h, n_y);
    ### END CODE HERE ###

    # Get W1, b1, W2 and b2 from the dictionary parameters.
    W1 = parameters["W1"];
    b1 = parameters["b1"];
    W2 = parameters["W2"];
    b2 = parameters["b2"];

    # Loop (gradient descent)

    for i in range(0, num_iterations):

        # Forward propagation: LINEAR -> RELU -> LINEAR -> SIGMOID. Inputs: "X, W1, b1". Output: "A1, cache1, A2, cache2".
        ### START CODE HERE ### (≈ 2 lines of code)
        A1, cache1 = linear_activation_forward(X, W1, b1, "relu");
        A2, cache2 = linear_activation_forward(A1, W2, b2, "sigmoid");
        ### END CODE HERE ###

        # Compute cost
        ### START CODE HERE ### (≈ 1 line of code)
        cost = compute_cost(A2, Y);
        ### END CODE HERE ###

        # Initializing backward propagation
        dA2 = - (np.divide(Y, A2) - np.divide(1 - Y, 1 - A2));

        # Backward propagation. Inputs: "dA2, cache2, cache1". Outputs: "dA1, dW2, db2; also dA0 (not used), dW1, db1".
        ### START CODE HERE ### (≈ 2 lines of code)
        dA1, dW2, db2 = linear_activation_backward(dA2, cache2, "sigmoid");
        dA0, dW1, db1 = linear_activation_backward(dA1, cache1, "relu");
        ### END CODE HERE ###

        # Set grads['dWl'] to dW1, grads['db1'] to db1, grads['dW2'] to dW2, grads['db2'] to db2
        grads['dW1'] = dW1;
        grads['db1'] = db1;
        grads['dW2'] = dW2;
        grads['db2'] = db2;

        # Update parameters.
        ### START CODE HERE ### (approx. 1 line of code)
        parameters = update_parameters(parameters, grads, learning_rate);
        ### END CODE HERE ###

        # Retrieve W1, b1, W2, b2 from parameters
        W1 = parameters["W1"];
        b1 = parameters["b1"];
        W2 = parameters["W2"];
        b2 = parameters["b2"];

        # Print the cost every 100 training example
        if print_cost and i % 100 == 0:
            print("Cost after iteration {}: {}".format(i, np.squeeze(cost)));
        if print_cost and i % 100 == 0:
            costs.append(cost);

    # plot the cost

    plt.plot(np.squeeze(costs));
    plt.ylabel('cost');
    plt.xlabel('iterations (per tens)');
    plt.title("Learning rate =" + str(learning_rate));
    plt.show();

    return parameters;

Run the cell below to train your parameters. See if your model runs. The cost should be decreasing. It may take up to 5 minutes to run 2500 iterations. Check if the “Cost after iteration 0” matches the expected output below, if not click on the black square button on the upper bar of the notebook to stop the cell and try to find your error.

1	parameters = two_layer_model(train_x, train_y, layers_dims = (n_x, n_h, n_y), num_iterations = 2500, print_cost = True);

Cost after iteration 0: 0.693049735659989
Cost after iteration 100: 0.6464320953428849
Cost after iteration 200: 0.6325140647912678
Cost after iteration 300: 0.6015024920354665
Cost after iteration 400: 0.5601966311605748
Cost after iteration 500: 0.5158304772764729
Cost after iteration 600: 0.4754901313943325
Cost after iteration 700: 0.43391631512257495
Cost after iteration 800: 0.4007977536203886
Cost after iteration 900: 0.35807050113237976
Cost after iteration 1000: 0.33942815383664127
Cost after iteration 1100: 0.3052753636196264
Cost after iteration 1200: 0.2749137728213016
Cost after iteration 1300: 0.2468176821061484
Cost after iteration 1400: 0.19850735037466102
Cost after iteration 1500: 0.1744831811255665
Cost after iteration 1600: 0.17080762978096942
Cost after iteration 1700: 0.11306524562164715
Cost after iteration 1800: 0.09629426845937152
Cost after iteration 1900: 0.0834261795972687
Cost after iteration 2000: 0.07439078704319087
Cost after iteration 2100: 0.06630748132267934
Cost after iteration 2200: 0.05919329501038172
Cost after iteration 2300: 0.053361403485605585
Cost after iteration 2400: 0.04855478562877019

Good thing you built a vectorized implementation! Otherwise it might have taken 10 times longer to train this.

Now, you can use the trained parameters to classify images from the dataset. To see your predictions on the training and test sets, run the cell below.

1	predictions_train = predict(train_x, train_y, parameters);

Accuracy: 0.9999999999999998

1	predictions_test = predict(test_x, test_y, parameters);

Accuracy: 0.72

the prediction function:

def predict(X, y, parameters):
    """
    This function is used to predict the results of a  L-layer neural network.

    Arguments:
    X -- data set of examples you would like to label
    parameters -- parameters of the trained model

    Returns:
    p -- predictions for the given dataset X
    """

    m = X.shape[1]
    n = len(parameters) // 2 # number of layers in the neural network
    p = np.zeros((1,m))

    # Forward propagation
    probas, caches = L_model_forward(X, parameters)


    # convert probas to 0/1 predictions
    for i in range(0, probas.shape[1]):
        if probas[0,i] > 0.5:
            p[0,i] = 1
        else:
            p[0,i] = 0

    print("Accuracy: "  + str(np.sum((p == y)/m)))

    return p

Note: You may notice that running the model on fewer iterations (say 1500) gives better accuracy on the test set. This is called “early stopping” and we will talk about it in the next course. Early stopping is a way to prevent overfitting.

Congratulations! It seems that your 2-layer neural network has better performance (72%) than the logistic regression implementation (70%, assignment week 2). Let’s see if you can do even better with an L-layer model.

5. L-layer Neural Network

Question: Use the helper functions you have implemented previously to build an L-layer neural network with the following structure: [LINEAR -> RELU]×(L-1) -> LINEAR -> SIGMOID. The functions you may need and their inputs are:

def initialize_parameters_deep(layer_dims):
    ...
    return parameters 
def L_model_forward(X, parameters):
    ...
    return AL, caches
def compute_cost(AL, Y):
    ...
    return cost
def L_model_backward(AL, Y, caches):
    ...
    return grads
def update_parameters(parameters, grads, learning_rate):
    ...
    return parameters

1 2	### CONSTANTS ### layers_dims = [12288, 20, 7, 5, 1] # 5-layer model

# GRADED FUNCTION: L_layer_model

def L_layer_model(X, Y, layers_dims, learning_rate = 0.0075, num_iterations = 3000, print_cost=False):#lr was 0.009
    """
    Implements a L-layer neural network: [LINEAR->RELU]*(L-1)->LINEAR->SIGMOID.

    Arguments:
    X -- data, numpy array of shape (number of examples, num_px * num_px * 3)
    Y -- true "label" vector (containing 0 if cat, 1 if non-cat), of shape (1, number of examples)
    layers_dims -- list containing the input size and each layer size, of length (number of layers + 1).
    learning_rate -- learning rate of the gradient descent update rule
    num_iterations -- number of iterations of the optimization loop
    print_cost -- if True, it prints the cost every 100 steps

    Returns:
    parameters -- parameters learnt by the model. They can then be used to predict.
    """

    np.random.seed(1)
    costs = []                         # keep track of cost

    # Parameters initialization.
    ### START CODE HERE ###
    parameters = initialize_parameters_deep(layers_dims);
    ### END CODE HERE ###

    # Loop (gradient descent)
    for i in range(0, num_iterations):

        # Forward propagation: [LINEAR -> RELU]*(L-1) -> LINEAR -> SIGMOID.
        ### START CODE HERE ### (≈ 1 line of code)
        AL, caches =L_model_forward(X, parameters);
        ### END CODE HERE ###

        # Compute cost.
        ### START CODE HERE ### (≈ 1 line of code)
        cost = compute_cost(AL, Y);
        ### END CODE HERE ###

        # Backward propagation.
        ### START CODE HERE ### (≈ 1 line of code)
        grads = L_model_backward(AL, Y, caches);
        ### END CODE HERE ###

        # Update parameters.
        ### START CODE HERE ### (≈ 1 line of code)
        parameters = update_parameters(parameters, grads, learning_rate);
        ### END CODE HERE ###

        # Print the cost every 100 training example
        if print_cost and i % 100 == 0:
            print ("Cost after iteration %i: %f" %(i, cost));
        if print_cost and i % 100 == 0:
            costs.append(cost);

    # plot the cost
    plt.plot(np.squeeze(costs));
    plt.ylabel('cost');
    plt.xlabel('iterations (per tens)');
    plt.title("Learning rate =" + str(learning_rate));
    plt.show();

    return parameters;

You will now train the model as a 5-layer neural network.

Run the cell below to train your model. The cost should decrease on every iteration. It may take up to 5 minutes to run 2500 iterations. Check if the “Cost after iteration 0” matches the expected output below, if not click on the black square button on the upper bar of the notebook to stop the cell and try to find your error.

1	parameters = L_layer_model(train_x, train_y, layers_dims, num_iterations = 2500, print_cost = True);

Cost after iteration 0: 0.771749
Cost after iteration 100: 0.672053
Cost after iteration 200: 0.648263
Cost after iteration 300: 0.611507
Cost after iteration 400: 0.567047
Cost after iteration 500: 0.540138
Cost after iteration 600: 0.527930
Cost after iteration 700: 0.465477
Cost after iteration 800: 0.369126
Cost after iteration 900: 0.391747
Cost after iteration 1000: 0.315187
Cost after iteration 1100: 0.272700
Cost after iteration 1200: 0.237419
Cost after iteration 1300: 0.199601
Cost after iteration 1400: 0.189263
Cost after iteration 1500: 0.161189
Cost after iteration 1600: 0.148214
Cost after iteration 1700: 0.137775
Cost after iteration 1800: 0.129740
Cost after iteration 1900: 0.121225
Cost after iteration 2000: 0.113821
Cost after iteration 2100: 0.107839
Cost after iteration 2200: 0.102855
Cost after iteration 2300: 0.100897
Cost after iteration 2400: 0.092878

1	pred_train = predict(train_x, train_y, parameters);

Accuracy: 0.9856459330143539

1	pred_test = predict(test_x, test_y, parameters);

Accuracy: 0.8

Congrats! It seems that your 5-layer neural network has better performance than your 2-layer neural network on the same test set.

This is good performance for this task. Nice job!

Though in the next course on “Improving deep neural networks” you will learn how to obtain even higher accuracy by systematically searching for better hyperparameters (learning_rate, layers_dims, num_iterations, and others you’ll also learn in the next course).

6. Results Analysis

First, let’s take a look at some images the L-layer model labeled incorrectly. This will show a few mislabeled images.

1	print_mislabeled_images(classes, test_x, test_y, pred_test);

A few type of images the model tends to do poorly on include:

Cat body in an unusual position
Cat appears against a background of a similar color
Unusual cat color and species
Camera Angle
Brightness of the picture
Scale variation (cat is very large or small in image)

7. Test with your own image (optional/ungraded exercise)

Congratulations on finishing this assignment. You can use your own image and see the output of your model. To do that:

Click on “File” in the upper bar of this notebook, then click “Open” to go on your Coursera Hub.
Add your image to this Jupyter Notebook’s directory, in the “images” folder
Change your image’s name in the following code
Run the code and check if the algorithm is right (1 = cat, 0 = non-cat)!

## START CODE HERE ##
my_image = "1.png"; # change this to the name of your image file 
my_label_y = [1]; # the true class of your image (1 -> cat, 0 -> non-cat)
## END CODE HERE ##

fname = "images/" + my_image;
image = np.array(ndimage.imread(fname, flatten=False));
my_image = scipy.misc.imresize(image, size=(num_px,num_px)).reshape((num_px * num_px * 3,1));
my_predicted_image = predict(my_image, my_label_y, parameters);

plt.imshow(image);
print ("y = " + str(np.squeeze(my_predicted_image)) + ", your L-layer model predicts a \"" + classes[int(np.squeeze(my_predicted_image)),].decode("utf-8") +  "\" picture.");

Accuracy: 1.0
y = 1.0, your L-layer model predicts a "cat" picture.

你可能感兴趣的:(深度学习)

基于opencv消除图片马赛克小苗爸爸 opencv 人工智能计算机视觉
以下是一个基于Python的图片马赛克消除函数实现，结合了图像处理和深度学习方法。由于马赛克消除涉及复杂的图像重建任务，建议根据实际需求选择合适的方法：importcv2importnumpyasnpfromPILimportImagedefremove_mosaic(image_path,output_path,method='traditional',block_size=10,scale_f
【AI深度学习基础】Pandas完全指南入门篇：数据处理的瑞士军刀（含完整代码） arbboter 人工智能人工智能深度学习 pandas 数据处理数据分析数据清洗数据分析效率提升
Pandas系列文章导航入门篇进阶篇终极篇一、引言在大数据与AI驱动的时代，数据预处理和分析是深度学习与机器学习的基石。Pandas作为Python生态中最强大的数据处理库，以其灵活的数据结构（如DataFrame和Series）和丰富的功能（数据清洗、转换、聚合等），成为数据科学家和工程师的核心工具。Pandas以Series（一维标签数组）和DataFrame（二维表格）为核心数据结构，提供高
对“预训练”的理解衣衣困深度学习神经网络自然语言处理
预训练有什么用传统的机器学习是偏数学的，对数据的量不做过多要求，而深度学习的项目通常是有大量的数据可供使用。在平常的任务或者项目中，我们可能并没有大量数据，只有少量数据，在这时我们就可以通过“借用”有大数据支持的模型的参数，作为基准，这样就能提高效率和准确率。因为他们神经网络的浅层是相似的，也就是说，在任务相似的情况下，可以用已有的模型即“预训练”好的模型参数实现小数据量的模型训练。预训练可以节省
赋能农业数字化转型用DeepSeek大模型开启智慧农业新纪元 jingwang-cs 人工智能人工智能后端
赋能农业数字化转型用DeepSeek大模型开启智慧农业新纪元当农业遇见DeepSeek大模型：从经验驱动到数据智能的跨越传统农业依赖“看天吃饭”，而「智慧农业」平台依托公司自主研发的农业大模型，深度融合DeepSeek前沿AI技术，构建“数据-模型-决策”全链路智能服务体系。通过深度学习历史种植数据、气象信息、土壤墒情等多维农业要素，平台可精准预测病虫害风险、产量波动及市场趋势，为农户提供科学种植
AI创业机遇：垂直领域无限可能 AGI大模型与大数据研究院 DeepSeek R1 &大数据AI人工智能 java python javascript kotlin golang 架构人工智能
AI创业垂直领域机器学习深度学习自然语言处理计算机视觉无人驾驶1.背景介绍人工智能（AI）正在各行各业掀起一场革命，为创业者带来了前所未有的机遇。垂直领域，即特定行业或细分市场，正在成为AI创业的热门选择。本文将深入探讨AI在垂直领域的应用，并提供实用的指南，帮助读者把握AI创业机遇。2.核心概念与联系2.1AI与垂直领域AI在垂直领域的应用，需要理解AI与垂直领域的关系。AI可以为垂直领域提供智
深度学习代码分析——自用肆—— 深度学习人工智能笔记
代码来自：https://github.com/ChuHan89/WSSS-Tissue?tab=readme-ov-file借助了一些人工智能1_train_stage1.py代码功能总览该代码是弱监督语义分割（WSSS）流程的Stage1训练与测试脚本，核心任务是通过多标签分类模型生成图像级标签，为后续生成伪掩码（Pseudo-Masks）提供基础。代码分为train_phase和test_p
Python | Pytorch | Tensor知识点总结漂亮_大男孩 Python拾遗 python pytorch 深度学习人工智能
如是我闻：Tensor是我们接触Pytorch了解到的第一个概念，这里是一个关于PyTorchTensor主题的知识点总结，涵盖了Tensor的基本概念、创建方式、运算操作、梯度计算和GPU加速等内容。1.Tensor基本概念Tensor是PyTorch的核心数据结构，类似于NumPy的ndarray，但支持GPU加速和自动求导。PyTorch的Tensor具有动态计算图，可用于深度学习模型的前向
一文讲清楚自我学习和深度学习平凡而伟大(心之所向) 人工智能人工智能深度学习机器学习
自我学习（Self-Learning）和深度学习（DeepLearning）是两个不同的概念，但它们在某些应用场景中可以有交集。下面我们将分别介绍这两个概念，并探讨如何将它们结合起来用于自我学习系统。自我学习（Self-Learning）自我学习是指个体或系统通过自主探索、实践和反思来获取知识和技能的过程。它强调的是无需外部直接指导的学习方式，通常包括以下几个方面：自主性：学习者根据自己的兴趣、需
深度学习数据集封装-----目标检测篇科研小天才深度学习目标检测人工智能
前言在上篇文章中，我们深入探讨了图像分类数据集的制作流程。图像分类作为计算机视觉领域的一个基础任务，通常被认为是最为简单直接的子任务之一。然而，当我们转向目标检测任务时，复杂度便显著提升，尤其是在标注框的处理环节。不同的模型架构往往对标注框的处理方式有着各自独特的要求。以YOLO系列为例，它自有一套成熟且高效的方法来应对这一挑战。鉴于篇幅有限，本文暂不深入展开YOLO的相关内容，感兴趣的读者可以查
深入理解 Transformer：用途、原理和示例范吉民(DY Young) 简单AI学习 transformer 深度学习人工智能
深入理解Transformer：用途、原理和示例一、Transformer是什么Transformer是一种基于注意力机制（AttentionMechanism）的深度学习架构，在2017年的论文“AttentionIsAllYouNeed”中惊艳登场。它打破了传统循环神经网络（RNN）按顺序处理序列、难以并行计算以及卷积神经网络（CNN）在捕捉长距离依赖关系上的局限，另辟蹊径地采用多头注意力机制
深度学习算法模型：从原理到未来 YDH_AlwaysRunning 深度学习
近年来，人工智能（AI）技术以前所未有的速度改变着人类生活，而深度学习的崛起无疑是这场技术革命的核心驱动力。从手机中的语音助手到医学影像的智能诊断，从自动驾驶汽车到生成式AI创作的诗歌和画作，深度学习算法模型正逐渐渗透到社会的每个角落。本文将从基本原理出发，解析典型模型的运作机制，探讨其应用现状与发展趋势，带您全面认识这一改变世界的技术。一、深度学习的基本原理：让机器学会"思考"1.1神经网络的生
大模型推理速度测评的实战代码 herosunly 大模型推理速度人工智能实战代码
大家好，我是herosunly。985院校硕士毕业，现担任算法研究员一职，热衷于机器学习算法研究与应用。曾获得阿里云天池比赛第一名，CCF比赛第二名，科大讯飞比赛第三名。拥有多项发明专利。对机器学习和深度学习拥有自己独到的见解。曾经辅导过若干个非计算机专业的学生进入到算法行业就业。希望和大家一起成长进步。今天给大家带来的文章是大模型推理速度测评的实战代码，希望能对学习大模型的同学们有所帮助
微调（Fine-tuning）路野yue 人工智能深度学习
微调（Fine-tuning）是自然语言处理（NLP）和深度学习中的一种常见技术，用于将预训练模型（Pre-trainedModel）适配到特定任务上。它的核心思想是：在预训练模型的基础上，通过少量任务相关的数据进一步训练模型，使其更好地适应目标任务。1.微调的核心思想预训练模型：像BERT、GPT这样的模型，已经在大量通用文本数据上进行了预训练，学习到了丰富的语言知识（如语法、语义、上下文关系等
软件设计和软件架构之间的区别前网易架构师-高司机软件架构软件设计系统架构
作者简介：高科，先后在IBMPlatformComputing从事网格计算，淘米网，网易从事游戏服务器开发，拥有丰富的C++，go等语言开发经验，mysql，mongo，redis等数据库，设计模式和网络库开发经验，对战棋类，回合制，moba类页游，手游有丰富的架构设计和开发经验。并且深耕深度学习和数据集训练，提供商业化的视觉人工智能检测和预警系统（煤矿，工厂，制造业，消防等领域的工业化产品），合
基于PyTorch的深度学习2——Numpy与Tensor Wis4e 深度学习 pytorch numpy
Tensor自称为神经网络界的Numpy，它与Numpy相似，二者可以共享内存，且之间的转换非常方便和高效。不过它们也有不同之处，最大的区别就是Numpy会把ndarray放在CPU中进行加速运算，而由Torch产生的Tensor会放在GPU中进行加速运算。1.创建创建Tensor的方法有很多，可以从列表或ndarray等类型进行构建，也可根据指定的形状构建。importtorch#根据list数
【北上广深杭大厂AI算法面试题】深度学习篇...Squeeze Excitation（SE）网络结构详解，附代码。（二）努力毕业的小土博^_^ AI算法题库人工智能算法深度学习神经网络 cnn
【北上广深杭大厂AI算法面试题】深度学习篇…SqueezeExcitation（SE）网络结构详解，附代码。（二）【北上广深杭大厂AI算法面试题】深度学习篇…SqueezeExcitation（SE）网络结构详解，附代码。（二）文章目录【北上广深杭大厂AI算法面试题】深度学习篇...SqueezeExcitation（SE）网络结构详解，附代码。（二）SqueezeExcitation（SE）网络
GPU与CPU：架构对比与技术应用解析 Hello.Reader 运维其他架构
1.引言1.1为什么探讨GPU与CPU的对比？随着计算技术的不断发展，GPU（图形处理单元）和CPU（中央处理单元）已经成为现代计算机系统中最重要的两个组成部分。然而，随着应用场景的多样化和对性能需求的提高，这两种处理器的角色正在逐渐发生变化。GPU以其强大的并行计算能力，在深度学习、图像处理和科学计算等领域迅速崛起，而CPU则在通用计算任务中保持其核心地位。了解GPU与CPU的设计差异和适用场景
AI人工智能代理工作流AI Agent WorkFlow：搭建可拓展的AI代理工作流架构 AI天才研究院 AI大模型企业级应用开发实战 DeepSeek R1 &大数据AI人工智能大模型大厂Offer收割机面试题简历程序员读书硅基计算碳基计算认知计算生物计算深度学习神经网络大数据 AIGC AGI LLM Java Python 架构设计 Agent 程序员实现财富自由
AI人工智能代理工作流AIAgentWorkFlow：搭建可拓展的AI代理工作流架构1.背景介绍1.1问题的由来随着人工智能技术的飞速发展，特别是机器学习和深度学习技术的广泛应用，构建高度智能且自动化的代理系统成为了一个迫切的需求。这些代理系统能够自主地进行决策、执行任务并适应不断变化的环境。然而，现有的代理系统往往在面对复杂任务时缺乏灵活性和可扩展性，这限制了它们在实际应用中的广泛部署和大规模应
Java 中 VO、POJO、DTO 的区别详解 ♢.＊ java 开发语言
亲爱的小伙伴们，在求知的漫漫旅途中，若你对深度学习的奥秘、Java与Python的奇妙世界，亦或是读研论文的撰写攻略有所探寻，那不妨给我一个小小的关注吧。我会精心筹备，在未来的日子里不定期地为大家呈上这些领域的知识宝藏与实用经验分享。每一个点赞，都如同春日里的一缕阳光，给予我满满的动力与温暖，让我们在学习成长的道路上相伴而行，共同进步✨。期待你的关注与点赞哟！在Java开发的广阔领域中，准确理解和
树莓集团现状最新进展：宜宾园区业务再添新篇树莓集团百度人工智能科技大数据媒体
树莓集团在不断发展的进程中，宜宾园区传来了最新进展，业务再添新篇。近期，树莓集团宜宾园区在人工智能领域取得了重大突破。园区内的研发团队成功研发出一款适用于工业检测的人工智能视觉系统。该系统利用深度学习算法，能够快速、准确地检测出工业产品表面的细微缺陷，检测精度比传统检测方法提高了30%。这一成果不仅提升了宜宾园区在智能制造领域的竞争力，还为当地的制造业企业提供了更先进的质量检测手段。目前，已有多家
python数据分析入门与实战王静_Keras快速上手：基于Python的深度学习实战 weixin_39724362
1准备深度学习的环境11.1硬件环境的搭建和配置选择.........................11.1.1通用图形处理单元..........................31.1.2你需要什么样的GPU加速卡....................61.1.3你的GPU需要多少内存.......................61.1.4是否应该用多个GPU..............
PyTorch RuntimeError: 张量 a 的大小必须与张量 b 的大小在非单例维度上匹配 PzBlockchain pytorch 人工智能 python 机器学习-深度学习
在使用PyTorch进行深度学习模型开发时，经常会遇到各种错误和异常。其中一个常见的错误是RuntimeError。这篇文章将详细介绍其中一个特定的RuntimeError，即“Thesizeoftensoramustmatchthesizeoftensorbatnon-singletondimension”错误。我们将讨论这个错误的原因，并提供一些解决方案。错误信息解读：错误信息“Thesize
数据挖掘与数据分析 dundunmm 数据挖掘数据挖掘数据分析人工智能
数据挖掘和数据分析是两个密切相关但有所区别的领域，它们都涉及从数据中提取有价值的信息，但在目标、方法和技术上有所不同。数据挖掘vs.数据分析特征数据挖掘数据分析目标从大数据中自动发现知识和模式通过系统分析数据，得出有意义的结论重点数据模式的自动发现、预测模型的构建数据理解、数据清洗、数据总结、假设验证方法机器学习、聚类、回归、关联规则、深度学习等统计学方法、数据可视化、数据清理、假设检验等应用实时
大模型算法工程师的技术图谱和学习路径执于代码开发者职业加速服务算法学习
介绍：大模型算法工程师是指在开发和部署复杂的机器学习模型、深度学习模型或其他大规模模型的专业人员。他们的主要职责和技能要求包括：职责：设计、开发和优化大规模机器学习或深度学习模型，解决复杂的业务问题。负责整个模型开发生命周期，包括数据清洗、特征工程、模型选择、训练和部署。与数据科学家、工程团队和产品团队合作，理解业务需求并将算法转化为实际产品。对模型性能进行评估和优化，确保模型的准确性、效率和可扩
图像算法工程师的技术图谱和学习路径执于代码开发者职业加速服务算法学习
01.图像算法图像算法工程师的技术图谱和学习路径涵盖了多个技术领域，从基础知识到高级算法，涉及计算机视觉、深度学习、图像处理、数学和编程等多个方面。以下是图像算法工程师的技术图谱和学习路径的详细总结。1.基础数学与编程数学基础：线性代数：矩阵运算、特征值、特征向量、奇异值分解（SVD）等概率论与统计：概率分布、贝叶斯定理、最大似然估计（MLE）、假设检验等微积分：导数、梯度、最优化方法（梯度下降、
【深度学习】Hopfield网络：模拟联想记忆 T-I-M 深度学习人工智能
Transformer优化，什么是稀疏注意力？Transformer模型自2017年被提出以来，已经成为自然语言处理（NLP）领域的核心架构，并在计算机视觉、语音处理等其他领域也取得了显著的成功。然而，随着模型规模的不断增大和任务复杂性的提升，Transformer的计算成本和内存需求也随之激增。为了解决这一问题，研究者们提出了多种优化方法，其中稀疏注意力（SparseAttention）是一种备
深度学习pytorch之4种归一化方法（Normalization）原理公式解析和参数使用 @Mr_LiuYang 计算机视觉基础归一化正则化 Normlization BatchNorm LayerNorm InstanceNrom GroupNorm
深度学习pytorch之22种损失函数数学公式和代码定义深度学习pytorch之19种优化算法（optimizer）解析深度学习pytorch之4种归一化方法（Normalization）原理公式解析和参数使用摘要归一化（Normalization）是提升模型性能、加速训练的重要技巧。归一化方法可以帮助减少梯度消失或爆炸的问题，提升模型的收敛速度，且对最终模型的性能有显著影响。本文将以PyTorc
【2025年超全汇总】大模型常见面试题及详细答案解析（非常详细）收藏这一篇就够了！ Cc不爱吃洋葱人工智能大语言模型语言模型 LLM 大模型大模型面试大模型算法
大模型相关的面试问题通常涉及模型的原理、应用、优化以及面试者对于该领域的理解和经验。以下是一些常见的大模型面试问题以及建议的回答方式：请简述什么是大模型，以及它与传统模型的主要区别是什么？回答：大模型通常指的是参数数量巨大的深度学习模型，如GPT系列。它们与传统模型的主要区别在于规模：大模型拥有更多的参数和更复杂的结构，从而能够处理更复杂、更广泛的任务。此外，大模型通常需要更多的数据和计算资源进行
【深度学习·命运-27】NAS四部曲end-NASNet 华东算法王深度学习·命运深度学习人工智能
NASNet（NeuralArchitectureSearchNetwork）是由GoogleBrain团队提出的另一种神经架构搜索（NAS）方法，它通过自动化搜索神经网络的结构，找到了具有竞争力的神经网络架构，尤其在计算机视觉任务（如图像分类）中表现非常优秀。NASNet是基于进化算法的架构搜索方法，与其他NAS方法相比，它具有更高的效率，并且能够生成更加优化的网络架构。1.NASNet的背景与
DeepSeek 1.5B 蒸馏模型的征程 6 部署（Llama 方式）自动驾驶算法
前言DeepSeek是一款基于人工智能的搜索引擎，旨在提升用户的搜索体验。它利用先进的自然语言处理技术，通过理解查询的上下文和意图，为用户提供更精确、相关的搜索结果。与传统的搜索引擎不同，DeepSeek不仅仅依赖于关键词匹配，还能通过深度学习分析用户的需求，呈现更加智能化的搜索结果。此外，DeepSeek还具备语义理解能力，能够处理复杂的查询，并在短时间内给出最符合用户需求的答案。DeepSee
Js函数返回值 _wy_ js return
一、返回控制与函数结果，语法为：return 表达式;作用: 结束函数执行，返回调用函数，而且把表达式的值作为函数的结果二、返回控制语法为：return;作用: 结束函数执行，返回调用函数，而且把undefined作为函数的结果在大多数情况下,为事件处理函数返回false,可以防止默认的事件行为.例如,默认情况下点击一个<a>元素,页面会跳转到该元素href属性
MySQL 的 char 与 varchar bylijinnan mysql
今天发现，create table 时，MySQL 4.1有时会把 char 自动转换成 varchar 测试举例： CREATE TABLE `varcharLessThan4` ( `lastName` varchar(3) ) ; mysql> desc varcharLessThan4; +----------+---------+------+-
Quartz——TriggerListener和JobListener eksliang TriggerListener JobListener quartz
转载请出自出处：http://eksliang.iteye.com/blog/2208624 一.概述 listener是一个监听器对象，用于监听scheduler中发生的事件，然后执行相应的操作；你可能已经猜到了，TriggerListeners接受与trigger相关的事件，JobListeners接受与jobs相关的事件。二.JobListener监听器 j
oracle层次查询 18289753290 oracle；层次查询；树查询
.oracle层次查询(connect by) oracle的emp表中包含了一列mgr指出谁是雇员的经理，由于经理也是雇员，所以经理的信息也存储在emp表中。这样emp表就是一个自引用表，表中的mgr列是一个自引用列，它指向emp表中的empno列，mgr表示一个员工的管理者， select empno,mgr,ename,sal from e
通过反射把map中的属性赋值到实体类bean对象中酷的飞上天空 javaee 泛型类型转换
使用过struts2后感觉最方便的就是这个框架能自动把表单的参数赋值到action里面的对象中但现在主要使用Spring框架的MVC，虽然也有@ModelAttribute可以使用但是明显感觉不方便。好吧，那就自己再造一个轮子吧。原理都知道，就是利用反射进行字段的赋值，下面贴代码主要类如下： import java.lang.reflect.Field; imp
SAP HANA数据存储：传统硬盘的瓶颈问题蓝儿唯美 HANA
SAPHANA平台有各种各样的应用场景，这也意味着客户的实施方法有许多种选择，关键是如何挑选最适合他们需求的实施方案。在《Implementing SAP HANA》这本书中，介绍了SAP平台在现实场景中的运作原理，并给出了实施建议和成功案例供参考。本系列文章节选自《Implementing SAP HANA》，介绍了行存储和列存储的各自特点，以及SAP HANA的数据存储方式如何提升空间压
Java Socket 多线程实现文件传输随便小屋 java socket
高级操作系统作业，让用Socket实现文件传输，有些代码也是在网上找的，写的不好，如果大家能用就用上。客户端类： package edu.logic.client; import java.io.BufferedInputStream; import java.io.Buffered
java初学者路径 aijuans java
学习Java有没有什么捷径?要想学好Java，首先要知道Java的大致分类。自从Sun推出Java以来，就力图使之无所不包，所以Java发展到现在，按应用来分主要分为三大块：J2SE,J2ME和J2EE,这也就是Sun ONE(Open Net Environment)体系。J2SE就是Java2的标准版，主要用于桌面应用软件的编程；J2ME主要应用于嵌入是系统开发，如手机和PDA的编程；J2EE
APP推广 aoyouzi APP 推广
一，免费篇 1，APP推荐类网站自主推荐最美应用、酷安网、DEMO8、木蚂蚁发现频道等,如果产品独特新颖，还能获取最美应用的评测推荐。PS：推荐简单。只要产品有趣好玩，用户会自主分享传播。例如足迹APP在最美应用推荐一次，几天用户暴增将服务器击垮。 2，各大应用商店首发合作老实盯着排期，多给应用市场官方负责人献殷勤。 3，论坛贴吧推广百度知道，百度贴吧，猫扑论坛，天涯社区，豆瓣（
JSP转发与重定向百合不是茶 jsp servlet Java Web jsp转发
在servlet和jsp中我们经常需要请求,这时就需要用到转发和重定向; 转发包括;forward和include 例子;forwrad转发; 将请求装法给reg.html页面关键代码; req.getRequestDispatcher("reg.html
web.xml之jsp-config bijian1013 java web.xml servlet jsp-config
1.作用：主要用于设定JSP页面的相关配置。 2.常见定义： <jsp-config> <taglib> <taglib-uri>URI(定义TLD文件的URI,JSP页面的tablib命令可以经由此URI获取到TLD文件)</tablib-uri> <taglib-location> TLD文件所在的位置
JSF2.2 ViewScoped Using CDI sunjing CDI JSF 2.2 ViewScoped
JSF 2.0 introduced annotation @ViewScoped; A bean annotated with this scope maintained its state as long as the user stays on the same view(reloads or navigation - no intervening views). One problem w
【分布式数据一致性二】Zookeeper数据读写一致性 bit1129 zookeeper
很多文档说Zookeeper是强一致性保证，事实不然。关于一致性模型请参考http://bit1129.iteye.com/blog/2155336 Zookeeper的数据同步协议 Zookeeper采用称为Quorum Based Protocol的数据同步协议。假如Zookeeper集群有N台Zookeeper服务器(N通常取奇数，3台能够满足数据可靠性同时
Java开发笔记白糖_ java开发
1、Map<key,value>的remove方法只能识别相同类型的key值 Map<Integer,String> map = new HashMap<Integer,String>(); map.put(1,"a"); map.put(2,"b"); map.put(3,"c"
图片黑色阴影 bozch 图片
.event{ padding:0; width:460px; min-width: 460px; border:0px solid #e4e4e4; height: 350px; min-heig
编程之美-饮料供货-动态规划 bylijinnan 动态规划
import java.util.Arrays; import java.util.Random; public class BeverageSupply { /** * 编程之美饮料供货 * 设Opt（V’，i）表示从i到n-1种饮料中，总容量为V’的方案中，满意度之和的最大值。 * 那么递归式就应该是：Opt（V’，i）=max{ k * Hi+Op
ajax大参数（大数据）提交性能分析 chenbowen00 Web Ajax 框架浏览器 prototype
近期在项目中发现如下一个问题项目中有个提交现场事件的功能，该功能主要是在web客户端保存现场数据（主要有截屏，终端日志等信息）然后提交到服务器上方便我们分析定位问题。客户在使用该功能的过程中反应点击提交后反应很慢，大概要等10到20秒的时间浏览器才能操作，期间页面不响应事件。根据客户描述分析了下的代码流程，很简单，主要通过OCX控件截屏，在将前端的日志等文件使用OCX控件打包，在将之转换为
[宇宙与天文]在太空采矿,在太空建造 comsci
我们在太空进行工业活动...但是不太可能把太空工业产品又运回到地面上进行加工,而一般是在哪里开采,就在哪里加工,太空的微重力环境,可能会使我们的工业产品的制造尺度非常巨大.... 地球上制造的最大工业机器是超级油轮和航空母舰,再大些就会遇到困难了,但是在空间船坞中,制造的最大工业机器,可能就没
ORACLE中CONSTRAINT的四对属性 daizj oracle CONSTRAINT
ORACLE中CONSTRAINT的四对属性 summary:在data migrate时,某些表的约束总是困扰着我们,让我们的migratet举步维艰,如何利用约束本身的属性来处理这些问题呢?本文详细介绍了约束的四对属性: Deferrable/not deferrable, Deferred/immediate, enalbe/disable, validate/novalidate,以及如
Gradle入门教程 dengkane gradle
一、寻找gradle的历程一开始的时候，我们只有一个工程，所有要用到的jar包都放到工程目录下面，时间长了，工程越来越大，使用到的jar包也越来越多，难以理解jar之间的依赖关系。再后来我们把旧的工程拆分到不同的工程里，靠ide来管理工程之间的依赖关系，各工程下的jar包依赖是杂乱的。一段时间后，我们发现用ide来管理项程很不方便，比如不方便脱离ide自动构建，于是我们写自己的ant脚本。再后
C语言简单循环示例 dcj3sjt126com c
# include <stdio.h> int main(void) { int i; int count = 0; int sum = 0; float avg; for (i=1; i<=100; i++) { if (i%2==0) { count++; sum += i; } } avg
presentModalViewController 的动画效果 dcj3sjt126com controller
系统自带(四种效果)： presentModalViewController模态的动画效果设置： [cpp] view plain copy UIViewController *detailViewController = [[UIViewController al
java 二分查找 shuizhaosi888 二分查找 java二分查找
需求：在排好顺序的一串数字中，找到数字T 一般解法：从左到右扫描数据，其运行花费线性时间O(N)。然而这个算法并没有用到该表已经排序的事实。 /** * * @param array * 顺序数组 * @param t * 要查找对象 * @return */ public stati
Spring Security（07）——缓存UserDetails 234390216 ehcache 缓存 Spring Security
Spring Security提供了一个实现了可以缓存UserDetails的UserDetailsService实现类，CachingUserDetailsService。该类的构造接收一个用于真正加载UserDetails的UserDetailsService实现类。当需要加载UserDetails时，其首先会从缓存中获取，如果缓存中没
Dozer 深层次复制 jayluns VO maven po
最近在做项目上遇到了一些小问题，因为架构在做设计的时候web前段展示用到了vo层，而在后台进行与数据库层操作的时候用到的是Po层。这样在业务层返回vo到控制层，每一次都需要从po-->转化到vo层，用到BeanUtils.copyProperties(source, target)只能复制简单的属性，因为实体类都配置了hibernate那些关联关系，所以它满足不了现在的需求，但后发现还有个很
CSS规范整理（摘自懒人图库） a409435341 html UI css 浏览器
刚没事闲着在网上瞎逛，找了一篇CSS规范整理，粗略看了一下后还蛮有一定的道理，并自问是否有这样的规范，这也是初入前端开发的人一个很好的规范吧。一、文件规范 1、文件均归档至约定的目录中。具体要求通过豆瓣的CSS规范进行讲解：所有的CSS分为两大类：通用类和业务类。通用的CSS文件，放在如下目录中：基本样式库 /css/core
C++动态链接库创建与使用你不认识的休道人 C++dll
一、创建动态链接库 1.新建工程test中选择”MFC [dll]”dll类型选择第二项"Regular DLL With MFC shared linked"，完成 2.在test.h中添加 extern “C” 返回类型 _declspec(dllexport)函数名(参数列表); 3.在test.cpp中最后写 extern “C” 返回类型 _decls
Android代码混淆之ProGuard rensanning ProGuard
Android应用的Java代码，通过反编译apk文件（dex2jar、apktool）很容易得到源代码，所以在release版本的apk中一定要混淆一下一些关键的Java源码。 ProGuard是一个开源的Java代码混淆器（obfuscation）。ADT r8开始它被默认集成到了Android SDK中。官网： http://proguard.sourceforge.net/
程序员在编程中遇到的奇葩弱智问题 tomcat_oracle jquery 编程 ide
　　现在收集一下：　　排名不分先后，按照发言顺序来的。 1、Jquery插件一个通用函数一直报错，尤其是很明显是存在的函数，很有可能就是你没有引入jquery。。。或者版本不对 2、调试半天没变化：不在同一个文件中调试。这个很可怕，我们很多时候会备份好几个项目，改完发现改错了。有个群友说的好：在汤匙
解决maven-dependency-plugin (goals "copy-dependencies","unpack") is not supported xp9802 dependency
解决办法：在plugins之前添加如下pluginManagement，二者前后顺序如下： [html] view plain copy <build> <pluginManagement

吴恩达 神经网络和深度学习 第4周编程作业

由于csdn的markdown编辑器及其难用，已将本文转移至此处

Note

Part 1：Building your Deep Neural Network: Step by Step

1. Packages

2. Outline of the Assignment

3. Initialization

3.1 2-layer Neural Network

3.2 L-layer Neural Network

4 Forward propagation module

4.1 Linear Forward

4.2 Linear-Activation Forward

4.3 L-Layer Model

5. Cost function

6. Backward propagation module

6.1 Linear backward

6.2 Linear-Activation backward

6.3 L-Model Backward

6.4 Update Parameters

7. Conclusion

Part 2：Deep Neural Network for Image Classification: Application

1. Packages

2. Dataset

3. Architecture of your model

3.1 2-layer neural network

3.2 L-layer deep neural network

3.3 General methodology

4. Two-layer neural network

5. L-layer Neural Network

6. Results Analysis

7. Test with your own image (optional/ungraded exercise)

你可能感兴趣的:(深度学习)

吴恩达神经网络和深度学习第4周编程作业