Practice quiz: Neural Network Training
第 1 个问题:Here is some code that you saw in the lecture:
model.compile(loss=BinaryCrossentropy())
For which type of task would you use the binary cross entropy loss function?
A classification task that has 3 or more classes (categories)
【正确】binary classification (classification with exactly 2 classes)
regression tasks (tasks that predict a number)
BinaryCrossentropy() should not be used for any task.
【解释】Yes! Binary cross entropy, which we've also referred to as logistic loss, is used for classifying between two classes (two categories).
第 2 个问题:Here is code that you saw in the lecture:
model = Sequential([
Dense(units=25, activation='sigmoid’),
Dense(units=15, activation='sigmoid’),
Dense(units=1, activation='sigmoid’)
])
model.compile(loss=BinaryCrossentropy())
model.fit(X,y,epochs=100)
Which line of code updates the network parameters in order to reduce the cost?
【正确】model.fit(X,y,epochs=100)
None of the above -- this code does not update the network parameters.
model = Sequential([...])
model.compile(loss=BinaryCrossentropy())
【解释】Yes! The third step of model training is to train the model on data in order to minimize the loss (and the cost)
Practice quiz: Activation Functions
第 1 个问题:Which of the following activation functions is the most common choice for the hidden layers of a neural network?
【正确】ReLU (rectified linear unit)
Sigmoid
Most hidden layers do not use any activation function
Linear
【解释】Yes! A ReLU is most often used because it is faster to train compared to the sigmoid. This is because the ReLU is only flat on one side (the left side) whereas the sigmoid goes flat (horizontal, slope approaching zero) on both sides of the curve.
第 2 个问题:For the task of predicting housing prices, which activation functions could you choose for the output layer? Choose the 2 options that apply.
【正确】ReLU
【解释】Yes! ReLU outputs values 0 or greater, and housing prices are positive values.
【正确】linear
【解释】Yes! A linear activation function can be used for a regression task where the output can be both negative and positive, but it's also possible to use it for a task where the output is 0 or greater (like with house prices).
Sigmoid
第 3 个问题:True/False? A neural network with many layers but no activation function (in the hidden layers) is not effective; that’s why we should instead use the linear activation function in every hidden layer.
True
【正确】False
【解释】Yes! A neural network with many layers but no activation function is not effective. A linear activation is the same as "no activation function".
Practice quiz: Multiclass Classification
第 1 个问题:For a multiclass classification task that has 4 possible outputs, the sum of all the activations adds up to 1. For a multiclass classification task that has 3 possible outputs, the sum of all the activations should add up to ….
It will vary, depending on the input x.
【正确】1
More than 1
Less than 1
第3个问题:For multiclass classification, the cross entropy loss is used for training the model. If there are 4 possible classes for the output, and for a particular training example, the true class of the example is class 3 (y=3), then what does the cross entropy loss simplify to? [Hint: This loss should get smaller when a 3 gets larger.]
第 3 个问题:For multiclass classification, the recommended way to implement softmax regression is to set from_logits=True in the loss function, and also to define the model's output layer with…
【正确】a 'linear' activation
a 'softmax' activation
【解释】Yes! Set the output as linear, because the loss function handles the calculation of the softmax with a more numerically stable method.
Practice quiz: Additional Neural Network Concepts
第 1 个问题:The Adam optimizer is the recommended optimizer for finding the optimal parameters of the model. How do you use the Adam optimizer in TensorFlow?
The call to model.compile() will automatically pick the best optimizer, whether it is gradient descent, Adam or something else. So there’s no need to pick an optimizer manually.
【正确】When calling model.compile, set optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3).
The call to model.compile() uses the Adam optimizer by default
The Adam optimizer works only with Softmax outputs. So if a neural network has a Softmax output layer, TensorFlow will automatically pick the Adam optimizer.
【解释】Correct. Set the optimizer to Adam.
第 2 个问题:The lecture covered a different layer type where each single neuron of the layer does not look at all the values of the input vector that is fed into that layer. What is this name of the layer type discussed in lecture?
1D layer or 2D layer (depending on the input dimension)
【正确】convolutional layer
A fully connected layer
Image layer
【解释】Correct. For a convolutional layer, each neuron takes as input a subset of the vector that is fed into that layer.