# 第4周测验-深度神经网络的关键概念

1. 在实现前向传播和反向传播中使用的“cache”是什么？

• 【 】用于在训练期间缓存成本函数的中间值。
• ★】 我们用它传递前向传播中计算的变量到相应的反向传播步骤，它包含用于计算导数的反向传播的有用值。
• 【 】它用于跟踪我们正在搜索的超参数，以加速计算。
• 【 】 我们使用它将向后传播计算的变量传递给相应的正向传播步骤，它包含用于计算计算激活的正向传播的有用值。

the “cache” records values from the forward propagation units and sends it to the backward propagation units because it is needed to compute the chain rule derivatives.

“cache”记录来自正向传播单元的值并将其发送到反向传播单元，因为需要链式计算导数。

2. 以下哪些是“超参数”？

• ★】隐藏层的大小 n[l] n [ l ] $n^{[l]}$
• ★】学习率α
• ★】迭代次数
• ★】神经网络中的层数L

博主注：我只列出了正确选项。
请注意：你可以查看Quora的这篇文章 或者 这篇博客.

3. 下列哪个说法是正确的？

• ★】 神经网络的更深层通常比前面的层计算更复杂的输入特征。
• 【 】神经网络的前面的层通常比更深层计算输入的更复杂的特性。

注意：您可以查看视频，我想用吴恩达的用美国有线电视新闻网的例子来解释这个。

4. 向量化允许您在L层神经网络中计算前向传播，而不需要在层(l = 1,2，…，L)上显式的使用for-loop（或任何其他显式迭代循环），正确吗？

• 【 】正确
• ★】 错误

请注意：在层间计算中，我们不能避免for循环迭代。
博主注：请想一下输入的迭代次数的参数，在模型内部是用什么实现的？

5. 假设我们将 n[l] n [ l ] $n ^ {[l]}$的值存储在名为layers的数组中，如下所示：layer_dims = [n_x,4,3,2,1]。 因此，第1层有四个隐藏单元，第2层有三个隐藏单元，依此类推。 您可以使用哪个for循环初始化模型参数？

for(i in range(1, len(layer_dims))):
parameter[‘W’ + str(i)] = np.random.randn(layers[i], layers[i - 1])) * 0.01
parameter[‘b’ + str(i)] = np.random.randn(layers[i], 1) * 0.01
6. 下面关于神经网络的说法正确的是：.

• ★】层数L为4，隐藏层数为3。

注意：输入层（ L[0] L [ 0 ] $L ^ {[0]}$）不计数。

As seen in lecture, the number of layers is counted as the number of hidden layers + 1. The input and output layers are not counted as hidden layers.

正如视频中所看到的那样，层数被计为隐藏层数+1。输入层和输出层不计为隐藏层。

7. 在前向传播期间，在层 l l $l$的前向传播函数中，您需要知道层$l$$l$中的激活函数（Sigmoid，tanh，ReLU等）是什么， 在反向传播期间，相应的反向传播函数也需要知道第 l l $l$层的激活函数是什么，因为梯度是根据它来计算的，正确吗？

• ★】 正确
• 【 】错误

During backpropagation you need to know which activation was used in the forward propagation to be able to compute the correct derivative.

在反向传播期间，您需要知道正向传播中使用哪种激活函数才能计算正确的导数。

8. 有一些功能具有以下属性：

(i) 使用浅网络电路计算函数时，需要一个大网络（我们通过网络中的逻辑门数量来度量大小），但是（ii）使用深网络电路来计算它，只需要一个指数较小的网络。真/假？

• ★】 正确
• 【 】错误

请注意：参见视频，完全相同的题。
博主注：没有读懂题，直接机器翻译，你可以在下面的英文原版自己读一下。

9. 在2层隐层神经网络中，下列哪个说法是正确的？

• ★${W}^{\left[1\right]}$$W^{[1]}$ 的维度为 (4, 4)

• ★ b[1] b [ 1 ] $b^{[1]}$ 的维度为 (4, 1)
• ★ W[2] W [ 2 ] $W^{[2]}$的维度为 (3, 4)
• ★ b[2] b [ 2 ] $b^{[2]}$ 的维度为 (3, 1)
• ★ b[3] b [ 3 ] $b^{[3]}$ 的维度为 (1, 1)
• ★ W[3] W [ 3 ] $W^{[3]}$的维度为 (1, 3)
• 请注意：请参阅图片
博主注：找不到图片23333333。

• 前面的问题使用了一个特定的网络，与层 l l $l$有关的权重矩阵在一般情况下，${W}^{\left[1\right]}$$W ^ {[1]}$的维数是多少

• W[l] W [ l ] $W^{[l]}$的维度是 ( n[l] n [ l ] $n^{[l]}$, n[l1] n [ l − 1 ] $n^{[l−1]}$)

请注意：请参阅图片

## Week 4 Quiz - Key concepts on Deep Neural Networks

1. What is the “cache” used for in our implementation of forward propagation and backward propagation?

• [ ] It is used to cache the intermediate values of the cost function during training.
• [x] We use it to pass variables computed during forward propagation to the corresponding backward propagation step. It contains useful values for backward propagation to compute derivatives.
• [ ] It is used to keep track of the hyperparameters that we are searching over, to speed up computation.
• [ ] We use it to pass variables computed during backward propagation to the corresponding forward propagation step. It contains useful values for forward propagation to compute activations.

the “cache” records values from the forward propagation units and sends it to the backward propagation units because it is needed to compute the chain rule derivatives.

2. Among the following, which ones are “hyperparameters”? (Check all that apply.) I only list correct options.

• size of the hidden layers n[l]
• learning rate α
• number of iterations
• number of layers L in the neural network

Note: You can check this Quora post or this blog post.

3. Which of the following statements is true?

• [x] The deeper layers of a neural network are typically computing more complex features of the input than the earlier layers.
Correct
• [ ] The earlier layers of a neural network are typically computing more complex features of the input than the deeper layers.

Note: You can check the lecture videos. I think Andrew used a CNN example to explain this.

4. Vectorization allows you to compute forward propagation in an L-layer neural network without an explicit for-loop (or any other explicit iterative loop) over the layers l=1, 2, …,L. True/False?

• [ ] True
• [x] False

Note: We cannot avoid the for-loop iteration over the computations among layers.

5. Assume we store the values for n^[l] in an array called layers, as follows: layer_dims = [n_x, 4,3,2,1]. So layer 1 has four hidden units, layer 2 has 3 hidden units and so on. Which of the following for-loops will allow you to initialize the parameters for the model?

for(i in range(1, len(layer_dims))):
parameter[‘W’ + str(i)] = np.random.randn(layers[i], layers[i - 1])) * 0.01
parameter[‘b’ + str(i)] = np.random.randn(layers[i], 1) * 0.01
6. Consider the following neural network.

• The number of layers L is 4. The number of hidden layers is 3.

Note: The input layer (L^[0]) does not count.

As seen in lecture, the number of layers is counted as the number of hidden layers + 1. The input and output layers are not counted as hidden layers.

7. During forward propagation, in the forward function for a layer l you need to know what is the activation function in a layer (Sigmoid, tanh, ReLU, etc.). During backpropagation, the corresponding backward function also needs to know what is the activation function for layer l, since the gradient depends on it. True/False?

• [x] True
• [ ] False

During backpropagation you need to know which activation was used in the forward propagation to be able to compute the correct derivative.

8. There are certain functions with the following properties:

(i) To compute the function using a shallow network circuit, you will need a large network (where we measure size by the number of logic gates in the network), but (ii) To compute it using a deep network circuit, you need only an exponentially smaller network. True/False?

• [x] True
• [ ] False

Note: See lectures, exactly same idea was explained.

9. Consider the following 2 hidden layer neural network:

Which of the following statements are True? (Check all that apply).

• W^[1] will have shape (4, 4)
• b^[1] will have shape (4, 1)
• W^[2] will have shape (3, 4)
• b^[2] will have shape (3, 1)
• b^[3] will have shape (1, 1)
• W^[3] will have shape (1, 3)

Note: See this image for general formulas.

10. Whereas the previous question used a specific network, in the general case what is the dimension of W^[l], the weight matrix associated with layer l?

• W^[l] has shape (n^[l],n^[l−1])

Note: See this image for general formulas.

08-10 1万+
11-29 1178
01-09 3011
08-06 6万+
04-07 537
06-07 7421
01-09 727
04-25 4万+