Correct Answers 2023-2024 Update Graded A++
+
1. PCA identifies the axis that accounts for the __________ in the training set. -
ANSWER largest amount of variance
2. What is the standard matrix factorization technique that can decompose the
training set matrix into a dot product of 3 matrices, with the latter of these matrices
containing the principal components? - ANSWER Singular Value Decomposition
(SVD)
True or False.....3. Numerical differentiation has a high accuracy. - ANSWER
False, it has low accuracy and is difficult to implement
4. What method is called to restore a previously saved model? - ANSWER
restore()
5. What is the derivative of the step function at 0? - ANSWER undefined
6. What is the output range for a hyperbolic tangent activation function (tanh(z) =
2*sigma(2*zeta) - 1)? - ANSWER -1 to 1
7. The ________ activation function involves α (hyperparameter that determines how
much the function "leaks") being learned during training instead of being a
hyperparameter. - ANSWER parametric leaky ReLU
8. _______ regularization technique involves each neuron having a probability p of
being temporarily "dropped" - ANSWER Dropout
9. What function is NOT used to make a pooling layer for a CNN?
A) max_pool()
B) avg_pool()
C) min_pool()
D) all of these are used - ANSWER C) Min_pool()
10. Autoencoders typically have a symmetrical neural network structure...True or
False? - ANSWER True
True or False....CNN's are able to generalize much better that DNN's for image
processing tasks such as classification using fewer training examples. - ANSWER
True
, 12. A ___________________ normalization layer makes the neurons that most
strongly activate inhibit neurons at the same location but in neighboring feature
maps. - ANSWER local response
13. In Scikit-Learn's GridSearchCV, all you need to do is tell it which
________________
you want it to experiment with, and what values to try out, and it will evaluate all the
possible combinations of _______________ values, using cross-validation. -
ANSWER hyperparameters, hyperparameter
14. A CNN is faster to train than a DNN because of the following reason.
1. Partially Connected Layers.
2. Reusability of weights.
3. Both A and B. - ANSWER C) Both A and B
15. What type of TensorFLow optimizer may be used to compute an optimal
gradient?
A) LossOptimizer() B) GradientDescentOptimizer()
C)ErrorOptimizer() D)SquaredErrorOptimizer - ANSWER B) Gradient Descent
Optimizer
How is a perceptron trained? - ANSWER Perceptrons are trained based on an
algorithm that considers the error made by the network. The connections that lead to
the wrong output are neglected and not reinforced.
17. How does TensorFlow treat dependencies in graph nodes? - ANSWER In the
case of graph nodes TensorFLow will recompute the value of a node even if it has
encountered it previously.
18. Building an RNN using __________ instead of __________ offers several
advantages.(State the function names) - ANSWER dynamic_rnn(), static_rnn
19. Which of the following is an example of a vector-to-sequence RNN:
1. Locating pedestrians in a picture
2. Speech to text.
3. Video captioning.
4. A and C
5. B and C - ANSWER Locating pedestrians in a picture.
True or False....A classical Perceptron is able to estimate class probabilities. -
ANSWER False
21. How do you clear all nodes from the default graph?
A) Restart the kernel/shell.
B) Call tf.get_default_graph().clear()
C) Call tf.reset_default_graph()
D) A & B