ANSWERS | 2026 UPDATE | WITH COMPLETE
SOLUTION
What does the analogy AI is the new electricity refer to?
A: AI is powering personal devices in our homes and offices, similar to
electricity.
B: Through the smart grid, AI is delivering a new wave of electricity.
C: AI runs on computers and is thus powered by electricity, but it is letting
computers do things not possible before.
D: Similar to electricity starting about 100 years ago, AI is transforming multiple
industries. Answer - D
Which of these are reasons for Deep Learning recently taking off? (Check the
two options that apply.)
A: We have access to a lot more computational power.
B: Neural Networks are a brand new field.
C: We have access to a lot more data.
D: Deep learning has resulted in significant improvements in important
applications such as online advertising, speech recognition, and image
recognition. Answer - ACD
Recall this diagram of iterating over different ML ideas. Which of the
statements below are true? (Check all that apply.)
,A: Being able to try out ideas quickly allows deep learning engineers to iterate
more quickly.
B: Faster computation can help speed up how long a team takes to iterate to a
good idea.
C: It is faster to train on a big dataset than a small dataset.
D: Recent progress in deep learning algorithms has allowed us to train good
models faster (even without changing the CPU/GPU hardware). Answer - ABD
When an experienced deep learning engineer works on a new problem, they
can usually use insight from previous problems to train a good model on the
first try, without needing to iterate multiple times through different models.
True/False?
A: True
B: False Answer - B
Which one of these plots represents a ReLU activation function? Answer -
Check [relu](https://en.wikipedia.org/wiki/Rectifier_(neural_networks)).
Formula: $$f(x)=max(0,x)$$
Images for cat recognition is an example of structured data, because it is
represented as a structured array in a computer. True/False?
A: True
B: False Answer - B
A demographic dataset with statistics on different cities' population, GDP per
capita, economic growth is an example of unstructured data because it
contains data coming from different sources. True/False?
A: True
B: False Answer - B
,Why is an RNN (Recurrent Neural Network) used for machine translation, say
translating English to French? (Check all that apply.)
A: It can be trained as a supervised learning problem.
B: It is strictly more powerful than a Convolutional Neural Network (CNN).
C: It is applicable when the input/output is a sequence (e.g., a sequence of
words).
D: RNNs represent the recurrent process of Idea->Code->Experiment->Idea->....
Answer - AC
In this diagram which we hand-drew in lecture, what do the horizontal axis (x-
axis) and vertical axis (y-axis) represent?
A: • x-axis is the performance of the algorithm
• y-axis (vertical axis) is the amount of data.
B: • x-axis is the amount of data
• y-axis is the size of the model you train.
C: • x-axis is the input to the algorithm
• y-axis is outputs.
D: • x-axis is the amount of data
• y-axis (vertical axis) is the performance of the algorithm. Answer - D
Assuming the trends described in the previous question's figure are accurate
(and hoping you got the axis labels right), which of the following are true?
A: Increasing the training set size generally does not hurt an algorithm's
performance, and it may help significantly.
B: Increasing the size of a neural network generally does not hurt an
algorithm's performance, and it may help significantly.
, C: Decreasing the training set size generally does not hurt an algorithm's
performance, and it may help significantly.
D: Decreasing the size of a neural network generally does not hurt an
algorithm's performance, and it may help significantly. Answer - AB
What does a neuron compute?
A: A neuron computes an activation function followed by a linear function ($z =
Wx + b$)
B: A neuron computes a linear function ($z = Wx + b$) followed by an
activation function
C: A neuron computes a function $g$ that scales the input $x$ linearly ($Wx +
b$)
D: A neuron computes the mean of all features before applying the output to
an activation function Answer - B
Which of these is the "Logistic Loss"?
A: $L(\hat{y},y)=|\hat{y}−y|$
B: $L(\hat{y},y)=max(\hat{y}-y, 0)$
C: $L(\hat{y},y)=|\hat{y}−y|^2$
D: $L(\hat{y},y)=−(y \log(\hat{y})+(1−y) \log(1−\hat{y}))$ Answer - D
Suppose img is a (32,32,3) array, representing a 32x32 image with 3 color
channels red, green and blue. How do you reshape this into a column vector?
A: x = img.reshape((32*32*3,1))
B: x = img.reshape((32*32,3))
C: x = img.reshape((1,32*32*3))
D: x = img.reshape((3, 32*32)) Answer - A