Convolution Features - Answers edges
colors
textures
motifs (corners, shapes)
Receptive field - Answers A region of an image (image patch) from which the node receives
input. Usually denoted by a K1 x K2 matrix.
Convolution vs Cross-correlation - Answers Convolution: flip the kernel (rotate 180) and take the
dot product with image patch
Cross-correlation: do not flip the kernel to take the dot product with image patch
Advantage of using image patch - Answers 1./ Reduces the input parameters to
K1 x K2 + 1 (bias)
for each output node. Thus, the total number of input parameters:
N x (K1 + K2 + 1)
2./ Explicitly maintains spatial information
Weight sharing - Answers The weights will represent what types of features we will extract. The
weights (W) will be the same for each output node with respect to a specific kernel, regardless
of the specific image patch we are looking at.
The total number of input parameters:
K1 x K2 + 1
Input parameters with multiple feature extractions - Answers (K1 x K2 + 1) x M
where M is the number of features
Relationship between convolution and cross-correlation - Answers Duality: If cross-correlation is
the forward pass (which is the easier operation), the convolution operation is going to be the
backward pass to calculate gradients (vice versa)
Valid convolution - Answers When the kernel is fully on the image. (No padding)
Output size of the vanilla convolution,
given H, W, K1, K2 - Answers (H - K1 + 1) x (W - K2 + 1)
, How to add padding - Answers Increases the size of the image with P in both directions (top &
bottom, left & right)
--> (H + 2P) x (W + 2P)
Can be filled with zeros or mirror the image
Stride and its consequences - Answers Number of pixels moving forward when parsing the
patch through images.
Loss of information
Used for dimensionality reduction
Effect of channels on output size - Answers It doesn't have effect on the output size: we
perform the dot product for each channels and summing them up.
Effect of channels on parameters - Answers Each channel might have its own weights with
respect to the same kernel.
M x (Ch x K1 x K2 + 1)
Effect of multiple kernels (feature extraction) on output size. - Answers The kernel size should
be equal (K1 x K2) for each kernel within the layer. The output size:
(H - K1 + 1) x (W - K2 + 1) x Number of Kernels
Effect of multiple kernels (feature extraction) on parameters - Answers Each kernel, each
channel has its own set of weights, but each kernel has only 1 bias term.
(K1 x K2 x Channels + 1) x M
where M is the number of kernels
What is the purpose of pooling layer? - Answers Dimensionality reduction
How many learned parameters does a max pooling layer have? - Answers None
Invariance - Answers If the feature changes, moves or rotates slightly on the image, the output
value remains the same. (For example, we classify the image of a cat regardless of where the
cat is in the image)
Equivariance - Answers If the feature translates or moves a little bit, the output values move by
the same translation and can be detected in the new location.
Why different kernels would learn different features? - Answers Because we initialize them to
different values, and the local minima on the weight space will different, and so the gradient will
be different --> kernels are learning different features.