Class notes

Computer Vision And Deep Learning Convolution Neural Network

Rating

Sold

Pages

Uploaded on

25-03-2023

Written in

2022/2023

a clear and precise notes regarding the Convolutional Neural Network and its architecture and evolution including all the types of changes made to architecture and advantages and disadvantages of each architecture type of CNN.

Institution

Course

Content preview

Computer Vision and Deep
Learning
UNIT - III
Convolutional Neural Network

Notes

Introduction: Convolutional Neural Network (CNN)

A Convolutional Neural Network (CNN) is a type of Deep Learning architecture commonly used
for image classification and recognition tasks. It consists of multiple layers, including
Convolutional layers, Pooling layers, and fully connected layers. The Convolutional layer applies
filters to the input image to extract features, the Pooling layer downsamples the image to
reduce computation, and the fully connected layer makes the final prediction. The network
learns the optimal filters through backpropagation and gradient descent.

Artificial Neural Networks are used in various classification tasks like image, audio, words.
Different types of Neural Networks are used for different purposes, for example for predicting
the sequence of words we use Recurrent Neural Networks more precisely an LSTM, similarly for
image classification we use Convolution Neural networks.

Convolution Neural Network

Convolution Neural Networks or covnets are neural networks that share their parameters.
Imagine you have an image. It can be represented as a cuboid having its length, width
(dimension of the image), and height (as images generally have red, green, and blue channels).

,Now imagine taking a small patch of this image and running a small neural network on it, with
say, k outputs and represent them vertically. Now slide that neural network across the whole
image, as a result, we will get another image with different width, height, and depth. Instead of
just R, G, and B channels now we have more channels but lesser width and height. This
operation is called Convolution. If the patch size is the same as that of the image it will be a
regular neural network. Because of this small patch, we have fewer weights.

Now let’s talk about a bit of mathematics that is involved in the whole convolution process.

● Convolution layers consist of a set of learnable filters (a patch in the above image). Every
filter has small width and height and the same depth as that of input volume (3 if the
input layer is image input).
● For example, if we have to run convolution on an image with dimension 34x34x3. The
possible size of filters can be axax3, where ‘a’ can be 3, 5, 7, etc but small as compared to
image dimension.
● During forward pass, we slide each filter across the whole input volume step by step
where each step is called stride (which can have value 2 or 3 or even 4 for high
dimensional images) and compute the dot product between the weights of filters and
patch from input volume.
● As we slide our filters we’ll get a 2-D output for each filter and we’ll stack them together
and as a result, we’ll get output volume having a depth equal to the number of filters.
The network will learn all the filters.
●

Layers used to build ConvNets : A covnets is a sequence of layers, and every layer transforms
one volume to another through a differentiable function.

, Types of layers:
Let’s take an example by running a covnets on of image of dimension 32 x 32 x 3.

● Input Layer: This layer holds the raw input of the image with width 32, height 32, and
depth 3.
● Convolution Layer: This layer computes the output volume by computing the dot
product between all filters and image patches. Suppose we use a total of 12 filters for
this layer we’ll get output volume of dimension 32 x 32 x 12.
● Activation Function Layer: This layer will apply an element-wise activation function to
the output of the convolution layer. Some common activation functions are RELU:
max(0, x), Sigmoid: 1/(1+e^-x), Tanh, Leaky RELU, etc. The volume remains unchanged
hence output volume will have dimension 32 x 32 x 12.
● Pool Layer: This layer is periodically inserted in the covnets and its main function is to
reduce the size of volume which makes the computation fast reduces memory and also
prevents overfitting. Two common types of pooling layers are max pooling and average
pooling. If we use a max pool with 2 x 2 filters and stride 2, the resultant volume will be
of dimension 16x16x12.

● Fully-Connected Layer: This layer is a regular neural network layer that takes input from
the previous layer and computes the class scores and outputs the 1-D array of size equal
to the number of classes.

Advantages Or Disadvantages:

Advantages of Convolutional Neural Networks (CNNs):

● Good at detecting patterns and features in images, videos and audio signals.
● Robust to translation, rotation and scaling invariance.

Report Copyright Violation

Written for

Institution: GHRIETN
Course: UAIL305

All documents for this subject (1)

Document information

Uploaded on: March 25, 2023
Number of pages: 18
Written in: 2022/2023
Type: Class notes
Professor(s): Rahul
Contains: All classes

Subjects

introduction to cnn
evolution of cnn
cnn architecture
alexnet
zfnet
vgg
inceptionnet
resnets
densenets
convolutional neural network

$3.99

Get access to the full document:

Written by students who passed

Immediately available after payment

Read online or as PDF

Get to know the seller

raghavsurya74

Also available in package deal

Get to know the seller

raghavsurya74 GHRIET

View profile

Sold

Member since

3 year

Number of followers

Documents

Last sold

0.0

0 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller raghavsurya74. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $3.99. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 47251 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 16 years now

Computer Vision And Deep Learning Convolution Neural Network

Content preview

Written for

Document information

Subjects

Also available in package deal

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Working on your references?

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?