Summary of Recurrent Neural Networks (RNNs)
Instructor: Jaskirat Singh
October 7, 2024
A Recurrent Neural Network (RNN) is a type of neural network designed to handle
sequential data by maintaining a memory of previous inputs, allowing it to capture temporal
dependencies. Here’s a detailed breakdown of the architecture of a simple RNN and how it
differs from a Feedforward Neural Network (FNN):
1 Architecture of a Simple RNN
Basic Structure: A simple RNN is characterized by its recurrent loop, which allows it to
pass information from one step to the next. In a standard RNN, each time step processes a
single element of the input sequence while maintaining a hidden state that captures historical
information.
• Input Sequence (xt ): The data is fed in as a sequence of vectors x1 , x2 , . . . , xT , where
T is the length of the sequence. For instance, this could be a time series of temperature
readings or words in a sentence.
• Hidden State (ht ): At each time step t, the RNN maintains a hidden state ht , which
is a summary of all previous inputs in the sequence up to that point. The hidden state is
updated recurrently, providing the model with a form of memory.
• Output (yt ): The output at each time step yt can be a prediction for that particular step,
like the next word in a sentence or a classification score.
Mathematical Representation: The hidden state at time step t, ht , is computed as:
ht = f (Whh ht−1 + Wxh xt + bh )
Where:
• f is a non-linear activation function, typically tanh or ReLU.
• Whh is the weight matrix for the hidden state from the previous time step.
• Wxh is the weight matrix for the current input.
1
Instructor: Jaskirat Singh
October 7, 2024
A Recurrent Neural Network (RNN) is a type of neural network designed to handle
sequential data by maintaining a memory of previous inputs, allowing it to capture temporal
dependencies. Here’s a detailed breakdown of the architecture of a simple RNN and how it
differs from a Feedforward Neural Network (FNN):
1 Architecture of a Simple RNN
Basic Structure: A simple RNN is characterized by its recurrent loop, which allows it to
pass information from one step to the next. In a standard RNN, each time step processes a
single element of the input sequence while maintaining a hidden state that captures historical
information.
• Input Sequence (xt ): The data is fed in as a sequence of vectors x1 , x2 , . . . , xT , where
T is the length of the sequence. For instance, this could be a time series of temperature
readings or words in a sentence.
• Hidden State (ht ): At each time step t, the RNN maintains a hidden state ht , which
is a summary of all previous inputs in the sequence up to that point. The hidden state is
updated recurrently, providing the model with a form of memory.
• Output (yt ): The output at each time step yt can be a prediction for that particular step,
like the next word in a sentence or a classification score.
Mathematical Representation: The hidden state at time step t, ht , is computed as:
ht = f (Whh ht−1 + Wxh xt + bh )
Where:
• f is a non-linear activation function, typically tanh or ReLU.
• Whh is the weight matrix for the hidden state from the previous time step.
• Wxh is the weight matrix for the current input.
1