Exam (elaborations)

CS7643 QUIZ 4: RECURRENT NETWORKS, EMBEDDINGS & SEQUENCE MODELING

Rating

Sold

Pages

Grade

A+

Uploaded on

21-06-2026

Written in

2025/2026

This document contains study material and practice questions for CS7643 Quiz 4, focusing on recurrent neural networks, embeddings, and sequence modeling techniques in deep learning. Topics include recurrent network architectures, sequence processing, word embeddings, language modeling, long short-term memory (LSTM) networks, gated recurrent units (GRUs), sequence-to-sequence models, attention mechanisms, training challenges, and practical applications in natural language processing and time-series analysis. It is designed to help students prepare for quizzes and strengthen their understanding of sequence-based machine learning models.

Show more Read less

Institution

CS7643

Course

CS7643

Content preview

CS7643 QUIZ 4: RECURRENT NETWORKS,
EMBEDDINGS & SEQUENCE MODELING
SECTION A: RECURRENT NEURAL NETWORKS (10 Questions)

Q1: In a vanilla RNN with update rule h(t) = tanh(U·x(t) + V·h(t-1) + b), what is the
primary computational disadvantage during training?

A. The model requires O(T²) memory to store all intermediate hidden states.
B. The forward pass cannot be parallelized across time steps due to sequential
dependency. [CORRECT]

C. The backward pass can be fully parallelized using modern GPU architectures.
D. The number of parameters scales linearly with sequence length T.

Correct Answer: B

Rationale: Correct because the hidden state h(t) depends on h(t-1), forcing
sequential computation with runtime O(T) that cannot be parallelized across the
time dimension.
Q2: A vanilla RNN is trained on sequences of length T=100. Analysis shows that
gradients with respect to early time step inputs are approximately zero. What is
the most likely cause?

A. The learning rate is too high, causing gradient descent to oscillate.

B. The weight matrix V has spectral radius less than 1, causing vanishing
gradients. [CORRECT]
C. The activation function is ReLU rather than tanh.
D. The input dimension is larger than the hidden dimension.

Correct Answer: B

Rationale: Correct because the Jacobian ∂h(t)/∂h(t-1) involves repeated
multiplication by V; when the spectral radius of V is less than 1, gradients decay
exponentially as V^t, producing vanishing gradients for early time steps.
Q3: Which RNN architecture is most appropriate for sentiment classification,
where a single sentiment label must be produced for an input sentence of
variable length?
A. N-to-N architecture with one output per word.

,B. N-to-1 architecture that maps the final hidden state to a single output.
[CORRECT]

C. 1-to-N architecture that generates a sequence from a single input vector.

D. Encoder-decoder with attention over all intermediate states.

Correct Answer: B

Rationale: Correct because sentiment classification requires mapping a variable-
length input sequence to a single output label, which is precisely the N-to-1
architecture where the final hidden state encodes the entire sequence.

Q4: During training of a vanilla RNN, gradient norms suddenly spike to values
exceeding 1000. Which technique should be applied?
A. Reduce the learning rate by a factor of 10.

B. Apply gradient clipping to bound the maximum gradient norm. [CORRECT]

C. Switch from SGD to Adam optimizer immediately.

D. Increase the hidden state dimension to absorb larger gradients.
Correct Answer: B

Rationale: Correct because exploding gradients occur when the spectral radius of
recurrent weights exceeds 1; gradient clipping directly bounds the gradient norm
during backpropagation through time without modifying the architecture.

Q5: In teacher forcing during RNN training, what input is fed at time step t+1?
A. The model's own predicted output from time step t.

B. The ground-truth target value from the training data at time step t+1.
[CORRECT]

C. A weighted average of the prediction and ground truth.

D. The hidden state from time step t passed through the output layer.

Correct Answer: B

Rationale: Correct because teacher forcing uses the actual training data value as
the next input rather than the model's prediction, which emerges from maximum
likelihood estimation and prevents error accumulation during training.

Q6: A researcher replaces hidden-to-hidden recurrence with teacher forcing at
every time step during both training and inference. What is the primary
consequence?

, A. The model becomes unable to handle variable-length sequences.

B. The model can be parallelized across time steps but loses the ability to
propagate information through hidden states. [CORRECT]

C. The vanishing gradient problem is completely eliminated.

D. The model requires twice as many parameters as a standard RNN.

Correct Answer: B

Rationale: Correct because removing hidden-to-hidden recurrence eliminates the
sequential dependency chain, enabling parallelization, but the model loses the
recurrent path for propagating information across time steps, making it less
powerful than a true RNN.
Q7: Truncated backpropagation through time (BPTT) with truncation parameter
k=10 on sequences of length T=100 means:

A. Only the first 10 time steps are used in the forward pass.

B. Gradients are backpropagated through at most 10 time steps before
truncation. [CORRECT]

C. The hidden state is reset to zero every 10 time steps.

D. The model processes the sequence in 10 non-overlapping chunks.

Correct Answer: B

Rationale: Correct because truncated BPTT limits the temporal span of gradient
computation to k steps, approximating full BPTT while controlling computational
cost and mitigating vanishing/exploding gradients in long sequences.
Q8: Which of the following is NOT a valid criticism of using MLPs for NLP tasks
compared to RNNs?

A. MLPs cannot easily support variable-sized input sequences.

B. MLPs have no inherent mechanism for modeling temporal structure.

C. MLPs require network size to grow with maximum allowed sequence length.
D. MLPs suffer from vanishing gradients across time steps. [CORRECT]

Correct Answer: D

Rationale: Correct because vanishing gradients across time steps is a problem
specific to recurrent architectures with repeated weight multiplication; MLPs

Report Copyright Violation

Written for

Institution: CS7643
Course: CS7643

Document information

Uploaded on: June 21, 2026
Number of pages: 19
Written in: 2025/2026
Type: Exam (elaborations)
Contains: Questions & answers

Subjects

recurrent neural networks
word embeddings
sequence modeling
language modeling
lstm
gru
attention mechanisms
deep learning
sequence to sequence models
natural language processing

$15.99

Get access to the full document:

Written by students who passed

Immediately available after payment

Read online or as PDF

Get to know the seller

ExamAceStuvia

3.9

(7)

Get to know the seller

ExamAceStuvia Rasmussen College

View profile

Sold

Member since

9 months

Number of followers

Documents

926

Last sold

3 days ago

Top Grades By ExamAceStuvia

Ace Your Certification — The Smart Way! Welcome to ExamAceStuvia – the ultimate battle-tested exam prep platform built by passers, for future passers. Get thousands of real exam questions straight from people who just crushed the same test you’re facing. No fluff. No outdated dumps. Just authentic, up-to-date practice that feels exactly like the real thing. Why thousands choose Examice every day: 400+ published exams across 100+ top providers (AWS, Microsoft, Cisco, ,NCLEX , WGU , CompTIA, and many more) Whether you're preparing for nursing licensure (NCLEX, ATI, HESI, ANCC, AANP), healthcare certifications (ACLS, BLS, PALS, PMHNP, AGNP), standardized tests (TEAS, HESI, PAX, NLN), or university-specific exams (WGU, Portage Learning, Georgia Tech, and more), our documents are 100% correct, up-to-date for 2025/2026, and reviewed for accuracy.. Community-powered accuracy → open discussions, source-backed references, democratic voting & follow-up Q&A to lock in the real correct answers Realistic exam that builds confidence and exposes weak spots fast Most affordable premium prep in the industry – quality without breaking the bank Regular updates so you’re always studying what actually appears today Whether you're chasing that dream job, promotion, or career switch — ExamAce turns “I hope I pass” into “I’ve got this.” Join the community that’s already helped thousands certify. Try ExamAceStuvia today → pass tomorrow.

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller ExamAceStuvia. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $15.99. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 54571 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 16 years now

CS7643 QUIZ 4: RECURRENT NETWORKS, EMBEDDINGS & SEQUENCE MODELING

Content preview

Written for

Document information

Subjects

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Working on your references?

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?