Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Exam (elaborations)

COS4861 Assignment 2 (NLP) Due 2025

Rating
-
Sold
-
Pages
10
Grade
A+
Uploaded on
13-07-2025
Written in
2024/2025

Okay, here's a polished and enhanced version of your document's description, suitable for an academic assignment: COS4861/0/2025 Assignment 2: Automata and NLP Preprocessing This document presents a comprehensive response to Assignment 2 for COS4861/0/2025, delving into both the theoretical underpinnings of automata theory and the practical application of Natural Language Processing (NLP) data preprocessing. Theoretical Foundations: Automata in NLP The theoretical section provides a rigorous exploration of Deterministic Finite State Automata (DFSA) and Non-Deterministic Finite State Automata (NDFSA). It meticulously defines their fundamental components (Q, Σ, δ, q 0 ​ , F), operations, and key distinctions using formal notation and illustrative examples. A detailed proof of the equivalence between NDFSA and DFSA is presented, employing the subset construction algorithm, complemented by a clear, step-by-step conversion example. Furthermore, the critical significance of these automata in core NLP tasks, such as tokenization and syntax parsing, is thoroughly analyzed with relevant practical applications. Practical Implementation: NLP Preprocessing Pipeline The practical component details the implementation of a robust NLP preprocessing pipeline. Using Python's NLTK library, the pipeline systematically applies essential preprocessing steps to a sample text dataset, including: Tokenization: Breaking down text into individual words or units. Stopwords Removal: Eliminating common, low-information words. Stemming: Reducing words to their root form heuristically. Lemmatization: Reducing words to their base dictionary form (lemma) using linguistic knowledge.

Show more Read less
Institution
Course

Content preview

COS4861

Assignment 2

Natural Language Processing

Due 2025

, COS4861/2025 Assignment 2: Natural
Language Processing (NLP)



Question 1: Theory of Automata (40 Marks)
1.1 Deterministic Finite State Automaton (DFSA)
A Deterministic Finite State Automaton (DFSA) is defined as a 5-tuple:

M = (Q, Σ, δ, q0 , F )

where:

– Q: A finite set of states

– Σ: A finite input alphabet

– δ: A transition function δ : Q × Σ → Q

– q0 ∈ Q: The start state

– F ⊆ Q: A set of accepting states

Each input symbol causes the automaton to make a unique transition to the next state.

Example: A DFSA that accepts binary strings ending in 01:

Q = {q0 , q1 , q2 }, Σ = {0, 1}, q0 = start state, F = {q2 }

δ(q0 , 0) = q1 , δ(q1 , 1) = q2

0 1
start q0 q1 q2




1

Connected book

Written for

Institution
Course

Document information

Uploaded on
July 13, 2025
Number of pages
10
Written in
2024/2025
Type
Exam (elaborations)
Contains
Questions & answers

Subjects

$2.79
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
BeeNotes teachmetutor
Follow You need to be logged in order to follow users or courses
Sold
313
Member since
1 year
Number of followers
0
Documents
881
Last sold
3 weeks ago
BeeNotes

BeeNotes: Buzzing Brilliance for Your Studies Discover BeeNotes, where hard-working lecture notes fuel your academic success. Our clear, concise study materials simplify complex topics and help you ace exams. Join the hive and unlock your potential with BeeNotes today!

4.1

39 reviews

5
23
4
4
3
8
2
1
1
3

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions