UNIT-I
Introduction to Natural Language
Understanding natural language involves studying how humans use language and how
machines can replicate that understanding through computational models.
1. The Study of Language
🔹 What is Language?
Language is a structured system of symbols (words) and rules used for communication.
Human language is complex and varies across cultures and contexts.
🔹 Linguistics and NLP:
Linguistics studies the structure and function of language.
NLP applies computational methods to model and process natural human languages.
Core linguistic areas in NLP: Morphology, Syntax, Semantics, Pragmatics, and
Phonology.
🔹 Why Study Language in NLP?
To teach computers how to understand and generate text/speech just like humans do, we must
understand how language works.
2. Applications of Natural Language Processing (NLP)
NLP is widely used in both everyday and advanced AI applications.
🔹 Application 🔹 Examples
Machine Translation Google Translate, DeepL
Speech Recognition Alexa, Siri, Google Assistant
Chatbots & Virtual Agents Customer service bots, mental health assistants
, 🔹 Application 🔹 Examples
Information Extraction Extracting facts from news articles or medical records
Sentiment Analysis Social media analysis, review classification
Text Summarization News digest apps, academic paper summarizers
Grammar & Spell Checking Grammarly, Microsoft Word
Question Answering AI tutors, search engines like Bing Chat or ChatGPT
These show the diverse impact of NLP in real life.
3. Evaluating Language Understanding Systems
To judge an NLP system's performance, we use specific evaluation methods.
🔹 Evaluation Metrics:
Precision – How many selected items are relevant?
Recall – How many relevant items are selected?
F1-Score – Harmonic mean of precision and recall
BLEU Score – Evaluates machine translation quality
ROUGE Score – Used in summarization tasks
WER (Word Error Rate) – Used in speech recognition
🔹 Types of Evaluation:
Intrinsic Evaluation: Tests a system on its direct output (e.g., grammar correction).
Extrinsic Evaluation: Measures how well an NLP system supports a larger task (e.g.,
improving search results).
4. Different Levels of Language Analysis
Language processing can be broken into levels — each handling a part of meaning.
🔹 Level 🔹 Description
, 🔹 Level 🔹 Description
Phonology Sounds of language (important in speech-based NLP).
Study of word forms, stems, prefixes/suffixes (e.g., 'unbelievable' → un + believe +
Morphology
able).
Syntax Sentence structure and grammar rules (e.g., Subject-Verb-Object).
Semantics Study of meanings (word and sentence level).
Pragmatics Meaning in context (e.g., "Can you pass the salt?" implies a request).
Discourse How meaning flows between sentences in a paragraph or conversation.
5. Representations and Understanding
To make language understandable to machines, it needs to be represented in a format they
can work with.
🔹 Types of Representations:
Parse Trees: Represent sentence structure.
Dependency Trees: Show how words are related.
Semantic Frames: Capture event-related meaning (e.g., who did what to whom).
Logical Forms: Use formal logic to encode sentence meaning.
Embeddings (Vectors): Word2Vec, GloVe, and BERT turn words into numbers for
machine use.
🔹 Why Represent Language?
Representations bridge the gap between natural language and machine processing
capabilities.
6. Organization of Natural Language Understanding
Systems
An NLP system processes language through a pipeline of modules.
🔹 Typical NLP Pipeline:
Introduction to Natural Language
Understanding natural language involves studying how humans use language and how
machines can replicate that understanding through computational models.
1. The Study of Language
🔹 What is Language?
Language is a structured system of symbols (words) and rules used for communication.
Human language is complex and varies across cultures and contexts.
🔹 Linguistics and NLP:
Linguistics studies the structure and function of language.
NLP applies computational methods to model and process natural human languages.
Core linguistic areas in NLP: Morphology, Syntax, Semantics, Pragmatics, and
Phonology.
🔹 Why Study Language in NLP?
To teach computers how to understand and generate text/speech just like humans do, we must
understand how language works.
2. Applications of Natural Language Processing (NLP)
NLP is widely used in both everyday and advanced AI applications.
🔹 Application 🔹 Examples
Machine Translation Google Translate, DeepL
Speech Recognition Alexa, Siri, Google Assistant
Chatbots & Virtual Agents Customer service bots, mental health assistants
, 🔹 Application 🔹 Examples
Information Extraction Extracting facts from news articles or medical records
Sentiment Analysis Social media analysis, review classification
Text Summarization News digest apps, academic paper summarizers
Grammar & Spell Checking Grammarly, Microsoft Word
Question Answering AI tutors, search engines like Bing Chat or ChatGPT
These show the diverse impact of NLP in real life.
3. Evaluating Language Understanding Systems
To judge an NLP system's performance, we use specific evaluation methods.
🔹 Evaluation Metrics:
Precision – How many selected items are relevant?
Recall – How many relevant items are selected?
F1-Score – Harmonic mean of precision and recall
BLEU Score – Evaluates machine translation quality
ROUGE Score – Used in summarization tasks
WER (Word Error Rate) – Used in speech recognition
🔹 Types of Evaluation:
Intrinsic Evaluation: Tests a system on its direct output (e.g., grammar correction).
Extrinsic Evaluation: Measures how well an NLP system supports a larger task (e.g.,
improving search results).
4. Different Levels of Language Analysis
Language processing can be broken into levels — each handling a part of meaning.
🔹 Level 🔹 Description
, 🔹 Level 🔹 Description
Phonology Sounds of language (important in speech-based NLP).
Study of word forms, stems, prefixes/suffixes (e.g., 'unbelievable' → un + believe +
Morphology
able).
Syntax Sentence structure and grammar rules (e.g., Subject-Verb-Object).
Semantics Study of meanings (word and sentence level).
Pragmatics Meaning in context (e.g., "Can you pass the salt?" implies a request).
Discourse How meaning flows between sentences in a paragraph or conversation.
5. Representations and Understanding
To make language understandable to machines, it needs to be represented in a format they
can work with.
🔹 Types of Representations:
Parse Trees: Represent sentence structure.
Dependency Trees: Show how words are related.
Semantic Frames: Capture event-related meaning (e.g., who did what to whom).
Logical Forms: Use formal logic to encode sentence meaning.
Embeddings (Vectors): Word2Vec, GloVe, and BERT turn words into numbers for
machine use.
🔹 Why Represent Language?
Representations bridge the gap between natural language and machine processing
capabilities.
6. Organization of Natural Language Understanding
Systems
An NLP system processes language through a pipeline of modules.
🔹 Typical NLP Pipeline: