Notes
The file contains:
• Unit 1 notes
• Unit 2 notes
• Unit 3 notes
• Unit 4 notes
• Unit 5 notes
• Diagrams
• Examples
• Assignment questions
,UNIT – 1
INTRODUCTION TO BIG DATA (Simplified Notes)
1. What is Big Data?
Big Data refers to extremely large amounts of data
that cannot be handled using normal computers or
traditional databases.
Today, data comes from everywhere — apps, websites,
sensors, cameras, social media, transactions, etc.
Big Data helps organizations analyze this huge data to
make better decisions.
2. Why Big Data became important? (History
Overview)
Earlier (1980s–2000s):
• Companies used databases + data warehouses.
• Only structured data (tables, rows) was used.
• Data growth was slow.
After 2005:
• Social media (FB, YouTube) exploded.
• Cloud platforms and Hadoop appeared.
• Unstructured data (images, videos, text, sensors)
became common.
• Internet of Things (IoT) increased data creation.
,Today:
• Every device produces data.
• Companies want insights in real-time.
• Big Data technologies allow fast processing,
storage & analysis.
3. Types of Big Data
A) Structured Data
• Present in fixed rows & columns.
• Easy to search.
Example: Employee table, bank transactions.
B) Unstructured Data
• No fixed format.
• Hard to process.
Example: Images, emails, chats, audio, videos.
C) Semi-Structured Data
• Not fully structured but has tags/markers.
Example: JSON, XML, Web logs.
4. 3V’s of Big Data (Characteristics)
1) Volume
, • Massive amount of data generated daily.
Examples: Google searches, Instagram photos,
online sales data.
2) Velocity
• Speed at which data is produced & processed.
Example: Stock market updates every second.
3) Variety
• Different kinds of data — text, images, logs,
videos, emails, sensor data, etc.
5. Classification of Analytics
A) Descriptive Analytics
• Describes what happened.
Examples: Sales reports, website visits, daily
revenue.
B) Predictive Analytics
• Predicts what will happen.
Examples: Movie recommendations, loan default
prediction.
C) Prescriptive Analytics
• Suggests what should be done.
Examples: Uber surge pricing, best delivery route
suggestions.