Programming – Quick Notes
Data Science vs Big Data vs Data Analytics
| Term | Description |
|----------------|-------------|
| Data Science | A field that uses tools and techniques (like Python, ML) to extract insights
from data. |
| Big Data | Refers to large volumes of complex data (structured or unstructured) that
traditional tools can’t handle. |
| Data Analytics | The process of analyzing data to find patterns and trends for decision-
making. |
Understanding Big Data Technologies
- Big Data Technologies are tools used to store, process, and analyze large datasets.
- Common ones include:
- Hadoop
- Spark
- Kafka
- NoSQL databases
What is Hadoop?
- Hadoop is an open-source framework for storing and processing big data.
- It works on a distributed system (splits data across multiple machines).
- Uses HDFS (Hadoop Distributed File System) and MapReduce.
Understanding Spark
- Apache Spark is a fast data processing engine for big data.
- Works in-memory (faster than Hadoop).
, - Supports real-time data processing, machine learning, and SQL.
What is a Data Lake?
- A Data Lake is a central repository that stores raw data in any format (structured, semi-
structured, unstructured).
- Good for big data analytics, AI, and machine learning.
SQL in Data Science
SQL Attributes:
- Structured query language for relational databases.
- Useful in data cleaning, data extraction, and data analysis.
Common SQL Commands:
- SELECT, INSERT, UPDATE, DELETE
- WHERE, GROUP BY, ORDER BY, JOIN
What is NoSQL?
- NoSQL = Not Only SQL
- Used for non-relational or flexible data storage.
- Types: Document (MongoDB), Key-Value (Redis), Column (Cassandra), Graph (Neo4j)
SQL vs NoSQL
| Feature | SQL | NoSQL |
|----------------|------------------------------------|----------------------------------------|
| Structure | Tables with rows/columns | Flexible: JSON, key-value, etc. |
| Schema | Fixed | Dynamic/flexible |
| Joins | Supports | Limited |
| Best For | Structured data | Unstructured/big data |
| Examples | MySQL, PostgreSQL | MongoDB, Cassandra |