Course name: Fundamentals of Data Science
Fundamentals of Data Science (24BEELY107)
Semester - I/II
Module I
Introduction to Data Science
Definition—Big Data and Data Science Hype—Datafication—Data Science
Profile—Meta Data—Definition—Data Scientist—Statistical Inference—
Populations and Samples—Populations and Samples of Big Data—Modelling-Data
Warehouse—Philosophy of Exploratory Data Analysis—The Data Science
Process—A Data Scientist’s Role in this Process Case Study: Real Direct—Housing
Market Analysis
School of Engineering and Technology 1
, Course name: Fundamentals of Data Science
1.1 Introduction: Definition
Data science is a deep study of the large amount of data, which involves extracting
meaningful insights from raw, structured, and unstructured data that is processed
using the scientific method, different technologies, and algorithms.
Data science uses the most powerful hardware, programming systems, and most
efficient algorithms to solve the data related problems. It is the future of artificial
intelligence.
Data Science is about data gathering, analysis and decision-making.
Data Science is about finding patterns in data, through analysis, and make future
predictions.
By using Data Science, companies are able to make:
Better decisions (should we choose A or B)
Predictive analysis (what will happen next?)
Pattern discoveries (find pattern, or maybe hidden information in the data)
Data science is the combination of: statistics, mathematics, programming, and
problem-solving; capturing data in ingenious ways; the ability to look at things
differently; and the activity of cleansing, preparing, and aligning data. This
umbrella term includes various techniques that are used when extracting insights
and information from data.
Need for Data Science
Data Science is used in many industries in the world today, e.g. banking,
consultancy, healthcare, and manufacturing.
For route planning: To discover the best routes to ship
To foresee delays for flight/ship/train etc. (through predictive analysis)
To create promotional offers
To find the best suited time to deliver goods
To forecast the next years revenue for a company
To analyze health benefit of training
School of Engineering and Technology 2
, Course name: Fundamentals of Data Science
To predict who will win elections
Data Science can be applied in nearly every part of a business where data is
available. Examples are:
Consumer goods
Stock markets
Industry
Politics
Logistic companies
E-commerce
A Data Scientist requires expertise in several backgrounds:
Machine Learning
Statistics
Programming (Python)
Mathematics
Databases
Data science is all about:
Asking the correct questions and analyzing the raw data.
Modeling the data using various complex and efficient algorithms.
Visualizing the data to get a better perspective.
Understanding the data to make better decisions and finding the final result.
School of Engineering and Technology 3
, Course name: Fundamentals of Data Science
Fig 1.1 Data Science basics
Example:
Let suppose we want to travel from station A to station B by car.
Now, we need to take some decisions such as which route will be the best route
to reach faster at the location, in which route there will be no traffic jam, and which
will be cost-effective.
All these decision factors will act as input data, and we will get an appropriate
answer from these decisions, so this analysis of data is called the data analysis,
which is a part of data science.
School of Engineering and Technology 4
Fundamentals of Data Science (24BEELY107)
Semester - I/II
Module I
Introduction to Data Science
Definition—Big Data and Data Science Hype—Datafication—Data Science
Profile—Meta Data—Definition—Data Scientist—Statistical Inference—
Populations and Samples—Populations and Samples of Big Data—Modelling-Data
Warehouse—Philosophy of Exploratory Data Analysis—The Data Science
Process—A Data Scientist’s Role in this Process Case Study: Real Direct—Housing
Market Analysis
School of Engineering and Technology 1
, Course name: Fundamentals of Data Science
1.1 Introduction: Definition
Data science is a deep study of the large amount of data, which involves extracting
meaningful insights from raw, structured, and unstructured data that is processed
using the scientific method, different technologies, and algorithms.
Data science uses the most powerful hardware, programming systems, and most
efficient algorithms to solve the data related problems. It is the future of artificial
intelligence.
Data Science is about data gathering, analysis and decision-making.
Data Science is about finding patterns in data, through analysis, and make future
predictions.
By using Data Science, companies are able to make:
Better decisions (should we choose A or B)
Predictive analysis (what will happen next?)
Pattern discoveries (find pattern, or maybe hidden information in the data)
Data science is the combination of: statistics, mathematics, programming, and
problem-solving; capturing data in ingenious ways; the ability to look at things
differently; and the activity of cleansing, preparing, and aligning data. This
umbrella term includes various techniques that are used when extracting insights
and information from data.
Need for Data Science
Data Science is used in many industries in the world today, e.g. banking,
consultancy, healthcare, and manufacturing.
For route planning: To discover the best routes to ship
To foresee delays for flight/ship/train etc. (through predictive analysis)
To create promotional offers
To find the best suited time to deliver goods
To forecast the next years revenue for a company
To analyze health benefit of training
School of Engineering and Technology 2
, Course name: Fundamentals of Data Science
To predict who will win elections
Data Science can be applied in nearly every part of a business where data is
available. Examples are:
Consumer goods
Stock markets
Industry
Politics
Logistic companies
E-commerce
A Data Scientist requires expertise in several backgrounds:
Machine Learning
Statistics
Programming (Python)
Mathematics
Databases
Data science is all about:
Asking the correct questions and analyzing the raw data.
Modeling the data using various complex and efficient algorithms.
Visualizing the data to get a better perspective.
Understanding the data to make better decisions and finding the final result.
School of Engineering and Technology 3
, Course name: Fundamentals of Data Science
Fig 1.1 Data Science basics
Example:
Let suppose we want to travel from station A to station B by car.
Now, we need to take some decisions such as which route will be the best route
to reach faster at the location, in which route there will be no traffic jam, and which
will be cost-effective.
All these decision factors will act as input data, and we will get an appropriate
answer from these decisions, so this analysis of data is called the data analysis,
which is a part of data science.
School of Engineering and Technology 4