techworldthink • February 02, 2022
1. What is data science?
Data science is the field of study that combines domain expertise, programming
skills, and knowledge of mathematics and statistics to extract meaningful insights
from data. Data science practitioners apply machine learning algorithms to numbers,
text, images, video, audio, and more to produce artificial intelligence (AI) systems to
perform tasks that ordinarily require human intelligence. In turn, these systems
generate insights which analysts and business users can translate into tangible
business value.
Data Science can be defined as the study of data, where it comes from, what it
represents, and the ways by which it can be transformed into valuable inputs and
resources to create business and IT strategies.
Data science is a deep study of the massive amount of data, which involves extracting
meaningful insights from raw, structured, and unstructured data that is processed
using the scientific method, different technologies, and algorithms.
It is a multidisciplinary field that uses tools and techniques to manipulate the data so
that you can find something new and meaningful.
Data science uses the most powerful hardware, programming systems, and most
efficient algorithms to solve the data related problems. It is the future of artificial
intelligence.
Data Science is about data gathering, analysis and decision-making.
Data Science is about finding patterns in data, through analysis, and make future
predictions.
By using Data Science, companies are able to make:
, • Better decisions (should we choose A or B)
• Predictive analysis (what will happen next?)
• Pattern discoveries (find pattern, or maybe hidden information in the data)
Data Science is used in many industries in the world today, e.g. banking, consultancy,
healthcare, and manufacturing.
A Data Scientist requires expertise in several backgrounds:
• Machine Learning
• Statistics
• Programming (Python or R)
• Mathematics
• Databases
2. Explain the different types of data.
Almost anything can be turned into DATA. Building a deep understanding of the
different data types is a crucial prerequisite for doing Exploratory Data Analysis
(EDA) and Feature Engineering for Machine Learning models. You also need to
convert data types of some variables in order to make appropriate choices for visual
encodings in data visualization and storytelling.
Most data can be categorized into 4 basic types from a Machine Learning perspective:
numerical data
categorical data
time-series data
text data