CORRECT ANSWERS #9
SQL (Structured Query Language) - correct answer to access its database.
Database - correct answer is a set of data stored in a computer.
How is data structured? - correct answer In to tables. Tables can grow large and have a
multitude of columns and records.
Queries - correct answer define the subset of data you are seeking. You can save these
queries, refine them, share them, and run them on different databases.
Syntax - correct answer (the specific vocabulary that gives instructions to the computer)
Select all (*) columns from browse table for the first 10 records. - correct answer
SELECT *
FROM browse
LIMIT 10;
Churn rate - correct answer The percent of subscribers to a monthly service who have
canceled.
Formula: cancellation/total subscribers
Python - correct answer general-purpose programming language used by data scientist
for prototyping, visualization, and execution of data analyses on datasets.
CSV file - correct answer Comma-Separated Values. CSV is a text-only spreadsheet
format that lets us store and explore data.
Pandas - correct answer special set of commands in Python that lets us analyze
spreadsheet data. Pandas can do a lot of the things that SQL can do
Import pandas library and give a nickname - correct answer import pandas as pd
Matplotlib - correct answer plotting library for the Python programming language used to
visualize analysis and share it with others it allows the creation of line charts, bar charts,
pie charts, and more. It gives precise control over colors and labels so that we can
create the perfect chart to communicate our findings.
Probability - correct answer In data science, probability is often used to simulate
scenarios.
, Normal distribution - correct answer the mean is the middle of the distribution and the
standard deviation is the width.
Machine Learning and Algorithms Team - correct answer They use data to make
predictions and create new products using data (like recommendation systems).
Clusters - correct answer small groups of people or things that are close together
Analytics Team - correct answer They typically drive decision-making by summarizing
data, asking good questions, and developing dashboards.
Data Literacy - correct answer Competence in finding, manipulating, managing, and
interpreting data (numbers, text and/or images) and translating it into information.
Data literacy is about how well we read, interpret, and communicate with data.
Data Gaps (Bad data leads to Bad Prediction) - correct answer The ability to separate
good, mediocre, and poor quality data is a crucial data literacy skill. Data-driven
conclusions are only as strong, robust, and well-supported as the data behind them.
This is also often referred to with the phrase "garbage in, garbage out."
Garbage in, garbage out - correct answer The quality of the predictions made during a
predictive analysis is deeply dependent on the quality of the data used to generate the
predictions.
For example, if a model is trained with mislabeled data, it will produce inaccurate
predictions no matter how good the actual algorithm is. This is commonly referred to as,
"garbage in, garbage out."
Addressing Bias - correct answer Bias in data collection leads to poorer quality data.
Recognizing bias in data is a crucial data literacy skill.
Who participated in the data?
Who is left out?
Who made the data?
Statistics - correct answer helps us test the likelihood of an event happening by random
chance versus systematically.
Binary Categorical Variables - correct answer Categorical variables can also be binary
or dichotomous variables. Binary variables are nominal categorical variables that
contain only two, mutually exclusive categories. Examples of binary variables are if a
person is pregnant, or if a house's price is above or below a particular price.
Categorical Variables - correct answer Categorical variables consist of data that can be
grouped into distinct categories and are ordinal or nominal.