D204: The Data Analytics Journey latest/# 286
Complete Questions & Answers.
Data scientists are able to find ______, _________, and _____ in unstructured data.
- - order, meaning, and value
-What is involved in the planning phase? - -
1. Defining goals
2. Organizing resources
3. Coordinate people
4. Schedule project
-What is involved in the wrangling phase? - -
5. Get data
6. Clean data
7. Explore data
8. Refine data
-What is involved in the Modeling phase? - -
9. Create model
10. Validate model
11. Evaluate model
12. Refine model
-What is involved in the Applying phase? - -
13. Present model
14. Deploy model
15. Revisit model
16. Archive assets
-____________________ are programming languages that are very frequently used
for data manipulation and modeling. - - Python or R
-___________ are general-purpose languages that are used for the back end, the
foundational elementsterm-26 of data science, and they provide maximum speed. - - C,
and C++, and Java
-____________________ is a language for working with relational databases to do
queries and data manipulation. - - SQL
-What does SQL stand for? - - structured query language
-This is where you actually create the statistical model and you do the linear regression.
You do the decision tree. You do the deep learning neural network. - - Modeling
-These are the developers, and the system architects, the people who focus on the
hardware and the software that make data science possible - - Data engineers
,-This is the phase of collecting data. - - Data acquisition
-Which phase? - Working with stakeholders to help them ask better questions so that
both they and you understand the outcome. - - Discovery
-What are the 4 parts of data analytics cycle? - - Planning, Wrangling, Modeling and
Applying
-This phase is also known as the discovery phase. During this phase, an analyst defines
the major questions of interest that need to be answered, understand the needs of the
stakeholders, and assess the resource constraints in the project. - - Business
understanding
-____________________ is the person who champions the vision of the project
and has the authority to allocate resources. - - The project sponsor
-__________________ is responsible for making sure things get done on time and
within budget and removes roadblocks. - - Project manager
-___________ is when new requirements are added to the project that increases the
time/resources needed to complete it. - - Scope creep
-What are the 3 types of analysis? - - Descriptive, Predictive, Prescriptive
-___________________________ describes the data that is present. Mean,
Median, Mode, counting things. How many of each size and color of shirt were sold in
the last month? Do we sell more shirts in the summer vs winter? - - Descriptive analysis
-____________________ makes predictions about future state of business.
Forecasting volumes for example. Based on last summer and winter, what will we sell
next year? - - Predictive analytics
-_______________________ analysis with an end goal of making a
recommendation. What colors and sizes of shirts should we sell to maximize profits? - -
Prescriptive analytics
-______________________ is just looking at any variable over time - - Time series
analysis
-____________________ is a programing language that is specific to statistics. It
also has capabilities to visualize data. - - R
-_______________ is a multipurpose programing language that has libraries that
extend its capabilities to do statistical analysis. - - Python
, -______________________ are platforms that specialize in visualization. This is
where you can make graphs and charts for presentations and data storytelling to
executive leaders. - - Tableau and Power BI
-_______________________ are instant messaging platforms that facilitate in a
faster, but less formal, way than email. - - Teams, Slack
-An European union law regulating their citizens must have informed consent and
ability to request or delete their own data that you collect. - - GDPR
-When the researching organization consciously ignores data that calls their results into
question or only presents one side of the results that puts them in a positive light. - -
Conflict of interest
-Sometimes data might not be available and the analyst will use tools such as web
scraping or surveys to acquire it during which phase? - - Data aquisition
-The ____________ states that the sampling distribution of the sample means
approaches a normal distribution as the sample size gets larger (if you were to take 50
people out of that population and get the mean, then take another 50 random people
and get their mean age, and so forth, all of those means would follow the normal
distribution (bell curve)). - - Central Limit Theorem
-In this phase, the analyst begins to understand the basic nature of data and the
relationships within it. This phase often relies on the use of data visualization tools and
numerical summaries, such as measures of central tendency and variability. - - Data
Exploration
-__________________ enables an analyst to move beyond describing the data to
creating models that enable predicting outcomes of interest. - - Predictive Modeling
-Tools such as _______________ play an important role in automating the training
and using of models. - - Python and R
-In this phase, an analyst tells the story of the data and uses graphs or interactive
dashboards to inform others of the findings from the analyses. - - Reporting and
Visualization
-Even if you have a wide spread of a variable, let's say, age in a population, and you take
lots of sample groups, the mean age of those sample groups would tend to have a normal
distribution. - - Central Limit theorem
-This is the phase of collecting data. Frequently, data will be retrieved from a database,
perhaps a component of a data warehouse, by using a language like SQL. - - Data
Acquisition
-"Collect the data" is synonymous with ____________________ - - data acquisition
Complete Questions & Answers.
Data scientists are able to find ______, _________, and _____ in unstructured data.
- - order, meaning, and value
-What is involved in the planning phase? - -
1. Defining goals
2. Organizing resources
3. Coordinate people
4. Schedule project
-What is involved in the wrangling phase? - -
5. Get data
6. Clean data
7. Explore data
8. Refine data
-What is involved in the Modeling phase? - -
9. Create model
10. Validate model
11. Evaluate model
12. Refine model
-What is involved in the Applying phase? - -
13. Present model
14. Deploy model
15. Revisit model
16. Archive assets
-____________________ are programming languages that are very frequently used
for data manipulation and modeling. - - Python or R
-___________ are general-purpose languages that are used for the back end, the
foundational elementsterm-26 of data science, and they provide maximum speed. - - C,
and C++, and Java
-____________________ is a language for working with relational databases to do
queries and data manipulation. - - SQL
-What does SQL stand for? - - structured query language
-This is where you actually create the statistical model and you do the linear regression.
You do the decision tree. You do the deep learning neural network. - - Modeling
-These are the developers, and the system architects, the people who focus on the
hardware and the software that make data science possible - - Data engineers
,-This is the phase of collecting data. - - Data acquisition
-Which phase? - Working with stakeholders to help them ask better questions so that
both they and you understand the outcome. - - Discovery
-What are the 4 parts of data analytics cycle? - - Planning, Wrangling, Modeling and
Applying
-This phase is also known as the discovery phase. During this phase, an analyst defines
the major questions of interest that need to be answered, understand the needs of the
stakeholders, and assess the resource constraints in the project. - - Business
understanding
-____________________ is the person who champions the vision of the project
and has the authority to allocate resources. - - The project sponsor
-__________________ is responsible for making sure things get done on time and
within budget and removes roadblocks. - - Project manager
-___________ is when new requirements are added to the project that increases the
time/resources needed to complete it. - - Scope creep
-What are the 3 types of analysis? - - Descriptive, Predictive, Prescriptive
-___________________________ describes the data that is present. Mean,
Median, Mode, counting things. How many of each size and color of shirt were sold in
the last month? Do we sell more shirts in the summer vs winter? - - Descriptive analysis
-____________________ makes predictions about future state of business.
Forecasting volumes for example. Based on last summer and winter, what will we sell
next year? - - Predictive analytics
-_______________________ analysis with an end goal of making a
recommendation. What colors and sizes of shirts should we sell to maximize profits? - -
Prescriptive analytics
-______________________ is just looking at any variable over time - - Time series
analysis
-____________________ is a programing language that is specific to statistics. It
also has capabilities to visualize data. - - R
-_______________ is a multipurpose programing language that has libraries that
extend its capabilities to do statistical analysis. - - Python
, -______________________ are platforms that specialize in visualization. This is
where you can make graphs and charts for presentations and data storytelling to
executive leaders. - - Tableau and Power BI
-_______________________ are instant messaging platforms that facilitate in a
faster, but less formal, way than email. - - Teams, Slack
-An European union law regulating their citizens must have informed consent and
ability to request or delete their own data that you collect. - - GDPR
-When the researching organization consciously ignores data that calls their results into
question or only presents one side of the results that puts them in a positive light. - -
Conflict of interest
-Sometimes data might not be available and the analyst will use tools such as web
scraping or surveys to acquire it during which phase? - - Data aquisition
-The ____________ states that the sampling distribution of the sample means
approaches a normal distribution as the sample size gets larger (if you were to take 50
people out of that population and get the mean, then take another 50 random people
and get their mean age, and so forth, all of those means would follow the normal
distribution (bell curve)). - - Central Limit Theorem
-In this phase, the analyst begins to understand the basic nature of data and the
relationships within it. This phase often relies on the use of data visualization tools and
numerical summaries, such as measures of central tendency and variability. - - Data
Exploration
-__________________ enables an analyst to move beyond describing the data to
creating models that enable predicting outcomes of interest. - - Predictive Modeling
-Tools such as _______________ play an important role in automating the training
and using of models. - - Python and R
-In this phase, an analyst tells the story of the data and uses graphs or interactive
dashboards to inform others of the findings from the analyses. - - Reporting and
Visualization
-Even if you have a wide spread of a variable, let's say, age in a population, and you take
lots of sample groups, the mean age of those sample groups would tend to have a normal
distribution. - - Central Limit theorem
-This is the phase of collecting data. Frequently, data will be retrieved from a database,
perhaps a component of a data warehouse, by using a language like SQL. - - Data
Acquisition
-"Collect the data" is synonymous with ____________________ - - data acquisition