Exam (elaborations)

WGU D208 Task 1 (2025): Applying Predictive Analytics for Decision-Making

Rating

Sold

Pages

Grade

A+

Uploaded on

28-02-2025

Written in

2024/2025

A: Research Question The “Telecommunications Churn” data was utilized to demonstrate my ability to practice predictive modeling. Customers in the telecom sector have the option to select from a variety of service providers and actively move between them. The percentage of customers that switch to a different service provider within a specific time frame is called customer churn. According to WGU, it is 10 times more expensive to keep an existing customer than it is to get a new one (WGU, 2024). The purpose of this data analysis and predictive modeling exercise is to see if there are any indications of customer churn. That will therefore provide insight as to how to minimize it. 1 – Question Identification Can linear regression models predict the future bandwidth usage per year of a customer? 2 – Goals & Objectives To apply appropriate strategies to avoid or mitigate instances of customer churn, it is essential that the telecommunications company first understand the customer. More specifically, it is important that the company fully understands the implications of customer churn. Once they understand customer churn, they can make predictions of the customer data that they have. The objective of this analysis is to provide the telecommunications company with a predictive model on the bandwidth usage per year of each customer and then relate it to independent variables. This will help them determine areas in which they can facilitate or improve the profitability of the services that they provide to their customers by either limiting or expanding their bandwidth usage. B: Method Justification 1 – Four Assumptions of Multiple Linear Regression Methods Multiple Linear Regression (MLR) will be used for this analysis. There are four assumptions associated with MLR to be effective: 1. Variables are normally distributed. If there are outliers, it can be removed. However, the weight of the loss of information must be evaluated before removing the data. 2. There is a linear relationship between the independent and dependent variables. This can be established by the assessment of a scatter plot. 3. There is no collinearity between the variables. This means that there should not be a high correlation between the variables. This can be determined by using a Variance Inflation Test (VIF). 4. Homoscedasticity is present in the data. This means that there is the same variance of errors between the independent variables. There must be proof that all these assumptions and/or conditions must be met for the data to be considered reliable. 2 – Benefits

Show more Read less

Institution

WGU D208

Course

WGU D208

Content preview

1

WGU D208 Task 1 (2025): Applying Predictive
Analytics for Decision-Making

College of Information Technology, Western Governors University

D208: Predictive Modeling

Dr. Keiona Middleton

Table of Contents
A: Research Question..................................................................................................................................3
1 – Question Identification.......................................................................................................................3
2 – Goals & Objectives............................................................................................................................3

, 2

B: Method Justification................................................................................................................................3
1 – Four Assumptions of Multiple Linear Regression Methods...............................................................3
2 – Benefits of Using Jupyter Notebook and Python for Analysis...........................................................4
3 – Multiple Linear Regression Analysis Justification.............................................................................4
C: Data Preparation......................................................................................................................................4
1 – Data Preparation Goals......................................................................................................................4
2 – Statistics Summary...........................................................................................................................12
3 – Univariate & Bivariate Statistics......................................................................................................14
4 – Data Transformation (Data Wrangling)...........................................................................................18
5 – Data Preparation File.......................................................................................................................20
D: Model Analysis.....................................................................................................................................20
1 – Initial Model....................................................................................................................................20
2 – Model Method & Justification..........................................................................................................22
3 – Reduced Model................................................................................................................................22
E: Model Comparison................................................................................................................................28
1 – Initial vs. Reduced Regression Models............................................................................................28
2 – Output & Calculations......................................................................................................................29
3 – Copy of Code...................................................................................................................................31
F: Data Summary & Implications..............................................................................................................31
1 – Results.............................................................................................................................................31
2 – Recommendations............................................................................................................................32
G: Demonstration.......................................................................................................................................32
Panopto Video Presentation...................................................................................................................32
H: Third Party Web-References.................................................................................................................33
I - References.............................................................................................................................................33

, 3

A: Research Question
The “Telecommunications Churn” data was utilized to demonstrate my ability to practice
predictive modeling. Customers in the telecom sector have the option to select from a variety of
service providers and actively move between them. The percentage of customers that switch to a
different service provider within a specific time frame is called customer churn. According to
WGU, it is 10 times more expensive to keep an existing customer than it is to get a new one
(WGU, 2024). The purpose of this data analysis and predictive modeling exercise is to see if
there are any indications of customer churn. That will therefore provide insight as to how to
minimize it.
1 – Question Identification
Can linear regression models predict the future bandwidth usage per year of a customer?
2 – Goals & Objectives
To apply appropriate strategies to avoid or mitigate instances of customer churn, it is essential
that the telecommunications company first understand the customer. More specifically, it is
important that the company fully understands the implications of customer churn. Once they
understand customer churn, they can make predictions of the customer data that they have. The
objective of this analysis is to provide the telecommunications company with a predictive model
on the bandwidth usage per year of each customer and then relate it to independent variables.
This will help them determine areas in which they can facilitate or improve the profitability of
the services that they provide to their customers by either limiting or expanding their bandwidth
usage.

B: Method Justification
1 – Four Assumptions of Multiple Linear Regression Methods
Multiple Linear Regression (MLR) will be used for this analysis. There are four assumptions
associated with MLR to be effective:
1. Variables are normally distributed. If there are outliers, it can be removed. However,
the weight of the loss of information must be evaluated before removing the data.
2. There is a linear relationship between the independent and dependent variables. This
can be established by the assessment of a scatter plot.
3. There is no collinearity between the variables. This means that there should not be a
high correlation between the variables. This can be determined by using a Variance
Inflation Test (VIF).
4. Homoscedasticity is present in the data. This means that there is the same variance
of errors between the independent variables.
There must be proof that all these assumptions and/or conditions must be met for the data to be
considered reliable.

, 4

2 – Benefits of Using Jupyter Notebook and Python for Analysis
Jupyter Notebook, an interactive web-based computing platform was utilized to implement the
Python programming language to identify duplicates, missing values, and outliers in the
“Telecommunications Churn” scenario dataset in D206. It was also used to explore the data in
D207. This is a popular platform to code because you can simultaneously switch between
different tools/libraries to create visualizations, calculate statistics, and more. Python’s
simplified syntax requirements were used to perform complex tasks. For this exercise, the same
tools will be used to answer the research question. The following describes the packages and
libraries that will be used:
• NumPy: includes mathematical functions
• Pandas: allows for data processing and machine learning
• Seaborn: high-level interface for creating statistical graphs
• Matplotlib: used to create static or interactive visualizations
• PyLab: procedural interface to Matplotlib
• SciPy: provides algorithms for equations and statistics
• Sklearn: tool used for predictive data analysis

3 – Multiple Linear Regression Analysis Justification
Multiple Linear Regression (MLR) is an appropriate analysis technique because it allows for the
measurement of more than one independent variable on another variable. In this case, the
bandwidth per year variable is a continuous variable that can further examined by independent
variables such as age, tenure, monthly charge, streaming TV, and streaming movies. This will be
useful in determining how the bandwidth usage per year is related to the occurrence of customer
churn.

C: Data Preparation
1 – Data Preparation Goals
To clean the provided dataset by identifying duplicates, missing values, and outliers to then
mitigate them, if needed. The data will also be wrangled and will prepare the categorical
variables for linear regression. This will be done by changing the data types as necessary and
creating nominal gender and churn variables.
Step 1: Import Necessary Packages/Libraries
#Import necessary packages
import pandas as pd
from pandas import Series, DataFrame
import numpy as np
#Import visualization packages
import seaborn as sns
import sklearn

Report Copyright Violation

Written for

Institution: WGU D208
Course: WGU D208

Document information

Uploaded on: February 28, 2025
Number of pages: 32
Written in: 2024/2025
Type: Exam (elaborations)
Contains: Questions & answers

Subjects

wgu d208 task 1
wgu d208
wgu d208 task 1 2025 applying predictive analyt

$17.99

Get access to the full document:

Written by students who passed

Immediately available after payment

Read online or as PDF

Get to know the seller

YANCHY

4.1

(448)

Get to know the seller

YANCHY Herzing University

View profile

Sold

1781

Member since

4 year

Number of followers

1159

Documents

23404

Last sold

3 weeks ago

Ace Your Exams with Elite Study Resources | ExamEliteHub on Stuvia

I offer genuine and dependable exam papers that are directly obtained from well-known, reputable institutions as a highly regarded professional who specializes in sourcing study materials. These papers are invaluable resources made to help people who want to become nurses and people who work in other fields prepare for exams. Because of my extensive experience and in-depth knowledge of the subject, I take great care to ensure that each exam paper meets the highest quality, accuracy, and relevance standards, making them an essential component of any successful study plan.

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller YANCHY. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $17.99. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 50629 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 16 years now

WGU D208 Task 1 (2025): Applying Predictive Analytics for Decision-Making

Content preview

Written for

Document information

Subjects

Get to know the seller

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Working on your references?

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?