Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Class notes

APPLICATION PROBLEMS-MACHINE LEARNING

Rating
-
Sold
-
Pages
17
Uploaded on
26-04-2025
Written in
2024/2025

These notes focus on real-world Application Problems using Machine Learning. It includes detailed case studies on churn analysis and prediction using Cox-Proportional Models and churn prediction techniques. It also covers credit card fraud detection, with emphasis on handling imbalanced data and the use of neural networks. Sentiment analysis and topic mining from The New York Times articles are addressed using methods like cosine similarity, chi-square tests, and N-gram models. Additional Natural Language Processing (NLP) techniques such as part-of-speech tagging, stemming, and chunking are discussed, along with sales funnel analysis. These notes are perfect for students and professionals who want practical insights into applying ML techniques for real-world challenges.

Show more Read less
Institution
Course

Content preview

UNIT V APPLICATION PROBLEMS

The case studies- churn analysis and prediction using Cox-proportional models, and
churn prediction techniques. - Credit card fraud analysis with a focus on handling
imbalanced data and neural networks. - Sentiment analysis and topic mining from the
New York Times are addressed using similarity measures like cosine similarity, chi-
square, and N-grams. part-of-speech tagging, stemming, chunking - sales funnel
analysis, A/B testing, and campaign effectiveness. - Web page layout effectiveness -
recommendation systems with collaborative filtering - customer segmentation
5.1 CHURNandANALYSIS
strategies lifetime value- portfolio risk conformance and optimization, and Uber
alternative
Churn occursrouting
when a with graph
customer construction
discontinues and or
a service route
stopsoptimization.
using a product. The definition can vary
depending on the industry. For instance, in telecommunications, a churner might be someone who
cancels their subscription, while in retail, it could be a customer who hasn't made a purchase in a set
period.
Features for Churn Prediction:
 Customer Demographics: Age, gender, location, etc.
 Behavioral Data: Frequency of usage, time since the last interaction, customer support
interactions, etc.
 Transaction Data: Purchase history, average purchase value, payment methods.
 Subscription Information: Type of subscription, renewal dates, discounts applied, etc.
 Customer Feedback: Survey responses, ratings, reviews.


5.2 CHURN ANALYSIS PREDICTION USING COX-PROPORTIONAL MODELS
The Cox Proportional Hazards model, often referred to as the Cox model, is a powerful tool used in
survival analysis to predict the time until an event of interest occurs, such as customer churn, equipment
failure, or patient survival. Unlike traditional regression models, the Cox model specifically accounts for
censored data, where the event has not occurred for some individuals by the end of the study period.


Survival Analysis:
 Survival Time: The time duration until the event occurs.
 Censoring: This occurs when the event has not been observed for some subjects during the study
period. For example, if a customer hasn't churned by the end of the observation period, their data is
1

, censored.
 Hazard Function: The hazard function h(t)h(t)h(t) represents the instantaneous rate of occurrence
of the event at time ttt, given that the individual has survived up to time ttt.
 Survival Function: The survival function S(t)S(t)S(t) gives the probability that the event has not
occurred by time ttt.
Cox Proportional Hazards Model:
 The Cox model assumes that the hazard function for an individual at time t is the product of a
baseline hazard function h0(t) and a function of the explanatory variables (covariates):



X1,X2,…,Xp are the covariates (features), and β1,β2,…,βp are the coefficients to be estimated.
The following program demonstrates the cox proportional model for telecom industry

import pandas as pd
import matplotlib.pyplot as plt
from lifelines import KaplanMeierFitter, CoxPHFitter
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer
df = pd.read_csv("telco_churn.csv").dropna() # Drop NaN values at the start
df['Churn'] = df['Churn'].map({'Yes': 1, 'No': 0})
features = ['MonthlyCharges', 'Contract', 'InternetService']
train, test = train_test_split(df[features + ['tenure', 'Churn']], test_size=0.2, random_state=42)
preprocessor = ColumnTransformer([
('num', StandardScaler(), ['MonthlyCharges']),
('cat', OneHotEncoder(drop='first', sparse_output=False), ['Contract', 'InternetService'])
])
X_train = preprocessor.fit_transform(train.drop(columns=['tenure', 'Churn']))
train_data = pd.DataFrame(X_train, columns=preprocessor.get_feature_names_out())
train_data[['tenure', 'Churn']] = train[['tenure', 'Churn']].values

2

, train_data.dropna(inplace=True) # Ensure no NaNs before fitting
KaplanMeierFitter().fit(train['tenure'], train['Churn']).plot_survival_function()
cph = CoxPHFitter().fit(train_data, duration_col='tenure', event_col='Churn')
print(f"\nModel Concordance Index: {cph.concordance_index_:.2f}")
for feature, exp_coef in zip(cph.summary.index, cph.summary["exp(coef)"]):
effect = "MORE" if exp_coef > 1 else "LESS"
print(f"Customers with higher '{feature}' values are {effect} likely to churn. (Factor: {exp_coef:.2f})")
plt.title("Kaplan-Meier Survival Curve")
plt.show()


5.3 CHURN PREDICTION TECHNIQUES.


Churn prediction is a critical task in various industries, especially in businesses where customer
retention is crucial, such as telecom, finance, and subscription-based services. Several techniques can be
used for churn prediction, ranging from traditional statistical methods to advanced machine learning
models. Below is an overview of the most commonly used techniques:
1. Logistic Regression
Description: Logistic regression is a simple and interpretable method used for binary classification
problems like churn prediction. It models the probability that a customer will churn based on various
input features.
How It Works: The model predicts the probability of churn using a sigmoid function applied to a linear
combination of input features.
Advantages: Easy to interpret, fast to train, and works well with linearly separable data.
Disadvantages: May not perform well with complex relationships between features.
2. Decision Trees
Description: Decision trees split the data into subsets based on the most significant features, creating a
tree-like model of decisions.
How It Works: The model recursively splits the data based on the feature that provides the best split
(usually measured by metrics like Gini impurity or information gain).
Advantages: Easy to interpret, handles non-linear relationships well, and can manage both numerical
3

Written for

Institution
Course

Document information

Uploaded on
April 26, 2025
Number of pages
17
Written in
2024/2025
Type
Class notes
Professor(s)
Dr.gowri
Contains
All classes

Subjects

$8.49
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller
Seller avatar
gvarshini

Get to know the seller

Seller avatar
gvarshini CHENNAI INSTITUTE OF TECHNOLOGY
Follow You need to be logged in order to follow users or courses
Sold
-
Member since
1 year
Number of followers
0
Documents
5
Last sold
-

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions