The case studies- churn analysis and prediction using Cox-proportional models, and
churn prediction techniques. - Credit card fraud analysis with a focus on handling
imbalanced data and neural networks. - Sentiment analysis and topic mining from the
New York Times are addressed using similarity measures like cosine similarity, chi-
square, and N-grams. part-of-speech tagging, stemming, chunking - sales funnel
analysis, A/B testing, and campaign effectiveness. - Web page layout effectiveness -
recommendation systems with collaborative filtering - customer segmentation
5.1 CHURNandANALYSIS
strategies lifetime value- portfolio risk conformance and optimization, and Uber
alternative
Churn occursrouting
when a with graph
customer construction
discontinues and or
a service route
stopsoptimization.
using a product. The definition can vary
depending on the industry. For instance, in telecommunications, a churner might be someone who
cancels their subscription, while in retail, it could be a customer who hasn't made a purchase in a set
period.
Features for Churn Prediction:
Customer Demographics: Age, gender, location, etc.
Behavioral Data: Frequency of usage, time since the last interaction, customer support
interactions, etc.
Transaction Data: Purchase history, average purchase value, payment methods.
Subscription Information: Type of subscription, renewal dates, discounts applied, etc.
Customer Feedback: Survey responses, ratings, reviews.
5.2 CHURN ANALYSIS PREDICTION USING COX-PROPORTIONAL MODELS
The Cox Proportional Hazards model, often referred to as the Cox model, is a powerful tool used in
survival analysis to predict the time until an event of interest occurs, such as customer churn, equipment
failure, or patient survival. Unlike traditional regression models, the Cox model specifically accounts for
censored data, where the event has not occurred for some individuals by the end of the study period.
Survival Analysis:
Survival Time: The time duration until the event occurs.
Censoring: This occurs when the event has not been observed for some subjects during the study
period. For example, if a customer hasn't churned by the end of the observation period, their data is
1
, censored.
Hazard Function: The hazard function h(t)h(t)h(t) represents the instantaneous rate of occurrence
of the event at time ttt, given that the individual has survived up to time ttt.
Survival Function: The survival function S(t)S(t)S(t) gives the probability that the event has not
occurred by time ttt.
Cox Proportional Hazards Model:
The Cox model assumes that the hazard function for an individual at time t is the product of a
baseline hazard function h0(t) and a function of the explanatory variables (covariates):
X1,X2,…,Xp are the covariates (features), and β1,β2,…,βp are the coefficients to be estimated.
The following program demonstrates the cox proportional model for telecom industry
import pandas as pd
import matplotlib.pyplot as plt
from lifelines import KaplanMeierFitter, CoxPHFitter
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer
df = pd.read_csv("telco_churn.csv").dropna() # Drop NaN values at the start
df['Churn'] = df['Churn'].map({'Yes': 1, 'No': 0})
features = ['MonthlyCharges', 'Contract', 'InternetService']
train, test = train_test_split(df[features + ['tenure', 'Churn']], test_size=0.2, random_state=42)
preprocessor = ColumnTransformer([
('num', StandardScaler(), ['MonthlyCharges']),
('cat', OneHotEncoder(drop='first', sparse_output=False), ['Contract', 'InternetService'])
])
X_train = preprocessor.fit_transform(train.drop(columns=['tenure', 'Churn']))
train_data = pd.DataFrame(X_train, columns=preprocessor.get_feature_names_out())
train_data[['tenure', 'Churn']] = train[['tenure', 'Churn']].values
2
, train_data.dropna(inplace=True) # Ensure no NaNs before fitting
KaplanMeierFitter().fit(train['tenure'], train['Churn']).plot_survival_function()
cph = CoxPHFitter().fit(train_data, duration_col='tenure', event_col='Churn')
print(f"\nModel Concordance Index: {cph.concordance_index_:.2f}")
for feature, exp_coef in zip(cph.summary.index, cph.summary["exp(coef)"]):
effect = "MORE" if exp_coef > 1 else "LESS"
print(f"Customers with higher '{feature}' values are {effect} likely to churn. (Factor: {exp_coef:.2f})")
plt.title("Kaplan-Meier Survival Curve")
plt.show()
5.3 CHURN PREDICTION TECHNIQUES.
Churn prediction is a critical task in various industries, especially in businesses where customer
retention is crucial, such as telecom, finance, and subscription-based services. Several techniques can be
used for churn prediction, ranging from traditional statistical methods to advanced machine learning
models. Below is an overview of the most commonly used techniques:
1. Logistic Regression
Description: Logistic regression is a simple and interpretable method used for binary classification
problems like churn prediction. It models the probability that a customer will churn based on various
input features.
How It Works: The model predicts the probability of churn using a sigmoid function applied to a linear
combination of input features.
Advantages: Easy to interpret, fast to train, and works well with linearly separable data.
Disadvantages: May not perform well with complex relationships between features.
2. Decision Trees
Description: Decision trees split the data into subsets based on the most significant features, creating a
tree-like model of decisions.
How It Works: The model recursively splits the data based on the feature that provides the best split
(usually measured by metrics like Gini impurity or information gain).
Advantages: Easy to interpret, handles non-linear relationships well, and can manage both numerical
3