Week 11:
K- Means Clustering
DDHA 8800
Data-Driven Decision Making in Health Administration
Problem 17-21
A. This problem is looking to see where splits need to be made in order to reduce
diversity. The classification method that will be used in order to successfully
classify gender, age, and education will be a classification tree or known as
decision tree.
Possible first splits
Gender %Y Diversity
F 49.41% 0.9999
M 63.56% 0.9264
Total 0.9636 Drop rate 0.0200
Age % Diversity
M,O Y 0.9678
Y 58.97%
50.94% 0.9996
Total 0.978 Drop rate 0.0056
M 53.69% 0.9946
Y,O 57.79% 0.9757
Total 0.9821 Drop rate 0.0015
O 64.22% 0.9191
Y, M 52.35% 0.9978
Total 0.9709 Drop rate 0.0127
Education %Y Diversity
HS 26.27% 0.7747
UG, G 71.58% 0.8137
Total 0.8007 Drop rate 0.1829
HS, G 58.56% 0.9707