Central limit theorem : original scaling univariate case :
multivariate case : . .
. . .
Xn .
~Np(M El ,
Ho M Mo : = vS.
HimMo Ho M +Mo H
observations , m Count data
Ty Mo vs
: = :
iid
n mean
y
.
Elog)p)
F In-1)p
Mont (n 1)
-
t=
T (= Moli (S) Mo) -(n p)Fp n - p
logit(p)
-
I covariance for Ratios proportionsp = CF
large
- -
n-p S
= - ,
, ,
m <= M) ~Np(0, I) Correlations r Fisher's z(r) =
Elog() = Si -E -
⑫testcosions
-
n(E -
M)TI (F -
M) -Xp Mo
=
(Mo ...
MpolT
⑫transformation
empirical variance :
= ij Fj)
e Sjj
-
p-dimensional x ~Np(m , E)
normal distribution : -
<x ; s i= 1 ...
p arithmetic
⑫tuction
=in
(2/[
:
mean
f(x) =
expli multiv
mean : = (F....
.
arith.
p)T
Zur
empirical covariance :
Sj = j --
on
=
S
jk
sample correlation :
replace univariate ju=si
distance "M) by Expectation :
is standardized
Di
multivariate distance IE(X) [IE(Xij))]
siti-
=
X-Np(M , I) :
* if : =
1
M)"[ ex-M) MD(x)
(x ↑ matrix
⑫peta
· x -
MwNp(0, I)
- :=
Linearity
in
· A laxp) matrix Mahalanobis distance" A AES =
Ax-Np(Am , AlA)
univariate normal columns X1 , Xz ...
p
·
distribution
-E
:
(x-M rows X1 .
1 %. ...
Xn .
Covariance
f(x) =
j -
xx
Cov(xi yj) 1E((x ;
,
= -
1(x i
))(yj - 1 (yj))] Spectral Theorem or
eigendecomposition :
M expectation , 8"variance
yj))] =ET(x uy] Every symmetric (pxp) matrix can be
Autotation
cov(x , y) =
[ (cov(xi M )(y
= fi
-
-
+
,
decomposed as I = ↑ANT
IE
(xyT) MxMy
=
-
A =
diagla ap) an ap eigenvalues of I
() and
T .... , ...
,
Cov(x) =
[ (Cov(x; , x i ))] = T(x yx)(y - -
My)) ↑ =
(1 jp) orthogonal matrix (4= T)
,
=
=
...
column row
X ( ,x
=
·
xp) Cov(Ax , By) = A Cor(x , y) BT, CoviAx) ACov(x) AT =
Un ...
Up eigenvectors of I
, stay close to
0 sum of
Calinski-Harabasz index CH Bi/(k-1)
values close to0 squares [
idea :
clusters generated by proportions called similar to values close to 1
=
stay close to 1 between cluster SS
want In
Wi/in k)
multivariate dist With
coefficients means
fuzzy K-means
-
norm in
Hartigan
minim B
-
membership
. .
#+ ...
+ Ti = 1
↳
Mis , Ei and it. ... i dass prob. that
given consider poss numb of cl .,
largest value is best
Nike [0 , 2] i =1 n k= 1 K ,
= En big
.
with
.
, ,
... ...
(Mi Mrp)
(param . all unknown and
i 1 Fi
with m= ...
width Si
Gültig entde.
=
expectation maximization · within cluster SS want
average silhoutte
estimated
by EM-algorithm)
Wi -
1 di ,-di ,k close to 1 , well classified
: mean of K-th c. Si-max (di,, [[-1 , 1] silhoutte value
Fischdebased
n
Uik di, d
both depend the
Wikij
on
idea : proportional assi,
an
number of K di =
↑ ni , dij
Tdissim
, di min die
dissimilarrity in dust .
to other dust
Etvaluation
. .
. .. k
j = 1
of obs to all clusters
Expectation
.
est K baram under sample size
. .
sum of prop is 1 idea :
compare the of
n
Enhlog (Will-log(wu)
.
Gap statistic Gapn(k)
duster
=
outcomes with
theclassification
restrictions the
partitioning methods also
=Fij) pooled within-cluster
on
Wi
no cluster structure
cov. called hard dustering 1 or 0 validity measures -
should be small
structure a
gainst unstability
to
simplify the models Lobs assigned to < or not)
· , similarity and
Muster
unsupervised :
we want to know
often
est .
1
if there are supervised
dissimilarity is
groups
:
param. ,
measured distances
[1 = ... =
[ 8I=
we know that there are
by
groups
Euclidean distance
I Euclidean distance
Entde
K-means clustering with drij)i-th Ixi Xjl =
YSIS
= -
in=le
d (i
j)
.
a ,
:
W(EEi Full
row
-
of k-th cluster .
mean
· ↓
minimizing r(C)
by minim .
aug distance
.
Manhattan distance
of obs to their cluster
= inkl.
.
mean
cluster is
iterative
restart with diff analysis dri
, j)
algorithm
.
initializations distance
m1 ... MK anunsupervised method ↳ norm
1
. initialize cluster centers highly homogeneous clusters
symmetric (nxn) distance
Crandomly select 1 obs ) heterogenity between diff dusters
matrix D [ (dij)] (
.
.
Goal
identifying clusters
=
Ward's of
:
method increase
2
. min W(C) by assign Obs to dust will describe
we
only
. . .
variance when two of n obs min 1 max n
merging ., ,
arithmen
. calc .
3 new centers
procedures for agglom.
atoni
exhaustive classification :
duster E ll every
with clustering !
etypes
Tierarchi
cluster
↑n +
n
obs .
is
assigned to a
of non-exhaustive dass : some
nothing changes
Massification
until
not be
obs
might assigned
·
M .
total point scatter T
= i
j)
,
Mi to a cluster /e .
g outliers
.
Centroid method
= I (Idi j) + dij Hierarchy :
n +
1
Euklidean dist Of E(C) and (partition Partition : number of cluster
Linkage sagglomerative dustering
,
1iEC jECk
=
complete
.
- arithm.
T = w(C) + B(C) #(2) cluster centroids mean of n clusters clusters are merged) + is fixed beforehand , an obs.
minimize
W() = single Linkage im il
within-cluster
average Linkage
·
· divisive
dustering (1 cluster is 1 +n is
assigned only to one cluster
,
ichijedli j)
maximize
between-duster BIC)
=i nun Ei j) ,
min , ·
. decomposed up tonsingle custers) non-overlapping , method :
>
-
large if obs .
in diff clusters
. are far appart Visualization :
dendogram K-means
clustering