Class notes

Advanced machine learning

Rating

Sold

Pages

Uploaded on

04-07-2022

Written in

2021/2022

These are notes for advanced machine learning, focusing on ; Measurement space, features, typical learning problems, key concepts etc .

Institution

Course

Content preview

Advanced Machine Learning
Lectures

1 - Introduction
Measurement space, features, typical learning problems, key concepts, what you should
know
Supervised vs unsupervised learning, generative vs discriminative modeling

2 - Representations
Expected risk (R): conditional and total expected risk
Empirical risk (R^): training error, empirical risk minimizer, test error

Distinguish between test error and expected risk
Taxonomy of data, object space, measurement

Monadic, dyadic (e.g. pairwise), polyadic
Scales

Nominal (categorical): qualitative, but without quantitative measurements

Ordinal: measurement values are meaningful only with respect to other
measurements, i.e., the rank order of measurements carries the information, not the
numerical diﬀerences

Quantitative scale
Interval: the relation of numerical diﬀerences carries the information. Invariance
w.r.t. translation and scaling
Ratio: zero value of the scale carries information but not the measurement unit
Absolute: absolute values are meaningful
Mathematical spaces: topological, metric, Euclidean vector, metrizable

Probability spaces: elementary event, sample space, family of sets, algebra of events,
probability of events, probability model (triplet)

Stackexchange: Where a distinction is made between probability function and density,
the pmf applies only to discrete random variables, while the pdf applies to continuous
random variables
ml2016tutorial1: Note: Expected value =/= Most likely value
Describing dependencies in data by covariance is equivalent to approximation of data
distribution by a Gaussian model.

3 - Density Estimation in Regression: Parametric Models
Modeling assumptions for regression, diﬀerent approaches, Bayesianism and frequentism

Maximum Likelihood Estimation, ML estimation for normal distributions
Procedure: Find the extremum of the log-likelihood function
Wikipedia: Under the additional assumption that the errors are normally distributed,
ordinary least squares (OLS) is the maximum likelihood estimator.

, Wikipedia: Gauss-Markov Theorem states that in a linear regression model in which
the errors have expectation zero, are uncorrelated and have equal variances, the best
linear unbiased estimator (BLUE) of the coeﬃcients is given by the ordinary least
squares (OLS) estimator, provided it exists. The errors do not need to be normal, nor do
they need to be independent and identically distributed (only uncorrelated with mean
zero and homoscedastic with ﬁnite variance).
ml2016tutorial3: Note that if we don't know the real value of μ, we can use its
obtained prediction μ^ to calculate σ^, however in this case σ^ would be biased, i.e. σ^
=/= σtrue.
The James—Stein estimator is a biased estimator of the mean of Gaussian random
vectors. It can be shown that the James—Stein estimator dominates the "ordinary"
least squares approach, i.e., it has lower mean squared error. It is the best-known
example of Stein's phenomenon.
Maximum likelihood estimation of variance is biased, but it is nevertheless consistent.
Rao-Cramer inequality, Fisher information, score etc.

Wikipedia: In its simplest form, the bound states that the variance of any unbiased
estimator is at least as high as the inverse of the Fisher information.
Wikipedia: An unbiased estimator which achieves this lower bound is said to be (fully)
eﬃcient. Such a solution achieves the lowest possible mean squared error among all
unbiased methods, and is therefore the minimum variance unbiased (MVU) estimator.
Wikipedia: The Cramér–Rao bound can also be used to bound the variance of biased
estimators of given bias. In some cases, a biased approach can result in both a variance
and a mean squared error that are below the unbiased Cramér–Rao lower bound
Importance of the Maximum Likelihood Method, realizable model
Summary of MLEs

Consistency, equivariance, asymptotic eﬃciency, asymptotic normality
Bayesian Learning, on normal distribution, recursive Bayesian estimation
Exercise 2: Having determined the functional form of the prior and likelihood, we want
to compute the posterior. Doing it analytically can be hard in general, but it is easy if
the prior and likelihood form a conjugate pair. Then the posterior will have the same
functional form as the prior, only the parameters diﬀer.

Wikipedia: In Bayesian probability theory, if the posterior distributions are in the same
probability distribution family as the prior probability distribution, the prior and
posterior are then called conjugate distributions, and the prior is called a conjugate
prior for the likelihood function

ml2016tutorial3: Conjugate priors:

the gamma distribution is conjugate to the exponential distribution
the normal distribution is conjugate to the normal one
ML-Bayes estimation diﬀerences
The maximum likelihood method only estimates the parameters μ^, σ^, but not the
distribution!
ml2016tutorial3: simple linear regression corresponds to MLE, regularized linear
regression corresponds to MAP.
Schematic behaviour of bias and variance

4 - Regression
Linear regression models, least squares, residual sum of squares (RSS)

Report Copyright Violation

Written for

Institution: Massachusetts Institute Of Technology
Course: ADVANCED MACHINE LEARNING

All documents for this subject (1)

Document information

Uploaded on: July 4, 2022
Number of pages: 11
Written in: 2021/2022
Type: Class notes
Professor(s): Jessedunietz
Contains: All classes

Subjects

advanced machine learning pdf

$8.49

Get access to the full document:

Written by students who passed

Immediately available after payment

Read online or as PDF

Get to know the seller

Studentscenter

Get to know the seller

Studentscenter Heriot-Watt university

View profile

Sold

Member since

4 year

Number of followers

Documents

Last sold

0.0

0 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller Studentscenter. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $8.49. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 49246 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 16 years now

Advanced machine learning

Content preview

Written for

Document information

Subjects

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Working on your references?

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?