Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Class notes

College Notes Machine Learning Week 1 - 5

Rating
-
Sold
1
Pages
20
Uploaded on
21-03-2021
Written in
2019/2020

College aantekeningen Machine Learning Week 1 - 5 (2 colleges per week)

Institution
Course

Content preview

Machine Learning Week 1:

Hoorcollege 1 Introduction

Deductive reasoning: discrete, unambiguous, known rules
Inductive reasoning: ambiguous, experimental rules, unknown rules -> has to be broken
down to a few rules.
Machine learning: provides systems the ability to automatically learn and improve from
experience without being explicitly programmed (inside other software, in analytics/data
mining, science/statistics)
What makes a good ML problem?
● can’t solve it explicitly, approximate solutions are fine
● limited reliability, predictability, interpretability is fine
● plenty of examples to learn from
- e.g recommending a movie, predicting driving time
Abstract tasks generalize the problem of learning
Online learning: acting and learning at the same time
Reinforcement learning: online learning in a world based on delayed feedback
Offline learning: separate learning and acting - take a fixed dataset of examples, train a
model to learn from these examples, test the model to see if it works
Solve ML problems in general by using a few abstract tasks like classification, regression or
clustering.
1. Supervised: explicit examples of input and output, learn to predict the output for an
unseen input. (mailtjes naar spam of niet) labeled data
- Classification: assign a class to each example
Instances : data we provide our system (examples)
Features of the instance: things we measure
Target value: thing we are trying to learn
Dataset fed to learning algorithm -> produce a classifier: a small machine that
attempts to solve the learning problem.

Example: handwriting recognition, chess problem -> translate problems into
tasks.
Feature space: features plotted against each other
1. Linear classifier: draw a line, since a line is defined by two
parameters, we can also plot in a model space. (the two parameters of
the lines) We define a loss function which tells us how well a particular
model does for our data.
Loss function (model) = performance of model on the data. It takes the
model as input and the data as a constant
2. Decision Tree classifier: studies one feature in isolation at every node.
3. K-nearest neighbour: for a new point, it just looks at the k-points that
are closest and assigns the class that is most frequent in that set.
k = hyperparameter: you have to choose it yourself.



Variations:
Features usually numerical or categorical

, Binary classification: a task with two classes
Multiclass classification: more than two classes
Multilabel classification: none, some or all classes may be true (not
part of course)
Class probabilities/scores: the classifier reports a probability for each
class

- Regression: assign a number to each example. Predict a number instead of a
class. Output space and feature space in same model. We want to model the
relation between the features and the target
Loss function regression: 1/n Sum (f(xi) - yi)2 (MSE)
Regression tree: segment feature space into blocks and assign each a
number.
-> Overfitting (memorizing data instead of generalizing): training loss : the
loss for a given model on the training data. Never judge performance on
training data, leave some unseen data.

The aim of machine learning is not to minimise the loss on the training data,
but to minimise the loss on your test data. You don’t get to see the test data
until you’ve chosen your model!
The problem of machine learning is to choose a model that fits the pattern,
and ignores the noise.

2. Unsupervised: only inputs provided, find any pattern that explains something about
the data
- Clustering: ask the learner to split the instances into a number of clusters.
(unsupervised classification)
K-means: recursive nearest mean until the algorithm converges to a stable
state.
- Density Estimation: we want to learn how likely new data is. It thus produces
a number. The task of modelling the probability distribution behind your data.
- Generative Modeling: building a model from which you can sample new
examples is called generative modeling. -> deep learning




Hoorcollege 2: Linear Models 1

, Linear models and search: search methods are extremely versatile.
- Linear Regression: 1 feature f(x) = wx+b. If we have 2 features, each feature gets its
own weight. Multiple features, we arrange the features of a single instance into a
vector.
-> arbitrary number of features: dot product of the data and the weights and add a
bias. The weights and the bias are the parameters of the model.
Dot product!

Which model fits our data best?
1. Loss function: which tells us how well a particular choice of model does
2. Search method: search the space of all models for a particular model that
results in a low loss.
● Common Loss function for regression: MSE. Squares ensure that negative and
positive residuals don’t cancel out, but also that big errors affect the loss more
heavily than small errors.
-> the loss function maps every point in the model space to a loss value.
● Optimization: p^ = arg min loss X,Y (p) . We are trying to find the input (model
parameters) for which a particular function (the loss) is at its minimum. Find lowest
value as possible
1. Random search: Start with a random point p, loop: pick a point p’ close to p, if
loss p’ is smaller than loss p, p<p’. Analogy: hiker in a snowstorm!
It chooses the next point by sampling uniformly among all points with some
pre-chosen distance.
It works well because our problem is convex: if a line drawn between any two
points on the surface lies entirely above the surface. Implicates there is only 1
minimum.

-> What if the loss surface has multiple local minima? The random search will
go to one of the local minima and get stuck there.
Trick: simulated annealing: still pick the next point that isn’t better than the
current one, but only with a small probability. It still has some probability of
escaping.
In many situations local minima are fine.
● Variations in step size: fixed radius, random uniform or normally distributed.
● Space of linear models is continuous, if the space is discrete such as in tree models
you need to define which models are close to each other.
● Black Box Optimization: random search, simulated annealing. Very simple. We only
need to compute the loss function for each model. Can require many iterations. Also
works for discrete model spaces.
● Parallel Search: random search a couple of times independently.
- Introduce some form of communication between the searches to make it
more useful. ​Population Methods.
Evolutionary Algorithms: rank the population by loss, remove the worst half,
‘breed’ (crossover operator) a new population.
-> Slow + expensive for complex models, difficult to tune.
● Branching Search: the more closely we inspect our local neighbourhood, the faster
we converge.

Written for

Institution
Study
Course

Document information

Uploaded on
March 21, 2021
Number of pages
20
Written in
2019/2020
Type
Class notes
Professor(s)
Peter bloem
Contains
Week 1 - 5 (2 colleges per week)

Subjects

$7.64
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller
Seller avatar
hannah85
3.0
(1)

Get to know the seller

Seller avatar
hannah85 Vrije Universiteit Amsterdam
Follow You need to be logged in order to follow users or courses
Sold
10
Member since
7 year
Number of followers
10
Documents
0
Last sold
3 year ago

3.0

1 reviews

5
0
4
0
3
1
2
0
1
0

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions