Geschreven door studenten die geslaagd zijn Direct beschikbaar na je betaling Online lezen of als PDF Verkeerd document? Gratis ruilen 4,6 TrustPilot
logo-home
Samenvatting

Summary Tentamen voorbereiding - Machine Learning for the Quantified Self (XM_40012)

Beoordeling
-
Verkocht
-
Pagina's
14
Geüpload op
05-10-2023
Geschreven in
2022/2023

Alle belangrijke termen, formules, voorbeelden en feiten die je moet kennen voor het tentamen voor Machine Learning for the Quantified Self.

Instelling
Vak

Voorbeeld van de inhoud

Machine Learning for the
Quantified Self
Terminology
A measurement is one value for an attribute recorded at a specific time point. E.g., heart
rate, velocity, etc.
A time series is a series of measurements in temporal order.
Supervised learning is the machine learning task of inferring a function from a set of labeled
training data.
In unsupervised learning, there is no target measure (or label), and the goal is to describe
the associations and patterns among the attributes.
Reinforcement learning tries to find optimal actions in a given situation so as to maximize a
numerical reward that does not immediately come with the action but later in time.




An example of an instance is x 1=[0,45 , low ,0 ]. A target for the instance is g1=[inactive] .
Outlier detection
An outlier is an observation point that is distant from other observations. There can be two
causes of an outlier:
- Measurement error (Arnold with a heart rate of 400)

, - Variability (Arnold trying to push his limits with a heart rate of 190)




Outliers can be detected and removed using two types of outlier detection:
- Distribution based (we assume a certain distribution of the data)
- Distance based (we only look at the distance between data points)
Distribution-based outlier detection
Chauvenet’s criterion assumes a normal distribution of a single attribute. The mean and
variance of the dataset are used as parameters of the normal distribution. A measurement is
1
rejected if the probability of observing it is less than , where c is a parameter indicating
c⋅N
the certainty of the outlier, and N is the size of the dataset.
Mixture models assume that the data can be described by K normal distributions
{N ( μ1 , σ 1 ) , … , N ( μK , σ K ) }. All the 2 K parameters can be estimated by using the maximum
likelihood of observing the data. Points with the lowest probabilities are candidates for
removal.
Distance-based outlier detection
The simple distance-based approach calls a point close if they are within distance d min . Points
are outliers when there is more than a fraction f min of points outside d min .
The local outlier factor also takes the density of the surrounding points into account, to
prevent a less dense cluster of points to all be flagged as outliers. The first step is to define
the k -distance k dist of a point x i. This is defined as the largest distance among the distances to
the k closest points. In other words, there should be at most k −1 points with a distance less
than k dist and at least one point which is exactly k dist away. These two together form the k dist nh

set.
The reachability of a point x i to another point x is:


This expresses that a reachability distance is the real distance if the point x i is not among the
k nearest points of x (in that case the value for d ( x , x i ) will be larger than k dist (x )) and
otherwise it is k dist of that point, so we set the distance value of all points within k dist (x )
equal to k dist ( x ).
Next, the local reachability density around our point x i is:




Intuitively, this says something about how close x i is to its neighbors. If a point is part of x i’s
nearest k neighbors, but this relationship does not hold the other way, x i might be an outlier.
The lower the average distance to the neighbors, the higher the local reachability distance
becomes.

Geschreven voor

Instelling
Studie
Vak

Documentinformatie

Geüpload op
5 oktober 2023
Aantal pagina's
14
Geschreven in
2022/2023
Type
SAMENVATTING

Onderwerpen

$7.27
Krijg toegang tot het volledige document:

Verkeerd document? Gratis ruilen Binnen 14 dagen na aankoop en voor het downloaden kun je een ander document kiezen. Je kunt het bedrag gewoon opnieuw besteden.
Geschreven door studenten die geslaagd zijn
Direct beschikbaar na je betaling
Online lezen of als PDF

Maak kennis met de verkoper
Seller avatar
sandervanwerkhooven

Maak kennis met de verkoper

Seller avatar
sandervanwerkhooven Vrije Universiteit Amsterdam
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
-
Lid sinds
2 jaar
Aantal volgers
0
Documenten
7
Laatst verkocht
-

0.0

0 beoordelingen

5
0
4
0
3
0
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Bezig met je bronvermelding?

Maak nauwkeurige citaten in APA, MLA en Harvard met onze gratis bronnengenerator.

Bezig met je bronvermelding?

Veelgestelde vragen