Geschreven door studenten die geslaagd zijn Direct beschikbaar na je betaling Online lezen of als PDF Verkeerd document? Gratis ruilen 4,6 TrustPilot
logo-home
Samenvatting

Elements of Machine Learning summary + practice exam

Beoordeling
-
Verkocht
6
Pagina's
60
Geüpload op
12-06-2024
Geschreven in
2023/2024

Summary of all lessons from Elements of Machine Learning, enhanced with explanations by ChatGPT to deepen your understanding. Perfect for your open book exam in Elements of Machine Learning. Additionally, a practice exam based on last year's test is included.

Meer zien Lees minder

Voorbeeld van de inhoud

Elements of machine learning
Book: https://www.nrigroupindia.com/e-book/Introduction%20to%20Machine%20Learning
%20with%20Python%20(%20PDFDrive.com%20)-min.pdf
Machine learning
Intelligence in machine learning = Being able
to do a task
Greyhound vs Labrador
We can collect thousands of images fort he
computer
Or we could describe each dog to the
computer;
- Height of the dog
- Weight
- Color
 These are called features
More features are more informative and make
the computer more precise
Branches of ML
- Supervised learning: Learn from examples + the desired ‘target’
behaviour of the computer.
E.g. With an input of data, we give the model examples of how
it needs to work -> If it is a dog, it needs to recognize it. We
provide the label and we provide the target.
- Unsupervised learning -> Useful if you want to find structure in
data, if you don’t know what cats and dogs are the computer
might find structure and tell you that there are classes. E.g.:
Fraudulent transactions: using structure to find out that a
transaction in Nigeria or Cambodia is different than normal
- Reinforcement learning: learning a behaviour by interacting
with an environment and receiving rewards and punishments
based on the current behaviour. Connected to reinforcement
learning in animals. The algorithm can decide how he can change his behaviour to maximize
the amount of reward.
Supervised vs unsupervised learning
- Supervised learning requires labelled data that is data samples that come with a target label
- Labelled data is expensive as someone must look at each sample and assign a label, needs to
be done by real people who can max label about 1000 data cases every day and need to be
paid for it.
- Unlabelled data is cheap; e.g. easy to collect millions of images and text from the internet
Examples
- Email spam filter: given an email learn to recognize whether it is spam or legit
- Speech recognition: Transcribe spoken sentences into text

, - Language translation -> simple
language prediction model which
uses his knowledge to predict what
to answer
- Recommender systems: given data
on online behaviour of users in
general (and possibly of a specific
user), provide suggestions for items that are most pertinent to a practical user.
- Sentiment analysis: Given some text (e.g. a tweet or a product review) determine whether the
content is positive, negative, or neutral
- Time series forecasting: Make future predictions based on historical data, (regression)
- Fraud detection
Types of data
Images
- Computers work with numbers so data
has to be represented accordingly,
- Pictures/visual data: Arrays of numbers
(RGV values for each pixel)
- Colour images: 3 colour channels:
commonly, red, green and blue
- Grayscale: light intensity, integer
values between 0 and 225
- Binary: Each pixel is either 0 and 1,
useful to find contours
Text/language:
- we need to convert letters or words in a format computers can understand.
- Usually we separate text individual letters or words and convert each into a vector
i. Example, each letter van become a vector with 26 elements, all set to 0 except fort he
element at the index of the character which is set to 1 E.g., C = (0,0,1,0,0)
ii. With words we can do the same thing using a dictionary, and using a vector of the
same length as the dictionary.
iii. The resulting vector is usually too large and sparse to use directly, os it is often passed
through an ‘embedding’ funtion to compress it.


- Applicable to any categorisation, we need to use a numerical representation rather than their
names
- One-hot encoding is used to represent categorial data with a vector of numbers such that the
elements of the vector are all 0 except fort he correct category, which is given a 1.
i.

, - Categorial data -> Discrete and countable.
Often we work with data that consists of multiple different fields.
Example: Credit card fraud detection, objective: Detect whether a given transaction is legit or not
The Iris dataset
- A dataset is a collection of data
- Each instance in the dataset with features/attributes that describe it, and may come with a
label/class
- The features of each instance make it possible to determine which species flower each
instance is
Wrapping up
- Machine learning is not magic
- We can work with different types of data, but we have to organize them in a way that
computers can understand (vectors of numbers)
- In simple cases we van see patterns in data by eye
- Machine learning methods can do that for us, even when we cant
Werkcollege 1
Exploring the first data set:
- First import the pandas package
- Make sure the file you want to read is in the same folder
- Define the variable like this
i. Variable = pd(ModuleName).read(FunctionName)_csv(“iris.csv”(Name of file))
- Look at first x rows -> iris.head(x)




- Explore the shape by using iris.shape -> (150,5) 150 rows and 5 columns
- Explore how many of which type there are by using iris.value_count(“class”)
Hoorcollege 2
Supervised learning and k-NN
Recap
- Methods that extract knowledge from the observed data
- Looking for patterns that can be exploited to solve a task
- Closely related to statistics and optimization

, - We usually want to predict something
i. What an object is in a picture
ii. Which direction to steer a self-driving car
iii. What sentence is a user saying
iv. Etc...
Many branches of ML
- Supervised learning
i. Learn from examples + the desired ‘target’ behaviour of the computer. Finding
patterns in data
- Unsupervised learning
- Reinforcement learning -> no data but an environment, with robots and gets a reward.
i. Robot gets reward in cycles and gets a grade
ii. Gets +1 if it completes the reward or gets -1 when it doesn’t
Supervised learning
- Works with giving examples
- In supervised learning we want to find a model that maps inputs to outputs given examples of
correct pairs.
- The function is a mathematical function -> gives input and output
- Example a cat vs dog classifier takes images as input and output whether the image is of a cot
or a dog
i. A dataset is collected containing a large number of pictures of cats and dogs
ii. Classifications with vectors like {0,0,1}
iii. For each picture a human writes a label to tell whether the picture contains a cat or
dog
iv. The dataset is a collection oof pairs
v. The {image, label} pairs are used to show our machine learning model show they
should behave
vi. The objective is the chosen ML method to find a function that behaves as shown, and
that generalizes to new unseen images.
Supervised learning: two tasks
Classification
- The model trained on the data defines a decision boundary that separates the data
Regression
- The model fits the data to describe the relation between 2 features or between a feature (e.g.
height) and the label (e.g., yes//no)

Documentinformatie

Geüpload op
12 juni 2024
Aantal pagina's
60
Geschreven in
2023/2024
Type
SAMENVATTING

Onderwerpen

€7,99
Krijg toegang tot het volledige document:

Verkeerd document? Gratis ruilen Binnen 14 dagen na aankoop en voor het downloaden kun je een ander document kiezen. Je kunt het bedrag gewoon opnieuw besteden.
Geschreven door studenten die geslaagd zijn
Direct beschikbaar na je betaling
Online lezen of als PDF

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
noahveldhoen Tilburg University
Bekijk profiel
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
17
Lid sinds
1 jaar
Aantal volgers
1
Documenten
2
Laatst verkocht
1 maand geleden

1,0

1 beoordelingen

5
0
4
0
3
0
2
0
1
1

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Bezig met je bronvermelding?

Maak nauwkeurige citaten in APA, MLA en Harvard met onze gratis bronnengenerator.

Bezig met je bronvermelding?

Veelgestelde vragen