Geschreven door studenten die geslaagd zijn Direct beschikbaar na je betaling Online lezen of als PDF Verkeerd document? Gratis ruilen 4,6 TrustPilot
logo-home
Samenvatting

0HM270 - Supercrunchers Summary

Beoordeling
-
Verkocht
2
Pagina's
76
Geüpload op
08-09-2020
Geschreven in
2019/2020

Complete summary of all lectures in the Supercrunchers course, all you need to know for the exam

Instelling
Vak

Voorbeeld van de inhoud

0HM270 - Supercrunchers Summary – 19-20 - Q4

Content
Lecture 1: Intro lecture

Lecture 2: User aspects of Recommender systems

Lecture 3: Manager vs Machine and more

Lecture 4: Interactive recommender systems

Lecture 5: Brunswik’s Lens model / Dawes 1974

Lecture 6: Learning analytics and skin cancer detection

Lecture 7: Some notes on prediction

Lecture 8: Netflix for Good – Guest lecture Alain Stark

Lecture 9: Website (online) adaptation

Lecture 1: Intro lecture
Supercrunching = Using (sometimes a lot of) data to predict something that
- We normally cannot predict well
- Humans normally tend to predict

HMI = human model interaction

The timeline of ideas: ideas → .. → … → … → world-wide implementation (difficult to get here)
- Which hurdles need to overcome?
- Can we find consistencies across topics?
- Which kind of crunchers are more likely to be adopted?
- When do which kind of counter-arguments pop up? What can we do about these?
- Etc.

Example: Cook county hospital
Not enough rooms, overworked staff, many patients without insurance etc.
Most often: acute chest pain. There was not much agreement between physicians on what is high,
medium of low risk.
Goldman found out: only 4 things matter: ECG, blood pressure, fluid in lungs, unstable angina. He
created a scheme out of this.
Reilly tested Goldman’s idea. Physicians were 82% of the time right, Goldman’s scheme is 95% of the
time right.

Clinical prediction (human (expert)) versus statistical prediction (computer model, scheme, etc.)
Most often, the model wins!! But this depends on the context.

Where to expect that a human will outperform a computer
- Emotion recognition / emotional support
- In situations where social cues are important
- Where human interaction is very important
- Intuition

,Why is it that computer models often beat (expert) humans?
In total, there are 88 reasons/ well documented flaws in human judgment. Some of these are:
- Our memory fools us (Wagenaar)
- Dealing with probabilities / base rate neglect (Bar-Hillel)
- We emphasize the improbable (Stickler)
- Confirmation bias (Edwards, Wason)
- Hindsight bias (Fischhoff)
- Cognitive dissonance (Festinger)
- Mental floating frankfurter: What you see when you put your fingers close to your eyes and
try to see through them. This is when you see a floating piece of meat. You know that it is not
there, but as soon as you see it, you cannot help this, you just see it. This is the same for
decision making biases, even though you might now that you have them, that doesn’t help
you get rid of it.
- Mental sets: certain ways of thinking you have learned. This makes it difficult to think outside
of the box, you use less of your creativity. (Redelmayer, Tversky)
- Memory: people are not very good in remembering things. We don’t only forget things, we
also get ‘extra stuff in’ that is not supposed to be there. So, remember something (partly)
wrong.
- Availability heuristic: a mental shortcut that relies on immediate examples that come to mind
when evaluating a specific topic.
- Dealing with probabilities = difficult for people
- Overconfidence. E.g. Estimates of how many quiz questions you will have correct are
generally too high. When you are better at something, the overconfidence is generally
worse.
- Finding non-existent patterns. Predict what is going to happen, 2/3 = green 1/3 = red. The
optimal strategy is always pressing green, because you don’t know what it is going to be.
Here you would have 2/3 correct. Other strategy: guess each time, with about 2 greens to
every 1 red. However, here you score lower than 2/3 correct.
- The broken leg cue. E.g. trying to predict whether or not you are going to the cinema this
Saturday. When I hear you have a broken leg, I know you won’t go. In our situation, the
corona-virus would be the broken leg cue. Because of this, you can predict that people are
not going to the cinema. If you know this broken leg cue, in all likelihood you would have a
perfect prediction. The problem: humans see broken leg cues everywhere, way more often
than they actually should.
- The issue of feedback. People learn when they get immediate and unambiguous feedback.
But, in many cases this immediate and unambiguous feedback is simply not there. There is
often a lot of time in between. Not that obvious what exactly you did right or wrong. You
don’t know if what you did influenced it, or that it was something else.

Decision making = store, retrieve, combine information + learn from feedback
A human is not very well equipped to do that, a computer is.

Two competing theories
1. Naturalistic decision making (NDM)
a. = an attempt to understand how people make decisions in real-world contexts that
are meaningful and familiar to them.
b. It is not clear why people make a certain decision, but there is a certain experience
and intuition built over time that helps making the decision (Klein, Shanteau)
c. Counterargument: studies are done in the lab, where decision making is different
than in normal life.
2. Fast and frugal heuristics

, a. People don’t decide in perfect ways, but they have sort of shortcuts, which are (over
time) good enough to make decisions (Gigerenzer)

Difficult issues when implementing ideas:
- When the model makes a mistake, then who can we blame?
- Patients may complain. E.g. who is the idiot with the card/scheme treating me? Why can’t I
get a real doctor? Who doesn’t need a card.
Possible solution: Look at scheme before entering the patient room, so you can remember the
scheme and don’t need it in the room with the patient anymore.

Conclusion:
It is not:
- Humans (or experts) are stupid
Instead:
- Models can beat humans, sometimes
- We have a quite good idea as to why this happens. People make mistakes that are consistent
(not random)
- Implementation issues are often more complicated to solve than it is to make the model
(modeling is easy, humans are complicated)

Lecture 2: User aspects of Recommender systems
Recommender systems:
- Field that combines machine learning, information retrieval and human-computer
interaction (HCI)
- Help overcome information overload, find relevant stuff in the big pile of information
- Offers personalized suggestions based on what it knows about the user, e.g. history of what
the user liked and disliked
- Main task: predict what other items the user would also like
- The prediction task is part of the recommendation task. When you have a good prediction, it
doesn’t automatically mean it is a good recommendation.

Most popular methods of recommender systems:
- Content-based filtering
- Collaborative filtering (CF)
o Neighborhood methods
▪ User-based
▪ Item-based
o Matrix factorization / SVD (singular value decomposition)

What data to use to build a user profile?
- Explicit data
o Ratings of individual items
o Different types of scales
- Implicit data
o Click streams
o Wish list
o Purchase data
o Viewing times

Content-based recommender system (personalization)
- User profile is content description of previous interests (expressed through ratings)

, - It uses these content features (meta-data) to find other movies
o Meta-data can be the genre, the actors etc.
- Advantages:
o Profiles are individual in nature and don’t rely on other users (benefit of privacy!)
o Easy to explain and control by the user
o Can be run client-side (privacy!)
- Drawbacks:
o Overspecializes the item selection
▪ Only based on previous ratings by this particular user
o Difficult to get unexpected items
▪ And people value novel, serendipitous items the most. We want to find new
things.

Collaborative filtering (CF)
- Matching user’s ratings with those of similar users
o Find out how users are similar in what they like and dislike
o Completely data driven, no meta-data needed
- Advantages:
o Domain-free and no explicit profiles/content needed
o Can express aspects in the data that are hard to profile
- Drawbacks:
o Cold-start problem: new users have not rated anything / new items have no ratings
yet. So, you don’t know what to recommend.
o Sparsity: each user has only rated a few items, so you are missing a lot of information
o Server-side: privacy issues in data collection and storage

2 types of collaborative filtering (CF):
- Neighborhood methods (clustering, K-NN)
- Latent factor models (matrix factorization, dimensionality reduction methods)

2 types of neighborhood methods:
- User-based collaborative filtering
o Find similar users like A, then form a neighborhood (clique). Find items rated by the
clique but not by A and predict how A would rate all of these (weighing other user
ratings by their similarity to A). Recommend the items with the highest predicted
ratings
o Drawbacks:
▪ Computationally expensive, because you have to find similar users out of all
of the users in the system (large data base). This will take a lot of time.
- Item-based collaborative filtering
o Similar, but based on similar items:
▪ Find items that are similar (instead of users), by calculating the similarity
between items based on user ratings. Generate a similarity matrix between
the items, based on similarity in the rating profile. So, when movies are rated
similarly by the same people, they are more similar. Use the similarity matrix
to calculate what the expected rating of other items would be.
▪ ‘If you like these items, you might like this as well’
o Computationally better for cases with much more users than items

How to measure performance of CF? And how to optimize the prediction model?
- Deviation of algorithmic prediction from actual user ratings
- Training-test set approach: Predict based on a test set using a training set.

Geschreven voor

Instelling
Studie
Vak

Documentinformatie

Geüpload op
8 september 2020
Aantal pagina's
76
Geschreven in
2019/2020
Type
SAMENVATTING

Onderwerpen

$8.35
Krijg toegang tot het volledige document:

Verkeerd document? Gratis ruilen Binnen 14 dagen na aankoop en voor het downloaden kun je een ander document kiezen. Je kunt het bedrag gewoon opnieuw besteden.
Geschreven door studenten die geslaagd zijn
Direct beschikbaar na je betaling
Online lezen of als PDF

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
lynnheesterbeek Technische Universiteit Eindhoven
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
26
Lid sinds
6 jaar
Aantal volgers
17
Documenten
9
Laatst verkocht
2 jaar geleden

5.0

1 beoordelingen

5
1
4
0
3
0
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Bezig met je bronvermelding?

Maak nauwkeurige citaten in APA, MLA en Harvard met onze gratis bronnengenerator.

Bezig met je bronvermelding?

Veelgestelde vragen