Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Summary

Summary Classification: k Nearest Neighbours

Rating
-
Sold
-
Pages
3
Uploaded on
03-01-2022
Written in
2005/2006

Recall that in classification problems, the task is to decide to which of a predefined, finite set of categories an object belongs. In this lecture, we look at another classification method. The method we look at is known as the k-nearest-neighbours or kNN method

Show more Read less
Institution
Course

Content preview

Classification: k Nearest Neighbours 2 Distance

2.1 Local distance functions, global distance functions and weights
1 Introduction
A global distance function, dist, can be defined by combining in some way a number of local distance functions,
Recall that in classification problems, the task is to decide to which of a predefined, finite set of categories an object distA , one per attribute.
belongs. The easiest way of combining them is to sum them:
In this lecture, we look at another classification method. The method we look at is known as the k-nearest-neighbours
dist(x, q) =def Σni=1 distAi (x.Ai , q.Ai )
or kN N method.
Just as with the probabilistic methods that we looked at in the previous two lectures, we need a dataset of examples. More generally, the global distance can be defined as a weighted sum of the local distances. The weights wi allow
Each example describes an instance and gives the class to which it belongs. As before, we’ll assume instances are different attributes to have different importances in the computation of the overall distance. Weights sometimes lie
described by a set of attribute-value pairs, and there is a finite set of class labels L. So the dataset comprises examples between zero and one; a weight of zero would indicate a totally irrelevant attribute.
of the form h{A1 = a1 , A2 = a2 , . . . , An = an }, class = cli. For a particular instance x, we will refer to x’s value
for attribute Ai as x.Ai . dist(x, q) =def Σni=1 wi × distAi (x.Ai , q.Ai )
In the probabilistic methods that we looked at, the learning step involved computing probabilities from the dataset.
Once this was done, in principle the dataset could be thrown away; classification was done using just the probabilities. A weighted average is also common:
In k-nearest-neighbours, the learning step is trivial: we simply store the dataset in the system’s memory. Yes, that’s it! Σni=1 wi × distAi (x.Ai , q.Ai )
dist(x, q) =def
In the classification step, we are given an instance q (the query), whose attributes we will refer to as q.Ai and we wish Σni=1 wi
to know its class. In kN N , the class of q is found as follows:
The weights can be supplied by the system designer. There are also ways of learning weights from a dataset.
1. Find the k instances in the dataset that are closest to q. What we haven’t discussed is how to define the local distance functions. We will exemplify the different definitions
by computing the distance between query q and x1 and x2 below:
2. These k instances then vote to determine the class of q.
x1 x2 q
Ties (whether they arise when finding the closest instances to q or when voting for the class of q) are broken arbitrarily.
sex female sex male sex male
In the following visualisation of this method, we assume there are only two attributes A1 and A2 , and two different weight 60 weight 75 weight 70
classes (where circles with solid fill represent instances in the dataset of one class and circles with hashed fill represent amount 4 amount 2 amount 1
instances in the dataset of the other class). Query q is here being classified by its 3 nearest neighbours: meal full meal full meal snack
duration 90 duration 60 duration 30
Attribute A1 class over class under class ?

111
000 The attributes and their values are: sex (male/female), weight (between 50 and 150 inclusive), amount of alcohol
000
111 11
00
000
111 00
11
consumed in units (1-16 inc.), last meal consumed today (none, snack or full meal), and duration of drinking session
000
111 00
11 (20-320 minutes inc.). The classes are: over or under the drink driving limit.
00
11
00
11
11
00
00
11 2.2 Hamming distance
00
11
q 11
00 k=3
The easiest local distance function, known as the overlap function, returns 0 if the two values are equal and 1 otherwise:
11
00 11
00
00
11

0 if x.A = q.A
00
11 00
11 distA (x.A, q.A) =def
00
11 00
11 1 otherwise.
00
11 00
11
00
11
Attribute A2 If the global distance function is defined as the sum of the local distances, then we’re simply counting the number
of attributes on which the two instances disagree. This is called Hamming distance. Weighted sums and weighted
averages are also possible.
All that remains to do is discuss how distance is measured, and how the voting works.
Class Exercise. What is the Hamming distance between x1 and q and between x2 and q?

1 2

Written for

Institution
Course

Document information

Uploaded on
January 3, 2022
Number of pages
3
Written in
2005/2006
Type
SUMMARY

Subjects

$7.99
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller
Seller avatar
riyadhalgburi

Get to know the seller

Seller avatar
riyadhalgburi Southwest Jiaotong University
Follow You need to be logged in order to follow users or courses
Sold
-
Member since
4 year
Number of followers
0
Documents
33
Last sold
-

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions