Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Summary

Summary Machine Learning - PYTHON PART (complete walkthrough)

Rating
-
Sold
17
Pages
46
Uploaded on
03-06-2020
Written in
2019/2020

Complete guide through all notebooks. Each type of exercise clearly explained.

Institution
Course

Content preview

Python for Machine Learning
Huge potential helper: https://ml-cheatsheet.readthedocs.io/en/latest/loss_functions.html


Notebook 1: Evaluation
MAE & MSE
We want to calculate the MAE and the MSE for the evaluation of the model.

MAE: to calculate the absolute error, we need to do two steps
1. Transform the predicted values and the actual values to arrays
2. Transform the values to absolute values &calculate the difference between those the
two
3. Take the mean of the absolute error
In code that looks something like this:
def MAE(pred, actual):
abs_error = abs(np.array(actual)- np.array(pred))
mae = sum(abs_error) / len(actual)
return mae

MSE: we do the exact same, except now we use the exponential of abs_error.
def MAE(pred, actual):
sq_error = np.exp(abs(np.array(actual)- np.array(pred)))
mae = sum(abs_error) / len(actual)
return mae



Binary classification
In this exercise we need to calculate the accuracy of a spam filter. This spam filter classifies
between spam or non-spam. To calculate the accuracy, we need to see how many times the
filter was (not) correct. The trick is to check when the prediction is equal to the actual value.
We can take two routes:
A) For loop:
The steps we need to undertake are:
1. Make a range of the length of the dataset
2. Iterate over each element in the dataset and check if ypred == yactual. Count += 1 if
True
3. Divide the count by the total amount of predictions.
def accuracy(y_true, y_pred):
count = 0
for i in range(0, len(y_true)):
if y_true[i] == y_pred[i]:
count += 1
return count/len(y_true)
B) NumPy
Steps:

, 1. We transform both the pred and the actual into arrays
2. We create an object that compares ypred and y_actual. The output is an array that
contains True, True, Ture, False, True, False, False etc..
3. Because booleans can be seen as 0 and 1, we can use the np.mean() to
get the average rate that ypred == yactual → this rate is equal to the
accuracy.

def accuracy_np(y_true, y_pred):
acc2 = np.mean(np.array(y_true) == np.array(y_pred))
return acc2

Building a confusion Matrix
If we want to calculate the Recall and Precision, we will need a confusion matrix. We start off
with making an empty matrix and we are going to fill this matrix with values. This works for
both binary and multi-classification matrices.
1. First, we check how many unique classes the list has. The function np.unique
collects all unique values in an array. The len() function turns this into an int.
2. We make an empty matrix using the np.zeros() function, with N x N as their shape
3. We use a for loop to iterate over two zipped lists: ypred & yactual.
4. We use the values in each iteration step to index the position in the matrix, and we
add 1 to that position.

N = len(np.unique(y_true)
def confusion_matrix(y_true, y_pred):
M = np.zeros((N, N))
for i, j in zip(y_true, y_pred):
M[i, j] += 1
return M

Def precision(M):
TP = M[1, 1]
FP = M[0, 1]
return TP/(TP+FP)



You can also se set() operations. Check notebook 1 exercise 7 for this.

,Notebook 2: Decision Trees
Decision trees have a recursive structure: If condition A holds, then move on to the following
check. The example below shows how recursive functions work in Python. Essentially, you
call the function within the function, however, the input of different than the first call. This is
an example of a recursive function calculating the factorial:

def factorial(n):
if n == 0:
print("This I know! (the base case)")
return 1
else:
print("I don't know the factorial for", n, "let's try", n-1)
return n * factorial(n-1)
factorial(5)

In the if-statement you define the base case. This is relevant because it will keep on calling
itself until it reaches the base case. Under the hood python stores the number of times it
called itself. When it reaches the base case, it can trace back what values it should use for
the ‘n-1’.

Example 2:

def rec_sum(a):
if len(a) == 1:
return a[0]
else:
return a[0] + rec_sum(a[1:])

rec_sum([1,2,3,4,5,6])

Example 3: We need to count the number of brackets in this nested lists:
nested = [[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[13]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
To do so we must use a function to check whether its content is a list or not. It keeps on
doing this until its content is an integer:

def search(a, depth=0):
if isinstance(a, list):
return search(a[0], depth + 1)
else:
return depth
the a[0] returns the 0th element of the list. --> therefore remove on pair of brackets. If you do
this recursively, it will operate: a[0][0], a[0][0][0] and so on.. Until the content is not a list
anymore, because 13 is an int. Meanwhile, for each recursion the depth is increased by +1.

, Recursion in decision trees
Recursive functions are very useful when dealing with tree structures, which are recursive
structures themselves. We do not know how deep the tree is. All we can see is if the node we are
currently looking at has any children, and if it does we can try to visit those, and repeat this.
Decision trees are usually full binary trees which means that every node has either 0 or 2
children. If it has 0 then it is a leaf node.

We start off by creating a function with which we can call a node:

def Node(left=None, right=None, feature=None, value=None, predict=None):
"Return a node in a binary decision tree"
return dict(left=left, right=right, feature=feature, value=value, predict=predict)

def isLeaf(node):
"""Helper function to check if the current node is a leaf"""
return node['left'] is None and node['right'] is None

Now that we have specified the function for an empty decision tree, we can start giving it
content:

# We want to first ask about value Round in column at index 2.
root = Node(feature=2, value="Round",

# If false, in the left branch, which is a leaf node, we'll predict Banana
left=Node(predict="Banana"),

# If true, in the right branch we'll ask about the color Red
right=Node(feature=1, value="Red",

# Based on the answer to question about color Red,
# we'll predict either Lime
left=Node(predict="Lime"),

# or Apple
right=Node(predict="Apple")))

Written for

Institution
Study
Course

Document information

Uploaded on
June 3, 2020
Number of pages
46
Written in
2019/2020
Type
SUMMARY

Subjects

$8.21
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
jeroenverboom Tilburg University
Follow You need to be logged in order to follow users or courses
Sold
82
Member since
7 year
Number of followers
60
Documents
8
Last sold
8 months ago

5.0

1 reviews

5
1
4
0
3
0
2
0
1
0

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions