Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Summary

Summary Data Mining classification (1+2) + solutions exercises

Rating
-
Sold
-
Pages
10
Uploaded on
04-08-2023
Written in
2022/2023

This document contains a summary of the theory that was completed during this lab session. In addition, at the end of the document, there are solutions for the lab sessions.

Institution
Course

Content preview

Classification 1
lag1, lag2,…,lag5: percentage return for each of the five previous trading days

volume: number of shares traded on previous day

today: percentage return on data in question

direction: whether the market was Up or Down on this data

cor(): produces matrix containing all of correlations among the predictors




Here error because “direction” variable is qualitative

Correlations between the lags and today’s returns close to zero => little correlation

Year and volume: substantial correlation

glm(): fits linear models that includes logistic regression (similar to lm() except: family = binomial)

Lag1

 smallest p-value
 negative coefficient: if
market had positive return
yesterday, then less likely to
go up today
 0.15: no clear evidence of
association between Lag1
and direction

, coef(): access coefficients

summary(): access specific aspects of fitted model




predict(): can be used for the probability that the market will go up, given values of predictors

type = “response”: tells R to output probabilities of the form P(Y=1|X)

contrasts(): indicates that R has created a dummy variable




Vector of class predictions based on whether predicted probability of a market increase is greater
than or less than 0.5:




First command: creates vector of 1,250 Down elements

Second command: transforms to Up all of elements for which predicted probability of

market increase exceeds 0.5

table(): produces a confusion matrix



Diagonal elements: correct predictions

Off-diagonal elements: incorrect

Training error rate: 100 – 52.2 = 47.8%

Written for

Institution
Study
Course

Document information

Uploaded on
August 4, 2023
Number of pages
10
Written in
2022/2023
Type
SUMMARY

Subjects

$4.18
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller
Seller avatar
Worstje2021
5.0
(1)

Also available in package deal

Get to know the seller

Seller avatar
Worstje2021 Universiteit Gent
Follow You need to be logged in order to follow users or courses
Sold
7
Member since
2 year
Number of followers
5
Documents
13
Last sold
2 year ago

5.0

1 reviews

5
1
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions