Geschreven door studenten die geslaagd zijn Direct beschikbaar na je betaling Online lezen of als PDF Verkeerd document? Gratis ruilen 4,6 TrustPilot
logo-home
Tentamen (uitwerkingen)

A+ GRADE-INTRODUCTION TO DATA SCIENCE QUESTIONS AND ANSWERS FOR EXAM PREP. 2024/2025 UPDATE.

Beoordeling
-
Verkocht
-
Pagina's
17
Cijfer
A+
Geüpload op
02-09-2024
Geschreven in
2024/2025

A+ GRADE-INTRODUCTION TO DATA SCIENCE QUESTIONS AND ANSWERS FOR EXAM PREP. 2024/2025 UPDATE.

Instelling
Vak

Voorbeeld van de inhoud

COMMONLY ASKED QUESTIONS FOR DATA SCIENCE.
ANSWERS RESEARCHED AND PROVIDED. 2024/2025
UPDATE

1. What is type 1 error and type 2 error? Falsely concluding that
intervention was successful. Known as false positive result
Falsely concluding intervention was not successful. Known a false
negative
2. What can we do about overfitting? > Regularization (penalizing
model complexity while we're training)
> L2 regularization penalizes really big weights - complexity(model) =
sum of squares of weights
> Regularization is about instead of minimizing only loss, its
minimizing loss + complexity which is called structural risk
minimization
3. Describe true positive, false positive, false negative, true negative:
True
Positives - we correctly called wolf; the town is saved.
> False positive - we called wolf falsely, the town is mad
> False negative - There was a wolf but we didn't spot it. Chickens
are eaten.
> True negative - no wolf, no alarm. All is well.
4. What is precision? True Positive / (True Positive + False Positive)
When you classify something as positive, how often are you right?
5. What is recall? True positive / (True positive + False Negative)
When you classify something as positive, how many times did you
fail to recall something as actually positive?
6. What is an ROC curve? A graph showing the performance of a
classification model at all classification thresholds. The curve plots
two parameters true positive rate (recall) & true negative rate,
also called Specificity (true negative / (true negative + false
positive)) along the axis from 0 to 1

,COMMONLY ASKED QUESTIONS FOR DATA SCIENCE.
ANSWERS RESEARCHED AND PROVIDED. 2024/2025
UPDATE

i.e. T PR on the y axis, and FPR on the x axis
7. What is false positive rate? (false positive / (false positive + true
negative))


8. What is the bias? An error from erroneous assumptions in the
learning algorithm. High bias can cause an algorithm to miss the relevant
relations between features and target outputs (underfitting).


The effect on the model because the sample systematically
misrepresents the 'real' data. Most datasets are a convenience
sample - the data easiest to collect
9. What is variance? An error from sensitivity to small fluctuations in
the training set. High variance can cause an algorithm to model the
random noise in the training data, rather than the intended outputs
(overfitting).
The effect on the model because it was built from this sample rather
than that sample
variance measures how inconsistent are the predictions from one
another
10. What is skewness? Asymmetry in a statistical distribution, in which
the curve appears distorted or skewed either to the left or to the right.
Skewness can be quantified to define the extent to which a distribution
differs from a normal distribution.
This is called negative skewness (tail goes towards negative
11 . What is kurtosis?: The sharpness of the peak of a frequency-
distribution curve.

, COMMONLY ASKED QUESTIONS FOR DATA SCIENCE.
ANSWERS RESEARCHED AND PROVIDED. 2024/2025
UPDATE

12. What are the different ways to handle missing values?
1 . Delete the entire row/column
2 Replace by a fixed value (i.e. "unknown")



3 General statistic replacement (replace values by a statistic
associated with a particular column like mean or median)
4 Grouped statistic replacement (replace values by a statistic
associated with a
particular group)
5 Imputation - predict values based on nearest neighbors or
likelihood
13. What kind of feature transformation can you perform on
numeric?
1. Round numeric to the nearest decimal or you can turn it into
discrete for turning it into a categorical later
2. Discretization: binning of a variable to become categorical for
better value management
3. Scaling (change the sale of the variable for better
understanding), i.e. min-max, z-score, etc.
14. What are some types of discretization methods? 1. Equal-width
binning (bins have equal ranges, roughly same distribution as
original variable
2. equal-density (frequency) binning - bins have equal number of
examples/records/rows with a uniform distribution
15. What are the 5 categories of feature generation? 1. Indicator
features (Attributes that isolate key information)

Geschreven voor

Vak

Documentinformatie

Geüpload op
2 september 2024
Aantal pagina's
17
Geschreven in
2024/2025
Type
Tentamen (uitwerkingen)
Bevat
Vragen en antwoorden

Onderwerpen

$7.49
Krijg toegang tot het volledige document:

Verkeerd document? Gratis ruilen Binnen 14 dagen na aankoop en voor het downloaden kun je een ander document kiezen. Je kunt het bedrag gewoon opnieuw besteden.
Geschreven door studenten die geslaagd zijn
Direct beschikbaar na je betaling
Online lezen of als PDF

Maak kennis met de verkoper
Seller avatar
TopRevision

Maak kennis met de verkoper

Seller avatar
TopRevision University Of California - Los Angeles (UCLA)
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
-
Lid sinds
1 jaar
Aantal volgers
0
Documenten
154
Laatst verkocht
-
Top Revision Material

I provide students with easy to grasp and up to date examination materials with complete and well researched answers to guide through revision.

0.0

0 beoordelingen

5
0
4
0
3
0
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Bezig met je bronvermelding?

Maak nauwkeurige citaten in APA, MLA en Harvard met onze gratis bronnengenerator.

Bezig met je bronvermelding?

Veelgestelde vragen