What kind of variable is a response variable and why? - CORRECT ANSWER-random,
because it varies with changes in the predictor/s along with other random changes.
What kind of variable is a predicting variable and why? - CORRECT ANSWER-fixed,
because it does not change with the response but it is fixed before the response is
measured.
linear relationship - CORRECT ANSWER-a simple deterministic relationship between 2
factors, x and y
what are three things that a regression analysis is used for? - CORRECT ANSWER-1.
Prediction of the response variable, 2. Modeling the relationship between the response
and explanatory variables, 3. Testing hypotheses of association relationships
B0 = ? - CORRECT ANSWER-intercept
B1 = ? - CORRECT ANSWER-slope
for our linear model where: Y = B0 + B1 + EPSILON (E), what does the epsilon
represent? - CORRECT ANSWER-deviance of the data from the linear model (error
term)
what are the 4 assumptions of linear regression? - CORRECT ANSWER-Linearity/Mean
Zero, Constant Variance, Independence, Normality
Linearity/Mean zero assumption - CORRECT ANSWER-Means that the expected value
(deviances) of errors is zero. This leads to difficulties in estimating B0 and means that
our model does not include a necessary systematic component
Constant variance assumption - CORRECT ANSWER-Means that it cannot be true that
the model is more accurate for some parts of the population, and less accurate for other
parts of the populations. This can result in less accurate parameters and poorly-
calibrated prediction intervals.
Assumption of Independence - CORRECT ANSWER-Means that the deviances, or in
fact the response variables ys, are independently drawn from the data-generating
process. (this most often occurs in time series data) This can result in very misleading
assessments of the strength of regression.
,Normality assumption - CORRECT ANSWER-This is needed if we want to do any
confidence or prediction intervals, or hypothesis test, which we usually do. If this
assumption is violated, hypothesis test and confidence and prediction intervals and be
very misleading.
what are the 3 parameters we estimated in regression? - CORRECT ANSWER-B0, B1,
sigma squared (variance of the one pop.)
response (dependent) variables - CORRECT ANSWER-one particular variable that we
are interested in understanding or modeling (y)
predicting or explanatory (independent) variables - CORRECT ANSWER-a set of other
variables that might be useful in predicting or modeling the response variable (x1, x2)
What do we mean by model parameters in statistics? - CORRECT ANSWER-Model
parameters are unknown quantities, and they stay unknown regardless how much data
are observed. We estimate those parameters given the model assumptions and the
data, but through estimation, we're not identifying the true parameters. We're just
estimating approximations of those parameters.
What is the estimated sampling distribution of s^2? - CORRECT ANSWER-chi-square
with n-1 DF
Why do we lose 1 DF for s^2? - CORRECT ANSWER-we replace mu with zbar
what is the relationship between s^2 and sigma^2? - CORRECT ANSWER-S^2
estimates sigma^2
What is the estimated sampling distribution of sigma^2? - CORRECT ANSWER-chi-
square with n-2 DF (~ equivalent to MSE)
Why do we lose 2 DF for sigma^2? - CORRECT ANSWER-we replaced two
parameters, B0 and B1
In SLR, we are interested in the behavior of which parameter? - CORRECT ANSWER-
B1
If we have a positive value for B1,.... - CORRECT ANSWER-then that's consistent with
a direct relationship between the predicting variable x and the response variable y.
If we have a negative value for B1,.... - CORRECT ANSWER-is consistent with an
inverse relationship between x and y
,When B1 is close to zero... - CORRECT ANSWER-we interpret that there is not a
significant association between predicting variables, between the predicting variable x,
and the response variable y.
How do we interpret B1? - CORRECT ANSWER-It is the estimated expected change in
the response variable associated with one unit of change in the predicting variable.
How we interpret ^B0? - CORRECT ANSWER-It is the estimated expected value of the
response variable, when the predicting variable equals zero.
What is the sampling distribution of ^B1? - CORRECT ANSWER-t distribution with N-2
DF
What can we use to test for statistical significance? - CORRECT ANSWER-t-test
What would we do if the T value is large? - CORRECT ANSWER-Reject the null
hypothesis that β1 is equal to zero. If the null hypothesis is rejected, we interpret this
that β1 is statistically significant.
what does 'statistical significance' mean? - CORRECT ANSWER-B1 is statistically
different from zero.
what is the distribution of B1? - CORRECT ANSWER-Normal
The estimators for the regression coefficients are:
A) Biased but with small variance
B) Unbiased under normality assumptions but biased otherwise.
C) Unbiased regardless of the distribution of the data. - CORRECT ANSWER-C
The assumption of normality:
A) It is needed for deriving the estimators of the regression coefficients.
B) It is not needed for linear regression modeling and inference.
C) It is needed for the sampling distribution of the estimators of the regression
coefficients and hence for inference.
D) It is needed for deriving the expectation and variance of the estimators of the
regression coefficients. - CORRECT ANSWER-C
What is 'X*'? - CORRECT ANSWER-predictor
Where does uncertainty from estimation come from? - CORRECT ANSWER-from
estimation alone
Where does uncertainty from prediction come from? - CORRECT ANSWER-from the
estimation of regression parameters and from the newness of the observation itself
, what is the prediction interval used for? - CORRECT ANSWER-used to provide an
interval estimate for a prediction of y for one member of the population with a particular
value of x*
what is the confidence interval used for? - CORRECT ANSWER-to provide an interval
estimate for the true average value of y for all members of the population with a
particular value of x*
The estimated versus predicted regression line for a given x*:
A) Have the same variance
B) Have the same expectation
C) Have the same variance and expectation
D) None of the above - CORRECT ANSWER-B
The variability in the prediction comes from:
A) The variability due to a new measurement.
B) The variability due to estimation.
C) The variability due to a new measurement and due to estimation.
D) None of the above. - CORRECT ANSWER-C
residuals - CORRECT ANSWER-the difference between the observed response and
the fitted responses
what does residual analysis NOT check for? (for SLR assumptions) - CORRECT
ANSWER-independence
what can we use to check for normality? - CORRECT ANSWER-QQ plot and histogram
what are two ways to transform data? - CORRECT ANSWER-power and log
transformation
outliers - CORRECT ANSWER-which are data points far from the majority of the data in
both x and y or just x
leverage points - CORRECT ANSWER-Data points that are far from the mean of the x's
influential points - CORRECT ANSWER-A data point that is far from the mean of both
the x's and the y's, because they are influencing the fit of the regression.
R^2 or coefficient of determination - CORRECT ANSWER-a statistic that efficiently
summarizes how well the x can be used to predict the response variable.
How do we interpret R^2? - CORRECT ANSWER-Proportion of total variability in Y that
can be explained by the regression (that uses X)