complete solution 2026
In logistic regression, the relationship between the probability of success and the predicting variables is
nonlinear. - correct answer ✔TRUE: The equation that links the predictors to the probability is:
(𝑥1,...,𝑥𝑝)=
𝑒𝑥(𝛽0+𝛽1𝑥1+...+𝛽𝑝𝑥𝑝) / 1+𝑒𝑥𝑝(𝛽0+𝛽1𝑥1+...+𝛽𝑝𝑥𝑝)
This relationship is not linear.
In logistic regression, the error terms are assumed to follow a normal distribution. - correct answer
✔FALSE: There are no error terms in logistic regression
The logit function is the log of the ratio of the probability of success to the probability of failure. It is also
known as the log odds function. - correct answer ✔TRUE: 𝑔(𝑝)=ln(p/1−𝑝)
The logit link function is also known as the log odds function.
The number of parameters that need to be estimated in a logistic regression model with 6 predicting
variables and an intercept is the same as the number of parameters that need to be estimated in a
standard linear regression model with an intercept and same predicting variables. - correct answer
✔FALSE: As there is no error term in a logistic regression model, there is no additional parameter for the
variance of the error terms. As a result, the number of parameters that need to be estimated in a logistic
regression model with 6 predicting variables and an intercept is 7. The number of parameters that need
to be estimated in a standard linear regression model with an intercept and same predicting variables is
8.
The log-likelihood function is a linear function with a closed-form solution. - correct answer ✔FALSE:
The log-likelihood function is a non-linear function. A numerical algorithm is needed in order to
maximize it.
,In logistic regression, the estimated value for a regression coefficient 𝛽𝑖 represents the estimated
expected change in the response variable associated with one unit increase in the corresponding
predicting variable, 𝑥𝑖 , holding all else in the model fixed. - correct answer ✔FALSE: We interpret
logistic regression coefficients with respect to the odds of success.
Under logistic regression, the sampling distribution used for a coefficient estimator is a Chi-squared
distribution when the sample size is large. - correct answer ✔FALSE: The coefficient estimator follows
an approximate normal distribution
When testing a subset of coefficients, deviance follows a chi-square distribution with 𝑞q degrees of
freedom, where 𝑞q is the number of regression coefficients in the reduced model. - correct answer
✔FALSE: When testing a subset of coefficients, deviance follows a chi-square distribution with q degrees
of freedom, where q is the number of regression coefficients discarded from the full model to get the
reduced model.
Logistic regression deals with the case where the dependent variable is binary, and the conditional
distribution 𝑌𝑖|𝑿𝑖,1,⋯,𝑿𝑖,𝑝 is Binomial. - correct answer ✔TRUE: Logistic regression is the
generalization of the standard regression model that is used when the response variable y is binary or
binomial.
In logistic regression, if the p-value of the deviance test for goodness-of-fit is smaller than the
significance level 𝛼, then it is plausible that the model is a good fit. - correct answer ✔FALSE: For
logistic regression, if the p-value of the deviance test for goodness-of-fit is large, then it is an indication
that the model is a good fit.
If a logistic regression model provides accurate classification, then we can conclude that it is a good fit
for the data. - correct answer ✔FALSE: 'Goodness of fit doesn't guarantee good prediction." And
conversely, good prediction doesn't guarantee that the model is a good fit.
To evaluate whether the model is a good fit or equivalently whether the assumptions hold, we can use
the Pearson or deviance residuals to evaluate whether they are normally distributed. We can evaluate
that using the histogram and the normality plots. If they're normally distributed, then we conclude that
the model is a good fit.
, Another approach to evaluating goodness of fit is through hypothesis testing. In the goodness of fit test,
the null hypothesis is that the model fits well, and the alternative is that the model does not fit well. The
test statistic for the goodness of fit test is the sum of squared deviances. Under the null hypothesis of
good fit, the test statistic has an approximate Chi-Square distribution with n-p-1 degrees of freedom.
Very important to remember that if the p-value is small, we reject the null hypothesis of good fit, and
thus we conclude that the model is not a good fit.
For both logistic and Poisson regression, the deviance residuals should approximately follow the
standard normal distribution if the model is a good fit for the data. - correct answer ✔TRUE: The
deviance residuals are approximately N(0,1) if the model is a good fit
The logit link function is the best link function to model binary response data because it always fits the
data better than other link functions. - correct answer ✔FALSE: "The logit function is not the only
function that yields the s-shaped kind of curve. There are other s-shaped functions that are used in
modeling binary responses."
Although there are no error terms in a logistic regression model using binary data with replications, we
can still perform residual analysis. - correct answer ✔TRUE: We can perform residual analysis on the
Pearson residuals or the Deviance residuals
For a classification model, the training error rate tends to underestimate the true classification error rate
of the model. - correct answer ✔TRUE: The training error rate is a downward-biased (optimistic)
estimate of the true classification error rate. Hence, the training error rate tends to underestimate the
true classification error rate of the model.
The estimated regression coefficients in Poisson regression are approximate. - correct answer ✔TRUE:
The estimated parameters and their standard errors are approximate estimates.
A t-test is used for testing the statistical significance of a coefficient given all predicting variables in a
Poisson regression model. - correct answer ✔FALSE: We can use a Z-test to test for the statistical
significance of a coefficient given all predicting variables in a Poisson regression model.