Statistics Sophia 2, Statistics Sophia 1.1 Exam/
300 Q&A.
sample statistic - Answer: A measure of an attribute of a sample.
sample mean - Answer: A mean obtained from a sample of a given size. Denoted as x bar.
multiple data sets - Answer: Plotting more than one data set on a scatterplot requires that we
use different colors or symbols for the different data sets so we can see the relationships
separately.
form - Answer: The overall shape of the data points. The form may be linear or nonlinear, or
there may not be any form at all to the points if they form a "cloud."
Page 1 of 39
,direction - Answer: The way one variable responds to an increase in the other. With a negative
association, an increase in one variable is associated with a decrease in the other, whereas with
a positive association, an increase in one variable is associated with an increase in the other.
strength - Answer: The closeness of the points to the indicated form. Points that are strongly
linear will all fall on or near a straight line.
explanatory variable - Answer: The variable whose increase or decrease we believe helps
explain a tendency to increase or decrease in some other variable.
response variable - Answer: The variable that tends to increase or decrease due to an increase
or decrease in the explanatory variable.
correlation - Answer: The strength and direction of a linear association between two
quantitative variables.
correlation coefficient - Answer: The numerical value between -1 and +1 that measures the
correlation between two quantitative variables.
positive correlation - Answer: The type of correlation present when two variables have a
correlation coefficient generally greater than or equal to 0.5.
negative correlation - Answer: The type of correlation present when two variables have a
correlation coefficient generally less than or equal to -0.5.
Relative Zero Correlation - Answer: The type of correlation present when two variables have a
correlation coefficient generally between -0.5 and 0.5.
Page 2 of 39
,non-linear relationships - Answer: Associations between two variables that can be modeled
better with a curve than a line.
Coefficient of Determination (r^2) - Answer: A value that explains the percent of variation in the
response variable that can be explained by a linear association with the explanatory variable. It
is the square of the correlation coefficient.
finding r from r squared - Answer: Step 1: Take the square root of r2. If only r-squared is given,
what you have to do is take the square root to obtain the correlation coefficient, r.
Step 2: Look at the graph to determine sign. You also have to look at the graph to find the
association--either positive or negative--to determine the sign of the correlation coefficient.
outlier - Answer: Points that deviate substantially from the overall form of the remainder of the
data points.
influential points - Answer: An observation that, if removed, significantly changes a statistical
measure
inappropriate grouping - Answer: Combining together subgroups that should not be combined,
resulting in a weakened, or even reversed, association.
correlation - Answer: A statistic which measures the strength and direction of the linear
association between two quantitative variables.
Causation/Cause-and-Effect - Answer: A phenomenon whereby an increase in one variable
directly leads to an increase or decrease in another variable.
causality - Answer: A cause-and-effect relationship between two variables.
Page 3 of 39
, Best-Fit Line/Trend Line/Regression Line - Answer: A line that closely approximates the
response values for given explanatory values when the form of the scatterplot is linear.
slope - Answer: The rate of change relating the increase or decrease in y to an increase of 1 in x.
y-intercept - Answer: The value of y when x = 0.
residual - Answer: The difference between the actual value of the response variable for a
particular data point and its predicted value from the regression line.
residual plot - Answer: A scatter plot that plots Residuals vs. explanatory variable, as opposed to
response variable vs. explanatory variable. It can be used to assess the fit of a line.
Least-Squares Line - Answer: A best-fit line that is found through a process of minimizing the
sum of the squared residuals.
X-bar (x with a line over it) - Answer: The average x value for a sample
Y-bar (y with a line over it) - Answer: The average y value for a sample
Slope of Regression Line - Answer: The amount y changes (on average) for a one unit increase in
x.
Y-Intercept of Regression Line - Answer: The expected y value when x = 0
multiple regression - Answer: Using more than one explanatory variable to predict the value of
the response variable.
Page 4 of 39