As the sample size becomes larger, the sampling distribution of the sample mean
approaches a
Normal Distribution.
Poisson distribution
binomial distribution
hypergeometric distribution - ANSWER-Normal Distribution.
*bound on the error of estimation - ANSWER-also known as margin of error; the
maximum likely difference between the observed statistic and the true value of the
population parameter
B = Z(stdev/sqrt(n))
n = (zstdev/B)^2
*central limit theorem - ANSWER-the larger the sample size, the more closely the
sampling distribution of x-bar will resemble a normal distribution
if the population is normal, then x-bar is normally distributed for all values of n
if the population is non-normal, then x-bar is approximately normal only for larger values
of n
In most practical situations, a sample size of 30 may be sufficiently large to allow us to
use the normal distribution as an approximation for the sampling distribution of x-bar.
the definition of "sufficiently large" depends on the extent of nonnormality of x (e.g.
heavily skewed; multimodal)
treat any population that is at least 20 times larger than the sample size as large
*cluster sampling - ANSWER-A population is divided into clusters using naturally
occurring geographic or other boundaries. Then, clusters are randomly selected and a
sample is collected by randomly selecting from each cluster.
Subdivide the state into small units—either counties or regions, then take samples of
the residents in each of these regions and interview them.
*confidence interval estimate - ANSWER-x-bar +- (stdev/sqrt(n))
*continuous random variable - ANSWER-one that can assume an uncountable number
of values
,cannot list the possible values because there is an infinite number of them
because there is an infinite number of values, the probability of each individual value is
virtually 0
*control charts - ANSWER-used to study a process variation over time
distinguishes between assignable and random variations
control limits = mean of sample means + sample number (std error of sample means)
*difference between two sample means - ANSWER-requires independent random
samples be drawn from each of two normal populations. If this condition is met, then the
sampling distribution of the difference between the two sample means, i.e.
will be normally distributed.
if the two populations are not both normally distributed, but the sample sizes are "large"
(>30), the distribution of x-bar1 - x-bar2 is approximately normal
mean = mean1 - mean2
stdev = sqrt((stdev^2/n1) + stdev^2/n2))
*discrete variable - ANSWER-a quantitative variable that has either a finite number of
possible values or a countable number of possible values
birth weight of newborns, rounded to the nearest gram
*errors in data acquisition - ANSWER-recording of incorrect responses, due to:
— incorrect measurements being taken because of faulty equipment,
— mistakes made during transcription from primary sources,
— inaccurate recording of data due to misinterpretation of terms, or
— inaccurate responses to questions concerning sensitive issues
*nonresponse error - ANSWER-error (or bias) introduced when responses are not
obtained from some members of the sample, i.e. the sample observations that are
collected may not be representative of the target population
*nonsampling error - ANSWER-more serious and are due to mistakes made in the
acquisition of data or due to the sample observations being selected improperly
Errors in data acquisition
Nonresponse errors
Selection bias
, increasing sample size will not reduce this error
*normal approximation to the binomial - ANSWER-For large samples, using the normal
approximation is a more convenient way to calculate binomial probabilities than using
the binomial probability density function
works best when the number of experiments, n, is large, and the probability of success,
p, is close to 0.5
mean = np
stdev = sqrt(np(1-p))
for the approximation to provide good results:
np ≥ 5 and n(1-p) ≥ 5
To calculate P(X=10) using the normal distribution, we can find the area under the
normal curve between 9.5 & 10.5.
To find the probability of X > 10, we will find P(X > 9.5)
To find the probability of X < 8, we will find P(X < 8.5)
*sample mean (x-bar) - ANSWER-mean of x-bar = mean
stdev of x-bar (aka std error) = stdev/sqrt(n)
stdev^2 of x-bar = stdev^2/n
*sample proportion p̂ - ANSWER-estimator of a population proportion of successes
X is the number of successes, n is the sample size
p̂ mean number of successes = X/n
mean = n
stdev = sqrt((p(1-p))/n)
expected value = p
variance = (p(1-p))/n
x-bar +- z*stdev
*sampling error - ANSWER-differences between the sample and the population that
exist only because of the observations that happened to be selected for the sample
increasing sample size will reduce this error