and answers | Updated New 2026
Reliability
Is the degree to which an assessment tool produces stable and consistent results
Types of Reliability
test-retest, interrater, internal consistency, parallel forms
test-retest reliability
a measure of reliability obtained by administering the same test twice over a period of me to a
group of individuals
parallel forms reliability
Obtained by administering different versions of an assessment tool to the same group of
individuals
inter-rater reliability
Used to assess the degree to which different raters/observers give consistent es mates of the
same phenomenon.
internal consistency reliability
Used to evaluate the degree to which different test items that probe the same construct
produce similar results
Subtypes of internal consistency
Average inter-item correla on, split half reliability
Average inter-item correla on
is a subtype of internal consistency reliability. It is obtained by taking all of the items on a test
that probe the same construct (e.g., reading comprehension), determining the correla on
coefficient for each pair of items, and finally taking the average of all of these correla on
coefficients. This final step yields the average inter-item correla on.
split-half reliability
, A measure of reliability in which a test is split into two parts and an individual's scores on both
halves are compared.
Validity
the extent to which a test measures or predicts what it is supposed to
Types of Validity
face, construct, criterion-related, forma ve, sampling
face validity (content validity)
The extent to which a test is subjec vely viewed as covering the concept it purports to measure.
The relevance of the test as it appears to the test takers.
construct validity
the extent to which variables measure what they are supposed to measure
criterion validity
Used to predict furture or current performances - it correlates test results with another criterion
of interest
Forma ve Validity
Is used to assess how well a measure is able to provide informa on to help improve the
program under study
sampling validity
Experts assess the scope of the measuring device. Does the test/device measure everything the
researcher is hoping to measure
Sta s cs
Collec on of methods for planning experiments, obtaining data, organizing, summarizing,
presen ng, analyzing, interpre ng, and drawing conclusions based on data.
Variability
The extent to which the scores in a data set tend to vary from each other and from the mean.
Hypothesis