Experimental Design Models
We consider the models which are used in designing an experiment. The experimental conditions,
experimental setup and the objective of the study primarily determine that what type of design is to
be used and hence which type of design model can be used for the further statistical analysis to
conclude about the decisions. These models are based on one-way classification, two-way
classifications (with or without interactions), etc. We discuss them now in detail in a few setups
which can be extended further to any order of classification. We discuss them now under the set up
of one-way and two-way classifications.
It may be noted that it has already been described how to develop the likelihood ratio tests for the
testing the hypothesis of equality of more than two means from normal distributions and now we
will concentrate more on deriving the same tests through the least-squaress principle under the setup
of the linear regression model. The design matrix is assumed to be not necessarily of full rank and
consists of 0’s and 1’s only.
One way classification:
Let p random samples from p normal populations with the same variances but different means and
different sample sizes have been independently drawn.
Let the observations Yij follow the linear regression model setup and
Yij denotes the jth observation of dependent variable Y when the effect of ith level of the factor is
present.
Then Yij are independently normally distributed with
E (Yij ) i , i 1, 2,..., p, j 1, 2,..., ni
V (Yij ) 2
where
is the general mean effect.
- is fixed.
- gives an idea about the general conditions of the experimental units and treatments.
i is the effect of ith level of the factor.
- can be fixed or random.
Analysis of Variance | Chapter 3 | Experimental Design Models | Shalabh, IIT Kanpur
1
,Example: Consider a medicine experiment in which there are three different dosages of drugs - 2
mg., 5 mg., 10 mg. which are given to patients for controlling the fever. These are the 3 levels of
drugs, and so denote 1 2 mg., 2 5 mg., 3 10 mg. Let Y denotes the time taken by the
medicine to reduce the body temperature from high to normal. Suppose two patients have been given
2 mg. of dosage, so Y11 and Y12 will denote their responses. So we can write that when 1 2mg is
given to the two patients, then
E (Y1 j ) 1 ; j 1, 2.
Similarly, if 2 5 mg. and 3 10 mg. of dosages are given to 4 and 7 patients respectively then
the responses follow the model
E (Y2 j ) 2 ; j 1, 2,3, 4
E (Y3 j ) 3 ; j 1, 2,3, 4,5,6, 7.
Here denotes the general mean effect which may be thought as follows: The human body has a
tendency to fight against the fever, so the time taken by the medicine to bring down the temperature
depends on many factors like body weight, height, general health condition etc. of the patient. So
denotes the general effect of all these factors which is present in all the observations.
In the terminology of the linear regression model, denotes the intercept term which is the value of
the response variable when all the independent variables are set to take value zero. In experimental
designs, the models with intercept term are more commonly used and so generally we consider these
types of models.
Also, we can express
Yij i ij ; i 1, 2,..., p , j 1, 2,..., ni where ij is the random error component in Yij . It
indicates the variations due to uncontrolled causes which can influence the observations. We assume
that ij ’s are identically and independently distributed as N (0, 2 ) with E ( ij ) 0, Var ( ij ) 2 .
Analysis of Variance | Chapter 3 | Experimental Design Models | Shalabh, IIT Kanpur
2
,Note that the general linear model considered is
E (Y ) X
for which Yij can be written as
E (Yij ) i .
When all the entries in X are 0’s or 1’s, then this model can also be re-expressed in the form of
E (Yij ) i .
This gives rise to some more issues.
Consider and rewrite
E (Yij ) i
( i )
i
where
1 p
i
p i 1
i i .
Now let us see the changes in the structure of the design matrix and the vector of regression
coefficients.
The model E (Yij ) i i can now be rewritten as
E (Y ) X * *
Cov (Y ) 2 I
where * ( , 1 , 2 ,..., p ) is a p 1 vector and
1
1 X
X*
1
is a n ( p 1) matrix, and X denotes the earlier defined design matrix in which
- first n1 rows as (1,0,0,…,0),
- second n2 rows as (0,1,0,…,0)
- …, and
- last np rows as (0,0,0,…,1).
Analysis of Variance | Chapter 3 | Experimental Design Models | Shalabh, IIT Kanpur
3
, We earlier assumed that rank X p but can we also say that rank X * is also p in the present
case?
Since the first column of X* is the vector sum of all its remaining p columns, so
rank X * p .
It is thus apparent that all the linear parametric functions of 1 , 2 ,..., p are not estimable. The
question now arises is what kind of linear parametric functions are estimable?
Consider any linear estimator
p ni
L aijYij
i 1 j 1
with
ni
Ci aij
j 1
Now
p ni
E ( L) aij E (Yij )
i 1 j 1
p ni
aij ( i )
i 1 j 1
p ni p ni
aij a ij i
i 1 j 1 i 1 j 1
p p
( Ci ) Ci i .
i 1 i 1
p
Thus C
i 1
i i is estimable if and only if
p
C
i 1
i 0,
p
i.e., C
i 1
i i
is a contrast.
p
Thus, in general, neither i 1
i nor any , 1 , 2 ,..., p is estimable. If it is a contrast, then it is
estimable.
Analysis of Variance | Chapter 3 | Experimental Design Models | Shalabh, IIT Kanpur
4