CORRELATION AND REGRESSION
5.1: Introduction
So far we have confined our discussion to the distributions involving
only one variable. Sometimes, in practical applications, we might come
across certain set of data, where each item of the set may comprise of the
values of two or more variables.
Suppose we have a set of 30 students in a class and we want to
measure the heights and weights of all the students. We observe that each
individual (unit) of the set assumes two values – one relating to the height
and the other to the weight. Such a distribution in which each individual or
unit of the set is made up of two values is called a bivariate distribution. The
following examples will illustrate clearly the meaning of bivariate
distribution.
(i) In a class of 60 students the series of marks obtained in two
subjects by all of them.
(ii) The series of sales revenue and advertising expenditure of two
companies in a particular year.
(iii) The series of ages of husbands and wives in a sample of selected
married couples.
Thus in a bivariate distribution, we are given a set of pairs of
observations, wherein each pair represents the values of two variables.
In a bivariate distribution, we are interested in finding a relationship
(if it exists) between the two variables under study.
The concept of ‘correlation’ is a statistical tool which studies the
relationship between two variables and Correlation Analysis involves
various methods and techniques used for studying and measuring the extent
of the relationship between the two variables.
“Two variables are said to be in correlation if the change in one of the
variables results in a change in the other variable”.
1
,5.2: Types of Correlation
There are two important types of correlation. They are (1) Positive
and Negative correlation and (2) Linear and Non – Linear correlation.
5.2.1: Positive and Negative Correlation
If the values of the two variables deviate in the same direction i.e. if
an increase (or decrease) in the values of one variable results, on an average,
in a corresponding increase (or decrease) in the values of the other variable
the correlation is said to be positive.
Some examples of series of positive correlation are:
(i) Heights and weights;
(ii) Household income and expenditure;
(iii) Price and supply of commodities;
(iv) Amount of rainfall and yield of crops.
Correlation between two variables is said to be negative or inverse if
the variables deviate in opposite direction. That is, if the increase in the
variables deviate in opposite direction. That is, if increase (or decrease) in
the values of one variable results on an average, in corresponding decrease
(or increase) in the values of other variable.
Some examples of series of negative correlation are:
(i) Volume and pressure of perfect gas;
V
(ii) Current and resistance [keeping the voltage constant] ( R = ) ;
I
(iii) Price and demand of goods.
2
, Graphs of Positive and Negative correlation:
Suppose we are given sets of data relating to heights and weights of
students in a class. They can be plotted on the coordinate plane using x –
axis to represent heights and y – axis to represent weights. The different
graphs shown below illustrate the different types of correlations.
x
x
x
x
x
x
x x
x
Figure for positive correlation
3