When comparing two different variables, two questions come to mind: “is there a
relationship between two variables?” and “how strong is that relationship?”. These questions
can be answered using regression and correlation. Regression answers whether there is a
relationship and correlation answers how strong the relationship is. To introduce both
concepts, lets discuss first the simple linear regression.
Simple Linear Regression is a process of estimating the statistical relationship
between two variables. It consists of the dependent variable and the independent variable.
The independent variables are also called as the explanatory variable or predictor variable, it
is the x-value in the equation. It is used to predict what the other variable is. The dependent
variable responds to the explanatory variable which is sometimes called the response variable
(y-value).
The objective in linear regression is to obtain an equation of a straight line that
minimizes the sum of equation. The straight line approximates the relationship between the
two variables.
The simple linear regression equation (also known as best fitting line or the least
squares line) is:
𝑦̂ = 𝑎 + 𝑏𝑥
where:
𝑦̂ = predicted value of the independent variable
𝑎 = intercept
𝑏 = slope
To get the intercept and slope, you must compute for below formula:
∑ 𝑥𝑦 − 𝑛𝑥𝑦
̅̅̅ ∑𝑥 ∑𝑦
𝑎 = 𝑦̅ − b𝑥̅ 𝑏= 𝑥̅ = 𝑦̅ =
2
∑ 𝑥 − 𝑛𝑥̅̅̅2 𝑛 𝑛
where:
𝑎 = intercept
𝑦̅ = mean of dependent variable
𝑥̅ = mean of independent variable
𝑏 = slope
𝑥 = independent variable
𝑦 = dependent variable
𝑛 = number of ordered pairs
, Example: The XYZ Foods Corp. has accumulated the following data on their promotional
expenditures and sales for the past ten years.
Annual
Promotional Annual Sales a. Develop a simple regression equation for these
Expenditures (P100,000) data.
(P100,000)
8 65 b. If the promotional expenditure of XYZ Food Corp, is
14 90 P2,500,000, what would be the company’s expected
10 84
13 95 sales?
15 97
18 100
19 105
20 111
24 120
29 123
Solution:
With this example, let x = annual promotional expenditure and y = annual sales. Take
note that the value of 𝑛 = number of values in the given data, in this case 𝑛 = 10.
Step 1: Compute the necessary values needed in the formula.
x y xy 𝒙𝟐 𝒚𝟐
8 65 520 64 4,225
14 90 1,260 196 8,100
10 84 840 100 7,056
13 95 1,235 169 9,025
15 97 1,455 225 9,409
18 100 1,800 324 10,000
19 105 1,995 361 11,025
20 111 2,220 400 12,321
24 120 2,880 576 14,400
29 123 3,567 841 15,129
𝜮x = 170 𝜮y = 990 𝜮xy = 17,772 𝜮𝒙𝟐 = 3,256 𝜮𝒚𝟐 = 100,690
Step 2: Compute for the mean values of x and y.
∑𝑥 170 ∑𝑦 990
𝑥̅ = = = 𝟏𝟕 𝑦̅ = = = 𝟗𝟗
𝑛 10 𝑛 10
Step 3: Compute for the value of the slope (𝑏) and intercept (𝑎).
∑ 𝑥𝑦 − 𝑛𝑥𝑦
̅̅̅ 17,772 − 10(17)(99) 17,772 − 16,830 942
𝑏= = = = = 𝟐. 𝟓𝟕
2
∑ 𝑥 − 𝑛𝑥̅̅̅
2 2
3,256 − 10(17 ) 3,256 − 2,890 366
𝑎 = 𝑦̅ − b𝑥̅ = 99 – 2.57(17) = 99 – 43.69 = 𝟓𝟓. 𝟑𝟏