Lets calculate linear regression for the below dataset. We have age of the person which we will denote as X and sugar level of the person which we will denote as Y
Step 1)
To know how to calculate mean refer link.
Step 2)
To know how to calculate standard deviation refer link
Step 3)
Now calculate the difference between each data point of X and Y from the mean for X and Y
Step 4)
Now calculate the summation of multiplication of
Step 5)
Now calculate the sum of squares of X and Y.
Step 6)
Calculate Pearson correlation:
Formula to calculate Pearson correlation
We already calculated numerator and denominator for the above formula at step 4 and step 5, now we will substitute those values in our formula.
r = 478/square root (1240.833333*656) = 0.529809
Step 7)
Calculate m which is called slope or in AP statistics we will call it beta.
Slope m = Pearson correlation * Sy/Sx
m = 0.529809 * 11.45426/15.75331 = 0.38522
So, we found the value of m = 0.38522
Step 8)
we will calculate the value of c is called intercept or error.
Intercept c = y
c = 81 – 0.38522* 41.16667 = 65. 1417
Now we have everything to calculate the feature value of Y. Now I want to know the sugar level of the person whose age is 30. We know sugar level is Y and age is X.
Y = 0.38522 *30+65.1417 = 76.69