### Chapter 11: Regression Analysis: Simple Linear Regression

### Residuals and Total Squared Error

A regression line is the *best-fitting* straight line through a set of data points. Regression analysis is all about predicting values, and what makes a regression line 'best-fitting' is that it has the lowest possible amount of prediction error.

In the context of regression, the amount of prediction error is expressed in terms of *residuals*.

#\phantom{0}#

Residual

A **residual** is the vertical distance between the regression line and a data point and is denoted by #r#.

Calculating Residuals

To calculate a residual, take a point #(X,Y)# from the data and determine the height of the regression line at point #X#. This point is the *predicted value* of #Y# and is denoted by #\hat{Y}#*. *

Next, subtract the predicted value #\hat{Y}# from the observed value #Y# to determine the value of the residual:

\[r_i = Y_i - \hat{Y}_i\]

Calculation of Residuals

Consider the regression equation #\hat{Y}=2X# and the data points #(1,3)#, #(3,1)#, and #(4,3)#. The residuals of these three data points are calculated as follows:

- For the first point #(1,3)#:
- #\purple{\hat{Y}_1}=2\cdot 1=2#
- #\blue{Y_1} = 3#
- #\orange{r_1}= Y_1-\hat{Y}_1=3-2=1#.

- For the second point #(3,1)#:
- #\purple{\hat{Y}_2}=2\cdot 3=6#
- #\blue{Y_2}=1#
- #\orange{r_2}= Y_2-\hat{Y}_2 = 1-6 =-5#.

- For the last point #(3,3)#:
- #\purple{\hat{Y}_3} =2\cdot 4=8#
- #\blue{Y_3}=3#
- #\orange{r_3}= Y_3-\hat{Y}_3 = 3-8=-5#.

#\phantom{0}#

#\phantom{0}#

One of the most commonly used measures to summarize the total amount of prediction error is the *Total Squared Error*.

#\phantom{0}#

Total Squared Error

The **Total Squared Error **is the sum of the squared residuals and is often abbreviated TSE.

\[\text{TSE} = \sum{r^2} = \sum{(Y-\hat{Y})^2}\]

The reason for squaring the residuals before adding them together is to prevent positive and negative residuals from canceling one another. Consequently, the total squared error will always be a positive number.

Calculation of Total Squared Error

Consider the regression line and residuals from the previous example. In this case, the *Total Squared Error* is:

\[\begin{array}{rcl}

\text{TSE} &=& \sum{(Y-\hat{Y})^2}\\

&=& (Y_1-\hat{Y}_1)^2 + (Y_2-\hat{Y}_2)^2 + (Y_3-\hat{Y}_3)^2\\

&=& (3-2)^2+(1-6)^2+(3-8)^2\\

&=& 1^2 + (-5)^2 + (-5)^2\\

&=& 1 + 25 + 25\\

&=& 51

\end{array}\]

**Pass Your Math**independent of your university. See pricing and more.

Or visit omptest.org if jou are taking an OMPT exam.