In the ice cream truck example at the beginning of this chapter, it is clear that the hotter the day is, the more ice cream is sold. In this case, the warmer weather causes more ice cream to be sold.
In general, however, you have to be very careful to make such a statement of causality. On the basis of a regression analysis, you can never prove that a causal relationship exists between two variables.
One thing to keep in mind when interpreting the results of a regression analysis is the possible existence of a lurking variable.
A lurking or confounding variable is a (hidden) third variable that influences both the outcome and the predictor variable, thereby creating the illusion that the outcome and predictor variable are directly related to one another.
Such an illusory relationship between two variables is also called a spurious association.
For example, it is possible to predict the amount of damage done by a fire based on the number of firefighters that are sent to put it out. This does not mean, however, that the firefighters are the cause of the damage.
To illustrate this point, consider the following two statements:
- The larger the fire, the more damage is caused.
- The larger the fire, the more firefighters are sent to put it out.
Here, the size of the fire acts as a lurking variable that influences both the amount of damage done by the fire and the number of firefighters that are sent to put it out, thereby creating a spurious association between the two variables.