# Correlation and causality

© Dec 2006 Paul Cooijmans

(It is assumed in this article that the reader already knows what a correlation is.)

It is often said, typically in a warning manner, that a correlation does not imply a causal relation. This well-meant advice, however, is wrong.

Any correlation implies a causal relation, with a probability inversely proportional to its significance.

For clarity: A correlation as meant here is not zero, as the value of zero denotes the absence of correlation. The significance of a correlation, typically reported as a p value, is the probability of that correlation resulting from chance if the true correlation were zero.

In other words, the significance or p value is the probability that the correlation does not imply a causal relation, and [1 - significance] is the probability it does.

Then on to the nature and direction of the causality; if a correlation exists between A and B, at least one of the following explanations applies:

• A causes B;
• B causes A;
• There is at least one third variable that causes both A and B, linking them.

It is the latter explanation, the common cause, that is often overlooked, and that may be what some people are really trying to say with "correlation does not imply causality". But a common cause is a cause nevertheless, so there is causality after all. There can be no correlation without causality, except for by chance as reflected in the correlation's significance.

Exactly which variable is a cause, and through which mechanism it works, can be found out and proven by studying all intercorrelations between a broader set of variables, by logical thinking, or by other non-statistical methods like experiment.