Correlation and causality

© 2006-2020 Paul Cooijmans

(It is assumed in this article that the reader already knows what a correlation is.)

It is often said, typically in a warning manner, that a correlation does not imply a causal relation. This well-meant advice, however, is wrong.

Any correlation implies a causal relation, with a probability inversely proportional to its significance.

For clarity: A correlation as meant here is not zero, as the value of zero denotes the absence of correlation. The significance of a correlation, typically reported as a p value, is the probability of that correlation resulting from chance (coincidence) if the true correlation were zero.

In other words, p value is the probability that the correlation does not imply a causal relation, and [1 - p value] is the probability it does.

Then on to the nature and direction of the causality; if a correlation exists between A and B, at least one of the following explanations applies:

It is the latter explanation, the common cause, that is often overlooked, and that may be what some people are really trying to say with "correlation does not imply causality". But a common cause is a cause nevertheless, so there is causality after all. There can be no correlation without causality, except for by chance (coincidence) as reflected in the correlation's significance.

Exactly which variable is a cause, and through which mechanism it works, can be found out and proven by studying all intercorrelations between a broader set of variables, by logical thinking, or by other non-statistical methods like experiment.