# Tag: statistics

## Best links of the week #76

### Best links of the week from 24th August to 30th August

#### Videos

1. The Science Behind the Butterfly Effect at Veritasium.

## Spurious Independence: is it real?

### First things first: Spurious Dependence

Depending on your background, you have already heard of spurious dependence in a way or another. It goes by the names of spurious association, spurious dependence, the famous quote “correlation does not imply causation” and also other versions based on the same idea that you can not say that $X$ necessarily causes $Y$ (or vice versa) solely because $X$ and $Y$ are associated, that is, because they tend to occur together. Even if one of the events always happens before the other, let’s say $X$ preceding $Y$, still, you can not say that $X$ causes $Y$. There is a statistical test very famous in economics known as Granger causality.

The Granger causality test is a statistical hypothesis test for determining whether one time series is useful in forecasting another, first proposed in 1969.[1] Ordinarily, regressions reflect “mere” correlations, but Clive Granger argued that causality in economics could be tested for by measuring the ability to predict the future values of a time series using prior values of another time series. Since the question of “true causality” is deeply philosophical, and because of the post hoc ergo propter hoc fallacy of assuming that one thing preceding another can be used as a proof of causation, econometricians assert that the Granger test finds only “predictive causality”.

Granger Causality at Wikipedia.

The post hoc ergo propter hoc fallacy is also known as “after this, therefore because of this”. It’s pretty clear today that Granger causality is not an adequate tool to infer causal relationships and this is one of the reasons that when $X$ and $Y$ are tested by the granger causality test, and an association is found, it’s said that $X$ Granger-causes $Y$ instead of saying that $X$ causes $Y$. Maybe it’s not clear to you why the association between two variables and the notion that one always precedes the other is not enough to say that one is causing the other. One explanation for a hypothetical situation, for example, would be a third lurking variable $C$, also known as a confounder, that causes both events, a phenomenon known as confounding. By ignoring the existence of $C$ (which in some contexts happens by design and is a strong assumption called unconfoundedness), you fail to realize that the events $X$ and $Y$ are actually independent when taking into consideration this third variable $C$, the confounder. Since you ignored it, they seem dependent, associated. A very famous and straight forward example is the positive correlation between (a) ice cream sales and death by drowning or (b) ice cream sales and homicide rate.