Spurious Independence: is it real?

Reading Time: 14 minutes

First things first: Spurious Dependence

Depending on your background, you have already heard of spurious dependence in a way or another. It goes by the names of spurious association, spurious dependence, the famous quote “correlation does not imply causation” and also other versions based on the same idea that you can not say that X necessarily causes Y (or vice versa) solely because X and Y are associated, that is, because they tend to occur together. Even if one of the events always happens before the other, let’s say X preceding Y, still, you can not say that X causes Y. There is a statistical test very famous in economics known as Granger causality.

The Granger causality test is a statistical hypothesis test for determining whether one time series is useful in forecasting another, first proposed in 1969.[1] Ordinarily, regressions reflect “mere” correlations, but Clive Granger argued that causality in economics could be tested for by measuring the ability to predict the future values of a time series using prior values of another time series. Since the question of “true causality” is deeply philosophical, and because of the post hoc ergo propter hoc fallacy of assuming that one thing preceding another can be used as a proof of causation, econometricians assert that the Granger test finds only “predictive causality”.

Granger Causality at Wikipedia.

The post hoc ergo propter hoc fallacy is also known as “after this, therefore because of this”. It’s pretty clear today that Granger causality is not an adequate tool to infer causal relationships and this is one of the reasons that when X and Y are tested by the granger causality test, and an association is found, it’s said that X Granger-causes Y instead of saying that X causes Y. Maybe it’s not clear to you why the association between two variables and the notion that one always precedes the other is not enough to say that one is causing the other. One explanation for a hypothetical situation, for example, would be a third lurking variable C, also known as a confounder, that causes both events, a phenomenon known as confounding. By ignoring the existence of C (which in some contexts happens by design and is a strong assumption called unconfoundedness), you fail to realize that the events X and Y are actually independent when taking into consideration this third variable C, the confounder. Since you ignored it, they seem dependent, associated. A very famous and straight forward example is the positive correlation between (a) ice cream sales and death by drowning or (b) ice cream sales and homicide rate.

Continue…

Best links of the week #53

Best links of the week #52

Best links of the week #51

Best links of the week #50

Best links of the week #49

Best links of the week #48

Reading Time: 4 minutes

Best links of the week from 2nd December to 8th December

This image has an empty alt attribute; its file name is meme-1.jpg

Links

  1. What can you expect at NeurIPS 2019? at PacktPub.
  2. NeurIPS 2019 Livestream.
  3. Is it worth attending NeurIPS if you’re not an academic?
  4. Colliders in Epidemiology: an educational interactive web application.
  5. Treinamento técnico em Bioinformática com bolsa da FAPESP at Agência FAPESP.
  6. Properties of the OLS estimator at StatLect.

Blog/posts

  1. NIPS 2017: 1st day at Dmytro Mishkin‘s Medium.
  2. NIPS 2017, Day 4 (orals + symposium) at Dmytro Mishkin‘s Medium.
  3. Nine things I wish I had known the first time I came to NIPS at Jennifer Wortman Vaughan‘s Medium (Some extra comments on this post here).
  4. Preparing for NeurIPS at Keren Gu‘s Medium.
  5. How to NeurIPS at Jade Abbott‘s Medium.
  6. How to analyze a research paper at Jade Abbott‘s Medium.
  7. NeurIPS 2018 Through the Eyes of First-Timers at Synced Review.
  8. NeurIPS: A Beginner’s Guide at Max Marion‘s website.
  9. Regressão Linear at Natan Anael‘s Medium.
  10. Beware the Propensity Score: It’s a Collider at Fernando Martel García’s GitHub.
  11. Remarks on Chen and Pearl on causality in econometrics textbooks at Chris Auld’s Blog.

Videos

  1. Origin of Markov chains at Khan Academy Labs’s YouTube channel.
  2. OpenAI Plays Hide and Seek…and Breaks The Game! at Two Minute Papers’s YouTube channel.
  3. Artificial Intelligence Debate – Yann LeCun vs. Gary Marcus – Does AI Need More Innate Machinery? at The Artificial Intelligence Channel on YouTube. I wrote a Twitter thread here commenting on the debate.

Podcasts

  1. Quantitative Bias Analysis with Matt Fox at Casual Inference Podcast.
  2. Você confiaria num “Robô Juiz”? at Spin de Notícias.

Positions available

  1. Research Scientist – Machine Learning for Autonomous Driving Behavior at Bosch.
  2. Research Engineer – Machine Learning for Autonomous Driving Behavior at Bosch.
  3. Research Intern in Machine Learning at Bosch.
  4. 8 Ph.D. fellowships at Institut Curie.
  5. Assistant or Associate Professor in Machine Learning at Université de Montreal.
  6. Ph.D. and Postdoctoral opportunities in Recurrent Neural Networks and Related Machines That Learn Algorithms at Swiss AI Lab IDSIA.
  7. Research position in Natural Language Processing / Machine Learning at INESCTEC.
  8. Research position in Machine Learning at INESCTEC.
  9. Four Postdoctoral Research Assistant position in AI for Healthcare at the University of Oxford.
  10. Two Senior Research Associate in Biomedical Engineering at the University of Oxford.
  11. Two Research Scientist in AI for Healthcare at the University of Oxford.
  12. 50+ opportunities in Engineering, Information Technology and Research at ACT.
  13. Research Fellow/Senior Research Fellow in Machine Learning For Autonomous Robot at UCL.
  14. Research Fellow/Senior Research Fellow in Machine Learning for Climate Science at UCL.
  15. Postdoctoral fellowship in fairness aware artificial intelligence, federated learning, dynamic sequence data modeling, and recommendation at the University of Arkansas.
    The next opportunities (30+) are reserved for readers registered in the newsletter. By having registered, you will receive updates on the posts in the blog!

Best links of the week #47

Reading Time: 3 minutes

Best links of the week from 25th November to 1st December

This image has an empty alt attribute; its file name is meme-1.jpg

Links

  1. Researchers Have Successfully Tricked A.I. Into Seeing The Wrong Things at PopSci.
  2. Fooling the machine at PopSci.
  3. Why isn’t confounding a statistical concept? at Judea Pearl’s discussion with readers.
  4. The impossibility of asymmetric causation at Judea Pearl’s discussion with readers.
  5. d-SEPARATION WITHOUT TEARS at Judea Pearl’s discussion with readers. There is an interactive adaptation from this at dagitty’s website here.
  6. An Illustration of Pearl’s Simpson Machine at dagitty.
  7. Do you think you know DAG terminology? This game can help you try your skills. There is also another game here for testing your knowledge on covariate roles and another one about Table 2 Fallacy. All this at dagitty.
  8.  On causality and decision trees at Judea Pearl’s discussion with readers.
  9. On causality and decision trees (cont.) at Judea Pearl’s discussion with readers.
  10. Back-door criterion and epidemiology at Judea Pearl’s discussion with readers.
  11. Indirect Effects at Judea Pearl’s discussion with readers.
  12. The meaning of counterfactuals at Judea Pearl’s discussion with readers.
  13. Has causality been defined? at Judea Pearl’s discussion with readers.
  14. The tidyverse for Machine Learning presentation by Bruna Wundervald at satRday São Paulo.
  15. Centrality measures as a proxy for causal influence? at Fabian Dablander‘s website.
  16. Garoto de 12 anos já trabalha como cientista de dados at Olhar Digital.
  17. CGU lança novo Painel Correição em Dados at CGU.

Blog/posts

  1. Causality in Machine Learning 101 for Dummies like Me by Sangeet Moy Das at Towards Data Science.
  2. An introduction to Causal inference at Fabian Dablander‘s Blog.
  3. Spurious correlations and random walks at Fabian Dablander‘s Blog.
  4. Curve fitting and the Gaussian distribution at Fabian Dablander‘s Blog.
  5. In Review: Ten Great Ideas About Chance at Fabian Dablander‘s Blog.
  6. Using causal graphs to understand missingness and how to deal with it at Cookie Scientist.

Videos

  1. A network of science: 150 years of Nature papers at nature video‘s YouTube channel.
  2. ViennaR Meetup March 2019 | Hadley Wickham Tidy Data at Quantargo‘s YouTube channel.
  3. Causal Graphs by Julian Schüssler at MZES Methods Bites‘s YouTube channel.

Positions available

  1. Lecturer/Senior Lecturer/Reader in Media & Data Science at the University of Glasgow.
  2. Ph.D. fellowship in Machine Learning for Robot Manipulation at Bosch.
  3. Fully Funded Ph.D. position in AI and Machine Learning for mental well being at Örebro University.
  4. Research Assistant in Computer Vision and Deep Learning at Edge Hill University.
  5. Tenure Track ML Teaching Professor Position at UCSD.
  6. Post-doctoral fellowship (Genomics) at Instituto Tecnológico Vale.
  7. Data Science Vice President at Big Cloud.
  8. Director of Data Science at Ideal Team Consulting.
  9. Gerente de Governança e Arquitetura de Dados at Wiz.
  10. Senior Business Intelligence Analyst at SumUp.
  11. Data Architect – Restaurant Product at iFood.
  12. Lead Data Engineer at QuintoAndar.
  13. Software Engineer at Google.
  14. Senior SQL Server/ETL Developer at Cognizant.
  15. Data Architect D2- Lunch DFN at iFood.
    The next opportunities (30+) are reserved for readers registered in the newsletter. By having registered, you will receive updates on the posts in the blog!

Best links of the week #46