Uncategorized

Best links of the week #74

Reading time: 3 minutes

Best links of the week from 20th July to 17th August

This image has an empty alt attribute; its file name is meme-1.jpg

Links

  1. Match your manuscript to a potential journal (Clarivate).
  2. Match your manuscript to a potential journal (Elsevier).
  3. Match your manuscript to a potential journal (Springer).
  4. Cohen’s d at Wikiversity.
  5. Why, When and How to Adjust Your P Values? at Cell Journal.
  6. GitHub CLI.

Blog posts

  1. Entenda de uma vez por todas o que são testes unitários, para que servem e como fazê-los at Dayvson Lima‘s Medium.
  2. Unit Testing in R at Towards Data Science.
  3. Independent and Identically Distributed Data (IID) at Statistics by Jim.
  4. Ser cético não implica ser cínico! at Portal Deviante.
  5. Probabilistic Graphical Models Tutorial — Part 1 by Prasoon Goyal at Cube Dev‘s Medium.
  6. Probabilistic Graphical Models Tutorial — Part 2 by Prasoon Goyal at Cube Dev‘s Medium.
  7. Learning GitHub Actions: Creating Beautiful PR Comments by Ivan Shcheklein.
  8. Continuous Machine Learning at The Dataist Storyteller.
  9. Using Continuous Machine Learning to Run Your ML Pipeline at Vaithy Narayanan‘s Medium.
  10. Improve your workflow by managing your machine learning experiments using Sacred at Déborah Mesquita’s blog.
  11. A gentle introduction to D3: how to build a reusable bubble chart at Déborah Mesquita’s blog.
  12. The Rise of DataOps (from the ashes of Data Governance) by Ryan Gross at Towards Data Science.

Videos

  1. The Great Debate: THE STORYTELLING OF SCIENCE (Part 1/2).
  2. The Great Debate: THE STORYTELLING OF SCIENCE (Part 2/2).
  3. What Is And How To Calculate Cohen’s d? at Top Tip Bio‘s YouTube channel.
  4. What are degrees of freedom? at James Gilbert‘s YouTube channel.
  5. Data Analysis: Why do we test the null hypothesis? at James Gilbert‘s YouTube channel.
  6. Testing For Normality – Clearly Explained at Top Tip Bio‘s YouTube channel.
  7. Pearson Correlation Explained (Inc. Test Assumptions) at Top Tip Bio‘s YouTube channel.
  8. The Shape of Data: Distributions: Crash Course Statistics #7 at CrashCourse‘s YouTube channel.
  9. Regression: Crash Course Statistics #32 at CrashCourse‘s YouTube channel.
  10. The Multiple Comparisons Problem at Sprightly Pedagogue‘s YouTube channel.
  11. MLOps Tutorial #1: Intro to Continuous Integration for ML at DVCorg‘s Youtube channel.
  12. MLOps Tutorial #2: When data is too big for Git at DVCorg‘s Youtube channel.
  13. MLOps Tutorial #3: Track ML models with Git & GitHub Actions at DVCorg‘s Youtube channel.
  14. Introduction to Bayesian Networks | Implement Bayesian Networks In Python at edureka!‘s YouTube channel.
  15. Bayesian Network – Exact Inference Example (With Numbers, FULL Walk-Through) at John McVickar‘s YouTube channel.

Podcast

  1. Iniciativa monitora o distanciamento social no Brasil (#999) at Spin de Notícias.
Causality, Data Science, R, tools, Uncategorized

Continuous Machine Learning – Part I

Reading time: 9 minutes
Image by Taras Tymoshchuck from here.

This is a 3 part series about Continuous Machine Learning. You can check Part II here and Part III here.

What is it?

Continuous Machine Learning (CML) follows the same concept of Continuous Integration and Continuous Delivery (CI/CD), famous concepts in Software Engineering / DevOps, but applied to Machine Learning and Data Science projects.

What is this post about?

I will cover a set of tools that can make your life as a Data Scientist much more interesting. We will use MIIC, a network inference algorithm, to infer the network of a famous dataset (alarm from bnlearn). We will then use (1) git to track our code, (2) DVC to track our dataset, outputs and pipeline, (3) we will use GitHub as a git remote and (4) Google Drive as a DVC remote. I’ve written a tutorial on managing Data Science projects with DVC, so if you’re interested on it open a tab here to check it later.

BestLinks

Best links of the week #73

Reading time: 2 minutes

Best links of the week from 6th July to 19th July

This image has an empty alt attribute; its file name is meme-1.jpg

Links

  1. Left-hand & right-hand side nomenclature in regression models at Cross Validated.
  2. Scientists invite 4,000 music fans to a live concert to assess spread of coronavirus at Classic FM.
  3. Solicitando dados via lei de acesso a informação at Escola de Dados.
  4. Doctor Penguin: Catch the Latest AI+Healthcare Research.
  5. R Weekly.

Blog posts

  1. The Difference between Linear and Nonlinear Regression Models at Statistics By Jim.
  2. Multicollinearity in Regression Analysis: Problems, Detection, and Solutions at Statistics By Jim.
  3. How To Interpret R-squared in Regression Analysis at Statistics By Jim.
  4. Check Your Residual Plots to Ensure Trustworthy Regression Results! at Statistics By Jim.
  5. Standard Error of the Regression vs. R-squared at Statistics By Jim.
  6. R-squared Is Not Valid for Nonlinear Regression at Statistics By Jim.
  7. How to Choose Between Linear and Nonlinear Regression at Statistics By Jim.
  8. Heteroscedasticity in Regression Analysis at Statistics By Jim.
BestLinks

Best links of the week #72

Reading time: 2 minutes

Best links of the week from 15th June to 5th July

This image has an empty alt attribute; its file name is meme-1.jpg

Links

  1. Os 11 melhores canais de Data Science no Telegram at Insight.
  2. Prove Your Grit in our Competitions at bitgrit.
  3. Confounding in epidemiological studies at Health Knowledge.
  4. Cochran–Mantel–Haenszel statistics at Wikipedia.
  5. University and college students, learn for free with Coursera!
  6. Determine the most significant overlap between subsets of two or three sorted lists with DynaVenn.
  7. Scholarly Community Encyclopedia.
  8. Category mistake at Wikipedia.
  9. Black swan theory at Wikipedia.
  10. Hindsight bias at Wikipedia.
  11. Cochran–Mantel–Haenszel statistics at Wikipedia.
  12. Pareidolia e Apofenia at Wikipedia.
  13. Levenshtein distance at Wikipedia.
  14. A periodic table of visualization methods at Visual Literacy.
  15. Why It’s Hard to Evaluate State Policies in the Pandemic at Penn LDI.

Blog posts

  1. A Gentle Introduction to Concept Drift in Machine Learning at Machine Learning Mastery.
  2. What is the difference between Bagging and Boosting? at QuantDare.
  3. A PRIMER TO ENSEMBLE LEARNING – BAGGING AND BOOSTING at Analytics India Mag
  4. O modelo de #SquadGoals do Spotify falhou. at Flavio Clesio’s Blog.
  5. That one weird third variable problem nobody ever mentions: Conditioning on a collider at the 100 CI.
  6. Why Statistics Don’t Capture The Full Extent Of The Systemic Bias In Policing at Five Thirty Eight.
  7. Why Is the Average Human Body Temperature Decreasing? at Science and Philosophy’s Medium.
  8. CITAÇÃO DE CITAÇÃO SEGUNDO AS REGRAS ABNT: ACABE COM SUAS DÚVIDAS! at Blog PPEC.
  9. FAPESP cria repositório de informações clínicas para subsidiar pesquisas sobre COVID-19 at Agência FAPESP.

Videos

  1. The Super Mario Effect – Tricking Your Brain into Learning More by Mark Rober at TEDx Talks’ YouTube channel.
  2. Por que a concorrência abre suas lojas perto das outras? at TED-Ed’s YouTube channel.

Podcast

  1. Causalidade na saúde at Dados e Saúde.
Uncategorized

Best links of the week #71

Reading time: 2 minutes

Best links of the week from 8th June to 14th June

This image has an empty alt attribute; its file name is meme-1.jpg

Links

  1. Hipótese de Sapir-Whorf at Wikipedia.
  2. Live 2020 Max Planck Lecture by Geoffrey Hinton (June, 23rd).
  3. Um livro ilustrado de maus argumentos.
  4. Your logical fallacy is.
  5. Brazilian Symposium on Bioinformatics 2020.
  6. When 511 Epidemiologists Expect to Fly, Hug and Do 18 Other Everyday Activities Again at the New York Times.
  7. 7 Reasons Why Studying a Bachelor’s Degrees Abroad in Better than in Your Home Town at Study Portals Masters.
  8. 7 Decisive Reasons to Study Abroad in 2020 – Why You Won’t Regret It at Study Portals Bachelors.

Blog posts

  1. “Depois disso, logo, causado por isso”… Será? at Portal Deviante.
  2. Why is Linear Algebra Taught So Badly? by Callum Ballard at Towards Data Science.
  3. Why is Data Science Losing Its Charm? by Harshit Ahuja at Towards Data Science.

Videos

  1. Dividing by zero? at Eddie Woo‘s YouTube channel.
  2. Why is 0! = 1? at Eddie Woo‘s YouTube channel.
  3. What is 0 to the power of 0? at Eddie Woo‘s YouTube channel.

Podcast

  1. A Ciência e a COVID-19 (SciCast #380) at Portal Deviante.
BestLinks

Best links of the week #70

Reading time: 2 minutes

Best links of the week from 18th May to 7th June

This image has an empty alt attribute; its file name is meme-1.jpg

Links

  1. Here are 450 Ivy League courses you can take online right now for free at FreeCodeCamp.
  2. Our weird behavior during the pandemic is messing with AI models at MIT Technology Review.
  3. microdatasus Python Package.
  4. Meet xaringan: Making slides in R Markdown by Alison Hill at Advanced R Markdown Workshop.
  5. How to Make Slides in R by Zhi Yang.
  6. Recall bias at Wikipedia.
  7. Confidence Interval cartoon at xkcd.

Blog posts

  1. Machine Learning is too easy at John Langford’s Blog.
  2. Naive Bayes for Dummies; A Simple Explanation at Data Science Central.
  3. Support Vector Machines for dummies; A Simple Explanation at Aylien’s Blog.
  4. Everything You Wanted to Know about the Kernel Trick (But Were Too Afraid to Ask) at Eric Kim’s Blog.
  5. Machina Machinae Lupus est? at Portal Deviante.

Videos

  1. Ten Craziest Things Cells Do by Wallace Marshall at iBiology’s YouTube channel.
  2. O mundo a partir do coronavírus, ed. 09 | Modelos computacionais e isolamento social social at Academia Brasileira de Ciências.
Uncategorized

Best links of the week #69

Reading time: < 1 minute

Best links of the week from 11th May to 17th May

This image has an empty alt attribute; its file name is meme-1.jpg

Links

  1. Changes in new release of R (4.0.0).

Blog posts

  1. May ’20 DVC❤️Heartbeat at DVC Blog.
  2. Isotonic Regression is THE Coolest Machine-Learning Model You Might Not Have Heard Of by Emmett Boudreau at Towards Data Science.
  3. Econometrics 101 for data scientists by Mahbubul Alam at Towards Data Science.
  4. Panel data regression: a powerful time series modeling technique by Mahbubul Alam at Towards Data Science.
  5. Detecting stationarity in time series data by Shay Palachy at KDnuggets.
Uncategorized

Best links of the week #68

Reading time: 2 minutes

Best links of the week from 4th May to 10th May

This image has an empty alt attribute; its file name is meme-1.jpg

Links

  1. COVID-19 dashboard by NatalNet lab (UFRN).
  2. COVID-19 dashboard by Brain Institute (UFRN).
  3. Rt Covid-19.
  4. Monitor COVID-19.
  5. Estimativas de R(t) por Estados do Brasil at Flavio Figueiredo’s website.
  6. Many other COVID-19 dashboards.
  7. COVID-19 Projections Using Machine Learning.
  8. OBSERVATÓRIO DA CIÊNCIA.
  9. Join the DVC Ambassador Program! at DVC Blog.
  10. Resultados da pesquisa de mercado de Data Science feita pelo Data Hackers at Kaggle.
BestLinks

Best links of the week #66

Reading time: 2 minutes

Best links of the week from 20th April to 26th April

This image has an empty alt attribute; its file name is meme-1.jpg

Links

  1. COVID-19 na Perspectiva Geográfica e Estatística at IBGE.
  2. Postmortem documentation at Wikipedia.
  3. Pesquisa de datasets do Google at Dados Abertos.
  4. Preocupações sobre abertura de dados e respostas que têm se mostrado efetivas at Dados Abertos.
  5. COVID-19 Analysis Repository at Christian S. Perone’s GitHub.