A simple project tutorial with R/RMarkdown, Packrat, Git, and DVC.
The pain of managing a Data Science project
Something has been bothering me for a while: Reproducibility and data tracking in data science projects. I have read about some technologies but had never really tried any of them out until recently when I couldn’t stand this feeling of losing track of my analyses anymore. At some point, I decided to give DVC a try after some friends, mostly Flávio Clésio, suggested it to me. In this post, I will talk about Git, DVC, R, RMarkdown and Packrat, everything I think you may need to manage your Data Science project, but the focus is definitely on DVC.
Best links of the week from 29th July to 4th August
- Some interesting shiny apps at Tychobra.
- Learn git branching!
- Learn vim.
- rThreeJS R Package.
- Difference Between Covariance and Correlation at Key Differences.
- Variance vs. Covariance: What’s the Difference? at Investopedia.
- Difference Between Correlation and Regression at Key Differences.
- Difference Between Parametric and Nonparametric Test at Key Differences.
- Preferential attachment at Wikipedia.
- Voice automated shiny app (example here) at Yihui Xie’s GitHub.
- Webcam (face) automated shiny app (example here) at Yihui Xie’s GitHub.
- Xaringan (presentation on xaringan here) at Yihui Xie’s GitHub.
- Learn R fast with fasteR!
- We’re told that too much screen time hurts our kids. Where’s the evidence? at The Guardian.
- pagedown: Creating beautiful PDFs with R Markdown and CSS at rstudio::conf 2019 website.
- Por que cientistas precisam ser também bons comunicadores at NEXO Jornal.
- Portugal cria visto especial para atrair profissionais de TI brasileiros at Folha de São Paulo.
Best links of the week from 27th May to 2nd June
- Samsung AI Can Turn a Single Portrait Into a Realistic Talking Head at PetaPixel.
- Let’s Encrypt (Free Certification Authority) at MLAIT.
- Public data from the French government.
- Paris opens a data center to control its digital infrastructure.
- genderBR is an R package that predicts gender from Brazilian first names using data from the Instituto Brasileiro de Geografia e Estatistica’s 2010 Census.
- Git Cherry Pick at Atlassian Git Tutorials.
- Refs and the Reflog at Atlassian Git Tutorials.
- Advanced Git log at Atlassian Git Tutorials.
- Merging vs. Rebasing at Atlassian Git Tutorials.
- Intro to Cherry Picking with Git at PreviousNext.
- O que faz o cientista de dados ser o profissional mais procurado pelos RHs? at StartSe.
- 8 habilidades indispensáveis para cientistas de dados at CIO.