Tag: datascience

BestLinks

Best links of the week #79

Reading time: 2 minutes

Best links of the week from 14th September to 27th September

This image has an empty alt attribute; its file name is meme-1.jpg

Links

  1. Painel de análise do excesso de mortalidade por causas naturais no Brasil em 2020 at CONASS.
  2. iDiscover.
  3. Several dashboards related to COVID19 in Brazil by IBGE.
  4. Divulgação de Candidaturas e Contas Eleitorais.
  5. The Cognitive Bias Index at WikiMedia.
  6. Loki’s Wager and The Merchant of Venice at Wikipedia.
  7. GAN School by Junior Koch.
  8. Fato ou Fake COVID-19.
Data Science, PhD, R

Manage your Data Science Project in R

Reading time: 9 minutes

A simple project tutorial with R/RMarkdown, Packrat, Git, and DVC.

Source: Here.

The pain of managing a Data Science project

Something has been bothering me for a while: Reproducibility and data tracking in data science projects. I have read about some technologies but had never really tried any of them out until recently when I couldn’t stand this feeling of losing track of my analyses anymore. At some point, I decided to give DVC a try after some friends, mostly Flávio Clésio, suggested it to me. In this post, I will talk about Git, DVC, R, RMarkdown and Packrat, everything I think you may need to manage your Data Science project, but the focus is definitely on DVC.

BestLinks

Best links of the week #49

Reading time: 2 minutes

Best links of the week from 23rd December to 29th December

This image has an empty alt attribute; its file name is meme-1.jpg

Links

  1. How Margaret Dayhoff Brought Modern Computing to Biology at Smithsonian Magazine.
  2. Humans have a natural lifespan of only 38 years at Daily Mail (paper here).
  3. When Surgeons Listen To Their Preferred Music, Their Stitches Are Better and Faster at FAC Medicine.
  4. Artificial intelligence predictions for 2020: 16 experts have their say at Verdict.
  5. Unpacking the Black Box in Artificial Intelligence for Medicine at Undark.
  6. Vanishing gradient problem at Wikipedia.
  7. O que acontece com as crianças-prodígio quando elas crescem? at BBC News.
BestLinks

Best links of the week #47

Reading time: 3 minutes

Best links of the week from 25th November to 1st December

This image has an empty alt attribute; its file name is meme-1.jpg

Links

  1. Researchers Have Successfully Tricked A.I. Into Seeing The Wrong Things at PopSci.
  2. Fooling the machine at PopSci.
  3. Why isn’t confounding a statistical concept? at Judea Pearl’s discussion with readers.
  4. The impossibility of asymmetric causation at Judea Pearl’s discussion with readers.
  5. d-SEPARATION WITHOUT TEARS at Judea Pearl’s discussion with readers. There is an interactive adaptation from this at dagitty’s website here.
  6. An Illustration of Pearl’s Simpson Machine at dagitty.
  7. Do you think you know DAG terminology? This game can help you try your skills. There is also another game here for testing your knowledge on covariate roles and another one about Table 2 Fallacy. All this at dagitty.
  8.  On causality and decision trees at Judea Pearl’s discussion with readers.
  9. On causality and decision trees (cont.) at Judea Pearl’s discussion with readers.
  10. Back-door criterion and epidemiology at Judea Pearl’s discussion with readers.
  11. Indirect Effects at Judea Pearl’s discussion with readers.
  12. The meaning of counterfactuals at Judea Pearl’s discussion with readers.
  13. Has causality been defined? at Judea Pearl’s discussion with readers.
  14. The tidyverse for Machine Learning presentation by Bruna Wundervald at satRday São Paulo.
  15. Centrality measures as a proxy for causal influence? at Fabian Dablander‘s website.
  16. Garoto de 12 anos já trabalha como cientista de dados at Olhar Digital.
  17. CGU lança novo Painel Correição em Dados at CGU.

Blog/posts

  1. Causality in Machine Learning 101 for Dummies like Me by Sangeet Moy Das at Towards Data Science.
  2. An introduction to Causal inference at Fabian Dablander‘s Blog.
  3. Spurious correlations and random walks at Fabian Dablander‘s Blog.
  4. Curve fitting and the Gaussian distribution at Fabian Dablander‘s Blog.
  5. In Review: Ten Great Ideas About Chance at Fabian Dablander‘s Blog.
  6. Using causal graphs to understand missingness and how to deal with it at Cookie Scientist.

Videos

  1. A network of science: 150 years of Nature papers at nature video‘s YouTube channel.
  2. ViennaR Meetup March 2019 | Hadley Wickham Tidy Data at Quantargo‘s YouTube channel.
  3. Causal Graphs by Julian Schüssler at MZES Methods Bites‘s YouTube channel.

Positions available

  1. Lecturer/Senior Lecturer/Reader in Media & Data Science at the University of Glasgow.
  2. Ph.D. fellowship in Machine Learning for Robot Manipulation at Bosch.
  3. Fully Funded Ph.D. position in AI and Machine Learning for mental well being at Örebro University.
  4. Research Assistant in Computer Vision and Deep Learning at Edge Hill University.
  5. Tenure Track ML Teaching Professor Position at UCSD.
  6. Post-doctoral fellowship (Genomics) at Instituto Tecnológico Vale.
  7. Data Science Vice President at Big Cloud.
  8. Director of Data Science at Ideal Team Consulting.
  9. Gerente de Governança e Arquitetura de Dados at Wiz.
  10. Senior Business Intelligence Analyst at SumUp.
  11. Data Architect – Restaurant Product at iFood.
  12. Lead Data Engineer at QuintoAndar.
  13. Software Engineer at Google.
  14. Senior SQL Server/ETL Developer at Cognizant.
  15. Data Architect D2- Lunch DFN at iFood.
    The next opportunities (30+) are reserved for readers registered in the newsletter. By having registered, you will receive updates on the posts in the blog!
BestLinks

Best links of the week #16

Reading time: 2 minutes

Best links of the week from 22nd April to 28th April

You can check this comic here

Links

  1. Do more with R: drag-and-drop ggplot at InfoWorld.
  2. Apart from esquisse, the package mentioned in the link above, there is another one that allows you to drag-and-drop and plot your data: ggplotAssist.
  3. DreamRs is a French R consulting firm. In their website, they have made publicly available some shiny apps on real data, such as RATP traffic and a GitHub dashboard.
  4. VCs just invested $8 million into this startup that gave away its software for free because they noticed how much people loved it!
  5. Cheat Sheets for several softwares and concepts related to Data Science at Asif Bhat GitHub.
  6. Data Science must read articles, tutorials and useful links at Asif Bhat GitHub.
  7. Math required for Data Science at Asif Bhat GitHub.
  8. Quick overview of Statistics for Biologists (it’s useful for pretty much everybody, you don’t say no to an offer of knowledge :-).
  9. How can I show the intermediate steps of a long routine in R? at StackOverflow.
  10. ‘Friendly’ reviewers rate grant applications more highly at Nature.
  11. Calm down, everyone. Keeping dead pig cells alive is not ‘brain resuscitation’ at Los Angeles Times.
  12. Uber is sharing publicly some data!
  13. Need help on choosing the right visualization method? From data-to-viz can help you!
  14. IBM releases Diversity in Faces, a dataset with over 1 million annotated images to help fight bias at Turing Tribe.
  15. Até 2030, AI contribuirá em mais de US$ 15,7 trilhões para economia global at Computer World.
  16. A extraordinária cientista que estudou o cérebro de Einstein e revolucionou a neurociência moderna at Época Negócios.
  17. TerraBrasilis, a open access public geographical data for environmental monitoring.
BestLinks, R

Best links of the week #15

Reading time: 2 minutes

Best links of the week from 15th April to 21st April

Links

  1. When it comes to clustering, depending on the algorithm used, one may have a hard time determining the appropriate k (number of clusters). Some algorithms do not require it, but for the ones that do, such as k-means, you should have a look at the elbow method to evaluate the appropriate k or at the silhouette of objects regarding the clusters.
  2. Dunder Data is a professional training company dedicated to teaching data science and machine learning. There is paid and free online material.
  3. Software Carpentry, teaching basic lab skills for research computing.
  4. ROpenSci, transforming science through open data and software.
  5. mlmaisleve, conceitos rápidos e leves sobre Machine Learning ?.
  6. kite, Code Faster in Python with Line-of-Code Completions.
BestLinks

Best links of the week #14

Reading time: 2 minutes

Best links of the week from 8th April to 14th April

Source: Business Broadway.

Links

  1. Many more images like the one above at Business Broadway.
  2. Websites with challenges and exercises at Gabriel Fonseca’s GitHub page.
  3. Support innovation in healthcare with Hacking Health! There are several chapters around the world, including several in Brazil and in France :-).
  4. What are some examples of “Correlation does not equal causation?” at Quora.
  5. Does no correlation imply no causality? at Cross Validated.
  6. PEARL VS RUBIN (GELMAN) at Dokyun Lee’s website.
  7. Virgilio, your new Mentor for Data Science E-Learning at Giacomo Ciarlini.
  8. A quick reference for data visualization.
  9. Dev Tube.
  10. Por que preciso de “Análise de Componentes Principais” ou PCA na mineração de dados? at Quora.
  11. Harvard lança 15 cursos gratuitos de Inteligência Artificial at Estagio Online.
  12. Os testes de Harvard selecionam seus genes at Deviante.
  13. A realidade biopsicossocial da violência at Deviante.
BestLinks

Best links of the week #13

Reading time: 2 minutes

Best links of the week from 1st April to 7th April


Links

  1. Feature Engineering presentation by HJ van Veen (Nubank Brasil).
  2. Winning Data Science Competitions presentation by Owen Zhang (Data Robot).
  3. Tips and tricks to win kaggle data science competitions by raddar.
  4. 2019 Best Data Science Bootcamps.
  5. Free open public domain football data (football.db) for Brazil here and here.
  6. A weekly email of useful links for people interested in building data platforms.
  7. Top GAN Research Papers Every Machine Learning Enthusiast Must Peruse at Analytics India Magazine.
BestLinks

Best links of the week #8

Reading time: 2 minutes

Best links of the week from 25th February to 3rd March.

Links

  1. I don’t like notebooks at Jupyter Conference 2018 by Joel Grus.
  2. Twitter thread on Regression to the Mean Bias in a published paper at Andrew Althouse Twitter feed.
  3. The hipster effect: Why anti-conformists always end up looking the same at MIT Technology Review.
  4. An archive of datasets distributed with R.
  5. Beautiful, customizable, publication-ready model summaries in R (R Package) at Vincent-Arel Bundock GitHub account.
  6. Advanced R, a book by Hadley Wickham.
  7. R for Data Science, a book by Hadley Wickham.
  8. R packages, a book by Hadley Wickham.
  9. Data Visualization: A practical introduction, a book by Kieran Healy.
  10. Diversos cursos gratuitos na Data Science Academy (Com certificado) at Pelando.
  11. Estudar no Exterior: o caminho das pedras com Anna Giselle Ribeiro at Deviante.
  12. Dados de pesquisas eleitorais no Brasil at Poder360.
  13. Novo portal do IBGE compara estatísticas econômicas e sociais de 193 nações at Agência de Notícias IBGE.
  14. Estatísticas do Comércio Exterior (data visualization and raw data) at Ministério da Economia, Indústria, Comércio Exterior e Serviços.
Uncategorized

Best links of the week #5

Reading time: < 1 minute

Best links of the week from 4th February to 10th February.

Links

  1. Como controlar o braço de outra pessoa com o poder da sua mente? at UOL.
  2. vidente is an R package I am currently writing to parse and analyze data from the Surveillance, Epidemiology and End Results (SEER) Program, which covers over 1/3 of the US population on cancer incidence and survival.
  3. Ciência de Dados com R is a book on Data Science using R at Instituto Brasileiro de Pesquisa e Análise de Dados.
  4. Data Science & Machine Learning Course at Ivanovitch Silva’s GitHub repository.
  5. A receita dos candidatos a deputado federal em 2018 at Nexo Jornal.
  6. AI 100: The Artificial Intelligence Startups Redefining Industries at CB Insights.
  7. The open-source and crowd sourced conference website.
  8. Ranking of IT conferences.