{"id":3896,"date":"2020-08-10T21:42:29","date_gmt":"2020-08-10T21:42:29","guid":{"rendered":"https:\/\/mribeirodantas.xyz\/blog\/?p=3896"},"modified":"2025-01-31T04:37:50","modified_gmt":"2025-01-31T02:37:50","slug":"continuous-machine-learning","status":"publish","type":"post","link":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/","title":{"rendered":"Continuous Machine Learning &#8211; Part I"},"content":{"rendered":"<span class=\"span-reading-time rt-reading-time\" style=\"display: block;\"><span class=\"rt-label rt-prefix\"><b>Reading time: <\/span> <span class=\"rt-time\"> 9<\/span> <span class=\"rt-label rt-postfix\">minutes<\/b><\/span><\/span>\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"640\" height=\"523\" data-attachment-id=\"3972\" data-permalink=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/mlops\/\" data-orig-file=\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/MLOps.png?fit=1470%2C1202&amp;ssl=1\" data-orig-size=\"1470,1202\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"MLOps\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/MLOps.png?fit=300%2C245&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/MLOps.png?fit=640%2C523&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/MLOps.png?resize=640%2C523&#038;ssl=1\" alt=\"\" class=\"wp-image-3972\" srcset=\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/MLOps.png?resize=1024%2C837&amp;ssl=1 1024w, https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/MLOps.png?resize=300%2C245&amp;ssl=1 300w, https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/MLOps.png?resize=768%2C628&amp;ssl=1 768w, https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/MLOps.png?w=1470&amp;ssl=1 1470w, https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/MLOps.png?w=1280&amp;ssl=1 1280w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><figcaption>Image by <a href=\"https:\/\/medium.com\/@taras_tymoshchuck?source=post_page-----c52b49af38c9----------------------\">Taras Tymoshchuck<\/a> from <a href=\"https:\/\/medium.com\/datadriveninvestor\/mlops-practices-and-its-benefits-c52b49af38c9\">here<\/a>.<\/figcaption><\/figure>\n\n\n\n<p>This is a 3 part series about Continuous Machine Learning. You can check Part II <a href=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/18\/continuous-machine-learning-part-ii\/\">here<\/a> and Part III here.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What is it?<\/h2>\n\n\n\n<p>Continuous Machine Learning (<span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>) follows the same concept of <a href=\"https:\/\/en.wikipedia.org\/wiki\/CI\/CD\">Continuous Integration and Continuous Delivery (CI\/CD)<\/a>, famous concepts in <span id=\"wiki-tooltip-1\" data-tooltip-content=\"#wiki-tooltip-box-1\" data-wiki_num=\"1\" data-wiki_id=\"27010\" data-wiki_title=\"Software engineering\" data-wiki_section=\"\" data-wiki_base_url=\"https:\/\/en.wikipedia.org\/w\/api.php\" data-wiki_url=\"https:\/\/en.wikipedia.org\/wiki\/Software_engineering\" data-wiki_thumbnail=\"default\"><a class=\"wiki-tooltip\" href=\"https:\/\/en.wikipedia.org\/wiki\/Software_engineering\" target=\"_blank\" rel=\"noopener noreferrer\" onclick=\"return isClickEnabled( 'hover', 'none' );\">Software Engineering<\/a><\/span> \/ <span id=\"wiki-tooltip-2\" data-tooltip-content=\"#wiki-tooltip-box-2\" data-wiki_num=\"2\" data-wiki_id=\"27488100\" data-wiki_title=\"DevOps\" data-wiki_section=\"\" data-wiki_base_url=\"https:\/\/en.wikipedia.org\/w\/api.php\" data-wiki_url=\"https:\/\/en.wikipedia.org\/wiki\/DevOps\" data-wiki_thumbnail=\"default\"><a class=\"wiki-tooltip\" href=\"https:\/\/en.wikipedia.org\/wiki\/DevOps\" target=\"_blank\" rel=\"noopener noreferrer\" onclick=\"return isClickEnabled( 'hover', 'none' );\">DevOps<\/a><\/span>, but applied to Machine Learning and Data Science projects.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What is this post about?<\/h2>\n\n\n\n<p>I will cover a set of tools that can make your life as a Data Scientist much more interesting. We will use <a href=\"http:\/\/github.com\/miicTeam\/miic_R_package\" target=\"_blank\" rel=\"noreferrer noopener\">MIIC<\/a>, a network inference algorithm, to infer the network of a famous dataset (<a href=\"https:\/\/rdrr.io\/cran\/bnlearn\/man\/alarm.html\" target=\"_blank\" rel=\"noreferrer noopener\">alarm from bnlearn<\/a>). We will then use (1) <a href=\"https:\/\/git-scm.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">git<\/a> to track our code, (2) <a href=\"https:\/\/dvc.org\/\" target=\"_blank\" rel=\"noreferrer noopener\">DVC<\/a> to track our dataset, outputs and pipeline, (3) we will use <a href=\"http:\/\/github.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">GitHub<\/a> as a git remote and (4) <a href=\"http:\/\/drive.google.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">Google Drive<\/a> as a DVC remote. I&#8217;ve written a tutorial on managing Data Science projects with DVC, so if you&#8217;re interested on it <a href=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/03\/05\/r-dvc-and-rmarkdown\/\" target=\"_blank\" rel=\"noreferrer noopener\">open a tab here<\/a> to check it later.<\/p>\n\n\n\n<!--more-->\n\n\n\n<p>The first thing is that I don&#8217;t really like having to go to the GitHub website all the time, so I will also introduce you to <a href=\"https:\/\/github.com\/cli\/cli\">gh<\/a>, GitHub&#8217;s official command line application. We will also use <a href=\"https:\/\/cml.dev\/\" target=\"_blank\" rel=\"noreferrer noopener\"><span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span><\/a>, an open-source library for implementing continuous integration &amp; delivery (CI\/CD) in machine learning projects, that will link git, DVC and GitHub Actions. The idea is that every time you do something in your repository, some actions will be triggered and executed by GitHub Actions in their computing infrastructure (They call it GitHub Runner, though it&#8217;s just a virtual machine. See more about it <a href=\"https:\/\/docs.github.com\/en\/actions\/reference\/virtual-environments-for-github-hosted-runners\">here<\/a>). One example would be using branches as experiments in your ML project, such as several inferences of the same algorithm but changing some parameters. Every time you commit changing a parameter and push, a report would be presented to make it easier (and beautiful) for you to compare the results with the different parameters.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Time to start.<\/h2>\n\n\n\n<p>Let&#8217;s create our repository on GitHub and make a local copy of it. From the command line! (Instructions <a href=\"https:\/\/github.com\/cli\/cli\">here<\/a> to install gh).<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4123\" data-id=\"4123\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4123\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-1\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 1\">mkdir $HOME\/dev\r\ncd dev\r\ngh repo create dvc-miic-<span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span> -d 'GitHub repo to play with DVC, MIIC and <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>' --public<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>You will be asked if you want to create a local copy of this repository. If you say no, you will have to clone the repository later, so reply <strong>Y<\/strong> and press enter. After that, enter the directory.<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4124\" data-id=\"4124\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4124\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-2\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 2\">cd dvc-miic-<span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span><\/pre>\n\t\t\t<\/div>\n\n\n\n<p>Let&#8217;s create a README file so that we can describe the purpose of the repository. In GitHub, this is usually a file named README.md written in <a href=\"https:\/\/github.com\/adam-p\/markdown-here\/wiki\/Markdown-Cheatsheet\" target=\"_blank\" rel=\"noreferrer noopener\">Markdown format<\/a>.<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4125\" data-id=\"4125\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4125\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-3\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 3\">echo '# DVC-MIIC-<span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>' &gt; README.md\r\necho 'This is a sample repository for testing DVC, MIIC and <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span> in GitHub.' &gt;&gt; README.md\r\necho 'The analyses will be performed using [MIIC](https:\/\/github.com\/miicTeam\/miic_R_package) to infer the network from the [alarm dataset](https:\/\/rdrr.io\/cran\/bnlearn\/man\/alarm.html).' &gt;&gt; README.md<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>We will also download a license file (GNU GPLv3) from the <a href=\"http:\/\/gnu.org\/\" target=\"_blank\" rel=\"noreferrer noopener\">GNU website<\/a> and have it named as LICENSE, as it is commonly done in GitHub.<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4126\" data-id=\"4126\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4126\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-4\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 4\">wget -c https:\/\/www.gnu.org\/licenses\/gpl-3.0.txt -O LICENSE<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>We will then add our two new files to the index of our git repository with the <em>git add<\/em> command and then commit to the repository, that is, save a snapshot of our local repository. Afterwards, we will push our modifications to GitHub to make sure anyone can see the most up to date version of our repository. Since this is the first time we&#8217;re pushing, we will need to tell git what&#8217;s the default branch to push. We do that with the <em>&#8211;set-upstream<\/em> parameter. In the future, when we want to push to the default branch, we can just type <em>git push<\/em>.<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4127\" data-id=\"4127\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4127\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-5\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 5\"># The dot when provided to git add means everything in the current folder\r\ngit add .\r\ngit commit -m 'Initial commit'\r\ngit push --set-upstream origin master<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>You can check your repository at GitHub now. It should be updated! <a href=\"https:\/\/github.com\/mribeirodantas\/dvc-miic-cml\/tree\/72e420f47300d998d02e85b35df869c911a2b740\" target=\"_blank\" rel=\"noreferrer noopener\">Mine is<\/a>. Git is not supposed to track data, output files (metrics files, plots, reports) or pipelines. That&#8217;s where DVC fits in. Let&#8217;s start by tracking the alarm dataset. You can download it from MIIC official website clicking <a href=\"https:\/\/miic.curie.fr\/datasets\/alarm1000samples.txt\">here<\/a>. In the command line, that&#8217;s what we would do:<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4128\" data-id=\"4128\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4128\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-6\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 6\">wget -c https:\/\/miic.curie.fr\/datasets\/alarm1000samples.txt -O alarm.tsv<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>I don&#8217;t really like when git repositories are just a bunch of files thrown at the root folder, so let&#8217;s make it a bit more organized.<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4129\" data-id=\"4129\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4129\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-7\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 7\">mkdir data\r\nmv alarm.tsv data\/<\/pre>\n\t\t\t<\/div>\n\n\n\n<h2 class=\"wp-block-heading\">DVC enters the scene<\/h2>\n\n\n\n<p>If you type <em>git status<\/em>, you will see that the folder is untracked, which is a bit annoying since (a) git is not supposed to track data and (b) you do not want that either. One of the things that DVC does, after being told to track files, is to tell git to ignore such files. After all, DVC will be taking care of them! Before telling DVC what to track, though, you must tell DVC you want it to work in this repository. Just like you would normally do with a new git repository without the help of gh (<em>git init<\/em>), you do with DVC. Let&#8217;s do this and then tell DVC to track our dataset. Instructions on DVC installation can be found <a href=\"https:\/\/dvc.org\/doc\/install\">here<\/a>.<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4130\" data-id=\"4130\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4130\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-8\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 8\">dvc init\r\ndvc add data\/alarm.tsv<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>Some files will be created by DVC. These meta data files must be tracked by git, so let&#8217;s just add everything new to the index and commit it. Our dataset won&#8217;t be added because DVC added it to .gitignore, a hidden file used by git for this purpose: what to ignore.<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4131\" data-id=\"4131\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4131\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-9\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I -  9\">git add .\r\ngit commit -m 'Initiates DVC and asks it to track alarm dataset'<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>Just to make sure that the workflow of git is clear to you, you don&#8217;t have to push every time you commit. We won&#8217;t push now, for example (though we could). Just like GitHub is a git remote, you can also have a DVC remote. It can be Dropbox, an Amazon S3 bucket, Google Drive or even a folder in your computer or in your external disk. For simplicity here, let&#8217;s use Google Drive.<\/p>\n\n\n\n<p>I went to the Google Drive website, logged with my account, and created at the root of my drive a new folder named <em>dvc-miic-<span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span> example<\/em>. The URL is <strong><a href=\"https:\/\/drive.google.com\/drive\/u\/1\/folders\/188CmpQIYqKOgvcgaLZOxz1GqlwTasv8c\">https:\/\/drive.google.com\/drive\/u\/1\/folders\/188CmpQIYqKOgvcgaLZOxz1GqlwTasv8c<\/a><\/strong><\/p>\n\n\n\n<p>What you need here is the last part after the <strong>folders\/<\/strong>, that is, <strong>188CmpQIYqKOgvcgaLZOxz1GqlwTasv8c<\/strong>. Let&#8217;s set this as our DVC remote now with the following command:<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4132\" data-id=\"4132\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4132\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-10\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 10\">dvc remote add -d myremote gdrive:\/\/188CmpQIYqKOgvcgaLZOxz1GqlwTasv8c<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>The <strong>-d<\/strong> parameter is important to tell DVC that this is your default remote. Otherwise, it will ask you what remote you want to use whenever you run a command that will do something based on a remote. We used <em>git push<\/em> to push to git. Can you guess what command we should use to push to our DVC remote at Google Drive? I&#8217;m sure you guessed it right!<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4133\" data-id=\"4133\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4133\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-11\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 11\">dvc push<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>If you check your folder in Google Drive you will see it is no longer empty. You can&#8217;t really understand what&#8217;s there, but take my word for it: DVC knows how to interpret it \ud83d\ude1b . As an habit, you type <em>git status<\/em> and you realize something changed in your repository. <em>Wait, what!?<\/em> By adding a default remote, the DVC configuration file was changed. You could <code>git add<\/code> the folder and <code>git commit<\/code> it but for didactic reasons I will do something else: I will amend it to the last commit, and by doing so, update the commit message. Amending is useful when you committed something but forgot to add something, or you decided that your last commit message wasn&#8217;t that good. So you change your last commit, instead of doing a new one!<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4134\" data-id=\"4134\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4134\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-12\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 12\">git add .\r\ngit commit --amend -m 'Initiates DVC, sets the default remote and asks it to track alarm dataset'<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>Ok, let&#8217;s push now.<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4135\" data-id=\"4135\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4135\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-13\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 13\">git push<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>The GitHub page for your repository should look like <a href=\"https:\/\/github.com\/mribeirodantas\/dvc-miic-cml\/tree\/d2249961e24e331e53428ada9071aedb93a42517\" target=\"_blank\" rel=\"noreferrer noopener\">this<\/a>. You may wonder why there are two files in your data folder since I told you git won&#8217;t be used to track data. One of the files is .gitignore, to make sure git won&#8217;t annoy you saying that the dataset file is not tracked, when it actually is tracked [by DVC]. The .dvc file is a meta data file used by DVC and it contains a hash built out of the content of the dataset. That&#8217;s how DVC knows if the dataset changed, because the hash will change.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">DVC Note<\/h2>\n\n\n\n<p>If someone is interested in this repository (maybe you are), they would initially do just like any other GitHub repository: They would clone it!<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4136\" data-id=\"4136\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4136\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-14\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 14\">git clone https:\/\/github.com\/mribeirodantas\/dvc-miic-<span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>.git\r\ncd dvc-miic-<span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span><\/pre>\n\t\t\t<\/div>\n\n\n\n<p>By checking the data folder with <code>ls data<\/code>, you will realize the dataset is not there. Well, of course it is not there, right? You only cloned the git repository. Let&#8217;s use <em>dvc pull<\/em> to pull what DVC is tracking for this repository.<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4137\" data-id=\"4137\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4137\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-15\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 15\">dvc pull<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>Now it&#8217;s there \ud83d\ude42 . Let&#8217;s start writing our network inference script. We will use MIIC (Multivariate information-based inductive causation) for that. Create a file named <code>infer_network.R<\/code> with the content below:<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4138\" data-id=\"4138\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4138\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-16\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-r\" title=\"CML Part I - 16\">library(miic)\r\nalarm_dataset &lt;- read.table('data\/alarm.tsv', header = TRUE)\r\nres &lt;- miic(input_data = alarm_dataset)\r\ntotal_edges &lt;- nrow(res$all.edges.summary)\r\nretained_edges &lt;- nrow(res$all.edges.summary[res$all.edges.summary$type == 'P', ])\r\nratio_edges &lt;- paste0('Ratio of retained edges: ', retained_edges\/total_edges)\r\nwrite.table(ratio_edges, file = 'metrics.txt', col.names = FALSE, row.names = FALSE)<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>This code loads the miic R package, reads the dataset into the R environment, runs miic to infer the network and calculates the ratio of retained edges by the number of possible edges. Then, the ratio is saved to a file named metrics.txt.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">GitHub Actions<\/h2>\n\n\n\n<p>Now it&#8217;s time to start playing with GitHub Actions to make <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span> work for us. Every time we push a new commit to the repository, the model will be rebuilt and our metrics recalculated.<\/p>\n\n\n\n<p>In order to use GitHub Actions, we need to create a special file in a special folder. The path from within your git repository is: <strong>.github\/workflows<\/strong><\/p>\n\n\n\n<p>Inside the folder, you have to create your GitHub Action file. The name is not important, but it must be a file in <a href=\"https:\/\/en.wikipedia.org\/wiki\/YAML\" target=\"_blank\" rel=\"noreferrer noopener\">YAML format<\/a>. Let&#8217;s create a file named <code><span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>.yaml<\/code> inside the path mentioned above.<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4139\" data-id=\"4139\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4139\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-17\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 17\">mkdir -p .github\/workflows\r\ncd .github\/workflows<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>Then, create a file named <code><span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>.yaml<\/code> and put the code below inside it. This asks for a machine running the latest version of Ubuntu, sets up an R environment, checks out the current git repository, installs MIIC, DVC, their dependencies, <code>dvc pull<\/code> our dataset, calls the <code>infer_network.R<\/code> script that will save the metrics to a file in the end, and then output it.<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4140\" data-id=\"4140\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4140\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-18\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-yaml\" title=\"CML Part I - 18\">name: dvc-<span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>-miic\r\non: [push]\r\njobs:\r\n  run:\r\n    runs-on: [ubuntu-latest]\r\n    steps:\r\n      - uses: r-lib\/actions\/setup-r@master\r\n        with:\r\n          version: '3.6.1'\r\n      - uses: actions\/checkout@v2\r\n      - name: cml_run\r\n        env:\r\n          repo_token: ${{ secrets.GITHUB_TOKEN }}\r\n          GDRIVE_CREDENTIALS_DATA: ${{ secrets.GDRIVE_CREDENTIALS_DATA }}\r\n        run: |\r\n          # Install miic and dependencies\r\n          wget -c https:\/\/github.com\/miicTeam\/miic_R_package\/archive\/v1.4.2.tar.gz\r\n          tar -xvzf v1.4.2.tar.gz\r\n          cd miic_R_package-1.4.2\r\n          R --silent -e &quot;install.packages(c(\\&quot;igraph\\&quot;, \\&quot;ppcor\\&quot;, \\&quot;scales\\&quot;, \\&quot;Rcpp\\&quot;))&quot;\r\n          R CMD INSTALL . --preclean\r\n          cd ..\r\n          # Install Python packages\r\n          pip install --upgrade pip\r\n          pip install wheel\r\n          pip install PyDrive2==1.6.0 --use-feature=2020-resolver\r\n          # Install DVC\r\n          wget -c https:\/\/github.com\/iterative\/dvc\/releases\/download\/1.4.0\/dvc_1.4.0_amd64.deb\r\n          sudo apt install .\/dvc_1.4.0_amd64.deb\r\n          # Run DVC\r\n          dvc pull\r\n          Rscript infer_network.R\r\n          # Write your <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span> report\r\n          echo &quot;MODEL METRICS&quot;\r\n          cat metrics.txt<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>Instead of comitting this to the master (default) branch, we will create an experiment branch. That&#8217;s how you should use DVC! We will analyze the raw version of the alarm dataset, no pre-processing, so I will call this branch raw_alarm_dataset.<\/p>\n\n\n\n<p>You have used <code>dvc pull<\/code> already, so you authenticated your machine with Google Drive. <a href=\"https:\/\/docs.github.com\/en\/actions\/configuring-and-managing-workflows\/creating-and-storing-encrypted-secrets\" target=\"_blank\" rel=\"noreferrer noopener\">Create a GitHub secret<\/a> with the content of the file <code>.dvc\/tmp\/gdrive-user-credentials.json<\/code> and name it GDRIVE_CREDENTIALS_DATA.<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4141\" data-id=\"4141\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4141\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-19\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 19\">git checkout -b raw_alarm_dataset\r\n# infer_network.R is not in this folder, therefore `git add .` wouldn't\r\n# add it to the index of your git repository. -A adds everything.\r\ngit add -A\r\ngit commit -m 'Infers alarm network with MIIC and default parameters'\r\ngit push origin raw_alarm_dataset\r\ngh pr create --title 'Network inference of alarm dataset'<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>Now, <a href=\"https:\/\/github.com\/mribeirodantas\/dvc-miic-cml\/pull\/1\/checks\" target=\"_blank\" rel=\"noreferrer noopener\">go to GitHub and check what&#8217;s happening<\/a>. If everything goes according to plan, you will see something like the image below when the check is over.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"640\" height=\"223\" data-attachment-id=\"3943\" data-permalink=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/screen_2020-08-10_22-16-29\/\" data-orig-file=\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-16-29.png?fit=3427%2C1196&amp;ssl=1\" data-orig-size=\"3427,1196\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"screen_2020-08-10_22-16-29\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-16-29.png?fit=300%2C105&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-16-29.png?fit=640%2C223&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-16-29.png?resize=640%2C223&#038;ssl=1\" alt=\"\" class=\"wp-image-3943\" srcset=\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-16-29.png?resize=1024%2C357&amp;ssl=1 1024w, https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-16-29.png?resize=300%2C105&amp;ssl=1 300w, https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-16-29.png?resize=768%2C268&amp;ssl=1 768w, https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-16-29.png?resize=1536%2C536&amp;ssl=1 1536w, https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-16-29.png?resize=2048%2C715&amp;ssl=1 2048w, https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-16-29.png?w=1280&amp;ssl=1 1280w, https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-16-29.png?w=1920&amp;ssl=1 1920w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/figure>\n\n\n\n<p>Well&#8230; You got your metrics printed out in the checks log file. Cool, but you probably agree with me that we should expect something more elegant, right? Hehe `^^<\/p>\n\n\n\n<p>Let&#8217;s add some lines to our <code>infer_network.R<\/code> script to make it plot the network, and then let&#8217;s change the last part to make use of <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span> functionalities. The new <code>infer_network.R<\/code> should look like:<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4142\" data-id=\"4142\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4142\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-20\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-r\" title=\"CML Part I - 20\">library(miic)\r\nalarm_dataset &lt;- read.table('data\/alarm.tsv', header = TRUE)\r\nres &lt;- miic(input_data = alarm_dataset)\r\ntotal_edges &lt;- nrow(res$all.edges.summary)\r\nretained_edges &lt;- nrow(res$all.edges.summary[res$all.edges.summary$type == 'P', ])\r\nratio_edges &lt;- paste0('Ratio of retained edges: ', retained_edges\/total_edges)\r\nwrite.table(ratio_edges, file = 'metrics.txt', col.names = FALSE, row.names = FALSE)\r\n# Plot network\r\npng(file='network_diagram.png')\r\nmiic.plot(res)\r\ndev.off()<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>And the new <code><span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>.yaml<\/code> file should look like the code below. The new thing now is that we&#8217;re also installing <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span> and making use of it.<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4143\" data-id=\"4143\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4143\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-21\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-yaml\" title=\"CML Part I - 21\">name: dvc-<span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>-miic\r\non: [push]\r\njobs:\r\n  run:\r\n    runs-on: [ubuntu-latest]\r\n    steps:\r\n      - uses: r-lib\/actions\/setup-r@master\r\n        with:\r\n          version: '3.6.1'\r\n      - uses: actions\/checkout@v2\r\n      - name: cml_run\r\n        env:\r\n          repo_token: ${{ secrets.GITHUB_TOKEN }}\r\n          GDRIVE_CREDENTIALS_DATA: ${{ secrets.GDRIVE_CREDENTIALS_DATA }}\r\n        run: |\r\n\r\n          # Install miic and dependencies\r\n          wget -c https:\/\/github.com\/miicTeam\/miic_R_package\/archive\/v1.4.2.tar.gz\r\n          tar -xvzf v1.4.2.tar.gz\r\n          cd miic_R_package-1.4.2\r\n          R --silent -e &quot;install.packages(c(\\&quot;igraph\\&quot;, \\&quot;ppcor\\&quot;, \\&quot;scales\\&quot;, \\&quot;Rcpp\\&quot;))&quot;\r\n          R CMD INSTALL . --preclean\r\n          cd ..\r\n          # Install Python packages\r\n          pip install --upgrade pip\r\n          pip install wheel\r\n          pip install PyDrive2==1.6.0 --use-feature=2020-resolver\r\n          # Install DVC\r\n          wget -c https:\/\/github.com\/iterative\/dvc\/releases\/download\/1.4.0\/dvc_1.4.0_amd64.deb\r\n          sudo apt install .\/dvc_1.4.0_amd64.deb\r\n          # Run DVC\r\n          dvc pull\r\n          Rscript infer_network.R\r\n\r\n          # Install <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>\r\n          npm init --yes\r\n          npm i @dvcorg\/<span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>@latest\r\n          # Write your <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span> report\r\n          echo &quot;## Model Metrics&quot; &gt; report.md\r\n          cat metrics.txt &gt;&gt; report.md\r\n          echo &quot;## Data visualization&quot; &gt;&gt; report.md\r\n          npx <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>-publish network_diagram.png --md &gt;&gt; report.md\r\n          npx <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>-send-comment report.md<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>Let&#8217;s commit.<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4144\" data-id=\"4144\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4144\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-22\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 22\">git add .\r\ngit commit -m 'Uses <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span> to improve PR feedback'\r\ngit push origin raw_alarm_dataset<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>Now, right after the checks are done, you should have an automatic comment with your report like in the figure below.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"640\" height=\"489\" data-attachment-id=\"3948\" data-permalink=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/screen_2020-08-10_22-27-49\/\" data-orig-file=\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-27-49.png?fit=1124%2C860&amp;ssl=1\" data-orig-size=\"1124,860\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"screen_2020-08-10_22-27-49\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-27-49.png?fit=300%2C230&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-27-49.png?fit=640%2C489&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-27-49.png?resize=640%2C489&#038;ssl=1\" alt=\"\" class=\"wp-image-3948\" srcset=\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-27-49.png?resize=1024%2C783&amp;ssl=1 1024w, https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-27-49.png?resize=300%2C230&amp;ssl=1 300w, https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-27-49.png?resize=768%2C588&amp;ssl=1 768w, https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-27-49.png?w=1124&amp;ssl=1 1124w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/figure>\n\n\n\n<p>Let&#8217;s say that I think too many edges have been removed and maybe the network is not consistent. I will change the <code>infer_network.R<\/code> script to make MIIC look for a <a href=\"http:\/\/kinefold.curie.fr\/isambertlab\/research_interpretable_constraint-based-methods.htm\">consistent<\/a> network. The third line now looks like:<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4145\" data-id=\"4145\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4145\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-23\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-r\" title=\"CML Part I - 23\">res &lt;- miic(input_data = alarm_dataset, consistent='orientation')<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>And commit.<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4146\" data-id=\"4146\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4146\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-24\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 24\">git add .\r\ngit commit -m 'Makes network consistent'\r\ngit push origin raw_alarm_dataset<\/pre>\n\t\t\t<\/div>\n\n\n\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"564\" height=\"925\" data-attachment-id=\"3949\" data-permalink=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/screen_2020-08-10_22-36-45\/\" data-orig-file=\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-36-45.png?fit=564%2C925&amp;ssl=1\" data-orig-size=\"564,925\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"screen_2020-08-10_22-36-45\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-36-45.png?fit=183%2C300&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-36-45.png?fit=564%2C925&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-36-45.png?resize=564%2C925&#038;ssl=1\" alt=\"\" class=\"wp-image-3949\" srcset=\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-36-45.png?w=564&amp;ssl=1 564w, https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/screen_2020-08-10_22-36-45.png?resize=183%2C300&amp;ssl=1 183w\" sizes=\"auto, (max-width: 564px) 100vw, 564px\" \/><\/figure>\n\n\n\n<p>So now I think it&#8217;s right and I should approve the pull request \ud83d\ude42 . I could do it clicking on the green \u201cMerge pull request\u201d button or I could use <code>gh<\/code> again, GitHub&#8217;s official command line application.<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4147\" data-id=\"4147\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4147\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-25\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 25\">gh pr merge 1<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>It will ask you two questions. I chose to create a merge commit and to not remove the branch, be it locally or at GitHub. To go back to the master branch, you should do:<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4148\" data-id=\"4148\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4148\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-26\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 26\">git checkout master<\/pre>\n\t\t\t<\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Using Docker containers<\/h2>\n\n\n\n<p>You probably noticed it takes a while to do the checks and depending on how many things you want to install, it can take very long. One way out of this situation is by using a docker container that already has your dependencies installed. The way we&#8217;ve been doing it so far is ready for you to use your containers, after all, I&#8217;m installing <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span> manually. If you don&#8217;t want to use a container of yours, but don&#8217;t want either to download and install <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span> at every check, you can use <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>&#8217;s official docker container.<\/p>\n\n\n\n<p>Since we merged a pull request, our remote (GitHub) is different from our local repository. To update our local repository, let&#8217;s run <code>git pull<\/code>, and then create a new branch.<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4149\" data-id=\"4149\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4149\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-27\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 27\">git pull\r\ngit checkout -b <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>_container<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>Change your <code><span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>.yaml<\/code> to the code below.<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4150\" data-id=\"4150\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4150\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-28\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-yaml\" title=\"CML Part I - 28\">name: dvc-<span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>-miic\r\non: [push]\r\njobs:\r\n  run:\r\n    runs-on: [ubuntu-latest]\r\n    container: docker:\/\/dvcorg\/cml\r\n    steps:\r\n      - uses: actions\/checkout@v2\r\n        \r\n      - uses: r-lib\/actions\/setup-r@master\r\n        with:\r\n          version: '3.6.1'\r\n\r\n      - name: cml_run\r\n        env:\r\n          repo_token: ${{ secrets.GITHUB_TOKEN }}\r\n          GDRIVE_CREDENTIALS_DATA: ${{ secrets.GDRIVE_CREDENTIALS_DATA }}\r\n        run: |\r\n          # Install miic and dependencies\r\n          wget -c https:\/\/github.com\/miicTeam\/miic_R_package\/archive\/v1.4.2.tar.gz\r\n          tar -xvzf v1.4.2.tar.gz\r\n          cd miic_R_package-1.4.2\r\n          R --silent -e &quot;install.packages(c(\\&quot;igraph\\&quot;, \\&quot;ppcor\\&quot;, \\&quot;scales\\&quot;, \\&quot;Rcpp\\&quot;))&quot;\r\n          R CMD INSTALL . --preclean\r\n          cd ..\r\n\r\n          # Run DVC\r\n          dvc pull\r\n          Rscript infer_network.R\r\n\r\n          # Write your <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span> report\r\n          echo &quot;## Model Metrics&quot; &gt; report.md\r\n          cat metrics.txt &gt;&gt; report.md\r\n          echo &quot;## Data visualization&quot; &gt;&gt; report.md\r\n          <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>-publish network_diagram.png --md &gt;&gt; report.md\r\n          <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>-send-comment report.md<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>Let&#8217;s add the changed file, commit it, push and create a Pull Request (PR).<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4151\" data-id=\"4151\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4151\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-29\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 29\">git add .\r\ngit commit -m 'Makes use of <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span> container'\r\ngit push origin <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>_container\r\ngh pr create --title 'Use <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span> container'<\/pre>\n\t\t\t<\/div>\n\n\n\n<p>Everything should have run fine, like in <a href=\"https:\/\/github.com\/mribeirodantas\/dvc-miic-cml\/pull\/2\/checks\">here<\/a>. You can merge the pull request and then <code>git pull<\/code> to update your local copy.<\/p>\n\n\n<div class=\"snippetcpt-wrap\" id=\"snippet-4152\" data-id=\"4152\" data-edit=\"\" data-copy=\"\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896?snippet=25a28b258b&#038;id=4152\" data-fullscreen=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/code-snippets\/cml-part-i-30\/?full-screen=1\">\n\t\t\t\t<pre class=\"prettyprint linenums lang-sh\" title=\"CML Part I - 30\">gh pr merge 2\r\ngit pull<\/pre>\n\t\t\t<\/div>\n\n\n\n<h2 class=\"wp-block-heading\">What else?<\/h2>\n\n\n\n<p>DVC is not limited to data tracking. We could also track our pipeline, including output files such as the images that our <code>infer_network.R<\/code> script plotted. Imagine that we could have some code for preprocessing that would deliver a preprocessed dataset to the <code>infer_network.R<\/code> script that would generate the image with the network. Instead of running all these scripts (and we can easily think of scenarios that are much more complicated), we can use dvc to create a pipeline and a simple command (<code>dvc repro<\/code>) in our GitHub action file would be enough to reproduce our whole pipeline.<\/p>\n\n\n\n<p>Besides, instead of installing a bunch of the same things (R, DVC, <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span>&#8230;) every time we push to the repository, we could have a Docker container with these things already installed. This could save us some time :-). In our case here, for example, downloading, compiling and installing MIIC takes a few minutes that could be spared if it was already installed in a Docker container. For our simple example, the time to download\/setup the docker container may not make it worth to use it, but when complexity and dependencies increase, the benefits become more evident.<\/p>\n\n\n\n<p>That&#8217;s it for today folks! \ud83d\ude09<\/p>\n\n\n\n<p>You would not be reading this post if it wasn&#8217;t for <a href=\"https:\/\/twitter.com\/DrElleOBrien\">Elle O&#8217;Brien<\/a>, who told me so many things about <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span> + presentations and examples, and <a href=\"https:\/\/twitter.com\/g_ortega_david\">David Ortega<\/a> who helped me setting up the R environment within the <span class='tooltipsall tooltipsincontent classtoolTips0'>CML<\/span> docker container.<\/p>\n<script type=\"text\/javascript\"> toolTips('.classtoolTips0','Continuous Machine Learning is the equivalent of Continuous Integration and Continuous Delivery (CI\/CD) for Machine Learning.'); <\/script>","protected":false},"excerpt":{"rendered":"<p><span class=\"span-reading-time rt-reading-time\" style=\"display: block;\"><span class=\"rt-label rt-prefix\"><b>Reading time: <\/span> <span class=\"rt-time\"> 9<\/span> <span class=\"rt-label rt-postfix\">minutes<\/b><\/span><\/span>Continuous Machine Learning has come to revolutionize Machine Learning, Data Science and Software Engineering! I will teach you how to exploit this through CML, DVC and MIIC in this blog post \ud83d\ude42<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_uag_custom_page_level_css":"","jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false},"version":2}},"categories":[196,69,24,209,1],"tags":[9,219,28,208,63,25],"class_list":["post-3896","post","type-post","status-publish","format-standard","hentry","category-causality","category-data-science","category-r","category-tools","category-uncategorized","tag-causality","tag-cml","tag-data-science","tag-dvc","tag-machine-learning","tag-r"],"jetpack_publicize_connections":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Continuous Machine Learning - Part I - The Dataist Storyteller<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Continuous Machine Learning - Part I - The Dataist Storyteller\" \/>\n<meta property=\"og:description\" content=\"Reading time:  9 minutesContinuous Machine Learning has come to revolutionize Machine Learning, Data Science and Software Engineering! I will teach you how to exploit this through CML, DVC and MIIC in this blog post :-)\" \/>\n<meta property=\"og:url\" content=\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/\" \/>\n<meta property=\"og:site_name\" content=\"The Dataist Storyteller\" \/>\n<meta property=\"article:published_time\" content=\"2020-08-10T21:42:29+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-01-31T02:37:50+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/MLOps-1024x837.png\" \/>\n<meta name=\"author\" content=\"mribeirodantas\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"mribeirodantas\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"12 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/\"},\"author\":{\"name\":\"mribeirodantas\",\"@id\":\"https:\/\/mribeirodantas.xyz\/blog\/#\/schema\/person\/2856ebf8edffabf1f4bbca59bade5957\"},\"headline\":\"Continuous Machine Learning &#8211; Part I\",\"datePublished\":\"2020-08-10T21:42:29+00:00\",\"dateModified\":\"2025-01-31T02:37:50+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/\"},\"wordCount\":2417,\"commentCount\":3,\"publisher\":{\"@id\":\"https:\/\/mribeirodantas.xyz\/blog\/#\/schema\/person\/2856ebf8edffabf1f4bbca59bade5957\"},\"image\":{\"@id\":\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/MLOps-1024x837.png\",\"keywords\":[\"causality\",\"CML\",\"data science\",\"DVC\",\"machine learning\",\"r\"],\"articleSection\":[\"Causality\",\"Data Science\",\"R\",\"tools\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/\",\"url\":\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/\",\"name\":\"Continuous Machine Learning - Part I - The Dataist Storyteller\",\"isPartOf\":{\"@id\":\"https:\/\/mribeirodantas.xyz\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/MLOps-1024x837.png\",\"datePublished\":\"2020-08-10T21:42:29+00:00\",\"dateModified\":\"2025-01-31T02:37:50+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/#primaryimage\",\"url\":\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/MLOps.png?fit=1470%2C1202&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/MLOps.png?fit=1470%2C1202&ssl=1\",\"width\":1470,\"height\":1202},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/mribeirodantas.xyz\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Continuous Machine Learning &#8211; Part I\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/mribeirodantas.xyz\/blog\/#website\",\"url\":\"https:\/\/mribeirodantas.xyz\/blog\/\",\"name\":\"The Dataist Storyteller\",\"description\":\"Telling stories backed by data\",\"publisher\":{\"@id\":\"https:\/\/mribeirodantas.xyz\/blog\/#\/schema\/person\/2856ebf8edffabf1f4bbca59bade5957\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/mribeirodantas.xyz\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\/\/mribeirodantas.xyz\/blog\/#\/schema\/person\/2856ebf8edffabf1f4bbca59bade5957\",\"name\":\"mribeirodantas\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/mribeirodantas.xyz\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/6687720529e55feab1680cbd98da5c7f?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/6687720529e55feab1680cbd98da5c7f?s=96&d=mm&r=g\",\"caption\":\"mribeirodantas\"},\"logo\":{\"@id\":\"https:\/\/mribeirodantas.xyz\/blog\/#\/schema\/person\/image\/\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Continuous Machine Learning - Part I - The Dataist Storyteller","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/","og_locale":"en_US","og_type":"article","og_title":"Continuous Machine Learning - Part I - The Dataist Storyteller","og_description":"Reading time:  9 minutesContinuous Machine Learning has come to revolutionize Machine Learning, Data Science and Software Engineering! I will teach you how to exploit this through CML, DVC and MIIC in this blog post :-)","og_url":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/","og_site_name":"The Dataist Storyteller","article_published_time":"2020-08-10T21:42:29+00:00","article_modified_time":"2025-01-31T02:37:50+00:00","og_image":[{"url":"https:\/\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/MLOps-1024x837.png","type":"","width":"","height":""}],"author":"mribeirodantas","twitter_card":"summary_large_image","twitter_misc":{"Written by":"mribeirodantas","Est. reading time":"12 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/#article","isPartOf":{"@id":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/"},"author":{"name":"mribeirodantas","@id":"https:\/\/mribeirodantas.xyz\/blog\/#\/schema\/person\/2856ebf8edffabf1f4bbca59bade5957"},"headline":"Continuous Machine Learning &#8211; Part I","datePublished":"2020-08-10T21:42:29+00:00","dateModified":"2025-01-31T02:37:50+00:00","mainEntityOfPage":{"@id":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/"},"wordCount":2417,"commentCount":3,"publisher":{"@id":"https:\/\/mribeirodantas.xyz\/blog\/#\/schema\/person\/2856ebf8edffabf1f4bbca59bade5957"},"image":{"@id":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/#primaryimage"},"thumbnailUrl":"https:\/\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/MLOps-1024x837.png","keywords":["causality","CML","data science","DVC","machine learning","r"],"articleSection":["Causality","Data Science","R","tools"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/","url":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/","name":"Continuous Machine Learning - Part I - The Dataist Storyteller","isPartOf":{"@id":"https:\/\/mribeirodantas.xyz\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/#primaryimage"},"image":{"@id":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/#primaryimage"},"thumbnailUrl":"https:\/\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/MLOps-1024x837.png","datePublished":"2020-08-10T21:42:29+00:00","dateModified":"2025-01-31T02:37:50+00:00","breadcrumb":{"@id":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/#primaryimage","url":"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/MLOps.png?fit=1470%2C1202&ssl=1","contentUrl":"https:\/\/i0.wp.com\/mribeirodantas.xyz\/blog\/wp-content\/uploads\/2020\/08\/MLOps.png?fit=1470%2C1202&ssl=1","width":1470,"height":1202},{"@type":"BreadcrumbList","@id":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/2020\/08\/10\/continuous-machine-learning\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/mribeirodantas.xyz\/blog\/"},{"@type":"ListItem","position":2,"name":"Continuous Machine Learning &#8211; Part I"}]},{"@type":"WebSite","@id":"https:\/\/mribeirodantas.xyz\/blog\/#website","url":"https:\/\/mribeirodantas.xyz\/blog\/","name":"The Dataist Storyteller","description":"Telling stories backed by data","publisher":{"@id":"https:\/\/mribeirodantas.xyz\/blog\/#\/schema\/person\/2856ebf8edffabf1f4bbca59bade5957"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/mribeirodantas.xyz\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/mribeirodantas.xyz\/blog\/#\/schema\/person\/2856ebf8edffabf1f4bbca59bade5957","name":"mribeirodantas","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/mribeirodantas.xyz\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/6687720529e55feab1680cbd98da5c7f?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/6687720529e55feab1680cbd98da5c7f?s=96&d=mm&r=g","caption":"mribeirodantas"},"logo":{"@id":"https:\/\/mribeirodantas.xyz\/blog\/#\/schema\/person\/image\/"}}]}},"jetpack_featured_media_url":"","uagb_featured_image_src":{"full":false,"thumbnail":false,"medium":false,"medium_large":false,"large":false,"1536x1536":false,"2048x2048":false},"uagb_author_info":{"display_name":"mribeirodantas","author_link":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/author\/mribeirodantas\/"},"uagb_comment_info":19,"uagb_excerpt":"Reading time: 9 minutesContinuous Machine Learning has come to revolutionize Machine Learning, Data Science and Software Engineering! I will teach you how to exploit this through CML, DVC and MIIC in this blog post :-)","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/paw9jx-10Q","jetpack-related-posts":[],"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/wp-json\/wp\/v2\/comments?post=3896"}],"version-history":[{"count":82,"href":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896\/revisions"}],"predecessor-version":[{"id":4204,"href":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3896\/revisions\/4204"}],"wp:attachment":[{"href":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/wp-json\/wp\/v2\/media?parent=3896"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/wp-json\/wp\/v2\/categories?post=3896"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mribeirodantas.xyz\/blog\/index.php\/wp-json\/wp\/v2\/tags?post=3896"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}