Better track your ML experiments with MLflow

We all know how painful keeping track of your machine learning experiments can be.

You train a bunch of models of different flavours (Random Forests, XGBoost, Neural Networks, etc.).

For each model, you explore a range of hyper-parameters. Then you compute the performance metrics on some test data.

Sometimes you change the training data by adding or removing features.

Some other times, you have to work in a team and combine your results with other data scientits...

How do you manage these experiments in such a way the are easily tracable and therefore reproducible? MLflow is perfectly suited to this task.

To learn more about MLflow, watch the video tutorial.

Here's what I'll discuss:

Setting up MLflow locally to track some machine learning experiences I performed on a dataset
For each model fit, using MLflow to track:
- metrics
- hyper-parameters
- source scripts executing the run
- code version
- notes & comments
Comparing different runs between through the MFflow UI
Setting up a tracking on AWS

To reproduce my experiments and set up MLflow on AWS, have a look at my Github repo.

Happy coding 💻