Increasingly, data science projects do not simply end with a few key takeaway conclusions, but result in trained machine learning (ML) models going into production. With these models taking ever more critical roles in organisations’ services and tooling, it’s extremely important for us to track how models were created, know why a particular model was selected over other candidates, and be able to reproduce them when necessary.
At Faculty, we believe knowledge of the source of models and making their training reproducible will be ever more important as AI becomes more widespread. To address these needs, we’re working on a range of new features to help data scientists manage their workflow in Faculty Platform. The first of these to be launched is our experiment tracking feature.
Experiment tracking allows data scientists to keep track of parameters, code, and data that went into producing a trained machine learning model. It supports storing the trained model and any associated metrics or artifacts centrally. With experiment tracking on Faculty it’s now easy to keep a full history and lineage of what went into producing a model, as well as be able to reproduce the results. For larger teams it supports stronger collaboration by giving the team visibility of all experiments.
Why is tracking experiments important?
A typical process for training a model as part of a data science project involves first importing and preparing training data, selecting an appropriate model type, then training the model using the prepared data. A data scientist will usually try several different model types and sets of associated hyperparameters (inputs to the optimisation algorithm used to train the model). This results in a number of different models being trained, which the data scientist will select the best from based on one or more performance metrics, in addition to other considerations such as model explainability and complexity.
Once a model is selected and put in to production, the data scientist will want to remember where it came from and be able to trace their steps in the future. Considering the workflow discussed above, this requires tracking:
- The exact training data used
- What code was used to prepare it for model training
- What model implementation and hyperparameters were used to train it
- What version of software packages were used to train the model
For data scientists to track all of this information manually for every model trained would be very time consuming, so they typically don’t do it at the moment. This frequently results in models in production whose source is unknown – not a great situation to be in if the model starts performing in unexpected ways.
How our experiment tracking feature works
There’s already a number of tools in the space of tracking training of data science models, both open source tools and in commercial products. After extensive testing and user research, we decided to implement a tracking feature that would integrate MLflow, an open source project. It’s a feature-rich tool supporting many common data science libraries, and has a growing community and developer base. Crucially, data scientists need to make only trivial changes to their code to use MLflow.
After collaboration with the principal MLflow developers at Databricks, including contributions to the open source project, we’ve integrated MLflow into Faculty Platform as the experiment tracking feature. This works out of the box in Faculty, all you need to do is import MLflow into your existing ML code and start using it:
import sklearn import mlflow gamma = 0.001 mlflow.log_param("gamma", gamma) classifier = sklearn.svm.SVC(gamma=0.001) classifier.fit(data_train, target_train) predictions = classifier.predict(data_test) accuracy = sklearn.metrics.accuracy_score(target_test, predictions) mlflow.log_metric("accuracy", accuracy)
Running your model training code with these MLflow annotations will create a run in the new Experiments screen:
Experiment tracking with MLflow in Faculty Platform is already a really useful feature for keeping records in your data science workflow, but we’re working on further improvements to make it even better, including:
- Filtering and sorting of runs by parameters, metrics and other values
- Comparison of runs by plotting parameter and metric values
- Viewing how metrics change during model training
If you would be interested in a demo of Faculty Platform or finding out more, please get in touch.