Why Use MLflow to Package Models

MLflow allows you to package and distribute your model in a way that makes it easy for other users to use your model to run inference. It also makes it easy for model serving platforms (e.g. SageMaker, Databricks, Azure ML, etc.), to serve your model as a web service. Read about the MLflow Model Format to learn more.

Specifically, using MLflow provides the following concrete benefits:

Uniform packaging of model code and weights: Packages your model inference code and weights in a uniform way that is independent of the ML framework used to train the model. This significantly improves the distribution of your model.
Clear input/output validation and documentation: Provides a schema specification of the input and output of the model which enables automatic input and output data validation at inference time. It also enables the generation of crystal clear documentation for your model.
Scoring API independent of ML training framework: Provides a uniform scoring API that is independent of the ML framework used to train the model. This significantly improves the usability of your model and deployability of your model in model serving platforms.

Quick overview of MLflow packaging

Below is a quick overview of the steps required to package a model as an MLflow model. You and end users of your model can run inference with a single command at the end of these steps!

You will be given a pre-configured directory structure that lists all the files necessary to package a model as an MLflow Model. You will only need to modify a few files (see below) to get your model packaged and run inference on it. Here is the tentative pre-configured directory structure containing the code:

<your_model_name>_mlflow_pkg
├── requirements.in
├── generate_requirements_txt.py
├── model_data/
├── model_code/
│   ├── __init__.py
│   └── <your_model_name>_mlflow_model.py
├── model_spec.py
└── mlflow_packager.py

Put your top-level Python package dependencies of your model in requirements.in and run generate_requirements_txt.py to capture all transitive dependencies.
Download your model weights and any auxiliary data to model_data/ directory.
Specify the input and output datatypes of your model in model_spec.py.
Implement 3 hook methods in <your_model_name>_mlflow_model.py to complete the inference implementation for your model.
Run mlflow_packager.py script to package up your model. The MLflow package will look something like this (example from transcriptformer; see the original model repo here):

mlflow_model_artifact/
├── MLmodel
├── artifacts
│   └── tf_sapiens
│       ├── config.json
│       ├── model_weights.pt
│       └── vocabs
│           ├── assay_vocab.json
│           └── homo_sapiens_gene.h5
├── code
│   └── model_code
│       ├── __init__.py
│       └── transcriptformer_mlflow_model.py
├── conda.yaml
├── input_example.json
├── python_env.yaml
├── python_model.pkl
├── requirements.txt
└── serving_input_example.json

Copy the serving_input_example.json and modify it to point to an input file on your disk.
You can now run inference with a single command.

mlflow models predict --model-uri ./mlflow_model_artifact --content-type json --input-path serving_input_example.json --output-path test_output.json --env-manager conda