Prepare Storage for Data Loading and Model Checkpointing

Data loading and model checkpointing are crucial to deep learning workloads especially distributed DL. You need to prepare a FUSE mount for data loading, model checkpoint, and logging from each worker to a shared storage location.

DBFS FUSE was not designed to handle data loading and model checkpointing. To achieve good I/O performance for DL, we recommend using a custom FUSE client.

Azure Databricks recommends that you use the blobfuse client, an open source project to provide a virtual filesystem backed by Azure Blob Storage. For information about blobfuse, see the blobfuse GitHub website.

To mount an Azure Blob Storage container as a file system with blobfuse, Azure Databricks recommends that you use an init script. The example notebook below demonstrates how to generate an init script and configure a cluster to run the script.