Save Spark DataFrames and Datasets to TFRecord Files

You can use spark-tensorflow-connector to save Spark DataFrames to TFRecord files. See Spark-TensorFlow data conversion for details.

The example notebook below demonstrates how to load MNIST data images to Spark DataFrames and save to TFRecords with spark-tensorflow-connector. Before running the notebook, you must:

  1. Prepare storage mounts for distributed data loading.
  2. Configure your FUSE_MOUNT_LOCATION in the notebook.