LZO Compressed Files

Due to licensing restrictions, the LZO compression codec is not available by default on Azure Databricks clusters. To read LZO compressed files, you must use an init script to install the codec on your cluster at launch time.

This topic includes two notebooks:

Init LZO compressed files
  • Builds the LZO codec.
  • Creates an init script that:
    • Installs the LZO compression libraries and the lzop command, and copies the LZO codec to proper class path.
    • Configures Spark to use the LZO compression codec.
Read LZO compressed files
Uses the codec installed by the init script.

Init LZO compressed files

Read LZO compressed files