Vacuum

Clean up files associated with a table. There are different versions of this command for Spark and Databricks Delta tables.

Vacuum a Spark table

VACUUM ([db_name.]table_name|path) [RETAIN num HOURS]
RETAIN num HOURS
The retention threshold.

Recursively vacuum directories associated with the Spark table and remove uncommitted files older than a retention threshold. The default threshold is 7 days. DBIO automatically triggers VACUUM operations as data is written. See Clean up uncommitted files for more information.

Vacuum a Databricks Delta table

VACUUM [db_name.]table_name|path [RETAIN num HOURS] [DRY RUN]

Recursively vacuum directories associated with the Databricks Delta table and remove files that are no longer in the transaction log and are older than a retention threshold. The default threshold is 7 days. VACUUM operations on Databricks Delta tables are not triggered automatically. See Garbage collection for more information.

RETAIN num HOURS
The retention threshold.
DRY RUN
Return a list of files to be deleted.