When you create a cluster, you can either provide a fixed number of workers for the cluster or provide a minimum and maximum number of workers for the cluster.
When you provide a fixed size cluster, Azure Databricks ensures that your cluster has the specified number of workers. When you provide a range for the number of workers, Databricks chooses the appropriate number of workers required to run your job.
To allow Databricks to resize your cluster automatically, you specify that the cluster is autoscaling and provide the min and max range of workers.
Autoscaling works best with Databricks Runtime 3.4 and above.
When you select autoscaling, the cluster size is automatically adjusted between the minimum and a maximum number of worker limits during the cluster’s lifetime.
During runtime, Databricks dynamically reallocates workers to account for the characteristics of your job. Certain parts of your pipeline may be more computationally demanding than others, and Databricks automatically adds additional workers during these phases of your job (and removes them when they’re no longer needed).
Autoscaling makes it easier to achieve high cluster utilization, because you don’t need to provision the cluster to match a workload. This applies especially to workloads whose requirements change over time (like exploring a dataset during the course of a day), but it can also apply to a one-time shorter workload whose provisioning requirements are unknown. Autoscaling thus offers two advantages:
- Workloads can run faster compared to a constant-sized under-provisioned cluster.
- Autoscaling clusters can reduce overall costs compared to a statically-sized cluster.
Depending on the constant size of the cluster and on the workload, autoscaling gives you one or both of these benefits at the same time. Note that the cluster size can go below the minimum number of workers selected when the cloud provider terminates instances. In this case, Databricks continuously retries to increase the cluster size.
You enable autoscaling by selecting the Enable Autoscaling checkbox at the time of cluster creation and entering the number of min and max workers.
Databricks monitors load on Spark clusters and decides whether to scale a cluster up or down and by how much. If a cluster has pending Spark tasks, the cluster scales up. If a cluster does not have any pending Spark tasks, the cluster scales down. The autoscaling algorithm uses exponential steps to ensure that users experience fast workloads while maintaining efficient cluster utilization.
Clusters with no pending tasks do not scale up. This usually indicates that the cluster is fully utilized, and adding more nodes will not make the processing faster.
For example, this cluster currently has 16 running tasks and 16 pending tasks (total tasks - running tasks), and will be scaled up:
If you reconfigure a static cluster to be an autoscaling cluster, Databricks immediately resizes the cluster within the minimum and maximum bounds and then starts autoscaling. As an example, the table below demonstrates what happens to clusters with a certain initial size if you reconfigure a cluster to autoscale between 5 and 10 nodes.
|Initial size||Size after reconfiguration|
Autoscaling for jobs is different from standard autoscaling, and is recommended only with runtime versions 3.4 and above. This feature allows a jobs cluster to scale up and down more aggressively in response to load and is designed to improve resource utilization. In particular, a cluster can scale down idle VMs even when there are tasks running on other VMs. This autoscaling algorithm is different than the one used for standard interactive clusters. To enable this feature for a job running Databricks Runtime 3.4 or higher, select the Enable Autoscaling option on the Configure Cluster page. For a demonstration of the benefits of job autoscaling, see the blog post on Optimized Autoscaling.