Deploying Azure Databricks in your Azure Virtual Network

The default deployment of Azure Databricks is a fully managed service on Azure: all data plane resources, including a virtual network (VNet) that all clusters will be associated with, are deployed to a locked resource group. If you require network customization, however, you can deploy Azure Databricks data plane resources in your own virtual network (sometimes called VNet injection), enabling you to:

Deploying Azure Databricks data plane resources to your own virtual network also lets you take advantage of flexible CIDR ranges (anywhere between /16-/24 for the virtual network and between /18-/26 for the subnets).

Preview

The ability to deploy Azure Databricks to your own virtual network is in Public Preview.

Important

You cannot replace the virtual network for an existing workspace. If your current workspace cannot accommodate the required number of active cluster nodes, we recommend that you create another workspace in a larger virtual network. Follow these detailed migration steps to copy resources (notebooks, cluster configurations, jobs) from the old to new workspace.

Virtual network requirements

The virtual network that you deploy your Azure Databricks workspace to must meet the following requirements:

  • Location: The virtual network must reside in the same location as the Azure Databricks workspace.

  • Subnets: The virtual network must include two subnets dedicated to Azure Databricks:

    • A private subnet with a configured network security group that allows cluster-internal communication.
    • A public subnet with a configured network security group that allows communication with the Azure Databricks control plane.
  • Address space: A CIDR block between /16 - /24 for the virtual network and a CIDR block between /18 - /26 for the private and public subnets.

  • Whitelisting: All outbound and inbound traffic between the subnets and the Azure Databricks control plane must be whitelisted.

You can use the Azure Databricks workspace deployment interface in the Azure portal to automatically configure an existing virtual network with the required subnets, network security group, and whitelisting settings, or you can use Azure-Databricks-supplied ARM templates to configure your virtual network and deploy your workspace.

Create the Azure Databricks workspace in the Azure portal

This section describes how to create an Azure Databricks workspace in the Azure portal and deploy it in your own existing virtual network. Azure Databricks updates the virtual network with two new subnets and network security groups using CIDR ranges provided by you, whitelists inbound and outbound subnet traffic, and deploys the workspace to the updated virtual network.

Note

If you want more control over the configuration of the virtual network—for example, you may want to use existing subnets, use existing network security groups, or create your own security rules—you can use Azure-Databricks-supplied ARM templates instead of the portal UI. See Advanced configuration using ARM templates.

Prerequisites

You must prepare a virtual network to which you will deploy the Azure Databricks workspace:

  • You can use an existing virtual network or create a new one, but the virtual network must be in the same region as the Azure Databricks workspace that you plan to create.

  • A CIDR range between /16 - /24 is required for the virtual network.

    Warning

    A workspace with a smaller virtual network—that is, a lower CIDR range—can run out of IP addresses (network space) more quickly than a workspace with a larger virtual network. For example, a workspace with a /24 virtual network and /26 subnets can have a maximum of 64 nodes active at a time, whereas a workspace with a /20 virtual network and /22 subnets can house a maximum of 1024 nodes.

    Your subnets will be created automatically when you configure your workspace, and you will have the opportunity to provide the CIDR range for the subnets during configuration.

Configure the virtual network

  1. In the Azure portal, select + Create a resource > Analytics > Azure Databricks to open the Azure Databricks Service dialog.

  2. Follow the configuration steps described in Step 2: Create an Azure Databricks workspace in the Getting Started Guide, and select the Deploy Azure Databricks workspace in your Virtual Network option.

    ../../../_images/azure-vnet-injection-yes.png
  3. Select the virtual network you want to use.

    ../../../_images/azure-vnet-injection-subnets.png
  4. Provide CIDR ranges in a block between /18 - /26 for two subnets, dedicated to Azure Databricks:

    • A public subnet will be created with an associated network security group that allows communication with the Azure Databricks control plane.
    • A private subnet will be created with an associated network security group that allows cluster-internal communication.
  5. Click Create to deploy the Azure Databricks workspace to the virtual network.

Note

When you create a new virtual network (VNet) workspace in Azure Databricks, an Azure Databricks Managed Resource Group is created by default. This resource group contains a virtual network named workers-vnet and a network security group named workers-sg. The resource group is created as part of the normal operation of the Azure Databricks system. The resource group is not modifiable, and it is not used to create virtual machines. Only the customer-managed group is used to create virtual machines.

Advanced configuration using ARM templates

If you want more control over the configuration of the virtual network—for example, you want to use existing subnets, use existing network security groups, or create your own security rules—you can use the following ARM templates instead of the portal-UI-based automatic virtual network configuration and workspace deployment.

All in one

To create a virtual network, network security groups, and Azure Databricks workspace all in one, use the All-in-one Template for Databricks VNet Injected Workspaces.

When you use this template, you do not need to do any manual whitelisting of subnet traffic.

Network security groups

To create network security groups with the required rules for an existing virtual network, use the Network Security Group Template for Databricks VNet Injection.

When you use this template, you do not need to do any manual whitelisting of subnet traffic.

Virtual network

To create a virtual network with the proper public and private subnets, use the Virtual Network Template for Databricks VNet Injection.

If you use this template without also using the network security groups template, you must manually add whitelisting rules to the network security groups you use with the virtual network.

Azure Databricks workspace

To deploy an Azure Databricks workspace to an existing virtual network that has public and private subnets and properly configured network security groups already set up, use the Workspace Template for Databricks VNet Injection.

If you use this template without also using the network security groups template, you must manually add whitelisting rules to the network security groups you use with the virtual network.

Whitelisting subnet traffic

If you do not use the portal-UI-based automatic virtual network configuration and workspace deployment or ARM templates to create your network security groups, you must manually whitelist the following traffic on your subnets. IP addresses are listed by region in the second table.

Direction Protocol Source Source Port Destination Destination Port
Inbound
VirtualNetwork
Inbound
Control Plane NAT IP
22
Inbound
Control Plane NAT IP
5557
Outbound
Webapp IP
Outbound
SQL (service tag)
Outbound
Storage (service tag)
Outbound
VirtualNetwork

Whitelist subnet traffic using the following IP addresses. For SQL (metastore) and Storage (artifact and log storage), you should use the Sql and Storage service tags.

Azure Databricks Region Service Public IP
Australia Central Control Plane NAT 13.70.105.50/32
Webapp 13.75.218.172/32
Australia Central 2 Control Plane NAT 13.70.105.50/32
Webapp 13.75.218.172/32
Australia East Control Plane NAT 13.70.105.50/32
Webapp 13.75.218.172/32
Australia Southeast Control Plane NAT 13.70.105.50/32
Webapp 13.75.218.172/32
Canada Central Control Plane NAT 40.85.223.25/32
Webapp 13.71.184.74/32
Canada East Control Plane NAT 40.85.223.25/32
Webapp 13.71.184.74/32
Central India Control Plane NAT 104.211.101.14/32
Webapp 104.211.89.81/32
Central US Control Plane NAT 23.101.152.95/32
Webapp 40.70.58.221/32
East Asia Control Plane NAT 52.187.0.85/32
Webapp 52.187.145.107/32
East US Control Plane NAT 23.101.152.95/32
Webapp 40.70.58.221/32
East US 2 Control Plane NAT 23.101.152.95/32
Webapp 40.70.58.221/32
Japan East Control Plane NAT 13.78.19.235/32
Webapp 52.246.160.72/32
Japan West Control Plane NAT 13.78.19.235/32
Webapp 52.246.160.72/32
Korea Central Control Plane NAT 52.141.6.181/32
Webapp 52.141.22.164/32
North Central US Control Plane NAT 23.101.152.95/32
Webapp 40.70.58.221/32
North Europe Control Plane NAT 23.100.0.135/32
Webapp 52.232.19.246/32
South Africa North Control Plane NAT 40.127.5.82/32
Webapp 102.133.224.24/32
South Central US Control Plane NAT 40.83.178.242/32
Webapp 40.118.174.12/32
South India Control Plane NAT 104.211.101.14/32
Webapp 104.211.89.81/32
Southeast Asia Control Plane NAT 52.187.0.85/32
Webapp 52.187.145.107/32
UK South Control Plane NAT 51.140.203.27/32
Webapp 51.140.204.4/32
UK West Control Plane NAT 51.140.203.27/32
Webapp 51.140.204.4/32
West Europe Control Plane NAT 23.100.0.135/32
Webapp 52.232.19.246/32
West India Control Plane NAT 104.211.101.14/32
Webapp 104.211.89.81/32
West US Control Plane NAT 40.83.178.242/32
Webapp 40.118.174.12/32
West US 2 Control Plane NAT 40.83.178.242/32
Webapp 40.118.174.12/32

Troubleshooting

Workspace launch errors

Launching a workspace in a custom virtual network fails on the Azure Databricks sign-in screen with the following error: “We’ve encountered an error creating your workspace. Please make sure the custom network configuration is correct and try again.”
This error is caused by a network configuration not meeting requirements. Confirm that you followed the instructions in this topic when you created the workspace.

Cluster creation errors

Instances Unreachable: Resources were not reachable via SSH.
Possible cause: traffic from control plane to workers is blocked. Fix by ensuring that inbound security rules meet requirements. If you are deploying to an existing virtual network connected to your on-premises network, review your setup using the information supplied in Connecting your Azure Databricks Workspace to your On-Premises Network.
Unexpected Launch Failure: An unexpected error was encountered while setting up the cluster. Please retry and contact Azure Databricks if the problem persists. Internal error message: Timeout while placing node.
Possible cause: traffic from workers to Azure Storage endpoints is blocked. Fix by ensuring that outbound security rules meet requirements. If you are using custom DNS servers, also check the status of the DNS servers in your virtual network.
Cloud Provider Launch Failure: A cloud provider error was encountered while setting up the cluster. See the Azure Databricks guide for more information. Azure error code: AuthorizationFailed/InvalidResourceReference.
Possible cause: the virtual network or subnets do not exist any more. Make sure the virtual network and subnets exist.
Cluster terminated. Reason: Spark Startup Failure: Spark was not able to start in time. This issue can be caused by a malfunctioning Hive metastore, invalid Spark configurations, or malfunctioning init scripts. Please refer to the Spark driver logs to troubleshoot this issue, and contact Databricks if the problem persists. Internal error message: Spark failed to start: Driver failed to start in time.
Possible cause: Container cannot talk to hosting instance or DBFS storage account. Fix by adding a custom route to the subnets for the DBFS storage account with the next hop being Internet.

Notebook command errors

Command is hanging

../../../_images/command-hangs.png

Possible cause: worker-to-worker communication is blocked. Fix by making sure the inbound security rules meet requirements.

Notebook workflow fails with the exception: com.databricks.WorkflowException: org.apache.http.conn.ConnectTimeoutException
Possible cause: traffic from workers to Azure Databricks Webapp is blocked. Fix by making sure the outbound security rules meet requirements.