Deploying Azure Databricks in your Azure Virtual Network (Preview)

The default deployment of Azure Databricks is a fully managed service on Azure: all data plane resources, including a virtual network that all clusters will be associated with, are deployed to a locked resource group. If you require network customization, however, you can deploy Azure Databricks in your own Azure Virtual Network (VNet), enabling you to:

  • Connect Azure Databricks to other Azure services (such as Azure Storage) in a secure manner using service endpoints.
  • Connect to on-premises data sources for use with Azure Databricks, taking advantage of user-defined routes.
  • Connect Azure Databricks to a network virtual appliance to inspect all outbound traffic and take actions according to allow and deny rules.
  • Configure Azure Databricks to use custom DNS.
  • Configure inbound network security group (NSG) rules to allow traffic to additional ports like H2O Flow, or outbound NSG rules to specify egress traffic.
  • Deploy Azure Databricks workspaces in your existing Azure Virtual Network.

Deploying Azure Databricks to your own VNet also lets you take advantage of smaller CIDR ranges (/16-/20 for the virtual network, /18-/22 for the subnets).

Note

The ability to deploy Azure Databricks to your own VNet is a preview feature and requires enrollment. To enroll, Azure Databricks must whitelist your subscription. Please work directly with your account team or send the following details to feedback_azuredatabricks@service.microsoft.com:

  • Subscription ID
  • Region
  • Description of why and how you plan to use the feature

You will be contacted and provided with detailed deployment instructions and assistance.

Workflow

To deploy Azure Databricks in your own Azure Virtual Network, you do the following:

  1. Prepare or update a virtual network to which you would like to deploy an Azure Databricks workspace.

    Two new subnets, dedicated to Azure Databricks, are required in the virtual network:

    • A private subnet with an associated network security group that allows cluster-internal communication.
    • A public subnet with an associated network security group that allows communication with the Azure Databricks control plane.

    A CIDR block between /16 - /20 is required for the virtual network, and a CIDR block between /18 - /22 is required for the private and public subnets.

  2. Create an Azure Databricks workspace in the configured virtual network.

You will be provided with ARM templates for use in configuring your Azure Virtual Network and creating your Azure Databricks workspace.