Configuring SCIM Provisioning for Microsoft Azure Active Directory

To enable provisioning to Azure Databricks using Azure Active Directory (Azure AD) you must create an enterprise application for each Azure Databricks workspace.

Note

Provisioning configuration is entirely separate from the process of setting up authentication and conditional access for Azure Databricks workspaces. Authentication for Azure Databricks is handled automatically by Azure Active Directory, using the OpenID Connect protocol flow. Conditional access, which lets you create rules to require multi-factor authentication or restrict logins to local networks, can be established at the service level. For instructions, see Conditional Access.

Prerequisites

  • Your Azure AD account must be a Premium edition account, and you must be a global administrator for that account to enable provisioning.

Create an enterprise application and connect to the Azure Databricks SCIM API

In the following examples, replace <databricks-instance> with the <region>.azuredatabricks.net domain name of your Azure Databricks deployment.

  1. Generate a personal access token in Azure Databricks and copy it. See Generate a token.

    You will provide this to Azure AD in a subsequent step.

    Important

    Generate this token as an Azure Databricks admin who will not be managed by the Azure AD enterprise application that you set up below. An Azure Databricks admin user who is managed by this enterprise application can be deprovisioned using Azure AD, which would cause your SCIM provisioning integration to be disabled.

  2. In your Azure Portal, go to Azure Active Directory > Enterprise Applications.

  3. Click + New Application above the application list, and then, under Add your own app, click Non-gallery application.

  4. Enter a Name for the application and click Add.

    Use a name that will help administrators find it, like <your-workspace-name>-provisioning.

  5. Under the Manage menu, click Provisioning.

  6. From the Provisioning Mode drop-down, select Automatic.

  7. Enter the Tenant URL:

    https://<databricks-instance>/api/2.0/preview/scim
    

    For example,:

    https://westus.azuredatabricks.net/api/2.0/preview/scim
    
  8. In the Secret Token field, enter the Azure Databricks personal access token that you generated in step 1.

  9. Click Test Connection and wait for the message that confirms that the credentials are authorized to enable provisioning.

  10. Click Save.

Define user and group attribute mappings

User and group mappings define how data should flow between Azure AD and your Azure Databricks workspace.

  1. Continuing where you left off after you tested your connection and saved it, go to the Mappings section and click Synchronize Azure Active Directory Users to customappsso.

  2. Under Attribute Mappings, delete all of the deletable default mappings.

    You may not be able to delete the first mapping in the table. You can modify it in the next step.

    attribute mappings list
  3. Select Show advanced options and click Edit attribute list for customappsso.

  4. In the Edit Attribute List, make the following changes and click Save:

    • Delete the name.formatted attribute.
    • Select the Primary and Required options for the userName attribute.
  5. Click the mailNickname attribute mapping to open the Edit Attribute pane and make the following changes:

    • Change Source attribute mailNickname to userPrincipalName.
    • Change Target attribute externalID to userName.
  6. Click OK.

  7. In the Attribute Mappings section, click Add New Mapping to add the required attribute mappings.

    add attribute mappings

    Add each of the following mappings, with the properties indicated.

    Mapping Type Source attribute / Expression Default value Target attribute Match object using attribute Matching precedence Apply this mapping
    Direct extensionAttribute1   id No   Always
    Direct mail   emails[type eq "work"].value No   Always
    Expression Join(" ", [givenName], [surname])   displayName No   Always
    Expression Switch([IsSoftDeleted], , "False", "True", "True", "False")   active No   Always

    When you’re done, your Attribute Mappings list will look like this:

    user attribute mappings
  8. Click Save and return to the <Your-enterprise-application> pane to map your group attributes.

  9. Go to the Mappings section and click Synchronize Azure Active Directory Groups to customappsso.

  10. In the Attribute Mappings section, delete the following group mapping attributes:

    • objectID
    • mail
    • mailEnabled
    • securityEnabled
  11. Add a new attribute, extensionAttribute1, with the following properties:

    Mapping Type Source attribute / Expression Default value Target attribute Match object using attribute Matching precedence Apply this mapping
    Direct extensionAttribute1   id Yes 2 Always

    When you’re done, your Attribute Mappings list will look like this:

    group attribute mappings
  12. Click Save.

Add owners and start provisioning

  1. On the <Your-enterprise-application> pane, go to Manage > Owners.

  2. Add yourself and any other users who should be able to manage the provisioning integration.

  3. Go to Manage > Provisioning and, under Settings, select the scope to Sync only assigned users and groups.

    This option syncs only users and groups assigned to the enterprise application, and is our recommended approach.

  4. To start the synchronization of users and groups from Azure AD to Azure Databricks, set the Provisioning Status switch to On.

  5. Click Save.

  6. Test your provisioning setup:

    1. Go to Manage > Users and groups.
    2. Add some users and groups (click Add, select the users and groups, and click the Assign button.)
    3. Go to Manage > Provisioning and select Clear current state and restart synchronization.
    4. Wait a few minutes and check that the users and groups have been added to your Azure Databricks workspace.

Any additional users and groups that you add and assign will automatically be provisioned when Azure AD schedules the next sync.

Important

Do not assign the Azure Databricks admin whose secret token (bearer token) was used to set up this enterprise application.

Provisioning tips

  • Users and groups that existed in Azure Databricks prior to enabling provisioning exhibit the following behavior upon provisioning sync:
    • Are merged if they also exist in this Azure AD enterprise application.
    • Are ignored if they don’t exist in this Azure AD enterprise application.
  • User permissions that are assigned individually and are duplicated through membership in a group remain after the group membership is removed for the user.
  • Users removed from an Azure Databricks workspace directly, using the Azure Databricks Admin console:
    • Lose access to that Azure Databricks workspace but may still have access to other Azure Databricks workspaces.
    • Will not be synced again using Azure AD provisioning, even if they remain in the enterprise application.
  • The initial Azure AD sync is triggered immediately after you turn on provisioning. Subsequent syncs are triggered every 20-40 minutes, depending on the number of users and groups in the application. See Provisioning summary report in the Azure AD documentation.
  • The “admins” group is a reserved group in Azure Databricks and cannot be removed.
  • Groups cannot be renamed in Azure Databricks; do not attempt to rename them in Azure AD.
  • You can use the Azure Databricks Groups API or the Groups UI to get a list of members of any Azure Databricks group.
  • You cannot update Azure Databricks usernames and email addresses.

Troubleshooting

Users and groups do not sync
Usually this happens because the attribute mappings for users and groups are not correct. Validate the mappings recommended in this topic against what you have set, and remove any unrecommended mapping elements. The issue could also be that the Azure Databricks admin user whose personal access token is being used to connect to Azure AD has lost admin status or has an invalid token: log in to the Azure Databricks Admin console as that user and validate that you are still an admin and your access token is still valid.
After initial sync the users and groups are not syncing
After the initial sync, Azure AD does not sync immediately upon changes to user and group assignments. It schedules a sync with the application after a delay (depending on the number of users and groups). You can go to Manage > Provisioning for the enterprise application and select Clear current state and restart synchronization to initiate an immediate sync.