Set Permissions on a Data Object

Note

Access control requires the Azure Databricks Premium Plan.

The Azure Databricks view-based data governance model lets you programmatically grant, deny, and revoke access to your data from the Spark SQL API.

This data governance model lets you control access to securable objects like tables, databases, views, and functions. It also allows for fine-grained access control (to a particular subset of a table, for example) by setting permissions on derived views created from arbitrary queries. The Azure Databricks SQL query analyzer enforces these access control policies at runtime on clusters with table access control enabled.

This topic describes the privileges, objects, and ownership rules that make up the Azure Databricks view-based data access control model. It also describes how to grant, deny, and revoke object privileges.

Requirements

Before you can grant, deny, or revoke privileges on data objects, an administrator must enable table access control for the cluster. For information about how to create a cluster that follows this security model, see Enable Table Access Control.

View-based access control model

This section describes the Azure Databricks view-based data access control model.

Privileges

  • SELECT – gives read access to an object
  • CREATE – gives ability to create an object (for example, a table in a database)
  • MODIFY – gives ability to add/delete/modify data to/from an object
  • READ_METADATA – gives ability to view an object and its metadata
  • CREATE_NAMED_FUNCTION – gives ability to create a named UDF in an existing catalog or database
  • ALL PRIVILEGES – gives all privileges (gets translated into all the above privileges)

Objects

The privileges above can apply to the following classes of objects:

  • CATALOG - controls access to the entire data catalog.
  • DATABASE - controls access to a database.
  • TABLE - controls access to a managed or external table.
  • VIEW - controls access to SQL views.
  • FUNCTION - controls access to a named function.
  • ANONYMOUS FUNCTION - controls access to anonymous or temporary functions.
  • ANY FILE - controls access to the underlying filesystem.

Object ownership

For some actions, the ownership of the object (table/view/database) determines if you are authorized to perform the action. The user who creates the table, view, or database becomes its owner. In the case of tables and views, the owner is granted all privileges, and can grant those privileges to other users.

For example, ownership determines whether or not you can grant permissions on derived objects to other users. Suppose User A owns Table T and grants User B SELECT permission on Table T. Now, even though User B can select from Table T, User B cannot grant SELECT permission on Table T to User C, because User A is still the owner of the underlying Table T.

Furthermore, User B cannot circumvent this restriction simply by creating a View V on Table T and granting permissions on that view to User C. When Databricks checks for permissions for User C to access View V, it also checks that the owner of V and underlying Table T are the same. If the owners are not the same, User C must also have select permissions on underlying Table T.

To summarize, the owner of an object controls access to that object. This applies to tables as well as to views that are used for fine-grained access to tables.

Users and groups

Privileges can be granted to users or groups that are created via the groups API. Each user is uniquely identified by their username (which typically maps to their email address) in Databricks. Users who are workspace administrators in Databricks belong to a special admin role and can also access objects that they haven’t been given explicit access to.

Privilege hierarchy

Privileges on objects are hierarchical. This means that granting or denying a privilege on the entire CATALOG automatically grants or denies the privilege to all of the databases (as well as all tables and views). Similarly, granting or denying a privilege to a given DATABASE automatically grants or denies the privilege to all tables and views in that database.

If a user is an admin or an owner of the specified object they will always be able to perform all actions. Otherwise if the action is denied they will never be able to perform it. If none of the preceding rules apply then a user can only perform the action if it is granted.

Grant, deny, and revoke object privileges

Commands

You use the following commands to manage object privileges:

GRANT

Grant a specific permission on an object to a user or principal. Granting a permission on a database (for example a SELECT permission) has the effect of implicitly granting that permission on all objects in that database. Granting a specific permission on the catalog has the effect of implicitly granting that permission on all databases in the catalog.

GRANT
  privilege_type [, privilege_type ] ...
  ON (CATALOG | DATABASE db_name | [TABLE] table_name | [VIEW] view_name | [FUNCTION] function_name | ANONYMOUS FUNCTION | ANY FILE)
  TO user

privilege_type
  : SELECT | CREATE | MODIFY | READ_METADATA | CREATE_NAMED_FUNCTION | ALL PRIVILEGES

DENY

Deny a specific permission on an object to a user or principal. Denying a permission on a database (for example a SELECT permission) has the effect of implicitly denying that permission on all objects in that database. Denying a specific permission on the catalog has the effect of implicitly denying that permission on all databases in the catalog.

DENY can be used to ensure that a user or principal cannot access the specified object, despite any implicit or explicit GRANTs. When an object is accessed, Databricks first checks if there are any explicit or implicit DENYs on the object before checking if there are any explicit or implicit GRANTs.

For example, suppose there is a database db with tables t1 and t2. A user is initially granted SELECT privileges on db. The user can access t1 and t2 due to the GRANT on the database db.

Now if the administrator issues a DENY on table t1, the user will no longer be able to access t1. If the administrator issues a DENY on database db, the user will not be able to access any tables in db even if there is an explicit GRANT on these tables. That is, the DENY always supersedes the GRANT.

DENY
  privilege_type [, privilege_type ] ...
  ON (CATALOG | DATABASE db_name | [TABLE] table_name | [VIEW] view_name | [FUNCTION] function_name | ANONYMOUS FUNCTION | ANY FILE)
  TO user

privilege_type
  : SELECT | CREATE | MODIFY | READ_METADATA | CREATE_NAMED_FUNCTION | ALL PRIVILEGES

REVOKE

Revoke an explicitly granted or denied permission on an object from a user. A REVOKE is strictly scoped to the object specified in the command and does not cascade to contained objects.

For example, suppose there is a database db with tables t1 and t2. A user is initially granted SELECT privileges on db and on t1. The user can access t2 due to the GRANT on the database db.

Now, if the administrator revokes the SELECT privilege on db, the user will no longer be able to access t2, but will still be able to access t1 since there is an explicit GRANT on table t1.

Note that if the administrator instead revokes the SELECT on table t1 but still keeps the SELECT on database db, the user can still access t1 because the SELECT on the database db implicitly confers privileges on the table t1.

REVOKE
  privilege_type [, privilege_type ] ...
  ON (CATALOG | DATABASE db_name | [TABLE] table_name | [VIEW] view_name | [FUNCTION] function_name | ANONYMOUS FUNCTION | ANY FILE)
  FROM user

privilege_type
  : SELECT | CREATE | MODIFY | READ_METADATA | CREATE_NAMED_FUNCTION | ALL PRIVILEGES

Examples

GRANT SELECT ON DATABASE database_name to `user1@databricks.com`;
DENY SELECT ON table_name to `user2@databricks.com`;
GRANT SELECT ON ANONYMOUS FUNCTION to `user3@databricks.com`;
GRANT SELECT ON ANY FILE to `user4@databricks.com`;
REVOKE ALL PRIVILEGES ON DATABASE default FROM `user1@databricks.com`
REVOKE SELECT ON table_name default FROM `user2@databricks.com`

SHOW GRANT

Display all permissions (including inherited, denied, and granted) that affect the specified object.

SHOW GRANT [user] ON (CATALOG | DATABASE db_name | [TABLE] table_name | [VIEW] view_name | [FUNCTION] function_name | ANONYMOUS FUNCTION | ANY FILE)

Example

SHOW GRANT `user1@databricks.com` ON DATABASE default

View-based access control

You can configure fine-grained access control (to rows and columns matching specific conditions, for example) by granting access to derived views that contain arbitrary queries.

Example

CREATE OR REPLACE VIEW view_name AS SELECT columnA, columnB FROM table_name WHERE columnC > 1000;
GRANT SELECT ON VIEW view_name to `user1@databricks.com`;

Privileges required for SQL operations

The following table maps privileges to SQL operations:

Operations ↓ / Privilege → SELECT CREATE MODIFY READ_METADATA CREATE_NAMED_FUNCTION Ownership Admin
CREATE TABLE   x       x x
DROP TABLE     x     x x
DESCRIBE TABLE       x   x x
ALTER TABLE     x     x x
DROP TABLE     x     x x
CREATE VIEW   x       x x
DROP VIEW     x     x x
SELECT x         x x
CREATE FUNCTION         x x x
MSCK           x x
CREATE DATABASE   x       x x
EXPLAIN       x   x x
DROP DATABASE     x     x x
GRANT           x x
REVOKE           x x

Frequently asked questions (FAQ)

I created a table or view but now I can’t query, drop, or modify it.

This error can occur because you created that table on a cluster without Table ACLs enabled. When Table ACLs are disabled on a cluster, owners are not registered when a table, view, or database is created. To mitigate this problem an admin must assign an owner to the table using the following command:

ALTER TABLE <table-name> OWNER TO `<user-name>@<user-domain>.com`;

If you are not the new owner, the owner or an admin must provide SELECT permission to you using the following command:

GRANT SELECT ON TABLE <table-name> to `<user-name>@<user-domain>.com`;
How do I grant permissions on global and local temporary views?
Unfortunately permissions on global and local temporary views are not supported. Local temporary views are visible only within the same session, and views created in the global_temp database are visible to all users sharing a cluster. However, permissions on the underlying tables and views referenced by any temporary views are enforced.
How do I grant a user or group permissions on multiple tables at once?

A grant, deny, or revoke statement can be applied to only one object at a time. The recommended way to organize and grant permissions on multiple tables to a principal is via databases. Granting a principal SELECT permission on a database implicitly grants that principal SELECT permissions on all tables and views in that database. For example, if a database D has tables T1 and T2, and an admin issues the following GRANT command:

GRANT SELECT ON DATABASE D to `<user>@databricks.com`

The principal <user>@databricks.com can select from tables T1 and T2, as well as any tables and views created in database D in the future.

How do I grant a user permissions on all tables except one?

You grant SELECT permission to the database and then deny SELECT permission for the specific table you want to restrict access to.

GRANT SELECT ON DATABASE D to `<user>@databricks.com`
DENY SELECT ON TABLE D.T to `<user>@databricks.com`

The principal <user>@databricks.com can select from all tables in D except D.T.

I granted a user SELECT permissions on a view, but when that user tries to SELECT from that view, they get the error User does not have permission SELECT on table.

This common error is caused by the way object ownership affects the ability to grant permissions. You see this behavior because you are not the owner of Table T, for one of the following reasons:

  • Someone else created Table T and is the owner of Table T.
  • Table T has no registered owner because it was created using a cluster for which Table ACL is disabled.

Suppose there is a View V on Table T and User U tries to select from View V. Azure Databricks checks that:

  • User U has SELECT permission on Table T.
  • One of the following is true:
    • View V and Table T have the same owner. In this case the owner of Table T has granted fine-grained access to user U.
    • User U can select directly from Table T. If the owner of Table T has allowed User U to select from all of Table T, User U can select from any view on Table T.

As described in the Object ownership section, these conditions ensure that only the owner of an object can grant other users access to that object.

You can test if a table has an owner by using SHOW GRANT ON <table-name>. If you do not see an entry with ActionType OWN, then the table does not have an owner. If this is the case, an admin user can set the table owner by running the following:

ALTER TABLE <table-name> OWNER TO `<user-name>@<user-domain>.com`;
I tried to run sc.parallelize on a cluster with Table ACLs enabled and it doesn’t work.
On clusters with Table ACLs enabled you can use only the Spark SQL and Python DataFrame APIs. The RDD API is disallowed for security reasons, since Azure Databricks does not have the ability to inspect and authorize code within an RDD.