This resource allows you to generically manage access control in Databricks workspace. It would guarantee that only _admins_, _authenticated principal_ and those declared within access_control
blocks would have specified access. It is not possible to remove management rights from _admins_ group.
It's possible to separate cluster access control to three different permission levels: CAN_ATTACH_TO
, CAN_RESTART
and CAN_MANAGE
:
resource "databricks_group" "auto" {
display_name = "Automation"
}
resource "databricks_group" "eng" {
display_name = "Engineering"
}
resource "databricks_group" "ds" {
display_name = "Data Science"
}
data "databricks_spark_version" "latest" {}
data "databricks_node_type" "smallest" {
local_disk = true
}
resource "databricks_cluster" "shared_autoscaling" {
cluster_name = "Shared Autoscaling"
spark_version = data.databricks_spark_version.latest.id
node_type_id = data.databricks_node_type.smallest.id
autotermination_minutes = 60
autoscale {
min_workers = 1
max_workers = 10
}
}
resource "databricks_permissions" "cluster_usage" {
cluster_id = databricks_cluster.shared_autoscaling.id
access_control {
group_name = databricks_group.auto.display_name
permission_level = "CAN_ATTACH_TO"
}
access_control {
group_name = databricks_group.eng.display_name
permission_level = "CAN_RESTART"
}
access_control {
group_name = databricks_group.ds.display_name
permission_level = "CAN_MANAGE"
}
}
Cluster policies allow creation of clusters, that match given policy. It's possible to assign CAN_USE
permission to users and groups:
resource "databricks_group" "ds" {
display_name = "Data Science"
}
resource "databricks_group" "eng" {
display_name = "Engineering"
}
resource "databricks_cluster_policy" "something_simple" {
name = "Some simple policy"
definition = jsonencode({
"spark_conf.spark.hadoop.javax.jdo.option.ConnectionURL" : {
"type" : "forbidden"
},
"spark_conf.spark.secondkey" : {
"type" : "forbidden"
}
})
}
resource "databricks_permissions" "policy_usage" {
cluster_policy_id = databricks_cluster_policy.something_simple.id
access_control {
group_name = databricks_group.ds.display_name
permission_level = "CAN_USE"
}
access_control {
group_name = databricks_group.eng.display_name
permission_level = "CAN_USE"
}
}
Instance Pools access control allows to assign CAN_ATTACH_TO
and CAN_MANAGE
permissions to users, service principals, and groups. It's also possible to grant creation of Instance Pools to individual groups and users, service principals.
resource "databricks_group" "auto" {
display_name = "Automation"
}
resource "databricks_group" "eng" {
display_name = "Engineering"
}
data "databricks_node_type" "smallest" {
local_disk = true
}
resource "databricks_instance_pool" "this" {
instance_pool_name = "Reserved Instances"
idle_instance_autotermination_minutes = 60
node_type_id = data.databricks_node_type.smallest.id
min_idle_instances = 0
max_capacity = 10
}
resource "databricks_permissions" "pool_usage" {
instance_pool_id = databricks_instance_pool.this.id
access_control {
group_name = databricks_group.auto.display_name
permission_level = "CAN_ATTACH_TO"
}
access_control {
group_name = databricks_group.eng.display_name
permission_level = "CAN_MANAGE"
}
}
There are four assignable permission levels for databricks_job: CAN_VIEW
, CAN_MANAGE_RUN
, IS_OWNER
, and CAN_MANAGE
. Admins are granted the CAN_MANAGE
permission by default, and they can assign that permission to non-admin users, and service principals.
IS_OWNER
permission. Destroying databricks_permissions
resource for a job would revert ownership to the creator.resource "databricks_group" "auto" {
display_name = "Automation"
}
resource "databricks_group" "eng" {
display_name = "Engineering"
}
resource "databricks_service_principal" "aws_principal" {
display_name = "main"
}
data "databricks_spark_version" "latest" {}
data "databricks_node_type" "smallest" {
local_disk = true
}
resource "databricks_job" "this" {
name = "Featurization"
max_concurrent_runs = 1
task {
task_key = "task1"
new_cluster {
num_workers = 300
spark_version = data.databricks_spark_version.latest.id
node_type_id = data.databricks_node_type.smallest.id
}
notebook_task {
notebook_path = "/Production/MakeFeatures"
}
}
}
resource "databricks_permissions" "job_usage" {
job_id = databricks_job.this.id
access_control {
group_name = "users"
permission_level = "CAN_VIEW"
}
access_control {
group_name = databricks_group.auto.display_name
permission_level = "CAN_MANAGE_RUN"
}
access_control {
group_name = databricks_group.eng.display_name
permission_level = "CAN_MANAGE"
}
access_control {
service_principal_name = databricks_service_principal.aws_principal.application_id
permission_level = "IS_OWNER"
}
}
There are four assignable permission levels for databricks_pipeline: CAN_VIEW
, CAN_RUN
, CAN_MANAGE
, and IS_OWNER
. Admins are granted the CAN_MANAGE
permission by default, and they can assign that permission to non-admin users, and service principals.
IS_OWNER
permission. Destroying databricks_permissions
resource for a pipeline would revert ownership to the creator.resource "databricks_group" "eng" {
display_name = "Engineering"
}
resource "databricks_notebook" "dlt_demo" {
content_base64 = base64encode(<<-EOT
import dlt
json_path = "/databricks-datasets/wikipedia-datasets/data-001/clickstream/raw-uncompressed-json/2015_2_clickstream.json"
@dlt.table(
comment="The raw wikipedia clickstream dataset, ingested from /databricks-datasets."
)
def clickstream_raw():
return (spark.read.format("json").load(json_path))
EOT
)
language = "PYTHON"
path = "${data.databricks_current_user.me.home}/DLT_Demo"
}
resource "databricks_pipeline" "this" {
name = "DLT Demo Pipeline (${data.databricks_current_user.me.alphanumeric})"
storage = "/test/tf-pipeline"
configuration = {
key1 = "value1"
key2 = "value2"
}
library {
notebook {
path = databricks_notebook.dlt_demo.id
}
}
continuous = false
filters {
include = ["com.databricks.include"]
exclude = ["com.databricks.exclude"]
}
}
resource "databricks_permissions" "dlt_usage" {
pipeline_id = databricks_pipeline.this.id
access_control {
group_name = "users"
permission_level = "CAN_VIEW"
}
access_control {
group_name = databricks_group.eng.display_name
permission_level = "CAN_MANAGE"
}
}
Valid permission levels for databricks_notebook are: CAN_READ
, CAN_RUN
, CAN_EDIT
, and CAN_MANAGE
.
resource "databricks_group" "auto" {
display_name = "Automation"
}
resource "databricks_group" "eng" {
display_name = "Engineering"
}
resource "databricks_notebook" "this" {
content_base64 = base64encode("# Welcome to your Python notebook")
path = "/Production/ETL/Features"
language = "PYTHON"
}
resource "databricks_permissions" "notebook_usage" {
notebook_path = databricks_notebook.this.path
access_control {
group_name = "users"
permission_level = "CAN_READ"
}
access_control {
group_name = databricks_group.auto.display_name
permission_level = "CAN_RUN"
}
access_control {
group_name = databricks_group.eng.display_name
permission_level = "CAN_EDIT"
}
}
Valid permission levels for databricks_workspace_file are: CAN_READ
, CAN_RUN
, CAN_EDIT
, and CAN_MANAGE
.
resource "databricks_group" "auto" {
display_name = "Automation"
}
resource "databricks_group" "eng" {
display_name = "Engineering"
}
resource "databricks_workspace_file" "this" {
content_base64 = base64encode("print('Hello World')")
path = "/Production/ETL/Features.py"
}
resource "databricks_permissions" "workspace_file_usage" {
workspace_file_path = databricks_workspace_file.this.path
access_control {
group_name = "users"
permission_level = "CAN_READ"
}
access_control {
group_name = databricks_group.auto.display_name
permission_level = "CAN_RUN"
}
access_control {
group_name = databricks_group.eng.display_name
permission_level = "CAN_EDIT"
}
}
Valid permission levels for folders of databricks_directory are: CAN_READ
, CAN_RUN
, CAN_EDIT
, and CAN_MANAGE
. Notebooks and experiments in a folder inherit all permissions settings of that folder. For example, a user (or service principal) that has CAN_RUN
permission on a folder has CAN_RUN
permission on the notebooks in that folder.
CAN_MANAGE
permission for items in the Workspace > Shared Icon Shared folder. You can grant CAN_MANAGE
permission to notebooks and folders by moving them to the Shared Icon Shared folder.CAN_MANAGE
permission for objects the user creates.CAN_MANAGE
permission. All other users (or service principals) can list their directories.resource "databricks_group" "auto" {
display_name = "Automation"
}
resource "databricks_group" "eng" {
display_name = "Engineering"
}
resource "databricks_directory" "this" {
path = "/Production/ETL"
}
resource "databricks_permissions" "folder_usage" {
directory_path = databricks_directory.this.path
depends_on = [databricks_directory.this]
access_control {
group_name = "users"
permission_level = "CAN_READ"
}
access_control {
group_name = databricks_group.auto.display_name
permission_level = "CAN_RUN"
}
access_control {
group_name = databricks_group.eng.display_name
permission_level = "CAN_EDIT"
}
}
Valid permission levels for databricks_repo are: CAN_READ
, CAN_RUN
, CAN_EDIT
, and CAN_MANAGE
.
resource "databricks_group" "auto" {
display_name = "Automation"
}
resource "databricks_group" "eng" {
display_name = "Engineering"
}
resource "databricks_repo" "this" {
url = "https://github.com/user/demo.git"
}
resource "databricks_permissions" "repo_usage" {
repo_id = databricks_repo.this.id
access_control {
group_name = "users"
permission_level = "CAN_READ"
}
access_control {
group_name = databricks_group.auto.display_name
permission_level = "CAN_RUN"
}
access_control {
group_name = databricks_group.eng.display_name
permission_level = "CAN_EDIT"
}
}
Valid permission levels for databricks_mlflow_experiment are: CAN_READ
, CAN_EDIT
, and CAN_MANAGE
.
data "databricks_current_user" "me" {}
resource "databricks_mlflow_experiment" "this" {
name = "${data.databricks_current_user.me.home}/Sample"
artifact_location = "dbfs:/tmp/my-experiment"
description = "My MLflow experiment description"
}
resource "databricks_group" "auto" {
display_name = "Automation"
}
resource "databricks_group" "eng" {
display_name = "Engineering"
}
resource "databricks_permissions" "experiment_usage" {
experiment_id = databricks_mlflow_experiment.this.id
access_control {
group_name = "users"
permission_level = "CAN_READ"
}
access_control {
group_name = databricks_group.auto.display_name
permission_level = "CAN_MANAGE"
}
access_control {
group_name = databricks_group.eng.display_name
permission_level = "CAN_EDIT"
}
}
Valid permission levels for databricks_mlflow_model are: CAN_READ
, CAN_EDIT
, CAN_MANAGE_STAGING_VERSIONS
, CAN_MANAGE_PRODUCTION_VERSIONS
, and CAN_MANAGE
. You can also manage permissions for all MLflow models by registered_model_id = "root"
.
resource "databricks_mlflow_model" "this" {
name = "SomePredictions"
}
resource "databricks_group" "auto" {
display_name = "Automation"
}
resource "databricks_group" "eng" {
display_name = "Engineering"
}
resource "databricks_permissions" "model_usage" {
registered_model_id = databricks_mlflow_model.this.registered_model_id
access_control {
group_name = "users"
permission_level = "CAN_READ"
}
access_control {
group_name = databricks_group.auto.display_name
permission_level = "CAN_MANAGE_PRODUCTION_VERSIONS"
}
access_control {
group_name = databricks_group.eng.display_name
permission_level = "CAN_MANAGE_STAGING_VERSIONS"
}
}
Valid permission levels for databricks_model_serving are: CAN_VIEW
, CAN_QUERY
, and CAN_MANAGE
.
resource "databricks_model_serving" "this" {
name = "tf-test"
config {
served_models {
name = "prod_model"
model_name = "test"
model_version = "1"
workload_size = "Small"
scale_to_zero_enabled = true
}
}
}
resource "databricks_group" "auto" {
display_name = "Automation"
}
resource "databricks_group" "eng" {
display_name = "Engineering"
}
resource "databricks_permissions" "ml_serving_usage" {
serving_endpoint_id = databricks_model_serving.this.serving_endpoint_id
access_control {
group_name = "users"
permission_level = "CAN_VIEW"
}
access_control {
group_name = databricks_group.auto.display_name
permission_level = "CAN_MANAGE"
}
access_control {
group_name = databricks_group.eng.display_name
permission_level = "CAN_QUERY"
}
}
By default on AWS deployments, all admin users can sign in to Databricks using either SSO or their username and password, and all API users can authenticate to the Databricks REST APIs using their username and password. As an admin, you can limit admin users’ and API users’ ability to authenticate with their username and password by configuring CAN_USE
permissions using password access control.
resource "databricks_group" "guests" {
display_name = "Guest Users"
}
resource "databricks_permissions" "password_usage" {
authorization = "passwords"
access_control {
group_name = databricks_group.guests.display_name
permission_level = "CAN_USE"
}
}
It is required to have at least 1 personal access token in the workspace before you can manage tokens permissions.
Only possible permission to assign to non-admin group is CAN_USE
, where _admins_ CAN_MANAGE
all tokens:
resource "databricks_group" "auto" {
display_name = "Automation"
}
resource "databricks_group" "eng" {
display_name = "Engineering"
}
resource "databricks_permissions" "token_usage" {
authorization = "tokens"
access_control {
group_name = databricks_group.auto.display_name
permission_level = "CAN_USE"
}
access_control {
group_name = databricks_group.eng.display_name
permission_level = "CAN_USE"
}
}
SQL warehouses have three possible permissions: IS_OWNER
, CAN_USE
and CAN_MANAGE
:
data "databricks_current_user" "me" {}
resource "databricks_group" "auto" {
display_name = "Automation"
}
resource "databricks_group" "eng" {
display_name = "Engineering"
}
resource "databricks_sql_endpoint" "this" {
name = "Endpoint of ${data.databricks_current_user.me.alphanumeric}"
cluster_size = "Small"
max_num_clusters = 1
tags {
custom_tags {
key = "City"
value = "Amsterdam"
}
}
}
resource "databricks_permissions" "endpoint_usage" {
sql_endpoint_id = databricks_sql_endpoint.this.id
access_control {
group_name = databricks_group.auto.display_name
permission_level = "CAN_USE"
}
access_control {
group_name = databricks_group.eng.display_name
permission_level = "CAN_MANAGE"
}
}
SQL dashboards have three possible permissions: CAN_VIEW
, CAN_RUN
and CAN_MANAGE
:
resource "databricks_group" "auto" {
display_name = "Automation"
}
resource "databricks_group" "eng" {
display_name = "Engineering"
}
resource "databricks_permissions" "endpoint_usage" {
sql_dashboard_id = "3244325"
access_control {
group_name = databricks_group.auto.display_name
permission_level = "CAN_RUN"
}
access_control {
group_name = databricks_group.eng.display_name
permission_level = "CAN_MANAGE"
}
}
SQL queries have three possible permissions: CAN_VIEW
, CAN_RUN
and CAN_MANAGE
:
resource "databricks_group" "auto" {
display_name = "Automation"
}
resource "databricks_group" "eng" {
display_name = "Engineering"
}
resource "databricks_permissions" "endpoint_usage" {
sql_query_id = "3244325"
access_control {
group_name = databricks_group.auto.display_name
permission_level = "CAN_RUN"
}
access_control {
group_name = databricks_group.eng.display_name
permission_level = "CAN_MANAGE"
}
}
SQL alerts have three possible permissions: CAN_VIEW
, CAN_RUN
and CAN_MANAGE
:
resource "databricks_group" "auto" {
display_name = "Automation"
}
resource "databricks_group" "eng" {
display_name = "Engineering"
}
resource "databricks_permissions" "endpoint_usage" {
sql_alert_id = "3244325"
access_control {
group_name = databricks_group.auto.display_name
permission_level = "CAN_RUN"
}
access_control {
group_name = databricks_group.eng.display_name
permission_level = "CAN_MANAGE"
}
}
Instance Profiles are not managed by General Permissions API and therefore databricks_group_instance_profile and databricks_user_instance_profile should be used to allow usage of specific AWS EC2 IAM roles to users or groups.
One can control access to databricks_secret through initial_manage_principal
argument on databricks_secret_scope or databricks_secret_acl, so that users (or service principals) can READ
, WRITE
or MANAGE
entries within secret scope.
General Permissions API does not apply to access control for tables and they have to be managed separately using the databricks_sql_permissions resource, though you're encouraged to use Unity Catalog or migrate to it.
Initially in Unity Catalog all users have no access to data, which has to be later assigned through databricks_grants resource.
One type argument and at least one access control block argument are required.
Exactly one of the following arguments is required:
cluster_id
- cluster idcluster_policy_id
- cluster policy idinstance_pool_id
- instance pool idjob_id
- job idpipeline_id
- pipeline idnotebook_id
- ID of notebook within workspacenotebook_path
- path of notebookdirectory_id
- directory iddirectory_path
- path of directoryrepo_id
- repo idrepo_path
- path of databricks repo directory(/Repos/<username>/...
)experiment_id
- MLflow experiment idregistered_model_id
- MLflow registered model idserving_endpoint_id
- Model Serving endpoint id.authorization
- either tokens
or passwords
.sql_endpoint_id
- SQL warehouse idsql_dashboard_id
- SQL dashboard idsql_query_id
- SQL query idsql_alert_id
- SQL alert idOne or more access_control
blocks are required to actually set the permission levels:
access_control {
group_name = databricks_group.datascience.display_name
permission_level = "CAN_USE"
}
Arguments for the access_control
block are:
permission_level
- (Required) permission level according to specific resource. See examples above for the reference.Exactly one of the below arguments is required:
user_name
- (Optional) name of the user.service_principal_name
- (Optional) Application ID of the service_principal.group_name
- (Optional) name of the group. We recommend setting permissions on groups.In addition to all arguments above, the following attributes are exported:
id
- Canonical unique identifier for the permissions in form of /object_type/object_id
.object_type
- type of permissions.The resource permissions can be imported using the object id
terraform import databricks_permissions.this /<object type>/<object id>
Configuration file:
resource "databricks_mlflow_model" "model" {
name = "example_model"
description = "MLflow registered model"
}
resource "databricks_permissions" "model_usage" {
registered_model_id = databricks_mlflow_model.model.registered_model_id
access_control {
group_name = "users"
permission_level = "CAN_READ"
}
}
Import command:
terraform import databricks_permissions.model_usage /registered-models/<registered_model_id>