databricks_instance_profile Resource

This resource allows you to manage AWS EC2 instance profiles that users can launch databricks_cluster and access data, like databricks_mount. The following example demonstrates how to create an instance profile and create a cluster with it. When creating a new databricks_instance_profile, Databricks validates that it has sufficient permissions to launch instances with the instance profile. This validation uses AWS dry-run mode for the AWS EC2 RunInstances API.

variable "crossaccount_role_name" {
  type        = string
  description = "Role that you've specified on https://accounts.cloud.databricks.com/#aws"
}
data "aws_iam_policy_document" "assume_role_for_ec2" {
  statement {
    effect  = "Allow"
    actions = ["sts:AssumeRole"]
    principals {
      identifiers = ["ec2.amazonaws.com"]
      type        = "Service"
    }
  }
}
resource "aws_iam_role" "role_for_s3_access" {
  name               = "shared-ec2-role-for-s3"
  description        = "Role for shared access"
  assume_role_policy = data.aws_iam_policy_document.assume_role_for_ec2.json
}
data "aws_iam_policy_document" "pass_role_for_s3_access" {
  statement {
    effect    = "Allow"
    actions   = ["iam:PassRole"]
    resources = [aws_iam_role.role_for_s3_access.arn]
  }
}
resource "aws_iam_policy" "pass_role_for_s3_access" {
  name   = "shared-pass-role-for-s3-access"
  path   = "/"
  policy = data.aws_iam_policy_document.pass_role_for_s3_access.json
}
resource "aws_iam_role_policy_attachment" "cross_account" {
  policy_arn = aws_iam_policy.pass_role_for_s3_access.arn
  role       = var.crossaccount_role_name
}
resource "aws_iam_instance_profile" "shared" {
  name = "shared-instance-profile"
  role = aws_iam_role.role_for_s3_access.name
}
resource "databricks_instance_profile" "shared" {
  instance_profile_arn = aws_iam_instance_profile.shared.arn
}
data "databricks_spark_version" "latest" {}
data "databricks_node_type" "smallest" {
  local_disk = true
}
resource "databricks_cluster" "this" {
  cluster_name            = "Shared Autoscaling"
  spark_version           = data.databricks_spark_version.latest.id
  node_type_id            = data.databricks_node_type.smallest.id
  autotermination_minutes = 20
  autoscale {
    min_workers = 1
    max_workers = 50
  }
  aws_attributes {
    instance_profile_arn   = databricks_instance_profile.shared.id
    availability           = "SPOT"
    zone_id                = "us-east-1"
    first_on_demand        = 1
    spot_bid_price_percent = 100
  }
}

Usage with Cluster Policies

It is advised to keep all common configurations in Cluster Policies to maintain control of the environments launched, so databricks_cluster above could be replaced with databricks_cluster_policy:

resource "databricks_cluster_policy" "this" {
  name = "Policy with predefined instance profile"
  definition = jsonencode({
    # most likely policy might have way more things init.
    "aws_attributes.instance_profile_arn" : {
      "type" : "fixed",
      "value" : databricks_instance_profile.shared.arn
    }
  })
}

Granting access to all users

You can make instance profile available to all users by associating it with the special group called users through databricks_group data source.

resource "databricks_instance_profile" "this" {
  instance_profile_arn = aws_iam_instance_profile.shared.arn
}

data "databricks_group" "users" {
  display_name = "users"
}

resource "databricks_group_instance_profile" "all" {
  group_id            = data.databricks_group.users.id
  instance_profile_id = databricks_instance_profile.this.id
}

Usage with Databricks SQL serverless

When the instance profile ARN and its associated IAM role ARN don't match and the instance profile is intended for use with Databricks SQL serverless, the iam_role_arn parameter can be specified.

data "aws_iam_policy_document" "sql_serverless_assume_role" {
  statement {
    actions = ["sts:AssumeRole"]
    principals {
      type        = "AWS"
      identifiers = ["arn:aws:iam::790110701330:role/serverless-customer-resource-role"]
    }
    condition {
      test     = "StringEquals"
      variable = "sts:ExternalID"
      values = [
        "databricks-serverless-<YOUR_WORKSPACE_ID1>",
        "databricks-serverless-<YOUR_WORKSPACE_ID2>"
      ]
    }
  }
}

resource "aws_iam_role" "this" {
  name               = "my-databricks-sql-serverless-role"
  assume_role_policy = data.aws_iam_policy_document.sql_serverless_assume_role.json
}

resource "aws_iam_instance_profile" "this" {
  name = "my-databricks-sql-serverless-instance-profile"
  role = aws_iam_role.this.name
}

resource "databricks_instance_profile" "this" {
  instance_profile_arn = aws_iam_instance_profile.this.arn
  iam_role_arn         = aws_iam_role.this.arn
}

Argument Reference

The following arguments are supported:

Attribute Reference

In addition to all arguments above, the following attributes are exported:

Import

The resource instance profile can be imported using the ARN of it

terraform import databricks_instance_profile.this <instance-profile-arn>