databricks_instance_pool Resource

This resource allows you to manage instance pools to reduce cluster start and auto-scaling times by maintaining a set of idle, ready-to-use instances. An instance pool reduces cluster start and auto-scaling times by maintaining a set of idle, ready-to-use cloud instances. When a cluster attached to a pool needs an instance, it first attempts to allocate one of the pool’s idle instances. If the pool has no idle instances, it expands by allocating a new instance from the instance provider in order to accommodate the cluster’s request. When a cluster releases an instance, it returns to the pool and is free for another cluster to use. Only clusters attached to a pool can use that pool’s idle instances.

Example Usage

data "databricks_node_type" "smallest" {
}

resource "databricks_instance_pool" "smallest_nodes" {
  instance_pool_name = "Smallest Nodes"
  min_idle_instances = 0
  max_capacity       = 300
  node_type_id       = data.databricks_node_type.smallest.id
  aws_attributes {
    availability           = "ON_DEMAND"
    zone_id                = "us-east-1a"
    spot_bid_price_percent = "100"
  }
  idle_instance_autotermination_minutes = 10
  disk_spec {
    disk_type {
      ebs_volume_type = "GENERAL_PURPOSE_SSD"
    }
    disk_size  = 80
    disk_count = 1
  }
}

Argument Reference

The following arguments are supported:

aws_attributes Configuration Block

The following options are available:

azure_attributes Configuration Block

azure_attributes optional configuration block contains attributes related to instance pools on Azure.

The following options are available:

gcp_attributes Configuration Block

gcp_attributes optional configuration block contains attributes related to instance pools on GCP.

The following options are available:

disk_spec Configuration Block

For disk_spec make sure to use ebs_volume_type only on AWS deployment of Databricks and azure_disk_volume_type only on a Azure deployment of Databricks.

disk_type sub-block

ebs_volume_type - (Optional) (String) The EBS volume type to use. Options are: GENERAL_PURPOSE_SSD (Provision extra storage using AWS gp2 EBS volumes) or THROUGHPUT_OPTIMIZED_HDD (Provision extra storage using AWS st1 volumes)

azure_disk_volume_type - (Optional) (String) The type of Azure disk to use. Options are: PREMIUM_LRS (Premium storage tier, backed by SSDs) or "STANDARD_LRS" (Standard storage tier, backed by HDDs)

preloaded_docker_image sub_block

Databricks Container Services lets you specify a Docker image when you create a cluster. You need to enable Container Services in Admin Console / Advanced page in the user interface. By enabling this feature, you acknowledge and agree that your usage of this feature is subject to the applicable additional terms. You can instruct the instance pool to pre-download the Docker image onto the instances so when node is acquired for a cluster that requires a custom Docker image the setup process will be faster.

preloaded_docker_image configuration block has the following attributes:

Example usage with azurerm_container_registry and docker_registry_image, that you can adapt to your specific use-case:

resource "docker_registry_image" "this" {
  name = "${azurerm_container_registry.this.login_server}/sample:latest"
  build {
    # ...
  }
}

resource "databricks_instance_pool" "this" {
  # ...
  preloaded_docker_image {
    url = docker_registry_image.this.name
    basic_auth {
      username = azurerm_container_registry.this.admin_username
      password = azurerm_container_registry.this.admin_password
    }
  }
}

Attribute Reference

In addition to all arguments above, the following attributes are exported:

Access Control

Import

The resource instance pool can be imported using it's id:

terraform import databricks_instance_pool.this <instance-pool-id>