Provides a SageMaker Endpoint resource.
Basic usage:
resource "aws_sagemaker_endpoint" "e" {
name = "my-endpoint"
endpoint_config_name = aws_sagemaker_endpoint_configuration.ec.name
tags = {
Name = "foo"
}
}
This resource supports the following arguments:
endpoint_config_name
- (Required) The name of the endpoint configuration to use.deployment_config
- (Optional) The deployment configuration for an endpoint, which contains the desired deployment strategy and rollback configurations. See Deployment Config.name
- (Optional) The name of the endpoint. If omitted, Terraform will assign a random, unique name.tags
- (Optional) A map of tags to assign to the resource. If configured with a provider default_tags
configuration block present, tags with matching keys will overwrite those defined at the provider-level.blue_green_update_policy
- (Optional) Update policy for a blue/green deployment. If this update policy is specified, SageMaker creates a new fleet during the deployment while maintaining the old fleet. SageMaker flips traffic to the new fleet according to the specified traffic routing configuration. Only one update policy should be used in the deployment configuration. If no update policy is specified, SageMaker uses a blue/green deployment strategy with all at once traffic shifting by default. See Blue Green Update Config.auto_rollback_configuration
- (Optional) Automatic rollback configuration for handling endpoint deployment failures and recovery. See Auto Rollback Configuration.rolling_update_policy
- (Optional) Specifies a rolling deployment strategy for updating a SageMaker endpoint. See Rolling Update Policy.traffic_routing_configuration
- (Required) Defines the traffic routing strategy to shift traffic from the old fleet to the new fleet during an endpoint deployment. See Traffic Routing Configuration.maximum_execution_timeout_in_seconds
- (Optional) Maximum execution timeout for the deployment. Note that the timeout value should be larger than the total waiting time specified in termination_wait_in_seconds
and wait_interval_in_seconds
. Valid values are between 600
and 14400
.termination_wait_in_seconds
- (Optional) Additional waiting time in seconds after the completion of an endpoint deployment before terminating the old endpoint fleet. Default is 0
. Valid values are between 0
and 3600
.maximum_batch_size
- (Required) Batch size for each rolling step to provision capacity and turn on traffic on the new endpoint fleet, and terminate capacity on the old endpoint fleet. Value must be between 5% to 50% of the variant's total instance count. See Maximum Batch Size.maximum_execution_timeout_in_seconds
- (Optional) The time limit for the total deployment. Exceeding this limit causes a timeout. Valid values are between 600
and 14400
.rollback_maximum_batch_size
- (Optional) Batch size for rollback to the old endpoint fleet. Each rolling step to provision capacity and turn on traffic on the old endpoint fleet, and terminate capacity on the new endpoint fleet. If this field is absent, the default value will be set to 100% of total capacity which means to bring up the whole capacity of the old fleet at once during rollback. See Rollback Maximum Batch Size.wait_interval_in_seconds
- (Required) The length of the baking period, during which SageMaker monitors alarms for each batch on the new fleet. Valid values are between 0
and 3600
.type
- (Required) Traffic routing strategy type. Valid values are: ALL_AT_ONCE
, CANARY
, and LINEAR
.wait_interval_in_seconds
- (Required) The waiting time (in seconds) between incremental steps to turn on traffic on the new endpoint fleet. Valid values are between 0
and 3600
.canary_size
- (Optional) Batch size for the first step to turn on traffic on the new endpoint fleet. Value must be less than or equal to 50% of the variant's total instance count. See Canary Size.linear_step_size
- (Optional) Batch size for each step to turn on traffic on the new endpoint fleet. Value must be 10-50% of the variant's total instance count. See Linear Step Size.type
- (Required) Specifies the endpoint capacity type. Valid values are: INSTANCE_COUNT
, or CAPACITY_PERCENT
.value
- (Required) Defines the capacity size, either as a number of instances or a capacity percentage.type
- (Required) Specifies the endpoint capacity type. Valid values are: INSTANCE_COUNT
, or CAPACITY_PERCENT
.value
- (Required) Defines the capacity size, either as a number of instances or a capacity percentage.type
- (Required) Specifies the endpoint capacity type. Valid values are: INSTANCE_COUNT
, or CAPACITY_PERCENT
.value
- (Required) Defines the capacity size, either as a number of instances or a capacity percentage.type
- (Required) Specifies the endpoint capacity type. Valid values are: INSTANCE_COUNT
, or CAPACITY_PERCENT
.value
- (Required) Defines the capacity size, either as a number of instances or a capacity percentage.alarms
- (Required) List of CloudWatch alarms in your account that are configured to monitor metrics on an endpoint. If any alarms are tripped during a deployment, SageMaker rolls back the deployment. See Alarms.alarm_name
- (Required) The name of a CloudWatch alarm in your account.This resource exports the following attributes in addition to the arguments above:
arn
- The Amazon Resource Name (ARN) assigned by AWS to this endpoint.name
- The name of the endpoint.tags_all
- A map of tags assigned to the resource, including those inherited from the provider default_tags
configuration block.In Terraform v1.5.0 and later, use an import
block to import endpoints using the name
. For example:
import {
to = aws_sagemaker_endpoint.test_endpoint
id = "my-endpoint"
}
Using terraform import
, import endpoints using the name
. For example:
% terraform import aws_sagemaker_endpoint.test_endpoint my-endpoint