aws-cdk-lib.aws_sagemaker.CfnEndpointConfig.ProductionVariantProperty

interface ProductionVariantProperty

LanguageType name
.NETAmazon.CDK.AWS.Sagemaker.CfnEndpointConfig.ProductionVariantProperty
Gogithub.com/aws/aws-cdk-go/awscdk/v2/awssagemaker#CfnEndpointConfig_ProductionVariantProperty
Javasoftware.amazon.awscdk.services.sagemaker.CfnEndpointConfig.ProductionVariantProperty
Pythonaws_cdk.aws_sagemaker.CfnEndpointConfig.ProductionVariantProperty
TypeScript aws-cdk-lib » aws_sagemaker » CfnEndpointConfig » ProductionVariantProperty

Specifies a model that you want to host and the resources to deploy for hosting it.

If you are deploying multiple models, tell Amazon SageMaker how to distribute traffic among the models by specifying the InitialVariantWeight objects.

Example

// The code below shows an example of how to instantiate this type.
// The values are placeholders you should change.
import { aws_sagemaker as sagemaker } from 'aws-cdk-lib';
const productionVariantProperty: sagemaker.CfnEndpointConfig.ProductionVariantProperty = {
  initialVariantWeight: 123,
  modelName: 'modelName',
  variantName: 'variantName',

  // the properties below are optional
  acceleratorType: 'acceleratorType',
  containerStartupHealthCheckTimeoutInSeconds: 123,
  enableSsmAccess: false,
  initialInstanceCount: 123,
  instanceType: 'instanceType',
  modelDataDownloadTimeoutInSeconds: 123,
  serverlessConfig: {
    maxConcurrency: 123,
    memorySizeInMb: 123,

    // the properties below are optional
    provisionedConcurrency: 123,
  },
  volumeSizeInGb: 123,
};

Properties

NameTypeDescription
initialVariantWeightnumberDetermines initial traffic distribution among all of the models that you specify in the endpoint configuration.
modelNamestringThe name of the model that you want to host.
variantNamestringThe name of the production variant.
acceleratorType?stringThe size of the Elastic Inference (EI) instance to use for the production variant.
containerStartupHealthCheckTimeoutInSeconds?numberCfnEndpointConfig.ProductionVariantProperty.ContainerStartupHealthCheckTimeoutInSeconds.
enableSsmAccess?boolean | IResolvableCfnEndpointConfig.ProductionVariantProperty.EnableSSMAccess.
initialInstanceCount?numberNumber of instances to launch initially.
instanceType?stringThe ML compute instance type.
modelDataDownloadTimeoutInSeconds?numberCfnEndpointConfig.ProductionVariantProperty.ModelDataDownloadTimeoutInSeconds.
serverlessConfig?IResolvable | ServerlessConfigPropertyThe serverless configuration for an endpoint.
volumeSizeInGb?numberCfnEndpointConfig.ProductionVariantProperty.VolumeSizeInGB.

initialVariantWeight

Type: number

Determines initial traffic distribution among all of the models that you specify in the endpoint configuration.

The traffic to a production variant is determined by the ratio of the VariantWeight to the sum of all VariantWeight values across all ProductionVariants. If unspecified, it defaults to 1.0.


modelName

Type: string

The name of the model that you want to host.

This is the name that you specified when creating the model.


variantName

Type: string

The name of the production variant.


acceleratorType?

Type: string (optional)

The size of the Elastic Inference (EI) instance to use for the production variant.

EI instances provide on-demand GPU computing for inference. For more information, see Using Elastic Inference in Amazon SageMaker . For more information, see Using Elastic Inference in Amazon SageMaker .


containerStartupHealthCheckTimeoutInSeconds?

Type: number (optional)

CfnEndpointConfig.ProductionVariantProperty.ContainerStartupHealthCheckTimeoutInSeconds.


enableSsmAccess?

Type: boolean | IResolvable (optional)

CfnEndpointConfig.ProductionVariantProperty.EnableSSMAccess.


initialInstanceCount?

Type: number (optional)

Number of instances to launch initially.


instanceType?

Type: string (optional)

The ML compute instance type.


modelDataDownloadTimeoutInSeconds?

Type: number (optional)

CfnEndpointConfig.ProductionVariantProperty.ModelDataDownloadTimeoutInSeconds.


serverlessConfig?

Type: IResolvable | ServerlessConfigProperty (optional)

The serverless configuration for an endpoint.

Specifies a serverless endpoint configuration instead of an instance-based endpoint configuration.


volumeSizeInGb?

Type: number (optional)

CfnEndpointConfig.ProductionVariantProperty.VolumeSizeInGB.