aws-cdk-lib.aws_stepfunctions_tasks.SageMakerCreateTransformJobProps

interface SageMakerCreateTransformJobProps

LanguageType name
.NETAmazon.CDK.AWS.StepFunctions.Tasks.SageMakerCreateTransformJobProps
Gogithub.com/aws/aws-cdk-go/awscdk/v2/awsstepfunctionstasks#SageMakerCreateTransformJobProps
Javasoftware.amazon.awscdk.services.stepfunctions.tasks.SageMakerCreateTransformJobProps
Pythonaws_cdk.aws_stepfunctions_tasks.SageMakerCreateTransformJobProps
TypeScript (source)aws-cdk-lib » aws_stepfunctions_tasks » SageMakerCreateTransformJobProps

Properties for creating an Amazon SageMaker transform job task.

Example

new tasks.SageMakerCreateTransformJob(this, 'Batch Inference', {
  transformJobName: 'MyTransformJob',
  modelName: 'MyModelName',
  modelClientOptions: {
    invocationsMaxRetries: 3,  // default is 0
    invocationsTimeout: Duration.minutes(5),  // default is 60 seconds
  },
  transformInput: {
    transformDataSource: {
      s3DataSource: {
        s3Uri: 's3://inputbucket/train',
        s3DataType: tasks.S3DataType.S3_PREFIX,
      }
    }
  },
  transformOutput: {
    s3OutputPath: 's3://outputbucket/TransformJobOutputPath',
  },
  transformResources: {
    instanceCount: 1,
    instanceType: ec2.InstanceType.of(ec2.InstanceClass.M4, ec2.InstanceSize.XLARGE),
  }
});

Properties

NameTypeDescription
modelNamestringName of the model that you want to use for the transform job.
transformInputTransformInputDataset to be transformed and the Amazon S3 location where it is stored.
transformJobNamestringTransform Job Name.
transformOutputTransformOutputS3 location where you want Amazon SageMaker to save the results from the transform job.
batchStrategy?BatchStrategyNumber of records to include in a mini-batch for an HTTP inference request.
comment?stringAn optional description for this state.
credentials?CredentialsCredentials for an IAM Role that the State Machine assumes for executing the task.
environment?{ [string]: string }Environment variables to set in the Docker container.
heartbeat?⚠️DurationTimeout for the heartbeat.
heartbeatTimeout?TimeoutTimeout for the heartbeat.
inputPath?stringJSONPath expression to select part of the state to be the input to this state.
integrationPattern?IntegrationPatternAWS Step Functions integrates with services directly in the Amazon States Language.
maxConcurrentTransforms?numberMaximum number of parallel requests that can be sent to each instance in a transform job.
maxPayload?SizeMaximum allowed size of the payload, in MB.
modelClientOptions?ModelClientOptionsConfigures the timeout and maximum number of retries for processing a transform job invocation.
outputPath?stringJSONPath expression to select select a portion of the state output to pass to the next state.
resultPath?stringJSONPath expression to indicate where to inject the state's output.
resultSelector?{ [string]: any }The JSON that will replace the state's raw result and become the effective result before ResultPath is applied.
role?IRoleRole for the Transform Job.
tags?{ [string]: string }Tags to be applied to the train job.
taskTimeout?TimeoutTimeout for the task.
timeout?⚠️DurationTimeout for the task.
transformResources?TransformResourcesML compute instances for the transform job.

modelName

Type: string

Name of the model that you want to use for the transform job.


transformInput

Type: TransformInput

Dataset to be transformed and the Amazon S3 location where it is stored.


transformJobName

Type: string

Transform Job Name.


transformOutput

Type: TransformOutput

S3 location where you want Amazon SageMaker to save the results from the transform job.


batchStrategy?

Type: BatchStrategy (optional, default: No batch strategy)

Number of records to include in a mini-batch for an HTTP inference request.


comment?

Type: string (optional, default: No comment)

An optional description for this state.


credentials?

Type: Credentials (optional, default: None (Task is executed using the State Machine's execution role))

Credentials for an IAM Role that the State Machine assumes for executing the task.

This enables cross-account resource invocations.

See also: https://docs.aws.amazon.com/step-functions/latest/dg/concepts-access-cross-acct-resources.html


environment?

Type: { [string]: string } (optional, default: No environment variables)

Environment variables to set in the Docker container.


heartbeat?⚠️

⚠️ Deprecated: use heartbeatTimeout

Type: Duration (optional, default: None)

Timeout for the heartbeat.


heartbeatTimeout?

Type: Timeout (optional, default: None)

Timeout for the heartbeat.

[disable-awslint:duration-prop-type] is needed because all props interface in aws-stepfunctions-tasks extend this interface


inputPath?

Type: string (optional, default: The entire task input (JSON path '$'))

JSONPath expression to select part of the state to be the input to this state.

May also be the special value JsonPath.DISCARD, which will cause the effective input to be the empty object {}.


integrationPattern?

Type: IntegrationPattern (optional, default: IntegrationPattern.REQUEST_RESPONSE for most tasks. IntegrationPattern.RUN_JOB for the following exceptions: BatchSubmitJob, EmrAddStep, EmrCreateCluster, EmrTerminationCluster, and EmrContainersStartJobRun.)

AWS Step Functions integrates with services directly in the Amazon States Language.

You can control these AWS services using service integration patterns

See also: https://docs.aws.amazon.com/step-functions/latest/dg/connect-to-resource.html#connect-wait-token


maxConcurrentTransforms?

Type: number (optional, default: Amazon SageMaker checks the optional execution-parameters to determine the settings for your chosen algorithm. If the execution-parameters endpoint is not enabled, the default value is 1.)

Maximum number of parallel requests that can be sent to each instance in a transform job.


maxPayload?

Type: Size (optional, default: 6)

Maximum allowed size of the payload, in MB.


modelClientOptions?

Type: ModelClientOptions (optional, default: 0 retries and 60 seconds of timeout)

Configures the timeout and maximum number of retries for processing a transform job invocation.


outputPath?

Type: string (optional, default: The entire JSON node determined by the state input, the task result, and resultPath is passed to the next state (JSON path '$'))

JSONPath expression to select select a portion of the state output to pass to the next state.

May also be the special value JsonPath.DISCARD, which will cause the effective output to be the empty object {}.


resultPath?

Type: string (optional, default: Replaces the entire input with the result (JSON path '$'))

JSONPath expression to indicate where to inject the state's output.

May also be the special value JsonPath.DISCARD, which will cause the state's input to become its output.


resultSelector?

Type: { [string]: any } (optional, default: None)

The JSON that will replace the state's raw result and become the effective result before ResultPath is applied.

You can use ResultSelector to create a payload with values that are static or selected from the state's raw result.

See also: https://docs.aws.amazon.com/step-functions/latest/dg/input-output-inputpath-params.html#input-output-resultselector


role?

Type: IRole (optional, default: A role is created with AmazonSageMakerFullAccess managed policy)

Role for the Transform Job.


tags?

Type: { [string]: string } (optional, default: No tags)

Tags to be applied to the train job.


taskTimeout?

Type: Timeout (optional, default: None)

Timeout for the task.

[disable-awslint:duration-prop-type] is needed because all props interface in aws-stepfunctions-tasks extend this interface


timeout?⚠️

⚠️ Deprecated: use taskTimeout

Type: Duration (optional, default: None)

Timeout for the task.


transformResources?

Type: TransformResources (optional, default: 1 instance of type M4.XLarge)

ML compute instances for the transform job.