Provides a Glue Job resource.
resource "aws_glue_job" "example" {
name = "example"
role_arn = aws_iam_role.example.arn
command {
script_location = "s3://${aws_s3_bucket.example.bucket}/example.py"
}
}
resource "aws_glue_job" "example" {
name = "example"
role_arn = aws_iam_role.example.arn
glue_version = "4.0"
worker_type = "Z.2X"
command {
name = "glueray"
python_version = "3.9"
runtime = "Ray2.4"
script_location = "s3://${aws_s3_bucket.example.bucket}/example.py"
}
}
resource "aws_glue_job" "example" {
name = "example"
role_arn = aws_iam_role.example.arn
command {
script_location = "s3://${aws_s3_bucket.example.bucket}/example.scala"
}
default_arguments = {
"--job-language" = "scala"
}
}
resource "aws_glue_job" "example" {
name = "example streaming job"
role_arn = aws_iam_role.example.arn
command {
name = "gluestreaming"
script_location = "s3://${aws_s3_bucket.example.bucket}/example.script"
}
}
resource "aws_cloudwatch_log_group" "example" {
name = "example"
retention_in_days = 14
}
resource "aws_glue_job" "example" {
# ... other configuration ...
default_arguments = {
# ... potentially other arguments ...
"--continuous-log-logGroup" = aws_cloudwatch_log_group.example.name
"--enable-continuous-cloudwatch-log" = "true"
"--enable-continuous-log-filter" = "true"
"--enable-metrics" = ""
}
}
This resource supports the following arguments:
command
– (Required) The command of the job. Defined below.connections
– (Optional) The list of connections used for this job.default_arguments
– (Optional) The map of default arguments for this job. You can specify arguments here that your own job-execution script consumes, as well as arguments that AWS Glue itself consumes. For information about how to specify and consume your own Job arguments, see the Calling AWS Glue APIs in Python topic in the developer guide. For information about the key-value pairs that AWS Glue consumes to set up your job, see the Special Parameters Used by AWS Glue topic in the developer guide.non_overridable_arguments
– (Optional) Non-overridable arguments for this job, specified as name-value pairs.description
– (Optional) Description of the job.execution_property
– (Optional) Execution property of the job. Defined below.glue_version
- (Optional) The version of glue to use, for example "1.0". Ray jobs should set this to 4.0 or greater. For information about available versions, see the AWS Glue Release Notes.execution_class
- (Optional) Indicates whether the job is run with a standard or flexible execution class. The standard execution class is ideal for time-sensitive workloads that require fast job startup and dedicated resources. Valid value: FLEX
, STANDARD
.max_capacity
– (Optional) The maximum number of AWS Glue data processing units (DPUs) that can be allocated when this job runs. Required
when pythonshell
is set, accept either 0.0625
or 1.0
. Use number_of_workers
and worker_type
arguments instead with glue_version
2.0
and above.max_retries
– (Optional) The maximum number of times to retry this job if it fails.name
– (Required) The name you assign to this job. It must be unique in your account.notification_property
- (Optional) Notification property of the job. Defined below.role_arn
– (Required) The ARN of the IAM role associated with this job.tags
- (Optional) Key-value map of resource tags. If configured with a provider default_tags
configuration block present, tags with matching keys will overwrite those defined at the provider-level.timeout
– (Optional) The job timeout in minutes. The default is 2880 minutes (48 hours) for glueetl
and pythonshell
jobs, and null (unlimited) for gluestreaming
jobs.security_configuration
- (Optional) The name of the Security Configuration to be associated with the job.worker_type
- (Optional) The type of predefined worker that is allocated when a job runs. Accepts a value of Standard, G.1X, G.2X, or G.025X for Spark jobs. Accepts the value Z.2X for Ray jobs.
number_of_workers
- (Optional) The number of workers of a defined workerType that are allocated when a job runs.name
- (Optional) The name of the job command. Defaults to glueetl
. Use pythonshell
for Python Shell Job Type, glueray
for Ray Job Type, or gluestreaming
for Streaming Job Type. max_capacity
needs to be set if pythonshell
is chosen.script_location
- (Required) Specifies the S3 path to a script that executes a job.python_version
- (Optional) The Python version being used to execute a Python shell job. Allowed values are 2, 3 or 3.9. Version 3 refers to Python 3.6.runtime
- (Optional) In Ray jobs, runtime is used to specify the versions of Ray, Python and additional libraries available in your environment. This field is not used in other job types. For supported runtime environment values, see Working with Ray jobs in the Glue Developer Guide.max_concurrent_runs
- (Optional) The maximum number of concurrent runs allowed for a job. The default is 1.notify_delay_after
- (Optional) After a job run starts, the number of minutes to wait before sending a job run delay notification.This resource exports the following attributes in addition to the arguments above:
arn
- Amazon Resource Name (ARN) of Glue Jobid
- Job nametags_all
- A map of tags assigned to the resource, including those inherited from the provider default_tags
configuration block.In Terraform v1.5.0 and later, use an import
block to import Glue Jobs using name
. For example:
import {
to = aws_glue_job.MyJob
id = "MyJob"
}
Using terraform import
, import Glue Jobs using name
. For example:
% terraform import aws_glue_job.MyJob MyJob