Resource: aws_glue_crawler

Manages a Glue Crawler. More information can be found in the AWS Glue Developer Guide

Example Usage

DynamoDB Target Example

resource "aws_glue_crawler" "example" {
  database_name = aws_glue_catalog_database.example.name
  name          = "example"
  role          = aws_iam_role.example.arn

  dynamodb_target {
    path = "table-name"
  }
}

JDBC Target Example

resource "aws_glue_crawler" "example" {
  database_name = aws_glue_catalog_database.example.name
  name          = "example"
  role          = aws_iam_role.example.arn

  jdbc_target {
    connection_name = aws_glue_connection.example.name
    path            = "database-name/%"
  }
}

S3 Target Example

resource "aws_glue_crawler" "example" {
  database_name = aws_glue_catalog_database.example.name
  name          = "example"
  role          = aws_iam_role.example.arn

  s3_target {
    path = "s3://${aws_s3_bucket.example.bucket}"
  }
}

Catalog Target Example

resource "aws_glue_crawler" "example" {
  database_name = aws_glue_catalog_database.example.name
  name          = "example"
  role          = aws_iam_role.example.arn

  catalog_target {
    database_name = aws_glue_catalog_database.example.name
    tables        = [aws_glue_catalog_table.example.name]
  }

  schema_change_policy {
    delete_behavior = "LOG"
  }

  configuration = <<EOF
{
  "Version":1.0,
  "Grouping": {
    "TableGroupingPolicy": "CombineCompatibleSchemas"
  }
}
EOF
}

MongoDB Target Example

resource "aws_glue_crawler" "example" {
  database_name = aws_glue_catalog_database.example.name
  name          = "example"
  role          = aws_iam_role.example.arn

  mongodb_target {
    connection_name = aws_glue_connection.example.name
    path            = "database-name/%"
  }
}

Configuration Settings Example

resource "aws_glue_crawler" "events_crawler" {
  database_name = aws_glue_catalog_database.glue_database.name
  schedule      = "cron(0 1 * * ? *)"
  name          = "events_crawler_${var.environment_name}"
  role          = aws_iam_role.glue_role.arn
  tags          = var.tags

  configuration = jsonencode(
    {
      Grouping = {
        TableGroupingPolicy = "CombineCompatibleSchemas"
      }
      CrawlerOutput = {
        Partitions = { AddOrUpdateBehavior = "InheritFromTable" }
      }
      Version = 1
    }
  )

  s3_target {
    path = "s3://${aws_s3_bucket.data_lake_bucket.bucket}"
  }
}

Argument Reference

This argument supports the following arguments:

Dynamodb Target

JDBC Target

S3 Target

Catalog Target

MongoDB Target

Hudi Target

Iceberg Target

Delta Target

Schema Change Policy

Lake Formation Configuration

Lineage Configuration

Recrawl Policy

Attribute Reference

This resource exports the following attributes in addition to the arguments above:

Import

In Terraform v1.5.0 and later, use an import block to import Glue Crawlers using name. For example:

import {
  to = aws_glue_crawler.MyJob
  id = "MyJob"
}

Using terraform import, import Glue Crawlers using name. For example:

% terraform import aws_glue_crawler.MyJob MyJob