aws-cdk-lib.aws_glue.CfnCrawlerProps

interface CfnCrawlerProps

LanguageType name
.NETAmazon.CDK.AWS.Glue.CfnCrawlerProps
Gogithub.com/aws/aws-cdk-go/awscdk/v2/awsglue#CfnCrawlerProps
Javasoftware.amazon.awscdk.services.glue.CfnCrawlerProps
Pythonaws_cdk.aws_glue.CfnCrawlerProps
TypeScript aws-cdk-lib » aws_glue » CfnCrawlerProps

Properties for defining a CfnCrawler.

Example

// The code below shows an example of how to instantiate this type.
// The values are placeholders you should change.
import { aws_glue as glue } from 'aws-cdk-lib';

declare const tags: any;
const cfnCrawlerProps: glue.CfnCrawlerProps = {
  role: 'role',
  targets: {
    catalogTargets: [{
      connectionName: 'connectionName',
      databaseName: 'databaseName',
      dlqEventQueueArn: 'dlqEventQueueArn',
      eventQueueArn: 'eventQueueArn',
      tables: ['tables'],
    }],
    deltaTargets: [{
      connectionName: 'connectionName',
      createNativeDeltaTable: false,
      deltaTables: ['deltaTables'],
      writeManifest: false,
    }],
    dynamoDbTargets: [{
      path: 'path',
    }],
    jdbcTargets: [{
      connectionName: 'connectionName',
      exclusions: ['exclusions'],
      path: 'path',
    }],
    mongoDbTargets: [{
      connectionName: 'connectionName',
      path: 'path',
    }],
    s3Targets: [{
      connectionName: 'connectionName',
      dlqEventQueueArn: 'dlqEventQueueArn',
      eventQueueArn: 'eventQueueArn',
      exclusions: ['exclusions'],
      path: 'path',
      sampleSize: 123,
    }],
  },

  // the properties below are optional
  classifiers: ['classifiers'],
  configuration: 'configuration',
  crawlerSecurityConfiguration: 'crawlerSecurityConfiguration',
  databaseName: 'databaseName',
  description: 'description',
  name: 'name',
  recrawlPolicy: {
    recrawlBehavior: 'recrawlBehavior',
  },
  schedule: {
    scheduleExpression: 'scheduleExpression',
  },
  schemaChangePolicy: {
    deleteBehavior: 'deleteBehavior',
    updateBehavior: 'updateBehavior',
  },
  tablePrefix: 'tablePrefix',
  tags: tags,
};

Properties

NameTypeDescription
rolestringThe Amazon Resource Name (ARN) of an IAM role that's used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.
targetsIResolvable | TargetsPropertyA collection of targets to crawl.
classifiers?string[]A list of UTF-8 strings that specify the names of custom classifiers that are associated with the crawler.
configuration?stringCrawler configuration information.
crawlerSecurityConfiguration?stringThe name of the SecurityConfiguration structure to be used by this crawler.
databaseName?stringThe name of the database in which the crawler's output is stored.
description?stringA description of the crawler.
name?stringThe name of the crawler.
recrawlPolicy?IResolvable | RecrawlPolicyPropertyA policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
schedule?IResolvable | SchedulePropertyFor scheduled crawlers, the schedule when the crawler runs.
schemaChangePolicy?IResolvable | SchemaChangePolicyPropertyThe policy that specifies update and delete behaviors for the crawler.
tablePrefix?stringThe prefix added to the names of tables that are created.
tags?anyThe tags to use with this crawler.

role

Type: string

The Amazon Resource Name (ARN) of an IAM role that's used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.


targets

Type: IResolvable | TargetsProperty

A collection of targets to crawl.


classifiers?

Type: string[] (optional)

A list of UTF-8 strings that specify the names of custom classifiers that are associated with the crawler.


configuration?

Type: string (optional)

Crawler configuration information.

This versioned JSON string allows users to specify aspects of a crawler's behavior. For more information, see Configuring a Crawler .


crawlerSecurityConfiguration?

Type: string (optional)

The name of the SecurityConfiguration structure to be used by this crawler.


databaseName?

Type: string (optional)

The name of the database in which the crawler's output is stored.


description?

Type: string (optional)

A description of the crawler.


name?

Type: string (optional)

The name of the crawler.


recrawlPolicy?

Type: IResolvable | RecrawlPolicyProperty (optional)

A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.


schedule?

Type: IResolvable | ScheduleProperty (optional)

For scheduled crawlers, the schedule when the crawler runs.


schemaChangePolicy?

Type: IResolvable | SchemaChangePolicyProperty (optional)

The policy that specifies update and delete behaviors for the crawler.

The policy tells the crawler what to do in the event that it detects a change in a table that already exists in the customer's database at the time of the crawl. The SchemaChangePolicy does not affect whether or how new tables and partitions are added. New tables and partitions are always created regardless of the SchemaChangePolicy on a crawler.

The SchemaChangePolicy consists of two components, UpdateBehavior and DeleteBehavior .


tablePrefix?

Type: string (optional)

The prefix added to the names of tables that are created.


tags?

Type: any (optional)

The tags to use with this crawler.