Using HCP Terraform's Continuous Validation feature with the AWS Provider

Continuous Validation in HCP Terraform

The Continuous Validation feature in HCP Terraform allows users to make assertions about their infrastructure between applied runs. This helps users to identify issues at the time they first appear and avoid situations where a change is only identified once it causes a customer-facing problem.

Users can add checks to their Terraform configuration using check blocks. Check blocks contain assertions that are defined with a custom condition expression and an error message. When the condition expression evaluates to true the check passes, but when the expression evaluates to false Terraform will show a warning message that includes the user-defined error message.

Custom conditions can be created using data from Terraform providers’ resources and data sources. Data can also be combined from multiple sources; for example, you can use checks to monitor expirable resources by comparing a resource’s expiration date attribute to the current time returned by Terraform’s built-in time functions.

Below, this guide shows examples of how data returned by the AWS provider can be used to define checks in your Terraform configuration.

Example - Ensure your AWS account is within budget (aws_budgets_budget)

AWS Budgets allows you to track and take action on your AWS costs and usage. You can use AWS Budgets to monitor your aggregate utilization and coverage metrics for your Reserved Instances (RIs) or Savings Plans.

The example below shows how a check block can be used to assert that you remain in compliance for the budgets that have been established.

check "check_budget_exceeded" {
  data "aws_budgets_budget" "example" {
    name = aws_budgets_budget.example.name
  }

  assert {
    condition = !data.aws_budgets_budget.example.budget_exceeded
    error_message = format("AWS budget has been exceeded! Calculated spend: '%s' and budget limit: '%s'",
      data.aws_budgets_budget.example.calculated_spend[0].actual_spend[0].amount,
      data.aws_budgets_budget.example.budget_limit[0].amount
    )
  }
}

If the budget exceeds the set limit, the check block assertion will return a warning similar to the following:

│ Warning: Check block assertion failed
│
│   on main.tf line 43, in check "check_budget_exceeded":
│   43:     condition = !data.aws_budgets_budget.example.budget_exceeded
│     ├────────────────
│     │ data.aws_budgets_budget.example.budget_exceeded is true
│
│ AWS budget has been exceeded! Calculated spend: '1550.0' and budget limit: '1200.0'

Example - Check GuardDuty for Threats (aws_guardduty_finding_ids)

Amazon GuardDuty is a threat detection service that continuously monitors for malicious activity and unauthorized behavior to protect your Amazon Web Services accounts, workloads, and data stored in Amazon S3. With the cloud, the collection and aggregation of account and network activities is simplified, but it can be time consuming for security teams to continuously analyze event log data for potential threats. With GuardDuty, you now have an intelligent and cost-effective option for continuous threat detection in Amazon Web Services Cloud.

The following example outlines how a check block can be utilized to assert that no threats have been identified from AWS GuardDuty.

data "aws_guardduty_detector" "example" {}

check "check_guardduty_findings" {
  data "aws_guardduty_finding_ids" "example" {
    detector_id = data.aws_guardduty_detector.example.id
  }

  assert {
    condition = !data.aws_guardduty_finding_ids.example.has_findings
    error_message = format("AWS GuardDuty detector '%s' has %d open findings!",
      data.aws_guardduty_finding_ids.example.detector_id,
      length(data.aws_guardduty_finding_ids.example.finding_ids),
    )
  }
}

If findings are present, the check block assertion will return a warning similar to the following:

│ Warning: Check block assertion failed
│
│   on main.tf line 24, in check "check_guardduty_findings":
│   24:     condition = !data.aws_guardduty_finding_ids.example.has_findings
│     ├────────────────
│     │ data.aws_guardduty_finding_ids.example.has_findings is true
│
│ AWS GuardDuty detector 'abcdef123456' has 9 open findings!

Example - Check for unused IAM roles (aws_iam_role)

AWS IAM tracks role usage, including the last used date and region. This information is returned with the aws_iam_role data source, and can be used in continuous validation to check for unused roles. AWS reports activity for the trailing 400 days. If a role is unused within that period, the last_used_date will be an empty string ("").

In the example below, the timecmp function checks for a last_used_date more recent than the unused_limit local variable (30 days ago). The coalesce function handles empty ("") last_used_date values safely, falling back to the unused_limit local, and automatically triggering a failed condition.

locals {
  unused_limit = timeadd(timestamp(), "-720h")
}

check "check_iam_role_unused" {
  data "aws_iam_role" "example" {
    name = aws_iam_role.example.name
  }

  assert {
    condition = (
      timecmp(
        coalesce(data.aws_iam_role.example.role_last_used[0].last_used_date, local.unused_limit),
        local.unused_limit,
      ) > 0
    )
    error_message = format("AWS IAM role '%s' is unused in the last 30 days!",
      data.aws_iam_role.example.name,
    )
  }
}

Example - Check EKS Cluster Instance Health and Availability (aws_eks_cluster)

Amazon Elastic Kubernetes Service (Amazon EKS) is a managed Kubernetes service to run Kubernetes in the AWS cloud and on-premises data centers. In the cloud, Amazon EKS automatically manages the availability and scalability of the Kubernetes control plane nodes responsible for scheduling containers, managing application availability, storing cluster data, and other key tasks. With Amazon EKS, you can take advantage of all the performance, scale, reliability, and availability of AWS infrastructure, as well as integrations with AWS networking and security services.

The example below shows how a check block can be used to assert that your cluster is in good health and available.

check "aws_eks_cluster_default" {
  assert {
    condition     = aws_eks_cluster.default.status == "ACTIVE"
    error_message = "EKS cluster ${aws_eks_cluster.default.id} status is ${aws_eks_cluster.default.status}"
  }
}

Example - Check for EC2 Stopped Instances (aws_instances)

Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides secure, resizable compute capacity in the cloud. Access reliable, scalable infrastructure on demand. Scale capacity within minutes with SLA commitment of 99.99% availability.

The example below shows how a check block can be used to assert that your EC2 instances are stopped.

check "aws_instances_stopped" {
  data "aws_instances" "example" {
    instance_state_names = "stopped"
  }
  assert {
    condition     = length(data.aws_instances.example) > 0
    error_message = format("Found Instances have stopped! Instance ID’s: %s", data.aws_instances.example.ids)
  }
}