Tracing Infrastructure Changes with AWS Provider User Agent header

Failures are a given and everything will eventually fail over time- Werner Vogels, CTO, Amazon.com.

And when an application goes down or an infrastructure change goes wrong, the first question is always What changed and who changed it ? . On a very high level, CloudTrail as a service gives you the view into the API calls which could have lead to that issue you are investigating. But, connecting that back to a specific deployment, commit , user person requires more context. If you are using Terraform to deploy the AWS workloads in your case , you may have a way to add this additional context. No, I am not talking about tags. I am talking about the user agent mechanism which the provider has - a way to embed custom metadata into every API call Terraform makes.

Phew User Agent and context and no AI involved !!! This post looks at ways you can introduce this metadata.

What is a User Agent Header?

Every AWS API call includes a User-Agent header that identifies the client making the request. The AWS Terraform provider includes information like a sample below:

APN/1.0 HashiCorp/1.0 Terraform/1.15.0-alpha20260204 (+https://www.terraform.io) 
terraform-provider-aws/6.32.1 (+https://registry.terraform.io/providers/hashicorp/aws) 
aws-sdk-go-v2/1.41.1 ua/2.1 os/macos lang/go#1.25.7 md/GOOS#darwin md/GOARCH#arm64 
api/sts#1.41.6 m/v,w  test-infrastructure/1.0.0 (local-test)" tf_aws.sdk=aws-sdk-go-v2 timestamp=2026-02-14T22:48:03.584-0500

This tells us that it’s the terraform version, the provider version, and the SDK details - but nothing about your deployment context. You can see this by running terraform plan with debug logging enabled:

TF_LOG=DEBUG terraform plan 2>&1 | grep "http.user_agent"

From the provider documentation, you can see there are three ways to add the additional information you want to :

The user_agent provider argument.
The TF_APPEND_USER_AGENT environment variable.
The provider_meta user_agent argument. # We will discuss this in a second post as it needs a post of its own.

Keeping the focus of this post only on AWS provider though the AWSCC provider also supports it. The use agent data is not passed through to CloudTrail (yet) in case of CloudControl API based calls.

Provider-Level User Agent

Basic Configuration

Let’s see this in action with the latest AWS provider. All you need to do is use the user_agent attribute. For example, the below configuration

provider "aws" {
  region = "us-east-1"

  user_agent = [
    "Manu added this"
  ]
}

Yields…

 http.user_agent="APN/1.0 HashiCorp/1.0 Terraform/1.15.0-alpha20260204 (+https://www.terraform.io) 
 terraform-provider-aws/6.32.1 (+https://registry.terraform.io/providers/hashicorp/aws) 
 aws-sdk-go-v2/1.41.1 ua/2.1 os/macos lang/go#1.25.7 md/GOOS#darwin md/GOARCH#arm64 api/sts#1.41.6 m/v,w 
 Manu added this

1. Provider Argument with Function

One of the things I have realized over the years is that you learn something new everytime you start diving deep into something you want to write about. What was it this time ? The provider function provider::aws::user_agent() . I wasn’t aware this existed .

Lets revise our example to use the function. Keep in mind that the function requires three arguments.

provider "aws" {
  region = "us-east-1"

  user_agent = [
    provider::aws::user_agent("example-demo", "0.0.1", "a comment"),
        "other-demo/0.0.2 (other comment)"
  ]
}

This adds example-demo/0.0.1 (a comment) other-demo/0.0.2 (other comment) to every API call’s user agent string. When you run terraform plan, you’ll see:

http.user_agent="APN/1.0 HashiCorp/1.0 Terraform/1.15.0-alpha20260204 
# removed other pieces for brevity
example-demo/0.0.1 (a comment) other-demo/0.0.2 (other comment)"

You still get the provider injected metadata and you just added some custom information to the end of it.

2. Environment Variable

Let’s see how we can do this via the environment variable TF_APPEND_USER_AGENT, but retain the user_agent arg in the provider too.

export TF_APPEND_USER_AGENT="GitHub Actions/sha1235363 BuildID/1234"

# versions.tf
provider "aws" {
  user_agent = [
    provider::aws::user_agent("example-demo", "0.0.1", "a comment")
  ]
}

Result:

APN/1.0 HashiCorp/1.0 Terraform/1.15.0-alpha20260204  # injected by provider
example-demo/0.0.1 (a comment)  # injected by `user_agent` input
GitHub Actions/sha1235363 BuildID/1234 # injected by `TF_APPEND_USER_AGENT`

Where is this useful ? As in my example, you can use it as part of your build system to add the build metadata. If there are changes, you can see the info regarding the build system.

3. provider_meta argument

What about modules ? ( WHAT ABOUT THEM ?) One of the recommendations or practices when writing modules is to not have the provider block included. So, how would you inject this data that you said was needed or critical?

The usage analysis of git sourced Terraform modules have always been a pain point for the module authors when it comes to identifying the impact. The Terraform registry based sourcing usually gives you some information with the downloads. For customers I work with , they usualy have private modules in some form of VCS ( GitHub/GitLab etc..). One of the messiest ways I have had to identify this is by forcing the module users to add some common infra_version tag and then writing back to repo or destination as a json document. But this revolves around the users being compliant with it, which can be difficult at times.

Thats where the provider_meta argument in the terraform block comes in.

# modules/app/versions.tf
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 6.32"
    }
  }
  
  provider_meta "aws" {
    user_agent = ["acme-vpc-module/2.1.0"]
  }
}

Functions cannot be used in the terraform block, so you must format the string manually for provider_meta.

Module authors can inject their own user agent information using the provider_meta block. This is particularly useful for:

Tracking adoption of internal modules across teams
Debugging issues specific to module versions
Understanding which module components are most active
Identifying which teams are using outdated module versions

Note : This is assuming the module authors are able to gather some of this information, especially within an organization. I am not aware of a mechanism currently where a public module author can access this information for their benefit.

Combining provider user_agent and provider_meta

When you use both the provider-level user_agent and module-level provider_meta, they are additive - both appear in the final user agent string.

Example configuration:

# Root module
provider "aws" {
  region = "us-east-1"
  user_agent = [
    provider::aws::user_agent("github-actions", "ww1xx2y", "development")
  ]
}

module "test" {
  source = "./modules/test-module"
}

# Module: modules/test-module/main.tf
terraform {
  provider_meta "aws" {
    user_agent = ["test-module/1.0.0"]
  }
}

Results in CloudTrail:

http.user_agent="APN/1.0 HashiCorp/1.0 Terraform/1.15.0-alpha20260204 (+https://www.terraform.io) 
terraform-provider-aws/6.32.1 (+https://registry.terraform.io/providers/hashicorp/aws) 
aws-sdk-go-v2/1.41.1 ua/2.1 os/macos lang/go#1.25.7 md/GOOS#darwin md/GOARCH#arm64 api/sts#1.41.6 m/v,w
github-actions/ww1xx2y (development) # from the user_agent
test-module/1.0.0 # from the provider_meta

The module’s provider_meta is appended to the root modules’s user_agent, giving you full traceability from CI system → build version → environment → specific module making the call.

Usage patterns

CI/CD Integration

Most CI/CD systems expose environment variables you can inject. Taking an example of GitHub actions, keeping default values for the variables which will end up being used when you are doing a deployment from your local machine. You can absolutely add additional identifiers which you team needs.

# GitHub Actions
variable "ci_system" {
  description = "CI/CD system name"
  type        = string
  default     = "local"
}

variable "build_version" {
  description = "Build or commit version"
  type        = string
  default     = "dev"
}

variable "environment"{
  description = "Environment"
  type        = string
  default     = "development"
}

provider "aws" {
  region = "us-east-1"

  user_agent = [
    provider::aws::user_agent(
      var.ci_system,
      var.build_version,
      var.environment
    )
  ]
}

In your GitHub Actions workflow, there are default environment variables you can bank on.

- name: Terraform Apply
  env:
    TF_VAR_ci_system: "github-actions"
    TF_VAR_build_version: $GITHUB_SHA
    TF_VAR_environment: "staging"
  run: terraform apply -auto-approve

Now every API call includes the exact commit and environment. When you run this, the user agent will show:

http.user_agent="... github-actions/abc123f"

For GitLab CI/CD, use the pre-defined variables to inject via those pipelines.

# .gitlab-ci.yml
terraform:apply:
  script:
    - export TF_VAR_ci_system="gitlab-ci"
    - export TF_VAR_build_version=$CI_COMMIT_SHORT_SHA
    - export TF_VAR_environment="staging"
    - terraform apply -auto-approve

Local deployment , the variables use the default values ( unless explicitly overridden) and gives us the local/dev combination.

Querying CloudTrail

Let’s see how this ends up on the AWS account side, specifically CloudTrail.

Finding Recent Terraform Operations

# Get the most recent GetCallerIdentity event with user agent
aws cloudtrail lookup-events \
    --lookup-attributes AttributeKey=EventName,AttributeValue=GetCallerIdentity \
    --max-items 1 \
    --region us-east-1 \
    --query 'Events[0].CloudTrailEvent' | jq -r 'fromjson | .userAgent'

This returns the full user agent string:

APN/1.0 HashiCorp/1.0 Terraform/1.15.0-alpha20260204 (+https://www.terraform.io) terraform-provider-aws/6.32.1 (+https://registry.terraform.io/providers/hashicorp/aws) aws-sdk-go-v2/1.41.1 ua/2.1 os/macos lang/go#1.25.7 md/GOOS#darwin md/GOARCH#arm64 api/sts#1.41.6 m/v,w 
local/dev (development) # from our local tf vars

Extracting Just the Custom Component

Kiro to the rescue. I wanted to find if I can get just the custom agent metadata I added.

# Extract only the custom user agent portion
aws cloudtrail lookup-events \
    --lookup-attributes AttributeKey=EventName,AttributeValue=GetCallerIdentity \
    --max-items 1 \
    --region us-east-1 \
    --query 'Events[0].CloudTrailEvent' | jq -r 'fromjson | .userAgent' | grep -o '[a-z-]*/[^ ]* ([^)]*)' | tail -1

Output:

local/dev (development)

Analysis options

Sample Dashboard

Asking CloudWatch to create the dashboard based on some of the metadata was the harder portion. Kiro CLI helped out with the insights queries in this case and a sample is below. I resorted to keep some information like resource deletion and IAM policy changes up top in the dashboard which I feel is something you would want to be aware of.

User agent metadata dashboard

CloudWatch Dashboard JSON

{
  "widgets": [
    {
      "type": "log",
      "x": 0,
      "y": 0,
      "width": 12,
      "height": 6,
      "properties": {
        "query": "SOURCE '/aws/cloudtrail/build-tracking'\n| fields @timestamp, eventName, userAgent\n| filter userAgent like /github-actions/ or userAgent like /local/\n| parse userAgent /m\\/v,w\\s+(?<CISystem>[^\\/]+)\\/(?<CommitID>[^\\s\\)]+)/\n| filter eventName in ['DeleteRole', 'DeleteFunction', 'DeleteBucket', 'DeleteTopic', 'DeleteQueue', 'DeleteTable']\n| display @timestamp, CISystem, CommitID, eventName\n| sort @timestamp desc\n| limit 20",
        "region": "us-east-1",
        "title": "⚠️ Critical: Deletions by Build"
      }
    },
    {
      "type": "log",
      "x": 12,
      "y": 0,
      "width": 12,
      "height": 6,
      "properties": {
        "query": "SOURCE '/aws/cloudtrail/build-tracking'\n| fields @timestamp, eventName, userAgent\n| filter userAgent like /github-actions/ or userAgent like /local/\n| parse userAgent /m\\/v,w\\s+(?<CISystem>[^\\/]+)\\/(?<CommitID>[^\\s\\)]+)/\n| filter eventName like /Put.*Policy/ or eventName like /Delete.*Policy/\n| display @timestamp, CISystem, CommitID, eventName\n| sort @timestamp desc\n| limit 20",
        "region": "us-east-1",
        "title": "IAM Policy Changes by Build"
      }
    },
    {
      "type": "log",
      "x": 0,
      "y": 6,
      "width": 12,
      "height": 6,
      "properties": {
        "query": "SOURCE '/aws/cloudtrail/build-tracking'\n| fields eventName\n| filter userAgent like /github-actions/ or userAgent like /local/\n| filter eventName in ['CreateRole', 'DeleteRole', 'CreateFunction', 'DeleteFunction', 'CreateBucket', 'DeleteBucket', 'CreateTopic', 'DeleteTopic', 'PutRolePolicy', 'DeleteRolePolicy']\n| stats count() as Count by eventName\n| sort Count desc",
        "region": "us-east-1",
        "title": "Resource Operations Distribution",
        "view": "bar"
      }
    },
    {
      "type": "log",
      "x": 12,
      "y": 6,
      "width": 12,
      "height": 6,
      "properties": {
        "query": "SOURCE '/aws/cloudtrail/build-tracking'\n| fields @timestamp\n| filter userAgent like /github-actions/ or userAgent like /local/\n| filter eventName not like /Describe/ and eventName not like /List/ and eventName not like /Get/ and eventName != 'AssumeRole' and eventName != 'GetCallerIdentity'\n| parse userAgent /m\\/v,w\\s+(?<CISystem>[^\\/]+)\\/(?<CommitID>[^\\s\\)]+)/\n| stats count() as Deployments by CISystem",
        "region": "us-east-1",
        "title": "CI System Distribution",
        "view": "pie"
      }
    },
    {
      "type": "log",
      "x": 0,
      "y": 12,
      "width": 12,
      "height": 6,
      "properties": {
        "query": "SOURCE '/aws/cloudtrail/build-tracking'\n| fields @timestamp, eventName\n| filter userAgent like /github-actions/ or userAgent like /local/\n| filter eventName not like /Describe/ and eventName not like /List/ and eventName not like /Get/ and eventName != 'AssumeRole' and eventName != 'GetCallerIdentity'\n| stats count() as Events by datefloor(@timestamp, 5m)",
        "region": "us-east-1",
        "title": "Deployment Activity Timeline (Write Ops Only)",
        "view": "timeSeries"
      }
    },
    {
      "type": "log",
      "x": 12,
      "y": 12,
      "width": 12,
      "height": 6,
      "properties": {
        "query": "SOURCE '/aws/cloudtrail/build-tracking'\n| fields @timestamp, eventName, userAgent\n| filter userAgent like /github-actions/ or userAgent like /local/\n| parse userAgent /m\\/v,w\\s+(?<CISystem>[^\\/]+)\\/(?<CommitID>[^\\s\\)]+)/\n| filter eventName in ['CreateBucket', 'DeleteBucket', 'CreateRole', 'DeleteRole', 'CreateFunction', 'DeleteFunction', 'CreateTopic', 'DeleteTopic', 'CreateQueue', 'DeleteQueue', 'CreateTable', 'DeleteTable', 'PutRolePolicy', 'DeleteRolePolicy']\n| display @timestamp, CISystem, CommitID, eventName\n| sort @timestamp desc\n| limit 50",
        "region": "us-east-1",
        "title": "Resource Changes by Build (Detailed)"
      }
    },
    {
      "type": "log",
      "x": 0,
      "y": 18,
      "width": 24,
      "height": 6,
      "properties": {
        "query": "SOURCE '/aws/cloudtrail/build-tracking'\n| fields @timestamp, eventName, userAgent\n| filter userAgent like /github-actions/ or userAgent like /local/\n| filter eventName not like /Describe/ and eventName not like /List/ and eventName not like /Get/ and eventName != 'AssumeRole' and eventName != 'GetCallerIdentity'\n| parse userAgent /m\\/v,w\\s+(?<CISystem>[^\\/]+)\\/(?<CommitID>[^\\s\\)]+)/\n| stats count() as Events by CISystem, CommitID\n| sort Events desc\n| limit 30",
        "region": "us-east-1",
        "title": "Commit IDs by CI System"
      }
    }
  ]
}

CloudTrail MCP server

With the recent MCP explosion, investigation of these issues are easier with a prompt as well. But what we miss is the metadata which can help source the actions better. From the deployments I initiated, what if I ask about the queue I know was deleted ?

Scenario 1

When was the SQS queue task-queue deleted and how/who ?
..
Running tool lookup_events with the param (from mcp server: awslabs.cloudtrail-mcp-server)
 ⋮  {
 ⋮    "attribute_key": "EventName",
 ⋮    "attribute_value": "DeleteQueue",
 ⋮    "max_results": 50,
 ⋮    "start_time": "90 days ago"
 ⋮  }
 - Completed in 0.884s

> The SQS queue task-queue was deleted on February 15, 2026 at 11:44:48 AM EST (16:44:48 UTC).

How/Who:
- **User**: github-role (via IAM role Admin)
- **Method**: Terraform destroy via AWS provider
- **Tool**: Terraform 1.15.0-alpha20260204 with AWS provider 6.32.1
- **CI System**: github-actions
- **Build Version**: qq7rr8s
- **Additional metadata**: development

May be we should add some more information in the the third comment for better utilize for investigations.

Scenario 2

Identify the number of times and versions of "test-module" is used in resource management using CloudTrail. You can check GetCallerIdentity for example.

> CloudTrail Lake hasn't ingested the recent events yet. Let me use lookup_events which queries the last 90
days directly:
Running tool lookup_events with the param (from mcp server: awslabs.cloudtrail-mcp-server)
 ⋮  {
 ⋮    "attribute_key": "EventName",
 ⋮    "attribute_value": "GetCallerIdentity",
 ⋮    "max_results": 50,
 ⋮    "region": "us-east-1",
 ⋮    "start_time": "1 day ago"
 ⋮  }
 - Completed in 1.574s

> Perfect! Now let me extract and count the test-module versions from the user agents:

Version breakdown:
  test-module/1.0.0: 4 calls
  test-module/1.0.1: 2 calls
  test-module/1.1.0: 1 calls
 - Completed in 1.272s

> ## Summary

3 versions of "test-module" are being used in resource management:

### Version Usage:
1. test-module/1.0.0 - 4 API calls (most used)
2. test-module/1.0.1 - 2 API calls
3. test-module/1.1.0 - 1 API call

### Timeline:
- **15:52:29** - test-module/1.0.0 (deployment-v1)
- **15:52:47** - test-module/1.0.1 (deployment-v2, 2 calls)
- **15:53:04** - test-module/1.1.0 (deployment-v3)

Additional earlier usage of test-module/1.0.0 was found at 15:42:50, 15:37:49, and 15:29:35, indicating the
module was tested multiple times before the final deployments.

I will have to revisit this with some more solid examples of module usage to refine my own prompt. But extrapolating the dashboard to give you information on the provider_meta data shouldn’t be harder.

Hypothetical : incident analysis

Here’s how this could help during an incident:

Scenario: An S3 bucket’s public access settings were unexpectedly changed at 3:47 PM.

Without User Agent Context:

CloudTrail shows: PutBucketPublicAccessBlock called by terraform-assumed-role

Back to the drawing board and looking at the application repos and finding the time of deployment ( if done via CICD ) and try to co-relate.

With User Agent Context:

CloudTrail shows: 
User-Agent: ... github-actions/22030165088 (37e04a2-deploy-production)

Now you can:

Go directly to GitHub Actions run #22030165088
See the exact commit (37e04a2)
Review the PR that introduced the change
Identify the folks involved in the authoring and approval stage to understand the intent of the change.

Why Not Tags?

You might be wondering: “Can’t I just use AWS tags to track this information?” I feel they are supplementary to each other while user-agent can answe deployment provenace while tags can answer resource classification.

Thinking out loud..

Not all AWS resources support tags
Failed API calls will still carry the user-agent while the operation never succeeded while leading to some issues.
Delete a resource and the metadata around that is almost gone ( you can probably figure out some details using may be Config) , but CloudTrail retains User-Agent data
Tags can be modified or removed after deployment, breaking the audit trail.

Conclusion

The AWS provider’s custom user agent mechanism helps us to enable an additional layer of observability into the infrastructure lifecycle management. By embedding deployment context - CI/CD run IDs, commit SHAs, and any other metadata that you need - you create a direct link between infrastructure changes and the code that caused them.