Synchronizing Terraform State Files to External Storage from TFC/TFE
As organizations scale their infrastructure with Terraform, managing state becomes increasingly critical. While HCP Terraform( previously known as Terraform Cloud/TFC) and Terraform Enterprise(TFE) provide robust state management capabilities, some scenarios may require maintaining additional copies of state files in external storage systems. Today, I’ll walk through couple of approaches for synchronizing your Terraform state from TFC/TFE to Amazon S3 or similar storage services in case your organization requires to do so.
Why Maintain External Copies of Terraform State?
Before diving into implementation, let’s consider why you might need this capability:
- Disaster recovery: Maintaining additional backups beyond TFC/TFE’s built-in mechanisms
- Compliance requirements: Meeting regulatory needs for data retention or storage location
- Resiliency management: Reviewing resiliency details with AWS Resilience Hub (We will look at this at a later post.)
- Integration with custom tools: Supporting internal systems that require access to state data
Implementation Approaches
Let’s explore two primary approaches to accomplish this state synchronization.
Approach 1: CLI-Based Workflow
If you’re utilizing a CLI-based workflow with TFC/TFE, the process is relatively straightforward:
- After executing your Terraform operations, run
terraform state pull
to retrieve the current state - Save this output to a file
- Upload the file to your external storage (e.g., Amazon S3)
Here’s a simple bash script demonstrating this approach:
1#!/bin/bash
2
3# Pull current state
4terraform state pull > current_state.tfstate
5
6# Upload to S3
7aws s3 cp current_state.tfstate s3://your-bucket-name/path/to/state/current_state.tfstate
8
9# Cleanup
10rm current_state.tfstate
This approach is simple and can be integrated into your existing CLI based HCP Terraform workflow.
Approach 2: API-Based with CI/CD Integration
For VCS-driven workflows, accessing the Terraform API directly provides greater flexibility. This method can be implemented in any CI/CD system with access to your TFC/TFE environment.
An example implementation using GitHub actions is laid out here.
The implementation follows these key steps:
- Authenticate with the Terraform Cloud/Enterprise API
- Retrieve the workspace details to identify the current state version
- Download the state file
- Upload it to Amazon S3
Here’s a simplified version of the bash implementation:
1#!/bin/bash
2
3# Set variables
4TF_WORKSPACE="your-workspace-name"
5TF_ORG="your-organization"
6S3_BUCKET="your-s3-bucket"
7TFE_TOKEN="your-tfe-token"
8
9# Get workspace ID
10 WORKSPACE_ID=$(curl \
11 --header "Authorization: Bearer $TFE_TOKEN" \
12 --header "Content-Type: application/vnd.api+json" \
13 "https://app.terraform.io/api/v2/organizations/${TFE_ORGANIZATION}/workspaces/${TFE_WORKSPACE}" \
14 | jq -r '.data.id')
15
16
17# Get current state version and state url
18 STATE_VERSION_ID=$(curl \
19 --header "Authorization: Bearer $TFE_TOKEN" \
20 --header "Content-Type: application/vnd.api+json" \
21 "https://app.terraform.io/api/v2/workspaces/${WORKSPACE_ID}/current-state-version" \
22 | jq -r '.data.id')
23
24 STATE_URL=$(curl \
25 --header "Authorization: Bearer $TFE_TOKEN" \
26 --header "Content-Type: application/vnd.api+json" \
27 "https://app.terraform.io/api/v2/state-versions/${STATE_VERSION_ID}" \
28 | jq -r '.data.attributes."hosted-state-download-url"')
29
30# Download state file
31# -vL as the state url you have captured earlier gets redirected
32 curl -vL \
33 --header "Authorization: Bearer $TFE_TOKEN" \
34 --header "Content-Type: application/vnd.api+json" \
35 "$STATE_URL" \
36 --output terraform.tfstate
37
38
39# Upload to S3
40aws s3 cp terraform.tfstate s3://${S3_BUCKET}/terraform.tfstate
Setting Up in GitHub Actions
The repository provides a complete GitHub Actions workflow for this task. Here’s how you might set it up:
- Store your Terraform Cloud/Enterprise API token and AWS credentials as GitHub secrets
- Create a workflow file (e.g.,
.github/workflows/sync-state.yml
) - Configure the workflow to run on your preferred schedule or trigger
1name: Sync Terraform State to S3
2
3on:
4 schedule:
5 - cron: '0 0 * * *' # Daily at midnight
6 workflow_dispatch: # Allow manual triggering
7
8jobs:
9 sync-state:
10 runs-on: ubuntu-latest
11 steps:
12 - uses: actions/checkout@v3
13
14 - name: Configure AWS credentials
15 uses: aws-actions/configure-aws-credentials@v4
16 with:
17 role-to-assume: ${{ secrets.IAM_ROLE }} # role allowed to be assumed by GitHub
18 aws-region: us-east-1
19
20 - name: Sync state to S3
21 env:
22 TFE_TOKEN: ${{ secrets.TF_TOKEN }}
23 TF_ORG: "your-org-name"
24 TF_WORKSPACE: "your-workspace"
25 S3_BUCKET: "your-bucket-name"
26 run: |
27 # Script to sync state (as shown above)
The only requirements are:
- Access to the Terraform Cloud/Enterprise API
- Proper authentication credentials for both TFC/TFE and your storage provider
Best Practices
When implementing this solution, consider these best practices:
- Security: Store API keys and credentials securely (e.g., in CI/CD secrets)
- Access Control: Restrict access to both TFC/TFE APIs and external storage
- Scheduling: Determine appropriate synchronization frequency based on change cadence
Conclusion
While Terraform Cloud and Enterprise provide robust state management capabilities, there are legitimate scenarios where maintaining external copies of your state files could be required. The approaches outlined in this post offer flexible solutions whether you’re using CLI-based workflows or VCS-driven automation.
Note: Always ensure your state file handling complies with your organization’s security policies, as state files may contain sensitive information.