HashiConf 24 : GH action workflows
HashiConf ‘25 is just a month away. I will be speaking at HashiConf this year on the Cloud engineering track. Session details are below.
Making AI work for you: Terraform engineer blueprint AI coding assistants for HashiCorp Terraform often give inconsistent results. This session shows proven techniques for reliable infrastructure code using AI. Learn practical strategies: repository context, reusable prompts, MCP servers, and validation workflows. See live demos with Amazon Q Developer covering provider contributions, module building, and troubleshooting. Get actionable templates and skills that work across any AI assistant.
From the HashiConf'24, I thought I would bring back some of the hallway track slides I had on “GitHub based Terraform workflow deployments”, most of which I had to deal with when working with some customers. These are some sample workflow patterns which you can possibly use or tweak as needed for your organization needs. I am using GitHub actions as the CI system as it is easy for anyone reading to reproduce across their personal repos which is possibly hosted on GitHub.
Components
- GitHub Actions workflows
- HCP Terraform workspaces
- Terraform configuration in a GitHub repo
- AWS Credentials accessible to GitHub actions via OIDC or Dynamic Credentials which HCP Terraform can retrieve from AWS using OIDC. Preferable over static credentials.
- HCP Terraform workspace set up for your infra stack.
Workflow scenarios
Let’s set the baseline
GitHub action allows you to set triggers based on a few conditions. Naming a few below.
- Commit to a branch
- On demand : Select the workflow to be triggered from Actions tab in any repo.
- On a schedule
-
The top portion of the image shows a commit triggering that workflow which in this case provisions your infrastructure to AWS via HCP Terraform.
-
The bottom portion shows an “on demand” workflow with manual triggers a user can invoke. You as a user have the option to provider choices to the users for the target environment, type of run ( plan only , apply ) and any more which makes sense to you. I use the environment and type to give users an option to run a speculative plan against higher environments like pre-prod or prod once a change in main is deployed to dev. The plan , this way, gives the user a good idea of what to expect based on the incoming configuration and what is currently provisioned in the target account.
Custom & Cleanup workflows
As I work with customers, their demand of type of workflows and deployment patterns differ. I have customers who prefer to use PRs deploy to a single dev/sandox account followed by a main branch based reconcilation. There are others who have the luxury to provision multiple instances of workloads into a single account or multiple accounts based on the PR source branch. In those latter scnearios, you could have prefixes and tags for your resources which make them different for each branch being provisioned to an account.
-
The state management and use of proper naming standars become crucial in this case. Keep in mind that there is elevated costs associated with this flow. I usually recommend a stale PR workflow which removes the infrastructure based on some external trigger or a manual clean up workflow in case a PR is stuck or blocked. Else each PR is provisioning their own instance of the resource.
-
Clean up workflows: Essentially a mechanism allowing the users or application owners in a repo run a
terraform destroy
on their infrastructure in a certain account. This could be due to multiple reasons; move to a new account, app being decommissioned and so on.
Feature flag workflow
Feature flags are not a new thing in software delivery lifecycles. They help you differentoiate what is deployed vs what is released. This was my proposal to a team who didn’t have an additional feature flagging mechanism like LaunchDarkly. And with infrastructure, it is slightly difficult to pull off too.
- Situation : Teams following Trunk Based development with some feature which was merged in , but not ready to be consumed by users.
- Proposal: Use a combination of GitHub environment variables along with the count meta argument in Terraform.
- Count is set on infrastructure configurations which you are unsure of provisioning to higher environments just yet. The count can be based on an environment value which can be set to a true or false value which further helps conditionally provision those infrastructure components.
- I am taking Jira story id as the name of the environment variable as it is easier to trace back to a feature that way.
- We use the env var by the same name across all your target environments. Lets say dev/preprod/prod with only dev set to true.
- When Terraform workflow runs, the specific component is deployed in dev and not deployed in higher environments.
The picture shows: • A complete feature flag workflow using story ID JIRA-1234 • Demonstrates how environment variables in GitHub (FF_JIRA_1234) map to Terraform variables and infrastructure configuration • Flow: Developer commits → GitHub triggers deployment workflows → Terraform provisions AWS RDS cluster with feature flag controls • Shows the infrastructure code using count = tobool(var.FF_JIRA_1234) ? 1:0 to conditionally create resources
Feature flag hygiene
What follows a feature flag based implementation is a mechanism to remove it. If I am working on adding feature flags, I always have a PR to clean up the same as part of the story which might remain on the repo for future cleanup. The hygiene steps depends on how your team decides on what good looks like. Lets say , after an atribitrary month of reviewing the infra component being in production; you decide that it is going to stay like that removing the need for the feature flag.
The picture: • Shows the cleanup process after feature development is complete • Developer removes environment variables and Terraform variables from the story • Creates a PR to merge changes to default branch • Highlights that infrastructure components under feature flag references should use try functions to avoid deployment issues • Shows “no infra changes” when the feature flag is removed
Keep in mind that I am leaving the count argument as 1 going forward it would otherwise create a drift.
Deployment to Regulated environments
Situation:
The team needed to provision Kafka topics, service accounts and access controls in a private Confluent Kafka cluster for their streaming workloads but faced networking restrictions that prevented using GitHub Actions or Terraform workflows. The existing solution was manual and required maintaining complex firewall rules with networking team and IP-based access controls from Confluent accounts to allow the Terraform Enterprise host to deploy from 200+ customer repositories. This could also be the case with provisioning workloads into on-prem or regulated environments where your Terraform Cloud hosts cannot reach directly.
The picture shows two workflows:
-
Set up a Terraform Cloud agent which becomes a conduit to connect to the regulated environment. In this case the agents were provisioned on a Fargate cluster which resides in the same VPC on an account which is privately linked to a Confluent cluster. The HCP Terraform or TFE host needs to create an agent pool which creates an API token which the agent can use to communicate with the host. The execution settings on a Terraform workspace can be set to run on the agent than the default host which HCP Terraform provides.
-
The user workflow proceeds the same as always. The Terraform Enteprise ( or HCP Terraform) host will offload the actual execution of the terraform workflow to the agent running on the Fargate cluster which is able to provision them on the privately linked cluster.
- Reference implementation : https://github.com/aws-ia/terraform-aws-tf-cloud-agents
Conclusion
The workflows I have specified are some examples I have come across or used. They may or may not work without changes in your environment. Are there any other use cases you have come across which deviates from the standard GitHub action based workflows for Terraform ?