Terraform snapshot testing: A practical approach to unit testing

Achieving faster feedback cycles for your Terraform engineers and reducing development expenditure without sacrificing control.

Oct 29, 2025

“The aesthetics of a product can never hope to make up for poor-quality technology.”
— Bruno Sacco

When Bruno Sacco led Mercedes-Benz design from 1975 to 1999, he didn’t just create beautiful cars. He created cars built to last 20 to 30 years. The W123, W126, W124 weren’t just elegant - they were engineered with a rigor that made them legendary for reliability. Many are still on roads today, decades later, clocking mile after mile, day in, day out.

Sacco understood something fundamental about building things that endure: design and engineering must go hand in hand. As he put it, “The aesthetics of a product can never hope to make up for poor-quality technology.“ You don’t mask engineering shortcuts with styling. You validate the engineering first, then express that quality through design.

Your Terraform code, powering your business critical infrastructure deserves the same respect. Yet most teams today deploy infrastructure to test infrastructure. They spin up resources, wait for feedback, hope nothing breaks, and accumulate dependencies on cloud APIs, shared state files, increasing deployment times, budget expenditures, and bottlenecks. Every successful deployment increases confidence while hiding fundamental fragility in tight coupling, snow-flake like infrastructure modules / stacks, and lack of real modularity.

**Bruno Sacco** - “It sometimes happens that a project is elected on the basis of a precise, planned obsolescence of the product to ensure future demand for new models,” Sacco said, though he rejected this approach for Mercedes.

Main Insights

Traditional Terraform testing is fragile and difficult: Most teams deploy infrastructure to test infrastructure, which creates dependencies on cloud APIs, shared state files, deployment time, and budget;
Snapshot testing follows the test pyramid: Following the test pyramid principle, unit tests should be fast, numerous, and cheap. Snapshot testing treats Terraform plans as unit test artefacts - you generate snapshots of planned changes and commit them to version control. Tests run in seconds using only read-only credentials, catching configuration errors before expensive deployments;
Fast feedback before deploying test infrastructure: Tests complete in seconds with no actual resources created, no state files to manage, and no deployment waiting. This enables parallel development without shared state file contention, catches problems early when they’re cheapest to fix, and makes unintended changes visible immediately in code review as snapshot diffs;
Snapshot testing has clear limitations: It cannot test composed infrastructure (cross-stack dependencies need integration testing through deployment), is limited to plan validation (doesn’t verify infrastructure actually works), requires disciplined module design, and needs snapshot maintenance. This is one layer in a comprehensive testing strategy - you still need deployment to non-production environments for integration testing and production-like testing for performance and reliability.

The problem with traditional Terraform testing

When you develop Terraform modules, you face a dilemma. The most reliable way to test your code is to run terraform apply and see what actually gets created. But this approach has practical problems that slow teams down.

Reading HashiCorp’s own documentation on Terraform testing, you can immediately infer that traditional integration tests deploying resources can be time-consuming and expensive to develop - this may force your (most likely) operations team to create and maintain highly detailed tests, or worse having to write them in Golang, all this while struggling to keep the systems and the business running. Google Cloud’s best practices guide acknowledges that “running a Terraform test creates, modifies, and destroys real infrastructure, so your tests can potentially be time-consuming and expensive.“

For Terraform specifically, developers wait minutes or hours for feedback on their changes. Testing blocks access to shared state files, preventing parallel development. Teams spend significant time and money deploying test infrastructure just to validate code correctness.

There are several available mechanisms of testing Terraform, let’s look into some of the best known and recent ones.

Terratest

Terratest is a Go library developed by Gruntwork that automates infrastructure testing by deploying real resources. Tests are written in Go using the standard Go testing package. A typical Terratest workflow executes terraform init and terraform apply to deploy actual infrastructure to cloud environments, validates that the deployed resources work correctly through HTTP requests, API calls, or SSH connections, and then runs terraform destroy to clean up.

The library provides helper functions for common infrastructure testing tasks across Terraform, Kubernetes, Docker, Packer, and major cloud providers. Tests run using the go test command and can be integrated into CI/CD pipelines like any other Go test suite.

The main advantage is comprehensive end-to-end validation: Terratest deploys real infrastructure and verifies it actually works, catching provider-specific issues, API behaviors, and integration problems that plan-only testing misses. The framework is mature with extensive examples, documentation, and a large community.

However, tests are expensive and slow, often taking 30+ minutes to complete because they create actual cloud resources, incurring infrastructure costs with each test run. Teams need Go expertise to write and maintain tests, which adds a learning curve if developers aren’t already familiar with the language. The approach also requires managing test resource cleanup, handling cloud provider rate limits, and dealing with test flakiness from network issues or cloud service disruptions. For enterprises, this means Terratest is valuable for critical integration testing but impractical for rapid feedback during development.

Terraform Test Framework

The Terraform Test Framework is HashiCorp’s native testing capability introduced in Terraform 1.6, allowing tests to be written in HCL using .tftest.hcl files. Tests consist of run blocks that execute terraform plan or terraform apply commands against the configuration, with assert blocks that validate conditions must evaluate to true.

By default, tests create ephemeral infrastructure in temporary state files separate from production state, preventing interference with existing resources. Tests are stored in a tests/ directory within the module and executed with the terraform test command. The framework supports both unit testing (plan-only validation) and integration testing (actual deployment), and includes features like helper modules for test setup, mock providers for simulating resources without cloud costs, and the ability to reference outputs from previous run blocks.

The framework’s primary advantage is native integration as tests use the same HCL language developers already know, requiring no additional language skills, and the terraform test command is built into Terraform itself with no extra dependencies. Tests are straightforward to write and maintain within the existing Terraform workflow, and the framework provides safety through isolated ephemeral state.

However, the framework is relatively new (introduced in 2023) with limited community examples and best practices compared to mature alternatives. Like Terratest, integration tests still deploy real infrastructure with associated time and cost penalties, though unit tests using plan-only mode or mock providers can provide faster feedback.

The framework is less flexible than code-based testing tools - you’re constrained by HCL’s capabilities and the prescribed testing workflow. Additionally, maintaining extensive test suites in HCL can become cumbersome as complexity grows, since HCL lacks the programming constructs (loops, functions, conditionals) and abstractions available in general-purpose languages, making it harder to reduce duplication and manage test data across large module collections.

A fast approach: Snapshot Testing

Following Martin Fowler’s Test Pyramid, in application land, unit tests should be fast, numerous, and cheap to run. However, in infrastructure land, traditional Terraform testing doesn’t easily allow for this kind of rapid iteration.

Enter snapshot testing - to treat Terraform plans as unit test artifacts. You create test stacks that exercise your modules with different configurations, generate snapshots of the planned changes, and commit those snapshots to version control. On every commit, the test suite verifies that your code still produces the expected plan.

This catches problems early. Configuration errors, unintended resource changes, and broken module logic surface immediately in code review as snapshot diffs. You see what changed before any deployment happens.

The approach doesn’t replace deployment testing, you still need to deploy to non-production environments and validate that infrastructure actually works. But snapshot testing happens first, catching obvious problems when they’re cheapest to fix, while guaranteeing intent correctness of the infrastructure.

**Snapshot as a form of Unit Testing:** Test cheaply before deploying non-production infrastructure

Here’s how we use Terraform / OpenTofu snapshot testing in the context of complex shared infrastructure:

Maintainer of the module / stack makes changes to the code, updates test cases, runs the snapshot to reflect these new changes, and verifies synthesis and planned values correctness;
When committing and pushing, the snapshot test runs in the pipeline to verify that the module builds and generates the desired configuration, and compares it with the snapshots;
Deploy to non-prod environments to test operation and integration at scale;
Deploy to production environment to operate and scale the business.

In the same context of complex shared infrastructure, we favour a GitLab Flow approach for the reasons explained in “Simplifying Infrastructure - part 3”. The inclusion of snapshot testing should be done at the commit or push event, before applying the changes in the development environment.

Here’s the archetype of the GitLab Flow approach applied to Terraform / OpenTofu complex shared infrastructure, as described in the same blog post.

**GitLab Flow:** Managing shared infrastructure deployments while minimising risk.

How it works in practice

We produced a library to perform Terraform Snapshot Testing, using standard Python testing tools. It is available on PyPI, and the source code and examples are on GitHub - terraform-snapshot-test. Tests run in seconds using only read-only provider credentials. No state files, no deployed resources, no waiting.

Here’s the development and test cycle:

You work on your Terraform module as usual. In a tests/ folder, you create test stacks that instantiate the module with different configurations;
When you’re ready, you generate snapshots by running pytest -m terraform --snapshot-update -s. This initializes Terraform, uses read-only credentials to create a plan, and persists both the synthesis and planned values as JSON snapshots;
You commit these snapshots to version control. They serve as the expected output for your module;
Expectations can be specified in an easy to write and maintain YAML files, one per stack / module being tested;
For the advanced user, it is possible to go directly into the Python layer to extend and write further test cases;
On every subsequent commit, running pytest compares the current plan against the committed snapshots. Any differences fail the test;

The library works with AWS, GitLab, GitHub, and other providers. The only requirement is read-only API credentials to allow Terraform to query the provider during planning.

Advantages of Terraform snapshot testing

Here are some of the advantages of introducing this practice into your infrastructure development process, before attempting to deploy into the test environments.

Fast feedback cycles: Tests complete in seconds, not minutes or hours. Developers get immediate feedback on code changes without waiting for infrastructure deployment.

No infrastructure costs: The approach uses read-only credentials for planning only. No actual resources are created during testing. No state files to manage or clean up.

Unintended changes become visible: Snapshot diffs make all infrastructure modifications explicit. Unexpected changes surface immediately in code review. This guards against regressions when refactoring modules.

Parallel development: Multiple team members can test simultaneously without shared state file contention or blocking on deployment environments.

Better module design: The approach forces you to write truly modular, reusable code. Modules must accept configuration through variables and can inject dependencies rather than coupling tightly to remote state.

Limitations of Terraform snapshot testing

Snapshot testing works well for isolated modules but has limitations.

It cannot test composed infrastructure: Testing multiple stacks that reference each other’s outputs is difficult. Cross-stack dependencies require “remote state” referencing, which this approach bypasses. Complex multi-stack compositions still need integration testing via deployment,
It’s limited to plan validation: Snapshots only validate what Terraform intends to create, not whether infrastructure actually works as expected. The approach cannot catch provider-specific issues, API behaviour, or runtime behaviour;
Snapshots require maintenance: They must be updated when intentional changes occur. Reviewing snapshot diffs requires understanding Terraform plan JSON. Provider version updates can cause false positives from irrelevant changes;

This is one layer in a comprehensive testing strategy, not the entire strategy. You still need deployment to non-production environments for integration testing, manual verification of deployed infrastructure, and production-like testing for performance and reliability.

Working backwards from the outcome

Testing exists to give you confidence that your code does what you intend. Traditional Terraform testing achieves this through deployment, but deployment is expensive and slow. The question isn’t whether to test, but where in the development cycle to catch problems.

Snapshot testing moves problem detection earlier, where fixes are cheaper. It won’t catch everything - no single testing approach does. But it catches enough to be useful, and it does so quickly enough to fit into a tight development loop.

The real value is in the feedback cycle - write code, run tests, see what changed, adjust. Do this ten times in the time it would take to deploy once. Catch configuration errors before they reach non-production environments. Find broken logic before it enters code review.

This aligns with what we know about effective testing strategies - successful teams use multiple testing layers. Unit tests that run in under a second provide the fastest feedback. Snapshot testing fits this pattern.

Building with battle-tested components

The library uses standard Python testing tools. If your team already uses pytest, there’s no new testing framework to learn. Tests are defined in familiar Python. Snapshots use Syrupy, a mature snapshot testing library from the Python ecosystem.

This matters because tools that fit existing workflows are more likely to be adopted. You don’t need to convince your team to learn Terratest or Go. You don’t need to set up special CI/CD infrastructure. If you can run pytest, you can run these tests.

Getting started

The library is available on PyPI, and the source code and examples are on GitHub.

Setup involves creating a pytest.ini file with environment variables for your use case, creating test stacks in a tests/ folder, and overriding your provider configuration to prevent interaction with state backends. Examples for AWS, GitLab, and GitHub are included in the repository.

If you want to extend the use of expectations, you can create an assertions folder in the tests folder of the Terraform stack, with one YAML file per stack which you want to test. In the expectations files you can write the objects and configuration which you want to ensure will exist in the synthesis and planned_values snapshots.

From there, it’s a standard pytest workflow. Run tests locally during development. Run them in CI/CD on every commit. Review snapshot diffs in pull requests alongside code changes.

Final thoughts

Most infrastructure testing advice tells you to test by deploying. This makes sense for integration and end-to-end testing. But for unit testing modules and catching basic errors, deployment is overkill.

Snapshot testing fills the gap between static analysis and integration testing. It’s faster than deploying but more thorough than linting. It catches problems that static analysis misses without the cost and complexity of spinning up infrastructure.

If your team writes Terraform modules, if you spend time waiting for test deployments, or if you want faster feedback on infrastructure changes, snapshot testing is worth considering. It won’t solve every testing problem. But for the problems it does solve, it solves them quickly and cheaply.

Use snapshot testing for fast feedback on module logic. Then use deployment testing to verify that things actually work. Only then you can confidently deploy into production, having already harvested faster delivery times at a lower testing and validation cost.

I invite you to subscribe to my blog, and to read a few of my favourite case-studies describing how some of my clients achieved success in their high-stakes technology projects, using the very same approach described.

Have a great day!

João

Visit my website | Linkedin profile

Beyond the Blueprint

Discussion about this post