Picking the right AWS Core Network service in 3 steps

Jul 16, 2025

“If it's a two way door decision, you pick a door, you walk out, you spend a little time there, it turns out to be the wrong decision. You can come back in and pick another door. Some decisions are so consequential and so important and so hard to reverse that they really are one way door decisions. You go in that door, you're not coming back. And those decisions have to be made very deliberately, very carefully.”
— Jeff Bezos

Building an AWS Core Network is like constructing a highway system for a city that doesn't exist yet. You're making infrastructure decisions today that will either enable or constrain your organisation's digital future.

This blog post will help you avoid costly architectural mistakes by providing a proven set of principles for selecting the right AWS core networking service based on your organisation's scale, operational maturity, and cost tolerance, potentially saving you from expensive migration projects and operational headaches down the road.

Key Takeaways

This is a strategic business decision, not just a technical choice - Your networking architecture will either accelerate or constrain growth for 3-5 years;
Plan for failure scenarios - Have clear migration paths and warning signs for when your chosen approach isn't working;
Align with business strategy - Choose based on your growth plans, not current state;
Investment in expertise pays off - The right skills can make complex architectures simple, while lacking skills makes simple architectures complex;
Choose the simplest architecture that won't constrain your business in 3-5 years.

Do I need a Core Network?

A Cloud Core Network is the foundational networking infrastructure that connects all your cloud resources across regions and accounts through a centralised, automated routing system.

You most likely need a Core Network if you check any two of these use-cases:

Multiple AWS accounts (and if you only have one, we need to talk about your blast radius strategy);
Cross-region operations (because data sovereignty is real, unlike your “cloud-first” strategy from 2019);
Hybrid connectivity requirements (your on-premises datacentre isn't going anywhere, despite what the CTO promised);
Complex routing and application needs (when "just use Private Link" stops being a viable solution);

Let’s do a quick tour of the menu, when it comes to the fundamental AWS services to build an AWS core network:

VPC Peering;
Transit Gateways;
Cloud WAN.

VPC “Spaghetti” Peering

VPC peering appears simple and elegant when you're connecting four or five VPCs. The concept is straightforward: create direct connections between VPCs, enabling resources to communicate directly. For small deployments, this approach offers immediate benefits - low latency, predictable costs, and minimal configuration overhead.

However, the mathematics of peering becomes unforgiving as you scale. The number of peering connections grows exponentially, not linearly (see picture below). With five VPCs, you need ten peering connections. With ten VPCs, you need forty-five connections. By the time you reach fifteen VPCs, you're managing 105 individual peering relationships.

**Spaghetti Networks:** Scaling a full mesh network takes a similar mental effort of counting threads of spaghetti during a meal, in a colourful and loud Italian restaurant.

Each peering connection requires manual configuration and ongoing maintenance as your network grows. Route tables become increasingly complex, making troubleshooting difficult and changes risky. The cost structure of VPC peering also contains surprises. While there are no charges for the peering connections themselves, data transfer costs can accumulate quickly, especially for cross-region communication.

Transit Gateways - Fundamental Core Routing

Transit Gateway emerged as AWS's answer to VPC peering's scalability limitations. Instead of creating point-to-point connections, Transit Gateways provide a regional central router through which all VPCs can communicate. This model dramatically simplifies the connectivity matrix by deploying a hub-and-spoke topology (picture below) in the form of a central virtual router.

Transit Gateway support advanced routing policies, enabling sophisticated traffic segmentation and security controls. You can create isolated routing domains within the same infrastructure, allowing different business units or environments to share networking resources while maintaining separation.

**Hub and Spoke Networks:** Scaling and control becomes an easier exercise, through increased centralisation.

However, Transit Gateway introduce their own complexities. Managing multiple Transit Gateways across regions requires careful coordination of routing policies and connection configurations. The full-mesh connectivity between Transit Gateways, while powerful, becomes increasingly difficult to manage as the number of regions grows beyond four. While Terraform and CloudFormation can manage individual components effectively, coordinating routing configurations across multiple regions and accounts requires sophisticated state management and dependency handling.

Cloud WAN - Global Abstracted Core Routing

Cloud WAN represents AWS's latest evolution in core networking architecture, providing a global overlay that abstracts much of the complexity inherent in multi-region Transit Gateway deployments. Rather than managing individual Transit Gateways and their interconnections, Cloud WAN presents a unified global network that spans regions and accounts.

The abstraction benefits are substantial. Cloud WAN automatically handles the routing between regions, eliminating the need to manually configure and maintain Transit Gateway peering relationships. This reduces both the initial configuration complexity and the ongoing operational burden of managing global networks.

**“Fabric” of Hub and Spoke Networks:** Interconnected regional based hub and spoke networks delivering a de-facto fabric (in reality it is an AWS managed full-mesh of Transit Gateways).

At the heart of Cloud WAN, you can find the Network Policy which serves as the central configuration document that defines how your global network operates. This policy creates logical network segments ensuring that applications in one segment cannot inadvertently communicate with resources in another segment unless explicitly permitted. The policy also controls network steering and association logic, determining which VPCs automatically join specific segments based on criteria like account tags, region, or resource characteristics.

The operational challenges of Cloud WAN become more manageable with purpose-built automation that addresses real-world complexities. One practical solution that emerged from field experience involves automating VPC admission to Cloud WAN networks through intelligent event processing. The diagram below shows an event based processing of the solution happening in the control plane, and how this simplifies user experience.

**AWS Cloud WAN Attachment Manager:** Simplified event based architecture which looks at account tagging and IP addressing, before allowing new VPCs into the network - event based processing happening far away from the user

Another aspect to take into account relates with the Service Insertion feature for Cloud WAN, which was announced in June 2024. This feature is of critical importance because it fixes an early service fragility related to asymmetric traffic which required a flow to be inspected twice by East-West network firewalls (in the source and destination regions) - this represents the lion’s share of the core network cost. During early validation and testing of the feature for one of my largest clients (6x AWS Regions, approx 600 Attachments), we saw a lack of network summarisation capabilities, and its stateless service steering nature requires global firewall policies.

However, do keep Service Insertion in your radar! As this feature matures and improves, I believe it will be a massive tail-wind for better core network economics without sacrificing scalability, global reach, segmentation, or security.

Cost Functions of AWS Core Routing

Understanding the cost implications of different networking approaches requires examining both direct infrastructure costs and operational expenses. The economics vary significantly based on scale, usage patterns, and organisational maturity.

VPC peering offers the cheapest cost structure for small deployments. You pay only for data transfer across AZs and Regions, with no additional infrastructure charges. However, the operational costs grow exponentially with scale, as the management overhead of maintaining numerous peering relationships becomes prohibitive.

Transit Gateways introduce a different cost model. You pay hourly charges for each connected attachment (e.g. VPCs, VPNs, TGW Peering, Direct Connect, etc). The data processing charges add another cost dimension, particularly for high-volume applications. While the infrastructure costs are higher than VPC peering for small deployments, the operational efficiencies can justify the expense at moderate scale.

Cloud WAN's cost structure is the most complex, with charges for the network edge locations, number of attachments, and data processing. Given that Cloud WAN abstracts, manages, and orchestrates a full mesh of Transit Gateways across the requested edge locations, it is no surprise that a premium would be demanded by AWS, and tolerated by clients, for the use of this service.

But how much is this premium? How does the cost function of a full-mesh Transit Gateway network managed by the client, compares with the cost function of using Cloud WAN? To answer these questions, I present a simple model that looks into both Transit Gateway and Cloud WAN pricing for the Ireland Region (eu-west-1), using the parameters below.

We also assume that 15% of the total traffic is inter-region, which incurs a $0.02 per GB of traffic.

Impact of Traffic Volume in Network Costs

Let’s start by looking at Traffic Volume as a variable, with 3 regions and 250 attachments. This scenario starts with a total daily volume of 2 PB, followed by 10 steps of 25% traffic volume increase.

**Daily cost as a function of Traffic Volume**

For these conditions, you can see the cost function of two services behaves as two parallel lines, with the slope being defined as the price of transported GB. The premium demanded by Cloud WAN is explained by the higher attachment hourly fee and by the hourly edge region rate. The premium starts to decay as we increase the traffic volume and “eat” the attachment and regional price difference.

Impact of Traffic Volume and Attachments in Network Costs

For this scenario, we keep the same Traffic Volume steps and number of Regions of the previous scenario, and we will start scaling up the number of attachments. We start with 250, followed by 10 steps of 15% attachments increase.

Daily cost as a function of the number of Attachments and Traffic Volume

For these conditions, you can see the cost function of Cloud WAN starts to accelerate as we increase the number of attachments, with the slope being defined as the price of transported GB (which is 30% higher than the Transit Gateway). The premium starts to decay as we increase the traffic volume and “eat” the attachment and regional price difference, but at a slower speed (there is a higher pull explained by the 30% higher attachment price).

Impact of Traffic Volume, Attachments, and Regions in Network Costs

For this scenario, we keep the same Traffic Volume steps, and we start increasing the number of Regions and Attachments, and we will start scaling up the number of attachments. We start with 250 and 3 regions, followed by 10 steps of increasing 1 region and 33.33% (initial proportion growth of attachments to regions).

Daily cost as a function of the number of Regions, Attachments, and Traffic Volume

For these conditions, you can see the cost function of Cloud WAN starts to accelerate as we increase the number of attachments, with the slope being defined as the price of transported GB (which is 30% higher than the Transit Gateway). The premium starts to decay as we increase the traffic volume and “eat” the attachment and regional price difference, but at an even slower speed (there is a higher pull explained by the 30% higher attachment price and the increase of number of active regions).

Operational Complexity

The hidden costs in all approaches relate to complexity management. Complex networks require more skilled personnel, more sophisticated monitoring tools, and more time for change management. These operational costs often exceed the direct infrastructure expenses, making architectural decisions that reduce complexity valuable even when they increase infrastructure costs.

This dimension is quite difficult to model, given that every team is different, and use-cases across different industries also tend to have different segmentation and traffic inspection requirements.

Making the Decision

Here are three steps to make when picking your AWS Core Network Service:

Assess your scale dimension: Count your current and projected VPCs, regions, and accounts over the next 3-5 years - if you have fewer than 5 VPCs in a single region, VPC peering suffices; 5-50 VPCs across 1-3 regions favour Transit Gateways; more than 50 VPCs across 3+ regions justify Cloud WAN.
Evaluate complexity and fragility tolerance: Determine your team's operational maturity and appetite for managing network complexity - choose VPC peering if you prefer simple, manual management; Transit Gateways if you can invest in Infrastructure as Code practices; Cloud WAN if you have have plans to scale beyond 3 AWS Regions or if you don’t want to manage the network attachment process.
Calculate total cost of ownership: Beyond infrastructure costs, factor in operational expenses, team training, and the premium you're willing to pay for reduced complexity - remember that Cloud WAN typically costs 12-34% more than equivalent Transit Gateway implementations, but this premium decreases as traffic volume increases and may be justified by operational savings.

Red Flags to Avoid:

Choosing Cloud WAN without DevOps and Networks expertise;
Implementing Transit Gateway without Infrastructure-as-Code;
Selecting VPC Peering for multi-region businesses;
Making networking decisions without input from compliance/security teams.

I invite you to subscribe to my blog, and to read a few of my favourite case-studies describing how some of my clients achieved success in their high-stakes technology projects, using the very same approach described.

Have a great day!

João

Visit my website | Linkedin profile

Beyond the Blueprint

Discussion about this post