If you are running applications across multiple regions and need to route users to the closest backend server, Cloudflare Load Balancers with geo-steering is one of the best solutions available. In this guide, I will walk you through how to set up Cloudflare load balancers using Terraform, configure geographic traffic routing, implement health checks for automatic failover, and package everything into a reusable Terraform module.
This is based on a real-world production setup routing API traffic across EU and US regions.
Table of Contents
- Table of Contents
- What is a Cloudflare Load Balancer?
- What Problem Does It Solve?
- Key Concepts
- Prerequisites
- Architecture Overview
- Project Structure
- Step 1: Configure the Terraform Provider
- Step 2: Define Root Variables
- Step 3: Build the Reusable Terraform Module
- Step 4: Configure Health Check Monitors
- Step 5: Create Origin Pools
- Step 6: Create the Load Balancer with Geo-Steering
- Step 7: Define Module Outputs
- Step 8: Instantiate the Module for Multiple Environments
- Step 9: Deploy the Infrastructure
- Understanding Geo-Steering in Depth
- Understanding Steering Policies
- Session Affinity Explained
- Health Checks and Automatic Failover
- Best Practices
- 1. Always Use Proxied Mode
- 2. Set Appropriate Health Check Intervals
- 3. Use
ip_cookieSession Affinity - 4. Define Country Pools for Important Markets
- 5. Set the Fallback Pool to Your Most Reliable Region
- 6. Use Terraform Modules for Consistency
- 7. Store State Remotely
- 8. Use Separate Health Monitors Per Region
- Troubleshooting Common Issues
- Conclusion
What is a Cloudflare Load Balancer?
A Cloudflare Load Balancer is a DNS-based traffic distribution service that sits in front of your origin servers. Unlike traditional load balancers that operate at the infrastructure level (like AWS ALB or Nginx), Cloudflare’s load balancer operates at the DNS and Anycast network level, making routing decisions before traffic even reaches your infrastructure.
When a user makes a request to your domain, Cloudflare’s global network (spanning 300+ data centers) intercepts the request and intelligently routes it to the most appropriate backend server based on rules you define — such as geographic proximity, server health, or latency.
Key characteristics of Cloudflare Load Balancers:
- DNS-level routing: Routing decisions happen at the edge, before traffic reaches your servers
- Global Anycast network: Leverages Cloudflare’s 300+ PoPs (Points of Presence) worldwide
- Built-in health monitoring: Automatically detects unhealthy origins and reroutes traffic
- Multiple steering policies: Supports geographic, latency-based, random, and proximity-based routing
- Proxy integration: Works seamlessly with Cloudflare’s CDN, WAF, and DDoS protection
What Problem Does It Solve?
Consider this scenario: You have an API serving users globally. Your servers are deployed in two regions — EU (Ireland, eu-west-1) and US (Oregon, us-west-2). Without a geographic load balancer:
- High latency for distant users: A user in Kenya hitting a US-based server experiences 200-400ms of additional latency per request
- No automatic failover: If your EU server goes down, EU users get errors until someone manually switches DNS
- Uneven load distribution: All traffic hits a single region, leaving the other underutilized
- Poor user experience: Slow API responses lead to degraded app performance and user frustration
Cloudflare Load Balancers solve all of these problems by:
- Routing users to the nearest healthy region — reducing latency
- Automatically failing over when a region becomes unhealthy — improving reliability
- Distributing traffic geographically — balancing server load
- Maintaining session affinity — ensuring consistent user experiences
Key Concepts
Before diving into the implementation, let us understand the key components:
Origins
An origin is a backend server that handles requests. This could be an IP address, a hostname pointing to an AWS ELB, an EKS ingress endpoint, or any server that can respond to HTTP/HTTPS requests.
Pools
A pool is a group of one or more origins, typically in the same geographic region. For example, you might have an “EU Pool” containing your European servers and a “US Pool” containing your American servers. Pools are monitored for health and traffic is routed to healthy pools.
Monitors (Health Checks)
A monitor defines how Cloudflare checks the health of origins within a pool. Monitors send periodic HTTP/HTTPS requests to a specified endpoint (like /health) and mark origins as healthy or unhealthy based on the response.
Steering Policies
A steering policy determines how Cloudflare routes traffic across pools. Options include:
| Policy | Description |
|---|---|
geo | Routes based on user’s geographic location (region/country) |
dynamic_latency | Routes based on observed round-trip time |
proximity_strict | Routes to the geographically closest pool |
random | Distributes traffic randomly across pools |
off | Uses only the default pool order (first healthy pool wins) |
Session Affinity
Session affinity ensures that requests from the same user are consistently routed to the same origin server. This is critical for applications that maintain server-side state.
Prerequisites
Before you begin, ensure you have:
- Terraform >= 1.0 installed (install guide)
- A Cloudflare account with a domain configured
- A Cloudflare API token with the following permissions:
- Zone > Load Balancers > Edit
- Zone > DNS > Edit
- Account > Load Balancing: Monitors and Pools > Edit
- Origin servers deployed in at least two regions with a health check endpoint (e.g.,
/health) - Your Cloudflare Zone ID and Account ID (found in the Cloudflare dashboard under your domain’s overview page)
Creating a Cloudflare API Token
- Go to the Cloudflare dashboard
- Navigate to My Profile > API Tokens
- Click Create Token
- Use the Custom Token template
- Add the permissions listed above
- Restrict to your specific zone for security
Architecture Overview
Here is what we are building:
| |
Traffic flow:
- A user makes a request to
api-lb.example.com - Cloudflare’s Anycast network receives the request at the nearest PoP
- The load balancer determines the user’s geographic location
- Geo-steering routes the request to the appropriate regional pool
- Health monitors ensure only healthy origins receive traffic
- Session affinity keeps the user on the same origin for subsequent requests
Project Structure
We will organize our Terraform code using modules for reusability:
| |
This modular approach allows you to reuse the same load balancer configuration across multiple environments (staging, preprod, production) and services.
Step 1: Configure the Terraform Provider
First, set up the Terraform provider configuration. Create provider.tf:
| |
We are using the Cloudflare Terraform provider version 4.x. The api_token method is the recommended authentication approach — it is more secure than using a global API key because tokens can be scoped to specific permissions.
Tip: For production deployments, always use a remote backend (like S3 + DynamoDB) to store your Terraform state. This enables team collaboration and state locking to prevent concurrent modifications.
Step 2: Define Root Variables
Create variables.tf at the root level for shared configuration:
| |
The sensitive = true flag on the API token ensures Terraform will never display this value in plan output or logs.
Store these values in a terraform.tfvars file (which should be in your .gitignore):
| |
Alternatively, you can use environment variables:
| |
Step 3: Build the Reusable Terraform Module
Now let us build the core module. Create the module variables in modules/cloudflare-geo-load-balancer/variables.tf:
| |
Notice the validation block on steering_policy — this prevents invalid values from being passed, catching configuration errors early during terraform plan rather than at apply time.
Step 4: Configure Health Check Monitors
Health checks are the foundation of reliable load balancing. They continuously probe your origins to ensure traffic is only sent to healthy servers.
Add the following to modules/cloudflare-geo-load-balancer/main.tf:
| |
Let us break down the key configuration options:
type: The protocol to use —httpsorhttp. Always usehttpsin production.port: Dynamically set to443for HTTPS or80for HTTP using a conditional expression.path: The endpoint to probe. Your application should expose a/healthendpoint that returns a200status when the server is healthy.interval: How often (in seconds) Cloudflare sends a health check.60seconds is a good balance between responsiveness and not overloading your servers.retries: Number of consecutive failures before marking an origin as unhealthy. Setting this to2prevents false positives from transient network issues.timeout: How long to wait for a response.5seconds is reasonable for most API health endpoints.expected_codes: The HTTP status code that indicates a healthy origin. Use"200"for standard health checks.allow_insecure: Whether to skip SSL certificate verification. Keep thisfalsein production to ensure valid certificates.follow_redirects: Whether to follow HTTP 3xx redirects during health checks. Enable this if your health endpoint redirects (e.g., HTTP to HTTPS redirect).header: TheHostheader is critical — it must match the origin’s expected hostname for proper routing, especially when origins are behind shared infrastructure like a reverse proxy or ingress controller.
Why Separate Monitors Per Region?
We create separate monitors for EU and US origins rather than sharing one. This is because each monitor sends the appropriate Host header matching its specific origin. If your origins share the same health check path, you might be tempted to reuse a monitor, but the Host header difference makes separate monitors necessary.
Step 5: Create Origin Pools
Pools group your origins by region. Each pool is associated with a health monitor and configured for specific check regions.
Add the following to the same main.tf file:
| |
Let us break down the important pool configuration:
check_regions
This determines where Cloudflare runs health checks from. Cloudflare has health check probes in multiple regions. By setting check_regions = ["WEU"] for the EU pool, health checks are performed from Western Europe data centers — giving you an accurate picture of the pool’s health from the perspective of users in that region.
Available check regions include:
| Code | Region |
|---|---|
WEU | Western Europe |
EEU | Eastern Europe |
WNAM | Western North America |
ENAM | Eastern North America |
WSAM | Western South America |
ESAM | Eastern South America |
OC | Oceania |
WAS | Western Asia |
EAS | Eastern Asia |
SAS | Southern Asia |
SEAS | Southeast Asia |
NEAS | Northeast Asia |
NAF | Northern Africa |
SAF | Southern Africa |
ME | Middle East |
minimum_origins
The minimum number of healthy origins required for the pool to be considered healthy. With a single origin per pool, set this to 1. If you have multiple origins in a pool, you might set this higher.
origin_steering
Determines how traffic is distributed among origins within a pool. With a single origin this does not matter much, but when you have multiple origins in a pool, random distributes traffic evenly.
The Host Header in Origins
The header block sets the Host header on requests forwarded to the origin. This is essential when your origins are behind shared infrastructure (like an ingress controller or reverse proxy) that routes based on the Host header. Without this, your origin might not know which service to route the request to.
Step 6: Create the Load Balancer with Geo-Steering
Now for the main event — the load balancer itself with geographic steering configuration:
| |
This is where all the magic happens. Let me explain each section:
default_pool_ids
This defines the ordered list of pools used when no geo-steering rule matches the user’s location. The order matters — the first pool in the list is preferred. For regions not explicitly defined (like Asia or South America), traffic falls through to these default pools.
fallback_pool_id
The pool of last resort. When all pools in a region are unhealthy, traffic is sent to the fallback pool. Choose your most reliable region for this.
proxied
When set to true, traffic flows through Cloudflare’s network, enabling:
- DDoS protection
- WAF (Web Application Firewall)
- CDN caching
- SSL/TLS termination
- Bot management
When proxied, the TTL is automatically managed by Cloudflare (hence ttl = var.proxied ? null : var.ttl).
region_pools
Region-level routing rules. These define which pool(s) serve traffic from specific geographic regions. You can specify multiple pools per region for fallback:
| |
country_pools
Country-level routing takes highest precedence in geo-steering. Use this for countries that need specific routing, such as routing Kenya (KE) to the EU pool because it is geographically closer to Europe than to the US:
| |
Country codes follow the ISO 3166-1 alpha-2 standard.
location_strategy
Configures how Cloudflare determines the user’s location:
mode = "resolver_ip": Uses the IP address of the user’s DNS resolver to determine location. This is the default and works well for most cases.prefer_ecs = "never": Disables EDNS Client Subnet (ECS). ECS can provide more accurate location data but has privacy implications. Set to"always"if you need maximum accuracy and your DNS resolvers support it.
Step 7: Define Module Outputs
Create modules/cloudflare-geo-load-balancer/outputs.tf to expose useful information:
| |
These outputs are useful for:
- Referencing resources in other Terraform configurations
- Debugging and monitoring through the Cloudflare dashboard
- Integration with CI/CD pipelines
Step 8: Instantiate the Module for Multiple Environments
Now we can use our module to create load balancers for different environments. Create the root main.tf:
| |
And the root outputs.tf:
| |
Notice how each environment uses the exact same module with different parameters. This is the power of Terraform modules — one codebase, multiple deployments, guaranteed consistency.
Step 9: Deploy the Infrastructure
Initialize Terraform and deploy:
| |
You should see output similar to:
| |
Verify the Setup
After deploying, verify your load balancer is working:
| |
The CF-Ray header includes a three-letter airport code indicating which Cloudflare data center handled the request (e.g., NBO for Nairobi, IAD for Washington DC).
Understanding Geo-Steering in Depth
Geo-steering uses a priority hierarchy to determine which pool serves a request. Understanding this hierarchy is critical for correct configuration:
| |
Example: Request from Kenya
- Country check:
KEis defined incountry_pools→ routes to EU Pool - Even though Kenya is in Africa (not WEU or EEU), the country-level rule takes precedence
Example: Request from Germany
- Country check:
DEis NOT defined incountry_pools - Region check: Germany is in
WEU→ routes to EU Pool
Example: Request from Japan
- Country check:
JPis NOT defined incountry_pools - Region check: No region pool defined for East Asia
- Default pools: Falls through to
[EU Pool, US Pool]— EU Pool is tried first
Adding More Regions
To add coverage for more regions, simply add more region_pools blocks:
| |
Understanding Steering Policies
Cloudflare offers multiple steering policies. Choose based on your needs:
geo — Geographic Steering
Routes traffic based on the user’s geographic location using region_pools and country_pools. This is the most explicit and predictable option.
Best for: Applications with strict data residency requirements, or when you want explicit control over regional routing.
| |
dynamic_latency — Latency-Based Steering
Cloudflare measures round-trip time to each pool and routes users to the lowest-latency pool. The routing table is built dynamically based on observed performance.
Best for: Performance-optimized applications where the fastest response time matters most.
| |
proximity_strict — Proximity Steering
Routes to the geographically closest pool based on the user’s resolver IP or EDNS Client Subnet data. Unlike geo, this does not require explicit region/country mapping.
Best for: Simple geographic routing without needing explicit region mapping. Works well as a free alternative to geo.
| |
random — Random Steering
Distributes traffic randomly across all healthy pools. Each pool has an equal chance of receiving any request.
Best for: Simple load distribution when geography does not matter.
| |
off — No Steering
Uses the default_pool_ids order. Traffic always goes to the first healthy pool. Acts as a simple active/passive failover.
Best for: Active/passive setups where you want all traffic on one pool unless it fails.
| |
Session Affinity Explained
Session affinity ensures that subsequent requests from the same user go to the same origin server. This is important for applications that maintain server-side state (like sessions, shopping carts, or WebSocket connections).
none — No Affinity
Every request is routed independently. Best for stateless APIs.
| |
cookie — Cookie-Based Affinity
Cloudflare sets a cookie (__cflb) that pins the user to a specific origin. The cookie is transparent to the user and your application.
| |
ip_cookie — IP + Cookie Affinity
Combines the user’s IP address with a cookie for stronger session pinning. Even if the cookie is lost, the IP address provides a fallback for affinity.
| |
This is the recommended option for most applications. It provides the best balance of reliability and consistency.
Health Checks and Automatic Failover
Health checks are what make load balancers truly valuable. Without them, you are just doing static DNS routing. Here is how the failover mechanism works:
Normal Operation
| |
EU Origin Becomes Unhealthy
After 2 consecutive failed health checks (configurable via retries):
| |
Both Origins Unhealthy
Traffic is sent to the fallback pool (EU Pool). Even when unhealthy, the fallback pool receives traffic as a last resort — this is better than returning an error to users.
| |
Implementing a Health Check Endpoint
Your application needs a /health endpoint. Here is a minimal example:
Node.js (Express):
| |
Python (FastAPI):
| |
Go (net/http):
| |
Tip: Keep health check endpoints lightweight. They are called every 60 seconds from multiple Cloudflare data centers. Avoid expensive database queries — a simple connection check is sufficient.
Best Practices
1. Always Use Proxied Mode
Setting proxied = true routes traffic through Cloudflare’s network, giving you DDoS protection, WAF, caching, and SSL termination for free. Only disable this if you have a specific reason (like needing to expose the origin IP).
2. Set Appropriate Health Check Intervals
- 60 seconds: Good for production APIs. Balances responsiveness with server load.
- 30 seconds: Better for critical services where faster failover is needed.
- 120 seconds: Acceptable for non-critical services to reduce health check traffic.
3. Use ip_cookie Session Affinity
For most applications, ip_cookie provides the best balance. It maintains session consistency while handling edge cases like cookie clearing.
4. Define Country Pools for Important Markets
If you have users concentrated in specific countries, define explicit country_pools to ensure they always hit the optimal region:
| |
5. Set the Fallback Pool to Your Most Reliable Region
The fallback pool is your safety net. Choose the region with the highest uptime and capacity.
6. Use Terraform Modules for Consistency
As shown in this guide, wrapping your load balancer configuration in a reusable module ensures consistency across environments and services. One bug fix or improvement applies everywhere.
7. Store State Remotely
For team environments, always use a remote backend for Terraform state:
| |
8. Use Separate Health Monitors Per Region
Never share a health monitor between pools. Each monitor should have the correct Host header for its specific origin to ensure accurate health reporting.
Troubleshooting Common Issues
Health Checks Failing
Symptom: Pool shows as unhealthy in Cloudflare dashboard.
Common causes:
- Wrong
Hostheader: Ensure theheaderblock in both the monitor and origin matches the origin’s expected hostname - SSL certificate issues: If using HTTPS, ensure
allow_insecure = falseand your origin has a valid certificate. For internal/self-signed certs, temporarily setallow_insecure = true - Firewall blocking Cloudflare IPs: Ensure your origin allows traffic from Cloudflare’s IP ranges
- Health endpoint not responding with expected code: Verify your
/healthendpoint returns200(not204,301, etc.)
Traffic Not Routing to Expected Region
Symptom: Users in Europe are hitting the US pool.
Common causes:
- DNS resolver location: The user’s DNS resolver might be in a different region. VPNs and public DNS services (like Google 8.8.8.8) can cause this
- Missing
region_poolsentries: Ensure all relevant regions are defined - Country pool overriding region pool: Remember
country_poolstakes precedence overregion_pools
Session Affinity Not Working
Symptom: Users are bouncing between origins.
Common causes:
- Cookies blocked: The client might be blocking Cloudflare’s
__cflbcookie. Useip_cookiefor better reliability - Multiple load balancer layers: If you have another load balancer (like AWS ALB) between Cloudflare and your origin, session affinity might be broken at that layer
Terraform Plan Shows Unexpected Changes
Symptom: Running terraform plan shows changes even though you haven’t modified anything.
Common causes:
- Order sensitivity: Cloudflare may return pool IDs in a different order. Use
lifecycle { ignore_changes }if needed - TTL being set when proxied: When
proxied = true, TTL is managed by Cloudflare. The conditionalttl = var.proxied ? null : var.ttlhandles this
Conclusion
Cloudflare Load Balancers with geo-steering provide a powerful way to distribute traffic globally, reduce latency, and ensure high availability for your applications. By using Terraform to manage this infrastructure, you get the benefits of version control, repeatability, and consistency across environments.
In this guide, we covered:
- Setting up the Cloudflare Terraform provider
- Building a reusable Terraform module for geo-distributed load balancers
- Configuring health check monitors for automatic failover
- Creating regional origin pools with proper
Hostheaders - Implementing geo-steering with region and country-level routing
- Managing session affinity for stateful applications
- Deploying across multiple environments with a single module
The modular approach means adding a new environment or service is as simple as adding a new module block — all the complexity is encapsulated and tested.
For further reading, check out: