HashiCorp Vault on Kubernetes: A Comprehensive Production Guide
Introduction
In modern cloud-native architectures, managing secrets securely is one of the most critical challenges. Hardcoding credentials, storing API keys in environment variables, or committing secrets to version control are dangerous practices that lead to security breaches. This is where HashiCorp Vault comes in—a powerful, enterprise-grade secrets management solution designed for the cloud-native era.
This comprehensive guide will walk you through deploying Vault on Kubernetes from scratch, implementing production-grade hardening, and integrating it with your applications using real-world examples from a production infrastructure.
What is HashiCorp Vault?
HashiCorp Vault is a tool for securely accessing secrets. A secret is anything you want to tightly control access to, such as API keys, passwords, certificates, or encryption keys. Vault provides a unified interface to any secret while providing tight access control and recording a detailed audit log.
Core Capabilities
Secrets Management
- Securely store and access secrets through a unified API
- Dynamic secrets generation with automatic revocation
- Secret versioning and rotation
- Support for multiple secret backends (KV, databases, cloud providers)
Data Encryption
- Encrypt data in-transit and at-rest
- Encryption as a service for applications
- Cryptographic operations without exposing keys
Identity-Based Access
- Fine-grained access control with policies
- Multiple authentication methods (tokens, Kubernetes, LDAP, etc.)
- Secret leasing with time-to-live (TTL)
Audit and Compliance
- Detailed audit logs of all operations
- Policy enforcement and compliance reporting
- Secret usage tracking
Why Do We Need Vault?
The Secret Sprawl Problem
Without a centralized secrets management solution, organizations face several challenges:
- Secrets Scattered Everywhere: Developers store secrets in config files, environment variables, CI/CD systems, and wikis
- No Rotation Strategy: Static secrets that never change become security liabilities
- No Audit Trail: When a breach occurs, you can’t determine what was accessed or by whom
- Access Control Nightmare: Managing who has access to what secrets becomes unmaintainable
- Compliance Issues: Regulations like GDPR, SOC 2, and PCI-DSS require strict secrets management
How Vault Solves These Problems
Centralized Secret Storage
All secrets are stored in one place with encryption at rest. Vault becomes the single source of truth for all sensitive data.
Dynamic Secrets
Instead of sharing long-lived credentials, Vault generates short-lived credentials on-demand. When an application needs database access, Vault creates temporary credentials that expire automatically.
Audit Everything
Every operation is logged: who accessed what secret, when, and from where. This provides complete visibility for security teams.
Fine-Grained Access Control
Policies define exactly what each application or user can access. An application can only read the secrets it needs, nothing more.
Automated Secret Rotation
Vault can automatically rotate secrets without downtime, reducing the window of compromise.
Architecture Overview
Before diving into the setup, let’s understand the architecture we’ll be deploying:
High Availability Setup
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
| ┌─────────────────────────────────────────────────────────────┐
│ External Access │
│ │
│ HTTPS (TLS via Let's Encrypt) │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Traefik │ (Ingress Controller) │
│ │ Ingress │ │
│ └─────────────┘ │
│ │ │
│ │ HTTP (internal) │
│ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Vault Service (ClusterIP) │ │
│ │ vault.vault:8200 │ │
│ └─────────────────────────────────────────────────────┘ │
│ │ │
│ ┌────┼────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌───┐ ┌───┐ ┌───┐ │
│ │V-0│ │V-1│ │V-2│ (3-node HA Cluster) │
│ └───┘ └───┘ └───┘ │
│ │ │ │ │
│ │ │ │ Raft Consensus Protocol │
│ │ │ │ (Leader Election) │
│ │ │ │ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Integrated Raft Storage │ │
│ │ ┌─────┐ ┌─────┐ ┌─────┐ │ │
│ │ │PVC-0│ │PVC-1│ │PVC-2│ (20Gi each) │ │
│ │ └─────┘ └─────┘ └─────┘ │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Audit Storage │ │
│ │ ┌─────┐ ┌─────┐ ┌─────┐ │ │
│ │ │Audit│ │Audit│ │Audit│ (10Gi each) │ │
│ │ └─────┘ └─────┘ └─────┘ │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Integration Layer │
│ │
│ ┌─────────────────────────┐ │
│ │ External Secrets │ │
│ │ Operator │ Syncs Vault secrets to │
│ │ │ Kubernetes Secrets │
│ └─────────────────────────┘ │
│ │ │
│ │ Reads secrets from Vault │
│ ▼ │
│ ┌─────────────────────────┐ │
│ │ Application Pods │ Use standard K8s secrets │
│ │ (No direct Vault │ │
│ │ interaction needed) │ │
│ └─────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
|
Key Components
Vault Cluster: 3 replicas for high availability using Raft integrated storage
Storage: Persistent volumes for data and audit logs
Ingress: TLS-terminated external access via Traefik
Network Policy: Restricts access to authorized namespaces only
External Secrets Operator: Syncs Vault secrets to Kubernetes secrets automatically
Prerequisites
Before beginning the Vault deployment, ensure you have the following in place:
Required Infrastructure
- Kubernetes Cluster: Version 1.20+ running and accessible
- kubectl: Configured and authenticated to your cluster
- Helmfile: Installed for declarative Helm chart management
- Cert-Manager: Deployed for automatic TLS certificate management
- Ingress Controller: Traefik (or similar) for external access
- Storage Class: Available for persistent volume claims (e.g.,
local-path)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| # Install kubectl
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
# Install helmfile
wget https://github.com/helmfile/helmfile/releases/download/v0.157.0/helmfile_0.157.0_linux_amd64.tar.gz
tar -xzf helmfile_0.157.0_linux_amd64.tar.gz
sudo mv helmfile /usr/local/bin/
# Install helm
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
# Verify installations
kubectl version --client
helmfile --version
helm version
|
Storage Considerations
Vault requires persistent storage for:
- Data Storage: Raft integrated storage (20Gi per pod recommended)
- Audit Logs: Compliance and security auditing (10Gi per pod recommended)
Ground-Up Setup on Kubernetes
Step 1: Prepare the Configuration
Create the directory structure for your Vault deployment:
1
2
| mkdir -p k8s/releases/vault
cd k8s/releases/vault
|
Create the helmfile.yaml:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
| helmDefaults:
createNamespace: true
timeout: 600
wait: true
atomic: true
repositories:
- name: hashicorp
url: https://helm.releases.hashicorp.com
releases:
- name: vault
namespace: vault
chart: hashicorp/vault
version: "0.31.0"
values:
- ./values.yml
hooks:
- events: ["presync"]
command: "kubectl"
args: ["apply", "-f", "./network-policy.yml"]
|
Create the values.yml with production-ready configuration:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
| global:
openShiftSupport: false
tlsDisable: true # TLS termination at ingress
server:
enabled: true
# High Availability Configuration
ha:
enabled: true
replicas: 3
raft:
enabled: true
setNodeId: true
config: |
ui = true
listener "tcp" {
address = "[::]:8200"
cluster_address = "[::]:8201"
tls_disable = true
}
storage "raft" {
path = "/vault/data"
}
service_registration "kubernetes" {}
cluster_addr = "http://$(POD_IP):8201"
api_addr = "http://$(POD_IP):8200"
# Resource Configuration for Production
resources:
requests:
memory: 512Mi
cpu: 250m
limits:
memory: 2Gi
cpu: 1000m
# Persistent Storage Configuration
dataStorage:
enabled: true
size: 20Gi
storageClass: local-path
accessMode: ReadWriteOnce
# Audit Storage for Compliance
auditStorage:
enabled: true
size: 10Gi
storageClass: local-path
accessMode: ReadWriteOnce
# Security Configuration
securityContext:
runAsNonRoot: true
runAsUser: 100
runAsGroup: 1000
fsGroup: 1000
# Service Configuration
service:
enabled: true
type: ClusterIP
port: 8200
targetPort: 8200
# Ingress Configuration
ingress:
enabled: true
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod-issuer
traefik.ingress.kubernetes.io/router.entrypoints: websecure
traefik.ingress.kubernetes.io/service.serversscheme: http
ingressClassName: traefik
pathType: Prefix
hosts:
- host: vault.yourdomain.com
paths:
- /
tls:
- secretName: vault-tls
hosts:
- vault.yourdomain.com
# Pod Disruption Budget for HA
podDisruptionBudget:
enabled: true
minAvailable: 2
# Anti-affinity to spread pods across nodes
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- vault
topologyKey: kubernetes.io/hostname
# Health check probes
readinessProbe:
enabled: true
path: "/v1/sys/health?standbyok=true&sealedcode=204&uninitcode=204"
livenessProbe:
enabled: true
path: "/v1/sys/health?standbyok=true&sealedcode=204&uninitcode=204"
initialDelaySeconds: 60
# Environment variables
extraEnvironmentVars:
VAULT_LOG_LEVEL: info
VAULT_LOG_FORMAT: json
# UI Configuration
ui:
enabled: true
serviceType: ClusterIP
# Disable components not needed
injector:
enabled: false
csi:
enabled: false
|
Create the network-policy.yml for network security:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
| apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: vault-network-policy
namespace: vault
spec:
podSelector:
matchLabels:
app.kubernetes.io/name: vault
policyTypes:
- Ingress
- Egress
ingress:
# Allow from vault namespace (Raft communication)
- from:
- namespaceSelector:
matchLabels:
name: vault
ports:
- protocol: TCP
port: 8200
- protocol: TCP
port: 8201
# Allow from external-secrets namespace
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: external-secrets
ports:
- protocol: TCP
port: 8200
# Allow from application namespaces
- from:
- namespaceSelector:
matchLabels:
name: default
ports:
- protocol: TCP
port: 8200
# Allow from ingress controller
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
podSelector:
matchLabels:
app.kubernetes.io/name: traefik
ports:
- protocol: TCP
port: 8200
egress:
# Allow all egress for DNS, cert-manager, etc.
- {}
|
Step 2: Deploy Vault
Deploy Vault using helmfile:
1
2
3
4
5
| # Preview what will be deployed
helmfile template
# Deploy Vault
helmfile sync
|
Monitor the deployment:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| # Watch pods come up
kubectl get pods -n vault -w
# Expected output (pods will be 0/1 until initialized and unsealed):
# NAME READY STATUS RESTARTS AGE
# vault-0 0/1 Running 0 2m
# vault-1 0/1 Running 0 2m
# vault-2 0/1 Running 0 2m
# Check persistent volumes
kubectl get pvc -n vault
# Check ingress
kubectl get ingress -n vault
# Verify TLS certificate
kubectl get certificate -n vault
|
Step 3: Initialize Vault
Vault starts sealed and uninitialized. Initialization creates the encryption keys and unseal keys:
1
2
| # Exec into the first pod
kubectl exec -it vault-0 -n vault -- sh
|
Inside the pod, initialize Vault with Shamir secret sharing:
1
| vault operator init -key-shares=5 -key-threshold=3
|
This command will output:
1
2
3
4
5
6
7
| Unseal Key 1: [REDACTED]
Unseal Key 2: [REDACTED]
Unseal Key 3: [REDACTED]
Unseal Key 4: [REDACTED]
Unseal Key 5: [REDACTED]
Initial Root Token: [REDACTED]
|
CRITICAL SECURITY STEP: Immediately save these keys in a secure password manager:
- The 5 unseal keys
- The root token
- NEVER commit these to git or share them insecurely
- Store them in separate secure locations
- You need 3 of 5 keys to unseal Vault
Step 4: Unseal the Vault Cluster
Each Vault pod must be unsealed independently using 3 of the 5 unseal keys.
Unseal vault-0 (the node you just initialized):
1
2
3
4
5
6
7
8
9
10
11
| # Still inside vault-0 pod
vault operator unseal [KEY-1]
# Output: Sealed: true, Unseal Progress: 1/3
vault operator unseal [KEY-2]
# Output: Sealed: true, Unseal Progress: 2/3
vault operator unseal [KEY-3]
# Output: Sealed: false (UNSEALED!)
exit
|
Unseal vault-1:
1
2
3
4
5
6
7
8
9
10
11
| kubectl exec -it vault-1 -n vault -- sh
# First, join the Raft cluster
vault operator raft join http://vault-0.vault-internal:8200
# Then unseal
vault operator unseal [KEY-1]
vault operator unseal [KEY-2]
vault operator unseal [KEY-3]
exit
|
Unseal vault-2:
1
2
3
4
5
6
7
8
9
10
11
| kubectl exec -it vault-2 -n vault -- sh
# Join Raft cluster
vault operator raft join http://vault-0.vault-internal:8200
# Unseal
vault operator unseal [KEY-1]
vault operator unseal [KEY-2]
vault operator unseal [KEY-3]
exit
|
Verify all pods are now ready:
1
2
3
4
5
6
7
| kubectl get pods -n vault
# Expected output:
# NAME READY STATUS RESTARTS AGE
# vault-0 1/1 Running 0 10m
# vault-1 1/1 Running 0 10m
# vault-2 1/1 Running 0 10m
|
Step 5: Verify High Availability
Check the Raft cluster status:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| kubectl exec -it vault-0 -n vault -- sh
# Login
vault login [ROOT-TOKEN]
# Check Raft peers
vault operator raft list-peers
# Expected output:
# Node Address State Voter
# ---- ------- ----- -----
# vault-0 vault-0.vault-internal leader true
# vault-1 vault-1.vault-internal follower true
# vault-2 vault-2.vault-internal follower true
# Check cluster status
vault status
exit
|
Step 6: Enable Secrets Engines
Vault requires explicit enabling of secrets engines. Enable the KV (Key-Value) v2 secrets engine:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| kubectl exec -it vault-0 -n vault -- sh
vault login [ROOT-TOKEN]
# Enable KV v2 secrets engine
vault secrets enable -version=2 kv
# Verify
vault secrets list
# Output should include:
# Path Type Description
# ---- ---- -----------
# kv/ kv n/a
# ...
exit
|
Step 7: Create Access Policies
Never use the root token for application access. Create policies with minimal required permissions:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
| kubectl exec -it vault-0 -n vault -- sh
vault login [ROOT-TOKEN]
# Create a read-only policy for external-secrets operator
vault policy write external-secrets-readonly - <<EOF
# Allow reading secrets
path "kv/data/*" {
capabilities = ["read", "list"]
}
# Allow listing secret paths
path "kv/metadata/*" {
capabilities = ["list"]
}
EOF
# Verify policy
vault policy read external-secrets-readonly
exit
|
Step 8: Create Service Token
Create a token with the policy for the External Secrets Operator:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| kubectl exec -it vault-0 -n vault -- sh
vault login [ROOT-TOKEN]
# Create token with policy
vault token create \
-policy=external-secrets-readonly \
-no-default-policy \
-orphan \
-renewable=false \
-period=0
# Save the token value (hvs.CAESIJ...)
# Store this securely - you'll need it for External Secrets
exit
|
Production Hardening
Now that Vault is deployed and running, let’s implement production-grade hardening.
1. Enable Audit Logging
Audit logs are crucial for compliance and security investigations:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| kubectl exec -it vault-0 -n vault -- sh
vault login [ROOT-TOKEN]
# Enable file-based audit logging
vault audit enable file file_path=/vault/audit/audit.log
# Verify audit is enabled
vault audit list
# Output:
# Path Type Description
# ---- ---- -----------
# file/ file n/a
exit
|
Audit logs will now be written to the persistent audit volume mounted at /vault/audit/.
2. Implement Network Security
The network policy we created earlier restricts access to Vault. Let’s verify it:
1
2
3
4
5
6
7
8
| # View the network policy
kubectl describe networkpolicy vault-network-policy -n vault
# Test connectivity from allowed namespace
kubectl run test -it --rm --image=busybox -n external-secrets -- wget -O- http://vault.vault:8200/v1/sys/health
# Test connectivity from disallowed namespace (should fail)
kubectl run test -it --rm --image=busybox -n some-other-ns -- wget -O- http://vault.vault:8200/v1/sys/health
|
Resource limits prevent resource exhaustion attacks and ensure fair resource sharing. Our configuration already includes:
1
2
3
4
5
6
7
| resources:
requests:
memory: 512Mi
cpu: 250m
limits:
memory: 2Gi
cpu: 1000m
|
Monitor resource usage:
1
2
3
4
5
| # View resource consumption
kubectl top pods -n vault
# View resource requests/limits
kubectl describe pods -n vault | grep -A 5 "Limits:"
|
4. Implement Pod Disruption Budget
The Pod Disruption Budget ensures high availability during cluster maintenance:
1
2
3
| podDisruptionBudget:
enabled: true
minAvailable: 2
|
This guarantees at least 2 Vault pods are always running during voluntary disruptions (node drains, upgrades, etc.).
Verify the PDB:
1
2
| kubectl get pdb -n vault
kubectl describe pdb vault -n vault
|
Anti-affinity spreads Vault pods across different nodes to prevent single-point-of-failure:
1
2
3
4
5
6
7
8
9
10
11
12
| affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- vault
topologyKey: kubernetes.io/hostname
|
Verify pod distribution:
1
2
3
| kubectl get pods -n vault -o wide
# Pods should be on different nodes
|
6. Secure TLS Configuration
Our setup uses TLS termination at the ingress with Let’s Encrypt certificates:
- External traffic: HTTPS with automatic certificate renewal
- Internal traffic: HTTP (within trusted cluster network)
Verify TLS certificate:
1
2
3
4
5
6
7
| kubectl get certificate -n vault vault-tls -o yaml
# Check certificate is ready
kubectl describe certificate vault-tls -n vault
# Test external access
curl -I https://vault.yourdomain.com/ui/
|
7. Implement Backup Strategy
Automated backups are critical for disaster recovery. Create a backup script at backup-all-secrets.sh:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
| #!/bin/bash
set -e
BACKUP_DIR="vault-backup-$(date +%Y%m%d-%H%M%S)"
VAULT_NAMESPACE="vault"
VAULT_POD="vault-0"
echo "Creating backup: $BACKUP_DIR"
mkdir -p "$BACKUP_DIR"
cd "$BACKUP_DIR"
# 1. Backup Vault Raft snapshot
echo "Backing up Vault Raft snapshot..."
kubectl exec -it "$VAULT_POD" -n "$VAULT_NAMESPACE" -- \
vault operator raft snapshot save /tmp/vault-snapshot.snap
kubectl cp "$VAULT_NAMESPACE/$VAULT_POD:/tmp/vault-snapshot.snap" \
./vault-raft-snapshot.snap
# 2. Backup Kubernetes secrets
echo "Backing up Kubernetes secrets..."
mkdir -p k8s-secrets
kubectl get namespaces -o jsonpath='{.items[*].metadata.name}' | \
tr ' ' '\n' > k8s-secrets/namespaces.txt
while IFS= read -r ns; do
echo " Namespace: $ns"
kubectl get secrets -n "$ns" -o yaml > "k8s-secrets/secrets-$ns.yaml"
done < k8s-secrets/namespaces.txt
# 3. Backup External Secrets configuration
echo "Backing up External Secrets configuration..."
mkdir -p external-secrets-config
kubectl get clustersecretstore -o yaml > \
external-secrets-config/all-clustersecretstores.yaml
kubectl get externalsecret --all-namespaces -o yaml > \
external-secrets-config/all-externalsecrets.yaml
# 4. Create archive
cd ..
echo "Creating archive..."
tar -czf "$BACKUP_DIR.tar.gz" "$BACKUP_DIR"
shasum -a 256 "$BACKUP_DIR.tar.gz" > "$BACKUP_DIR.tar.gz.sha256"
echo "Backup complete: $BACKUP_DIR.tar.gz"
echo "Checksum: $(cat $BACKUP_DIR.tar.gz.sha256)"
|
Make it executable and run regularly:
1
2
3
4
5
6
7
| chmod +x backup-all-secrets.sh
# Manual backup
./backup-all-secrets.sh
# Or schedule with cron (daily at 2 AM)
0 2 * * * /path/to/backup-all-secrets.sh
|
Upload backups to secure off-site storage:
1
2
3
4
5
| # Upload to S3
aws s3 cp vault-backup-*.tar.gz s3://your-backup-bucket/vault-backups/
# Or use rsync to remote server
rsync -avz vault-backup-*.tar.gz user@backup-server:/backups/vault/
|
8. Security Best Practices Summary
Secrets Management:
- ✅ Store unseal keys in a secure password manager (separate locations)
- ✅ Revoke root token after initial setup
- ✅ Use policies for all access (never use root token for applications)
- ✅ Enable audit logging for all operations
- ✅ Rotate tokens regularly
Network Security:
- ✅ Network policies restrict access to authorized namespaces only
- ✅ TLS for external access via ingress
- ✅ Service mesh (optional): Consider Istio/Linkerd for mTLS between pods
Infrastructure Security:
- ✅ Run as non-root user
- ✅ Read-only root filesystem
- ✅ Resource limits prevent resource exhaustion
- ✅ Pod security policies/admission controllers
- ✅ Regular security updates
Operational Security:
- ✅ Automated backups to off-site storage
- ✅ Test disaster recovery procedures
- ✅ Monitor audit logs for suspicious activity
- ✅ Alert on Vault seal events
- ✅ Document runbooks for incidents
Working with Vault: Real-World Examples
Example 1: Storing and Retrieving Secrets
Create a secret in Vault:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
| kubectl exec -it vault-0 -n vault -- sh
vault login [ROOT-TOKEN]
# Store a database credential
vault kv put kv/infra/database-credentials \
username=dbuser \
password=super-secret-password \
host=postgres.database.svc.cluster.local \
port=5432 \
database=myapp
# List secrets
vault kv list kv/infra
# Read a secret
vault kv get kv/infra/database-credentials
# Read specific field
vault kv get -field=password kv/infra/database-credentials
exit
|
Example 2: Secret Versioning
Vault KV v2 engine maintains version history:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
| kubectl exec -it vault-0 -n vault -- sh
vault login [ROOT-TOKEN]
# Update the secret (creates version 2)
vault kv put kv/infra/database-credentials \
username=dbuser \
password=new-super-secret-password \
host=postgres.database.svc.cluster.local \
port=5432 \
database=myapp
# Read latest version
vault kv get kv/infra/database-credentials
# Read specific version
vault kv get -version=1 kv/infra/database-credentials
# View version history
vault kv metadata get kv/infra/database-credentials
exit
|
Example 3: Integration with External Secrets Operator
External Secrets Operator syncs Vault secrets to Kubernetes secrets automatically. This is the recommended way to consume Vault secrets in Kubernetes.
Step 1: Install External Secrets Operator
1
2
3
4
5
6
| helm repo add external-secrets https://charts.external-secrets.io
helm install external-secrets \
external-secrets/external-secrets \
-n external-secrets \
--create-namespace
|
Step 2: Create Vault Token Secret
1
2
3
4
| # Create secret with the token we generated earlier
kubectl create secret generic vault-token \
-n external-secrets \
--from-literal=token=hvs.CAESIJ...YOUR-TOKEN-HERE
|
Step 3: Create ClusterSecretStore
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| apiVersion: external-secrets.io/v1beta1
kind: ClusterSecretStore
metadata:
name: vault-backend
spec:
provider:
vault:
server: http://vault.vault:8200
path: kv
version: v2
auth:
tokenSecretRef:
name: vault-token
key: token
namespace: external-secrets
|
Apply it:
1
2
3
4
5
6
7
8
| kubectl apply -f clustersecretstore.yaml
# Verify it's ready
kubectl get clustersecretstore vault-backend
# Expected output:
# NAME AGE STATUS CAPABILITIES READY
# vault-backend 5s Valid ReadWrite True
|
Step 4: Create ExternalSecret
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: database-credentials
namespace: default
spec:
refreshInterval: 5m
secretStoreRef:
kind: ClusterSecretStore
name: vault-backend
target:
name: database-credentials
creationPolicy: Owner
dataFrom:
- extract:
key: infra/database-credentials
|
Apply and verify:
1
2
3
4
5
6
7
8
9
10
| kubectl apply -f externalsecret.yaml
# Check ExternalSecret status
kubectl get externalsecret database-credentials
# Verify Kubernetes secret was created
kubectl get secret database-credentials
# View secret contents (base64 decoded)
kubectl get secret database-credentials -o jsonpath='{.data.username}' | base64 -d
|
Step 5: Use in Application
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
| apiVersion: v1
kind: Pod
metadata:
name: myapp
spec:
containers:
- name: app
image: myapp:latest
env:
- name: DB_USERNAME
valueFrom:
secretKeyRef:
name: database-credentials
key: username
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: database-credentials
key: password
- name: DB_HOST
valueFrom:
secretKeyRef:
name: database-credentials
key: host
|
Example 4: Organizing Secrets by Environment
Use path-based organization for different environments:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
| kubectl exec -it vault-0 -n vault -- sh
vault login [ROOT-TOKEN]
# Development environment secrets
vault kv put kv/dev/api-keys \
stripe_key=sk_test_... \
sendgrid_key=SG.test...
# Staging environment secrets
vault kv put kv/staging/api-keys \
stripe_key=sk_test_... \
sendgrid_key=SG.staging...
# Production environment secrets
vault kv put kv/prod/api-keys \
stripe_key=sk_live_... \
sendgrid_key=SG.prod...
# Create environment-specific policies
vault policy write dev-readonly - <<EOF
path "kv/data/dev/*" {
capabilities = ["read", "list"]
}
EOF
vault policy write prod-readonly - <<EOF
path "kv/data/prod/*" {
capabilities = ["read", "list"]
}
EOF
exit
|
Create environment-specific ExternalSecrets:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
| # Production ExternalSecret
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: api-keys
namespace: production
spec:
refreshInterval: 5m
secretStoreRef:
kind: ClusterSecretStore
name: vault-backend
target:
name: api-keys
dataFrom:
- extract:
key: prod/api-keys
---
# Development ExternalSecret
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: api-keys
namespace: development
spec:
refreshInterval: 5m
secretStoreRef:
kind: ClusterSecretStore
name: vault-backend
target:
name: api-keys
dataFrom:
- extract:
key: dev/api-keys
|
Example 5: Bulk Secret Migration
When migrating from another secrets management solution:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| # Export secrets from old system (example: Kubernetes secrets)
kubectl get secret old-secret -n default -o json | \
jq -r '.data | to_entries[] | "\(.key)=\(.value | @base64d)"' > secrets.env
# Import to Vault
kubectl exec -it vault-0 -n vault -- sh
vault login [ROOT-TOKEN]
# Read from file and import (run this on your local machine)
while IFS='=' read -r key value; do
vault kv put kv/migrated/$key value="$value"
done < secrets.env
# Or import multiple key-value pairs at once
vault kv put kv/migrated/application \
$(cat secrets.env | tr '\n' ' ')
exit
|
Disaster Recovery and Backup Procedures
Creating Backups
Automated Daily Backup (Recommended):
The backup script we created earlier should run daily via cron:
1
2
3
4
5
| # Add to crontab
crontab -e
# Add this line (daily at 2 AM)
0 2 * * * /path/to/k8s/releases/vault/backup-all-secrets.sh >> /var/log/vault-backup.log 2>&1
|
Manual Backup:
1
2
3
4
5
6
| cd k8s/releases/vault
./backup-all-secrets.sh
# Upload to S3
aws s3 cp vault-backup-$(date +%Y%m%d)*.tar.gz \
s3://your-backup-bucket/vault-backups/
|
What Gets Backed Up:
- Vault Raft snapshot (complete Vault data)
- All Kubernetes secrets (all namespaces)
- ExternalSecret definitions
- ClusterSecretStore configuration
- Vault deployment configuration
- Current system state documentation
Restoring from Backup
Restore Vault Data from Raft Snapshot:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
| # Extract backup
tar -xzf vault-backup-YYYYMMDD-HHMMSS.tar.gz
cd vault-backup-YYYYMMDD-HHMMSS
# Copy snapshot to Vault pod
kubectl cp vault-raft-snapshot.snap vault/vault-0:/tmp/vault-restore.snap
# Restore the snapshot
kubectl exec -it vault-0 -n vault -- sh
vault login [ROOT-TOKEN]
# Force restore (overwrites current data)
vault operator raft snapshot restore -force /tmp/vault-restore.snap
exit
# Restart all Vault pods
kubectl delete pod vault-0 vault-1 vault-2 -n vault
# Wait for pods to restart
kubectl get pods -n vault -w
# Unseal all pods (see unsealing procedure above)
|
Restore Kubernetes Secrets:
1
2
3
4
5
6
7
8
9
10
| # Restore all secrets from backup
cd vault-backup-YYYYMMDD-HHMMSS
for file in k8s-secrets/secrets-*.yaml; do
echo "Restoring $file..."
kubectl apply -f "$file"
done
# Or restore specific namespace
kubectl apply -f k8s-secrets/secrets-vault.yaml
|
Restore External Secrets Configuration:
1
2
3
4
5
| # Restore ClusterSecretStore
kubectl apply -f external-secrets-config/all-clustersecretstores.yaml
# Restore all ExternalSecrets
kubectl apply -f external-secrets-config/all-externalsecrets.yaml
|
Testing Disaster Recovery
Regularly test your backup and restore procedures in a non-production environment:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
| # 1. Create a test namespace
kubectl create namespace vault-dr-test
# 2. Deploy a test Vault instance
# (use same helmfile with different namespace)
# 3. Restore from production backup
# (follow restore procedures above)
# 4. Verify data integrity
kubectl exec -it vault-0 -n vault-dr-test -- sh
vault login [ROOT-TOKEN]
vault kv list kv/
vault kv get kv/infra/database-credentials
exit
# 5. Clean up test environment
kubectl delete namespace vault-dr-test
|
Monitoring and Observability
Health Checks
Vault provides health endpoints for monitoring:
1
2
3
4
5
6
7
8
9
10
| # Check Vault status
curl https://vault.yourdomain.com/v1/sys/health
# Response codes:
# 200: Initialized, unsealed, active
# 429: Unsealed, standby
# 472: Disaster recovery mode
# 473: Performance standby
# 501: Not initialized
# 503: Sealed
|
Metrics and Monitoring
Prometheus Integration (if Prometheus is installed):
Enable Vault telemetry in values.yml:
1
2
3
4
5
6
7
8
9
| server:
extraEnvironmentVars:
VAULT_TELEMETRY_PROMETHEUS_RETENTION_TIME: "30s"
serverTelemetry:
serviceMonitor:
enabled: true
selectors:
release: prometheus
|
Key metrics to monitor:
vault_core_unsealed: Vault seal status (should always be 1)vault_core_leader: Leader election statusvault_runtime_alloc_bytes: Memory usagevault_raft_peers: Number of Raft peers (should be 3)vault_token_count: Number of active tokensvault_audit_log_request: Audit log write latency
Alerting Rules
Example Prometheus alert rules:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
| groups:
- name: vault
interval: 30s
rules:
- alert: VaultSealed
expr: vault_core_unsealed == 0
for: 5m
labels:
severity: critical
annotations:
summary: "Vault is sealed"
description: "Vault instance {{ $labels.instance }} is sealed"
- alert: VaultDown
expr: up{job="vault"} == 0
for: 5m
labels:
severity: critical
annotations:
summary: "Vault is down"
description: "Vault instance {{ $labels.instance }} is down"
- alert: VaultNoLeader
expr: max(vault_core_leader) == 0
for: 5m
labels:
severity: warning
annotations:
summary: "Vault has no leader"
description: "Vault cluster has no elected leader"
|
Log Management
View Vault logs:
1
2
3
4
5
6
7
8
| # View logs from all Vault pods
kubectl logs -n vault -l app.kubernetes.io/name=vault --tail=100 -f
# View logs from specific pod
kubectl logs -n vault vault-0 -f
# View audit logs
kubectl exec -it vault-0 -n vault -- tail -f /vault/audit/audit.log
|
Ship logs to centralized logging (ELK, Loki, etc.):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
| # Example: Fluent Bit DaemonSet for log shipping
apiVersion: v1
kind: ConfigMap
metadata:
name: fluent-bit-config
namespace: logging
data:
fluent-bit.conf: |
[INPUT]
Name tail
Path /var/log/containers/vault-*_vault_*.log
Parser docker
Tag vault.*
Refresh_Interval 5
[OUTPUT]
Name es
Match vault.*
Host elasticsearch
Port 9200
Index vault-logs
|
Troubleshooting Common Issues
Issue 1: Vault Pods Not Ready
Symptoms: Pods show 0/1 READY
Diagnosis:
1
2
3
| kubectl get pods -n vault
kubectl describe pod vault-0 -n vault
kubectl logs vault-0 -n vault
|
Common Causes:
- Vault is sealed: Unseal the pods (see unsealing procedure)
- Not initialized: Initialize Vault (see initialization procedure)
- Raft not joined: Join pods to Raft cluster
- Storage issues: Check PVC status
Resolution:
1
2
3
4
5
| # Check if sealed
kubectl exec -it vault-0 -n vault -- vault status
# Unseal if needed
kubectl exec -it vault-0 -n vault -- vault operator unseal [KEY]
|
Issue 2: Cannot Access Vault UI
Symptoms: Browser shows connection error or 502
Diagnosis:
1
2
3
4
5
6
7
8
9
10
| # Check ingress
kubectl get ingress -n vault
kubectl describe ingress vault -n vault
# Check certificate
kubectl get certificate -n vault
kubectl describe certificate vault-tls -n vault
# Check service
kubectl get svc -n vault
|
Common Causes:
- Certificate not ready: Wait for cert-manager to issue certificate
- DNS not configured: Ensure DNS points to ingress
- Ingress misconfigured: Check ingress annotations
Issue 3: ExternalSecrets Not Syncing
Symptoms: Kubernetes secrets not created or not updated
Diagnosis:
1
2
3
4
5
6
7
8
9
10
| # Check ExternalSecret status
kubectl get externalsecret -A
kubectl describe externalsecret [name] -n [namespace]
# Check ClusterSecretStore
kubectl get clustersecretstore
kubectl describe clustersecretstore vault-backend
# Check operator logs
kubectl logs -n external-secrets -l app.kubernetes.io/name=external-secrets
|
Common Causes:
- Invalid token: Recreate Vault token and update secret
- Wrong secret path: Verify path exists in Vault
- Network policy blocking: Check network policies allow external-secrets namespace
Resolution:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| # Test connectivity from external-secrets pod
kubectl run test -it --rm --image=busybox -n external-secrets -- \
wget -O- http://vault.vault:8200/v1/sys/health
# Recreate token
kubectl exec -it vault-0 -n vault -- sh
vault login [ROOT-TOKEN]
vault token create -policy=external-secrets-readonly
exit
# Update secret
kubectl delete secret vault-token -n external-secrets
kubectl create secret generic vault-token \
-n external-secrets \
--from-literal=token=hvs.NEW-TOKEN...
|
Issue 4: Raft Cluster Issues
Symptoms: Leader election fails, nodes out of sync
Diagnosis:
1
2
3
4
5
| kubectl exec -it vault-0 -n vault -- sh
vault login [ROOT-TOKEN]
vault operator raft list-peers
vault operator raft configuration
exit
|
Common Causes:
- Network partition: Check pod-to-pod connectivity
- Node failure: Check node status
- Storage issues: Check PVC status
Resolution:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| # Remove failed peer
kubectl exec -it vault-0 -n vault -- sh
vault login [ROOT-TOKEN]
vault operator raft remove-peer vault-2
exit
# Delete pod to recreate
kubectl delete pod vault-2 -n vault
# Re-join to cluster
kubectl exec -it vault-2 -n vault -- sh
vault operator raft join http://vault-0.vault-internal:8200
vault operator unseal [KEY-1]
vault operator unseal [KEY-2]
vault operator unseal [KEY-3]
exit
|
Issue 5: High Memory Usage
Symptoms: Vault pods using excessive memory, OOM kills
Diagnosis:
1
2
3
4
5
6
7
8
| # Check resource usage
kubectl top pods -n vault
# Check memory limits
kubectl describe pod vault-0 -n vault | grep -A 10 "Limits:"
# View metrics (if Prometheus enabled)
# Query: rate(vault_runtime_alloc_bytes[5m])
|
Resolution:
1
2
3
4
5
6
7
8
9
10
11
| # Increase memory limits in values.yml
resources:
limits:
memory: 4Gi # Increase from 2Gi
# Apply changes
helmfile sync
# Or restart pods to clear memory
kubectl delete pod vault-0 -n vault
# Wait and unseal
|
Advanced Topics
Kubernetes Authentication
Enable Kubernetes auth for pods to authenticate directly to Vault:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| kubectl exec -it vault-0 -n vault -- sh
vault login [ROOT-TOKEN]
# Enable Kubernetes auth
vault auth enable kubernetes
# Configure Kubernetes auth
vault write auth/kubernetes/config \
kubernetes_host="https://$KUBERNETES_PORT_443_TCP_ADDR:443"
# Create role for application
vault write auth/kubernetes/role/myapp \
bound_service_account_names=myapp \
bound_service_account_namespaces=default \
policies=myapp-policy \
ttl=24h
exit
|
Application code example (Go):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| import (
"github.com/hashicorp/vault/api"
)
// Read JWT token
jwt, _ := ioutil.ReadFile("/var/run/secrets/kubernetes.io/serviceaccount/token")
// Authenticate to Vault
client, _ := api.NewClient(api.DefaultConfig())
secret, _ := client.Logical().Write("auth/kubernetes/login", map[string]interface{}{
"jwt": string(jwt),
"role": "myapp",
})
// Set token
client.SetToken(secret.Auth.ClientToken)
// Read secret
secret, _ := client.Logical().Read("kv/data/myapp/config")
|
Dynamic Database Credentials
Generate short-lived database credentials dynamically:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
| kubectl exec -it vault-0 -n vault -- sh
vault login [ROOT-TOKEN]
# Enable database secrets engine
vault secrets enable database
# Configure PostgreSQL connection
vault write database/config/myapp-db \
plugin_name=postgresql-database-plugin \
allowed_roles="myapp-role" \
connection_url="postgresql://{{username}}:{{password}}@postgres.database:5432/mydb" \
username="vault" \
password="vault-password"
# Create role with SQL statements
vault write database/roles/myapp-role \
db_name=myapp-db \
creation_statements="CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}'; \
GRANT SELECT ON ALL TABLES IN SCHEMA public TO \"{{name}}\";" \
default_ttl="1h" \
max_ttl="24h"
# Generate credentials
vault read database/creds/myapp-role
# Output:
# Key Value
# --- -----
# lease_id database/creds/myapp-role/xxxxx
# lease_duration 1h
# username v-root-myapp-ro-xxxxx
# password A1a-xxxxxxxxxxxxxxxx
exit
|
PKI and Certificate Management
Use Vault as a private CA for mTLS:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
| kubectl exec -it vault-0 -n vault -- sh
vault login [ROOT-TOKEN]
# Enable PKI secrets engine
vault secrets enable pki
# Set max lease TTL
vault secrets tune -max-lease-ttl=87600h pki
# Generate root CA
vault write -field=certificate pki/root/generate/internal \
common_name="My Internal CA" \
ttl=87600h > CA_cert.crt
# Configure CA and CRL URLs
vault write pki/config/urls \
issuing_certificates="https://vault.yourdomain.com/v1/pki/ca" \
crl_distribution_points="https://vault.yourdomain.com/v1/pki/crl"
# Create role
vault write pki/roles/myapp-role \
allowed_domains="myapp.svc.cluster.local" \
allow_subdomains=true \
max_ttl="720h"
# Issue certificate
vault write pki/issue/myapp-role \
common_name="myapp.myapp.svc.cluster.local" \
ttl="24h"
exit
|
Migration Guide
Migrating from OpenBao/Other Solutions
If you’re migrating from another Vault-compatible solution:
Step 1: Export secrets from old system
1
2
3
4
5
6
7
8
9
10
11
12
| # Example: Export from OpenBao
kubectl exec -it openbao-0 -n openbao -- sh
bao login [TOKEN]
# List all paths
bao kv list -format=json kv/ > paths.json
# Export each secret (script this for multiple secrets)
bao kv get -format=json kv/infra/secret1 > secret1.json
exit
|
Step 2: Import to Vault
1
2
3
4
5
6
7
8
9
10
11
| kubectl exec -it vault-0 -n vault -- sh
vault login [ROOT-TOKEN]
# Import secrets
# (Process JSON files and import)
vault kv put kv/infra/secret1 \
key1=value1 \
key2=value2
exit
|
Step 3: Update ExternalSecrets
1
2
3
4
| # Update all ExternalSecret resources to use new ClusterSecretStore
kubectl get externalsecret --all-namespaces -o yaml | \
sed 's/old-backend/vault-backend/g' | \
kubectl apply -f -
|
Step 4: Verify and cleanup
1
2
3
4
5
6
7
8
9
| # Verify all ExternalSecrets are syncing
kubectl get externalsecret --all-namespaces
# Verify applications are working
kubectl get pods --all-namespaces
# Remove old system after verification period
kubectl delete clustersecretstore old-backend
helmfile destroy -f old-system/helmfile.yaml
|
Conclusion
You now have a production-ready HashiCorp Vault deployment on Kubernetes with:
✅ High Availability: 3-node Raft cluster with automatic failover
✅ Security Hardening: Network policies, audit logging, RBAC, and encryption
✅ Automated Backups: Daily snapshots with disaster recovery procedures
✅ External Secrets Integration: Seamless Kubernetes secrets synchronization
✅ Monitoring: Health checks, metrics, and alerting
✅ Production Operations: Troubleshooting guides and runbooks
Key Takeaways
- Never use root token in production: Create policies and service tokens
- Store unseal keys securely: Use password managers, separate storage
- Test backups regularly: Verify restore procedures work
- Monitor seal status: Alert immediately if Vault becomes sealed
- Rotate credentials: Implement regular rotation for all tokens
- Network security matters: Use network policies to restrict access
- Audit everything: Enable and monitor audit logs
Next Steps
- Set up monitoring dashboards: Create Grafana dashboards for Vault metrics
- Implement secret rotation: Automate credential rotation workflows
- Document runbooks: Create operational playbooks for your team
- Train your team: Ensure everyone understands Vault operations
- Regular security audits: Review policies and access patterns monthly
Resources
Support
For issues or questions:
- Review the troubleshooting section above
- Check Vault logs:
kubectl logs -n vault vault-0 - Consult HashiCorp community forums
- Open issues on the vault-helm GitHub repository
About the Author: This guide is based on real-world production infrastructure running at scale. The configurations and practices described have been battle-tested in production environments handling millions of requests daily.
Last Updated: January 31, 2026
Version: 1.0
Vault Chart Version: 0.31.0
Vault Version: 1.15.2+