HashiCorp Vault on Kubernetes: A Comprehensive Production Guide

Complete step-by-step guide to deploying, hardening, and operating HashiCorp Vault on Kubernetes with high availability, Raft storage, automated backups, External Secrets integration, and production-ready security configurations

HashiCorp Vault on Kubernetes: A Comprehensive Production Guide

Introduction

In modern cloud-native architectures, managing secrets securely is one of the most critical challenges. Hardcoding credentials, storing API keys in environment variables, or committing secrets to version control are dangerous practices that lead to security breaches. This is where HashiCorp Vault comes in—a powerful, enterprise-grade secrets management solution designed for the cloud-native era.

This comprehensive guide will walk you through deploying Vault on Kubernetes from scratch, implementing production-grade hardening, and integrating it with your applications using real-world examples from a production infrastructure.

What is HashiCorp Vault?

HashiCorp Vault is a tool for securely accessing secrets. A secret is anything you want to tightly control access to, such as API keys, passwords, certificates, or encryption keys. Vault provides a unified interface to any secret while providing tight access control and recording a detailed audit log.

Core Capabilities

Secrets Management

  • Securely store and access secrets through a unified API
  • Dynamic secrets generation with automatic revocation
  • Secret versioning and rotation
  • Support for multiple secret backends (KV, databases, cloud providers)

Data Encryption

  • Encrypt data in-transit and at-rest
  • Encryption as a service for applications
  • Cryptographic operations without exposing keys

Identity-Based Access

  • Fine-grained access control with policies
  • Multiple authentication methods (tokens, Kubernetes, LDAP, etc.)
  • Secret leasing with time-to-live (TTL)

Audit and Compliance

  • Detailed audit logs of all operations
  • Policy enforcement and compliance reporting
  • Secret usage tracking

Why Do We Need Vault?

The Secret Sprawl Problem

Without a centralized secrets management solution, organizations face several challenges:

  1. Secrets Scattered Everywhere: Developers store secrets in config files, environment variables, CI/CD systems, and wikis
  2. No Rotation Strategy: Static secrets that never change become security liabilities
  3. No Audit Trail: When a breach occurs, you can’t determine what was accessed or by whom
  4. Access Control Nightmare: Managing who has access to what secrets becomes unmaintainable
  5. Compliance Issues: Regulations like GDPR, SOC 2, and PCI-DSS require strict secrets management

How Vault Solves These Problems

Centralized Secret Storage All secrets are stored in one place with encryption at rest. Vault becomes the single source of truth for all sensitive data.

Dynamic Secrets Instead of sharing long-lived credentials, Vault generates short-lived credentials on-demand. When an application needs database access, Vault creates temporary credentials that expire automatically.

Audit Everything Every operation is logged: who accessed what secret, when, and from where. This provides complete visibility for security teams.

Fine-Grained Access Control Policies define exactly what each application or user can access. An application can only read the secrets it needs, nothing more.

Automated Secret Rotation Vault can automatically rotate secrets without downtime, reducing the window of compromise.

Architecture Overview

Before diving into the setup, let’s understand the architecture we’ll be deploying:

High Availability Setup

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
┌─────────────────────────────────────────────────────────────┐
│                        External Access                      │
│                                                             │
│  HTTPS (TLS via Let's Encrypt)                             │
│         │                                                   │
│         ▼                                                   │
│  ┌─────────────┐                                          │
│  │   Traefik   │  (Ingress Controller)                    │
│  │   Ingress   │                                          │
│  └─────────────┘                                          │
│         │                                                   │
│         │ HTTP (internal)                                  │
│         ▼                                                   │
│  ┌─────────────────────────────────────────────────────┐  │
│  │            Vault Service (ClusterIP)                │  │
│  │                vault.vault:8200                     │  │
│  └─────────────────────────────────────────────────────┘  │
│         │                                                   │
│    ┌────┼────┐                                             │
│    │    │    │                                             │
│    ▼    ▼    ▼                                             │
│  ┌───┐ ┌───┐ ┌───┐                                        │
│  │V-0│ │V-1│ │V-2│  (3-node HA Cluster)                  │
│  └───┘ └───┘ └───┘                                        │
│    │    │    │                                             │
│    │    │    │   Raft Consensus Protocol                  │
│    │    │    │   (Leader Election)                        │
│    │    │    │                                             │
│  ┌─────────────────────────────────────────────────────┐  │
│  │          Integrated Raft Storage                    │  │
│  │  ┌─────┐    ┌─────┐    ┌─────┐                     │  │
│  │  │PVC-0│    │PVC-1│    │PVC-2│  (20Gi each)        │  │
│  │  └─────┘    └─────┘    └─────┘                     │  │
│  └─────────────────────────────────────────────────────┘  │
│                                                             │
│  ┌─────────────────────────────────────────────────────┐  │
│  │          Audit Storage                              │  │
│  │  ┌─────┐    ┌─────┐    ┌─────┐                     │  │
│  │  │Audit│    │Audit│    │Audit│  (10Gi each)        │  │
│  │  └─────┘    └─────┘    └─────┘                     │  │
│  └─────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│                     Integration Layer                       │
│                                                             │
│  ┌─────────────────────────┐                               │
│  │  External Secrets       │                               │
│  │  Operator               │  Syncs Vault secrets to       │
│  │                         │  Kubernetes Secrets           │
│  └─────────────────────────┘                               │
│             │                                               │
│             │ Reads secrets from Vault                     │
│             ▼                                               │
│  ┌─────────────────────────┐                               │
│  │  Application Pods       │  Use standard K8s secrets     │
│  │  (No direct Vault       │                               │
│  │   interaction needed)   │                               │
│  └─────────────────────────┘                               │
└─────────────────────────────────────────────────────────────┘

Key Components

Vault Cluster: 3 replicas for high availability using Raft integrated storage Storage: Persistent volumes for data and audit logs Ingress: TLS-terminated external access via Traefik Network Policy: Restricts access to authorized namespaces only External Secrets Operator: Syncs Vault secrets to Kubernetes secrets automatically

Prerequisites

Before beginning the Vault deployment, ensure you have the following in place:

Required Infrastructure

  1. Kubernetes Cluster: Version 1.20+ running and accessible
  2. kubectl: Configured and authenticated to your cluster
  3. Helmfile: Installed for declarative Helm chart management
  4. Cert-Manager: Deployed for automatic TLS certificate management
  5. Ingress Controller: Traefik (or similar) for external access
  6. Storage Class: Available for persistent volume claims (e.g., local-path)

Required Tools

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# Install kubectl
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl

# Install helmfile
wget https://github.com/helmfile/helmfile/releases/download/v0.157.0/helmfile_0.157.0_linux_amd64.tar.gz
tar -xzf helmfile_0.157.0_linux_amd64.tar.gz
sudo mv helmfile /usr/local/bin/

# Install helm
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

# Verify installations
kubectl version --client
helmfile --version
helm version

Storage Considerations

Vault requires persistent storage for:

  • Data Storage: Raft integrated storage (20Gi per pod recommended)
  • Audit Logs: Compliance and security auditing (10Gi per pod recommended)

Ground-Up Setup on Kubernetes

Step 1: Prepare the Configuration

Create the directory structure for your Vault deployment:

1
2
mkdir -p k8s/releases/vault
cd k8s/releases/vault

Create the helmfile.yaml:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
helmDefaults:
  createNamespace: true
  timeout: 600
  wait: true
  atomic: true

repositories:
  - name: hashicorp
    url: https://helm.releases.hashicorp.com

releases:
  - name: vault
    namespace: vault
    chart: hashicorp/vault
    version: "0.31.0"
    values:
      - ./values.yml
    hooks:
      - events: ["presync"]
        command: "kubectl"
        args: ["apply", "-f", "./network-policy.yml"]

Create the values.yml with production-ready configuration:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
global:
  openShiftSupport: false
  tlsDisable: true # TLS termination at ingress

server:
  enabled: true

  # High Availability Configuration
  ha:
    enabled: true
    replicas: 3
    raft:
      enabled: true
      setNodeId: true
      config: |
        ui = true

        listener "tcp" {
          address = "[::]:8200"
          cluster_address = "[::]:8201"
          tls_disable = true
        }

        storage "raft" {
          path = "/vault/data"
        }

        service_registration "kubernetes" {}

        cluster_addr = "http://$(POD_IP):8201"
        api_addr = "http://$(POD_IP):8200"

  # Resource Configuration for Production
  resources:
    requests:
      memory: 512Mi
      cpu: 250m
    limits:
      memory: 2Gi
      cpu: 1000m

  # Persistent Storage Configuration
  dataStorage:
    enabled: true
    size: 20Gi
    storageClass: local-path
    accessMode: ReadWriteOnce

  # Audit Storage for Compliance
  auditStorage:
    enabled: true
    size: 10Gi
    storageClass: local-path
    accessMode: ReadWriteOnce

  # Security Configuration
  securityContext:
    runAsNonRoot: true
    runAsUser: 100
    runAsGroup: 1000
    fsGroup: 1000

  # Service Configuration
  service:
    enabled: true
    type: ClusterIP
    port: 8200
    targetPort: 8200

  # Ingress Configuration
  ingress:
    enabled: true
    annotations:
      cert-manager.io/cluster-issuer: letsencrypt-prod-issuer
      traefik.ingress.kubernetes.io/router.entrypoints: websecure
      traefik.ingress.kubernetes.io/service.serversscheme: http
    ingressClassName: traefik
    pathType: Prefix
    hosts:
      - host: vault.yourdomain.com
        paths:
          - /
    tls:
      - secretName: vault-tls
        hosts:
          - vault.yourdomain.com

  # Pod Disruption Budget for HA
  podDisruptionBudget:
    enabled: true
    minAvailable: 2

  # Anti-affinity to spread pods across nodes
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchExpressions:
                - key: app.kubernetes.io/name
                  operator: In
                  values:
                    - vault
            topologyKey: kubernetes.io/hostname

  # Health check probes
  readinessProbe:
    enabled: true
    path: "/v1/sys/health?standbyok=true&sealedcode=204&uninitcode=204"

  livenessProbe:
    enabled: true
    path: "/v1/sys/health?standbyok=true&sealedcode=204&uninitcode=204"
    initialDelaySeconds: 60

  # Environment variables
  extraEnvironmentVars:
    VAULT_LOG_LEVEL: info
    VAULT_LOG_FORMAT: json

# UI Configuration
ui:
  enabled: true
  serviceType: ClusterIP

# Disable components not needed
injector:
  enabled: false

csi:
  enabled: false

Create the network-policy.yml for network security:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: vault-network-policy
  namespace: vault
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: vault
  policyTypes:
    - Ingress
    - Egress
  ingress:
    # Allow from vault namespace (Raft communication)
    - from:
        - namespaceSelector:
            matchLabels:
              name: vault
      ports:
        - protocol: TCP
          port: 8200
        - protocol: TCP
          port: 8201
    # Allow from external-secrets namespace
    - from:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: external-secrets
      ports:
        - protocol: TCP
          port: 8200
    # Allow from application namespaces
    - from:
        - namespaceSelector:
            matchLabels:
              name: default
      ports:
        - protocol: TCP
          port: 8200
    # Allow from ingress controller
    - from:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: kube-system
          podSelector:
            matchLabels:
              app.kubernetes.io/name: traefik
      ports:
        - protocol: TCP
          port: 8200
  egress:
    # Allow all egress for DNS, cert-manager, etc.
    - {}

Step 2: Deploy Vault

Deploy Vault using helmfile:

1
2
3
4
5
# Preview what will be deployed
helmfile template

# Deploy Vault
helmfile sync

Monitor the deployment:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# Watch pods come up
kubectl get pods -n vault -w

# Expected output (pods will be 0/1 until initialized and unsealed):
# NAME      READY   STATUS    RESTARTS   AGE
# vault-0   0/1     Running   0          2m
# vault-1   0/1     Running   0          2m
# vault-2   0/1     Running   0          2m

# Check persistent volumes
kubectl get pvc -n vault

# Check ingress
kubectl get ingress -n vault

# Verify TLS certificate
kubectl get certificate -n vault

Step 3: Initialize Vault

Vault starts sealed and uninitialized. Initialization creates the encryption keys and unseal keys:

1
2
# Exec into the first pod
kubectl exec -it vault-0 -n vault -- sh

Inside the pod, initialize Vault with Shamir secret sharing:

1
vault operator init -key-shares=5 -key-threshold=3

This command will output:

1
2
3
4
5
6
7
Unseal Key 1: [REDACTED]
Unseal Key 2: [REDACTED]
Unseal Key 3: [REDACTED]
Unseal Key 4: [REDACTED]
Unseal Key 5: [REDACTED]

Initial Root Token: [REDACTED]

CRITICAL SECURITY STEP: Immediately save these keys in a secure password manager:

  • The 5 unseal keys
  • The root token
  • NEVER commit these to git or share them insecurely
  • Store them in separate secure locations
  • You need 3 of 5 keys to unseal Vault

Step 4: Unseal the Vault Cluster

Each Vault pod must be unsealed independently using 3 of the 5 unseal keys.

Unseal vault-0 (the node you just initialized):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Still inside vault-0 pod
vault operator unseal [KEY-1]
# Output: Sealed: true, Unseal Progress: 1/3

vault operator unseal [KEY-2]
# Output: Sealed: true, Unseal Progress: 2/3

vault operator unseal [KEY-3]
# Output: Sealed: false (UNSEALED!)

exit

Unseal vault-1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
kubectl exec -it vault-1 -n vault -- sh

# First, join the Raft cluster
vault operator raft join http://vault-0.vault-internal:8200

# Then unseal
vault operator unseal [KEY-1]
vault operator unseal [KEY-2]
vault operator unseal [KEY-3]

exit

Unseal vault-2:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
kubectl exec -it vault-2 -n vault -- sh

# Join Raft cluster
vault operator raft join http://vault-0.vault-internal:8200

# Unseal
vault operator unseal [KEY-1]
vault operator unseal [KEY-2]
vault operator unseal [KEY-3]

exit

Verify all pods are now ready:

1
2
3
4
5
6
7
kubectl get pods -n vault

# Expected output:
# NAME      READY   STATUS    RESTARTS   AGE
# vault-0   1/1     Running   0          10m
# vault-1   1/1     Running   0          10m
# vault-2   1/1     Running   0          10m

Step 5: Verify High Availability

Check the Raft cluster status:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
kubectl exec -it vault-0 -n vault -- sh

# Login
vault login [ROOT-TOKEN]

# Check Raft peers
vault operator raft list-peers

# Expected output:
# Node       Address                   State       Voter
# ----       -------                   -----       -----
# vault-0    vault-0.vault-internal    leader      true
# vault-1    vault-1.vault-internal    follower    true
# vault-2    vault-2.vault-internal    follower    true

# Check cluster status
vault status

exit

Step 6: Enable Secrets Engines

Vault requires explicit enabling of secrets engines. Enable the KV (Key-Value) v2 secrets engine:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
kubectl exec -it vault-0 -n vault -- sh

vault login [ROOT-TOKEN]

# Enable KV v2 secrets engine
vault secrets enable -version=2 kv

# Verify
vault secrets list

# Output should include:
# Path          Type         Description
# ----          ----         -----------
# kv/           kv           n/a
# ...

exit

Step 7: Create Access Policies

Never use the root token for application access. Create policies with minimal required permissions:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
kubectl exec -it vault-0 -n vault -- sh

vault login [ROOT-TOKEN]

# Create a read-only policy for external-secrets operator
vault policy write external-secrets-readonly - <<EOF
# Allow reading secrets
path "kv/data/*" {
  capabilities = ["read", "list"]
}

# Allow listing secret paths
path "kv/metadata/*" {
  capabilities = ["list"]
}
EOF

# Verify policy
vault policy read external-secrets-readonly

exit

Step 8: Create Service Token

Create a token with the policy for the External Secrets Operator:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
kubectl exec -it vault-0 -n vault -- sh

vault login [ROOT-TOKEN]

# Create token with policy
vault token create \
  -policy=external-secrets-readonly \
  -no-default-policy \
  -orphan \
  -renewable=false \
  -period=0

# Save the token value (hvs.CAESIJ...)
# Store this securely - you'll need it for External Secrets

exit

Production Hardening

Now that Vault is deployed and running, let’s implement production-grade hardening.

1. Enable Audit Logging

Audit logs are crucial for compliance and security investigations:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
kubectl exec -it vault-0 -n vault -- sh

vault login [ROOT-TOKEN]

# Enable file-based audit logging
vault audit enable file file_path=/vault/audit/audit.log

# Verify audit is enabled
vault audit list

# Output:
# Path     Type    Description
# ----     ----    -----------
# file/    file    n/a

exit

Audit logs will now be written to the persistent audit volume mounted at /vault/audit/.

2. Implement Network Security

The network policy we created earlier restricts access to Vault. Let’s verify it:

1
2
3
4
5
6
7
8
# View the network policy
kubectl describe networkpolicy vault-network-policy -n vault

# Test connectivity from allowed namespace
kubectl run test -it --rm --image=busybox -n external-secrets -- wget -O- http://vault.vault:8200/v1/sys/health

# Test connectivity from disallowed namespace (should fail)
kubectl run test -it --rm --image=busybox -n some-other-ns -- wget -O- http://vault.vault:8200/v1/sys/health

3. Configure Resource Limits

Resource limits prevent resource exhaustion attacks and ensure fair resource sharing. Our configuration already includes:

1
2
3
4
5
6
7
resources:
  requests:
    memory: 512Mi
    cpu: 250m
  limits:
    memory: 2Gi
    cpu: 1000m

Monitor resource usage:

1
2
3
4
5
# View resource consumption
kubectl top pods -n vault

# View resource requests/limits
kubectl describe pods -n vault | grep -A 5 "Limits:"

4. Implement Pod Disruption Budget

The Pod Disruption Budget ensures high availability during cluster maintenance:

1
2
3
podDisruptionBudget:
  enabled: true
  minAvailable: 2

This guarantees at least 2 Vault pods are always running during voluntary disruptions (node drains, upgrades, etc.).

Verify the PDB:

1
2
kubectl get pdb -n vault
kubectl describe pdb vault -n vault

5. Configure Pod Anti-Affinity

Anti-affinity spreads Vault pods across different nodes to prevent single-point-of-failure:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
affinity:
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchExpressions:
              - key: app.kubernetes.io/name
                operator: In
                values:
                  - vault
          topologyKey: kubernetes.io/hostname

Verify pod distribution:

1
2
3
kubectl get pods -n vault -o wide

# Pods should be on different nodes

6. Secure TLS Configuration

Our setup uses TLS termination at the ingress with Let’s Encrypt certificates:

  • External traffic: HTTPS with automatic certificate renewal
  • Internal traffic: HTTP (within trusted cluster network)

Verify TLS certificate:

1
2
3
4
5
6
7
kubectl get certificate -n vault vault-tls -o yaml

# Check certificate is ready
kubectl describe certificate vault-tls -n vault

# Test external access
curl -I https://vault.yourdomain.com/ui/

7. Implement Backup Strategy

Automated backups are critical for disaster recovery. Create a backup script at backup-all-secrets.sh:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
#!/bin/bash
set -e

BACKUP_DIR="vault-backup-$(date +%Y%m%d-%H%M%S)"
VAULT_NAMESPACE="vault"
VAULT_POD="vault-0"

echo "Creating backup: $BACKUP_DIR"
mkdir -p "$BACKUP_DIR"
cd "$BACKUP_DIR"

# 1. Backup Vault Raft snapshot
echo "Backing up Vault Raft snapshot..."
kubectl exec -it "$VAULT_POD" -n "$VAULT_NAMESPACE" -- \
  vault operator raft snapshot save /tmp/vault-snapshot.snap

kubectl cp "$VAULT_NAMESPACE/$VAULT_POD:/tmp/vault-snapshot.snap" \
  ./vault-raft-snapshot.snap

# 2. Backup Kubernetes secrets
echo "Backing up Kubernetes secrets..."
mkdir -p k8s-secrets

kubectl get namespaces -o jsonpath='{.items[*].metadata.name}' | \
  tr ' ' '\n' > k8s-secrets/namespaces.txt

while IFS= read -r ns; do
  echo "  Namespace: $ns"
  kubectl get secrets -n "$ns" -o yaml > "k8s-secrets/secrets-$ns.yaml"
done < k8s-secrets/namespaces.txt

# 3. Backup External Secrets configuration
echo "Backing up External Secrets configuration..."
mkdir -p external-secrets-config

kubectl get clustersecretstore -o yaml > \
  external-secrets-config/all-clustersecretstores.yaml

kubectl get externalsecret --all-namespaces -o yaml > \
  external-secrets-config/all-externalsecrets.yaml

# 4. Create archive
cd ..
echo "Creating archive..."
tar -czf "$BACKUP_DIR.tar.gz" "$BACKUP_DIR"
shasum -a 256 "$BACKUP_DIR.tar.gz" > "$BACKUP_DIR.tar.gz.sha256"

echo "Backup complete: $BACKUP_DIR.tar.gz"
echo "Checksum: $(cat $BACKUP_DIR.tar.gz.sha256)"

Make it executable and run regularly:

1
2
3
4
5
6
7
chmod +x backup-all-secrets.sh

# Manual backup
./backup-all-secrets.sh

# Or schedule with cron (daily at 2 AM)
0 2 * * * /path/to/backup-all-secrets.sh

Upload backups to secure off-site storage:

1
2
3
4
5
# Upload to S3
aws s3 cp vault-backup-*.tar.gz s3://your-backup-bucket/vault-backups/

# Or use rsync to remote server
rsync -avz vault-backup-*.tar.gz user@backup-server:/backups/vault/

8. Security Best Practices Summary

Secrets Management:

  • ✅ Store unseal keys in a secure password manager (separate locations)
  • ✅ Revoke root token after initial setup
  • ✅ Use policies for all access (never use root token for applications)
  • ✅ Enable audit logging for all operations
  • ✅ Rotate tokens regularly

Network Security:

  • ✅ Network policies restrict access to authorized namespaces only
  • ✅ TLS for external access via ingress
  • ✅ Service mesh (optional): Consider Istio/Linkerd for mTLS between pods

Infrastructure Security:

  • ✅ Run as non-root user
  • ✅ Read-only root filesystem
  • ✅ Resource limits prevent resource exhaustion
  • ✅ Pod security policies/admission controllers
  • ✅ Regular security updates

Operational Security:

  • ✅ Automated backups to off-site storage
  • ✅ Test disaster recovery procedures
  • ✅ Monitor audit logs for suspicious activity
  • ✅ Alert on Vault seal events
  • ✅ Document runbooks for incidents

Working with Vault: Real-World Examples

Example 1: Storing and Retrieving Secrets

Create a secret in Vault:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
kubectl exec -it vault-0 -n vault -- sh

vault login [ROOT-TOKEN]

# Store a database credential
vault kv put kv/infra/database-credentials \
  username=dbuser \
  password=super-secret-password \
  host=postgres.database.svc.cluster.local \
  port=5432 \
  database=myapp

# List secrets
vault kv list kv/infra

# Read a secret
vault kv get kv/infra/database-credentials

# Read specific field
vault kv get -field=password kv/infra/database-credentials

exit

Example 2: Secret Versioning

Vault KV v2 engine maintains version history:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
kubectl exec -it vault-0 -n vault -- sh

vault login [ROOT-TOKEN]

# Update the secret (creates version 2)
vault kv put kv/infra/database-credentials \
  username=dbuser \
  password=new-super-secret-password \
  host=postgres.database.svc.cluster.local \
  port=5432 \
  database=myapp

# Read latest version
vault kv get kv/infra/database-credentials

# Read specific version
vault kv get -version=1 kv/infra/database-credentials

# View version history
vault kv metadata get kv/infra/database-credentials

exit

Example 3: Integration with External Secrets Operator

External Secrets Operator syncs Vault secrets to Kubernetes secrets automatically. This is the recommended way to consume Vault secrets in Kubernetes.

Step 1: Install External Secrets Operator

1
2
3
4
5
6
helm repo add external-secrets https://charts.external-secrets.io

helm install external-secrets \
  external-secrets/external-secrets \
  -n external-secrets \
  --create-namespace

Step 2: Create Vault Token Secret

1
2
3
4
# Create secret with the token we generated earlier
kubectl create secret generic vault-token \
  -n external-secrets \
  --from-literal=token=hvs.CAESIJ...YOUR-TOKEN-HERE

Step 3: Create ClusterSecretStore

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
apiVersion: external-secrets.io/v1beta1
kind: ClusterSecretStore
metadata:
  name: vault-backend
spec:
  provider:
    vault:
      server: http://vault.vault:8200
      path: kv
      version: v2
      auth:
        tokenSecretRef:
          name: vault-token
          key: token
          namespace: external-secrets

Apply it:

1
2
3
4
5
6
7
8
kubectl apply -f clustersecretstore.yaml

# Verify it's ready
kubectl get clustersecretstore vault-backend

# Expected output:
# NAME            AGE   STATUS   CAPABILITIES   READY
# vault-backend   5s    Valid    ReadWrite      True

Step 4: Create ExternalSecret

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: database-credentials
  namespace: default
spec:
  refreshInterval: 5m
  secretStoreRef:
    kind: ClusterSecretStore
    name: vault-backend
  target:
    name: database-credentials
    creationPolicy: Owner
  dataFrom:
    - extract:
        key: infra/database-credentials

Apply and verify:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
kubectl apply -f externalsecret.yaml

# Check ExternalSecret status
kubectl get externalsecret database-credentials

# Verify Kubernetes secret was created
kubectl get secret database-credentials

# View secret contents (base64 decoded)
kubectl get secret database-credentials -o jsonpath='{.data.username}' | base64 -d

Step 5: Use in Application

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
apiVersion: v1
kind: Pod
metadata:
  name: myapp
spec:
  containers:
    - name: app
      image: myapp:latest
      env:
        - name: DB_USERNAME
          valueFrom:
            secretKeyRef:
              name: database-credentials
              key: username
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: database-credentials
              key: password
        - name: DB_HOST
          valueFrom:
            secretKeyRef:
              name: database-credentials
              key: host

Example 4: Organizing Secrets by Environment

Use path-based organization for different environments:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
kubectl exec -it vault-0 -n vault -- sh

vault login [ROOT-TOKEN]

# Development environment secrets
vault kv put kv/dev/api-keys \
  stripe_key=sk_test_... \
  sendgrid_key=SG.test...

# Staging environment secrets
vault kv put kv/staging/api-keys \
  stripe_key=sk_test_... \
  sendgrid_key=SG.staging...

# Production environment secrets
vault kv put kv/prod/api-keys \
  stripe_key=sk_live_... \
  sendgrid_key=SG.prod...

# Create environment-specific policies
vault policy write dev-readonly - <<EOF
path "kv/data/dev/*" {
  capabilities = ["read", "list"]
}
EOF

vault policy write prod-readonly - <<EOF
path "kv/data/prod/*" {
  capabilities = ["read", "list"]
}
EOF

exit

Create environment-specific ExternalSecrets:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# Production ExternalSecret
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: api-keys
  namespace: production
spec:
  refreshInterval: 5m
  secretStoreRef:
    kind: ClusterSecretStore
    name: vault-backend
  target:
    name: api-keys
  dataFrom:
    - extract:
        key: prod/api-keys
---
# Development ExternalSecret
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: api-keys
  namespace: development
spec:
  refreshInterval: 5m
  secretStoreRef:
    kind: ClusterSecretStore
    name: vault-backend
  target:
    name: api-keys
  dataFrom:
    - extract:
        key: dev/api-keys

Example 5: Bulk Secret Migration

When migrating from another secrets management solution:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# Export secrets from old system (example: Kubernetes secrets)
kubectl get secret old-secret -n default -o json | \
  jq -r '.data | to_entries[] | "\(.key)=\(.value | @base64d)"' > secrets.env

# Import to Vault
kubectl exec -it vault-0 -n vault -- sh

vault login [ROOT-TOKEN]

# Read from file and import (run this on your local machine)
while IFS='=' read -r key value; do
  vault kv put kv/migrated/$key value="$value"
done < secrets.env

# Or import multiple key-value pairs at once
vault kv put kv/migrated/application \
  $(cat secrets.env | tr '\n' ' ')

exit

Disaster Recovery and Backup Procedures

Creating Backups

Automated Daily Backup (Recommended):

The backup script we created earlier should run daily via cron:

1
2
3
4
5
# Add to crontab
crontab -e

# Add this line (daily at 2 AM)
0 2 * * * /path/to/k8s/releases/vault/backup-all-secrets.sh >> /var/log/vault-backup.log 2>&1

Manual Backup:

1
2
3
4
5
6
cd k8s/releases/vault
./backup-all-secrets.sh

# Upload to S3
aws s3 cp vault-backup-$(date +%Y%m%d)*.tar.gz \
  s3://your-backup-bucket/vault-backups/

What Gets Backed Up:

  1. Vault Raft snapshot (complete Vault data)
  2. All Kubernetes secrets (all namespaces)
  3. ExternalSecret definitions
  4. ClusterSecretStore configuration
  5. Vault deployment configuration
  6. Current system state documentation

Restoring from Backup

Restore Vault Data from Raft Snapshot:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# Extract backup
tar -xzf vault-backup-YYYYMMDD-HHMMSS.tar.gz
cd vault-backup-YYYYMMDD-HHMMSS

# Copy snapshot to Vault pod
kubectl cp vault-raft-snapshot.snap vault/vault-0:/tmp/vault-restore.snap

# Restore the snapshot
kubectl exec -it vault-0 -n vault -- sh

vault login [ROOT-TOKEN]

# Force restore (overwrites current data)
vault operator raft snapshot restore -force /tmp/vault-restore.snap

exit

# Restart all Vault pods
kubectl delete pod vault-0 vault-1 vault-2 -n vault

# Wait for pods to restart
kubectl get pods -n vault -w

# Unseal all pods (see unsealing procedure above)

Restore Kubernetes Secrets:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Restore all secrets from backup
cd vault-backup-YYYYMMDD-HHMMSS

for file in k8s-secrets/secrets-*.yaml; do
  echo "Restoring $file..."
  kubectl apply -f "$file"
done

# Or restore specific namespace
kubectl apply -f k8s-secrets/secrets-vault.yaml

Restore External Secrets Configuration:

1
2
3
4
5
# Restore ClusterSecretStore
kubectl apply -f external-secrets-config/all-clustersecretstores.yaml

# Restore all ExternalSecrets
kubectl apply -f external-secrets-config/all-externalsecrets.yaml

Testing Disaster Recovery

Regularly test your backup and restore procedures in a non-production environment:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# 1. Create a test namespace
kubectl create namespace vault-dr-test

# 2. Deploy a test Vault instance
# (use same helmfile with different namespace)

# 3. Restore from production backup
# (follow restore procedures above)

# 4. Verify data integrity
kubectl exec -it vault-0 -n vault-dr-test -- sh
vault login [ROOT-TOKEN]
vault kv list kv/
vault kv get kv/infra/database-credentials
exit

# 5. Clean up test environment
kubectl delete namespace vault-dr-test

Monitoring and Observability

Health Checks

Vault provides health endpoints for monitoring:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Check Vault status
curl https://vault.yourdomain.com/v1/sys/health

# Response codes:
# 200: Initialized, unsealed, active
# 429: Unsealed, standby
# 472: Disaster recovery mode
# 473: Performance standby
# 501: Not initialized
# 503: Sealed

Metrics and Monitoring

Prometheus Integration (if Prometheus is installed):

Enable Vault telemetry in values.yml:

1
2
3
4
5
6
7
8
9
server:
  extraEnvironmentVars:
    VAULT_TELEMETRY_PROMETHEUS_RETENTION_TIME: "30s"

serverTelemetry:
  serviceMonitor:
    enabled: true
    selectors:
      release: prometheus

Key metrics to monitor:

  • vault_core_unsealed: Vault seal status (should always be 1)
  • vault_core_leader: Leader election status
  • vault_runtime_alloc_bytes: Memory usage
  • vault_raft_peers: Number of Raft peers (should be 3)
  • vault_token_count: Number of active tokens
  • vault_audit_log_request: Audit log write latency

Alerting Rules

Example Prometheus alert rules:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
groups:
  - name: vault
    interval: 30s
    rules:
      - alert: VaultSealed
        expr: vault_core_unsealed == 0
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Vault is sealed"
          description: "Vault instance {{ $labels.instance }} is sealed"

      - alert: VaultDown
        expr: up{job="vault"} == 0
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Vault is down"
          description: "Vault instance {{ $labels.instance }} is down"

      - alert: VaultNoLeader
        expr: max(vault_core_leader) == 0
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Vault has no leader"
          description: "Vault cluster has no elected leader"

Log Management

View Vault logs:

1
2
3
4
5
6
7
8
# View logs from all Vault pods
kubectl logs -n vault -l app.kubernetes.io/name=vault --tail=100 -f

# View logs from specific pod
kubectl logs -n vault vault-0 -f

# View audit logs
kubectl exec -it vault-0 -n vault -- tail -f /vault/audit/audit.log

Ship logs to centralized logging (ELK, Loki, etc.):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# Example: Fluent Bit DaemonSet for log shipping
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: logging
data:
  fluent-bit.conf: |
    [INPUT]
        Name              tail
        Path              /var/log/containers/vault-*_vault_*.log
        Parser            docker
        Tag               vault.*
        Refresh_Interval  5

    [OUTPUT]
        Name              es
        Match             vault.*
        Host              elasticsearch
        Port              9200
        Index             vault-logs

Troubleshooting Common Issues

Issue 1: Vault Pods Not Ready

Symptoms: Pods show 0/1 READY

Diagnosis:

1
2
3
kubectl get pods -n vault
kubectl describe pod vault-0 -n vault
kubectl logs vault-0 -n vault

Common Causes:

  1. Vault is sealed: Unseal the pods (see unsealing procedure)
  2. Not initialized: Initialize Vault (see initialization procedure)
  3. Raft not joined: Join pods to Raft cluster
  4. Storage issues: Check PVC status

Resolution:

1
2
3
4
5
# Check if sealed
kubectl exec -it vault-0 -n vault -- vault status

# Unseal if needed
kubectl exec -it vault-0 -n vault -- vault operator unseal [KEY]

Issue 2: Cannot Access Vault UI

Symptoms: Browser shows connection error or 502

Diagnosis:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Check ingress
kubectl get ingress -n vault
kubectl describe ingress vault -n vault

# Check certificate
kubectl get certificate -n vault
kubectl describe certificate vault-tls -n vault

# Check service
kubectl get svc -n vault

Common Causes:

  1. Certificate not ready: Wait for cert-manager to issue certificate
  2. DNS not configured: Ensure DNS points to ingress
  3. Ingress misconfigured: Check ingress annotations

Issue 3: ExternalSecrets Not Syncing

Symptoms: Kubernetes secrets not created or not updated

Diagnosis:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Check ExternalSecret status
kubectl get externalsecret -A
kubectl describe externalsecret [name] -n [namespace]

# Check ClusterSecretStore
kubectl get clustersecretstore
kubectl describe clustersecretstore vault-backend

# Check operator logs
kubectl logs -n external-secrets -l app.kubernetes.io/name=external-secrets

Common Causes:

  1. Invalid token: Recreate Vault token and update secret
  2. Wrong secret path: Verify path exists in Vault
  3. Network policy blocking: Check network policies allow external-secrets namespace

Resolution:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# Test connectivity from external-secrets pod
kubectl run test -it --rm --image=busybox -n external-secrets -- \
  wget -O- http://vault.vault:8200/v1/sys/health

# Recreate token
kubectl exec -it vault-0 -n vault -- sh
vault login [ROOT-TOKEN]
vault token create -policy=external-secrets-readonly
exit

# Update secret
kubectl delete secret vault-token -n external-secrets
kubectl create secret generic vault-token \
  -n external-secrets \
  --from-literal=token=hvs.NEW-TOKEN...

Issue 4: Raft Cluster Issues

Symptoms: Leader election fails, nodes out of sync

Diagnosis:

1
2
3
4
5
kubectl exec -it vault-0 -n vault -- sh
vault login [ROOT-TOKEN]
vault operator raft list-peers
vault operator raft configuration
exit

Common Causes:

  1. Network partition: Check pod-to-pod connectivity
  2. Node failure: Check node status
  3. Storage issues: Check PVC status

Resolution:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# Remove failed peer
kubectl exec -it vault-0 -n vault -- sh
vault login [ROOT-TOKEN]
vault operator raft remove-peer vault-2
exit

# Delete pod to recreate
kubectl delete pod vault-2 -n vault

# Re-join to cluster
kubectl exec -it vault-2 -n vault -- sh
vault operator raft join http://vault-0.vault-internal:8200
vault operator unseal [KEY-1]
vault operator unseal [KEY-2]
vault operator unseal [KEY-3]
exit

Issue 5: High Memory Usage

Symptoms: Vault pods using excessive memory, OOM kills

Diagnosis:

1
2
3
4
5
6
7
8
# Check resource usage
kubectl top pods -n vault

# Check memory limits
kubectl describe pod vault-0 -n vault | grep -A 10 "Limits:"

# View metrics (if Prometheus enabled)
# Query: rate(vault_runtime_alloc_bytes[5m])

Resolution:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Increase memory limits in values.yml
resources:
  limits:
    memory: 4Gi  # Increase from 2Gi

# Apply changes
helmfile sync

# Or restart pods to clear memory
kubectl delete pod vault-0 -n vault
# Wait and unseal

Advanced Topics

Kubernetes Authentication

Enable Kubernetes auth for pods to authenticate directly to Vault:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
kubectl exec -it vault-0 -n vault -- sh

vault login [ROOT-TOKEN]

# Enable Kubernetes auth
vault auth enable kubernetes

# Configure Kubernetes auth
vault write auth/kubernetes/config \
  kubernetes_host="https://$KUBERNETES_PORT_443_TCP_ADDR:443"

# Create role for application
vault write auth/kubernetes/role/myapp \
  bound_service_account_names=myapp \
  bound_service_account_namespaces=default \
  policies=myapp-policy \
  ttl=24h

exit

Application code example (Go):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
import (
    "github.com/hashicorp/vault/api"
)

// Read JWT token
jwt, _ := ioutil.ReadFile("/var/run/secrets/kubernetes.io/serviceaccount/token")

// Authenticate to Vault
client, _ := api.NewClient(api.DefaultConfig())
secret, _ := client.Logical().Write("auth/kubernetes/login", map[string]interface{}{
    "jwt":  string(jwt),
    "role": "myapp",
})

// Set token
client.SetToken(secret.Auth.ClientToken)

// Read secret
secret, _ := client.Logical().Read("kv/data/myapp/config")

Dynamic Database Credentials

Generate short-lived database credentials dynamically:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
kubectl exec -it vault-0 -n vault -- sh

vault login [ROOT-TOKEN]

# Enable database secrets engine
vault secrets enable database

# Configure PostgreSQL connection
vault write database/config/myapp-db \
  plugin_name=postgresql-database-plugin \
  allowed_roles="myapp-role" \
  connection_url="postgresql://{{username}}:{{password}}@postgres.database:5432/mydb" \
  username="vault" \
  password="vault-password"

# Create role with SQL statements
vault write database/roles/myapp-role \
  db_name=myapp-db \
  creation_statements="CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}'; \
    GRANT SELECT ON ALL TABLES IN SCHEMA public TO \"{{name}}\";" \
  default_ttl="1h" \
  max_ttl="24h"

# Generate credentials
vault read database/creds/myapp-role

# Output:
# Key                Value
# ---                -----
# lease_id           database/creds/myapp-role/xxxxx
# lease_duration     1h
# username           v-root-myapp-ro-xxxxx
# password           A1a-xxxxxxxxxxxxxxxx

exit

PKI and Certificate Management

Use Vault as a private CA for mTLS:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
kubectl exec -it vault-0 -n vault -- sh

vault login [ROOT-TOKEN]

# Enable PKI secrets engine
vault secrets enable pki

# Set max lease TTL
vault secrets tune -max-lease-ttl=87600h pki

# Generate root CA
vault write -field=certificate pki/root/generate/internal \
  common_name="My Internal CA" \
  ttl=87600h > CA_cert.crt

# Configure CA and CRL URLs
vault write pki/config/urls \
  issuing_certificates="https://vault.yourdomain.com/v1/pki/ca" \
  crl_distribution_points="https://vault.yourdomain.com/v1/pki/crl"

# Create role
vault write pki/roles/myapp-role \
  allowed_domains="myapp.svc.cluster.local" \
  allow_subdomains=true \
  max_ttl="720h"

# Issue certificate
vault write pki/issue/myapp-role \
  common_name="myapp.myapp.svc.cluster.local" \
  ttl="24h"

exit

Migration Guide

Migrating from OpenBao/Other Solutions

If you’re migrating from another Vault-compatible solution:

Step 1: Export secrets from old system

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Example: Export from OpenBao
kubectl exec -it openbao-0 -n openbao -- sh

bao login [TOKEN]

# List all paths
bao kv list -format=json kv/ > paths.json

# Export each secret (script this for multiple secrets)
bao kv get -format=json kv/infra/secret1 > secret1.json

exit

Step 2: Import to Vault

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
kubectl exec -it vault-0 -n vault -- sh

vault login [ROOT-TOKEN]

# Import secrets
# (Process JSON files and import)
vault kv put kv/infra/secret1 \
  key1=value1 \
  key2=value2

exit

Step 3: Update ExternalSecrets

1
2
3
4
# Update all ExternalSecret resources to use new ClusterSecretStore
kubectl get externalsecret --all-namespaces -o yaml | \
  sed 's/old-backend/vault-backend/g' | \
  kubectl apply -f -

Step 4: Verify and cleanup

1
2
3
4
5
6
7
8
9
# Verify all ExternalSecrets are syncing
kubectl get externalsecret --all-namespaces

# Verify applications are working
kubectl get pods --all-namespaces

# Remove old system after verification period
kubectl delete clustersecretstore old-backend
helmfile destroy -f old-system/helmfile.yaml

Conclusion

You now have a production-ready HashiCorp Vault deployment on Kubernetes with:

High Availability: 3-node Raft cluster with automatic failover ✅ Security Hardening: Network policies, audit logging, RBAC, and encryption ✅ Automated Backups: Daily snapshots with disaster recovery procedures ✅ External Secrets Integration: Seamless Kubernetes secrets synchronization ✅ Monitoring: Health checks, metrics, and alerting ✅ Production Operations: Troubleshooting guides and runbooks

Key Takeaways

  1. Never use root token in production: Create policies and service tokens
  2. Store unseal keys securely: Use password managers, separate storage
  3. Test backups regularly: Verify restore procedures work
  4. Monitor seal status: Alert immediately if Vault becomes sealed
  5. Rotate credentials: Implement regular rotation for all tokens
  6. Network security matters: Use network policies to restrict access
  7. Audit everything: Enable and monitor audit logs

Next Steps

  • Set up monitoring dashboards: Create Grafana dashboards for Vault metrics
  • Implement secret rotation: Automate credential rotation workflows
  • Document runbooks: Create operational playbooks for your team
  • Train your team: Ensure everyone understands Vault operations
  • Regular security audits: Review policies and access patterns monthly

Resources

Support

For issues or questions:

  • Review the troubleshooting section above
  • Check Vault logs: kubectl logs -n vault vault-0
  • Consult HashiCorp community forums
  • Open issues on the vault-helm GitHub repository

About the Author: This guide is based on real-world production infrastructure running at scale. The configurations and practices described have been battle-tested in production environments handling millions of requests daily.

Last Updated: January 31, 2026 Version: 1.0 Vault Chart Version: 0.31.0 Vault Version: 1.15.2+

comments powered by Disqus
Citizix Ltd
Built with Hugo
Theme Stack designed by Jimmy