Kubernetes monitoring on Amazon EKS gets complex when you run multiple clusters or need fast root-cause analysis. Robusta is an AI-powered Kubernetes observability platform that enriches alerts with logs, events, and metrics—and can integrate with Prometheus, Alertmanager, and Slack. This guide walks you through deploying Robusta on EKS with Helm, IRSA (IAM Roles for Service Accounts), and External Secrets, so you get production-ready monitoring without hardcoded credentials.
In this guide you’ll learn:
- What Robusta is and how it fits into EKS monitoring
- How to create IAM roles for Robusta with Terraform (IRSA)
- How to configure Slack and the Robusta UI, and store secrets in AWS Secrets Manager
- How to deploy Robusta with Helm/Helmfile and verify it across clusters
- Troubleshooting, security, and cost tips for running Robusta on EKS
What is Robusta?
Robusta is a Kubernetes troubleshooting and monitoring platform that combines alert enrichment, automated diagnostics, and AI-powered root cause analysis. Unlike traditional monitoring tools, Robusta:
- Enriches alerts automatically with relevant logs, metrics, and Kubernetes events
- Uses AI (Holmes GPT) to analyze issues and suggest fixes
- Integrates with existing tools like Slack, Prometheus, and AlertManager
- Provides a centralized UI for historical analysis across all clusters
- Automates common troubleshooting tasks through customizable playbooks
Architecture Overview
Before diving into the setup, let’s understand how Robusta integrates with EKS:
| |
Key Components:
- Runner: Executes playbooks, enriches alerts, collects cluster data
- Forwarder: Watches Kubernetes API and relays events to the runner
- IRSA (IAM Roles for Service Accounts): Provides AWS permissions without credentials
- External Secrets: Injects credentials from AWS Secrets Manager
- Robusta UI Sink: Sends data to the centralized platform
Prerequisites
Before starting, ensure you have:
Tools Installed
kubectl- Kubernetes CLIhelm- Kubernetes package managerhelmfile- Declarative Helm deployment toolaws-cli- AWS command line interfaceterraform- Infrastructure as code toolrobusta-cli- Robusta command line tool
Existing Infrastructure
- Amazon EKS clusters (one or more)—see how to create and manage an EKS cluster with eksctl if you need to create one
- External Secrets Operator deployed (we use it to sync secrets from AWS; see how to use External Secrets with AWS Secrets Manager for setup)
- AWS Secrets Manager access
- Slack workspace (for notifications)
Required Permissions
- EKS cluster admin access
- AWS IAM permissions to create roles and policies
- AWS Secrets Manager write access
- Slack admin access to create apps
Step 1: Create IAM Roles with Terraform
Robusta needs read-only access to AWS services for enriching alerts with CloudWatch logs, EC2 instance information, and EKS metadata. We’ll use IRSA (IAM Roles for Service Accounts) for secure, credential-less access.
Directory Structure
First, create this directory structure for your Terraform configuration:
| |
1.1 Set Up Terraform Configuration
First, let’s create the Terraform provider configuration:
File: terraform/robusta-iam/providers.tf
| |
1.2 Create IAM Policy
Create the IAM policy that defines what permissions Robusta will have:
File: terraform/robusta-iam/policy.json
| |
Why these permissions?
- CloudWatch Logs: Robusta retrieves pod logs when alerts fire
- EKS: Gets cluster and node group information for context
- EC2: Describes instances to correlate pods with underlying infrastructure
- CloudWatch Metrics: Enriches alerts with CPU, memory, and other metrics
File: terraform/robusta-iam/variables.tf
| |
File: terraform/robusta-iam/main.tf
| |
1.3 Create IRSA Role
Now create the IAM role that the Robusta service account will assume:
File: terraform/robusta-iam/irsa.tf
| |
File: terraform/robusta-iam/outputs.tf
| |
Key Points:
- OIDC Provider: Links the IAM role to your EKS cluster’s identity provider
- Service Account: Only the
robusta:robusta-runner-service-accountcan assume this role - Policy Attachment: Attaches the policy we created in the previous step
1.4 Deploy the IAM Resources
Create a variables file for your environment:
File: terraform/robusta-iam/prod-us-west-2.tfvars
| |
Finding your OIDC Provider ID:
| |
Now deploy the Terraform configuration:
| |
After applying, Terraform will output the IAM role ARN:
| |
Copy the role_arn output - you’ll need it for the Helm values files in Step 5.
Repeat these steps for all your environments and regions:
| |
Step 2: Configure Slack Integration
Robusta sends rich, context-aware notifications to Slack. Let’s set up the Slack app.
2.1 Create Slack App
- Go to Slack API Apps
- Click Create New App → From scratch
- Name it “Robusta Monitoring” and select your workspace
- Click Create App
2.2 Configure Bot Permissions
- Go to OAuth & Permissions in the left sidebar
- Scroll to Bot Token Scopes
- Add these scopes:
chat:write- Send messages to channelschat:write.public- Post to channels without joiningfiles:write- Upload files (logs, graphs, etc.)
2.3 Install App and Get Token
- Scroll to the top and click Install to Workspace
- Review permissions and click Allow
- Copy the Bot User OAuth Token (starts with
xoxb-) - Save this token - you’ll need it shortly
2.4 Add Bot to Channel
- Create or select a Slack channel (e.g.,
#robusta-alerts) - In the channel, type:
/invite @Robusta Monitoring - Or click channel name → Integrations → Add apps
Step 3: Set Up Robusta Platform Account
Robusta provides a centralized UI for viewing alerts, running investigations, and using AI analysis across all your clusters.
3.1 Install Robusta CLI
| |
3.2 Generate UI Integration Token
| |
When prompted:
- Email: Your work email (used for login)
- Organization: Your company name (e.g., “Acme Corp”)
The CLI will output a base64-encoded token. Copy this token - we’ll add it to AWS Secrets Manager.
Step 4: Store Credentials in AWS Secrets Manager
We’ll use AWS Secrets Manager to securely store all Robusta credentials, then inject them into Kubernetes using the External Secrets Operator. If you haven’t set up External Secrets yet, follow how to use External Secrets with AWS Secrets Manager first.
4.1 Create the Secret
| |
Where to get these values:
- accountId: Go to platform.robusta.dev/settings
- signingKey: Go to platform.robusta.dev/settings/api-keys and generate a new key
- robustaUiToken: From Step 3.2 above
- slackChannel: Your Slack channel name (include the #)
- slackApiKey: From Step 2.3 above
4.2 Verify Secret
| |
Step 5: Create Helm Release Configuration
Now we’ll create the declarative Helm configuration for deploying Robusta across multiple clusters.
5.1 Directory Structure
Create this structure in your infrastructure repository:
| |
5.2 Create Helmfile
File: k8s/releases/robusta/helmfile.yaml
| |
Key Features:
- Environment Variables: Uses
CLUSTER_ENVandCLUSTER_REGIONto select the correct values file - Presync Hooks: Creates namespace and applies External Secret before Helm installation
- Version Pinning: Explicit version ensures reproducible deployments
5.3 Create External Secret Configuration
File: k8s/releases/robusta/external-secret.yaml
| |
How it works:
- External Secrets Operator reads from AWS Secrets Manager
- Creates a Kubernetes Secret named
robusta-secrets - Refreshes every hour to pick up any credential changes
- Robusta pods mount this secret as environment variables
5.4 Create Values Files
Robusta v0.32.0 uses a two-step secret injection pattern:
- Load secrets as environment variables via
runner.additional_env_vars - Reference them in configuration using
{{ env.VARIABLE }}syntax
File: k8s/releases/robusta/values-prod-us-west-2.yml
| |
Important Configuration Points:
- Service Account Annotation: Must match the IAM role ARN from Step 1
- Cluster Name: Unique identifier for this cluster in Robusta UI
- Empty Prometheus URLs: Prevents harmless startup warnings if you don’t use Prometheus
- Two Sinks:
robusta_sink: Sends data to Robusta UIslack_sink: Sends real-time notifications to Slack
- Custom Playbook: Example that enriches pod crash loop alerts with events
Create similar files for:
values-prod-eu-west-1.ymlvalues-stage-us-west-2.ymlvalues-stage-eu-west-1.yml
Adjust:
clusterNamefor each environment/regioneks.amazonaws.com/role-arnto match your IAM roles- Resource limits (staging can use less than production)
Step 6: Deploy Robusta
Now that everything is configured, let’s deploy!
6.1 Deploy to Production US West 2
| |
What happens:
- Helmfile creates the
robustanamespace - Applies the ExternalSecret to create
robusta-secrets - Installs Robusta Helm chart with the production US West 2 values
- Runner pod starts and connects to Robusta UI
- Forwarder pod starts watching Kubernetes API
6.2 Verify Deployment
| |
6.3 Deploy to Other Clusters
Repeat for each cluster:
| |
Step 7: Verify in Robusta UI
7.1 Access the UI
- Go to platform.robusta.dev
- Login with your email from Step 3
- Navigate to Clusters page
You should see all your clusters listed with:
- Name: e.g., “production-cluster-us-west-2”
- Status: Green “Connected”
- Version: “0.32.0”
- Last Seen: Recent timestamp
7.2 Explore the Timeline
Click on Timeline to see:
- Pod crashes and restarts
- Deployment updates
- Node changes
- Kubernetes events
Each event is enriched with:
- Related logs
- Resource manifests
- Recent changes
- AI analysis (if enabled)
7.3 Test Slack Integration
Create a test alert:
| |
Clean up:
| |
Step 8: Customize Playbooks
Playbooks are Robusta’s automation rules. They define what happens when specific events occur.
8.1 Add a Deployment Update Playbook
Edit your values file:
| |
Redeploy:
| |
8.2 Common Enrichment Actions
Here are useful enrichment actions you can add to any playbook:
pod_events_enricher: {}- Show recent pod eventslogs_enricher: {}- Attach pod logspod_graph_enricher: {}- Add CPU/memory graphsprometheus_enricher: {}- Add Prometheus metrics (if configured)related_pods: {}- Show related pods (same deployment, etc.)deployment_events_enricher: {}- Show deployment eventsresource_events_enricher: {}- Show resource-level events
Full reference: Robusta Actions Documentation
Troubleshooting
Pod CrashLoopBackOff
Check logs:
| |
Common issues:
Action not found (e.g., “Action pod_events not found”)
- Cause: Action names changed in v0.32.0
- Fix: Use
pod_events_enricherinstead ofpod_events
Invalid configuration format
- Cause: Using old
valueFromsyntax inglobalConfig - Fix: Use
{{ env.VAR }}syntax instead
- Cause: Using old
Missing environment variables
- Cause:
runner.additional_env_varsnot configured - Fix: Ensure all env vars are loaded from secrets
- Cause:
Slack Errors
Error: missing_scope - Need files:write
The Slack bot is missing required permissions.
Fix:
- Go to api.slack.com/apps → Select your app
- Go to OAuth & Permissions
- Under Bot Token Scopes, add
files:write - Click Reinstall App
- Copy the new Bot OAuth Token
- Update AWS Secrets Manager:
| |
Error: not_in_channel
The bot hasn’t been added to the Slack channel.
Fix:
- Go to your Slack channel
- Type
/invite @RobustaBot - Or click channel name → Integrations → Add apps
No restart needed - works immediately.
Cluster Not Appearing in UI
1. Check if runner is connected:
| |
Look for:
| |
2. Verify UI token:
| |
3. Check forwarder logs:
| |
4. Force data sync:
| |
IAM Permission Issues
Error: AccessDenied when calling AWS APIs
Verify IRSA configuration:
| |
Test role assumption:
| |
Verify IAM role exists:
| |
Best Practices
1. Use GitOps for Configuration
Store all Robusta configuration in Git:
- Values files tracked in version control
- External Secret configurations committed
- Helm releases managed by ArgoCD or Flux
2. Separate Secrets by Environment
Create separate AWS Secrets Manager secrets for staging and production:
| |
Update ExternalSecret per environment:
| |
3. Use Specific IAM Permissions
Follow principle of least privilege:
- Read-only access to AWS services
- Scoped to specific resources where possible
- Separate roles per environment
4. Monitor Resource Usage
Track Robusta’s resource consumption:
| |
Adjust resources in values files based on actual usage.
5. Keep Robusta Updated
Check for new releases:
| |
Update helmfile.yaml:
| |
Test in staging first, then production.
6. Create Custom Playbooks for Your Stack
Tailor Robusta to your specific needs:
| |
Performance Considerations
Resource Requirements
Based on cluster size:
| Cluster Size | Runner CPU | Runner Memory | Forwarder CPU | Forwarder Memory |
|---|---|---|---|---|
| Small (<50 pods) | 100m | 512Mi | 100m | 256Mi |
| Medium (50-200 pods) | 200m | 1Gi | 200m | 512Mi |
| Large (200+ pods) | 500m | 2Gi | 500m | 1Gi |
Network Considerations
Robusta generates outbound traffic to:
- Robusta UI: Event data and heartbeats
- Slack API: Notification payloads
- AWS APIs: CloudWatch Logs, EKS, EC2 queries
Ensure network policies and security groups allow these connections.
Storage Considerations
Robusta is stateless and doesn’t require persistent storage. However:
- Temporary files use emptyDir volumes
- Logs are rotated automatically
- No PVC needed
Security Considerations
1. IRSA Instead of IAM Users
Never use IAM access keys:
- ✅ Use IRSA (IAM Roles for Service Accounts)
- ❌ Don’t create IAM users with access keys
- ❌ Don’t mount AWS credentials in pods
2. Secret Rotation
Rotate credentials regularly:
| |
3. Network Policies
Restrict Robusta’s network access:
| |
4. Audit Logging
Enable audit logs for Robusta actions:
| |
Cost Optimization
1. Right-Size Resources
Start conservative, scale up based on metrics:
| |
2. Use Spot Instances
Robusta tolerates interruptions well:
| |
3. Optimize External Secrets Refresh
Reduce API calls to AWS Secrets Manager:
| |
Credentials don’t change frequently, so daily refresh is sufficient.
Frequently Asked Questions (FAQ)
What is Robusta in Kubernetes?
Robusta is a Kubernetes troubleshooting and observability platform that automatically enriches alerts with logs, metrics, and events. It connects to Prometheus and Alertmanager, sends notifications to Slack (and other channels), and offers AI-assisted root cause analysis via the Robusta UI.
How do I install Robusta on EKS?
Install Robusta on Amazon EKS by: (1) creating IAM policy and IRSA role with Terraform, (2) storing credentials in AWS Secrets Manager and syncing them with External Secrets, (3) deploying the Robusta Helm chart with a values file that references those secrets and your Slack/UI config. Use helmfile sync (or helm install) after setting CLUSTER_ENV and CLUSTER_REGION.
Does Robusta work with Prometheus and Alertmanager?
Yes. Robusta integrates with Prometheus and Alertmanager. You can set globalConfig.prometheus_url and globalConfig.alertmanager_url in your Helm values, and use playbooks triggered by on_prometheus_alert to enrich and route alerts to Slack or the Robusta UI.
Why use IRSA for Robusta on EKS?
IRSA (IAM Roles for Service Accounts) lets the Robusta runner use AWS APIs (CloudWatch Logs, EKS, EC2) without storing access keys. The pod assumes an IAM role via the service account annotation, which is more secure and easier to rotate than static credentials.
How often should I rotate Robusta credentials?
Rotate Slack tokens, Robusta signing keys, and UI tokens periodically (e.g. every 90 days). Update the values in AWS Secrets Manager, then restart the runner (e.g. kubectl rollout restart deployment robusta-runner -n robusta) so it picks up the new credentials after External Secrets syncs.
Conclusion
You now have a production-ready Robusta deployment across multiple EKS clusters! This setup provides:
✅ AI-powered troubleshooting with Holmes GPT ✅ Real-time Slack notifications with rich context ✅ Centralized UI for all clusters ✅ Secure credential management with External Secrets ✅ IAM roles without hardcoded credentials ✅ Multi-region deployment with GitOps ✅ Custom playbooks for your specific needs
Next Steps
- Add Prometheus Integration: Connect Robusta to Prometheus for metric enrichment—see our production Prometheus on Kubernetes guide and how to install and configure Prometheus Alertmanager for alerting.
- Enable Holmes AI: Set up AI-powered root cause analysis in the Robusta UI.
- Create More Playbooks: Automate responses to common issues.
- Set Up AlertManager: Integrate with existing alerting infrastructure.
- Configure MS Teams/PagerDuty: Add additional notification channels.
Additional Resources
Questions or Issues?
If you encounter any problems:
- Check the Robusta Troubleshooting Guide
- Review logs:
kubectl logs -n robusta -l app=robusta-runner - Join Robusta Slack Community
- Open an issue on GitHub
Happy monitoring! 🚀