Skip to main content
Version: v1.0.0-rc.1 (pre-release)

RCA Agent

The RCA (Root Cause Analysis) Agent is an AI-powered component that analyzes logs, metrics, and traces from your OpenChoreo deployments to generate reports with likely root causes of issues. It integrates with Large Language Models (LLMs) to provide intelligent analysis and actionable insights.

Prerequisites​

Before enabling the RCA Agent, ensure the following:

  • OpenChoreo Observability Plane installed (optionally with the Prometheus Metrics Module for richer analysis).
  • An LLM API key from OpenAI (support for other providers coming soon)
  • Alerting configured for your components with enableAiRootCauseAnalysis enabled.
note

Enable automatic RCA only for critical alerts to manage LLM costs.

tip

For best compatibility, we recommend using OpenAI models. Support for other providers will be available soon.

Enabling the RCA Agent​

Step 1: Store secrets in OpenBao​

Store your LLM API key:

kubectl exec -n openbao openbao-0 -- \
env BAO_ADDR=http://127.0.0.1:8200 BAO_TOKEN=root \
bao kv put secret/rca-llm-api-key value="<YOUR_LLM_API_KEY>"

Step 2: Create the ExternalSecret​

Create an ExternalSecret to pull all required values into a single Kubernetes Secret. This secret is referenced by rca.secretName and all its keys are injected as environment variables via envFrom.

kubectl apply -f - <<EOF
apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
name: rca-agent-secret
namespace: openchoreo-observability-plane
spec:
refreshInterval: 1h
secretStoreRef:
kind: ClusterSecretStore
name: default
target:
name: rca-agent-secret
data:
- secretKey: RCA_LLM_API_KEY
remoteRef:
key: rca-llm-api-key
property: value
- secretKey: OAUTH_CLIENT_SECRET
remoteRef:
key: rca-oauth-client-secret
property: value
EOF

Step 3: Upgrade the Observability Plane​

Enable the RCA Agent and configure the LLM model. The --reuse-values flag preserves your existing configuration.

helm upgrade --install openchoreo-observability-plane oci://ghcr.io/openchoreo/helm-charts/openchoreo-observability-plane \
--version 1.0.0-rc.1 \
--namespace openchoreo-observability-plane \
--reuse-values \
--set rca.enabled=true \
--set rca.llm.modelName=<model-name>
note

If the observability plane and control plane are in separate clusters, set rca.controlPlaneUrl to the control plane API URL (defaults to http://api.openchoreo.localhost:8080):

--set rca.controlPlaneUrl=<control-plane-api-url>

Step 4: Register with the control plane​

Configure rcaAgentURL in the ClusterObservabilityPlane resource so the control plane knows where to reach the agent:

kubectl patch clusterobservabilityplane default --type=merge -p '{"spec":{"rcaAgentURL":"http://rca-agent.openchoreo.localhost:11080"}}'

Step 5: Verify the installation​

Check that the RCA Agent pod is running:

kubectl get pods -n openchoreo-observability-plane -l app.kubernetes.io/component=ai-rca-agent

If you are using the default identity provider (Thunder) and the default SQLite report storage, your setup is complete. The sections below are only needed if you are configuring an external identity provider or PostgreSQL for report storage.

Authentication and Authorization​

By default, OpenChoreo configures Thunder as the identity provider for the RCA Agent with a pre-configured OAuth client for testing purposes. If you are using an external identity provider, follow the steps below to configure authentication and authorization.

Authentication​

Create an OAuth 2.0 client that supports the client_credentials grant type for service-to-service authentication.

Store your OAuth client secret in OpenBao:

kubectl exec -n openbao openbao-0 -- \
env BAO_ADDR=http://127.0.0.1:8200 BAO_TOKEN=root \
bao kv put secret/rca-oauth-client-secret value="<YOUR_OAUTH_CLIENT_SECRET>"

Then configure the Observability Plane Helm values with your client credentials:

security:
oidc:
tokenUrl: "<your-idp-token-url>"

rca:
secretName: "rca-agent-secret"
oauth:
clientId: "<your-client-id>"

See Identity Provider Configuration for detailed setup instructions.

Authorization​

The RCA Agent uses the client_credentials grant to authenticate with the OpenChoreo API as a service account. The API matches the sub claim in the issued JWT to identify the caller, so the new client must be granted the rca-agent role via a bootstrap authorization mapping.

Add the following to your Control Plane values override, replacing <your-client-id> with the same client ID used above:

openchoreoApi:
config:
security:
authorization:
bootstrap:
mappings:
- name: rca-agent-binding
roleRef:
name: rca-agent
entitlement:
claim: sub
value: "<your-client-id>"
effect: allow

Report Storage​

By default, RCA reports are stored in SQLite with a persistent volume β€” no external database required.

For production deployments that need horizontal scaling or shared storage, you can use PostgreSQL instead.

Using PostgreSQL​

Store the PostgreSQL connection URI in OpenBao:

kubectl exec -n openbao openbao-0 -- \
env BAO_ADDR=http://127.0.0.1:8200 BAO_TOKEN=root \
bao kv put secret/rca-sql-backend-uri value="postgresql+asyncpg://<USER>:<PASSWORD>@<HOST>:<PORT>/<DBNAME>"

Add the SQL_BACKEND_URI key to the ExternalSecret from Step 2:

kubectl patch externalsecret rca-agent-secret -n openchoreo-observability-plane --type=json \
-p '[{"op":"add","path":"/spec/data/-","value":{"secretKey":"SQL_BACKEND_URI","remoteRef":{"key":"rca-sql-backend-uri","property":"value"}}}]'

Then set the report backend in your Helm values:

rca:
reportBackend: postgresql