Skip to main content
Version: v0.17.x

RCA Agent

The RCA (Root Cause Analysis) Agent is an AI-powered component that analyzes logs, metrics, and traces from your OpenChoreo deployments to generate reports with likely root causes of issues. It integrates with Large Language Models (LLMs) to provide intelligent analysis and actionable insights.

Prerequisites

Before enabling the RCA Agent, ensure the following:

note

Enable automatic RCA only for critical alerts to manage LLM costs.

LLM Configuration

The RCA Agent requires an LLM provider to perform root cause analysis. Configure the model name and API key:

Configuration Parameters:

  • rca.llm.modelName: LLM model name (e.g., gpt-5, claude-sonnet-4-5, gemini-2.0-flash)
  • rca.llm.apiKey: API key for the LLM provider
tip

For best results, we recommend using the latest models from OpenAI or Anthropic.

Enabling the RCA Agent

Step 1: Create the OpenSearch credentials secret

The RCA Agent stores its reports in OpenSearch and needs credentials to connect. Create a secret with your OpenSearch credentials. If you followed the local setup guide, this pulls from the ClusterSecretStore seeded during installation:

kubectl apply -f - <<EOF
apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
name: observer-opensearch-credentials
namespace: openchoreo-observability-plane
spec:
refreshInterval: 1h
secretStoreRef:
kind: ClusterSecretStore
name: default
target:
name: observer-opensearch-credentials
data:
- secretKey: username
remoteRef:
key: opensearch-username
property: value
- secretKey: password
remoteRef:
key: opensearch-password
property: value
EOF
tip

If you are not using External Secrets Operator, create the secret directly:

kubectl create secret generic observer-opensearch-credentials \
--from-literal=username=<opensearch-username> \
--from-literal=password=<opensearch-password> \
-n openchoreo-observability-plane

Step 2: Upgrade the Observability Plane

Enable the RCA Agent and configure the LLM. The --reuse-values flag preserves your existing configuration.

If you followed the local setup guide, run:

helm upgrade --install openchoreo-observability-plane oci://ghcr.io/openchoreo/helm-charts/openchoreo-observability-plane \
--version 0.17.0 \
--namespace openchoreo-observability-plane \
--reuse-values \
--set rca.enabled=true \
--set rca.llm.modelName=<model-name> \
--set rca.llm.apiKey=<api-key>
note

If the observability plane and control plane are in separate clusters, set rca.controlPlaneUrl to the control plane API URL (defaults to http://api.openchoreo.localhost:8080):

--set rca.controlPlaneUrl=<control-plane-api-url>

Step 3: Register the RCA Agent with the control plane

Configure rcaAgentURL in the ObservabilityPlane resource so the control plane knows where to reach the agent:

kubectl patch observabilityplane default -n default --type=merge -p '{"spec":{"rcaAgentURL":"http://rca-agent.openchoreo.localhost:11080"}}'

Authentication and Authorization

By default, OpenChoreo configures Thunder as the identity provider for the RCA Agent with a pre-configured OAuth client for testing purposes. If you are using an external identity provider, follow the steps below to configure both authentication and authorization for the new client.

Authentication

Create an OAuth 2.0 client that supports the client_credentials grant type for service-to-service authentication, and configure the Observability Plane with the client credentials:

security:
oidc:
tokenUrl: "<your-idp-token-url>"

rca:
oauth:
clientId: "<your-client-id>"
clientSecret: "<your-client-secret>"

See Identity Provider Configuration for detailed setup instructions.

Authorization

With authorization enabled by default, the RCA Agent uses the client_credentials grant to authenticate with the OpenChoreo API as a service account. The API matches the sub claim in the issued JWT to identify the caller, so the new client must be granted the rca-agent role via a bootstrap authorization mapping.

Add the following to your Control Plane values override, replacing <your-client-id> with the same client ID used in the authentication configuration above:

openchoreoApi:
config:
security:
authorization:
bootstrap:
mappings:
- name: rca-agent-binding
roleRef:
name: rca-agent
entitlement:
claim: sub
value: "<your-client-id>"
effect: allow

Verifying the Installation

Check that the RCA Agent pod is running:

kubectl get pods -n openchoreo-observability-plane -l app.kubernetes.io/component=ai-rca-agent