Skip to main content
Version: v1.0.0-rc.1 (pre-release)

Observability Plane

Dependencies​

This chart depends on the following community modules as sub-charts to install as default observability modules. For full configuration options of each community module, please refer to their official documentation in the community modules repository.

NameRepositoryCondition
observability-logs-opensearchhttps://github.com/openchoreo/community-modules/tree/main/observability-logs-opensearchobservability-logs-opensearch.enabled
observability-metrics-prometheushttps://github.com/openchoreo/community-modules/tree/main/observability-metrics-prometheusobservability-metrics-prometheus.enabled
observability-tracing-opensearchhttps://github.com/openchoreo/community-modules/tree/main/observability-tracing-opensearchobservability-tracing-opensearch.enabled

Cluster Agent​

Cluster Agent configuration for WebSocket-based communication with the control plane's cluster gateway

ParameterDescriptionTypeDefault
clusterAgent.affinityAffinity rules for pod schedulingobject{}
clusterAgent.heartbeatIntervalInterval between heartbeat messages to the control planestring30s
clusterAgent.image.pullPolicyImage pull policy for the cluster agentstringIfNotPresent
clusterAgent.image.repositoryContainer image repository for the cluster agentstringghcr.io/openchoreo/cluster-agent
clusterAgent.image.tagContainer image tag (defaults to Chart.AppVersion if empty)string
clusterAgent.logLevelLog level for the cluster agent (debug, info, warn, error)stringinfo
clusterAgent.nameName of the cluster agent deployment and associated resourcesstringcluster-agent-observabilityplane
clusterAgent.nodeSelectorNode selector for pod schedulingobject{}
clusterAgent.planeIDLogical plane identifier for multi-tenancy. Multiple CRs with the same planeID share one agent. Defaults to Helm release name if not specified.stringdefault
clusterAgent.planeTypeType of plane this agent servesstringobservabilityplane
clusterAgent.podAnnotationsAnnotations to add to cluster agent podsobject{}
clusterAgent.podDisruptionBudget.enabledEnable PodDisruptionBudget for cluster agentbooleanfalse
clusterAgent.podDisruptionBudget.maxUnavailableMaximum number of pods that can be unavailableinteger,nullnull
clusterAgent.podDisruptionBudget.minAvailableMinimum number of pods that must be availableinteger1
clusterAgent.podSecurityContext.fsGroupFilesystem group IDinteger1000
clusterAgent.podSecurityContext.runAsNonRootRun as non-root userbooleantrue
clusterAgent.podSecurityContext.runAsUserUser ID to run asinteger1000
clusterAgent.priorityClass.createCreate a priority classbooleanfalse
clusterAgent.priorityClass.nameName of the priority classstringcluster-agent-observabilityplane
clusterAgent.priorityClass.valuePriority valueinteger900000
clusterAgent.rbac.createCreate ClusterRole and ClusterRoleBinding for the agentbooleantrue
clusterAgent.reconnectDelayDelay before reconnecting after connection lossstring5s
clusterAgent.replicasNumber of cluster agent pod replicasinteger1
clusterAgent.resources.limits.cpuCPU limitstring100m
clusterAgent.resources.limits.memoryMemory limitstring256Mi
clusterAgent.resources.requests.cpuCPU requeststring50m
clusterAgent.resources.requests.memoryMemory requeststring128Mi
clusterAgent.securityContext.allowPrivilegeEscalationPrevent privilege escalationbooleanfalse
clusterAgent.securityContext.capabilities.dropCapabilities to droparray
clusterAgent.securityContext.readOnlyRootFilesystemMount root filesystem as read-onlybooleantrue
clusterAgent.serverCANamespaceNamespace where cluster-gateway CA ConfigMap existsstringopenchoreo-control-plane
clusterAgent.serverUrlWebSocket URL of the cluster gateway in the control planestringwss://cluster-gateway.openchoreo-control-plane.svc.cluster.local:8443/ws
clusterAgent.serviceAccount.annotationsAnnotations to add to the service accountobject{}
clusterAgent.serviceAccount.createCreate a dedicated service accountbooleantrue
clusterAgent.serviceAccount.nameName of the service accountstringcluster-agent-observabilityplane
clusterAgent.tls.caSecretNameCA secret name for signing agent client certificates. If empty, self-signed certs will be generated (required for multi-cluster setup).stringcluster-gateway-ca
clusterAgent.tls.caSecretNamespaceNamespace where the CA secret exists. If empty, self-signed certs will be generated (required for multi-cluster setup).stringopenchoreo-control-plane
clusterAgent.tls.caValueInline CA certificate in PEM format (for multi-cluster, takes precedence)string
clusterAgent.tls.clientSecretNameName of the client certificate Secretstringcluster-agent-tls
clusterAgent.tls.durationCertificate validity duration (e.g., 2160h = 90 days)string2160h
clusterAgent.tls.enabledEnable TLS for cluster agent communicationbooleantrue
clusterAgent.tls.generateCertsGenerate client certificates locally using cert-manager with a self-signed CAbooleantrue
clusterAgent.tls.renewBeforeTime before expiry to renew certificate (e.g., 360h = 15 days)string360h
clusterAgent.tls.secretNameName of the Secret containing client certificate and keystringcluster-agent-tls
clusterAgent.tls.serverCAConfigMapName of the ConfigMap containing server CA certificatestringcluster-gateway-ca
clusterAgent.tls.serverCAValueInline server CA certificate in PEM format (for multi-cluster setups)string
clusterAgent.tolerationsTolerations for pod schedulingarray[]

Controller Manager​

Configuration for the observability plane controller manager that reconciles ObservabilityAlertRules and other CRDs

ParameterDescriptionTypeDefault
controllerManager.affinityAffinity rules for pod schedulingobject{}
controllerManager.clusterGateway.enabledEnable cluster gateway integration for multi-cluster setupsbooleanfalse
controllerManager.clusterGateway.tls.caConfigMapName of the ConfigMap containing the gateway CA certificatestringcluster-gateway-ca
controllerManager.clusterGateway.tls.caPathPath to the CA certificate file for gateway verificationstring/etc/cluster-gateway/ca.crt
controllerManager.clusterGateway.urlURL of the cluster gateway service in the control planestringhttps://cluster-gateway.openchoreo-control-plane.svc.cluster.local:8443
controllerManager.containerSecurityContext.allowPrivilegeEscalationPrevent privilege escalation within the containerbooleanfalse
controllerManager.containerSecurityContext.capabilities.dropCapabilities to drop from the containerarray["ALL"]
controllerManager.containerSecurityContext.readOnlyRootFilesystemMount root filesystem as read-onlybooleanfalse
controllerManager.containerSecurityContext.seccompProfile.typeSeccomp profile typestringRuntimeDefault
controllerManager.deploymentPlaneIdentifier for this deployment plane typestringobservabilityplane
controllerManager.enabledEnable or disable the controller manager deploymentbooleantrue
controllerManager.image.pullPolicyImage pull policy for the controller manager containerstringIfNotPresent
controllerManager.image.repositoryContainer image repository for the controller managerstringghcr.io/openchoreo/controller
controllerManager.image.tagContainer image tag (defaults to Chart.AppVersion if empty)string
controllerManager.manager.argsCommand line arguments passed to the controller managerarray
controllerManager.manager.env.enableWebhooksEnable or disable admission webhooksstringfalse
controllerManager.nameName of the controller manager deployment and associated resourcesstringcontroller-manager
controllerManager.nodeSelectorNode selector for pod scheduling constraintsobject{}
controllerManager.podSecurityContext.fsGroupGroup ID for filesystem accessinteger1000
controllerManager.podSecurityContext.runAsGroupGroup ID to run the container processinteger1000
controllerManager.podSecurityContext.runAsNonRootRequire the container to run as a non-root userbooleantrue
controllerManager.podSecurityContext.runAsUserUser ID to run the container processinteger1000
controllerManager.priorityClass.createCreate a priority class for the controller managerbooleanfalse
controllerManager.priorityClass.nameName of the priority classstringobservabilityplane-controller-manager
controllerManager.priorityClass.valuePriority value (higher values indicate higher priority)integer900000
controllerManager.replicasNumber of controller manager pod replicasinteger1
controllerManager.resources.limits.cpuCPU limit for the controller managerstring500m
controllerManager.resources.limits.memoryMemory limit for the controller managerstring512Mi
controllerManager.resources.requests.cpuCPU request for the controller managerstring100m
controllerManager.resources.requests.memoryMemory request for the controller managerstring256Mi
controllerManager.serviceAccount.annotationsAnnotations to add to the service accountobject{}
controllerManager.serviceAccount.createCreate a dedicated service account for the controller managerbooleantrue
controllerManager.tolerationsTolerations for pod scheduling on tainted nodesarray[]
controllerManager.topologySpreadConstraintsTopology spread constraints for pod distribution across failure domainsarray[]

Gateway​

Gateway resource configuration for observability plane routing

ParameterDescriptionTypeDefault
gateway.annotationsAnnotations added to the Gateway resource. Use this to configure cert-manager, external-dns, or other integrations.object{}
gateway.enabledEnable Gateway CR creationbooleantrue
gateway.httpPortHTTP listener portinteger80
gateway.httpsPortHTTPS listener portinteger443
gateway.infrastructureGateway infrastructure configuration passed to the generated Service. Used to configure cloud provider load balancer settings via annotations. Example for AWS with Elastic IP: infrastructure: annotations: service.beta.kubernetes.io/aws-load-balancer-type: "external"object
gateway.tls.certificateRefsTLS certificate references for the HTTPS listener. Each entry references a Secret containing the TLS cert/key pair.array
gateway.tls.enabledEnable HTTPS listener on the gateway. When false, only the HTTP listener is created.booleantrue
gateway.tls.hostnameHostname pattern for the HTTPS listener (SNI matching)string*.openchoreo.invalid
gateway.tlsPassthrough.enabledEnable TLS passthrough listener (used for OpenSearch direct access)booleanfalse
gateway.tlsPassthrough.hostnameHostname for TLS passthrough listenerstring
gateway.tlsPassthrough.portPort for TLS passthrough listenerinteger11443

Global​

Global values shared across all components in the observability plane

ParameterDescriptionTypeDefault
global.commonLabelsCommon labels applied to all resources created by this chartobject{}
global.installationModeInstallation mode of OpenChoreo. Supported: singleCluster, multiCluster, quickStartstringsingleCluster

Kubernetes Cluster Domain​

Kubernetes cluster domain used for service discovery DNS resolution

ParameterDescriptionTypeDefault
kubernetesClusterDomainKubernetes cluster domain used for service discovery DNS resolutionstringcluster.local

Observer​

OpenChoreo Observer is the service that powers the Observer API used to query logs, metrics, traces, alerts, and incidents. It also owns:

  • The internal alerts API used by the observability plane controller to create/update/delete alert rules in OpenSearch and Prometheus.
  • The alert webhook endpoint that Alertmanager and OpenSearch call when alerts fire.
  • The alert/incident store used by the alert and incident query APIs.

Use the values in this section together with the alerting-related values in the observability-metrics-prometheus and observability-logs-opensearch sub-charts when configuring alerting and RCA for a deployment.

ParameterDescriptionTypeDefault
observer.alertStoreBackendAlert entry storage backend for fired alerts (sqlite, postgresql)stringsqlite
observer.alertStoreSqliteSizePVC size for SQLite alert entry storagestring128Mi
observer.authzTlsInsecureSkipVerifySkip TLS certificate verification when calling the control plane authz service (use for self-signed certs)booleanfalse
observer.controlPlaneApiUrlControl plane API base URL used by observerstringhttp://api.openchoreo.localhost:8080
observer.cors.allowedOriginsList of allowed origins for CORS requests. Empty list disables CORS.array
observer.extraEnvsExtra environment variables for the Observer container (can be used to point to custom alert/incident stores or adapters)array
observer.http.enabledEnable HTTPRoutebooleantrue
observer.http.hostnamesHTTPRoute hostnamesarray
observer.image.pullPolicyImage pull policy for the Observer containerstringIfNotPresent
observer.image.repositoryContainer image repository for the Observerstringghcr.io/openchoreo/observer
observer.image.tagContainer image tag (defaults to Chart.AppVersion if empty)string
observer.internalService.portService port for the Observer internal API (used for alert rule and webhook endpoints)integer8081
observer.logLevelLog level for the Observer service (debug, info, warn, error)stringinfo
observer.logsAdapter.enabledEnable logs adapter for fetching logs from an external adapterbooleanfalse
observer.logsAdapter.timeoutTimeout for logs adapter requestsstring30s
observer.logsAdapter.urlURL of the logs adapter servicestringhttp://logs-adapter:9098
observer.oauthClientIdOAuth2 client ID used by the Observer when calling the control plane APIstringopenchoreo-observer
observer.secretNameName of an existing Secret injected via envFrom. Required keys: OPENSEARCH_USERNAME, OPENSEARCH_PASSWORD, UID_RESOLVER_OAUTH_CLIENT_SECRET. Optional keys: ALERT_STORE_DSN (when alertStoreBackend is postgresql).string
observer.openSearchSecretNameName of an existing Secret with 'username' and 'password' keys for OpenSearch authentication. Required.string
observer.prometheus.addressPrometheus server address (auto-constructed from release name if empty)string
observer.prometheus.timeoutTimeout for Prometheus queriesstring30s
observer.replicasNumber of Observer pod replicasinteger1
observer.resources.limits.cpuCPU limit for the Observerstring200m
observer.resources.limits.memoryMemory limit for the Observerstring200Mi
observer.resources.requests.cpuCPU request for the Observerstring100m
observer.resources.requests.memoryMemory request for the Observerstring128Mi
observer.security.subjectTypesSubject type configurations for JWT subject resolutionarray
observer.service.portService port for the Observer APIinteger8080
observer.service.typeKubernetes service typestringClusterIP
observer.tracingAdapter.enabledEnable tracing adapter for fetching traces from an external adapterbooleanfalse
observer.tracingAdapter.timeoutTimeout for tracing adapter requestsstring30s
observer.tracingAdapter.urlURL of the tracing adapter servicestringhttp://tracing-adapter:9100

Rca​

AI-powered Root Cause Analysis agent configuration

ParameterDescriptionTypeDefault
rca.authz.timeoutSecondsAuthorization request timeout in secondsinteger30
rca.controlPlaneUrlControl plane API base URL used by rca-agentstringhttp://api.openchoreo.localhost:8080
rca.cors.allowedOriginsList of allowed origins for CORS requests. Empty list disables CORS.array
rca.enabledEnable RCA agent deploymentbooleanfalse
rca.http.enabledEnable HTTPRoutebooleantrue
rca.http.hostnamesHTTPRoute hostnamesarray
rca.image.pullPolicyImage pull policystringIfNotPresent
rca.image.repositoryContainer image repositorystringghcr.io/openchoreo/ai-rca-agent
rca.image.tagContainer image tag (defaults to Chart.AppVersion if empty)string
rca.llm.modelNameLLM model name (e.g., gpt-5.2)string
rca.logLevelLog level for the RCA agentstringINFO
rca.nameName of the RCA agent deploymentstringai-rca-agent
rca.oauth.clientIdOAuth2 client ID registered with the IDPstringopenchoreo-rca-agent
rca.observerMcpUrlObserver MCP endpoint URLstring
rca.remedAgentEnable remediation agentbooleantrue
rca.replicasNumber of RCA agent replicas (must be 1 for sqlite)integer1
rca.resources.limits.cpuCPU limitstring250m
rca.resources.limits.memoryMemory limitstring1536Mi
rca.resources.requests.cpuCPU requeststring100m
rca.resources.requests.memoryMemory requeststring1024Mi
rca.secretNameName of an existing Secret injected via envFrom. Required keys: RCA_LLM_API_KEY, OAUTH_CLIENT_SECRET. Optional keys: SQL_BACKEND_URI (when reportBackend is postgresql).stringrca-agent-secret
rca.reportBackendReport storage backend type (sqlite, postgresql)stringsqlite
rca.sqliteStorageSizePVC storage size for SQLite (only when reportBackend is sqlite)string128Mi
rca.service.portService portinteger8080
rca.service.typeService typestringClusterIP

Security​

Common security configuration shared across all components

ParameterDescriptionTypeDefault
security.enabledGlobal security toggle - when disabled, authentication is turned off for all componentsbooleantrue
security.jwt.audienceExpected audience claim in JWT tokensstring
security.oidc.authServerBaseUrlBase URL for the authorization server (used for OAuth metadata)string
security.oidc.issuerOIDC issuer URLstring
security.oidc.jwksUrlJWKS URL for token verificationstring
security.oidc.jwksUrlTlsInsecureSkipVerifySkip TLS verification for JWKS URLstringfalse
security.oidc.tokenUrlOIDC token endpoint URLstring
security.oidc.uidResolverTlsInsecureSkipVerifySkip TLS verification for the UID resolver OAuth token endpoint (for self-signed certs)stringfalse

Tls​

Global TLS certificate configuration using cert-manager

ParameterDescriptionTypeDefault
tls.dnsNamesDNS names for generated wildcard certificate. Required when tls.enabled=truearray[]
tls.enabledEnable TLS certificate generation for the observability planebooleanfalse