Skip to main content
Version: v0.9.x

ObservabilityAlertRule

An ObservabilityAlertRule defines a rule for monitoring runtime observability data (metrics or logs) and triggering alerts when specific conditions are met.

Generated Resources

ObservabilityAlertRule resources are typically generated automatically by the OpenChoreo control plane during component releases. They are derived from the alert definitions specified in a component's traits.

Usage Recommendation​

You should not create ObservabilityAlertRule resources manually. Instead, you should define alert rules using a Trait (either from the default observability-alertrule trait or a custom trait) within your component definition. This ensures that the alert rules are properly scoped to your component and managed as part of its lifecycle across different environments.

Example: Defining Alerts as Traits​

In your Component CR, add the alert rule as a trait (using the default observability-alertrule trait):

apiVersion: openchoreo.dev/v1alpha1
kind: Component
metadata:
name: my-service
spec:
# ... other component fields ...
traits:
- name: observability-alertrule
instanceName: high-error-rate-log-alert
parameters:
description: "Triggered when error logs count exceeds 50 in 5 minutes."
severity: "critical"
source:
type: "log"
query: "status:error"
condition:
window: 5m
interval: 1m
operator: gt
threshold: 50

Override the environment-specific parameters for the alert rule in the ReleaseBinding CR.

apiVersion: openchoreo.dev/v1alpha1
kind: ReleaseBinding
metadata:
name: my-service-production
namespace: default
spec:
owner:
projectName: default
componentName: my-service
environment: production

traitOverrides:
high-error-rate-log-alert:
enabled: true
enableAiRootCauseAnalysis: false
notificationChannel: devops-email-notifications

The control plane will then generate the corresponding ObservabilityAlertRule resource for each environment where this component is released.

API Version​

openchoreo.dev/v1alpha1

Resource Definition​

Metadata​

ObservabilityAlertRule resources are namespace-scoped and typically created within the project-environment namespace.

apiVersion: openchoreo.dev/v1alpha1
kind: ObservabilityAlertRule
metadata:
name: <rule-name>
namespace: <project-environment-namespace>

Spec Fields​

FieldTypeRequiredDescription
namestringYesUnique identifier for the alert rule
descriptionstringNoA human-friendly summary of the alert rule
severityAlertSeverityNoDescribes how urgent the alert is (info, warning, critical)
enabledbooleanNoToggles whether this alert rule should be evaluated. Defaults to true
enableAiRootCauseAnalysisbooleanNoAllows an attached AI engine to perform root cause analysis and generate a report when the alert is triggered
notificationChannelstringYesName of the ObservabilityAlertsNotificationChannel to notify
sourceAlertSourceYesSpecifies the observability source type (log or metrics) and query that drives the rule
conditionAlertConditionYesControls when an alert should be triggered based on the source data

AlertSeverity​

ValueDescription
infoInformational alerts
warningWarning-level alerts
criticalCritical alerts

AlertSource​

Specifies where and how events are pulled for evaluation.

FieldTypeRequiredDescription
typeAlertSourceTypeYesThe telemetry source type (log, metrics)
querystringNoThe query for log-based alerting (e.g., status:error)
metricstringNoThe metric name for metrics-based alerting (e.g., cpu, memory)

AlertSourceType​

ValueDescription
logLog-based alerting (Powered by OpenSearch)
metricsUsage metrics-based alerting (Powered by Prometheus)

AlertCondition​

Represents the conditions under which an alert should be triggered.

FieldTypeRequiredDescription
windowdurationYesThe time window aggregated before comparison (e.g., 5m)
intervaldurationYesHow often the alert rule is evaluated (e.g., 1m)
operatorAlertConditionOperatorYesComparison operator used for evaluation
thresholdintegerYesTrigger value for the configured operator

AlertConditionOperator​

ValueDescription
gtGreater than threshold
ltLess than threshold
gteGreater than or equal to threshold
lteLess than or equal to threshold
eqEquals the threshold

Examples​

Log-based Alert Rule​

apiVersion: openchoreo.dev/v1alpha1
kind: ObservabilityAlertRule
metadata:
name: error-logs-alert
namespace: my-project-production
spec:
name: Error Logs Detected
description: Triggered when more than 10 error logs are detected in 1 minute.
severity: critical
notificationChannel: devops-email-notifications
source:
type: log
query: 'status: "error"'
condition:
window: 1m
interval: 1m
operator: gt
threshold: 10