Multi Cluster Production Setup
Deploy OpenChoreo planes across multiple Kubernetes clusters for production isolation, independent scaling, and blast radius reduction.
Before You Beginβ
Read Deployment Planning to understand:
- Multi-cluster architecture and communication
- Domain requirements per plane
- TLS certificate options
Prerequisitesβ
- Multiple Kubernetes 1.32+ clusters
- kubectl contexts configured for each cluster
- Helm installed
- Your base domain (e.g.,
example.com) - DNS access to create records
- LoadBalancer support in each cluster
- cert-manager installed in each cluster
Set your domain and cluster contexts:
export DOMAIN="example.com"
export CP_CONTEXT="my-control-plane-cluster"
export DP_CONTEXT="my-data-plane-cluster"
export BP_CONTEXT="my-build-plane-cluster"
export OP_CONTEXT="my-observability-cluster"
Verify access to all clusters:
kubectl --context $CP_CONTEXT get nodes
kubectl --context $DP_CONTEXT get nodes
Install cert-manager (per cluster)
Run on each cluster that needs cert-manager:
helm upgrade --install cert-manager oci://quay.io/jetstack/charts/cert-manager \
--kube-context $CONTEXT \
--namespace cert-manager \
--create-namespace \
--set crds.enabled=true
Wait for cert-manager to be ready:
kubectl --context $CONTEXT wait --for=condition=available deployment/cert-manager -n cert-manager --timeout=120s
Replace $CONTEXT with $CP_CONTEXT, $DP_CONTEXT, $BP_CONTEXT, or $OP_CONTEXT as needed.
Step 1: Setup Control Plane Clusterβ
helm upgrade --install openchoreo-control-plane oci://ghcr.io/openchoreo/helm-charts/openchoreo-control-plane \
--version 0.8.0 \
--kube-context $CP_CONTEXT \
--namespace openchoreo-control-plane \
--create-namespace \
--set global.baseDomain=openchoreo.${DOMAIN} \
--set global.tls.enabled=true \
--set "backstage.ingress.tls[0].secretName=control-plane-tls" \
--set "backstage.ingress.tls[0].hosts[0]=openchoreo.${DOMAIN}" \
--set "openchoreoApi.ingress.tls[0].secretName=control-plane-tls" \
--set "openchoreoApi.ingress.tls[0].hosts[0]=api.openchoreo.${DOMAIN}" \
--set "thunder.ocIngress.tls[0].secretName=control-plane-tls" \
--set "thunder.ocIngress.tls[0].hosts[0]=thunder.openchoreo.${DOMAIN}" \
--set thunder.configuration.server.publicUrl=https://thunder.openchoreo.${DOMAIN} \
--set thunder.configuration.gateClient.hostname=thunder.openchoreo.${DOMAIN} \
--set thunder.configuration.gateClient.port=443 \
--set thunder.configuration.gateClient.scheme="https" \
--set clusterGateway.ingress.enabled=true \
--set clusterGateway.ingress.ingressClassName=traefik \
--set "clusterGateway.ingress.hosts[0].host=cluster-gateway.openchoreo.${DOMAIN}"
Wait for pods to start:
kubectl --context $CP_CONTEXT get pods -n openchoreo-control-plane -w
Configure TLS
- Standard (GKE, AKS, etc.)
- AWS EKS
Wait for LoadBalancer to get an external IP (press Ctrl+C once EXTERNAL-IP appears):
kubectl --context $CP_CONTEXT get svc openchoreo-traefik -n openchoreo-control-plane -w
EKS LoadBalancers are private by default and return a hostname instead of an IP.
Make the LoadBalancer internet-facing:
kubectl --context $CP_CONTEXT patch svc openchoreo-traefik -n openchoreo-control-plane \
-p '{"metadata":{"annotations":{"service.beta.kubernetes.io/aws-load-balancer-scheme":"internet-facing"}}}'
Wait for the new LoadBalancer to be provisioned (this may take 1-2 minutes). Press Ctrl+C once EXTERNAL-IP (hostname) appears:
kubectl --context $CP_CONTEXT get svc openchoreo-traefik -n openchoreo-control-plane -w
For DNS records, use the LoadBalancer hostname (CNAME) or resolve it to an IP.
Create DNS records:
| Record | Value |
|---|---|
openchoreo.$DOMAIN | Control Plane LoadBalancer IP |
api.openchoreo.$DOMAIN | Control Plane LoadBalancer IP |
thunder.openchoreo.$DOMAIN | Control Plane LoadBalancer IP |
cluster-gateway.openchoreo.$DOMAIN | Control Plane LoadBalancer IP |
The cluster-gateway.openchoreo.$DOMAIN record is required for multi-cluster communication. Data Plane, Build Plane, and Observability Plane agents connect to this endpoint.
Configure TLSβ
- Using cert-manager
- Bring Your Own Certificates
- HTTP-01 Solver
- DNS-01 Solver
- Existing Issuer
kubectl --context $CP_CONTEXT apply -f - <<EOF
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: letsencrypt
namespace: openchoreo-control-plane
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@${DOMAIN}
privateKeySecretRef:
name: letsencrypt-account-key
solvers:
- http01:
ingress:
ingressClassName: openchoreo-traefik
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: control-plane-tls
namespace: openchoreo-control-plane
spec:
secretName: control-plane-tls
issuerRef:
name: letsencrypt
kind: Issuer
dnsNames:
- "openchoreo.${DOMAIN}"
- "api.openchoreo.${DOMAIN}"
- "thunder.openchoreo.${DOMAIN}"
EOF
Create secret with DNS provider credentials (example: Cloudflare):
kubectl --context $CP_CONTEXT create secret generic cloudflare-api-token \
--from-literal=api-token=YOUR_CLOUDFLARE_API_TOKEN \
-n openchoreo-control-plane
Create the Issuer and Certificate:
kubectl --context $CP_CONTEXT apply -f - <<EOF
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: letsencrypt
namespace: openchoreo-control-plane
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@${DOMAIN}
privateKeySecretRef:
name: letsencrypt-account-key
solvers:
- dns01:
cloudflare:
apiTokenSecretRef:
name: cloudflare-api-token
key: api-token
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: control-plane-tls
namespace: openchoreo-control-plane
spec:
secretName: control-plane-tls
issuerRef:
name: letsencrypt
kind: Issuer
dnsNames:
- "openchoreo.${DOMAIN}"
- "api.openchoreo.${DOMAIN}"
- "thunder.openchoreo.${DOMAIN}"
EOF
kubectl --context $CP_CONTEXT apply -f - <<EOF
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: control-plane-tls
namespace: openchoreo-control-plane
spec:
secretName: control-plane-tls
issuerRef:
name: YOUR_ISSUER_NAME
kind: Issuer # or ClusterIssuer
dnsNames:
- "openchoreo.${DOMAIN}"
- "api.openchoreo.${DOMAIN}"
- "thunder.openchoreo.${DOMAIN}"
EOF
Wait for certificate:
kubectl --context $CP_CONTEXT get certificate control-plane-tls -n openchoreo-control-plane -w
kubectl --context $CP_CONTEXT create secret tls control-plane-tls \
--cert=./path/to/cert.pem \
--key=./path/to/key.pem \
-n openchoreo-control-plane
Update Cluster Gateway Certificate for External Accessβ
The cluster-gateway certificate must include the external hostname for remote planes to connect. Without this step, agents will fail with "certificate is valid for ... not cluster-gateway.openchoreo.$DOMAIN".
Add the external hostname to the cluster-gateway certificate:
kubectl --context $CP_CONTEXT patch certificate cluster-gateway-tls -n openchoreo-control-plane --type='json' \
-p="[{\"op\": \"add\", \"path\": \"/spec/dnsNames/-\", \"value\": \"cluster-gateway.openchoreo.${DOMAIN}\"}]"
Delete the existing secret to trigger certificate regeneration:
kubectl --context $CP_CONTEXT delete secret cluster-gateway-tls -n openchoreo-control-plane
Wait for the certificate to be reissued:
kubectl --context $CP_CONTEXT get certificate cluster-gateway-tls -n openchoreo-control-plane -w
Restart the cluster-gateway to pick up the new certificate:
kubectl --context $CP_CONTEXT rollout restart deployment cluster-gateway -n openchoreo-control-plane
kubectl --context $CP_CONTEXT rollout status deployment cluster-gateway -n openchoreo-control-plane --timeout=60s
Extract Server CA for Remote Planesβ
Remote planes (Data Plane, Build Plane, Observability Plane) need the cluster-gateway server CA certificate to establish secure connections.
Wait for the cluster-gateway CA ConfigMap to be ready:
kubectl --context $CP_CONTEXT wait --for=jsonpath='{.data.ca\.crt}' \
configmap/cluster-gateway-ca -n openchoreo-control-plane --timeout=120s
Extract the server CA certificate:
mkdir -p ./agent-cas
kubectl --context $CP_CONTEXT get configmap cluster-gateway-ca \
-n openchoreo-control-plane -o jsonpath='{.data.ca\.crt}' > ./agent-cas/server-ca.crt
Verify the certificate was extracted:
cat ./agent-cas/server-ca.crt | head -5
You should see -----BEGIN CERTIFICATE-----.
Step 2: Setup Data Plane Clusterβ
helm upgrade --install openchoreo-data-plane oci://ghcr.io/openchoreo/helm-charts/openchoreo-data-plane \
--version 0.8.0 \
--kube-context $DP_CONTEXT \
--namespace openchoreo-data-plane \
--create-namespace \
--set gateway.httpPort=19080 \
--set gateway.httpsPort=19443 \
--set-file clusterAgent.tls.serverCAValue=./agent-cas/server-ca.crt \
--set clusterAgent.tls.generateCerts=true \
--set clusterAgent.tls.caSecretName="" \
--set clusterAgent.tls.caSecretNamespace="" \
--set clusterAgent.serverUrl=wss://cluster-gateway.openchoreo.${DOMAIN}:443/ws
Configure TLS
- Standard (GKE, AKS, etc.)
- AWS EKS
Wait for the kgateway LoadBalancer to get an external IP (press Ctrl+C once EXTERNAL-IP appears):
kubectl --context $DP_CONTEXT get svc gateway-default -n openchoreo-data-plane -w
Make the LoadBalancer internet-facing:
kubectl --context $DP_CONTEXT patch svc gateway-default -n openchoreo-data-plane \
-p '{"metadata":{"annotations":{"service.beta.kubernetes.io/aws-load-balancer-scheme":"internet-facing"}}}'
Wait for the new LoadBalancer to be provisioned (this may take 1-2 minutes). Press Ctrl+C once EXTERNAL-IP (hostname) appears:
kubectl --context $DP_CONTEXT get svc gateway-default -n openchoreo-data-plane -w
Create DNS records pointing to the kgateway LoadBalancer IP:
| Record | Value |
|---|---|
apps.openchoreo.$DOMAIN | Data Plane Gateway LoadBalancer IP |
*.apps.openchoreo.$DOMAIN | Data Plane Gateway LoadBalancer IP |
Configure TLSβ
The data plane gateway uses wildcard hostnames (*.apps.openchoreo.$DOMAIN). HTTP-01 validation cannot issue wildcard certificates - only DNS-01 validation or bring-your-own certificates work.
- Using cert-manager (DNS-01)
- Bring Your Own Certificates
DNS-01 validation requires configuring your DNS provider. See cert-manager DNS01 docs for all supported providers.
Example for Cloudflare:
Create secret with API token:
kubectl --context $DP_CONTEXT create secret generic cloudflare-api-token \
--from-literal=api-token=YOUR_CLOUDFLARE_API_TOKEN \
-n openchoreo-data-plane
Create the Issuer and Certificate:
kubectl --context $DP_CONTEXT apply -f - <<EOF
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: letsencrypt
namespace: openchoreo-data-plane
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@${DOMAIN}
privateKeySecretRef:
name: letsencrypt-account-key
solvers:
- dns01:
cloudflare:
apiTokenSecretRef:
name: cloudflare-api-token
key: api-token
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: openchoreo-gateway-tls
namespace: openchoreo-data-plane
spec:
secretName: openchoreo-gateway-tls
issuerRef:
name: letsencrypt
kind: Issuer
dnsNames:
- "apps.openchoreo.${DOMAIN}"
- "*.apps.openchoreo.${DOMAIN}"
EOF
Wait for certificate:
kubectl --context $DP_CONTEXT get certificate openchoreo-gateway-tls -n openchoreo-data-plane -w
kubectl --context $DP_CONTEXT create secret tls openchoreo-gateway-tls \
--cert=./path/to/apps-cert.pem \
--key=./path/to/apps-key.pem \
-n openchoreo-data-plane
Configure Gateway for Public HTTPSβ
The kgateway's HTTPS listener needs to be updated to accept public hostnames. Patch the Gateway to add a listener for your domain:
kubectl --context $DP_CONTEXT patch gateway gateway-default -n openchoreo-data-plane --type='json' -p='[
{
"op": "add",
"path": "/spec/listeners/-",
"value": {
"name": "https-public",
"port": 9443,
"protocol": "HTTPS",
"hostname": "*.apps.openchoreo.'${DOMAIN}'",
"allowedRoutes": {
"namespaces": {
"from": "All"
}
},
"tls": {
"mode": "Terminate",
"certificateRefs": [
{
"kind": "Secret",
"name": "openchoreo-gateway-tls",
"namespace": "openchoreo-data-plane"
}
]
}
}
}
]'
Verify the new listener is programmed:
kubectl --context $DP_CONTEXT get gateway gateway-default -n openchoreo-data-plane -o jsonpath='{.status.listeners[?(@.name=="https-public")].conditions[?(@.type=="Programmed")].status}'
You should see True.
Register Data Planeβ
Wait for the cluster-agent certificate to be ready:
kubectl --context $DP_CONTEXT get certificate cluster-agent-dataplane-tls -n openchoreo-data-plane -w
Register the Data Plane with the Control Plane:
DP_CA_CERT=$(kubectl --context $DP_CONTEXT get secret cluster-agent-tls -n openchoreo-data-plane -o jsonpath='{.data.ca\.crt}' | base64 -d)
kubectl --context $CP_CONTEXT apply -f - <<EOF
apiVersion: openchoreo.dev/v1alpha1
kind: DataPlane
metadata:
name: default
namespace: default
spec:
agent:
enabled: true
clientCA:
value: |
$(echo "$DP_CA_CERT" | sed 's/^/ /')
gateway:
organizationVirtualHost: "openchoreoapis.internal"
publicVirtualHost: "apps.openchoreo.${DOMAIN}"
secretStoreRef:
name: default
EOF
Verify:
kubectl --context $CP_CONTEXT get dataplane -n default
kubectl --context $DP_CONTEXT get pods -n openchoreo-data-plane
kubectl --context $DP_CONTEXT logs -n openchoreo-data-plane -l app=cluster-agent --tail=10
You should see "connected to control plane" in the agent logs.
Step 3: Setup Build Plane Cluster (Optional)β
The initial helm install may show an error for the push-buildpack-cache-images job. This is expected because the job runs before TLS is configured. The build plane components will deploy successfully despite this error.
helm upgrade --install openchoreo-build-plane oci://ghcr.io/openchoreo/helm-charts/openchoreo-build-plane \
--version 0.8.0 \
--kube-context $BP_CONTEXT \
--namespace openchoreo-build-plane \
--create-namespace \
--set clusterAgent.enabled=true \
--set global.baseDomain=openchoreo.${DOMAIN} \
--set-file clusterAgent.tls.serverCAValue=./agent-cas/server-ca.crt \
--set clusterAgent.tls.generateCerts=true \
--set clusterAgent.tls.caSecretName="" \
--set clusterAgent.tls.caSecretNamespace="" \
--set clusterAgent.serverUrl=wss://cluster-gateway.openchoreo.${DOMAIN}:443/ws
Install Traefik for Registry Ingressβ
The Build Plane needs Traefik to expose the container registry with TLS:
helm repo add traefik https://traefik.github.io/charts
helm repo update
helm upgrade --install traefik traefik/traefik \
--kube-context $BP_CONTEXT \
--namespace openchoreo-build-plane \
--set ports.websecure.expose.default=true \
--set ports.web.expose.default=true \
--wait
Configure TLS
Wait for the Traefik LoadBalancer to get an external IP:
kubectl --context $BP_CONTEXT get svc traefik -n openchoreo-build-plane -w
Create DNS record pointing to the Build Plane Traefik LoadBalancer IP:
| Record | Value |
|---|---|
registry.openchoreo.$DOMAIN | Build Plane Traefik LoadBalancer IP |
Configure TLSβ
The container registry must have a valid, trusted TLS certificate. Use DNS-01 validation since the Build Plane needs Traefik for ingress.
- Using cert-manager (DNS-01)
- Bring Your Own Certificates
Create secret with DNS provider credentials:
kubectl --context $BP_CONTEXT create secret generic cloudflare-api-token \
--from-literal=api-token=YOUR_CLOUDFLARE_API_TOKEN \
-n openchoreo-build-plane
Create the Issuer and Certificate:
kubectl --context $BP_CONTEXT apply -f - <<EOF
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: letsencrypt
namespace: openchoreo-build-plane
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@${DOMAIN}
privateKeySecretRef:
name: letsencrypt-account-key
solvers:
- dns01:
cloudflare:
apiTokenSecretRef:
name: cloudflare-api-token
key: api-token
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: build-plane-tls
namespace: openchoreo-build-plane
spec:
secretName: build-plane-tls
issuerRef:
name: letsencrypt
kind: Issuer
dnsNames:
- "registry.openchoreo.${DOMAIN}"
EOF
kubectl --context $BP_CONTEXT create secret tls build-plane-tls \
--cert=./path/to/registry-cert.pem \
--key=./path/to/registry-key.pem \
-n openchoreo-build-plane
Wait for certificate:
kubectl --context $BP_CONTEXT get certificate build-plane-tls -n openchoreo-build-plane -w
Configure Ingress for Registryβ
Create an Ingress resource to expose the registry:
kubectl --context $BP_CONTEXT apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: registry
namespace: openchoreo-build-plane
annotations:
traefik.ingress.kubernetes.io/router.tls: "true"
spec:
ingressClassName: traefik
tls:
- hosts:
- registry.openchoreo.${DOMAIN}
secretName: build-plane-tls
rules:
- host: registry.openchoreo.${DOMAIN}
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: registry
port:
number: 5000
EOF
Verify registry is accessible:
curl -I https://registry.openchoreo.${DOMAIN}/v2/
You should get a 200 OK response.
Upgrade with TLS - After the certificate and Ingress are ready:
helm upgrade --install openchoreo-build-plane oci://ghcr.io/openchoreo/helm-charts/openchoreo-build-plane \
--version 0.8.0 \
--kube-context $BP_CONTEXT \
--namespace openchoreo-build-plane \
--reuse-values \
--set global.tls.enabled=true \
--set global.tls.secretName=build-plane-tls
Register Build Planeβ
Wait for the cluster-agent certificate to be ready:
kubectl --context $BP_CONTEXT get certificate cluster-agent-buildplane-tls -n openchoreo-build-plane -w
Register the Build Plane with the Control Plane:
BP_CA_CERT=$(kubectl --context $BP_CONTEXT get secret cluster-agent-tls -n openchoreo-build-plane -o jsonpath='{.data.ca\.crt}' | base64 -d)
kubectl --context $CP_CONTEXT apply -f - <<EOF
apiVersion: openchoreo.dev/v1alpha1
kind: BuildPlane
metadata:
name: default
namespace: default
spec:
agent:
enabled: true
clientCA:
value: |
$(echo "$BP_CA_CERT" | sed 's/^/ /')
EOF
Verify:
kubectl --context $CP_CONTEXT get buildplane -n default
kubectl --context $BP_CONTEXT get pods -n openchoreo-build-plane
kubectl --context $BP_CONTEXT logs -n openchoreo-build-plane -l app=cluster-agent --tail=10
You should see "connected to control plane" in the agent logs.
Step 4: Setup Observability Plane Cluster (Optional)β
- Minimal
- HA Mode
Single-node OpenSearch for development or small deployments.
helm upgrade --install openchoreo-observability-plane oci://ghcr.io/openchoreo/helm-charts/openchoreo-observability-plane \
--version 0.8.0 \
--kube-context $OP_CONTEXT \
--namespace openchoreo-observability-plane \
--create-namespace \
--set openSearch.enabled=true \
--set openSearchCluster.enabled=false \
--set clusterAgent.enabled=true \
--set-file clusterAgent.tls.serverCAValue=./agent-cas/server-ca.crt \
--set clusterAgent.tls.generateCerts=true \
--set clusterAgent.tls.caSecretName="" \
--set clusterAgent.tls.caSecretNamespace="" \
--set clusterAgent.serverUrl=wss://cluster-gateway.openchoreo.${DOMAIN}:443/ws \
--timeout 10m
Clustered OpenSearch using the OpenSearch Operator for high availability.
Install OpenSearch Operator:
helm repo add opensearch-operator https://opensearch-project.github.io/opensearch-k8s-operator/
helm repo update
helm upgrade --install opensearch-operator opensearch-operator/opensearch-operator \
--create-namespace \
--kube-context $OP_CONTEXT \
--namespace openchoreo-observability-plane \
--version 2.8.0
Wait for the operator to be ready:
kubectl --context $OP_CONTEXT wait --for=condition=available \
deployment/opensearch-operator-controller-manager \
-n openchoreo-observability-plane --timeout=120s
Install Observability Plane:
helm upgrade --install openchoreo-observability-plane oci://ghcr.io/openchoreo/helm-charts/openchoreo-observability-plane \
--version 0.8.0 \
--kube-context $OP_CONTEXT \
--namespace openchoreo-observability-plane \
--set clusterAgent.enabled=true \
--set-file clusterAgent.tls.serverCAValue=./agent-cas/server-ca.crt \
--set clusterAgent.tls.generateCerts=true \
--set clusterAgent.tls.caSecretName="" \
--set clusterAgent.tls.caSecretNamespace="" \
--set clusterAgent.serverUrl=wss://cluster-gateway.openchoreo.${DOMAIN}:443/ws \
--timeout 10m
Configure Cross-Cluster Observer Accessβ
Get observability cluster node IP for the observer URL:
OP_NODE_IP=$(kubectl --context $OP_CONTEXT get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="InternalIP")].address}')
echo "Observer Node IP: $OP_NODE_IP"
Use InternalIP if clusters are in the same VPC/VNet. Use ExternalIP if clusters are in different networks.
Expose observer service via NodePort for cross-cluster access:
kubectl --context $OP_CONTEXT patch svc observer -n openchoreo-observability-plane --type='json' \
-p='[{"op": "replace", "path": "/spec/type", "value": "NodePort"}, {"op": "add", "path": "/spec/ports/0/nodePort", "value": 30880}]'
Register Observability Planeβ
Wait for the cluster-agent certificate to be ready:
kubectl --context $OP_CONTEXT get certificate cluster-agent-observabilityplane-tls -n openchoreo-observability-plane -w
Register the Observability Plane with the Control Plane:
OP_CA_CERT=$(kubectl --context $OP_CONTEXT get secret cluster-agent-tls -n openchoreo-observability-plane -o jsonpath='{.data.ca\.crt}' | base64 -d)
kubectl --context $CP_CONTEXT apply -f - <<EOF
apiVersion: openchoreo.dev/v1alpha1
kind: ObservabilityPlane
metadata:
name: default
namespace: default
spec:
agent:
enabled: true
clientCA:
value: |
$(echo "$OP_CA_CERT" | sed 's/^/ /')
observerURL: http://${OP_NODE_IP}:30880
EOF
Link the Data Plane (and Build Plane if installed) to use observability:
kubectl --context $CP_CONTEXT patch dataplane default -n default --type merge -p '{"spec":{"observabilityPlaneRef":"default"}}'
kubectl --context $CP_CONTEXT patch buildplane default -n default --type merge -p '{"spec":{"observabilityPlaneRef":"default"}}'
Verify:
kubectl --context $CP_CONTEXT get observabilityplane -n default
kubectl --context $OP_CONTEXT get pods -n openchoreo-observability-plane
kubectl --context $OP_CONTEXT logs -n openchoreo-observability-plane -l app=cluster-agent --tail=10
You should see "connected to control plane" in the agent logs.
DNS Records Summaryβ
Here's a complete list of all DNS records required for multi-cluster setup:
| Record | Points To | Cluster |
|---|---|---|
openchoreo.$DOMAIN | Control Plane Traefik LoadBalancer IP | Control Plane |
api.openchoreo.$DOMAIN | Control Plane Traefik LoadBalancer IP | Control Plane |
thunder.openchoreo.$DOMAIN | Control Plane Traefik LoadBalancer IP | Control Plane |
cluster-gateway.openchoreo.$DOMAIN | Control Plane Traefik LoadBalancer IP | Control Plane |
apps.openchoreo.$DOMAIN | Data Plane kgateway LoadBalancer IP | Data Plane |
*.apps.openchoreo.$DOMAIN | Data Plane kgateway LoadBalancer IP | Data Plane |
registry.openchoreo.$DOMAIN | Build Plane Traefik LoadBalancer IP | Build Plane (optional) |
- Registry DNS must point to the Build Plane Traefik LoadBalancer, not the Control Plane
Access URLsβ
| Service | URL |
|---|---|
| Console | https://openchoreo.$DOMAIN |
| API | https://api.openchoreo.$DOMAIN |
| Deployed Apps | https://<environment>.apps.openchoreo.$DOMAIN:9443/<component-name>/... |
| Registry | https://registry.openchoreo.$DOMAIN (if Build Plane) |
The kgateway exposes HTTPS on port 9443. Deployed applications are accessed via this port (e.g., https://development.apps.openchoreo.example.com:9443/my-service/...).
Default credentials: admin@openchoreo.dev / Admin@123
Verify Installationβ
After completing the setup, verify all planes are connected:
# Check all planes are registered
kubectl --context $CP_CONTEXT get dataplane,buildplane,observabilityplane -n default
# Check agent connectivity (should show "connected to control plane")
kubectl --context $DP_CONTEXT logs -n openchoreo-data-plane -l app=cluster-agent --tail=5
kubectl --context $BP_CONTEXT logs -n openchoreo-build-plane -l app=cluster-agent --tail=5
kubectl --context $OP_CONTEXT logs -n openchoreo-observability-plane -l app=cluster-agent --tail=5
Test the console is accessible:
curl -I https://openchoreo.${DOMAIN}
Next Stepsβ
- Deploy your first component
- Review Operations Guide for maintenance procedures
Cleanupβ
Delete plane registrations from control plane:
kubectl --context $CP_CONTEXT delete dataplane default -n default 2>/dev/null
kubectl --context $CP_CONTEXT delete buildplane default -n default 2>/dev/null
kubectl --context $CP_CONTEXT delete observabilityplane default -n default 2>/dev/null
Data Plane Cluster:
helm uninstall openchoreo-data-plane -n openchoreo-data-plane --kube-context $DP_CONTEXT
kubectl --context $DP_CONTEXT delete namespace openchoreo-data-plane cert-manager
Build Plane Cluster (if installed):
helm uninstall traefik -n openchoreo-build-plane --kube-context $BP_CONTEXT
helm uninstall openchoreo-build-plane -n openchoreo-build-plane --kube-context $BP_CONTEXT
kubectl --context $BP_CONTEXT delete namespace openchoreo-build-plane cert-manager
Observability Plane Cluster (if installed):
helm uninstall openchoreo-observability-plane -n openchoreo-observability-plane --kube-context $OP_CONTEXT
helm uninstall opensearch-operator -n openchoreo-observability-plane --kube-context $OP_CONTEXT 2>/dev/null
kubectl --context $OP_CONTEXT delete namespace openchoreo-observability-plane
Control Plane Cluster (last):
helm uninstall openchoreo-control-plane -n openchoreo-control-plane --kube-context $CP_CONTEXT
kubectl --context $CP_CONTEXT delete namespace openchoreo-control-plane cert-manager
Delete CRDs from control plane:
kubectl --context $CP_CONTEXT get crd -o name | grep -E '\.openchoreo\.dev$' | xargs -r kubectl --context $CP_CONTEXT delete
Clean up extracted CA files:
rm -rf ./agent-cas
Troubleshootingβ
Agent not connectingβ
Check agent logs in data plane:
kubectl --context $DP_CONTEXT logs -n openchoreo-data-plane -l app=cluster-agent --tail=50
Check cluster-gateway logs in control plane:
kubectl --context $CP_CONTEXT logs -n openchoreo-control-plane -l app=cluster-gateway --tail=50
Common issues:
- "connection refused": Control plane cluster-gateway not ready or DNS not configured
- "certificate signed by unknown authority": Server CA not correctly configured - verify
./agent-cas/server-ca.crtwas extracted correctly - "certificate is valid for ... not cluster-gateway.openchoreo.$DOMAIN": Cluster gateway certificate doesn't include external hostname - follow the "Update Cluster Gateway Certificate" step
- "WebSocket connection failed": Network connectivity between clusters or firewall blocking port 443
Cluster Gateway Ingress not workingβ
Verify the IngressRouteTCP was created:
kubectl --context $CP_CONTEXT get ingressroutetcp -n openchoreo-control-plane
Check Traefik logs:
kubectl --context $CP_CONTEXT logs -n openchoreo-control-plane -l app.kubernetes.io/name=traefik --tail=50
Verify DNS resolves to the LoadBalancer IP:
dig cluster-gateway.openchoreo.${DOMAIN}
Certificate not issuingβ
kubectl --context $CONTEXT describe certificate <name> -n <namespace>
kubectl --context $CONTEXT get challenges -A
DNS resolution issuesβ
Verify DNS records are propagated:
dig openchoreo.${DOMAIN}
dig api.openchoreo.${DOMAIN}
dig apps.openchoreo.${DOMAIN}
dig cluster-gateway.openchoreo.${DOMAIN}
dig registry.openchoreo.${DOMAIN}
Registry not accessibleβ
Verify the registry Ingress:
kubectl --context $BP_CONTEXT get ingress -n openchoreo-build-plane
Test registry connectivity:
curl -v https://registry.openchoreo.${DOMAIN}/v2/
Build workflow fails with registry errorsβ
Check if the registry certificate is valid and accessible from within the build plane:
kubectl --context $BP_CONTEXT run -it --rm curl --image=curlimages/curl --restart=Never -- \
curl -v https://registry.openchoreo.${DOMAIN}/v2/
Deployed apps not accessibleβ
Check the kgateway service is running and has an external IP:
kubectl --context $DP_CONTEXT get svc gateway-default -n openchoreo-data-plane
kubectl --context $DP_CONTEXT get pods -n openchoreo-data-plane -l app.kubernetes.io/name=gateway-default
Verify the gateway certificate is ready:
kubectl --context $DP_CONTEXT get certificate openchoreo-gateway-tls -n openchoreo-data-plane
Test connectivity on port 9443:
curl -v https://<environment>.apps.openchoreo.${DOMAIN}:9443/<component-name>/...