Multi-Cluster Connectivity
When deploying OpenChoreo across multiple Kubernetes clusters (e.g., separate Control Plane and Data Plane clusters), you must explicitly establish trust between them. This guide covers the step-by-step process of exchanging certificates to secure the WebSocket connection between planes.
Overviewβ
The OpenChoreo Control Plane runs a Cluster Gateway that listens for incoming WebSocket connections from remote planes (Data Plane, Build Plane, Observability Plane). This connection is secured using Mutual TLS (mTLS).
- Server Trust: The remote plane must trust the Control Plane's CA certificate to verify the Cluster Gateway's identity.
- Client Authentication: The remote plane's Cluster Agent generates its own client certificate using a local self-signed issuer. This certificate must be registered with the Control Plane to allow the connection.
In multi-cluster deployments, the agent's client certificate is self-signed because the Control Plane's CA private key cannot be shared across clusters. The Control Plane trusts the agent by explicitly registering the agent's certificate in the DataPlane/BuildPlane/ObservabilityPlane CRD.
Prerequisitesβ
- Control Plane installed in a primary cluster
- Remote Cluster (Data, Build, or Observability) where the remote plane will be installed
kubectlcontext configured for both clusters- cert-manager installed on both clusters (Control Plane for gateway certificate, remote cluster for agent certificate)
Step 0: Configure Cluster Gateway DNS Namesβ
Before extracting the CA, ensure the Control Plane's Cluster Gateway certificate includes your public DNS name. By default, the certificate only includes internal cluster DNS names.
When installing or upgrading the Control Plane, add your public DNS name:
helm upgrade --install openchoreo-control-plane oci://ghcr.io/openchoreo/helm-charts/openchoreo-control-plane \
--namespace openchoreo-control-plane \
# ... other values ...
--set "clusterGateway.tls.dnsNames[0]=cluster-gateway.openchoreo-control-plane.svc" \
--set "clusterGateway.tls.dnsNames[1]=cluster-gateway.openchoreo-control-plane.svc.cluster.local" \
--set "clusterGateway.tls.dnsNames[2]=cluster-gateway.openchoreo.${DOMAIN}"
Or in a values file:
clusterGateway:
tls:
dnsNames:
- cluster-gateway.openchoreo-control-plane.svc
- cluster-gateway.openchoreo-control-plane.svc.cluster.local
- cluster-gateway.openchoreo.example.com # Your public DNS name
The serverUrl hostname in Step 2 must match one of the DNS names in the Cluster Gateway certificate. If not, TLS verification will fail with a certificate error.
Step 1: Extract Control Plane CAβ
The Control Plane generates a Certificate Authority (CA) used to sign the Cluster Gateway's serving certificate. Remote planes need this CA to verify they are connecting to the authentic Control Plane.
Run this command against your Control Plane cluster:
# Set your Control Plane context and namespace
export CP_CONTEXT="my-control-plane-cluster"
export CP_NAMESPACE="openchoreo-control-plane"
# Extract the CA certificate from the ConfigMap
export CP_CA_CERT=$(kubectl --context $CP_CONTEXT get configmap cluster-gateway-ca \
-n $CP_NAMESPACE -o jsonpath='{.data.ca\.crt}')
# Verify the output (should start with -----BEGIN CERTIFICATE-----)
echo "$CP_CA_CERT" | head -n 5
Ensure you extract the CA from the cluster-gateway-ca ConfigMap, not the TLS Secret. The ConfigMap contains the public CA certificate in plain text, which is what the Helm chart expects.
Step 2: Install Remote Plane with CAβ
When installing a remote plane (e.g., Data Plane), pass the extracted CA certificate to the Helm chart. This configures the Cluster Agent to trust your Control Plane and use a locally-generated client certificate.
Example: Data Plane Installation
First, save the extracted CA certificate to a file:
# Save the CA certificate to a file
echo "$CP_CA_CERT" > ./server-ca.crt
Then install the Data Plane:
# Set your Data Plane context
export DP_CONTEXT="my-data-plane-cluster"
export DOMAIN="example.com"
helm upgrade --install openchoreo-data-plane oci://ghcr.io/openchoreo/helm-charts/openchoreo-data-plane \
--version <version> \
--kube-context $DP_CONTEXT \
--namespace openchoreo-data-plane \
--create-namespace \
--set clusterAgent.enabled=true \
--set clusterAgent.serverUrl="wss://cluster-gateway.openchoreo.${DOMAIN}/ws" \
--set clusterAgent.tls.enabled=true \
--set clusterAgent.tls.generateCerts=true \
--set-file clusterAgent.tls.serverCAValue=./server-ca.crt \
--set clusterAgent.tls.caSecretName="" \
--set clusterAgent.tls.caSecretNamespace=""
Key Parametersβ
| Parameter | Description |
|---|---|
clusterAgent.serverUrl | The public WebSocket URL of your Control Plane's Cluster Gateway (e.g., wss://cluster-gateway.openchoreo.example.com/ws) |
clusterAgent.tls.generateCerts | Set to true to generate client certificates locally instead of copying from control plane |
clusterAgent.tls.serverCAValue | The CA certificate file from Step 1. Used to verify the Cluster Gateway's identity. Use --set-file for proper PEM formatting. |
clusterAgent.tls.caSecretName | Set to empty ("") to use a self-signed issuer for generating the agent's client certificate |
clusterAgent.tls.caSecretNamespace | Set to empty ("") along with caSecretName |
In multi-cluster deployments, the remote plane cannot access the Control Plane's CA private key (only the public certificate is available). Therefore, the agent generates its own client certificate using a local self-signed issuer. This certificate is then registered with the Control Plane in Step 4, establishing mutual trust.
Step 3: Extract Agent Client Certificateβ
After the remote plane is installed, its cert-manager will generate a client certificate for the Cluster Agent. You need to extract this certificate to register the plane.
Run this command against your Remote Cluster (Data/Build/Observability):
# Set your Remote Plane context and namespace
export REMOTE_CONTEXT="my-data-plane-cluster"
export REMOTE_NAMESPACE="openchoreo-data-plane"
# Wait for the certificate to be ready
kubectl --context $REMOTE_CONTEXT wait --for=condition=Ready \
certificate/cluster-agent-dataplane-tls -n $REMOTE_NAMESPACE --timeout=120s
# Extract the agent's client certificate
export AGENT_CERT=$(kubectl --context $REMOTE_CONTEXT get secret cluster-agent-tls \
-n $REMOTE_NAMESPACE -o jsonpath='{.data.tls\.crt}' | base64 -d)
# Verify the output (should start with -----BEGIN CERTIFICATE-----)
echo "$AGENT_CERT" | head -n 5
If the certificate is not ready, check the cert-manager logs and the Certificate resource status:
kubectl --context $REMOTE_CONTEXT describe certificate cluster-agent-dataplane-tls -n $REMOTE_NAMESPACE
Step 4: Register Plane in Control Planeβ
Finally, register the remote plane by creating the appropriate CRD (DataPlane, BuildPlane, or ObservabilityPlane) in the Control Plane cluster. You must embed the AGENT_CERT extracted in Step 3.
Example: Registering a Data Plane
# Create the DataPlane resource in the Control Plane
cat <<EOF | kubectl --context $CP_CONTEXT apply -f -
apiVersion: openchoreo.dev/v1alpha1
kind: DataPlane
metadata:
name: production-us-east
namespace: default # Or your organization's namespace
spec:
agent:
enabled: true
clientCA:
value: |
$(echo "$AGENT_CERT" | sed 's/^/ /')
gateway:
publicVirtualHost: "apps.openchoreo.${DOMAIN}"
organizationVirtualHost: "openchoreoapis.internal"
secretStoreRef:
name: default
EOF
Verificationβ
Once registered, the Control Plane will accept connections from the agent. You can verify the connection by checking the agent logs:
kubectl --context $REMOTE_CONTEXT logs -n $REMOTE_NAMESPACE -l app.kubernetes.io/name=cluster-agent --tail=20
You should see a message indicating successful connection: Connected to control plane at wss://...
Other Plane Typesβ
The same process applies to Build Plane and Observability Plane. The key differences are the namespace and plane-specific configuration.
Build Planeβ
helm upgrade --install openchoreo-build-plane oci://ghcr.io/openchoreo/helm-charts/openchoreo-build-plane \
--version <version> \
--kube-context $REMOTE_CONTEXT \
--namespace openchoreo-build-plane \
--create-namespace \
--set clusterAgent.enabled=true \
--set clusterAgent.serverUrl="wss://cluster-gateway.openchoreo.${DOMAIN}/ws" \
--set clusterAgent.tls.enabled=true \
--set clusterAgent.tls.generateCerts=true \
--set-file clusterAgent.tls.serverCAValue=./server-ca.crt \
--set clusterAgent.tls.caSecretName="" \
--set clusterAgent.tls.caSecretNamespace=""
Observability Planeβ
helm upgrade --install openchoreo-observability-plane oci://ghcr.io/openchoreo/helm-charts/openchoreo-observability-plane \
--version <version> \
--kube-context $REMOTE_CONTEXT \
--namespace openchoreo-observability-plane \
--create-namespace \
--set clusterAgent.enabled=true \
--set clusterAgent.serverUrl="wss://cluster-gateway.openchoreo.${DOMAIN}/ws" \
--set clusterAgent.tls.enabled=true \
--set clusterAgent.tls.generateCerts=true \
--set-file clusterAgent.tls.serverCAValue=./server-ca.crt \
--set clusterAgent.tls.caSecretName="" \
--set clusterAgent.tls.caSecretNamespace=""
Troubleshootingβ
Certificate Not Readyβ
If the certificate fails to become ready:
# Check certificate status
kubectl describe certificate cluster-agent-dataplane-tls -n openchoreo-data-plane
# Check cert-manager logs
kubectl logs -n cert-manager deployment/cert-manager
Common issues:
- Issuer not found: Ensure the Helm chart completed successfully and the self-signed issuer was created
- Permission denied: Check that cert-manager has permissions to create secrets in the namespace
Agent Cannot Connectβ
If the agent fails to connect to the Control Plane:
# Check agent logs for connection errors
kubectl logs -n openchoreo-data-plane -l app=cluster-agent
# Verify the server CA ConfigMap exists
kubectl get configmap cluster-gateway-ca -n openchoreo-data-plane
Common issues:
- Certificate verification failed: Ensure
serverCAValuecontains the correct CA certificate from the Control Plane - Connection refused: Verify the
serverUrlis accessible from the remote cluster and the Cluster Gateway ingress is properly configured
TLS Certificate Error (x509: certificate is valid for X, not Y)β
If you see an error like x509: certificate is valid for cluster-gateway.openchoreo-control-plane.svc, not cluster-gateway.openchoreo.example.com:
The Cluster Gateway's server certificate doesn't include your public DNS name. Update the Control Plane with the correct DNS names (see Step 0):
# Check current certificate DNS names
kubectl get certificate cluster-gateway-tls -n openchoreo-control-plane -o jsonpath='{.spec.dnsNames}'
# After updating the Helm values, the certificate will be re-issued
# You may need to delete the old certificate to force regeneration
kubectl delete certificate cluster-gateway-tls -n openchoreo-control-plane
After updating, re-extract the CA certificate (Step 1) as the certificate may have been regenerated.
Agent Certificate Not Trustedβ
If the Control Plane rejects the agent's connection:
# Check Control Plane controller-manager logs
kubectl logs -n openchoreo-control-plane deployment/controller-manager
Ensure the DataPlane/BuildPlane/ObservabilityPlane CRD has the correct agent certificate in spec.agent.clientCA.value