Multi-Cluster Kubernetes — Federation, Submariner, and Cluster API

December 6, 2025 · 8 min read

DevOps & Cloud Learning Hub

Running a single Kubernetes cluster is straightforward until your application needs to survive a regional outage, comply with data sovereignty laws, or serve users on three continents without 200ms latency. At that point, you need multiple clusters — and that changes everything about how you deploy, network, and manage workloads.

Why Multi-Cluster?

Before adding the complexity of multiple clusters, make sure you actually need them. Here are the legitimate reasons:

Reason	Example	Single-Cluster Alternative?
Disaster Recovery	Region goes down, app stays up	Multi-AZ (partial protection only)
Data Sovereignty	EU data must stay in EU region	Namespace isolation (does not satisfy regulators)
Latency	Users in Asia, US, Europe need <50ms	CDN helps for static, not for APIs
Blast Radius	Bad deploy kills one cluster, not all	Canary deploys (still single point of failure)
Team Isolation	Platform team vs application teams	Namespaces + RBAC (works for smaller orgs)
Hybrid/Multi-Cloud	Avoid vendor lock-in, use best-of-breed	Not possible with single cluster

If your answer is "we just want it for fun" — stop here. Multi-cluster is operationally expensive.

Multi-Cluster Patterns

The pattern you choose dictates your architecture, tooling, and recovery time:

Pattern	Description	RTO	Complexity	Use Case
Active-Active	All clusters serve traffic simultaneously	~0	Very High	Global apps, zero downtime
Active-Passive	Standby cluster takes over on failure	Minutes	Medium	DR for critical apps
Hub-Spoke	Central management cluster, edge workload clusters	Varies	High	Edge computing, retail
Federated	Clusters loosely coupled, shared config	Varies	High	Multi-team, multi-region

KubeFed — Kubernetes Federation v2

KubeFed lets you distribute resources across clusters from a single control plane. You define a FederatedDeployment, and KubeFed pushes it to member clusters.

Install KubeFed

# Add the KubeFed Helm repo
helm repo add kubefed-charts https://raw.githubusercontent.com/kubernetes-sigs/kubefed/master/charts
helm repo update

# Install KubeFed in the host cluster
helm install kubefed kubefed-charts/kubefed \
  --namespace kube-federation-system \
  --create-namespace \
  --set controllermanager.replicaCount=2

# Join member clusters
kubefedctl join cluster-us-east \
  --cluster-context=us-east-ctx \
  --host-cluster-context=hub-ctx \
  --v=2

kubefedctl join cluster-eu-west \
  --cluster-context=eu-west-ctx \
  --host-cluster-context=hub-ctx \
  --v=2

# Verify joined clusters
kubectl get kubefedclusters -n kube-federation-system

Federated Deployment

apiVersion: types.kubefed.io/v1beta1
kind: FederatedDeployment
metadata:
  name: payment-api
  namespace: production
spec:
  template:
    metadata:
      labels:
        app: payment-api
    spec:
      replicas: 3
      selector:
        matchLabels:
          app: payment-api
      template:
        spec:
          containers:
            - name: payment-api
              image: myregistry/payment-api:v2.1.0
              resources:
                requests:
                  cpu: 100m
                  memory: 128Mi
  placement:
    clusters:
      - name: cluster-us-east
      - name: cluster-eu-west
  overrides:
    - clusterName: cluster-eu-west
      clusterOverrides:
        - path: "/spec/replicas"
          value: 5  # More replicas in EU due to higher traffic

This creates a Deployment in both clusters, with 3 replicas in US-East and 5 in EU-West.

Submariner — Cross-Cluster Networking

The hardest part of multi-cluster is networking. Pods in Cluster A cannot reach pods in Cluster B by default — their Pod CIDRs are isolated. Submariner solves this by creating encrypted tunnels between clusters.

Install Submariner

# Install subctl CLI
curl -Ls https://get.submariner.io | VERSION=v0.18.0 bash

# Deploy the broker (on hub cluster)
subctl deploy-broker --kubeconfig hub-kubeconfig

# Join clusters to the broker
subctl join --kubeconfig us-east-kubeconfig broker-info.subm \
  --clusterid cluster-us-east \
  --natt=false

subctl join --kubeconfig eu-west-kubeconfig broker-info.subm \
  --clusterid cluster-eu-west \
  --natt=false

# Verify connectivity
subctl show connections
subctl verify --kubecontext us-east-ctx --tocontext eu-west-ctx --only connectivity

Export a Service Across Clusters

# In cluster-us-east, export the payment-api service
subctl export service payment-api -n production

# Now pods in cluster-eu-west can reach it via:
# payment-api.production.svc.clusterset.local

# From a pod in cluster-eu-west:
apiVersion: v1
kind: Pod
metadata:
  name: test-cross-cluster
spec:
  containers:
    - name: curl
      image: curlimages/curl:latest
      command: ["curl", "http://payment-api.production.svc.clusterset.local:8080/health"]

Cluster API — Lifecycle Management

Cluster API (CAPI) treats clusters as Kubernetes resources. You declare a cluster in YAML, and CAPI provisions the infrastructure, bootstraps the nodes, and manages the lifecycle — just like a Deployment manages pods.

Bootstrap a Management Cluster

# Install clusterctl
curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.7.0/clusterctl-linux-amd64 -o clusterctl
chmod +x clusterctl && sudo mv clusterctl /usr/local/bin/

# Initialize the management cluster with AWS provider
export AWS_REGION=us-east-1
export AWS_ACCESS_KEY_ID=<your-key>
export AWS_SECRET_ACCESS_KEY=<your-secret>

clusterctl init --infrastructure aws

# Generate a workload cluster manifest
clusterctl generate cluster prod-us-east \
  --kubernetes-version v1.29.0 \
  --control-plane-machine-count 3 \
  --worker-machine-count 5 \
  > prod-us-east-cluster.yaml

# Create the cluster
kubectl apply -f prod-us-east-cluster.yaml

# Watch cluster provisioning
kubectl get clusters -w
kubectl get machines

Cluster YAML Structure

apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: prod-us-east
  namespace: default
spec:
  clusterNetwork:
    pods:
      cidrBlocks: ["192.168.0.0/16"]
    services:
      cidrBlocks: ["10.96.0.0/12"]
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    kind: KubeadmControlPlane
    name: prod-us-east-control-plane
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
    kind: AWSCluster
    name: prod-us-east

Multi-Cluster Service Mesh with Istio

Istio supports multi-cluster deployments where services in different clusters can communicate transparently through the mesh.

# Install Istio on both clusters with mesh ID
istioctl install --context=us-east-ctx \
  --set values.global.meshID=goel-mesh \
  --set values.global.multiCluster.clusterName=cluster-us-east \
  --set values.global.network=network-us

istioctl install --context=eu-west-ctx \
  --set values.global.meshID=goel-mesh \
  --set values.global.multiCluster.clusterName=cluster-eu-west \
  --set values.global.network=network-eu

# Create remote secrets for cross-cluster discovery
istioctl create-remote-secret --context=us-east-ctx --name=cluster-us-east | \
  kubectl apply -f - --context=eu-west-ctx

istioctl create-remote-secret --context=eu-west-ctx --name=cluster-eu-west | \
  kubectl apply -f - --context=us-east-ctx

# Verify cross-cluster mesh
istioctl remote-clusters --context=us-east-ctx

Liqo extends your cluster by creating virtual nodes that represent remote clusters. Pods scheduled on a virtual node actually run in the remote cluster — the scheduler does not even know the difference.

# Install Liqo on both clusters
curl --fail -LS "https://get.liqo.io" | bash

# Peer clusters together
liqoctl peer --remoteconfig remote-cluster-config.yaml

# Check virtual nodes
kubectl get nodes
# NAME                    STATUS   ROLES    AGE   VERSION
# node-1                  Ready    worker   30d   v1.29.0
# node-2                  Ready    worker   30d   v1.29.0
# liqo-cluster-eu-west    Ready    agent    5m    v1.29.0   ← virtual node!

# Schedule pods to the remote cluster using node affinity
kubectl label namespace production liqo.io/scheduling-enabled=true

Admiralty — Multi-Cluster Scheduling

Admiralty provides a clean multi-cluster scheduling layer. You create a "source" pod in one cluster, and Admiralty creates a "delegate" pod in a target cluster.

# Label the source cluster namespace
apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    multicluster-scheduler: enabled
---
# ClusterTarget defines where pods can be scheduled
apiVersion: multicluster.admiralty.io/v1alpha1
kind: ClusterTarget
metadata:
  name: cluster-eu-west
spec:
  kubeconfigSecret:
    name: eu-west-kubeconfig
---
# Annotate pods for multi-cluster scheduling
apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-api
spec:
  replicas: 6
  template:
    metadata:
      annotations:
        multicluster.admiralty.io/elect: ""  # Enable multi-cluster scheduling
    spec:
      containers:
        - name: payment-api
          image: myregistry/payment-api:v2.1.0

Managing Config Across Clusters with ArgoCD ApplicationSet

ArgoCD ApplicationSet is the most practical way to deploy the same application across multiple clusters. One ApplicationSet generates an Application for each target cluster.

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: payment-api
  namespace: argocd
spec:
  generators:
    - clusters:
        selector:
          matchLabels:
            environment: production
        values:
          revision: main
  template:
    metadata:
      name: 'payment-api-{{name}}'
    spec:
      project: default
      source:
        repoURL: https://github.com/myorg/k8s-manifests
        targetRevision: '{{values.revision}}'
        path: apps/payment-api/overlays/{{metadata.labels.region}}
      destination:
        server: '{{server}}'
        namespace: production
      syncPolicy:
        automated:
          prune: true
          selfHeal: true

This generates one ArgoCD Application per cluster labeled environment: production. Each gets the right overlay for its region automatically.

Challenges and Trade-Offs

Multi-cluster is not free. Here is what you are signing up for:

Challenge	Impact	Mitigation
Networking complexity	Cross-cluster DNS, firewall rules, encryption	Submariner, Cilium ClusterMesh
Data consistency	Distributed databases, eventual consistency	CockroachDB, Vitess, or single-region DB
Observability	Logs and metrics scattered across clusters	Thanos for Prometheus, centralized logging
Secret management	Syncing secrets across clusters	External Secrets Operator + Vault
Cost	More control planes, cross-region traffic	Start with 2 clusters, not 5
Cognitive load	Engineers must understand multi-cluster context	Good abstractions, platform team

Start small. Run two clusters — one primary and one DR — before attempting active-active across three regions. Get your observability, GitOps, and networking right on two clusters first. The jump from two to five is easier than the jump from one to two.

Multi-cluster Kubernetes is where infrastructure engineering gets genuinely hard. There is no single tool that solves everything — you pick the tools that match your specific requirements. Federation for resource distribution, Submariner or Cilium for networking, Cluster API for provisioning, and ArgoCD for deployment. Layer them carefully, test your failovers, and always ask yourself: do we actually need this complexity?

Why Multi-Cluster?​

Multi-Cluster Patterns​

KubeFed — Kubernetes Federation v2​

Install KubeFed​

Federated Deployment​

Submariner — Cross-Cluster Networking​

Install Submariner​

Export a Service Across Clusters​

Cluster API — Lifecycle Management​

Bootstrap a Management Cluster​

Cluster YAML Structure​

Multi-Cluster Service Mesh with Istio​

Liqo — Virtual Nodes and Resource Sharing​

Admiralty — Multi-Cluster Scheduling​

Managing Config Across Clusters with ArgoCD ApplicationSet​

Challenges and Trade-Offs​

Stay Updated