Zero Static Authority in Multi-Cluster GitOps

A GitOps controller — Argo CD, Flux — sits in one cluster and reconciles many. To reach the others it needs to authenticate to each remote kube-apiserver. The default way to do that is to store a credential: a bearer token, a client certificate, a kubeconfig, kept as a Kubernetes Secret in the control cluster. In Argo CD these are the objects labelled argocd.argoproj.io/secret-type: cluster.

Stop and look at what that Secret is. It is a long-lived credential, usually with broad rights, pointing at your entire fleet, sitting at rest in one place. It does not rotate on its own. It is in your backups. It is the single most valuable thing an attacker can find in your control cluster, because owning it is owning every cluster it can reach. The whole discipline of boring architecture is about not keeping liabilities you don’t have to — and a stored fleet-wide token is a liability you keep purely because issuing a fresh, short-lived one at the moment of use seemed like more work.

This piece is about doing that more work, in order. The target is zero static authority: no long-lived credential that authenticates the controller to a remote cluster is ever written down. The identity the controller presents is minted on demand, expires in about an hour, and lives only in memory. The mechanism is SPIFFE for the identity model and SPIRE for issuing it.

Before any of it, this is not a tool for three clusters you could list on one hand. SPIRE is a control plane you operate, not a library you import — it is exactly the kind of capability you should be slow to take on, the way modular monolith beats microservices until you’ve earned the split. It earns its weight only when you genuinely run a fleet — many clusters, many teams, an audit requirement that a stolen Secret would fail. If a single shared cluster covers you, the cheapest secure credential is the one you never stand up the machinery to issue. The rest of this assumes you have actually arrived at the fleet.

The one rule up front

State it before the mechanics: the credential that crosses a cluster boundary must be short-lived and never stored. Everything below is in service of that one sentence. If at the end you still have a long-lived token written to disk somewhere on the authentication path, you have built complexity without buying the property you came for — the same wrong turn as adding a read replica to dodge a missing index.

“Short-lived and never stored” has a precise meaning here. The identity is an X.509 certificate (or a JWT) whose default lifetime is one hour, fetched by the workload from a local socket with no token of its own to present, and rotated automatically at half its life. There is no secret to steal that is worth stealing tomorrow.

What SPIFFE and SPIRE actually are

SPIFFE is a spec; SPIRE is the reference implementation that issues the things the spec describes. Four nouns carry the whole design.

A SPIFFE ID is a URI: spiffe://<trust-domain>/<path>, e.g. spiffe://prod.example.org/ns/argocd/sa/application-controller. The authority part is the trust domain — one logical root of trust, one CA. The path identifies the workload. That is the name a workload proves it holds.

An SVID (SPIFFE Verifiable Identity Document) is the proof. Two forms:

X.509-SVID — an X.509 certificate with the SPIFFE ID in the URI SAN (not the CN, which matters later). You do mTLS with it.
JWT-SVID — a signed JWT whose sub is the SPIFFE ID and whose aud you scope to the intended verifier. You present it as a bearer token.

The lifetimes are the point. In SPIRE’s server config the defaults are default_x509_svid_ttl: 1h and default_jwt_svid_ttl: 5m, signed by a CA whose own ca_ttl defaults to 24h. These are short on purpose: a leaked SVID is worthless within the hour, which is exactly why it is safe to use it where you used to store a token.

The thing that makes “never stored” possible is the Workload API, served by a SPIRE agent over a local Unix domain socket (the documented default is /tmp/spire-agent/public/api.sock; many deployments mount it at /run/spire/sockets/...). The workload connects to that socket and asks for its SVID. It presents no credential to do so. The agent identifies the caller by inspecting the calling process through the kernel — its PID, and from that its container, namespace, service account — and performs workload attestation against a set of selectors. Identity is established by what the process verifiably is, not by what secret it holds. That inversion is the entire reason there is nothing to store.

Underneath, SPIRE is two components and two attestations:

The spire-server holds the CA, signs SVIDs, and keeps the registration entries. Its CA is self-signed by default, or an intermediate chained to your corporate PKI via an UpstreamAuthority plugin — a decision you make once and live with.
The spire-agent runs on every node. It first proves the node to the server — node attestation — using a plugin like k8s_psat, which validates a projected service-account token through the Kubernetes TokenReview API. No shared secret is planted on the node. Then it does workload attestation for each local process before handing it an SVID.

A registration entry is the server-side record that ties a SPIFFE ID to a parent (the node/agent) and a set of selectors (namespace, service account, image). It is the policy: this workload, attested this way, gets this identity. Managing the lifecycle of these entries is real work, and it is where most of the ongoing operational cost lives.

Federation: trusting an SVID from another cluster

One trust domain is one root of trust. A multi-cluster fleet usually means multiple trust domains — one per cluster, or per region — and now a workload in domain A must be able to validate an SVID minted in domain B. That is federation, and its unit is the trust bundle: the public CA certs and JWT signing keys (a JWKS) that let you verify a domain’s SVIDs.

Each domain stands up a bundle endpoint — a URL serving its current bundle, the SPIFFE analogue of OIDC’s jwks_uri — and each peer polls it to stay current. Two profiles, and the difference is exactly the bootstrap question:

https_web — the endpoint is fronted by a Web-PKI certificate from a public CA. The peer validates it with the ordinary public trust store, so no initial bundle has to be exchanged out of band. Trust bootstraps off the existing Web PKI.
https_spiffe — the endpoint authenticates with its own X.509-SVID. To talk to it the first time, the peer must already hold that domain’s initial bundle. So the very first bundle has to arrive out of band, after which the latest fetched bundle is used going forward.

On a registration entry, federatesWith lists the foreign trust domains a workload is allowed to authenticate against; the agent then delivers those foreign bundles to the workload over the Workload API. Bundles refresh on the spiffe_refresh_hint (commonly around five minutes), and you publish a new signing key several refresh cycles before you start using it, so federated peers have already learned it when the rotation lands.

Note the honest shape of “zero static authority”: federation relocates the bootstrap trust decision, it does not erase it. With https_spiffe you ship one initial bundle by hand; with https_web you lean on the public CA system. Either way the root of trust still comes from somewhere — what you’ve eliminated is the long-lived, broadly-scoped, per-cluster credential, not the one-time trust anchor. That is the right trade, but call it what it is.

Making the apiserver accept an SVID

Here is where intent meets the Kubernetes API, and where it is easy to be wrong. There are two mechanisms, and only one is clean today.

The JWT-SVID-as-OIDC path (the one that works). Kubernetes can be told to trust an external OIDC issuer. SPIRE ships an OIDC Discovery Provider that serves /.well-known/openid-configuration and a JWKS backed by SPIRE’s JWT-SVID signing keys. You point the remote (spoke) cluster’s apiserver at that provider as an OIDC issuer. The controller fetches a fresh JWT-SVID from its local Workload API and presents it as a bearer token; the apiserver validates it against SPIRE’s JWKS and maps it to a subject. No Secret, no stored token — a new JWT per request, expiring in minutes.

This is not a thought experiment. Red Hat’s OpenShift GitOps (4.20+) ships exactly this as a supported integration: Argo CD uses a client-go ExecCredential plugin (apiVersion: client.authentication.k8s.io/v1beta1) via execProviderConfig, which on each call reads the SPIFFE socket (SPIFFE_ENDPOINT_SOCKET), requests a JWT-SVID with the audience the spoke apiserver expects (SPIFFE_JWT_AUDIENCE), and hands it over. The credential is manufactured at the moment of use and discarded.

The mTLS-with-X.509-SVID path (the trap). The instinct is to do straight mTLS: the controller presents its X.509-SVID, the apiserver trusts the SPIFFE bundle as a client CA, done. It is not done. Kubernetes client-cert auth derives the username from the certificate’s Subject CN and groups from the Subject O — and the SPIFFE ID lives in the URI SAN, which apiserver client-cert auth does not read. A raw SPIFFE X.509-SVID therefore does not map to a Kubernetes user. To use mTLS you insert a proxy — Ghostunnel or Envoy — in front of the apiserver: it terminates SPIFFE mTLS, validates the URI-SAN SPIFFE ID, and forwards with an identity the apiserver does understand. Ghostunnel consumes the Workload API directly for its own rotating SVIDs. This works, but it is a moving part in front of every apiserver, and the OIDC path avoids it.

On maturity, be honest. Upstream Argo CD has no native SPIFFE auth; it works through the generic ExecCredential plugin pattern, and the most mature productization is OpenShift’s. Flux can be GitOps-managed to deploy SPIRE, and its OCI registry auth can consume a JWT-SVID, but a first-class “authenticate Flux to a remote cluster via SPIFFE instead of a kubeconfig” feature is not documented — it rides the same OIDC/exec-credential/proxy plumbing. Treat native remote-cluster SPIFFE auth as a pattern you assemble, not a checkbox you enable.

What breaks, and what it costs

This is the part the architecture diagrams omit. You have removed a stored credential; in return you have made an identity service a hard dependency on the critical path, and short-lived things fail in ways long-lived things didn’t.

SVID rotation is now an availability dependency. SVIDs are short — X.509 an hour, JWT five minutes — and rotate at half-life. If the agent or the Workload API is down, or the agent can’t reach the server to renew, the SVID expires and authentication stops. The SPIRE server is on the critical path for every issuance and renewal. Agents cache credentials and tolerate brief server outages; the agent’s availability_target knob (must be ≥ 24h if set) makes it rotate early to bank headroom for graceful downtime. But the failure mode is real and new: identity-plane down means fleet auth down.
Clock skew is now an outage class. A five-minute JWT is unforgiving. A few minutes of drift between a controller and a spoke apiserver rejects valid tokens. NTP discipline across the fleet stops being hygiene and becomes a hard requirement.
A federation bundle endpoint that goes dark breaks cross-cluster auth — silently, later. If a peer’s bundle endpoint is unreachable past the refresh window and that domain rotates its keys, your cached bundle goes stale and cross-domain SVID validation fails, even though both clusters are individually healthy. The failure shows up at rotation time, not at outage time, which makes it nasty to diagnose.
The identity plane is now a stateful, HA-critical service. SPIRE servers in HA share one SQL datastore; the default SQLite is single-node, so production means a highly-available MySQL/PostgreSQL that the entire fleet’s ability to authenticate depends on. For multi-cluster you choose a topology — nested SPIRE (a root server issuing intermediates to downstream servers, surviving a root outage) or federation across per-cluster trust domains — and each adds its own operational surface.
The operational weight is a whole new control plane. Server plus agents on every cluster; the registration-entry lifecycle (selectors per workload, kept in sync with deployments); the upstream-CA decision; bundle endpoints and federation relationships; the OIDC Discovery Provider; and, on the mTLS path, a proxy per apiserver. None of this existed when the answer was “store a kubeconfig.”

Why the order matters

The steps are not a menu; they are a sequence, and skipping it is how teams get the cost without the benefit.

1. Stand up SPIRE: server + agents, node attestation       (foundation)
2. Issue workload SVIDs to the controller via Workload API
3. Make the spoke apiserver trust SPIRE (OIDC path first)
4. Federate trust domains across clusters
5. Delete the stored kubeconfig Secrets                     (the payoff)

Step 5 is the whole point, and it is only safe once 1–4 actually work. The common failure is to keep the old Secret “as a fallback” — which means the long-lived fleet-wide credential is still at rest, still in your backups, still the thing an attacker takes, and you are now also running SPIRE. You have paid for the control plane and kept the liability. Either the stored credential is gone or you have not done this; there is no half-credit.

The reverse error is reaching for SPIRE before the order can pay off — standing up a federated identity plane for a couple of clusters whose Secret would be fine. That is just-in-case complexity wearing a security badge: machinery built for a blast radius you don’t yet have. The machinery is justified by the fleet, the audit requirement, and the blast radius of a stolen token. Below that line, the cheapest secure credential really is the boring one you never have to issue.

Where this approach genuinely ends is clear: a fleet large enough that a static per-cluster credential is an unacceptable blast radius, with the operational maturity to run a stateful, HA, on-the-critical-path identity service and keep its clocks in sync. A team that has arrived there already pays for that maturity elsewhere — and at that point zero static authority is not gold-plating, it is the credential model the blast radius was always demanding. Not before.

See also: the SPIFFE and SPIRE docs define the identity model and the server/agent components, and the SPIFFE Federation spec specifies the bundle-endpoint profiles this design relies on.