Podless Kubernetes

Kubernetes is often criticized (somewhat unfairly) for being complex to operate, pushing most people to rely on managed offerings. However, k3s flipped this on its head somewhat by bundling a full Kubernetes distro into a single binary. This is super handy, especially for running in small environments like IoT; while isolated components can be good for very high scale, advanced deployments, for smaller environments operating microservices can just be a burden -- this is exactly why Istio chose to re-architect into a more monolithic architecture years back.

However, its still not really that minimal. On an empty cluster after running k3d cluster create test, we end up with a variety of pods in the cluster:

$ kubectl get pods --all-namespaces
NAMESPACE     NAME                                      READY   STATUS     RESTARTS  AGE
kube-system   local-path-provisioner-6c86858495-gc9jq   1/1     Running    0         2m18s
kube-system   coredns-6799fbcd5-pdf4b                   1/1     Running    0         2m18s
kube-system   helm-install-traefik-crd-cp9s2            0/1     Completed  0         2m18s
kube-system   helm-install-traefik-pch7c                0/1     Completed  1         2m18s
kube-system   traefik-f4564c4f4-q4lkj                   1/1     Running    0         2m8s
kube-system   metrics-server-54fd9b65b-d69w6            1/1     Running    0         2m18s
kube-system   svclb-traefik-58c5bb65-sq54b              2/2     Running    0         2m8s

k3d is a handy tool to deploy k3s inside docker, which makes it easy to test out.

What gives? How did our "single binary Kubernetes" turn into 6 different containers?

While k3s embeds quite a few components (kube-proxy, flannel, containerd, kubelet, ...) into one binary, others are deferred to run as standard pods in the cluster.

Additionally, once we deploy our favorite service mesh, we are going to have even more pods, moving us further from my goal of no pods.

Kubernetes without pods?

So the question is -- can we get a fully functional Kubernetes and Istio deployment by pushing the ideas of k3s further, and embedding full cluster functionality into a single binary?

Warning: these are experimental concepts; do not even think about doing this in production!

First, we can just cut some cruft right away and turn off a few optional components like servicelb (needed for LoadBalancer services), traefik (needed for Ingress), local-storage (needed for PVCs), and metrics-server (needed for kubectl top).

That leaves coredns and Istio.

If we are going to for minimalism, we are definitely going to want Istio ambient mode, which operates without sidecars entirely. Fortunately, this comes out of the box with full DNS support. This allows us to drop coredns as well.

With this, we can drop everything in kube-system, if we can get Istio ambient running. This is pretty straightforward; the tricky part is not adding more pods for Istio

Embedding Istio

With a fork of k3s, I modified it to embed Istio itself into k3s. k3s can run as a server and/or agent. Typically you would have 1 server, and each other node runs as an agent.

On the server, we want to run Istiod (Istio's control plane). On the agent, we want to run istio-cni (the per-node control plane), and ztunnel (the per-node data plane).

All three of these can be embedded directly in k3s with some work!

With this custom build, we can spin up a new k3d cluster with some custom configuration to disable components we no longer need:

apiVersion: k3d.io/v1alpha5
kind: Simple
metadata:
  name: podless
servers: 1
agents: 1
options:
  k3d:
    wait: true
    timeout: "60s"
    disableLoadbalancer: true
    disableRollback: true
  k3s:
    extraArgs:
      - arg: --disable-cloud-controller
        nodeFilters:
          - server:*
      - arg: --disable-kube-proxy
        nodeFilters:
          - server:*
      - arg: --disable-network-policy
        nodeFilters:
          - server:*
      - arg: --disable-helm-controller
        nodeFilters:
          - server:*
      - arg: --disable=coredns,servicelb,traefik,local-storage,metrics-server
        nodeFilters:
          - server:*

Here we disable all the pods we saw above, include a few extras.

One notable one is kube-proxy. Like some other projects, Istio's ztunnel can effectively replace kube-proxy for most use cases.

Podless service mesh

With everything in place, how does our cluster look?

$ kubectl get pods --all-namespaces
No resources found

Looking good so far....

Of course, running nothing is easy; the real challenge is keeping the cluster functional.

Lets deploy some application pods. Again, these are the only pods in the cluster:

$ kubectl get pods --all-namespaces
NAMESPACE   NAME                     READY   STATUS    RESTARTS   AGE
default     shell-5fff89ccf5-98kgg   1/1     Running   0          19s
default     echo-66d88ff694-9qprp    1/1     Running   0          14s

And we can send traffic:

$ kubectl exec deploy/shell -- curl -s echo
RequestHeader=Accept:*/*
RequestHeader=User-Agent:curl/8.5.0
Hostname=echo-66d88ff694-9qprp

Traffic is fully functional, including Service traffic (formerly handled by kube-proxy) and DNS (formerly handled by coredns). ztunnel handles all of this now, and additionally tunnels everything over a secure mTLS transport.

In addition to mTLS encryption, we can also apply policies based on the mTLS identity. Again, these are all enforced by ztunnel.

apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: allow-default
spec:
  action: ALLOW
  selector:
    matchLabels:
      app: echo
  rules:
  - from:
    - source:
        namespace: ["cluster.local/ns/default/sa/shell"]

Now traffic from the default namespace is allowed, but no other traffic is. We can see this by sending traffic from shell, and a new test workload I deployed in the other namespace:

$ kubectl exec deploy/shell -- curl -s echo
RequestHeader=Accept:*/*
RequestHeader=User-Agent:curl/8.5.0
Hostname=echo-66d88ff694-9qprp
$ kubectl exec deploy/shell -n other -- curl -s echo
command terminated with exit code 56

As expected, our other application is denied!

Additionally, if we want, we can upgrade our traffic to go through a full HTTP proxy ("waypoint"):

$ istioctl x waypoint apply --enroll-namespace
waypoint default/waypoint applied

$ kubectl get pods
NAME                        READY   STATUS    RESTARTS   AGE
echo-66d88ff694-czd65       1/1     Running   0          93m
shell-56bd5dbdbf-f4gh9      1/1     Running   0          93m
waypoint-7cd4dc789f-2s7z2   1/1     Running   0          41s

$ kubectl exec deploy/shell -- curl -s echo
RequestHeader=Accept:*/*
RequestHeader=User-Agent:curl/8.5.0
RequestHeader=X-Request-Id:18d72190-9caa-4162-8bc5-4c11518d7568
Hostname=echo-66d88ff694-czd65

Now that our waypoint is deployed, all traffic to the namespace is automatically forwarded to it, where full HTTP policies can be enforced. Here, we can see the X-Request-Id was added to our request, but there is tons of other functionality we get automatically without any configuration and even more we can configure.

Wrapping up

In the end, we were able to deploy a full Kubernetes cluster and service mesh, with all of the infrastructure components embedded into a single hidden node binary -- no pods are required for cluster functionality.

Is this all practical? Not really. However, it does go to show that the perceptions that Kubernetes/Istio are overly bloated and complex to run aren't so accurate.

Is it actually meaningfully simpler than a typical cluster? Sort of... we are legitimately replacing two components (kube-proxy and coredns), but the rest we are essentially just hiding and bundling. This is clearly less meaningful then replacing entirely, but not quite as good. That being said, hiding things is great for social media engagement, and k3s has had smashing success by, effectively, just hiding and bundling so clearly it provides some tangible benefits.

Kubernetes without pods?#

Embedding Istio#

Podless service mesh#

Wrapping up#

Kubernetes without pods?

Embedding Istio

Podless service mesh

Wrapping up