Kubernetes is often criticized (somewhat unfairly) for being complex to operate, pushing most people to rely on managed offerings.
However, k3s
flipped this on its head somewhat by bundling a full Kubernetes distro into a single binary.
This is super handy, especially for running in small environments like IoT; while isolated components can be good for very high scale, advanced deployments, for smaller environments operating microservices can just be a burden -- this is exactly why Istio chose to re-architect into a more monolithic architecture years back.
However, its still not really that minimal. On an empty cluster after running k3d cluster create test
, we end up with a variety of pods in the cluster:
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system local-path-provisioner-6c86858495-gc9jq 1/1 Running 0 2m18s
kube-system coredns-6799fbcd5-pdf4b 1/1 Running 0 2m18s
kube-system helm-install-traefik-crd-cp9s2 0/1 Completed 0 2m18s
kube-system helm-install-traefik-pch7c 0/1 Completed 1 2m18s
kube-system traefik-f4564c4f4-q4lkj 1/1 Running 0 2m8s
kube-system metrics-server-54fd9b65b-d69w6 1/1 Running 0 2m18s
kube-system svclb-traefik-58c5bb65-sq54b 2/2 Running 0 2m8s
k3d
is a handy tool to deployk3s
inside docker, which makes it easy to test out.
What gives? How did our "single binary Kubernetes" turn into 6 different containers?
While k3s embeds quite a few components (kube-proxy
, flannel
, containerd
, kubelet
, ...) into one binary, others are deferred to run as standard pods in the cluster.
Additionally, once we deploy our favorite service mesh, we are going to have even more pods, moving us further from my goal of no pods.
Kubernetes without pods?
So the question is -- can we get a fully functional Kubernetes and Istio deployment by pushing the ideas of k3s
further, and embedding full cluster functionality into a single binary?
Warning: these are experimental concepts; do not even think about doing this in production!
First, we can just cut some cruft right away and turn off a few optional components like servicelb
(needed for LoadBalancer services), traefik
(needed for Ingress
), local-storage
(needed for PVCs), and metrics-server
(needed for kubectl top
).
That leaves coredns
and Istio.
If we are going to for minimalism, we are definitely going to want Istio ambient mode, which operates without sidecars entirely.
Fortunately, this comes out of the box with full DNS support. This allows us to drop coredns
as well.
With this, we can drop everything in kube-system
, if we can get Istio ambient running. This is pretty straightforward; the tricky part is not adding more pods for Istio
Embedding Istio
With a fork of k3s
, I modified it to embed Istio itself into k3s
. k3s
can run as a server and/or agent. Typically you would have 1 server, and each other node runs as an agent.
On the server
, we want to run Istiod
(Istio's control plane).
On the agent, we want to run istio-cni
(the per-node control plane), and ztunnel
(the per-node data plane).
All three of these can be embedded directly in k3s
with some work!
With this custom build, we can spin up a new k3d
cluster with some custom configuration to disable components we no longer need:
apiVersion: k3d.io/v1alpha5
kind: Simple
metadata:
name: podless
servers: 1
agents: 1
options:
k3d:
wait: true
timeout: "60s"
disableLoadbalancer: true
disableRollback: true
k3s:
extraArgs:
- arg: --disable-cloud-controller
nodeFilters:
- server:*
- arg: --disable-kube-proxy
nodeFilters:
- server:*
- arg: --disable-network-policy
nodeFilters:
- server:*
- arg: --disable-helm-controller
nodeFilters:
- server:*
- arg: --disable=coredns,servicelb,traefik,local-storage,metrics-server
nodeFilters:
- server:*
Here we disable all the pods we saw above, include a few extras.
One notable one is kube-proxy
. Like some other projects, Istio's ztunnel
can effectively replace kube-proxy
for most use cases.
Podless service mesh
With everything in place, how does our cluster look?
$ kubectl get pods --all-namespaces
No resources found
Looking good so far....
Of course, running nothing is easy; the real challenge is keeping the cluster functional.
Lets deploy some application pods. Again, these are the only pods in the cluster:
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default shell-5fff89ccf5-98kgg 1/1 Running 0 19s
default echo-66d88ff694-9qprp 1/1 Running 0 14s
And we can send traffic:
$ kubectl exec deploy/shell -- curl -s echo
RequestHeader=Accept:*/*
RequestHeader=User-Agent:curl/8.5.0
Hostname=echo-66d88ff694-9qprp
Traffic is fully functional, including Service traffic (formerly handled by kube-proxy
) and DNS (formerly handled by coredns
).
ztunnel
handles all of this now, and additionally tunnels everything over a secure mTLS transport.
In addition to mTLS encryption, we can also apply policies based on the mTLS identity. Again, these are all enforced by ztunnel
.
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
name: allow-default
spec:
action: ALLOW
selector:
matchLabels:
app: echo
rules:
- from:
- source:
namespace: ["cluster.local/ns/default/sa/shell"]
Now traffic from the default
namespace is allowed, but no other traffic is. We can see this by sending traffic from shell
, and a new test workload I deployed in the other
namespace:
$ kubectl exec deploy/shell -- curl -s echo
RequestHeader=Accept:*/*
RequestHeader=User-Agent:curl/8.5.0
Hostname=echo-66d88ff694-9qprp
$ kubectl exec deploy/shell -n other -- curl -s echo
command terminated with exit code 56
As expected, our other application is denied!
Additionally, if we want, we can upgrade our traffic to go through a full HTTP proxy ("waypoint"):
$ istioctl x waypoint apply --enroll-namespace
waypoint default/waypoint applied
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
echo-66d88ff694-czd65 1/1 Running 0 93m
shell-56bd5dbdbf-f4gh9 1/1 Running 0 93m
waypoint-7cd4dc789f-2s7z2 1/1 Running 0 41s
$ kubectl exec deploy/shell -- curl -s echo
RequestHeader=Accept:*/*
RequestHeader=User-Agent:curl/8.5.0
RequestHeader=X-Request-Id:18d72190-9caa-4162-8bc5-4c11518d7568
Hostname=echo-66d88ff694-czd65
Now that our waypoint is deployed, all traffic to the namespace is automatically forwarded to it, where full HTTP policies can be enforced.
Here, we can see the X-Request-Id
was added to our request, but there is tons of other functionality we get automatically without any configuration and even more we can configure.
Wrapping up
In the end, we were able to deploy a full Kubernetes cluster and service mesh, with all of the infrastructure components embedded into a single hidden node binary -- no pods are required for cluster functionality.
Is this all practical? Not really. However, it does go to show that the perceptions that Kubernetes/Istio are overly bloated and complex to run aren't so accurate.
Is it actually meaningfully simpler than a typical cluster? Sort of... we are legitimately replacing two components (kube-proxy
and coredns
), but the rest we are essentially just hiding and bundling.
This is clearly less meaningful then replacing entirely, but not quite as good.
That being said, hiding things is great for social media engagement, and k3s
has had smashing success by, effectively, just hiding and bundling so clearly it provides some tangible benefits.