When we first started designing what eventually became Istio ambient mode, there were many directions we explored, both in terms of implementation, and what our goals were. What resonated most, though, was that we wanted to provide an incredibly easy onboarding story for a subset of functionality. This subset, ultimately, was getting Mutual TLS deployed for all service-to-service communication within a cluster. I talk a bit more about this here.
Since then, I think we have delivered on this promise... and gone even further! In this post, I wanted to highlight some of the areas that I think ambient helps deliver some serious value to users with minimal complexity.
Zero to mTLS
Given this is the origin story, it only makes sense to start here!
All we need to get started:
$ istioctl install --set profile=ambient
$ kubectl label namespace --all istio.io/dataplane-mode=ambient
This will install Istio in ambient mode, and enroll all namespaces. With these two steps, all traffic in the cluster will automatically be mTLS encrypted!
In the future, we may allow enabling all namespaces by default, without labelling each namespace.
Operational complexity
Those familiar with sidecar may ask: "can't I just do that with sidecars (with istio-injection=enabled
label)?".
Sort of... the complexity we are trying to address with ambient mode is not just about the complexity of a demo, where sidecar looks essentially the same, but rather the complexity of real production operations.
A fundamental difference with ambient mode and sidecar mode is that ambient mode is specifically designed to be safe to enroll for any service. This differs from sidecars, which modify the behavior of applications and introduces a number of requirements that applications must meet. Some of these are intentional features -- for instance, sidecars will automatically load balance HTTP requests -- but these features introduce risk.
In ambient mode, we took the opposite approach. The initial enablement is designed to be zero risk -- so stable, scalable, and compatible that it could be turned on-by-default by a Kubernetes provider. Features (which tradeoff compatibility) are opt-in (as we will do a bit latter!).
HTTP Telemetry
This section assumes usage of Gloo Mesh
One of the coolest things I think we were able to build is full HTTP observability into ambient mesh, without compromising on our goals of making the base layer "zero risk".
Typically, HTTP processing is slow (adding up to 400% overhead), unsafe (one of the most common sources of vulnerabilities), and common source of compatibility issues. Initially, this pushed this processing to the waypoint layer, requiring users to opt-in.
In Gloo Mesh, we created a novel telemetry system that solves all of these problems. Our implementation has full compatibility with applications, is protected from safety issues, and comes with under 1% performance overhead!
All of these factors led to use being able to enable this by default for users, making this effectively a zero step operation. Instead, we will use our 2 step budget to deploy some handy tools to visualize this:
$ kubectl apply -f samples/addons/prometheus.yaml
$ kubectl apply -f samples/addons/kiali.yaml
Now we can check out the Kiali graph and see a full view of our services, powered by HTTP metrics from Istio.
Egress Gateways
While the "Istio is complex" meme is typically bit overdone, one place its no exaggeration is Egress Gateways. Deploying egress gateways has always been the one task I feel embarrassed to recommend to users given how complex it is.
In sidecars, the most basic possible setup looks like below:
# One time setup
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: egress
namespace: istio-egress
spec:
gatewayClassName: istio
listeners:
- name: default
port: 80
protocol: HTTP
---
# Per host
apiVersion: networking.istio.io/v1
kind: ServiceEntry
metadata:
name: httpbin
namespace: istio-egress
spec:
hosts:
- httpbin.org
ports:
- number: 80
name: http-port
protocol: HTTP
resolution: DNS
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: direct-httpbin-to-egress-gateway
namespace: istio-egress
spec:
parentRefs:
- kind: ServiceEntry
group: networking.istio.io
name: httpbin
rules:
- backendRefs:
- name: egress-istio
port: 80
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: forward-httpbin-from-egress-gateway
namespace: istio-egress
spec:
parentRefs:
- name: egress
hostnames:
- httpbin.org
rules:
- backendRefs:
- kind: Hostname
group: networking.istio.io
name: httpbin.org
port: 80
Note that most of this is needed for each host.
This setup is trivial, too. It doesn't do mTLS to the egress gateway, or apply any interesting policies.
With ambient, this becomes way easier, and in a pretty interesting way. There is no code anywhere in ambient mode about "Egress Gateways". However, there is a "waypoint proxy" concept which automatically captures traffic to a service. We can utilize this feature to our advantage, by associating an external service with the waypoint.
As promised, this is done in only 2 steps (we will count 1 object as 1 step):
$ istioctl waypoint apply --enroll-namespace --namespace istio-egress
$ cat <<EOF | kubectl apply -f -
apiVersion: networking.istio.io/v1
kind: ServiceEntry
metadata:
name: httpbin
namespace: istio-egress
spec:
hosts:
- httpbin.org
ports:
- number: 80
name: http-port
protocol: HTTP
resolution: DNS
EOF
Not only does this replace the huge configuration to get a barebones setup, it works a lot better, too. Traffic between the client and the egress gateway (or "egress waypoint"), is automatically encrypted with mTLS.
Additionally, going beyond the basics and apply policy is even easier as well.
Rather than applying a policy to our Gateway
, then trying to match httpbin
in some way, we can directly attach the the ServiceEntry
.
This is simpler to reason about and less error-prone:
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
name: httpbin
namespace: istio-egress
spec:
targetRefs:
- kind: ServiceEntry
group: networking.istio.io
name: httpbin.org
action: ALLOW
rules: # ...