Istio has a huge variety of features it offers, but not a lot of opinions on which you should use. In this post, I hope to cover which features I think you should and should not use in Istio, based on my experience with users using the feature and issues they do/do not run into, my knowledge of the inner workings of the feature, and subjective gut feelings.

The hope is we don't end up with a "JavaScript: The Good Parts" situation.

JavaScript: The Good Parts; a short book
JavaScript: The Good Parts; a short book

Rubric

  • Adopt if needed ✅: these features are generally reliable to be used broadly. However, any feature brings some risk and complexity, so its best to avoid anything that isn't needed; there is no "Always adopt" category intentionally.
  • ⚠️ Adopt with caution ⚠️: these features are reasonably reliable, but come with some caveats. In general, they are safe to use if these caveats are acceptable.
  • 🛑 Avoid if possible 🛑: these features are not very reliable, and should be avoided unless there is a very compelling need for the feature. When doing so, use caution.
  • ☠️ Avoid at all costs ☠️: do not use these features!

Installation

Istio offers a variety of different installation options.

  • Istioctl install: ✅ Adopt if needed ✅
  • Helm install: ✅ Adopt if needed ✅
  • Istio Operator: ☠️ Avoid at all costs ☠️

Istioctl and Helm are roughly equivalent in stability; use whichever fits best in your environment. Helm tends to integrate much better with other tooling like Terraform, ArgoCD, etc, so is a reasonable first choice. The Istio Operator (not to be confused with the IstioOperator API passed in to istioctl install), is an extremely bad option. Do not use it under any circumstances. More discussion on operators here.

  • Multi-cluster: ⚠️ Adopt with caution ⚠️
  • Multi-network: 🛑 Avoid if possible 🛑

Multi-cluster and Multi-network are two common features that draw users to Istio. Both are very powerful, mature, and widely adopted, but come with some risk. Multi-cluster changes the routing behavior of Kubernetes in risky ways; any Service with the same name in another cluster is merged, so you will get cross-cluster routing without an explicit opt in. Multi-network, on top of the above risk, with some additional ones. Multi-network sends all cross-network traffic through a TCP proxy. This negatively impacts load balancing, and places a high single-point-of-failure into network communication.

For many users, the benefits of these are well worth the risks. However, its best to make sure you actually need these features before you buy in to these features.

  • External Istiod: ⚠️ Adopt with caution ⚠️

This is a fairly niche feature, but if you have a large scale deployment it can be a convenient way to manage Istio. Do not use this if you are small scale.

  • Revisions and Tags: ✅ Adopt if needed ✅

In general, these are great features to reduce risk and increase ease of upgrades. If you plan to upgrade Istio, you should use these. If you don't plan to upgrade Istio... rethink your plans?

  • CNI: ⚠️ Adopt with caution ⚠️

Istio CNI reduces privileges on pods, but comes with some operational risks. Most of these can be mitigated, but make sure they are acceptable before adopting.

Networking

  • DNS Proxy: ⚠️ Adopt with caution ⚠️

While the DNS Proxy feature offers some benefits, messing with DNS is scary business. When "its always DNS", do you really want to add Istio in the picture as well?

  • DNS Auto Allocation: 🛑 Avoid if possible 🛑

This feature was borderline "Avoid at all cost". Unlike the broader DNS Proxy, auto allocation has two issues.

Fundamentally, it is returning bogus IP addresses to the application; applications might do strange things when they get bogus IP addresses.

In terms of its implementation, the allocation of IP addresses is not stable and has a long history of bugs. When the IP allocation changes, outages result until things reset to the new state; this could be a while depending on the client. Notably, this can be fixed by manually allocating bogus IPs in ServiceEntry explicitly (just make sure to avoid collisions), though this doesn't change the more fundamental issue above.

  • Locality Load Balancing: ⚠️ Adopt with caution ⚠️

This is a great feature, but is surprisingly hard to operate safely and correctly. Generally, the pros outweigh the cons (cloud egress $$$), though.

Be sure to watch out for unbalanced workloads across localities, and remember that outlierDetection is required.

Security

  • Mutual TLS: ✅ Adopt if needed ✅

This is, for many, the bread and butter of Istio. While any feature adds some risk, this is one of the most battle tested aspects of Istio. Better yet, it doesn't require any configuration to get started (though it can be improved with Authorization Policies).

  • Authorization Policies: ✅ Adopt if needed ✅

Overall these go very well with Mutual TLS, and are critical to building a secure mesh. One caveat is that, especially in gateways, configuration generated by these rules can add up to become quite expensive.

  • JWT Request Authentication: 🛑 Avoid if possible 🛑
  • External Authorization: 🛑 Avoid if possible 🛑

While these features have pretty compelling use case, they come with quite a bit of risk and runtime dependencies that are not fully understood or documented. Where possible, I would avoid these features, or adopt cautiously.

Extensibility

  • WebAssembly (WASM): 🛑 Avoid if possible 🛑

WASM in Istio has lingered as "alpha" for quite a while, and introduces risks around binary distribution, performance, and instability in the WASM runtime. While its probably the best extension mechanism in Istio today (see below for a worse option), its still best to avoid unless its critically required at this point.

  • EnvoyFilter: ☠️ Avoid at all costs ☠️

EnvoyFilter is, objectively, the worst feature in Istio for stability. Essentially, it gives arbitrary patching into Envoy code. An analogy would be to provide a fast-moving project a git diff that is patched dynamically and recompiled; EnvoyFilter is only slightly more stable than that. In addition to risks of breakage, particularly around upgrades, safe usage requires a deep understanding of Envoy, which is surprisingly hard.

That being said, EnvoyFilter brings great power along with it. I would urge you to resist the temptation to use it as much as reasonably possible.

  • Rate Limiting: ☠️ Avoid at all costs ☠️

This feature is not particularly mature (it depends on EnvoyFilter!); best to avoid for now.