This is part two of a series.
- Envoy Quirks Part 1: Clear Route Cache
- Envoy Quirks Part 2: Filter Chain Match (this post)
One of Envoy's core features is, of course, its ability to match traffic and route it to the appropriate destination. This is done at two levels generally:
- Filter Chain Matches define the top level matching of traffic, matching on attributes of the TCP and TLS handshake like port and SNI.
- Route Matches define the matching of HTTP traffic, matching on attributes of the HTTP request like path and headers.
Filter chain matchers are notoriously tricky to get right, and have become a sort of rite of passage for Envoy users to learn how to use them correctly. Lets dig in.
A simple matcher
First, lets start showing a pretty simple example showing two filter chains. This actually exposes plaintext and TLS on the same port which is neat:
admin:
address:
socket_address:
address: ::1
port_value: 9901
static_resources:
listeners:
- name: chains
address:
socket_address:
address: 0.0.0.0
port_value: 10000
listener_filters:
- name: envoy.filters.listener.tls_inspector
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.listener.tls_inspector.v3.TlsInspector
filter_chains:
- name: tls
filter_chain_match:
application_protocols: ["h2", "http/1.1"]
transport_protocol: tls
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
common_tls_context:
alpn_protocols: ["h2", "http/1.1"]
tls_certificates:
- certificate_chain:
filename: /home/john/.secrets/localhost-lo.pem
private_key:
filename: /home/john/.secrets/localhost-lo-key.pem
filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: route
codec_type: AUTO
route_config:
name: route
virtual_hosts:
- name: route
domains: ["*"]
routes:
- match:
prefix: /
direct_response:
status: 200
body:
inline_string: "listener 10000: TLS matched\n"
http_filters:
- name: envoy.filters.http.router
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
- name: plaintext
filter_chain_match:
transport_protocol: raw_buffer
filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: route
codec_type: AUTO
route_config:
name: route
virtual_hosts:
- name: route
domains: ["*"]
routes:
- match:
prefix: /
direct_response:
status: 200
body:
inline_string: "listener 10000: plaintext\n"
http_filters:
- name: envoy.filters.http.router
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
The key part:
- name: tls
filter_chain_match:
transport_protocol: tls
- name: plaintext
filter_chain_match:
transport_protocol: raw_buffer
So our first filter chain matches all TLS traffic, and the second filter chain matches all plaintext traffic. This is pretty straightforward, and works as expected. We can send HTTPS requests (HTTP/1.1 or HTTP/2) as well as plaintext HTTP requests, and they will be matched to the appropriate filter chain and return the expected response.
$ client http://localhost:10000
listener 10000: plaintext
$ client https://localhost:10000 --http2
listener 10000: TLS matched
$ client https://localhost:10000
listener 10000: TLS matched
Adding another matcher
Now, lets suppose we have a new requirement that a specific domain should have some different behavior for HTTP/1.1 traffic. We may add another matcher:
- name: tls-example.com
filter_chain_match:
server_names: ["example.com"]
transport_protocol: tls
application_protocols: ["http/1.1"]
# Some HTTP/1.1 specific overrides here
Now lets try to send some requests with the SNI of "example.com":
$ client https://localhost:10000 --server-name example.com
listener 10000: TLS example.com
Works great! And if we try to send HTTP/2, which will be handled by the original filter chain matching all TLS traffic... right?
$ client https://localhost:10000 --server-name example.com --http2
Get "https://localhost:10000": read tcp 127.0.0.1:52898->127.0.0.1:10000: read: connection reset by peer
What!?
When we had 2 matchers, this traffic was handled. But adding a 3rd matcher that does not match causes the traffic to be dropped?
Under the hood
Despite the weird behavior, this is actually working as intended and documented. However, it differs from any other matching algorithm I have seen before, and can be a huge gotcha for users.
Rather than simply finding the first (or "best") match, Envoy orders the different match types (server name, transport protocol, etc) and filters possible matches out as it progresses.
So when we have 3 matchers, and the request comes in the server name example.com, we first look for matches that match the server name.
We find one -- our third matcher -- and discards the rest.
As it progresses through the matching, we later fail to match at the next step (application protocols), and the request is dropped.
What to do?
Back when I first ran into this many years ago, the options were pretty bleak: for every match type we wanted (4 I think), we had to duplicate every filter chain, leading to an explosion of filter chains and a very difficult to maintain configuration.
Later, a new field was added, default_filter_chain that was a bit of an escape hatch for some specific scenarios, but didn't solve the general case for complex matches.
However, eventually the new filter_chain_matcher (that is "matcher" rather than "match") giving full control over a tree-like matching flow.
While the config is very complex to achieve simple tasks (too big for me to include here, but you can see some examples, it does offer full control and a more intuitive matching flow that is more in line with what users expect.