Skip to content
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
9cd388d
xDS: ext_proc: add GRPC processing mode, and clean up docs
markdroth Mar 13, 2025
c30e220
Merge remote-tracking branch 'upstream/main' into ext_proc_grpc
markdroth Jul 10, 2025
ed81c37
add missing import
markdroth Jul 11, 2025
b30a64b
add missing build dep
markdroth Jul 11, 2025
2800967
spelling
markdroth Jul 11, 2025
42bd600
fix formatting
markdroth Jul 11, 2025
532f526
attempt to fix doc generation
markdroth Jul 11, 2025
5166f35
Merge remote-tracking branch 'upstream/main' into ext_proc_grpc
markdroth Sep 10, 2025
ace4e47
attempt to fix RST formatting
markdroth Sep 10, 2025
8c307bc
fix wording
markdroth Sep 10, 2025
bc9a0ad
clarify streaming behavior of GRPC body send mode
markdroth Sep 12, 2025
975e849
add comment about timeout behavior vs. body mode, and send_body_witho…
markdroth Sep 12, 2025
e8afc4b
try to fix formatting
markdroth Sep 12, 2025
532c31d
GRPC mode will support timers, and note that FULL_DUPLEX_STREAMED doe…
markdroth Sep 12, 2025
cd3b71f
[xDS] improve documentation for ext_proc
markdroth Sep 12, 2025
4b7f6f5
clarify wording
markdroth Sep 12, 2025
e7cbd4d
formatting and wording tweak
markdroth Sep 12, 2025
aa8d3cc
GRPC mode uses StreamedBodyResponse
markdroth Sep 12, 2025
6932eb6
clarify GRPC mode description, and note that message timeouts are not…
markdroth Sep 12, 2025
0015fbd
ext_proc server can inject EOS on gRPC client message
markdroth Sep 12, 2025
708120f
add a safe way for ext_proc server to close the stream with OK status
markdroth Sep 12, 2025
8b95c52
Update api/envoy/extensions/filters/http/ext_proc/v3/processing_mode.…
markdroth Sep 16, 2025
9186b12
clarify wording
markdroth Sep 16, 2025
51224c4
Merge branch 'ext_proc_doc_update' into ext_proc_grpc
markdroth Sep 16, 2025
d2334a8
add end_of_stream_without_message field
markdroth Sep 16, 2025
ced8f28
add end_of_stream_without_message in response also
markdroth Sep 16, 2025
28c5bd6
Merge remote-tracking branch 'upstream/main' into ext_proc_grpc
markdroth Sep 18, 2025
af6aa49
note that GRPC mode is similar to FULL_DUPLEX_STREAMED
markdroth Oct 2, 2025
ec538a3
fix space
markdroth Oct 2, 2025
4fa776a
add a bit to indicate whether gRPC messages are compressed
markdroth Oct 7, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions api/envoy/extensions/filters/http/ext_proc/v3/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ licenses(["notice"]) # Apache 2

api_proto_package(
deps = [
"//envoy/annotations:pkg",
"//envoy/config/common/mutation_rules/v3:pkg",
"//envoy/config/core/v3:pkg",
"//envoy/type/matcher/v3:pkg",
Expand Down
79 changes: 43 additions & 36 deletions api/envoy/extensions/filters/http/ext_proc/v3/ext_proc.proto
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ import "google/protobuf/wrappers.proto";

import "xds/annotations/v3/status.proto";

import "envoy/annotations/deprecation.proto";
import "udpa/annotations/migrate.proto";
import "udpa/annotations/status.proto";
import "validate/validate.proto";
Expand Down Expand Up @@ -48,18 +49,24 @@ option (udpa.annotations.file_status).package_version_status = ACTIVE;
//
// * Whether it receives the response message at all.
// * Whether it receives the message body at all, in separate chunks, or as a single buffer.
// * Whether subsequent HTTP requests are transmitted synchronously or whether they are
// sent asynchronously.
// * To modify request or response trailers if they already exist.
//
// The filter supports up to six different processing steps. Each is represented by
// a gRPC stream message that is sent to the external processor. For each message, the
// processor must send a matching response.
//
// * Request headers: Contains the headers from the original HTTP request.
// * Request body: Delivered if they are present and sent in a single message if
// the ``BUFFERED`` or ``BUFFERED_PARTIAL`` mode is chosen, in multiple messages if the
// ``STREAMED`` mode is chosen, and not at all otherwise.
// * Request body: If the body is present, the behavior depends on the
// body send mode:
// * ``BUFFERED`` or ``BUFFERED_PARTIAL``: Entire body is sent to the
// external processor in a single message.
// * ``STREAMED`` or ``FULL_DUPLEX_STREAMED``: Body will be split across
// multiple messages sent to the external processor.
// * ``GRPC``: As each gRPC message arrives, it will be sent to the external
// processor. There will be exactly one gRPC message in each message
// sent to the external processor.
// * ``NONE``: Body will not be sent to the external processor.
//
// * Request trailers: Delivered if they are present and if the trailer mode is set
// to ``SEND``.
// * Response headers: Contains the headers from the HTTP response. Keep in mind
Expand All @@ -75,7 +82,7 @@ option (udpa.annotations.file_status).package_version_status = ACTIVE;
// from the external processor. The latter is only enabled if ``allow_mode_override`` is
// set to true. This way, a processor may, for example, use information
// in the request header to determine whether the message body must be examined, or whether
// the proxy should simply stream it straight through.
// the data plane should simply stream it straight through.
//
// All of this together allows a server to process the filter traffic in fairly
// sophisticated ways. For example:
Expand All @@ -84,12 +91,8 @@ option (udpa.annotations.file_status).package_version_status = ACTIVE;
// on the content of the headers.
// * A server may choose to immediately reject some messages based on their HTTP
// headers (or other dynamic metadata) and more carefully examine others.
// * A server may asynchronously monitor traffic coming through the filter by inspecting
// headers, bodies, or both, and then decide to switch to a synchronous processing
// mode, either permanently or temporarily.
//
// The protocol itself is based on a bidirectional gRPC stream. Envoy will send the
// server
// The protocol itself is based on a bidirectional gRPC stream. The data plane will send the server
// :ref:`ProcessingRequest <envoy_v3_api_msg_service.ext_proc.v3.ProcessingRequest>`
// messages, and the server must reply with
// :ref:`ProcessingResponse <envoy_v3_api_msg_service.ext_proc.v3.ProcessingResponse>`.
Expand Down Expand Up @@ -124,7 +127,6 @@ message ExternalProcessor {
reserved "async_mode";

// Configuration for the gRPC service that the filter will communicate with.
// The filter supports both the "Envoy" and "Google" gRPC clients.
// Only one of ``grpc_service`` or ``http_service`` can be set.
// It is required that one of them must be set.
config.core.v3.GrpcService grpc_service = 1
Expand All @@ -140,14 +142,14 @@ message ExternalProcessor {
// cannot be configured to send any body or trailers. i.e., ``http_service`` only supports
// sending request or response headers to the side stream server.
//
// With this configuration, Envoy behavior:
// With this configuration, the data plane behavior is:
//
// 1. The headers are first put in a proto message
// :ref:`ProcessingRequest <envoy_v3_api_msg_service.ext_proc.v3.ProcessingRequest>`.
//
// 2. This proto message is then transcoded into a JSON text.
//
// 3. Envoy then sends an HTTP POST message with content-type as "application/json",
// 3. The data plane then sends an HTTP POST message with content-type as "application/json",
// and this JSON text as body to the side stream server.
//
// After the side-stream receives this HTTP request message, it is expected to do as follows:
Expand All @@ -160,7 +162,7 @@ message ExternalProcessor {
//
// 3. It converts the ``ProcessingResponse`` proto message into a JSON text.
//
// 4. It then sends an HTTP response back to Envoy with status code as ``"200"``,
// 4. It then sends an HTTP response back to the data plane with status code as ``"200"``,
// ``content-type`` as ``"application/json"`` and sets the JSON text as the body.
//
ExtProcHttpService http_service = 20 [
Expand Down Expand Up @@ -194,28 +196,30 @@ message ExternalProcessor {
// sent. See ``ProcessingMode`` for details.
ProcessingMode processing_mode = 3;

// Envoy provides a number of :ref:`attributes <arch_overview_attributes>`
// The data plane provides a number of :ref:`attributes <arch_overview_attributes>`
// for expressive policies. Each attribute name provided in this field will be
// matched against that list and populated in the ``request_headers`` message.
// matched against that list and populated in the
// :ref:`ProcessingRequest.attributes <envoy_v3_api_field_service.ext_proc.v3.ProcessingRequest.attributes>` field.
// See the :ref:`attribute documentation <arch_overview_request_attributes>`
// for the list of supported attributes and their types.
repeated string request_attributes = 5;

// Envoy provides a number of :ref:`attributes <arch_overview_attributes>`
// The data plane provides a number of :ref:`attributes <arch_overview_attributes>`
// for expressive policies. Each attribute name provided in this field will be
// matched against that list and populated in the ``response_headers`` message.
// matched against that list and populated in the
// :ref:`ProcessingRequest.attributes <envoy_v3_api_field_service.ext_proc.v3.ProcessingRequest.attributes>` field.
// See the :ref:`attribute documentation <arch_overview_attributes>`
// for the list of supported attributes and their types.
repeated string response_attributes = 6;

// Specifies the timeout for each individual message sent on the stream and
// when the filter is running in synchronous mode. Whenever the proxy sends
// a message on the stream that requires a response, it will reset this timer,
// and will stop processing and return an error (subject to the processing mode)
// if the timer expires before a matching response is received. There is no
// timeout when the filter is running in asynchronous mode. Zero is a valid
// config which means the timer will be triggered immediately. If not
// configured, default is 200 milliseconds.
// Specifies the timeout for each individual message sent on the stream.
// Whenever the data plane sends a message on the stream that requires a
// response, it will reset this timer, and will stop processing and return
// an error (subject to the processing mode) if the timer expires before a
// matching response is received. There is no timeout when the filter is
// running in observability mode. Zero is a valid config which means the
// timer will be triggered immediately. If not configured, default is 200
// milliseconds.
google.protobuf.Duration message_timeout = 7 [(validate.rules).duration = {
lte {seconds: 3600}
gte {}
Expand All @@ -232,7 +236,7 @@ message ExternalProcessor {
// :ref:`header_prefix <envoy_v3_api_field_config.bootstrap.v3.Bootstrap.header_prefix>`
// (which is usually "x-envoy").
// Note that changing headers such as "host" or ":authority" may not in itself
// change Envoy's routing decision, as routes can be cached. To also force the
// change the data plane's routing decision, as routes can be cached. To also force the
// route to be recomputed, set the
// :ref:`clear_route_cache <envoy_v3_api_field_service.ext_proc.v3.CommonResponse.clear_route_cache>`
// field to true in the same response.
Expand Down Expand Up @@ -274,10 +278,11 @@ message ExternalProcessor {

// If true, send each part of the HTTP request or response specified by ``ProcessingMode``
// without pausing on filter chain iteration. It is "Send and Go" mode that can be used
// by external processor to observe Envoy data and status. In this mode:
// by external processor to observe the request's data and status. In this mode:
//
// 1. Only ``STREAMED`` body processing mode is supported and any other body processing modes will be
// ignored. ``NONE`` mode (i.e., skip body processing) will still work as expected.
// 1. Only ``STREAMED`` and ``GRPC`` body processing modes are supported and any other body
// processing modes will be ignored. ``NONE`` mode (i.e., skip body processing) will still
// work as expected.
//
// 2. External processor should not send back processing response, as any responses will be ignored.
// This also means that
Expand Down Expand Up @@ -314,12 +319,13 @@ message ExternalProcessor {
// Specifies the deferred closure timeout for gRPC stream that connects to external processor. Currently, the deferred stream closure
// is only used in :ref:`observability_mode <envoy_v3_api_field_extensions.filters.http.ext_proc.v3.ExternalProcessor.observability_mode>`.
// In observability mode, gRPC streams may be held open to the external processor longer than the lifetime of the regular client to
// backend stream lifetime. In this case, Envoy will eventually timeout the external processor stream according to this time limit.
// backend stream lifetime. In this case, the data plane will eventually timeout the external processor stream according to this time limit.
// The default value is 5000 milliseconds (5 seconds) if not specified.
google.protobuf.Duration deferred_close_timeout = 19;

// Send body to the side stream server once it arrives without waiting for the header response from that server.
// It only works for ``STREAMED`` body processing mode. For any other body processing modes, it is ignored.
// It only works for ``STREAMED`` and ``GRPC`` body processing modes. For any other body
// processing modes, it is ignored.
// The server has two options upon receiving a header request:
//
// 1. Instant Response: send the header response as soon as the header request is received.
Expand All @@ -328,9 +334,9 @@ message ExternalProcessor {
//
// In all scenarios, the header-body ordering must always be maintained.
//
// If enabled Envoy will ignore the
// If enabled the data plane will ignore the
// :ref:`mode_override <envoy_v3_api_field_service.ext_proc.v3.ProcessingResponse.mode_override>`
// value that the server sends in the header response. This is because Envoy may have already
// value that the server sends in the header response. This is because the data plane may have already
// sent the body to the server, prior to processing the header response.
bool send_body_without_waiting_for_header_response = 21;

Expand Down Expand Up @@ -434,7 +440,8 @@ message ExtProcOverrides {

// [#not-implemented-hide:]
// Set a different asynchronous processing option than the default.
bool async_mode = 2;
// Deprecated and not implemented.
bool async_mode = 2 [deprecated = true, (envoy.annotations.deprecated_at_minor_version) = "3.0"];

// [#not-implemented-hide:]
// Set different optional attributes than the default setting of the
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,8 +65,7 @@ message ProcessingMode {
// Do not send the body at all. This is the default.
NONE = 0;

// Stream the body to the server in pieces as they arrive at the
// proxy.
// Stream the body to the server in pieces as they are seen.
STREAMED = 1;

// Buffer the message body in memory and send the entire body at once.
Expand All @@ -79,11 +78,11 @@ message ProcessingMode {
// up to the buffer limit will be sent.
BUFFERED_PARTIAL = 3;

// Envoy streams the body to the server in pieces as they arrive.
// The ext_proc client streams the body to the server in pieces as they arrive.
//
// 1) The server may choose to buffer any number chunks of data before processing them.
// After it finishes buffering, the server processes the buffered data. Then it splits the processed
// data into any number of chunks, and streams them back to Envoy one by one.
// data into any number of chunks, and streams them back to the client one by one.
// The server may continuously do so until the complete body is processed.
// The individual response chunk size is recommended to be no greater than 64K bytes, or
// :ref:`max_receive_message_length <envoy_v3_api_field_config.core.v3.GrpcService.EnvoyGrpc.max_receive_message_length>`
Expand All @@ -98,17 +97,29 @@ message ProcessingMode {
//
// In this body mode:
// * The corresponding trailer mode has to be set to ``SEND``.
// * Envoy will send body and trailers (if present) to the server as they arrive.
// * The client will send body and trailers (if present) to the server as they arrive.
// Sending the trailers (if present) is to inform the server the complete body arrives.
// In case there are no trailers, then Envoy will set
// In case there are no trailers, then the client will set
// :ref:`end_of_stream <envoy_v3_api_field_service.ext_proc.v3.HttpBody.end_of_stream>`
// to true as part of the last body chunk request to notify the server that no other data is to be sent.
// * The server needs to send
// :ref:`StreamedBodyResponse <envoy_v3_api_msg_service.ext_proc.v3.StreamedBodyResponse>`
// to Envoy in the body response.
// * Envoy will stream the body chunks in the responses from the server to the upstream/downstream as they arrive.
// to the client in the body response.
// * The client will stream the body chunks in the responses from the server to the upstream/downstream as they arrive.

FULL_DUPLEX_STREAMED = 4;

// [#not-implemented-hide:]
// gRPC traffic. In this mode, the ext_proc client will de-frame the
// individual gRPC messages inside the HTTP/2 DATA frames, and as each
// message is de-framed, it will be sent to the ext_proc server as a
// :ref:`request_body
// <envoy_v3_api_field_service.ext_proc.v3.ProcessingRequest.request_body>`
// or :ref:`response_body
// <envoy_v3_api_field_service.ext_proc.v3.ProcessingRequest.response_body>`.
// If the ext_proc server modifies the body, that modified body will
// be used to replace the gRPC message in the stream.
GRPC = 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This protocol mode does not seem to define a new behavior but rather a semantic interpretation of content of the body bytes. An ext_proc client that sends a body containing a complete gRPC message using the GRPC mode and any other mode is indistinguishable to the ext_proc server - i.e. they will all be processed the same way.

In the end I think this change complicates the already complicated and confusing protocol even more without offering a tangible benefit. More over server implementations still have to support gRPC messages split over multiple ext_proc messages since this is an existing behavior.

Copy link
Contributor Author

@markdroth markdroth Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This protocol mode does not seem to define a new behavior but rather a semantic interpretation of content of the body bytes. An ext_proc client that sends a body containing a complete gRPC message using the GRPC mode and any other mode is indistinguishable to the ext_proc server - i.e. they will all be processed the same way.

That's not actually true -- the contents sent to the ext_proc server are actually different under this mode, so the ext_proc server can absolutely tell the difference.

For gRPC traffic, the content of the HTTP/2 DATA frames is a sequence of framed gRPC messages. However, there is no guarantee that each DATA frame contains exactly one framed gRPC message; it's entirely possible to have a single framed gRPC message span multiple DATA frames, and it's also entirely possible to have multiple framed gRPC messages within a single DATA frame. The goal of the new GRPC body send mode is to handle the buffering and deframing in the data plane instead of making every ext_proc server handle it itself.

In an existing body send mode like FULL_DUPLEX_STREAMED, every ext_proc server will need to handle buffering and deframing of gRPC messages as it receives DATA frame chunks. In the new GRPC mode, the ext_proc server sees only the deframed gRPC messages, and it always gets exactly one deframed gRPC message in each body message on the ext_proc stream.

In addition, note that gRPC cannot implement any other mode here, because our filters do not have access to the raw HTTP/2 DATA frames. In gRPC, we handle deframing the gRPC messages in our transport layer (i.e., the same place where we handle the HTTP/2 protocol), and our filters see only the deframed gRPC messages. The only way we could implement the current FULL_DUPLEX_STREAMED protocol would be some horrible hack where we'd need to re-add the HTTP/2 framing to each message before we send it to the ext_proc server, which really doesn't make sense -- and it would probably hurt performance by requiring additional memory allocations. And since we want to support the ability for users to switch between proxy-based setups and proxyless gRPC, we need some common protocol that both data planes can support for gRPC traffic.

So basically, there are two arguments for this new mode:

  1. We want to make things easier by having the gRPC buffering and framing handled in the data plane instead of in each individual ext_proc server.
  2. gRPC cannot support any other body send mode, and we want a common mode that works with both data planes for gRPC traffic.

server implementations still have to support gRPC messages split over multiple ext_proc messages since this is an existing behavior.

I think there's a clear path to removing the need for that in existing servers:

  1. Add support for GRPC mode in the ext_proc server. (Note that the ext_proc request already tells the server what body send mode the client is using, so during this migration, the server can tell how it's supposed to handle each ext_proc stream.)
  2. Change the configuration of all data planes to use GRPC mode.
  3. Once all data planes have been reconfigured, it will be safe to remove support for other body send modes from the ext_proc server.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BodySendMode describes behavior of the state machine for exchanging body content with the server. You are describing two things simultaneously with this new value: what contains in the body payload and (possibly) new state transitions. The latter part is not described here. Does this value imply that the state machine is the same as STREAMED but with payload of gRPC messages? Or is it the same as FULL_DUPLEX_STREAMED? Or something new entirely? And why can;t I use BUFFERED state machine if I know I will only have unary gRPC messages?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The state machine is essentially the same as FULL_DUPLEX_STREAMED, just with the addition of the buffering and deframing of gRPC messages. In other words, the data plane will buffer only until it sees a complete gRPC message, at which point it will deframe the message and stream it to the ext_proc server. And the body responses sent back from the ext_proc server will each be an individual gRPC message as well.

I don't think it makes sense to use BUFFERED mode here. Note that the gRPC wire protocol is designed to handle streaming as a first-class citizen, and unary RPCs are just a special case: a unary RPC is a bidi stream that just happens to have a single request followed by a single result. Inside the HTTP/2 DATA frames, the request and response messages are still encoded with the same gRPC framing as in streaming cases. So using BUFFERED mode would still require the data plane to send the raw contents of the HTTP/2 DATA frames, and it would require the ext_proc server to handle the deframing, which is what we're trying to avoid here. (I don't think we want to add a BUFFERED_GRPC mode to make it handle the framing, and even if we did, the behavior would wind up being the same as the GRPC mode proposed here, so I think that would add complexity without actually providing any benefit.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markdroth can gRPC client just re-use FULL_DUPLEX_STREAMED mode then. The gRPC client implementation can do the buffering and deframing before sending the message to the gRPC ext_proc server, which is the gRPC client implementation detail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the above comment, the proposal is that the only body send mode supported by proxyless gRPC is GRPC? If so, I take it that the expectation is that Envoy would not have to implement GRPC at all, right?

No, my expectation is that Envoy would also implement GRPC mode. Users that want to use ext_proc for gRPC traffic would want to migrate to this mode if they need to use proxyless gRPC.

I think it is a net loss to not have FULL_DUPLEX_STREAMED mode for proxyless gRPC. Part of the appeal of ext_proc is that extension authors can do things which aren't supported by the Envoy/proxyless gRPC stack and/or are too expensive or risky to do in either place. We recently had this discussion for compression and decompression, because the CPU cost of doing it in the proxy was a significant driver to moving it to ext_proc — at least then, it could be paid for by the ext_proc server owner rather than the proxy/client.

Take for example an ext_proc server which implements a new form of compression, let's call it Zsupreme. For cost reasons or whatever, the customer deploys an ext_proc server on GCE that sets the appropriate headers and performs Zsupreme compression on the body bytes for the proxy — AIUI, this is not possible using GRPC send mode.

If the goal is to implement compression on a per-message basis, which is the way compression is normally done for gRPC, then it should be totally possible in GRPC mode.

If the goal is to implement compression for the entire HTTP/2 stream, even compressing the gRPC framing within the HTTP/2 DATA frames, then that's not something gRPC will be able to support in the first place, regardless of how we structure the configuration. In gRPC, we deal with the gRPC framing in our transport layer, and the xDS HTTP filters are above that, so the ext_proc filter will see only the deframed gRPC messages. It's not possible for the filter to replace the raw HTTP/2 DATA frames, since it simply doesn't operate at that layer.

Given gRPC's architecture, I don't think there's actually anything that the ext_proc server could do that we could actually support that it won't be able to do with GRPC mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I will note that this is yet another reason why it makes sense for gRPC to not support FULL_DUPLEX_STREAMED, because that may fool ext_proc server authors into thinking that they can do things that gRPC simply can't support.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After thinking more and looking at other ext_proc implementations I think it would be preferable to use a separate field to indicate content and leave body sending mode as is. Proxyless gRPC would then use content_type: GRPC and ...body_mode: FULL_DUPLEX_STREAMED. Proxyless gRPC does need to implement other modes like BUFFERED or STREAMED.

The reason is we want to add support for other content types such as JSON-RPC and we want to be able to send them in all modes, most importantly in BUFFERED mode, since some ext_proc servers only support this mode and it will delay adoption if we have to force implementations to use FULL_DUPLEX_STREAMED to be able to process other content types.

If we put the GRPC content as the send mode then we have to add content-type for JSON-RPC anyway and then have to explain that it works in all cases except GRPC mode, which is not great.

Using content-type; GRPC and ..._body_mode: FULL_DUPLEX_STREAMED will work exactly the same way as GRPC body send mode without limiting implementation from using other send modes either for GRPC or for other protocols we are adding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the desire to support other content-aware modes like JSON-RPC, and I agree that that should be possible. So let's talk about the best way to do that.

It's not clear to me that introducing a separate content type setting actually provides a better story for doing that, because it's not actually an independent dimension from the body send mode. The content type would actually directly affect the behavior of the body send mode -- e.g., the exact buffering behavior of FULL_DUPLEX_STREAMED will be different depending on whether the content type is GRPC -- which means that both code and humans will have to look at both settings in order to understand the behavior.

Specifically:

  • In terms of the data plane implementation, the code implementing the body send mode in the data plane will need to understand the content type and alter its behavior accordingly, so I don't think this change actually makes anything simpler there: the content type setting will not reduce implementation complexity in Envoy.
  • The presence of the content type setting will make ext_proc server implementations more complex, because any ext_proc server will need to look at two different settings in order to decide how to interpret the body chunks that it's receiving and know what body chunks it needs to send back.
  • There are certain combinations of body send mode and content type that are inherently non-sensical and would not be supported (e.g., STREAMED and GRPC). If they are two separate knobs, then every data plane and every ext_proc server will need to decide how to handle these unsupported combinations.
  • For humans, it will be harder to understand the behavior if the behavior is dicatated by two different settings that interact with each other so deeply rather than just one. We will need to document the behavior for every possible combination of body send mode and content type: each body send mode will need to document which content types it supports and how that content type affects the state machine, and each content type will need to document which body send modes it supports. For example, we'll need to document things like "GRPC content_type is supported only in FULL_DUPLEX_STREAMED body send mode" and "in FULL_DUPLEX_STREAMED body send mode, if the content type is GRPC, each body chunk sent to or from the ext_proc server is a single deframed gRPC message". In contrast, if we have just one setting for body send mode, then it's much easier for users to understand the behavior of that mode.

Given that these two things are really just a single dimension, I think it would make things much easier for the configuration to reflect that. I would suggest instead defining new body send modes for each actual body send behavior, including those that are content-aware. In other words:

  • We can rename GRPC body send mode to FULL_DUPLEX_STREAMED_GRPC.
  • In the future, we can add BUFFERED_JSON_RPC and FULL_DUPLEX_STREAMED_JSON_RPC if needed.

I realize that this approach will wind up with a larger number of body send modes, but I think it will ultimately make it easier to understand and implement.

To be clear, I think we are going to ultimately support the same set of modes either way; the question here is only about which configuration structure winds up being easier to understand and which adds more complexity. If there's a strong argument for doing this as a separate content type knob, we can make that work, but it seems worse to me in terms of complexity and understandability.

Thoughts...?


As an aside, it's still not clear to me that it makes sense to support BUFFERED mode for gRPC traffic, for the following reasons:

  • This mode is useful only for unary RPCs, where there is only a single gRPC message on the stream. However, there is no indication in the wire protocol whether the RPC is unary or streaming, so the data plane would not even know that the RPC is streaming until after it sends the initial gRPC message to the ext_proc server. And it's not clear how the data plane should react at that point.
  • This mode does not send the body to the ext_proc server until after it receives the ext_proc response for the request headers, which means that it adds significant latency. This is something gRPC works hard to avoid, especially for unary RPCs: we send the client headers, message, and a half close in a single write on the TCP connection for performance reasons.

That having been said, we don't need to decide this right now, as long as we leave room for the possibility in the future. I think that if we rename the GRPC mode to FULL_DUPLEX_STREAMED_GRPC, it will leave an easy way to add support for a BUFFERED_GRPC mode in the future if we decide to do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me that introducing a separate content type setting actually provides a better story for doing that, because it's not actually an independent dimension from the body send mode.

Why can it be independent? The bytes that ext_proc filter sends to the server is opaque to it, there is no specific meaning attached to them today. Today it is implied to be a portion (not even corresponding to H/2 DATA or H/1 chunks) of HTTP body because no other other protocols are used with ext_proc. This can be made explicit with additional content-type field.

We'd like to make ext_proc and other filters work generically with the protocols encapsulated within HTTP. I think we can make this work by adding ambient frame information to the request that ext_proc can use to generically support sending and receiving of encapsulated protocols. We have at least 2 in the pipeline: GRPC, JSON-RPC/MCP. WebSocket is also being asked for, but this is less definitive, so we want to give developers a flexible way to add support for encapsulated protocols without having to also modify ext_proc protocol.

}

// How to handle the request header. Default is "SEND".
Expand Down
Loading