Skip to content
Open
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
307 changes: 307 additions & 0 deletions A110-child-channel-plugins.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,307 @@
A110: Child Channel Options
----
* Author(s): [Abhishek Agrawal](mailto:agrawalabhi@google.com)
* Approver: a11r
* Status: In Review
* Implemented in: Core, Java, Go
* Last updated: 2025-12-24
* Discussion at: https://groups.google.com/g/grpc-io/c/EBIp3uud-Bo

## Abstract

This proposal introduces a mechanism to configure "child channels", channels
created internally by gRPC components (such as xDS control
user, making it difficult to inject necessary configurations for
metrics, tracing etc from the user application. This design proposes
an approach for users to pass configuration options to these
internal channels.
gRPC willl support the xDS Extension Config Discovery Service (ECDS),

## Background

Complex gRPC ecosystems often require the creation of auxiliary channels that
are not directly instantiated by the user application. The primary examples are:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to also have an example here of doing this from within an LB policy. The obvious example is RLS, but since that isn't actually a public API, I guess we'll have to cite grpclb as the example instead. However, let's note that we're just using grpclb as an example for how we will handle this plumbing in LB policies, and that this proposal does not actually mandate any behavior changes for grpclb specifically. (In C-core, at least, there's a bunch of crufty legacy behavior for grpclb that I don't want to touch at this point, so we will not implement this proposal for grpclb.)

Are Java or Go going to have any problems plumbing the child channel options down into resolvers or LB policies? @ejona86 @dfawley @easwars


1. xDS (Extensible Discovery Service): When a user creates a channel with an xDS
target, the gRPC library internally creates a separate channel to communicate
with the xDS control plane.
2. External Authorization (ext_authz): As described
in [gRFC A92](https://github.com/grpc/proposal/pull/481), the gRPC server or
client may create an internal channel to contact an external authorization
service.
3. External Processing (ext_proc): As described
in [gRFC A93](https://github.com/grpc/proposal/pull/484), filters may create
internal channels to call external processing servers.

### Related Proposals

* [A27: xDS-Based Global Load Balancing](https://github.com/grpc/proposal/blob/master/A27-xds-global-load-balancing.md)
* [A66: Otel Stats](https://github.com/grpc/proposal/blob/master/A66-otel-stats.md)
* [A72: OpenTelemetry Tracing](https://github.com/grpc/proposal/blob/master/A72-open-telemetry-tracing.md)
* [A92: xDS ExtAuthz Support](https://github.com/grpc/proposal/pull/481)
* [A93: xDS ExtProc Support](https://github.com/grpc/proposal/pull/484)

### The Problem

The primary motivation for this feature is the need to configure observability
on a per-child-channel basis.

* StatsPlugins & Tracing: Users need to configure metric sinks (as described in
gRFC [A66](https://github.com/grpc/proposal/blob/master/A66-otel-stats.md)
and [A72](https://github.com/grpc/proposal/blob/master/A72-open-telemetry-tracing.md))
so that telemetry from internal channels is correctly tagged and exported.
* Interceptors: Users may need to apply specific interceptors (e.g., for
logging, or tracing) to internal traffic.

These configurations cannot be set globally because different parts of an
application may require different configurations, such as different metric
backends.

## Proposal

We introduce the concept of **Child Channel Options**. This is a configuration
container attached to a parent channel that is strictly designated for use by
its children.

### Encapsulation

The user API must allow "nesting" of channel options. A user creating a Parent
Channel `P` can provide a set of options `O_child`.

* `O_child` is opaque to `P`. `P` does not apply these options to itself.
* `O_child` is carried in `P`'s state, available for extraction by internal
components.
* The configuration provided by `O_child` is strictly uniform across all child
channels of a particular parent channel.

### Propagation

When an internal component (e.g., an xDS client factory or an auth filter)
attached to `P` needs to create a Child Channel `C`:

1. It retrieves `O_child` from `P`.
2. It applies `O_child` to the configuration of `C`.

### Precedence and Merging

The Child Channel `C` typically requires some internal
configuration `O_internal` (e.g., target URIs, or internal interceptors).

* Merge Rule: `O_child` and `O_internal` are merged. If the environment supports
global channel options, `O_child` options override global channel options.
* Conflict Resolution: Mandatory internal settings (`O_internal`) generally take
precedence over user-provided child options (`O_child`) to ensure correctness.

### Shared Resources

Certain internal channels, specifically the **xDS Control Plane Client**, are
often pooled and shared across multiple parent channels within a process based
on the target URI (
see [gRFC A27](https://github.com/grpc/proposal/blob/master/A27-xds-global-load-balancing.md)).

If multiple Parent Channels (`P1`, `P2`) point to the same xDS target but
provide *different* Child Channel Options (`O_child1`, `O_child2`):

* Behavior: The shared client is created using the options from the first parent
channel that triggers its creation (e.g., `O_child1`).
* Subsequent Usage: When `P2` requests the client, it receives the existing
shared client. `O_child2` is effectively ignored for that specific shared
resource.
Comment on lines +99 to +103
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ejona86 @dfawley Should we define this specific behavior, or should we just say that it's up to the implementation? I don't feel strongly either way, but I wanted to make sure we're all on the same page.


### Language Implementations

#### Java

In Java, the configuration will be achieved by accepting functions (callbacks).
The API allows users to pass a `Consumer<ManagedChannelBuilder<?>>` (or a
similar functional interface). When an internal library (e.g., xDS, gRPCLB)
creates a child channel, it applies this user-provided function to the builder
before further configuring the channel.

* ##### Configuration Interface

Define a new public API interface, `ChannelConfigurer`, to encapsulate the
configuration logic for channels.

```java

import io.grpc.ManagedChannelBuilder;
import io.grpc.ServerBuilder;

// Captures the intent of the plugin.
// Consumes a builder to modify it before further configuring the channel or server
public interface ChannelConfigurer {
/**
* Configures the given channel builder.
*
* @param builder the channel builder to configure
*/
default void configureChannelBuilder(ManagedChannelBuilder<?> builder) {}

/**
* Configures the given server builder.
*
* @param builder the server builder to configure
*/
default void configureServerBuilder(ServerBuilder<?> builder) {}
}
```

* ##### API Changes

* ManagedChannelBuilder:
Add `ManagedChannelBuilder#childChannelConfigurer(ChannelConfigurer channelConfigurer)`
to allow users to register this configurer.
* XdsServerBuilder:
Add `XdsServerBuilder#childChannelConfigurer(ChannelConfigurer configurer)`
to allow users to provide configuration for any internal channels created
by the server (e.g., connections to external authorization or processing
services).

* ##### Usage Example

```java
// Define the configurer for internal child channels
ChannelConfigurer myInternalConfig = new ChannelConfigurer() {
@Override
public void configureChannelBuilder(ManagedChannelBuilder<?> builder) {
builder.addMetricSink(sink);
}
};

// Apply it to the parent channel
ManagedChannel channel = ManagedChannelBuilder.forTarget("xds:///my-service")
.childChannelConfigurer(myInternalConfig) // <--- Configuration injected here
.build();
```

#### Go

In Go, both the Client (`grpc.NewClient`) and the Server (`NewGRPCServer`)
create internal child channels. We introduce mechanisms to pass `DialOption`s
into these internal channels from both entry points.

* ##### New API for Child Channel Options

* Client-Side: `WithChildChannelOptions`

For standard clients, we introduce a `DialOption` wrapper.

```go
// WithChildChannelOptions returns a DialOption that specifies a list of
// DialOptions to be applied to any internal child channels.
func WithChildChannelOptions(opts ...DialOption) DialOption {
return newFuncDialOption(func(o *dialOptions) {
o.childChannelOptions = opts
})
}
```

* Server-Side: `WithChildDialOptions`

For xDS-enabled servers, we introduce a `ServerOption` wrapper.
Since `xds.NewGRPCServer` creates an internal xDS client to fetch listener
configurations, it requires a way to apply `DialOptions` (such as **Socket
Options** or **Stats Handlers**) to that internal connection.

```go
// WithChildDialOptions returns a ServerOption that specifies a list of
// DialOptions to be applied to the server's internal child channels
// (e.g., the xDS control plane connection).
func WithChildDialOptions(opts ...DialOption) ServerOption {
return newFuncServerOption(func(o *serverOptions) {
o.childDialOptions = opts
})
}
```

* ##### Usage Example (User-Side Code)

This design provides users with the flexibility to define independent
configurations for parent and child channels within a single NewClient call.
For example, a parent channel can be configured with transport security (mTLS)
while the internal child channels (such as the xDS control plane connection)
are configured with specific interceptors or a custom authority.

```go
func main() {
// Define configuration specifically for the internal control plane
internalOpts := []grpc.DialOption{
// Inject the OTel handler here. It will only measure traffic on the
// internal child channels (e.g., to the xDS server).
grpc.WithStatsHandler(otelHandler)
}

// Create the Parent Channel
conn, err := grpc.NewClient("xds:///my-service",
// Parent channel configuration (Data Plane)
grpc.WithTransportCredentials(insecure.NewCredentials()),

// Child channel configuration (Control Plane)
// The OTel handler inside here applies ONLY to the child channels.
grpc.WithChildChannelOptions(internalOpts...),
)

if err != nil {
log.Fatalf("failed to create client: %v", err)
}
defer conn.Close()

// ... use conn ...
}
```

##### Core (C/C++)

In gRPC Core, we utilize the existing `ChannelArgs` mechanism recursively to
pass configuration to internal channels. We define a standard argument key whose
value is a pointer to another `grpc_channel_args` structure. This "Nested
Arguments" pattern allows the parent channel to carry a specific subset of
arguments intended solely for its children.

* ##### Configuration Mechanism

We define a new channel argument key. The value associated with this key is a
pointer to a `grpc_channel_args` struct, managed via a pointer vtable to
ensure correct ownership and copying.

```c
// A pointer argument key. The value is a pointer to a grpc_channel_args
// struct containing the subset of options for child channels.
#define GRPC_ARG_CHILD_CHANNEL_ARGS "grpc.child_channel.args"
```

* **API Changes**

We add a helper method to the C++ `ChannelArguments` class to simplify packing
the nested arguments safely.

```cpp
// Sets the channel arguments to be used for child channels.
void SetChildChannelArgs(const ChannelArguments& args);
```

* ##### Usage Example (User-Side Code)

```cpp
grpc::ChannelArguments child_channel_args;
// E.g., add a custom tracing interceptor specifically for child channels
child_args.SetPointer(GRPC_ARG_TRACING_PROVIDER, my_tracing_provider);

grpc::ChannelArguments parent_args;
// Pass the nested args up
parent_args.SetChildChannelArgs(child_channel_args);

std::shared_ptr<grpc::Channel> channel =
grpc::CreateCustomChannel("xds:///my-service", credentials, parent_args);
```

## Rationale
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to say somewhere that we are not trying to provide any identity information about which channel is being created. The configuration provided by this gRFC is required to be uniform across child channels of a particular channel.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure this should be a non-goal? E.g. we might have xds client connections needing a different set of options from RLS connections/etc. It seems pretty plausible given that one team will own the control plane, but a different team would own the ext_proc services, so different credentials might be needed or different telemetry configuration may be desired.

This doesn't have to be hard; it could be a simple map from name->config.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not for credentials. Both of those channels would send telemetry to the same destination (configured here), so they'd use the same credentials. We would solve multi-ownership problems by allowing applying telemetry multiple times, a different time for each consumer/configuration.

I don't disagree that someone may want to identify the purpose of the child channel, but that then means we need some way to identify the purpose of the child channel. That's a bigger lift and opens up lots of complexities on how you will categorize the child channels' purposes. In our TL discussion about it, nobody could come up with a strong need for the info, we so left it for future work if/when it is needed.


### Why not Global Configuration?

We reject global configuration (static variables) because it prevents
multi-tenant applications from isolating configurations. For example, one client
may need to export metrics to Prometheus, while another in the same process
exports to Cloud Monitoring. Furthermore, libraries want to configure their
channels but cannot do so globally without affecting the host application.