Skip to content

feat: remote dyn filter basics#7979

Open
discord9 wants to merge 5 commits intomainfrom
remote_dyn_filter_01_02
Open

feat: remote dyn filter basics#7979
discord9 wants to merge 5 commits intomainfrom
remote_dyn_filter_01_02

Conversation

@discord9
Copy link
Copy Markdown
Contributor

@discord9 discord9 commented Apr 16, 2026

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

GreptimeTeam/greptime-proto#314

Summary

  • define the Phase 1 remote dynamic filter wire ABI, including remote_query_id, DynFilterUpdate, and shared payload encode/decode helpers
  • add the region unary RPC scaffolding for remote dynamic filter update / unregister handling between frontend and datanode
  • keep this PR scoped to scaffolding only, leaving frontend producer and datanode apply/runtime work to follow-up changes

Why

This splits out the protocol and control-plane groundwork into a smaller reviewable PR before the larger end-to-end remote dynamic filter implementation. It gives us a stable ABI and RPC entrypoint first, so later work can build on a reviewed contract instead of mixing transport and runtime changes together.

Included in this PR

  • wire ABI groundwork
  • region RPC scaffolding

Not included

  • frontend producer registration/build-link logic
  • datanode apply/runtime wrapper integration
  • lifecycle cleanup / observability / validation follow-ups

PR Checklist

Please convert it to a draft if some of the following conditions are not met.

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR requires documentation updates.
  • API changes are backward compatible.
  • Schema or data changes are backward compatible.

Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: discord9 <discord9@163.com>
@github-actions github-actions Bot added size/M docs-not-required This change does not impact docs. labels Apr 16, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a remote dynamic filter protocol, enabling the serialization and transmission of DataFusion physical expressions between components. Key additions include the DynFilterPayload and DynFilterUpdate structures, along with client and server-side handlers for dynamic filter updates and unregistrations. The PR also implements automatic generation of a unique remote_query_id using UUID v7 within the QueryContext for better request tracking. Feedback highlights a naming inconsistency between internal structures and gRPC messages regarding the 'epoch' field, as well as a potential issue with strict column name validation when qualifiers are present.

pub protocol_version: u32,
pub query_id: String,
pub filter_id: String,
pub epoch: u64,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The field name epoch in the internal DynFilterUpdate struct is inconsistent with the generation field used in the gRPC RemoteDynFilterUpdate message (as seen in the tests and client code). It is recommended to use consistent naming to avoid confusion.

Comment on lines +126 to +133
if field.name() != column.name() {
return Err(DataFusionError::Plan(format!(
"Decoded Column name/index mismatch: payload has '{}' at index {}, but schema field is '{}'",
column.name(),
column.index(),
field.name()
)));
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The strict name check field.name() != column.name() might be too restrictive if the DataFusion Column expression contains qualifiers (e.g., table.column) while the input schema fields are unqualified. Consider using a more robust comparison or ensuring that qualifiers are handled consistently during serialization and deserialization.

Signed-off-by: discord9 <discord9@163.com>
@discord9 discord9 marked this pull request as ready for review April 16, 2026 12:14
@discord9 discord9 requested review from a team, evenyag and v0y4g3r as code owners April 16, 2026 12:14
.channel(channel);
.channel(channel)
.set_extension(
REMOTE_QUERY_ID_EXTENSION_KEY.to_string(),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remote_query_id is being introduced here as the internal correlation key for remote dynamic filters, but this builder still applies every client-supplied hint after generating it. Because x-greptime-hints accepts arbitrary keys, an external gRPC caller can set remote_query_id=... and force collisions with another query's control-plane traffic. The same overwrite path exists for HTTP via http::hints::extract_hints(). Once follow-up PRs start using this ID to register/update/unregister filters, this becomes a spoofable internal identifier.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

Signed-off-by: discord9 <discord9@163.com>
#[derive(Clone, Debug, PartialEq, Eq, Serialize, Deserialize)]
#[non_exhaustive]
#[serde(tag = "kind", content = "payload", rename_all = "snake_case")]
pub enum DynFilterPayload {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always document the public structs and functions.

.as_ref()
.context(error::MissingRequiredFieldSnafu { name: "action" })?
{
api::v1::region::remote_dyn_filter_request::Action::Update(update) => {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use api::v1::region::remote_dyn_filter_request::Action

Make the code clean.

.channel(channel);
.channel(channel)
.set_extension(
REMOTE_QUERY_ID_EXTENSION_KEY.to_string(),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

Signed-off-by: discord9 <discord9@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-not-required This change does not impact docs. size/M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants