Feature: reject queries that cause partition fullscan#1711
Feature: reject queries that cause partition fullscan#1711lqriu wants to merge 1 commit intoapache:mainfrom
Conversation
Add a plan-time check that rejects queries on partitioned tables when no effective partition pruning occurs, preventing unintended full partition scans that waste cluster resources. New GUC parameters: - reject_partition_fullscan (bool, default on): enable/disable - partition_fullscan_threshold (int, default 0): max partitions allowed after pruning. 0 = reject only true fullscans. Planner path (inherit.c): - Check in expand_partitioned_rtentry() after prune_append_rel_partitions() returns - Exempts parameterized queries (Param nodes in baserestrictinfo) ORCA path (orca.c): - Post-plan check in optimize_query() using plan_tree_walker - Inspects part_prune_info on 7 Dynamic scan node types - Skips PartitionSelector (JOIN dynamic pruning) - Exempts nodes with initial/exec_pruning_steps Exemptions: enable_partition_pruning=off, single-partition tables, parameterized queries, JOIN-based dynamic partition selection.
There was a problem hiding this comment.
Hi, @lqriu welcome!🎊 Thanks for taking the effort to make our project better! 🙌 Keep making such awesome contributions!
|
Thanks for the contribution! The feature addresses a real operational pain point. One architectural suggestion: could this be implemented as an extension and placed under contrib/ (or gpcontrib/)? Using PostgreSQL's planner_hook, the extension fires after the planner (including ORCA) returns the final PlannedStmt, allowing a plan tree walk to detect missing pruning steps — which is essentially what the ORCA path in orca.c already does. GUCs can be registered from the extension side via DefineCustomBoolVariable/DefineCustomIntVariable, keeping all the knobs without touching core GUC tables. The main trade-off is that planner_hook only sees the finished plan, while the current standard planner path hooks mid-planning inside expand_partitioned_rtentry() to compare num_live_parts vs nparts. For the partition_fullscan_threshold feature this is slightly less precise — but for the primary use case of rejecting true full scans (zero pruning steps), the extension approach is fully equivalent. Placing this in contrib/ would:
Would you be open to exploring that direction? Happy to discuss if there are cases where the extension approach falls short of your requirements. |
|
Hi,
Thank you for the review and the suggestion to implement this as an extension under `gpcontrib/`. I fully agree with the approach — keeping core optimizer files untouched significantly reduces maintenance burden across versions.
I've started reworking the implementation as an extension using `planner_hook`. The hook wraps `standard_planner()` (which internally dispatches to either ORCA or the PG planner), then walks the resulting `PlannedStmt` plan tree checking `PartitionPruneInfo` on each partition scan node.
**The ORCA path works perfectly** — ORCA always generates `DynamicSeqScan` (or similar Dynamic nodes) with a populated `part_prune_info`, even for queries with no WHERE clause. The extension can detect `present_parts == nparts` with empty pruning steps and reject the query.
However, I've identified a gap in the **standard PG Planner path** that I'd like to discuss. There are three common fullscan scenarios where the Planner does not generate `PartitionPruneInfo` at all, making them invisible to the post-plan hook:
**1. No WHERE clause**
`baserestrictinfo` is NIL → `prunequal` is NIL → `createplan.c` line 1487 (`if (prunequal != NIL)`) skips `make_partition_pruneinfo()` entirely → Append node's `part_prune_info` is NULL.
**2. WHERE 1=1 (or WHERE true)**
After constant folding, the expression is eliminated and `baserestrictinfo` becomes NIL — identical to case 1.
**3. WHERE clause on a non-partition-key column (e.g., `WHERE status = 'active'` on a date-partitioned table)**
`prunequal` is non-NIL, so `make_partition_pruneinfo()` is called. But inside `gen_partprune_steps_internal()`, `match_clause_to_partition_key()` returns `PARTCLAUSE_NOMATCH` for every clause (none reference the partition key). No pruning steps are generated → `make_partition_pruneinfo()` returns NULL → `part_prune_info` is NULL.
In all three cases, the extension's plan-tree walker sees `part_prune_info == NULL` and has no way to distinguish "no pruning attempted on a partitioned table" from "not a partitioned table at all."
**Impact assessment:**
Since Cloudberry defaults to ORCA (`optimizer = on`), and ORCA handles the vast majority of queries, this gap only affects:
- Sessions with `SET optimizer = off`
- Queries that ORCA cannot handle and falls back to the PG planner (~5-10%)
For the default ORCA path, the extension approach is fully equivalent to the core-modification approach.
**My question:**
Is this trade-off acceptable for the project? I see a few possible paths forward:
1. **Accept the trade-off** — document the Planner-path limitation and proceed with the pure extension approach. This is what your review suggested and covers the primary use case.
2. **Hybrid approach** — use the extension for GUC registration and ORCA-path checking (via `planner_hook`), but also add a minimal check in `inherit.c`'s `expand_partitioned_rtentry()` for the Planner path. This would touch one core file but provide complete coverage.
3. **Enhance the extension** — explore alternative detection methods for the Planner path, such as checking if an Append node has partition-child subplans but no `part_prune_info` attached. This avoids core changes but adds complexity.
I'd appreciate your guidance on which direction the project prefers. Happy to proceed with any of these approaches.
Best regards,
Liu Qiren
… 2026年5月1日 02:17,Jianghua.yjh ***@***.***> 写道:
yjhjstz
left a comment
(apache/cloudberry#1711)
<#1711 (comment)>
Thanks for the contribution! The feature addresses a real operational pain point.
One architectural suggestion: could this be implemented as an extension and placed under contrib/ (or gpcontrib/)? Using PostgreSQL's planner_hook, the extension fires after the planner (including ORCA) returns the final PlannedStmt, allowing a plan tree walk to detect missing pruning steps — which is essentially what the ORCA path in orca.c already does. GUCs can be registered from the extension side via DefineCustomBoolVariable/DefineCustomIntVariable, keeping all the knobs without touching core GUC tables.
The main trade-off is that planner_hook only sees the finished plan, while the current standard planner path hooks mid-planning inside expand_partitioned_rtentry() to compare num_live_parts vs nparts. For the partition_fullscan_threshold feature this is slightly less precise — but for the primary use case of rejecting true full scans (zero pruning steps), the extension approach is fully equivalent.
Placing this in contrib/ would:
Keep core optimizer files (inherit.c, orca.c, guc.c) untouched, reducing cross-version maintenance burden
Make the feature easier to evolve and backport independently
Would you be open to exploring that direction? Happy to discuss if there are cases where the extension approach falls short of your requirements.
—
Reply to this email directly, view it on GitHub <#1711 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AY2OQC4LQXQTZ4P4CZQQ45L4YOKERAVCNFSM6AAAAACYL2AD2CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DGNJVGA3DKMJTHA>.
Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.
|
|
Thank you for the detailed analysis. I prefer Option 3 — enhancing the extension to handle the standard planner path without touching core files. For the Append-node heuristic: if a node is an Append with partition children but part_prune_info == NULL, we can cross-reference the RTE to confirm it's a partitioned table (via RTE_RELATION + inh = true), then treat it as an undetected full scan. This keeps the implementation fully self-contained in the extension. |
|
Closing this PR in favor of a new implementation as a gpcontrib extension, per reviewer feedback. The new PR uses planner_hook instead of modifying core optimizer files. See the new PR for the extension-based approach. |
What does this PR do?
Add a plan-time check that rejects queries on partitioned tables when no
effective partition pruning occurs, preventing unintended full partition
scans in production environments.
Two new GUC parameters:
reject_partition_fullscan(bool, default ON) — enable/disablepartition_fullscan_threshold(int, default 0) — max partitions allowedafter pruning; 0 means reject only true fullscans
Planner path (
inherit.c)expand_partitioned_rtentry()afterprune_append_rel_partitions()returnsnum_live_partsvsrelinfo->npartsORCA path (
orca.c)optimize_query()viaplan_tree_walkerpart_prune_infoon 7 node types (Append, MergeAppend,DynamicSeqScan, DynamicIndexScan, DynamicIndexOnlyScan,
DynamicBitmapHeapScan, DynamicForeignScan)
initial_pruning_stepsorexec_pruning_steps(runtime pruning capable)
Exemptions
reject_partition_fullscan = offenable_partition_pruning = offError message
Type of Change
Breaking Changes
Queries that previously performed full partition scans will now be
rejected by default. Users can disable with:
SET reject_partition_fullscan = off;Test Plan
partition_fullscan_reject.sql(12 scenarios)make installcheckmake -C src/test installcheck-cbdb-parallelImpact
per partitioned table; ORCA path adds one lightweight plan_tree_walker
Compliance Checklist
Files Changed
src/backend/optimizer/path/costsize.csrc/include/optimizer/cost.hsrc/backend/utils/misc/guc.csrc/include/utils/unsync_guc_name.hsrc/backend/optimizer/util/inherit.csrc/backend/optimizer/plan/orca.csrc/test/regress/sql/partition_fullscan_reject.sqlsrc/test/regress/greenplum_schedule