Skip to content

Feature: reject queries that cause partition fullscan#1711

Closed
lqriu wants to merge 1 commit intoapache:mainfrom
lqriu:feature/reject-partition-fullscan
Closed

Feature: reject queries that cause partition fullscan#1711
lqriu wants to merge 1 commit intoapache:mainfrom
lqriu:feature/reject-partition-fullscan

Conversation

@lqriu
Copy link
Copy Markdown

@lqriu lqriu commented Apr 30, 2026

What does this PR do?

Add a plan-time check that rejects queries on partitioned tables when no
effective partition pruning occurs, preventing unintended full partition
scans in production environments.

Two new GUC parameters:

  • reject_partition_fullscan (bool, default ON) — enable/disable
  • partition_fullscan_threshold (int, default 0) — max partitions allowed
    after pruning; 0 means reject only true fullscans

Planner path (inherit.c)

  • Check inserted in expand_partitioned_rtentry() after
    prune_append_rel_partitions() returns
  • Compares num_live_parts vs relinfo->nparts
  • Exempts queries with Param nodes (prepared statements, subquery params)

ORCA path (orca.c)

  • Post-plan check in optimize_query() via plan_tree_walker
  • Inspects part_prune_info on 7 node types (Append, MergeAppend,
    DynamicSeqScan, DynamicIndexScan, DynamicIndexOnlyScan,
    DynamicBitmapHeapScan, DynamicForeignScan)
  • Skips PartitionSelector (JOIN dynamic pruning)
  • Exempts nodes with initial_pruning_steps or exec_pruning_steps
    (runtime pruning capable)

Exemptions

Scenario Behavior
reject_partition_fullscan = off Allowed
enable_partition_pruning = off Allowed
Single-partition table (nparts <= 1) Allowed
Parameterized query ($1, PARAM_EXEC) Allowed (runtime pruning)
PartitionSelector (ORCA JOIN pruning) Allowed
No WHERE / WHERE 1=1 Rejected
WHERE on non-partition-key column Rejected

Error message

ERROR: partitioned table "schema.table" full partition scan is not
       allowed, N partitions would be scanned
HINT:  Add a WHERE clause on the partition key to enable partition pruning.

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation update

Breaking Changes

Queries that previously performed full partition scans will now be
rejected by default. Users can disable with:
SET reject_partition_fullscan = off;

Test Plan

  • New regression test: partition_fullscan_reject.sql (12 scenarios)
    • Basic rejection (no WHERE, WHERE 1=1, non-partition-key WHERE)
    • Pruning passes (WHERE on partition key)
    • GUC on/off toggle
    • enable_partition_pruning=off exemption
    • Single-partition table exemption
    • Threshold mode (threshold=2)
    • Prepared statement Param exemption
    • UPDATE/DELETE rejection
    • Subquery propagation
    • ORCA path verification
  • make installcheck
  • make -C src/test installcheck-cbdb-parallel

Impact

  • Performance: negligible — Planner path adds one integer comparison
    per partitioned table; ORCA path adds one lightweight plan_tree_walker
  • User-facing: new ERROR for unfiltered partition queries; two new GUCs
  • Risk: low — read-only checks, GUC-controllable, zero ORCA C++ changes

Compliance Checklist

  • I have read the contribution guide
  • My code follows PostgreSQL coding conventions
  • New functionality includes tests
  • I have reviewed my changes for security implications
  • I have requested appropriate reviewers

Files Changed

File Change
src/backend/optimizer/path/costsize.c GUC variable definitions (+2)
src/include/optimizer/cost.h extern declarations (+2)
src/backend/utils/misc/guc.c GUC registration (+27)
src/include/utils/unsync_guc_name.h GUC name registration (+2)
src/backend/optimizer/util/inherit.c Planner path check (+85)
src/backend/optimizer/plan/orca.c ORCA path check (+164)
src/test/regress/sql/partition_fullscan_reject.sql New test (+127)
src/test/regress/greenplum_schedule Register test (+2)

Add a plan-time check that rejects queries on partitioned tables when
no effective partition pruning occurs, preventing unintended full
partition scans that waste cluster resources.

New GUC parameters:
- reject_partition_fullscan (bool, default on): enable/disable
- partition_fullscan_threshold (int, default 0): max partitions
  allowed after pruning. 0 = reject only true fullscans.

Planner path (inherit.c):
- Check in expand_partitioned_rtentry() after
  prune_append_rel_partitions() returns
- Exempts parameterized queries (Param nodes in baserestrictinfo)

ORCA path (orca.c):
- Post-plan check in optimize_query() using plan_tree_walker
- Inspects part_prune_info on 7 Dynamic scan node types
- Skips PartitionSelector (JOIN dynamic pruning)
- Exempts nodes with initial/exec_pruning_steps

Exemptions: enable_partition_pruning=off, single-partition tables,
parameterized queries, JOIN-based dynamic partition selection.
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @lqriu welcome!🎊 Thanks for taking the effort to make our project better! 🙌 Keep making such awesome contributions!

@yjhjstz
Copy link
Copy Markdown
Member

yjhjstz commented Apr 30, 2026

Thanks for the contribution! The feature addresses a real operational pain point.

One architectural suggestion: could this be implemented as an extension and placed under contrib/ (or gpcontrib/)? Using PostgreSQL's planner_hook, the extension fires after the planner (including ORCA) returns the final PlannedStmt, allowing a plan tree walk to detect missing pruning steps — which is essentially what the ORCA path in orca.c already does. GUCs can be registered from the extension side via DefineCustomBoolVariable/DefineCustomIntVariable, keeping all the knobs without touching core GUC tables.

The main trade-off is that planner_hook only sees the finished plan, while the current standard planner path hooks mid-planning inside expand_partitioned_rtentry() to compare num_live_parts vs nparts. For the partition_fullscan_threshold feature this is slightly less precise — but for the primary use case of rejecting true full scans (zero pruning steps), the extension approach is fully equivalent.

Placing this in contrib/ would:

  • Keep core optimizer files (inherit.c, orca.c, guc.c) untouched, reducing cross-version maintenance burden
  • Make the feature easier to evolve and backport independently

Would you be open to exploring that direction? Happy to discuss if there are cases where the extension approach falls short of your requirements.

@lqriu
Copy link
Copy Markdown
Author

lqriu commented May 1, 2026 via email

@yjhjstz
Copy link
Copy Markdown
Member

yjhjstz commented May 1, 2026

Thank you for the detailed analysis. I prefer Option 3 — enhancing the extension to handle the standard planner path without touching core files.

For the Append-node heuristic: if a node is an Append with partition children but part_prune_info == NULL, we can cross-reference the RTE to confirm it's a partitioned table (via RTE_RELATION + inh = true), then treat it as an undetected full scan. This keeps the implementation fully self-contained in the extension.

@lqriu
Copy link
Copy Markdown
Author

lqriu commented May 2, 2026

Closing this PR in favor of a new implementation as a gpcontrib extension, per reviewer feedback. The new PR uses planner_hook instead of modifying core optimizer files.

See the new PR for the extension-based approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants