Skip to content

[subquery][test] COUNT-containing scalar subquery rewrite in fallback PostgreSQL optimizer path #2

@Excaliiiibur

Description

@Excaliiiibur

Greenplum version or build

open-gpdb OPENGPDB_STABLE (GP6/PG9.4-based line)

Expected behavior

Correlated scalar aggregate subqueries should preserve empty-input semantics after pull-up rewrite.

If target expression contains COUNT, unmatched rows must evaluate with aggregate empty-input defaults (e.g. COUNT -> 0).

Actual behavior

In fallback PostgreSQL optimizer rewrite path (cdbsubselect.c, convert_EXPR_to_join), COUNT-containing scalar-subquery expressions can return incorrect results on unmatched rows.

This is not ORCA-specific. It affects execution paths that fall back to PostgreSQL planner.

Step to reproduce the behavior

CREATE TABLE t_outer(a int, b int) DISTRIBUTED BY (a);
CREATE TABLE t_inner(b int) DISTRIBUTED BY (b);
INSERT INTO t_outer VALUES (0,1),(0,2);
INSERT INTO t_inner VALUES (1);

set optimizer=off;
SELECT a,b
FROM t_outer o
WHERE (o.a + 1) > (
  SELECT count(*) + 1
  FROM t_inner i
  WHERE i.b = o.b
)
ORDER BY b;

Additional impact scope

Besides optimizer=off, the same risk exists for other query shapes that trigger ORCA fallback to PostgreSQL planner (for example unsupported features/GUC combinations where planning switches to Postgres optimizer).

So the issue scope is: fallback PostgreSQL planner rewrite path, not only manual ORCA-off sessions.

Related PR: #1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions