Skip to content

[CALCITE-7620] Result of FILTER clause in window functions is incorrect#5040

Open
xuzifu666 wants to merge 3 commits into
apache:mainfrom
xuzifu666:filter_fix
Open

[CALCITE-7620] Result of FILTER clause in window functions is incorrect#5040
xuzifu666 wants to merge 3 commits into
apache:mainfrom
xuzifu666:filter_fix

Conversation

@xuzifu666

Copy link
Copy Markdown
Member

* partition by ORDER BY keys. The output order is therefore not defined by
* a simple collation in the general case, so we conservatively report no
* collations. */
public static @Nullable List<RelCollation> window(RelMetadataQuery mq, RelNode input,

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for modifying RelMdCollation.window is that the original window sorting derivation was too optimistic, which would cause the optimizer to mistakenly believe that the window output retained the input order, thus mistakenly deleting the top-level Sort.
The original implementation had the following problem:

  1. Previously, RelMdCollation.window directly returned mq.collations(input), meaning "the window operator will preserve the order of the input rows as is." However, the actual implementation of EnumerableWindow first groups the rows by the PARTITION BY key using SortedMultiMap, and then sorts them within each group by the window ORDER BY key. Therefore, the input order is not preserved; the global output order is PARTITION BY keys + ORDER BY keys, not simply the input order.
    This caused the top-level Sort to be incorrectly optimized away.

  2. When order by empno is written in the SQL, if the window also happens to be sorted by empno, the optimizer will mistakenly assume that the window output is globally ordered, thus deleting the top-level EnumerableSort. The resulting output is grouped by deptno, not sorted by empno.

this.hints = calc.getHints();
this.cluster = calc.getCluster();
this.traits = calc.getTraitSet();
this.traits = calc.getTraitSet()

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for modifying CalcRelSplitter.java is that when ProjectToWindowRule splits Calc/Project containing window functions, it passes the original node's trait set (including the contaminated collation) to the new node after splitting, causing the optimizer to incorrectly remove the top-level Sort before the window expands.

!ok

# Test 3: Multiple FILTER with OVER on different aggregates
select empno, deptno,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another case

select ename, job, hiredate,
  avg(sal) over (order by hiredate rows 3 preceding) as avg_sal,
  avg(sal) filter (where job = 'MANAGER') over (order by hiredate rows 3 preceding)
    as avg_mgr_sal
from emp
order by hiredate;

@xuzifu666 xuzifu666 Jun 23, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this test has been added; the AVG_MGR_SAL field is related to this filter modification, and the data is consistent with https://onecompiler.com/postgresql/44smkpfxb

@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants