patternsqlModerate
ROW_NUMBER() without PARTITION BY still generates Segment iterator
Viewed 0 times
generateswithoutpartitioniteratorsegmentrow_numberstill
Problem
I'm writing on an upcoming blog post of mine on ranking and aggregate window functions, specifically the Segment and Sequence Project iterators. The way I understand it is that Segment identifies rows in a stream that constitute the end/beginning of a group, so the following query:
Will use Segment to tell when a row belongs to a different group other than the previous row. The Sequence Project iterator then does the actual row number calculation, based on the output of the Segment iterator's output.
But the following query, using that logic, shouldn't have to include a Segment, because there's no partition expression.
However, When I try this hypothesis both these queries use a Segment operator. The only difference is that the second query does not need a
Example
SELECT ROW_NUMBER() OVER (PARTITION BY someGroup ORDER BY someOrder)Will use Segment to tell when a row belongs to a different group other than the previous row. The Sequence Project iterator then does the actual row number calculation, based on the output of the Segment iterator's output.
But the following query, using that logic, shouldn't have to include a Segment, because there's no partition expression.
SELECT ROW_NUMBER() OVER (ORDER BY someGroup, someOrder)However, When I try this hypothesis both these queries use a Segment operator. The only difference is that the second query does not need a
GroupBy on the Segment. Doesn't that eliminate the need for a Segment in the first place?Example
CREATE TABLE dbo.someTable (
someGroup int NOT NULL,
someOrder int NOT NULL,
someValue numeric(8, 2) NOT NULL,
PRIMARY KEY CLUSTERED (someGroup, someOrder)
);
--- Query 1:
SELECT ROW_NUMBER() OVER (PARTITION BY someGroup ORDER BY someOrder)
FROM dbo.someTable;
--- Query 2:
SELECT ROW_NUMBER() OVER (ORDER BY someGroup, someOrder)
FROM dbo.someTable;Solution
I found this 6 year old blog post mentioning the same behavior.
It looks like
If the segment is needed in most cases, and in the cases where it's not needed it's essentially a zero-cost non-operation, it's a lot simpler to just always include it in the plan when a windowing function is used.
It looks like
ROW_NUMBER() always includes a segment operator, whether PARTITION BY is used or not. If I had to guess I would say this is because it makes creating a query plan easier on the engine.If the segment is needed in most cases, and in the cases where it's not needed it's essentially a zero-cost non-operation, it's a lot simpler to just always include it in the plan when a windowing function is used.
Context
StackExchange Database Administrators Q#84090, answer score: 12
Revisions (0)
No revisions yet.