HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

ROW_NUMBER() without PARTITION BY still generates Segment iterator

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
generateswithoutpartitioniteratorsegmentrow_numberstill

Problem

I'm writing on an upcoming blog post of mine on ranking and aggregate window functions, specifically the Segment and Sequence Project iterators. The way I understand it is that Segment identifies rows in a stream that constitute the end/beginning of a group, so the following query:

SELECT ROW_NUMBER() OVER (PARTITION BY someGroup ORDER BY someOrder)


Will use Segment to tell when a row belongs to a different group other than the previous row. The Sequence Project iterator then does the actual row number calculation, based on the output of the Segment iterator's output.

But the following query, using that logic, shouldn't have to include a Segment, because there's no partition expression.

SELECT ROW_NUMBER() OVER (ORDER BY someGroup, someOrder)


However, When I try this hypothesis both these queries use a Segment operator. The only difference is that the second query does not need a GroupBy on the Segment. Doesn't that eliminate the need for a Segment in the first place?

Example

CREATE TABLE dbo.someTable (
    someGroup   int NOT NULL,
    someOrder   int NOT NULL,
    someValue   numeric(8, 2) NOT NULL,
    PRIMARY KEY CLUSTERED (someGroup, someOrder)
);

--- Query 1:
SELECT ROW_NUMBER() OVER (PARTITION BY someGroup ORDER BY someOrder)
FROM dbo.someTable;

--- Query 2:
SELECT ROW_NUMBER() OVER (ORDER BY someGroup, someOrder)
FROM dbo.someTable;

Solution

I found this 6 year old blog post mentioning the same behavior.

It looks like ROW_NUMBER() always includes a segment operator, whether PARTITION BY is used or not. If I had to guess I would say this is because it makes creating a query plan easier on the engine.

If the segment is needed in most cases, and in the cases where it's not needed it's essentially a zero-cost non-operation, it's a lot simpler to just always include it in the plan when a windowing function is used.

Context

StackExchange Database Administrators Q#84090, answer score: 12

Revisions (0)

No revisions yet.