HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

Is order by necessary after a group by clause?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
afternecessaryordergroupclause

Problem

Suppose I have a simple query

SELECT col1, col2, COUNT(col3)  
FROM tbl_name    
WHERE condition    
GROUP BY col1, col2    
ORDER BY col1 ;


My understanding is group by clause seems to already have a sorting functionality. DB engine would sort col1 then col2 if adjacent rows of col1 happen to have same values. As least that is my experience with sql server.

If that is the case, order by clause here isn't really necessary?

Solution

My understanding is group by clause seems to already have a sorting
functionality.

There are two basic approaches that SQL Server can use. A stream aggregate requires the data to be sorted by the group by keys. This can be either supplied by an index or might need an explicit sort. A stream aggregate is order preserving in that the rows output from that operator are in the same order as the input. This does not imply any guarantee about the eventual output from the query as a whole however. You only get that if you add an ORDER BY.

For a GROUP BY col1, col2 the input stream can be ordered by either col1, col2 or col2, col1 to be acceptable for the stream aggregate in this case. The addition of an index might change the decision of the optimiser as to which one to use.

The other basic approach is that of a hash aggregate. In which the grouping keys are hashed. This is not at all order preserving and will likely output rows in seemingly random orders.

Additionally for parallel plans there might be hybrid approaches, e.g. Each thread could have a local stream aggregate with the thread results then aggregated at global level with a hash aggregate.

Adding ORDER BY means that SQL Server will ensure that the rows are delivered in the desired order. If you are currently observing an output order of col1, col2 then likely you are getting a stream aggregate ordered by those two columns. The addition of an explicit order by won't change the execution plan to add any additional sort as SQL Server will recognize that the output of this aggregate is already in the desired order.

Context

StackExchange Database Administrators Q#142872, answer score: 16

Revisions (0)

No revisions yet.