patternsqlModerate
Are results from UNION ALL clauses always appended in order?
Viewed 0 times
appendedorderallareunionalwaysresultsfromclauses
Problem
As per standard SQL
However, Postgres uses an "Append" step for plain cases of
Obviously this does not hold for
I have based queries on this behavior in the past. Now I have met a claim Postgres might return rows out of order here, but unsubstantiated with actual proof.
The current Postgres manual has this to say on the matter:
of
order in which the rows are actually returned). Furthermore, it
eliminates duplicate rows from its result, in the same way as
This is pretty unclear. Does the cited order apply to the list of
Can anybody show an example, where rows are returned out of order, breaking the sequence of
If not that, is there reason to believe this might change in a future version?
UNION / UNION ALL do not guarantee any particular sort order without an outer ORDER BY clause - like there is hardly any place in SQL where sort order is guaranteed without ORDER BY.However, Postgres uses an "Append" step for plain cases of
UNION ALL, hence results from the first leg (even if unsorted among their partition) always come before the next leg, etc. Postgres simply appends results from each leg in the given order. This is particularly relevant with a LIMIT clause:SELECT 1 FROM tbl -- or any complex query
UNION ALL
SELECT 2
LIMIT 1Obviously this does not hold for
UNION (without ALL). But other than that, I have never seen Postgres return out of order, i.e. '2' from the above query, while the first SELECT would also return row(s). Not even if the first leg is extremely expensive.I have based queries on this behavior in the past. Now I have met a claim Postgres might return rows out of order here, but unsubstantiated with actual proof.
The current Postgres manual has this to say on the matter:
UNION effectively appends the result of query2 to the resultof
query1 (although there is no guarantee that this is theorder in which the rows are actually returned). Furthermore, it
eliminates duplicate rows from its result, in the same way as
DISTINCT, unless UNION ALL is used.This is pretty unclear. Does the cited order apply to the list of
SELECT clauses, or rows within each clause, or just the returned set? Also, UNION ALL is only mentioned in the second sentence, so it remains unclear if the all-important first sentence is supposed to apply to UNION ALL ...Can anybody show an example, where rows are returned out of order, breaking the sequence of
UNION ALL clauses? In any version of Postgres. (Even if the latest version would be most interesting.)If not that, is there reason to believe this might change in a future version?
ORDER BY is not the immediate issue here. The questionSolution
There was a similar question recently in pgsql-docs mailing list,
Clarify the ordering guarantees in combining queries (or lack thereof):
I was trying to understand what - if any - are the guarantees with regards
to ordering for combining queries (UNION/UNION ALL/...). From this
message1, it seems that UNION ALL does preserve the ordering of the
operand queries, whereas UNION does not (presumably neither do INTERSECT,
INTERSECT ALL, EXCEPT and EXCEPT ALL).
The documentation[2] makes no mention of this, I'd suggest adding a note
clarifying this.
Tom Lane (and others) replied:
Since the documentation doesn't make a guarantee there is none. If you
want ordered output use ORDER BY.
No, there is no guarantee. It's just that UNION ALL works this way today
(preserving the order of the subselects) - and I'm not even sure about
that, it may not preserve the order in all cases, with different indexes or
partitioning or a parallel plan, etc.
In any case, there is no guarantee that the behaviour will not change in
the future due to planner improvements.
Yeah, that. You can get a parallelized plan today for UNION ALL:
It's true that in simple non-parallelized cases we'll do the first query then the second, but SQL doesn't promise that to be true and neither does Postgres.
Clarify the ordering guarantees in combining queries (or lack thereof):
I was trying to understand what - if any - are the guarantees with regards
to ordering for combining queries (UNION/UNION ALL/...). From this
message1, it seems that UNION ALL does preserve the ordering of the
operand queries, whereas UNION does not (presumably neither do INTERSECT,
INTERSECT ALL, EXCEPT and EXCEPT ALL).
The documentation[2] makes no mention of this, I'd suggest adding a note
clarifying this.
Tom Lane (and others) replied:
Since the documentation doesn't make a guarantee there is none. If you
want ordered output use ORDER BY.
No, there is no guarantee. It's just that UNION ALL works this way today
(preserving the order of the subselects) - and I'm not even sure about
that, it may not preserve the order in all cases, with different indexes or
partitioning or a parallel plan, etc.
In any case, there is no guarantee that the behaviour will not change in
the future due to planner improvements.
Yeah, that. You can get a parallelized plan today for UNION ALL:
=# explain analyze select * from foo union all select * from foo;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=0.00..208552.05 rows=5120008 width=244) (actual time=0.652..390.135 rows=5120000 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Append (cost=0.00..208552.05 rows=2133336 width=244) (actual time=0.021..228.848 rows=1706667 loops=3)
-> Parallel Seq Scan on foo (cost=0.00..98942.68 rows=1066668 width=244) (actual time=0.453..78.084 rows=853333 loops=3)
-> Parallel Seq Scan on foo foo_1 (cost=0.00..98942.68 rows=1066668 width=244) (actual time=0.024..125.299 rows=1280000 loops=2)
Planning Time: 0.094 ms
Execution Time: 488.352 msIt's true that in simple non-parallelized cases we'll do the first query then the second, but SQL doesn't promise that to be true and neither does Postgres.
Code Snippets
=# explain analyze select * from foo union all select * from foo;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=0.00..208552.05 rows=5120008 width=244) (actual time=0.652..390.135 rows=5120000 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Append (cost=0.00..208552.05 rows=2133336 width=244) (actual time=0.021..228.848 rows=1706667 loops=3)
-> Parallel Seq Scan on foo (cost=0.00..98942.68 rows=1066668 width=244) (actual time=0.453..78.084 rows=853333 loops=3)
-> Parallel Seq Scan on foo foo_1 (cost=0.00..98942.68 rows=1066668 width=244) (actual time=0.024..125.299 rows=1280000 loops=2)
Planning Time: 0.094 ms
Execution Time: 488.352 msContext
StackExchange Database Administrators Q#316818, answer score: 15
Revisions (0)
No revisions yet.