snippetsqlMinor
How to combine ORDER BY and LIMIT with an aggregate function?
Viewed 0 times
ordercombinewithlimitfunctionhowandaggregate
Problem
Here is a fiddle for my question.
I have a simple table layout:
I want to select all classes, and for each class, I want the first two person identifiers of the belonging persons sorted by descending name.
I solved this with the following query:
Note: I could have used a correlation subquery in the
As you can see, I am applying
Is there a way to avoid that? My train of thought:
-
First, obviously I cannot remove the
-
Second, I think that the
Should I rewrite the query?
I have a simple table layout:
class
person: belongs to a classI want to select all classes, and for each class, I want the first two person identifiers of the belonging persons sorted by descending name.
I solved this with the following query:
select c.identifier, array_agg(p.identifier order by p.name desc) as persons
from class as c
left join lateral (
select p.identifier, p.name
from person as p
where p.class_identifier = c.identifier
order by p.name desc
limit 2
) as p
on true
group by c.identifier
order by c.identifierNote: I could have used a correlation subquery in the
SELECT clause, but I am trying to avoid that as part of a learning process.As you can see, I am applying
order by p.name desc in two places:- in the subquery
- in the aggregate function
Is there a way to avoid that? My train of thought:
-
First, obviously I cannot remove the
order by in the subquery, as that would give a query which does not meet my requirement as stated above.-
Second, I think that the
order by in the aggregate function cannot be left out, as row order of the subquery is not necessarily preserved in the aggregate function?Should I rewrite the query?
Solution
I am applying
Yes. Aggregate with an ARRAY constructor in the lateral subquery directly:
You also don't need
I replaced the
db<>fiddle here.
Related:
Order of rows in subqueries
To address your comment:
I learned that order of rows in a subquery is never guaranteed to be preserved in the outer query.
Well, no. While the SQL standard does not offer any guarantees, there are limited guarantees in Postgres. The manual:
This ordering is unspecified by default, but can be controlled by
writing an
Section 4.2.7. Alternatively, supplying the input values from a sorted
subquery will usually work. For example:
Beware that this approach can fail if the outer query level contains
additional processing, such as a join, because that might cause the
subquery's output to be reordered before the aggregate is computed.
If all you do in the next level is to aggregate rows, the order is positively guaranteed. Any yes, what we feed to the ARRAY constructor is a subquery, too. That's not the point. It would work with
But I expect the ARRAY constructor to be faster for the case. See:
order by p.name desc in two places ... Is there a way to avoid that?Yes. Aggregate with an ARRAY constructor in the lateral subquery directly:
SELECT c.identifier, p.persons
FROM class c
CROSS JOIN LATERAL (
SELECT ARRAY (
SELECT identifier
FROM person
WHERE class_identifier = c.identifier
ORDER BY name DESC
LIMIT 2
) AS persons
) p
ORDER BY c.identifier;You also don't need
GROUP BY in the outer SELECT this way. Shorter, cleaner, faster.I replaced the
LEFT JOIN with a plain CROSS JOIN since the ARRAY constructor always returns exactly 1 row. (Like you pointed out in a comment.)db<>fiddle here.
Related:
- Preserve order of array elements after join
Order of rows in subqueries
To address your comment:
I learned that order of rows in a subquery is never guaranteed to be preserved in the outer query.
Well, no. While the SQL standard does not offer any guarantees, there are limited guarantees in Postgres. The manual:
This ordering is unspecified by default, but can be controlled by
writing an
ORDER BY clause within the aggregate call, as shown inSection 4.2.7. Alternatively, supplying the input values from a sorted
subquery will usually work. For example:
SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;Beware that this approach can fail if the outer query level contains
additional processing, such as a join, because that might cause the
subquery's output to be reordered before the aggregate is computed.
If all you do in the next level is to aggregate rows, the order is positively guaranteed. Any yes, what we feed to the ARRAY constructor is a subquery, too. That's not the point. It would work with
array_agg() as well:SELECT c.identifier, p.persons
FROM class c
CROSS JOIN LATERAL (
SELECT array_agg(identifier) AS persons
FROM (
SELECT identifier
FROM person
WHERE class_identifier = c.identifier
ORDER BY name DESC
LIMIT 2
) sub
) p
ORDER BY c.identifier;But I expect the ARRAY constructor to be faster for the case. See:
- Why is array_agg() slower than the non-aggregate ARRAY() constructor?
- convert right side of join of many to many into array
Code Snippets
SELECT c.identifier, p.persons
FROM class c
CROSS JOIN LATERAL (
SELECT ARRAY (
SELECT identifier
FROM person
WHERE class_identifier = c.identifier
ORDER BY name DESC
LIMIT 2
) AS persons
) p
ORDER BY c.identifier;SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;SELECT c.identifier, p.persons
FROM class c
CROSS JOIN LATERAL (
SELECT array_agg(identifier) AS persons
FROM (
SELECT identifier
FROM person
WHERE class_identifier = c.identifier
ORDER BY name DESC
LIMIT 2
) sub
) p
ORDER BY c.identifier;Context
StackExchange Database Administrators Q#213592, answer score: 8
Revisions (0)
No revisions yet.