patternsqlMinor
Count rows with max per group and additional condition
Viewed 0 times
rowsconditionpergroupwithmaxandcountadditional
Problem
I have the following table:
horse_main_id | race_id | horse_name | rating100 | race_result
---------------+---------+-------------------+-----------+-------------
23 | 159 | Angelology | 126 | 1
24 | 159 | Arwoc | 118 |
25 | 159 | Marlo Man | 118 |
26 | 159 | Hucknall | 113 |
27 | 159 | It's Electrifying | 110 | 2
28 | 159 | Monte Carlo | 107 |
29 | 159 | Cross Constance | 103 |
30 | 160 | Auld Burns | 119 |
31 | 160 | Diamond Jim | 117 |
33 | 160 | Livery | 115 |
34 | 160 | Paraggi | 115 |
35 | 160 | I'll'ava'alf | 110 | 2
36 | 160 | Jacks 'n' Kings | 108 |
32 | 160 | Fastnet Isle | 115 | 1
I would like to operate on groups of
So this example should return only 1, as for race_id 159 group the max rating100 of 126 also has race_result = 1, but for race_id 160 group, the max rating100 is 119, but does not have race_result = 1.
This is the closest I've got (was told I should use window functions):
SELECT * FROM (
SELECT horse_name, race_id, race_result
, max(rating100) OVER (partition by race_id) AS max_rating
FROM horse_main) t
WHERE race_result = 1;
horse_main_id | race_id | horse_name | rating100 | race_result
---------------+---------+-------------------+-----------+-------------
23 | 159 | Angelology | 126 | 1
24 | 159 | Arwoc | 118 |
25 | 159 | Marlo Man | 118 |
26 | 159 | Hucknall | 113 |
27 | 159 | It's Electrifying | 110 | 2
28 | 159 | Monte Carlo | 107 |
29 | 159 | Cross Constance | 103 |
30 | 160 | Auld Burns | 119 |
31 | 160 | Diamond Jim | 117 |
33 | 160 | Livery | 115 |
34 | 160 | Paraggi | 115 |
35 | 160 | I'll'ava'alf | 110 | 2
36 | 160 | Jacks 'n' Kings | 108 |
32 | 160 | Fastnet Isle | 115 | 1
I would like to operate on groups of
race_id, find all groups that have the max(rating100) for their group and race_result = 1, and count all the times this happens.So this example should return only 1, as for race_id 159 group the max rating100 of 126 also has race_result = 1, but for race_id 160 group, the max rating100 is 119, but does not have race_result = 1.
This is the closest I've got (was told I should use window functions):
SELECT * FROM (
SELECT horse_name, race_id, race_result
, max(rating100) OVER (partition by race_id) AS max_rating
FROM horse_main) t
WHERE race_result = 1;
Solution
Since you need a subquery in either case, I would use a plain aggregate in the subquery (may be cheaper):
If there are many rows per
Or use the window function
An index on
Assuming
SELECT count(*)
FROM (SELECT race_id, max(rating100) AS rating100
FROM horse_main
GROUP BY 1) x
JOIN horse_main h USING (race_id, rating100)
WHERE h.race_result = 1;If there are many rows per
race_id, it will be faster to get group-wise maxima with one of these techniques:- Optimize GROUP BY query to retrieve latest record per user
Or use the window function
rank() - that's what you count effectively: all greatest rating100 per race_id with race_result = 1:SELECT count(*)
FROM (SELECT race_result, rank() OVER (PARTITION BY race_id ORDER BY rating100 DESC) AS rnk
FROM horse_main) x
WHERE rnk = 1
AND race_result = 1;An index on
(race_id, rating100 DESC, race_result DESC) should help performance with big tables.Assuming
rating100 is defined NOT NULL, else you need to add NULLS LAST in query and index.Code Snippets
SELECT count(*)
FROM (SELECT race_id, max(rating100) AS rating100
FROM horse_main
GROUP BY 1) x
JOIN horse_main h USING (race_id, rating100)
WHERE h.race_result = 1;SELECT count(*)
FROM (SELECT race_result, rank() OVER (PARTITION BY race_id ORDER BY rating100 DESC) AS rnk
FROM horse_main) x
WHERE rnk = 1
AND race_result = 1;Context
StackExchange Database Administrators Q#139115, answer score: 4
Revisions (0)
No revisions yet.