patternMinor
MySQL - maximum of sum over different months with ties over multiple years
Viewed 0 times
maximumwithtiesdifferentmysqlmonthsmultiplesumyearsover
Problem
This question was inspired by this one [closed] and is virtually identical to this one but using different RDBMS's (PostgreSQL vs. MySQL).
Suppose I have a list of tumours (this data is simulated from real data):
You want to find out which particular tumour was most common in a given month - so far so good!
Now, you will notice that for month 1 of 2017, there is a tie - so it makes no sense whatsoever to randomly pick one and give that as the answer - so ties have to be included - this makes the problem much more challenging.
The correct answer is:
A further bonus would b
Suppose I have a list of tumours (this data is simulated from real data):
CREATE table illness (nature_of_illness VARCHAR(25), created_at DATETIME);
INSERT INTO illness VALUES ('Cervix', '2018-01-03 15:45:40');
INSERT INTO illness VALUES ('Cervix', '2018-01-03 15:45:40');
INSERT INTO illness VALUES ('Cervix', '2018-01-03 15:45:40');
INSERT INTO illness VALUES ('Cervix', '2018-01-03 15:45:40');
INSERT INTO illness VALUES ('Cervix', '2018-01-03 15:45:40');
INSERT INTO illness VALUES ('Lung', '2018-01-03 17:50:32');
INSERT INTO illness VALUES ('Lung', '2018-02-03 17:50:32');
INSERT INTO illness VALUES ('Lung', '2018-02-03 17:50:32');
INSERT INTO illness VALUES ('Lung', '2018-02-03 17:50:32');
INSERT INTO illness VALUES ('Cervix', '2018-02-03 17:50:32');
-- 2017, with 1 Cervix and Lung each for the month of Jan - tie!
INSERT INTO illness VALUES ('Cervix', '2017-01-03 15:45:40');
INSERT INTO illness VALUES ('Lung', '2017-01-03 17:50:32');
INSERT INTO illness VALUES ('Lung', '2017-02-03 17:50:32');
INSERT INTO illness VALUES ('Lung', '2017-02-03 17:50:32');
INSERT INTO illness VALUES ('Lung', '2017-02-03 17:50:32');
INSERT INTO illness VALUES ('Cervix', '2017-02-03 17:50:32');You want to find out which particular tumour was most common in a given month - so far so good!
Now, you will notice that for month 1 of 2017, there is a tie - so it makes no sense whatsoever to randomly pick one and give that as the answer - so ties have to be included - this makes the problem much more challenging.
The correct answer is:
Year Month Tumour count Type
2017 1 1 Cervix -- note tie
2017 1 1 Lung -- " "
2017 2 3 Lung
2018 1 5 Cervix
2018 2 3 LungA further bonus would b
Solution
My attempt to solve this is as follows. I would appreciate any advice on how this query could be improved:
And it does give the correct result, as can be seen in the fiddle here!
SELECT
t3.c_year AS "Year",
t3.c_month AS "Month",
t3.il_mc AS "Tumour count",
t4.ill_nat AS "Type" FROM
(
SELECT c_year, c_month, il_mc FROM
(
SELECT
c_year,
c_month,
MAX(month_count) AS il_mc
FROM
(
SELECT nature_of_illness as illness,
EXTRACT(YEAR FROM created_at) AS c_year,
EXTRACT(MONTH FROM created_at) AS c_month,
COUNT(EXTRACT(MONTH FROM created_at)) AS month_count
FROM illness
GROUP BY illness, c_year, c_month
ORDER BY c_year, c_month
) AS t1
GROUP BY c_year, c_month
) AS t2
) AS t3
JOIN
(
SELECT
EXTRACT(YEAR FROM created_at) AS t_year,
EXTRACT(MONTH FROM created_at) AS t_month,
nature_of_illness AS ill_nat,
COUNT(nature_of_illness) AS ill_cnt
FROM illness
GROUP BY t_year, t_month, nature_of_illness
ORDER BY t_year, t_month, nature_of_illness
) AS t4
ON t3.c_year = t4.t_year
AND t3.c_month = t4.t_month
AND t3.il_mc = t4.ill_cntAnd it does give the correct result, as can be seen in the fiddle here!
Code Snippets
SELECT
t3.c_year AS "Year",
t3.c_month AS "Month",
t3.il_mc AS "Tumour count",
t4.ill_nat AS "Type" FROM
(
SELECT c_year, c_month, il_mc FROM
(
SELECT
c_year,
c_month,
MAX(month_count) AS il_mc
FROM
(
SELECT nature_of_illness as illness,
EXTRACT(YEAR FROM created_at) AS c_year,
EXTRACT(MONTH FROM created_at) AS c_month,
COUNT(EXTRACT(MONTH FROM created_at)) AS month_count
FROM illness
GROUP BY illness, c_year, c_month
ORDER BY c_year, c_month
) AS t1
GROUP BY c_year, c_month
) AS t2
) AS t3
JOIN
(
SELECT
EXTRACT(YEAR FROM created_at) AS t_year,
EXTRACT(MONTH FROM created_at) AS t_month,
nature_of_illness AS ill_nat,
COUNT(nature_of_illness) AS ill_cnt
FROM illness
GROUP BY t_year, t_month, nature_of_illness
ORDER BY t_year, t_month, nature_of_illness
) AS t4
ON t3.c_year = t4.t_year
AND t3.c_month = t4.t_month
AND t3.il_mc = t4.ill_cntContext
StackExchange Database Administrators Q#206030, answer score: 5
Revisions (0)
No revisions yet.