patternsqlMinor
Single query to get only newest entry of grouped redundant entries
Viewed 0 times
entrynewestqueryredundantsinglegetgroupedonlyentries
Problem
A table
There is an index on:
I want a list of the latest files (
This query works perfect, but is very slow. Is there a faster alternative in a single query?
files looks like:id, file_name, vendor_id, created
1, z.txt, 2, 2011
2, z.txt, 2, 2011
3, a.txt, 1, 2016
4, a.txt, 1, 2016
5, a.txt, 1, 2016
6, b.txt, 1, 2015
7, y.txt, 2, 2015
8, x.txt, 2, 2014There is an index on:
file_name, vendor_id and created.I want a list of the latest files (
newest_file) of each vendor_id in a single query.SELECT vendor_id, SUBSTRING_INDEX(GROUP_CONCAT(file_name ORDER BY created DESC), ',', 1) AS newest_file
FROM files GROUP BY vendor_idThis query works perfect, but is very slow. Is there a faster alternative in a single query?
CREATE TABLE `files` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`created` datetime DEFAULT NULL,
`file_name` varchar(255) CHARACTER SET utf8 DEFAULT NULL,
`vendor_id` int(11) NOT NULL DEFAULT '0',
`description` mediumtext CHARACTER SET utf8,
`color` varchar(255) CHARACTER SET utf8 DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `vendor_id` (`vendor_id`),
KEY `file_name` (`file_name`),
KEY `created` (`created`),
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1Solution
The best index for this query would be on
There are various other methods to rewrite this type of queries, without using
One way to rewrite (needs the same index as above). It assumes that
If
(vendor_id, created, file_name), with this specific order on the indexed columns.There are various other methods to rewrite this type of queries, without using
GROUP_CONCAT(). There is even a tag, at SO and this site: greatest-n-per-group. One way to rewrite (needs the same index as above). It assumes that
(vendor_id, created) is UNIQUE (ie. there are no two rows with same vendor and created) or it isn't unique and you want all the duplicate/tied results:SELECT
v.vendor_id,
f.file_name AS newest_file,
f.created -- you can have this in the results as well
FROM
( SELECT vendor_id -- you can replace the "v" subquery with
FROM files -- a single "vendors" table,
GROUP BY vendor_id -- if there is one
) AS v
JOIN files AS f
ON f.vendor_id = v.vendor_id
AND f.created =
( SELECT fi.created
FROM files AS fi
WHERE fi.vendor_id = v.vendor_id
ORDER BY fi.created DESC
LIMIT 1
) ;If
(vendor_id, created) is not UNIQUE and/or you want just one row of the duplicates, change the joining condition to:-- the query as it is
--
JOIN files AS f
ON f.id =
( SELECT fi.id
FROM files AS fi
--
-- the query as it isCode Snippets
SELECT
v.vendor_id,
f.file_name AS newest_file,
f.created -- you can have this in the results as well
FROM
( SELECT vendor_id -- you can replace the "v" subquery with
FROM files -- a single "vendors" table,
GROUP BY vendor_id -- if there is one
) AS v
JOIN files AS f
ON f.vendor_id = v.vendor_id
AND f.created =
( SELECT fi.created
FROM files AS fi
WHERE fi.vendor_id = v.vendor_id
ORDER BY fi.created DESC
LIMIT 1
) ;-- the query as it is
--
JOIN files AS f
ON f.id =
( SELECT fi.id
FROM files AS fi
--
-- the query as it isContext
StackExchange Database Administrators Q#152207, answer score: 2
Revisions (0)
No revisions yet.