patternsqlMinor
Performance issues with groupwise MAX in JOIN clause
Viewed 0 times
issueswithjoinmaxperformanceclausegroupwise
Problem
Problem
I have some assets in my application which are updated from time to time in an asynchronous fashion.
The example I'm gonna use here is
The relevant parts of the table structure are:
Now I have a new requirement in which I must return the latest update information along side the vehicle information itself.
Groupwise MAX Solutions
After digging a little, I've found this blog article. I then tried to use the suggested "uncorrelated subquery" approach, since it's deemed the best one:
Uncorrelated subquery
This query has an average execution time of
I thought that was too slow, so I decided to g
I have some assets in my application which are updated from time to time in an asynchronous fashion.
The example I'm gonna use here is
Vehicles. There are two tables:Vehicles: holds information about the vehicles themselves
VehicleUpdates: holds information about all updates that happened for that vehicle.
The relevant parts of the table structure are:
CREATE TABLE Vehicles (
id varchar(50) NOT NULL,
organizationId varchar(50) NOT NULL,
plate char(7) NOT NULL,
vehicleInfo json DEFAULT NULL,
createdAt timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
updatedAt timestamp NULL DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (id),
UNIQUE KEY unq_Vehicles_orgId_plate_idx (organizationId,plate) USING BTREE,
KEY Vehicles_createdAt_idx (createdAt),
);
CREATE TABLE VehicleUpdates (
id varchar(50) NOT NULL,
organizationId varchar(50) NOT NULL,
vehiclePlate char(7) NOT NULL,
status varchar(15) NOT NULL,
createdAt timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
updatedAt timestamp NULL DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (id),
KEY VehicleUpdates_orgId_vhclPlt_createdAt_idx (organizationId,vehiclePlate,createdAt) USING BTREE
);
Now I have a new requirement in which I must return the latest update information along side the vehicle information itself.
Groupwise MAX Solutions
After digging a little, I've found this blog article. I then tried to use the suggested "uncorrelated subquery" approach, since it's deemed the best one:
Uncorrelated subquery
SELECT vu1.*
FROM VehicleUpdates AS vu1
JOIN
( SELECT vehiclePlate, organizationId, MAX(createdAt) AS createdAt
FROM VehicleUpdates
GROUP BY organizationId, vehiclePlate
) AS vu2 USING (organizationId, vehiclePlate, createdAt);
This query has an average execution time of
275 ms in my production database.I thought that was too slow, so I decided to g
Solution
I feel so stupid! Just found the problem.
For some reason,
That's the reason why the
After converting
Likewise, the "LEFT JOIN" approach execution plan changed from:
To:
So, the performance now for the different queries are:
LEFT JOIN
Always runs bellow
Unc
For some reason,
Vehicles and VehicleUpdates had different charsets (utf8mb4 and utf8, respectively) in production.That's the reason why the
EXPLAIN result for the "uncorrelated subquery" approach had a full table scan in one of its steps:| "select_type" | "table" | "type" | "possible_keys" | "key" | "key_len" | "ref" | "rows" | "filtered" | "Extra"
| PRIMARY | v | ALL | | | | | 14456 | 100 |
| PRIMARY | | ALL | | | | | 29289 | 100 | Using where
| PRIMARY | vu1 | ref | VehicleUpdates_orgId_vhclPlt_createdAt_idx | VehicleUpdates_orgId_vhclPlt_createdAt_idx | "327" | vu2.organizationId,vu2.vehiclePlate,vu2.createdAt | 1 | 100 | Using where
| DERIVED | VehicleUpdates | range | VehicleUpdates_orgId_vhclPlt_createdAt_idx | VehicleUpdates_orgId_vhclPlt_createdAt_idx | "323" | | 29289 | 100 | Using index for group-byAfter converting
VehicleUpdates to utf8mb4, the EXPLAIN result is:| "select_type" | "table" | "type" | "possible_keys" | "key" | "key_len" | "ref" | "rows" | "filtered" | "Extra"
| PRIMARY | v | ref | Vehicles_orgId_status_idx | Vehicles_orgId_status_idx | "202" | const | 188 | 100 |
| PRIMARY | | ref | | | "230" | v.plate,v.organizationId | 10 | 100 |
| PRIMARY | vu1 | ref | VehicleUpdates_orgId_vhclPlt_createdAt_idx | VehicleUpdates_orgId_vhclPlt_createdAt_idx | "234" | v.organizationId,vu2.vehiclePlate,vu2.createdAt | 1 | 100 | Using where
| DERIVED | VehicleUpdates | ref | VehicleUpdates_orgId_vhclPlt_createdAt_idx | VehicleUpdates_orgId_vhclPlt_createdAt_idx | "202" | const | 24090 | 100 | Using where; Using indexLikewise, the "LEFT JOIN" approach execution plan changed from:
| "select_type" | "table" | "type" | "possible_keys" | "key" | "key_len" | "ref" | "rows" | "filtered" | "Extra"
| SIMPLE | v | ref | unq_Vehicles_orgId_plate_idx,Vehicles_orgId_status_idx | unq_Vehicles_orgId_plate_idx | "202" | const | 30 | 100 |
| SIMPLE | vu1 | ALL | | | | | 263171 | 100 | Using where; Using join buffer (Block Nested Loop)
| SIMPLE | vu2 | ref | VehicleUpdates_orgId_vhclPlt_createdAt_idx | VehicleUpdates_orgId_vhclPlt_createdAt_idx | "173" | vu1.organizationId,vu1.vehiclePlate | 10 | 10 | Using where; Not exists; Using indexTo:
| "select_type" | "table" | "type" | "possible_keys" | "key" | "key_len" | "ref" | "rows" | "filtered" | "Extra"
| SIMPLE | v | ref | Vehicles_orgId_status_idx | Vehicles_orgId_status_idx | "202" | const | 188 | 100 |
| SIMPLE | vu1 | ref | VehicleUpdates_orgId_vhclPlt_createdAt_idx | VehicleUpdates_orgId_vhclPlt_createdAt_idx | "230" | v.organizationId,v.plate | 9 | 100 |
| SIMPLE | vu2 | ref | VehicleUpdates_orgId_vhclPlt_createdAt_idx | VehicleUpdates_orgId_vhclPlt_createdAt_idx | "230" | vu1.organizationId,vu1.vehiclePlate | 9 | 10 | Using where; Not exists; Using indexSo, the performance now for the different queries are:
LEFT JOIN
SELECT v., vu1.
FROM Vehicles AS v
LEFT JOIN VehicleUpdates AS vu1
ON v.plate = vu1.vehiclePlate
AND v.organizationId = vu1.organizationId
LEFT JOIN VehicleUpdates AS vu2
ON vu1.organizationId = vu2.organizationId
AND vu1.vehiclePlate = vu2.vehiclePlate
AND vu2.createdAt > vu1.createdAt
where v.organizationId = ''
AND vu2.id IS NULL
LIMIT 100;
Always runs bellow
50 ms.Unc
Code Snippets
| "select_type" | "table" | "type" | "possible_keys" | "key" | "key_len" | "ref" | "rows" | "filtered" | "Extra"
| PRIMARY | v | ALL | | | | | 14456 | 100 |
| PRIMARY | <derived3> | ALL | | | | | 29289 | 100 | Using where
| PRIMARY | vu1 | ref | VehicleUpdates_orgId_vhclPlt_createdAt_idx | VehicleUpdates_orgId_vhclPlt_createdAt_idx | "327" | vu2.organizationId,vu2.vehiclePlate,vu2.createdAt | 1 | 100 | Using where
| DERIVED | VehicleUpdates | range | VehicleUpdates_orgId_vhclPlt_createdAt_idx | VehicleUpdates_orgId_vhclPlt_createdAt_idx | "323" | | 29289 | 100 | Using index for group-by| "select_type" | "table" | "type" | "possible_keys" | "key" | "key_len" | "ref" | "rows" | "filtered" | "Extra"
| PRIMARY | v | ref | Vehicles_orgId_status_idx | Vehicles_orgId_status_idx | "202" | const | 188 | 100 |
| PRIMARY | <derived2> | ref | <auto_key1> | <auto_key1> | "230" | v.plate,v.organizationId | 10 | 100 |
| PRIMARY | vu1 | ref | VehicleUpdates_orgId_vhclPlt_createdAt_idx | VehicleUpdates_orgId_vhclPlt_createdAt_idx | "234" | v.organizationId,vu2.vehiclePlate,vu2.createdAt | 1 | 100 | Using where
| DERIVED | VehicleUpdates | ref | VehicleUpdates_orgId_vhclPlt_createdAt_idx | VehicleUpdates_orgId_vhclPlt_createdAt_idx | "202" | const | 24090 | 100 | Using where; Using index| "select_type" | "table" | "type" | "possible_keys" | "key" | "key_len" | "ref" | "rows" | "filtered" | "Extra"
| SIMPLE | v | ref | unq_Vehicles_orgId_plate_idx,Vehicles_orgId_status_idx | unq_Vehicles_orgId_plate_idx | "202" | const | 30 | 100 |
| SIMPLE | vu1 | ALL | | | | | 263171 | 100 | Using where; Using join buffer (Block Nested Loop)
| SIMPLE | vu2 | ref | VehicleUpdates_orgId_vhclPlt_createdAt_idx | VehicleUpdates_orgId_vhclPlt_createdAt_idx | "173" | vu1.organizationId,vu1.vehiclePlate | 10 | 10 | Using where; Not exists; Using index| "select_type" | "table" | "type" | "possible_keys" | "key" | "key_len" | "ref" | "rows" | "filtered" | "Extra"
| SIMPLE | v | ref | Vehicles_orgId_status_idx | Vehicles_orgId_status_idx | "202" | const | 188 | 100 |
| SIMPLE | vu1 | ref | VehicleUpdates_orgId_vhclPlt_createdAt_idx | VehicleUpdates_orgId_vhclPlt_createdAt_idx | "230" | v.organizationId,v.plate | 9 | 100 |
| SIMPLE | vu2 | ref | VehicleUpdates_orgId_vhclPlt_createdAt_idx | VehicleUpdates_orgId_vhclPlt_createdAt_idx | "230" | vu1.organizationId,vu1.vehiclePlate | 9 | 10 | Using where; Not exists; Using indexContext
StackExchange Database Administrators Q#229643, answer score: 2
Revisions (0)
No revisions yet.