patternsqlMinor
Queries hung for no specific reason
Viewed 0 times
hungreasonforspecificqueries
Problem
I'm having a strange issues for the past few days. I run data updates at night that shouldn't take more than a few hours.
Several weeks ago a single script took two days to complete. I ran it again when the server wasn't processing anything and it took minutes.
I figured this is some kind of race condition, or something triggered when the server load is high.
It is indeed reproduced every night, but not on the same step. I mean I update a lot of databases that are similarly processed. It won't fail on the same database every night, but it will fail on the same specific query.
Here's the specific query:
Basically a
My goal is to have an
So my script does this in two steps: compute md5 and store in temp table, then fill back into
I had an op
Several weeks ago a single script took two days to complete. I ran it again when the server wasn't processing anything and it took minutes.
I figured this is some kind of race condition, or something triggered when the server load is high.
It is indeed reproduced every night, but not on the same step. I mean I update a lot of databases that are similarly processed. It won't fail on the same database every night, but it will fail on the same specific query.
Here's the specific query:
create temporary table trip_types (
id int unsigned not null auto_increment primary key,
stops_taken varchar(32) not null default '',
index trip_type_idx (stops_taken)
)
select (
select md5(group_concat(st.stop_iid))
from stop_times st
where st.trip_iid = t.id
group by st.trip_iid
order by st.stop_sequence
) as stops_taken
from trips t
group by stops_taken;
update trips t
set trip_type = (
select id
from trip_types
where stops_taken = (
select md5(group_concat(stop_iid))
from stop_times
where trip_iid = t.id
group by trip_iid
order by stop_sequence asc
)
group by stops_taken
);Basically a
trip is a collection of stops (+ time of departure, hence stop_times), ordered by stop.stop_sequence. I want to compare identical trips regardless of the time of departure, hence I'm comparing the ordered list of stops taken.My goal is to have an
INT in the trips table that would indicate the trip type (list of trips taken), so that I can differentiate trips just using the INT column:- take one trip
- compute list of stops taken
- save as md5 in a temp table
- once all trips are computed, put back the ID of the temp table in the
tripstable
So my script does this in two steps: compute md5 and store in temp table, then fill back into
trips.I had an op
Solution
I suspect that both the md5 function and the group_concat function are very slow. So, you should probably avoid calculating them twice. Unfortunately this work-around involves an additional temporary table.
As I assume that you'll be either computing this periodically or on a trigger any time there's a "real" update to trips, and as I suspect the trip_type_idx is leading to stalls during your script propagation you may want to drop the key entirely. That way it won't be recomputed halfway through your updates. By the same token, you'll also want to destroy any foreign keys that point to this field.
As jkavalik points out, you cannot simply
Misquoting the MySQL manual ever so slightly
[InnoDB] does this with a special algorithm [
MySQL version 5.7] that is much faster than inserting keys one by one.
Using ALTER TABLE ... [ADD KEY] requires the INDEX privilege in addition
to the privileges mentioned earlier.
As I assume that you'll be either computing this periodically or on a trigger any time there's a "real" update to trips, and as I suspect the trip_type_idx is leading to stalls during your script propagation you may want to drop the key entirely. That way it won't be recomputed halfway through your updates. By the same token, you'll also want to destroy any foreign keys that point to this field.
ALTER TABLE trips DROP KEY trip_type_idx;
CREATE TEMPORARY TABLE trans_trip_types (
trip_id unsigned not null primary key,
md5_stops_taken varchar(32) not null,
INDEX idx_md5 (md5_stops_taken)
)
SELECT st.trip_iid AS trip_id,
md5(Group_Concat(DISTINCT st.stop_iid ORDER BY st.stop_sequence)) AS md5_stops_taken
FROM stop_times AS st
JOIN trips t
ON st.trip_iid = t.id
GROUP BY st.trip_iid;
create temporary table trip_types (
id int unsigned autonumber not null primary key, -- back to autonumber
stops_taken varchar(32) not null default '',
INDEX trip_type_idx (stops_taken)
)
SELECT DISTINCT md5_stops_taken AS stops_taken
FROM trans_trip_types ORDER BY md5_stops_taken;
UPDATE trips AS t
SET t.trip_type =
(SELECT tt.id
FROM trans_trip_types AS ttt
JOIN trip_types AS tt
ON tt.stops_taken = ttt.stops_taken
WHERE ttt.trip_iid = t.trip_iid
LIMIT 1
);
DROP trans_trip_types;
ALTER TABLE trips ADD KEY trip_type_idx (trip_type);As jkavalik points out, you cannot simply
ENABLE and DISABLE keys (e.g. ALTER TABLE trips DISABLE KEYS) for InnoDB tables in MySQL, but index re-creation should be faster than updating indexes one at a time.Misquoting the MySQL manual ever so slightly
[InnoDB] does this with a special algorithm [
fast index creation as of MySQL version 5.7] that is much faster than inserting keys one by one.
Using ALTER TABLE ... [ADD KEY] requires the INDEX privilege in addition
to the privileges mentioned earlier.
Code Snippets
ALTER TABLE trips DROP KEY trip_type_idx;
CREATE TEMPORARY TABLE trans_trip_types (
trip_id unsigned not null primary key,
md5_stops_taken varchar(32) not null,
INDEX idx_md5 (md5_stops_taken)
)
SELECT st.trip_iid AS trip_id,
md5(Group_Concat(DISTINCT st.stop_iid ORDER BY st.stop_sequence)) AS md5_stops_taken
FROM stop_times AS st
JOIN trips t
ON st.trip_iid = t.id
GROUP BY st.trip_iid;
create temporary table trip_types (
id int unsigned autonumber not null primary key, -- back to autonumber
stops_taken varchar(32) not null default '',
INDEX trip_type_idx (stops_taken)
)
SELECT DISTINCT md5_stops_taken AS stops_taken
FROM trans_trip_types ORDER BY md5_stops_taken;
UPDATE trips AS t
SET t.trip_type =
(SELECT tt.id
FROM trans_trip_types AS ttt
JOIN trip_types AS tt
ON tt.stops_taken = ttt.stops_taken
WHERE ttt.trip_iid = t.trip_iid
LIMIT 1
);
DROP trans_trip_types;
ALTER TABLE trips ADD KEY trip_type_idx (trip_type);Context
StackExchange Database Administrators Q#117802, answer score: 2
Revisions (0)
No revisions yet.