debugMinor
How to fix performance issue with row_number and multiple schema?
Viewed 0 times
fixwithissueperformancerow_numberhowandmultipleschema
Problem
After receiving answer to How can I get a valid rank counter?, I adapted to my own system. But now, I have a performance issue. All of my requests are very fast (less than 0.0005s for most of them), but when using
Here is a complete example:
Schema 1, named
This table actual have more than 400 rows.
Schema 2, named
This table actual have more than 2000 rows.
My complete query is like that:
It takes around 0.22s.
Now, let's check part by part:
When running
When running:
It takes around 0.0017s.
With
With
Query analyzed:
How can I fix this performance issue?
Note: the column "coins" here is an example. In reality, more than 60 columns will use this request. So add index for each column like this is not an option for me (too many index, with too different values)
Note 2: Can't do a db fiddle as it's on multiple database and with lot of data, sorry.
ROW_NUMBER() with multiple schemas, it take more than 0.2s.Here is a complete example:
Schema 1, named
sanctions, with a table named bans and composed of:id, auto increment field
uuid, varchar with index
- others content not linked with question
This table actual have more than 400 rows.
Schema 2, named
stats, with a table named players and composed of:id, auto increment field
uuid, varchar with index
coins, double
- others content not linked with question
This table actual have more than 2000 rows.
My complete query is like that:
SELECT
uuid,
(SELECT count(*) FROM sanctions.bans WHERE uuid = p.uuid) as nb,
row_number() OVER (order by coins DESC) counter
FROM stats.players p;
It takes around 0.22s.
Now, let's check part by part:
When running
SELECT count(*) FROM arkbans.litebans_bans WHERE uuid = p.uuid (and by replacing p.uuid by a value), I never go more than 0.0002s.When running:
SELECT
uuid,
row_number() OVER (order by coins DESC) counter
FROM stats.players p;
It takes around 0.0017s.
With
ANALYZE key:With
ANALYZE FORMAT=JSON : hereQuery analyzed:
WITH Bans AS
(
SELECT uuid, COUNT(*) AS BanCount
FROM sanctions.bans
GROUP BY uuid
)
SELECT
p.uuid,
COUNT(b.BanCount) as nb,
row_number() OVER (order by MAX(p.coins) DESC) counter
FROM stats.players p
LEFT JOIN Bans b ON p.uuid = b.uuid
How can I fix this performance issue?
Note: the column "coins" here is an example. In reality, more than 60 columns will use this request. So add index for each column like this is not an option for me (too many index, with too different values)
Note 2: Can't do a db fiddle as it's on multiple database and with lot of data, sorry.
Solution
When running
Yea running it for a single
You should write your query in a more relationally performant way, with an actual
You may find this re-write even more performant, by directly joining and then grouping the results:
As mentioned in the comments, an index on
Your
"attached_condition": "trigcond(stats.p.uuid = convert(b.uuid using utf8mb4))"
This
SELECT count(*) FROM arkbans.litebans_bans WHERE uuid = p.uuid (and by replacing p.uuid by a value), I never go more than 0.0002s.Yea running it for a single
uuid is only 0.0002s but how many rows / uuids are there in your stats.players table? And if you multiply that count by 0.0002s, how long is the total runtime now? That's effectively what you're doing when you have an inlined expression in your SELECT list that filters on a given row of the outer table. (It's not exactly the same thing, but close enough for you to get the idea.)You should write your query in a more relationally performant way, with an actual
JOIN, like so:WITH Bans AS
(
SELECT uuid, COUNT(1) AS BanCount
FROM sanctions.bans
GROUP BY uuid
)
SELECT
p.uuid,
IFNULL(b.BanCount, 0) as nb,
row_number() OVER (order by p.coins DESC) counter
FROM stats.players p
LEFT JOIN Bans b
ON p.uuid = b.uuid;You may find this re-write even more performant, by directly joining and then grouping the results:
SELECT
p.uuid,
COUNT(b.BanCount) as nb,
row_number() OVER (order by MAX(p.coins) DESC) counter
FROM stats.players p
LEFT JOIN sanctions.bans b
ON p.uuid = b.uuid
GROUP BY p.uuid;As mentioned in the comments, an index on
(uuid, coins) on the stats.players table may be better suited for your type of query.Your
ANALYZE FORMAT=JSON is showing that most of the time is spent on sanctions.bans. Specifically this line is interesting:"attached_condition": "trigcond(stats.p.uuid = convert(b.uuid using utf8mb4))"
This
convert(b.uuid using utf8mb4) indicates to me that your uuid column in the sanctions.bans table is a different character set than the one in the stats.players table. This is called implicit conversion, and can cause performance issues. Intuitively I feel like this is your bottleneck. Please verify and make sure that both fields are the same character sets (and collations).Code Snippets
WITH Bans AS
(
SELECT uuid, COUNT(1) AS BanCount
FROM sanctions.bans
GROUP BY uuid
)
SELECT
p.uuid,
IFNULL(b.BanCount, 0) as nb,
row_number() OVER (order by p.coins DESC) counter
FROM stats.players p
LEFT JOIN Bans b
ON p.uuid = b.uuid;SELECT
p.uuid,
COUNT(b.BanCount) as nb,
row_number() OVER (order by MAX(p.coins) DESC) counter
FROM stats.players p
LEFT JOIN sanctions.bans b
ON p.uuid = b.uuid
GROUP BY p.uuid;Context
StackExchange Database Administrators Q#326150, answer score: 2
Revisions (0)
No revisions yet.