patternsqlMinor
What to do after indexing and partitioning to improve performance
Viewed 0 times
afterwhatindexingimproveperformanceandpartitioning
Problem
I have a large table with about 100 million rows.
I have done indexing and partitioning in the table, but still queries sometimes take >100 or even >200 seconds to run.
So I was thinking, what is the next step after indexing and partitioning to improvise the MySQL performance.
And this question is not only about this table, but about general MySQL practices. Like doing indexing is one of the basic things, and after indexing second thing is partitioning. But what after that? My question can be little vague, but I think this will be helpful to many people searching for similar ansers.
To give you an idea, here is an overview of my table
Following is the output of
```
+-------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------+------------+----------+--------------+-------------+-----------+-------------+----------+-
I have done indexing and partitioning in the table, but still queries sometimes take >100 or even >200 seconds to run.
So I was thinking, what is the next step after indexing and partitioning to improvise the MySQL performance.
- Changing the code logic is one option (but that is reducing the data, I am concerned about how to tune MySQL to work with this data only)
- Upgrading the hardware and MySQL versions in another.(I am already using SSD, 8 core CPU and 32GBs RAM, and resources are available most of the time. )
And this question is not only about this table, but about general MySQL practices. Like doing indexing is one of the basic things, and after indexing second thing is partitioning. But what after that? My question can be little vague, but I think this will be helpful to many people searching for similar ansers.
To give you an idea, here is an overview of my table
large_table :+------+------------+---------------+---------------------+
| user | mobile | is_first_time | time_send |
+------+------------+---------------+---------------------+
| a | xxxxxxxxxx | 0 | 2018-03-12 00:00:00 |
+------+------------+---------------+---------------------+
| b | xxxxxxxxxx | 1 | 2018-04-02 07:08:09 |
+------+------------+---------------+---------------------+
| c | xxxxxxxxxx | 0 | 2018-01-03 01:02:03 |
+------+------------+---------------+---------------------+Following is the output of
SHOW INDEXES FOR large_table```
+-------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------+------------+----------+--------------+-------------+-----------+-------------+----------+-
Solution
-
-
You already have this; keep it:
One way to 'fix' the update is to build a table (
and have
The other solution to the Update is to upgrade to 5.6, then 5.7. Anyway, 5.5 will soon be End-Of-Life'd.
-
To save a little space, and a little speed, consider
-
Get rid of the redundant
PARTITIONing buys you no performance. Get rid of it. (Also, 400 partitions is so many that that, itself, slows things down. More discussion.)-
You already have this; keep it:
PRIMARY KEY(user, mobile)- Change the
UPDATE. Although "row constructors" as inWHERE(a,b) IN ((1,2),...)is tempting syntax, it is awful at performance. The Optimizer does not (until version 5.7.3) do anything useful with that. It scans the entire table! (Note how thek firstUPDATEin thePROCESSLISTis at 91 seconds, and seems to be blocking all the other queries.)
One way to 'fix' the update is to build a table (
tmp) with the IN items, then do a multi-table UPDATE withWHERE large.user = tmp.user
AND large.mobile = tmp.mobileand have
PRIMARY KEY(user, mobile) on tmp.The other solution to the Update is to upgrade to 5.6, then 5.7. Anyway, 5.5 will soon be End-Of-Life'd.
-
To save a little space, and a little speed, consider
DECIMAL(11) for mobile. It would take 5 bytes instead of BIGINT, which takes 8.-
Get rid of the redundant
KEY(user). A rule: If you have INDEX(a,b), you don't need INDEX(a). And PRIMARY KEY is Unique and a key.
-
Don't say count(mobile), simply say COUNT(). When you specify an expression, it checks that for being NULL, which is not relevant in your case. Although SELECT COUNT() FROM lt WHERE user = xx might use INDEX(key), the overhead of that otherwise-unneeded index makes it probably not worth having.
-
Be cautious about MAX(user). 100M is not terribly far from 2 billion. Note, making it INT UNSIGNED would raise the limit to 4 billion, while still taking only 4 bytes.
I hope I have provided you some 'knowledge' as well as multiple solutions toward improving the table's performance.
After all these suggestions, I suspect you will rarely see a Time greater than "1" in PROCESSLIST`.Code Snippets
WHERE large.user = tmp.user
AND large.mobile = tmp.mobileContext
StackExchange Database Administrators Q#208349, answer score: 3
Revisions (0)
No revisions yet.