patternsqlMinor
Mysql 5.5 InnoDB growing CPU - Going crazy
Viewed 0 times
goinginnodbmysqlgrowingcrazycpu
Problem
Good morning,
I'm currently facing a problem where CPU usage keeps going up until the server crash.
I've tried multiple best practices proposed in this forum and on Percona, reinstalled the server twice already. Nothing seems to work.
The application is a small script that perform SELECT on range indexes, UPDATE on primary key and bulk insert on a UNIQUE keys in a very intensive way.
99.9% of the queries are running on a single InnoDB tables.
Following is the mysqltuner.pl output (which did not help me at all):
My.cnf file is optimised to the best of my knowledge and what I've seen being used or adviced by pros:
```
# general tuning
# family of join_buffer_size ,sort_buffer_size, read_buffer_size, read_rnd_buffer_size
sort_buffer_size = 4M
read_rnd_buffer_size = 512K
table_cache = 1024
key_buffer_size = 128M
#show global status like 'Max_used_connections'
max_connections = 100
#determine how many thread MySQL keep in memory to handle new conn
#100 - ((Threads_created/Connections)*100)
thread_cache_size = 50
#max size of what is being cached see SHOW GLOBAL STATUS LIKE 'Qc%'
#query_cache_limit = 2M
query_cache_size = 0
query_cache_type = 0
#innodb
I'm currently facing a problem where CPU usage keeps going up until the server crash.
I've tried multiple best practices proposed in this forum and on Percona, reinstalled the server twice already. Nothing seems to work.
The application is a small script that perform SELECT on range indexes, UPDATE on primary key and bulk insert on a UNIQUE keys in a very intensive way.
99.9% of the queries are running on a single InnoDB tables.
Following is the mysqltuner.pl output (which did not help me at all):
[--] Reads / Writes: 12% / 88%
[--] Total buffers: 4.2G global + 5.0M per thread (100 max threads)
[OK] Maximum possible memory usage: 4.7G (59% of installed RAM)
[OK] Slow queries: 0% (1/1M)
[OK] Highest usage of available connections: 42% (42/100)
[OK] Key buffer size / total MyISAM indexes: 128.0M/29.3M
[!!] Query cache is disabled
[OK] Sorts requiring temporary tables: 0% (0 temp sorts / 2 sorts)
[OK] Temporary tables created on disk: 19% (63 on disk / 327 total)
[OK] Thread cache hit rate: 58% (42 created / 102 connections)
[OK] Table cache hit rate: 38% (126 open / 325 opened)
[OK] Open file limit used: 3% (68/2K)
[OK] Table locks acquired immediately: 96% (1M immediate / 1M locks)
[OK] InnoDB buffer pool / data size: 4.0G/358.1M
[OK] InnoDB log waits: 0My.cnf file is optimised to the best of my knowledge and what I've seen being used or adviced by pros:
```
# general tuning
# family of join_buffer_size ,sort_buffer_size, read_buffer_size, read_rnd_buffer_size
sort_buffer_size = 4M
read_rnd_buffer_size = 512K
table_cache = 1024
key_buffer_size = 128M
#show global status like 'Max_used_connections'
max_connections = 100
#determine how many thread MySQL keep in memory to handle new conn
#100 - ((Threads_created/Connections)*100)
thread_cache_size = 50
#max size of what is being cached see SHOW GLOBAL STATUS LIKE 'Qc%'
#query_cache_limit = 2M
query_cache_size = 0
query_cache_type = 0
#innodb
Solution
I'm going to explain how I solved the performance problem that I was fighting with for weeks. Feel free to criticize, the result can only be better for people in need.
First I read the book "High Performance MySQL". It is a great book because they expose the ways of profiling right from the start. Basically when you are done reading the first 2-3 chapters, you have enough tools to start digging. I wasn't aware of most of the profiling methods and I learned a lot.
I decided to start digging from the query_log by setting
Then i picked the query that was causing trouble. In fact there was only one choice possible since all the queries were indexed, but this particular one was working on a large table. Went through everything here.
The target table was:
The suspicious query was:
There was a BTREE index on (updated,domain_id,stop_visit) so the explain plan gave nothing out.
Then I tried querying the profiling like this:
Nothing wrong was observed and the query was executing with timings under 0.001. No temporary tables, indexes are used... Everything looks OK but obviously there is a underlying problem that I knew it was there.
Now I thought why is Handler_read_next so high when the index was tailored to be selective (or i thought so). Well it wasn't. And more importantly I still don't have an answer to this question. After a bunch of random tries I found out that the ENUM column (stop_visit) was causing this very high Handler_read_next value. Changing the datatype to tinyint [0,1] or char(1) ['Y','N'] rewarded me with nothing.
At this point, I decided to break everything and re-start from scratch with the knowledge acquired. This resulted in the following changes to the initial table:
```
+-----------+---------------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+---------------------+------+-----+-------------------+----------------+
| id | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| url_crc | int(10) unsigned | NO | MUL | NULL | |
| updated | timestamp | NO | MUL | CURRENT_TIMESTAMP | |
| domain_id | tinyint(3) unsigned | NO | | NULL | |
| owner | int(10) unsigned | NO | MUL | 0 | |
| url | varch
First I read the book "High Performance MySQL". It is a great book because they expose the ways of profiling right from the start. Basically when you are done reading the first 2-3 chapters, you have enough tools to start digging. I wasn't aware of most of the profiling methods and I learned a lot.
I decided to start digging from the query_log by setting
query_log_time=0.Then i picked the query that was causing trouble. In fact there was only one choice possible since all the queries were indexed, but this particular one was working on a large table. Went through everything here.
The target table was:
+------------+---------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------------+------+-----+-------------------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| domain_id | int(11) | NO | MUL | NULL | |
| url | varchar(250) | NO | UNI | 0 | |
| visited | enum('Y','N') | NO | | N | |
| updated | timestamp | NO | | CURRENT_TIMESTAMP | |
| hits | smallint(6) | NO | | 0 | |
| stop_visit | enum('Y','N') | NO | | N | |
+------------+---------------+------+-----+-------------------+----------------+The suspicious query was:
SELECT id FROM tracker WHERE updated<=NOW() and domain_id=2 and stop_visit='N' LIMIT 1;There was a BTREE index on (updated,domain_id,stop_visit) so the explain plan gave nothing out.
Then I tried querying the profiling like this:
SET profiling=1;
SHOW profiles;
SET @query_id=1
SELECT STATE, SUM(DURATION) AS Total_R,
ROUND(
100 * SUM(DURATION) /
(SELECT SUM(DURATION)
FROM INFORMATION_SCHEMA.PROFILING
WHERE QUERY_ID = @query_id
), 2) AS Pct_R,
COUNT(*) AS Calls,
SUM(DURATION) / COUNT(*) AS "R/Call"
FROM INFORMATION_SCHEMA.PROFILING
WHERE QUERY_ID = @query_id
GROUP BY STATE
ORDER BY Total_R DESC;Nothing wrong was observed and the query was executing with timings under 0.001. No temporary tables, indexes are used... Everything looks OK but obviously there is a underlying problem that I knew it was there.
SHOW STATUS and FLUSH STATUS showed something really weird:FLUSH status;
SELECT id FROM tracker WHERE updated<=NOW() and domain_id=2 and stop_visit='N' LIMIT 1;
show status where variable_name like 'Created%' or variable_name like 'Handler%';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| Created_tmp_disk_tables | 0 |
| Created_tmp_files | 6 |
| Created_tmp_tables | 0 |
| Handler_commit | 0 |
| Handler_delete | 0 |
| Handler_discover | 0 |
| Handler_prepare | 0 |
| Handler_read_first | 0 |
| Handler_read_key | 0 |
| Handler_read_last | 0 |
| Handler_read_next | 44687 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_next | 0 |
| Handler_rollback | 0 |
| Handler_savepoint | 0 |
| Handler_savepoint_rollback | 0 |
| Handler_update | 0 |
| Handler_write | 0 |
+----------------------------+-------+Now I thought why is Handler_read_next so high when the index was tailored to be selective (or i thought so). Well it wasn't. And more importantly I still don't have an answer to this question. After a bunch of random tries I found out that the ENUM column (stop_visit) was causing this very high Handler_read_next value. Changing the datatype to tinyint [0,1] or char(1) ['Y','N'] rewarded me with nothing.
At this point, I decided to break everything and re-start from scratch with the knowledge acquired. This resulted in the following changes to the initial table:
```
+-----------+---------------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+---------------------+------+-----+-------------------+----------------+
| id | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| url_crc | int(10) unsigned | NO | MUL | NULL | |
| updated | timestamp | NO | MUL | CURRENT_TIMESTAMP | |
| domain_id | tinyint(3) unsigned | NO | | NULL | |
| owner | int(10) unsigned | NO | MUL | 0 | |
| url | varch
Code Snippets
+------------+---------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------------+------+-----+-------------------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| domain_id | int(11) | NO | MUL | NULL | |
| url | varchar(250) | NO | UNI | 0 | |
| visited | enum('Y','N') | NO | | N | |
| updated | timestamp | NO | | CURRENT_TIMESTAMP | |
| hits | smallint(6) | NO | | 0 | |
| stop_visit | enum('Y','N') | NO | | N | |
+------------+---------------+------+-----+-------------------+----------------+SELECT id FROM tracker WHERE updated<=NOW() and domain_id=2 and stop_visit='N' LIMIT 1;SET profiling=1;
SHOW profiles;
SET @query_id=1
SELECT STATE, SUM(DURATION) AS Total_R,
ROUND(
100 * SUM(DURATION) /
(SELECT SUM(DURATION)
FROM INFORMATION_SCHEMA.PROFILING
WHERE QUERY_ID = @query_id
), 2) AS Pct_R,
COUNT(*) AS Calls,
SUM(DURATION) / COUNT(*) AS "R/Call"
FROM INFORMATION_SCHEMA.PROFILING
WHERE QUERY_ID = @query_id
GROUP BY STATE
ORDER BY Total_R DESC;FLUSH status;
SELECT id FROM tracker WHERE updated<=NOW() and domain_id=2 and stop_visit='N' LIMIT 1;
show status where variable_name like 'Created%' or variable_name like 'Handler%';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| Created_tmp_disk_tables | 0 |
| Created_tmp_files | 6 |
| Created_tmp_tables | 0 |
| Handler_commit | 0 |
| Handler_delete | 0 |
| Handler_discover | 0 |
| Handler_prepare | 0 |
| Handler_read_first | 0 |
| Handler_read_key | 0 |
| Handler_read_last | 0 |
| Handler_read_next | 44687 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_next | 0 |
| Handler_rollback | 0 |
| Handler_savepoint | 0 |
| Handler_savepoint_rollback | 0 |
| Handler_update | 0 |
| Handler_write | 0 |
+----------------------------+-------++-----------+---------------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+---------------------+------+-----+-------------------+----------------+
| id | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| url_crc | int(10) unsigned | NO | MUL | NULL | |
| updated | timestamp | NO | MUL | CURRENT_TIMESTAMP | |
| domain_id | tinyint(3) unsigned | NO | | NULL | |
| owner | int(10) unsigned | NO | MUL | 0 | |
| url | varchar(255) | NO | | NULL | |
| hits | tinyint(3) unsigned | NO | | 0 | |
+-----------+---------------------+------+-----+-------------------+----------------+Context
StackExchange Database Administrators Q#63540, answer score: 9
Revisions (0)
No revisions yet.