HiveBrain v1.2.0
Get Started
← Back to all entries
debugsqlMinor

Innodb, MySQL 5.5.28 - Segmentation Signal 11 faults, on high load. .. my.cnf file included

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
filesignalsegmentationinnodbhighcnfmysqlfaultsincludedload

Problem

We have a high-end server,
128GB RAM, 32 Core , Xeon, SSD RAID 10 - running Ubuntu 12.04 with MySQL 5.5.28 .
Doing random imports to large InnoDB tables, over 50+ gigs, randomly after a few hours of heavy load, mysql does a Signal 11 and crashes.

We have tried to move hardware. Doing a full dump (but not a restore yet) gives no issues.
Usually on corrupted tables, a dump would fail no?

Below is the crash log and my.cnf .

```
17:48:34 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

key_buffer_size=536870912
read_buffer_size=131072
max_used_connections=324
max_threads=200
thread_count=308
connection_count=308
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 965187 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7fc7eb1b5040
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fadf6abfe60 thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x29)[0x7fc758522759]
/usr/sbin/mysqld(handle_fatal_signal+0x483)[0x7fc7583e9ae3]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0)[0x7fc75713bcb0]
/usr/sbin/mysqld(+0x6671b0)[0x7fc75863a1b0]
/usr/sbin/mysqld(+0x61d6b9)[0x7fc7585f06b9]
/usr/sbin/mysqld(+0x630d12)[0x7fc758603d12]
/usr/sbin/mysqld(+0x6319c2)[0x7fc7586049c2]
/usr/sbin/mysqld(+0x631d85)[0x7fc758604d85]
/usr/sbin/mysqld(+0x626e7d)[0x7fc7585f9e7d]
/usr/sbin/mysqld(+0x633cea)[0x7fc758606cea]
/usr/sbin/mysqld(+0x6347e2)[0x7fc7586077e2]
/usr/sbin/mysqld(+0

Solution

It seems theoretically possible that table could still dump properly if the corruption were in the indexes, which aren't dumped.

It should not be possible for anything in your configuration to cause MySQL to crash with a Signal 11, a segmentation fault.

I've been staring at this for a while, now, and I haven't come up with answers... just questions (in no particular order):

  • have you run memory diagnostics on your server? You mentioned that you "tried to move hardware" but you also mention having not tried a restore of your dump, so I'm not clear exactly what you tried moving. Resist the temptation to think "it can't be that." Test the memory.



  • is your system using any swap space at all? Hopefully not -- but if (and only if) it is, then you should reduce the innodb_buffer_pool_size to the point that it isn't ... because there's not really a point in buffering to memory that gets swapped, and the swap partition could be introducing problems. This one is a stretch, but worth eliminating, I think.



  • is this a problem that occurred after an upgrade to 5.5.28 or is this a new application or deployment?



  • if it's new, have you tried replicating the problem with MySQL 5.6?



  • is partitioning involved? That means touching more code.



  • are you using a binary distribution of MySQL that you downloaded from Oracle (tar/deb/rpm)? Or is it from Ubuntu (I always use generic tar binaries, so I don't know what the current version of MySQL 5.5 is, in 12.04LTS) or another source? Or compiled from source code?



  • are you using any unusual plugins or UDFs?



This could be a bug, but when you hear the sound of hooves, suspect horses before zebras (at least where I come from).

update (from comments):

"Another" memory bug?

Checking the memory would be the first thing I would try, for sure.

The snapshots should be getting you a reliable backup, I agree, but if there's any kind of binary wierdness going on in your files, it would be perfectly replicated. It will take some time, but restoring to a fresh system using mysqldump files would be a better test, since all of the table structures would all be absolutely rebuilt from scratch. Since the table structures seem to be valid, it may be unlikely that this will change anything, but it feels like you're kind of at the point where every possibility needs to be pinned down... clearly, what you're seeing should not be happening.

For a new test system, though, I would install the server using the "Linux - Generic 2.6 (x86, 64-bit), Compressed TAR Archive" package from the download site. Download the tarball, verify it's md5 checksum, then tar xvzf it into /usr/local and symlink the resulting directory to /usr/local/mysql. (I think Ubuntu still puts it in /var/lib/mysql, so you can probably do this even without removing the distro version, as long as you don't have the other copy running). Then move the "data" directory from inside /usr/local/mysql to whatever partition it needs to live on (if different), and symlink it back into /usr/local/mysql/data. Put your config file at /usr/local/mysql/my.cnf and pass that as the first option ... using --defaults-file=/usr/local/mysql/my.cnf when using the install scripts and when starting the server -- this will cause any other my.cnf's (such as those in /etc) to not be read.

The rest of the setup is pretty straightforward. It's more work, but it completely eliminates the "black box" of using the package manager. The real motivation here, though, is that the disto packages may have been compiled from source, and the resulting binaries could have slight variations from the "official" Oracle binaries.

Context

StackExchange Database Administrators Q#29086, answer score: 2

Revisions (0)

No revisions yet.