Login / Register
Developer ZoneBugs HomeReport a bugStatisticsAdvanced searchSaved searchesTags
Bug #15868 InnoDB hangs under 200 threads making INSERTs: 'thread thrashing'
Submitted: 19 Dec 2005 21:51 Modified: 31 Aug 2007 3:23
Reporter: Vadim Tkachenko
Status: Closed
Category: Server: InnoDB Severity: S2 (Serious)
Version: 4.1, 5.0 OS: Linux (Linux)
Assigned to: Paul DuBois Target Version:
ViewAdd CommentFilesDeveloperEdit SubmissionView Progress Log
[19 Dec 2005 21:51] Vadim Tkachenko
Description:
InnoDB hangs and crashed under Sysbench Insert benchmarks with 2048 threads.
there is crash in code
if (sync_array_print_long_waits()) {
fatal_cnt++;
if (fatal_cnt > 5) {
fprintf(stderr,
"InnoDB: Error: semaphore wait has lasted > %lu seconds\n"
"InnoDB: We intentionally crash the server, because it appears to be hung.\n",
srv_fatal_semaphore_wait_threshold);
ut_error;
}
} else {
fatal_cnt = 0;
}
How to repeat:
sysbench --test=oltp --oltp-table-size=100000000 --oltp-table-name=sbtest --
oltp-test-mode=nontrx --oltp-nontrx-mode=insert --max-requests=0 --num-threads=2048
--mysql-db=sbtest run
Suggested fix:
Obviously there is a problem with sync_array.
I'd test new version of Osku's patch, I'm waiting for fix of patch.[20 Dec 2005 9:41] Heikki Tuuri
Vadim,
please post what mysqld prints to the .err log.
Heikki[20 Dec 2005 22:08] Vadim Tkachenko
Full .err file takes 32MB I cannot post it.
Here is the end of output:
InnoDB: ###### Diagnostic info printed to the standard error stream
InnoDB: Error: semaphore wait has lasted > 600 seconds
InnoDB: We intentionally crash the server, because it appears to be hung.
051220 22:05:19InnoDB: Assertion failure in thread 2921647024 in file srv0srv.c line
2080
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/mysql/en/Forcing_recovery.html
InnoDB: about forcing recovery.
InnoDB: Thread 2642688944 stopped in file sync0arr.c line 126
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.
key_buffer_size=8388600
read_buffer_size=131072
max_used_connections=2048
max_connections=3000
threads_connected=2048
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 584168 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
thd=(nil)
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0xae24c74c, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x81622ec
0xb75a8d28
(nil)
0xb74d9e8a
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow
instructions on how to resolve the stack trace. Reso
lved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.[20 Dec 2005 22:53] Heikki Tuuri
Vadim,
attach it as gzipped.
Heikki[21 Dec 2005 8:26] Vadim Tkachenko
Here is link to gzipped error log:
http://mysql.apachephp.com/files/err.log.crash.gz[4 Jan 2006 17:27] Heikki Tuuri
Assigning this to Osku.[16 Jan 2006 10:47] Osku Salerma
I tried testing this with my latest patches to sync array on a dual-CPU Xeon 3.2 GHz / 6
GB memory box, but I have problems running the sysbench commands given in this bug.
First, 2048 threads are too much for some reason, it fails with weird "out of memory"
errors. Limiting the threads to 250 fails with this:
ALERT: failed to execute mysql_stmt_execute(): Err1062 Duplicate entry '0' for key 1
FATAL: database error, exiting...[16 Jan 2006 11:04] Vadim Tkachenko
Osku,
you can try thread_stack=128KB for mysql
and ulimit -s 128 for sysbench to increase memory.
and you should prepare benchmark before run by command:
sysbench --test=oltp --oltp-table-size=100000000 --oltp-table-name=sbtest
--mysql-db=sbtest prepare[16 Jan 2006 11:38] Osku Salerma
Vadim,
I did run the prepare command (it refuses to run if you haven't). Though I used the exact
same command as for run, only substituting "prepare" for "run".[16 Jan 2006 12:33] Vadim Tkachenko
Osku,
please try this sysbench:
http://mysql.apachephp.com/files/sysbench-0.4.4-sp.tar.gz[17 Jan 2006 12:47] Osku Salerma
sysbench 0.4.4 starts with 250 threads, but eventually the server dies with:
Warning: found too many locks at write: second write lock
Warning: found too many locks at write: enter write_lock
Warning: found too many locks at write: second write lock
Warning: found too many locks at write: start of release lock
Warning: found too many locks at write: after releasing lock
Warning: found too many locks at write: thr_unlock
Warning: found too many locks at write: start of release lock
Warning: found too many locks at write: second write lock
Warning: found too many locks at write: enter write_lock
Warning: found too many locks at write: second write lock
safe_mutex: Trying to lock unitialized mutex at os0sync.c, line 602
mysqld got signal 6;
I'm not sure if InnoDB should even be using the safe_mutex functions from MySQL...I will
investigate this.[17 Jan 2006 14:43] Vadim Tkachenko
Osku,
You are using debug version, I did not test it.
With debug version you will have absolutely diffferent results.[17 Jan 2006 17:44] Heikki Tuuri
Osku,
can you run sysbench inside gdb, so that we would know where that uninitialized mutex
happens?
Hmm... there might a a bug in InnoDB in creating trx objects or trx_undo objects with
many concurrent transactions. Maybe in some case an InnoDB mutex object is not
initialized though InnoDB uses it, or InnoDB uses a freed mutex object.
Regards,
Heikki[19 Jan 2006 14:44] Heikki Tuuri
Hi!
The unitilialized mutex error was probably due to the sync0arr.c bug in
...signal_object() that I fixed last night. It tried to set an event in a garbage pointer
which involves locking an OS mutex at a random location in memory.
Vadim's original crash, however, should be studied. I have never tested InnoDB with 2000
threads.
Regards,
Heikki[15 Feb 2006 16:45] Ady Wicaksono
I have the same cases, i use only 200 thread... and yes... this bug is ignoring me since I
upgrade MySQL4 to MySQL5.....
I use MySQL 5.0.18 Max ... RedHat 9 Kernel Linux 2.4.20-8smp #1 SMP Thu Mar 13 17:45:54
EST 2003 i686 i686 i386 GNU/Linux. RAM 2 Gbytes.
I Run
# ./sysbench --test=oltp --oltp-table-size=100000000 --oltp-table-name=test
--oltp-test-mode=nontrx --oltp-nontrx-mode=insert --max-requests=0 --num-threads=200
--mysql-db=test --mysql-user=root --mysql-password=simbah65infokom run
sysbench v0.4.4-sp: multi-threaded system evaluation benchmark
No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 200
Doing OLTP test.
Running non-transactional test
Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct
cases)
Using "BEGIN" for starting transactions
Threads started!
From processlist, many INSERT/UPDATE hang
mysql> show processlist;
+-----+------+-----------------+------+---------+------+--------+----------------------------------+
| Id | User | Host | db | Command | Time | State | Info
|
+-----+------+-----------------+------+---------+------+--------+----------------------------------+
| 3 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 4 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 5 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 6 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 7 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 8 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 9 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 10 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 11 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 12 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 13 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 14 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 15 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 16 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 17 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 18 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 19 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 20 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 21 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 22 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 23 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 24 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 25 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 26 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 27 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 28 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 29 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
| 30 | root | localhost | test | Execute | 365 | update | INSERT INTO test
values(?,?,?,?) |
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 1598035, signal count 95217
--Thread 2250521152 has waited at btr0cur.c line 359 for 220.00 seconds the semaphore:
X-lock on RW-latch at 0x4615e4bc created in file dict0dict.c line 3654
a writer (thread id 2252289472) has reserved it in mode exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0cur.c line 366
Last time write locked in file btr0cur.c line 359
--Thread 2232825664 has waited at btr0cur.c line 366 for 220.00 seconds the semaphore:
S-lock on RW-latch at 0x4615e4bc created in file dict0dict.c line 3654
a writer (thread id 2252289472) has reserved it in mode exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0cur.c line 366
Last time write locked in file btr0cur.c line 359
--Thread 2217884992 has waited at btr0cur.c line 366 for 220.00 seconds the semaphore:
S-lock on RW-latch at 0x4615e4bc created in file dict0dict.c line 3654
a writer (thread id 2252289472) has reserved it in mode exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0cur.c line 366
Last time write locked in file btr0cur.c line 359
--Thread 2250324672 has waited at btr0cur.c line 366 for 220.00 seconds the semaphore:
S-lock on RW-latch at 0x4615e4bc created in file dict0dict.c line 3654
a writer (thread id 2252289472) has reserved it in mode exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0cur.c line 366
Last time write locked in file btr0cur.c line 359
--Thread 2223582912 has waited at btr0cur.c line 366 for 220.00 seconds the semaphore:
S-lock on RW-latch at 0x4615e4bc created in file dict0dict.c line 3654
a writer (thread id 2252289472) has reserved it in mode exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0cur.c line 366
Last time write locked in file btr0cur.c line 359
--Thread 2227516608 has waited at btr0cur.c line 366 for 220.00 seconds the semaphore:
S-lock on RW-latch at 0x4615e4bc created in file dict0dict.c line 3654
a writer (thread id 2252289472) has reserved it in mode exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0cur.c line 366
Last time write locked in file btr0cur.c line 359
--Thread 2249538752 has waited at btr0cur.c line 366 for 220.00 seconds the semaphore:
S-lock on RW-latch at 0x4615e4bc created in file dict0dict.c line 3654
a writer (thread id 2252289472) has reserved it in mode exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0cur.c line 366[15 Feb 2006 17:16] Ady Wicaksono
FYI, I just tried in MySQL 4.1.18, and found that MySQL 4.1.18 is also affected by this
bug[15 Feb 2006 17:27] Ady Wicaksono
Below tho innodb status for MySQL 4.1.18 testing, after a long time hang
InnoDB: ###### Diagnostic info printed to the standard error stream
InnoDB: Error: semaphore wait has lasted > 600 seconds
InnoDB: We intentionally crash the server, because it appears to be hung.
060215 23:22:02InnoDB: Assertion failure in thread 1788512832 in file srv0srv.c line
1899
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/mysql/en/Forcing_recovery.html
InnoDB: about forcing recovery.
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.
key_buffer_size=402653184
read_buffer_size=12578816
max_used_connections=202
max_connections=910
threads_connected=202
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 1782208 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
thd=(nil)
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x6a9a82ec, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x815b8f0
0x4004a618
(nil)
0x420de407
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow
instructions on how to resolve the stack trace
. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.
InnoDB: Thread 1780124352 stopped in file os0sync.c line 501[15 Feb 2006 17:35] Ady Wicaksono
I also need to inform that, hang on MySQL 5 happened faster than MySQL 4, MySQL is just
inserting a million of rows and then hang, but MySQL 4 could inserting billion of rows
and then hang[15 Feb 2006 20:47] Heikki Tuuri
Ady,
please attach the COMPLETE UNEDITED .err log to this bug report. Preferably from the
5.0.xx server.
Regards,
Heikki[16 Feb 2006 4:17] Ady Wicaksono
Hi Heikki
Please find the log here http://202.147.198.67/bugSysBenchMySQL.txt.gz
BTW, why i can't upload file on "files" menu?
Thx[16 Feb 2006 14:13] Heikki Tuuri
Ady,
in both your and Vadim's reports the 'main thread' of InnoDB seems to have died or hung
after a very small number of INSERTs, in your case only 200 000! The main thread is not
waiting for any semaphore. It is just stuck somewhere.
You are using prepared statements (with the notation ?,?,?,?) in the test. Do you get a
hang also with normal statements?
Regards,
Heikki[16 Feb 2006 14:32] Heikki Tuuri
Ady,
a thread is apparently doing a B-tree page split:
--Thread 2252289472 has waited at fsp0fsp.c line 686 for 256.00 seconds the sem\
aphore:
X-lock on RW-latch at 0x40140410 created in file buf0buf.c line 485
a writer (thread id 2252289472) has reserved it in mode wait exclusive
number of readers 1, waiters flag 1
Last time read locked in file buf0flu.c line 576
Last time write locked in file fsp0fsp.c line 686
But the associated buffer pool page most probably is latched by the main thread which is
doing a flush:
Pending writes: LRU 0, flush list 9, single page 0
Main thread process no. 10176, id 1788045632, state: flushing buffer pool pages
The question is what makes the main thread vulnerable in the flush operation under a high
concurrent load. Could it use so much stack that the stack gets overwritten by another
thread.
Have you set thread stacks in some specific way?
Regards,
Heikki[16 Feb 2006 14:37] Heikki Tuuri
I am raising the priority and editing the synopsis to reflect this better.[17 Feb 2006 4:45] Ady Wicaksono
Heikki
Question:
You are using prepared statements (with the notation ?,?,?,?) in the test. Do
you get a hang also with normal statements?
Answer:
I haven't test. But, when i upgrade MySQL4 to MySQL5, my web application get a hang on
high load.
I'll try to test later using Apache Benchmark (ab). However it's simply a PHP 4 page that
doing insert on MySQL directly using php_mysql module.
When there're huge HTTP request hit this PHP page, it's equivalent with huge insert on
MySQL. Since PHP has no database pooling, it will make many connection on MySQL and each
connection will run insert into bla bla on table.
Finally, my friend suggest me to use SQLRelay (sqlrelay.sourceforge.net) so database
connection will be limited, and insert into bla bla on table will be limited too without
sacrifice HTTP request.
Question:
Have you set thread stacks in some specific way?
Answer:
No. MySQL thread stack size should be 196kbyte as default right?
Maybe useful, below is kernel configuration about max threads
# cat /proc/sys/kernel/threads-max
14336
Here is my.cnf:
========== cut here =============
[client]
port = 3306
socket = /var/lib/mysql/mysql.sock
# The MySQL server
[mysqld]
port = 3306
#bind-address = 127.0.0.1
socket = /var/lib/mysql/mysql.sock
max_connections = 910
max_connect_errors = 9999999
skip-locking
key_buffer = 384M
max_allowed_packet = 16M
table_cache = 2049
sort_buffer_size = 12M
read_buffer_size = 12M
myisam_sort_buffer_size = 64M
thread_cache = 48
query_cache_size = 32M
# Try number of CPU's*2 for thread_concurrency
thread_concurrency = 8
innodb_data_home_dir =
innodb_data_file_path =
/mysql1/data1:2000M;/mysql1/data2:2000M;/mysql1/data3:2000M;/mysql1/data4:2000M;/mysql1/data5:1000M;/m
ysql2/data6:2000M;/mysql2/data7:2000M;/mysql2/data8:2000M;/mysql2/data9:2000M;/mysql2/data10:1000M;/mysql0/data11:2000M;/mysql
0/data12:2000M;/mysql0/data13:2000M;/mysql0/data14:2000M;
# ------ OLD ---
innodb_log_group_home_dir = /mysql1
innodb_log_arch_dir = /mysql1
# You can set .._buffer_pool_size up to 50 - 80 %
# of RAM but beware of setting memory usage too high
innodb_buffer_pool_size = 512M
innodb_additional_mem_pool_size = 40M
# Set .._log_file_size to 25 % of buffer pool size
innodb_log_file_size = 100M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 1
innodb_lock_wait_timeout = 50
[mysqldump]
quick
max_allowed_packet = 16M
add-drop-table=FALSE
========== cut here =============[17 Feb 2006 23:39] Heikki Tuuri
Ady,
when you get more hangs, please post the complete unedited .err log here. They can be a
valuable clue about what is wrong.
I will try to test this on a 4 x Opteron/Linux next week.
Regards,
Heikki[20 Feb 2006 12:12] Ady Wicaksono
Heikki,
I'm preparing an SSH account for you in my Server to test if you interesting,
please contact me via email at
[email protected].
What about your test at opteron linux?[28 Feb 2006 18:17] Heikki Tuuri
Hi!
I have now tested this on a single-processor computer, with 240 threads. But no hang
observed with the latest MySQL-5.0 tree with the compile-pentium-debug build. I ran both
inside gdb and without it. I do get the warnings that Osku described.
My next step is to test on a 4 x AMD64 Linux.
Regards,
Heikki
heikki@127:~/sysbench-0.4.4-sp/sysbench$ ./sysbench --test=oltp --oltp-table-size=1000000
--oltp-table-name=sbtest --mysql-db=sbtest prepare
sysbench v0.4.4-sp: multi-threaded system evaluation benchmark
No DB drivers specified, using mysql
Creating table 'sbtest'...
Creating 1000000 records in table 'sbtest'...
heikki@127:~/sysbench-0.4.4-sp/sysbench$ ./sysbench --test=oltp --oltp-table-size=1000000
--oltp-table-name=sbtest --oltp-test-mode=nontrx --oltp-nontrx-mode=insert
--max-requests=0 --num-threads=240 --mysql-db=sbtest run
sysbench v0.4.4-sp: multi-threaded system evaluation benchmark
No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 240
Doing OLTP test.
Running non-transactional test
Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct
cases)
Using "BEGIN" for starting transactions
Threads started!
heikki@127:~/sysbench-0.4.4-sp/sysbench$ uname -a
Linux 127 2.6.8-2-686 #1 Tue Aug 16 13:22:48 UTC 2005 i686 GNU/Linux
heikki@127:~/mysql-5.0/sql$ ./mysqld
060228 18:57:39 [Warning] Changed limits: max_open_files: 1024 max_connections: 1014
table_cache: 64
060228 18:57:39 InnoDB: Started; log sequence number 6 1313589783
060228 18:57:39 [Warning] mysql.user table is not updated to new password format;
Disabling new password usage until mysql_fix_privilege_tables is run
060228 18:57:39 [Note] ./mysqld: ready for connections.
Version: '5.0.19-debug' socket: '/home/heikki/bugsocket' port: 3307 Source
distribution
Warning: found too many locks at write: second write lock
Warning: found too many locks at write: enter write_lock
Warning: found too many locks at write: second write lock
Warning: found too many locks at write: enter write_lock
Warning: found too many locks at write: second write lock
Warning: found too many locks at write: enter write_lock
Warning: found too many locks at write: second write lock
Warning: found too many locks at write: enter write_lock
Warning: found too many locks at write: second write lock
Warning: found too many locks at write: enter write_lock
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 98328845; in additional pool allocated 3380992
Buffer pool size 4096
Free buffers 1482
Database pages 2613
Modified db pages 448
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages read 14708, created 29603, written 31373
0.00 reads/s, 0.57 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
1 read views open inside InnoDB
Main thread process no. 3132, id 1275108272, state: making checkpoint
Number of rows inserted 2017486, updated 0, deleted 0, read 1000963
26.65 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================[28 Feb 2006 19:08] Heikki Tuuri
Hi!
I tested 5.0.18 built with -O3 -g, with 900 threads on a 4 X AMD64 computer. But I could
not get any hang. I cannot test with more threads now because the open files limit of
1024 would be exceeded.
I am putting this bug report to status 'Can't repeat'. The problems with 2048 threads may
have been due to mysqld exhausting the resources of the OS.
Regards,
Heikki
bash-3.00$ uname -a
Linux staso03 2.6.9-22.ELsmp #1 SMP Mon Sep 19 18:00:54 EDT 2005 x86_64 x86_64 x86_64
GNU/Linux
bash-3.00$ ./sysbench --test=oltp --oltp-table-size=100000000 --oltp-table-name=sbtest
--oltp-test-mode=nontrx --oltp-nontrx-mode=insert --max-requests=0 --num-threads=900
--mysql-db=sbtest run
sysbench v0.4.4-sp: multi-threaded system evaluation benchmark
No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 900
Doing OLTP test.
Running non-transactional test
Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct
cases)
Using "BEGIN" for starting transactions
Threads started!
bash-3.00$ ./mysqld
InnoDB: The first specified data file /tmp/ibdata did not exist:
InnoDB: a new database to be created!
060228 9:26:24 InnoDB: Setting file /tmp/ibdata size to 12 MB
InnoDB: Database physically writes the file full: wait...
060228 9:26:25 InnoDB: Log file ./ib_logfile0 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile0 size to 5 MB
InnoDB: Database physically writes the file full: wait...
060228 9:26:25 InnoDB: Log file ./ib_logfile1 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile1 size to 5 MB
InnoDB: Database physically writes the file full: wait...
InnoDB: Doublewrite buffer not found: creating new
InnoDB: Doublewrite buffer created
InnoDB: Creating foreign key constraint system tables
InnoDB: Foreign key constraint system tables created
060228 9:26:26 InnoDB: Started; log sequence number 0 0
060228 9:26:26 [Note] ./mysqld: ready for connections.
Version: '5.0.18' socket: '/home/htuuri/mysocket' port: 3307 Source distribution
060228 9:32:33 [Note] ./mysqld: Normal shutdown
060228 9:32:33 InnoDB: Starting shutdown...
060228 9:32:36 InnoDB: Shutdown completed; log sequence number 0 43655
060228 9:32:36 [Note] ./mysqld: Shutdown complete
bash-3.00$ ./mysqld
060228 9:33:26 [Warning] Changed limits: max_open_files: 1024 max_connections: 1014
table_cache: 64
060228 9:33:26 InnoDB: Started; log sequence number 0 43655
060228 9:33:26 [Note] ./mysqld: ready for connections.
Version: '5.0.18' socket: '/home/htuuri/mysocket' port: 3307 Source distribution
060228 9:36:01 [ERROR] Error in accept: Too many open files
| 2600 | sbtest | localhost | sbtest | Execute | 4 | update | INSERT INTO sbtest
values(?,?,?,?) |
| 2601 | sbtest | localhost | sbtest | Execute | 4 | update | INSERT INTO sbtest
values(?,?,?,?) |
| 2602 | sbtest | localhost | sbtest | Execute | 4 | update | INSERT INTO sbtest
values(?,?,?,?) |
| 2603 | sbtest | localhost | sbtest | Execute | 4 | update | INSERT INTO sbtest
values(?,?,?,?) |
| 2604 | sbtest | localhost | sbtest | Execute | 6 | update | INSERT INTO sbtest
values(?,?,?,?) |
| 2605 | sbtest | localhost | sbtest | Execute | 4 | update | INSERT INTO sbtest
values(?,?,?,?) |
| 2606 | sbtest | localhost | sbtest | Execute | 4 | update | INSERT INTO sbtest
values(?,?,?,?) |
| 2607 | sbtest | localhost | sbtest | Execute | 4 | update | INSERT INTO sbtest
values(?,?,?,?) |
| 2608 | sbtest | localhost | sbtest | Execute | 2 | update | INSERT INTO sbtest
values(?,?,?,?) |
+------+--------+-----------+--------+---------+------+--------+------------------------------------+
901 rows in set (0.01 sec)
mysql> show variables like 'innodb%';
+---------------------------------+----------------------------+
| Variable_name | Value |
+---------------------------------+----------------------------+
| innodb_additional_mem_pool_size | 10485760 |
| innodb_autoextend_increment | 8 |
| innodb_buffer_pool_awe_mem_mb | 0 |
| innodb_buffer_pool_size | 67108864 |
| innodb_checksums | ON |
| innodb_commit_concurrency | 0 |
| innodb_concurrency_tickets | 500 |
| innodb_data_file_path | /tmp/ibdata:12M:autoextend |
| innodb_data_home_dir | |
| innodb_doublewrite | ON |
| innodb_fast_shutdown | 1 |
| innodb_file_io_threads | 4 |
| innodb_file_per_table | OFF |
| innodb_flush_log_at_trx_commit | 1 |
| innodb_flush_method | |
| innodb_force_recovery | 0 |
| innodb_lock_wait_timeout | 50 |
| innodb_locks_unsafe_for_binlog | OFF |
| innodb_log_arch_dir | |
| innodb_log_archive | OFF |
| innodb_log_buffer_size | 8388608 |
| innodb_log_file_size | 5242880 |
| innodb_log_files_in_group | 2 |
| innodb_log_group_home_dir | ./ |
| innodb_max_dirty_pages_pct | 90 |
| innodb_max_purge_lag | 0 |
| innodb_mirrored_log_groups | 1 |
| innodb_open_files | 300 |
| innodb_support_xa | ON |
| innodb_sync_spin_loops | 20 |
| innodb_table_locks | ON |
| innodb_thread_concurrency | 20 |
| innodb_thread_sleep_delay | 10000 |
+---------------------------------+----------------------------+
33 rows in set (0.00 sec)
mysql>[1 Mar 2006 8:22] Vadim Tkachenko
Heikki,
please note I reported about 2048 threads.
for example MyISAM can easy handle ~3000 threads.
to increase limit of open files you could use
system wide ulimit -n 10000[1 Mar 2006 21:06] Heikki Tuuri
Vadim,
I am not the sysadmin of the 4 X AMD64 computer. Was the bug visible only with 2048
threads, and not, for example, 990 threads? I tested now 990 threads with no problems.
srv_max_n_threads is the size of the sync array, and it is set to 10 000 or bigger in
srv0start.c. The value 2048 should not be any hard limit.
I will next work on the sync arr optimization bug #15815. If this bug was caused by the
OS choking under too many thread switches, then fixing #15815 may solve also this bug.
Regards,
Heikki[1 Mar 2006 21:17] Vadim Tkachenko
Heikki,
I tested with 1024 threads.
InnoDB works fine under such workload.[2 Mar 2006 13:41] Ady Wicaksono
Vadim
Why i can't create more than 254 thread?
# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 7168
virtual memory (kbytes, -v) unlimited
# ulimit -n 5012
# ./sysbench --test=oltp --oltp-table-size=100000000 --oltp-table-name=sbtest
--oltp-test-mode=nontrx --oltp-nontrx-mode=insert --max-requests=0 --num-threads=400
--mysql-db=test run --mysql-password=sbtest --mysql-user=sbtest --mysql-host=127.0.0.1
sysbench v0.4.1: multi-threaded system evaluation benchmark
No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 400
Doing OLTP test.
Running non-transactional test
Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct
cases)
Using "BEGIN" for starting transactions
FATAL: Thread #254 creation failed errno = 12 (Cannot allocate memory)[2 Mar 2006 13:43] Ady Wicaksono
Heikki,
You use kernel 2.6, is it kernel issue or what? Maybe you have idea
I use kernel 2.4 SMP[2 Mar 2006 13:56] Vadim Tkachenko
Ady,
You have too high stack size (8M for stack is giant)
Please change
ulimit -s 192[2 Mar 2006 14:30] Ady Wicaksono
I try with 600 threads and MySQL got hang
# ulimit -s 192
# ./sysbench --test=oltp --oltp-table-size=100000000 --oltp-table-name=sbtest
--oltp-test-mode=nontrx --oltp-nontrx-mode=insert --max-requests=0 --num-threads=600
--mysql-db=test run --mysql-password=sbtest --mysql-user=sbtest --mysql-host=127.0.0.1
sysbench v0.4.1: multi-threaded system evaluation benchmark
No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 600
Doing OLTP test.
Running non-transactional test
Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct
cases)
Using "BEGIN" for starting transactions
Threads started!
-------------- mysql processlist full of these INSERT that wait for more than 624
seconds :( ----------
| 10722 | sbtest | localhost:60789 | test | Execute | 624 | update | INSERT INTO sbtest
(k,c,pad) values(?,?,?) |
| 10723 | sbtest | localhost:60790 | test | Execute | 628 | update | INSERT INTO sbtest
(k,c,pad) values(?,?,?) |
| 10724 | sbtest | localhost:60791 | test | Execute | 628 | update | INSERT INTO sbtest
(k,c,pad) values(?,?,?) |
| 10725 | sbtest | localhost:60792 | test | Execute | 624 | update | INSERT INTO sbtest
(k,c,pad) values(?,?,?) |
| 10726 | sbtest | localhost:60793 | test | Execute | 624 | update | INSERT INTO sbtest
(k,c,pad) values(?,?,?) |
| 10727 | sbtest | localhost:60794 | test | Execute | 624 | update | INSERT INTO sbtest
(k,c,pad) values(?,?,?) |
| 10728 | sbtest | localhost:60795 | test | Execute | 624 | update | INSERT INTO sbtest
(k,c,pad) values(?,?,?) |
| 10729 | sbtest | localhost:60796 | test | Execute | 624 | update | INSERT INTO sbtest
(k,c,pad) values(?,?,?) |
| 10730 | sbtest | localhost:60797 | test | Execute | 624 | update | INSERT INTO sbtest
(k,c,pad) values(?,?,?) |
| 10731 | sbtest | localhost:60798 | test | Execute | 624 | update | INSERT INTO sbtest
(k,c,pad) values(?,?,?) |
| 10732 | sbtest | localhost:60799 | test | Execute | 629 | update | INSERT INTO sbtest
(k,c,pad) values(?,?,?) |
| 10733 | sbtest | localhost:60800 | test | Execute | 624 | update | INSERT INTO sbtest
(k,c,pad) values(?,?,?) |
| 10734 | sbtest | localhost:60801 | test | Execute | 624 | update | INSERT INTO sbtest
(k,c,pad) values(?,?,?) |
| 10735 | sbtest | localhost:60802 | test | Execute | 624 | update | INSERT INTO sbtest
(k,c,pad) values(?,?,?) |
-------------- mysql processlist ----------
-------------- Last lines of mysql.err ----------
InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info:
InnoDB: Pending preads 0, pwrites 0
InnoDB: ###### Diagnostic info printed to the standard error stream
InnoDB: Error: semaphore wait has lasted > 600 seconds
InnoDB: We intentionally crash the server, because it appears to be hung.
060302 20:25:40InnoDB: Assertion failure in thread 1779657152 in file srv0srv.c line
2092
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/mysql/en/Forcing_recovery.html
InnoDB: about forcing recovery.
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.
key_buffer_size=10485760
read_buffer_size=520192
max_used_connections=901
max_connections=910
threads_connected=602
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 938432 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
thd=(nil)
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x6a13626c, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x81916c0
0x40048618
(nil)
0x420de407
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow
instructions on how to resolve the stack trac
e. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.
Number of processes running now: 0
060302 20:25:40 mysqld restarted
060302 20:25:40 InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
060302 20:25:41 InnoDB: Starting log scan based on checkpoint at
InnoDB: log sequence number 6 389178624.
InnoDB: Doing recovery: scanned up to log sequence number 6 389238551
060302 20:25:41 InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38
39 40 41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79
80 81 82 83 84 85 86 87 88 89 90 91 9
2 93 94 95 96 97 98 99
InnoDB: Apply batch completed
InnoDB: Last MySQL binlog file position 0 0, file name
060302 20:25:41 InnoDB: Started; log sequence number 6 389238551
060302 20:25:41 [Note] /usr/sbin/mysqld-max: ready for connections.
Version: '5.0.18-max' socket: '/var/lib/mysql/mysql.sock' port: 3306 MySQL Community
Edition - Experimental (GPL)
-------------- Last lines of mysql.err ----------
Heikki, i think this bug is not can't repeat, you should try with kernel 2.4 SMP
# uname -a
Linux db 2.4.20-8smp #1 SMP Thu Mar 13 17:45:54 EST 2003 i686 i686 i386 GNU/Linux
Thanks[2 Mar 2006 15:20] Ady Wicaksono
Vadim,
May i know, u try this sysbench on what Linux kernel version?[2 Mar 2006 15:53] Vadim Tkachenko
Ady,
RedHat ES 3,
2.4.21-15.ELsmp
quadxeon[3 Mar 2006 4:55] Ady Wicaksono
Heikki,
I try to test this problem on the same machine, Dell PE 2850, Intel XEON 3.0Hgz
HyperThread, 2 Gbytes Memory but using Kernel 2.6
The problem is the same :(
# uname -a
Linux backup 2.6.11-1.1369_FC4smp #1 SMP Thu Jun 2 23:08:39 EDT 2005 i686 i686 i386
GNU/Linux
# MySQL version 5.0.15-max
# ulimit -s 192
# ulimit -n 4096
# ./sysbench --test=oltp --oltp-table-size=100000000 --oltp-table-name=sbtest
--oltp-test-mode=nontrx --oltp-nontrx-mode=insert --max-requests=0 --num-threads=2048
--mysql-db=test run --mysql-password=sbtest --mysql-user=sbtest --mysql-host=127.0.0.1
sysbench v0.4.4-sp: multi-threaded system evaluation benchmark
No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 2048
Doing OLTP test.
Running non-transactional test
Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct
cases)
Using "BEGIN" for starting transactions
Threads started!
..........
On other side, these're MySQL processlist
------------------------
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 6155 to server version: 5.0.15-max
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql> show processlist;
+------+------+-----------------+------+---------+------+--------------+------------------------------------+
| Id | User | Host | db | Command | Time | State | Info
|
+------+------+-----------------+------+---------+------+--------------+------------------------------------+
| 4106 | root | localhost:46749 | test | Execute | 142 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4107 | root | localhost:46750 | test | Execute | 228 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4108 | root | localhost:46751 | test | Execute | 228 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4109 | root | localhost:46752 | test | Execute | 227 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4110 | root | localhost:46753 | test | Execute | 228 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4111 | root | localhost:46754 | test | Execute | 228 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4112 | root | localhost:46755 | test | Execute | 9 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4113 | root | localhost:46756 | test | Execute | 228 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4114 | root | localhost:46757 | test | Execute | 228 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4115 | root | localhost:46758 | test | Execute | 228 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4116 | root | localhost:46759 | test | Execute | 228 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4117 | root | localhost:46760 | test | Execute | 227 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4118 | root | localhost:46761 | test | Execute | 90 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4119 | root | localhost:46762 | test | Execute | 228 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4120 | root | localhost:46763 | test | Execute | 228 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4121 | root | localhost:46764 | test | Execute | 228 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4122 | root | localhost:46765 | test | Execute | 226 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4123 | root | localhost:46766 | test | Execute | 227 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4124 | root | localhost:46767 | test | Execute | 227 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4125 | root | localhost:46768 | test | Execute | 227 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4126 | root | localhost:46769 | test | Execute | 227 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4127 | root | localhost:46770 | test | Execute | 228 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4128 | root | localhost:46771 | test | Execute | 227 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4129 | root | localhost:46772 | test | Execute | 226 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4130 | root | localhost:46773 | test | Execute | 187 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4131 | root | localhost:46774 | test | Execute | 228 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4132 | root | localhost:46775 | test | Execute | 228 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4133 | root | localhost:46776 | test | Execute | 229 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4134 | root | localhost:46777 | test | Execute | 144 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4135 | root | localhost:46778 | test | Execute | 228 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4136 | root | localhost:46779 | test | Execute | 226 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4137 | root | localhost:46780 | test | Execute | 228 | update | INSERT INTO
sbtest values(?,?,?,?) |
| 4138 | root | localhost:46781 | test | Execute | 228 | update | INSERT INTO
sbtest values(?,?,?,?) |
---------------------------------
Heiki, this bug drives me crazy :), I use it on application with huge insert/update,
currently i use db pooling (SQLRelay) to decrease the impact of this bug.[3 Mar 2006 5:08] Ady Wicaksono
For more information, last test on kernel 2.6
1. mysql user could use up to 2.8 Gbytes Memory
2. Here is the mysql process
mysql 21360 130 25.8 2067072 536888 pts/0 Sl 17:36 37:52 /usr/sbin/mysqld-max
--basedir=/ --datadir=/var/lib/mysql --
3. After a few minutes.... i decided to stop sysbench, but MySQL not crash, it simply
hang, but very annoying
4. I run sysbench again but with 600 thread, and it's work fine...
# ./sysbench --test=oltp --oltp-table-size=100000000 --oltp-table-name=sbtest
--oltp-test-mode=nontrx --oltp-nontrx-mode=insert --max-requests=0 --num-threads=600
--mysql-db=test run --mysql-password=simbah65infokom --mysql-user=root
--mysql-host=127.0.0.1
sysbench v0.4.4-sp: multi-threaded system evaluation benchmark
No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 600
Doing OLTP test.
Running non-transactional test
Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct
cases)
Using "BEGIN" for starting transactions
Threads started!
No INSERT process hang.[3 Mar 2006 18:40] Heikki Tuuri
Ady,
does it hang with 990 threads? With 1500 threads?
Regards,
Heikki[4 Mar 2006 5:34] Ady Wicaksono
I try with 990 threads, and below the result.
$ uname -a
Linux db 2.6.11-1.1369_FC4smp #1 SMP Thu Jun 2 23:08:39 EDT 2005 i686 i686 i386
GNU/Linux
$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Xeon(TM) CPU 3.00GHz
stepping : 1
cpu MHz : 2993.085
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe lm pni monitor ds_cpl cid cx16 xtpr
bogomips : 5931.00
processor : 1
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Xeon(TM) CPU 3.00GHz
stepping : 1
cpu MHz : 2993.085
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe lm pni monitor ds_cpl cid cx16 xtpr
bogomips : 5980.16
$ gcc -v
Using built-in specs.
Target: i386-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-libgcj-multifile
--enable-languages=c,c++,objc,java,f95,ada --enable-java-awt=gtk
--with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre --host=i386-redhat-linux
Thread model: posix
gcc version 4.0.0 20050519 (Red Hat 4.0.0-8)
$ ps aux|grep mysqld-max
mysql 21360 5.0 23.4 1859492 485968 ? Sl Mar03 75:08 /usr/sbin/mysqld-max
--basedir=/ --datadir=/var/lib/mysql --user=mysql
--pid-file=/var/lib/mysql/backuplistener.pid --skip-locking --port=3306
--socket=/var/lib/mysql/mysql.sock
$ free
total used free shared buffers cached
Mem: 2074700 1568536 506164 0 118524 806772
-/+ buffers/cache: 643240 1431460
Swap: 4192924 44 4192880
# ulimit -s 192
# ulimit -n 4096
# ./sysbench --test=oltp --oltp-table-size=100000000 --oltp-table-name=sbtest
--oltp-test-mode=nontrx --oltp-nontrx-mode=insert --max-requests=0 --num-threads=990
--mysql-db=test run --mysql-password=sbtest --mysql-user=sbtest --mysql-host=127.0.0.1
sysbench v0.4.4-sp: multi-threaded system evaluation benchmark
No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 990
Doing OLTP test.
Running non-transactional test
Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct
cases)
Using "BEGIN" for starting transactions
Threads started!
No INSERT Hang, but the performance of insert is poor, look at the data below, as a end
user, 990 threads should be equivalent to successfull 990 insert/seconds, but it's not
happened here. I don't know the explanation
mysql> select NOW(); select count(*) from sbtest;
+---------------------+
| NOW() |
+---------------------+
| 2006-03-04 18:27:15 |
+---------------------+
1 row in set (0.00 sec)
+----------+
| count(*) |
+----------+
| 17038 |
+----------+
1 row in set (34.57 sec)
mysql> select NOW(); select count(*) from sbtest;
+---------------------+
| NOW() |
+---------------------+
| 2006-03-04 18:27:52 |
+---------------------+
1 row in set (0.00 sec)
+----------+
| count(*) |
+----------+
| 17488 |
+----------+
1 row in set (1 min 37.25 sec)
I'll try later with 1500 thread[7 Mar 2006 16:44] Ady Wicaksono
Heikki,
I cancel to test with 1500 thread,
This bug maybe relatted to http://bugs.mysql.com/bug.php?id=15815,
We decide to downgrade to MySQL 4.1.x, which more stable on concurrent insert
Thank you.[10 Mar 2006 9:20] Jeff Freund
From the descriptions here, I believe that this issue just happened for us. Machine has
been 100% stable in production for ~2 months before this incident.
We are running 5.0.18-standard-log on 2xDual Core Opteron, with
2.6.9-22.0.1.106.unsupportedsmp #1 SMP x86_64 x86_64 x86_64 GNU/Linux. Max allowed
connections=400.
InnoDB appeared to just freeze up. Alive check for database (select 1;) ran as usual -
anything accessing innodb tables was stuck. Typical traffic on the db averages a few
hundred up to pecks of about 1500 queries per second, split between
selects/inserts/updates.[10 Mar 2006 9:54] Ady Wicaksono
Hi Jeff,
For a temporary solutions while you want using MySQL 5.0.x with InnoDB
you may use SQLRelay (sqlrelay.sourceforge.net) to limit connection ~ limit threads of
MySQL. It quite usefull.
However, i prefer to downgrade my MySQL from 5.0.x to MySQL 4.[10 Mar 2006 16:23] Heikki Tuuri
Jeff,
can you please post the complete, unedited .err log from the hung server?
Did you run prepared statements?
It could be something different from the hang problem in this bug report. This bug report
only contains hangs that are triggered by an artificial workload of several hundred
threads in sysbench. I have not seen any report from a real-world application.
Regards,
Heikki[10 Mar 2006 17:53] Jeff Freund
Thanks Heikki -- We do use Prepared Statements.
There actually were no entries in the .err log before/while it was frozen up. Below I
have included the contents of the .err log upon restart.
Also, I have the results of SHOW INNODB STATUS while it was "frozen" - however it is too
big to post in the comments here, and I can't attach a file as I am not the original
submitter. Please advise on how to get you that information.
Thanks.
Jeff
.err
---------------------------------------------
060309 23:31:25 [Note] /usr/sbin/mysqld: Normal shutdown
060309 23:31:25 [Note] Slave I/O thread killed while reading event
060309 23:31:25 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000011',
position 33467209
060309 23:38:06 mysqld started
060309 23:38:06 InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
060309 23:38:07 InnoDB: Starting log scan based on checkpoint at
InnoDB: log sequence number 39 3064686203.
InnoDB: Doing recovery: scanned up to log sequence number 39 3064686203
InnoDB: Transaction 0 341644458 was in the XA prepared state.
InnoDB: 15 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 14 row operations to undo
InnoDB: Trx id counter is 0 341645824
InnoDB: In a MySQL replication slave the last master binlog file
InnoDB: position 0 1464524, file name mysql-bin.000011
InnoDB: Last MySQL binlog file position 0 1070123614, file name
/db_logs/mysql/repl/mysql-bin.000086
InnoDB: Starting in background the rollback of uncommitted transactions
060309 23:38:08 InnoDB: Rolling back trx with id 0 341645461, 1 rows to undo
060309 23:38:08 InnoDB: Started; log sequence number 39 3064686203
060309 23:38:08 [Note] Recovering after a crash using /db_logs/mysql/repl/mysql-bin
InnoDB: Rolling back of trx id 0 341645461 completed
060309 23:38:08 InnoDB: Rolling back trx with id 0 341645428, 1 rows to undo
InnoDB: Rolling back of trx id 0 341645428 completed
060309 23:38:08 InnoDB: Rolling back trx with id 0 341645339, 1 rows to undo
InnoDB: Rolling back of trx id 0 341645339 completed
060309 23:38:08 InnoDB: Rolling back trx with id 0 341645162, 1 rows to undo
InnoDB: Rolling back of trx id 0 341645162 completed
060309 23:38:08 InnoDB: Rolling back trx with id 0 341645122, 1 rows to undo
InnoDB: Rolling back of trx id 0 341645122 completed
060309 23:38:08 InnoDB: Rolling back trx with id 0 341645120, 1 rows to undo
InnoDB: Rolling back of trx id 0 341645120 completed
060309 23:38:08 InnoDB: Rolling back trx with id 0 341645088, 1 rows to undo
InnoDB: Rolling back of trx id 0 341645088 completed
060309 23:38:08 InnoDB: Rolling back trx with id 0 341645056, 1 rows to undo
InnoDB: Rolling back of trx id 0 341645056 completed
060309 23:38:08 InnoDB: Rolling back trx with id 0 341645049, 1 rows to undo
InnoDB: Rolling back of trx id 0 341645049 completed
060309 23:38:08 InnoDB: Rolling back trx with id 0 341645030, 1 rows to undo
InnoDB: Rolling back of trx id 0 341645030 completed
060309 23:38:08 InnoDB: Rolling back trx with id 0 341644587, 1 rows to undo
InnoDB: Rolling back of trx id 0 341644587 completed
060309 23:38:08 InnoDB: Rolling back trx with id 0 341644570, 1 rows to undo
InnoDB: Rolling back of trx id 0 341644570 completed
060309 23:38:08 InnoDB: Rolling back trx with id 0 341644522, 1 rows to undo
InnoDB: Rolling back of trx id 0 341644522 completed
060309 23:38:08 InnoDB: Rolling back trx with id 0 341644491, 1 rows to undo
InnoDB: Rolling back of trx id 0 341644491 completed
060309 23:38:08 InnoDB: Rollback of non-prepared transactions completed
060309 23:38:18 [Note] Starting crash recovery...
060309 23:38:18 InnoDB: Starting recovery for XA transactions...
060309 23:38:18 InnoDB: Transaction 0 341644458 in prepared state after recovery
060309 23:38:18 InnoDB: Transaction contains changes to 22 rows
060309 23:38:18 InnoDB: 1 transactions in prepared state after recovery
060309 23:38:18 [Note] Found 1 prepared transaction(s) in InnoDB
060309 23:38:18 [Note] Crash recovery finished.
060309 23:38:18 [ERROR] /usr/sbin/mysqld: Table './mysql/user' is marked as crashed and
should be repaired
060309 23:38:18 [Warning] Checking table: './mysql/user'
060309 23:38:18 [ERROR] 1 client is using or hasn't closed the table properly
060309 23:38:18 [ERROR] /usr/sbin/mysqld: Table './mysql/db' is marked as crashed and
should be repaired
060309 23:38:18 [Warning] Checking table: './mysql/db'
060309 23:38:18 [ERROR] 1 client is using or hasn't closed the table properly
060309 23:38:18 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.0.18-standard-log' socket: '/db_data/mysql/mysql.sock' port: 3306 MySQL
Community Edition - Standard (GPL)
060309 23:38:18 [Note] Slave SQL thread initialized, starting replication in log
'mysql-bin.000011' at position 1464524, relay log './cms-mysql01-relay-bin.001929'
position: 235
060309 23:38:18 [Note] Slave I/O thread: connected to master 'repl@cms-mysql02:3306',
replication started in log 'mysql-bin.000011' at position 33467209[10 Mar 2006 18:32] Heikki Tuuri
Jeff,
please send the output to
[email protected]
If the output is > 1 MB, then gzip it.
Regards,
Heikki[10 Mar 2006 23:26] Heikki Tuuri
Jeff,
your hang report looks different from the ones by Vadim and Ady.
---TRANSACTION 0 341644491, ACTIVE 2908 sec, process no 15722, OS thread id 1179335008,
thread declared inside InnoDB 498
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1216, undo log entries 1
MySQL thread id 334757, query id 395178082
---TRANSACTION 0 341644458, ACTIVE (PREPARED) 2908 sec, process no 15722, OS thread id
1177528672
6 lock struct(s), heap size 1216, undo log entries 22
MySQL thread id 334694, query id 395178059
commit
Trx read view will not see trx with id >= 0 341644469, sees < 0 341644381
The threads are not waiting for an InnoDB semaphore, but are probably waiting for a MySQL
mutex. There was a similar bug report in the Forums a few weeks ago. I think also that was
associated with prepared statements.
You may be hitting this bug:
http://bugs.mysql.com/bug.php?id=18116
Or you may want to open a new bug report.
Regards,
Heikki[10 Mar 2006 23:42] Heikki Tuuri
It was Anda Zhao who reported a hang that looks similar to Jeff's:
http://forums.mysql.com/read.php?22,72478,75227#msg-75227[29 Mar 2006 22:34] Heikki Tuuri
Hmm... Vadim's original hang report looks similar to Ady's.
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 107048906; in additional pool allocated 17150464
Buffer pool size 4096
Free buffers 0
Database pages 4095
Modified db pages 56
Pending reads 0
Pending writes: LRU 0, flush list 2, single page 0
Pages read 14376, created 29335, written 29457
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
1 read views open inside InnoDB
Main thread process no. 13784, id 2909039536, state: flushing buffer pool pages
It is the main thread that is stuck flushing buffer pool pages.
The main thread:
--Thread 2909039536 has waited at buf0flu.c line 586 for 11.00 seconds the sema\
phore:
S-lock on RW-latch at 0xb1f13c58 created in file buf0buf.c line 485
a writer (thread id 2661153712) has reserved it in mode exclusive
number of readers 0, waiters flag 1
Last time read locked in file ../include/btr0btr.ic line 28
Last time write locked in file ../include/btr0btr.ic line 28
In Vadim's output we see the main thread waiting for a semaphore for a few seconds, then
waking up, and then ending up waiting again! This looks like a livelock of the main
thread. The question is why the write I/O thread does not handle the pending writes.
In buf0flu.c, the main thread does call buf_flush_buffered_writes() before waiting for
the S-latch:
if (!locked) {
buf_flush_buffered_writes();
rw_lock_s_lock_gen(&(block->lock), BUF_IO_WRITE);
}[29 Mar 2006 22:40] Heikki Tuuri
The bug may be an InnoDB race condition in fil0fil.c or os0file.c. For some reason a few
pending write requests are never handled but are lost.[8 Apr 2006 0:19] Heikki Tuuri
Hi!
I analyzed Vadim's .err log in more detail. We see that the master thread, which is doing
a buffer pool flush, becomes starved when it tries to get an S-latch on a certain page.
Many other threads acquire an X-latch to the same page meanwhile.
buf0flu.c:
mutex_exit(&(buf_pool->mutex));
if (!locked) {
buf_flush_buffered_writes();
rw_lock_s_lock_gen(&(block->lock), BUF_IO_WRITE); ### waits
here!
}
Looks like this is not a genuine hang. All the way until the assertion, there are threads
that have just waited for < 10 seconds for a semaphore. But some threads have waited for >
600 seconds, that is why InnoDB gets an assertion failure and crashes mysqld.
Thus, this problem probably will be fixed when #15815 is fixed.
Regards,
Heikki
--Thread 2909039536 has waited at buf0flu.c line 586 for 11.00 seconds the sema\
phore:
S-lock on RW-latch at 0xb1f13c58 created in file buf0buf.c line 485
a writer (thread id 2661153712) has reserved it in mode exclusive
number of readers 0, waiters flag 1
Last time read locked in file ../include/btr0btr.ic line 28
Last time write locked in file ../include/btr0btr.ic line 28
--Thread 2909039536 has waited at buf0flu.c line 586 for 4.00 seconds the semap\
hore:
S-lock on RW-latch at 0xb1f13c58 created in file buf0buf.c line 485
a writer (thread id 2877311920) has reserved it in mode exclusive
number of readers 0, waiters flag 1
Last time read locked in file ../include/btr0btr.ic line 28
Last time write locked in file ../include/btr0btr.ic line 28
--Thread 2909039536 has waited at buf0flu.c line 586 for 1.00 seconds the semap\
hore:
S-lock on RW-latch at 0xb1f13c58 created in file buf0buf.c line 485
a writer (thread id 2810477488) has reserved it in mode exclusive
number of readers 0, waiters flag 1
Last time read locked in file ../include/btr0btr.ic line 28
Last time write locked in file ../include/btr0btr.ic line 28
--Thread 2909039536 has waited at buf0flu.c line 586 for 0.00 seconds the semap\
hore:
S-lock on RW-latch at 0xb1f13c58 created in file buf0buf.c line 485
a writer (thread id 2654129072) has reserved it in mode exclusive
number of readers 0, waiters flag 1
Last time read locked in file ../include/btr0btr.ic line 28
Last time write locked in file ../include/btr0btr.ic line 28
--Thread 2909039536 has waited at buf0flu.c line 586 for 5.00 seconds the semap\
hore:
S-lock on RW-latch at 0xb1f13c58 created in file buf0buf.c line 485
a writer (thread id 2659748784) has reserved it in mode exclusive
number of readers 0, waiters flag 1
Last time read locked in file ../include/btr0btr.ic line 28
Last time write locked in file ../include/btr0btr.ic line 28
--Thread 2909039536 has waited at buf0flu.c line 586 for 5.00 seconds the semap\
hore:
S-lock on RW-latch at 0xb1f13c58 created in file buf0buf.c line 485
a writer (thread id 2505006000) has reserved it in mode exclusive
number of readers 0, waiters flag 1
Last time read locked in file ../include/btr0btr.ic line 28
Last time write locked in file ../include/btr0btr.ic line 28[8 Apr 2006 1:06] Heikki Tuuri
A workaround for this bug may be setting in my.cnf innodb_thread_concurrency=8 or even
=1.
The default is 8 in 4.1.x. In 5.0, the default is not to restrict it at all. That may
explain why these hang bugs started to flow in with 5.0. It is our old friend, 'thread
thrashing' that in an extreme form may look like a hang.[8 Apr 2006 7:50] Heikki Tuuri
Assigning this to Marko.
Patch for 5.0: in mysqld.cc:
{"innodb_thread_concurrency", OPT_INNODB_THREAD_CONCURRENCY,
"Helps in performance tuning in heavily concurrent environments. "
"Sets the maximum number of threads allowed inside InnoDB.