kernel: INFO: task httpd:31276 blocked for more than 120 seconds.

环境centos5.5x86_64,这台web给数据采集用的,主要是抓取数据.启动过错出错信息:
Call Trace:
 [<ffffffff802bf8db>] out_of_memory+0x8b/0x203
 [<ffffffff8020f8a2>] __alloc_pages+0x27f/0x308
 [<ffffffff8025d117>] getnstimeofday+0x10/0x28
 [<ffffffff80213836>] __do_page_cache_readahead+0xc6/0x1ab
 [<ffffffff802140ff>] filemap_nopage+0x14c/0x360
 [<ffffffff80208ebc>] __handle_mm_fault+0x443/0x1348
 [<ffffffff8020622a>] hypercall_page+0x22a/0x1000
 [<ffffffff8026769a>] do_page_fault+0xf7b/0x12e0
 [<ffffffff8026ef4b>] monotonic_clock+0x35/0x7b
 [<ffffffff80262dd3>] thread_return+0x6c/0x113
 [<ffffffff8026082b>] error_exit+0x0/0x6e
DMA per-cpu:
cpu 0 hot: high 186, batch 31 used:26
cpu 0 cold: high 62, batch 15 used:58
cpu 1 hot: high 186, batch 31 used:35
cpu 1 cold: high 62, batch 15 used:49
DMA32 per-cpu: empty
Normal per-cpu: empty
HighMem per-cpu: empty
Free pages:        5760kB (0kB HighMem)
Active:46859 inactive:24599 dirty:0 writeback:0 unstable:0 free:1440 slab:28082 mapped-file:5 mapped-anon:76709 pagetables:390527
DMA free:5760kB min:5800kB low:7248kB high:8700kB active:187436kB inactive:98396kB present:2105344kB pages_scanned:328111276 all_unreclaimable? yes
lowmem_reserve[]: 0 0 0 0
DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 16*4kB 68*8kB 2*16kB 4*32kB 2*64kB 2*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 1*4096kB = 5760kB
DMA32: empty
Normal: empty
HighMem: empty
45 pagecache pages
Swap cache: add 14188563, delete 14188552, find 585539/2267924, race 4109+888
Free swap  = 0kB
Total swap = 4192956kB
重启后查看message错误日志:
Dec  9 12:09:33 SN7x kernel: INFO: task httpd:31276 blocked for more than 120 seconds.
Dec  9 12:09:33 SN7x kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec  9 12:09:33 SN7x kernel: httpd         D 011d5cb2df1c1910     0 31276   6472         31284 31269 (NOTLB)
Dec  9 12:09:33 SN7x kernel:  ffff88005e587d28  0000000000000286  0000000050c40dea  ffff88007f67d678 
Dec  9 12:09:33 SN7x kernel:  0000000000000008  ffff8800614cf820  ffffffff804f4b80  0000000000003d62 
Dec  9 12:09:33 SN7x kernel:  ffff8800614cfa08  ffffffff00000001 
Dec  9 12:09:33 SN7x kernel: Call Trace:
Dec  9 12:09:34 SN7x kernel:  [<ffffffff802b653d>] delayacct_end+0x5d/0x86
Dec  9 12:09:34 SN7x kernel:  [<ffffffff80263a7e>] __mutex_lock_slowpath+0x60/0x9b
Dec  9 12:09:35 SN7x kernel:  [<ffffffff80263ac8>] .text.lock.mutex+0xf/0x14
Dec  9 12:09:35 SN7x kernel:  [<ffffffff802cb8e0>] remove_exclusive_swap_page+0x10b/0x116
Dec  9 12:09:35 SN7x kernel:  [<ffffffff80222073>] generic_file_aio_write+0x4e/0xc1
Dec  9 12:09:35 SN7x kernel:  [<ffffffff8804c1b6>] :ext3:ext3_file_write+0x16/0x91
Dec  9 12:09:35 SN7x kernel:  [<ffffffff80218705>] do_sync_write+0xc7/0x104
Dec  9 12:09:35 SN7x kernel:  [<ffffffff802676cd>] do_page_fault+0xfae/0x12e0
Dec  9 12:09:35 SN7x kernel:  [<ffffffff8029c60a>] autoremove_wake_function+0x0/0x2e
Dec  9 12:09:35 SN7x kernel:  [<ffffffff802172b2>] vfs_write+0xce/0x174
Dec  9 12:09:35 SN7x kernel:  [<ffffffff80217aff>] sys_write+0x45/0x6e
Dec  9 12:09:36 SN7x kernel:  [<ffffffff802602f9>] tracesys+0xab/0xb6
 
查询结果:
 httpd:31276 blocked for more than 120 seconds.  httpd进程阻塞了120s,查了下:
This is a know bug. By default Linux uses up to 40% of the available memory for file system caching. 
 
原理:linux会设置40%的可用内存用来做系统cache,当flush数据时这40%内存中的数据由于和IO同步问题导致超时(120s)简单讲就是设置在文件 /etc/sysctl.conf中加入 “vm.dirty_ratio=10″
 
目前处理方案:
更改maxclients,将其进程数的最大值设大点或
临时处理更改了下内核参数有待后续观察:
'echo 0 > /proc/sys/kernel/hung_task_timeout_secs.
在进程处理于可中断的睡眠状态S 时,进程要等待 如 套接字、信号量被挂起,进入到对应事件等待队列中的的这个时间太长,占用了队列空间使进程长时间处于阻塞状态,影响了进程队列的有序调度。
 

 

你可能感兴趣的:(httpd,BLOCKED)