OCFS2 No space left on device

上次出现No space left on device 以为是inode节点不够引起的,但实际是ocfs2的bug

参考文档 metalink ID 1232702.1 https://support.oracle.com

 

java.lang.IllegalStateException: java.lang.IllegalStateException: /pic/claimDbpic/dubang/2010/1/0507/705072010370702000189/06010/Thu
mb70507201037070200018906010_3.JPG (No space left on device)

 

Last login: Thu Oct 21 10:18:12 2010 from 10.0.1.26
[jboss@ca-be00-ser05 ~]$ df -i
文件系统               Inode (I)已用 (I)可用 (I)已用% 挂载点
/dev/sda1            33587200  260074 33327126    1% /
tmpfs                3084467       1 3084466    1% /dev/shm
/dev/mapper/pic      32768160 32440546  327614  100% /pic
/dev/mapper/app      3276960  619879 2657081   19% /app
/dev/mapper/mpath5   503316480 6076467 497240013    2% /picNL

[jboss@ca-be00-ser05 ~]$ df -h
文件系统              容量  已用 可用 已用% 挂载点
/dev/sda1             125G   11G  108G   9% /
tmpfs                  12G     0   12G   0% /dev/shm
/dev/mapper/pic      1001G  991G   10G 100% /pic
/dev/mapper/app       101G   19G   82G  19% /app
/dev/mapper/mpath5     15T  186G   15T   2% /picNL

 

[root@ca-be00-ser05 ~]# more /var/log/messages

Oct 21 07:41:40 ca-be00-ser05 kernel: (ocfs2_wq,6482,1):ocfs2_delete_inode:999 ERROR: status = -2
Oct 21 08:10:05 ca-be00-ser05 ntpd[6983]: time reset -0.572586 s
Oct 21 08:14:21 ca-be00-ser05 ntpd[6983]: synchronized to LOCAL(0), stratum 8
Oct 21 08:15:26 ca-be00-ser05 ntpd[6983]: synchronized to 10.1.2.101, stratum 2
Oct 21 08:22:58 ca-be00-ser05 ntpd[6983]: synchronized to LOCAL(0), stratum 8
Oct 21 08:23:21 ca-be00-ser05 kernel: (ocfs2_wq,6482,11):ocfs2_orphan_del:1841 ERROR: status = -2
Oct 21 08:23:21 ca-be00-ser05 kernel: (ocfs2_wq,6482,11):ocfs2_remove_inode:628 ERROR: status = -2
Oct 21 08:23:21 ca-be00-ser05 kernel: (ocfs2_wq,6482,11):ocfs2_wipe_inode:754 ERROR: status = -2
Oct 21 08:23:21 ca-be00-ser05 kernel: (ocfs2_wq,6482,11):ocfs2_delete_inode:999 ERROR: status = -2
Oct 21 08:23:21 ca-be00-ser05 kernel: (ocfs2_wq,6482,11):ocfs2_orphan_del:1841 ERROR: status = -2
Oct 21 08:23:21 ca-be00-ser05 kernel: (ocfs2_wq,6482,11):ocfs2_remove_inode:628 ERROR: status = -2
Oct 21 08:23:21 ca-be00-ser05 kernel: (ocfs2_wq,6482,11):ocfs2_wipe_inode:754 ERROR: status = -2
Oct 21 08:23:21 ca-be00-ser05 kernel: (ocfs2_wq,6482,11):ocfs2_delete_inode:999 ERROR: status = -2
Oct 21 08:23:21 ca-be00-ser05 kernel: (ocfs2_wq,6482,11):ocfs2_orphan_del:1841 ERROR: status = -2
Oct 21 08:23:21 ca-be00-ser05 kernel: (ocfs2_wq,6482,11):ocfs2_remove_inode:628 ERROR: status = -2
Oct 21 08:23:21 ca-be00-ser05 kernel: (ocfs2_wq,6482,11):ocfs2_wipe_inode:754 ERROR: status = -2
Oct 21 08:23:21 ca-be00-ser05 kernel: (ocfs2_wq,6482,11):ocfs2_delete_inode:999 ERROR: status = -2
Oct 21 08:29:07 ca-be00-ser05 ntpd[6983]: synchronized to 10.1.2.101, stratum 11
Oct 21 08:56:54 ca-be00-ser05 ntpd[6983]: time reset +0.174235 s
Oct 21 09:00:14 ca-be00-ser05 ntpd[6983]: synchronized to LOCAL(0), stratum 8
Oct 21 09:01:21 ca-be00-ser05 ntpd[6983]: synchronized to 10.1.2.101, stratum 2
Oct 21 09:13:45 ca-be00-ser05 kernel: (ocfs2_wq,6482,7):ocfs2_orphan_del:1841 ERROR: status = -2

 

Solution

1) Permanent solution:
- Upgrade to OCFS2 1.6, backup the files on the volume, reformat it with the 1.6 mkfs.ocfs2, restore the volume.
- prior to 1.6, all clusters in a cluster group required contiguous bits in the global bitmap. With 1.6, a group is broken down into subgroups, requiring a much smaller number of contiguous bits.

2) Temporary solutions:

a) Upgrade to OCFS2 1.4, backup the files on the volume, reformat it with the 1.4 mkfs.ocfs2, restore the volume.
- this will reserve 0.3% of the volume to allow for metadata expansion, but problem could still re-occur

b) Check for "spare slots" and remove one or more of them
- when a volume is created, the "-N" parameter of "mkfs.ocfs" specifies the maximum number of nodes that can concurrently mount a volume. This reserves a "slot" for each node.
If there are many more node slots created than required, there will be unused space that can be reclaimed.
To do this:
i) unmount the volume on all nodes
ii) run "tunefs.ocfs2 -N <n>" where <n> is the reduced number of node slots
iii) remount the volume and rerun the operation to expand the file
- note: do not attempt to perform this pro-actively else the freed space might get used by other objects.

c) Identify one or more large files that can be moved.
i) move it/them off the volume, and check the global bitmap for sufficient contiguous bits.
ii) move it/them back to the volume.

你可能感兴趣的:(jboss,delete,文档,each,login,2010)