The df and du commands provide different system information and I can not write to a partition that df says is 100% full. Which is correct and why does the system not allow any data writes to this partition?
环境
- Red Hat Enterprise Linux (RHEL)
问题
- Unable to find which files are consuming disk space.
- Why is there a mismatch between df and du outputs?
-
The
df
anddu
commands provide different information and I can not write to a partition thatdf
says is 100% full. Which is correct and why does the system not allow any data writes to this partition? -
df
says that the system is out of disk space on one of my partitions, butdu
shows plenty of space left. For example:# df -h Filesystem Size Used Avail Use% Mounted on /dev/md1 7.9G 7.8G 0 100% /var # du -sh /var 50M var
决议
There are multiple possible causes for a difference in 'df' and 'du' output
- [Mounted-over] - Files in a directory that now has another file system mounted over it. Please see the Resolution section of How can I see what is consuming space underneath a mounted partition?.
- [Open/deleted] - Running processes holding open deleted files. To free up the space, the processes holding open the deleted files must exit. Please see the Diagnostic Steps section for more information on locating such processes.
- [Filesystem-state] - A filesystem corruption could lead to such a difference. If the filesystem in question is a LVM volume and there is free space in its VolumeGroup, then creating a snapshot of the volume and running "fsck" is recommended to find out if the filesystem is corrupted.
根源
Running processes holding open deleted files [Open/deleted]
df
may report that the partition is full if the inodes or disk space are consumed by a running process. This may happen when a service such as samba
has filled up the log file. When logrotate runs nightly, it deletes the big log file but does not restart the service, so the space is still being reserved by the process. Restarting the process should release the handle on the now-deleted file, and the space should be freed.
-
[Open/Deleted] - Check
lsof
output for files listed as(deleted)
. These files have been deleted (or more properly, unlinked) from the filesystem tree but because one or more processes still have them open, the disk space they occupy cannot be reclaimed. So thedf
command will still account for these open/deleted files, whiledu
which scans the filesystem won't see them any more and will not account for them. For example, below we see four deleted files listed inlsof
but an attempt to list the files fails since they have been removed/unlinked from the filesystem's directory structure.# lsof | grep COMMANMD ; lsof | grep deleted COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME oracle 21869 1000 16w REG 253,5 7476824 130802 /oracle/diag/EIPDB2_ping_21869.trc (deleted) oracle 21869 1000 17w REG 253,5 8676828 130804 /oracle/diag/EIPDB2_ping_21869.trm (deleted) oracle 21927 1000 18w REG 253,5 7375910 130983 /oracle/diag/EIPDB2_asmb_21927.trc (deleted) oracle 21927 1000 19w REG 253,5 6748612 130658 /oracle/diag/EIPDB2_asmb_21927.trm (deleted) # ls -lh /oracle/diag/EIPDB2*.tr* #
Files in a directory that now has another file system mounted over it [Mounted-over]
If a path given to df
/ du
contains a secondary file system, mounted on top of a primary file system, and the primary file system contains files which are now hidden, the output will be different. For example, suppose /opt/foo
is a separate file system mounted on top of /opt
. Prior to mounting the secondary file system /opt/foo
, the primary file system contained a directory of /opt/foo which had files in it. Once the secondary file system is mounted onto /opt/foo
, the files from the primary file system inside the /opt/foo
directory are no longer visible, but still, consume space.
-
[Mounted Over] - There may be files hidden under a mounted file system. For instance, if the directory
/mnt/test
contained large files in it, and then an NFS file system was mounted on/mnt/test
,df
would continue to account for the size of those files butdu
would not. This can be verified by bind mounting the suspected filesystem on a different directory and inspecting its contents if it's not possible to unmount the NFS filesystem mounted on/mnt/test
and inspect due to production outage issues.$ mkdir /tmp/root_chk $ mount --bind / /tmp/root_chk $ du -h /tmp/root_chk/mnt/test
-
For example, starting with the following, we see df command reports
/tmp
is using 3.1G of allocated space butdf
is only reporting 11M:
# df -TPh /tmp/* | sort -u -k 1 -r Filesystem Type Size Used Avail Use% Mounted on /dev/sdc2 ext3 147G 188M 140G 1% /tmp/mnt /dev/mapper/vga-lvtmp ext4 167G 3.1G 156G 2% /tmp # du -sxkh /tmp 11M /tmp
- Now we run the above bind commands and find that there are files hiding under the
/tmp/mnt
directory used as a mount point. These files exist within the/tmp
filesystem, but once another filesystem is mounted on that directory, then those files become invisible todu
command's filesystem scanning functions.
# mkdir /mnt_check # [1] # mount --bind /tmp /mnt_check # [2] # du -sxkh /mnt_check # [3] 3.1G /mnt_check # ls -lh /mnt_check/mnt # [4] total 3.1G -rw-rw-r--. 1 user user 1.0G Nov 19 14:08 file1 -rw-rw-r--. 1 user user 2.0G Nov 19 14:22 file2 # umount /mnt_check # [5] # rmdir /mnt_check # [6]
-
Notes on above:
- [1] create a temporary directory on to which we can bind mount the main filesystem that another filesystem is mounted on top of. In this case, the main filesystem is
/tmp
and another filesystem is mounted on top of/tmp/mnt
a directory within the/tmp
filesystem. The temporary directory must be created on a different filesystem, in this case,/
(root
). - [2]
--bind
mount the main filesystem to the temporary directory created above. This will only mount the main filesystem of interest -- any mounted filesystems on top of that main filesystem won't follow to this new mount point. - [3] perform a
du
command on the new--bind
mounted filesystem and we can see the expected 3.1G shows up (vs the starting point above of just 11M). This means that one or more mount points on the main filesystem, here that is/tmp
, contains files as part of the main (/tmp
) filesystem which becomes invisible to thedu
command once another filesystem is mounted over the top of the/tmp
directory. - [4] check the
/tmp/mnt
directory, its the place where another filesystem is mounted. Thels -lh
command shows that there are two pre-existing files within the/tmp
filesystem within the/tmp/mnt
directory of the/tmp
filesystem. Once another filesystem is mounted over that directory, those two files cannot be seen by thedu
command but are still accounted for by thedf
command against the main/tmp
filesystem. Thefile1
andfile2
need to be removed -- moved or deleted -- from the/tmp/mnt
directory. These actions can be performed on the--bind
mount point (/mnt_check/mnt
in this example which is equivalent to/tmp/mnt
before another filesystem is mounted onto that directory). - [5] unmount the
--bind
filesystem after correcting the issue with files being present within the mount point. - [6] remove the temporary directory used as a mount point.
- [1] create a temporary directory on to which we can bind mount the main filesystem that another filesystem is mounted on top of. In this case, the main filesystem is
-
To correct the above problem, the two files --
file1
andfile2
-- that are contained within the/tmp/mnt
directory need to be moved or deleted so as to empty/tmp/mnt
directory before a filesystem is mounted on top of the/tmp/mnt
directory mount point. See step [4] above. -
Note that even after the files under the mount point are deleted,
df
will not exactly matchdu
output. There are filesystem-specific allocated "stuff" that is internal to the filesystem and does not show up within visible files of the filesystem. Typically the extra filesystem-specific "stuff" makes df output 2-5% larger thandu
command output. That is normal and expected. As an example, after deletingfile1
andfile2
above,df
showed 98M in use, butdu
only 11M. That extra 87M is 0.5% of the total allocatable blocks of the filesystem -- well below the threshold for concern.
诊断步骤
Files in a directory that now has another file system mounted over it
To check and remedy this condition, please see How can I see what is consuming space underneath a mounted partition?.
Running processes holding open deleted files
Use the lsof
command as follows:
# lsof | grep deleted
nmbd 16408 root cwd DIR 9,1 0 163846 /var/log/samba (deleted)
nmbd 16408 root 13w REG 9,1 924442067 163964 /var/log/samba/nmbd.log (deleted)
Note: In order to obtain all files which may be holding open files, run the command as root.
The 7th column in the output tells you how big the file is in bytes. The 9th column tells you which file is being held open. The 1st column tells which process is holding this file descriptor open. In the above example, the size of /var/log/samba/nmbd.log
is 924442067 bytes, which is almost 1 GB. Since the nmbd
service is a part of samba, running /sbin/service smb restart
should fix the problem.