The df and du commands provide different system information and I can not write to a partition that df says is 100% full.

The df and du commands provide different system information and I can not write to a partition that df says is 100% full. Which is correct and why does the system not allow any data writes to this partition?

 

环境

  • Red Hat Enterprise Linux (RHEL)

问题

  • Unable to find which files are consuming disk space.
  • Why is there a mismatch between df and du outputs?
  • The df and du commands provide different information and I can not write to a partition that df says is 100% full. Which is correct and why does the system not allow any data writes to this partition?

  • df says that the system is out of disk space on one of my partitions, but du shows plenty of space left. For example:

    Raw
    # df -h
    Filesystem            Size  Used Avail Use% Mounted on
    /dev/md1              7.9G  7.8G     0 100% /var
    
    # du -sh /var
    50M     var
    

决议

There are multiple possible causes for a difference in 'df' and 'du' output

  • [Mounted-over] - Files in a directory that now has another file system mounted over it. Please see the Resolution section of How can I see what is consuming space underneath a mounted partition?.
  • [Open/deleted] - Running processes holding open deleted files. To free up the space, the processes holding open the deleted files must exit. Please see the Diagnostic Steps section for more information on locating such processes.
  • [Filesystem-state] - A filesystem corruption could lead to such a difference. If the filesystem in question is a LVM volume and there is free space in its VolumeGroup, then creating a snapshot of the volume and running "fsck" is recommended to find out if the filesystem is corrupted.

根源

Running processes holding open deleted files [Open/deleted]

df may report that the partition is full if the inodes or disk space are consumed by a running process. This may happen when a service such as samba has filled up the log file. When logrotate runs nightly, it deletes the big log file but does not restart the service, so the space is still being reserved by the process. Restarting the process should release the handle on the now-deleted file, and the space should be freed.

  • [Open/Deleted] - Check lsof output for files listed as (deleted). These files have been deleted (or more properly, unlinked) from the filesystem tree but because one or more processes still have them open, the disk space they occupy cannot be reclaimed. So the dfcommand will still account for these open/deleted files, while du which scans the filesystem won't see them any more and will not account for them. For example, below we see four deleted files listed in lsof but an attempt to list the files fails since they have been removed/unlinked from the filesystem's directory structure.

    Raw
    # lsof | grep COMMANMD ; lsof | grep deleted
    COMMAND      PID     USER   FD      TYPE       DEVICE  SIZE/OFF       NODE NAME
    oracle     21869     1000   16w      REG        253,5   7476824     130802 /oracle/diag/EIPDB2_ping_21869.trc (deleted)
    oracle     21869     1000   17w      REG        253,5   8676828     130804 /oracle/diag/EIPDB2_ping_21869.trm (deleted)
    oracle     21927     1000   18w      REG        253,5   7375910     130983 /oracle/diag/EIPDB2_asmb_21927.trc (deleted)
    oracle     21927     1000   19w      REG        253,5   6748612     130658 /oracle/diag/EIPDB2_asmb_21927.trm (deleted)
    
    # ls -lh /oracle/diag/EIPDB2*.tr*
    #
    

Files in a directory that now has another file system mounted over it [Mounted-over]

If a path given to df / du contains a secondary file system, mounted on top of a primary file system, and the primary file system contains files which are now hidden, the output will be different. For example, suppose /opt/foo is a separate file system mounted on top of /opt. Prior to mounting the secondary file system /opt/foo, the primary file system contained a directory of /opt/foo which had files in it. Once the secondary file system is mounted onto /opt/foo, the files from the primary file system inside the /opt/foo directory are no longer visible, but still, consume space.

  • [Mounted Over] - There may be files hidden under a mounted file system. For instance, if the directory /mnt/test contained large files in it, and then an NFS file system was mounted on /mnt/testdf would continue to account for the size of those files but du would not. This can be verified by bind mounting the suspected filesystem on a different directory and inspecting its contents if it's not possible to unmount the NFS filesystem mounted on /mnt/test and inspect due to production outage issues.

    Raw
    $ mkdir /tmp/root_chk
    $ mount --bind / /tmp/root_chk
    $ du -h /tmp/root_chk/mnt/test
    
  • For example, starting with the following, we see df command reports /tmp is using 3.1G of allocated space but df is only reporting 11M:

Raw
# df -TPh /tmp/* | sort -u -k 1 -r 
Filesystem            Type  Size  Used Avail Use% Mounted on
/dev/sdc2             ext3  147G  188M  140G   1% /tmp/mnt
/dev/mapper/vga-lvtmp ext4  167G  3.1G  156G   2% /tmp

# du -sxkh /tmp
11M /tmp
  • Now we run the above bind commands and find that there are files hiding under the /tmp/mnt directory used as a mount point. These files exist within the /tmp filesystem, but once another filesystem is mounted on that directory, then those files become invisible to ducommand's filesystem scanning functions.
Raw
# mkdir /mnt_check                              # [1]
# mount --bind /tmp /mnt_check                  # [2]
# du -sxkh /mnt_check                           # [3]
3.1G /mnt_check
# ls -lh /mnt_check/mnt                         # [4]
total 3.1G
-rw-rw-r--. 1 user user 1.0G Nov 19 14:08 file1
-rw-rw-r--. 1 user user 2.0G Nov 19 14:22 file2
# umount /mnt_check                             # [5]
# rmdir  /mnt_check                             # [6]
  • Notes on above:

    • [1] create a temporary directory on to which we can bind mount the main filesystem that another filesystem is mounted on top of. In this case, the main filesystem is /tmp and another filesystem is mounted on top of /tmp/mnt a directory within the /tmp filesystem. The temporary directory must be created on a different filesystem, in this case, /(root).
    • [2] --bind mount the main filesystem to the temporary directory created above. This will only mount the main filesystem of interest -- any mounted filesystems on top of that main filesystem won't follow to this new mount point.
    • [3] perform a du command on the new --bind mounted filesystem and we can see the expected 3.1G shows up (vs the starting point above of just 11M). This means that one or more mount points on the main filesystem, here that is /tmp, contains files as part of the main (/tmp) filesystem which becomes invisible to the du command once another filesystem is mounted over the top of the /tmp directory.
    • [4] check the /tmp/mnt directory, its the place where another filesystem is mounted. The ls -lh command shows that there are two pre-existing files within the /tmp filesystem within the /tmp/mnt directory of the /tmp filesystem. Once another filesystem is mounted over that directory, those two files cannot be seen by the du command but are still accounted for by the df command against the main /tmpfilesystem. The file1 and file2 need to be removed -- moved or deleted -- from the /tmp/mnt directory. These actions can be performed on the --bind mount point (/mnt_check/mnt in this example which is equivalent to /tmp/mnt before another filesystem is mounted onto that directory).
    • [5] unmount the --bind filesystem after correcting the issue with files being present within the mount point.
    • [6] remove the temporary directory used as a mount point.
  • To correct the above problem, the two files -- file1 and file2 -- that are contained within the /tmp/mnt directory need to be moved or deleted so as to empty /tmp/mnt directory before a filesystem is mounted on top of the /tmp/mnt directory mount point. See step [4] above.

  • Note that even after the files under the mount point are deleted, df will not exactly match du output. There are filesystem-specific allocated "stuff" that is internal to the filesystem and does not show up within visible files of the filesystem. Typically the extra filesystem-specific "stuff" makes df output 2-5% larger than du command output. That is normal and expected. As an example, after deleting file1 and file2above, df showed 98M in use, but du only 11M. That extra 87M is 0.5% of the total allocatable blocks of the filesystem -- well below the threshold for concern.

诊断步骤

Files in a directory that now has another file system mounted over it

To check and remedy this condition, please see How can I see what is consuming space underneath a mounted partition?.

Running processes holding open deleted files

Use the lsof command as follows:

Raw
# lsof | grep deleted
nmbd      16408   root  cwd    DIR        9,1          0    163846 /var/log/samba (deleted)
nmbd      16408   root   13w   REG        9,1  924442067    163964 /var/log/samba/nmbd.log (deleted)

Note: In order to obtain all files which may be holding open files, run the command as root.

The 7th column in the output tells you how big the file is in bytes. The 9th column tells you which file is being held open. The 1st column tells which process is holding this file descriptor open. In the above example, the size of /var/log/samba/nmbd.log is 924442067 bytes, which is almost 1 GB. Since the nmbd service is a part of samba, running /sbin/service smb restart should fix the problem. 

你可能感兴趣的:(The df and du commands provide different system information and I can not write to a partition that df says is 100% full.)