A right of passage as a unix admin is to be deceived by the ‘du’ and ‘df’ commands. Every single one of us has tried to clear space in a full file system only to run ‘df’ and be greeted by that ever-present 100%. What gives?
The answer lies in the way that each command gathers disk allocation information.
‘du’ is a user level program that navigates the file system tree and adds up the blocks allocated to files, directories and links as reported by stat(). ’du’ doesn’t have access to be “aware” of everything, so it doesn’t see metadata. ’df’ refers to the disk allocation maps and has all necessary accesses.
In every case, ‘df’ is a more reliable indicator of file system allocation status. Here is an actual case of a common example of why:
# df -g
..
/dev/logicalv 72.00 0.00 100% 13579 96% /filesystem
..# du -s /filesystem
32671176 /filesystem# fuser -d /filesystem
/filesystem 123456
Process 123456 has a file(s) open in /filesystem that has been deleted.
Notice that ‘df’ reports a full 72G %used and ‘du’ reports only 32G used. The application had multiple files opened when the files were deleted from the filesystem. Since that space is still not free, ‘df’ correctly reports it as used. ‘du’, however, only has visibility into present file allocations and doesn’t see metadata, so it reports it free.
This conversation can also extend into ensuring files are not currently opened before deleting them during cleanup. This can be done with fuser and is explained in more detail in this fuser post.
‘du’ is perfectly adequate to quickly find the larger files in a file system for cleanup, but, when it comes to accurately reporting the allocation status of a file system, stick with ‘df’.
No comments:
Post a Comment