i have a filesystem exported from a Solaris host via NFSv3 to a Linux client. according to df -k on both sides, this filesystem has 2290522928 KB space used: Filesystem kbytes used avail capacity Mounted on ift 5119991744 2290522928 2829468816 45% /aux0 Solaris df -h shows this as 2.1TB: Filesystem size used avail capacity Mounted on ift 4.8T 2.1T 2.6T 45% /aux0 however, Linux df -h shows this as 2.2TB: Filesystem Size Used Avail Use% Mounted on clematis:/aux0/hemlock-home 4.8T 2.2T 2.7T 45% /home (notice available is different too.) the Linux (coreutils df) output is incorrect. when rounded using base-2 multiples, 2290522928 KB is 2.1TB. when rounded using base-10 multiples (which wouldn't make much sense anyway), the output is still wrong, because it would be 2.3TB then.
I am seeing incorrect %use when displaying data from a 500GB USB external drive -- Example output: /dev/sde1 480040596 310726424 144929512 69% /media/wdp7 Precise calc. (on HP11C) is Use% = 68.193% which should not round upward I am doing a long rm -rf to clean out approx. 200GB of old files. While this is going on in a background job, I run # while sleep 5; do echo $(date) $(df .)|cut -d ' ' -f 1,10-; done in the foreground. When Use% dropped to 67. 998%, the df display changed to 68%. This, I think, demonstartes that the problem is in the actual calc., not in the use of human friendly number display. HTH PS, I have several 500GB USB drives and often defer deleting old files until a time when the delete does not compete for cycles with useful work. Its OK to ask for test runs of new code. Wish list item: Enable correct computation of Use% that is over 100% (because of the 5% safety buffer that is built in somewhere)
Paul E Condon wrote: Thank you for the report. But I think this is not a bug in df but is instead a misunderstanding of how it operates. Please see this FAQ entry: http://www.gnu.org/software/coreutils/faq/#df-Size-and-Used-and-Available-do-not-add-up Is that the issue you are seeing? In any case, df simply passes along the values reported by the kernel in the statfs call. Therefore any actual calculation problems will be root caused in the kernel and not in the df program. To see the values that the kernel is returning to df's statfs call please run the following command and report the contents of the file. $ strace -v -e trace=statfs -o /tmp/df.strace.out df /dev/sde1 Bob
No and yes. I am aware of the fact that %use denominator is sum of Used (U) and Available (A), and that U+A is 0.95 * (1k-blocks). My 'precise' calc. is 100*U/(U+A) . The output transitions to a new,lower value as the 'precise value' transitions from 68.007 to 67.995, which I think is strong evidence that the code is ignoring the (1k-blocks) number and only using U and A. I think the kernel calc is being done wrong, but the correct calc. can surly be done in user space. Much as the kernel reports utterly spurious precision of modification times (down to 1 nanosec) which are ignored by the coreutils by the simple expedient of truncation. Of course this is a MINOR bug. I think coreutils should give the user an self consistent view of what is the situation. I have no idea what the actual U and A values are. They may be garbage also in which case I'm asking for self consistent garbage in preference to manifestly false garbage. I rather like the idea of having a 5% safety allowance, and having %use report 100% when there is still 25GB available on a 500GB disk. That is explained somewhere and is easy to understand and appreciate. But rm id SLOW on these big disks. I've been watching the progress of rm more often than I'd like, and I noticed that my mental extrapolations of when the process would be done weren't giving the correct answer, and it was because of this bug, so I report. My suspicion is that the U and A values that are reported by the kernel are pretty honest data. To get them wrong would require extra code, and extra code deliberately introduced in order to make a dishonest report is pretty unbelieveable. Maybe on Wall Street, but not in Linux kernel. I don't have strace installed on the computer were this is happening. I attempted to install but the computer crashed will running aptitude. I close now and go to recovering from the crash. But I don't expect that df is fudging the numbers that it gets from the kernel. I DO suspect that the % calc is incorrectly done in the kernel, but on learning that the calc is done in the kernel, I think that is itself a minor bug. There are many uses for the kernel in embedded systems where %use is never needed. Getting it out of the kernel could save a few dozen bytes, perhaps. Cheers,
Paul E Condon wrote: I have been noticing that the ext4 w/ fsync fiasco is making everything very much slower while saying that it is trying to make things faster. The irony is tragic. I don't know if that is what you are suffering from but it is potentially possible. Sometimes it is a bug. Sometimes it is not. Thank you for the report just the same. In any case I apologize for not spending the time to completely understand your report before sending my reply. So often people don't take minfree into consideration and so I pointed to the FAQ on the topic. While the other numbers are just reported from statfs the use percentage is calculated. Sorry for getting ahead of myself. I agree with your analysis that used / (used + available) in your case of 310726424 / ( 310726424 + 144929512 ) = 0.68193 as you reported which is not equal to the 69% that the tool emitted. I looked at the code and if I am following the correct code path then it is basically doing the following: used = f_blocks available_to_root = f_bfree available = f_bavail nonroot_total = used + available u100 = used * 100 pct = u100 / nonroot_total + (u100 % nonroot_total != 0) Knowing the values returned from the statfs system call would fill in the values for f_blocks, f_bfree, and f_bavail and should allow us to know how this calculation is processed. Again, my apologies for not fully understanding the nature of your bug report at that time. Instead of running aptitude (which because of your words makes me think it ran out of memory and got the oom killer involved) you could copy the strace deb over and then install it directly with dpkg -i which would use much less memory and very likely succeed where aptitude failed. You could even help aptitude along with aptitude download strace and then dpkg -i strace*.deb at that point. Just ideas for you. Alternatively it would be relatively easy to put together a very small C program that printed the results of the statfs call directly. Or perhaps print it from perl's syscall interface. Please let me know if you have too much trouble getting strace installed and I will suggest something. Bob
No problem about delayed understanding. And the bug really is minor. I've been puzzling about it for a LOOOONG time while waiting for rm to complete. Finally did enough careful observation to convince myself that it was real. Actually, I don't use ext4. I still using ext3. A few months ago I thought I had a problem with ext3, but the symptoms disappeared while trying to document it, about the time I throw away a bad disk. My guess is that that disk was corrupting something that made other disk also appear to be bad. But I didn't dig it out of the trash to pursue that theory. Before I got this email I had already done a clean install of Squeeze which seems to have gotten the box working again. It was strange. Investigation done before the reinstall indicated that the system clock has stopped two days ago. Things are working much better now. So back to the minor bug: My thinking about strace is that it might be overkill. At some point people operating in user space (me, in particular, but perhaps you, also) need to trust data returned by the kernel to a system call. Here we have two kinds of data: disk size, U, and A are real data. But %use is the result of a trivial calculation that uses some of these real data as input, and where the result has no effect on the proper functioning of the kernel. I suspect that the trivial calculation in the kernel has a silly bug. That it is done wrong could easily go unnoticed by kernel developers. Such calculations should not be done by the kernel. It belongs close to the formatting code that introduces the '%' character into the output stream, IMHO. So, I propose that you ignore the %use number given by the kernel, and replace it with a calculated value that is consistent with the other numbers on the line as the line is being formatted. The problem is more cosmetic than real. Three numbers is a row that purport to be related by simple calculation, but are not, is --- ugly. There is already, a situation where the data returned by the kernel is ignored by the coreutils code: the kernel in recent years has started returning nine orders of magnitude of sub-second precision that had not been in the last modified time before. At least five or six of those OoM are utterly spurious and all of them are lost when the file is written to disk, so the coreutils code ignores them all. Also, /tmp/df.strace.out is empty after running your suggested diagnostic. So is the above discussion an instance of sour grapes reasoning?
Paul E Condon wrote:
...
...
kernel differences? The above works fine for me with unstable's
2.6.32-5-amd64 and self-compiled df from coreutils-8.7.x:
This variant might be more useful:
$ strace -v -e file -o /tmp/df.strace.out df /dev/sde1
If it too leaves the output file empty, try this:
$ strace -o /tmp/df.strace.out df /dev/sde1
1> I don't use ext4. How can that be my problem? 2> I DO use ssh to get into the host on which I am seeing the problem. I could try on one of my three other Squeeze boxes, but none of them have a /dev/sde1 so someone might think I'm fudging the data. Original suggested diagnostic seems not to conform to man page use of -e, but I've never used strace before and may misunderstand. To run it I just copy and paste between xterms in Gnome GUI. The simplified string, strace -o /tmp/df.strace.out df /dev/sde1 gives output that seems garbled when view using less and/or cat. I will not post it for fear of breakage. Did you read my last letter? What do you think of the idea of just doing the divide and round inside df? Your comment about a certain kernel developer confirms my impression that raising issues as to the correctness of kernel code can be VERY counter-productive. Better to ignore output that one doesn't find useful, and produce, from more basic data, output that is more to ones liking. As I mentioned this is already being done with certain time-stamps by coreutils developers. Cheers,
Paul E Condon wrote: It is a severe breach of etiquette to forward private email without permission to a public mailing list. Bob