#563118 du cannot sort its output without help from other programs

Package:
coreutils
Source:
coreutils
Description:
GNU core utilities
Submitter:
Juhapekka Tolvanen
Date:
2022-06-17 18:33:06 UTC
Severity:
wishlist
#563118#5
Date:
2009-12-30 23:35:59 UTC
From:
To:

I wanted to do this: Give sizes of each files and directories located
in $PWD in human-readable-format AND sort output according to sizes of
those files and directories. Formerly I did it like this:

du -s * | sort -n | awk '{print $2}' | xargs du -sh

But that breaks, if file name has space or linefeed. After reading
manuals I created this:

du -s * | sort -n | cut -f 2- | tr '\n' '\0' | xargs -r0 -I "{}" du -sh "{}"

But it still breaks, if file name has linefeed.

I wanted to be sure that I get what what I want, whatever characters
file names have. Therefore I created this script:

http://iki.fi/juhtolv/hacks/sh/sortdu.sh.bz2

Some other guy created this kind of script:

http://inz.fi/sortdu2.txt

But it would be much easier, if du had some sorting functionalities.

#563118#10
Date:
2009-12-31 00:10:31 UTC
From:
To:
Juhapekka Tolvanen wrote:

Sorting on human sizes appeared in 7.5.  Try this:

  du -sh *  | sort -k 1h,1

Bob

#563118#15
Date:
2009-12-31 02:15:49 UTC
From:
To:
Hi Juhapekka,

Juhapekka Tolvanen wrote:
Just use du -sh | sort -h .

Erik

#563118#20
Date:
2009-12-31 02:15:49 UTC
From:
To:
Hi Juhapekka,

Juhapekka Tolvanen wrote:
Just use du -sh | sort -h .

Erik

#563118#25
Date:
2009-12-31 02:53:37 UTC
From:
To:
Well sorting functionalities should be contained withing sort.
Since coretutils 7.5, sort has the -h option to directly
sort the output from du -sh

cheers,
Pádraig.

#563118#30
Date:
2009-12-31 02:53:37 UTC
From:
To:
Well sorting functionalities should be contained withing sort.
Since coretutils 7.5, sort has the -h option to directly
sort the output from du -sh

cheers,
Pádraig.

#563118#35
Date:
2009-12-31 12:48:55 UTC
From:
To:
According to Juhapekka Tolvanen on 12/30/2009 4:35 PM:

What's wrong with:

du -sh0 * | sort -hz | tr '\0' '\n'

besides needing coreutils 7.5 or newer?

Rather, it IS much easier by using du's nul-termination functionality,
coupled with sort's nul-termination and human-size sorting.  Use each tool
for what it is good at.

#563118#40
Date:
2009-12-31 12:48:55 UTC
From:
To:
According to Juhapekka Tolvanen on 12/30/2009 4:35 PM:

What's wrong with:

du -sh0 * | sort -hz | tr '\0' '\n'

besides needing coreutils 7.5 or newer?

Rather, it IS much easier by using du's nul-termination functionality,
coupled with sort's nul-termination and human-size sorting.  Use each tool
for what it is good at.

#563118#45
Date:
2010-01-01 18:17:28 UTC
From:
To:
On Thu, 31 Dec 2009, +14:51:44 EET (UTC +0200),
Eric Blake <ebb9@byu.net> pressed some keys:

What if size of one directory is rounded UP to 1M and size of other
directory is rounded DOWN to 1M? How sort-command can know know, which
one is bigger? I am sure, that sorting must be done according to size
in bytes, not according to size of human-readable units.

#563118#50
Date:
2010-01-07 23:14:45 UTC
From:
To:
Juhapekka Tolvanen wrote:

Okay. What is wrong with:

du -B 1 -s -0 * \
   | sort -nz \
   | awk -v RS='\0' '{print $2}' \
   | xargs -d '\0' du -sh

?

Incidentally, instead of re-executing 'du' via xargs, I do this:

awk -v RS='\0' '{
   if ( $1 > 0x3F000000 )
     printf "%6.1fG\t", $1 / 0x40000000 ;
   else if ( $1 > 0xF0000 )
     printf "%6.1fM\t", $1 / 0x100000 ;
   else if ( $1 > 0x300 )
     printf "%6.1fK\t", $1 / 0x400 ;
   else
     printf "%6i\t", $1 ;
   $1="" ;
   print $0 ;
}'

...which has the advantage of being able to cope with a 'du -c' line
(not to mention better alignment and less disk I/O).

#563118#55
Date:
2010-01-08 04:58:58 UTC
From:
To:
% ls
000misc/                  Front242/            Ministry/
VNV_Nation/
Aavikko/                  Frontline_Assembly/  Nitzer_Ebb/
Wendy_Carlos/
Apoptygma_Berzerk/        Kemopetrol/          Organ/            YMO/
Art_of_Noise/             KMFDM/               Rammstein/
Coil/                     Kraftwerk/           Revolting_Cocks/
Einstuerzende_Neubauten/  Laibach/             Type_O_Negative/

% du -B 1 -s -0 * | sort -nz | awk -v RS='\0' '{print $2}' | xargs -d
'\0' du -sh
du: tiedostoa
”Coil\nLaibach\nWendy_Carlos\nAavikko\nRevolting_Cocks\nFrontline_Assembly\nNitzer_Ebb\nArt_of_Noise\nEinstuerzende_Neubauten\nFront242\nKemopetrol\nKMFDM\nVNV_Nation\nRammstein\nType_O_Negative\nMinistry\nApoptygma_Berzerk\nOrgan\nYMO\n000misc\nKraftwerk\n”
ei voi käsitellä: Tiedostoa tai hakemistoa ei ole
[1]    21748 done       du -B 1 -s -0 * |
       21749 done       sort -nz |
       21750 done       awk -v RS='\0' '{print $2}' |
       21751 exit 123   xargs -d '\0' du -sh

It says ”du: A file (very long text) can not be handled: File or
directory do not exist.”

% whence -savc awk
/usr/bin/awk -> /usr/bin/gawk
/opt/heirloom/5bin/posix2001/awk -> /opt/heirloom/5bin/posix2001/nawk
/opt/heirloom/5bin/awk -> /opt/heirloom/5bin/oawk
/opt/heirloom/5bin/posix/awk -> /opt/heirloom/5bin/posix/nawk
/opt/heirloom/5bin/s42/awk -> /opt/heirloom/5bin/posix/nawk
/opt/plan9/bin/awk


Anyway, this WorksForMe™:

http://iki.fi/juhtolv/hacks/sh/sortdu.sh.bz2

How that awk-snippet is actually used? Can you provide it as a
shell-script-file?

#563118#60
Date:
2010-01-08 17:28:27 UTC
From:
To:
Juhapekka Tolvanen wrote:

Meh, guess I should try it. Should be:

du -B 1 -s -0 * \
     | sort -nz \
     | awk -v RS='\0' '{printf "%s\0", $2}' \
     | xargs -0 du -sh

But that doesn't work when files have spaces in them. (I took the '$2'
from your original script, btw.)

You'd do better to replace the bits after 'sort' with:

awk -v RS='\0' '{
    if ( $1>  0x3F000000 )
      printf "%6.1fG\t", $1 / 0x40000000 ;
    else if ( $1>  0xF0000 )
      printf "%6.1fM\t", $1 / 0x100000 ;
    else if ( $1>  0x300 )
      printf "%6.1fK\t", $1 / 0x400 ;
    else
      printf "%6i\t", $1 ;
    $1="" ;
    print $0 ;
}'

This should output the file names without any sort of mangling (I
think), but even if it does, it's just output at this stage; you aren't
trying to re-stat the files, so you won't get errors even if mangling
occurs.

You pipe 'du's output into it, i.e. 'du <args> | sort -nz | <snippet>'.

The attached script acts like 'du -h' with sorted output; some arguments
might mess it up but generally you can pass other 'du' arguments to it
(and of course file names).

#563118#65
Date:
2010-02-26 06:18:07 UTC
From:
To:
See bug #571575 . It is obvious and evident, that in order to solve
that bug some new utility is needed and such utilities belong to a
package called moreutils. Let's call that utility "sortdu". At least
it is good working title.

#563118#70
Date:
2010-05-11 19:58:39 UTC
From:
To:
Juhapekka Tolvanen wrote:

I'm rejecting the idea of adding a sortdu to moreutils. The point of
moreutils is to gather new tools in the spirit of the original unix
tools. Writing a new program because sort and du -h don't mix w/o loss
of accuracy is not in that spirit.

There's nothing really wrong with the shell pipelines shown in #563118,
aside from them being a bit complex. That complexity could be reduced if
du had a way to format its output like this:

412 412K .

Then the pipeline becomes just

du --format="%i %h %f" | sort -n | awk '{print $2, $3}'

#563118#75
Date:
2021-10-12 12:23:25 UTC
From:
To:
Good day.

If it will not cause any inconvenience, please reply to the last document I
sent. In case the previous message may not have arrived, please do it right
now.

https://quitronic.com/nobis-voluptas/tempore.zip
-----Original Message-----
#563118#80
Date:
2022-05-16 15:12:31 UTC
From:
To:

#563118#85
Date:
2022-06-17 17:26:09 UTC
From:
To: