Dear Maintainer,
The -l (--table-column-limit) option to the "column" utility does not
work correctly for data that has more than one space in a row. It is
supposed to specify a maximum number of columns and the last column
will contain all remaining line data.
An example of how it is supposed to work can be seen when the input is
delimited by a single space. For example:
$ printf '1 2 3 4 5 6\nOne Two Three Four Five Six\n' \
| column -t -l5
1 2 3 4 5 6
One Two Three Four Five Six
Note how column 5 is the maximum column, so the data for column six is
simply appended. That is the correct behavior.
However, the problem can be easily triggered by simply piping the
output from column back into itself. This should be a no-op, but
instead mangles the data:
$ printf '1 2 3 4 5 6\nOne Two Three Four Five Six\n' \
| column -t -l5 \
| column -t -l5
1 2 3 4 3
One Two Three Four ur
As you can see, the fifth column has been overwritten by data from
previous columns. (Perhaps a pointer problem?)
Any data with multiple spaces will trigger the bug. For example, the
output from 'ls -l':
$ ls -lh | column -t -l7
total 500K
drwxr-xr-x 2 ben ben 4.0K Jan an
-rwxr-xr-x 1 ben ben 2.7K Jul ul
drwxr-xr-x 5 ben ben 4.0K Dec ec
-rw-r--r-- 1 ben ben 116K Nov ov
-rw-r--r-- 1 ben ben 31K Nov Nov
drwxr-xr-x 2 ben ben 4.0K Mar ar
-rw-r--r-- 1 ben ben 225 Oct Oct
drwxr-xr-x 2 ben ben 12K Jan Jan
drwxr-xr-x 12 ben ben 260K Jan n
* * * * *
This may be irrelevant, but I noticed in the source that there is some
code which seems suspicious at lines 459 and 470:
457 if (ctl->maxncols && n + 1 == ctl->maxncols) {
458 if (nchars + skip < len)
-> 459 wcdata = wcs0 + (nchars + skip);
460 else
461 wcdata = NULL;
462 } else {
463 wcdata = local_wcstok(ctl, wcs, &sv);
464
465 /* For the default separator ('greedy' mode) it uses
466 * strtok() and it skips leading white chars. In this
467 * case we need to remember size of the ignored white
468 * chars due to wcdata calculation in maxncols case */
469 if (wcdata && ctl->greedy
-> 470 && n == 0 && nchars == 0 && wcdata > wcs)
471 skip = wcdata - wcs;
472 }
In 459, pointer arithmetic is being done to index into the string for
the last column. However, it is a few bytes shy, perhaps because skip
is always zero. In my experiments, the test in 469-470 always failed,
thus `skip` is never changed.
The reference to wide characters made me wonder if that was the issue,
but neither export LANG=C nor recompiling with HAVE_WIDECHAR=0 helped.