Comment 2 for bug 868747

Revision history for this message
Jim Meyering (meyering) wrote : Re: [Bug 868747] Re: fmt -f <unlimited> should be supported

jimav wrote:
> Strictly speaking this is an enhancement request.
>
> fmt imposes an artificial limit on the maximum output line length
> controlled by the -f option, which prevents using this tool to "join"

You meant -w, not -f, throughout.

Thanks for the suggestion. Note that the code has this:

    /* Size of paragraph buffer, in words and characters. Longer paragraphs
       are handled neatly (cf. flush_paragraph()), so long as these values
       are considerably greater than required by the width. These values
       cannot be extended indefinitely: doing so would run into size limits
       and/or cause more overflows in cost calculations. FIXME: Remove these
       arbitrary limits. */

    #define MAXWORDS 1000
    #define MAXCHARS 5000

where MAXCHARS/2 specifies the largest width.
I.e., fmt -w 2500 works, but not 2501.

We agree that there should not be such a limit.
But the internals of fmt are not pretty -- significantly less
so than most other parts of the coreutils, and as the comment says
we cannot easily increase them arbitrarily.

In the mean time what can you do if you want truly unlimited-length
paragraphs? It's not trivial since you want to retain paragraph delimiters.
This perl command should do the trick.
It processes your input a paragraph at a time, replacing each newline
(and spaces before/after) with a single space:

    perl -00ple 's/\s*\n\s*/ /g'

E.g., given this input,

1
2
3
4

1
2
3
4
5

It prints this:

    $ (seq 4; echo; seq 5) | perl -00ple 's/\s*\n\s*/ /g'
    1 2 3 4

    1 2 3 4 5

It doesn't preserve indentation, but if you're just going to
paste it into libreoffice, that should be fine.

I've Cc'd the upstream bug-tracker, so we'll have a bug number there, too.