Comment 3 for bug 868747

Revision history for this message
jimav (james-avera) wrote : Re: [Bug 868747] Re: fmt -f <unlimited> should be supported

Hi,
Thanks for the substantive reply, and the Perl -00 trick.
-Jim

>________________________________
>From: Jim Meyering <email address hidden>
>To: <email address hidden>
>Sent: Wednesday, October 5, 2011 11:39 PM
>Subject: Re: [Bug 868747] Re: fmt -f <unlimited> should be supported
>
>jimav wrote:
>>  Strictly speaking this is an enhancement request.
>>
>>  fmt imposes an artificial limit on the maximum output line length
>>  controlled by the -f option, which prevents using this tool to "join"
>
>You meant -w, not -f, throughout.
>
>Thanks for the suggestion.  Note that the code has this:
>
>    /* Size of paragraph buffer, in words and characters.  Longer paragraphs
>      are handled neatly (cf. flush_paragraph()), so long as these values
>      are considerably greater than required by the width.  These values
>      cannot be extended indefinitely: doing so would run into size limits
>      and/or cause more overflows in cost calculations.  FIXME: Remove these
>      arbitrary limits.  */
>
>    #define MAXWORDS    1000
>    #define MAXCHARS    5000
>
>where MAXCHARS/2 specifies the largest width.
>I.e., fmt -w 2500 works, but not 2501.
>
>We agree that there should not be such a limit.
>But the internals of fmt are not pretty -- significantly less
>so than most other parts of the coreutils, and as the comment says
>we cannot easily increase them arbitrarily.
>
>In the mean time what can you do if you want truly unlimited-length
>paragraphs?  It's not trivial since you want to retain paragraph delimiters.
>This perl command should do the trick.
>It processes your input a paragraph at a time, replacing each newline
>(and spaces before/after) with a single space:
>
>    perl -00ple 's/\s*\n\s*/ /g'
>
>E.g., given this input,
>
>1
>2
>3
>4
>
>1
>2
>3
>4
>5
>
>It prints this:
>
>    $ (seq 4; echo; seq 5) | perl -00ple 's/\s*\n\s*/ /g'
>    1 2 3 4
>
>    1 2 3 4 5
>
>It doesn't preserve indentation, but if you're just going to
>paste it into libreoffice, that should be fine.
>
>I've Cc'd the upstream bug-tracker, so we'll have a bug number there,
>too.
>
>--
>You received this bug notification because you are subscribed to the bug
>report.
>https://bugs.launchpad.net/bugs/868747
>
>Title:
>  fmt -f <unlimited> should be supported
>
>Status in “coreutils” package in Ubuntu:
>  New
>
>Bug description:
>  Strictly speaking this is an enhancement request.
>
>  fmt imposes an artificial limit on the maximum output line length
>  controlled by the -f option, which prevents using this tool to "join"
>  together all lines in each paragraph (for any paragraph size).  This
>  operation is necessary to prepare plain-text for import into a word
>  processor such as LibreOffice, where "manual line breaks" (i.e.
>  newlines in the middle of paragraphs) are undesirable.  If fmt could
>  be run with an effectively unlimited max line length, then it could be
>  used for this purpose.
>
>  Ideally there would be a way to specify an explicitly unlimited output
>  line length (say, -f -1).
>
>  SUMMARY:
>    'fmt -f 9999 file.txt' gives error "invalid width".  It should accept an arbitrarily-large value (up to max 32-bit integer).
>    Ideally, an option would allow specifying an explicitly unlimited (or maximum) output line length.
>
>  ProblemType: Bug
>  DistroRelease: Ubuntu 10.10
>  Package: coreutils 8.5-1ubuntu3
>  ProcVersionSignature: Ubuntu 2.6.35-30.59-generic 2.6.35.13
>  Uname: Linux 2.6.35-30-generic x86_64
>  Architecture: amd64
>  Date: Wed Oct  5 14:45:27 2011
>  InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Release Candidate amd64 (20100928)
>  ProcEnviron:
>   PATH=(custom, user)
>   LANG=en_US.utf8
>   SHELL=/bin/bash
>  SourcePackage: coreutils
>
>To manage notifications about this bug go to:
>https://bugs.launchpad.net/ubuntu/+source/coreutils/+bug/868747/+subscriptions
>
>
>