who command gets "who: memory exhausted" for certain inputs

Bug #1109327 reported by Chris J Arges on 2013-01-29
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
GLibC
Fix Released
Medium
eglibc (Debian)
Fix Released
Unknown
eglibc (Ubuntu)
Undecided
Unassigned
Lucid
Medium
Adam Conrad
Precise
Medium
Adam Conrad
Quantal
Medium
Adam Conrad

Bug Description

SRU Justification:
[Impact]
 * When using who, certain characters can cause issues with eglibc's vfprintf causing memory issues and who to print 'memory exhausted'.

[Test Case]

 * Download wtmp.clean.
 * locale-gen en_US.UTF-8 # if necessary
 * LANG=en_US.UTF-8 who wtmp.clean
 * If you see 'who: memory exhausted' the test failed.

[Regression Potential]

 * This patch reverts a change that fixes the issue in BZ #6530.
 * The patch also adds a test case to check for handling of incomplete multi-byte characters.
 * upstream patch URL: http://sourceware.org/git/?p=glibc.git;a=commit;h=715a900c9085907fa749589bf738b192b1a2bda5

--

* Description:
When running who with the attached file we get an error of "who: memory exhausted".
    $ who wtmp.clean

This works fine in newer versions of eglibc. I was able to determine that coreutils was not the problem
by using the precise version of coreutils with the raring eglibc version. In that case the problem went away.
In addition I've compiled the precise version for raring, and the problem is not present.

* Versions affected:
This affects current Lucid, Oneiric, Precise and Quantal eglibc versions.
2.11.1-0ubuntu7.12
2.13-20ubuntu5.3
2.15-0ubuntu10.4
2.15-0ubuntu20.1
But does not affect Raring eglibc ( 2.17-0ubuntu1 )

With the following testcase, it happens while it shouldn't, according to
the manual:
-----8<-------
#include <stdio.h>
#include <locale.h>

#define STR "²éľÂíɱ²¡¶¾£¬ÖܺèµtÄúµÄ360²»×¨Òµ£¡"

int main(void) {
        char buf[200];
        setlocale(LC_ALL, "");
        printf("%d\n", snprintf(buf, 150, "%.50s", STR));
        return 0;
}
----->8-------

The manual page has this to say:
 About precision:
  An optional precision, in the form of a period (&#8216;.&#8217;) followed by an
  optional decimal digit string.(...) This gives (...) the maximum
  number of characters to be printed from a string for s and S
  conversions.

 About s:
  If no l modifier is present: The const char * argument is expected to
  be a pointer to an array of character type(...)

  If an l modifier is present: The const wchar_t * argument is expected
  to be a pointer to an array of wide characters. Wide characters from
  the array are converted to multibyte characters (...)

There is no "l" modifier, but still, the string goes through the
multibyte conversion code, and fails because the string is invalid
multibyte.

Note, it only works with non UTF-8 locale set in LC_CTYPE or LC_ALL.

This is debian bug http://bugs.debian.org/208308

Err the title is bogus, the thing is that sprintf returns -1 bogusly, if you
run the previous testcase using LC_ALL=ja_JP.EUC-JP e.g.

%s should not care about multibyte at all

Here's a simpler testcase for this bug courtesy of Jonathan Nieder:

    #include <stdio.h>
    #include <locale.h>

    int main(void)
    {
        int n;

        setlocale(LC_CTYPE, "");
        n = printf("%.11s\n", "Author: \277");
        perror("printf");
        fprintf(stderr, "return value: %d\n", n);
        return 0;
    }

Under a C locale that'll do the right thing:

    $ LANG=C ./test
    Author: &#65533;
    printf: Success
    return value: 10

But not under a UTF-8 locale, since \277 isn't a valid UTF-8 sequence:

    $ LANG=en_US.utf8 ./test
    printf: Invalid or incomplete multibyte or wide character
    return value: -1

This may be a duplicate of another longstanding bug report that Mr. Drepper has marked as WONTFIX despite it being obviously non-conformant. He seems to believe, for whatever reason, that avoiding writing out incomplete multibyte sequences in incorrect programs that use %.NNs formats haphazardly with multibyte strings is more important than not breaking correct programs that are using %.NNs with strings not intended to be interpreted as multibyte characters in the current locale. Neither C nor POSIX makes any requirement that strings used with the %s format specifier be valid multibyte character sequences in the current locale; glibc is simply wrong on this issue.

I'm moving this bug to unassigned. We need someone to review this issue and confirm the problem.

What would really make this bug stand out is if someone did the work to review what the *standards* say about this particular behaviour. Reviewing what should be accepted is 90% of the work in issues like this. Help on this from the bug submitter would be very useful.

My pleasure. Per N1256 (C99+TC1/2/3):

s
If no l length modifier is present, the argument shall be a pointer to the initial element of an array of character type.246) Characters from the array are written up to (but not including) the terminating null character. If the precision is specified, no more than that many bytes are written. If the precision is not specified or is greater than the size of the array, the array shall contain a null character. If an l length modifier is present, the argument shall be a pointer to the initial element of an array of wchar_t type. Wide characters from the array are converted to multibyte characters (each as if by a call to the wcrtomb function, with the conversion state described by an mbstate_t object initialized to zero before the first wide character is converted) up to and including a terminating null wide character. The resulting multibyte characters are written up to (but not including) the terminating null character (byte). If no precision is specified, the array shall contain a null wide character. If a precision is specified, no more than that many bytes are written (including shift sequences, if any), and the array shall contain a null wide character if, to equal the multibyte character sequence length given by the precision, the function would need to access a wide character one past the end of the array. In no case is a partial multibyte character written.247)

And the referenced footnotes:

246) No special provisions are made for multibyte characters.

247) Redundant shift sequences may result if multibyte characters have a state-dependent encoding.

I don't think it could be clearer than what footnote 246 says about %s without the l modifier.

Many thanks Rich, that really helps. I'll put this into the queue to see if we can fix it for 2.16.

Any progress on this? I'm going to try to find and reopen the bug it's a duplicate of; perhaps then they can be marked as duplicate.

I'm moving this to 2.17. We didn't get a chance to look into this issue. We are in the code freeze for 2.16 and I think it will take considerable effort for the community to develop and test a fix for this.

Based on experience from the sinking feeling of watching target milestones slowly march forward on the Mozilla and Chromium bugtrackers: could you target it at no particular milestone instead? That way, anyone stumbling on this will realize that it is their own responsibility to get it fixed quickly if they care for that to happen.

Thanks,
Jonathan

(In reply to comment #10)
> Based on experience from the sinking feeling of watching target milestones
> slowly march forward on the Mozilla and Chromium bugtrackers: could you target
> it at no particular milestone instead? That way, anyone stumbling on this will
> realize that it is their own responsibility to get it fixed quickly if they
> care for that to happen.
>
> Thanks,
> Jonathan

Jonathan, That's a discussion that should be had on <email address hidden>. Please start a thread there :-)

Fixing this bug should not be complex or invasive. Basically, it's a case of special-case code having been incorrectly written for a case that does not need to be, and should not be, special-cased. The fix should essentially just remove this incorrect code. I haven't read the source yet, but I'd be willing to look into it and possibly submit a patch.

Created attachment 6472
patch to fix the bug

This patch fixes the bug. As I suspected, it's all -'s, no +'s. If you want to fix the indention on the one line that's left, you can and it'll be a single +...

Hi,

Rich Felker wrote:

> --- vfprintf.c.orig
> +++ vfprintf.c
> @@ -1168,42 +1168,7 @@
> else if (!is_long && spec != L_('S')) \
> { \
> if (prec != -1) \
> - { \
> - /* Search for the end of the string, but don't search past \
> - the length (in bytes) specified by the precision. Also \
> - don't use incomplete characters. */ \
> - if (_NL_CURRENT_WORD (LC_CTYPE, _NL_CTYPE_MB_CUR_MAX) == 1) \
> len = __strnlen (string, prec); \
> - else \
> - { \
> - /* In case we have a multibyte character set the \
> - situation is more complicated. We must not copy \
> - bytes at the end which form an incomplete character. */\
[...]
> - len = str2 - string - (ps.__count & 7); \
> - } \
> - } \
> else \
> len = strlen (string); \
> } \
> This patch fixes the bug. As I suspected, it's all -'s, no +'s.

Thanks. wprintf(3) tells me:

 s
  If no l modifier is present: The const char * argument
  is expected to be a pointer to an array of character
  type (pointer to a string) containing a multibyte character
  sequence beginning in the initial shift state.
  Characters from the array are converted to wide
  characters (each by a call to the mbrtowc(3) function
  with a conversion state starting in the initial state
  before the first byte). The resulting wide characters
  are written up to (but not including) the terminating
  null wide character.
[etc]

C99 says that in this case "the argument shall be a pointer to the
initial element of a character array containing a multibyte character
sequence beginning in the initial shift state". POSIX uses similar
wording. C99 describes an error condition:

 The fwprintf function returns the number of wide characters
 transmitted, or a negative value if an output or encoding
 error occurred.

POSIX does not document what value errno should have in this case, but
presumably EILSEQ is typical, for consistency with mbrtowc.

Luckily these considerations only matter in the defined(COMPILE_WPRINTF)
case. Your patch only modifies the !defined(COMPILE_WPRINTF) code.
So for what it's worth:

Reviewed-by: Jonathan Nieder <email address hidden>

The bug has nothing to do with wprintf. It applies to the non-wide printf functions. The text you cited is about wprintf and is therefore irrelevant.

(In reply to comment #15)
> The bug has nothing to do with wprintf. It applies to the non-wide printf
> functions. The text you cited is about wprintf and is therefore irrelevant.

Did you read my message to the end? Is your reply meant to motivate people
to help you?

I would be thrilled in printf and wprintf were defined in separate files
with shared code (if any) in a third file, but that's a topic for another
day.

Apologies for jumping on your reply. I did read the whole thing but didn't get what you intended the message to be, and the whole history of this bug has been folks misquoting text that has nothing to do with the non-long %s format in the non-wide printf functions to justify the buggy behavior... :-(

With that said, thanks for the review.

(In reply to comment #17)
> Apologies for jumping on your reply. I did read the whole thing but didn't get
> what you intended the message to be, and the whole history of this bug has been
> folks misquoting text that has nothing to do with the non-long %s format in the
> non-wide printf functions to justify the buggy behavior... :-(

I had been assuming that the bug was introduced by copying the wprintf()
implementation for reuse in printf() --- i.e., no misunderstanding of the
standard but just simple operator error.

But actually the patch introducing the regression (40c014b3, "Correct
handling of multibyte character strings in %s format with precision",
2000-07-22) only affected vsnprintf() and co. The changelog comment
makes no sense. There is no explanation on libc-alpha@ or libc-hacker@
from around that time. It looks like a simple, unilateral, bad decision.

There's a patch identical to yours complete with changelog and copyright
assignment at <http://sourceware.org/ml/libc-hacker/2010-08/msg00009.html>.
It just reverts 40c014b3. It was ignored. A story from a worse time.

Thanks for fixing it.

A testcase for the fix should be added to the glibc testsuite (this
applies to any bug fix that can reasonably be covered in the testsuite and
isn't fixing an existing test failure). (It's probably a good idea to add
a test for wprintf as well to verify that wprintf continues to behave as
expected, if this isn't already covered by existing wprintf tests.)

I don't think there's any concern about breaking wprintf. wprintf inherently operates by converting the argument to %s to wide characters since its output it wide characters; as such, there's no way it could avoid failing when the input is an invalid multibyte sequence unless it completely ignored the failure return.

Created attachment 6474
test conversion from UTF-8 and handling of incomplete characters (fails)

I assume Joseph means to spend some time to avoid future authors breaking
wprintf now that our attention's on it, not to avoid breaking wprintf when
applying your patch. Your patch doesn't even touch wprintf code.

I tried writing a wprintf testcase, and it's not gone so well. For
example, I tried

  wchar_t buf[1000];
  int n;

  if (!setlocale(LC_ALL, "ja_JP.UTF-8)) {
    perror("setlocale");
    return 1;
  }
  n = swprintf(buf, sizeof(buf)/sizeof(wchar_t), L"%3s",
        "\xef\xbd\xa1g\xef\xbd\xa2h\xef\xbd\xa3i\xef\xbd\xa4j");

I expected swprintf to return 3 and buf to equal

  { 0xff61, 'g', 0xff62, 0 }

Instead, swprintf returns 8 and converts the entire string.

That's because you made the classic printf mistake of confusing width and precision.

Try L"%.3s".

Created attachment 6475
Test swprintf's handling of incomplete multibyte characters

(In reply to comment #22)
> That's because you made the classic printf mistake of confusing width and
> precision.
>
> Try L"%.3s".

Good catch, thanks. Here's a corrected test, which works fine.

On Fri, 22 Jun 2012, carlos_odonell at mentor dot com wrote:

> > Based on experience from the sinking feeling of watching target milestones
> > slowly march forward on the Mozilla and Chromium bugtrackers: could you target
> > it at no particular milestone instead? That way, anyone stumbling on this will
> > realize that it is their own responsibility to get it fixed quickly if they
> > care for that to happen.
> >
> > Thanks,
> > Jonathan
>
> Jonathan, That's a discussion that should be had on <email address hidden>.
> Please start a thread there :-)

Yes, we do need to discuss how best to use milestones based on our
experiences of the 2.16 release cycle. Experience with bug trackers of
other problems would certainly be a useful contribution to that discussion
(and the related one of how to get bugs fixed faster than they are filed).

When a bug has been known for over 4 years and there's such a simple fix, why is there discussion of putting off fixing it until later? Or am I misunderstanding Jonathan's reply again?

(In reply to comment #25)
> When a bug has been known for over 4 years and there's such a simple fix, why
> is there discussion of putting off fixing it until later? Or am I
> misunderstanding Jonathan's reply again?

This bug will likely be fixed in 2.17 development, with a backport to 2.16.1 if requested. Does that match your expectations?

Created attachment 6477
Test sprintf's handling of incomplete multibyte characters

And here's a test to make sure the bug doesn't rear its head again in the future.

I personally think Rich's fix should be safe to apply for 2.16.

commit 715a900c
Author: Jeff Law <email address hidden>
Date: Fri Sep 28 12:48:42 2012 -0600

    2012-09-28 Andreas Schwab <email address hidden>

            [BZ #6530]
            * stdio-common/vfprintf.c (process_string_arg): Revert
            2000-07-22 change.

Jonathan,

Could you please open one new issue per backport request and use keywords to mark which branch to backport the patch to.

Also we expect the person or groups that needs the backport to do the work of making and testing the patch. The release branch manager simply does the approval and analysis of risk for the branch.

Thanks.

Sure, no problem.

Do I understand correctly that this means testing the patch against
2.16, 2.15, 2.14, 2.12, and 2.11 and then filing one bug for each
branch describing the result?

(In reply to comment #30)
> Sure, no problem.
>
> Do I understand correctly that this means testing the patch against
> 2.16, 2.15, 2.14, 2.12, and 2.11 and then filing one bug for each
> branch describing the result?

Yes, that way we can track the backport to each branch individually and the various release branch maintainers and the release branch shareholders can comment on the change. The keyword allows a release branch manager to quickly identify issues open against their branch.

If you have any feedback about this process I'd be more than happy to hear it. We don't have much good formal process for bugzilla, but we're trying :-)

On Wed, 3 Oct 2012, jrnieder at gmail dot com wrote:

> Do I understand correctly that this means testing the patch against
> 2.16, 2.15, 2.14, 2.12, and 2.11 and then filing one bug for each
> branch describing the result?

There is no real evidence of 2.14 or 2.12 being alive - no response from
their maintainer when I asked on libc-alpha whether they are alive or dead
- and 2.11 is considered dead
<http://sourceware.org/ml/libc-alpha/2012-08/msg00557.html>. I don't
think it's worth doing any backports to those three branches.

Backport requests for known living branches filed as bug 14667 and bug 14667.

(In reply to comment #33)
> Backport requests for known living branches filed as bug 14667 and bug 14667.

... and bug 14668. Sorry for the noise.

Chris J Arges (arges) wrote :
Changed in eglibc (Ubuntu Precise):
assignee: nobody → Chris J Arges (arges)
Changed in eglibc (Ubuntu Quantal):
assignee: nobody → Chris J Arges (arges)
Changed in eglibc (Ubuntu Precise):
importance: Undecided → Medium
Changed in eglibc (Ubuntu Quantal):
importance: Undecided → Medium
Changed in eglibc (Ubuntu Precise):
status: New → In Progress
Changed in eglibc (Ubuntu Quantal):
status: New → In Progress
Changed in eglibc (Ubuntu):
assignee: Chris J Arges (arges) → nobody
status: In Progress → Fix Released
importance: Medium → Undecided
Chris J Arges (arges) wrote :

If I install eglibc 2.15-0ubuntu20 it fails, if I install 2.16-0ubuntu1 it works.
Tested with packages here: https://launchpad.net/ubuntu/raring/+source/eglibc

Chris J Arges (arges) wrote :

This seems to work within a precise chroot, but a non-chroot precise environment it fails.

Chris J Arges (arges) wrote :

A workaround to this issue is to use 'export LANG=C' before issuing the command.

Adam Conrad (adconrad) wrote :

A quick test here suggests that the attached backport patch should DTRT. I think. Unless my testing methodology sucks. Putting it through a proper rebuild test to see.

Adam Conrad (adconrad) wrote :

Okay, tested with a proper rebuild, and that patch works. Will include it in the next eglibc SRU round.

Changed in eglibc (Ubuntu Precise):
assignee: Chris J Arges (arges) → Adam Conrad (adconrad)
Changed in eglibc (Ubuntu Quantal):
assignee: Chris J Arges (arges) → Adam Conrad (adconrad)
Adam Conrad (adconrad) on 2013-02-07
affects: eglibc → glibc
Chris J Arges (arges) wrote :

http://people.canonical.com/~adconrad/lp1109327/

I tested this on Precise and verified that it does fix the issue.

Chris J Arges (arges) on 2013-02-07
description: updated
Chris J Arges (arges) on 2013-02-08
Changed in eglibc (Ubuntu Lucid):
importance: Undecided → Medium
status: New → In Progress
Chris J Arges (arges) wrote :

This also affects the lucid version. I did a test build with Adam's asplodey-who patch here:
http://people.canonical.com/~arges/lp1109327/
I tested this and it fixes the issue.

Changed in eglibc (Ubuntu Lucid):
assignee: nobody → Adam Conrad (adconrad)
Changed in glibc:
importance: Unknown → Medium
status: Unknown → Fix Released
Changed in eglibc (Debian):
status: Unknown → Fix Released

*** Bug 16298 has been marked as a duplicate of this bug. ***

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, release/2.15/master has been updated
       via 7def6a63a0298d2e5275820db22eb40ac3dcbe4b (commit)
      from 1ba48eb07a72690406c0ffda642a963c88639752 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=7def6a63a0298d2e5275820db22eb40ac3dcbe4b

commit 7def6a63a0298d2e5275820db22eb40ac3dcbe4b
Author: Carlos O'Donell <email address hidden>
Date: Thu Jan 30 13:19:58 2014 -0500

    Don't parse %s format argument as multibyte string

    2012-09-28 Andreas Schwab <email address hidden>

                [BZ #6530]
                * stdio-common/vfprintf.c (process_string_arg): Revert
                2000-07-22 change.

    2011-09-28 Jonathan Nieder <email address hidden>

                * stdio-common/Makefile (tst-sprintf-ENV): Set environment
                for testcase.
                * stdio-common/tst-sprintf.c: Include <locale.h>
                (main): Test sprintf's handling of incomplete multibyte
                characters.

    (cherry picked from commit 715a900c9085907fa749589bf738b192b1a2bda5)

-----------------------------------------------------------------------

Summary of changes:
 ChangeLog | 14 ++++++++++++++
 NEWS | 8 ++++----
 stdio-common/Makefile | 1 +
 stdio-common/tst-sprintf.c | 13 +++++++++++++
 stdio-common/vfprintf.c | 38 +++-----------------------------------
 5 files changed, 35 insertions(+), 39 deletions(-)

*** Bug 260998 has been marked as a duplicate of this bug. ***
Seen from the domain http://volichat.com
Page where seen: http://volichat.com/adult-chat-rooms
Marked for reference. Resolved as fixed @bugzilla.

Rolf Leggewie (r0lf) wrote :

quantal has seen the end of its life and is no longer receiving any updates. Marking the quantal task for this ticket as "Won't Fix".

Changed in eglibc (Ubuntu Quantal):
status: In Progress → Won't Fix
Adam Conrad (adconrad) on 2015-03-25
Changed in eglibc (Ubuntu Lucid):
status: In Progress → Won't Fix

Hello Chris, or anyone else affected,

Accepted eglibc into precise-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/eglibc/2.15-0ubuntu10.12 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in eglibc (Ubuntu Precise):
status: In Progress → Fix Committed
tags: added: verification-needed
Adam Conrad (adconrad) wrote :

Verified that libc6/2.15-0ubuntu10.12 fixes this bug.

tags: added: verification-done
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package eglibc - 2.15-0ubuntu10.12

---------------
eglibc (2.15-0ubuntu10.12) precise; urgency=medium

  * cvs-vfprintf-multibyte.diff: Fix "memory exhausted" bug in who, by no
    longer parsing %s format arguments as multibyte strings (LP: #1109327)
  * cvs-__SSE_MATH__-feraiseexcept.diff: Check for __SSE_MATH__ in x86_64
    feraiseexcept to fix backported -m32 builds of GCC 4.8 (LP: #1165387)
  * cvs-canonical-name.diff: Don't incorrectly do a PTR lookup when asked
    to do a canonical lookup for a host using AI_CANONNAME (LP: #1057526)
  * cvs-atomic-fastbins.diff: Fix race in free() of fastbin (LP: #1020210)
 -- Adam Conrad <email address hidden> Wed, 25 Mar 2015 13:28:41 -0600

Changed in eglibc (Ubuntu Precise):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for eglibc has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.