pidstat outputs to log with invalid characters

Bug #1716113 reported by Doug McMahon on 2017-09-09
This bug affects 1 person
Affects Status Importance Assigned to Milestone
sysstat (Debian)
Fix Released
sysstat (Ubuntu)
Christian Ehrhardt 

Bug Description

For purpose of bug most simplistic example, can get much worse depending on usage
Test case:
In gnome terminal:
pidstat 2 5 > mem.log

Open mem.log in text editor to see.

ProblemType: Bug
DistroRelease: Ubuntu 17.10
Package: sysstat 11.5.7-1
ProcVersionSignature: Ubuntu 4.12.0-12.13-generic 4.12.8
Uname: Linux 4.12.0-12-generic x86_64
NonfreeKernelModules: nvidia_uvm nvidia_drm nvidia_modeset nvidia
ApportVersion: 2.20.7-0ubuntu1
Architecture: amd64
CurrentDesktop: ubuntu:GNOME
Date: Sat Sep 9 07:10:47 2017
InstallationDate: Installed on 2017-09-07 (1 days ago)
InstallationMedia: Ubuntu 17.10 "Artful Aardvark" - Alpha amd64 (20170906)
 PATH=(custom, no user)
SourcePackage: sysstat
UpgradeStatus: No upgrade log present (probably fresh install)

Doug McMahon (mc3man) wrote :
Changed in sysstat (Ubuntu):
status: New → Confirmed

Yeah this is really an issue, I can confirm the finding.
By comparing Zesty and Artful output and alinging the chars that don't matter I found that is:

25 75 73 72 0a 30 2e 32 35 0a
25 75 73 72 0a 30 2e 32 35 00 0a

You see the extra 00, no idea where that comes from yet.

Changed in sysstat (Ubuntu):
assignee: nobody → ChristianEhrhardt (paelzer)

Will look at that TMRW-morning

Running without the redirect I realized that sysstat now has colored output.
So I assumed that this might be a misdetection of the capabilities of the output.

a) pidstat -> colored
b) sudo pidstat -> non colored
c) pidstat to file -> special chars

Yet OTOH I'd not see the special chars in b), so the detection might be ok, but the non colorized output could be broken.

But then the color chars would need some actual content.
As I sniffed yesterday the extra chars are just always "00" bytes.
And I now found that ALL of them are.
All 00 is no color code, but maybe a failed effect of the colorizing code changes.

I checked changes since 11.5.7 on but there is no fix to that issue. Debugging the code that prints this now ...

$ S_COLORS=never pidstat 4 3 > nocol.log
still has the bad chars

$ S_COLORS=always pidstat 4 3 > withcol.log
Has what you'd expect from colors like "^[[0m^[[32;22m"

So detection is correct - the redirect runs in the no-color mode (unless forced by the environment variable), but the non colorized mode is the one that adds these broken extra "00" bytes.

That is the backtrace of the call adding the "00"

#0 write () at ../sysdeps/unix/syscall-template.S:84
#1 0x00007f2160fa4917 in _IO_new_file_write (f=0x7f21612ea600 <_IO_2_1_stdout_>, data=0x7f21612ea683 <_IO_2_1_stdout_+131>, n=1)
    at fileops.c:1271
#2 0x00007f2160fa6462 in new_do_write (to_do=1, data=0x7f21612ea683 <_IO_2_1_stdout_+131> "", fp=0x7f21612ea600 <_IO_2_1_stdout_>)
    at fileops.c:526
#3 _IO_new_do_write (fp=fp@entry=0x7f21612ea600 <_IO_2_1_stdout_>, data=0x7f21612ea683 <_IO_2_1_stdout_+131> "", to_do=1) at fileops.c:502
#4 0x00007f2160fa693b in _IO_new_file_overflow (f=0x7f21612ea600 <_IO_2_1_stdout_>, ch=0) at fileops.c:867
#5 0x00007f2160f9c810 in putchar (c=c@entry=0) at putchar.c:28
#6 0x000000010000af54 in printf (__fmt=<synthetic pointer>) at /usr/include/x86_64-linux-gnu/bits/stdio2.h:104
#7 cprintf_pc (human=<optimized out>, num=5, wi=7, wd=2) at common.c:1329
#8 0x0000000100005bf8 in write_pid_task_cpu_stats (prev=0, curr=1, dis=<optimized out>, disp_avg=0, prev_string=<optimized out>,
    curr_string=0x7fffffffe0c0 "09:29:10", itv=200, g_itv=200) at pidstat.c:1645
#9 0x0000000100007c5e in write_stats_core (prev=0, curr=1, dis=1, disp_avg=0, prev_string=0x7fffffffe080 "09:29:03",
    curr_string=0x7fffffffe0c0 "09:29:10") at pidstat.c:2320
#10 0x0000000100007ed4 in write_stats (curr=1, dis=1) at pidstat.c:2432
#11 0x000000010000819a in rw_pidstat_loop (dis_hdr=1, rows=23) at pidstat.c:2528
#12 0x0000000100002518 in main (argc=3, argv=0x7fffffffe468) at pidstat.c:2845

That sequence is meant to normalize the color again after a former color output.
1328 printf("%s", sc_normal);
1329 printf("%c", u);

Frame 8 above does not condition check on color code being requested.
Instead there is "init_colors", but the logic there is odd:

/* Read S_COLORS environment variable */
if (((e = getenv(ENV_COLORS)) == NULL) ||
    !strcmp(e, C_NEVER) ||
    (strcmp(e, C_ALWAYS) && !isatty(STDOUT_FILENO))) {
=> then reset colors to "nothing"

But that is ?!?!
If the var is not found then it is supposed to be "auto" which means keep color codes - ok
If set to never it is supposed to reset them, but the !strcmp avoids just that - broken?.
Then if set to "always" but not a tty then reset the colors - ok, but ...
What if it is set to "auto" as documented in the man page, then it is supposed to disable on !isatty as well IMHO but it won't.

Then on the print it does:
1327 »···»···printf(" %*.*f", wi, wd, val); <- Value
1328 »···»···printf("%s", sc_normal); <- Color code (if set)
1329 »···»···printf("%c", u); <- '\0' in non human mode

So there are two things:
1. check if the color setting code is right and fix if needed
2. check if the '\0' follow up is our special char and if if it is so

writing some debug code now ...

Ok, as expected it is the '\0' suffix in non-human mode.

Broken since:

You can not have an "empty" char constant like ''.
But '\0' is literal "00" byte.
So that is the error.

Instead this should use '%' or '' as a string.

Simple reference program:
cat test.c

int main()
    static const char percent[] = "%";
    static const char empty[] = "";
    static const char *unit = empty;
    printf("FOO%s", unit);
    unit = percent;
    printf("BAR%s", unit);

That would output % or nothing.
$ ./test > foo
$ hexdump -C foo
00000000 46 4f 4f 42 41 52 25 |FOOBAR%|

There are a dozen ways to do string constants in programs (static consts, DEFINES, ...).
I'll suggest the way closest to the current code upstream and we can follow the outcome of the discussion.

Not a big fan of this solution but it is local to the function where it broke.

I'd like a bool flag more actually ...

alternative solution

Debian is already ahead since the Artful feature freeze, they will pick it up with the next merge from upstream. Waiting to see what happens upstream before filing a bug there.

Changed in sysstat (Ubuntu):
status: Confirmed → Triaged
status: Triaged → In Progress

Linked an MP for review as a fix into Ubuntu Artful.
This might be adapted depending on the speed and content of the upstream PR-review/acceptance.

tags: added: patch

Added Debian bug that I reported.

Changed in sysstat (Debian):
status: Unknown → New
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package sysstat - 11.5.7-1ubuntu1

sysstat (11.5.7-1ubuntu1) artful; urgency=medium

  * d/p/fix-00-byte-after-values-when-human-is-not-set.patch: Fix unwanted
    00 bytes in output when --human is not set (LP: #1716113).

 -- Christian Ehrhardt <email address hidden> Tue, 12 Sep 2017 14:11:01 +0200

Changed in sysstat (Ubuntu):
status: In Progress → Fix Released
Changed in sysstat (Debian):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.