Bug #341827 “hplip filled user.log with 2.2Gb of log on error in...” : Bugs : HPLIP

Revision history for this message

zebul666 (zebul666) wrote on 2009-03-12:

#1

hp-check -t output Edit (7.6 KiB, text/plain)

Revision history for this message

zebul666 (zebul666) wrote on 2009-03-12:

#2

no it is not dead. i had to remove any document in cups queue so that my printer began again to print.

is only one document could have cause that ? the acrobat reader plugin ?

Revision history for this message

David Suffield (david-suffield) wrote on 2009-03-16:

#3

What kind of system are you running? Distro?

Is there more syslog info? I see USB IO (musb.c) messages but no up-stream caller messages. There are no loops at this level of the code.

Based on the error messages something hung your printer...

Revision history for this message

zebul666 (zebul666) wrote on 2009-03-17:

#4

it's linux 32 bit. distro is archlinux (current of the date above)

i got in user.log the lines above repeated a big number of times.
there is the same errors line in /var/log/errors.log

in everything.log, i got those lines just before the errors given above, i switch on the printer at 15:38
Mar 12 15:38:37 soho kernel: usb 1-4: new high speed USB device using ehci_hcd and address 5
Mar 12 15:38:37 soho kernel: usb 1-4: configuration #1 chosen from 1 choice
Mar 12 15:38:37 soho kernel: usblp0: USB Bidirectional printer dev 5 if 1 alt 0 proto 2 vid 0x03F0 pid 0x5611
Mar 12 15:38:37 soho kernel: usbcore: registered new interface driver usblp
Mar 12 15:38:37 soho load-modules.sh: 'usb:v03F0p5611d0100dc00dsc00dp00icFFiscCCip00' is not a valid module or alias name
Mar 12 15:38:37 soho kernel: Initializing USB Mass Storage driver...
Mar 12 15:38:37 soho kernel: scsi4 : SCSI emulation for USB Mass Storage devices
Mar 12 15:38:37 soho kernel: usb-storage: device found at 5
Mar 12 15:38:37 soho kernel: usbcore: registered new interface driver usb-storage
Mar 12 15:38:37 soho kernel: USB Mass Storage support registered.
Mar 12 15:38:37 soho kernel: usb-storage: waiting for device to settle before scanning
Mar 12 15:38:37 soho load-modules.sh: 'usb:v03F0p5611d0100dc00dsc00dp00icFFiscFFipFF' is not a valid module or alias name
Mar 12 15:38:42 soho kernel: scsi 4:0:0:0: Direct-Access HP Photosmart C3180 1.00 PQ: 0 ANSI: 2
Mar 12 15:38:42 soho kernel: sd 4:0:0:0: [sdb] Attached SCSI removable disk
Mar 12 15:38:42 soho kernel: sd 4:0:0:0: Attached scsi generic sg2 type 0
Mar 12 15:38:42 soho kernel: usb-storage: device scan complete
Mar 12 15:39:19 soho kernel: usblp0: removed
Mar 12 15:39:21 soho kernel: usb 1-4: USB disconnect, address 5

what's strange, it seems the printer disapeared ?
could it be cause by hardware ?

so i need to lower the info level of syslog-ng for hplip ?

Revision history for this message

David Suffield (david-suffield) wrote on 2009-03-17:

#5

I have never used archlinux, but your kernel did disconnect the device from the USB bus for some reason after initial connection. Linux kernel USB support can be a problem with some no-name PCs.

Try the following in order.

1. Remove any hubs, plug directly into PC.
2. Try a certified HIGH-SPEED or USB 2.0 cable.
3. Try a different USB port on the PC.
4. Try running FULL-SPEED or USB 1.1 (ie: BIOS setting or USB 1.1 hub).
5. Try a different PC or distro.

-dave

Revision history for this message

Aaron Albright (albrigha-deactivatedaccount) wrote on 2009-03-27:

#6

Are you still having a problem with this?

Thanks.

Aaron

Changed in hplip:
assignee:	nobody → kalosaurusrex
status:	New → Triaged

Revision history for this message

zebul666 (zebul666) wrote on 2009-03-28:

#7

it's not a no-name PC, it's a DELL inspiron 531 ;-)
it is a usb 2.0 cable (certified high-speed i think)

this has happened only once.
but that's a problem that my log grows like that. in fact they were 3 logs that grows to 2.2GB : user.log, everything.log and errors.log that makes 6.6GB ! i know we have big hdd now but ... it was 2.2GB only because i stop this mess by rebooting

do you consider it's me that must configure syslog (if it's possible) so that such a thing never happen again ?
i think something must be done to avoid the repeat of the same message over and over.
something like that exist in kernel message (where you can see (repeated 6 times) after some message)

how do you properly stop such a thing to happen (i.e. stop the log to grow ) except by rebooting ???

Revision history for this message

zebul666 (zebul666) wrote on 2009-03-28:

#8

and archlinux used almost vanilla kernel. it is not over patched like big distro used to do.

Revision history for this message

Ariel Faigon (ariel.faigon) wrote on 2009-04-15:

#9

Seeing the same problem here: Ubuntu 8.10 (Intrepid). DIfferent HP printer.

The log spewing is a big problem. When it happens it can quickly fill the disk.

The printer is alive and well, but the software thinks it isn't and goes out of its way to make the problem much worse.

To add to the problem logcheck starts failing because syslog and user.log are filling up so fast it can never catch up. logcheck calls logtail which calls grep and this grep (with many regexps) is running more slowly than the logs get added to!

Here's what's repeated (hundreds per second) in the logs:

Apr 14 18:55:16 ze DeskJet_970C?serial=SG99E1V1SMJQ: io/hpmud/musb.c 688: invalid deviceid retry ret=-19: No such device
Apr 14 18:55:16 ze DeskJet_970C?serial=SG99E1V1SMJQ: io/hpmud/musb.c 731: invalid device_status: No such device
Apr 14 18:55:16 ze DeskJet_970C?serial=SG99E1V1SMJQ: io/hpmud/musb.c 976: bulk_write failed buf=0xbfb5dadc size=4736 len=-19: No such device
Apr 14 18:55:16 ze DeskJet_970C?serial=SG99E1V1SMJQ: io/hpmud/musb.c 1338: unable to write data hp:/usb/DeskJet_970C?serial=SG99E1V1SMJQ: No such device
Apr 14 18:55:16 ze DeskJet_970C?serial=SG99E1V1SMJQ: io/hpmud/musb.c 679: invalid deviceid wIndex=0, retrying wIndex=0: No such device

[the above 5 lines repeat at extremely high rate killing system performance and filling up the disk]

Thanks for the attention to this bug.

Revision history for this message

David Suffield (david-suffield) wrote on 2009-04-15:

#10

Based on the syslog errors you are still getting a USB device disconnect during the print job. Assuming you have not tried a different PC this is USB kernel issue with your PC.

I can't explain the syslog filling up. There should be a 30 second loop error message in the syslog, but there is none. We will try and reproduce the syslog failure.

Revision history for this message

maximi89 (maximi89) wrote on 2009-04-25:

#11

i'm having the same problem, i think it is a problem with cups.
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=525361
https://bugs.launchpad.net/bugs/344592

Revision history for this message

In Red Hat Bugzilla #515481, Ted (ted-redhat-bugs) wrote on 2009-08-04:

#17

Download full text (8.3 KiB)

Description of problem:

Print jobs occasionally fail to print to a USB-connected HP PhotoSmart 2575,
requiring a cupsrestart to complete the job.

Versions of various components which have exhibited the problem:

$ uname -a
Linux topaz.bugfinder.co.uk 2.6.27.25-170.2.72.fc10.i686 #1 SMP Sun Jun 21 19:03:24 EDT 2009 i686 i686 i386 GNU/Linux

$ rpm -q cups glibc hal hplip hpijs foomatic
cups-1.3.10-5.fc10.i386
glibc-2.9-3.i686
hal-0.5.12-14.20081027git.fc10.i386
hplip-2.8.12-6b.fc10.i386
hpijs-2.8.12-6b.fc10.i386
foomatic-3.0.2-70.fc10.i386

Occasionally, and there doesn't appear to be a specific way of triggering the bug as far as I can tell, a submitted print job will stall in the print queue, usually after at least one page of a multi-page job has printed.

At that point, in the syslog, we usually see the following set of messages:

Aug 4 11:30:55 topaz kernel: usblp0: removed
Aug 4 11:30:55 topaz hal_lpadmin: Running hal_lpadmin
Aug 4 11:30:56 topaz hal_lpadmin: hal_lpadmin triggered by usblp kernel module
Aug 4 11:30:56 topaz hal_lpadmin: Using device ID from HAL database entry
Aug 4 11:30:56 topaz hal_lpadmin: remove
Aug 4 11:30:56 topaz hal_lpadmin: Found configured printer: Photosmart_2570_series
Aug 4 11:31:26 topaz kernel: usb 1-7: reset high speed USB device using ehci_hcd and address 3
Aug 4 11:31:26 topaz kernel: usblp0: USB Bidirectional printer dev 3 if 1 alt 0 proto 2 vid 0x03F0 pid 0x4E11
Aug 4 11:31:26 topaz kernel: usb 1-7: usbfs: process 10998 (hp) did not claim interface 1 before use
Aug 4 11:31:26 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/musb.c 1024: bulk_write failed buf=0xbf988b3c size=512 len=-16: No data available
Aug 4 11:31:26 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/musb.c 1386: unable to write data hp:/usb/Photosmart_2570_series?serial=MY65T112BM04DY: No data available
Aug 4 11:31:26 topaz kernel: usb 1-7: usbfs: process 10905 (hp) did not claim interface 1 before use
Aug 4 11:31:26 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/musb.c 726: invalid deviceid wIndex=1, retrying wIndex=100: Device or resource busy
Aug 4 11:31:26 topaz hal_lpadmin: Running hal_lpadmin
Aug 4 11:31:27 topaz hal_lpadmin: hal_lpadmin triggered by usblp kernel module
Aug 4 11:31:27 topaz hal_lpadmin: Using device ID from HAL database entry
Aug 4 11:31:27 topaz hal_lpadmin: add
Aug 4 11:31:27 topaz hal_lpadmin: URIs: ['hp:/usb/Photosmart_2570_series?serial=MY65T112BM04DY', 'usb://HP/Photosmart%202570%20series?serial=MY65T112BM04DY', 'hal:///org/freedesktop/Hal/devices/usb_device_3f0_4e11_MY65T112BM04DY_if1_printer_MY65T112BM04DY']
Aug 4 11:31:27 topaz hal_lpadmin: HPLIP Fax URIs: None
Aug 4 11:31:27 topaz hal_lpadmin: Not adding printer: Photosmart_2570_series already exists
Aug 4 11:31:36 topaz kernel: usb 1-7: usbfs: process 10905 (hp) did not claim interface 0 before use
Aug 4 11:31:36 topaz kernel: usb 1-7: usbfs: process 11008 (hp) did not claim interface 1 before use
Aug 4 11:31:36 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/musb.c 1024: bulk_write failed buf=0xbf988b3c size=512 len=-16: Device or resource busy
Aug 4 11:31:36 topaz Photo...

Description of problem:

Print jobs occasionally fail to print to a USB-connected HP PhotoSmart 2575,
requiring a cupsrestart to complete the job.

Versions of various components which have exhibited the problem:

$ uname -a 
Linux topaz.bugfinder.co.uk 2.6.27.25-170.2.72.fc10.i686 #1 SMP Sun Jun 21 19:03:24 EDT 2009 i686 i686 i386 GNU/Linux

$ rpm -q cups glibc hal hplip hpijs foomatic 
cups-1.3.10-5.fc10.i386
glibc-2.9-3.i686
hal-0.5.12-14.20081027git.fc10.i386
hplip-2.8.12-6b.fc10.i386
hpijs-2.8.12-6b.fc10.i386
foomatic-3.0.2-70.fc10.i386

Occasionally, and there doesn't appear to be a specific way of triggering the bug as far as I can tell, a submitted print job will stall in the print queue, usually after at least one page of a multi-page job has printed.

At that point, in the syslog, we usually see the following set of messages:

Aug  4 11:30:55 topaz kernel: usblp0: removed
Aug  4 11:30:55 topaz hal_lpadmin: Running hal_lpadmin
Aug  4 11:30:56 topaz hal_lpadmin: hal_lpadmin triggered by usblp kernel module
Aug  4 11:30:56 topaz hal_lpadmin: Using device ID from HAL database entry
Aug  4 11:30:56 topaz hal_lpadmin: remove
Aug  4 11:30:56 topaz hal_lpadmin: Found configured printer: Photosmart_2570_series
Aug  4 11:31:26 topaz kernel: usb 1-7: reset high speed USB device using ehci_hcd and address 3
Aug  4 11:31:26 topaz kernel: usblp0: USB Bidirectional printer dev 3 if 1 alt 0 proto 2 vid 0x03F0 pid 0x4E11
Aug  4 11:31:26 topaz kernel: usb 1-7: usbfs: process 10998 (hp) did not claim interface 1 before use
Aug  4 11:31:26 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/musb.c 1024: bulk_write failed buf=0xbf988b3c size=512 len=-16: No data available
Aug  4 11:31:26 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/musb.c 1386: unable to write data hp:/usb/Photosmart_2570_series?serial=MY65T112BM04DY: No data available
Aug  4 11:31:26 topaz kernel: usb 1-7: usbfs: process 10905 (hp) did not claim interface 1 before use
Aug  4 11:31:26 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/musb.c 726: invalid deviceid wIndex=1, retrying wIndex=100: Device or resource busy
Aug  4 11:31:26 topaz hal_lpadmin: Running hal_lpadmin
Aug  4 11:31:27 topaz hal_lpadmin: hal_lpadmin triggered by usblp kernel module
Aug  4 11:31:27 topaz hal_lpadmin: Using device ID from HAL database entry
Aug  4 11:31:27 topaz hal_lpadmin: add
Aug  4 11:31:27 topaz hal_lpadmin: URIs: ['hp:/usb/Photosmart_2570_series?serial=MY65T112BM04DY', 'usb://HP/Photosmart%202570%20series?serial=MY65T112BM04DY', 'hal:///org/freedesktop/Hal/devices/usb_device_3f0_4e11_MY65T112BM04DY_if1_printer_MY65T112BM04DY']
Aug  4 11:31:27 topaz hal_lpadmin: HPLIP Fax URIs: None
Aug  4 11:31:27 topaz hal_lpadmin: Not adding printer: Photosmart_2570_series already exists
Aug  4 11:31:36 topaz kernel: usb 1-7: usbfs: process 10905 (hp) did not claim interface 0 before use
Aug  4 11:31:36 topaz kernel: usb 1-7: usbfs: process 11008 (hp) did not claim interface 1 before use
Aug  4 11:31:36 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/musb.c 1024: bulk_write failed buf=0xbf988b3c size=512 len=-16: Device or resource busy
Aug  4 11:31:36 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/musb.c 1386: unable to write data hp:/usb/Photosmart_2570_series?serial=MY65T112BM04DY: Device or resource busy
Aug  4 11:31:36 topaz kernel: usb 1-7: usbfs: process 10905 (hp) did not claim interface 1 before use
Aug  4 11:31:36 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/musb.c 726: invalid deviceid wIndex=1, retrying wIndex=100: Device or resource busy
Aug  4 11:31:46 topaz kernel: usb 1-7: usbfs: process 11040 (hp) did not claim interface 1 before use
Aug  4 11:31:46 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/musb.c 1024: bulk_write failed buf=0xbf988b3c size=512 len=-16: Device or resource busy
Aug  4 11:31:46 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/musb.c 1386: unable to write data hp:/usb/Photosmart_2570_series?serial=MY65T112BM04DY: Device or resource busy

It should be noted that when I originally found this problem a while ago, the syslog filled up at the rate of several gigabytes an hour as all the "hpmud" messages were not rate-limited.

To reduce the extent to which the bug could flatten my log partition, I patched the hplip source code with:

$ more hplip-2.8.12-hpmud-bugthrottle.patch 
diff -uNr hplip-2.8.12.orig/io/hpmud/hpmudi.h hplip-2.8.12/io/hpmud/hpmudi.h
--- hplip-2.8.12.orig/io/hpmud/hpmudi.h	2008-12-17 20:41:08.000000000 +0000
+++ hplip-2.8.12/io/hpmud/hpmudi.h	2009-06-07 10:23:56.000000000 +0100
@@ -60,6 +60,9 @@
 #define _STRINGIZE(x) #x
 #define STRINGIZE(x) _STRINGIZE(x)
 
+// Impose a throttle on any BUG related error messages to avoid disk overflow ...
+#define HPMUD_BUG_THROTTLE 10000000  /* microseconds */
+#define HPMUD_BUG_SEC_THROTTLE 10  /* seconds */
 #define BUG(args...) syslog(LOG_ERR, __FILE__ " " STRINGIZE(__LINE__) ": " args)
 
 #ifdef HPMUD_DEBUG
diff -uNr hplip-2.8.12.orig/io/hpmud/musb.c hplip-2.8.12/io/hpmud/musb.c
--- hplip-2.8.12.orig/io/hpmud/musb.c	2008-12-17 20:41:08.000000000 +0000
+++ hplip-2.8.12/io/hpmud/musb.c	2009-06-07 10:27:59.000000000 +0100
@@ -125,6 +125,7 @@
       {
          /* This retry is necessary for lj1000 and lj1005. des 12/12/07 */
          BUG("get_string_descriptor zero result, retrying...");
+	 sleep(HPMUD_BUG_SEC_THROTTLE);
          continue;
       }
       break;
@@ -723,6 +724,7 @@
    {
       /* Following retry is necessary for a firmware problem with PS A420 products. DES 4/17/07 */
       BUG("invalid deviceid wIndex=%x, retrying wIndex=%x: %m\n", interface, interface << 8);
+      sleep(HPMUD_BUG_SEC_THROTTLE);
       rlen = usb_control_msg(hd, 
              USB_ENDPOINT_IN | USB_TYPE_CLASS | USB_RECIP_INTERFACE, /* bmRequestType */
              USB_REQ_GET_STATUS,        /* bRequest */
$

The one extra piece of diagnostic is that a restart of CUPS is seemingly sufficient the fix the problem. Usually, this then means that the stuck print job is resubmitted to the printer and usually then completes Ok.

At that point we get the following in syslog:

...
Aug  4 11:32:16 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/musb.c 726: invalid deviceid wIndex=1, retrying wIndex=100: Device or resource busy
Aug  4 11:32:26 topaz kernel: usb 1-7: usbfs: process 11054 (hp) did not claim interface 1 before use
Aug  4 11:32:26 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/musb.c 1024: bulk_write failed buf=0xbf988b3c size=512 len=-16: Device or resource busy
Aug  4 11:32:26 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/musb.c 1386: unable to write data hp:/usb/Photosmart_2570_series?serial=MY65T112BM04DY: Device or resource busy
Aug  4 11:32:26 topaz kernel: usb 1-7: usbfs: process 10905 (hp) did not claim interface 1 before use
Aug  4 11:32:26 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/musb.c 726: invalid deviceid wIndex=1, retrying wIndex=100: Device or resource busy
Aug  4 11:32:34 topaz sudo:     ejtr : TTY=pts/3 ; PWD=/home/ejtr ; USER=root ; COMMAND=/usr/local/bin/cupsrestart
Aug  4 11:32:35 topaz kernel: usblp0: removed
Aug  4 11:32:35 topaz hal_lpadmin: Running hal_lpadmin
Aug  4 11:32:36 topaz hal_lpadmin: hal_lpadmin triggered by usblp kernel module
Aug  4 11:32:36 topaz hal_lpadmin: Using device ID from HAL database entry
Aug  4 11:32:36 topaz hal_lpadmin: remove
Aug  4 11:32:36 topaz hal_lpadmin: Found configured printer: Photosmart_2570_series
Aug  4 11:36:32 topaz syslog-ng[2040]: Log statistics; processed='center(queued)=4035', processed='center(received)=1567', processed='destination(d_boot)=0', processed='destination(d_auth)=8', processed='destination(d_debug)=1372', processed='destination(d_cron)=40', processed='destination(d_mlal)=0', processed='destination(d_errors)=1151', processed='destination(d_mesg)=1138', processed='destination(d_smoothwall)=195', processed='destination(d_cons)=0', processed='destination(d_spol)=0', processed='destination(d_mail)=131', processed='source(s_remote)=195', processed='source(s_sys)=1372'

I note that F11 has a much more recent version of hplip, so I may well try backporting that to F10 to see whether the issue is resolved. At the moment, I presume the bug to be in hplip, but it may well be that the real problem is in the usb drivers in the kernel itself.

Revision history for this message

In Red Hat Bugzilla #515481, Tim (tim-redhat-bugs) wrote on 2009-08-04:

#18

What's happening here is that the usblp kernel module is getting unloaded by the 'hp' backend, but loaded again by some other process (probably by accessing /dev/usb/lp0).

Can you run 'ps axfw' and attach the output here, if it happens again?

Revision history for this message

In Red Hat Bugzilla #515481, Tim (tim-redhat-bugs) wrote on 2009-08-04:

#19

In fact, even without it happening it would be useful to see the output of 'lpstat -s'.

Revision history for this message

In Red Hat Bugzilla #515481, Mikkel (mikkel-redhat-bugs) wrote on 2009-08-04:

#20

I have a similar problem with a USB connected HP PhotoSmart 257x, where the /var/log/messages file quickly is filled with

Jul 29 19:37:07 localhost Photosmart_2570_series?serial=MY6A4311RT04DY: io/hpmud/musb.c 725: invalid deviceid wIndex=1, retrying wIndex=100: Device or resource busy
Jul 29 19:37:07 localhost kernel: usb 2-1: usbfs: process 22388 (hp) did not claim interface 1 before use
Jul 29 19:37:07 localhost Photosmart_2570_series?serial=MY6A4311RT04DY: io/hpmud/musb.c 1022: bulk_write failed buf=0xbf9171fc size=8192 len=-16: Device or resource busy
Jul 29 19:37:07 localhost Photosmart_2570_series?serial=MY6A4311RT04DY: io/hpmud/musb.c 1384: unable to write data hp:/usb/Photosmart_2570_series?serial=MY6A4311RT04DY: Device or resource busy

I don't know exactly what triggers this, as the printer is apparently working fine both when printing and when scanning.

Installed versions:

hplip-2.8.12-6.fc10.i386
cups-1.3.10-5.fc10.i386

Bug 468272 may be related to this, and people have reported similar problems elsewhere:
https://bugs.launchpad.net/ubuntu/+source/sane-backends/+bug/313504
https://bugs.launchpad.net/hplip/+bug/341827

Revision history for this message

In Red Hat Bugzilla #515481, Ted (ted-redhat-bugs) wrote on 2009-08-05:

#21

FWIW:

$ lpstat -s
system default destination: Photosmart_2570_series
device for Cups-PDF: cups-pdf:/
device for Photosmart_2570_series: hp:/usb/Photosmart_2570_series?serial=MY65T112BM04DY

Meanwhile, since I've seen other reports that even the latest hplip might suffer from this, I've backported the current F11 hplip-3.9.2 rpm to F10, and added yet another bit of debug in the hpmud code, as per the patch below. With luck, this might provide a further clue as and when it happens again, albeit at the expense of creating a large debug log every day.

diff -uNr hplip-3.9.2-orig/io/hpmud/hpmudi.h hplip-3.9.2/io/hpmud/hpmudi.h
--- hplip-3.9.2-orig/io/hpmud/hpmudi.h 2009-02-20 00:36:44.000000000 +0000
+++ hplip-3.9.2/io/hpmud/hpmudi.h 2009-08-05 17:10:02.000000000 +0100
@@ -55,7 +55,7 @@
#include "pp.h"
#endif

-//#define HPMUD_DEBUG
+#define HPMUD_DEBUG 1

#define _STRINGIZE(x) #x
#define STRINGIZE(x) _STRINGIZE(x)
@@ -63,10 +63,10 @@
#define BUG(args...) syslog(LOG_ERR, __FILE__ " " STRINGIZE(__LINE__) ": " args)

#ifdef HPMUD_DEBUG
- #define DBG(args...) syslog(LOG_INFO, __FILE__ " " STRINGIZE(__LINE__) ": " args)
+ #define DBG(args...) syslog(LOG_DEBUG, __FILE__ " " STRINGIZE(__LINE__) ": " args)
// #define DBG(args...) fprintf(stderr, __FILE__ " " STRINGIZE(__LINE__) ": " args)
    #define DBG_DUMP(data, size) sysdump((data), (size))
- #define DBG_SZ(args...) syslog(LOG_INFO, args)
+ #define DBG_SZ(args...) syslog(LOG_DEBUG, args)
#else
    #define DBG(args...)
    #define DBG_DUMP(data, size)

My cupsrestart script now also contains a ps invocation to help with the debugging as and when it trips up....

$ cat /usr/local/bin/cupsrestart
#!/bin/sh

currentid=`id -un`

if [ $currentid = "root" ]; then
echo Restarting CUPS Printer Services ...
service cups status
service cups condrestart
service cups status
sleep 5
else
( lpstat -t ; echo "==================================" ; ps axfw ) | mail -s "User $currentid about to restart CUPS via $0" root
sudo $0
lpstat -t | mail -s "User $currentid restarted CUPS via $0" root
fi
$

FWIW:

$ lpstat -s
system default destination: Photosmart_2570_series
device for Cups-PDF: cups-pdf:/
device for Photosmart_2570_series: hp:/usb/Photosmart_2570_series?serial=MY65T112BM04DY

Meanwhile, since I've seen other reports that even the latest hplip might suffer from this, I've backported the current F11 hplip-3.9.2 rpm to F10, and added yet another bit of debug in the hpmud code, as per the patch below. With luck, this might provide a further clue as and when it happens again, albeit at the expense of creating a large debug log every day.

diff -uNr hplip-3.9.2-orig/io/hpmud/hpmudi.h hplip-3.9.2/io/hpmud/hpmudi.h
--- hplip-3.9.2-orig/io/hpmud/hpmudi.h	2009-02-20 00:36:44.000000000 +0000
+++ hplip-3.9.2/io/hpmud/hpmudi.h	2009-08-05 17:10:02.000000000 +0100
@@ -55,7 +55,7 @@
 #include "pp.h"
 #endif
 
-//#define HPMUD_DEBUG
+#define HPMUD_DEBUG 1
 
 #define _STRINGIZE(x) #x
 #define STRINGIZE(x) _STRINGIZE(x)
@@ -63,10 +63,10 @@
 #define BUG(args...) syslog(LOG_ERR, __FILE__ " " STRINGIZE(__LINE__) ": " args)
 
 #ifdef HPMUD_DEBUG
-   #define DBG(args...) syslog(LOG_INFO, __FILE__ " " STRINGIZE(__LINE__) ": " args)
+   #define DBG(args...) syslog(LOG_DEBUG, __FILE__ " " STRINGIZE(__LINE__) ": " args)
 //   #define DBG(args...) fprintf(stderr, __FILE__ " " STRINGIZE(__LINE__) ": " args)
    #define DBG_DUMP(data, size) sysdump((data), (size))
-   #define DBG_SZ(args...) syslog(LOG_INFO, args)
+   #define DBG_SZ(args...) syslog(LOG_DEBUG, args)
 #else
    #define DBG(args...)
    #define DBG_DUMP(data, size)

My cupsrestart script now also contains a ps invocation to help with the debugging as and when it trips up....

$ cat /usr/local/bin/cupsrestart 
#!/bin/sh

currentid=`id -un`

if [ $currentid = "root" ]; then
	echo Restarting CUPS Printer Services ...
	service cups status
	service cups condrestart
	service cups status
	sleep 5
else
	( lpstat -t ; echo "==================================" ; ps axfw ) | mail -s "User $currentid about to restart CUPS via $0" root
	sudo $0
	lpstat -t | mail -s "User $currentid restarted CUPS via $0" root
fi
$

Revision history for this message

In Red Hat Bugzilla #515481, Ted (ted-redhat-bugs) wrote on 2009-08-06:

#22

One small extra piece of diagnostic. Going back through the bits of history I have for this issue, I see that when the bug occurs, this message:

"reset high speed USB device"

always seems to occur exactly 30 seconds after this message:

"Found configured printer"

For instance:

...
Jun 6 13:50:21 topaz kernel: usblp0: removed
Jun 6 13:50:22 topaz hpijs: WARNING: color pen has low ink
Jun 6 13:50:22 topaz hpijs: STATE: +marker-supply-low-warning
Jun 6 13:50:52 topaz kernel: usb 1-7: reset high speed USB device using ehci_hcd and address 3
Jun 6 13:50:52 topaz kernel: usblp0: USB Bidirectional printer dev 3 if 1 alt 0 proto 2 vid 0x03F0 pid 0x4E11
Jun 6 13:50:52 topaz kernel: usb 1-7: usbfs: process 5839 (hp) did not claim interface 1 before use
...

...
Jul 9 22:15:33 topaz kernel: usblp0: removed
Jul 9 22:15:33 topaz hal_lpadmin: Running hal_lpadmin
Jul 9 22:15:34 topaz hal_lpadmin: hal_lpadmin triggered by usblp kernel module
Jul 9 22:15:34 topaz hal_lpadmin: Using device ID from HAL database entry
Jul 9 22:15:34 topaz hal_lpadmin: remove
Jul 9 22:15:34 topaz hal_lpadmin: Found configured printer: Photosmart_2570_series
Jul 9 22:16:04 topaz kernel: usb 1-7: reset high speed USB device using ehci_hcd and address 3
Jul 9 22:16:04 topaz kernel: usblp0: USB Bidirectional printer dev 3 if 1 alt 0 proto 2 vid 0x03F0 pid 0x4E11
Jul 9 22:16:04 topaz kernel: usb 1-7: usbfs: process 9692 (hp) did not claim interface 1 before use
Jul 9 22:16:04 topaz kernel: usb 1-7: usbfs: process 9517 (hp) did not claim interface 1 before use
...

...
Aug 4 11:30:55 topaz kernel: usblp0: removed
Aug 4 11:30:55 topaz hal_lpadmin: Running hal_lpadmin
Aug 4 11:30:56 topaz hal_lpadmin: hal_lpadmin triggered by usblp kernel module
Aug 4 11:30:56 topaz hal_lpadmin: Using device ID from HAL database entry
Aug 4 11:30:56 topaz hal_lpadmin: remove
Aug 4 11:30:56 topaz hal_lpadmin: Found configured printer:
Photosmart_2570_series
Aug 4 11:31:26 topaz kernel: usb 1-7: reset high speed USB device using
ehci_hcd and address 3
Aug 4 11:31:26 topaz kernel: usblp0: USB Bidirectional printer dev 3 if 1 alt
0 proto 2 vid 0x03F0 pid 0x4E11
Aug 4 11:31:26 topaz kernel: usb 1-7: usbfs: process 10998 (hp) did not claim
interface 1 before use
...

One small extra piece of diagnostic. Going back through the bits of history I have for this issue, I see that when the bug occurs, this message:

"reset high speed USB device"

always seems to occur exactly 30 seconds after this message:

"Found configured printer"

For instance:

...
Jun  6 13:50:21 topaz kernel: usblp0: removed
Jun  6 13:50:22 topaz hpijs: WARNING: color pen has low ink
Jun  6 13:50:22 topaz hpijs: STATE: +marker-supply-low-warning
Jun  6 13:50:52 topaz kernel: usb 1-7: reset high speed USB device using ehci_hcd and address 3
Jun  6 13:50:52 topaz kernel: usblp0: USB Bidirectional printer dev 3 if 1 alt 0 proto 2 vid 0x03F0 pid 0x4E11
Jun  6 13:50:52 topaz kernel: usb 1-7: usbfs: process 5839 (hp) did not claim interface 1 before use
...

...
Jul  9 22:15:33 topaz kernel: usblp0: removed
Jul  9 22:15:33 topaz hal_lpadmin: Running hal_lpadmin
Jul  9 22:15:34 topaz hal_lpadmin: hal_lpadmin triggered by usblp kernel module
Jul  9 22:15:34 topaz hal_lpadmin: Using device ID from HAL database entry
Jul  9 22:15:34 topaz hal_lpadmin: remove
Jul  9 22:15:34 topaz hal_lpadmin: Found configured printer: Photosmart_2570_series
Jul  9 22:16:04 topaz kernel: usb 1-7: reset high speed USB device using ehci_hcd and address 3
Jul  9 22:16:04 topaz kernel: usblp0: USB Bidirectional printer dev 3 if 1 alt 0 proto 2 vid 0x03F0 pid 0x4E11
Jul  9 22:16:04 topaz kernel: usb 1-7: usbfs: process 9692 (hp) did not claim interface 1 before use
Jul  9 22:16:04 topaz kernel: usb 1-7: usbfs: process 9517 (hp) did not claim interface 1 before use
...

...
Aug  4 11:30:55 topaz kernel: usblp0: removed
Aug  4 11:30:55 topaz hal_lpadmin: Running hal_lpadmin
Aug  4 11:30:56 topaz hal_lpadmin: hal_lpadmin triggered by usblp kernel module
Aug  4 11:30:56 topaz hal_lpadmin: Using device ID from HAL database entry
Aug  4 11:30:56 topaz hal_lpadmin: remove
Aug  4 11:30:56 topaz hal_lpadmin: Found configured printer:
Photosmart_2570_series
Aug  4 11:31:26 topaz kernel: usb 1-7: reset high speed USB device using
ehci_hcd and address 3
Aug  4 11:31:26 topaz kernel: usblp0: USB Bidirectional printer dev 3 if 1 alt
0 proto 2 vid 0x03F0 pid 0x4E11
Aug  4 11:31:26 topaz kernel: usb 1-7: usbfs: process 10998 (hp) did not claim
interface 1 before use
...

Revision history for this message

In Red Hat Bugzilla #515481, Ted (ted-redhat-bugs) wrote on 2009-08-12:

#23

Created attachment 357176
Trimmed debug level syslog for a recent sample incident on 10th August

Revision history for this message

In Red Hat Bugzilla #515481, Ted (ted-redhat-bugs) wrote on 2009-08-12:

#24

Created attachment 357177
Trimmed xsession ( and hence printer-applet ) log for a recent sample incident on 10th August

Revision history for this message

In Red Hat Bugzilla #515481, Ted (ted-redhat-bugs) wrote on 2009-08-12:

#25

Download full text (19.0 KiB)

The recent log attachments for both xsession and syslog cover an incident from 10th August.

On this occasion, I had my debug-patch version of hplip-2.8.12 running, i.e.:

$ rpm -q --changelog hplip|more
* Wed Aug 05 2009 Ted Rule <email address hidden> 2.8.12-6c
- Add hpmud debug patch

* Tue Jul 14 2009 Ted Rule <email address hidden> 2.8.12-6b
- Add bug throttle patch

* Tue Jan 27 2009 Tim Waugh <email address hidden> 2.8.12-6
- Only ship compressed PPD files.

where the extra debugging can be seen in the syslog trail.

At the same time, I tweaked gnome a bit to get some debugging out of the printer applet, as in:

$ diff -u /usr/share/system-config-printer/applet.py.orig /usr/share/system-config-printer/applet.py
--- /usr/share/system-config-printer/applet.py.orig 2009-03-25 18:55:26.000000000 +0000
+++ /usr/share/system-config-printer/applet.py 2009-08-07 12:11:18.000000000 +0100
@@ -251,6 +251,12 @@
show_help ()
sys.exit (1)

+ set_debugging (True)
+ if get_debugging () == False:
+ print >> sys.stderr, ("%s: unable to initialize debugging" %PROGRAM_NAME)
+ elif get_debugging () == True:
+ print >> sys.stderr, ("%s: able to initialize debugging" %PROGRAM_NAME)
+
     for opt, optarg in opts:
         if opt == "--help":
             show_help ()
$

$ diff -u /usr/share/system-config-printer/debug.py.orig /usr/share/system-config-printer/debug.py
--- /usr/share/system-config-printer/debug.py.orig 2009-03-25 18:55:26.000000000 +0000
+++ /usr/share/system-config-printer/debug.py 2009-08-07 12:13:39.000000000 +0100
@@ -19,12 +19,13 @@

import sys
import traceback
+import time

_debug=False
def debugprint (x):
     if _debug:
         try:
- print >>sys.stderr, x
+ print >>sys.stderr, time.strftime("%b %d %Y %H:%M:%S"), x
         except:
             pass

$

The incident happened when I tried to print a 5 page Email, and it was the first print Job after the initial boot-up of the system. As it happened, the printer had too little paper to complete the Job.

When the print Job "stalled", I had this in the process list:

[ejtr@topaz ~]$ ps axfu
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 2 0.0 0.0 0 0 ? S< 17:25 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [migration/0]
root 4 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [ksoftirqd/0]
root 5 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [watchdog/0]
root 6 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [migration/1]
root 7 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [ksoftirqd/1]
root 8 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [watchdog/1]
root 9 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [events/0]
root 10 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [events/1]
root 11 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [khelper]
root 85 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [kintegrityd/0]
root 86 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [kintegrity...

The recent log attachments for both xsession and syslog cover an incident from 10th August.

On this occasion, I had my debug-patch version of hplip-2.8.12 running, i.e.:

$ rpm -q --changelog hplip|more
* Wed Aug 05 2009 Ted Rule <ejtr@layer3.co.uk> 2.8.12-6c
- Add hpmud debug patch

* Tue Jul 14 2009 Ted Rule <ejtr@layer3.co.uk> 2.8.12-6b
- Add bug throttle patch

* Tue Jan 27 2009 Tim Waugh <twaugh@redhat.com> 2.8.12-6
- Only ship compressed PPD files.

where the extra debugging can be seen in the syslog trail.

At the same time, I tweaked gnome a bit to get some debugging out of the printer applet, as in:

$ diff -u /usr/share/system-config-printer/applet.py.orig /usr/share/system-config-printer/applet.py
--- /usr/share/system-config-printer/applet.py.orig	2009-03-25 18:55:26.000000000 +0000
+++ /usr/share/system-config-printer/applet.py	2009-08-07 12:11:18.000000000 +0100
@@ -251,6 +251,12 @@
         show_help ()
         sys.exit (1)
 
+    set_debugging (True)
+    if get_debugging () == False:
+        print >> sys.stderr, ("%s: unable to initialize debugging" %PROGRAM_NAME)
+    elif get_debugging () == True:
+        print >> sys.stderr, ("%s: able to initialize debugging" %PROGRAM_NAME)
+
     for opt, optarg in opts:
         if opt == "--help":
             show_help ()
$

$ diff -u /usr/share/system-config-printer/debug.py.orig /usr/share/system-config-printer/debug.py
--- /usr/share/system-config-printer/debug.py.orig	2009-03-25 18:55:26.000000000 +0000
+++ /usr/share/system-config-printer/debug.py	2009-08-07 12:13:39.000000000 +0100
@@ -19,12 +19,13 @@
 
 import sys
 import traceback
+import time
 
 _debug=False
 def debugprint (x):
     if _debug:
         try:
-            print >>sys.stderr, x
+            print >>sys.stderr, time.strftime("%b %d %Y %H:%M:%S"), x
         except:
             pass
 
$

The incident happened when I tried to print a 5 page Email, and it was the first print Job after the initial boot-up of the system. As it happened, the printer had too little paper to complete the Job.

When the print Job "stalled", I had this in the process list:

[ejtr@topaz ~]$ ps axfu
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         2  0.0  0.0      0     0 ?        S<   17:25   0:00 [kthreadd]
root         3  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [migration/0]
root         4  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [ksoftirqd/0]
root         5  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [watchdog/0]
root         6  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [migration/1]
root         7  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [ksoftirqd/1]
root         8  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [watchdog/1]
root         9  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [events/0]
root        10  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [events/1]
root        11  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [khelper]
root        85  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kintegrityd/0]
root        86  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kintegrityd/1]
root        88  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kblockd/0]
root        89  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kblockd/1]
root        91  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kacpid]
root        92  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kacpi_notify]
root       174  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [cqueue]
root       178  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [ata/0]
root       179  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [ata/1]
root       180  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [ata_aux]
root       182  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [ksuspend_usbd]
root       187  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [khubd]
root       190  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kseriod]
root       240  0.0  0.0      0     0 ?        S    17:25   0:00  \_ [pdflush]
root       241  0.0  0.0      0     0 ?        S    17:25   0:00  \_ [pdflush]
root       242  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kswapd0]
root       290  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [aio/0]
root       291  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [aio/1]
root       485  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [scsi_eh_0]
root       488  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [scsi_eh_1]
root       607  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kpsmoused]
root       614  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kstriped]
root       617  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [ksnapd]
root       675  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kdmflush]
root       676  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kdmflush]
root       677  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kdmflush]
root       678  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kdmflush]
root       679  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kdmflush]
root       680  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kdmflush]
root       681  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kjournald]
root      1173  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [scsi_eh_2]
root      1174  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [usb-storage]
root      1176  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [scsi_eh_3]
root      1177  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [usb-storage]
root      1582  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kauditd]
root      1651  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kmpathd/0]
root      1652  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kmpathd/1]
root      1653  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kmpath_handlerd]
root      1681  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kjournald]
root      1682  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kjournald]
root      1683  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kjournald]
root      1684  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kjournald]
root      1685  0.0  0.0      0     0 ?        S<   17:25   0:00  \_ [kjournald]
root         1  0.0  0.0   2012   776 ?        Ss   17:25   0:00 /sbin/init
root       736  0.0  0.0   2588  1160 ?        S<s  17:25   0:00 /sbin/udevd -d
root      2019  0.0  0.0  12780   792 ?        S<sl 17:25   0:00 auditd
root      2021  0.0  0.0  12300   752 ?        S<sl 17:25   0:00  \_ /sbin/audispd
root      2046  0.0  0.0   3524  1224 ?        Ss   17:25   0:00 /sbin/syslog-ng -p /var/run/syslogd.pid
dbus      2067  0.0  0.0  13528  1556 ?        Ssl  17:25   0:00 dbus-daemon --system
root      2077  0.0  0.0   1884   584 ?        Ss   17:25   0:00 /usr/sbin/acpid
68        2085  0.0  0.2   6948  4508 ?        Ss   17:25   0:00 hald
root      2089  0.0  0.0   3548  1052 ?        S    17:25   0:00  \_ hald-runner
root      2248  0.0  0.0   3628   920 ?        S    17:25   0:00      \_ hald-addon-storage: polling /dev/sdb (every 2 sec)
root      2250  0.0  0.0   3628   924 ?        S    17:25   0:00      \_ hald-addon-storage: polling /dev/sr0 (every 2 sec)
68        2252  0.0  0.0   3188   940 ?        S    17:25   0:00      \_ hald-addon-acpi: listening on acpid socket /var/run/acpid.socket
root      2323  0.0  0.0   3624  1032 ?        S    17:25   0:00      \_ hald-addon-input: Listening on /dev/input/event4 /dev/input/event5 
root      2088  0.0  0.1   9468  2140 ?        Ssl  17:25   0:00 /usr/sbin/console-kit-daemon
root      2241  0.0  0.1  17304  2168 ?        Ssl  17:25   0:00 NetworkManager --pid-file=/var/run/NetworkManager/NetworkManager.pid
root      2331  0.0  0.0   2580  1096 ?        S    17:25   0:00  \_ /sbin/dhclient -d -sf /usr/libexec/nm-dhcp-client.action -pf /var/run/d
root      2256  0.0  0.0   5624  1196 ?        S    17:25   0:00 /usr/sbin/wpa_supplicant -c /etc/wpa_supplicant/wpa_supplicant.conf -u -f /
root      2262  0.0  0.1   8136  3632 ?        S    17:25   0:00 /usr/sbin/nm-system-settings --config /etc/NetworkManager/nm-system-setting
root      2263  0.0  0.0   6200  1096 ?        Ssl  17:25   0:00 automount
root      2281  0.0  1.4  63944 30944 ?        Ssl  17:25   0:01 /usr/bin/python -E /usr/sbin/setroubleshootd
root      2299  0.0  0.0   7252  1024 ?        Ss   17:25   0:00 /usr/sbin/sshd
root      2307  0.0  0.0   2720   860 ?        Ss   17:25   0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
ntp       2315  0.0  0.0   4696  1396 ?        Ss   17:25   0:00 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
root      2335  0.0  1.4  35360 29476 ?        Ss   17:25   0:01 /usr/bin/spamd -d -c -m5 -H -r /var/run/spamd.pid                    
root      2664  0.0  1.3  35360 27072 ?        S    17:25   0:00  \_ spamd child                                                          
root      2665  0.0  1.3  35360 27072 ?        S    17:25   0:00  \_ spamd child                                                          
root      2370  0.0  0.0   2120   380 ?        Ss   17:25   0:00 /usr/sbin/gpm -m /dev/input/mice -t exps2
root      2383  0.0  0.0   5268  1220 ?        Ss   17:25   0:00 crond
root      2406  0.0  0.0   9888  1120 ?        Ss   17:25   0:00 kerneloops
smmsp     2434  0.0  0.0   8808  1480 ?        Ss   17:25   0:00 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue
root      2436  0.0  0.0   9092  1700 ?        Ss   17:25   0:00 sendmail: accepting connections   
root      2440  0.0  0.1  17092  2628 ?        Ss   17:25   0:00 smbd -D
root      2459  0.0  0.0  17092  1080 ?        S    17:25   0:00  \_ smbd -D
root      2451  0.0  0.0   2236   332 ?        Ss   17:25   0:00 /usr/sbin/atd
avahi     2461  0.0  0.0   2880  1404 ?        Ss   17:25   0:00 avahi-daemon: running [topaz.local]
avahi     2462  0.0  0.0   2880   324 ?        Ss   17:25   0:00  \_ avahi-daemon: chroot helper
root      2471  0.0  0.1  10636  3060 ?        Ss   17:25   0:00 cupsd
lp        3584  0.0  0.3  10908  6272 ?        S    17:34   0:00  \_ /usr/bin/perl /usr/lib/cups/filter/foomatic-rip 273 ejtr evolution job 
lp        3589  0.0  0.2  10908  4620 ?        S    17:34   0:00  |   \_ /usr/bin/perl /usr/lib/cups/filter/foomatic-rip 273 ejtr evolution 
lp        3590  0.0  0.1  10908  4060 ?        S    17:34   0:00  |       \_ /usr/bin/perl /usr/lib/cups/filter/foomatic-rip 273 ejtr evolut
lp        3591  0.0  1.4  55192 30384 ?        S    17:34   0:00  |       \_ gs -sstdout=%stderr -dBATCH -dPARANOIDSAFER -dQUIET -dNOPAUSE -
lp        3599  0.0  0.0   7596  1480 ?        S    17:34   0:00  |           \_ hpijs
lp        3585  0.0  0.0  15872  1252 ?        S    17:34   0:00  \_ hp:/usb/Photosmart_2570_series?serial=MY65T112BM04DY 273 ejtr evolution
root      2482  0.0  0.0   1884   528 ?        SNs  17:25   0:00 anacron -s
root      2495  0.0  0.0   3520   520 ?        S    17:25   0:00 /usr/sbin/smartd -q never
root      2498  0.0  0.0   7232  2052 ?        Ss   17:25   0:00 /usr/sbin/gdm-binary -nodaemon
root      2565  0.0  0.1   7604  2672 ?        S    17:25   0:00  \_ /usr/libexec/gdm-simple-slave --display-id /org/gnome/DisplayManager/Di
root      2566  4.1  2.2  56688 46028 tty1     Ss+  17:25   1:15      \_ /usr/bin/Xorg :0 -nr -verbose -auth /var/run/gdm/auth-for-gdm-TgoeQ
root      2667  0.0  0.1   6160  2784 ?        S    17:25   0:00      \_ /usr/libexec/gdm-session-worker
ejtr      2707  0.0  0.3  31804  7076 ?        Ssl  17:25   0:00          \_ gnome-session
ejtr      2983  0.0  0.1  15552  2920 ?        S    17:25   0:00              \_ /usr/lib/gnome-session/helpers/gnome-keyring-daemon-wrapper
ejtr      2995  0.0  0.6  26532 13724 ?        S    17:25   0:01              \_ metacity
ejtr      2997  0.0  0.8  56696 17156 ?        S    17:25   0:00              \_ gnome-panel
ejtr      2998  0.1  1.7 109992 36020 ?        S    17:25   0:03              \_ nautilus --no-desktop --browser
ejtr      3001  0.0  0.5  70272 11656 ?        S    17:25   0:00              \_ nm-applet --sm-disable
ejtr      3004  0.0  0.5  29520 11272 ?        S    17:25   0:00              \_ gnome-power-manager
ejtr      3008  0.0  0.3  20456  7124 ?        S    17:25   0:00              \_ imsettings-applet --disable-xsettings
ejtr      3010  0.0  0.3  20256  6572 ?        S    17:25   0:00              \_ bluetooth-applet
ejtr      3011  0.0  0.5  46280 10452 ?        S    17:25   0:00              \_ gpk-update-icon
ejtr      3012  0.1  1.0  60620 21028 ?        S    17:25   0:01              \_ python /usr/share/system-config-printer/applet.py
ejtr      3016  0.0  0.2  18856  5596 ?        S    17:25   0:00              \_ kerneloops-applet
root      2499  0.0  0.0   1872   456 tty4     Ss+  17:25   0:00 /sbin/mingetty tty4
root      2500  0.0  0.0   1872   456 tty5     Ss+  17:25   0:00 /sbin/mingetty tty5
root      2501  0.0  0.0   1872   456 tty2     Ss+  17:25   0:00 /sbin/mingetty tty2
root      2502  0.0  0.0   1872   452 tty3     Ss+  17:25   0:00 /sbin/mingetty tty3
root      2503  0.0  0.0   1872   452 tty6     Ss+  17:25   0:00 /sbin/mingetty tty6
gdm       2587  0.0  0.0   3156   544 ?        S    17:25   0:00 /usr/bin/dbus-launch --exit-with-session
ejtr      2705  0.0  0.1  26056  2224 ?        S    17:25   0:00 /usr/bin/gnome-keyring-daemon -d --login
ejtr      2765  0.0  0.0   3156   548 ?        S    17:25   0:00 dbus-launch --sh-syntax --exit-with-session
ejtr      2766  0.0  0.0  13528  1444 ?        Ssl  17:25   0:00 /bin/dbus-daemon --fork --print-pid 7 --print-address 9 --session
ejtr      2809  0.0  0.0   5528  2044 ?        S    17:25   0:00 /usr/libexec/im-settings-daemon
ejtr      2811  0.0  0.1   4788  2592 ?        S    17:25   0:00 /usr/libexec/im-info-daemon
ejtr      2813  0.0  0.0   6572  1964 ?        S    17:25   0:00 /usr/libexec/gvfsd
ejtr      2883  0.0  0.1  35652  2164 ?        Ssl  17:25   0:00 /usr/libexec//gvfs-fuse-daemon /home/ejtr/.gvfs
ejtr      2941  0.0  0.0   4956  2036 ?        S    17:25   0:00 /usr/libexec/gconf-im-settings-daemon
ejtr      2943  0.1  0.2   9380  5108 ?        S    17:25   0:03 /usr/libexec/gconfd-2
ejtr      2988  0.0  0.4  40672 10212 ?        Ssl  17:25   0:00 /usr/libexec/gnome-settings-daemon
ejtr      2996  0.0  0.1  28892  3184 ?        Ss   17:25   0:01 gnome-screensaver
ejtr      3000  0.0  0.1  40796  3100 ?        Ssl  17:25   0:00 /usr/libexec/bonobo-activation-server --ac-activate --ior-output-fd=17
ejtr      3007  0.0  0.6  51244 14284 ?        S    17:25   0:00 /usr/libexec/notification-daemon
ejtr      3024  0.0  0.2  98064  4488 ?        Ssl  17:25   0:00 /usr/bin/pulseaudio --start
ejtr      3032  0.0  0.0   7912  2044 ?        S    17:25   0:00  \_ /usr/libexec/pulse/gconf-helper
ejtr      3046  0.0  0.1   7056  2660 ?        S    17:25   0:00 /usr/libexec/gvfs-hal-volume-monitor
ejtr      3048  0.0  0.0   6752  1952 ?        S    17:25   0:00 /usr/libexec/gvfs-gphoto2-volume-monitor
ejtr      3050  0.1  0.7  55884 16336 ?        S    17:25   0:02 /usr/libexec/wnck-applet --oaf-activate-iid=OAFIID:GNOME_Wncklet_Factory --
ejtr      3052  0.0  0.4  50912  9656 ?        S    17:25   0:00 /usr/libexec/trashapplet --oaf-activate-iid=OAFIID:GNOME_Panel_TrashApplet_
ejtr      3055  0.0  0.1  17072  2544 ?        S    17:25   0:00 /usr/libexec/gvfsd-trash --spawner :1.4 /org/gtk/gvfs/exec_spaw/0
ejtr      3061  0.0  0.0   6540  1960 ?        S    17:25   0:00 /usr/libexec/gvfsd-burn --spawner :1.4 /org/gtk/gvfs/exec_spaw/2
ejtr      3069  0.0  0.7  77192 15572 ?        Sl   17:25   0:00 /usr/libexec/mixer_applet2 --oaf-activate-iid=OAFIID:GNOME_MixerApplet_Fact
ejtr      3071  0.0  0.7  37300 15408 ?        S    17:25   0:00 /usr/libexec/clock-applet --oaf-activate-iid=OAFIID:GNOME_ClockApplet_Facto
ejtr      3075  0.0  0.6  52860 13532 ?        S    17:25   0:00 /usr/libexec/gdm-user-switch-applet --oaf-activate-iid=OAFIID:GNOME_FastUse
ejtr      3080  0.0  0.3  26792  8112 ?        S    17:25   0:00 /usr/libexec/notification-area-applet --oaf-activate-iid=OAFIID:GNOME_Notif
ejtr      3096  0.0  0.0   4812  1164 ?        S    17:25   0:00 /bin/sh /usr/lib/firefox-3.0.13/run-mozilla.sh /usr/lib/firefox-3.0.13/fire
ejtr      3115  1.3  4.9 260312 102892 ?       Sl   17:25   0:24  \_ /usr/lib/firefox-3.0.13/firefox
ejtr      3103  4.5  4.6 338276 96268 ?        Sl   17:25   1:20 evolution
ejtr      3123  0.0  0.5  89008 12104 ?        Sl   17:25   0:00 /usr/libexec/evolution-data-server-2.24 --oaf-activate-iid=OAFIID:GNOME_Evo
ejtr      3161  0.2  1.5 123612 32656 ?        Sl   17:26   0:05 gnome-terminal --geometry 140x45
ejtr      3172  0.0  0.0   2844   624 ?        S    17:26   0:00  \_ gnome-pty-helper
ejtr      3173  0.0  0.0   5004  1624 pts/0    Ss   17:26   0:00  \_ bash
ejtr      4009  0.0  0.0   5004  1612 pts/3    Ss+  17:37   0:00  \_ bash
ejtr      4104  0.0  0.0   5004  1640 pts/4    Ss   17:39   0:00  \_ bash
ejtr      4583  0.0  0.0   4636   956 pts/4    R+   17:55   0:00      \_ ps axfu
ejtr      3189  0.0  0.5  69620 11368 ?        Sl   17:26   0:00 /usr/libexec/evolution/2.24/evolution-alarm-notify --oaf-activate-iid=OAFII
...
$

In particular, of course, I had this:

...
root      2471  0.0  0.1  10636  3060 ?        Ss   17:25   0:00 cupsd
lp        3584  0.0  0.3  10908  6272 ?        S    17:34   0:00  \_ /usr/bin/perl /usr/lib/cups/filter/foomatic-rip 273 ejtr evolution job 
lp        3589  0.0  0.2  10908  4620 ?        S    17:34   0:00  |   \_ /usr/bin/perl /usr/lib/cups/filter/foomatic-rip 273 ejtr evolution 
lp        3590  0.0  0.1  10908  4060 ?        S    17:34   0:00  |       \_ /usr/bin/perl /usr/lib/cups/filter/foomatic-rip 273 ejtr evolut
lp        3591  0.0  1.4  55192 30384 ?        S    17:34   0:00  |       \_ gs -sstdout=%stderr -dBATCH -dPARANOIDSAFER -dQUIET -dNOPAUSE -
lp        3599  0.0  0.0   7596  1480 ?        S    17:34   0:00  |           \_ hpijs
lp        3585  0.0  0.0  15872  1252 ?        S    17:34   0:00  \_ hp:/usb/Photosmart_2570_series?serial=MY65T112BM04DY 273 ejtr evolution
...

/var/log/debug shows this as the first sign of the error:

...
1981 Aug 10 17:34:37 topaz kernel: usb 1-7: usbfs: process 3793 (hp) did not claim interface 1 before use
...

preceded by this:
...
1976 Aug 10 17:34:37 topaz kernel: usb 1-7: reset high speed USB device using ehci_hcd and address 3
...

We note that the printer-applet seems to have this in its debug:

Aug 10 2009 17:34:06 <monitor.Monitor instance at 0xa021d4c>: printer `Photosmart_2570_series' has event `printer-state-changed'

but nothing thereafter.... which seems wrong.

Once I had gathered as much debug as I could, such as the process list, I restarted CUPS. Thereafter, I get more debug - interestingly enough the printer applet reports "Out of Paper" as soon as I restart cups..... This can be seen in the xsession debug, by tallying the timestamps with syslog.

Maybe this means that I can provoke the error if I print a multi-page document, but fill the printer with less paper than is needed to finish the Job? If I can guarantee to provoke the issue with a known set of conditions, we'd be much closer to fixing the issue.

Revision history for this message

In Red Hat Bugzilla #515481, Ted (ted-redhat-bugs) wrote on 2009-08-24:

#26

I note from the HPLIP website, that version 3.9.4 includes this intriguing comment in the Release Notes:

"Moved the hpmud_open_device() call in hp.c to after the first read from hpcups or hpijs."

Given my suspicion that this problem may have something to do with some sort of race condition between CUPS and the Desktop Printer Applet, and bidirectional access to the printer in general, I wonder if this change in 3.9.4 may be significant.

F10 currently has a 2.8.12 hplip RPM, whilst F11 uses 3.9.2. The latest RPM in the F12 tree seems to be 3.9.8, so it may be possible to fix this with a backport of that RPM.

Revision history for this message

In Red Hat Bugzilla #515481, Ted (ted-redhat-bugs) wrote on 2009-08-27:

#27

Given the comments on the HPLIP site, I've rebuilt a copy of the current F12 hplip 3.9.8 RPM on F10, adding my previously mentioned debug and throttle patches for good measure, as in:

$ rpm -q --changelog hplip| head
* Mon Aug 24 2009 Ted Rule <email address hidden> 3.9.8-6c
- Add hpmud debug patch

* Mon Aug 24 2009 Ted Rule <email address hidden> 3.9.8-6b
- Add bug throttle patch

* Wed Aug 19 2009 Tim Waugh <email address hidden> 3.9.8-6
- Make sure to avoid handwritten asm.
- Don't use obsolete configure options.
$

The F12 SRPM appears to rebuild without modification on F10, BTW.

This all seems to work Ok. No hangs so far, but then I haven't been using it for very long...

Meanwhile, the backport fails on a couple of issues, which I've had to manually work round.

Firstly, the SELinux policy for hplip-3.9.8 seems to need an additional permission so that hplip can read /var/lib/hp/hplip.state. I would imagine that this permission already exists in F12's current selinux-policy.

$ cat selinuxpolicy/localhplip.te

module localhplip 1.0.0;

########################################
#
# Declarations
#

require {
type hplip_t;
type var_lib_t;
type file_t;
class file { read_file_perms };
}

# Grant hplip permission to read /var/lib/hp/hplip.state
# Aug 26 10:29:08 workstation setroubleshoot: SELinux is preventing python (hplip_t) "read" to ./hplip.state (var_lib_t). For complete SELinux messages. run sealert -l fbafa21b-e3f1-4ec5-b4be-51cca204777d
allow hplip_t var_lib_t:file { read_file_perms };
auditallow hplip_t var_lib_t:file { read_file_perms };
$

Secondly, a variant of bugzilla 424331 becomes a problem:

https://bugzilla.redhat.com/show_bug.cgi?id=424331

/dev/bus/usb ends up with a missing group permission. Presumably hplip version 2 set this itself somehow? Anyhow, F12 seems to fix this by making the whole of /dev/bus/usb chmod 664. I've limited the change by making the g+w permission only apply to HP branded devices with this tweak to udev:

[ejtr@topaz ~]$ grep usb /lib/udev/rules.d/50-udev-default.rules
...
# libusb device nodes
SUBSYSTEM=="usb", ACTION=="add", ENV{DEVTYPE}=="usb_device", ATTRS{idVendor}=="03f0", NAME="bus/usb/$env{BUSNUM}/$env{DEVNUM}", GROUP="lp", MODE="0664"
SUBSYSTEM=="usb", ACTION=="add", ENV{DEVTYPE}=="usb_device", ATTRS{idVendor}!="03f0", NAME="bus/usb/$env{BUSNUM}/$env{DEVNUM}", MODE="0644"
SUBSYSTEM=="usb", KERNEL=="lp*", NAME="usb/%k", SYMLINK+="usb%k", GROUP="lp"
...
$

Given the comments on the HPLIP site, I've rebuilt a copy of the current F12 hplip 3.9.8 RPM on F10, adding my previously mentioned debug and throttle patches for good measure, as in:

$ rpm -q --changelog hplip| head 
* Mon Aug 24 2009 Ted Rule <ejtr@layer3.co.uk> 3.9.8-6c
- Add hpmud debug patch

* Mon Aug 24 2009 Ted Rule <ejtr@layer3.co.uk> 3.9.8-6b
- Add bug throttle patch

* Wed Aug 19 2009 Tim Waugh <twaugh@redhat.com> 3.9.8-6
- Make sure to avoid handwritten asm.
- Don't use obsolete configure options.
$

The F12 SRPM appears to rebuild without modification on F10, BTW.

This all seems to work Ok. No hangs so far, but then I haven't been using it for very long...

Meanwhile, the backport fails on a couple of issues, which I've had to manually work round.

Firstly, the SELinux policy for hplip-3.9.8 seems to need an additional permission so that hplip can read /var/lib/hp/hplip.state. I would imagine that this permission already exists in F12's current selinux-policy.

$ cat selinuxpolicy/localhplip.te

module localhplip 1.0.0;

########################################
#
# Declarations
#

require {
	type hplip_t;
	type var_lib_t;
	type file_t;
	class file { read_file_perms };
}

# Grant hplip permission to read /var/lib/hp/hplip.state
# Aug 26 10:29:08 workstation setroubleshoot: SELinux is preventing python (hplip_t) "read" to ./hplip.state (var_lib_t). For complete SELinux messages. run sealert -l fbafa21b-e3f1-4ec5-b4be-51cca204777d
allow hplip_t var_lib_t:file { read_file_perms };
auditallow hplip_t var_lib_t:file { read_file_perms };
$

Secondly, a variant of bugzilla 424331 becomes a problem:

https://bugzilla.redhat.com/show_bug.cgi?id=424331

/dev/bus/usb ends up with a missing group permission. Presumably hplip version 2 set this itself somehow? Anyhow, F12 seems to fix this by making the whole of /dev/bus/usb chmod 664. I've limited the change by making the g+w permission only apply to HP branded devices with this tweak to udev:

[ejtr@topaz ~]$ grep usb /lib/udev/rules.d/50-udev-default.rules
...
# libusb device nodes
SUBSYSTEM=="usb", ACTION=="add", ENV{DEVTYPE}=="usb_device", ATTRS{idVendor}=="03f0", NAME="bus/usb/$env{BUSNUM}/$env{DEVNUM}", GROUP="lp", MODE="0664"
SUBSYSTEM=="usb", ACTION=="add", ENV{DEVTYPE}=="usb_device", ATTRS{idVendor}!="03f0", NAME="bus/usb/$env{BUSNUM}/$env{DEVNUM}", MODE="0644"
SUBSYSTEM=="usb",		KERNEL=="lp*", NAME="usb/%k", SYMLINK+="usb%k", GROUP="lp"
...
$

Revision history for this message

In Red Hat Bugzilla #515481, Ted (ted-redhat-bugs) wrote on 2009-09-16:

#28

Created attachment 361219
syslog snippet for Sep 15 printer hang

Revision history for this message

In Red Hat Bugzilla #515481, Ted (ted-redhat-bugs) wrote on 2009-09-16:

#29

Created attachment 361220
lpstat / process status listing for Sep 15 Printer hang.

Revision history for this message

In Red Hat Bugzilla #515481, Ted (ted-redhat-bugs) wrote on 2009-09-16:

#30

Having lasted for a month or so since the last printer hang, and in particular none since I manually upgraded to hplip-3.9.8 backported from F11, we had an incident yesterday which required a CUPS restart.

The incident timeline started at around 19:00 BST yesterday. Print Jobs had been submitted, but nothing seemed to be happening.

lpstat showed that the printer had been in a disabled state since Sep 12th, though no printing had been attempted at that time.

My first action was simply to "cupsenable" the printer, which initially appeared to start the queued print Jobs flowing again.

However, after printing the first page of the first queued job, the printer apparently hung. Looking at the syslog at that point, I noted that I could see the familiar "did not claim interface 1 before use" message repeating every 10 seconds.

I therrefore restarted cups using my /usr/local/bin/cupsrestart script, which duly dumped an lpstat/ps listing to Email as attached, before restarting CUPS. Once CUPS restarted, the outstanding print Job all printed successfully to completion.

The debug/status messages which I managed to capture at that time are attached.

Sadly, I failed to note down the exact timestamps of performing the cupsenable, nor when exactly the printer had come up in disabled state. Judging from the logs, the cupsenable action corresponds to this message:

...
Sep 15 19:31:06 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/hpmud.c 348: [15869] hpmud_init()
...

Seemingly things started to go wrong just after this message:

...
Sep 15 19:31:38 topaz kernel: usb 1-7: reset high speed USB device using ehci_hcd and address 3
Sep 15 19:31:38 topaz kernel: usblp0: USB Bidirectional printer dev 3 if 1 alt 0 proto 2 vid 0x03F0 pid 0x4E11
...

And the cupsrestart was finally performed here:

...
Sep 15 19:35:37 topaz sudo: anne : TTY=pts/1 ; PWD=/home/anne ; USER=root ; COMMAND=/usr/local/bin/cupsrestart
...

Remember, as ever, that the copy of hplip I have running is a backported copy of 3.9.8 from F11 with additional debugging enabled via "#define HPMUD_DEBUG 1" in io/hpmud/hpmudi.h.

Last nights incident was slightly more unusual in that the printer subsystem had already ended up in a disabled state, but the behaviour of the system after I had invoked cupsenable seems to be identical to previous incidents.