hplip filled user.log with 2.2Gb of log on error in 10 minutes

Bug #341827 reported by zebul666 on 2009-03-12
28
This bug affects 4 people
Affects Status Importance Assigned to Milestone
HPLIP
Undecided
Unassigned
Baltix
Undecided
Unassigned
Fedora
Won't Fix
Low
hplip (Debian)
Confirmed
Unknown

Bug Description

i tried to print on my photosmart C3180
and it didn't work.
then i wondered why my hdd was making that noise. then i found that syslog-ng was using almost 100% of my cpu
and then found that user.log was 2.2Gb in size and growing.

/var/log/user.log is full of lines like that
Mar 12 15:39:21 soho Photosmart_C3100_series?serial=MY716D32F804P9: io/hpmud/musb.c 1022: bulk_write failed buf=0xbffdf15c size=8192 len=-19: No data available
Mar 12 15:39:21 soho Photosmart_C3100_series?serial=MY716D32F804P9: io/hpmud/musb.c 1384: unable to write data hp:/usb/Photosmart_C3100_series?serial=MY716D32F804P9: No data available
Mar 12 15:39:21 soho Photosmart_C3100_series?serial=MY716D32F804P9: io/hpmud/musb.c 725: invalid deviceid wIndex=1, retrying wIndex=100: No such device
Mar 12 15:39:21 soho Photosmart_C3100_series?serial=MY716D32F804P9: io/hpmud/musb.c 734: invalid deviceid retry ret=-19: No such device
Mar 12 15:39:21 soho Photosmart_C3100_series?serial=MY716D32F804P9: io/hpmud/musb.c 777: invalid device_status: No such device
[...]
and the same is repeated 3070670 times !

it has grown of 1.6Gb of data in 10 minutes. i guess my harddrive would have been full if i had leave at that moment.
so the only way i found to stop this mess is to reboot.
I couldn't unload any module because there is none. how do i had stop this mess in a proper way without rebooting ?

it is one thing that my printer dies (to be confirmed). but the hplip driver has _not_ to fill my drive with its log!

is my printer dead ?

zebul666 (zebul666) wrote :
zebul666 (zebul666) wrote :

no it is not dead. i had to remove any document in cups queue so that my printer began again to print.

is only one document could have cause that ? the acrobat reader plugin ?

David Suffield (david-suffield) wrote :

What kind of system are you running? Distro?

Is there more syslog info? I see USB IO (musb.c) messages but no up-stream caller messages. There are no loops at this level of the code.

Based on the error messages something hung your printer...

zebul666 (zebul666) wrote :

it's linux 32 bit. distro is archlinux (current of the date above)

i got in user.log the lines above repeated a big number of times.
there is the same errors line in /var/log/errors.log

in everything.log, i got those lines just before the errors given above, i switch on the printer at 15:38
Mar 12 15:38:37 soho kernel: usb 1-4: new high speed USB device using ehci_hcd and address 5
Mar 12 15:38:37 soho kernel: usb 1-4: configuration #1 chosen from 1 choice
Mar 12 15:38:37 soho kernel: usblp0: USB Bidirectional printer dev 5 if 1 alt 0 proto 2 vid 0x03F0 pid 0x5611
Mar 12 15:38:37 soho kernel: usbcore: registered new interface driver usblp
Mar 12 15:38:37 soho load-modules.sh: 'usb:v03F0p5611d0100dc00dsc00dp00icFFiscCCip00' is not a valid module or alias name
Mar 12 15:38:37 soho kernel: Initializing USB Mass Storage driver...
Mar 12 15:38:37 soho kernel: scsi4 : SCSI emulation for USB Mass Storage devices
Mar 12 15:38:37 soho kernel: usb-storage: device found at 5
Mar 12 15:38:37 soho kernel: usbcore: registered new interface driver usb-storage
Mar 12 15:38:37 soho kernel: USB Mass Storage support registered.
Mar 12 15:38:37 soho kernel: usb-storage: waiting for device to settle before scanning
Mar 12 15:38:37 soho load-modules.sh: 'usb:v03F0p5611d0100dc00dsc00dp00icFFiscFFipFF' is not a valid module or alias name
Mar 12 15:38:42 soho kernel: scsi 4:0:0:0: Direct-Access HP Photosmart C3180 1.00 PQ: 0 ANSI: 2
Mar 12 15:38:42 soho kernel: sd 4:0:0:0: [sdb] Attached SCSI removable disk
Mar 12 15:38:42 soho kernel: sd 4:0:0:0: Attached scsi generic sg2 type 0
Mar 12 15:38:42 soho kernel: usb-storage: device scan complete
Mar 12 15:39:19 soho kernel: usblp0: removed
Mar 12 15:39:21 soho kernel: usb 1-4: USB disconnect, address 5

what's strange, it seems the printer disapeared ?
could it be cause by hardware ?

so i need to lower the info level of syslog-ng for hplip ?

David Suffield (david-suffield) wrote :

I have never used archlinux, but your kernel did disconnect the device from the USB bus for some reason after initial connection. Linux kernel USB support can be a problem with some no-name PCs.

Try the following in order.

1. Remove any hubs, plug directly into PC.
2. Try a certified HIGH-SPEED or USB 2.0 cable.
3. Try a different USB port on the PC.
4. Try running FULL-SPEED or USB 1.1 (ie: BIOS setting or USB 1.1 hub).
5. Try a different PC or distro.

-dave

Are you still having a problem with this?

Thanks.

Aaron

Changed in hplip:
assignee: nobody → kalosaurusrex
status: New → Triaged
zebul666 (zebul666) wrote :

it's not a no-name PC, it's a DELL inspiron 531 ;-)
it is a usb 2.0 cable (certified high-speed i think)

this has happened only once.
but that's a problem that my log grows like that. in fact they were 3 logs that grows to 2.2GB : user.log, everything.log and errors.log that makes 6.6GB ! i know we have big hdd now but ... it was 2.2GB only because i stop this mess by rebooting

do you consider it's me that must configure syslog (if it's possible) so that such a thing never happen again ?
i think something must be done to avoid the repeat of the same message over and over.
something like that exist in kernel message (where you can see (repeated 6 times) after some message)

how do you properly stop such a thing to happen (i.e. stop the log to grow ) except by rebooting ???

zebul666 (zebul666) wrote :

and archlinux used almost vanilla kernel. it is not over patched like big distro used to do.

Ariel Faigon (ariel.faigon) wrote :

Seeing the same problem here: Ubuntu 8.10 (Intrepid). DIfferent HP printer.

The log spewing is a big problem. When it happens it can quickly fill the disk.

The printer is alive and well, but the software thinks it isn't and goes out of its way to make the problem much worse.

To add to the problem logcheck starts failing because syslog and user.log are filling up so fast it can never catch up. logcheck calls logtail which calls grep and this grep (with many regexps) is running more slowly than the logs get added to!

Here's what's repeated (hundreds per second) in the logs:

Apr 14 18:55:16 ze DeskJet_970C?serial=SG99E1V1SMJQ: io/hpmud/musb.c 688: invalid deviceid retry ret=-19: No such device
Apr 14 18:55:16 ze DeskJet_970C?serial=SG99E1V1SMJQ: io/hpmud/musb.c 731: invalid device_status: No such device
Apr 14 18:55:16 ze DeskJet_970C?serial=SG99E1V1SMJQ: io/hpmud/musb.c 976: bulk_write failed buf=0xbfb5dadc size=4736 len=-19: No such device
Apr 14 18:55:16 ze DeskJet_970C?serial=SG99E1V1SMJQ: io/hpmud/musb.c 1338: unable to write data hp:/usb/DeskJet_970C?serial=SG99E1V1SMJQ: No such device
Apr 14 18:55:16 ze DeskJet_970C?serial=SG99E1V1SMJQ: io/hpmud/musb.c 679: invalid deviceid wIndex=0, retrying wIndex=0: No such device

[the above 5 lines repeat at extremely high rate killing system performance and filling up the disk]

Thanks for the attention to this bug.

Based on the syslog errors you are still getting a USB device disconnect during the print job. Assuming you have not tried a different PC this is USB kernel issue with your PC.

I can't explain the syslog filling up. There should be a 30 second loop error message in the syslog, but there is none. We will try and reproduce the syslog failure.

maximi89 (maximi89) wrote :

i'm having the same problem, i think it is a problem with cups.
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=525361
https://bugs.launchpad.net/bugs/344592

Download full text (8.3 KiB)

Description of problem:

Print jobs occasionally fail to print to a USB-connected HP PhotoSmart 2575,
requiring a cupsrestart to complete the job.

Versions of various components which have exhibited the problem:

$ uname -a
Linux topaz.bugfinder.co.uk 2.6.27.25-170.2.72.fc10.i686 #1 SMP Sun Jun 21 19:03:24 EDT 2009 i686 i686 i386 GNU/Linux

$ rpm -q cups glibc hal hplip hpijs foomatic
cups-1.3.10-5.fc10.i386
glibc-2.9-3.i686
hal-0.5.12-14.20081027git.fc10.i386
hplip-2.8.12-6b.fc10.i386
hpijs-2.8.12-6b.fc10.i386
foomatic-3.0.2-70.fc10.i386

Occasionally, and there doesn't appear to be a specific way of triggering the bug as far as I can tell, a submitted print job will stall in the print queue, usually after at least one page of a multi-page job has printed.

At that point, in the syslog, we usually see the following set of messages:

Aug 4 11:30:55 topaz kernel: usblp0: removed
Aug 4 11:30:55 topaz hal_lpadmin: Running hal_lpadmin
Aug 4 11:30:56 topaz hal_lpadmin: hal_lpadmin triggered by usblp kernel module
Aug 4 11:30:56 topaz hal_lpadmin: Using device ID from HAL database entry
Aug 4 11:30:56 topaz hal_lpadmin: remove
Aug 4 11:30:56 topaz hal_lpadmin: Found configured printer: Photosmart_2570_series
Aug 4 11:31:26 topaz kernel: usb 1-7: reset high speed USB device using ehci_hcd and address 3
Aug 4 11:31:26 topaz kernel: usblp0: USB Bidirectional printer dev 3 if 1 alt 0 proto 2 vid 0x03F0 pid 0x4E11
Aug 4 11:31:26 topaz kernel: usb 1-7: usbfs: process 10998 (hp) did not claim interface 1 before use
Aug 4 11:31:26 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/musb.c 1024: bulk_write failed buf=0xbf988b3c size=512 len=-16: No data available
Aug 4 11:31:26 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/musb.c 1386: unable to write data hp:/usb/Photosmart_2570_series?serial=MY65T112BM04DY: No data available
Aug 4 11:31:26 topaz kernel: usb 1-7: usbfs: process 10905 (hp) did not claim interface 1 before use
Aug 4 11:31:26 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/musb.c 726: invalid deviceid wIndex=1, retrying wIndex=100: Device or resource busy
Aug 4 11:31:26 topaz hal_lpadmin: Running hal_lpadmin
Aug 4 11:31:27 topaz hal_lpadmin: hal_lpadmin triggered by usblp kernel module
Aug 4 11:31:27 topaz hal_lpadmin: Using device ID from HAL database entry
Aug 4 11:31:27 topaz hal_lpadmin: add
Aug 4 11:31:27 topaz hal_lpadmin: URIs: ['hp:/usb/Photosmart_2570_series?serial=MY65T112BM04DY', 'usb://HP/Photosmart%202570%20series?serial=MY65T112BM04DY', 'hal:///org/freedesktop/Hal/devices/usb_device_3f0_4e11_MY65T112BM04DY_if1_printer_MY65T112BM04DY']
Aug 4 11:31:27 topaz hal_lpadmin: HPLIP Fax URIs: None
Aug 4 11:31:27 topaz hal_lpadmin: Not adding printer: Photosmart_2570_series already exists
Aug 4 11:31:36 topaz kernel: usb 1-7: usbfs: process 10905 (hp) did not claim interface 0 before use
Aug 4 11:31:36 topaz kernel: usb 1-7: usbfs: process 11008 (hp) did not claim interface 1 before use
Aug 4 11:31:36 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/musb.c 1024: bulk_write failed buf=0xbf988b3c size=512 len=-16: Device or resource busy
Aug 4 11:31:36 topaz Photo...

Read more...

What's happening here is that the usblp kernel module is getting unloaded by the 'hp' backend, but loaded again by some other process (probably by accessing /dev/usb/lp0).

Can you run 'ps axfw' and attach the output here, if it happens again?

In fact, even without it happening it would be useful to see the output of 'lpstat -s'.

I have a similar problem with a USB connected HP PhotoSmart 257x, where the /var/log/messages file quickly is filled with

Jul 29 19:37:07 localhost Photosmart_2570_series?serial=MY6A4311RT04DY: io/hpmud/musb.c 725: invalid deviceid wIndex=1, retrying wIndex=100: Device or resource busy
Jul 29 19:37:07 localhost kernel: usb 2-1: usbfs: process 22388 (hp) did not claim interface 1 before use
Jul 29 19:37:07 localhost Photosmart_2570_series?serial=MY6A4311RT04DY: io/hpmud/musb.c 1022: bulk_write failed buf=0xbf9171fc size=8192 len=-16: Device or resource busy
Jul 29 19:37:07 localhost Photosmart_2570_series?serial=MY6A4311RT04DY: io/hpmud/musb.c 1384: unable to write data hp:/usb/Photosmart_2570_series?serial=MY6A4311RT04DY: Device or resource busy

I don't know exactly what triggers this, as the printer is apparently working fine both when printing and when scanning.

Installed versions:

hplip-2.8.12-6.fc10.i386
cups-1.3.10-5.fc10.i386

Bug 468272 may be related to this, and people have reported similar problems elsewhere:
https://bugs.launchpad.net/ubuntu/+source/sane-backends/+bug/313504
https://bugs.launchpad.net/hplip/+bug/341827

FWIW:

$ lpstat -s
system default destination: Photosmart_2570_series
device for Cups-PDF: cups-pdf:/
device for Photosmart_2570_series: hp:/usb/Photosmart_2570_series?serial=MY65T112BM04DY

Meanwhile, since I've seen other reports that even the latest hplip might suffer from this, I've backported the current F11 hplip-3.9.2 rpm to F10, and added yet another bit of debug in the hpmud code, as per the patch below. With luck, this might provide a further clue as and when it happens again, albeit at the expense of creating a large debug log every day.

diff -uNr hplip-3.9.2-orig/io/hpmud/hpmudi.h hplip-3.9.2/io/hpmud/hpmudi.h
--- hplip-3.9.2-orig/io/hpmud/hpmudi.h 2009-02-20 00:36:44.000000000 +0000
+++ hplip-3.9.2/io/hpmud/hpmudi.h 2009-08-05 17:10:02.000000000 +0100
@@ -55,7 +55,7 @@
 #include "pp.h"
 #endif

-//#define HPMUD_DEBUG
+#define HPMUD_DEBUG 1

 #define _STRINGIZE(x) #x
 #define STRINGIZE(x) _STRINGIZE(x)
@@ -63,10 +63,10 @@
 #define BUG(args...) syslog(LOG_ERR, __FILE__ " " STRINGIZE(__LINE__) ": " args)

 #ifdef HPMUD_DEBUG
- #define DBG(args...) syslog(LOG_INFO, __FILE__ " " STRINGIZE(__LINE__) ": " args)
+ #define DBG(args...) syslog(LOG_DEBUG, __FILE__ " " STRINGIZE(__LINE__) ": " args)
 // #define DBG(args...) fprintf(stderr, __FILE__ " " STRINGIZE(__LINE__) ": " args)
    #define DBG_DUMP(data, size) sysdump((data), (size))
- #define DBG_SZ(args...) syslog(LOG_INFO, args)
+ #define DBG_SZ(args...) syslog(LOG_DEBUG, args)
 #else
    #define DBG(args...)
    #define DBG_DUMP(data, size)

My cupsrestart script now also contains a ps invocation to help with the debugging as and when it trips up....

$ cat /usr/local/bin/cupsrestart
#!/bin/sh

currentid=`id -un`

if [ $currentid = "root" ]; then
 echo Restarting CUPS Printer Services ...
 service cups status
 service cups condrestart
 service cups status
 sleep 5
else
 ( lpstat -t ; echo "==================================" ; ps axfw ) | mail -s "User $currentid about to restart CUPS via $0" root
 sudo $0
 lpstat -t | mail -s "User $currentid restarted CUPS via $0" root
fi
$

One small extra piece of diagnostic. Going back through the bits of history I have for this issue, I see that when the bug occurs, this message:

   "reset high speed USB device"

always seems to occur exactly 30 seconds after this message:

   "Found configured printer"

For instance:

...
Jun 6 13:50:21 topaz kernel: usblp0: removed
Jun 6 13:50:22 topaz hpijs: WARNING: color pen has low ink
Jun 6 13:50:22 topaz hpijs: STATE: +marker-supply-low-warning
Jun 6 13:50:52 topaz kernel: usb 1-7: reset high speed USB device using ehci_hcd and address 3
Jun 6 13:50:52 topaz kernel: usblp0: USB Bidirectional printer dev 3 if 1 alt 0 proto 2 vid 0x03F0 pid 0x4E11
Jun 6 13:50:52 topaz kernel: usb 1-7: usbfs: process 5839 (hp) did not claim interface 1 before use
...

...
Jul 9 22:15:33 topaz kernel: usblp0: removed
Jul 9 22:15:33 topaz hal_lpadmin: Running hal_lpadmin
Jul 9 22:15:34 topaz hal_lpadmin: hal_lpadmin triggered by usblp kernel module
Jul 9 22:15:34 topaz hal_lpadmin: Using device ID from HAL database entry
Jul 9 22:15:34 topaz hal_lpadmin: remove
Jul 9 22:15:34 topaz hal_lpadmin: Found configured printer: Photosmart_2570_series
Jul 9 22:16:04 topaz kernel: usb 1-7: reset high speed USB device using ehci_hcd and address 3
Jul 9 22:16:04 topaz kernel: usblp0: USB Bidirectional printer dev 3 if 1 alt 0 proto 2 vid 0x03F0 pid 0x4E11
Jul 9 22:16:04 topaz kernel: usb 1-7: usbfs: process 9692 (hp) did not claim interface 1 before use
Jul 9 22:16:04 topaz kernel: usb 1-7: usbfs: process 9517 (hp) did not claim interface 1 before use
...

...
Aug 4 11:30:55 topaz kernel: usblp0: removed
Aug 4 11:30:55 topaz hal_lpadmin: Running hal_lpadmin
Aug 4 11:30:56 topaz hal_lpadmin: hal_lpadmin triggered by usblp kernel module
Aug 4 11:30:56 topaz hal_lpadmin: Using device ID from HAL database entry
Aug 4 11:30:56 topaz hal_lpadmin: remove
Aug 4 11:30:56 topaz hal_lpadmin: Found configured printer:
Photosmart_2570_series
Aug 4 11:31:26 topaz kernel: usb 1-7: reset high speed USB device using
ehci_hcd and address 3
Aug 4 11:31:26 topaz kernel: usblp0: USB Bidirectional printer dev 3 if 1 alt
0 proto 2 vid 0x03F0 pid 0x4E11
Aug 4 11:31:26 topaz kernel: usb 1-7: usbfs: process 10998 (hp) did not claim
interface 1 before use
...

Created attachment 357176
Trimmed debug level syslog for a recent sample incident on 10th August

Created attachment 357177
Trimmed xsession ( and hence printer-applet ) log for a recent sample incident on 10th August

Download full text (19.0 KiB)

The recent log attachments for both xsession and syslog cover an incident from 10th August.

On this occasion, I had my debug-patch version of hplip-2.8.12 running, i.e.:

$ rpm -q --changelog hplip|more
* Wed Aug 05 2009 Ted Rule <email address hidden> 2.8.12-6c
- Add hpmud debug patch

* Tue Jul 14 2009 Ted Rule <email address hidden> 2.8.12-6b
- Add bug throttle patch

* Tue Jan 27 2009 Tim Waugh <email address hidden> 2.8.12-6
- Only ship compressed PPD files.

where the extra debugging can be seen in the syslog trail.

At the same time, I tweaked gnome a bit to get some debugging out of the printer applet, as in:

$ diff -u /usr/share/system-config-printer/applet.py.orig /usr/share/system-config-printer/applet.py
--- /usr/share/system-config-printer/applet.py.orig 2009-03-25 18:55:26.000000000 +0000
+++ /usr/share/system-config-printer/applet.py 2009-08-07 12:11:18.000000000 +0100
@@ -251,6 +251,12 @@
         show_help ()
         sys.exit (1)

+ set_debugging (True)
+ if get_debugging () == False:
+ print >> sys.stderr, ("%s: unable to initialize debugging" %PROGRAM_NAME)
+ elif get_debugging () == True:
+ print >> sys.stderr, ("%s: able to initialize debugging" %PROGRAM_NAME)
+
     for opt, optarg in opts:
         if opt == "--help":
             show_help ()
$

$ diff -u /usr/share/system-config-printer/debug.py.orig /usr/share/system-config-printer/debug.py
--- /usr/share/system-config-printer/debug.py.orig 2009-03-25 18:55:26.000000000 +0000
+++ /usr/share/system-config-printer/debug.py 2009-08-07 12:13:39.000000000 +0100
@@ -19,12 +19,13 @@

 import sys
 import traceback
+import time

 _debug=False
 def debugprint (x):
     if _debug:
         try:
- print >>sys.stderr, x
+ print >>sys.stderr, time.strftime("%b %d %Y %H:%M:%S"), x
         except:
             pass

$

The incident happened when I tried to print a 5 page Email, and it was the first print Job after the initial boot-up of the system. As it happened, the printer had too little paper to complete the Job.

When the print Job "stalled", I had this in the process list:

[ejtr@topaz ~]$ ps axfu
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 2 0.0 0.0 0 0 ? S< 17:25 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [migration/0]
root 4 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [ksoftirqd/0]
root 5 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [watchdog/0]
root 6 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [migration/1]
root 7 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [ksoftirqd/1]
root 8 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [watchdog/1]
root 9 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [events/0]
root 10 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [events/1]
root 11 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [khelper]
root 85 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [kintegrityd/0]
root 86 0.0 0.0 0 0 ? S< 17:25 0:00 \_ [kintegrity...

I note from the HPLIP website, that version 3.9.4 includes this intriguing comment in the Release Notes:

"Moved the hpmud_open_device() call in hp.c to after the first read from hpcups or hpijs."

Given my suspicion that this problem may have something to do with some sort of race condition between CUPS and the Desktop Printer Applet, and bidirectional access to the printer in general, I wonder if this change in 3.9.4 may be significant.

F10 currently has a 2.8.12 hplip RPM, whilst F11 uses 3.9.2. The latest RPM in the F12 tree seems to be 3.9.8, so it may be possible to fix this with a backport of that RPM.

Given the comments on the HPLIP site, I've rebuilt a copy of the current F12 hplip 3.9.8 RPM on F10, adding my previously mentioned debug and throttle patches for good measure, as in:

$ rpm -q --changelog hplip| head
* Mon Aug 24 2009 Ted Rule <email address hidden> 3.9.8-6c
- Add hpmud debug patch

* Mon Aug 24 2009 Ted Rule <email address hidden> 3.9.8-6b
- Add bug throttle patch

* Wed Aug 19 2009 Tim Waugh <email address hidden> 3.9.8-6
- Make sure to avoid handwritten asm.
- Don't use obsolete configure options.
$

The F12 SRPM appears to rebuild without modification on F10, BTW.

This all seems to work Ok. No hangs so far, but then I haven't been using it for very long...

Meanwhile, the backport fails on a couple of issues, which I've had to manually work round.

Firstly, the SELinux policy for hplip-3.9.8 seems to need an additional permission so that hplip can read /var/lib/hp/hplip.state. I would imagine that this permission already exists in F12's current selinux-policy.

$ cat selinuxpolicy/localhplip.te

module localhplip 1.0.0;

########################################
#
# Declarations
#

require {
 type hplip_t;
 type var_lib_t;
 type file_t;
 class file { read_file_perms };
}

# Grant hplip permission to read /var/lib/hp/hplip.state
# Aug 26 10:29:08 workstation setroubleshoot: SELinux is preventing python (hplip_t) "read" to ./hplip.state (var_lib_t). For complete SELinux messages. run sealert -l fbafa21b-e3f1-4ec5-b4be-51cca204777d
allow hplip_t var_lib_t:file { read_file_perms };
auditallow hplip_t var_lib_t:file { read_file_perms };
$

Secondly, a variant of bugzilla 424331 becomes a problem:

https://bugzilla.redhat.com/show_bug.cgi?id=424331

/dev/bus/usb ends up with a missing group permission. Presumably hplip version 2 set this itself somehow? Anyhow, F12 seems to fix this by making the whole of /dev/bus/usb chmod 664. I've limited the change by making the g+w permission only apply to HP branded devices with this tweak to udev:

[ejtr@topaz ~]$ grep usb /lib/udev/rules.d/50-udev-default.rules
...
# libusb device nodes
SUBSYSTEM=="usb", ACTION=="add", ENV{DEVTYPE}=="usb_device", ATTRS{idVendor}=="03f0", NAME="bus/usb/$env{BUSNUM}/$env{DEVNUM}", GROUP="lp", MODE="0664"
SUBSYSTEM=="usb", ACTION=="add", ENV{DEVTYPE}=="usb_device", ATTRS{idVendor}!="03f0", NAME="bus/usb/$env{BUSNUM}/$env{DEVNUM}", MODE="0644"
SUBSYSTEM=="usb", KERNEL=="lp*", NAME="usb/%k", SYMLINK+="usb%k", GROUP="lp"
...
$

Created attachment 361219
syslog snippet for Sep 15 printer hang

Created attachment 361220
lpstat / process status listing for Sep 15 Printer hang.

Having lasted for a month or so since the last printer hang, and in particular none since I manually upgraded to hplip-3.9.8 backported from F11, we had an incident yesterday which required a CUPS restart.

The incident timeline started at around 19:00 BST yesterday. Print Jobs had been submitted, but nothing seemed to be happening.

lpstat showed that the printer had been in a disabled state since Sep 12th, though no printing had been attempted at that time.

My first action was simply to "cupsenable" the printer, which initially appeared to start the queued print Jobs flowing again.

However, after printing the first page of the first queued job, the printer apparently hung. Looking at the syslog at that point, I noted that I could see the familiar "did not claim interface 1 before use" message repeating every 10 seconds.

I therrefore restarted cups using my /usr/local/bin/cupsrestart script, which duly dumped an lpstat/ps listing to Email as attached, before restarting CUPS. Once CUPS restarted, the outstanding print Job all printed successfully to completion.

The debug/status messages which I managed to capture at that time are attached.

Sadly, I failed to note down the exact timestamps of performing the cupsenable, nor when exactly the printer had come up in disabled state. Judging from the logs, the cupsenable action corresponds to this message:

...
Sep 15 19:31:06 topaz Photosmart_2570_series?serial=MY65T112BM04DY: io/hpmud/hpmud.c 348: [15869] hpmud_init()
...

Seemingly things started to go wrong just after this message:

...
Sep 15 19:31:38 topaz kernel: usb 1-7: reset high speed USB device using ehci_hcd and address 3
Sep 15 19:31:38 topaz kernel: usblp0: USB Bidirectional printer dev 3 if 1 alt 0 proto 2 vid 0x03F0 pid 0x4E11
...

And the cupsrestart was finally performed here:

...
Sep 15 19:35:37 topaz sudo: anne : TTY=pts/1 ; PWD=/home/anne ; USER=root ; COMMAND=/usr/local/bin/cupsrestart
...

Remember, as ever, that the copy of hplip I have running is a backported copy of 3.9.8 from F11 with additional debugging enabled via "#define HPMUD_DEBUG 1" in io/hpmud/hpmudi.h.

Last nights incident was slightly more unusual in that the printer subsystem had already ended up in a disabled state, but the behaviour of the system after I had invoked cupsenable seems to be identical to previous incidents.

So it looks like the usblp module is getting loaded spontaneously. This seems to be because of a spontaneous USB reset:

Sep 15 19:31:38 topaz hp[15869]: io/hpmud/hpmud.c 627: [15869] hpmud_channel_write() dd=1 cd=2 buf=0xbfccc2cc size=8192 sectime=45
Sep 15 19:31:38 topaz kernel: usb 1-7: reset high speed USB device using ehci_hcd and address 3
Sep 15 19:31:38 topaz kernel: usblp0: USB Bidirectional printer dev 3 if 1 alt 0 proto 2 vid 0x03F0 pid 0x4E11
Sep 15 19:31:38 topaz kernel: usb 1-7: usbfs: process 16047 (hp) did not claim interface 1 before use

In this situation the hp backend asks the printer what's up by examining its Device ID. Of course, the printer is happy; the problem is that our USB channel handle is no longer valid.

I've reported this upstream, along with a patch to work around the problem.

I've checked the fix in to devel, F-12, and F-11, but not F-10: the code is quite different in Fedora 10 which makes it harder to fix.

Would you be able to re-base the package you are testing on the hplip-3.9.8-14.fc11 package in F-11?

I'm currently based on a backport of hplip-3.9.8-6.fc12.src.rpm, so this shouldn't be a problem, other than the fact that I can't seem to find that version on download.fedora.redhat.com.

There's a 3.9.8-16-fc12 in devel on the mirrors, but that was created a week ago, so it doesn't seem to be relevant.

The F11 mirror sites only seem to have 3.9.8-8 or 3.9.8-12 versions.

Can you clarify which version you'd like me to try and backport and test on F10?

Thanks.

Download full text (3.2 KiB)

In attempting to backport your F11 build, I get a libtool error.

Any advice on furthering this backport, so that I can continue to test?

My additional debug options added to SPEC:

$ diff -u hplip.spec.3.9.8-14-f11 hplip.spec.3.9.8-14c
--- hplip.spec.3.9.8-14-f11 2009-09-29 17:11:42.000000000 +0100
+++ hplip.spec.3.9.8-14c 2009-10-01 13:43:25.000000000 +0100
@@ -1,7 +1,7 @@
 Summary: HP Linux Imaging and Printing Project
 Name: hplip
 Version: 3.9.8
-Release: 14%{?dist}
+Release: 14c%{?dist}
 License: GPLv2+ and MIT
 Group: System Environment/Daemons
 Conflicts: system-config-printer < 0.6.132
@@ -25,6 +25,10 @@
 Patch9: hplip-requirespageregion.patch
 Patch10: hplip-discovery-method.patch
 Patch11: hplip-device-reconnected.patch
+
+Patch14: hplip-3.9.8-hpmud-debug.patch
+Patch15: hplip-2.8.12-hpmud-bugthrottle.patch
+
 BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n)

 Requires(pre): /sbin/service
@@ -147,6 +151,10 @@
 # Give up trying to print a job to a reconnected device (bug #515481).
 %patch11 -p1 -b .device-reconnected

+# Throttle any hpmud module craziness - and add some more debug
+%patch14 -p1 -b .hpmud-debug
+%patch15 -p1 -b .hpmud-throttle
+
 %build
 aclocal
 automake --foreign --add-missing --copy
@@ -364,6 +372,10 @@
 exit 0

 %changelog
+* Thu Oct 1 2009 Ted Rule <email address hidden> 3.9.8-14c
+- Add hpmud debug patch
+- Add bug throttle patch
+
 * Tue Sep 29 2009 Tim Waugh <email address hidden> 3.9.8-14
 - Give up trying to print a job to a reconnected device (bug #515481).
$

/bin/sh ./libtool --tag=CC --mode=compile gcc -DPACKAGE_NAME=\"HP\ Linux\ Imaging\ and\ Printing\" -DPACKAGE_TARNAME=\"hplip\" -DPACKAGE_VERSION=\"3.9.8\" -DPACKAGE_STRING=\"HP\ Linux\ Imaging\ and\ Printing\ 3.9.8\" -DPACKAGE_BUGREPORT=\"3.9.8.36\" -DPACKAGE=\"hplip\" -DVERSION=\"3.9.8\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DHAVE_LIBDL=1 -DHAVE_PTHREAD_H=1 -DHAVE_JPEGLIB_H=1 -DHAVE_UINT32_T=1 -DHAVE_LIBHPIP=1 -DHAVE_PPORT=1 -DHAVE_LIBCRYPTO=1 -DHAVE_LIBNETSNMP=1 -DHAVE_NET_SNMP_NET_SNMP_CONFIG_H=1 -DHAVE_LINUX_PPDEV_H=1 -DHAVE_CUPS_CUPS_H=1 -DHAVE_USB_H=1 -DHAVE_PYTHON2_5_PYTHON_H=1 -DHAVE_DBUS=1 -I. -Iip -Iio/hpmud -Iscan/sane -I/usr/include/python2.5 -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -MT cupsext_la-cupsext.lo -MD -MP -MF .deps/cupsext_la-cupsext.Tpo -c -o cupsext_la-cupsext.lo `test -f 'prnt/cupsext/cupsext.c' || echo './'`prnt/cupsext/cupsext.c
./libtool: line 467: CDPATH: command not found
./libtool: line 1145: func_opt_split: command not found
libtool: Version mismatch error. This is libtool 2.2.6 Debian-2.2.6a-1ubuntu1, but the
libtool: definition of this LT_INIT comes from an older release.
libtool: You should recreate aclocal.m4 with macros from libtool 2.2.6 Debian-2.2.6a-1ubuntu1
libtool: and run autoconf again.
make: *** [cupsext_la-cupsext.lo] Error 63
error: Bad exit status from /var/tmp/rpm-tmp.7VErLG (%build...

Read more...

Pending a fix to the libtool problem, I've added the device-reconnected patch to my backport from the original hplip-3.9.8-6.fc12.src.rpm.

$ diff -u hplip.spec.3.9.8.orig hplip.spec.3.9.8-6d
--- hplip.spec.3.9.8.orig 2009-08-19 11:41:30.000000000 +0100
+++ hplip.spec.3.9.8-6d 2009-10-01 16:15:17.000000000 +0100
@@ -1,7 +1,7 @@
 Summary: HP Linux Imaging and Printing Project
 Name: hplip
 Version: 3.9.8
-Release: 6%{?dist}
+Release: 6d%{?dist}
 License: GPLv2+ and MIT
 Group: System Environment/Daemons
 Conflicts: system-config-printer < 0.6.132
@@ -17,6 +17,12 @@
 Patch2: hplip-strstr-const.patch
 Patch3: hplip-ui-optional.patch
 Patch4: hplip-no-asm.patch
+
+Patch11: hplip-device-reconnected.patch
+
+Patch14: hplip-3.9.8-hpmud-debug.patch
+Patch15: hplip-2.8.12-hpmud-bugthrottle.patch
+
 BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n)

 Requires(pre): /sbin/service
@@ -109,6 +115,13 @@
 # Make sure to avoid handwritten asm.
 %patch4 -p1 -b .no-asm

+# Give up trying to print a job to a reconnected device (bug #515481).
+%patch11 -p1 -b .device-reconnected
+
+# Throttle any hpmud module craziness - and add some more debug
+%patch14 -p1 -b .hpmud-debug
+%patch15 -p1 -b .hpmud-throttle
+
 %build
 %configure \
  --enable-scan-build --enable-gui-build --enable-fax-build \
@@ -310,6 +323,15 @@
 exit 0

 %changelog
+* Thu Oct 1 2009 Tim Waugh <email address hidden> 3.9.8-6d
+- Give up trying to print a job to a reconnected device (bug #515481).
+
+* Mon Aug 24 2009 Ted Rule <email address hidden> 3.9.8-6c
+- Add hpmud debug patch
+
+* Mon Aug 24 2009 Ted Rule <email address hidden> 3.9.8-6b
+- Add bug throttle patch
+
 * Wed Aug 19 2009 Tim Waugh <email address hidden> 3.9.8-6
 - Make sure to avoid handwritten asm.
 - Don't use obsolete configure options.

Backporting the patch to your version will be fine for testing this.

(The other way is to remove hplip-non-scripts.patch and the calls to aclocal, automake and autoconf.)

Following the installation of the "reconnect" patch, the nature of the failure mode has changed. When the error occurs, the user is presented with a CUPS(?) error dialog showing that an error has occurred, and asking that they perform various diagnostic tasks.

Seemingly as a consequence of the patch, the printer appears to consistently end up in a "disable" state - as shown by lpstat -t.

Because of the "disabled" state, my cupsrestart script was seemingly insufficient to clear the error completely; I've modified it now to perform a conditional "cupsenable" prior to "service cups restart". The combination of these actions seems to consistently clear the error and cause the stuck print job to be correctly resubmitted to the printer.

The attached lpstat+ps+syslog diagnostic shows an example of a printer Job failing at ~11:43, and cupsrestart being run to free the stuck print Job at ~13:07

Created attachment 364340
lpstat / process / syslog listings for Oct 10 Printer hang

PJBrs (pjbeers) wrote :

I think I have the same problem on Ubuntu Karmic Koala (HPLIP 3.9.8). The printer model is HP Photosmart 2575. HPLIP fills up four different logfiles at high speed, quickly filling the root partition. I don't know how the error is provoked, although it appears that printing more than one page at one time gets the printer stuck, and at that moment the logs start fill, until the entire root partition is full.

/var/log/daemon.log is filled with repeated sequences of:

Nov 5 15:43:11 floor-laptop hp[27078]: io/hpmud/musb.c 725: invalid deviceid wIndex=1, retrying wIndex=100: Device or resource busy
Nov 5 15:43:11 floor-laptop hp[27078]: io/hpmud/musb.c 1022: bulk_write failed buf=0xbfdc3524 size=7680 len=-16: Device or resource busy
Nov 5 15:43:11 floor-laptop hp[27078]: io/hpmud/musb.c 1389: unable to write data hp:/usb/Photosmart_2570_series?serial=MY71C310G204DY: Device or resource busy

/var/log/syslog is filled with repeated messages like:

Nov 5 15:43:11 floor-laptop kernel: [ 7809.347948] usb 1-1: usbfs: process 27317 (hp) did not claim interface 1 before use
Nov 5 15:43:11 floor-laptop hp[27078]: io/hpmud/musb.c 1022: bulk_write failed buf=0xbfdc3524 size=7680 len=-16: Device or resource busy
Nov 5 15:43:11 floor-laptop kernel: [ 7809.347988] usb 1-1: usbfs: process 27078 (hp) did not claim interface 1 before use
Nov 5 15:43:11 floor-laptop hp[27078]: io/hpmud/musb.c 1389: unable to write data hp:/usb/Photosmart_2570_series?serial=MY71C310G204DY: Device or resource busy
Nov 5 15:43:11 floor-laptop hp[27078]: io/hpmud/musb.c 725: invalid deviceid wIndex=1, retrying wIndex=100: Device or resource busy
Nov 5 15:43:11 floor-laptop hp[27078]: io/hpmud/musb.c 1022: bulk_write failed buf=0xbfdc3524 size=7680 len=-16: Device or resource busy
Nov 5 15:43:11 floor-laptop hp[27078]: io/hpmud/musb.c 1389: unable to write data hp:/usb/Photosmart_2570_series?serial=MY71C310G204DY: Device or resource busy

/var/log/kern.log is filled with repeated messages like:

Nov 5 15:43:11 floor-laptop kernel: [ 7809.403956] usb 1-1: usbfs: process 27398 (hp) did not claim interface 1 before use

/var/log/messages also is filled with hplip-related messages, but they weren't different from what was in the other files.

Actually, I had the same problem with Ubuntu Jaunty Jackalope, but I had hoped that upgrading to Karmic Koala would solve it. In other words, this appears to be a quite long-standing bug.

In my view, the main problem with this bug is that novice and intermediate users are not able to solve the problems that are caused by this bug. They don't know where to look when to get error messages because their root partition is full. In the mean time, their computers are rendered useless because they often cannot open their documents any more, let alone save them. In fact, in my case I couldn't even save a draft email message in Thunderbird.

I noticed that the last message in this bugreport is from April 2009. Is their any activity regarding the solution of this problem?

PJBrs (pjbeers) wrote :

Oops! Type:

s/when to get/when they get/

This message is a reminder that Fedora 10 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 10. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora
'version' of '10'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version prior to Fedora 10's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that
we may not be able to fix it before Fedora 10 is end of life. If you
would still like to see this bug fixed and are able to reproduce it
against a later version of Fedora please change the 'version' of this
bug to the applicable version. If you are unable to change the version,
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

The process we are following is described here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

PJBrs (pjbeers) wrote :

It's been a week since I posted; is there any way I can help solve this problem?

Try the hplip 3.9.10. I made a change musb.c to address this problem when the USB device disconnects intermittently.

Of coarse this only addresses the syslog filling up issue not your hardware issue.

Changed in fedora:
status: Unknown → In Progress
martin_jacobs (martin-jacobs) wrote :

Seems to be similar to bug #439550
https://bugs.launchpad.net/hplip/+bug/439550
This one happens if printer connected by LAN (in my case WLAN) is switched on after job processing has been started by CUPS.

Fedora 10 changed to end-of-life (EOL) status on 2009-12-17. Fedora 10 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Changed in fedora:
status: In Progress → Won't Fix
Changed in hplip (Debian):
status: Unknown → Confirmed
Changed in hplip (Debian):
status: Confirmed → Fix Released
Changed in hplip (Debian):
status: Fix Released → Confirmed
Changed in fedora:
importance: Unknown → Low
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.