usbhid-ups regression (APC BE525-RS)

Bug #915985 reported by Dmitry Andreychuk on 2012-01-13
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
nut (Ubuntu)
Medium
Arnaud Quette

Bug Description

Ubuntu 11.10
nut 2.6.1-2ubuntu2
UPS model: APC Back-UPS ES 525 (BE525-RS)
driver: usbhid-ups

I get the following error approximately every 30 seconds:
usb 4-2: usbfs: USBDEVFS_CONTROL failed cmd usbhid-ups rqt 161 rq 1 len 2 ret -75
The simalar error is described in the driver's known issues but increasing pollinterval doesn't help - the interval between error messages just increases accordingly.
I also noticed that battery.charge returned by upsc doesn't change when the UPS is actually discharging. I called upsc, unplugged UPS and called upsc again after a while: ups.status, battery.runtime and input.voltage reasonably changed but battery.charge stayed at 100%. So maybe the retreiving of this variable is working incorrectly.
Part of my ups.conf:
[be525]
  driver = usbhid-ups
  port = auto
  pollinterval = 60

Output for "sudo /lib/nut/usbhid-ups -u nut -a be525 -DDDD" is attached.

I've tested a few other versions and got this:
2.4.3-1ubuntu3.1 - ok
2.6.0-1ubuntu3 - ok
2.6.1-2ubuntu2 - error
2.6.2-1ubuntu1 - error
2.6.3 (upstream) - error. Actually I didn't manage to run daemon that I built from source, so I ran just the driver and it reported the same error.

Looks like this problem appeared in v2.6.1.

Dmitry Andreychuk (and-dmitry) wrote :

Hi Dmitry,

2012/1/13 Dmitry Andreychuk
> Public bug reported:
>
> Ubuntu 11.10
> nut 2.6.1-2ubuntu2
> UPS model: APC Back-UPS ES 525 (BE525-RS)
> driver: usbhid-ups
>
> I get the following error approximately every 30 seconds:
> usb 4-2: usbfs: USBDEVFS_CONTROL failed cmd usbhid-ups rqt 161 rq 1 len 2 ret -75
> The simalar error is described in the driver's known issues but increasing pollinterval doesn't help - the interval between error messages just increases accordingly.
> I also noticed that battery.charge returned by upsc doesn't change when the UPS is actually discharging. I called upsc, unplugged UPS and called upsc again after a while: ups.status, battery.runtime and input.voltage reasonably changed but battery.charge stayed at 100%. So maybe the retreiving of this variable is working incorrectly.

as per your provided log, both issues are related:
errno -75 is "EOVERFLOW" ("Value too large for defined data type").

We can see 1 successful retrieval of battery. charge (HID path:
UPS.PowerSummary.RemainingCapacity, ReportID: 0x0c), obtained through
the interrupt pipe (ie, not requested explicitly, but sent by the
device itself, upon change)

All other request are failing, with EOVERFLOW, which should mean a
fault from the device (sending more data than it should

> I've tested a few other versions and got this:
> 2.4.3-1ubuntu3.1 - ok
> 2.6.0-1ubuntu3 - ok
> 2.6.1-2ubuntu2 - error
> 2.6.2-1ubuntu1 - error
> 2.6.3 (upstream) - error. Actually I didn't manage to run daemon that I built from source, so I ran just the driver and it reported the same error.

could you please send an output from a working version (preferably 2.6.0)?

cheers,
Arnaud
--
Linux / Unix Expert R&D - Eaton - http://powerquality.eaton.com
Network UPS Tools (NUT) Project Leader - http://www.networkupstools.org/
Debian Developer - http://www.debian.org
Free Software Developer - http://arnaud.quette.free.fr/

Dmitry Andreychuk (and-dmitry) wrote :

Hi Arnaud,

Thanks for your reply! Attaching output for 2.6.0-1ubuntu3.

Dave Walker (davewalker) on 2012-01-17
Changed in nut (Ubuntu):
importance: Undecided → Medium

Hi Sergey,

cc'ing NUT users list, since others may be interested in the topic
(for those, follow the bug link below).

2012/1/16 Dmitry Andreychuk <email address hidden>:
> Hi Arnaud,
>
> Thanks for your reply! Attaching output for 2.6.0-1ubuntu3.
>
>
> ** Attachment added: "usbhid-ups_2.6.0-1ubuntu3.log"
>   https://bugs.launchpad.net/ubuntu/+source/nut/+bug/915985/+attachment/2676822/+files/usbhid-ups_2.6.0-1ubuntu3.log
>
> --
> You received this bug notification because you are subscribed to nut in
> Ubuntu.
> https://bugs.launchpad.net/bugs/915985
>
> Title:
>  usbhid-ups regression (APC BE525-RS)

ok, I got it!
first, both of your issues are related.
we reverted in 2.6.1 a change on how we get data from the UPS.
we were previously requesting a very large size, while we now only ask
for the data actual size.

your device is reporting extra garbage data (buggy firmware) when
requesting ("Feature report") battery.charge
(UPS.PowerSummary.RemainingCapacity), while the notified version
("Input report", sent by the device upon changes, like SNMP traps) is
good.

this cause, at libusb level, an EOVERFLOW with 2.6.1+, which is logged
every 30 sec. (polling freq), and cause this data to not be processed.

now, Input report data should take over, and feed battery.charge!
I'd be interested in seeing a debug output of a faulty version
(2.6.1+), that includes a power failure.

the bad news: ATM, I've no solution that can be applied upstream and
distributed.
the good news: well, you already know: keep running 2.6.0 for the time being.

cheers,
Arnaud

Dmitry Andreychuk (and-dmitry) wrote :

Attaching logs with power failure for versions 2.6.1-2ubuntu2 and 2.6.3.
Power failures and power returns are marked with "NOTE" in the logs.

I noticed that UPS.PowerSummary.RemainingCapacity came with input report only once. The device didn't send it upon changes for some reason.

Dmitry Andreychuk (and-dmitry) wrote :
Arnaud Quette (aquette) wrote :

2012/1/19 Dmitry Andreychuk <email address hidden>:
> Attaching logs with power failure for versions 2.6.1-2ubuntu2 and 2.6.3.
> Power failures and power returns are marked with "NOTE" in the logs.
>
> I noticed that UPS.PowerSummary.RemainingCapacity came with input report
> only once. The device didn't send it upon changes for some reason.

buggy firmware again...

Attached is a patch against the latest development version (Subversion trunk).
It also applies on 2.6.2 and 2.6.3 using (from within the top source dir):
$ patch -p0 < /path/to/apc-hid-overflow.diff

Could you please test it (same procedure as before) and report back
(including a driver debug output) to see if things are going better?

cheers,
Arnaud
--
Linux / Unix Expert R&D - Eaton - http://powerquality.eaton.com
Network UPS Tools (NUT) Project Leader - http://www.networkupstools.org/
Debian Developer - http://www.debian.org
Free Software Developer - http://arnaud.quette.free.fr/

The attachment "apc-hid-overflow.diff" of this bug report has been identified as being a patch. The ubuntu-reviewers team has been subscribed to the bug report so that they can review the patch. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-reviewers team please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

tags: added: patch
Dmitry Andreychuk (and-dmitry) wrote :

The patch didn't change the output. I took a look at the sources:
libusb_get_report() returns libusb_strerror(ret, __func__) which returns 0 for ret == -EOVERFLOW. Then in refresh_report_buffer(): r == 0 and the patch doesn't work. The log message would be incorrect anyway because "r" is not the size of response but a negative error code in this situation.

I made a quick additional patch for 2.6.3 to get the output you wanted. Looks like the value is still updated only once.

--- drivers/libhid.c.bak 2012-01-24 18:40:32.437247314 +0400
+++ drivers/libhid.c 2012-01-24 18:52:50.624696914 +0400
@@ -155,7 +155,7 @@

        /* Some buggy firmware send more data than expected. Log this, but process
         * data anyway */
- if (rbuf->len[id] != r) {
+ if (r > 0 && rbuf->len[id] != r) {
                upsdebugx(2, "%s: expected %d bytes, but got %d instead", __func__, rbuf->len[id], r);
                upsdebug_hex(3, "Report[err]", rbuf->data[id], r);
        } else {
--- drivers/libusb.c.bak 2012-01-24 15:08:13.090612484 +0400
+++ drivers/libusb.c 2012-01-24 15:08:21.578110181 +0400
@@ -378,7 +378,7 @@
        case -EOVERFLOW: /* Value too large for defined data type */
        case -EPROTO: /* Protocol error */
                upsdebugx(2, "%s: %s", desc, usb_strerror());
- return 0;
+ return ret;

        default: /* Undetermined, log only */
                upslogx(LOG_DEBUG, "%s: %s", desc, usb_strerror());

Hi Robert and Dmitry,

I'm crossing LP with the nut mailing list, since this is the same EOVERFLOW
issue.

@Dmitry: it comes out that my previous patch was missing the libusb.c part
:-/

to both, the attached patch should fix your issue.
please send compressed debug output to confirm the fix.

I'll have to do some more testing tomorrow since it's "blind coded" (Ie,
just compiled, not tested with HW)

cheers,
Arnaud

Arnaud Quette (aquette) wrote :

2012/4/11 Robert Ayrapetyan <email address hidden>

> Hi.
>

Hi Robert,

please check your subscription.
I'm still told that you're not subscribed!

Mine output after patch looks same (attached).
>

in fact, it's different.
It's no more an overflow error, but an I/O error!

have you patched your 2.6.1 or used trunk + the patch?
in the latter case, I'd be interested in a trace without the patch, to see
if we're still on the overflow side.

On Wed, Apr 11, 2012 at 12:55 AM, Arnaud Quette <email address hidden>
> wrote:
> > Hi Robert and Dmitry,
> >
> > I'm crossing LP with the nut mailing list, since this is the same
> EOVERFLOW
> > issue.
> >
> > @Dmitry: it comes out that my previous patch was missing the libusb.c
> part
> > :-/
> >
> > to both, the attached patch should fix your issue.
> > please send compressed debug output to confirm the fix.
> >
> > I'll have to do some more testing tomorrow since it's "blind coded" (Ie,
> > just compiled, not tested with HW)
> >
> > cheers,
> > Arnaud
>

cheers,
Arnaud
--
Linux / Unix Expert R&D - Eaton - http://powerquality.eaton.com
Network UPS Tools (NUT) Project Leader - http://www.networkupstools.org/
Debian Developer - http://www.debian.org
Free Software Developer - http://arnaud.quette.free.fr/

Arnaud Quette (aquette) wrote :

2012/4/11 Robert Ayrapetyan <email address hidden>

> Seems I've missed confirmation mail somewhere, now registered.
>
>
> > have you patched your 2.6.1 or used trunk + the patch?
> I've used trunk + the patch.
>
>
> > I'd be interested in a trace without the patch, to
> > see if we're still on the overflow side.
>
> trunk (rev.3529) "with no patch" log attached.
>
> Btw I've checked all logs I've sent so far - all of them contain:
>
> 1.199490 libusb_get_report: Unknown error
> 1.199512 Can't retrieve Report 0c: Input/output error
>
> Seems you are talking about Dmitry's case (overflow).
>

you're right: your issues are the same, but BSD doesn't report EOVERFLOW
has it should.
while Dmitry is running on Linux, the patch should work there.

@Robert: could you please run the test again (with wathever version), using
debug level 5 (-DDDDD) and doing "export USB_DEBUG=3" before launching the
driver?

if there is no way to catch overflow on BSD, I fear I'll have to create a
driver option to bypass this...

cheers,
Arno

>
>
> On 04/11/12 16:35, Arnaud Quette wrote:
>
>>
>> 2012/4/11 Robert Ayrapetyan <<email address hidden>
>> <mailto:robert.ayrapetyan@**gmail.com <email address hidden>>>
>>
>>
>> Hi.
>>
>>
>> Hi Robert,
>>
>> please check your subscription.
>> I'm still told that you're not subscribed!
>>
>> Mine output after patch looks same (attached).
>>
>>
>> in fact, it's different.
>> It's no more an overflow error, but an I/O error!
>>
>> have you patched your 2.6.1 or used trunk + the patch?
>> in the latter case, I'd be interested in a trace without the patch, to
>> see if we're still on the overflow side.
>>
>> On Wed, Apr 11, 2012 at 12:55 AM, Arnaud Quette
>> <<email address hidden> <mailto:<email address hidden>>**> wrote:
>> > Hi Robert and Dmitry,
>> >
>> > I'm crossing LP with the nut mailing list, since this is the same
>> EOVERFLOW
>> > issue.
>> >
>> > @Dmitry: it comes out that my previous patch was missing the
>> libusb.c part
>> > :-/
>> >
>> > to both, the attached patch should fix your issue.
>> > please send compressed debug output to confirm the fix.
>> >
>> > I'll have to do some more testing tomorrow since it's "blind
>> coded" (Ie,
>> > just compiled, not tested with HW)
>> >
>> > cheers,
>> > Arnaud
>>
>>
>> cheers,
>> Arnaud
>> --
>> Linux / Unix Expert R&D - Eaton - http://powerquality.eaton.com
>> Network UPS Tools (NUT) Project Leader - http://www.networkupstools.**
>> org/ <http://www.networkupstools.org/>
>> Debian Developer - http://www.debian.org
>> Free Software Developer - http://arnaud.quette.free.fr/
>>
>>

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nut (Ubuntu):
status: New → Confirmed
Arnaud Quette (aquette) wrote :

Dimitry,

can you please test the last patch I sent (apc-hid-overflow-2.diff) and report the output?

cheers,
--- Arnaud

Changed in nut (Ubuntu):
assignee: nobody → Arnaud Quette (aquette)
status: Confirmed → In Progress
Dmitry Andreychuk (and-dmitry) wrote :

I'll test it next week, as soon as i get to the hardware.

2012/6/9 Dmitry Andreychuk <email address hidden>

> I'll test it next week, as soon as i get to the hardware.
>

great, thanks.

cheers,
-- Arnaud

Dmitry Andreychuk (and-dmitry) wrote :

Attaching logs for svn3663 + patch

Arnaud Quette (aquette) wrote :

Dmitry,

2012/9/11 Frederic Bohe <email address hidden>

> Please see http://lists.alioth.debian.org/pipermail/nut-
> upsuser/2012-September/007909.html
>

please check the above reference pointed by Fred.
This is the solution you were waiting for.

To Ubuntu packagers fellows: the patch to be applied are
http://trac.networkupstools.org/projects/nut/changeset/3721
http://trac.networkupstools.org/projects/nut/changeset/3723

These patches also applies to 2.6.1 (the version of the present report),
though there are a few hunks on the first one.
Since these only impacts source files, there is nothing else required than
patching before the build.

cheers,
Arnaud
--
Linux / Unix / Opensource Engineering Expert - Eaton -
http://opensource.eaton.com
Network UPS Tools (NUT) Project Leader - http://www.networkupstools.org
Debian Developer - http://www.debian.org
Free Software Developer - http://arnaud.quette.fr

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers