bttv driver dies with VIDIOC_DQBUF error

Bug #260251 reported by Stephan Fabel
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenCV
Unknown
Unknown
linux (Ubuntu)
Invalid
Undecided
Unassigned
Nominated for Hardy by Tim Howard
Declined for Intrepid by Brad Figg
Declined for Jaunty by Brad Figg
Declined for Karmic by Brad Figg
Nominated for Lucid by Stephan Fabel

Bug Description

When accessing the v4l video interface using the OpenCV cvQueryFrame() method, after an unpredictable amount of time, the bttv driver gives the following error message:

VIDIOC_DQBUF error 5, Input/output error

See also http://robolab.tek.sdu.dk/mediawiki/index.php/Using_OpenCV_on_live_video for a full description of the symptom and their solution (downgrading to kernel 2.6.18).

I'd like to keep my kernel version. Please update the bttv driver to fix the bug that was introduced since the driver included with kernel image 2.6.18.

If you tell me how, I can go on and test patches/compile new modules. This is somewhat urgent; I'd appreciate a timely feedback.

Stephan Fabel (sfabel)
description: updated
Revision history for this message
Tim Howard (mexeled) wrote :

Same error here, using this card: http://www.amazon.com/Hauppauge-ImpactVCB-Capture-166-Profile/dp/B0002P4TK0/ref=sr_1_1?ie=UTF8&s=electronics&qid=1219903140&sr=8-1

Several seconds after opening the stream (in opencv and C++) it crashes with the same error as the bug description describes.

Very unusual though, because I can access all the streams using tvtime just fine. Anyone getting the samething?

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

The Ubuntu Kernel Team is planning to move to the 2.6.27 kernel for the upcoming Intrepid Ibex 8.10 release. As a result, the kernel team would appreciate it if you could please test this newer 2.6.27 Ubuntu kernel. There are one of two ways you should be able to test:

1) If you are comfortable installing packages on your own, the linux-image-2.6.27-* package is currently available for you to install and test.

--or--

2) The upcoming Alpha5 for Intrepid Ibex 8.10 will contain this newer 2.6.27 Ubuntu kernel. Alpha5 is set to be released Thursday Sept 4. Please watch http://www.ubuntu.com/testing for Alpha5 to be announced. You should then be able to test via a LiveCD.

Please let us know immediately if this newer 2.6.27 kernel resolves the bug reported here or if the issue remains. More importantly, please open a new bug report for each new bug/regression introduced by the 2.6.27 kernel and tag the bug report with 'linux-2.6.27'. Also, please specifically note if the issue does or does not appear in the 2.6.26 kernel. Thanks again, we really appreicate your help and feedback.

Revision history for this message
Tim Howard (mexeled) wrote :

kernel 2.6.27-rc3 broke tvtime and using opencv still does not work.

I've included the code I am using to capture and display frames, there are two techniques in the code - both of which do not work in 2.6.24 or 2.6.27-rc3.

Revision history for this message
Tim Howard (mexeled) wrote :

Still not working here...

Revision history for this message
David Coles (dcoles) wrote :

I've encountered this problem using the bttv driver card (Hauppauge (bt878)) and OpenCV. It's quite odd since we've been successfully able to capture video using GStreamer, but when using an OpenCV based application (using the python bindings) will either not start or crash after a seemingly random amount of time with the message 'VIDIOC_DQBUF error 5, Input/output error'. This isn't caught by any of the error handling in python and causes the interpreter to terminate.

I'm planning on trying the 2.6.28 kernel to see if it has any effect.

Changed in linux:
status: New → Confirmed
Revision history for this message
ciril (ciril-mocnik) wrote :

I have had same problem on fedora 8 _64. For temporary solution I commented-out all occurences of errno_exit ("VIDIOC_DQBUF") in /opencv/otherlibs/highgui/cvcap_v4l.cpp and rebuild opencv. See also: http://article.gmane.org/gmane.comp.lib.opencv/7660/match=vidioc%5fdqbuf

Revision history for this message
David Coles (dcoles) wrote :

Thanks for the link ciril.
Having a look at the both the current svn and the current package in intrepid they both contain that patch to fix the EAGAIN. It looks like any other errors will still cause opencv to terminate. In my case the bttv card returns error 5, EIO, which according to the the V4L spec may or may not indicate a serious error. While I'm quite tempted just to patch it locally to get the thing to work, I'd rather fix this bug propperly.

From http://v4l2spec.bytesex.org/spec-single/v4l2.html#VIDIOC-QBUF :
EIO: VIDIOC_DQBUF failed due to an internal error. Can also indicate temporary problems like signal loss. Note the driver might dequeue an (empty) buffer despite returning an error, or even stop capturing.

http://opencvlibrary.svn.sourceforge.net/viewvc/opencvlibrary/trunk/opencv/src/highgui/cvcap_v4l.cpp?revision=1428 :
 1169 if (-1 == xioctl (capture->deviceHandle, VIDIOC_DQBUF, &buf)) {
 1170 switch (errno) {
 1171 case EAGAIN:
 1172 return 0;
 1173
 1174 case EIO:
 1175 /* Could ignore EIO, see spec. */
 1176
 1177 /* fall through */
 1178
 1179 default:
 1180 /* display the error and stop processing */
 1181 perror ("VIDIOC_DQBUF");
 1182 return 1;
 1183 }
 1184 }

There's also a lot of warnings in dmesg generated by the card:
bttv0: timeout: drop=15799 irq=1410075/32669932, risc=21789a2c, bits: HSYNC OFLOW

I'm really not quite sure if this is a opencv bug (should it be ignoring EIO?) or a bttv bug (should this even be throwing EIO?). It's not exactly clear from the V4L2 spec what the proper response should be. Having a look at GStreamer's v4l2 code ( http://cgit.freedesktop.org/gstreamer/gst-plugins-good/tree/sys/v4l2/v4l2src_calls.c ) shows a fair bit of buffer checking to try and safely recover from this error, though I'm not sure I quite understand what it's trying to do at a glance. Any ideas?

Revision history for this message
David Coles (dcoles) wrote :

I've just rebuilt the package with a small patch that will cause the highgui to ignore the EIO error. This has also been checked in upstream as an experimental fix. It seems to fix the problem, but I'd like to test it for a bit longer than the 5 minutes I had tonight.

Revision history for this message
Tim Howard (mexeled) wrote :

I can't even get the sourceforge SVN repository to build.
Keep getting this on make:
No rule to make target `cxcore/cxjacobieigens.cpp', needed by `cxjacobieigens.lo'.

Anywho, what source are you using to build? It looks like the repos I have already does what your patch changes.
.. But I can't get it to build.

Revision history for this message
David Coles (dcoles) wrote :

At the moment I'm building using the currently intrepid package source. I've not personally tried building the current SVN source since it looks like some fairly serious changes have been made, but one of the guys I was working with did and found I found it was a bit of a mess to remove. (It wasn't built with GTK bindings, yet even after 'make uninstalling' it didn't seem to remove all the shared libraries correctly).

It seems like this stops it terminating, but it looks like it still locks up after a short time (accompanied by a 'bttv0: timeout: drop=15799 irq=1410075/32669932, risc=21789a2c, bits: HSYNC OFLOW' in dmesg). I might try copying some of the stuff from the GStreamer code to see if that's what prevents GStreamer suffering the same problems.

Revision history for this message
David Coles (dcoles) wrote :

Ok. Since I didn't have much luck with the patch against the current Ubuntu package I've grabbed the latest version from Subversion and built that (Using the cmake/make combination). Unfortunately while programs no longer terminate with the 'VIDIOC_DQBUF error 5, Input/output error' they'll quickly get stuck in a EAGAIN loop (usually after 2 or 3 EIO errors). Again it seems to be very closely correlated with the 'bttv0: timeout' in the kernel log, though I can't seem to reproduce this error with any other program that uses v4l2.

I've opened up another bug on the OpenCV bugtracker and hopefully we might be able to work out what on earth is going on.

Revision history for this message
David Coles (dcoles) wrote :

Ok, this may in fact be a kernel bug. I've just tried installing the 2.6.28 Kernel from Jaunty and I'm no longer getting these EIO errors and kernel warning messages when trying out the sample programs. I'd still like to try this with a proper program, so I need to chase up my project partner who was has our opencv code.

If it does fix it, just moving to the 2.6.28 kernel might be the best solution especially with Jaunty in late Alpha now. (Still quite odd that only one application appeared to trigger the problem).

Revision history for this message
itbroke (jwlangston21) wrote :

I'm still having this same problem (VIDIOC_DQBUF error 5, bttv timeout) with openCV using both Intrepid and Jaunty. For me, nothing has changed between kernel versions. When the video comes up, the display is purely black, and quickly terminates along with new dmesg entries.

Revision history for this message
David Coles (dcoles) wrote :

Alas, yes. After a proper test this morning we still had the program crash with the VIDIOC_DQBUF error. So unfortunately, no, it doesn't look like the 2.6.28 kernel fixed the bug.

Revision history for this message
David Coles (dcoles) wrote :

Current open bug on the OpenCV bugtracker.

Revision history for this message
David Coles (dcoles) wrote :

There is another experimental fix in upstream subversion.
http://opencvlibrary.svn.sourceforge.net/viewvc/opencvlibrary/trunk/opencv/src/highgui/cvcap_v4l.cpp?r1=1604&r2=1609

I've rebuilt a local copy of the current package with this fix included and seeing if this makes things any better.

Revision history for this message
itbroke (jwlangston21) wrote :

I recently reinstalled with Ubuntu 8.10 and installing all openCV items in Synaptic's manager. I'm not having the problem anymore.

Revision history for this message
Tim Howard (mexeled) wrote :

Still not working for me.

Now using Ubuntu 9.04 with packages from Synaptic.

Revision history for this message
Tim Howard (mexeled) wrote :

I should also note that the error now happens immediately, even the window does not show up.

Revision history for this message
kernel-janitor (kernel-janitor) wrote :

[This is an automated message. Apologies if it has reached you inappropriately.]

This bug was flagged as having a patch attached. The Ubuntu Kernel Team's preferred policy is for all patches to be submitted and accepted into the upstream kernel before agreeing to merge them into the Ubuntu kernel. The goal for the Ubuntu kernel is to have little to no divergence from the upstream linux kernel source.

https://wiki.ubuntu.com/KernelTeam/KernelPatches has been written to document the suggested policy and procedures for helping get a patch merged upstream and subsequently into the Ubuntu kernel. Please take the time to review that wiki if this patch should be considered for inclusion into the upstream and Ubuntu kernel. Let us know if you have any questions or need any help via the Ubuntu Kernel Team mailing list. Thanks in advance.

tags: added: kj-comment
Revision history for this message
Stephan Fabel (sfabel) wrote :

I've been trying with
- OpenCV 1.0.0 from source
- OpenCV 1.1.0pre1 from source
- OpenCV SVN (current as of today)
- of course the packages included in Jaunty

NOTHING seems to work. I've applied the patch attached to this bug report, and I am getting the infinite loop condition that David mentioned.

After a year has gone by, will the V4L driver be updated to properly handle buffer I/O so that third party libraries won't have to do that? It seems to be a kernel issue and the frustrating thing is - I've tried it with Debian Etch (i.e. Kernel 2.6.18) and that works just fine.

Could someone with more insight in V4L look at the differences between the versions? I will happily try things for anybody, but I really need this to work.

Revision history for this message
Stephan Fabel (sfabel) wrote :

This is the error I get with the OpenCV SVN as of today (Aug 3, 2009):
libv4l2: error dequeuing buf: Resource temporarily unavailable

This pointed me here: https://bugs.launchpad.net/ubuntu/+source/libv4l/+bug/303174
and then here: https://bugs.launchpad.net/ubuntu/+source/libv4l/+bug/260918

Using their "workaround" (LD_PRELOAD=/usr/lib/libv4l/v4l1compat.so), I've been getting this output:

libv4l2: error dequeuing buf: Input/output error
VIDIOC_DQBUF: Input/output error
[..]
libv4l2: error dequeuing buf: Resource temporarily unavailable
[..]

Revision history for this message
Stephan Fabel (sfabel) wrote :

Sorry, one last thing: tvtime seems to do things differently, since it works perfectly fine with my card and camera. So no kernel bug after all?

Revision history for this message
Tim Howard (mexeled) wrote :

@Stephan Fabel: I've noticed the exact same thing. tvtime seems to work just fine with my cameras (last I tried them).

Revision history for this message
David Coles (dcoles) wrote :

Hi Steven,

There seems to be something very odd going on with OpenCV's v4l driver and the way it interacts with v4l/bttv driver. Whilst http://opencvlibrary.svn.sourceforge.net/viewvc/opencvlibrary?view=rev&revision=1609 at least prevented the EIO from crashing OpenCV, it was still accompanied by kernel warnings ("bttv0: timeout") and a drop in frame rate as it was re-buffered.

Since no other programs seemed to encounter this problem, the solution we took was to use GStreamer to capture from the v4l2src and then pass the image data into OpenCV ourselves. This seemed to avoid the issues highgui's v4l capture drivers (although in the current OpenCV package the python setters were broken and required us to hack together our own little python wrapper). See http://www.tardis.ed.ac.uk/~dcoles/sdpgroup2/workspace/vision/gstcapture.py for an example.

It seemed to either be a kernel bug uncovered by an unusual way of using v4l with this device or the OpenCV v4l code not handling the device correctly. I noticed that the OpenCV subversion now supports a few higher level capture devices such as libv4l and gstreamer which may be a way around these issues. Unfortunately I no longer have access to the bttv hardware we were using and so can't really do any more testing.

Revision history for this message
Stephan Fabel (sfabel) wrote :

Found that my bttv driver shared an IRQ with my network card. So I disabled the network (modprobe -r b44), and voila - no real buffer problems with my capture card anymore. It does not seem stable, but it is generally working.

Could it be the motherboard?

If someone could point me to a resource where I can learn how to direct either my ethernet or my bttv to a different IRQ, I would be eternally grateful (modinfo gave ambiguous results).

Thanks for the links I will take a look at them anyway. You're saying that GStreamer has a "easy to use" interface to access V4L devices also?

Thanks,
Stephan

Revision history for this message
Tim Howard (mexeled) wrote :

I was not able to get it working by disabling the network (unless I'm misunderstanding something).

I did 'modprobe -r b44' and disabled networking and 'eth0' was no longer showing up in 'cat /proc/interrupts'.
However, I still got the same error as always before.
Tvtime doesn't work on my test machine (something about GATOS drivers), but xawtv shows the video data just fine (regardless of the network status).

I've attached a file containing the report of 'cat /proc/interrupts' *before* I disabled networking.

Revision history for this message
Andy Whitcroft (apw) wrote :

Can we confirm whether this combination is now working on Maverick/Natty (you may be able to test in a live CD environment). It seems that we still do not know whether this is a library bug or an unusual access method triggering a kernel bug.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Stephan Fabel (sfabel) wrote :

I haven't had any issues with OpenCV 2.1 under Maverick (as of today). Workaround for me was to disable the ffmpeg support and use gstreamer instead (as soon as that was stable enough).

Thanks,
Stephan

Revision history for this message
Tim Gardner (timg-tpi) wrote :

expiring for lack of traffic

Changed in linux (Ubuntu):
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.