Comment 23 for bug 897773

Revision history for this message
Jeff Lane  (bladernr) wrote :

So I've done some digging.... First, an ancedotal story:

A man was experiencing extreme pain whenever he touched his arm in a particular place. So he goes to the doctor to have it checked out. The man says "Doctor, every time I push on this spot on my arm, I get horrible pain."

The doctor looks at the man and says, "Well then, stop pushing there."

After looking at this and recreating it a few times, I think I know what's going on here. Now keep in mind that this is just a guess... but I think that that vpd file in the sysfs directory for this card is actually a direct interface to the card itself, meant to be accessed only via an API of some sorts... if you run file on it and other items in the directory you'll note that it's a "regluar file" as opposed to "ASCII Text" or some other filetype.

For comparison, if you run file on /proc/kcore, it too is a "regular file".

I mention this because the behaviour seen here is alarmingly similar to older kernels (2.2x or 2.4.x, IIRC) where you could cat /proc/kcore to bring a system crashing to a halt.

So what I think is happening is that by accessing that vpd file (either using cat or in the udev_resource script which is doing a open() on the vpd file) we are inadvertantly causing the kernel to hang. I'd like to poke at this further by checking into the uevent file in that devices sysfs directory, but at the moment, I'm unabel to access the system, even after a reboot. It may need to be re-installed or at least poked manually. Power-cycling it remotely hasn't worked so far.

IN any case, I think Tim's already got this sorted out. My only concern with his suggested solution is whether or not blacklisting vpd will break data collection for other places. In other words, while /path/to/vpd here may be breaking the system, is there a case where /path/to/vpd actually contains parseable data that doesn't trigger a hung system when you try opening it?