[Latitude XT3] Firewire nonfunctional after suspend

Bug #881688 reported by Marc Legris
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
High
Ming Lei
Oneiric
Invalid
Undecided
Unassigned

Bug Description

After placing this system into suspend and resuming, firewire devices will no longer be automounted. dmsg is currently showing errors related to firewire

ProblemType: Bug
DistroRelease: Ubuntu 11.10
Package: linux-image-3.0.0-12-generic-pae 3.0.0-12.20
ProcVersionSignature: Ubuntu 3.0.0-12.20-generic-pae 3.0.4
Uname: Linux 3.0.0-12-generic-pae i686
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
ApportVersion: 1.23-0ubuntu3
Architecture: i386
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: PCH [HDA Intel PCH], device 0: STAC92xx Analog [STAC92xx Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: ubuntu 1694 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'PCH'/'HDA Intel PCH at 0xe1c60000 irq 44'
   Mixer name : 'Intel CougarPoint HDMI'
   Components : 'HDA:111d76e7,102804b4,00100102 HDA:80862805,80860101,00100000'
   Controls : 27
   Simple ctrls : 13
Date: Tue Oct 25 17:10:20 2011
HibernationDevice: RESUME=UUID=799761b2-bd4f-4a89-91f3-5c096ba13b5f
InstallationMedia: Ubuntu 11.10 "Oneiric Ocelot" - Release i386 (20111012)
MachineType: Dell Inc. Latitude XT3
ProcEnviron:
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.0.0-12-generic-pae root=UUID=abb4b1a7-23cd-481b-a38d-303cf210912e ro quiet splash initcall_debug initcall_debug vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-3.0.0-12-generic-pae N/A
 linux-backports-modules-3.0.0-12-generic-pae N/A
 linux-firmware 1.60
SourcePackage: linux
StagingDrivers: mei
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/20/2011
dmi.bios.vendor: Dell Inc.
dmi.bios.version: X23
dmi.board.name: 09HM99
dmi.board.vendor: Dell Inc.
dmi.board.version: X00
dmi.chassis.type: 9
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrX23:bd04/20/2011:svnDellInc.:pnLatitudeXT3:pvr01:rvnDellInc.:rn09HM99:rvrX00:cvnDellInc.:ct9:cvr:
dmi.product.name: Latitude XT3
dmi.product.version: 01
dmi.sys.vendor: Dell Inc.

Revision history for this message
Marc Legris (maaarc-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount) wrote :
Changed in linux (Ubuntu):
assignee: nobody → Chris Van Hoof (vanhoof)
description: updated
Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: New → Confirmed
Jeff Lane  (bladernr)
tags: added: blocks-hwcert-enablement
Chris Van Hoof (vanhoof)
Changed in linux (Ubuntu):
assignee: Chris Van Hoof (vanhoof) → Ming Lei (tom-leiming)
importance: Undecided → High
Ara Pulido (ara)
tags: added: blocks-hwcert
Revision history for this message
Ming Lei (tom-leiming) wrote :

Could you test the upstream kernel to see if the issue has been fixed?

http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.2-rc2-oneiric/

If the issue is still on upstream kernel, please post the dmesg info from upstream
kernel, so that upstream developers can help it.

thanks,

Revision history for this message
Marc Legris (maaarc-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount) wrote :

Hi Ming,

After updating to the mainline kernel, the system can now use firewire devices after suspend.

Revision history for this message
Ming Lei (tom-leiming) wrote : Re: [Bug 881688] Re: [Latitude XT3] Firewire nonfunctional after suspend
  • ppp1 Edit (122.4 KiB, application/octet-stream; name=ppp1)

Hi Marc,
As we talked and tested yesterday, your firewire HDD. still can't
beauto mounted on upstream kernel(3.2-rc2), and follows the
reproducesteps:
 - run system sleep test first:(I have to do it for remote access to
the Latitude XT3)          echo devices > /sys/power/pm_test echo
mem > /sys/power/state
 - plug Firewire HDD. into machine
 - the HDD. can't be recognized, and kernel dumped the failure
message below[1]
Attach the dmesg on the upstream kernel.
[1], [  372.989761] firewire_ohci: Register access failure - please
notify <email address hidden>[  373.005783] firewire_core:
skipped bus generations, destroying all nodes

thanks,--
Ming Lei

Revision history for this message
Ming Lei (tom-leiming) wrote :

Hi,

Also I have reported the issue on upstream kernel: <email address hidden>

thanks,
--
Ming Lei

Revision history for this message
Marc Legris (maaarc-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount) wrote :

Ming, I believe the way you put the system to sleep is causing the issue. If I put the system to sleep through the applet, it will be able to automount the firewire hdd after resuming.

Revision history for this message
Ming Lei (tom-leiming) wrote :

Hi Marc,

On Sat, Dec 3, 2011 at 4:24 AM, Marc Legris <email address hidden> wrote:
> Ming, I believe the way you put the system to sleep is causing the

In fact, the way of pm_test on devices only puts all devices into sleep
and keeps them in sleep state for 5 secs, then resumes all devices.

> issue. If I put the system to sleep through the applet, it will be able
> to automount the firewire hdd after resuming.

OK. I will continue to test by applet(pm_suspend), but need your help
to enable WOL on the machine since you know I have to access the
machine remotely.

thanks,
--
Ming Lei

Revision history for this message
Stefan Richter (stefan-r-ubz) wrote :

What FireWire target devices did you test? According to the first 3 bytes of the GUID (this part is the OUI), it is a device from Buffalo, Inc.

Observations about CurrentDmesg.txt from comment 1:

  - There is no message along the lines of "PM: resume of
    drv:firewire_ohci dev:0000:09:00.0 complete after ...msecs".
    Should there be such a message?

  - There is no "Register access failure".

  - After PM resume, self ID receive DMA, AT-req DMA, and AR-resp DMA
    work fine, which is evident by "rediscovered device fw1" from
    firewire-core.

  - Ditto, physical DMA and AR-req DMA work, since firewire-sbp2
    receives an SBP-2 reconnect status block and six pairs of SBP-2
    login and logout status blocks from the target.

So, to me it looks unlikely that there is a controller problem.

More likely to me is that the target firmware is crap since it refuses re-login. The status codes mean:

  - Received for the reconnect request ORB:
    0:9 = Request complete. Function rejected.

  - Received for each logout and login request ORB:
    0:4 = Request complete. Access denied.

Of course, there /still/ could be a controller problem because perhaps the physical DMA delivered malformed reconnect or logout or login ORBs to the target. But I think this is highly unlikely.

So far, I don't know what to make of this information.

I was thinking about asking for a debug log with "firewire-ohci debug=3" (logging of AT, AR, and self-ID-R DMA interrupt events), but that would show only a fraction of the picture since most of the SBP-2 communication occurs without CPU interrupts by means of OHCI-1394 physical DMA (something like remote DMA).

Revision history for this message
Stefan Richter (stefan-r-ubz) wrote :

Re comment 6 and 7:

So, the disk is fine after resume if the PC was suspended only for a few seconds, but becomes inaccessible if the PC was suspended for longer?

Or does the HDD keep spinning in one case and spins down in the other?

Revision history for this message
Stefan Richter (stefan-r-ubz) wrote :

> Or does the HDD keep spinning in one case and spins down in the other?

Of course, either way /should/ work. But if it makes a difference, it would mean that the target itself enters a low power state from which it is unable to resume due to a firmware bug.

Would
# echo 0 > /sys/class/scsi_disk/6:0:0:0/manage_start_stop
before suspend make a difference?

Revision history for this message
Ming Lei (tom-leiming) wrote :

Hi,

On Sat, Dec 3, 2011 at 8:39 PM, Stefan Richter
<email address hidden> wrote:
> Re comment 6 and 7:
>
> So, the disk is fine after resume if the PC was suspended only for a few
> seconds, but becomes inaccessible if the PC was suspended for longer?
>
> Or does the HDD keep spinning in one case and spins down in the other?
>

Looks like not both the two cases.

Before I run 'echo mem > /sys/power/state' (pm_test: devices), there
should be no any firewire HDD. plugged into bus. After resume
from system sleep(pm_test: devices), Marc started to plug
one firewire HDD. into bus, but the HDD. was not recognized by kernel.

Marc, correct me if my description is not consistent with your test.

thanks,
--
Ming Lei

Revision history for this message
Stefan Richter (stefan-r-ubz) wrote :

For the failure mode that a disk is not recognized if plugged in _after_ resume, please provide kernel logs with firewire-ohci IRQ logging enabled:

# echo 7 > /sys/module/firewire_ohci/parameters/debug
(suspend)
(resume)
(attach FireWire disk)
(collect the dmesg)

To be sure, repeat this with additional logging of bus reset events:

# modprobe -r firewire-ohci
# modprobe firewire-ohci debug=15
(suspend)
(resume)
(attach FireWire disk)
(collect the dmesg)

Beware, either debug level may possibly cause huge logs.

Revision history for this message
Ming Lei (tom-leiming) wrote :

Hi Stefan,

We will do the tests and post the log, thanks.

Marc, could you help to do the tests per Stefan's comment(#12) and post
the debug log? You know I have no way to plug the HDD. remotely?

thanks,
--
Ming Lei

On Sun, Dec 4, 2011 at 7:11 PM, Stefan Richter
<email address hidden> wrote:
> For the failure mode that a disk is not recognized if plugged in _after_
> resume, please provide kernel logs with firewire-ohci IRQ logging
> enabled:
>
> # echo 7 > /sys/module/firewire_ohci/parameters/debug
> (suspend)
> (resume)
> (attach FireWire disk)
> (collect the dmesg)
>
> To be sure, repeat this with additional logging of bus reset events:
>
> # modprobe -r firewire-ohci
> # modprobe firewire-ohci debug=15
> (suspend)
> (resume)
> (attach FireWire disk)
> (collect the dmesg)
>
> Beware, either debug level may possibly cause huge logs.
>
> --

Revision history for this message
Marc Legris (maaarc-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount) wrote :

dmesg with mainline kernel (w/o modprobe)

Revision history for this message
Marc Legris (maaarc-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount) wrote :

dmesg with mainline kernel (w/ modprobe)

Revision history for this message
Marc Legris (maaarc-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount) wrote :

dmesg with oneiric kernel (w/o modprobe)

Revision history for this message
Marc Legris (maaarc-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount) wrote :
Revision history for this message
Marc Legris (maaarc-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount) wrote :

dmesg with mainline kernel (w/ modprobe)

Revision history for this message
Marc Legris (maaarc-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount) wrote :

After a reinstall with oneiric I'm seeing the firewire hdd being automounted

tags: added: rls-mgr-o-tracking
Revision history for this message
Ara Pulido (ara) wrote :

Marc, are you suggesting that you no longer see this issue?

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Marc Legris (maaarc-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount) wrote :

correct, though I am adding "echo 7 > /sys/module/firewire_ohci/parameters/debug" which is the only thing different this time. Not sure if this would cause the firewire device to function properly. I'd have to test with and without the added parameters.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu Oneiric):
status: New → Confirmed
Revision history for this message
Marc Legris (maaarc-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount-deactivatedaccount) wrote :

Tested with a fresh install of oneiric, seems to working normally. Perhaps the first time was a fluke.

Changed in linux (Ubuntu):
status: Incomplete → Invalid
Ara Pulido (ara)
Changed in linux (Ubuntu Oneiric):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.