Xen AACRaid kernel panic

Bug #232017 reported by Jason Kendall
12
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Triaged
Medium
Unassigned

Bug Description

Binary package hint: linux-image-2.6.24-17-xen

When moving files on this server at random points the kernel will panic - I have not been able to get console redirection to work, so I don't have the full trace - screen pictures will be attached.

It seems that once the on board raid control cache is filled it panics - no actual files are written to the disk

root@xen:~# cat /proc/kmsg
<3>[ 107.140983] PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:06:08.1
<0>[ 107.141292] ------------[ cut here ]------------
<2>[ 107.141484] kernel BUG at /build/buildd/linux-2.6.24/debian/build/custom-source-xen/drivers/scsi/aacraid/aachba.c:2574!

Attempted with acpi=off

Does not happen in -generic or -server versions.

Tags: xen
Revision history for this message
Jason Kendall (coolacid) wrote :
Revision history for this message
Jason Kendall (coolacid) wrote :

output of:

uname -a
lspci
dmesg

and cat /proc/kmsg when the panic happens.

Revision history for this message
Richard Seguin (sectech) wrote :

Thanks for reporting this bug and any supporting documentation. Since this bug has enough information provided for a developer to begin work, I'm going to mark it as confirmed and let them handle it from here. Thanks for taking the time to make Ubuntu better!

Changed in linux:
status: New → Confirmed
Revision history for this message
Richard Seguin (sectech) wrote :

* Assigning to linux-source-2.6.24 as this is the main kernel package.

Revision history for this message
Jason Kendall (coolacid) wrote :

I should mention since there are two raid controllers on here - there is only drives connected to the following card:

06:08.1 RAID bus controller: Dell PowerEdge Expandable RAID Controller 3/Di (rev 01)

It has the latest firmware (as of 5 days ago)

This is a PE2650, PowerEdge 2650 (meta-tags for searching)

Revision history for this message
Jason Kendall (coolacid) wrote :

Following another bug report I was able to get Xen Console redirection working.. See attached output

Revision history for this message
Jason Kendall (coolacid) wrote :

Looking into the "CI-DMA: Out of SW-IOMMU space" error I was directed to a few options.

I tried the swiotlb=force and got a panic, I assumed it was the same, however I could be wrong.

As per the xen-devel list I tried upping what ever this is with swiotlb=128 can got a diffenent backtrace - see attached

Revision history for this message
Jason Kendall (coolacid) wrote :

Ok - I just read details from CentOS Xen implementaion notes and it notes about swiotlb - I had it on the wrong grub line.

Setting swiotlb=128 seems to have stablized the dom0 so far.

This seems to be a work around however, I don't think this should be required.

Revision history for this message
Jason Kendall (coolacid) wrote :

Update - as soon as I bring up a domU it panics - I'll pull reports later regarding this panic, however at this time I'm considering them possibly related.

Revision history for this message
Jason Kendall (coolacid) wrote :

Adding swiotlb=128 to the domU config also corrected the domU panic.

ie:

extra = '2 console=xvc0 swiotlb=128'

Revision history for this message
TJ Fontaine (tjfontaine) wrote :

I can reliably oops the 2.6.24-server kernel with my AAR-2410SA in 4x500GB RAID 1+0 mode, any real IO for instance via NFS or iSCSI will cause the oops. Increasing the swiotlb causes the system to be unbootable and results in almost instant oops. I built a vanilla 2.6.26.2 kernel and experienced the same error. I have another box running feisty and 2.6.20-16-server in RAID 5 with lots of consistent IO and never experience a problem. Below is the lspci -vv output for the card and attached is the most recent trace with linux-image-debug-server, I'll now also try with -generic

02:02.0 RAID bus controller: Adaptec AAC-RAID (rev 01)
 Subsystem: Adaptec AAR-2410SA PCI SATA 4ch (Jaguar II)
 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B-
 Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=slow >TAbort- <TAbort- <MAbort- >SERR- <PERR-
 Latency: 64 (250ns min, 250ns max), Cache Line Size: 32 bytes
 Interrupt: pin A routed to IRQ 28
 Region 0: Memory at e8000000 (32-bit, prefetchable) [size=64M]
 Expansion ROM at ff6e0000 [disabled] [size=32K]
 Capabilities: [80] Power Management version 2
  Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
  Status: D0 PME-Enable- DSel=0 DScale=0 PME-

Revision history for this message
TJ Fontaine (tjfontaine) wrote :

It turned out not to be a bug with my aacraid card but rather MTU settings on my r8169 network adapter, http://lkml.org/lkml/2008/8/20/287 for more information

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

The Ubuntu Kernel Team is planning to move to the 2.6.27 kernel for the upcoming Intrepid Ibex 8.10 release. As a result, the kernel team would appreciate it if you could please test this newer 2.6.27 Ubuntu kernel. There are one of two ways you should be able to test:

1) If you are comfortable installing packages on your own, the linux-image-2.6.27-* package is currently available for you to install and test.

--or--

2) The upcoming Alpha5 for Intrepid Ibex 8.10 will contain this newer 2.6.27 Ubuntu kernel. Alpha5 is set to be released Thursday Sept 4. Please watch http://www.ubuntu.com/testing for Alpha5 to be announced. You should then be able to test via a LiveCD.

Please let us know immediately if this newer 2.6.27 kernel resolves the bug reported here or if the issue remains. More importantly, please open a new bug report for each new bug/regression introduced by the 2.6.27 kernel and tag the bug report with 'linux-2.6.27'. Also, please specifically note if the issue does or does not appear in the 2.6.26 kernel. Thanks again, we really appreicate your help and feedback.

Revision history for this message
Jason Kendall (coolacid) wrote :

Leann;

This is a Xen issue - I was unable to find a Xen version of the linux-image-2.6.27- tree in hardy or Intrepid.

Regards,
Jay

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

It's my understanding that Xen domU is built into the current Intrepid kernel. I don't think there are any plans to incorporate a dom0 kernel though for Intrepid.

Revision history for this message
Jason Kendall (coolacid) wrote :

Since there is believe XEN will not be included for Intrepid, what is ubuntu's Visualization platform of choice? What needs to be done to rally the troops to get Xen put back on the supported list?

Richard Seguin (sectech)
Changed in linux:
importance: Undecided → Medium
status: Confirmed → Triaged
tags: added: kj-omit
tags: added: xen
removed: cft-2.6.27 kj-omit
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.