Ubuntu

DMA: Out of SW-IOMMU space

Reported by Luis Alvarado on 2012-10-02
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Unassigned

Bug Description

This problem affected me in 12.04 as seen here (Problem behaves the same) : https://bugs.launchpad.net/ubuntu/+source/linux/+bug/961618 and behaves like this one too: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/856496

Basically the error output starts and after around 45 minutes of errors the whole system freezes. Complete freeze that I have to force reboot.

I upgraded to Ubuntu 12.10 Beta2 since the problem was mentioned to be solved in Kernel 3.4 as seen here: https://bugzilla.kernel.org/show_bug.cgi?id=42976

As you can see in the information provided, I am using the 3.5 version of the Kernel and the problem persists and it happens at least 2 to 5 times daily. It happens more if I watch a youtube video, downoad through torrent or try to connect to any online bandwidth consuming site or app.

Here is a couple of the error logs from syslog:

 DMA: Out of SW-IOMMU space for 490 bytes at device 0000:03:02.0
 DMA: Out of SW-IOMMU space for 517 bytes at device 0000:03:02.0
 DMA: Out of SW-IOMMU space for 561 bytes at device 0000:03:00.0
 DMA: Out of SW-IOMMU space for 562 bytes at device 0000:03:02.0
 DMA: Out of SW-IOMMU space for 607 bytes at device 0000:03:02.0
 DMA: Out of SW-IOMMU space for 670 bytes at device 0000:03:02.0
 DMA: Out of SW-IOMMU space for 89 bytes at device 0000:03:02.0
 DMA: Out of SW-IOMMU space for 92 bytes at device 0000:03:00.0
 DMA: Out of SW-IOMMU space for 94 bytes at device 0000:03:02.0
 DMA: Out of SW-IOMMU space for 97 bytes at device 0000:03:02.0

Looking at the PCI ID I did a sort|uniq and found the following other network devices giving the same problem:

03:02.0 Network controller: Ralink corp. RT2800 802.11n PCI

03:00.0 Network controller: Broadcom Corporation BCM4321 802.11b/g/n (rev 01)

I have 4 network cards:

       product: 82579V Gigabit Network Connection (Internal Intel DZ68DB Motherboard Wired Card)
       product: BCM4321 802.11b/g/n (Broadcom Chipset, Linksys WMP300N http://homesupport.cisco.com/en-us/support/adapters/WMP300N
       product: RTL8169 PCI Gigabit Ethernet Controller (PCI Realtek Wired Card)
       product: RT2800 802.11n PCI (Wifi-N Ralink 2 Antennas)

ProblemType: Bug
DistroRelease: Ubuntu 12.10
Package: network-manager 0.9.6.0-0ubuntu7
ProcVersionSignature: Ubuntu 3.5.0-16.25-generic 3.5.4
Uname: Linux 3.5.0-16-generic x86_64
NonfreeKernelModules: nvidia wl
ApportVersion: 2.6.1-0ubuntu1
Architecture: amd64
Date: Tue Oct 2 10:21:42 2012
ExecutablePath: /usr/sbin/NetworkManager
IfupdownConfig:
 auto lo
 iface lo inet loopback
InstallationMedia: Ubuntu 12.04.1 LTS "Precise Pangolin" - Release amd64 (20120817.1)
IpRoute:
 default via 192.168.1.1 dev wlan0 proto static
 169.254.0.0/16 dev wlan0 scope link metric 1000
 192.168.1.0/24 dev wlan0 proto kernel scope link src 192.168.1.5 metric 9
NetworkManager.state:
 [main]
 NetworkingEnabled=true
 WirelessEnabled=true
 WWANEnabled=true
 WimaxEnabled=true
ProcEnviron:
 TERM=linux
 PATH=(custom, no user)
 LANG=en_US.UTF-8
SourcePackage: network-manager
UpgradeStatus: Upgraded to quantal on 2012-10-01 (1 days ago)
mtime.conffile..etc.NetworkManager.NetworkManager.conf: 2012-09-12T16:03:17.935620
nmcli-dev:
 DEVICE TYPE STATE DBUS-PATH
 wlan0 802-11-wireless connected /org/freedesktop/NetworkManager/Devices/3
 eth0 802-3-ethernet unavailable /org/freedesktop/NetworkManager/Devices/2
 eth2 802-11-wireless disconnected /org/freedesktop/NetworkManager/Devices/1
 eth1 802-3-ethernet unavailable /org/freedesktop/NetworkManager/Devices/0
nmcli-nm:
 RUNNING VERSION STATE NET-ENABLED WIFI-HARDWARE WIFI WWAN-HARDWARE WWAN
 running 0.9.6.0 connected enabled enabled enabled enabled disabled
---
ApportVersion: 2.6.1-0ubuntu1
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: cyrex 1861 F.... pulseaudio
 /dev/snd/controlC0: cyrex 1861 F.... pulseaudio
DistroRelease: Ubuntu 12.10
HibernationDevice: RESUME=UUID=653cced3-1151-4767-828a-9fdfe18c244c
InstallationMedia: Ubuntu 12.04.1 LTS "Precise Pangolin" - Release amd64 (20120817.1)
NonfreeKernelModules: nvidia wl
Package: linux (not installed)
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.5.0-16-generic root=UUID=75c610ab-0326-4472-994b-5992295df48e ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 3.5.0-16.25-generic 3.5.4
RelatedPackageVersions:
 linux-restricted-modules-3.5.0-16-generic N/A
 linux-backports-modules-3.5.0-16-generic N/A
 linux-firmware 1.94
Tags: quantal running-unity
Uname: Linux 3.5.0-16-generic x86_64
UpgradeStatus: Upgraded to quantal on 2012-10-01 (2 days ago)
UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
dmi.bios.date: 06/15/2012
dmi.bios.vendor: Intel Corp.
dmi.bios.version: DBZ6810H.86A.0043.2012.0615.2054
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: DZ68DB
dmi.board.vendor: Intel Corporation
dmi.board.version: AAG27985-104
dmi.chassis.type: 3
dmi.modalias: dmi:bvnIntelCorp.:bvrDBZ6810H.86A.0043.2012.0615.2054:bd06/15/2012:svn:pn:pvr:rvnIntelCorporation:rnDZ68DB:rvrAAG27985-104:cvn:ct3:cvr:

Luis Alvarado (luisalvarado) wrote :
Luis Alvarado (luisalvarado) wrote :

Testing Kernel 3.6 from here: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.6-quantal/ and it has been at least 10 hours with no problems. With Kernel 3.5.x the problem appeared at least every 2 to 3 hours. Sometimes almost 4 hours.

Will keep checking.

Luis Alvarado (luisalvarado) wrote :

Locked again after a total of 11 hours 30 minutes. I will have to disassemble the PC just to check it out.

Reassigning to 'linux'.

affects: network-manager (Ubuntu) → linux (Ubuntu)

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1060268

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete

apport information

tags: added: apport-collected running-unity
description: updated

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

Luis Alvarado (luisalvarado) wrote :

I used apport-collect as mentioned. Is there anything else I can help you with?

If I may, I found another group of lines that might help when doing dmesg:

phy0 -> rt2x00queue_write_tx_frame: Error - Dropping frame due to full tx queue 0.
phy0 -> rt2x00queue_write_tx_frame: Error - Dropping frame due to full tx queue 2.
phy0 -> rt2x00lib_rxdone: Error - Wrong frame size 0 max 3840.
phy0 -> rt2x00lib_rxdone: Error - Wrong frame size 4065 max 3840.

Doing a check on the modules loaded and dependencies I saw this (If in any case related to helping out the problem)

cyrex@cyrex:~$ lsmod |grep rt2
rt2800pci 18750 0
rt2800lib 63557 1 rt2800pci
crc_ccitt 12667 1 rt2800lib
rt2x00pci 14578 1 rt2800pci
rt2x00lib 55527 3 rt2800pci,rt2800lib,rt2x00pci
mac80211 564631 3 rt2800lib,rt2x00pci,rt2x00lib
cfg80211 212917 2 rt2x00lib,mac80211
eeprom_93cx6 13302 1 rt2800pci

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Joseph Salisbury (jsalisbury) wrote :

So this issue also exists in the upstream v3.6 kernel[0]?

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.6-quantal/

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
Luis Alvarado (luisalvarado) wrote :

Yes correct. Tested with the kernel in 3.6 (In the link you provided) and the issue still exists. It appears faster if I use a torrent downloader like transmission, view videos from youtube or connect to a game. Anything that uses a lot of the network connection. Specially via a wireless connection.

Luis Alvarado (luisalvarado) wrote :

Tested again with the official 3.5 version and 3.6 version in 12.10. This time, without doing anything intensive on the web like using Transmission or downloading something heavy. Did some G+ surfing and some Facebook. Started watching a movie I had on my PC (Not streaming or downloading it. It was already in the PC). After about 30 minutes into the movie, the whole system froze. Had to do a hard reboot on it. The time that lasted between frozen states was around 5 to 6 hours. Double almost triple of the time if I use transmission or do heavy web surfing.

Luis Alvarado (luisalvarado) wrote :

Is this sitll going to be "incomplete" or "confirmed" since I just send everything that was asked. If there is anything else please ask, since with this problem I won't be able to actually use 12.10. I must say the problem started in 12.04 around June/july with an update that appeared on Update Manager.

Luis Alvarado (luisalvarado) wrote :

Ok without doing anything at all in Internet during the last i don't know how many hours, I guess about 16 hours, the computer locks with the same problem, lasts longer the more I do not use internet or even a LAN network. I should mention that the first symptons are the Network Manager showing me the wireless connection when it drops and asking me for the password to connect to it again. This fails everytime when the bug appears even though the connection is set to connect automatically. When I see that it does not connect on the 2nd or 3rd try I ams 100% sure that when I do a dmesg on the terminal it shows the bugs happening, which lasts between 30 to 45 minutes from the minute it appears until locking the PC completely.

Luis Alvarado (luisalvarado) wrote :

Here is the same problem but for other Distros:

Red Hat - https://bugzilla.redhat.com/show_bug.cgi?id=787054

Kernel Bug - https://bugzilla.kernel.org/show_bug.cgi?id=42976 (Shows fixed in 3.4.x but still appears in 3.5 and 3.6.

Luis Alvarado (luisalvarado) wrote :

Update - Had to resort to using Wired connections. Using either the Broadcom wireless card or the Ralink one will give the same problem. Using Kernel 3.5.0.17.

Luis Alvarado (luisalvarado) wrote :

The problem is for the moment fixed. I had to delete Ubuntu 12.10 and reinstall it. The way I did it first was an update from 12.04 to 12.10.

I seem to have this bug again in 13.04 when using CIFS.

JaSauders (jasauders) wrote :

I ran into this as well in 13.04 (Feb 27 daily build, fully updated/upgraded). I experienced it with Samba over Nautilus, as well as over rsync/ssh.

I experienced it with a 13.04 laptop over wireless pushing data to a server running Ubuntu Server 12.04.2. I also experienced it with a custom built 13.04 desktop pushing data to the same desktop.

tags: added: raring
Bruce Pieterse (octoquad) wrote :
Download full text (7.7 KiB)

I've been getting this since 3.8.0.9 on Raring and in my case targets the IDE interface: Intel Corporation NM10/ICH7 Family SATA Controller [IDE mode].

I am able to copy a small amount of data over the network but as soon as it has been running for ~30 seconds the syslog gets full with following error messages:

Mar 9 14:56:15 hostname kernel: [60968.260769] EXT4-fs warning (device sdb1): ext4_end_bio:317: I/O error writing to inode 40897497 (offset 99905536 size 524288 starting block 362881223)
Mar 9 14:56:15 hostname kernel: [60968.260780] ata3: EH complete
Mar 9 14:56:24 hostname kernel: [60977.971703] DMA: Out of SW-IOMMU space for 28672 bytes at device 0000:00:1f.2
Mar 9 14:56:24 hostname kernel: [60977.972457] ata3.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Mar 9 14:56:24 hostname kernel: [60977.972467] ata3.01: failed command: WRITE DMA EXT
Mar 9 14:56:24 hostname kernel: [60977.972477] ata3.01: cmd 35/00:00:70:8c:09/00:04:ad:00:00/f0 tag 0 dma 524288 out
Mar 9 14:56:24 hostname kernel: [60977.972477] res 50/00:00:ff:6e:4d/00:00:ad:00:00/f0 Emask 0x40 (internal error)
Mar 9 14:56:24 hostname kernel: [60977.972482] ata3.01: status: { DRDY }
Mar 9 14:56:24 hostname kernel: [60978.008539] ata3.00: configured for UDMA/133
Mar 9 14:56:24 hostname kernel: [60978.024411] ata3.01: configured for UDMA/133
Mar 9 14:56:24 hostname kernel: [60978.024430] ata3: EH complete
Mar 9 14:56:24 hostname kernel: [60978.024963] DMA: Out of SW-IOMMU space for 28672 bytes at device 0000:00:1f.2
Mar 9 14:56:24 hostname kernel: [60978.024999] ata3.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Mar 9 14:56:24 hostname kernel: [60978.025007] ata3.01: failed command: WRITE DMA EXT
Mar 9 14:56:24 hostname kernel: [60978.025017] ata3.01: cmd 35/00:00:70:8c:09/00:04:ad:00:00/f0 tag 0 dma 524288 out
Mar 9 14:56:24 hostname kernel: [60978.025017] res 50/00:00:af:88:e0/00:00:e8:00:00/f0 Emask 0x40 (internal error)
Mar 9 14:56:24 hostname kernel: [60978.025022] ata3.01: status: { DRDY }
Mar 9 14:56:24 hostname kernel: [60978.048531] ata3.00: configured for UDMA/133
Mar 9 14:56:24 hostname kernel: [60978.064397] ata3.01: configured for UDMA/133
Mar 9 14:56:24 hostname kernel: [60978.064417] ata3: EH complete
Mar 9 14:56:24 hostname kernel: [60978.064954] DMA: Out of SW-IOMMU space for 28672 bytes at device 0000:00:1f.2
Mar 9 14:56:24 hostname kernel: [60978.064988] ata3.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Mar 9 14:56:24 hostname kernel: [60978.064994] ata3.01: failed command: WRITE DMA EXT
Mar 9 14:56:24 hostname kernel: [60978.065003] ata3.01: cmd 35/00:00:70:8c:09/00:04:ad:00:00/f0 tag 0 dma 524288 out
Mar 9 14:56:24 hostname kernel: [60978.065003] res 50/00:00:af:88:e0/00:00:e8:00:00/f0 Emask 0x40 (internal error)
Mar 9 14:56:24 hostname kernel: [60978.065008] ata3.01: status: { DRDY }
Mar 9 14:56:24 hostname kernel: [60978.088678] ata3.00: configured for UDMA/133
Mar 9 14:56:24 hostname kernel: [60978.104420] ata3.01: configured for UDMA/133
Mar 9 14:56:24 hostname kernel: [60978.104440] ata3: EH complete
Mar 9 14:56:24 hostname kernel: [60978.104980] DMA: Out of SW-IOMMU spac...

Read more...

Luis Alvarado (luisalvarado) wrote :

Tested with Ubutnu 13.04 32 Bit and 64 Bit. Still happening (Was also happening in 12.10 64 Bit). The easiest ways I found on how to reproduce the error were to try to reconnect to the same Router or try to scan using iwlist while still connected to the router. Specially with "sudo iwlist wlan0 s".

Here is some of the output:

wlan0: authenticate with f8:d1:11:26:c3:9e
[ 6206.226760] wlan0: direct probe to f8:d1:11:26:c3:9e (try 1/3)
[ 6206.430418] wlan0: direct probe to f8:d1:11:26:c3:9e (try 2/3)
[ 6206.634248] wlan0: direct probe to f8:d1:11:26:c3:9e (try 3/3)
[ 6206.838058] wlan0: authentication with f8:d1:11:26:c3:9e timed out
[ 6209.980125] wlan0: authenticate with f8:d1:11:26:c3:9e
[ 6209.995686] wlan0: direct probe to f8:d1:11:26:c3:9e (try 1/3)
[ 6210.199351] wlan0: direct probe to f8:d1:11:26:c3:9e (try 2/3)
[ 6210.403216] wlan0: direct probe to f8:d1:11:26:c3:9e (try 3/3)
[ 6210.607021] wlan0: authentication with f8:d1:11:26:c3:9e timed out
[ 6229.543820] phy0 -> rt2x00queue_write_tx_frame: Error - Dropping frame due to full tx queue 0.
[ 6234.060826] wlan0: authenticate with f8:d1:11:26:c3:9e
[ 6234.076294] wlan0: direct probe to f8:d1:11:26:c3:9e (try 1/3)
[ 6234.279988] wlan0: direct probe to f8:d1:11:26:c3:9e (try 2/3)
[ 6234.483827] wlan0: direct probe to f8:d1:11:26:c3:9e (try 3/3)
[ 6234.687659] wlan0: authentication with f8:d1:11:26:c3:9e timed out
[ 6234.855571] phy0 -> rt2x00queue_write_tx_frame: Error - Dropping frame due to full tx queue 0.
..... Continues .....
[ 6253.620460] phy0 -> rt2x00queue_write_tx_frame: Error - Dropping frame due to full tx queue 0.
[ 6259.187958] phy0 -> rt2x00queue_write_tx_frame: Error - Dropping frame due to full tx queue 0.
[ 6262.162264] wlan0: authenticate with f8:d1:11:26:c3:9e
[ 6262.177645] wlan0: direct probe to f8:d1:11:26:c3:9e (try 1/3)
[ 6262.381338] wlan0: direct probe to f8:d1:11:26:c3:9e (try 2/3)
[ 6262.585206] wlan0: direct probe to f8:d1:11:26:c3:9e (try 3/3)
[ 6262.789040] wlan0: authentication with f8:d1:11:26:c3:9e timed out
[ 6313.144511] phy0 -> rt2x00queue_write_tx_frame: Error - Dropping frame due to full tx queue 0.
..... Continues .....
[ 6346.117978] phy0 -> rt2x00queue_write_tx_frame: Error - Dropping frame due to full tx queue 0.
[ 6389.087371] phy0 -> rt2x00queue_write_tx_frame: Error - Dropping frame due to full tx queue 0.
[ 6428.139905] phy0 -> rt2x00queue_write_tx_frame: Error - Dropping frame due to full tx queue 0.
[ 6435.949642] phy0 -> rt2x00queue_write_tx_frame: Error - Dropping frame due to full tx queue 0.
[ 6496.996490] phy0 -> rt2x00queue_write_tx_frame: Error - Dropping frame due to full tx queue 0.
[ 6527.312068] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0x00000062].
[ 6528.910782] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0x00000062].
[ 6528.910788] phy0 -> rt2800pci_set_device_state: Error - Device failed to enter state 4 (-5).

To post a comment you must log in.