NBD kernel block driver hangs on heavy usage
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| linux (Ubuntu) |
Won't Fix
|
Undecided
|
Unassigned | ||
| linux-source-2.6.20 (Ubuntu) |
Won't Fix
|
Undecided
|
Unassigned | ||
Bug Description
Binary package hint: kernel-
The block device driver for NBD hangs on heavy usage. This is always reproducible for me, but I haven't found too many other people with the same problem. Not sure if that means that nobody is using nbd in production, or if nbd is just not popular on Ubuntu.
Steps to reproduce:
sudo modprobe nbd debugflags=65535 #debugflags has no effect on the bug, and nothing appears in dmesg anyways, although a huge number of requests will appear in kmsg.
dd if=/dev/zero of=/tmp/foo.img bs=4096 seek=512000 count=1
nbd-server 2000 /tmp/foo.img
sudo nbd-client localhost 2000 /dev/nbd0
sudo mke2fs /dev/nbd0
sudo mount /dev/nbd0 /mnt/nbd0
sudo chown enki:enki /mnt/nbd0 -R
bonnie++ -u 1000 -d /mnt/nbd0 -r 256 -x 10
Generally the kernel will stop making new requests, and hang the device during the first bonnie++ run. Occasionally it makes it further on. Sometimes (rarely), the kernel will stop responding to all new open files, not just nbd requests.
I found the problem while working on the next release of the elasticdrive project, and found that the problem re-occured with the stock nbd-server as well. Since elasticdrive completely replaces the server and client with "from scratch" re-implementations, I am led to believe that this is a separate issue.
=== BORING DETAILS FOLLOW ===
Running system processes:
enki@mobilepig:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 2908 1848 ? Ss Aug30 0:01 /sbin/init
root 2 0.0 0.0 0 0 ? S Aug30 0:00 [migration/0]
root 3 0.0 0.0 0 0 ? SN Aug30 0:00 [ksoftirqd/0]
root 4 0.0 0.0 0 0 ? S Aug30 0:00 [watchdog/0]
root 5 0.0 0.0 0 0 ? S< Aug30 0:00 [events/0]
root 6 0.0 0.0 0 0 ? S< Aug30 0:00 [khelper]
root 7 0.0 0.0 0 0 ? S< Aug30 0:00 [kthread]
root 30 0.0 0.0 0 0 ? S< Aug30 0:00 [kblockd/0]
root 31 0.0 0.0 0 0 ? S< Aug30 0:00 [kacpid]
root 32 0.0 0.0 0 0 ? S< Aug30 0:00 [kacpi_notify]
root 147 0.0 0.0 0 0 ? S< Aug30 0:00 [kseriod]
root 170 0.0 0.0 0 0 ? S< Aug30 0:00 [kswapd0]
root 171 0.0 0.0 0 0 ? S< Aug30 0:00 [aio/0]
root 2032 0.0 0.0 0 0 ? S< Aug30 0:00 [ksuspend_usbd]
root 2033 0.0 0.0 0 0 ? S< Aug30 0:00 [khubd]
root 2066 0.0 0.0 0 0 ? S< Aug30 0:00 [khpsbpkt]
root 2211 0.0 0.0 0 0 ? S< Aug30 0:00 [knodemgrd_0]
root 2219 0.0 0.0 0 0 ? S< Aug30 0:04 [ata/0]
root 2220 0.0 0.0 0 0 ? S< Aug30 0:00 [ata_aux]
root 2227 0.0 0.0 0 0 ? S< Aug30 0:07 [scsi_eh_0]
root 2228 0.0 0.0 0 0 ? S< Aug30 0:00 [scsi_eh_1]
root 2273 0.0 0.0 1864 844 ? Ss 07:30 0:00 /usr/sbin/anacron -s
root 2411 0.0 0.0 0 0 ? S< Aug30 0:01 [kjournald]
root 2541 0.0 0.0 1712 484 ? S 07:35 0:00 /bin/sh -c nice run-parts --report /etc/cron.daily
root 2542 0.0 0.0 1636 640 ? SN 07:35 0:00 run-parts --report /etc/cron.daily
root 2616 0.0 0.0 2932 1276 ? S<s Aug30 0:00 /sbin/udevd --daemon
cupsys 2796 0.0 0.0 4720 2000 ? SNs 07:37 0:00 /usr/sbin/cupsd
root 2868 0.0 0.0 1712 488 ? SNs 07:37 0:00 /bin/sh /etc/cron.
root 2869 0.1 0.0 1892 764 ? DN 07:37 0:07 /usr/bin/updatedb
root 3445 0.0 0.0 0 0 ? S< Aug30 0:00 [pccardd]
root 3454 0.0 0.0 0 0 ? S< Aug30 0:00 [pccardd]
root 3455 0.0 0.0 0 0 ? S< Aug30 0:00 [ipw2200/0]
root 3460 0.0 0.0 0 0 ? S< Aug30 0:00 [kpsmoused]
root 3690 0.0 0.0 0 0 ? S< Aug30 0:00 [hda_codec]
root 4077 0.0 0.0 0 0 ? S< Aug30 0:00 [kjournald]
root 4079 0.0 0.0 0 0 ? S< Aug30 0:00 [kjournald]
daemon 4162 0.0 0.0 1764 384 ? Ss Aug30 0:00 /sbin/portmap
root 4276 0.0 0.0 11376 1128 ? Ssl Aug30 0:00 /usr/sbin/ccsd
root 4373 0.0 0.0 1648 508 tty4 Ss+ Aug30 0:00 /sbin/getty 38400 tty4
root 4374 0.0 0.0 1648 508 tty5 Ss+ Aug30 0:00 /sbin/getty 38400 tty5
root 4378 0.0 0.0 1648 508 tty2 Ss+ Aug30 0:00 /sbin/getty 38400 tty2
root 4379 0.0 0.0 1652 512 tty3 Ss+ Aug30 0:00 /sbin/getty 38400 tty3
root 4380 0.0 0.0 1652 512 tty1 Ss+ Aug30 0:00 /sbin/getty 38400 tty1
root 4381 0.0 0.0 1652 512 tty6 Ss+ Aug30 0:00 /sbin/getty 38400 tty6
root 4632 0.0 0.0 2260 1244 ? Ss Aug30 0:00 /usr/sbin/acpid -c /etc/acpi/events -s /var/run/
root 4727 0.0 0.0 1704 648 ? Ss Aug30 0:00 /sbin/syslogd
root 4782 0.0 0.0 1796 528 ? Ss Aug30 0:00 /bin/dd bs 1 if /proc/kmsg of /var/run/klogd/kmsg
klog 4784 0.0 0.0 2428 1372 ? Ss Aug30 0:00 /sbin/klogd -P /var/run/klogd/kmsg
103 4805 0.0 0.0 2848 1136 ? Ss Aug30 0:00 /usr/bin/
106 4821 0.0 0.1 5440 3684 ? Ss Aug30 0:05 /usr/sbin/hald
root 4822 0.0 0.0 2880 1048 ? S Aug30 0:00 hald-runner
106 4828 0.0 0.0 2108 900 ? S Aug30 0:00 hald-addon-
106 4829 0.0 0.0 2108 896 ? S Aug30 0:00 hald-addon-
106 4830 0.0 0.0 2108 896 ? S Aug30 0:00 hald-addon-
root 4832 0.0 0.0 2936 1096 ? S Aug30 0:00 /usr/lib/
106 4833 0.0 0.0 2108 928 ? S Aug30 0:00 hald-addon-acpi: listening on acpid socket /var/run/
106 4834 0.0 0.0 2108 928 ? S Aug30 0:00 hald-addon-
106 4845 0.0 0.0 3048 1184 ? D Aug30 0:06 hald-addon-storage: polling /dev/scd0 (every 2 sec)
root 4858 0.0 0.0 1940 840 ? Ss Aug30 0:00 /usr/sbin/dhcdbd --system
root 4873 0.0 0.1 29752 2144 ? Ssl Aug30 0:00 /usr/sbin/
avahi 4891 0.0 0.0 2672 1388 ? Ss Aug30 0:00 avahi-daemon: registering [mobilepig.local]
avahi 4892 0.0 0.0 2672 468 ? Ss Aug30 0:00 avahi-daemon: chroot helper
root 4907 0.0 0.0 3060 1268 ? Ss Aug30 0:00 /usr/sbin/
root 4921 0.0 0.0 2872 804 ? Ss Aug30 0:00 /usr/bin/
root 4922 0.0 0.0 2716 1216 ? S Aug30 0:00 dbus-daemon --session --print-address --nofork
root 4963 0.0 0.0 12216 1408 ? Ss Aug30 0:00 /usr/sbin/gdm
root 4964 0.0 0.1 12576 2392 ? S Aug30 0:00 /usr/sbin/gdm
root 4979 0.1 2.5 59192 52776 tty7 RLs+ Aug30 0:57 /usr/bin/
root 5056 0.0 0.0 6412 832 ? Ss Aug30 0:00 /usr/sbin/hpiod
hplip 5059 0.0 0.2 10048 4988 ? S Aug30 0:00 python /usr/sbin/hpssd
root 5115 0.0 0.0 1712 524 ? S Aug30 0:00 /bin/sh /usr/bin/
mysql 5163 0.0 0.8 127656 17560 ? Sl Aug30 0:01 /usr/sbin/mysqld --basedir=/usr --datadir=
root 5165 0.0 0.0 1636 532 ? S Aug30 0:00 logger -p daemon.err -t mysqld_safe -i -t mysqld
root 5309 0.0 0.2 6000 4800 ? S Aug30 0:00 /usr/bin/memcached -vv -m 64 -p 11211 -u root
partimag 5335 0.0 0.0 5976 1684 ? Ss Aug30 0:00 /usr/sbin/
root 5357 0.0 0.0 6040 1264 ? Ss Aug30 0:00 /usr/sbin/nmbd -D
root 5359 0.0 0.1 9300 2244 ? Ss Aug30 0:00 /usr/sbin/smbd -D
root 5366 0.0 0.0 9300 1056 ? S Aug30 0:00 /usr/sbin/smbd -D
root 5381 0.0 0.0 5088 948 ? Ss Aug30 0:00 /usr/sbin/sshd
statd 5484 0.0 0.0 1840 752 ? Ss Aug30 0:00 /sbin/rpc.statd
root 5504 0.0 0.0 3648 544 ? Ss Aug30 0:00 /usr/sbin/
root 5529 0.0 0.0 2696 1036 ? Ss Aug30 0:00 /usr/sbin/hcid -x -s
root 5553 0.0 0.0 0 0 ? S< Aug30 0:00 [krfcommd]
ntp 5592 0.0 0.0 4268 1240 ? Ss Aug30 0:00 /usr/sbin/ntpd -p /var/run/ntpd.pid -u 119:122 -g
daemon 5632 0.0 0.0 1908 420 ? Ss Aug30 0:00 /usr/sbin/atd
root 5646 0.0 0.0 2284 896 ? Ss Aug30 0:00 /usr/sbin/cron
root 5711 0.0 0.0 1504 160 ? S Aug30 0:00 /usr/bin/
root 5726 0.0 0.0 1504 160 ? S Aug30 0:00 /usr/bin/
root 5728 0.0 0.0 1500 160 ? S Aug30 0:00 /usr/bin/
root 5743 0.0 0.0 1496 156 ? S Aug30 0:00 /usr/bin/
root 5758 0.0 0.0 1752 448 ? Ss Aug30 0:00 /usr/bin/vmnet-natd -d /var/run/
root 5781 0.0 0.0 1816 364 ? Ss Aug30 0:00 /usr/bin/
root 5782 0.0 0.0 1816 368 ? Ss Aug30 0:00 /usr/bin/
root 6005 0.0 0.0 0 0 ? S< Aug30 0:00 [kondemand/0]
root 6083 0.0 0.0 3788 1444 ? S Aug30 0:00 /sbin/wpa_
dhcp 6137 0.0 0.0 2456 1216 ? S Aug30 0:00 /sbin/dhclient -1 -lf /var/lib/
root 6790 0.0 0.0 0 0 ? S Aug30 0:00 [pdflush]
root 6792 0.0 0.0 0 0 ? D Aug30 0:00 [pdflush]
All Loaded Modules:
Module Size Used by
ext2 66824 1
nbd 23328 1
af_packet 23816 4
vmnet 32932 16
vmmon 185676 0
binfmt_misc 12680 1
rfcomm 40856 1
l2cap 25856 5 rfcomm
bluetooth 55908 4 rfcomm,l2cap
nfs 240876 0
lockd 64904 1 nfs
sunrpc 161340 3 nfs,lockd
ppdev 10116 0
speedstep_centrino 9920 0
cpufreq_userspace 5408 0
cpufreq_stats 7360 0
cpufreq_powersave 2688 0
cpufreq_ondemand 9228 1
freq_table 5792 3 speedstep_
cpufreq_
tc1100_wmi 8068 0
pcc_acpi 13184 0
dev_acpi 12292 0
sony_acpi 6284 0
video 16388 0
sbs 15652 0
i2c_ec 6016 1 sbs
dock 10268 0
button 8720 0
battery 10756 0
container 5248 0
ac 6020 0
asus_acpi 17308 0
backlight 7040 1 asus_acpi
ipv6 268960 18
lock_dlm 22092 0
gfs2 349068 1 lock_dlm
dlm 92948 1 lock_dlm
configfs 27536 2 dlm
sbp2 23812 0
parport_pc 36388 0
lp 12452 0
parport 36936 3 ppdev,parport_pc,lp
snd_hda_intel 21912 1
snd_hda_codec 205056 1 snd_hda_intel
snd_pcm_oss 44544 0
snd_mixer_oss 17408 1 snd_pcm_oss
snd_pcm 79876 3 snd_hda_
snd_seq_dummy 4740 0
joydev 10816 0
snd_seq_oss 32896 0
nvidia 6837140 22
pcmcia 39212 0
snd_seq_midi 9600 0
snd_rawmidi 25472 1 snd_seq_midi
snd_seq_midi_event 8448 2 snd_seq_
i2c_core 22656 2 i2c_ec,nvidia
snd_seq 52592 6 snd_seq_
snd_timer 23684 2 snd_pcm,snd_seq
snd_seq_device 9100 5 snd_seq_
snd 54020 12 snd_hda_
soundcore 8672 1 snd
pcspkr 4224 0
intel_agp 25244 1
agpgart 35400 2 nvidia,intel_agp
psmouse 38920 0
serio_raw 7940 0
ipw2200 148040 0
snd_page_alloc 10888 2 snd_hda_
ieee80211 34760 1 ipw2200
ieee80211_crypt 7040 1 ieee80211
sk98lin 156896 0
yenta_socket 27532 2
rsrc_nonstatic 14080 1 yenta_socket
pcmcia_core 40852 3 pcmcia,
iTCO_wdt 11812 0
iTCO_vendor_support 4868 1 iTCO_wdt
shpchp 34324 0
pci_hotplug 32576 1 shpchp
tsdev 8768 0
evdev 11008 5
ext3 133128 3
jbd 59816 1 ext3
mbcache 9604 2 ext2,ext3
sg 36252 0
sr_mod 17060 0
cdrom 37664 1 sr_mod
sd_mod 23428 5
ata_piix 15492 4
ata_generic 9092 0
libata 125720 2 ata_piix,
scsi_mod 142348 5 sbp2,sg,
ohci1394 36528 0
ieee1394 299448 2 sbp2,ohci1394
skge 40848 0
generic 5124 0 [permanent]
ehci_hcd 34188 0
uhci_hcd 25360 0
usbcore 134280 3 ehci_hcd,uhci_hcd
raid10 26240 0
raid456 126480 0
xor 16648 1 raid456
raid1 25600 0
raid0 9600 0
multipath 9856 0
linear 7296 0
md_mod 79764 6 raid10,
thermal 14856 0
processor 31048 1 thermal
fan 5636 0
dm_mod 59084 0
fbcon 42656 0
tileblit 3584 1 fbcon
font 9216 1 fbcon
bitblit 6912 1 fbcon
softcursor 3200 1 bitblit
vesafb 9220 0
capability 5896 0
commoncap 8192 1 capability

Beginning with the Hardy Heron 8.04 development cycle, all open Ubuntu kernel bugs need to be reported against the "linux" kernel package. We are automatically migrating this bug to the new "linux" package. However, development has already began for the upcoming Intrepid Ibex 8.10 release. It would be helpful if you could test the upcoming release and verify if this is still an issue - http:// www.ubuntu. com/testing . If the issue still exists, please update this report by changing the Status of the "linux" task from "Incomplete" to "New". We appreciate your patience and understanding as we make this transition. Thanks!