serial8250: too much work for irq3

Bug #1719339 reported by Denys Zagorui
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
QEMU
Expired
Undecided
Unassigned

Bug Description

It's know issue and sometimes mentioned since 2007. But it seems not fixed.

http://lists.gnu.org/archive/html/qemu-devel/2008-02/msg00140.html
https://bugzilla.redhat.com/show_bug.cgi?id=986761
http://old-list-archives.xenproject.org/archives/html/xen-devel/2009-02/msg00696.html

I don't think fixes like increases PASS_LIMIT (/drivers/tty/serial/8250.c) or remove this annoying message (https://patchwork.kernel.org/patch/3920801/) is real fix. Some fix was proposed by H. Peter Anvin https://lkml.org/lkml/2008/2/7/485.

Can reproduce on Debian Strech host (Qemu 1:2.8+dfsg-6+deb9u2), Ubuntu 16.04.2 LTS (Qemu 1:2.5+dfsg-5ubuntu10.15) also tried to use master branch (QEMU emulator version 2.10.50 (v2.10.0-766-ga43415ebfd-dirty)) if we write a lot of message into console (dmesg or dd if=/dev/zero of=/dev/ttyS1).

/usr/local/bin/qemu-system-x86_64 -name guest=ultra1,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-27-ultra1/master-key.aes -machine pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu Skylake-Client,ds=on,acpi=on,ss=on,ht=on,tm=on,pbe=on,dtes64=on,monitor=on,ds_cpl=on,vmx=on,smx=on,est=on,tm2=on,xtpr=on,pdcm=on,osxsave=on,tsc_adjust=on,clflushopt=on,pdpe1gb=on -m 4096 -realtime mlock=off -smp 4,sockets=1,cores=4,threads=1 -uuid 4537ca29-73b2-40c3-9b43-666de182ba5f -display none -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-27-ultra1/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x8.0x7 -drive file=/home/dzagorui/csr/csr_disk.qcow2,format=qcow2,if=none,id=drive-ide0-0-0 -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev tap,fd=26,id=hostnet0 -device e1000,netdev=hostnet0,id=net0,mac=52:54:00:a9:4c:86,bus=pci.0,addr=0x3 -chardev socket,id=charserial0,host=127.0.0.1,port=4000,telnet,server,nowait -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charserial1,host=127.0.0.1,port=4001,telnet,server,nowait -device isa-serial,chardev=charserial1,id=serial1 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x2 -msg timestamp=on

Tags: canonical-is
Revision history for this message
Denys Zagorui (dzagorui) wrote :

Use simpler setup for reproducing.
Was used only qemu-system-x86_64 (without using high-level wrappers and managers of virtual machines: libvirt, virsh, virt-install, virt-manager etc..). My setup with two consoles:

/usr/local/bin/qemu-system-x86_64 -cpu host -enable-kvm -m 256 -smp 4 -kernel /home/dzagorui//bzImage -append 'root=/dev/ram0 loglevel=9 rw console=ttyS0' -initrd /home/dzagorui/initrd.cpio -display none -chardev socket,id=charserial0,host=127.0.0.1,port=4002,telnet,server,nowait -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charserial1,host=127.0.0.1,port=4003,telnet,server,nowait -device isa-serial,chardev=charserial1,id=serial1

I noticed one thing, that -smp parameter affects this issue. When -smp 1 i can't reproduce this issue at all, when -smp 2 i can produce this issue only in second console (ttyS1), when -smp 4 and higher the issue produces on both consoles (ttyS1/ttyS0).
My Host cpu i5-6200U has 2 cores and 4 threads.

For reproducing was used this commands (no matter what console we use ttyS1 or ttyS0):
#dmesg > /dev/ttyS*
#dd if=/dev/zero of=/dev/ttyS*

Denys Zagorui (dzagorui)
Changed in qemu:
assignee: nobody → Denys Zagorui (dzagorui)
status: New → In Progress
Denys Zagorui (dzagorui)
Changed in qemu:
assignee: Denys Zagorui (dzagorui) → nobody
Denys Zagorui (dzagorui)
Changed in qemu:
status: In Progress → New
Revision history for this message
Paul Gear (paulgear) wrote :

I'm seeing this on AWS EC2 when there's (apparently) high logging volume to the console, very similarly to https://www.reddit.com/r/sysadmin/comments/6zuqad/mongodb_aws_ec2_serial8250_too_much_work_for_irq4/

Paul Gear (paulgear)
tags: added: canonical-is
Revision history for this message
Paul Gear (paulgear) wrote :

On further investigation of my instance, there appeared to be no high logging volume to the console, nor anything using the /dev/ttyS0 other than agetty. Switching from the generic kernel to the AWS kernel seems to have stabilised it.

Revision history for this message
Paul Gear (paulgear) wrote :

Further update: AWS kernel experienced the same error messages after just over 3 hours of runtime.

Revision history for this message
Thomas Huth (th-huth) wrote :

The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now.
If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience.

Changed in qemu:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for QEMU because there has been no activity for 60 days.]

Changed in qemu:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.