qemu exits with -11 when connecting to a port redirect before the service starts listening

Bug #932539 reported by Stéphane Graber
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
QEMU
Fix Released
Undecided
Unassigned
qemu-kvm (Ubuntu)
Fix Released
High
Unassigned

Bug Description

This was detected initially as a crash in the auto upgrade tester.
The code of the upgrade tester basically spawns a kvm instance in the background with a port redirect from localhost:54322 to tcp:22 in the VM, then wait for that port to allow for a ssh connection before continuing the upgrade testing.

In the past (Oneiric), all worked well but since Precise, we now get qemu exitting with -11 at every single test :(

A quick reproducer is:
 - start a VM that has openssh-server installed with: -net user,hostfwd=tcp::54322-:22
 - immediately start "ssh -p 54322 127.0.0.1" before the VM starts booting (BIOS/GRUB state)

Then wait for sshd to start in the VM and qemu will exit with -11.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Starting program: /usr/bin/kvm -drive file=delme.img -m 512 -vnc :1 -net nic,model=virtio -net user,hostfwd=tcp::2222-:22
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffee443700 (LWP 11818)]
[New Thread 0x7fffedc42700 (LWP 11819)]
[New Thread 0x7fffc605d700 (LWP 11825)]

Program received signal SIGSEGV, Segmentation fault.
0x00005555556812f3 in slirp_insque (a=0x0, b=0x555556329498) at slirp/misc.c:27
27 slirp/misc.c: No such file or directory.
(gdb) where
#0 0x00005555556812f3 in slirp_insque (a=0x0, b=0x555556329498) at slirp/misc.c:27
#1 0x000055555567fc18 in if_start (slirp=0x555556329430) at slirp/if.c:194
#2 0x0000555555680d50 in ip_output (so=0x555556628b00, m0=0x555556629800) at slirp/ip_output.c:84
#3 0x0000555555686850 in tcp_output (tp=0x555556628bb0) at slirp/tcp_output.c:456
#4 0x0000555555688133 in tcp_timers (timer=0, tp=0x555556628bb0) at slirp/tcp_timer.c:242
#5 tcp_slowtimo (slirp=0x555556329430) at slirp/tcp_timer.c:88
#6 0x00005555556835d8 in slirp_select_poll (readfds=0x7fffffffda30, writefds=0x7fffffffdab0,
    xfds=0x7fffffffdb30, select_error=0) at slirp/slirp.c:433
#7 0x000055555565536d in main_loop_wait (nonblocking=<optimized out>) at main-loop.c:465
#8 0x00005555555c060f in main_loop () at /build/buildd/qemu-kvm-1.0+noroms/vl.c:1482
#9 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
    at /build/buildd/qemu-kvm-1.0+noroms/vl.c:3523

Changed in qemu-kvm (Ubuntu):
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

inline void
insque(void *a, void *b)
{
        register struct quehead *element = (struct quehead *) a;
        register struct quehead *head = (struct quehead *) b;
        element->qh_link = head->qh_link;

(line 27 is the last line)

(gdb) p *element
Cannot access memory at address 0x0
(gdb) p a
$3 = (void *) 0x0

This is called from here in slirp/if.c:

        /* If there are more packets for this session, re-queue them */
        if (ifm->ifs_next != /* ifm->ifs_prev != */ ifm) {
                insque(ifm->ifs_next, ifqt);
                ifs_remque(ifm);
        }

It sounds like ifm expects its last element to have ifm->ifs_next = ifm,
but it's actually == NULL.

I don't see any changes to this file likely to have introduced the
regression, looking further up the stack.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Reproduced with uptodate qemu.git:

./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -drive file=../../delme.img -m 512 -vnc :1 -net nic,model=virtio -net user,hostfwd=tcp::2222-:22
[... immediately ssh -p 2222 localhost in another terminal, then wait while VM starts to boot ...]
Segmentation fault (core dumped)

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

I was thinking 1ab74cea060d776b19857c3babc64d729bbdba5c might have introduced it, but at that commit it doesn't happen.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Oddly, a bisect suggests this was introduced by

commit e3a110b527f749a2acec079c261f4481aadd3edc:
    slirp: Only start packet expiration for delayed ones

which seems rather innocent.

Changed in qemu:
status: New → Confirmed
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

This appears to be fixed upstream.

Changed in qemu:
status: Confirmed → Fix Released
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

(cherrypicked the slirp/if.c patches from upstream, which fixed the problem for me. Pushing the resulting package)

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package qemu-kvm - 1.0+noroms-0ubuntu8

---------------
qemu-kvm (1.0+noroms-0ubuntu8) precise; urgency=low

  * debian/patches/slirp-*: fix bad exit with -11 when connecting to a port
    redirect before the service starts listening. (LP: #932539)
 -- Serge Hallyn <email address hidden> Fri, 16 Mar 2012 16:34:05 -0500

Changed in qemu-kvm (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.