trusty-updates qemu-slof network boot breaks subsequent OpenFirmware operations

Bug #1503929 reported by William Grant on 2015-10-08
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
slof (Ubuntu)
Undecided
Unassigned
Trusty
High
William Grant

Bug Description

qemu-slof 20140630+dfsg-1ubuntu1~14.04 from trusty-updates appears to corrupt memory during some network boot operations, while 20140630+dfsg-1ubuntu1 (utopic, identical source, but built with gcc 4.9 rather than trusty's 4.8) works fine.

For example, a "boot net" that performs a TFTP download (even if it connects but fails, eg. because the file doesn't exist, but not if it fails to DHCP or connect at all) causes a subsequent "boot cdrom" to fail. http://paste.ubuntu.com/12710600/ has examples to reproduce the good and bad cases, no manual DHCP/TFTP setup required.

Other manifestations of memory corruption include a TFTP-booted GRUB working fine except that the MAC address that GRUB uses for its own TFTP operations has its last octet replaced with 0xd1. This was originally noticed when MAAS failed to boot KVM instances, as it assumed the MAC that GRUB used matched the one that Linux knew about.

Both cases are fixed by using a qemu-slof built with gcc 4.9 instead of the 4.8-built archive version.

A tcpdump excerpt covering the final TFTP packet loading GRUB itself, and the first ARP from within GRUB:

  04:01:23.536384 52:54:00:55:40:13 (oui Unknown) > 52:54:00:17:ed:f1 (oui Unknown), ethertype IPv4 (0x0800), length 60: 10.10.0.107.2001 > 10.10.0.2.38623: UDP, length 4
  04:01:23.876989 52:54:00:55:40:d1 (oui Unknown) > Broadcast, ethertype ARP (0x0806), length 60: Request who-has 10.10.0.2 tell 10.10.0.107, length 46

[Test Case]

On a system with qemu-system-ppc installed (KVM or emulated, both are fine):

  qemu-system-ppc64 -nographic -vga none -M pseries -cdrom /path/to/ppc64el/mini.iso -device virtio-net-pci,vlan=0 -net user,tftp=/does/not/exist,vlan=0 -boot order=nd

The network boot should always fail, and with trusty-updates' 20140630+dfsg-1ubuntu1~14.04 the subsequent CD-ROM boot will also fail. With the bug fixed, the CD-ROM boot will succeed and a GRUB menu will appear.

[Regression Potential]

Minimal. The patch is clearly correct, as all three callers (all in the Ethernet driver) already expect the function to take arguments in that order. The only risks are that the larger buffers will cause other objects to shift around, or something else could depend on the reliable memory corruption from the Ethernet driver.

William Grant (wgrant) on 2015-10-08
Changed in slof (Ubuntu):
status: New → Invalid
Changed in slof (Ubuntu Trusty):
importance: Undecided → High
status: New → Triaged
Changed in slof (Ubuntu Trusty):
assignee: nobody → Nikunj A Dadhania (nikunjad)
Nikunj A Dadhania (nikunjad) wrote :

We had found a bug in this code earlier and fixed by:

http://git.qemu.org/?p=SLOF.git;a=commit;h=8d96fe983fac761677453b06b0d7054db8e681f6

With older compiler is did not show up though. Can you check with this fix ?

William Grant (wgrant) wrote :

That patch fixes both issues. Thanks!

William Grant (wgrant) on 2015-10-08
description: updated
William Grant (wgrant) on 2015-10-08
Changed in slof (Ubuntu Trusty):
assignee: Nikunj A Dadhania (nikunjad) → William Grant (wgrant)
status: Triaged → In Progress
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers