partition table updates require a reboot

Bug #1410363 reported by James Page
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
High
Unassigned

Bug Description

We're seeing a problem in automated testing of ceph and swift in that partition tables always require a reboot:

2015-01-13 16:50:07 INFO mon-relation-changed Setting name!
2015-01-13 16:50:07 INFO mon-relation-changed partNum is 1
2015-01-13 16:50:07 INFO mon-relation-changed REALLY setting name!
2015-01-13 16:50:07 INFO mon-relation-changed Warning: The kernel is still using the old partition table.
2015-01-13 16:50:07 INFO mon-relation-changed The new table will be used at the next reboot.
2015-01-13 16:50:07 INFO mon-relation-changed The operation has completed successfully.
2015-01-13 16:50:09 INFO mon-relation-changed Setting name!
2015-01-13 16:50:09 INFO mon-relation-changed partNum is 0
2015-01-13 16:50:09 INFO mon-relation-changed REALLY setting name!
2015-01-13 16:50:09 INFO mon-relation-changed Warning: The kernel is still using the old partition table.
2015-01-13 16:50:09 INFO mon-relation-changed The new table will be used at the next reboot.
2015-01-13 16:50:09 INFO mon-relation-changed The operation has completed successfully.
2015-01-13 16:50:09 INFO mon-relation-changed mkfs.xfs: cannot open /dev/vdb1: Device or resource busy
2015-01-13 16:50:09 INFO mon-relation-changed ceph-disk: Error: Command '['/sbin/mkfs', '-t', 'xfs', '-f', '-i', 'size=2048', '--', '/dev/vdb1']' returned non-zero exit status 1
2015-01-13 16:50:09 ERROR juju-log mon:1: Unable to initialize device: /dev/vdb
2015-01-13 16:50:09 INFO mon-relation-changed Traceback (most recent call last):
2015-01-13 16:50:09 INFO mon-relation-changed File "/var/lib/juju/agents/unit-ceph-0/charm/hooks/mon-relation-changed", line 381, in <module>
2015-01-13 16:50:09 INFO mon-relation-changed hooks.execute(sys.argv)
2015-01-13 16:50:09 INFO mon-relation-changed File "/var/lib/juju/agents/unit-ceph-0/charm/hooks/charmhelpers/core/hookenv.py", line 528, in execute
2015-01-13 16:50:09 INFO mon-relation-changed self._hooks[hook_name]()
2015-01-13 16:50:09 INFO mon-relation-changed File "/var/lib/juju/agents/unit-ceph-0/charm/hooks/mon-relation-changed", line 217, in mon_relation
2015-01-13 16:50:09 INFO mon-relation-changed reformat_osd(), config('ignore-device-errors'))
2015-01-13 16:50:09 INFO mon-relation-changed File "/var/lib/juju/agents/unit-ceph-0/charm/hooks/ceph.py", line 327, in osdize
2015-01-13 16:50:09 INFO mon-relation-changed osdize_dev(dev, osd_format, osd_journal, reformat_osd, ignore_errors)
2015-01-13 16:50:09 INFO mon-relation-changed File "/var/lib/juju/agents/unit-ceph-0/charm/hooks/ceph.py", line 375, in osdize_dev
2015-01-13 16:50:09 INFO mon-relation-changed raise e
2015-01-13 16:50:09 INFO mon-relation-changed subprocess.CalledProcessError: Command '['ceph-disk-prepare', '--fs-type', u'xfs', '--zap-disk', u'/dev/vdb']' returned non-zero exit status 1

this is obviously blocking deployment and subsequent testing; previous Ubuntu releases have been OK.

ProblemType: Bug
DistroRelease: Ubuntu 15.04
Package: linux-image-generic (not installed)
ProcVersionSignature: User Name 3.18.0-8.9-generic 3.18.1
Uname: Linux 3.18.0-8-generic x86_64
AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access /dev/snd/: No such file or directory
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.15.1-0ubuntu2
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
CRDA: Error: [Errno 2] No such file or directory: 'iw'
Date: Tue Jan 13 16:51:36 2015
Ec2AMI: ami-0000006e
Ec2AMIManifest: FIXME
Ec2AvailabilityZone: nova
Ec2InstanceType: m1.small
Ec2Kernel: aki-00000002
Ec2Ramdisk: ari-00000002
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: OpenStack Foundation OpenStack Nova
PciMultimedia:

ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.18.0-8-generic root=LABEL=cloudimg-rootfs ro console=tty1 console=ttyS0
RelatedPackageVersions:
 linux-restricted-modules-3.18.0-8-generic N/A
 linux-backports-modules-3.18.0-8-generic N/A
 linux-firmware N/A
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 01/01/2011
dmi.bios.vendor: Bochs
dmi.bios.version: Bochs
dmi.chassis.type: 1
dmi.chassis.vendor: Bochs
dmi.modalias: dmi:bvnBochs:bvrBochs:bd01/01/2011:svnOpenStackFoundation:pnOpenStackNova:pvr2014.1.3:cvnBochs:ct1:cvr:
dmi.product.name: OpenStack Nova
dmi.product.version: 2014.1.3
dmi.sys.vendor: OpenStack Foundation

Revision history for this message
James Page (james-page) wrote :
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Do you happen to know if this was also the case with older kernels?

Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: kernel-key
Revision history for this message
Stefan Bader (smb) wrote :

As discussed on IRC this sounds like the re-appearance of the issue where some data which is used by cloud-init was present on the second disk. After boot and setup this ends up in a state where one succeeds in running unmount but the jbd process related to the mount remains stuck.

Last time this seemed to stop being reproducible at some point. So we had the same issue with 3.16 kernels, too. But we don't know exactly when it gets triggered.

In order to do experiments locally we need to know exactly what kind of data is on vdb. All the documents I find for cloud-init claim to require either an iso fs or vfat for instance data. And for OpenStack it sounds like only using network. So what would be the contents of vdb and is there what do we need to comply with when creating the fs (I think Trusty running on the creating host, any fs label?).

Revision history for this message
Stefan Bader (smb) wrote :

Alternatively, if you can provide us with the baby steps approach to get a *stack instance to the point of failure?
Or: Has one enough control to trigger a memory dump of an instance?

Stefan Bader (smb)
Changed in linux (Ubuntu):
assignee: nobody → Stefan Bader (smb)
Stefan Bader (smb)
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
tags: added: kernel-da-key
removed: kernel-key
Ryan Beisner (1chb1n)
tags: added: openstack uosci
Revision history for this message
Stefan Bader (smb) wrote :

No updates for more than a year. Closing as invalid.

Changed in linux (Ubuntu):
assignee: Stefan Bader (smb) → nobody
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.