[2.1.5] Deployment failure on reprovisioned systems with bcache devices
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Invalid
|
Undecided
|
Unassigned | ||
curtin |
Fix Released
|
High
|
Unassigned |
Bug Description
We're currently unable to deploy Xenial machines using PCI NVMe bcache devices; install fails with the following error:
Apr 6 10:55:26 mutus cloud-init[3633]: Creating new GPT entries.
Apr 6 10:55:26 mutus kernel: [ 51.753265] sda:
Apr 6 10:55:26 mutus cloud-init[3633]: The operation has completed successfully.
Apr 6 10:55:26 mutus kernel: [ 51.771281] sda:
Apr 6 10:55:26 mutus kernel: [ 51.790637] sda:
Apr 6 10:55:27 mutus kernel: [ 52.237888] nvme0n1:
Apr 6 10:55:28 mutus cloud-init[3633]: Creating new GPT entries.
Apr 6 10:55:28 mutus kernel: [ 53.259122] nvme0n1:
Apr 6 10:55:28 mutus cloud-init[3633]: The operation has completed successfully.
Apr 6 10:55:28 mutus kernel: [ 53.307976] nvme0n1:
Apr 6 10:55:28 mutus kernel: [ 53.324729] nvme0n1:
Apr 6 10:55:28 mutus kernel: [ 53.356890] nvme0n1:
Apr 6 10:55:28 mutus kernel: [ 53.529895] sda:
Apr 6 10:55:28 mutus kernel: [ 53.531137] sda:
Apr 6 10:55:28 mutus kernel: [ 53.557188] sda:
Apr 6 10:55:28 mutus kernel: [ 53.574467] sda:
Apr 6 10:55:29 mutus kernel: [ 54.578195] sda: sda1
Apr 6 10:55:29 mutus kernel: [ 54.688927] sda: sda1
Apr 6 10:55:29 mutus kernel: [ 54.788675] sda: sda1
Apr 6 10:55:29 mutus kernel: [ 54.889419] sda: sda1
Apr 6 10:55:30 mutus kernel: [ 54.971285] sda: sda1
Apr 6 10:55:30 mutus kernel: [ 55.065077] sda: sda1
Apr 6 10:55:30 mutus kernel: [ 55.149155] sda: sda1
Apr 6 10:55:30 mutus kernel: [ 55.354850] sda: sda1
Apr 6 10:55:30 mutus kernel: [ 55.405415] sda: sda1
Apr 6 10:55:30 mutus kernel: [ 55.407443] sda: sda1 sda2
Apr 6 10:55:31 mutus kernel: [ 56.408871] sda: sda1 sda2
Apr 6 10:55:31 mutus kernel: [ 56.478676] sda: sda1 sda2
Apr 6 10:55:31 mutus kernel: [ 56.573829] sda: sda1 sda2
Apr 6 10:55:31 mutus kernel: [ 56.671560] sda: sda1 sda2
Apr 6 10:55:31 mutus kernel: [ 56.767427] sda: sda1 sda2
Apr 6 10:55:31 mutus kernel: [ 56.875916] sda: sda1 sda2
Apr 6 10:55:32 mutus kernel: [ 56.962714] sda: sda1 sda2
Apr 6 10:55:32 mutus kernel: [ 57.172994] sda: sda1 sda2
Apr 6 10:55:32 mutus kernel: [ 57.223798] sda: sda1 sda2
Apr 6 10:55:32 mutus kernel: [ 57.226333] sda: sda1 sda2 sda3
Apr 6 10:55:33 mutus kernel: [ 58.227166] sda: sda1 sda2 sda3
Apr 6 10:55:33 mutus kernel: [ 58.303846] sda: sda1 sda2 sda3
Apr 6 10:55:33 mutus kernel: [ 58.415606] sda: sda1 sda2 sda3
Apr 6 10:55:33 mutus kernel: [ 58.515676] sda: sda1 sda2 sda3
Apr 6 10:55:33 mutus kernel: [ 58.615585] sda: sda1 sda2 sda3
Apr 6 10:55:33 mutus kernel: [ 58.734039] sda: sda1 sda2 sda3
Apr 6 10:55:33 mutus kernel: [ 58.833126] sda: sda1 sda2 sda3
Apr 6 10:55:34 mutus kernel: [ 59.069875] nvme0n1:
Apr 6 10:55:34 mutus kernel: [ 59.100977] nvme0n1:
Apr 6 10:55:34 mutus kernel: [ 59.134993] nvme0n1:
Apr 6 10:55:34 mutus kernel: [ 59.174833] nvme0n1:
Apr 6 10:55:34 mutus kernel: [ 59.190097] nvme0n1: p1
Apr 6 10:55:34 mutus kernel: [ 59.794421] bcache: bch_journal_
Apr 6 10:55:34 mutus kernel: [ 59.794620] bcache: register_cache() registered cache device nvme0n1p1
Apr 6 10:55:35 mutus kernel: [ 60.190482] bcache: register_bcache() error opening /dev/nvme0n1p1: device already registered
Apr 6 10:55:35 mutus kernel: [ 60.204020] bcache: register_bcache() error opening /dev/nvme0n1p1: device already registered
Apr 6 10:55:35 mutus kernel: [ 60.208338] bcache: register_bcache() error opening /dev/nvme0n1p1: device already registered
Apr 6 10:55:35 mutus kernel: [ 60.212611] bcache: register_bcache() error opening /dev/nvme0n1p1: device already registered
Apr 6 10:55:35 mutus kernel: [ 60.223886] bcache: register_bcache() error opening /dev/nvme0n1p1: device already registered
Apr 6 10:55:35 mutus kernel: [ 60.237373] bcache: register_bcache() error opening /dev/nvme0n1p1: device already registered
Apr 6 10:55:35 mutus kernel: [ 60.249949] bcache: register_bcache() error opening /dev/nvme0n1p1: device already registered
Apr 6 10:55:35 mutus kernel: [ 60.254018] bcache: register_bcache() error opening /dev/nvme0n1p1: device already registered
Apr 6 10:55:35 mutus kernel: [ 60.258041] bcache: register_bcache() error opening /dev/nvme0n1p1: device already registered
Apr 6 10:55:35 mutus kernel: [ 60.268776] bcache: register_bcache() error opening /dev/nvme0n1p1: device already registered
Apr 6 10:55:35 mutus kernel: [ 60.282611] bcache: register_bcache() error opening /dev/nvme0n1p1: device already registered
Apr 6 10:55:35 mutus kernel: [ 60.295236] bcache: register_bcache() error opening /dev/nvme0n1p1: device already registered
Apr 6 10:55:35 mutus kernel: [ 60.299748] bcache: register_bcache() error opening /dev/nvme0n1p1: device already registered
Apr 6 10:55:35 mutus kernel: [ 60.303715] bcache: register_bcache() error opening /dev/nvme0n1p1: device already registered
Apr 6 10:55:35 mutus kernel: [ 60.315059] bcache: register_bcache() error opening /dev/nvme0n1p1: device already registered
Apr 6 10:55:35 mutus kernel: [ 60.329561] bcache: register_bcache() error opening /dev/nvme0n1p1: device already registered
Apr 6 10:55:35 mutus kernel: [ 60.343202] bcache: register_bcache() error opening /dev/nvme0n1p1: device already registered
Apr 6 10:55:35 mutus kernel: [ 60.347250] bcache: register_bcache() error opening /dev/nvme0n1p1: device already registered
Apr 6 10:55:35 mutus kernel: [ 60.351080] bcache: register_bcache() error opening /dev/nvme0n1p1: device already registered
Apr 6 10:55:35 mutus kernel: [ 60.361475] bcache: register_bcache() error opening /dev/nvme0n1p1: device already registered
Apr 6 10:55:35 mutus cloud-init[3633]: Failed to exclusively open path: /dev/nvme0n1p1
Apr 6 10:55:35 mutus cloud-init[3633]: Traceback (most recent call last):
Apr 6 10:55:35 mutus cloud-init[3633]: File "/curtin/
Apr 6 10:55:35 mutus cloud-init[3633]: fd = os.open(path, os.O_RDWR | os.O_EXCL)
Apr 6 10:55:35 mutus cloud-init[3633]: OSError: [Errno 16] Device or resource busy: '/dev/nvme0n1p1'
Apr 6 10:55:35 mutus cloud-init[3633]: Device holders with exclusive access: []
Apr 6 10:55:35 mutus cloud-init[3633]: Device mounts: []
Apr 6 10:55:35 mutus cloud-init[3633]: An error occured handling 'nvme0n1-part1': OSError - [Errno 16] Device or resource busy: '/dev/nvme0n1p1'
Apr 6 10:55:35 mutus cloud-init[3633]: [Errno 16] Device or resource busy: '/dev/nvme0n1p1'
Apr 6 10:55:35 mutus cloud-init[3633]: curtin: Installation failed with exception: Unexpected error while running command.
Apr 6 10:55:35 mutus cloud-init[3633]: Command: ['curtin', 'block-meta', 'custom']
Apr 6 10:55:35 mutus cloud-init[3633]: Exit code: 3
Apr 6 10:55:35 mutus cloud-init[3633]: Reason: -
Apr 6 10:55:35 mutus cloud-init[3633]: Stdout: Creating new GPT entries.
Apr 6 10:55:35 mutus cloud-init[3633]: The operation has completed successfully.
Apr 6 10:55:35 mutus cloud-init[3633]: Creating new GPT entries.
Apr 6 10:55:35 mutus cloud-init[3633]: The operation has completed successfully.
Apr 6 10:55:35 mutus cloud-init[3633]: Failed to exclusively open path: /dev/nvme0n1p1
Apr 6 10:55:35 mutus cloud-init[3633]: Traceback (most recent call last):
Apr 6 10:55:35 mutus cloud-init[3633]: File "/curtin/
Apr 6 10:55:35 mutus cloud-init[3633]: fd = os.open(path, os.O_RDWR | os.O_EXCL)
Apr 6 10:55:35 mutus cloud-init[3633]: OSError: [Errno 16] Device or resource busy: '/dev/nvme0n1p1'
Apr 6 10:55:35 mutus cloud-init[3633]: Device holders with exclusive access: []
Apr 6 10:55:35 mutus cloud-init[3633]: Device mounts: []
Apr 6 10:55:35 mutus cloud-init[3633]: An error occured handling 'nvme0n1-part1': OSError - [Errno 16] Device or resource busy: '/dev/nvme0n1p1'
Apr 6 10:55:35 mutus cloud-init[3633]: [Errno 16] Device or resource busy: '/dev/nvme0n1p1'
Apr 6 10:55:36 mutus cloud-init[3633]:
Apr 6 10:55:36 mutus cloud-init[3633]: Stderr: ''
Apr 6 10:55:36 mutus cloud-init[3633]: Unexpected error while running command.
Apr 6 10:55:36 mutus cloud-init[3633]: Command: ['curtin', 'block-meta', 'custom']
Apr 6 10:55:36 mutus cloud-init[3633]: Exit code: 3
Apr 6 10:55:36 mutus cloud-init[3633]: Reason: -
Apr 6 10:55:36 mutus cloud-init[3633]: Stdout: Creating new GPT entries.
Apr 6 10:55:36 mutus cloud-init[3633]: The operation has completed successfully.
Apr 6 10:55:36 mutus cloud-init[3633]: Creating new GPT entries.
Apr 6 10:55:36 mutus cloud-init[3633]: The operation has completed successfully.
Apr 6 10:55:36 mutus cloud-init[3633]: Failed to exclusively open path: /dev/nvme0n1p1
Apr 6 10:55:36 mutus cloud-init[3633]: Traceback (most recent call last):
Apr 6 10:55:36 mutus cloud-init[3633]: File "/curtin/
Apr 6 10:55:36 mutus cloud-init[3633]: fd = os.open(path, os.O_RDWR | os.O_EXCL)
Apr 6 10:55:36 mutus cloud-init[3633]: OSError: [Errno 16] Device or resource busy: '/dev/nvme0n1p1'
Apr 6 10:55:36 mutus cloud-init[3633]: Device holders with exclusive access: []
Apr 6 10:55:36 mutus cloud-init[3633]: Device mounts: []
Apr 6 10:55:36 mutus cloud-init[3633]: An error occured handling 'nvme0n1-part1': OSError - [Errno 16] Device or resource busy: '/dev/nvme0n1p1'
Apr 6 10:55:37 mutus cloud-init[3633]: [Errno 16] Device or resource busy: '/dev/nvme0n1p1'
Apr 6 10:55:37 mutus cloud-init[3633]:
Apr 6 10:55:37 mutus cloud-init[3633]: Stderr: ''
Apr 6 10:55:37 mutus cloud-init[3633]: Cloud-init v. 0.7.9 running 'modules:final' at Thu, 06 Apr 2017 10:55:21 +0000. Up 45.64 seconds.
Apr 6 10:55:37 mutus cloud-init[3633]: 2017-04-06 10:55:35,776 - util.py[WARNING]: Failed running /var/lib/
Apr 6 10:55:37 mutus cloud-init[3633]: 2017-04-06 10:55:35,779 - cc_scripts_
Apr 6 10:55:37 mutus cloud-init[3633]: 2017-04-06 10:55:35,830 - util.py[WARNING]: Running module scripts-user (<module 'cloudinit.
The servers are configured with a single cache-set and two bcache devices; one fronting sda (created using the bcache storage profile in MAAS) and the second fronting a larger 3TB spindle.
We also tried with just a single bcache, but saw the same issue.
Related branches
- Server Team CI bot: Approve (continuous-integration)
- Scott Moser (community): Approve
-
Diff: 684 lines (+559/-22)6 files modifiedcurtin/block/clear_holders.py (+81/-14)
curtin/util.py (+21/-0)
examples/tests/nvme_bcache.yaml (+114/-0)
tests/unittests/test_clear_holders.py (+197/-8)
tests/unittests/test_util.py (+69/-0)
tests/vmtests/test_nvme.py (+77/-0)
It is worth noting that this same hardware deployed successfully with bcache with MAAS 1.9 yesterday.