nova-compute/0 unit error when running func tests from dev-basic-disco-stein

Bug #1828043 reported by Nikolay Vinogradov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Nova Compute Charm
New
Undecided
Unassigned

Bug Description

I tried to run functional tests of nova-compute charm on ServerStack.
gate-basic-bionic-stein and gate-basic-bionic-rocky work. But dev-basic-cosmic-rocky and dev-basic-disco-stein failed
Revision used to reproduce:

ubuntu@nova-compute-tests-approach:~/nova-compute/tests$ git show HEAD
commit 4e9bcbc28b9e7dca66c9b22d886ac5c1979eb73e (HEAD -> master, origin/master, origin/HEAD)
Merge: a076963 599245b
Author: Zuul <email address hidden>
Date: Fri May 3 14:38:09 2019 +0000

    Merge "config: add notification-format option"

Local changes to tox.ini:

ubuntu@nova-compute-tests-approach:~/nova-compute$ git diff tox.ini
diff --git a/tox.ini b/tox.ini
index 8b80e0f..6c1a0f3 100644
--- a/tox.ini
+++ b/tox.ini
@@ -5,6 +5,8 @@
 envlist = pep8,py27,py35,py36
 skipsdist = True
 skip_missing_interpreters = True
+toxworkdir=/tmp/.tox
+

 [testenv]
 setenv = VIRTUAL_ENV={envdir}
@@ -96,7 +98,7 @@ basepython = python2.7
 deps = -r{toxinidir}/requirements.txt
        -r{toxinidir}/test-requirements.txt
 commands =
- bundletester -vl DEBUG -r json -o func-results.json gate-basic-bionic-rocky --no-destroy
+ bundletester -vl DEBUG -r json -o func-results.json dev-basic-disco-stein --no-destroy

 [testenv:func27-dfs]
 # Charm Functional Test

Then run:
ubuntu@nova-compute-tests-approach:~/nova-compute$ tox -e func27-smoke
...

After a while juju status shows error in nova-compute/0 unit
...
nova-compute/0* error idle 5 10.5.0.19 hook failed: "config-changed"
...

ubuntu@nova-compute-tests-approach:~$ juju debug-log --include nova-compute --lines 50 | grep config-changed
unit-nova-compute-0: 20:04:34 DEBUG unit.nova-compute/0.config-changed ERROR no relation id specified
unit-nova-compute-0: 20:04:34 DEBUG unit.nova-compute/0.config-changed none
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed Setting /etc/apparmor.d/usr.bin.nova-compute to enforce mode.
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed mkfs.xfs: cannot open /dev/vdb: Device or resource busy
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed Traceback (most recent call last):
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed File "/var/lib/juju/agents/unit-nova-compute-0/charm/hooks/config-changed", line 654, in <module>
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed main()
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed File "/var/lib/juju/agents/unit-nova-compute-0/charm/hooks/config-changed", line 647, in main
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed hooks.execute(sys.argv)
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed File "/var/lib/juju/agents/unit-nova-compute-0/charm/hooks/charmhelpers/core/hookenv.py", line 914, in execute
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed self._hooks[hook_name]()
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed File "/var/lib/juju/agents/unit-nova-compute-0/charm/hooks/charmhelpers/contrib/openstack/utils.py", line 1546, in wrapped_f
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed stopstart, restart_functions)
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed File "/var/lib/juju/agents/unit-nova-compute-0/charm/hooks/charmhelpers/core/host.py", line 741, in restart_on_change_helper
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed r = lambda_f()
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed File "/var/lib/juju/agents/unit-nova-compute-0/charm/hooks/charmhelpers/contrib/openstack/utils.py", line 1545, in <lambda>
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed (lambda: f(*args, **kwargs)), __restart_map_cache['cache'],
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed File "/var/lib/juju/agents/unit-nova-compute-0/charm/hooks/charmhelpers/contrib/hardening/harden.py", line 93, in _harden_inner2
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed return f(*args, **kwargs)
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed File "/var/lib/juju/agents/unit-nova-compute-0/charm/hooks/config-changed", line 259, in config_changed
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed configure_local_ephemeral_storage()
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed File "/var/lib/juju/agents/unit-nova-compute-0/charm/hooks/nova_compute_utils.py", line 1002, in configure_local_ephemeral_storage
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed mkfs_xfs(dev, force=True)
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed File "/var/lib/juju/agents/unit-nova-compute-0/charm/hooks/charmhelpers/contrib/storage/linux/utils.py", line 126, in mkfs_xfs
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed check_call(cmd)
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed File "/usr/lib/python3.7/subprocess.py", line 347, in check_call
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed raise CalledProcessError(retcode, cmd)
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed subprocess.CalledProcessError: Command '['mkfs.xfs', '-f', '-i', 'size=1024', '/dev/vdb']' returned non-zero exit status 1.
unit-nova-compute-0: 20:04:37 ERROR juju.worker.uniter.operation hook "config-changed" failed: exit status 1
unit-nova-compute-0: 20:04:37 INFO juju.worker.uniter awaiting error resolution for "config-changed" hook
unit-nova-compute-0: 20:05:15 INFO juju.worker.uniter awaiting error resolution for "config-changed" hook
Now the problem is clearer:
unit-nova-compute-0: 20:04:37 DEBUG unit.nova-compute/0.config-changed mkfs.xfs: cannot open /dev/vdb: Device or resource busy
caused by executing
mkfs.xfs -f -i size=1024 /dev/vdb
Need to go inside and take a closer look:
ubuntu@nova-compute-tests-approach:~$ juju ssh nova-compute/0
ubuntu@juju-12e7f5-default-5:~$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 89.3M 1 loop /snap/core/6673
loop1 7:1 0 53.9M 1 loop /snap/lxd/10601
vda 252:0 0 15G 0 disk
├─vda1 252:1 0 14.9G 0 part /
├─vda14 252:14 0 4M 0 part
└─vda15 252:15 0 106M 0 part /boot/efi
vdb 252:16 0 15G 0 disk
vdc 252:32 0 4G 0 disk [SWAP]
We have vdb of 15G in size.
ubuntu@juju-12e7f5-default-5:~$ sudo mkfs.xfs -f -i size=1024 /dev/vdb
mkfs.xfs: cannot open /dev/vdb: Device or resource busy

Command still fails. Checking dmesg: There was an attempt to mount vdb as EXT4 during early boot:
ubuntu@juju-12e7f5-default-5:~$ sudo dmesg -T | grep vdb
[Tue Apr 30 19:43:33 2019] virtio_blk virtio2: [vdb] 31457280 512-byte logical blocks (16.1 GB/15.0 GiB)
[Tue Apr 30 19:43:51 2019] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: (null)

ubuntu@juju-12e7f5-default-5:~$ sudo umount /dev/vdb
umount: /dev/vdb: not mounted.
root@juju-12e7f5-default-5:/dev/mapper# lsblk -P
NAME="loop0" MAJ:MIN="7:0" RM="0" SIZE="89.3M" RO="1" TYPE="loop" MOUNTPOINT="/snap/core/6673"
NAME="loop1" MAJ:MIN="7:1" RM="0" SIZE="53.9M" RO="1" TYPE="loop" MOUNTPOINT="/snap/lxd/10601"
NAME="vda" MAJ:MIN="252:0" RM="0" SIZE="15G" RO="0" TYPE="disk" MOUNTPOINT=""
NAME="vda1" MAJ:MIN="252:1" RM="0" SIZE="14.9G" RO="0" TYPE="part" MOUNTPOINT="/"
NAME="vda14" MAJ:MIN="252:14" RM="0" SIZE="4M" RO="0" TYPE="part" MOUNTPOINT=""
NAME="vda15" MAJ:MIN="252:15" RM="0" SIZE="106M" RO="0" TYPE="part" MOUNTPOINT="/boot/efi"
NAME="vdb" MAJ:MIN="252:16" RM="0" SIZE="15G" RO="0" TYPE="disk" MOUNTPOINT=""
NAME="vdc" MAJ:MIN="252:32" RM="0" SIZE="4G" RO="0" TYPE="disk" MOUNTPOINT="[SWAP]"

Though it still seems to be used by kernel:
ubuntu@juju-12e7f5-default-5:/var/log$ sudo lsof | grep vdb
jbd2/vdb- 679 root cwd DIR 252,1 4096 2 /
jbd2/vdb- 679 root rtd DIR 252,1 4096 2 /
jbd2/vdb- 679 root txt unknown /proc/679/exe

Similar (but old) issue: https://bugzilla.redhat.com/show_bug.cgi?id=851970

Revision history for this message
Nikolay Vinogradov (nikolay.vinogradov) wrote :
Revision history for this message
Nikolay Vinogradov (nikolay.vinogradov) wrote :

Tail of tox -e output:
....
DEBUG:runner:2019-05-07 13:04:09 Config specifies num units for subordinate: neutron-openvswitch
DEBUG:runner:2019-05-07 13:09:38 The following units had errors:
DEBUG:runner: unit: nova-compute/0: machine: 5 agent-state: error details: hook failed: "config-changed"
DEBUG:runner:2019-05-07 13:09:38 Deployment stopped. run time: 358.56
DEBUG:runner:Traceback (most recent call last):
DEBUG:runner: File "/tmp/bundletester-ms0ttD/nova-compute/tests/dev-basic-cosmic-rocky", line 22, in <module>
DEBUG:runner: deployment = NovaBasicDeployment(series='cosmic')
DEBUG:runner: File "/tmp/bundletester-ms0ttD/nova-compute/tests/basic_deployment.py", line 63, in __init__
DEBUG:runner: self._deploy()
DEBUG:runner: File "/tmp/.tox/func27-smoke/local/lib/python2.7/site-packages/charmhelpers/contrib/amulet/deployment.py", line 85, in _deploy
DEBUG:runner: self.d.setup(timeout=timeout)
DEBUG:runner: File "/tmp/.tox/func27-smoke/local/lib/python2.7/site-packages/amulet/deployer.py", line 714, in setup
DEBUG:runner: subprocess.check_call(shlex.split(cmd))
DEBUG:runner: File "/usr/lib/python2.7/subprocess.py", line 190, in check_call
DEBUG:runner: raise CalledProcessError(retcode, cmd)
DEBUG:runner:subprocess.CalledProcessError: Command '['juju-deployer', '-W', '-c', '/tmp/amulet-juju-deployer-_BTSPR/deployer-schema.json', '-e', 'nickv-tenant-serverstack:admin/default', '-t', '5500', 'nickv-tenant-serverstack:admin/default']' returned non-zero exit status 1
DEBUG:runner:Exit Code: 1
DEBUG:bundletester.utils:Updating JUJU_MODEL: "nickv-tenant-serverstack:admin/default" -> ""
DEBUG:bundletester.fetchers:git rev-parse HEAD: 4e9bcbc28b9e7dca66c9b22d886ac5c1979eb73e

ERROR: InvocationError: '/tmp/.tox/func27-smoke/bin/bundletester -vl DEBUG -r json -o func-results.json dev-basic-cosmic-rocky --no-destroy'

Revision history for this message
Liam Young (gnuoy) wrote :
Revision history for this message
Nikolay Vinogradov (nikolay.vinogradov) wrote :

Looks like the root cause is the same. Cane be closed as dup of https://bugs.launchpad.net/charm-lxd/+bug/1801349/

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.