getting "juju.worker.diskmanager error" after recovery

Bug #1750683 reported by Anastasia
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Triaged
Low
Unassigned

Bug Description

I got the below after I do recover one machine at a time...

the scenario..
I enabled HA for 3 nodes, dropped one machine and recovered it... after recovery, dropped another machine and recovered it..
What should I check further!?

juju 2.3.2-xenial-amd64
Ubuntu 16.04.3 LTS
MAAS Version: 2.3.0-6434-gd354690-0ubuntu1~16.04.1

machine-1: 11:15:06 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:15:08 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:15:36 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:15:38 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:16:06 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:16:08 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:16:36 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:16:38 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:17:06 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:17:08 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:17:36 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:17:38 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:18:06 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:18:08 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:18:36 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:18:38 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:19:06 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:19:08 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:19:36 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:19:38 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:20:06 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:20:08 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:20:36 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:20:38 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:21:06 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:21:08 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:21:36 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:21:38 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:22:06 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:22:08 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:22:36 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:22:38 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:23:06 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:23:08 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:23:36 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:23:38 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:24:06 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:24:08 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:24:36 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:24:38 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:25:06 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:25:08 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:25:36 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:25:38 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:26:06 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:26:08 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:26:36 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:26:38 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:27:06 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:27:08 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:27:36 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:27:38 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:28:06 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:28:08 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:28:36 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:28:38 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:29:06 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:29:08 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-1: 11:29:36 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2
machine-18: 11:29:38 ERROR juju.worker.diskmanager error getting hardware info for "cciss/c0d0" from sysfs: udevadm failed: exit status 2

John A Meinel (jameinel)
Changed in juju:
status: New → Triaged
Revision history for this message
John A Meinel (jameinel) wrote :

for "cciss/c0d0" from sysfs: udevadm failed: exit status 2

^- it definitely seems like we need better logging. The place where I see "udevadm" running is in:
./worker/diskmanager/lsblk.go

It appears that we might have a Tracef logging that gives a bit more information.
However, we are calling exec.Command().Output() which seems to only gather stdout from the spawned udevadm, so we don't actually see any errors that it might be giving on stderr.

If we had actual output, then maybe it would have told us:
 if err != nil {
  msg := "udevadm failed"
  if output := bytes.TrimSpace(output); len(output) > 0 {
   msg += fmt.Sprintf(" (%s)", output)
  }
  return errors.Annotate(err, msg)
 }

but we really need to be trapping both stdout and stderr.

I don't recognize a "cciss/c0d0" device off hand. Google seems to say it has something to do with HP Smart Array drivers
http://man7.org/linux/man-pages/man4/cciss.4.html

Changed in juju:
importance: Undecided → High
Revision history for this message
Canonical Juju QA Bot (juju-qa-bot) wrote :

This bug has not been updated in 2 years, so we're marking it Low importance. If you believe this is incorrect, please update the importance.

Changed in juju:
importance: High → Low
tags: added: expirebugs-bot
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.