Activity log for bug #1454469

Date Who What changed Old value New value Message
2015-05-13 01:04:18 Larry Michel bug added bug
2015-05-13 01:04:18 Larry Michel attachment added logs.tar.gz https://bugs.launchpad.net/bugs/1454469/+attachment/4396562/+files/logs.tar.gz
2015-05-13 01:06:02 Larry Michel description We had 8 nodes transition to failed state within 2 minutes. I couldn't tell from the logs (attached) what could have caused that to happen. It hit that rash of failures then things went relatively quiet for the past day. This resulted in pending state and a few of these servers came from a singled OpenStack deployment: '1': agent-state: pending containers: 1/lxc/0: instance-id: pending series: trusty 1/lxc/1: instance-id: pending series: trusty dns-name: hayward-43.oil hardware: arch=amd64 cpu-cores=8 mem=16384M tags=hw-ok,oil,hardware-sm15k,hw-glance-sm15k instance-id: /MAAS/api/1.0/nodes/node-9df8a42a-c4cd-11e3-824b-00163efc5068/ series: trusty '2': agent-state: pending containers: 2/lxc/0: instance-id: pending series: trusty dns-name: hayward-21.oil hardware: arch=amd64 cpu-cores=8 mem=16384M tags=hw-ok,oil-slave-4,hardware-sm15k,hw-glance-sm15k instance-id: /MAAS/api/1.0/nodes/node-a38685ec-c4cd-11e3-8102-00163efc5068/ series: trusty '3': agent-state: pending containers: 3/lxc/0: instance-id: pending series: trusty dns-name: apsaras.oil hardware: arch=amd64 cpu-cores=4 mem=32768M tags=hw-ok,oil-slave-1,hardware-dell-poweredge-R210,console-com2 instance-id: /MAAS/api/1.0/nodes/node-5f9c14e6-ae98-11e3-b194-00163efc5068/ series: trusty ... '5': agent-state: pending containers: 5/lxc/0: instance-id: pending series: trusty 5/lxc/1: instance-id: pending series: trusty dns-name: hayward-49.oil hardware: arch=amd64 cpu-cores=8 mem=32768M tags=hw-ok,oil-slave-1,hardware-sm15k,hw-glance-sm15k instance-id: /MAAS/api/1.0/nodes/node-9ff3a32e-c4cd-11e3-824b-00163efc5068/ series: trusty From maas logs: Node failed to deploy, but we're looking at a number of 8 failed deployments during a ~2 minute window: May 11 13:16:29 maas-trusty-back-may22 maas.node: [INFO] hayward-43: Status transition from DEPLOYING to FAILED_DEPLOYMENT May 11 13:16:35 maas-trusty-back-may22 maas.node: [INFO] hayward-36: Status transition from DEPLOYING to FAILED_DEPLOYMENT May 11 13:16:53 maas-trusty-back-may22 maas.node: [INFO] hayward-21: Status transition from DEPLOYING to FAILED_DEPLOYMENT May 11 13:17:18 maas-trusty-back-may22 maas.node: [INFO] apsaras.local: Status transition from DEPLOYING to FAILED_DEPLOYMENT May 11 13:18:08 maas-trusty-back-may22 maas.node: [INFO] reading.local: Status transition from DEPLOYING to FAILED_DEPLOYMENT May 11 13:18:12 maas-trusty-back-may22 maas.node: [INFO] glover.local: Status transition from DEPLOYING to FAILED_DEPLOYMENT May 11 13:18:19 maas-trusty-back-may22 maas.node: [INFO] hayward-55: Status transition from DEPLOYING to FAILED_DEPLOYMENT May 11 13:18:19 maas-trusty-back-may22 maas.node: [INFO] hayward-49: Status transition from DEPLOYING to FAILED_DEPLOYMENT I have attached some logs from maas server. We had 8 nodes transition to failed state within 2 minutes. I couldn't tell from the logs (attached) what could have caused that to happen. It hit that rash of failures then things went relatively quiet for the past day. This resulted in the servers being in pending state in juju_status.yaml with a few of these servers came from one OpenStack deployment build. See below from juju_status.yaml: '1':     agent-state: pending     containers:       1/lxc/0:         instance-id: pending         series: trusty       1/lxc/1:         instance-id: pending         series: trusty     dns-name: hayward-43.oil     hardware: arch=amd64 cpu-cores=8 mem=16384M tags=hw-ok,oil,hardware-sm15k,hw-glance-sm15k     instance-id: /MAAS/api/1.0/nodes/node-9df8a42a-c4cd-11e3-824b-00163efc5068/     series: trusty   '2':     agent-state: pending     containers:       2/lxc/0:         instance-id: pending         series: trusty     dns-name: hayward-21.oil     hardware: arch=amd64 cpu-cores=8 mem=16384M tags=hw-ok,oil-slave-4,hardware-sm15k,hw-glance-sm15k     instance-id: /MAAS/api/1.0/nodes/node-a38685ec-c4cd-11e3-8102-00163efc5068/     series: trusty   '3':     agent-state: pending     containers:       3/lxc/0:         instance-id: pending         series: trusty     dns-name: apsaras.oil     hardware: arch=amd64 cpu-cores=4 mem=32768M tags=hw-ok,oil-slave-1,hardware-dell-poweredge-R210,console-com2     instance-id: /MAAS/api/1.0/nodes/node-5f9c14e6-ae98-11e3-b194-00163efc5068/     series: trusty ...   '5':     agent-state: pending     containers:       5/lxc/0:         instance-id: pending         series: trusty       5/lxc/1:         instance-id: pending         series: trusty     dns-name: hayward-49.oil     hardware: arch=amd64 cpu-cores=8 mem=32768M tags=hw-ok,oil-slave-1,hardware-sm15k,hw-glance-sm15k     instance-id: /MAAS/api/1.0/nodes/node-9ff3a32e-c4cd-11e3-824b-00163efc5068/     series: trusty From maas logs: Node failed to deploy, but we're looking at a number of 8 failed deployments during a ~2 minute window: May 11 13:16:29 maas-trusty-back-may22 maas.node: [INFO] hayward-43: Status transition from DEPLOYING to FAILED_DEPLOYMENT May 11 13:16:35 maas-trusty-back-may22 maas.node: [INFO] hayward-36: Status transition from DEPLOYING to FAILED_DEPLOYMENT May 11 13:16:53 maas-trusty-back-may22 maas.node: [INFO] hayward-21: Status transition from DEPLOYING to FAILED_DEPLOYMENT May 11 13:17:18 maas-trusty-back-may22 maas.node: [INFO] apsaras.local: Status transition from DEPLOYING to FAILED_DEPLOYMENT May 11 13:18:08 maas-trusty-back-may22 maas.node: [INFO] reading.local: Status transition from DEPLOYING to FAILED_DEPLOYMENT May 11 13:18:12 maas-trusty-back-may22 maas.node: [INFO] glover.local: Status transition from DEPLOYING to FAILED_DEPLOYMENT May 11 13:18:19 maas-trusty-back-may22 maas.node: [INFO] hayward-55: Status transition from DEPLOYING to FAILED_DEPLOYMENT May 11 13:18:19 maas-trusty-back-may22 maas.node: [INFO] hayward-49: Status transition from DEPLOYING to FAILED_DEPLOYMENT I have attached some logs from maas server.
2015-05-13 02:23:42 Blake Rouse maas: status New Incomplete
2015-05-13 02:40:18 Larry Michel attachment added logs.tar.gz https://bugs.launchpad.net/maas/+bug/1454469/+attachment/4396587/+files/logs.tar.gz
2015-05-13 02:40:39 Larry Michel maas: status Incomplete New
2015-05-13 03:07:46 Blake Rouse maas: status New Incomplete
2015-07-20 04:17:23 Launchpad Janitor maas: status Incomplete Expired