Fuel for OpenStack

Lost node from previous deployment seen as bootstrap, but is not functional

Bug #1250137 reported by Tatyanka on 2013-11-11

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Fuel for OpenStack	Won't Fix	Medium	Alexandr Notchenko	Fuel for OpenStack 4.0

Bug Description

iso 4.0-22 Havana
Precondition:
1. Deploy env with Ubuntu on KVM (1 controller + 1 compute + 2 ceph + rados) = Neutron gre - (deployment was not success)
2. Delete failed env
3. Wait while slave nodes was discovered after deletion
4. Try to deploy simple env on Centos (1controller/cinder + compute + Nova Flat DHCP)
5. Deployment hung on installation centos (1 node - centos was successfully intalled - second one stay at bootstrap)
ssh om admin node and execute command cobbler list:
[root@nailgun ~]# cobbler list
distros:
   bootstrap
   centos-x86_64
   ubuntu_1204_x86_64

profiles:
   bootstrap
   centos-x86_64
   ubuntu_1204_x86_64

systems:
   default
   node-8
   node-9

repos:

images:

mgmtclasses:

packages:

ssh on node - 9

And see that hostname is node-4 and ubuntu is installed in it
root@nailgun ~]# ssh node-9
Warning: the RSA host key for 'node-9' differs from the key for the IP address '10.108.0.7'
Offending key for IP in /root/.ssh/known_hosts:2
Matching host key in /root/.ssh/known_hosts:5
Are you sure you want to continue connecting (yes/no)? yes
Welcome to Ubuntu 12.04 LTS (GNU/Linux 3.8.0-31-generic x86_64)

* Documentation: https://help.ubuntu.com/
Last login: Mon Nov 11 15:30:18 2013 from 10.108.0.2
root@node-4:~#
node ip is:

eth0 Link encap:Ethernet HWaddr 64:0e:dd:b6:94:67
          inet addr:10.108.0.7 Bcast:10.108.0.255 Mask:255.255.255.0
          inet6 addr: fe80::660e:ddff:feb6:9467/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:22794 errors:0 dropped:3714 overruns:0 frame:0
          TX packets:11426 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:51768264 (51.7 MB) TX bytes:2875390 (2.8 MB)

Also there is error in nailgun on PUT for this node
13-11-11 13:41:22 ERROR (logger) Traceback (most recent call last):
2013-11-11 13:41:23 ERROR (logger) Response code '500 Internal Server Error' for PUT /api/nodes/ from 10.108.0.7:40075
2013-11-11 13:41:23 ERROR

(logger) Response code '500 Internal Server Error' for PUT /api/nodes/ from 10.108.0.7:40075

2013-11-11 13:41:22 ERROR

(logger) Traceback (most recent call last):
  File "/opt/nailgun/lib/python2.6/site-packages/web/application.py", line 239, in process
    return self.handle()
  File "/opt/nailgun/lib/python2.6/site-packages/web/application.py", line 230, in handle
    return self._delegate(fn, self.fvars, args)
  File "/opt/nailgun/lib/python2.6/site-packages/web/application.py", line 420, in _delegate
    return handle_class(cls)
  File "/opt/nailgun/lib/python2.6/site-packages/web/application.py", line 396, in handle_class
    return tocall(*args)
  File "<string>", line 2, in PUT
  File "/opt/nailgun/lib/python2.6/site-packages/nailgun/api/handlers/base.py", line 55, in content_json
    data = func(*args, **kwargs)
  File "/opt/nailgun/lib/python2.6/site-packages/nailgun/api/handlers/node.py", line 394, in PUT
    db().commit()
  File "/opt/nailgun/lib/python2.6/site-packages/sqlalchemy/orm/session.py", line 656, in commit
    self.transaction.commit()
  File "/opt/nailgun/lib/python2.6/site-packages/sqlalchemy/orm/session.py", line 314, in commit
    self._prepare_impl()
  File "/opt/nailgun/lib/python2.6/site-packages/sqlalchemy/orm/session.py", line 298, in _prepare_impl
    self.session.flush()
  File "/opt/nailgun/lib/python2.6/site-packages/sqlalchemy/orm/session.py", line 1583, in flush
    self._flush(objects)
  File "/opt/nailgun/lib/python2.6/site-packages/sqlalchemy/orm/session.py", line 1654, in _flush
    flush_context.execute()
  File "/opt/nailgun/lib/python2.6/site-packages/sqlalchemy/orm/unitofwork.py", line 331, in execute
    rec.execute(self)
  File "/opt/nailgun/lib/python2.6/site-packages/sqlalchemy/orm/unitofwork.py", line 475, in execute
    uow
  File "/opt/nailgun/lib/python2.6/site-packages/sqlalchemy/orm/persistence.py", line 59, in save_obj
    mapper, table, update)
  File "/opt/nailgun/lib/python2.6/site-packages/sqlalchemy/orm/persistence.py", line 485, in _emit_update_statements
    execute(statement, params)
  File "/opt/nailgun/lib/python2.6/site-packages/sqlalchemy/engine/base.py", line 1449, in execute
    params)
  File "/opt/nailgun/lib/python2.6/site-packages/sqlalchemy/engine/base.py", line 1584, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/opt/nailgun/lib/python2.6/site-packages/sqlalchemy/engine/base.py", line 1698, in _execute_context
    context)
  File "/opt/nailgun/lib/python2.6/site-packages/sqlalchemy/engine/base.py", line 1691, in _execute_context
    context)
  File "/opt/nailgun/lib/python2.6/site-packages/sqlalchemy/engine/default.py", line 331, in do_execute
    cursor.execute(statement, parameters)
IntegrityError: (IntegrityError) null value in column "mac" violates not-null constraint
'UPDATE nodes SET meta=%(meta)s, mac=%(mac)s, ip=%(ip)s WHERE nodes.id = %(nodes_id)s' {'nodes_id': 2, 'mac': None, 'meta': '{"system": {"fqdn": "node-2.test.domain.local", "manufacturer": "KVM"}, "interfaces": [{"mac": "64:91:2C:52:6D:69", "max_speed": null, "name": "eth3", "current_speed": null}, {"mac": "64:AB:6F:11:92:95", "max_speed": null, "name": "eth2", "current_speed": null}, {"mac": "64:33:9D:A0:A0:BF", "max_speed": null, "name": "eth1", "current_speed": null}, {"mac": "64:4C:D6:E3:4B:22", "max_speed": null, "name": "eth0", "current_speed": null}], "disks": [{"model": null, "disk": "disk/by-path/pci-0000:00:09.0-virtio-pci-virtio6", "name": "vdc", "size": 21474836480}, {"model": null, "disk": "disk/by-path/pci-0000:00:08.0-virtio-pci-virtio5", "name": "vdb", "size": 21474836480}, {"model": null, "disk": "disk/by-path/pci-0000:00:07.0-virtio-pci-virtio4", "name": "vda", "size": 21474836480}], "cpu": {"real": 0, "total": 1, "spec": [{"model": "QEMU Virtual CPU version 1.0", "frequency": 3410}]}, "memory": {"slots": 1, "total": 1073741824, "maximum_capacity": 1073741824, "devices": [{"type": "RAM", "size": 1073741824}]}}', 'ip': u'192.168.0.3'}

Seems that node was not deleted properly

Tags:

Revision history for this message

Tatyanka (tatyana-leontovich) wrote on 2013-11-11:

fuel-snapshot-2013-11-11_15-36-48.tgz Edit (1.6 MiB, application/x-tar)

Changed in fuel:
importance:	Undecided → Critical

Revision history for this message

Dmitry Pyzhov (dpyzhov) wrote on 2013-11-11:

Node failed to reboot after cluster deletion. Later it was successfully discovered as a new node.

First, we should alert user about 'new' pre-deployed nodes.
Second, we should not believe that mcollective on such node is able to reboot it.

We need a design for this use case.

Changed in fuel:
importance:	Critical → Medium
summary:	- Inconsist deployment of second environment after forst one has been - deleted + Lost node from previous deployment seen as bootstrap, but is not + functional

Mike Scherbakov (mihgen) on 2013-11-14

Changed in fuel:
milestone:	none → 4.0

Dmitry Pyzhov (dpyzhov) on 2013-11-19

Changed in fuel:
assignee:	Dmitry Pyzhov (lux-place) → Alexandr Notchenko (anotchenko)

Evgeniy L (rustyrobot) on 2013-11-19

Changed in fuel:
status:	New → Confirmed
status:	Confirmed → Triaged

Revision history for this message

Dmitry Pyzhov (dpyzhov) wrote on 2013-11-21:

Not reproducible

Changed in fuel:
status:	Triaged → Invalid

Dmitry Pyzhov (dpyzhov) on 2013-11-21

Changed in fuel:
status:	Invalid → Won't Fix

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

fuel-snapshot-2013-11-11_15-36-48.tgz Edit

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.