7.0, VLAN with Ceph and Radosgw.
A) Reset env + Deploy = Error
B) Delete env + Re create same env + Deploy = Success
Maybe very close to https://bugs.launchpad.net/fuel/+bug/1529870 but not exactly the same Astute error.
These are some log portions of A):
node-1.domain.tld/neutron-openvswitch-agent.log:2016-01-09T20:15:12.451205+00:00 err: 2016-01-09 20:15:12.449 4584 ERROR neutron.agent.ovsdb.impl_vsctl [req-31db460f-e1bd-40a6-81a0-d7f81c0e66c3 ] Unable to execute ['ovs-vsctl', '--timeout=10', '--oneline', '--format=json', '--', '--columns=type', 'list', 'Interface', 'int-br-prv'].
node-11.domain.tld/bootstrap/agent.log:2016-01-09T17:31:30.477050+00:00 debug: 17:31:29.857064 #2239] DEBUG -- : Response: status: 409 body: {"message": "Node with mac 90:B1:1C:28:D0:AB already exists - doing nothing", "errors": []}
node-12.domain.tld/apache2_error.log:2016-01-09T20:07:10.082747+00:00 err: [Sat Jan 09 20:07:02.982889 2016] [fastcgi:error] [pid 7009:tid 140517287319296] (2)No such file or directory: [client 240.0.0.2:53166] FastCGI: failed to connect to server "/var/www/radosgw/s3gw.fcgi": connect() failed
node-6.domain.tld/apache2_error.log:2016-01-09T19:15:57.680770+00:00 err: [Sat Jan 09 19:15:54.518571 2016] [fastcgi:error] [pid 25243:tid 140675924281088] (2)No such file or directory: [client 10.102.255.51:48750] FastCGI: failed to connect to server "/var/www/radosgw/s3gw.fcgi": connect() failed
node-8.domain.tld/apache2_error.log:2016-01-09T19:51:15.950208+00:00 err: [Sat Jan 09 19:51:14.744628 2016] [fastcgi:error] [pid 29025:tid 140315004450560] (2)No such file or directory: [client 240.0.0.2:48587] FastCGI: failed to connect to server "/var/www/radosgw/s3gw.fcgi": connect() failed
node-13.domain.tld/ceph-osd.log:2016-01-09T20:22:27.017267+00:00 emerg: 2016-01-09 20:22:27.012366 7f2c473fc800 -1 auth: error reading file: /var/lib/ceph/tmp/mnt.UA889K/keyring: can't open /var/lib/ceph/tmp/mnt.UA889K/keyring: (2) No such file or directory
node-13.domain.tld/puppet-apply.log:2016-01-09T20:22:51.135691+00:00 notice: (/Stage[main]/Ceph::Osds/Ceph::Osds::Osd[/dev/sdu3]/Exec[ceph-deploy osd prepare node-13:/dev/sdu3]/returns) [node-13][WARNING] Error: Partition(s) 1 on /dev/sdu3 have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use. As a result, the old partition(s) will remain in use. You should reboot now before making further changes.
node-7.domain.tld/puppet-apply.log:2016-01-09T20:14:38.523784+00:00 notice: (/Stage[main]/Ceph::Osds/Ceph::Osds::Osd[/dev/sdj3]/Exec[ceph-deploy osd prepare node-7:/dev/sdj3]/returns) [node-7][WARNING] Error: Partition(s) 1 on /dev/sdj3 have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use. As a result, the old partition(s) will remain in use. You should reboot now before making further changes.
node-14.domain.tld/ceph-osd.log:2016-01-09T20:22:25.922524+00:00 emerg: 2016-01-09 20:22:25.920223 7f771fdc4800 -1 auth: error reading file: /var/lib/ceph/tmp/mnt.cvHNTH/keyring: can't open /var/lib/ceph/tmp/mnt.cvHNTH/keyring: (2) No such file or directory
node-14.domain.tld/neutron-openvswitch-agent.log:2016-01-09T20:18:38.342165+00:00 err: 2016-01-09 20:18:38.338 10659 ERROR neutron.agent.ovsdb.impl_vsctl [req-990323c3-da16-41c0-92c8-75e3cc42e59a ] Unable to execute ['ovs-vsctl', '--timeout=10', '--oneline', '--format=json', '--', '--columns=type', 'list', 'Interface', 'int-br-prv'].
Snapshot is 1.2G, available upon request (in case, ask me in Slack or write a mail).
Looks like a ceph deploy specific bug