Floating ip unreachable while upgrade first controller

Bug #1543972 reported by sryabin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Won't Fix
High
Sergey Abramov
Mitaka
Won't Fix
High
Sergey Abramov

Bug Description

Upgrade from 6.1 to 7.0

Before upgrade controller crm status show:

Last updated: Wed Feb 10 08:42:27 2016
Last change: Tue Feb 9 19:38:59 2016
Stack: corosync
Current DC: node-1.domain.tld (1) - partition with quorum
Version: 1.1.12-561c4cf
3 Nodes configured
43 Resources configured

Online: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]

 Clone Set: clone_p_vrouter [p_vrouter]
     Started: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
 vip__management (ocf::fuel:ns_IPaddr2): Started node-1.domain.tld
 vip__public_vrouter (ocf::fuel:ns_IPaddr2): Started node-1.domain.tld
 vip__management_vrouter (ocf::fuel:ns_IPaddr2): Started node-1.domain.tld
 vip__public (ocf::fuel:ns_IPaddr2): Started node-2.domain.tld
 Master/Slave Set: master_p_conntrackd [p_conntrackd]
     Masters: [ node-1.domain.tld ]
     Slaves: [ node-2.domain.tld node-3.domain.tld ]
 Clone Set: clone_p_haproxy [p_haproxy]
     Started: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
 Clone Set: clone_p_dns [p_dns]
     Started: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
 Clone Set: clone_p_mysql [p_mysql]
     Started: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
 Master/Slave Set: master_p_rabbitmq-server [p_rabbitmq-server]
     Masters: [ node-1.domain.tld ]
     Slaves: [ node-2.domain.tld node-3.domain.tld ]
 Clone Set: clone_p_neutron-plugin-openvswitch-agent [p_neutron-plugin-openvswitch-agent]
     Started: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
 Clone Set: clone_p_neutron-dhcp-agent [p_neutron-dhcp-agent]
     Started: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
 Clone Set: clone_p_neutron-metadata-agent [p_neutron-metadata-agent]
     Started: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
 Clone Set: clone_p_neutron-l3-agent [p_neutron-l3-agent]
     Started: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
 Clone Set: clone_p_heat-engine [p_heat-engine]
     Started: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
 Clone Set: clone_p_ntp [p_ntp]
     Started: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
 Clone Set: clone_ping_vip__public [ping_vip__public]
     Started: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]

While running upgrade controller, crm status show:

Stack: corosync
Current DC: node-3.domain.tld (3) - partition with quorum
Version: 1.1.12-561c4cf
3 Nodes configured
43 Resources configured

Online: [ node-2.domain.tld node-3.domain.tld ]
OFFLINE: [ node-1.domain.tld ]

 Clone Set: clone_p_vrouter [p_vrouter]
     Started: [ node-2.domain.tld node-3.domain.tld ]
 vip__management (ocf::fuel:ns_IPaddr2): Started node-2.domain.tld
 vip__public_vrouter (ocf::fuel:ns_IPaddr2): Started node-3.domain.tld
 vip__management_vrouter (ocf::fuel:ns_IPaddr2): Started node-3.domain.tld
 vip__public (ocf::fuel:ns_IPaddr2): Started node-2.domain.tld
 Master/Slave Set: master_p_conntrackd [p_conntrackd]
     Masters: [ node-3.domain.tld ]
     Slaves: [ node-2.domain.tld ]
 Clone Set: clone_p_haproxy [p_haproxy]
     Started: [ node-2.domain.tld node-3.domain.tld ]
 Clone Set: clone_p_dns [p_dns]
     Started: [ node-2.domain.tld node-3.domain.tld ]
 Clone Set: clone_p_mysql [p_mysql]
     Started: [ node-2.domain.tld node-3.domain.tld ]
 Master/Slave Set: master_p_rabbitmq-server [p_rabbitmq-server]
     p_rabbitmq-server (ocf::fuel:rabbitmq-server): FAILED Master node-2.domain.tld
     Slaves: [ node-3.domain.tld ]
 Clone Set: clone_p_neutron-plugin-openvswitch-agent [p_neutron-plugin-openvswitch-agent]
     Started: [ node-2.domain.tld node-3.domain.tld ]
 Clone Set: clone_p_neutron-dhcp-agent [p_neutron-dhcp-agent]
     Started: [ node-2.domain.tld node-3.domain.tld ]
 Clone Set: clone_p_neutron-metadata-agent [p_neutron-metadata-agent]
     Started: [ node-2.domain.tld node-3.domain.tld ]
 Clone Set: clone_p_neutron-l3-agent [p_neutron-l3-agent]
     Started: [ node-2.domain.tld node-3.domain.tld ]
 Clone Set: clone_p_heat-engine [p_heat-engine]
     Started: [ node-2.domain.tld node-3.domain.tld ]
 Clone Set: clone_p_ntp [p_ntp]
     Started: [ node-2.domain.tld node-3.domain.tld ]
 Clone Set: clone_ping_vip__public [ping_vip__public]
     Started: [ node-2.domain.tld node-3.domain.tld ]

Failed actions:
    p_rabbitmq-server_monitor_30000 on node-3.domain.tld 'unknown error' (1): call=172, status=complete, last-rc-change='Wed Feb 10 09:17:43 2016', queued=0ms, exec=0ms
    p_rabbitmq-server_promote_0 on node-2.domain.tld 'unknown error' (1): call=279, status=complete, last-rc-change='Wed Feb 10 09:16:37 2016', queued=2ms, exec=52397ms
    p_rabbitmq-server_promote_0 on node-2.domain.tld 'unknown error' (1): call=279, status=complete, last-rc-change='Wed Feb 10 09:16:37 2016', queued=2ms, exec=52397ms

Floating ip at that moment was unreachable
Wed Feb 10 09:24:13 UTC 2016 10.21.6.161 is unreachable
Wed Feb 10 09:24:13 UTC 2016 10.21.6.170 is unreachable
Wed Feb 10 09:24:13 UTC 2016 10.21.6.171 is unreachable
Wed Feb 10 09:24:13 UTC 2016 10.21.6.172 is unreachable
Wed Feb 10 09:24:13 UTC 2016 10.21.6.173 is unreachable
Wed Feb 10 09:24:13 UTC 2016 10.21.6.174 is unreachable
Wed Feb 10 09:24:13 UTC 2016 10.21.6.175 is unreachable
Wed Feb 10 09:24:13 UTC 2016 10.21.6.162 is unreachable
Wed Feb 10 09:24:13 UTC 2016 10.21.6.163 is unreachable
Wed Feb 10 09:24:13 UTC 2016 10.21.6.164 is unreachable
Wed Feb 10 09:24:13 UTC 2016 10.21.6.165 is unreachable
Wed Feb 10 09:24:13 UTC 2016 10.21.6.166 is unreachable
Wed Feb 10 09:24:13 UTC 2016 10.21.6.167 is unreachable
Wed Feb 10 09:24:13 UTC 2016 10.21.6.168 is unreachable
Wed Feb 10 09:24:13 UTC 2016 10.21.6.169 is unreachable

Changed in fuel:
status: New → Incomplete
importance: Undecided → High
milestone: none → 8.0-updates
assignee: nobody → Fuel Octane (fuel-octane-team)
tags: added: feature-upgrade
Changed in fuel:
assignee: Fuel Octane (fuel-octane-team) → Fuel Octane Dev Team (fuel-octane)
tags: added: team-upgrades
Changed in fuel:
assignee: Fuel Octane Dev Team (fuel-octane) → sryabin (sryabin)
Revision history for this message
sryabin (sryabin) wrote :

We can't get rid of floating-ip downtime, but can reduce it by disabling l3 neutron agent on node, before shutdown.

For exapmle:
pcs resource ban p_neutron-l3-agent node-1.domain.tld

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-octane (master)

Fix proposed to branch: master
Review: https://review.openstack.org/288133

Changed in fuel:
status: Incomplete → In Progress
tags: added: area-python
Changed in fuel:
assignee: sryabin (sryabin) → Oleg S. Gelbukh (gelbuhos)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-octane (stable/8.0)

Fix proposed to branch: stable/8.0
Review: https://review.openstack.org/353968

Changed in fuel:
assignee: Oleg S. Gelbukh (gelbuhos) → Sergey Abramov (sabramov)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-octane (stable/8.0)

Change abandoned by Fuel DevOps Robot (<email address hidden>) on branch: stable/8.0
Review: https://review.openstack.org/353968
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-octane (master)

Change abandoned by Fuel DevOps Robot (<email address hidden>) on branch: master
Review: https://review.openstack.org/288133
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Revision history for this message
Anton Matveev (amatveev) wrote :

octane specific, moving to won't fix due to shifted priorities

Changed in fuel:
status: In Progress → Won't Fix
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Fuel DevOps Robot (<email address hidden>) on branch: master
Review: https://review.openstack.org/288133
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.