pacemaker would be wrong when both node have same hostname

Bug #1318912 reported by Walter
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
openstack-manuals
Fix Released
High
Bogdan Dobrelya

Bug Description

-----------------------------------
Built: 2014-05-13T00:13:53 00:00
git SHA: 8d32b87efb3847b41907e36d4ae5d1e78732f5aa
URL: http://docs.openstack.org/high-availability-guide/content/ch-network.html
source File: file:/home/jenkins/workspace/openstack-manuals-tox-doc-publishdocs/doc/high-availability-guide/bk-ha-guide.xml
xml:id: ch-network

   the manual says:Both nodes should have the same hostname since the Networking scheduler will be aware of one node, for example a virtual router attached to a single L3 node.
    But when I test it on two servers with same hostname,after installing corosync and pacemaker service on them(with no resource configured),the crm_mon output goes into endless loop.And in the log of corosync,there are so many messages like:May 09 22:25:40 [2149] TEST crmd: warning: crm_get_peer: Node 'TEST' and 'TEST' share the same cluster nodeid: 1678901258.After this I set diffrent nodeid in /etc/corosync/corosync.conf of each test node,but it didn't help.
    So,I set diffrent hostname for each server,and then configure pacemaker just like the manual except the hostname,the neutron-dhcp-agent and neutron-metadata-agent works well,but neutron-l3-agent not(VM instance can't not access the external net,further more the gateway of the VM instance can't be accessed either).
    After two days checking,finally I found that we can use "netron l3-agent-router-remove network1_l3_agentid external-routeid" and "netron l3-agent-router-add network2_l3_agentid external-routeid" to let the backup l3-agent to work when the former network node is down.(assume the two node's names are network1 and network2),alternatively,we can update the mysql table routerl3agentbindings in neutron base directly.If it make sense,I think we can change the scrip neutron-agent-l3 , in it's neutron_l3_agent_start() function,only need few lines to make it work well.

Walter (walterxj)
description: updated
Tom Fifield (fifieldt)
Changed in openstack-manuals:
status: New → Confirmed
importance: Undecided → High
milestone: none → juno
Revision history for this message
Walter (walterxj) wrote :

   With the help of Marica(<email address hidden>) from OpenStack mail list,I changed RA which named neutron-agent-l3,and now it works.I have attached my modified RA.hope it help anyone else.
   In my way,both node should set diffrent hostname in /etc/sysconfig/network,and when l3-agent started by pacemaker,the node's hostname will be changed to network-controller automatically,so whichever node start the l3-agent,it's hostname would be the same.This idea is from Marica.
   I have changed few lines based on Marica's RA and https://bugs.launchpad.net/openstack-manuals/+bug/1252131

Tom Fifield (fifieldt)
Changed in openstack-manuals:
status: Confirmed → Triaged
milestone: juno → kilo
tags: added: ha-guide
Changed in openstack-manuals:
assignee: nobody → Bogdan Dobrelya (bogdando)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ha-guide (master)

Fix proposed to branch: master
Review: https://review.openstack.org/135920

Changed in openstack-manuals:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ha-guide (master)

Reviewed: https://review.openstack.org/135920
Committed: https://git.openstack.org/cgit/openstack/ha-guide/commit/?id=04b18f1ccf131971b52158c2767fb37e528048e6
Submitter: Jenkins
Branch: master

commit 04b18f1ccf131971b52158c2767fb37e528048e6
Author: Bogdan Dobrelya <email address hidden>
Date: Thu Nov 20 12:35:49 2014 +0100

    Fix a note about a nework node names

    Closes-bug: #1318912

    Change-Id: I1e60848c6d9dee71599b75772cc331323819a4ab
    Signed-off-by: Bogdan Dobrelya <email address hidden>

Changed in openstack-manuals:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.