SYMC: Multiple HAProxy processes getting spawned for single LBaaS

Bug #1495702 reported by Varun Lodaya
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.0
Fix Committed
High
Divakar Dharanalakota
R2.20
Fix Committed
High
Divakar Dharanalakota
R2.21.x
Fix Committed
High
Divakar Dharanalakota
R3.0
Fix Committed
High
Divakar Dharanalakota
Trunk
Fix Committed
High
Divakar Dharanalakota
OpenContrail
New
High
Unassigned

Bug Description

Hi,

We are seeing this issue with a lot of LBaaS where there are multiple HAProxy processes getting spawned for a single LBaaS. LB update does not work because of this. Any update on LB member/pool gets incorporated on only 1 process and others stay with the old config causing inconsistencies. Made sure there is nbproc config in the configuration file.

Following is 1 of the LBaaS instance:
Pool_ID: 03e9ed77-aed9-4162-8773-46520ad34651

root@b0e002ash2004:/var/log/contrail# ps -ef | grep haproxy
haproxy 3196 1 0 May15 ? 00:04:58 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -D -p /var/run/haproxy.pid
root 7547 888 0 21:47 pts/5 00:00:00 grep --color=auto haproxy
nobody 25786 1 0 Sep10 ? 00:00:10 haproxy -f /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg.pid -sf 14627
nobody 25809 1 0 Sep10 ? 00:00:09 haproxy -f /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg.pid -sf 14627
nobody 25824 1 0 Sep10 ? 00:00:09 haproxy -f /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg.pid -sf 14627
nobody 25855 1 0 Sep10 ? 00:00:09 haproxy -f /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg.pid -sf 14627
nobody 25870 1 0 Sep10 ? 00:00:09 haproxy -f /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg.pid -sf 14627
nobody 25885 1 0 Sep10 ? 00:00:10 haproxy -f /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg.pid -sf 14627
nobody 25900 1 0 Sep10 ? 00:00:09 haproxy -f /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg.pid -sf 14627
nobody 25915 1 0 Sep10 ? 00:00:09 haproxy -f /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg.pid -sf 14627
nobody 25936 1 0 Sep10 ? 00:00:09 haproxy -f /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg.pid -sf 14627
nobody 25951 1 0 Sep10 ? 00:00:09 haproxy -f /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg.pid -sf 14627
nobody 41005 1 0 Aug29 ? 00:00:41 haproxy -f /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg.pid -sf 32187
nobody 41035 1 0 Aug29 ? 00:00:41 haproxy -f /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg.pid -sf 32187
nobody 41050 1 0 Aug29 ? 00:00:41 haproxy -f /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg.pid -sf 32187
nobody 41101 1 0 Aug29 ? 00:00:41 haproxy -f /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg.pid -sf 32187
nobody 41132 1 0 Aug29 ? 00:00:41 haproxy -f /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg.pid -sf 32187
nobody 41162 1 0 Aug29 ? 00:00:42 haproxy -f /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg.pid -sf 32187
nobody 41200 1 0 Aug29 ? 00:00:43 haproxy -f /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/03e9ed77-aed9-4162-8773-46520ad34651/etc/haproxy/haproxy.cfg.pid -sf 32187

Changed in opencontrail:
importance: Undecided → High
Jeba Paulaiyan (jebap)
no longer affects: juniperopenstack/r2.0
Revision history for this message
Varun Lodaya (varun-lodaya) wrote :

Also, no errors whatsoever in the vrouter-agent log files

Revision history for this message
Jeba Paulaiyan (jebap) wrote :

Assigning to Vivek Garg to re-create internally.

Revision history for this message
Varun Lodaya (varun-lodaya) wrote :

Observed the haproxy config file for the lbaas with many instances didn't seem correct. Attaching the config observed:

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.0

Review in progress for https://review.opencontrail.org/14410
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20.x

Review in progress for https://review.opencontrail.org/14440
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/14442
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.0

Review in progress for https://review.opencontrail.org/14444
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/14442
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/14442
Committed: http://github.org/Juniper/contrail-controller/commit/9bbe8d2ec3add95af5602fc8a8ed77c2dcd68be6
Submitter: Zuul
Branch: R2.20

commit 9bbe8d2ec3add95af5602fc8a8ed77c2dcd68be6
Author: Divakar <email address hidden>
Date: Thu Oct 15 16:11:50 2015 +0530

Taking the haproxy process id from config.pid file

The haproxy config is updated by providing the new configuration file
and old haproxy process id. This starts a new haproxy process letting
the old haproxy process to handle the old sessions. New haproxy handles
the new sessions. If a new configuration update comes while both old and
new haproxy processes are runnning, another haproxy process needs to be started
by providing the latest running haproxy process id.

In the existing update code, the haproxy process id is found by "ps"
output. This is resulating in not getting the latest haproxy process id,
but an older process id. This is leading to second haproxy process never
getting killed leaving two haproxies running for ever.

The latest haproxy process that is handling the new session is always
updated in a file by haproxy itself. As a fix, this process id used to
start a new haproxy rather taking the pid from "ps". When the
loadbalancer VIP is deleted, Agent deletes the directory where haproxy
config file, stats file and pid file are stored. This can result in
vrouter_netns.py script failing as pid is not found to stop the haproxy.
To overcome this, the config file, pid file and sock file are stored not
in an explicit directory corresponding to the pool. Only config fike and
sock file are deleted by Agent and pid file is deleted by vrouter_netns
script.

Change-Id: Id7e1a74a4f076f02c052f860ab5b183adf950640
closes-bug: #1495702

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/14444
Committed: http://github.org/Juniper/contrail-controller/commit/6083a6b34aba1eecd9bea3bc4712553da3c9dc4d
Submitter: Zuul
Branch: R2.0

commit 6083a6b34aba1eecd9bea3bc4712553da3c9dc4d
Author: Divakar <email address hidden>
Date: Thu Oct 15 16:38:59 2015 +0530

Taking the haproxy process id from config.pid file

The haproxy config is updated by providing the new configuration file
and old haproxy process id. This starts a new haproxy process letting
the old haproxy process to handle the old sessions. New haproxy handles
the new sessions. If a new configuration update comes while both old and
new haproxy processes are runnning, another haproxy process needs to be started
by providing the latest running haproxy process id.

In the existing update code, the haproxy process id is found by "ps"
output. This is resulating in not getting the latest haproxy process id,
but an older process id. This is leading to second haproxy process never
getting killed leaving two haproxies running for ever.

The latest haproxy process that is handling the new session is always
updated in a file by haproxy itself. As a fix, this process id used to
start a new haproxy rather taking the pid from "ps". When the
loadbalancer VIP is deleted, Agent deletes the directory where haproxy
config file, stats file and pid file are stored. This can result in
vrouter_netns.py script failing as pid is not found to stop the haproxy.
To overcome this, the config file, pid file and sock file are stored not
in an explicit directory corresponding to the pool. Only config fike and
sock file are deleted by Agent and pid file is deleted by vrouter_netns
script.

Change-Id: I2f9bda829f90859ae55991e7e4313bc2119c44a1
closes-bug: #1495702

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.21.x

Review in progress for https://review.opencontrail.org/14877
Submitter: Rudra Rugge (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/14877
Committed: http://github.org/Juniper/contrail-controller/commit/2bfbb468a25a5779e2eec15d935ace0990e00d87
Submitter: Zuul
Branch: R2.21.x

commit 2bfbb468a25a5779e2eec15d935ace0990e00d87
Author: Divakar <email address hidden>
Date: Thu Oct 15 16:11:50 2015 +0530

Taking the haproxy process id from config.pid file

The haproxy config is updated by providing the new configuration file
and old haproxy process id. This starts a new haproxy process letting
the old haproxy process to handle the old sessions. New haproxy handles
the new sessions. If a new configuration update comes while both old and
new haproxy processes are runnning, another haproxy process needs to be started
by providing the latest running haproxy process id.

In the existing update code, the haproxy process id is found by "ps"
output. This is resulating in not getting the latest haproxy process id,
but an older process id. This is leading to second haproxy process never
getting killed leaving two haproxies running for ever.

The latest haproxy process that is handling the new session is always
updated in a file by haproxy itself. As a fix, this process id used to
start a new haproxy rather taking the pid from "ps". When the
loadbalancer VIP is deleted, Agent deletes the directory where haproxy
config file, stats file and pid file are stored. This can result in
vrouter_netns.py script failing as pid is not found to stop the haproxy.
To overcome this, the config file, pid file and sock file are stored not
in an explicit directory corresponding to the pool. Only config fike and
sock file are deleted by Agent and pid file is deleted by vrouter_netns
script.

Change-Id: Id7e1a74a4f076f02c052f860ab5b183adf950640
closes-bug: #1495702

summary: - Multiple HAProxy processes getting spawned for single LBaaS
+ SYMC: Multiple HAProxy processes getting spawned for single LBaaS
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/17834
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/17834
Committed: http://github.org/Juniper/contrail-controller/commit/8a30644055aeec528a6223223adc5a1d9f76b925
Submitter: Zuul
Branch: R3.0

commit 8a30644055aeec528a6223223adc5a1d9f76b925
Author: Divakar <email address hidden>
Date: Thu Oct 15 16:11:50 2015 +0530

Taking the haproxy process id from config.pid file

The haproxy config is updated by providing the new configuration file
and old haproxy process id. This starts a new haproxy process letting
the old haproxy process to handle the old sessions. New haproxy handles
the new sessions. If a new configuration update comes while both old and
new haproxy processes are runnning, another haproxy process needs to be started
by providing the latest running haproxy process id.

In the existing update code, the haproxy process id is found by "ps"
output. This is resulating in not getting the latest haproxy process id,
but an older process id. This is leading to second haproxy process never
getting killed leaving two haproxies running for ever.

The latest haproxy process that is handling the new session is always
updated in a file by haproxy itself. As a fix, this process id used to
start a new haproxy rather taking the pid from "ps". When the
loadbalancer VIP is deleted, Agent deletes the directory where haproxy
config file, stats file and pid file are stored. This can result in
vrouter_netns.py script failing as pid is not found to stop the haproxy.
To overcome this, the config file, pid file and sock file are stored not
in an explicit directory corresponding to the pool. Only config fike and
sock file are deleted by Agent and pid file is deleted by vrouter_netns
script.

closes-bug: #1495702

Conflicts:
 src/nodemgr/vrouter_nodemgr/haproxy_stats.py
 src/vnsw/agent/oper/instance_manager.cc
 src/vnsw/agent/oper/netns_instance_adapter.cc
 src/vnsw/agent/oper/test/instance_manager_test.cc
 src/vnsw/opencontrail-vrouter-netns/opencontrail_vrouter_netns/vrouter_netns.py

Change-Id: If5fd9be0c5229aa68f4b104d3a566f6ac612814e

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/18000
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/18000
Committed: http://github.org/Juniper/contrail-controller/commit/04f36c5282f0b4efc9a798c922020151dda26a7a
Submitter: Zuul
Branch: master

commit 04f36c5282f0b4efc9a798c922020151dda26a7a
Author: Divakar <email address hidden>
Date: Thu Oct 15 16:11:50 2015 +0530

Taking the haproxy process id from config.pid file

The haproxy config is updated by providing the new configuration file
and old haproxy process id. This starts a new haproxy process letting
the old haproxy process to handle the old sessions. New haproxy handles
the new sessions. If a new configuration update comes while both old and
new haproxy processes are runnning, another haproxy process needs to be started
by providing the latest running haproxy process id.

In the existing update code, the haproxy process id is found by "ps"
output. This is resulating in not getting the latest haproxy process id,
but an older process id. This is leading to second haproxy process never
getting killed leaving two haproxies running for ever.

The latest haproxy process that is handling the new session is always
updated in a file by haproxy itself. As a fix, this process id used to
start a new haproxy rather taking the pid from "ps". When the
loadbalancer VIP is deleted, Agent deletes the directory where haproxy
config file, stats file and pid file are stored. This can result in
vrouter_netns.py script failing as pid is not found to stop the haproxy.
To overcome this, the config file, pid file and sock file are stored not
in an explicit directory corresponding to the pool. Only config fike and
sock file are deleted by Agent and pid file is deleted by vrouter_netns
script.

Change-Id: Iebc6230e0eef73c4e18b18e4fc0ca65b8af6b4e4
closes-bug: #1495702

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.