Contrail analytics response time varies based on the number of VN/VMI when one of the control node fails

Bug #1719236 reported by vijaya kumar shankaran on 2017-09-25
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.1
Fix Committed
High
Zhiqiang Cui
R3.2
Fix Committed
High
Zhiqiang Cui
R4.0
Fix Committed
High
Zhiqiang Cui
R4.1
Fix Committed
High
Zhiqiang Cui
R5.0
Fix Committed
High
Zhiqiang Cui
Trunk
Fix Committed
High
Zhiqiang Cui

Bug Description

Customer is testing analytics response time when one of the control node fails. Response time varies based on the number of VN and VMI’s. Greater the number VN’s & VMI it takes longer for the response.

Customer setup is as below
3 Control, config
3 collector
3 DB
1 openstack
6 compute nodes
2 TSN nodes

/etc/contrail/contrail-vrouter-agent.conf is modified to point to collector nodes on each compute node.
Customer has provided scripts to create VN & VMI and to query the analytics. They shutdown one of the control node and note down the time. They see a large difference in correct response for the analytics queries based on the number of VN’s and VMI
VN VMI Response time
303 600 5 Sec
1500 3000 50 secs
3000 6000 approx 2 min With one control node shutdown

The above delta time doubles when two control nodes are shutdown.
Is this intended behavior?
Why is this difference noticed in clustered scenario when collector nodes stop responding (to replicate the nodes are shutdown). The DB nodes are all up and running when performing this test.
Can the response time be reduced & consistent irrespective number of interfaces.

I could replicate the issue in lab up to 1500 VN and 3000 VMI. Due to resource constraints couldn’t scale this higher.

 When querying fro VMI we were getting Http 200 K as response but nothing pertaining to interface or network (output of script)

Valid response
10.204.74.242:8081 default-domain:mock:vmi_ntt-comp5_0100_01 200 {"UveVMInterfaceAgent": {"ip6_active": f

Invalid response
10.204.74.242:8081 default-domain:mock:vmi_ntt-comp6_0100_01 200 {}

From LogsFrom contrail-alarm-gen.log sv-25_log-large_sv-24_down

09/05/2017 11:46:02 AM [contrail-alarm-gen]: -uve-3 An exception of type LeaderNotAvailableError occured. Arguments:

LeaderNotAvailableError: LeaderNotAvailableError - 5 - This error is thrown if we are in the middle of a leadership election and there is currently no leader for this partition and hence it is unavailable for writes.

09/05/2017 11:46:07 AM [contrail-alarm-gen]: -uve-23 An exception of type LeaderNotAvailableError occured. Arguments:
LeaderNotAvailableError: LeaderNotAvailableError - 5 - This error is thrown if we are in the middle of a leadership election and there is currently no leader for this partition and hence it is unavailable for writes.

09/05/2017 11:46:07 AM [contrail-alarm-gen]: Error: Consumer Failure LeaderNotAvailableError occured. Arguments:
LeaderNotAvailableError: LeaderNotAvailableError - 5 - This error is thrown if we are in the middle of a leadership election and there is currently no leader for this partition and hence it is unavailable for writ

09/05/2017 11:51:00 AM [contrail-alarm-gen]: redis-uve failed Error connecting to 192.168.0.124:6379. timed out. for key ObjectVRouter:sv-39: (u'192.168.0.124', 6379, 1149) tb Traceback (most recent call last):
ConnectionError: Error connecting to 192.168.0.124:6379. timed out.

09/05/2017 11:51:00 AM [contrail-alarm-gen]: redis-uve failed Error connecting to 192.168.0.124:6379. timed out. for key ObjectGeneratorInfo:sv-21:Control:contrail-dns:0: (u'192.168.0.124', 6379, 1149) tb Traceback (most recent call last):

ConnectionError: Error connecting to 192.168.0.124:6379. timed out.

09/05/2017 11:51:00 AM [contrail-alarm-gen]: Exception KeyError in notif worker. Arguments:
((u'192.168.0.124', 6379, 1149),) : traceback Traceback (most recent call last):

09/05/2017 11:51:04 AM [contrail-alarm-gen]: -uve-12 An exception of type KeyError occured. Arguments:

09/05/2017 11:51:40 AM [contrail-alarm-gen]: Starting part 2 collectors [u'192.168.0.126:6379', u'192.168.0.125:6379']

information type: Proprietary → Private
information type: Private → Public

Hi,

Any Update?

Best Regards,
Vijay Kumar

Download full text (5.2 KiB)

Hi Vijay,

Can you please confirm the ubuntu version?
UVE aggregation doesn’t use kafka if the ubuntu version is 12.X
So, I need to know the ubuntu version to look at the right code path.

Thanks,
Sundar
> On Oct 4, 2017, at 8:24 PM, vijaya kumar shankaran <email address hidden> wrote:
>
> Hi,
>
> Any Update?
>
> Best Regards,
> Vijay Kumar
>
> --
> You received this bug notification because you are a member of Contrail
> Systems engineering, which is subscribed to Juniper Openstack.
> https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_bugs_1719236&d=DwIFaQ&c=HAkYuh63rsuhr6Scbfh0UjBXeMK-ndb3voDTXcWzoCI&r=LPHaOrEhcHUkaXTIgszI3jGHWJ2DkgIMvg2FajOezdI&m=hTKGHt_jLdtJnSnpw5RdXV3SE5w0_3VAg1LLTf7jFjo&s=vuECLcAMoAuQ6Iru8-E9Zcps8Bmg82Dw53EUy7hwJK8&e=
>
> Title:
> Contrail analytics response time varies based on the number of VN/VMI
> when one of the control node fails
>
> Status in Juniper Openstack:
> New
>
> Bug description:
> Customer is testing analytics response time when one of the control
> node fails. Response time varies based on the number of VN and VMI’s.
> Greater the number VN’s & VMI it takes longer for the response.
>
> Customer setup is ass below
> 3 Control, config
> 3 collector
> 3 DB
> 1 openstack
> 6 compute nodes
> 2 TSN nodes
>
>
> /etc/contrail/contrail-vrouter-agent.conf is modified to point to collector nodes on each compute node.
> Customer has provided scripts to create VN & VMI and to query the analytics. They shutdown one of the control node and note down the time. They see a large difference in correct response for the analytics queries based on the number of VN’s and VMI
> VN VMI Response time
> 303 600 5 Sec
> 1500 3000 50 secs
> 3000 6000 approx 2 min With one control node shutdown
>
> The above delta time doubles when two control nodes are shutdown.
> Is this intended behavior?
> Why is this difference noticed in clustered scenario when collector nodes stop responding (to replicate the nodes are shutdown). The DB nodes are all up and running when performing this test.
> Can the response time be reduced & consistent irrespective number of interfaces.
>
> I could replicate the issue in lab up to 1500 VN and 3000 VMI. Due to
> resource constraints couldn’t scale this higher.
>
> When querying fro VMI we were getting Http 200 K as response but
> nothing pertaining to interface or network (output of script)
>
> Valid response
> 10.204.74.242:8081 default-domain:mock:vmi_ntt-comp5_0100_01 200 {"UveVMInterfaceAgent": {"ip6_active": f
>
> Invalid response
> 10.204.74.242:8081 default-domain:mock:vmi_ntt-comp6_0100_01 200 {}
>
>
> From LogsFrom contrail-alarm-gen.log sv-25_log-large_sv-24_down
>
> 09/05/2017 11:46:02 AM [contrail-alarm-gen]: -uve-3 An exception of
> type LeaderNotAvailableError occured. Arguments:
>
> LeaderNotAvailableError: LeaderNotAvailableError - 5 - This error is
> thrown if we are in the middle of a leadership election and there is
> currently no leader for this partition and hence it is unavailable for
> writes...

Read more...

Hi Sundar,

Customer is testing this on Ubuntu 14.04.5.

Best Regards,
Vijay Kumar

Hi Sundar,

Could you please provide us an update?

Best Regards,
Vijay Kumar

Hi Sundar,

Any updates?

Best Regards,
Vijay Kumar

Changed in juniperopenstack:
assignee: nobody → Arvind (arvindv)

Hi Arvind,

This is one of the long pending issue and customer is looking forward for an update.

Could you please provide us an update or an ETA?

Best Regards,
Vijay Kumar

Hi Arvind,

Any updates?

Best Regards,
Vijay Kumar

Arvind (arvindv) wrote :

Hi Vijaya Kumar,

sorry for the delayed response.
Trying to understand the setup and queries issued more...

1) Are we having control and config on the same node or are they on different nodes?
2) Can you explain a bit about the script they are using to query analytics ?
We want to understand how are the queries being issued. Are the issuing individual GET request
for the VN and VMI and then timing the entire operation. Do they wait for individual requests to succeed and then issue the newer one ? Are they trying to determine the time taken to issue (300,600) GET requests vs (1500,3000) GET requests vs (3000, 6000) GET requests. [If there are more API requests more time will be taken, so can u clarify if my understanding of the issue is wrong]

3) Do they give time after each of the control-node shutdown and issuing query against analytics
This can be checked by looking at the redis.log in analytics node and make sure no deletes are happening before issuing the query against analytics.

4) Also the messages u have reported are unrelated to the issue reported by the customer. You are experiencing connectivity issues in analytics to redis. Before trying to issue queries, make sure contrail-status on analytics is ok.
Thanks
Arvind

Arvind (arvindv) wrote :

can u please let us know the contrail-version

description: updated
Download full text (11.6 KiB)

Hi Arvind,

Please find answers for your queries inline

1) Are we having control and config on the same node or are they on different nodes?
yes (attaching testbed.py for your reference)

2) Can you explain a bit about the script they are using to query analytics ?
We want to understand how are the queries being issued. Are the issuing individual GET request
for the VN and VMI and then timing the entire operation. Do they wait for individual requests to succeed and then issue the newer one ? Are they trying to determine the time taken to issue (300,600) GET requests vs (1500,3000) GET requests vs (3000, 6000) GET requests. [If there are more API requests more time will be taken, so can u clarify if my understanding of the issue is wrong]
Customer has created script get_port_uve.py to query against the api servers.

Customer runs the query as the following parameter
python get_port_uve.py --api_ips 10.0.0.100:8081 10.0.0.124:9081 10.0.0.125:9081 10.0.0.126:9081 --search 0 0500_01

In the above example
10.0.0.124:8081 sv-24 Collector
10.0.0.125:9081 sv-25 Collector
10.0.0.126:9081 sv-26 Collector

Customer has created dummy VN amd VMI. VMI (port) is created with following syntax

vmi_<hostname>_<VN id)_<port id>

In the above command customer is searching for VMI ending with port 0500_01. This is a sequential search.

Output of the script and the issue is as mentioned in the output below
1) The last time when all 3 nodes respond with non-zero-length json contents.
2) The time when remaining 2 nodes start to respond with non-zero-length json contents stably.

For example, in case of large_sv-24_down.log, the timestamp of 1) before I shutdown sv-24 was 11:49:57.

-------------------------------------------------------------------------------------------------------------
virtual-machine-interfaces at 2017/09/05 11:49:57
-------------------------------------------------------------------------------------------------------------
node object status result
---------------- --------------------------------------- ------- -----------------------------------------
10.0.0.100:8081 default-domain:mock6:vmi_sv-39_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f
10.0.0.100:8081 default-domain:mock3:vmi_sv-36_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f
10.0.0.100:8081 default-domain:mock4:vmi_sv-37_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f
10.0.0.100:8081 default-domain:mock1:vmi_sv-34_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f
10.0.0.100:8081 default-domain:mock2:vmi_sv-35_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f
10.0.0.100:8081 default-domain:mock5:vmi_sv-38_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f
10.0.0.124:9081 default-domain:mock6:vmi_sv-39_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f
10.0.0.124:9081 default-domain:mock3:vmi_sv-36_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f
10.0.0.124:9081 default-domain:mock4:vmi_sv-37_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f
10.0.0.124:9081 default-domain:mock1:vmi_sv-34_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f
10.0.0.124:9081 default-domain:mock2:vmi_sv-35_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active":...

Hi Arvind,

Any update?

Best Regards,
Vijay Kumar

Hi Arvind,

Do we have any update?

Best Regards,
Vijay Kumar

Arvind (arvindv) wrote :

Thanks Vijay Kumar.
1) do u have the get_port_uve.py script. Do you send the UVE requests both to VIP as well as the
internal IP's ? [would like to understand how u are issuing the requests to analytics api's]
2) Can I access your testbed, I would like to debug it. We don't have multinode with release u are trying.
Thanks
Arvind

Arvind (arvindv) wrote :

Hi VijayKumar,

With regards to your concern about not being able to read the UVE's while doing HA, it is by design.
We will not be able to read the UVE's(until the generators connect to the collectors in the new nodes) if they belonged to a partition that is owned by the node that went down.
In your case, the reason why there is a temporary empty json returned by the analytics-api for your queries are because the UVE's owned by the node that went down took a min to show up in the other nodes (surviving).

So when comparing the time taken by analytics-api to return the API's we cannot include this downtime. So let me know if you have concern in fetch times for various configuration(500, 1000, 1500 VN) outside this window.
Thanks
Arvind

Hi Arvind,

when you mention generators do you mean contrail-alarm-gen?

I am trying to follow up the update
In your case, the reason why there is a temporary empty json returned by the analytics-api for your queries are because the UVE's owned by the node that went down took a min to show up?

Does the time depend on the number of VN/VNI?

Best Regards,
Vijay Kumar

Changed in juniperopenstack:
importance: Undecided → Critical
Vineet Gupta (vineetrf) on 2018-03-19
tags: added: nttc
tags: added: 2017-0905-0258 jtac
tags: added: config
tags: added: analytics
removed: config

Review in progress for https://review.opencontrail.org/41986
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/41990
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/41991
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/41992
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/41986
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/42319
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/42338
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/42341
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/42338
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/41986
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/42319
Submitter: Zhiqiang Cui (<email address hidden>)

vivekananda shenoy (vshenoy83) wrote :

Hi Zhiqiang,

What is the ETA for this issue ?

Regards,
Vivek

Review in progress for https://review.opencontrail.org/41986
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/42319
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/42338
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/42341
Submitter: Zhiqiang Cui (<email address hidden>)

Reviewed: https://review.opencontrail.org/41986
Committed: http://github.com/Juniper/contrail-controller/commit/666fe371caced9db7c7206dda95b5c744ef72fcc
Submitter: Zuul (<email address hidden>)
Branch: R3.1

commit 666fe371caced9db7c7206dda95b5c744ef72fcc
Author: zcui <email address hidden>
Date: Mon Apr 16 17:55:34 2018 -0700

Add new config option use_aggregated_uve_db

Provide an option to enable/disable serve UVE queries from the aggregated
UVE db. In a scale setup, HA event (alarm-gen down), alarm-gen may take a
long time to aggregate the UVEs depending on the number of UVEs, number
of partitions it owns, etc., So, enabling this option would reduce the
down time in case of HA events (collector down - time taken for generators
to connect to the new collector and resync the UVEs, alarm-gen down - no
impact on UVE queries, only the alarms would be reevaluated)

Change-Id: If39df7650ec514d4645e3eae30edc7b8ed0b5d1d
Closes-Bug: #1719236

Reviewed: https://review.opencontrail.org/42341
Committed: http://github.com/Juniper/contrail-controller/commit/a101ea508f9d63662fae243b4928bc3a170e2c8d
Submitter: Zuul (<email address hidden>)
Branch: R4.1

commit a101ea508f9d63662fae243b4928bc3a170e2c8d
Author: zcui <email address hidden>
Date: Fri Apr 20 14:51:19 2018 -0700

Add new config option use_aggregated_uve_db

Provide an option to enable/disable serve UVE queries from the aggregated
UVE db. In a scale setup, HA event (alarm-gen down), alarm-gen may take a
long time to aggregate the UVEs depending on the number of UVEs, number
of partitions it owns, etc., So, enabling this option would reduce the
down time in case of HA events (collector down - time taken for generators
to connect to the new collector and resync the UVEs, alarm-gen down - no
impact on UVE queries, only the alarms would be reevaluated)

Change-Id: Id2efdb4ec6a600697442e081042a903a01812dd9
Closes-Bug: #1719236

Reviewed: https://review.opencontrail.org/42319
Committed: http://github.com/Juniper/contrail-controller/commit/fdda437edceb65ffce60efe3d40fe2e084fff664
Submitter: Zuul (<email address hidden>)
Branch: R3.2

commit fdda437edceb65ffce60efe3d40fe2e084fff664
Author: zcui <email address hidden>
Date: Mon Apr 16 17:55:34 2018 -0700

Add new config option use_aggregated_uve_db

Provide an option to enable/disable serve UVE queries from the aggregated
UVE db. In a scale setup, HA event (alarm-gen down), alarm-gen may take a
long time to aggregate the UVEs depending on the number of UVEs, number
of partitions it owns, etc., So, enabling this option would reduce the
down time in case of HA events (collector down - time taken for generators
to connect to the new collector and resync the UVEs, alarm-gen down - no
impact on UVE queries, only the alarms would be reevaluated)

Change-Id: If39df7650ec514d4645e3eae30edc7b8ed0b5d1d
Closes-Bug: #1719236

Reviewed: https://review.opencontrail.org/42338
Committed: http://github.com/Juniper/contrail-controller/commit/8f184ca728d3812c85a78f6404a68885cb841729
Submitter: Zuul (<email address hidden>)
Branch: R4.0

commit 8f184ca728d3812c85a78f6404a68885cb841729
Author: zcui <email address hidden>
Date: Fri Apr 20 14:34:30 2018 -0700

Add new config option use_aggregated_uve_db

Provide an option to enable/disable serve UVE queries from the aggregated
UVE db. In a scale setup, HA event (alarm-gen down), alarm-gen may take a
long time to aggregate the UVEs depending on the number of UVEs, number
of partitions it owns, etc., So, enabling this option would reduce the
down time in case of HA events (collector down - time taken for generators
to connect to the new collector and resync the UVEs, alarm-gen down - no
impact on UVE queries, only the alarms would be reevaluated)

Change-Id: I600e2c33671a1c16545732cd64d1493678095198
Closes-Bug: #1719236

Review in progress for https://review.opencontrail.org/44035
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/44207
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/44035
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/44207
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/44220
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/44221
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/44222
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/44223
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/44220
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/44221
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/44222
Submitter: Zhiqiang Cui (<email address hidden>)

Review in progress for https://review.opencontrail.org/44223
Submitter: Zhiqiang Cui (<email address hidden>)

Reviewed: https://review.opencontrail.org/44207
Committed: http://github.com/Juniper/contrail-analytics/commit/9abcdec4c72d86d7a2e345b1611d6ef8cf137bb8
Submitter: Zuul v3 CI (<email address hidden>)
Branch: master

commit 9abcdec4c72d86d7a2e345b1611d6ef8cf137bb8
Author: zcui <email address hidden>
Date: Thu Jun 21 16:17:06 2018 -0700

Add new config option use_aggregated_uve_db

Provide an option to enable/disable serve UVE queries from the aggregated
UVE db. In a scale setup, HA event (alarm-gen down), alarm-gen may take a
long time to aggregate the UVEs depending on the number of UVEs, number
of partitions it owns, etc., So, enabling this option would reduce the
down time in case of HA events (collector down - time taken for generators
to connect to the new collector and resync the UVEs, alarm-gen down - no
impact on UVE queries, only the alarms would be reevaluated)

For R5.0, we delete the way to read from DB7, and read from DB1 directly

Change-Id: I064a5d38d8ae04770e033251ab12ee7b329054cc
Closes-Bug: #1719236

Reviewed: https://review.opencontrail.org/44220
Committed: http://github.com/Juniper/contrail-controller/commit/f8167648d12ebb2340b334d84798c141ca43d991
Submitter: Zuul (<email address hidden>)
Branch: R4.1

commit f8167648d12ebb2340b334d84798c141ca43d991
Author: zcui <email address hidden>
Date: Thu Jun 28 15:19:03 2018 -0700

Add new config option use_aggregated_uve_db

Provide an option to enable/disable serve UVE queries from the aggregated
UVE db. In a scale setup, HA event (alarm-gen down), alarm-gen may take a
long time to aggregate the UVEs depending on the number of UVEs, number
of partitions it owns, etc., So, enabling this option would reduce the
down time in case of HA events (collector down - time taken for generators
to connect to the new collector and resync the UVEs, alarm-gen down - no
impact on UVE queries, only the alarms would be reevaluated)

This commit fix bug: legacy get_alarm do not use _usecache flag

Closes-Bug: #1719236

Change-Id: I03dcee1a1b4103587197774cccd2901e4faa3a42

Reviewed: https://review.opencontrail.org/44035
Committed: http://github.com/Juniper/contrail-analytics/commit/f7daaa6e8d8ecd617c4aba218f647bf2bfc4112e
Submitter: Zuul v3 CI (<email address hidden>)
Branch: R5.0

commit f7daaa6e8d8ecd617c4aba218f647bf2bfc4112e
Author: zcui <email address hidden>
Date: Thu Jun 21 16:17:06 2018 -0700

Add new config option use_aggregated_uve_db

Provide an option to enable/disable serve UVE queries from the aggregated
UVE db. In a scale setup, HA event (alarm-gen down), alarm-gen may take a
long time to aggregate the UVEs depending on the number of UVEs, number
of partitions it owns, etc., So, enabling this option would reduce the
down time in case of HA events (collector down - time taken for generators
to connect to the new collector and resync the UVEs, alarm-gen down - no
impact on UVE queries, only the alarms would be reevaluated)

For R5.0, we delete the way to read from DB7, and read from DB1 directly

Change-Id: I064a5d38d8ae04770e033251ab12ee7b329054cc
Closes-Bug: #1719236

Reviewed: https://review.opencontrail.org/44222
Committed: http://github.com/Juniper/contrail-controller/commit/5901f91c41ce26add952bb8f2922f72fc4ac53b8
Submitter: Zuul (<email address hidden>)
Branch: R3.2

commit 5901f91c41ce26add952bb8f2922f72fc4ac53b8
Author: zcui <email address hidden>
Date: Thu Jun 28 15:19:03 2018 -0700

Add new config option use_aggregated_uve_db

Provide an option to enable/disable serve UVE queries from the aggregated
UVE db. In a scale setup, HA event (alarm-gen down), alarm-gen may take a
long time to aggregate the UVEs depending on the number of UVEs, number
of partitions it owns, etc., So, enabling this option would reduce the
down time in case of HA events (collector down - time taken for generators
to connect to the new collector and resync the UVEs, alarm-gen down - no
impact on UVE queries, only the alarms would be reevaluated)

This commit fix bug: legacy get_alarm do not use _usecache flag

Closes-Bug: #1719236

Change-Id: I03dcee1a1b4103587197774cccd2901e4faa3a42

Reviewed: https://review.opencontrail.org/44221
Committed: http://github.com/Juniper/contrail-controller/commit/0deb61632a1217446132d9453afb0e70646b465f
Submitter: Zuul (<email address hidden>)
Branch: R4.0

commit 0deb61632a1217446132d9453afb0e70646b465f
Author: zcui <email address hidden>
Date: Thu Jun 28 15:19:03 2018 -0700

Add new config option use_aggregated_uve_db

Provide an option to enable/disable serve UVE queries from the aggregated
UVE db. In a scale setup, HA event (alarm-gen down), alarm-gen may take a
long time to aggregate the UVEs depending on the number of UVEs, number
of partitions it owns, etc., So, enabling this option would reduce the
down time in case of HA events (collector down - time taken for generators
to connect to the new collector and resync the UVEs, alarm-gen down - no
impact on UVE queries, only the alarms would be reevaluated)

This commit fix bug: legacy get_alarm do not use _usecache flag

Closes-Bug: #1719236

Change-Id: I03dcee1a1b4103587197774cccd2901e4faa3a42

Reviewed: https://review.opencontrail.org/44223
Committed: http://github.com/Juniper/contrail-controller/commit/36ce23cd579cf03cb3edbd4f2cbbe77b4e510751
Submitter: Zuul (<email address hidden>)
Branch: R3.1

commit 36ce23cd579cf03cb3edbd4f2cbbe77b4e510751
Author: zcui <email address hidden>
Date: Thu Jun 28 15:19:03 2018 -0700

Add new config option use_aggregated_uve_db

Provide an option to enable/disable serve UVE queries from the aggregated
UVE db. In a scale setup, HA event (alarm-gen down), alarm-gen may take a
long time to aggregate the UVEs depending on the number of UVEs, number
of partitions it owns, etc., So, enabling this option would reduce the
down time in case of HA events (collector down - time taken for generators
to connect to the new collector and resync the UVEs, alarm-gen down - no
impact on UVE queries, only the alarms would be reevaluated)

This commit fix bug: legacy get_alarm do not use _usecache flag

Closes-Bug: #1719236

Change-Id: I03dcee1a1b4103587197774cccd2901e4faa3a42

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers