Microsoft Azure Enablement: azure-lb missing in Bionic

Bug #1895343 reported by Rafael David Tinoco
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
resource-agents (Ubuntu)
Fix Released
Undecided
Unassigned
Bionic
Fix Released
High
Lucas Kanashiro
Focal
Fix Released
High
Lucas Kanashiro

Bug Description

[Impact]

azure-lb is a handful resources agent if one is maintaining a cluster on Azure platform. azure-lb facilitates the operation of a load balancer implemented by Azure.

In Focal, azure-lb is available, however, to keep all the releases in the same state a patch fixing a bug in azure-lb is needed. This is the upstream patch:

https://github.com/ClusterLabs/resource-agents/commit/d22700fc

In Bionic, the agent is not available, and the full feature will be backported as a way to better support the HA stack on Azure. The changes are self-contained since each resource agent is maintained as a separate script.

[Test Case]

On Azure create a Corosync/Pacemaker cluster with 3 VMs and make sure you have the following CIB:

node 1: vm01
node 2: vm02
node 3: vm03
primitive lb-healthprobe azure-lb \
        params port=8000 nc="/bin/nc" \
        op monitor interval=10 timeout=20
property cib-bootstrap-options: \
        have-watchdog=false \
        dc-version=2.0.3-4b1f869f0f \
        cluster-infrastructure=corosync \
        cluster-name=clufocal \
        no-quorum-policy=stop \
        last-lrm-refresh=1603822692 \
        maintenance-mode=false

The status of the cluster should be:

$ sudo crm status
Cluster Summary:
  * Stack: corosync
  * Current DC: vm02 (version 2.0.3-4b1f869f0f) - partition with quorum
  * Last updated: Wed Dec 9 18:16:32 2020
  * Last change: Wed Dec 9 18:16:30 2020 by root via crm_attribute on vm01
  * 3 nodes configured
  * 1 resource instances configured

Node List:
  * Online: [ vm01 vm02 vm03 ]

Full List of Resources:
  * lb-healthprobe (ocf::heartbeat:azure-lb): Started vm02

To take advantage of azure-lb resource agent you need to have an Azure load balance configured. You can find information on how to do it here:

https://docs.microsoft.com/en-us/azure/load-balancer/quickstart-load-balancer-standard-public-cli

[Where problems could occur]

As mentioned previously each resource agent is maintained as a separate script, so for Focal where the agent is already there we can have a problem in the existent azure-lb resource agent. In Bionic the risk of a regression is quite low since the agent does not exist.

[Original description]

This bug is part of the HA enablement for Microsoft Azure Cloud.
(https://discourse.ubuntu.com/t/ubuntu-ha-cluster-in-microsoft-azure-cloud/)

"""
Need one additional Azure package to support load balancer. If you refer the
following article, we use Azure load balancer to find the active replica in
availability group for SQL Server and routing connections appropriately.
"""

[rafaeldtinoco@focal ~]$ apt-file list resource-agents | grep -i azure

resource-agents: /usr/lib/ocf/resource.d/heartbeat/azure-events
resource-agents: /usr/lib/ocf/resource.d/heartbeat/azure-lb
resource-agents: /usr/share/man/man7/ocf_heartbeat_azure-events.7.gz
resource-agents: /usr/share/man/man7/ocf_heartbeat_azure-lb.7.gz

[rafaeldtinoco@bionic ~]$ apt-file list resource-agents | grep -i azure
<nothing>

In Ubuntu we have:

$ rmadison resource-agents
 resource-agents | 1:4.1.0~rc1-1ubuntu1.2 | bionic-updates
 resource-agents | 1:4.5.0-2ubuntu2 | focal
 resource-agents | 1:4.6.1-1ubuntu1 | groovy

 In Debian:

$ rmad resource-agents

resource-agents | 1:4.2.0-2+deb10u2 | stable
resource-agents | 1:4.6.1-1~bpo10+1 | buster-backports
resource-agents | 1:4.6.1-1 | testing
resource-agents | 1:4.6.1-1 | unstable

From upstream:

commit 771b49a1
Author: Oyvind Albrigtsen <email address hidden>
Date: Wed Nov 29 14:09:06 2017

    azure-lb: new resource agent

(and other commits from 2018 and 2019 as fixes)

[rafaeldtinoco@upstream resource-agents]$ git describe --tags 771b49a1
v4.1.0-1-g771b49a1

[rafaeldtinoco@upstream resource-agents]$ git tag --contains 771b49a1
v4.1.1
v4.1.1rc1

Considering version in Bionic is 4.1.0~rc1, we probably lost the new resource by
little. Should be easy to be backported and, after discussing with the SRU
(Stable Releases Update) team member, possible to be consider as HW enablement
within SRU guidelines.

Related branches

summary: - Azure: azure-lb & azure-events agents should be backported to Bionic
+ Microsoft Azure Enablement: azure-lb & azure-events missing in Bionic
Changed in resource-agents (Ubuntu Bionic):
status: New → Confirmed
Changed in resource-agents (Ubuntu Focal):
status: New → Confirmed
Changed in resource-agents (Ubuntu Bionic):
importance: Undecided → High
Changed in resource-agents (Ubuntu Focal):
importance: Undecided → High
Changed in resource-agents (Ubuntu):
status: New → Fix Released
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote : Re: Microsoft Azure Enablement: azure-lb & azure-events missing in Bionic

## To backport (or make sure it exists) to FOCAL:

# azure-lb

commit 771b49a1 (exists)
    azure-lb: new resource agent

commit c5e465fc (exists)
    azure-lb: remove reference to status from usage

commit d22700fc (needed)
    azure-lb: Don't redirect nc listener output to pidfile

# azure-events (to the same code level as in Focal)

commit 2512b396 (v4.2.0) (exists)
    Initial version of the AzEvents RA

commit 328bb0e4 (exists)
    AzEvents: Use configure to replace shebang line

----

https://docs.microsoft.com/en-us/azure/virtual-machines/workloads/sap/high-availability-guide-suse-nfs (Shows de issue about netcat fix for azure-lb)

----

## To backport (or make sure it exists) to BIONIC:

## azure-lb

commit 771b49a1 (needed)
    azure-lb: new resource agent

commit c5e465fc (needed)
    azure-lb: remove reference to status from usage

commit d22700fc (needed)
    azure-lb: Don't redirect nc listener output to pidfile

## azure-events (to the same code level as in Focal)

commit 2512b396 (v4.2.0) (needed)
    Initial version of the AzEvents RA

commit 328bb0e4 (needed)
    AzEvents: Use configure to replace shebang line

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

For BIONIC, azure-events also has to have:

commit cb87d027

    azure-events: change message log level for the non action messages
    Reduces the verbosity on the log when the RA has no events to process.
    The messages can still be seen using the verbose parameter.

commit d2c47ec3

    Fix implicit bytes conversion that breaks py3.
    Reduces the amount of errors messages using default value on
    crm_attribute

commit ca15b9dc

    dev: azure-events: Add custom user agent

commit 9890deba

    dev: AzEvents: Start using ocf.py (#1161)

commit e7b1a18a

    dev: AzEvents: Use pacemaker commands to set standby

commit 416f0b1f

    Implemented review feedback from krig

So it can be in the same level as Focal.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

For Bionic, I think it's okay to have azure-lb backported as it is a standalone shell script that will act as a resource agent. Unfortunately I don't see AzEvents being backported and accepted as SRU just because it takes some more changes for it to be integrated to existing Bionic resource-agents.

These are the patches implementing the AzEvents (renamed to azure-events later):

328bb0e 2018-07-18 10:11 -0700 TNiekamp AzEvents: Use configure to replace shebang line
416f0b1 2018-09-14 15:21 -0700 TNiekamp Implemented review feedback from krig
e7b1a18 2018-10-17 09:10 +0200 KGrönlund dev: AzEvents: Use pacemaker commands to set standby
9890deb 2018-10-17 09:06 +0200 KGrönlund dev: AzEvents: Start using ocf.py
ca15b9d 2018-10-18 09:23 +0200 KGrönlund dev: azure-events: Add custom user agent

and it would require the ocf.py to be brought as well, and then all the compilation changes regarding auto tools to recognize python scripts... not suitable for a SRU, for sure.

With that said, I'll keep my merge request to bring *at least* the azure-lb resource agent to Bionic at:

https://code.launchpad.net/~rafaeldtinoco/ubuntu/+source/resource-agents/+git/resource-agents/+merge/392746

My suggestion to those who need azure-events is to either migrate to Focal or Groovy, or to use the back-ported Groovy HA stack at:

https://launchpad.net/~ubuntu-ha/+archive/ubuntu/groovy-ha-stack

Changed in resource-agents (Ubuntu Bionic):
status: Confirmed → In Progress
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

For Focal...

I think we should SRU:

[rafaeldtinoco@groovy resource-agents]$ git log --oneline v4.5.0..HEAD -- heartbeat/azure-lb
d22700fc azure-lb: Don't redirect nc listener output to pidfile

that is the only fix I backported to Bionic (in the SRU being proposed) and I think that it should also be SRU'ed to Focal as the fix is straightforward.

In regards to azure-events:

From Focal to HEAD upstream we have:

[rafaeldtinoco@groovy resource-agents]$ git log --oneline v4.5.0..HEAD -- heartbeat/azure-events*
1ab5d71b azure-events: report error if jsondata not received
f2bf1d8a azure-events: import URLError and encode postData when necessary
57424bd1 azure-events: only decode() when exec() output not of type str
cc69a8fa azure-events: handle exceptions in urlopen (#1496)

but.. from Focal to Groovy we have only:

cc69a8fa azure-events: handle exceptions in urlopen (#1496)

I think there won't be anything to be done unless we have a real problem we can have a test case for (and solve in another SRU).

description: updated
Changed in resource-agents (Ubuntu Focal):
status: Confirmed → In Progress
Changed in resource-agents (Ubuntu Bionic):
assignee: nobody → Lucas Kanashiro (lucaskanashiro)
Changed in resource-agents (Ubuntu Focal):
assignee: nobody → Lucas Kanashiro (lucaskanashiro)
Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

After checking all the changes needed to backport azure-events to Bionic I agree with Rafael (comment #4). Let's recommend users wanting to use azure-events to upgrade to Focal. I am removing azure-events from the scope of the proposed SRUs.

summary: - Microsoft Azure Enablement: azure-lb & azure-events missing in Bionic
+ Microsoft Azure Enablement: azure-lb missing in Bionic
description: updated
description: updated
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Please test proposed package

Hello Rafael, or anyone else affected,

Accepted resource-agents into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/resource-agents/1:4.5.0-2ubuntu2.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in resource-agents (Ubuntu Focal):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-focal
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Hello Rafael, or anyone else affected,

Accepted resource-agents into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/resource-agents/1:4.1.0~rc1-1ubuntu1.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in resource-agents (Ubuntu Bionic):
status: In Progress → Fix Committed
tags: added: verification-needed-bionic
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (resource-agents/1:4.1.0~rc1-1ubuntu1.3)

All autopkgtests for the newly accepted resource-agents (1:4.1.0~rc1-1ubuntu1.3) for bionic have finished running.
The following regressions have been reported in tests triggered by the package:

pacemaker/1.1.18-0ubuntu1.3 (armhf)
resource-agents/1:4.1.0~rc1-1ubuntu1.3 (armhf)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/bionic/update_excuses.html#resource-agents

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

Hi SRU team,

I took a look at the 2 "regressions" listed above and it seems that none of them are originated from this SRU. The pacemaker failure is happening just in armhf and the same failure can be seen in the previous triggers for glibc (after some retries it passed for some reason):

https://autopkgtest.ubuntu.com/packages/pacemaker/bionic/armhf

The resouce-agents failure in armhf is because ldirectord cannot be installed in this architecture (unrelated to the changes proposed in the context of this SRU):

https://autopkgtest.ubuntu.com/packages/resource-agents/bionic/armhf

With that in mind, I'd like to propose to mark them as badtest (pacemaker/armhf and resource-agents/armhf).

Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

I did the verification of the package in proposed and it is working as expected. I update all the nodes of my Bionic cluster on Azure and the load balance feature is still working fine:

lucas@vm01:~$ sudo crm status
Stack: corosync
Current DC: vm03 (version 1.1.18-2b07d5c5a9) - partition with quorum
Last updated: Mon Jan 18 19:08:39 2021
Last change: Tue Dec 15 13:04:03 2020 by root via cibadmin on vm01

3 nodes configured
4 resources configured

Online: [ vm01 vm02 vm03 ]

Full list of resources:

 lb-healthprobe (ocf::heartbeat:azure-lb): Started vm02
 fence-vm01 (stonith:fence_azure_arm): Started vm03
 fence-vm02 (stonith:fence_azure_arm): Started vm01
 fence-vm03 (stonith:fence_azure_arm): Started vm02

lucas@vm01:~$ dpkg -l resource-agents
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-==============================================-============================-============================-==================================================================================================
ii resource-agents 1:4.1.0~rc1-1ubuntu1.3 amd64 Cluster Resource Agents

Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

I did the same verification work in a Focal cluster on Azure and everything worked as expected.

Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

Sorry, I forgot to mention the version of the package I used in my Focal verification work. Here it is:

lucas@vm01:~$ dpkg -l | grep resource-agents
ii resource-agents 1:4.5.0-2ubuntu2.2 amd64 Cluster Resource Agents

tags: added: verification-done verification-done-bionic verification-done-focal
removed: verification-needed verification-needed-bionic verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package resource-agents - 1:4.5.0-2ubuntu2.2

---------------
resource-agents (1:4.5.0-2ubuntu2.2) focal; urgency=medium

  * Microsoft Azure HA Enablement: azure-lb (LP: #1895343).
    d/p/lp1895343-01-azure-lb-Dont-redirect-nc-listener-output-to-pidfile.patch
    was introduced to backport upstream patch.

 -- Rafael David Tinoco <email address hidden> Mon, 26 Oct 2020 04:00:13 +0000

Changed in resource-agents (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for resource-agents has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package resource-agents - 1:4.1.0~rc1-1ubuntu1.3

---------------
resource-agents (1:4.1.0~rc1-1ubuntu1.3) bionic; urgency=medium

  * Microsoft Azure HA Enablement: azure-lb (LP: #1895343)
    - d/p/lp1895343-01-azure-lb-new-resource-agent.patch
    - d/p/lp1895343-02-azure-lb-Dont-redirect-nc-listener-output-to-pidfile.patch

 -- Rafael David Tinoco <email address hidden> Fri, 23 Oct 2020 18:41:22 +0000

Changed in resource-agents (Ubuntu Bionic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.