Brief Description
-----------------
During the upgrade from Release 7 to Release 8, if for some reason the upgrade-activate fails
and need to be executed again, it is possible that the runtime manifest:
platform::network::update_platform_nfs_ip_references is called again and at that time it can
fail due to an empty variable:
$plat_nfs_ip = $::platform::network::mgmt::params::platform_nfs_address
it can happen if this function was already executed and for some reason the entry was removed from
system.yaml but not from the Database.
So when the upgrade-activate is called again the protection that verifies if the platform-nfs-ip is in Database
will return TRUE and the runtime-manifest will execute again, but at this time the $plat_nfs_ip will be empty.
Severity
--------
Major: Upgrade can not be completed
Steps to Reproduce
------------------
It is not easy, I have to change the code to simulate the error
Using AIO-DX
Start an upgrade from an OLD CENTOS ( i.e: Rel 7 ) release to a new Debian Rel 8.
Upgrade controller-1 and controller-0
Before the activate:
Try to change some config to force a fail during the activate.
( I changed the network.pp and set the $plat_nfs_ip to undef and to '' too
run the upgrade-activate
check if Ruby error will happen.
Expected Behavior
------------------
The runtime maifest platform::network::update_platform_nfs_ip_references must be executed without problems
and can be executed other times without returning error
Actual Behavior
----------------
The runtime maifest platform::network::update_platform_nfs_ip_references executes correcty at first time
but if it runs again, it will fail.
Reproducibility
---------------
seen once.
System Configuration
--------------------
AIO-DX, IPv4
Branch/Pull Time/Commit
-----------------------
This issue is a side effect of the change: https://bugs.launchpad.net/starlingx/+bug/2012387
Last Pass
---------
N/A
Timestamp/Logs
--------------
At first time, the command executed correctly:
2023-04-01T23:31:29.203 [0;36mDebug: 2023-04-01 23:31:29 +0000 Executing: 'sm-deprovision service-group-member controller-services platform-nfs-ip --apply'[
2023-04-01T23:31:30.372 [mNotice: 2023-04-01 23:31:30 +0000 /Stage[main]/Platform::Network::Update_platform_nfs_ip_references/Exec[Deprovision platform-nfs-ip (service-group-member platform-nfs-ip)]/returns: executed successfully[
2023-04-01T23:31:30.374 [0;36mDebug: 2023-04-01 23:31:30 +0000 /Stage[main]/Platform::Network::Update_platform_nfs_ip_references/Exec[Deprovision platform-nfs-ip (service-group-member platform-nfs-ip)]: The container Class[Platform::Network::Update_platform_nfs_ip_references] will propagate my refresh event[
2023-04-01T23:31:30.375 [0;36mDebug: 2023-04-01 23:31:30 +0000 Exec[Deprovision Platform-NFS IP service in SM (service platform-nfs-ip)](provider=posix): Executing 'sm-deprovision service platform-nfs-ip'[
2023-04-01T23:31:30.377 [0;36mDebug: 2023-04-01 23:31:30 +0000 Executing: 'sm-deprovision service platform-nfs-ip'[
2023-04-01T23:31:30.541 [mNotice: 2023-04-01 23:31:30 +0000 /Stage[main]/Platform::Network::Update_platform_nfs_ip_references/Exec[Deprovision Platform-NFS IP service in SM (service platform-nfs-ip)]/returns: executed successfully[
2023-04-01T23:31:30.543 [0;36mDebug: 2023-04-01 23:31:30 +0000 /Stage[main]/Platform::Network::Update_platform_nfs_ip_references/Exec[Deprovision Platform-NFS IP service in SM (service platform-nfs-ip)]: The container Class[Platform::Network::Update_platform_nfs_ip_references] will propagate my refresh event[
2023-04-01T23:31:30.545 [0;36mDebug: 2023-04-01 23:31:30 +0000 Exec[Removing Plaform NFS IP address from interface: vlan603](provider=posix): Executing check 'ip -br addr show dev vlan603 2>/dev/null | grep '192.168.30.166/29' 1>/dev/null'[
2023-04-01T23:31:30.547 [0;36mDebug: 2023-04-01 23:31:30 +0000 Executing: 'ip -br addr show dev vlan603 2>/dev/null | grep '192.168.30.166/29' 1>/dev/null'[
2023-04-01T23:31:30.549 [0;36mDebug: 2023-04-01 23:31:30 +0000 Exec[Removing Plaform NFS IP address from interface: vlan603](provider=posix): Executing 'ip addr del 192.168.30.166/29 dev vlan603'[
2023-04-01T23:31:30.551 [0;36mDebug: 2023-04-01 23:31:30 +0000 Executing: 'ip addr del 192.168.30.166/29 dev vlan603'[
2023-04-01T23:31:30.553 [mNotice: 2023-04-01 23:31:30 +0000 /Stage[main]/Platform::Network::Update_platform_nfs_ip_references/Exec[Removing Plaform NFS IP address from interface: vlan603]/returns: executed successfully[
2023-04-01T23:31:30.555 [0;36mDebug: 2023-04-01 23:31:30 +0000 /Stage[main]/Platform::Network::Update_platform_nfs_ip_references/Exec[Removing Plaform NFS IP address from interface: vlan603]: The container Class[Platform::Network::Update_platform_nfs_ip_references] will propagate my refresh event[
2023-04-01T23:31:30.560 [0;36mDebug: 2023-04-01 23:31:30 +0000 Exec[Removing Plaform NFS IP address from /22.12/dnsmasq.hosts](provider=posix): Executing 'sed -i '/controller-platform-nfs/d' /opt/platform/config/22.12/dnsmasq.hosts'[
2023-04-01T23:31:30.562 [0;36mDebug: 2023-04-01 23:31:30 +0000 Executing: 'sed -i '/controller-platform-nfs/d' /opt/platform/config/22.12/dnsmasq.hosts'[
2023-04-01T23:31:30.564 [mNotice: 2023-04-01 23:31:30 +0000 /Stage[main]/Platform::Network::Update_platform_nfs_ip_references/Exec[Removing Plaform NFS IP address from /22.12/dnsmasq.hosts]/returns: executed successfully[
2023-04-01T23:31:30.566 [0;36mDebug: 2023-04-01 23:31:30 +0000 /Stage[main]/Platform::Network::Update_platform_nfs_ip_references/Exec[Removing Plaform NFS IP address from /22.12/dnsmasq.hosts]: The container Class[Platform::Network::Update_platform_nfs_ip_references] will propagate my refresh event[
2023-04-01T23:31:30.571 [0;36mDebug: 2023-04-01 23:31:30 +0000 Exec[Removing Plaform NFS IP address from /22.12/hieradata/system.yaml](provider=posix): Executing 'sed -i '/platform_nfs_address/d' /opt/platform/puppet/22.12/hieradata/system.yaml'[
2023-04-01T23:31:30.573 [0;36mDebug: 2023-04-01 23:31:30 +0000 Executing: 'sed -i '/platform_nfs_address/d' /opt/platform/puppet/22.12/hieradata/system.yaml'[
then it is called again and fails:
2023-04-01T23:31:46.407 [mNotice: 2023-04-01 23:31:46 +0000 /Stage[main]/Platform::Network::Update_platform_nfs_ip_references/Exec[Deprovision Platform-NFS IP service in SM (service platform-nfs-ip)]/returns: executed successfully[
2023-04-01T23:31:46.408 [0;36mDebug: 2023-04-01 23:31:46 +0000 /Stage[main]/Platform::Network::Update_platform_nfs_ip_references/Exec[Deprovision Platform-NFS IP service in SM (service platform-nfs-ip)]: The container Class[Platform::Network::Update_platform_nfs_ip_references] will propagate my refresh event[
2023-04-01T23:31:46.410 [0;36mDebug: 2023-04-01 23:31:46 +0000 Exec[Removing Plaform NFS IP address from interface: vlan603](provider=posix): Executing check 'ip -br addr show dev vlan603 2>/dev/null | grep '/29' 1>/dev/null'[
2023-04-01T23:31:46.412 [0;36mDebug: 2023-04-01 23:31:46 +0000 Executing: 'ip -br addr show dev vlan603 2>/dev/null | grep '/29' 1>/dev/null'[
2023-04-01T23:31:46.414 [0;36mDebug: 2023-04-01 23:31:46 +0000 Exec[Removing Plaform NFS IP address from interface: vlan603](provider=posix): Executing 'ip addr del /29 dev vlan603'[
2023-04-01T23:31:46.415 [0;36mDebug: 2023-04-01 23:31:46 +0000 Executing: 'ip addr del /29 dev vlan603'[
2023-04-01T23:31:46.418 [mNotice: 2023-04-01 23:31:46 +0000 /Stage[main]/Platform::Network::Update_platform_nfs_ip_references/Exec[Removing Plaform NFS IP address from interface: vlan603]/returns: Error: any valid prefix is expected rather than "/29".[
2023-04-01T23:31:46.419 [1;31mError: 2023-04-01 23:31:46 +0000 'ip addr del /29 dev vlan603' returned 1 instead of one of [0]
2023-04-01T23:31:46.421 /usr/lib/ruby/vendor_ruby/puppet/util/errors.rb:157:in `fail'
2023-04-01T23:31:46.423 /usr/lib/ruby/vendor_ruby/puppet/type/exec.rb:168:in `sync'
Test Activity
-------------
Regression Testing
Workaround
abort the upgrade and start again.
Fix proposed to branch: master /review. opendev. org/c/starlingx /config/ +/879683
Review: https:/