PTP pull status fails if notification pod moves to other host

Bug #1991793 reported by Douglas Henrique Koerich
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Douglas Henrique Koerich

Bug Description

Brief Description
-----------------
After ptp-notification pod changes from one host to another, a running vDU sidecar starts to fail with 404 Not Found when pulling states (https://docs.starlingx.io/api-ref/ptp-notification-armada-app/api_ptp_notifications_definition_v1.html#pull-status-notifications)

Severity
--------
Major

Steps to Reproduce
------------------
- Set label "ptp-notification=true" to controller-1;
- Apply ptp-notification and install a pod with sidecar;
- Check pull status is working;
- Move label "ptp-notification=true" to controller-0, the ptp-notification pod is expected to move as well;
- From sidecar still running in the pod at controller-1, try pulling status again;
- /ocloudNotifications/v1/.../CurrentState will return "404 Not Found"

Expected Behavior
------------------
Sidecar should reach the API server running in another host

Actual Behavior
----------------
Sidecar only reaches the API server running in the same host the sidecar is installed

Reproducibility
---------------
Reproducible

System Configuration
--------------------
All except AIO-SX

Last Pass
---------
This is a new test scenario

Timestamp/Logs
--------------
Before moving the server, returned 200 OK:
2022-10-03T13:16:45.762801209Z stdout F 2022-10-03 13:16:45,762 [DEBUG ] [notificationclientsdk.client.base][MainThread] Created Broker client:controller-1,rabbit://admin:admin@[172.16.166.133]:5672672]:5672
2022-10-03T13:16:45.936465044Z stdout F 2022-10-03 13:16:45,935 [INFO ] [pecan.commands.serve][MainThread] "GET /ocloudNotifications/v1/PTP/CurrentState HTTP/1.1" 200 169

Server moves:
2022-10-03T17:14:58.038412664Z stdout F 2022-10-03 17:14:58,038 [DEBUG ] [notificationclientsdk.services.notification_worker][MainThread] consume location info @controller-0:{'NodeName': 'controller-0', 'PodIP': '172.16.192.65', 'ResourceTypes': ['PTP'], 'Timestamp': 1664817297.893791}

After moved, returned 404 Not Found:
2022-10-03T17:21:27.099411681Z stdout F 2022-10-03 17:21:27,098 [DEBUG ] [notificationclientsdk.client.base][MainThread] Created Broker client:controller-1,rabbit://admin:admin@[172.16.166.133]:5672
2022-10-03T17:21:46.127691768Z stdout F 2022-10-03 17:21:46,126 [WARNING ] [pecan.commands.serve][MainThread] "GET /ocloudNotifications/v1/PTP/CurrentState HTTP/1.1" 404 62

Test Activity
-------------
Feature Testing

Workaround
----------
Should move sidecar to the host where the server is currently running

description: updated
Changed in starlingx:
status: New → In Progress
assignee: nobody → Douglas Henrique Koerich (dkoerich-wr)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ptp-notification-armada-app (master)
Ghada Khalil (gkhalil)
tags: added: stx.apps stx.networking
Changed in starlingx:
importance: Undecided → Medium
tags: added: stx.8.0
Revision history for this message
Douglas Henrique Koerich (dkoerich-wr) wrote : Re: New pulls and previous subscriptions from vdu on node-X start failing when notification pod moves to node-Y

A similar issue was found with the subscriptions created prior the move of notification pod.

summary: - PTP pull status fails if notification pod moves to other host
+ New pulls and previous subscriptions from vdu on node-X start failing
+ when notification pod moves to node-Y
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on ptp-notification-armada-app (master)

Change abandoned by "Douglas Henrique Koerich <email address hidden>" on branch: master
Review: https://review.opendev.org/c/starlingx/ptp-notification-armada-app/+/860494
Reason: Daemon context is not shared among the running threads.

summary: - New pulls and previous subscriptions from vdu on node-X start failing
- when notification pod moves to node-Y
+ PTP pull status fails if notification pod moves to other host
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ptp-notification-armada-app (master)

Reviewed: https://review.opendev.org/c/starlingx/ptp-notification-armada-app/+/860494
Committed: https://opendev.org/starlingx/ptp-notification-armada-app/commit/4c58b9e8b333e2ca6fe95368bf9301cff21192dd
Submitter: "Zuul (22348)"
Branch: master

commit 4c58b9e8b333e2ca6fe95368bf9301cff21192dd
Author: Douglas Henrique Koerich <email address hidden>
Date: Wed Oct 5 16:06:05 2022 -0300

    Update name of service node to pull PTP status

    Since neither in PTP notification API v1 or v2 the pull of PTP state:
    https://docs.starlingx.io/api-ref/ptp-notification-armada-app/api_ptp_notifications_definition_v1.html#pull-status-notifications
    https://docs.starlingx.io/api-ref/ptp-notification-armada-app/api_ptp_notifications_definition_v2.html#pull-status-notifications
    contains the name or address of the node running the PTP tracking
    service, in the scenario where that service (server) moves to another
    node while the sidecar (client) remains in the original node where the
    service was running before, further attempts to pull the PTP state
    fail with "404 Not Found".

    This change stores in the (now shared among threads) daemon context the
    node where service is running, and updates with latest location event
    triggered at notification worker.
    Instead of taking the residing node of sidecar ("THIS_NODE_NAME"), it
    reads the node name from the context to call the GET method of API.

    Test Plan:
    PASS: Installed new version of sidecar image and changed location of
          service, pulling PTP state with success before and after move.

    Closes-Bug: #1991793
    Signed-off-by: Douglas Henrique Koerich <email address hidden>
    Change-Id: Ie3f72c5b84f1d9093d6ee906bbc11f9fd4ceb31b

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
Douglas Henrique Koerich (dkoerich-wr) wrote :

Reopened to fix issue with O-RAN API v2.

Changed in starlingx:
status: Fix Released → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ptp-notification-armada-app (master)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ptp-notification-armada-app (master)

Reviewed: https://review.opendev.org/c/starlingx/ptp-notification-armada-app/+/864112
Committed: https://opendev.org/starlingx/ptp-notification-armada-app/commit/dfd4dc8c369a0166ca60a6f71cb1c09272a59eb4
Submitter: "Zuul (22348)"
Branch: master

commit dfd4dc8c369a0166ca60a6f71cb1c09272a59eb4
Author: Douglas Henrique Koerich <email address hidden>
Date: Wed Nov 9 09:56:37 2022 -0300

    Fix O-RAN (API v2) pull of current state

    While the change in
    https://review.opendev.org/c/starlingx/ptp-notification-armada-app/+/860494
    solved the problem of moving PTP tracking service across nodes for
    notification API v1, it didn't managed properly for v2.
    This change fixes the problem for v2 while improves the solution for v1.

    Test Plan:
    PASS: Both v1 & v2 of API tested for pull of current states.

    Closes-Bug: #1991793
    Signed-off-by: Douglas Henrique Koerich <email address hidden>
    Change-Id: I1afe8a864e8cda909743c2e91b93a7bc8dda66e8

Changed in starlingx:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.