PTP mode=legacy causing host reboot loop in node

Bug #1795071 reported by Anujeyan Manokeran
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Eric MacDonald

Bug Description

Bug Description : Hosts is going into reboot loop after lock and unlock with the parameters mode=legacy and PTP is enabled on hardware lab . Mode=legacy is not supported parameter for this hardware lab . I guess this parameter should be removed if is not supported most hardware lab which is going to cause reboot loop .

system ptp-show
+--------------+--------------------------------------+
| Property | Value |
+--------------+--------------------------------------+
| uuid | 4f31663f-a341-4c32-85d8-c87d62ed3fb3 |
| enabled | True |
| mode | legacy |
| transport | l2 |
| mechanism | e2e |
| isystem_uuid | 26990655-a38b-4d37-82a0-09ceff90eb1f |
| created_at | 2018-09-17T21:02:47.832837+00:00 |
| updated_at | 2018-09-28T19:52:15.686067+00:00 |
+--------------+--------------------------------------+

fm alarm-list
+-------+-------------------------------------------------------+--------------------------------------+----------+-------------+
| Alarm | Reason Text | Entity ID | Severity | Time Stamp |
| ID | | | | |
+-------+-------------------------------------------------------+--------------------------------------+----------+-------------+
| 200. | compute-1 experienced a service-affecting failure. | host=compute-1 | critical | 2018-09-28T |
| 004 | Auto-recovery in progress. Manual Lock and Unlock may | | | 20:35:18. |
| | be required if auto-recovery is unsuccessful. | | | 642736 |
| | | | | |
| 300. | No enabled compute host with connectivity to provider | service=networking.providernet= | major | 2018-09-28T |
| 004 | network. | 10a40226-d4c8-40ad-ab9f-d1f0344866e4 | | 20:30:50. |
| | | | | 739766 |
| | | | | |
| 250. | controller-1 Configuration is out-of-date. | host=controller-1 | major | 2018-09-28T |
| 001 | | | | 20:21:33. |
| | | | | 098745 |
| | | | | |
+-------+-------------------------------------------------------+-----

Severity
--------
Major

Steps to Reproduce
------------------
Follow install procedure
1.system ptp-modify –mode =legacy –enabled=True
2. lock and unlock all the hosts to clear out of config alarms. As the description hosts when to a reboot loop.

Expected Behavior
------------------
No reboot.

Actual Behavior
----------------
As per description

Reproducibility
---------------
100% reproducible

System Configuration
--------------------
Storage system

Branch/Pull Time/Commit
-----------------------
Master build of 2018-09-17 12:17:07 -0400

Timestamp/Logs
--------------
28 20:35:40 UTC 2018

description: updated
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Assigning to Alex to triage before deciding on the importance and target release

Changed in starlingx:
assignee: nobody → Alex Kozyrev (akozyrev)
tags: added: stx.config
Revision history for this message
Ghada Khalil (gkhalil) wrote :

From Alex:
There is no way to check PTP hardware capabilities upfront.
So after PTP configuration is applied we rely on ptp4l process to see it is valid configuration or not.
ptp4l fails to start In case it is not valid and a node goes into a reboot loop.
Waiting for collectd work to complete to change the behavior: will raise alarm in case of unsupported PTP mode.

Changed in starlingx:
importance: Undecided → Medium
status: New → Triaged
tags: added: stx.2019.03
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Targeting stx.2019.03 -- issue is triggered by entering an unsupported ptp configuration. The user can recover by changing the config to a valid one. Therefore, not required for stx.2018.10.
Alarms will be added in the future.

Ken Young (kenyis)
tags: added: stx.2019.05
removed: stx.2019.03
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-config (master)

Fix proposed to branch: master
Review: https://review.openstack.org/637052

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on stx-config (master)

Change abandoned by Alex Kozyrev (<email address hidden>) on branch: master
Review: https://review.openstack.org/637052

Ken Young (kenyis)
Changed in starlingx:
assignee: Alex Kozyrev (akozyrev) → Eric MacDonald (rocksolidmtce)
Ken Young (kenyis)
tags: added: stx.2.0
removed: stx.2019.05
Ghada Khalil (gkhalil)
tags: added: stx.retestneeded
Revision history for this message
Eric MacDonald (rocksolidmtce) wrote :

This issue is fixed by the following update.

https://review.openstack.org/#/c/647527/

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
Anujeyan Manokeran (anujeyan) wrote :

Verified in load 2019-04-10 01:30:00 +0000

Changed in starlingx:
status: Fix Released → Incomplete
status: Incomplete → Confirmed
status: Confirmed → Fix Committed
status: Fix Committed → Fix Released
tags: removed: stx.2.0 stx.config stx.retestneeded
Ghada Khalil (gkhalil)
tags: added: stx.2.0 stx.config
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.