ptp-notification application apply failed because /var/run/ptp4l no longer exists

Bug #1961358 reported by Ghada Khalil
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Cole Walker

Bug Description

Brief Description
-----------------
ptp-notification application apply-failed because /var/run/ptp4l no longer exists

Severity
--------
Major

Steps to Reproduce
------------------
system host-update controller-0 clock_synchronization=ptp
system ptp-instance-add ptp1 ptp4l
system host-ptp-instance-assign controller-0 ptp1
system ptp-instance-parameter-add ptp1 domainNumber=24 slaveOnly=1
system ptp-interface-add if1 ptp1
system host-if-ptp-assign controller-0 oam0 if1
system ptp-instance-apply

system host-label-assign controller-0 ptp-registration=true ptp-notification=true
system application-upload -n ptp-notification /usr/local/share/applications/helm/ptp-notification*.tgz
system application-apply ptp-notification

Expected Behavior
-----------------
ptp-notification application is applied successfully

Actual Behavior
---------------
ptp-notification application apply-failed

Reproducibility
---------------
Reproducible

System Configuration
---------------------
Any system configured w/ ptp

Branch/Pull Time/Commit
-----------------------
2022-02-08_20-00-06

Last Pass
---------
Before the PTP Dual NIC feature: https://storyboard.openstack.org/#!/story/2009248

Timestamp/Logs
[sysadmin@controller-0 ~(keystone_admin)]$ system application-list

+--------------------------+---------+-----------------------------------+----------------------------------+--------------+---------------------+
| application | version | manifest name | manifest file | status | progress |
+--------------------------+---------+-----------------------------------+----------------------------------+--------------+---------------------+
| cert-manager | 1.0-29 | cert-manager-manifest | certmanager-manifest.yaml | applied | completed |
| nginx-ingress-controller | 1.1-20 | nginx-ingress-controller-manifest | nginx_ingress_controller_manifes | applied | completed |
| | | | t.yaml | | |
| | | | | | |
| oidc-auth-apps | 1.0-63 | oidc-auth-manifest | manifest.yaml | uploaded | completed |
| platform-integ-apps | 1.0-44 | platform-integration-manifest | manifest.yaml | applied | completed |
| ptp-notification | 1.0-52 | ptp-notification-manifest | ptp_notification_manifest.yaml | apply-failed | operation aborted, |
| | | | | | check logs for |
| | | | | | detail |
| | | | | | |
| rook-ceph-apps | 1.0-14 | rook-ceph-manifest | manifest.yaml | uploaded | completed |
+--------------------------+---------+-----------------------------------+----------------------------------+--------------+---------------------+

controller-0:~# kubectl describe pods -n notification ptp-ptp-notification-m555d

Name: ptp-ptp-notification-m555d
Namespace: notification
Priority: 0
Node: controller-0/abcd:204::2
Start Time: Wed, 09 Feb 2022 23:26:05 +0000
Labels: app=ptp-notification
                controller-revision-hash=666b9c9774
                pod-template-generation=1
                release=ptp-ptp-notification
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: DaemonSet/ptp-ptp-notification
Containers:
  ptp-notification-rabbitmq:
    Container ID:
    Image: registry.local:9001/docker.io/rabbitmq:3.8.11-management
    Image ID:
    Port: <none>
    Host Port: <none>
    State: Waiting
      Reason: ContainerCreating
    Ready: False
    Restart Count: 0
    Environment:
      THIS_NODE_NAME: (v1:spec.nodeName)
      THIS_POD_IP: (v1:status.podIP)
      THIS_NAMESPACE: notification
      RABBITMQ_DEFAULT_USER: admin
      RABBITMQ_DEFAULT_PASS: admin
      RABBITMQ_DEFAULT_PORT: 5672
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ddt72 (ro)
  ptp-notification-location:
    Container ID:
    Image: registry.local:9001/docker.io/starlingx/locationservice-base:stx.5.0-v1.0.1
    Image ID:
    Port: <none>
    Host Port: <none>
    Command:
      /bin/bash
      /mnt/locationservice_start.sh
    State: Waiting
      Reason: ContainerCreating
    Ready: False
    Restart Count: 0
    Environment:
      THIS_NODE_NAME: (v1:spec.nodeName)
      THIS_POD_IP: (v1:status.podIP)
      THIS_NAMESPACE: notification
      REGISTRATION_HOST: registration.notification.svc.cluster.local
      REGISTRATION_USER: admin
      REGISTRATION_PASS: admin
      REGISTRATION_PORT: 5672
      NOTIFICATIONSERVICE_USER: admin
      NOTIFICATIONSERVICE_PASS: admin
      NOTIFICATIONSERVICE_PORT: 5672
    Mounts:
      /mnt from scripts (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ddt72 (ro)
  ptp-notification-ptptracking:
    Container ID:
    Image: registry.local:9001/docker.io/starlingx/notificationservice-base:stx.5.0-v1.0.4
    Image ID:
    Port: <none>
    Host Port: <none>
    Command:
      /bin/bash
      /mnt/ptptracking_start.sh
    State: Waiting
      Reason: ContainerCreating
    Ready: False
    Restart Count: 0
    Environment:
      THIS_NODE_NAME: (v1:spec.nodeName)
      THIS_POD_IP: (v1:status.podIP)
      THIS_NAMESPACE: notification
      PTP_DEVICE_SIMULATED: false
      PTP_HOLDOVER_SECONDS: 15
      PTP_POLL_FREQ_SECONDS: 2
      NOTIFICATIONSERVICE_USER: admin
      NOTIFICATIONSERVICE_PASS: admin
      NOTIFICATIONSERVICE_PORT: 5672
      REGISTRATION_USER: admin
      REGISTRATION_PASS: admin
      REGISTRATION_PORT: 5672
      REGISTRATION_HOST: registration.notification.svc.cluster.local
    Mounts:
      /mnt from scripts (rw)
      /ptp/ from conf (ro)
      /usr/sbin/pmc from pmc (rw)
      /var/run/ from ptpdir (rw)
      /var/run/ptp4l from varrun (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ddt72 (ro)
Conditions:
  Type Status
  Initialized True
  Ready False
  ContainersReady False
  PodScheduled True
Volumes:
  scripts:
    Type: ConfigMap (a volume populated by a ConfigMap)
    Name: ptp-notification-scripts-configmap
    Optional: false
  ptpdir:
    Type: HostPath (bare host directory volume)
    Path: /var/run/
    HostPathType: Directory
  varrun:
    Type: HostPath (bare host directory volume)
    Path: /var/run/ptp4l
    HostPathType: Socket
  pmc:
    Type: HostPath (bare host directory volume)
    Path: /usr/sbin/pmc
    HostPathType:
  conf:
    Type: HostPath (bare host directory volume)
    Path: /etc/
    HostPathType: Directory
  kube-api-access-ddt72:
    Type: Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds: 3607
    ConfigMapName: kube-root-ca.crt
    ConfigMapOptional: <nil>

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: apps/v1
kind: DaemonSet
metadata:
  annotations:
    deprecated.daemonset.template.generation: "1"
  creationTimestamp: "2022-02-09T23:26:05Z"
  generation: 1
  labels:
    app: ptp-notification
    chart: ptp-notification
    release: ptp-ptp-notification
  name: ptp-ptp-notification
  namespace: notification
  resourceVersion: "38765"
  selfLink: /apis/apps/v1/namespaces/notification/daemonsets/ptp-ptp-notification
  uid: faf54fc4-9487-4fb2-be36-8f7c1f8f2936
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: ptp-notification
      release: ptp-ptp-notification
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: ptp-notification
        release: ptp-ptp-notification
      namespace: notification
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: ptp-notification
                operator: In
                values:
                - "true"
      containers:
      - env:
        - name: THIS_NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        - name: THIS_POD_IP
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: status.podIP
        - name: THIS_NAMESPACE
          value: notification
        - name: RABBITMQ_DEFAULT_USER
          value: admin
        - name: RABBITMQ_DEFAULT_PASS
          value: admin
        - name: RABBITMQ_DEFAULT_PORT
          value: "5672"
        image: registry.local:9001/docker.io/rabbitmq:3.8.11-management
        imagePullPolicy: IfNotPresent
        name: ptp-notification-rabbitmq
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      - command:
/var
        - name: PTP_HOLDOVER_SECONDS
    DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node-role.kubernetes.io/master:NoSchedule op=Exists
                             node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type Reason Age From Message
  ---- ------ ---- ---- -------
  Normal Scheduled 16h default-scheduler Successfully assigned notification/ptp-ptp-notification-m555d to controller-0
  Warning FailedMount 124m (x61 over 16h) kubelet Unable to attach or mount volumes: unmounted volumes=[varrun], unattached volumes=[scripts ptpdir varrun pmc conf kube-api-access-ddt72]: timed out waiting for the condition
  Warning FailedMount 50m (x70 over 16h) kubelet Unable to attach or mount volumes: unmounted volumes=[varrun], unattached volumes=[pmc conf kube-api-access-ddt72 scripts ptpdir varrun]: timed out waiting for the condition
  Warning FailedMount 29m (x61 over 16h) kubelet Unable to attach or mount volumes: unmounted volumes=[varrun], unattached volumes=[varrun pmc conf kube-api-access-ddt72 scripts ptpdir]: timed out waiting for the condition
  Warning FailedMount 25m (x67 over 16h) kubelet Unable to attach or mount volumes: unmounted volumes=[varrun], unattached volumes=[conf kube-api-access-ddt72 scripts ptpdir varrun pmc]: timed out waiting for the condition
  Warning FailedMount 10m (x502 over 16h) kubelet MountVolume.SetUp failed for volume "varrun" : hostPath type check failed: /var/run/ptp4l is not a socket file
  Warning FailedMount 4m46s (x70 over 16h) kubelet Unable to attach or mount volumes: unmounted volumes=[varrun], unattached volumes=[ptpdir varrun pmc conf kube-api-access-ddt72 scripts]: timed out waiting for the condition
  Warning FailedMount 15s (x107 over 16h) kubelet Unable to attach or mount volumes: unmounted volumes=[varrun], unattached volumes=[kube-api-access-ddt72 scripts ptpdir varrun pmc conf]: timed out waiting for the condition

Test Activity
-------------
Feature Testing

Workaround
----------
Unknown

Revision history for this message
Ghada Khalil (gkhalil) wrote :

screening: stx.7.0 / medium: issue introduced by new stx.7.0 feature: https://storyboard.openstack.org/#!/story/2009248

tags: added: stx.7.0 stx.networking
Changed in starlingx:
importance: Undecided → Medium
status: New → Triaged
assignee: nobody → Cole Walker (cwalops)
Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ptp-notification-armada-app (master)

Reviewed: https://review.opendev.org/c/starlingx/ptp-notification-armada-app/+/832178
Committed: https://opendev.org/starlingx/ptp-notification-armada-app/commit/f626e3a6b890b4e9990f39d8fe98b39057ee5ed4
Submitter: "Zuul (22348)"
Branch: master

commit f626e3a6b890b4e9990f39d8fe98b39057ee5ed4
Author: Cole Walker <email address hidden>
Date: Fri Mar 4 16:17:28 2022 -0500

    [PTP SyncE] Support ptp-notification on GM node

    This change adds support for the ptp-notification armada app to track
    the clock status on a node operating as GM. This is required for nodes
    running Westport Channel NICs using the GNSS module as the clock and
    time of day source. This will require a rebuild of the
    notificationservice container image.

    The also fixes an issue where ptp-notification would no longer deploy
    with multi-instance ptp because several of the paths that were being
    mounted to the pods have been changed. The fix was to change these paths
    to user-configurable variables which can be supplied to the notification
    application as helm overrides. This change is only to the helm charts
    and does not require an image build.

    Testing:

    PASS: Build and deploy stx-ptp-notification-helm with multi instance
    ptp.

    PASS: Build notificationservice container and deploy with helm charts,
    validated that a GM node reported locked/holdover/freerun status
    correctly.

    Closes-Bug: 1961358
    Story: 2009130
    Task: 44700

    Signed-off-by: Cole Walker <email address hidden>
    Change-Id: Ibc1fc6c6342f873ea75c2e4015eb4c910b7010fd

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ptp-notification-armada-app (r/stx.6.0)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ptp-notification-armada-app (r/stx.6.0)

Reviewed: https://review.opendev.org/c/starlingx/ptp-notification-armada-app/+/862725
Committed: https://opendev.org/starlingx/ptp-notification-armada-app/commit/a5297275b49dc6679f891c0537e37c0fe7ffa078
Submitter: "Zuul (22348)"
Branch: r/stx.6.0

commit a5297275b49dc6679f891c0537e37c0fe7ffa078
Author: Cole Walker <email address hidden>
Date: Fri Mar 4 16:17:28 2022 -0500

    [PTP SyncE] Support ptp-notification on GM node

    This change adds support for the ptp-notification armada app to track
    the clock status on a node operating as GM. This is required for nodes
    running Westport Channel NICs using the GNSS module as the clock and
    time of day source. This will require a rebuild of the
    notificationservice container image.

    The also fixes an issue where ptp-notification would no longer deploy
    with multi-instance ptp because several of the paths that were being
    mounted to the pods have been changed. The fix was to change these paths
    to user-configurable variables which can be supplied to the notification
    application as helm overrides. This change is only to the helm charts
    and does not require an image build.

    Testing:

    PASS: Build and deploy stx-ptp-notification-helm with multi instance
    ptp.

    PASS: Build notificationservice container and deploy with helm charts,
    validated that a GM node reported locked/holdover/freerun status
    correctly.

    Closes-Bug: 1961358
    Story: 2009130
    Task: 44700

    Signed-off-by: Cole Walker <email address hidden>
    Change-Id: Ibc1fc6c6342f873ea75c2e4015eb4c910b7010fd
    (cherry picked from commit f626e3a6b890b4e9990f39d8fe98b39057ee5ed4)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.