Error executing `timeout` command in ceph-init-wrapper

Bug #2037728 reported by Tiago Lucas Leal
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Low
Tiago Lucas Leal

Bug Description

Brief Description
-----------------
Analyzing the log file ceph-process-states.log the following error appears several times:

/etc/init.d/ceph-init-wrapper osd.0 WARN: Error executing: timeout ceph daemon osd.0 config set debug_osd 20/20 errorcode: 125 output: timeout: invalid time interval ‘ceph’ Try 'timeout --help' for more information.

Severity
--------
Minor: System/Feature is usable with minor issue

Steps to Reproduce
------------------
Force a hung in the osd.0 after it, and check that /var/log/ceph/ceph-process-states.log contains a Error "executing: timeout 10 ceph daemon osd.0"

Expected Behavior
------------------
ceph-init-wrapper must be able to execute `execute_ceph_cmd` successfully

Actual Behavior
----------------
The function is not being executed, generating the error in log.

Reproducibility
---------------
The error was found while analyzing logs in the file ceph-process-states.log. Seems it occurs every time ceph-init-wrapper tries to call function `execute_ceph_cmd`.

System Configuration
--------------------
Found in StarlingX 8.0.0.

Timestamp/Logs
--------------
Look for the message below:

/etc/init.d/ceph-init-wrapper osd.0 WARN: Error executing: timeout ceph daemon osd.0 config set debug_osd 20/20 errorcode: 125 output: timeout: invalid time interval ‘ceph’ Try 'timeout --help' for more information.

Tiago Lucas Leal (tleal)
Changed in starlingx:
assignee: nobody → Tiago Lucas Leal (tleal)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to integ (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/integ/+/896952

Changed in starlingx:
status: New → In Progress
tags: added: stx stx.9.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to integ (master)

Reviewed: https://review.opendev.org/c/starlingx/integ/+/896952
Committed: https://opendev.org/starlingx/integ/commit/10a6701d7134a93930c84e1555714e8863218e8c
Submitter: "Zuul (22348)"
Branch: master

commit 10a6701d7134a93930c84e1555714e8863218e8c
Author: Tiago Leal <email address hidden>
Date: Fri Sep 29 12:28:27 2023 -0300

    Fix timeout command in ceph-init-wrapper

    When analyzing the ceph-process-states.log file, we observed a
    recurring error scenario. In the /etc/init.d/ceph-init-wrapper
    osd.0 script, the 'timeout' was consistently failing with error
    code 125 on the execute_ceph_cmd function call. This failure was
    due to the absence of a mandatory value parameter, causing
    'timeout' to interpret 'ceph' as an invalid time interval.

    To solve this bug, we introduced the necessary initialization
    of the $WAIT_FOR_CMD variable. This ensures that the command is
    executed correctly, addressing the issue and preventing the
    recurrence of the 'timeout' error."

    Test Plan:
      - PASS: Force the disk process to be reported as hung and
        check the aforementioned log for the desired output.

    Closes-Bug: 2037728
    Change-Id: Ic337b212b74c0cc76f25f4aaf9a99d77f8d9250d
    Signed-off-by: Tiago Leal <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
tags: added: stx.integ
removed: stx
Changed in starlingx:
importance: Undecided → Low
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.