Add toshiba plugin support

Bug #1866897 reported by dann frazier
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
nvme-cli (Ubuntu)
Fix Released
Undecided
Unassigned
Bionic
In Progress
Medium
Matthew Ruffell
Eoan
Fix Released
Undecided
Unassigned
Focal
Fix Released
Undecided
Unassigned

Bug Description

[Impact]

The version of nvme-cli in Bionic (1.5-ubuntu1) lacks support for functionality unique to Toshiba NVMe devices, provided by the toshiba-nvme plugin:

$ sudo nvme toshiba
nvme-1.5

usage: nvme toshiba <command> [<device>] [<args>]

The '<device>' may be either an NVMe character device (ex: /dev/nvme0) or an
nvme block device (ex: /dev/nvme0n1).

Toshiba NVME plugin

The following are all implemented sub-commands:
  vs-smart-add-log Extended SMART information
  vs-internal-log Get Internal Log
  clear-pcie-correctable-errors Clear PCIe correctable error count
  version Shows the program version
  help Display this help

Several customers have asked for this plugin to be included, so they can see the internal logs from their Toshiba NVMe devices.

[Test Case]

You can find a test package for Bionic here:

https://launchpad.net/~mruffell/+archive/ubuntu/sf258218-test

Once that is installed, verify that the toshiba plugin exists:

$ sudo nvme toshiba

This will print plugin help, and a list of subcommands which are supported.

From there, you can run each of the plugin commands to test them:

$ for cmd in vs-smart-add-log vs-internal-log clear-pcie-correctable-errors; do
  sudo nvme toshiba $cmd /dev/nvme0n1
done

I have tested the package without access to hardware, and the help commands work as expected:

https://paste.ubuntu.com/p/FhJwc6wscC/

[Regression Risk]

I believe that this SRU falls under the "Other safe cases" category under
https://wiki.ubuntu.com/StableReleaseUpdates#Other_safe_cases

This SRU fulfills the first three cases for applicability while also improving the user experience of those with Toshiba NVMe devices.

For case one:
(1) has an obviously safe patch: We are adding a standalone plugin separated completely from existing code, since it is implemented down the new "nvme toshiba" code path. The patch itself only reads hardware registers and prints the contents to the console, and does not do anything too serious.
(2) "affects an application" - changes are limited to nvme-cli only, as kernel support is already present.

For case two:
This SRU will enable the use of Toshiba NVMe devices, while not affecting any existing NVMe devices from other manufacturers, since the subcommand is implemented alongside other manufactures subcommands, and does not modify any existing code.

For case three:
The changes are unintrusive, again, to being limited to plugin which implements a new subcommand. The code itself is simplistic, and has not been dramatically modified since it first landed in upstream compared to the present day. The feature landed in nvme-cli version 1.6, which appears in cosmic onwards, meaning upgrading from bionic to a newer release will not cause upgrade regressions.

I did a diff of the proposed patch to the current upstream master branch, and the new changes are more or less cosmetic:

https://paste.ubuntu.com/p/TBy3HbPGV6/

If you want to review each commit to current master branch, I have provided them below:

e5786ad argconfig: Remove unused paramters
1d37bc9 nvme-cli: Code cleanup
507ded5 nvme-vendor: fix c99 declarations in vendor plugins
f392181 Refactor plugins in a file hierarchy

We have included one fixup commit already in this SRU, check [Other Info] for details.

I will ask a customer to test the package on real hardware soon, and try and get a printout of their session.

As such, I believe this will not introduce any regressions. If a regression were to happen it would be limited to the "nvme toshiba" subcommand, and not any other nvme-cli functionality.

[Other Info]

The commits which add support for Toshiba devices are:

commit 3d52c76a74a5fe894d9fbfcdfc9c713958f5717f
Author: Andrew Meir <email address hidden>
Date: Wed Mar 28 17:42:53 2018 +0200
Subject: Add toshiba plugin code and command documentation.
Link: https://github.com/linux-nvme/nvme-cli/commit/3d52c76a74a5fe894d9fbfcdfc9c713958f5717f

commit daf7b2d2cb53d8621444113503739b327091bec5
Author: Andrew Meir <email address hidden>
Date: Wed Mar 28 20:21:34 2018 +0200
Subject: Add toshiba plugin to makefile rules.
Link: https://github.com/linux-nvme/nvme-cli/commit/daf7b2d2cb53d8621444113503739b327091bec5

There is also a fixup commit which is required, and is a part of this SRU:

commit 5e9239edb89dd753352a8e981502f42b9d7be87c
Author: Andrew Meir <email address hidden>
Date: Fri Apr 13 14:51:17 2018 +0200
Subject: Fixes for log page access.
Link: https://github.com/linux-nvme/nvme-cli/commit/5e9239edb89dd753352a8e981502f42b9d7be87c

All three commits landed in version 1.6 of nvme-cli, and bionic only needs the backport:

$ rmadison nvme-cli
nvme-cli | 1.5-1 | bionic/universe | amd64
nvme-cli | 1.5-1ubuntu1 | bionic-updates/universe | amd64
nvme-cli | 1.7-1 | eoan/universe | amd64
nvme-cli | 1.9-1 | focal/universe | amd64

There was some minor backporting required. I merged the commit "Add toshiba plugin to makefile rules." into the primary Toshiba plugin patch, since it only changes the Makefile, which needed some context adjustment.

There was also a parameter change needed for nvme_set_feature() in clear_correctable_errors(), and you can find my analysis and modification for that in Comment #3.

Tags: sts
Revision history for this message
dann frazier (dannf) wrote :

Cherry-picking the following commit causes a build failure:
  3d52c76 Add toshiba plugin code and command documentation.
This is due to a change due to a since-added parameter to nvme_set_feature() in:
  54f5e4a Add support for decoding IO Determinism features

Following the pattern of changes in that commit, I think we can just drop that parameter like so:

diff --git a/toshiba-nvme.c b/toshiba-nvme.c
index a3b6c13..4bd03b7 100644
--- a/toshiba-nvme.c
+++ b/toshiba-nvme.c
@@ -627,7 +627,7 @@ static int clear_correctable_errors(int argc, char **argv, struct command *cmd,
        const bool save = false;
        __u32 result;
        err = nvme_set_feature(fd, namespace_id, feature_id, value, cdw12, save,
- 0, NULL, &result);
+ NULL, &result);
        if (err) {
                fprintf(stderr, "%s: couldn't clear PCIe correctable errors \n", __func__);
        }

Revision history for this message
dann frazier (dannf) wrote :

I also suggest we pull in:
  5e9239e Fixes for log page access.

Changed in nvme-cli (Ubuntu Focal):
status: New → Fix Released
Changed in nvme-cli (Ubuntu Eoan):
status: New → Fix Released
Changed in nvme-cli (Ubuntu Bionic):
status: New → Confirmed
Revision history for this message
Matthew Ruffell (mruffell) wrote :
Download full text (3.8 KiB)

After reviewing the code for adding the cdw12 parameter to nvme_set_feature(), I believe it is safe to drop the parameter.

Analysis:
54f5e4a Add support for decoding IO Determinism features

@@ -477,13 +478,13 @@ int nvme_feature(int fd, __u8 opcode, __u32 nsid, __u32 cdw10, __u32 cdw11,
        return err;
 }

-int nvme_set_feature(int fd, __u32 nsid, __u8 fid, __u32 value, bool save,
- __u32 data_len, void *data, __u32 *result)
+int nvme_set_feature(int fd, __u32 nsid, __u8 fid, __u32 value, __u32 cdw12,
+ bool save, __u32 data_len, void *data, __u32 *result)
 {
        __u32 cdw10 = fid | (save ? 1 << 31 : 0);

        return nvme_feature(fd, nvme_admin_set_features, nsid, cdw10, value,
- data_len, data, result);
+ cdw12, data_len, data, result);
 }

The cdw12 parameter is added, and passed straight through to nvme_feature().

From there, all that is done with it is to set a field in a struct:

@@ -459,13 +459,14 @@ int nvme_sanitize_log(int fd, struct nvme_sanitize_log_page *sanitize_log)
 }

 int nvme_feature(int fd, __u8 opcode, __u32 nsid, __u32 cdw10, __u32 cdw11,
- __u32 data_len, void *data, __u32 *result)
+ __u32 cdw12, __u32 data_len, void *data, __u32 *result)
 {
        struct nvme_admin_cmd cmd = {
                .opcode = opcode,
                .nsid = nsid,
                .cdw10 = cdw10,
                .cdw11 = cdw11,
+ .cdw11 = cdw12,
                .addr = (__u64)(uintptr_t) data,
                .data_len = data_len,
        };

Is it just me, or is .cdw11 set twice? Is this a mistake? It is! Let's see

commit 2d8627ad3eff1afb11d0cad533ce53d019014eea
Author: Muhammad Ahmad <email address hidden>
Date: Fri Mar 2 00:07:57 2018 -0600

    Fixed a bug where cdw11 was being overwritten by cdw12

diff --git a/nvme-ioctl.c b/nvme-ioctl.c
index 5166446..e8ba935 100644
--- a/nvme-ioctl.c
+++ b/nvme-ioctl.c
@@ -472,7 +472,7 @@ int nvme_feature(int fd, __u8 opcode, __u32 nsid, __u32 cdw10, __u32 cdw11,
                .nsid = nsid,
                .cdw10 = cdw10,
                .cdw11 = cdw11,
- .cdw11 = cdw12,
+ .cdw12 = cdw12,
                .addr = (__u64)(uintptr_t) data,
                .data_len = data_len,
        };

Looking at clear_correctable_errors(), we see that cdw12 is set to 0:

+ const __u32 cdw12 = 0;
+ const bool save = false;
+ __u32 result;
+ err = nvme_set_feature(fd, namespace_id, feature_id, value, cdw12, save,
+ 0, NULL, &result);

The cdw12 struct member is not used at all in nvme-cli 1.5, the version in bionic, and "54f5e4a Add support for decoding IO Determinism features" does nothing with the variable. I believe it is safe to drop cdw12 from nvme_set_feature().

Note, Dann, your patch is wrong, since you drop the wrong parameter. You are dropping data_len instead of cdw12.

The correct patch is:

diff --git a/toshiba-nvme.c b/toshiba-nvme.c
index a3b6c13..0694b8b 100644
--- a/toshiba-nvme.c
+++ b/toshib...

Read more...

Changed in nvme-cli (Ubuntu Bionic):
importance: Undecided → Medium
assignee: nobody → Matthew Ruffell (mruffell)
status: Confirmed → In Progress
tags: added: sts
Revision history for this message
dann frazier (dannf) wrote : Re: [Bug 1866897] Re: Add toshiba plugin support

On Tue, Mar 10, 2020 at 3:50 PM Matthew Ruffell
<email address hidden> wrote:
> Note, Dann, your patch is wrong, since you drop the wrong parameter. You
> are dropping data_len instead of cdw12.

Aha! Nice catch Matthew.

description: updated
description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.