> separate machines which have different controller serial numbers
> I believe that the controller is going to be the *same* on the commissioned node as when it is deployed
Yes, for 1 controller per server.
For the 1 server/2+ controllers case each controller (say of the same model) will have different serial numbers. If the original controller through which a namespace was accessible is removed from a server the expectation with multi-path is that another controller will provide access to it.
I believe this may be problematic in the following case:
1) 2 controllers are present in a server;
2) the server is commissioned and deployed (udev rules tied to WWN and SERIAL are generated for discovered namespaces);
3) the server is shutdown and one card is removed from it (the one with the serial written to a udev rule);
4) the server is brought back up and namespaces are rediscovered through a different controller with a different serial => by-dname rules did not match based on a different serial number.
Theoretically, just rebooting the server without removing 1 controller might yield this situation as well depending on the order they are enumerated in.
Does it look like a sane scenario or am I missing something obvious?
As for the implementation, in the 1.3 NVMe spec controllers have a CMIC field to indicate whether an NVM subsystem can contain multiple controllers or not and a subsystem NQN (CNTLID) based on which is used to put controllers under a subsystem in sysfs.
> separate machines which have different controller serial numbers
> I believe that the controller is going to be the *same* on the commissioned node as when it is deployed
Yes, for 1 controller per server.
For the 1 server/2+ controllers case each controller (say of the same model) will have different serial numbers. If the original controller through which a namespace was accessible is removed from a server the expectation with multi-path is that another controller will provide access to it.
I believe this may be problematic in the following case:
1) 2 controllers are present in a server;
2) the server is commissioned and deployed (udev rules tied to WWN and SERIAL are generated for discovered namespaces);
3) the server is shutdown and one card is removed from it (the one with the serial written to a udev rule);
4) the server is brought back up and namespaces are rediscovered through a different controller with a different serial => by-dname rules did not match based on a different serial number.
Theoretically, just rebooting the server without removing 1 controller might yield this situation as well depending on the order they are enumerated in.
Does it look like a sane scenario or am I missing something obvious?
As for the implementation, in the 1.3 NVMe spec controllers have a CMIC field to indicate whether an NVM subsystem can contain multiple controllers or not and a subsystem NQN (CNTLID) based on which is used to put controllers under a subsystem in sysfs.
https:/ /github. com/torvalds/ linux/commit/ 180de0070048340 868c7bc841fc12e 75556bb629
> consider it a udev/kernel bug if nvme0c33n1 were to ever show up in a value in the udev database
Right, agreed.