[bionic][azure] fence_scsi unable to unfence from another node
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
fence-agents (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
Bionic |
Invalid
|
Undecided
|
Unassigned |
Bug Description
When playing with fence_scsi in Azure's new shared disk feature, I discovered that I'm able to register a key and acquire a reservation using that key for the host I have generated the key to. BUT, when trying to unregister a key from another host I get errors.
With that problem, I can't use fence_scsi agent for fencing in Microsoft Azure.
-------
rafaeldtinoco@
3abe0000
#### Registering a node into the shared disk manually:
rafaeldtinoco@
Executing: /usr/bin/sg_turs /dev/sdc
0
Executing: /usr/bin/sg_turs /dev/sdc
0
Executing: /usr/bin/sg_persist -n -i -k -d /dev/sdc
0 PR generation=0x8, there are NO registered reservation keys
No registration for key 3abe0000 on device /dev/sdc
Executing: /usr/bin/sg_turs /dev/sdc
0
Executing: /usr/bin/sg_persist -n -o -I -S 3abe0000 -d /dev/sdc
0
Executing: /usr/bin/sg_turs /dev/sdc
0
Executing: /usr/bin/sg_persist -n -i -k -d /dev/sdc
0 PR generation=0x9, 1 registered reservation key follows:
0x3abe0000
Executing: /usr/bin/sg_turs /dev/sdc
0
Executing: /usr/bin/sg_persist -n -i -r -d /dev/sdc
0 PR generation=0x9, there is NO reservation held
Executing: /usr/bin/sg_turs /dev/sdc
0
Executing: /usr/bin/sg_persist -n -o -R -T 5 -K 3abe0000 -d /dev/sdc
0
Executing: /usr/bin/sg_turs /dev/sdc
0
Executing: /usr/bin/sg_turs /dev/sdc
0
Executing: /usr/bin/sg_persist -n -i -k -d /dev/sdc
0 PR generation=0x9, 1 registered reservation key follows:
0x3abe0000
Success: Powered ON
#### Trying to un-register from the same node HAS TO FAIL (fence_scsi does not allow it)
rafaeldtinoco@
Delay 0 second(s) before logging in to the fence device
Executing: /usr/bin/sg_turs /dev/sdc
0
Executing: /usr/bin/sg_turs /dev/sdc
0
Executing: /usr/bin/sg_persist -n -i -k -d /dev/sdc
0 PR generation=0x9, 1 registered reservation key follows:
0x3abe0000
Failed: keys cannot be same. You can not fence yourself.
#### Trying to un-register from another node (has to succeed, that is the fencing purpose!)
rafaeldtinoco@
Delay 0 second(s) before logging in to the fence device
Executing: /usr/bin/sg_turs /dev/sdc
0
Executing: /usr/bin/sg_turs /dev/sdc
0
Executing: /usr/bin/sg_persist -n -i -k -d /dev/sdc
0 PR generation=0x9, 1 registered reservation key follows:
0x3abe0000
Executing: /usr/bin/sg_turs /dev/sdc
0
Executing: /usr/bin/sg_persist -n -i -k -d /dev/sdc
0 PR generation=0x9, 1 registered reservation key follows:
0x3abe0000
Executing: /usr/bin/sg_turs /dev/sdc
0
Executing: /usr/bin/sg_persist -n -o -A -T 5 -K 62ed0001 -S 3abe0000 -d /dev/sdc
99 persistent reserve out: transport: Host_status=0x07 [DID_ERROR]
Driver_status=0x00 [DRIVER_OK]
PR out (Preempt and abort): Sense category: -1, try '-v' option for more information
Executing: /usr/bin/sg_turs /dev/sdc
0
Executing: /usr/bin/sg_persist -n -i -k -d /dev/sdc
0 PR generation=0x9, 1 registered reservation key follows:
0x3abe0000
Failed to remove key 3abe0000 on device /dev/sdc
Failed to verify 1 device(s)
-------
And I have realized that, in the SAME node (as the key in place), I'm able to release and unregister a node's key:
rafaeldtinoco@
Msft Virtual Disk 1.0
Peripheral device type: disk
rafaeldtinoco@
Msft Virtual Disk 1.0
Peripheral device type: disk
-------
But in a different node, after the reservation is set in a different node, I CANNOT release and unregister the node'skey:
rafaeldtinoco@
Msft Virtual Disk 1.0
Peripheral device type: disk
PR generation=0x9, 1 registered reservation key follows:
0x3abe0000
rafaeldtinoco@
Msft Virtual Disk 1.0
Peripheral device type: disk
persistent reserve out: transport: Host_status=0x07 [DID_ERROR]
Driver_status=0x00 [DRIVER_OK]
rafaeldtinoco@
Msft Virtual Disk 1.0
Peripheral device type: disk
persistent reserve out: transport: Host_status=0x07 [DID_ERROR]
Driver_status=0x00 [DRIVER_OK]
Changed in fence-agents (Ubuntu): | |
importance: | Undecided → High |
importance: | High → Undecided |
Changed in fence-agents (Ubuntu Bionic): | |
status: | New → Confirmed |
importance: | Undecided → High |
assignee: | nobody → Rafael David Tinoco (rafaeldtinoco) |
Creating an easier reproducer:
Node 01:
rafaeldtinoco@ clubionic01: ~$ sudo sg_persist --out --register --param- sark=3abe0000 /dev/sdc
Msft Virtual Disk 1.0
Peripheral device type: disk
rafaeldtinoco@ clubionic01: ~$ sudo sg_persist --out --reserve --param-rk=3abe0000 --prout-type=5 /dev/sdc
Msft Virtual Disk 1.0
Peripheral device type: disk
rafaeldtinoco@ clubionic01: ~$ sudo sg_persist -r /dev/sdc
Msft Virtual Disk 1.0
Peripheral device type: disk
PR generation=0xb, Reservation follows:
Key=0x3abe0000
scope: LU_SCOPE, type: Write Exclusive, registrants only
----
Node 02: (has to be able to remove node's 01 reservation):
rafaeldtinoco@ clubionic02: ~$ sudo sg_persist -v --out --release --param-rk=3abe0000 --prout-type=5 /dev/sdc
inquiry cdb: 12 00 00 00 24 00
Msft Virtual Disk 1.0
Peripheral device type: disk
Persistent Reservation Out cmd: 5f 02 05 00 00 00 00 00 18 00
persistent reserve out: transport: Host_status=0x07 [DID_ERROR]
Driver_status=0x00 [DRIVER_OK]
PR out (Release): Sense category: -1
We get a sense out of the storage server.
----
But if we try to remove the reservation from the same node it works:
rafaeldtinoco@ clubionic01: ~$ sudo sg_persist -v --out --release --param-rk=3abe0000 --prout-type=5 /dev/sdc
inquiry cdb: 12 00 00 00 24 00
Msft Virtual Disk 1.0
Peripheral device type: disk
Persistent Reservation Out cmd: 5f 02 05 00 00 00 00 00 18 00
PR out: command (Release) successful