ibmvscsis: Do not send aborted task response

Bug #1689365 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Fix Released
High
Manoj Iyer
linux (Ubuntu)
Incomplete
High
Manoj Iyer
Zesty
Fix Released
Undecided
Unassigned

Bug Description

[Impact]
ibmvscsis: Do not send aborted task response

The driver is sending a response to the actual scsi op that was
aborted by an abort task TM, while LIO is sending a response to
the abort task TM.

ibmvscsis_tgt does not send the response to the client until
release_cmd time. The reason for this was because if we did it
at queue_status time, then the client would be free to reuse the
tag for that command, but we're still using the tag until the
command is released at release_cmd time, so we chose to delay
sending the response until then. That then caused this issue, because
release_cmd is always called, even if queue_status is not.

SCSI spec says that the initiator that sends the abort task
TM NEVER gets a response to the aborted op and with the current
code it will send a response. Thus this fix will remove that response
if the CMD_T_ABORTED && !CMD_T_TAS.

Another case with a small timing window is the case where if LIO sends a
TMR_DOES_NOT_EXIST, and the release_cmd callback is called for the TMR Abort
cmd before the release_cmd for the (attemped) aborted cmd, then we need to
ensure that we send the response for the (attempted) abort cmd to the client
before we send the response for the TMR Abort cmd.

[Test Case]
As per comment #11, this requires sending manual abort signals to trigger the bug.

[Fix]
ibmvscsis: Fix the incorrect req_lim_delta
ibmvscsis: Clear left-over abort_cmd pointers
ibmvscsis: Do not send aborted task response
target: Fix unknown fabric callback queue-full errors

[Regression Potential]
Patches are confined to ibmvscsi driver and target driver.

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-154092 severity-high targetmilestone-inin1610
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → linux (Ubuntu)
Manoj Iyer (manjo)
Changed in linux (Ubuntu):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Manoj Iyer (manjo)
Revision history for this message
Manoj Iyer (manjo) wrote :

Can you please test the kernel in https://launchpad.net/~ubuntu-power-triage/+archive/ubuntu/test/ and report back here?

Changed in linux (Ubuntu):
importance: Undecided → High
status: New → Incomplete
Changed in ubuntu-power-systems:
status: New → Incomplete
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2017-05-10 14:58 EDT-------
Which has also been accepted by maintainer - https://www.spinics.net/lists/linux-scsi/msg108639.html

Revision history for this message
bugproxy (bugproxy) wrote : ibmvscsis: Clear left-over abort_cmd pointers

------- Comment on attachment From <email address hidden> 2017-05-10 14:57 EDT-------

This is an additional needed fix to the original commit 25e78531268e ("ibmvscsis: Do not send aborted task response") that was previously posted here.

Revision history for this message
bugproxy (bugproxy) wrote : ibmvscsis: Fix the incorrect req_lim_delta

------- Comment on attachment From <email address hidden> 2017-05-10 16:17 EDT-------

This patch is also an addition to both:
commit 25e78531268e ("ibmvscsis: Do not send aborted task response")
commit 38b2788edbd6 ("ibmvscsis: Clear left-over abort_cmd pointers")

bugproxy (bugproxy)
tags: added: architecture-ppc64
removed: architecture-ppc64le
Revision history for this message
Manoj Iyer (manjo) wrote :

I am a bit confused by the last couple of comments. The current test kernel has the following patches:

ibmvscsis: Do not send aborted task response
target: Fix unknown fabric callback queue-full errors

are suggesting that in addition to those the following patches are also required?

ibmvscsis: Clear left-over abort_cmd pointers
ibmvscsis: Fix the incorrect req_lim_delta

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2017-05-15 10:02 EDT-------
ibmvscsis: Clear left-over abort_cmd pointers
ibmvscsis: Fix the incorrect req_lim_delta

Yes these two are also required, after further testing of the original patch, these two patches were pushed out also to address problems with original patch.

Revision history for this message
Manoj Iyer (manjo) wrote :

Will roll up a test kernel once these patches land upstream.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-05-24 09:58 EDT-------
The patch are upstreamed now, since 4.12 rc1 window is now open.

Manoj Iyer (manjo)
tags: added: ubuntu-17.04
Revision history for this message
Manoj Iyer (manjo) wrote :

I have submitted this patch series for SRU. Please test the kernel in https://launchpad.net/~ubuntu-power-triage/+archive/ubuntu/lp1689365/ and report back here so that the Ubuntu kernel team has added confidence is picking up these patches. I boot tested the kernel on a Power8 and did not find any regressions. A functional test to make sure the patch fixes the issue is useful.

Manoj Iyer (manjo)
description: updated
Manoj Iyer (manjo)
description: updated
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-06-05 11:15 EDT-------
Works fine, thanks!

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-06-05 11:48 EDT-------
So in terms of testing... I had someone in our internal IBMi OS team manually send abort operations so that it would trigger this bug. I wouldn't know how you would test it since the bug was found through them in the first place.

Manoj Iyer (manjo)
description: updated
Manoj Iyer (manjo)
Changed in ubuntu-power-systems:
importance: Undecided → High
Manoj Iyer (manjo)
tags: added: triage-a
Revision history for this message
Manoj Iyer (manjo) wrote :

I got the following review comments for SRU from kernel team. IBM could you please comment?

Looking at upstream 4.9.y, I see the patches to the ibmvscsis driver [2-4]
applied but not the change to the genric target driver. And I am not seeing any
justification why patch #1 is strictly required.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-07-31 11:57 EDT-------
The reason target: Fix unknown fabric callback queue-full errors is required is because if ibmvscsi driver returns a non -EAGAIN or -ENOMEM error back to target-core then outgoing responses are leaked. This case is seen in ibmvscsis_write_pending, we added code to return 0 to address cases where if client failed or response queue is down to just return success to LIO, since they can't do anything about it. (This is part of ibmvscsis: Do not send aborted task response patch) But for cases where we attempted to transfer the data and it failed and we return a -EIO, target core would leak. Thus we need the target-core fix to address the 2nd part of that issue.

static int ibmvscsis_write_pending(struct se_cmd *se_cmd)
{
struct ibmvscsis_cmd *cmd = container_of(se_cmd, struct ibmvscsis_cmd,
se_cmd);
struct scsi_info *vscsi = cmd->adapter;
struct iu_entry *iue = cmd->iue;
int rc;

/*
* If CLIENT_FAILED OR RESPONSE_Q_DOWN, then just return success
* since LIO can't do anything about it, and we dont want to
* attempt an srp_transfer_data.
*/
if ((vscsi->flags & (CLIENT_FAILED | RESPONSE_Q_DOWN))) {
pr_err("write_pending failed since: %d\n", vscsi->flags);
return 0;
}

rc = srp_transfer_data(cmd, &vio_iu(iue)->srp.cmd, ibmvscsis_rdma,
1, 1);
if (rc) {
pr_err("srp_transfer_data() failed: %d\n", rc);
return -EIO;
}
/*
* We now tell TCM to add this WRITE CDB directly into the TCM storage
* object execution queue.
*/
target_execute_cmd(se_cmd);
return 0;
}

Changed in linux (Ubuntu Zesty):
status: New → Fix Committed
Manoj Iyer (manjo)
Changed in ubuntu-power-systems:
status: Incomplete → Fix Committed
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-zesty' to 'verification-done-zesty'. If the problem still exists, change the tag 'verification-needed-zesty' to 'verification-failed-zesty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-zesty
bugproxy (bugproxy)
tags: added: verification-done-zesty
removed: verification-needed-zesty
Changed in ubuntu-power-systems:
assignee: nobody → Manoj Iyer (manjo)
Manoj Iyer (manjo)
tags: added: triage-g
removed: triage-a
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (8.5 KiB)

This bug was fixed in the package linux - 4.10.0-33.37

---------------
linux (4.10.0-33.37) zesty; urgency=low

  * linux: 4.10.0-33.37 -proposed tracker (LP: #1709303)

  * CVE-2017-1000112
    - Revert "udp: consistently apply ufo or fragmentation"
    - udp: consistently apply ufo or fragmentation

  * CVE-2017-1000111
    - Revert "net-packet: fix race in packet_set_ring on PACKET_RESERVE"
    - packet: fix tp_reserve race in packet_set_ring

  * ThunderX: soft lockup on 4.8+ kernels when running qemu-efi with vhost=on
    (LP: #1673564)
    - irqchip/gic-v3: Add missing system register definitions
    - arm64: KVM: Do not use stack-protector to compile EL2 code
    - KVM: arm/arm64: vgic-v3: Use PREbits to infer the number of ICH_APxRn_EL2
      registers
    - KVM: arm/arm64: vgic-v3: Fix nr_pre_bits bitfield extraction
    - arm64: Add a facility to turn an ESR syndrome into a sysreg encoding
    - KVM: arm/arm64: vgic-v3: Add accessors for the ICH_APxRn_EL2 registers
    - KVM: arm64: Make kvm_condition_valid32() accessible from EL2
    - KVM: arm64: vgic-v3: Add hook to handle guest GICv3 sysreg accesses at EL2
    - KVM: arm64: vgic-v3: Add ICV_BPR1_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_IGRPEN1_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_IAR1_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_EOIR1_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_AP1Rn_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_HPPIR1_EL1 handler
    - KVM: arm64: vgic-v3: Enable trapping of Group-1 system registers
    - KVM: arm64: Enable GICv3 Group-1 sysreg trapping via command-line
    - KVM: arm64: vgic-v3: Add ICV_BPR0_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_IGNREN0_EL1 handler
    - KVM: arm64: vgic-v3: Add misc Group-0 handlers
    - KVM: arm64: vgic-v3: Enable trapping of Group-0 system registers
    - KVM: arm64: Enable GICv3 Group-0 sysreg trapping via command-line
    - arm64: Add MIDR values for Cavium cn83XX SoCs
    - [Config] CONFIG_CAVIUM_ERRATUM_30115=y
    - arm64: Add workaround for Cavium Thunder erratum 30115
    - KVM: arm64: vgic-v3: Add ICV_DIR_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_RPR_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_CTLR_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_PMR_EL1 handler
    - KVM: arm64: Enable GICv3 common sysreg trapping via command-line
    - KVM: arm64: vgic-v3: Log which GICv3 system registers are trapped
    - arm64: KVM: Make unexpected reads from WO registers inject an undef
    - KVM: arm64: Log an error if trapping a read-from-write-only GICv3 access
    - KVM: arm64: Log an error if trapping a write-to-read-only GICv3 access

  * ibmvscsis: Do not send aborted task response (LP: #1689365)
    - target: Fix unknown fabric callback queue-full errors
    - ibmvscsis: Do not send aborted task response
    - ibmvscsis: Clear left-over abort_cmd pointers
    - ibmvscsis: Fix the incorrect req_lim_delta

  * hisi_sas performance improvements (LP: #1708734)
    - scsi: hisi_sas: define hisi_sas_device.device_id as int
    - scsi: hisi_sas: optimise the usage of hisi_hba.lock
    - scsi: hisi_sas: relocate sata_done_v2_hw()
    - scsi: hisi_sas: optimise DMA slot memory

  * hisi_sas...

Read more...

Changed in linux (Ubuntu Zesty):
status: Fix Committed → Fix Released
bugproxy (bugproxy)
tags: removed: bugnameltc-154092 severity-high triage-g ubuntu-17.04 verification-done-zesty
bugproxy (bugproxy)
tags: added: bugnameltc-154092 severity-high
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-08-30 16:30 EDT-------
I added the keyword: verification-done-zesty

tags: added: verification-done-zesty
Changed in ubuntu-power-systems:
status: Fix Committed → Fix Released
bugproxy (bugproxy)
tags: added: targetmilestone-inin1704
removed: targetmilestone-inin1610
Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.