linux/include
Nicholas Bellinger bd4e2d2907 target: Fix NULL dereference during LUN lookup + active I/O shutdown
When transport_clear_lun_ref() is shutting down a se_lun via
configfs with new I/O in-flight, it's possible to trigger a
NULL pointer dereference in transport_lookup_cmd_lun() due
to the fact percpu_ref_get() doesn't do any __PERCPU_REF_DEAD
checking before incrementing lun->lun_ref.count after
lun->lun_ref has switched to atomic_t mode.

This results in a NULL pointer dereference as LUN shutdown
code in core_tpg_remove_lun() continues running after the
existing ->release() -> core_tpg_lun_ref_release() callback
completes, and clears the RCU protected se_lun->lun_se_dev
pointer.

During the OOPs, the state of lun->lun_ref in the process
which triggered the NULL pointer dereference looks like
the following on v4.1.y stable code:

struct se_lun {
  lun_link_magic = 4294932337,
  lun_status = TRANSPORT_LUN_STATUS_FREE,

  .....

  lun_se_dev = 0x0,
  lun_sep = 0x0,

  .....

  lun_ref = {
    count = {
      counter = 1
    },
    percpu_count_ptr = 3,
    release = 0xffffffffa02fa1e0 <core_tpg_lun_ref_release>,
    confirm_switch = 0x0,
    force_atomic = false,
    rcu = {
      next = 0xffff88154fa1a5d0,
      func = 0xffffffff8137c4c0 <percpu_ref_switch_to_atomic_rcu>
    }
  }
}

To address this bug, use percpu_ref_tryget_live() to ensure
once __PERCPU_REF_DEAD is visable on all CPUs and ->lun_ref
has switched to atomic_t, all new I/Os will fail to obtain
a new lun->lun_ref reference.

Also use an explicit percpu_ref_kill_and_confirm() callback
to block on ->lun_ref_comp to allow the first stage and
associated RCU grace period to complete, and then block on
->lun_ref_shutdown waiting for the final percpu_ref_put()
to drop the last reference via transport_lun_remove_cmd()
before continuing with core_tpg_remove_lun() shutdown.

Reported-by: Rob Millner <rlm@daterainc.com>
Tested-by: Rob Millner <rlm@daterainc.com>
Cc: Rob Millner <rlm@daterainc.com>
Tested-by: Vaibhav Tandon <vst@datera.io>
Cc: Vaibhav Tandon <vst@datera.io>
Tested-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> # v3.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-02-26 16:08:44 -08:00
..
acpi More ACPI updates for v4.10-rc1 2016-12-22 10:19:32 -08:00
asm-generic modversions: treat symbol CRCs as 32 bit quantities 2017-02-03 08:28:25 -08:00
clocksource
crypto This pull contains one set of changes: a conversion of the crypto DocBook 2016-12-17 16:00:34 -08:00
drm drm: Don't race connector registration 2017-01-30 10:17:32 +01:00
dt-bindings dt-bindings: mfd: Remove TPS65217 interrupts 2016-12-27 10:06:00 -08:00
keys
kvm KVM: arm64: Access CNTHCTL_EL2 bit fields correctly on VHE systems 2017-01-13 11:19:25 +00:00
linux Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-02-04 12:18:01 -08:00
math-emu
media
memory
misc
net ipv6: fix flow labels when the traffic class is non-0 2017-01-31 13:16:59 -05:00
pcmcia
ras
rdma RDMA/core: Add the function ib_mtu_int_to_enum 2017-01-24 15:34:22 -05:00
rxrpc
scsi Merge remote-tracking branch 'mkp-scsi/fixes' into fixes 2017-01-17 17:32:54 -05:00
soc ARCv2: MCIP: update the BCR per current changes 2017-01-24 11:05:59 -08:00
sound Merge remote-tracking branches 'asoc/fix/arizona', 'asoc/fix/dpcm', 'asoc/fix/dwc', 'asoc/fix/fsl-ssi' and 'asoc/fix/hdmi-codec' into asoc-linus 2017-01-10 10:47:50 +00:00
target target: Fix NULL dereference during LUN lookup + active I/O shutdown 2017-02-26 16:08:44 -08:00
trace Merge branch 'for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2017-01-13 17:40:22 -08:00
uapi uapi: fix linux/target_core_user.h userspace compilation errors 2017-02-18 21:44:59 -08:00
video
xen xen: features and fixes for 4.10 rc0 2016-12-13 16:07:55 -08:00
Kbuild