linux/include
Joseph Qi 4c6994806f blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()
We've triggered a WARNING in blk_throtl_bio() when throttling writeback
io, which complains blkg->refcnt is already 0 when calling blkg_get(),
and then kernel crashes with invalid page request.
After investigating this issue, we've found it is caused by a race
between blkcg_bio_issue_check() and cgroup_rmdir(), which is described
below:

writeback kworker               cgroup_rmdir
                                  cgroup_destroy_locked
                                    kill_css
                                      css_killed_ref_fn
                                        css_killed_work_fn
                                          offline_css
                                            blkcg_css_offline
  blkcg_bio_issue_check
    rcu_read_lock
    blkg_lookup
                                              spin_trylock(q->queue_lock)
                                              blkg_destroy
                                              spin_unlock(q->queue_lock)
    blk_throtl_bio
    spin_lock_irq(q->queue_lock)
    ...
    spin_unlock_irq(q->queue_lock)
  rcu_read_unlock

Since rcu can only prevent blkg from releasing when it is being used,
the blkg->refcnt can be decreased to 0 during blkg_destroy() and schedule
blkg release.
Then trying to blkg_get() in blk_throtl_bio() will complains the WARNING.
And then the corresponding blkg_put() will schedule blkg release again,
which result in double free.
This race is introduced by commit ae11889636 ("blkcg: consolidate blkg
creation in blkcg_bio_issue_check()"). Before this commit, it will
lookup first and then try to lookup/create again with queue_lock. Since
revive this logic is a bit drastic, so fix it by only offlining pd during
blkcg_css_offline(), and move the rest destruction (especially
blkg_put()) into blkcg_css_free(), which should be the right way as
discussed.

Fixes: ae11889636 ("blkcg: consolidate blkg creation in blkcg_bio_issue_check()")
Reported-by: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-16 10:35:12 -06:00
..
acpi ACPICA: Update version to 20180105 2018-02-06 10:32:13 +01:00
asm-generic bug.h: work around GCC PR82365 in BUG() 2018-02-21 15:35:43 -08:00
clocksource
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2018-01-31 14:22:45 -08:00
drm drm/graphics pull request for v4.16-rc1 2018-02-01 17:48:47 -08:00
dt-bindings MIPS changes for 4.16 2018-02-07 11:22:44 -08:00
keys
kvm KVM changes for 4.16 2018-02-10 13:16:35 -08:00
linux blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir() 2018-03-16 10:35:12 -06:00
math-emu
media vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
memory
misc powerpc updates for 4.16 2018-02-02 10:01:04 -08:00
net udplite: fix partial checksum initialization 2018-02-16 15:57:42 -05:00
pcmcia
ras
rdma IB/uverbs: Use u64_to_user_ptr() not a union 2018-02-15 14:59:45 -07:00
scsi SCSI postmerge on 20180202 2018-02-03 13:07:56 -08:00
soc ARM: SoC driver updates for 4.16 2018-02-01 16:35:31 -08:00
sound Merge branch 'topic/fixes' into for-linus 2018-02-12 09:36:26 +01:00
target
trace Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-02-14 17:02:15 -08:00
uapi blktrace_api.h: fix comment for struct blk_user_trace_setup 2018-02-26 12:26:02 -07:00
video fbdev changes for v4.16: 2018-02-07 13:10:43 -08:00
xen