linux/arch
Will Deacon f069faba68 arm64: mm: Use READ_ONCE when dereferencing pointer to pte table
On kernels built with support for transparent huge pages, different CPUs
can access the PMD concurrently due to e.g. fast GUP or page_vma_mapped_walk
and they must take care to use READ_ONCE to avoid value tearing or caching
of stale values by the compiler. Unfortunately, these functions call into
our pgtable macros, which don't use READ_ONCE, and compiler caching has
been observed to cause the following crash during ext4 writeback:

PC is at check_pte+0x20/0x170
LR is at page_vma_mapped_walk+0x2e0/0x540
[...]
Process doio (pid: 2463, stack limit = 0xffff00000f2e8000)
Call trace:
[<ffff000008233328>] check_pte+0x20/0x170
[<ffff000008233758>] page_vma_mapped_walk+0x2e0/0x540
[<ffff000008234adc>] page_mkclean_one+0xac/0x278
[<ffff000008234d98>] rmap_walk_file+0xf0/0x238
[<ffff000008236e74>] rmap_walk+0x64/0xa0
[<ffff0000082370c8>] page_mkclean+0x90/0xa8
[<ffff0000081f3c64>] clear_page_dirty_for_io+0x84/0x2a8
[<ffff00000832f984>] mpage_submit_page+0x34/0x98
[<ffff00000832fb4c>] mpage_process_page_bufs+0x164/0x170
[<ffff00000832fc8c>] mpage_prepare_extent_to_map+0x134/0x2b8
[<ffff00000833530c>] ext4_writepages+0x484/0xe30
[<ffff0000081f6ab4>] do_writepages+0x44/0xe8
[<ffff0000081e5bd4>] __filemap_fdatawrite_range+0xbc/0x110
[<ffff0000081e5e68>] file_write_and_wait_range+0x48/0xd8
[<ffff000008324310>] ext4_sync_file+0x80/0x4b8
[<ffff0000082bd434>] vfs_fsync_range+0x64/0xc0
[<ffff0000082332b4>] SyS_msync+0x194/0x1e8

This is because page_vma_mapped_walk loads the PMD twice before calling
pte_offset_map: the first time without READ_ONCE (where it gets all zeroes
due to a concurrent pmdp_invalidate) and the second time with READ_ONCE
(where it sees a valid table pointer due to a concurrent pmd_populate).
However, the compiler inlines everything and caches the first value in
a register, which is subsequently used in pte_offset_phys which returns
a junk pointer that is later dereferenced when attempting to access the
relevant pte.

This patch fixes the issue by using READ_ONCE in pte_offset_phys to ensure
that a stale value is not used. Whilst this is a point fix for a known
failure (and simple to backport), a full fix moving all of our page table
accessors over to {READ,WRITE}_ONCE and consistently using READ_ONCE in
page_vma_mapped_walk is in the works for a future kernel release.

Cc: Jon Masters <jcm@redhat.com>
Cc: Timur Tabi <timur@codeaurora.org>
Cc: <stable@vger.kernel.org>
Fixes: f27176cfc3 ("mm: convert page_mkclean_one() to use page_vma_mapped_walk()")
Tested-by: Richard Ruigrok <rruigrok@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-09-29 16:46:43 +01:00
..
alpha Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2017-09-11 18:34:47 -07:00
arc arch: remove unused *_segments() macros/functions 2017-09-22 12:59:52 -10:00
arm arm/syscalls: Optimize address limit check 2017-09-17 19:45:33 +02:00
arm64 arm64: mm: Use READ_ONCE when dereferencing pointer to pte table 2017-09-29 16:46:43 +01:00
blackfin Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2017-09-11 18:34:47 -07:00
c6x arch: remove unused *_segments() macros/functions 2017-09-22 12:59:52 -10:00
cris MTD changes for 4.14: 2017-09-09 14:48:21 -07:00
frv arch: remove unused *_segments() macros/functions 2017-09-22 12:59:52 -10:00
h8300 arch: define CPU_BIG_ENDIAN for all fixed big endian archs 2017-09-08 18:26:48 -07:00
hexagon Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-09-04 11:52:29 -07:00
ia64 Merge branch 'work.ipc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-09-14 17:37:26 -07:00
m32r arch: remove unused *_segments() macros/functions 2017-09-22 12:59:52 -10:00
m68k Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu 2017-09-10 21:07:39 -07:00
metag arch: remove unused *_segments() macros/functions 2017-09-22 12:59:52 -10:00
microblaze Merge branch 'akpm' (patches from Andrew) 2017-09-09 10:30:07 -07:00
mips pci-v4.14-fixes-2 2017-09-22 13:09:11 -10:00
mn10300 arch: remove unused *_segments() macros/functions 2017-09-22 12:59:52 -10:00
nios2 nios2 update for v4.14-rc1 2017-09-15 12:47:21 -07:00
openrisc OpenRISC patches for 4.14 2017-09-13 11:52:18 -07:00
parisc parisc: Unbreak bootloader due to gcc-7 optimizations 2017-09-22 22:26:43 +02:00
powerpc powerpc/pseries: Fix parent_dn reference leak in add_dt_node() 2017-09-21 19:33:16 +10:00
s390 s390/topology: enable / disable topology dynamically 2017-09-20 13:47:55 +02:00
score
sh arch: remove unused *_segments() macros/functions 2017-09-22 12:59:52 -10:00
sparc Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2017-09-11 18:34:47 -07:00
tile Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile 2017-09-22 13:02:54 -10:00
um arch: remove unused *_segments() macros/functions 2017-09-22 12:59:52 -10:00
unicore32
x86 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-09-24 12:33:58 -07:00
xtensa arch: remove unused *_segments() macros/functions 2017-09-22 12:59:52 -10:00
.gitignore
Kconfig - For the randstruct plugin, enable automatic randomization of structures 2017-09-07 20:30:19 -07:00