linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-24 20:01:55 +00:00

Author	SHA1	Message	Date
Andrea Arcangeli	26c191788f	mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race condition When holding the mmap_sem for reading, pmd_offset_map_lock should only run on a pmd_t that has been read atomically from the pmdp pointer, otherwise we may read only half of it leading to this crash. PID: 11679 TASK: f06e8000 CPU: 3 COMMAND: "do_race_2_panic" #0 [f06a9dd8] crash_kexec at c049b5ec #1 [f06a9e2c] oops_end at c083d1c2 #2 [f06a9e40] no_context at c0433ded #3 [f06a9e64] bad_area_nosemaphore at c043401a #4 [f06a9e6c] __do_page_fault at c0434493 #5 [f06a9eec] do_page_fault at c083eb45 #6 [f06a9f04] error_code (via page_fault) at c083c5d5 EAX: 01fb470c EBX: fff35000 ECX: 00000003 EDX: 00000100 EBP: 00000000 DS: 007b ESI: 9e201000 ES: 007b EDI: 01fb4700 GS: 00e0 CS: 0060 EIP: c083bc14 ERR: ffffffff EFLAGS: 00010246 #7 [f06a9f38] _spin_lock at c083bc14 #8 [f06a9f44] sys_mincore at c0507b7d #9 [f06a9fb0] system_call at c083becd start len EAX: ffffffda EBX: 9e200000 ECX: 00001000 EDX: 6228537f DS: 007b ESI: 00000000 ES: 007b EDI: 003d0f00 SS: 007b ESP: 62285354 EBP: 62285388 GS: 0033 CS: 0073 EIP: 00291416 ERR: 000000da EFLAGS: 00000286 This should be a longstanding bug affecting x86 32bit PAE without THP. Only archs with 64bit large pmd_t and 32bit unsigned long should be affected. With THP enabled the barrier() in pmd_none_or_trans_huge_or_clear_bad() would partly hide the bug when the pmd transition from none to stable, by forcing a re-read of the pmd in pmd_offset_map_lock, but when THP is enabled a new set of problem arises by the fact could then transition freely in any of the none, pmd_trans_huge or pmd_trans_stable states. So making the barrier in pmd_none_or_trans_huge_or_clear_bad() unconditional isn't good idea and it would be a flakey solution. This should be fully fixed by introducing a pmd_read_atomic that reads the pmd in order with THP disabled, or by reading the pmd atomically with cmpxchg8b with THP enabled. Luckily this new race condition only triggers in the places that must already be covered by pmd_none_or_trans_huge_or_clear_bad() so the fix is localized there but this bug is not related to THP. NOTE: this can trigger on x86 32bit systems with PAE enabled with more than 4G of ram, otherwise the high part of the pmd will never risk to be truncated because it would be zero at all times, in turn so hiding the SMP race. This bug was discovered and fully debugged by Ulrich, quote: ---- [..] pmd_none_or_trans_huge_or_clear_bad() loads the content of edx and eax. 496 static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t pmd) 497 { 498 /* depend on compiler for an atomic pmd read / 499 pmd_t pmdval = pmd; // edi = pmd pointer 0xc0507a74 <sys_mincore+548>: mov 0x8(%esp),%edi ... // edx = PTE page table high address 0xc0507a84 <sys_mincore+564>: mov 0x4(%edi),%edx ... // eax = PTE page table low address 0xc0507a8e <sys_mincore+574>: mov (%edi),%eax [..] Please note that the PMD is not read atomically. These are two "mov" instructions where the high order bits of the PMD entry are fetched first. Hence, the above machine code is prone to the following race. - The PMD entry {high\|low} is 0x0000000000000000. The "mov" at 0xc0507a84 loads 0x00000000 into edx. - A page fault (on another CPU) sneaks in between the two "mov" instructions and instantiates the PMD. - The PMD entry {high\|low} is now 0x00000003fda38067. The "mov" at 0xc0507a8e loads 0xfda38067 into eax. ---- Reported-by: Ulrich Obergfell <uobergfe@redhat.com> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Petr Matousek <pmatouse@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-29 16:22:24 -07:00
Shaohua Li	4981d01ead	x86: Flush TLB if PGD entry is changed in i386 PAE mode According to intel CPU manual, every time PGD entry is changed in i386 PAE mode, we need do a full TLB flush. Current code follows this and there is comment for this too in the code. But current code misses the multi-threaded case. A changed page table might be used by several CPUs, every such CPU should flush TLB. Usually this isn't a problem, because we prepopulate all PGD entries at process fork. But when the process does munmap and follows new mmap, this issue will be triggered. When it happens, some CPUs keep doing page faults: http://marc.info/?l=linux-kernel&m=129915020508238&w=2 Reported-by: Yasunori Goto<y-goto@jp.fujitsu.com> Tested-by: Yasunori Goto<y-goto@jp.fujitsu.com> Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Shaohua Li<shaohua.li@intel.com> Cc: Mallick Asit K <asit.k.mallick@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-mm <linux-mm@kvack.org> Cc: stable <stable@kernel.org> LKML-Reference: <1300246649.2337.95.camel@sli10-conroe> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-03-18 11:44:01 +01:00
Johannes Weiner	f2d6bfe9ff	thp: add x86 32bit support Add support for transparent hugepages to x86 32bit. Share the same VM_ bitflag for VM_MAPPED_COPY. mm/nommu.c will never support transparent hugepages. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-13 17:32:44 -08:00
Jeremy Fitzhardinge	71ff49d71b	x86: with the last user gone, remove set_pte_present Impact: cleanup set_pte_present() is no longer used, directly or indirectly, so remove it. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Xen-devel <xen-devel@lists.xensource.com> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Avi Kivity <avi@redhat.com> LKML-Reference: <1237406613-2929-2-git-send-email-jeremy@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-19 14:04:19 +01:00
Jeremy Fitzhardinge	deb79cfb36	x86: unify pud_none Impact: cleanup Unify and demacro pud_none. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-02-06 12:31:51 -08:00
Jeremy Fitzhardinge	a61bb29af4	x86: unify pgd_bad Impact: cleanup Unify and demacro pgd_bad. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-02-06 12:31:50 -08:00
Jeremy Fitzhardinge	01ade20d5a	x86: unify pmd_offset Impact: cleanup Unify and demacro pmd_offset. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-02-06 12:31:49 -08:00
Jeremy Fitzhardinge	f476961cb1	x86: unify pud_page Impact: cleanup Unify and demacro pud_page. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-02-06 12:31:48 -08:00
Jeremy Fitzhardinge	6fff47e3ac	x86: unify pud_page_vaddr Impact: cleanup Unify and demacro pud_page_vaddr. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-02-06 12:31:48 -08:00
Jeremy Fitzhardinge	5ba7c91341	x86: unify pud_present Impact: cleanup Unify and demacro pud_present. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-02-06 12:31:07 -08:00
Jeremy Fitzhardinge	8de01da35e	x86: unify pte_same Impact: cleanup Unify and demacro pte_same. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-02-06 12:26:08 -08:00
Jeremy Fitzhardinge	a034a010f4	x86: unify pte_none Impact: cleanup Unify and demacro pte_none. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-02-06 12:26:08 -08:00
Jan Beulich	1796316a8b	x86: consolidate __swp_XXX() macros Impact: cleanup, code robustization The __swp_...() macros silently relied upon which bits are used for _PAGE_FILE and _PAGE_PROTNONE. After having changed _PAGE_PROTNONE in our Xen kernel to no longer overlap _PAGE_PAT, live locks and crashes were reported that could have been avoided if these macros properly used the symbolic constants. Since, as pointed out earlier, for Xen Dom0 support mainline likewise will need to eliminate the conflict between _PAGE_PAT and _PAGE_PROTNONE, this patch does all the necessary adjustments, plus it introduces a mechanism to check consistency between MAX_SWAPFILES_SHIFT and the actual encoding macros. This also fixes a latent bug in that x86-64 used a 6-bit mask in __swp_type(), and if MAX_SWAPFILES_SHIFT was increased beyond 5 in (the seemingly unrelated) linux/swap.h, this would have resulted in a collision with _PAGE_FILE. Non-PAE 32-bit code gets similarly adjusted for its pte_to_pgoff() and pgoff_to_pte() calculations. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-16 18:34:51 +01:00
Jan Beulich	ab00fee30c	i386/PAE: fix pud_page() Impact: cleanup To the unsuspecting user it is quite annoying that this broken and inconsistent with x86-64 definition still exists. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-30 11:47:50 +01:00
H. Peter Anvin	1965aae3c9	x86: Fix ASM_X86__ header guards Change header guards named "ASM_X86__" to "_ASM_X86_" since: a. the double underscore is ugly and pointless. b. no leading underscore violates namespace constraints. Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-10-22 22:55:23 -07:00
Al Viro	bb8985586b	x86, um: ... and asm-x86 move Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-10-22 22:55:20 -07:00

16 Commits