It is an error to request a disabled XSAVE/XSAVES component address.
For that case, make __raw_xsave_addr() return a NULL and issue a
warning.
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Cc: H. Peter Anvin <h.peter.anvin@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1468253937-40008-3-git-send-email-fenghua.yu@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When the kernel is using XSAVES compacted format, we cannot do
__copy_from_user() from a signal frame, which has standard-format data.
Fix it by using copyin_to_xsaves(), which converts between formats and
filters out all supervisor states that we do not allow userspace to
write.
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Cc: H. Peter Anvin <h.peter.anvin@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1468253937-40008-2-git-send-email-fenghua.yu@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The arrays xstate_offsets[] and xstate_sizes[] record XSAVE standard-
format offsets and sizes. Values for non-extended state components
fpu and xmm's were not initialized or used. Ptrace format conversion
needs them. Fix it.
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/cf3ea36cf30e2a99e37da6483e65446d018ff0a7.1466179491.git.yu-cheng.yu@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Component offset print out was incorrect for XSAVES. Correct it and move
to a separate function.
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/86602a8ac400626c6eca7125c3e15934866fc38e.1466179491.git.yu-cheng.yu@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
XSAVES uses compacted format and is a kernel instruction. The kernel
should use standard-format, non-supervisor state data for PTRACE.
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
[ Edited away artificial linebreaks. ]
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/de3d80949001305fe389799973b675cab055c457.1466179491.git.yu-cheng.yu@intel.com
[ Made various readability edits. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
CPUID function 0x0d, sub function (i, i > 1) returns in ebx the offset of
xstate component i. Zero is returned for a supervisor state. A supervisor
state can only be saved by XSAVES and XSAVES uses a compacted format.
There is no fixed offset for a supervisor state. This patch checks and
makes sure a supervisor state offset is not recorded or mis-used. This has
no effect in practice as we currently use no supervisor states, but it
would be good to fix.
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/81b29e40d35d4cec9f2511a856fe769f34935a3f.1466179491.git.yu-cheng.yu@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
CPUID function 0x0d, sub function (i, i > 1) returns in ecx[1] the
alignment requirement of component 'i' when the compacted format is used.
If ecx[1] is 0, component 'i' is located immediately following the preceding
component. If ecx[1] is 1, component 'i' is located on the next 64-byte
boundary following the preceding component.
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/331e2bef1a0a7a584f06adde095b6bbfbe166472.1466179491.git.yu-cheng.yu@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull x86 fixes from Ingo Molnar:
"Three fixes:
- A boot crash fix with certain configs
- a MAINTAINERS entry update
- Documentation typo fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/Documentation: Fix various typos in Documentation/x86/ files
x86/amd_nb: Fix boot crash on non-AMD systems
MAINTAINERS: Update the Calgary IOMMU entry
Fix boot crash that triggers if this driver is built into a kernel and
run on non-AMD systems.
AMD northbridges users call amd_cache_northbridges() and it returns
a negative value to signal that we weren't able to cache/detect any
northbridges on the system.
At least, it should do so as all its callers expect it to do so. But it
does return a negative value only when kmalloc() fails.
Fix it to return -ENODEV if there are no NBs cached as otherwise, amd_nb
users like amd64_edac, for example, which relies on it to know whether
it should load or not, gets loaded on systems like Intel Xeons where it
shouldn't.
Reported-and-tested-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1466097230-5333-2-git-send-email-bp@alien8.de
Link: https://lkml.kernel.org/r/5761BEB0.9000807@cybernetics.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There is a generic function __pvclock_read_cycles to be used to get both
flags and cycles. For function pvclock_read_flags, it's useless to get
cycles value. To make this function be more effective, get this variable
flags directly in function.
Signed-off-by: Minfei Huang <mnghuan@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Protocol for the "version" fields is: hypervisor raises it (making it
uneven) before it starts updating the fields and raises it again (making
it even) when it is done. Thus the guest can make sure the time values
it got are consistent by checking the version before and after reading
them.
Add CPU barries after getting version value just like what function
vread_pvclock does, because all of callees in this function is inline.
Fixes: 502dfeff23
Cc: stable@vger.kernel.org
Signed-off-by: Minfei Huang <mnghuan@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull x86 kprobe fix from Thomas Gleixner:
"A single fix clearing the TF bit when a fault is single stepped"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kprobes/x86: Clear TF bit in fault on single-stepping
Merge misc fixes from Andrew Morton:
"Two weeks worth of fixes here"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (41 commits)
init/main.c: fix initcall_blacklisted on ia64, ppc64 and parisc64
autofs: don't get stuck in a loop if vfs_write() returns an error
mm/page_owner: avoid null pointer dereference
tools/vm/slabinfo: fix spelling mistake: "Ocurrences" -> "Occurrences"
fs/nilfs2: fix potential underflow in call to crc32_le
oom, suspend: fix oom_reaper vs. oom_killer_disable race
ocfs2: disable BUG assertions in reading blocks
mm, compaction: abort free scanner if split fails
mm: prevent KASAN false positives in kmemleak
mm/hugetlb: clear compound_mapcount when freeing gigantic pages
mm/swap.c: flush lru pvecs on compound page arrival
memcg: css_alloc should return an ERR_PTR value on error
memcg: mem_cgroup_migrate() may be called with irq disabled
hugetlb: fix nr_pmds accounting with shared page tables
Revert "mm: disable fault around on emulated access bit architecture"
Revert "mm: make faultaround produce old ptes"
mailmap: add Boris Brezillon's email
mailmap: add Antoine Tenart's email
mm, sl[au]b: add __GFP_ATOMIC to the GFP reclaim mask
mm: mempool: kasan: don't poot mempool objects in quarantine
...
__GFP_REPEAT has a rather weak semantic but since it has been introduced
around 2.6.12 it has been ignored for low order allocations.
PGALLOC_GFP uses __GFP_REPEAT but none of the allocation which uses this
flag is for more than order-0. This means that this flag has never been
actually useful here because it has always been used only for
PAGE_ALLOC_COSTLY requests.
Link: http://lkml.kernel.org/r/1464599699-30131-3-git-send-email-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As the actual pointer value is the same for the thread stack allocation
and the thread_info, code that confused the two worked fine, but will
break when the thread info is moved away from the stack allocation. It
also looks very confusing.
For example, the kprobe code wanted to know the current top of stack.
To do that, it used this:
(unsigned long)current_thread_info() + THREAD_SIZE
which did indeed give the correct value. But it's not only a fairly
nonsensical expression, it's also rather complex, especially since we
actually have this:
static inline unsigned long current_top_of_stack(void)
which not only gives us the value we are interested in, but happens to
be how "current_thread_info()" is currently defined as:
(struct thread_info *)(current_top_of_stack() - THREAD_SIZE);
so using current_thread_info() to figure out the top of the stack really
is a very round-about thing to do.
The other cases are just simpler confusion about task_thread_info() vs
task_stack_page(), which currently return the same pointer - but if you
want the stack page, you really should be using the latter one.
And there was one entirely unused assignment of the current stack to a
thread_info pointer.
All cleaned up to make more sense today, and make it easier to move the
thread_info away from the stack in the future.
No semantic changes.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
None of the code actually wants a thread_info, it all wants a
task_struct, and it's just converting to a thread_info pointer much too
early.
No semantic change.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
XSAVES is a kernel instruction and uses a compacted format. When working
with user space, the kernel should provide standard-format, non-supervisor
state data. We cannot do __copy_to_user() from a compacted-format kernel
xstate area to a signal frame.
Dave Hansen proposes this method to simplify copy xstate directly to user.
This patch is based on an earlier patch from Fenghua Yu <fenghua.yu@intel.com>
Originally-from: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/c36f419d525517d04209a28dd8e1e5af9000036e.1463760376.git.yu-cheng.yu@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Keep init_fpstate.xsave.header.xfeatures as zero for init optimization.
This is important for init optimization that is implemented in processor.
If a bit corresponding to an xstate in xstate_bv is 0, it means the
xstate is in init status and will not be read from memory to the processor
during XRSTOR/XRSTORS instruction. This largely impacts context switch
performance.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/2fb4ec7f18b76e8cda057a8c0038def74a9b8044.1463760376.git.yu-cheng.yu@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
User space uses standard format xsave area. fpstate in signal frame
should have standard format size.
To explicitly distinguish between xstate size in kernel space and the
one in user space, we rename 'xstate_size' to 'fpu_kernel_xstate_size'.
Cleanup only, no change in functionality.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
[ Rebased the patch and cleaned up the naming. ]
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/2ecbae347a5152d94be52adf7d0f3b7305d90d99.1463760376.git.yu-cheng.yu@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The kernel xstate area can be in standard or compacted format;
it is always in standard format for user mode. When XSAVES is
enabled, the kernel uses the compacted format and it is necessary
to use a separate fpu_user_xstate_size for signal/ptrace frames.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
[ Rebased the patch and cleaned up the naming. ]
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/8756ec34dabddfc727cda5743195eb81e8caf91c.1463760376.git.yu-cheng.yu@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fix kprobe_fault_handler() to clear the TF (trap flag) bit of
the flags register in the case of a fault fixup on single-stepping.
If we put a kprobe on the instruction which caused a
page fault (e.g. actual mov instructions in copy_user_*),
that fault happens on the single-stepping buffer. In this
case, kprobes resets running instance so that the CPU can
retry execution on the original ip address.
However, current code forgets to reset the TF bit. Since this
fault happens with TF bit set for enabling single-stepping,
when it retries, it causes a debug exception and kprobes
can not handle it because it already reset itself.
On the most of x86-64 platform, it can be easily reproduced
by using kprobe tracer. E.g.
# cd /sys/kernel/debug/tracing
# echo p copy_user_enhanced_fast_string+5 > kprobe_events
# echo 1 > events/kprobes/enable
And you'll see a kernel panic on do_debug(), since the debug
trap is not handled by kprobes.
To fix this problem, we just need to clear the TF bit when
resetting running kprobe.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: systemtap@sourceware.org
Cc: stable@vger.kernel.org # All the way back to ancient kernels
Link: http://lkml.kernel.org/r/20160611140648.25885.37482.stgit@devbox
[ Updated the comments. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
On a 4-socket Brickland system, hot-removing one ioapic is fine.
Hot-removing the 2nd one causes panic in mp_unregister_ioapic()
while calling release_resource().
It is because the iomem_res pointer has already been released
when removing the first ioapic.
To explain the use of &res[num] here: res is assigned to ioapic_resources,
and later in ioapic_insert_resources() we do:
struct resource *r = ioapic_resources;
for_each_ioapic(i) {
insert_resource(&iomem_resource, r);
r++;
}
Here 'r' is treated as an arry of 'struct resource', and the r++ ensures
that each element of the array is inserted separately. Thus we should call
release_resouce() on each element at &res[num].
Fix it by assigning the correct pointers to ioapics[i].iomem_res in
ioapic_setup_resources().
Signed-off-by: Rui Wang <rui.y.wang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: tony.luck@intel.com
Cc: linux-pci@vger.kernel.org
Cc: rjw@rjwysocki.net
Cc: linux-acpi@vger.kernel.org
Cc: bhelgaas@google.com
Link: http://lkml.kernel.org/r/1465369193-4816-3-git-send-email-rui.y.wang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Forcing in_interrupt() to return true if we're not in a bona fide
interrupt confuses the softirq code. This fixes warnings like:
NOHZ: local_softirq_pending 282
... which can happen when running things like selftests/x86.
This will change perf's static percpu buffer usage in IST context.
I think this is okay, and it's changing the behavior to match
historical (pre-4.0) behavior.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: 9592747538 ("x86, traps: Track entry into and exit from IST context")
Link: http://lkml.kernel.org/r/cdc215f94d118d691d73df35275022331156fb45.1464130360.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We need to reenable the topology extensions CPUID leafs on newer models
too, if BIOS has disabled them, as we rely on them to get proper compute
unit topology.
Make the printk a once thing, while at it.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rui Huang <ray.huang@amd.com>
Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-hwmon@vger.kernel.org
Link: http://lkml.kernel.org/r/1464775468-23355-1-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
I've been carrying this patch around for a bit and it's helped me
solve at least a couple FPU-related bugs. In addition to using
it for debugging, I also drug it out because using AVX (and
AVX2/AVX-512) can have serious power consequences for a modern
core. It's very important to be able to figure out who is using
it.
It's also insanely useful to go out and see who is using a given
feature, like MPX or Memory Protection Keys. If you, for
instance, want to find all processes using protection keys, you
can do:
echo 'xfeatures & 0x200' > filter
Since 0x200 is the protection keys feature bit.
Note that this touches the KVM code. KVM did a CREATE_TRACE_POINTS
and then included a bunch of random headers. If anyone one of
those included other tracepoints, it would have defined the *OTHER*
tracepoints. That's bogus, so move it to the right place.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160601174220.3CDFB90E@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull x86 fixes from Ingo Molnar:
"Misc fixes: EFI, entry code, pkeys and MPX fixes, TASK_SIZE cleanups
and a tsc frequency table fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Switch from TASK_SIZE to TASK_SIZE_MAX in the page fault code
x86/fsgsbase/64: Use TASK_SIZE_MAX for FSBASE/GSBASE upper limits
x86/mm/mpx: Work around MPX erratum SKD046
x86/entry/64: Fix stack return address retrieval in thunk
x86/efi: Fix 7-parameter efi_call()s
x86/cpufeature, x86/mm/pkeys: Fix broken compile-time disabling of pkeys
x86/tsc: Add missing Cherrytrail frequency to the table
Implement the protection method for the crash kernel memory reservation
for the 64-bit x86 kdump.
Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Minfei Huang <mhuang@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1) I forgot that I had another selftest to stress test the ftrace
instance creation. It was actually suppose to go into the 4.6
merge window, but I never committed it. I almost forgot about it
again, but noticed it was missing from your tree.
2) Soumya PN sent me a clean up patch to not disable interrupts when
taking the tasklist_lock for read, as it's unnecessary because
that lock is never taken for write in irq context.
3) Newer gcc's can cause the jump in the function_graph code to the
global ftrace_stub label to be a short jump instead of a long one.
As that jump is dynamically converted to jump to the trace code to
do function graph tracing, and that conversion expects a long jump
it can corrupt the ftrace_stub itself (it's directly after that call).
One way to prevent gcc from using a short jump is to declare the
ftrace_stub as a weak function, which we do here to keep gcc from
optimizing too much.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXQhYQAAoJEKKk/i67LK/82pAH/3XzRCP366HqWnKdvluPB8vX
UnVoXGAX1Eh2ZpvlPIJBXNYOZlnGRMMMAoeI+su31FoJHrzTzfGXvRynTkZPFZtd
XakvHfACjtGtvi2MuCN1t9/d1ty/ob2o05KB9qc+JRlzHM09qTL/HX8hwZeEsMQ4
NYgEY4Y727LOSCrJieLktchpwtie77q8Wq25oiWIVWOyDjpCsPnZyaOqaQSANot9
Gd00cixbMam7Ba1BjoRsRQZaT2pYZ8vt7HDXDBfAOW1oOjalWARLhRg/zww1V3WD
DEptuEeyAgMJS3v76Z6Sbk/QM7hyGUWCcmC2qaN1yc2n1Sh+zBOiN1eyiiUh/2U=
=ERxv
-----END PGP SIGNATURE-----
Merge tag 'trace-v4.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull motr tracing updates from Steven Rostedt:
"Three more changes.
- I forgot that I had another selftest to stress test the ftrace
instance creation. It was actually suppose to go into the 4.6
merge window, but I never committed it. I almost forgot about it
again, but noticed it was missing from your tree.
- Soumya PN sent me a clean up patch to not disable interrupts when
taking the tasklist_lock for read, as it's unnecessary because that
lock is never taken for write in irq context.
- Newer gcc's can cause the jump in the function_graph code to the
global ftrace_stub label to be a short jump instead of a long one.
As that jump is dynamically converted to jump to the trace code to
do function graph tracing, and that conversion expects a long jump
it can corrupt the ftrace_stub itself (it's directly after that
call). One way to prevent gcc from using a short jump is to
declare the ftrace_stub as a weak function, which we do here to
keep gcc from optimizing too much"
* tag 'trace-v4.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace/x86: Set ftrace_stub to weak to prevent gcc from using short jumps to it
ftrace: Don't disable irqs when taking the tasklist_lock read_lock
ftracetest: Add instance created, delete, read and enable event test
printk() takes some locks and could not be used a safe way in NMI
context.
The chance of a deadlock is real especially when printing stacks from
all CPUs. This particular problem has been addressed on x86 by the
commit a9edc88093 ("x86/nmi: Perform a safe NMI stack trace on all
CPUs").
The patchset brings two big advantages. First, it makes the NMI
backtraces safe on all architectures for free. Second, it makes all NMI
messages almost safe on all architectures (the temporary buffer is
limited. We still should keep the number of messages in NMI context at
minimum).
Note that there already are several messages printed in NMI context:
WARN_ON(in_nmi()), BUG_ON(in_nmi()), anything being printed out from MCE
handlers. These are not easy to avoid.
This patch reuses most of the code and makes it generic. It is useful
for all messages and architectures that support NMI.
The alternative printk_func is set when entering and is reseted when
leaving NMI context. It queues IRQ work to copy the messages into the
main ring buffer in a safe context.
__printk_nmi_flush() copies all available messages and reset the buffer.
Then we could use a simple cmpxchg operations to get synchronized with
writers. There is also used a spinlock to get synchronized with other
flushers.
We do not longer use seq_buf because it depends on external lock. It
would be hard to make all supported operations safe for a lockless use.
It would be confusing and error prone to make only some operations safe.
The code is put into separate printk/nmi.c as suggested by Steven
Rostedt. It needs a per-CPU buffer and is compiled only on
architectures that call nmi_enter(). This is achieved by the new
HAVE_NMI Kconfig flag.
The are MN10300 and Xtensa architectures. We need to clean up NMI
handling there first. Let's do it separately.
The patch is heavily based on the draft from Peter Zijlstra, see
https://lkml.org/lkml/2015/6/10/327
[arnd@arndb.de: printk-nmi: use %zu format string for size_t]
[akpm@linux-foundation.org: min_t->min - all types are size_t here]
Signed-off-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Jan Kara <jack@suse.cz>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> [arm part]
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jiri Kosina <jkosina@suse.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: David Miller <davem@davemloft.net>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We need to call exit_thread from copy_process in a fail path. So make it
accept task_struct as a parameter.
[v2]
* s390: exit_thread_runtime_instr doesn't make sense to be called for
non-current tasks.
* arm: fix the comment in vfp_thread_copy
* change 'me' to 'tsk' for task_struct
* now we can change only archs that actually have exit_thread
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Howells <dhowells@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matt Fleming reported seeing crashes when enabling and disabling
function profiling which uses function graph tracer. Later Namhyung Kim
hit a similar issue and he found that the issue was due to the jmp to
ftrace_stub in ftrace_graph_call was only two bytes, and when it was
changed to jump to the tracing code, it overwrote the ftrace_stub that
was after it.
Masami Hiramatsu bisected this down to a binutils change:
8dcea93252a9ea7dff57e85220a719e2a5e8ab41 is the first bad commit
commit 8dcea93252a9ea7dff57e85220a719e2a5e8ab41
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri May 15 03:17:31 2015 -0700
Add -mshared option to x86 ELF assembler
This patch adds -mshared option to x86 ELF assembler. By default,
assembler will optimize out non-PLT relocations against defined non-weak
global branch targets with default visibility. The -mshared option tells
the assembler to generate code which may go into a shared library
where all non-weak global branch targets with default visibility can
be preempted. The resulting code is slightly bigger. This option
only affects the handling of branch instructions.
Declaring ftrace_stub as a weak call prevents gas from using two byte
jumps to it, which would be converted to a jump to the function graph
code.
Link: http://lkml.kernel.org/r/20160516230035.1dbae571@gandalf.local.home
Reported-by: Matt Fleming <matt@codeblueprint.co.uk>
Reported-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Matt Fleming <matt@codeblueprint.co.uk>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The GSBASE upper limit exists to prevent user code from confusing
the paranoid idtentry path. The FSBASE upper limit is just for
consistency. There's no need to enforce a smaller limit for 32-bit
tasks.
Just use TASK_SIZE_MAX. This simplifies the logic and will save a
few bytes of code.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/5357f2fe0f103eabf005773b70722451eab09a89.1462897104.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This erratum essentially causes the CPU to forget which privilege
level it is operating on (kernel vs. user) for the purposes of MPX.
This erratum can only be triggered when a system is not using
Supervisor Mode Execution Prevention (SMEP). Our workaround for
the erratum is to ensure that MPX can only be used in cases where
SMEP is present in the processor and is enabled.
This erratum only affects Core processors. Atom is unaffected.
But, there is no architectural way to determine Atom vs. Core.
So, we just apply this workaround to all processors. It's
possible that it will mistakenly disable MPX on some Atom
processsors or future unaffected Core processors. There are
currently no processors that have MPX and not SMEP. It would
take something akin to a hypervisor masking SMEP out on an Atom
processor for this to present itself on current hardware.
More details can be found at:
http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/desktop-6th-gen-core-family-spec-update.pdf
"
SKD046 Branch Instructions May Initialize MPX Bound Registers Incorrectly
Problem:
Depending on the current Intel MPX (Memory Protection
Extensions) configuration, execution of certain branch
instructions (near CALL, near RET, near JMP, and Jcc
instructions) without a BND prefix (F2H) initialize the MPX bound
registers. Due to this erratum, such a branch instruction that is
executed both with CPL = 3 and with CPL < 3 may not use the
correct MPX configuration register (BNDCFGU or BNDCFGS,
respectively) for determining whether to initialize the bound
registers; it may thus initialize the bound registers when it
should not, or fail to initialize them when it should.
Implication:
A branch instruction that has executed both in user mode and in
supervisor mode (from the same linear address) may cause a #BR
(bound range fault) when it should not have or may not cause a
#BR when it should have. Workaround An operating system can
avoid this erratum by setting CR4.SMEP[bit 20] to enable
supervisor-mode execution prevention (SMEP). When SMEP is
enabled, no code can be executed both with CPL = 3 and with CPL < 3.
"
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160512220400.3B35F1BC@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull security subsystem updates from James Morris:
"Highlights:
- A new LSM, "LoadPin", from Kees Cook is added, which allows forcing
of modules and firmware to be loaded from a specific device (this
is from ChromeOS, where the device as a whole is verified
cryptographically via dm-verity).
This is disabled by default but can be configured to be enabled by
default (don't do this if you don't know what you're doing).
- Keys: allow authentication data to be stored in an asymmetric key.
Lots of general fixes and updates.
- SELinux: add restrictions for loading of kernel modules via
finit_module(). Distinguish non-init user namespace capability
checks. Apply execstack check on thread stacks"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (48 commits)
LSM: LoadPin: provide enablement CONFIG
Yama: use atomic allocations when reporting
seccomp: Fix comment typo
ima: add support for creating files using the mknodat syscall
ima: fix ima_inode_post_setattr
vfs: forbid write access when reading a file into memory
fs: fix over-zealous use of "const"
selinux: apply execstack check on thread stacks
selinux: distinguish non-init user namespace capability checks
LSM: LoadPin for kernel file loading restrictions
fs: define a string representation of the kernel_read_file_id enumeration
Yama: consolidate error reporting
string_helpers: add kstrdup_quotable_file
string_helpers: add kstrdup_quotable_cmdline
string_helpers: add kstrdup_quotable
selinux: check ss_initialized before revalidating an inode label
selinux: delay inode label lookup as long as possible
selinux: don't revalidate an inode's label when explicitly setting it
selinux: Change bool variable name to index.
KEYS: Add KEYCTL_DH_COMPUTE command
...
Pull livepatching updates from Jiri Kosina:
- remove of our own implementation of architecture-specific relocation
code and leveraging existing code in the module loader to perform
arch-dependent work, from Jessica Yu.
The relevant patches have been acked by Rusty (for module.c) and
Heiko (for s390).
- live patching support for ppc64le, which is a joint work of Michael
Ellerman and Torsten Duwe. This is coming from topic branch that is
share between livepatching.git and ppc tree.
- addition of livepatching documentation from Petr Mladek
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
livepatch: make object/func-walking helpers more robust
livepatch: Add some basic livepatch documentation
powerpc/livepatch: Add live patching support on ppc64le
powerpc/livepatch: Add livepatch stack to struct thread_info
powerpc/livepatch: Add livepatch header
livepatch: Allow architectures to specify an alternate ftrace location
ftrace: Make ftrace_location_range() global
livepatch: robustify klp_register_patch() API error checking
Documentation: livepatch: outline Elf format and requirements for patch modules
livepatch: reuse module loader code to write relocations
module: s390: keep mod_arch_specific for livepatch modules
module: preserve Elf information for livepatch modules
Elf: add livepatch-specific Elf constants
- In-kernel ACPICA code update to the upstream release 20160422
adding support for ACPI 6.1 along with some previously missing
bits of ACPI 6.0 support, making a fair amount of fixes and
cleanups and reducing divergences between the upstream ACPICA
and the in-kernel code (Bob Moore, Lv Zheng, Al Stone, Aleksey
Makarov, Will Miles).
- ACPI Generic Event Device (GED) support and a fix for it (Sinan Kaya,
Paul Gortmaker).
- INT3406 thermal driver for display thermal management and ACPI
backlight support code reorganization related to it (Aaron Lu,
Arnd Bergmann).
- Support for exporting the value returned by the _HRV (hardware
revision) ACPI object via sysfs (Betty Dall).
- Removal of the EXPERT dependency for ACPI on ARM64 (Mark Brown).
- Rework of the handling of ACPI _OSI mechanism allowing the
_OSI("Darwin") support to be overridden from the kernel command
line among other things (Lv Zheng, Chen Yu).
- Rework of the ACPI tables override mechanism to prepare it for
the introduction of overlays support going forward (Lv Zheng,
Rafael Wysocki).
- Fixes related to the ECDT support and module-level execution
of AML (Lv Zheng).
- ACPI PCI interrupts management update to make it work better on
ARM64 mostly (Sinan Kaya).
- ACPI SRAT handling update to make the code process all entires
in the table order regardless of the entry type (Lukasz Anaczkowski).
- EFI power off support for full-hardware ACPI platforms that don't
support ACPI S5 (Chen Yu).
- Fixes and cleanups related to the ACPI core's sysfs interface
(Dan Carpenter, Betty Dall).
- acpi_dev_present() API rework to reduce possible confusion related
to it (Lukas Wunner).
- Removal of CLK_IS_ROOT from two ACPI drivers (Stephen Boyd).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJXOjM+AAoJEILEb/54YlRxNO4P/0FsajR2iXfHybiHyJq+Iddk
MX+Jealb5klnXXtuih90oOHft9NypV1ESO7bcmjSz+2tuSgoXifdI3GO0aFghj7v
h8SaVpCGzlm+u8y+Ppbxk+eWHAV1+ohV8uaO47yDUjuyZgG6c702QqrJVaqunQoq
KQd+kqK5bhcaLhrx9Ro0I4Jbz0TdFa8j7noUTRXtDfJ9V4xZ3a6QfXz3H6GU4L31
kNKjroxkFXpHMj2mYXuskqw2IWoRZw7Z7kpLv0dM44nko6c+oM8/9BIx4xh1IbR4
vvgn/C2QYe45fz4Or/qmrPzGZ/kQtLiiVC2B/GWbCTezu3Px9E3V2NI0xLktVe0g
Y/MsRdzMs0TInWSVezOlTONmfcqZgPhbSmsuI9PJ7izxmzOLVk6tjXARkzWe2gQ0
N/nOd7I8AMsTMdpBCvf6xjJXqHRl6jdXuHAIhcPC5DINQ0daz8FZ4Cw42MtVKo0I
2OiZ7ZnAnDDHrptV9VwtEvo60Uw/QG8EhdMWyQVaFWe1pFNM9nQtD0P2QeMWUHhZ
YL7Q63nM8flQIywcSj7jyMWroWZMOI/cFOLGxZjz+yXA3fRizl4J22kJ392gSQti
da1X8OBKsOvYQutkeGeQCNYWp4j5uKpoMoR4iR4dOLNqguWxaicDSZgsU8cAAk0k
W+lRS/E8l+we5rxEZYOd
=rAwm
-----END PGP SIGNATURE-----
Merge tag 'acpi-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"The new features here are ACPI 6.1 support (and some previously
missing bits of ACPI 6.0 support) in ACPICA and two new drivers, a
driver for the ACPI Generic Event Device (GED) feature introduced by
ACPI 6.1 and the INT3406 thermal driver for display thermal
management. Also the value returned by the _HRV (hardware revision)
ACPI object will be exported to user space via sysfs now.
In addition to that, ACPI on ARM64 will not depend on EXPERT any more.
The rest is mostly fixes and cleanups and some code reorganization.
Specifics:
- In-kernel ACPICA code update to the upstream release 20160422
adding support for ACPI 6.1 along with some previously missing bits
of ACPI 6.0 support, making a fair amount of fixes and cleanups and
reducing divergences between the upstream ACPICA and the in-kernel
code (Bob Moore, Lv Zheng, Al Stone, Aleksey Makarov, Will Miles)
- ACPI Generic Event Device (GED) support and a fix for it (Sinan
Kaya, Paul Gortmaker)
- INT3406 thermal driver for display thermal management and ACPI
backlight support code reorganization related to it (Aaron Lu, Arnd
Bergmann)
- Support for exporting the value returned by the _HRV (hardware
revision) ACPI object via sysfs (Betty Dall)
- Removal of the EXPERT dependency for ACPI on ARM64 (Mark Brown)
- Rework of the handling of ACPI _OSI mechanism allowing the
_OSI("Darwin") support to be overridden from the kernel command
line among other things (Lv Zheng, Chen Yu)
- Rework of the ACPI tables override mechanism to prepare it for the
introduction of overlays support going forward (Lv Zheng, Rafael
Wysocki)
- Fixes related to the ECDT support and module-level execution of AML
(Lv Zheng)
- ACPI PCI interrupts management update to make it work better on
ARM64 mostly (Sinan Kaya)
- ACPI SRAT handling update to make the code process all entires in
the table order regardless of the entry type (Lukasz Anaczkowski)
- EFI power off support for full-hardware ACPI platforms that don't
support ACPI S5 (Chen Yu)
- Fixes and cleanups related to the ACPI core's sysfs interface (Dan
Carpenter, Betty Dall)
- acpi_dev_present() API rework to reduce possible confusion related
to it (Lukas Wunner)
- Removal of CLK_IS_ROOT from two ACPI drivers (Stephen Boyd)"
* tag 'acpi-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (82 commits)
ACPI / video: mark acpi_video_get_levels() inline
Thermal / ACPI / video: add INT3406 thermal driver
ACPI / GED: make evged.c explicitly non-modular
ACPI / tables: Fix DSDT override mechanism
ACPI / sysfs: fix error code in get_status()
ACPICA: Update version to 20160422
ACPICA: Move all ASCII utilities to a common file
ACPICA: ACPI 2.0, Hardware: Add access_width/bit_offset support for acpi_hw_write()
ACPICA: ACPI 2.0, Hardware: Add access_width/bit_offset support in acpi_hw_read()
ACPICA: Executer: Introduce a set of macros to handle bit width mask generation
ACPICA: Hardware: Add optimized access bit width support
ACPICA: Utilities: Add ACPI_IS_ALIGNED() macro
ACPICA: Renamed some #defined flag constants for clarity
ACPICA: ACPI 6.0, tools/iasl: Add support for new resource descriptors
ACPICA: ACPI 6.0: Update _BIX support for new package element
ACPICA: ACPI 6.1: Support for new PCCT subtable
ACPICA: Refactor evaluate_object to reduce nesting
ACPICA: Divergence: remove unwanted spaces for typedef
ACPI,PCI,IRQ: remove SCI penalize function
ACPI,PCI,IRQ: remove redundant code in acpi_irq_penalty_init()
..
Pull x86 platform updates from Ingo Molnar:
"The main change is the addition of SGI/UV4 support"
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
x86/platform/UV: Fix incorrect nodes and pnodes for cpuless and memoryless nodes
x86/platform/UV: Remove Obsolete GRU MMR address translation
x86/platform/UV: Update physical address conversions for UV4
x86/platform/UV: Build GAM reference tables
x86/platform/UV: Support UV4 socket address changes
x86/platform/UV: Add obtaining GAM Range Table from UV BIOS
x86/platform/UV: Add UV4 addressing discovery function
x86/platform/UV: Fold blade info into per node hub info structs
x86/platform/UV: Allocate common per node hub info structs on local node
x86/platform/UV: Move blade local processor ID to the per cpu info struct
x86/platform/UV: Move scir info to the per cpu info struct
x86/platform/UV: Create per cpu info structs to replace per hub info structs
x86/platform/UV: Update MMIOH setup function to work for both UV3 and UV4
x86/platform/UV: Clean up redunduncies after merge of UV4 MMR definitions
x86/platform/UV: Add UV4 Specific MMR definitions
x86/platform/UV: Prep for UV4 MMR updates
x86/platform/UV: Add UV MMR Illegal Access Function
x86/platform/UV: Add UV4 Specific Defines
x86/platform/UV: Add UV Architecture Defines
x86/platform/UV: Add Initial UV4 definitions
...
Pull x86 debug cleanup from Ingo Molnar:
"A printk() output simplification"
* 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/dumpstack: Combine some printk()s
Pull x86 boot updates from Ingo Molnar:
"The biggest changes in this cycle were:
- prepare for more KASLR related changes, by restructuring, cleaning
up and fixing the existing boot code. (Kees Cook, Baoquan He,
Yinghai Lu)
- simplifly/concentrate subarch handling code, eliminate
paravirt_enabled() usage. (Luis R Rodriguez)"
* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
x86/KASLR: Clarify purpose of each get_random_long()
x86/KASLR: Add virtual address choosing function
x86/KASLR: Return earliest overlap when avoiding regions
x86/KASLR: Add 'struct slot_area' to manage random_addr slots
x86/boot: Add missing file header comments
x86/KASLR: Initialize mapping_info every time
x86/boot: Comment what finalize_identity_maps() does
x86/KASLR: Build identity mappings on demand
x86/boot: Split out kernel_ident_mapping_init()
x86/boot: Clean up indenting for asm/boot.h
x86/KASLR: Improve comments around the mem_avoid[] logic
x86/boot: Simplify pointer casting in choose_random_location()
x86/KASLR: Consolidate mem_avoid[] entries
x86/boot: Clean up pointer casting
x86/boot: Warn on future overlapping memcpy() use
x86/boot: Extract error reporting functions
x86/boot: Correctly bounds-check relocations
x86/KASLR: Clean up unused code from old 'run_size' and rename it to 'kernel_total_size'
x86/boot: Fix "run_size" calculation
x86/boot: Calculate decompression size during boot not build
...
Pull x86 asm updates from Ingo Molnar:
"The main changes in this cycle were:
- MSR access API fixes and enhancements (Andy Lutomirski)
- early exception handling improvements (Andy Lutomirski)
- user-space FS/GS prctl usage fixes and improvements (Andy
Lutomirski)
- Remove the cpu_has_*() APIs and replace them with equivalents
(Borislav Petkov)
- task switch micro-optimization (Brian Gerst)
- 32-bit entry code simplification (Denys Vlasenko)
- enhance PAT handling in enumated CPUs (Toshi Kani)
... and lots of other cleanups/fixlets"
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
x86/arch_prctl/64: Restore accidentally removed put_cpu() in ARCH_SET_GS
x86/entry/32: Remove asmlinkage_protect()
x86/entry/32: Remove GET_THREAD_INFO() from entry code
x86/entry, sched/x86: Don't save/restore EFLAGS on task switch
x86/asm/entry/32: Simplify pushes of zeroed pt_regs->REGs
selftests/x86/ldt_gdt: Test set_thread_area() deletion of an active segment
x86/tls: Synchronize segment registers in set_thread_area()
x86/asm/64: Rename thread_struct's fs and gs to fsbase and gsbase
x86/arch_prctl/64: Remove FSBASE/GSBASE < 4G optimization
x86/segments/64: When load_gs_index fails, clear the base
x86/segments/64: When loadsegment(fs, ...) fails, clear the base
x86/asm: Make asm/alternative.h safe from assembly
x86/asm: Stop depending on ptrace.h in alternative.h
x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()
x86/asm: Make sure verify_cpu() has a good stack
x86/extable: Add a comment about early exception handlers
x86/msr: Set the return value to zero when native_rdmsr_safe() fails
x86/paravirt: Make "unsafe" MSR accesses unsafe even if PARAVIRT=y
x86/paravirt: Add paravirt_{read,write}_msr()
x86/msr: Carry on after a non-"safe" MSR access fails
...
Pull RAS updates from Ingo Molnar:
"Main changes in this cycle were:
- AMD MCE/RAS handling updates (Yazen Ghannam, Aravind
Gopalakrishnan)
- Cleanups (Borislav Petkov)
- logging fix (Tony Luck)"
* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/RAS: Add SMCA support to AMD Error Injector
EDAC, mce_amd: Detect SMCA using X86_FEATURE_SMCA
x86/mce: Update AMD mcheck init to use cpu_has() facilities
x86/cpu: Add detection of AMD RAS Capabilities
x86/mce/AMD: Save an indentation level in prepare_threshold_block()
x86/mce/AMD: Disable LogDeferredInMcaStat for SMCA systems
x86/mce/AMD: Log Deferred Errors using SMCA MCA_DE{STAT,ADDR} registers
x86/mce: Detect local MCEs properly
x86/mce: Look in genpool instead of mcelog for pending error records
x86/mce: Detect and use SMCA-specific msr_ops
x86/mce: Define vendor-specific MSR accessors
x86/mce: Carve out writes to MCx_STATUS and MCx_CTL
x86/mce: Grade uncorrected errors for SMCA-enabled systems
x86/mce: Log MCEs after a warm rest on AMD, Fam17h and later
x86/mce: Remove explicit smp_rmb() when starting CPUs sync
x86/RAS: Rename AMD MCE injector config item
Pull perf updates from Ingo Molnar:
"Bigger kernel side changes:
- Add backwards writing capability to the perf ring-buffer code,
which is preparation for future advanced features like robust
'overwrite support' and snapshot mode. (Wang Nan)
- Add pause and resume ioctls for the perf ringbuffer (Wang Nan)
- x86 Intel cstate code cleanups and reorgnization (Thomas Gleixner)
- x86 Intel uncore and CPU PMU driver updates (Kan Liang, Peter
Zijlstra)
- x86 AUX (Intel PT) related enhancements and updates (Alexander
Shishkin)
- x86 MSR PMU driver enhancements and updates (Huang Rui)
- ... and lots of other changes spread out over 40+ commits.
Biggest tooling side changes:
- 'perf trace' features and enhancements. (Arnaldo Carvalho de Melo)
- BPF tooling updates (Wang Nan)
- 'perf sched' updates (Jiri Olsa)
- 'perf probe' updates (Masami Hiramatsu)
- ... plus 200+ other enhancements, fixes and cleanups to tools/
The merge commits, the shortlog and the changelogs contain a lot more
details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (249 commits)
perf/core: Disable the event on a truncated AUX record
perf/x86/intel/pt: Generate PMI in the STOP region as well
perf buildid-cache: Use lsdir() for looking up buildid caches
perf symbols: Use lsdir() for the search in kcore cache directory
perf tools: Use SBUILD_ID_SIZE where applicable
perf tools: Fix lsdir to set errno correctly
perf trace: Move seccomp args beautifiers to tools/perf/trace/beauty/
perf trace: Move flock op beautifier to tools/perf/trace/beauty/
perf build: Add build-test for debug-frame on arm/arm64
perf build: Add build-test for libunwind cross-platforms support
perf script: Fix export of callchains with recursion in db-export
perf script: Fix callchain addresses in db-export
perf script: Fix symbol insertion behavior in db-export
perf symbols: Add dso__insert_symbol function
perf scripting python: Use Py_FatalError instead of die()
perf tools: Remove xrealloc and ALLOC_GROW
perf help: Do not use ALLOC_GROW in add_cmd_list
perf pmu: Make pmu_formats_string to check return value of strbuf
perf header: Make topology checkers to check return value of strbuf
perf tools: Make alias handler to check return value of strbuf
...
Pull EFI updates from Ingo Molnar:
"The main changes in this cycle were:
- Drop the unused EFI_SYSTEM_TABLES efi.flags bit and ensure the
ARM/arm64 EFI System Table mapping is read-only (Ard Biesheuvel)
- Add a comment to explain that one of the code paths in the x86/pat
code is only executed for EFI boot (Matt Fleming)
- Improve Secure Boot status checks on arm64 and handle unexpected
errors (Linn Crosetto)
- Remove the global EFI memory map variable 'memmap' as the same
information is already available in efi::memmap (Matt Fleming)
- Add EFI Memory Attribute table support for ARM/arm64 (Ard
Biesheuvel)
- Add EFI GOP framebuffer support for ARM/arm64 (Ard Biesheuvel)
- Add EFI Bootloader Control driver for storing reboot(2) data in EFI
variables for consumption by bootloaders (Jeremy Compostella)
- Add Core EFI capsule support (Matt Fleming)
- Add EFI capsule char driver (Kweh, Hock Leong)
- Unify EFI memory map code for ARM and arm64 (Ard Biesheuvel)
- Add generic EFI support for detecting when firmware corrupts CPU
status register bits (like IRQ flags) when performing EFI runtime
service calls (Mark Rutland)
... and other misc cleanups"
* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
efivarfs: Make efivarfs_file_ioctl() static
efi: Merge boolean flag arguments
efi/capsule: Move 'capsule' to the stack in efi_capsule_supported()
efibc: Fix excessive stack footprint warning
efi/capsule: Make efi_capsule_pending() lockless
efi: Remove unnecessary (and buggy) .memmap initialization from the Xen EFI driver
efi/runtime-wrappers: Remove ARCH_EFI_IRQ_FLAGS_MASK #ifdef
x86/efi: Enable runtime call flag checking
arm/efi: Enable runtime call flag checking
arm64/efi: Enable runtime call flag checking
efi/runtime-wrappers: Detect firmware IRQ flag corruption
efi/runtime-wrappers: Remove redundant #ifdefs
x86/efi: Move to generic {__,}efi_call_virt()
arm/efi: Move to generic {__,}efi_call_virt()
arm64/efi: Move to generic {__,}efi_call_virt()
efi/runtime-wrappers: Add {__,}efi_call_virt() templates
efi/arm-init: Reserve rather than unmap the memory map for ARM as well
efi: Add misc char driver interface to update EFI firmware
x86/efi: Force EFI reboot to process pending capsules
efi: Add 'capsule' update support
...
Pull core signal updates from Ingo Molnar:
"These updates from Stas Sergeev and Andy Lutomirski, improve the
sigaltstack interface by extending its ABI with the SS_AUTODISARM
feature, which makes it possible to use swapcontext() in a sighandler
that works on sigaltstack. Without this flag, the subsequent signal
will corrupt the state of the switched-away sighandler.
The inspiration is more robust dosemu signal handling"
* 'core-signals-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
signals/sigaltstack: Change SS_AUTODISARM to (1U << 31)
signals/sigaltstack: Report current flag bits in sigaltstack()
selftests/sigaltstack: Fix the sigaltstack test on old kernels
signals/sigaltstack: If SS_AUTODISARM, bypass on_sig_stack()
selftests/sigaltstack: Add new testcase for sigaltstack(SS_ONSTACK|SS_AUTODISARM)
signals/sigaltstack: Implement SS_AUTODISARM flag
signals/sigaltstack: Prepare to add new SS_xxx flags
signals/sigaltstack, x86/signals: Unify the x86 sigaltstack check with other architectures
* acpi-pci:
ACPI,PCI,IRQ: remove SCI penalize function
ACPI,PCI,IRQ: remove redundant code in acpi_irq_penalty_init()
ACPI,PCI,IRQ: reduce static IRQ array size to 16
ACPI,PCI,IRQ: reduce resource requirements
* acpi-misc:
ACPI / sysfs: fix error code in get_status()
ACPI / device_sysfs: Clean up checkpatch errors
ACPI / device_sysfs: Change _SUN and _STA show functions error return to EIO
ACPI / device_sysfs: Add sysfs support for _HRV hardware revision
arm64: defconfig: Enable ACPI
ACPI / ARM64: Remove EXPERT dependency for ACPI on ARM64
ACPI / ARM64: Don't enable ACPI by default on ARM64
acer-wmi: Use acpi_dev_found()
eeepc-wmi: Use acpi_dev_found()
ACPI / utils: Rename acpi_dev_present()
* acpi-tools:
tools/power/acpi: close file only if it is open