The nr_sort_keys field is to carry the number of sort entries in a
hpp_list or hists to determine the depth of indentation of a hist entry.
As it's only used in hierarchy mode and now we have used nr_hpp_node for
this reason, there's no need to keep it anymore. Let's get rid of it.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457531222-18130-7-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The hist_browser__fprintf_hierarchy_entry() if to dump current output
into a file so it needs to be sync-ed with the corresponding function
hist_browser__show_hierarchy_entry(). So use hists->nr_hpp_node to
indent width and use first fmt_node to print overhead columns instead of
checking whether it's a sort entry (or dynamic entry).
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457531222-18130-6-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It's not used anymore and the output format is accessed by the hpp_list
pointer instead when hierarchy is enabled. Let's get rid of it.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457531222-18130-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When a command-line filter is applied in hierarchy mode, output is
broken especially when filtering on lower level. The higher level
entries doesn't show up so it's hard to see the results.
Also it needs to handle multi sort keys in a single hierarchy level.
Before:
$ perf report --hierarchy -s 'cpu,{dso,comm}' --comms swapper --stdio
...
# Overhead CPU / Shared Object+Command
# ........... ...........................
#
13.79% [kernel.vmlinux] swapper
31.71% 000
13.80% [kernel.vmlinux] swapper
0.43% [e1000e] swapper
11.89% [kernel.vmlinux] swapper
9.18% [kernel.vmlinux] swapper
After:
# Overhead CPU / Shared Object+Command
# ........... ...............................
#
33.09% 003
13.79% [kernel.vmlinux] swapper
31.71% 000
13.80% [kernel.vmlinux] swapper
0.43% [e1000e] swapper
21.90% 002
11.89% [kernel.vmlinux] swapper
13.30% 001
9.18% [kernel.vmlinux] swapper
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457531222-18130-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Those functions are for checkinf if a given perf_hpp_fmt is a
filter-related sort entry. With hierarchy mode, it needs to check
filters on the hist entries with its own hpp format list.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457531222-18130-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When hierarchy mode is enabled each output format is in a separate hpp
list. So when applying a filter it should check all formats in the
list. Currently it only checks a single ->fmt field which was not set
properly.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457531222-18130-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Build jitdump only on architectures defined in util/genelf.h file, to avoid
breaking the build on such arches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Davidlohr Bueso <dbueso@suse.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Mel Gorman <mgorman@suse.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20160310164113.GA11357@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add myself as the maintainer of the NAND subsystem.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Richard Weinberger <richard@nod.at>
Acked-by: Brian Norris <computersforpeace@gmail.com>
Acked-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
is a bit larger because the surrounding code needed a cleanup, but
nothing worrisome.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJW4UwZAAoJEL/70l94x66DG3YH/0PfUr4sW0jnWRVXmYlPVka4
sNFYrdtYnx08PwXu2sWMm1F+OBXlF/t0ZSJXJ9OBF8WdKIu8TU4yBOINRAvGO/oE
slrivjktLTKgicTtIXP5BpRR14ohwHIGcuiIlppxvnhmQz1/rMtig7fvhZxYI545
lJyIbyquNR86tiVdUSG9/T9+ulXXXCvOspYv8jPXZx7VKBXKTvp5P5qavSqciRb+
O9RqY+GDCR/5vrw+MV0J7H9ZydeEJeD02LcWguTGMATTm0RCrhydvSbou42UcKfY
osWii0kwt2LhcM/sTOz+cWnLJ6gwU9T+ZtJTTbLvYWXWDLP/+icp9ACMkwNciNo=
=/y4V
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"A few simple fixes for ARM, x86, PPC and generic code.
The x86 MMU fix is a bit larger because the surrounding code needed a
cleanup, but nothing worrisome"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: MMU: fix reserved bit check for ept=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0
KVM: MMU: fix ept=0/pte.u=1/pte.w=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 combo
kvm: cap halt polling at exactly halt_poll_ns
KVM: s390: correct fprs on SIGP (STOP AND) STORE STATUS
KVM: VMX: disable PEBS before a guest entry
KVM: PPC: Book3S HV: Sanitize special-purpose register values on guest exit
- Temporarily disable huge pages built using contiguous ptes
- Ensure vmemmap region is sufficiently aligned for sparsemem sections
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCgAGBQJW4FSYAAoJELescNyEwWM0cp4H/0+9iMTqb3KowIW1vQPNnOG2
BG/RVsGlCPAwBu0V4FdcY7eK4fQ9J+/UCmVd5/SrlpjNpblinxRPihbIzs4bToqa
It02wNk7ISPm0oYtyGrRu1TpC5AvMykcluZkU/CUk0sjZlBAi8WSaJiqftFuZSGH
lhyhARO4KscbAUUhwDDYNKuWmLbmyOpt9RM2fziNQdjSp+8czCoCR9G+JXiPQFsJ
ORU10BqBDCyFpp8/NhM55qA76FJo6RCBUWx/6L1oJJxjvahkmPba/hhnfI7+Xj1u
3FKAntJ6wVZeKqRsIkOlECoU/mrgjlTByTFN+o3KhOky8ZYBHoveIQWtsqNMFd4=
=7d1o
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"I thought we were done for 4.5, but then the 64k-page chaps came
crawling out of the woodwork. *sigh*
The vmemmap fix I sent for -rc7 caused a regression with 64k pages and
sparsemem and at some point during the release cycle the new hugetlb
code using contiguous ptes started failing the libhugetlbfs tests with
64k pages enabled.
So here are a couple of patches that fix the vmemmap alignment and
disable the new hugetlb page sizes whilst a proper fix is being
developed:
- Temporarily disable huge pages built using contiguous ptes
- Ensure vmemmap region is sufficiently aligned for sparsemem
sections"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: hugetlb: partial revert of 66b3923a1a
arm64: account for sparsemem section alignment when choosing vmemmap offset
Pull s390 fixes from Martin Schwidefsky:
"Three bug fixes:
- The fix for the page table corruption (CVE-2016-2143)
- The diagnose statistics introduced a regression for the dasd diag
driver
- Boot crash on systems without the set-program-parameters facility"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/mm: four page table levels vs. fork
s390/cpumf: Fix lpp detection
s390/dasd: fix diag 0x250 inline assembly
The legacy media controller userspace API exposes entity types that
carry both type and function information. The new API replaces the type
with a function. It preserves backward compatibility by defining legacy
functions for the existing types and using them in drivers.
This works fine, as long as newer entity functions won't be added.
Unfortunately, some tools, like media-ctl with --print-dot argument
rely on the now legacy MEDIA_ENT_T_V4L2_SUBDEV and MEDIA_ENT_T_DEVNODE
numeric ranges to identify what entities will be shown.
Also, if the entity doesn't match those ranges, it will ignore the
major/minor information on devnodes, and won't be getting the devnode
name via udev or sysfs.
As we're now adding devices outside the old range, the legacy ioctl
needs to map the new entity functions into a type at the old range,
or otherwise we'll have a regression.
Detected on all released media-ctl versions (e. g. versions <= 1.10).
Fix this by deriving the type from the function to emulate the legacy
API if the function isn't in the legacy functions range.
Reported-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Commit abd4f7505b ("x86: i386-show-unhandled-signals-v3") did turn on
the showing-unhandled-signal behaviour for i386 for some exception handlers,
but for no reason do_trap() is left out (my naive guess is because turning it on
for do_trap() would be too noisy since do_trap() is shared by several exceptions).
And since the same commit make "show_unhandled_signals" a debug tunable(in
/proc/sys/debug/exception-trace), and x86 by default turning it on.
So it would be strange for i386 users who turing it on manually and expect
seeing the unhandled signal output in log, but nothing.
This patch turns it on for i386 in do_trap() as well.
Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@suse.de
Cc: dave.hansen@linux.intel.com
Cc: heukelum@fastmail.fm
Cc: jbeulich@novell.com
Cc: jdike@addtoit.com
Cc: joe@perches.com
Cc: luto@kernel.org
Link: http://lkml.kernel.org/r/1457612398-4568-1-git-send-email-nasa4836@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Large memory Haswell-EX systems with multiple DIMMs per channel were
sometimes reporting the wrong DIMM.
Found three problems:
1) Debug printouts for socket and channel interleave were not interpreting
the register fields correctly. The socket interleave field is a 2^X
value (0=1, 1=2, 2=4, 3=8). The channel interleave is X+1 (0=1, 1=2,
2=3. 3=4).
2) Actual use of the socket interleave value didn't interpret as 2^X
3) Conversion of address to channel address was complicated, and wrong.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The of_io_request_and_map() returns a valid pointer in iomem region or
ERR_PTR(), check for NULL always fails and may cause a NULL pointer
dereference on error path.
Fixes: 25e34b4431 ("irqchip/mxs: Prepare driver for hardware with different offsets")
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Oleksij Rempel <linux@rempel-privat.de>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1457486500-10237-1-git-send-email-vz@mleia.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The of_io_request_and_map() returns a valid pointer in iomem region or
ERR_PTR(), check for NULL always fails and may cause a NULL pointer
dereference on error path.
Fixes: 0e841b04c8 ("irqchip/sunxi-nmi: Switch to of_io_request_and_map() from of_iomap()")
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1457486489-10189-1-git-send-email-vz@mleia.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Export irq_chip_*_parent(), irq_domain_create_hierarchy(),
irq_domain_set_hwirq_and_chip(), irq_domain_reset_irq_data(),
irq_domain_alloc/free_irqs_parent()
So gpio drivers can be built as modules. First user: gpio-xgene-sb
Signed-off-by: Quan Nguyen <qnguyen@apm.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Phong Vo <pvo@apm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: patches@apm.com
Cc: Loc Ho <lho@apm.com>
Cc: Keyur Chudgar <kchudgar@apm.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Link: https://lists.01.org/pipermail/kbuild-all/2016-February/017914.html
Link: http://lkml.kernel.org/r/1457017012-10628-1-git-send-email-qnguyen@apm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
When computing the residue we need two pieces of information: the current
descriptor and the remaining data of the current descriptor. To get
that information, we need to read consecutively two registers but we
can't do it in an atomic way. For that reason, we have to check manually
that current descriptor has not changed.
Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Suggested-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Reported-by: David Engraf <david.engraf@sysgo.com>
Tested-by: David Engraf <david.engraf@sysgo.com>
Fixes: e1f7c9eee7 ("dmaengine: at_xdmac: creation of the atmel
eXtended DMA Controller driver")
Cc: stable@vger.kernel.org #4.1 and later
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
We do use this_cpu_ptr(&cpu_tss) as a cacheline-aligned, seldomly
accessed per-cpu var as the MONITORX target in delay_mwaitx(). However,
when called in preemptible context, this_cpu_ptr -> smp_processor_id() ->
debug_smp_processor_id() fires:
BUG: using smp_processor_id() in preemptible [00000000] code: udevd/312
caller is delay_mwaitx+0x40/0xa0
But we don't care about that check - we only need cpu_tss as a MONITORX
target and it doesn't really matter which CPU's var we're touching as
we're going idle anyway. Fix that.
Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: spg_linux_kernel@amd.com
Link: http://lkml.kernel.org/r/20160309205622.GG6564@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
KVM has special logic to handle pages with pte.u=1 and pte.w=0 when
CR0.WP=1. These pages' SPTEs flip continuously between two states:
U=1/W=0 (user and supervisor reads allowed, supervisor writes not allowed)
and U=0/W=1 (supervisor reads and writes allowed, user writes not allowed).
When SMEP is in effect, however, U=0 will enable kernel execution of
this page. To avoid this, KVM also sets NX=1 in the shadow PTE together
with U=0, making the two states U=1/W=0/NX=gpte.NX and U=0/W=1/NX=1.
When guest EFER has the NX bit cleared, the reserved bit check thinks
that the latter state is invalid; teach it that the smep_andnot_wp case
will also use the NX bit of SPTEs.
Cc: stable@vger.kernel.org
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.inel.com>
Fixes: c258b62b26
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Yes, all of these are needed. :) This is admittedly a bit odd, but
kvm-unit-tests access.flat tests this if you run it with "-cpu host"
and of course ept=0.
KVM runs the guest with CR0.WP=1, so it must handle supervisor writes
specially when pte.u=1/pte.w=0/CR0.WP=0. Such writes cause a fault
when U=1 and W=0 in the SPTE, but they must succeed because CR0.WP=0.
When KVM gets the fault, it sets U=0 and W=1 in the shadow PTE and
restarts execution. This will still cause a user write to fault, while
supervisor writes will succeed. User reads will fault spuriously now,
and KVM will then flip U and W again in the SPTE (U=1, W=0). User reads
will be enabled and supervisor writes disabled, going back to the
originary situation where supervisor writes fault spuriously.
When SMEP is in effect, however, U=0 will enable kernel execution of
this page. To avoid this, KVM also sets NX=1 in the shadow PTE together
with U=0. If the guest has not enabled NX, the result is a continuous
stream of page faults due to the NX bit being reserved.
The fix is to force EFER.NX=1 even if the CPU is taking care of the EFER
switch. (All machines with SMEP have the CPU_LOAD_IA32_EFER vm-entry
control, so they do not use user-return notifiers for EFER---if they did,
EFER.NX would be forced to the same value as the host).
There is another bug in the reserved bit check, which I've split to a
separate patch for easier application to stable kernels.
Cc: stable@vger.kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Fixes: f6577a5fa1
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now that slow-path syscalls always enter C before enabling
interrupts, it's straightforward to call enter_from_user_mode() before
enabling interrupts rather than doing it as part of entry tracing.
With this change, we should finally be able to retire exception_enter().
This will also enable optimizations based on knowing that we never
change context tracking state with interrupts on.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/bc376ecf87921a495e874ff98139b1ca2f5c5dd7.1457558566.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We want all of the syscall entries to run with interrupts off so that
we can efficiently run context tracking before enabling interrupts.
This will regress int $0x80 performance on 32-bit kernels by a
couple of cycles. This shouldn't matter much -- int $0x80 is not a
fast path.
This effectively reverts:
657c1eea00 ("x86/entry/32: Fix entry_INT80_32() to expect interrupts to be on")
... and fixes the same issue differently.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/59b4f90c9ebfccd8c937305dbbbca680bc74b905.1457558566.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We can micro-optimize this call and mildly relax the
barrier requirements by relying on ctrl + rmb, keeping
the acquire semantics. In addition, this is pretty much
the now standard for busy-waiting under such restraints.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave@stgolabs.net
Link: http://lkml.kernel.org/r/1457574936-19065-3-git-send-email-dbueso@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
While the compiler tends to already to it for us (except for
csd_unlock), make it explicit. These helpers mainly deal with
the ->flags, are short-lived and can be called, for example,
from smp_call_function_many().
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave@stgolabs.net
Link: http://lkml.kernel.org/r/1457574936-19065-2-git-send-email-dbueso@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Leonid Shatz noticed that the SDM interpretation of the following
recent commit:
394db20ca2 ("x86/fpu: Disable AVX when eagerfpu is off")
... is incorrect and that the original behavior of the FPU code was correct.
Because AVX is not stated in CR0 TS bit description, it was mistakenly
believed to be not supported for lazy context switch. This turns out
to be false:
Intel Software Developer's Manual Vol. 3A, Sec. 2.5 Control Registers:
'TS Task Switched bit (bit 3 of CR0) -- Allows the saving of the x87 FPU/
MMX/SSE/SSE2/SSE3/SSSE3/SSE4 context on a task switch to be delayed until
an x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 instruction is actually executed
by the new task.'
Intel Software Developer's Manual Vol. 2A, Sec. 2.4 Instruction Exception
Specification:
'AVX instructions refer to exceptions by classes that include #NM
"Device Not Available" exception for lazy context switch.'
So revert the commit.
Reported-by: Leonid Shatz <leonid.shatz@ravellosystems.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1457569734-3785-1-git-send-email-yu-cheng.yu@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo suggested that the comments should explain when the various
entries are used. This adds these explanations and improves other
parts of the comments.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/9524ecef7a295347294300045d08354d6a57c6e7.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Now that SYSENTER with TF set puts X86_EFLAGS_TF directly into
regs->flags, we don't need a TIF_SINGLESTEP fixup in the syscall
entry code. Remove it.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/2d15f24da52dafc9d2f0b8d76f55544f4779c517.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The first instruction of the SYSENTER entry runs on its own tiny
stack. That stack can be used if a #DB or NMI is delivered before
the SYSENTER prologue switches to a real stack.
We have code in place to prevent us from overflowing the tiny stack.
For added paranoia, add a canary to the stack and check it in
do_debug() -- that way, if something goes wrong with the #DB logic,
we'll eventually notice.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/6ff9a806f39098b166dc2c41c1db744df5272f29.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Right after SYSENTER, we can get a #DB or NMI. On x86_32, there's no IST,
so the exception handler is invoked on the temporary SYSENTER stack.
Because the SYSENTER stack is very small, we have a fixup to switch
off the stack quickly when this happens. The old fixup had several issues:
1. It checked the interrupt frame's CS and EIP. This wasn't
obviously correct on Xen or if vm86 mode was in use [1].
2. In the NMI handler, it did some frightening digging into the
stack frame. I'm not convinced this digging was correct.
3. The fixup didn't switch stacks and then switch back. Instead, it
synthesized a brand new stack frame that would redirect the IRET
back to the SYSENTER code. That frame was highly questionable.
For one thing, if NMI nested inside #DB, we would effectively
abort the #DB prologue, which was probably safe but was
frightening. For another, the code used PUSHFL to write the
FLAGS portion of the frame, which was simply bogus -- by the time
PUSHFL was called, at least TF, NT, VM, and all of the arithmetic
flags were clobbered.
Simplify this considerably. Instead of looking at the saved frame
to see where we came from, check the hardware ESP register against
the SYSENTER stack directly. Malicious user code cannot spoof the
kernel ESP register, and by moving the check after SAVE_ALL, we can
use normal PER_CPU accesses to find all the relevant addresses.
With this patch applied, the improved syscall_nt_32 test finally
passes on 32-bit kernels.
[1] It isn't obviously correct, but it is nonetheless safe from vm86
shenanigans as far as I can tell. A user can't point EIP at
entry_SYSENTER_32 while in vm86 mode because entry_SYSENTER_32,
like all kernel addresses, is greater than 0xffff and would thus
violate the CS segment limit.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/b2cdbc037031c07ecf2c40a96069318aec0e7971.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The SYSENTER stack is only used on 32-bit kernels. Remove it on 64-bit kernels.
( We may end up using it down the road on 64-bit kernels. If so,
we'll re-enable it for CONFIG_IA32_EMULATION. )
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/9dbd18429f9ff61a76b6eda97a9ea20510b9f6ba.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Due to a blatant design error, SYSENTER doesn't clear TF (single-step).
As a result, if a user does SYSENTER with TF set, we will single-step
through the kernel until something clears TF. There is absolutely
nothing we can do to prevent this short of turning off SYSENTER [1].
Simplify the handling considerably with two changes:
1. We already sanitize EFLAGS in SYSENTER to clear NT and AC. We can
add TF to that list of flags to sanitize with no overhead whatsoever.
2. Teach do_debug() to ignore single-step traps in the SYSENTER prologue.
That's all we need to do.
Don't get too excited -- our handling is still buggy on 32-bit
kernels. There's nothing wrong with the SYSENTER code itself, but
the #DB prologue has a clever fixup for traps on the very first
instruction of entry_SYSENTER_32, and the fixup doesn't work quite
correctly. The next two patches will fix that.
[1] We could probably prevent it by forcing BTF on at all times and
making sure we clear TF before any branches in the SYSENTER
code. Needless to say, this is a bad idea.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a30d2ea06fe4b621fe6a9ef911b02c0f38feb6f2.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Leaving any bits set in DR6 on return from a debug exception is
asking for trouble. Prevent it by writing zero right away and
clarify the comment.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/3857676e1be8fb27db4b89bbb1e2052b7f435ff4.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The SDM says that debug exceptions clear BTF, and we need to keep
TIF_BLOCKSTEP in sync with BTF. Clear it unconditionally and improve
the comment.
I suspect that the fact that kmemcheck could cause TIF_BLOCKSTEP not
to be cleared was just an oversight.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/fa86e55d196e6dde5b38839595bde2a292c52fdc.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We weren't restoring FLAGS at all on SYSEXIT. Apparently no one cared.
With this patch applied, native kernels should always honor
task_pt_regs()->flags, which opens the door for some sys_iopl()
cleanups. I'll do those as a separate series, though, since getting
it right will involve tweaking some paravirt ops.
( The short version is that, before this patch, sys_iopl(), invoked via
SYSENTER, wasn't guaranteed to ever transfer the updated
regs->flags, so sys_iopl() had to change the hardware flags register
as well. )
Reported-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/3f98b207472dc9784838eb5ca2b89dcc845ce269.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This makes the 32-bit code work just like the 64-bit code. It should
speed up syscalls on 32-bit kernels on Skylake by something like 20
cycles (by analogy to the 64-bit compat case).
It also cleans up NT just like we do for the 64-bit case.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/07daef3d44bd1ed62a2c866e143e8df64edb40ee.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
CLAC is slow, and the SYSENTER code already has an unlikely path
that runs if unusual flags are set. Drop the CLAC and instead rely
on the unlikely path to clear AC.
This seems to save ~24 cycles on my Skylake laptop. (Hey, Intel,
make this faster please!)
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/90d6db2189f9add83bc7bddd75a0c19ebbd676b2.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Setting TF prevents fastpath returns in most cases, which causes the
test to fail on 32-bit kernels because 32-bit kernels do not, in
fact, handle NT correctly on SYSENTER entries.
The next patch will fix 32-bit kernels.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/bd4bb48af6b10c0dc84aec6dbcf487ed25683495.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The fork of a process with four page table levels is broken since
git commit 6252d702c5 "[S390] dynamic page tables."
All new mm contexts are created with three page table levels and
an asce limit of 4TB. If the parent has four levels dup_mmap will
add vmas to the new context which are outside of the asce limit.
The subsequent call to copy_page_range will walk the three level
page table structure of the new process with non-zero pgd and pud
indexes. This leads to memory clobbers as the pgd_index *and* the
pud_index is added to the mm->pgd pointer without a pgd_deref
in between.
The init_new_context() function is selecting the number of page
table levels for a new context. The function is used by mm_init()
which in turn is called by dup_mm() and mm_alloc(). These two are
used by fork() and exec(). The init_new_context() function can
distinguish the two cases by looking at mm->context.asce_limit,
for fork() the mm struct has been copied and the number of page
table levels may not change. For exec() the mm_alloc() function
set the new mm structure to zero, in this case a three-level page
table is created as the temporary stack space is located at
STACK_TOP_MAX = 4TB.
This fixes CVE-2016-2143.
Reported-by: Marcin Kościelnicki <koriakin@0x04.net>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
A few driver specific fixes for the Rockchip and i.MX SPI controllers,
especially for the i.MX they're annoying bugs if you run into them.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJW4POVAAoJECTWi3JdVIfQMRYH/R3bNmMQHZB+qBm6vVL28Pdo
OXNejqStZLJfvdP6+xNXKijomMmvL/Xd+Jf2XMhDcEH8FZDVyemp9WIfQg8LMsy8
uSYxfNIitL7T4xfPkKF8J0z3E81lc6EDGtv9M/7XWYJM1FrfMibQB3lUv1OK6+Gd
z3Q6lpTblh6o6sqN0m2g23uSv3FGhoSdTNMhLeT3Wo5L3SfGFupO35Um5pbX8nHZ
ZKmVNu1wmnWvf879NsaRs8W7btP/l/lcaz+aKB0HlJfzjUjyojsgSZu+EYTBXEDF
ZOe3YZnodXhnFjyHLAcI09aXeDbtLlx4QHGv36OL8ObWHX9M8UYG9pQUuh5/Ujc=
=gJMG
-----END PGP SIGNATURE-----
Merge tag 'spi-fix-v4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A few driver specific fixes for the Rockchip and i.MX SPI controllers,
especially for the i.MX they're annoying bugs if you run into them"
* tag 'spi-fix-v4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: imx: fix spi resource leak with dma transfer
spi: imx: allow only WML aligned transfers to use DMA
spi: rockchip: add missing spi_master_put
spi: rockchip: disable runtime pm when in err case
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJW4N4xAAoJEPL5WVaVDYGjJQsH/i/9SP178CiaMeUp22PHmETi
UpCaQd9AY3xGGIjCktL2DC4NC86fjsRMYl1FJdVMxElUx54fuEU17wEW4BZyjUhI
aF9X7LfxQcxe+CRsY37ZdJ19nmE6EUZay8Vt/tB2LK/RvfruLNYmnzX5MmmjJY/S
1TKz6Jy5M0DTl+jpod2nv/xJ2j32WSPul8Un/iBinC16LPH+Q7KZRVjFLlf/krsM
SvZ1G6I70P7t9HW88BO9KhiYyxxuwqWC6SSoPMKTr4WeGnYQbA2JE6PJPktqsq76
Q91ucFkkGi+DZuZe5EuDMYMBrwaHQG8hKG3ueCj/pTu9IRErW94uO++H03bichk=
=Yjfq
-----END PGP SIGNATURE-----
Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fix from Ted Ts'o:
"This fixes a regression which crept in v4.5-rc5"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: iterate over buffer heads correctly in move_extent_per_page()
Pull drm fixes from Dave Airlie:
"A few imx fixes I missed from a couple of weeks ago, they still aren't
that big and fix some regression and a fail to boot problem.
Other than that, a couple of regression fixes for radeon/amdgpu, one
regression fix for vmwgfx and one regression fix for tda998x"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
Revert "drm/radeon/pm: adjust display configuration after powerstate"
drm/amdgpu/dp: add back special handling for NUTMEG
drm/radeon/dp: add back special handling for NUTMEG
drm/i2c: tda998x: Choose between atomic or non atomic dpms helper
drm/vmwgfx: Add back ->detect() and ->fill_modes()
drm/radeon: Fix error handling in radeon_flip_work_func.
drm/amdgpu: Fix error handling in amdgpu_flip_work_func.
drm/imx: Add missing DRM_FORMAT_RGB565 to ipu_plane_formats
drm/imx: notify DRM core about CRTC vblank state
gpu: ipu-v3: Reset IPU before activating IRQ
gpu: ipu-v3: Do not bail out on missing optional port nodes
if the current cpu is offline. But I forgot that in 3.18, we added lockdep
checks to test RCU usage even when the event is disabled. Although there
cannot be any bug when a cpu is going offline, we now get false warnings
triggered by the added checks of the event being disabled.
I removed the check from the tracepoint code itself, and added it to the
condition section (which is "1" for 'no condition'). This way the online
cpu check will get checked in all the right locations.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJW4N3rAAoJEKKk/i67LK/8WBIH/1WS6n919iU5sbdwmb843o4v
KTeZ962l7fsiU+Op2ha4eLO6qXwa85X1sq3yHxo7APttE0oN933b6VQFjEs+HnqU
INZorOQpy7soztHNewr48hdS0Z/x57xHywuf9i1K51zKCycuhupS6eZxN65zcuZp
jeyg1dWqcvUzQcbxc5xflt0+n27txUpHix3e290aNoH9cya7gdbXi5dWAQgM8Kfm
l8i2DJeyEy9nAKMjsKpKvdPkV5C8ZMGS1sJc/Psx9MGL08kM5Lqtuu8gkrvjqiLk
HYmWPKQ+l3OROORd6Sia88SniPT9ZU4A73CobgPt5flr8BvU51kxLeObD8myXps=
=s1tp
-----END PGP SIGNATURE-----
Merge tag 'trace-fixes-v4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"I previously sent a fix that prevents all trace events from being
called if the current cpu is offline.
But I forgot that in 3.18, we added lockdep checks to test RCU usage
even when the event is disabled. Although there cannot be any bug
when a cpu is going offline, we now get false warnings triggered by
the added checks of the event being disabled.
I removed the check from the tracepoint code itself, and added it to
the condition section (which is "1" for 'no condition'). This way the
online cpu check will get checked in all the right locations"
* tag 'trace-fixes-v4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix check for cpu online when event is disabled
In commit bcff24887d ("ext4: don't read blocks from disk after extents
being swapped") bh is not updated correctly in the for loop and wrong
data has been written to disk. generic/324 catches this on sub-page
block size ext4.
Fixes: bcff24887d ("ext4: don't read blocks from disk after extentsbeing swapped")
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Merge fixes from Andrew Morton:
"13 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
dma-mapping: avoid oops when parameter cpu_addr is null
mm/hugetlb: use EOPNOTSUPP in hugetlb sysctl handlers
memremap: check pfn validity before passing to pfn_to_page()
mm, thp: fix migration of PTE-mapped transparent huge pages
dax: check return value of dax_radix_entry()
ocfs2: fix return value from ocfs2_page_mkwrite()
arm64: kasan: clear stale stack poison
sched/kasan: remove stale KASAN poison after hotplug
kasan: add functions to clear stack poison
mm: fix mixed zone detection in devm_memremap_pages
list: kill list_force_poison()
mm: __delete_from_page_cache show Bad page if mapped
mm/hugetlb: hugetlb_no_page: rate-limit warning message
To keep consistent with kfree, which tolerate ptr is NULL. We do this
because sometimes we may use goto statement, so that success and failure
case can share parts of the code. But unfortunately, dma_free_coherent
called with parameter cpu_addr is null will cause oops, such as showed
below:
Unable to handle kernel paging request at virtual address ffffffc020d3b2b8
pgd = ffffffc083a61000
[ffffffc020d3b2b8] *pgd=0000000000000000, *pud=0000000000000000
CPU: 4 PID: 1489 Comm: malloc_dma_1 Tainted: G O 4.1.12 #1
Hardware name: ARM64 (DT)
PC is at __dma_free_coherent.isra.10+0x74/0xc8
LR is at __dma_free+0x9c/0xb0
Process malloc_dma_1 (pid: 1489, stack limit = 0xffffffc0837fc020)
[...]
Call trace:
__dma_free_coherent.isra.10+0x74/0xc8
__dma_free+0x9c/0xb0
malloc_dma+0x104/0x158 [dma_alloc_coherent_mtmalloc]
kthread+0xec/0xfc
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>