Now, multiple CPUs can receive an external NMI simultaneously by
specifying the "apic_extnmi=all" command line parameter. When we take
a crash dump by using external NMI with this option, we fail to save
registers into the crash dump. This happens as follows:
CPU 0 CPU 1
================================ =============================
receive an external NMI
default_do_nmi() receive an external NMI
spin_lock(&nmi_reason_lock) default_do_nmi()
io_check_error() spin_lock(&nmi_reason_lock)
panic() busy loop
...
kdump_nmi_shootdown_cpus()
issue NMI IPI -----------> blocked until IRET
busy loop...
Here, since CPU 1 is in NMI context, an additional NMI from CPU 0
remains unhandled until CPU 1 IRETs. However, CPU 1 will never execute
IRET so the NMI is not handled and the callback function to save
registers is never called.
To solve this issue, we check if the IPI for crash dumping was issued
while waiting for nmi_reason_lock to be released, and if so, call its
callback function directly. If the IPI is not issued (e.g. kdump is
disabled), the actual behavior doesn't change.
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: kexec@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20151210065245.4587.39316.stgit@softrs
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch introduces a command line parameter apic_extnmi:
apic_extnmi=( bsp|all|none )
The default value is "bsp" and this is the current behavior: only the
Boot-Strapping Processor receives an external NMI.
"all" allows external NMIs to be broadcast to all CPUs. This would
raise the success rate of panic on NMI when BSP hangs in NMI context
or the external NMI is swallowed by other NMI handlers on the BSP.
If you specify "none", no CPUs receive external NMIs. This is useful for
the dump capture kernel so that it cannot be shot down by accidentally
pressing the external NMI button (on platforms which have it) while
saving a crash dump.
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Bandan Das <bsd@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: kexec@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: "Maciej W. Rozycki" <macro@linux-mips.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20151210014632.25437.43778.stgit@softrs
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Currently, panic() and crash_kexec() can be called at the same time.
For example (x86 case):
CPU 0:
oops_end()
crash_kexec()
mutex_trylock() // acquired
nmi_shootdown_cpus() // stop other CPUs
CPU 1:
panic()
crash_kexec()
mutex_trylock() // failed to acquire
smp_send_stop() // stop other CPUs
infinite loop
If CPU 1 calls smp_send_stop() before nmi_shootdown_cpus(), kdump
fails.
In another case:
CPU 0:
oops_end()
crash_kexec()
mutex_trylock() // acquired
<NMI>
io_check_error()
panic()
crash_kexec()
mutex_trylock() // failed to acquire
infinite loop
Clearly, this is an undesirable result.
To fix this problem, this patch changes crash_kexec() to exclude others
by using the panic_cpu atomic.
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: kexec@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Minfei Huang <mnfhuang@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20151210014630.25437.94161.stgit@softrs
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Currently, kdump_nmi_shootdown_cpus(), a subroutine of crash_kexec(),
sends an NMI IPI to CPUs which haven't called panic() to stop them,
save their register information and do some cleanups for crash dumping.
However, if such a CPU is infinitely looping in NMI context, we fail to
save its register information into the crash dump.
For example, this can happen when unknown NMIs are broadcast to all
CPUs as follows:
CPU 0 CPU 1
=========================== ==========================
receive an unknown NMI
unknown_nmi_error()
panic() receive an unknown NMI
spin_trylock(&panic_lock) unknown_nmi_error()
crash_kexec() panic()
spin_trylock(&panic_lock)
panic_smp_self_stop()
infinite loop
kdump_nmi_shootdown_cpus()
issue NMI IPI -----------> blocked until IRET
infinite loop...
Here, since CPU 1 is in NMI context, the second NMI from CPU 0 is
blocked until CPU 1 executes IRET. However, CPU 1 never executes IRET,
so the NMI is not handled and the callback function to save registers is
never called.
In practice, this can happen on some servers which broadcast NMIs to all
CPUs when the NMI button is pushed.
To save registers in this case, we need to:
a) Return from NMI handler instead of looping infinitely
or
b) Call the callback function directly from the infinite loop
Inherently, a) is risky because NMI is also used to prevent corrupted
data from being propagated to devices. So, we chose b).
This patch does the following:
1. Move the infinite looping of CPUs which haven't called panic() in NMI
context (actually done by panic_smp_self_stop()) outside of panic() to
enable us to refer pt_regs. Please note that panic_smp_self_stop() is
still used for normal context.
2. Call a callback of kdump_nmi_shootdown_cpus() directly to save
registers and do some cleanups after setting waiting_for_crash_ipi which
is used for counting down the number of CPUs which handled the callback
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Gobinda Charan Maji <gobinda.cemk07@gmail.com>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Javi Merino <javi.merino@arm.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: kexec@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: lkml <linux-kernel@vger.kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Link: http://lkml.kernel.org/r/20151210014628.25437.75256.stgit@softrs
[ Cleanup comments, fixup formatting. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
If panic on NMI happens just after panic() on the same CPU, panic() is
recursively called. Kernel stalls, as a result, after failing to acquire
panic_lock.
To avoid this problem, don't call panic() in NMI context if we've
already entered panic().
For that, introduce nmi_panic() macro to reduce code duplication. In
the case of panic on NMI, don't return from NMI handlers if another CPU
already panicked.
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Gobinda Charan Maji <gobinda.cemk07@gmail.com>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Javi Merino <javi.merino@arm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: kexec@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: lkml <linux-kernel@vger.kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Link: http://lkml.kernel.org/r/20151210014626.25437.13302.stgit@softrs
[ Cleanup comments, fixup formatting. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
- Fix a potential regression in the generic power domains
framework introduced during the 4.3 development cycle that
may lead to spurious failures of system suspend in certain
situations (Ulf Hansson).
- Fix a problem in the power capping RAPL (Running Average
Power Limits) driver that causes it to initialize successfully
on some systems where it is not supposed to do that which is
due to an incorrect check in an initialization routine (Prarit
Bhargava).
- Fix a build problem in the cpufreq Tegra driver that depends
on the regulator framework, but that dependency is not reflected
in Kconfig (Arnd Bergmann).
- Fix a recent mistake in the intel_pstate driver where a numeric
constant is used directly instead of a symbol defined specifically
for the case in question (Prarit Bhargava).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJWdNhOAAoJEILEb/54YlRxFBUP/0KwbDo7BLdIjUjyi7y3IG39
oxcb4volIE69Z7e332Mm6y+hG/CXcfOwTde16+WIYOZfmmHfP1c6bamOoJOwcFI8
RAU9UmMtsXJaEjtiYx6+J97hLh3pmORZdXPnJqLPICIq0/kG356nqoPDEoTDaXGp
JRbt4h6mcTFX/rErI1hpk4/QXQMOSKSQUbz4bzTfozgn4y1j0DJhr+TBOoCZJdjE
r9h6Uk4y1E4f9kHy/25wgKlEB3LPqBm5IcFF1lxzndYIyGQJV829FYo+CuVoLMpj
tlM1yFhdqSZ9ejCPkAac75i92trWaxADuO9WT/RNhSGEyvpzh43IvGZKbJzKoXIf
svHkMwCCOcI4x6f6ZXpINSAmk8Vl+QzlBy35f8U9F5JX1ZPN21MsQRX1+QfzKFO/
TnsGW/GXPIH0cooucBqMS+NieN61R3AKYWsc4Tka7aXWGpdzvr/hvwaEoItxtTti
WXNn21PxOrQxcSjb46Mwikeniyhw2sLveZJRW2jCgwkYSLOOB3+gz+yu+q2yYlXh
rzr+ObihVMq0DUte2JfFSUYL2hJDIkwTIa/Rqmq2PJShDBDUmw38RIXZ2EcnxdiA
g32cboGtS5QvsSQzM2tr9fbioxl7n0zva0p64fSUq97fuX0em7y4kMR4m3y7WOE4
Nlpb0/dyECRpCvFBuuin
=da89
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix a potential regression introduced during the 4.3 cycle
(generic power domains framework), a nasty bug that has been present
forever (power capping RAPL driver), a build issue (Tegra cpufreq
driver) and a minor ugliness introduced recently (intel_pstate).
Specifics:
- Fix a potential regression in the generic power domains framework
introduced during the 4.3 development cycle that may lead to
spurious failures of system suspend in certain situations (Ulf
Hansson).
- Fix a problem in the power capping RAPL (Running Average Power
Limits) driver that causes it to initialize successfully on some
systems where it is not supposed to do that which is due to an
incorrect check in an initialization routine (Prarit Bhargava).
- Fix a build problem in the cpufreq Tegra driver that depends on the
regulator framework, but that dependency is not reflected in
Kconfig (Arnd Bergmann).
- Fix a recent mistake in the intel_pstate driver where a numeric
constant is used directly instead of a symbol defined specifically
for the case in question (Prarit Bhargava)"
* tag 'pm+acpi-4.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
powercap / RAPL: fix BIOS lock check
cpufreq: intel_pstate: Minor cleanup for FRAC_BITS
cpufreq: tegra: add regulator dependency for T124
PM / Domains: Allow runtime PM callbacks to be re-used during system PM
Three fixes this time, two in SES picked up by KASAN for various types of
buffer overrun. The first is a USB array which returns page 8 whatever is
asked for and causes us to overrun with incorrect data format assumptions and
the second is an invalid iteration of page 10 (the additional information
page). The final one is a reversion of a NULL deref fix which caused
suspend/resume not to be called in pairs leading to incorrect device operation
(Jens has queued a more proper fix for the problem in block).
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABAgAGBQJWdLxTAAoJEDeqqVYsXL0MwOYH+wYb27NxfyA7+q7z/dFz+LhQ
B9RlUfnEw57vVz7KEwleqJ9uA2jprCQndMqRoelmWtxeu5CVUBbq/1ONDWvPX2ha
Prr3wVp+SbqbtzmvGQrQ8If7o4iS47fXtwUe5RRDBdfKMUfXs7LeVBgQrpZsqlkE
va6LNKVqzYW4sneC+CfWcwwyedLGeaphNBYygKtCm7SfEkbnfH5+zhWH9JWwtYXf
r8VCCUnmF69ocx4a7MZLnSAJuXfzaJl45c0nhRiHTiokW7KYuylJm0Zd1PYkhwhV
rQr53otJsdPTyZUjmeCdS6PBlGp/HVdYIOyKt5b4Ti2S71ij9R52YPY6BdtIWeQ=
=6New
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Three fixes this time, two in SES picked up by KASAN for various types
of buffer overrun. The first is a USB array which returns page 8
whatever is asked for and causes us to overrun with incorrect data
format assumptions and the second is an invalid iteration of page 10
(the additional information page).
The final fix is a reversion of a NULL deref fix which caused
suspend/resume not to be called in pairs leading to incorrect device
operation (Jens has queued a more proper fix for the problem in
block)"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
ses: fix additional element traversal bug
Revert "SCSI: Fix NULL pointer dereference in runtime PM"
ses: Fix problems with simple enclosures
Pull btrfs fixes from Chris Mason:
"A couple of small fixes"
* 'for-linus-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: check prepare_uptodate_page() error code earlier
Btrfs: check for empty bitmap list in setup_cluster_bitmaps
btrfs: fix misleading warning when space cache failed to load
Btrfs: fix transaction handle leak in balance
Btrfs: fix unprotected list move from unused_bgs to deleted_bgs list
Merge misc fixes from Andrew Morton:
"Three patches"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
include/linux/mmdebug.h: should include linux/bug.h
mm/zswap: change incorrect strncmp use to strcmp
proc: fix -ESRCH error when writing to /proc/$pid/coredump_filter
mmdebug.h uses BUILD_BUG_ON_INVALID(), assuming someone else included
linux/bug.h. Include it ourselves.
This saves build-failures such as:
arch/arm64/include/asm/pgtable.h: In function 'set_pte_at':
arch/arm64/include/asm/pgtable.h:281:3: error: implicit declaration of function 'BUILD_BUG_ON_INVALID' [-Werror=implicit-function-declaration]
VM_WARN_ONCE(!pte_young(pte),
Fixes: 02602a18c3 ("bug: completely remove code generated by disabled VM_BUG_ON()")
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change the use of strncmp in zswap_pool_find_get() to strcmp.
The use of strncmp is no longer correct, now that zswap_zpool_type is
not an array; sizeof() will return the size of a pointer, which isn't
the right length to compare. We don't need to use strncmp anyway,
because the existing params and the passed in params are all guaranteed
to be null terminated, so strcmp should be used.
Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Reported-by: Weijie Yang <weijie.yang@samsung.com>
Cc: Seth Jennings <sjennings@variantweb.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Writing to /proc/$pid/coredump_filter always returns -ESRCH because commit
774636e19e ("proc: convert to kstrto*()/kstrto*_from_user()") removed
the setting of ret after the get_proc_task call and incorrectly left it as
-ESRCH. Instead, return 0 when successful.
Example breakage:
echo 0 > /proc/self/coredump_filter
bash: echo: write error: No such process
Fixes: 774636e19e ("proc: convert to kstrto*()/kstrto*_from_user()")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: <stable@vger.kernel.org> [4.3+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Select CONFIG_BITREVERSE for sht15 driver to avoid build failure
if it is not configured.
- Force wait for conversion time for the first valid data in tmp102
driver to avoid reporting erroneous data to the thermal subsystem.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWdGPZAAoJEMsfJm/On5mB8cMP/3G7+GElyMhUtNfXszliz66n
1pVi+Y/+YumGzLxzi1QMyw+4edQGGDLUZarciuQ4ADtpJ0FCFPwLrv3SyVMW4Cxf
/4OpKQo/2677xfh8Y2ZrTbVBgN3y5/6LNCdy4qNKOw4GVYjF7+7oL4owgSzyUmG/
vHZs9u0TQo78opTfdC1lUr+DjzkfFNvaEtRHywSQjdzylMjpgeTSU/Cr5pzcWeoh
wL8/mpS8KmIwiCYz12dohlqF8i7a7TV5HE5QKbDQqFXEoWSt5lUiaSkUrqVZUXlW
heXxM9CG01bicHkZJ4AOjMoqtEcZyfiU1MlumvTUAtVtzWkyHOWHQgGjIJFE6o6f
caXYjGUpTEhe5TSmMQW3Wlp6AEKegse5VLTR2Y4UzHkR28GoyR543MGtkR6w1Ux8
sea9JWwyYdvIcwmol9ivdG7V/ymCpJhT3yZxBIJSCrKbCfQuBszfnSmAUqMbFRBj
YD2FBVMAyt2rVjzPwUA9CKK/ERmgUZXftdiPCbcnHHPeT4tzx+KKzTE2cOpHYl3t
IeID+95pXoAukPF6F1kq6y2zR2Uzdqn5f5VeekGn9Bo+aZmL3zwAZFUgF4u9Tqzy
cO7Cn1b5fb6W5pdFEImK57irZSSSb6GZLlZXsHq9RRezdFQvJdyX0bPsX1Z7Fn+b
zuz5e1GhF+l5LpUUIKz8
=h83R
-----END PGP SIGNATURE-----
Merge tag 'hwmon-for-linus-v4.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- Select CONFIG_BITREVERSE for sht15 driver to avoid build failure if
it is not configured.
- Force wait for conversion time for the first valid data in tmp102
driver to avoid reporting erroneous data to the thermal subsystem.
* tag 'hwmon-for-linus-v4.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (sht15) Select CONFIG_BITREVERSE
hwmon: (tmp102) Force wait for conversion time for the first valid data
* Two similar fixes for the Intel and AMD IOMMU drivers to add
proper access checks before calling handle_mm_fault.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJWdCp7AAoJECvwRC2XARrjjAIP/0ihW2zF4R622RgY1C1Cm62j
0eb/R4UqjI3PG0KsURgDHcIm9JP5Z//dgKTOtNX9KOkHlXLcO9MMSD5chVBd4HKG
+Mgx7RM+Mr7f6ElRUa6s1GY1tcJlGf43fW5cMQ44BJIqVXlE47go4U09D86DVgXy
KgyBxQldeOrkXZvAG82WLjGgkdGALQjbDlI8ktmfYWXAvIRWNGJqWY16BwAYOWfb
9d3+1JPekSSBWHC6H+qbkDb8ueO69/Ux0HL5z2Q0zchqGjBb1gnfwLcz865KZpOB
qUwsKFSXTl+jPCrAaLYJnVqAnH4qqKaF6WKAJSIHObTSVqXKHpFHrQrlGVzOvYNn
s3216KIMsxG2nnvSgXCOFGqM/810MH2MSo8YcF5A3celrka3j2Gj08mxInrZXN7D
3p51HSwq8ePo4i5jppT5ldOBSjNV9N3wKWcjDb4OL+OfkJc/u2VbSHNQtpvTclsV
V6VSfWLDC8BCmUveMH2TrawQWkKOz0LqgqfQPX+VvSCIM7tgkrgVsTJrijPtGOs1
zid/A/cfqMdBezSVALrZfB4OVBaM2UL2LJmmLJgApYV+N55Oxmx+nxnMr0aT5KlY
crjcnVaypkq3rG1Wjpt+nTTwtllB0yXNEywQcu2edeswmaQCqsEgQRsDqi6S2/+S
c8l9JKoTrB4+vToYjXyW
=qrAB
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v4.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
"Two similar fixes for the Intel and AMD IOMMU drivers to add proper
access checks before calling handle_mm_fault"
* tag 'iommu-fixes-v4.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/vt-d: Do access checks before calling handle_mm_fault()
iommu/amd: Do proper access checking before calling handle_mm_fault()
- XSA-155 security fixes to backend drivers.
- XSA-157 security fixes to pciback.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJWdDrXAAoJEFxbo/MsZsTR3N0H/0Lvz6MWBARCje7livbz7nqE
PS0Bea+2yAfNhCDDiDlpV0lor8qlyfWDF6lGhLjItldAzahag3ZDKDf1Z/lcQvhf
3MwFOcOVZE8lLtvLT6LGnPuehi1Mfdi1Qk1/zQhPhsq6+FLPLT2y+whmBihp8mMh
C12f7KRg5r3U7eZXNB6MEtGA0RFrOp0lBdvsiZx3qyVLpezj9mIe0NueQqwY3QCS
xQ0fILp/x2EnZNZuzgghFTPRxMAx5ReOezgn9Rzvq4aThD+irz1y6ghkYN4rG2s2
tyYOTqBnjJEJEQ+wmYMhnfCwVvDffztG+uI9hqN31QFJiNB0xsjSWFCkDAWchiU=
=Argz
-----END PGP SIGNATURE-----
Merge tag 'for-linus-4.4-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen bug fixes from David Vrabel:
- XSA-155 security fixes to backend drivers.
- XSA-157 security fixes to pciback.
* tag 'for-linus-4.4-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen-pciback: fix up cleanup path when alloc fails
xen/pciback: Don't allow MSI-X ops if PCI_COMMAND_MEMORY is not set.
xen/pciback: For XEN_PCI_OP_disable_msi[|x] only disable if device has MSI(X) enabled.
xen/pciback: Do not install an IRQ handler for MSI interrupts.
xen/pciback: Return error on XEN_PCI_OP_enable_msix when device has MSI or MSI-X enabled
xen/pciback: Return error on XEN_PCI_OP_enable_msi when device has MSI or MSI-X enabled
xen/pciback: Save xen_pci_op commands before processing it
xen-scsiback: safely copy requests
xen-blkback: read from indirect descriptors only once
xen-blkback: only read request operation from shared ring once
xen-netback: use RING_COPY_REQUEST() throughout
xen-netback: don't use last request to determine minimum Tx credit
xen: Add RING_COPY_REQUEST()
xen/x86/pvh: Use HVM's flush_tlb_others op
xen: Resume PMU from non-atomic context
xen/events/fifo: Consume unprocessed events when a CPU dies
As usual in rc6, this update contains only a few HD-audio and
USB-audio device-specific quirks: yet another Thinkpad noise fixes,
Dell headphone mic fixes, and AudioQuest DragonFly fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJWc8fIAAoJEGwxgFQ9KSmkxtMP/0k05GN1EBQyKVhAFgO24N0g
zNoLAEzNylXfFD3p8Dq0YVrnebQ8AuA5SaCZmREjikG/r02JozniDhTFKk2igb0d
RSo+aRC6TnsIfKw7wNIlNUFM3c7ysuHecMtP7RKux6GSNM4lyfKmUWcUXKZkxq97
CXFEyZ82vX3Y7vc5vrXblx5pr0c61Urzel/b8li+noBec8G91FMgAbAIddjZBFhU
J14qSWlWnY92aiclYNeH9CHTk8j5gVkS4Vg2XsDrl1iOSHrKrAc40Tm1+sIjT7RA
2OoAjGyrSdW6v3rxakbqT/Wmz6lZOhfeoRwNCphLbhN5UalNj8QryvoF5ypU/ypr
oPDKTmy4AJ1XU9kGmr5OfTIOl4XVMTK4QpcTqkDJQW3sBDCk0vunZdk84YDx/rNf
26GjpeDfqzEGW9CQdPbpKgeaMMIPSdBD62IIuTD0lRQuelz2KH35PYsj6u+kYqJD
vgPUeETN5TCaqyafIWaYmUIcXWEfYRLQMUXJK0G+317bKFcBeQQAU9UnHennwmGl
AGLggl2zfAafspl/wImbKtvhHWmGAKPBCS2R0YNuO83Wvl5RJG/Jt7hyyUxi7n8h
bD5ao9BCpTnIqCqZskuJYGbNV4fVHTthjcqhak0+vAf0ymYBF4VlSAOySE3BrO/m
qkkEH2zJ2/IAOjJlvZfg
=bf1H
-----END PGP SIGNATURE-----
Merge tag 'sound-4.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"As usual in rc6, this update contains only a few HD-audio and
USB-audio device-specific quirks: yet another Thinkpad noise fixes,
Dell headphone mic fixes, and AudioQuest DragonFly fixes"
* tag 'sound-4.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Add a fixup for Thinkpad X1 Carbon 2nd
ALSA: hda - Set codec to D3 at reboot/shutdown on Thinkpads
ALSA: hda - Apply click noise workaround for Thinkpads generically
ALSA: hda - Fix headphone mic input on a few Dell ALC293 machines
ALSA: usb-audio: Add sample rate inquiry quirk for AudioQuest DragonFly
ALSA: usb-audio: Add a more accurate volume quirk for AudioQuest DragonFly
A little bit of a last-minute change for the device tree "fixed partition"
binding. This is needed because we might want to reuse the 'partitions' subnode
for other sorts of partitioning descriptions -- e.g., for describing which
on-flash partition format(s) might be used on the system.
Also tone down a warning message, since it is probably going to show up on a
lot of systems where it should just be ignored.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWc3C5AAoJEFySrpd9RFgtFlIP/jXyyH0ZmfRopa4idhF7bit1
aVZO3YuVw0UG8iXhHp4tpIIHt68P4VdozMV9iehy8dm31Jx/l71rMfTh6TQDmb9Q
mhzY5Oz5lu7zQvoj0pzIeQOAdg83r+eVCjpSfA8yHiVyoHz5CNHJHGiUyerJ/NLG
KCLWZEz9/VEEcuiOHs27dnvSgxQygqAf2gCGJQlvXSZJV6mv50l5KPxkt/vMe3Fz
3i1hjcOvKW5Md1DZHMBg06Mma04pDDE55whZConrGzIMd1jPQ/IbIveud2SmmuU8
+JzRKv//DIqxdJdbp3ybSh2eOPaRx63fAUOUtUdfxQkwz25avs11YWr8daQ4PFsy
wiDR/wTaP9kZyXwupFYqDhsvj7NXHVp/i00Q0FMq7ryrXYMbrOWMwAZgeTRZ3unr
Bq5Js+lbl9H/NVG/GPIKLrDj5n8eNh7DnU5x1ZpSu5wVNxXoSMNuU8piv2nb81RT
Cj4r6eARg5g0+SCz9h57BCIeVwGRtJT9T/PLga6osL/DN3DhMxBf2nSokA53lyXu
1dJh0m+LzgULKJV0s/0/GZ2E8TbZFqgzqq74YCMywciDa9qwvO3qgZdzM55102yx
QNzC5yQrVBraeDQ0dJ8VSOm6cB7sg7O66QhYAu6d/YdqB3CTuto2ehK2ffKLkpkg
5C6hpaIHCchBcFpqeHsT
=UcSG
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20151217' of git://git.infradead.org/linux-mtd
Pull MTD fixes from Brian Norris:
"I was holding out on this pull request for a bit, since there are a
few other small issues being discussed that look like 4.4-rc
regressions. Hopefully I can get those stabilized soon, but these are
ready at any rate:
- A little bit of a last-minute change for the device tree "fixed
partition" binding. This is needed because we might want to reuse
the 'partitions' subnode for other sorts of partitioning
descriptions -- e.g., for describing which on-flash partition
format(s) might be used on the system.
- Also tone down a warning message, since it is probably going to
show up on a lot of systems where it should just be ignored"
* tag 'for-linus-20151217' of git://git.infradead.org/linux-mtd:
doc: dt: mtd: partitions: add compatible property to "partitions" node
mtd: ofpart: don't complain about missing 'partitions' node too loudly
Driver requested device firmware version string during probe using
only 24 byte long buffer. That buffer is too small for newer firmware
versions, which causes device firmware hang - device stops responding
to any commands after that. Increase buffer size to 128 which should
be enough for any current and future version strings.
Link: https://github.com/airspy/host/issues/27
Cc: <stable@vger.kernel.org> # 3.17+
Reported-by: Benjamin Vernoux <bvernoux@gmail.com>
Signed-off-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Used Avago MGA-81563 RF amplifier could be destroyed pretty easily
with too strong signal or transmitting to bad antenna.
Add module parameter 'enable_rf_gain_ctrl' which allows enabling
RF gain control - otherwise, default without the module parameter,
RF gain control is set to 'grabbed' state which prevents setting
value to the control.
Signed-off-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
drivers/media/usb/hackrf/hackrf.c:1533 hackrf_probe()
error: we previously assumed 'dev' could be null (see line 1366)
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
When allocating a pciback device fails, clear the private
field. This could lead to an use-after free, however
the 'really_probe' takes care of setting
dev_set_drvdata(dev, NULL) in its failure path (which we would
exercise if the ->probe function failed), so we we
are OK. However lets be defensive as the code can change.
Going forward we should clean up the pci_set_drvdata(dev, NULL)
in the various code-base. That will be for another day.
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reported-by: Jonathan Creekmore <jonathan.creekmore@gmail.com>
Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
If CONFIG_BITREVERSE is not built-in, the sht15 driver fails to link:
drivers/built-in.o: In function `sht15_crc8':
drivers/hwmon/sht15.c:195: undefined reference to `byte_rev_table'
This adds a Kconfig 'select' statement, like all other users of
bitrev.h have it.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 33836ee985 ("hwmon:change sht15_reverse()")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
commit f598282f51 ("PCI: Fix the NIU MSI-X problem in a better way")
teaches us that dealing with MSI-X can be troublesome.
Further checks in the MSI-X architecture shows that if the
PCI_COMMAND_MEMORY bit is turned of in the PCI_COMMAND we
may not be able to access the BAR (since they are memory regions).
Since the MSI-X tables are located in there.. that can lead
to us causing PCIe errors. Inhibit us performing any
operation on the MSI-X unless the MEMORY bit is set.
Note that Xen hypervisor with:
"x86/MSI-X: access MSI-X table only after having enabled MSI-X"
will return:
xen_pciback: 0000:0a:00.1: error -6 enabling MSI-X for guest 3!
When the generic MSI code tries to setup the PIRQ without
MEMORY bit set. Which means with later versions of Xen
(4.6) this patch is not neccessary.
This is part of XSA-157
CC: stable@vger.kernel.org
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Otherwise just continue on, returning the same values as
previously (return of 0, and op->result has the PIRQ value).
This does not change the behavior of XEN_PCI_OP_disable_msi[|x].
The pci_disable_msi or pci_disable_msix have the checks for
msi_enabled or msix_enabled so they will error out immediately.
However the guest can still call these operations and cause
us to disable the 'ack_intr'. That means the backend IRQ handler
for the legacy interrupt will not respond to interrupts anymore.
This will lead to (if the device is causing an interrupt storm)
for the Linux generic code to disable the interrupt line.
Naturally this will only happen if the device in question
is plugged in on the motherboard on shared level interrupt GSI.
This is part of XSA-157
CC: stable@vger.kernel.org
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Otherwise an guest can subvert the generic MSI code to trigger
an BUG_ON condition during MSI interrupt freeing:
for (i = 0; i < entry->nvec_used; i++)
BUG_ON(irq_has_action(entry->irq + i));
Xen PCI backed installs an IRQ handler (request_irq) for
the dev->irq whenever the guest writes PCI_COMMAND_MEMORY
(or PCI_COMMAND_IO) to the PCI_COMMAND register. This is
done in case the device has legacy interrupts the GSI line
is shared by the backend devices.
To subvert the backend the guest needs to make the backend
to change the dev->irq from the GSI to the MSI interrupt line,
make the backend allocate an interrupt handler, and then command
the backend to free the MSI interrupt and hit the BUG_ON.
Since the backend only calls 'request_irq' when the guest
writes to the PCI_COMMAND register the guest needs to call
XEN_PCI_OP_enable_msi before any other operation. This will
cause the generic MSI code to setup an MSI entry and
populate dev->irq with the new PIRQ value.
Then the guest can write to PCI_COMMAND PCI_COMMAND_MEMORY
and cause the backend to setup an IRQ handler for dev->irq
(which instead of the GSI value has the MSI pirq). See
'xen_pcibk_control_isr'.
Then the guest disables the MSI: XEN_PCI_OP_disable_msi
which ends up triggering the BUG_ON condition in 'free_msi_irqs'
as there is an IRQ handler for the entry->irq (dev->irq).
Note that this cannot be done using MSI-X as the generic
code does not over-write dev->irq with the MSI-X PIRQ values.
The patch inhibits setting up the IRQ handler if MSI or
MSI-X (for symmetry reasons) code had been called successfully.
P.S.
Xen PCIBack when it sets up the device for the guest consumption
ends up writting 0 to the PCI_COMMAND (see xen_pcibk_reset_device).
XSA-120 addendum patch removed that - however when upstreaming said
addendum we found that it caused issues with qemu upstream. That
has now been fixed in qemu upstream.
This is part of XSA-157
CC: stable@vger.kernel.org
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The guest sequence of:
a) XEN_PCI_OP_enable_msix
b) XEN_PCI_OP_enable_msix
results in hitting an NULL pointer due to using freed pointers.
The device passed in the guest MUST have MSI-X capability.
The a) constructs and SysFS representation of MSI and MSI groups.
The b) adds a second set of them but adding in to SysFS fails (duplicate entry).
'populate_msi_sysfs' frees the newly allocated msi_irq_groups (note that
in a) pdev->msi_irq_groups is still set) and also free's ALL of the
MSI-X entries of the device (the ones allocated in step a) and b)).
The unwind code: 'free_msi_irqs' deletes all the entries and tries to
delete the pdev->msi_irq_groups (which hasn't been set to NULL).
However the pointers in the SysFS are already freed and we hit an
NULL pointer further on when 'strlen' is attempted on a freed pointer.
The patch adds a simple check in the XEN_PCI_OP_enable_msix to guard
against that. The check for msi_enabled is not stricly neccessary.
This is part of XSA-157
CC: stable@vger.kernel.org
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The guest sequence of:
a) XEN_PCI_OP_enable_msi
b) XEN_PCI_OP_enable_msi
c) XEN_PCI_OP_disable_msi
results in hitting an BUG_ON condition in the msi.c code.
The MSI code uses an dev->msi_list to which it adds MSI entries.
Under the above conditions an BUG_ON() can be hit. The device
passed in the guest MUST have MSI capability.
The a) adds the entry to the dev->msi_list and sets msi_enabled.
The b) adds a second entry but adding in to SysFS fails (duplicate entry)
and deletes all of the entries from msi_list and returns (with msi_enabled
is still set). c) pci_disable_msi passes the msi_enabled checks and hits:
BUG_ON(list_empty(dev_to_msi_list(&dev->dev)));
and blows up.
The patch adds a simple check in the XEN_PCI_OP_enable_msi to guard
against that. The check for msix_enabled is not stricly neccessary.
This is part of XSA-157.
CC: stable@vger.kernel.org
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Double fetch vulnerabilities that happen when a variable is
fetched twice from shared memory but a security check is only
performed the first time.
The xen_pcibk_do_op function performs a switch statements on the op->cmd
value which is stored in shared memory. Interestingly this can result
in a double fetch vulnerability depending on the performed compiler
optimization.
This patch fixes it by saving the xen_pci_op command before
processing it. We also use 'barrier' to make sure that the
compiler does not perform any optimization.
This is part of XSA155.
CC: stable@vger.kernel.org
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The copy of the ring request was lacking a following barrier(),
potentially allowing the compiler to optimize the copy away.
Use RING_COPY_REQUEST() to ensure the request is copied to local
memory.
This is part of XSA155.
CC: stable@vger.kernel.org
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Since indirect descriptors are in memory shared with the frontend, the
frontend could alter the first_sect and last_sect values after they have
been validated but before they are recorded in the request. This may
result in I/O requests that overflow the foreign page, possibly
overwriting local pages when the I/O request is executed.
When parsing indirect descriptors, only read first_sect and last_sect
once.
This is part of XSA155.
CC: stable@vger.kernel.org
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
A compiler may load a switch statement value multiple times, which could
be bad when the value is in memory shared with the frontend.
When converting a non-native request to a native one, ensure that
src->operation is only loaded once by using READ_ONCE().
This is part of XSA155.
CC: stable@vger.kernel.org
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Instead of open-coding memcpy()s and directly accessing Tx and Rx
requests, use the new RING_COPY_REQUEST() that ensures the local copy
is correct.
This is more than is strictly necessary for guest Rx requests since
only the id and gref fields are used and it is harmless if the
frontend modifies these.
This is part of XSA155.
CC: stable@vger.kernel.org
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The last from guest transmitted request gives no indication about the
minimum amount of credit that the guest might need to send a packet
since the last packet might have been a small one.
Instead allow for the worst case 128 KiB packet.
This is part of XSA155.
CC: stable@vger.kernel.org
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Using RING_GET_REQUEST() on a shared ring is easy to use incorrectly
(i.e., by not considering that the other end may alter the data in the
shared ring while it is being inspected). Safe usage of a request
generally requires taking a local copy.
Provide a RING_COPY_REQUEST() macro to use instead of
RING_GET_REQUEST() and an open-coded memcpy(). This takes care of
ensuring that the copy is done correctly regardless of any possible
compiler optimizations.
Use a volatile source to prevent the compiler from reordering or
omitting the copy.
This is part of XSA155.
CC: stable@vger.kernel.org
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Pull networking fixes from David Miller:
1) Fix uninitialized variable warnings in nfnetlink_queue, a lot of
people reported this... From Arnd Bergmann.
2) Don't init mutex twice in i40e driver, from Jesse Brandeburg.
3) Fix spurious EBUSY in rhashtable, from Herbert Xu.
4) Missing DMA unmaps in mvpp2 driver, from Marcin Wojtas.
5) Fix race with work structure access in pppoe driver causing
corruptions, from Guillaume Nault.
6) Fix OOPS due to sh_eth_rx() not checking whether netdev_alloc_skb()
actually succeeded or not, from Sergei Shtylyov.
7) Don't lose flags when settifn IFA_F_OPTIMISTIC in ipv6 code, from
Bjørn Mork.
8) VXLAN_HD_RCO defined incorrectly, fix from Jiri Benc.
9) Fix clock source used for cookies in SCTP, from Marcelo Ricardo
Leitner.
10) aurora driver needs HAS_DMA dependency, from Geert Uytterhoeven.
11) ndo_fill_metadata_dst op of vxlan has to handle ipv6 tunneling
properly as well, from Jiri Benc.
12) Handle request sockets properly in xfrm layer, from Eric Dumazet.
13) Double stats update in ipv6 geneve transmit path, fix from Pravin B
Shelar.
14) sk->sk_policy[] needs RCU protection, and as a result
xfrm_policy_destroy() needs to free policies using an RCU grace
period, from Eric Dumazet.
15) SCTP needs to clone ipv6 tx options in order to avoid use after
free, from Eric Dumazet.
16) Missing kbuild export if ila.h, from Stephen Hemminger.
17) Missing mdiobus_alloc() return value checking in mdio-mux.c, from
Tobias Klauser.
18) Validate protocol value range in ->create() methods, from Hannes
Frederic Sowa.
19) Fix early socket demux races that result in illegal dst reuse, from
Eric Dumazet.
20) Validate socket address length in pptp code, from WANG Cong.
21) skb_reorder_vlan_header() uses incorrect offset and can corrupt
packets, from Vlad Yasevich.
22) Fix memory leaks in nl80211 registry code, from Ola Olsson.
23) Timeout loop count handing fixes in mISDN, xgbe, qlge, sfc, and
qlcnic. From Dan Carpenter.
24) msg.msg_iocb needs to be cleared in recvfrom() otherwise, for
example, AF_ALG will interpret it as an async call. From Tadeusz
Struk.
25) inetpeer_set_addr_v4 forgets to initialize the 'vif' field, from
Eric Dumazet.
26) rhashtable enforces the minimum table size not early enough,
breaking how we calculate the per-cpu lock allocations. From
Herbert Xu.
27) Fix FCC port lockup in 82xx driver, from Martin Roth.
28) FOU sockets need to be freed using RCU, from Hannes Frederic Sowa.
29) Fix out-of-bounds access in __skb_complete_tx_timestamp() and
sock_setsockopt() wrt. timestamp handling. From WANG Cong.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (117 commits)
net: check both type and procotol for tcp sockets
drivers: net: xgene: fix Tx flow control
tcp: restore fastopen with no data in SYN packet
af_unix: Revert 'lock_interruptible' in stream receive code
fou: clean up socket with kfree_rcu
82xx: FCC: Fixing a bug causing to FCC port lock-up
gianfar: Don't enable RX Filer if not supported
net: fix warnings in 'make htmldocs' by moving macro definition out of field declaration
rhashtable: Fix walker list corruption
rhashtable: Enforce minimum size on initial hash table
inet: tcp: fix inetpeer_set_addr_v4()
ipv6: automatically enable stable privacy mode if stable_secret set
net: fix uninitialized variable issue
bluetooth: Validate socket address length in sco_sock_bind().
net_sched: make qdisc_tree_decrease_qlen() work for non mq
ser_gigaset: remove unnecessary kfree() calls from release method
ser_gigaset: fix deallocation of platform device structure
ser_gigaset: turn nonsense checks into WARN_ON
ser_gigaset: fix up NULL checks
qlcnic: fix a timeout loop
...
Dmitry reported the following out-of-bound access:
Call Trace:
[<ffffffff816cec2e>] __asan_report_load4_noabort+0x3e/0x40
mm/kasan/report.c:294
[<ffffffff84affb14>] sock_setsockopt+0x1284/0x13d0 net/core/sock.c:880
[< inline >] SYSC_setsockopt net/socket.c:1746
[<ffffffff84aed7ee>] SyS_setsockopt+0x1fe/0x240 net/socket.c:1729
[<ffffffff85c18c76>] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185
This is because we mistake a raw socket as a tcp socket.
We should check both sk->sk_type and sk->sk_protocol to ensure
it is a tcp socket.
Willem points out __skb_complete_tx_timestamp() needs to fix as well.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the Tx flow control is based on reading the hardware state,
which is not accurate since it may not reflect the descriptors that
are not yet reached the memory.
To accurately control the Tx flow, changing it to be software based.
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuchung tracked a regression caused by commit 57be5bdad7 ("ip: convert
tcp_sendmsg() to iov_iter primitives") for TCP Fast Open.
Some Fast Open users do not actually add any data in the SYN packet.
Fixes: 57be5bdad7 ("ip: convert tcp_sendmsg() to iov_iter primitives")
Reported-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With b3ca9b02b0, the AF_UNIX SOCK_STREAM
receive code was changed from using mutex_lock(&u->readlock) to
mutex_lock_interruptible(&u->readlock) to prevent signals from being
delayed for an indefinite time if a thread sleeping on the mutex
happened to be selected for handling the signal. But this was never a
problem with the stream receive code (as opposed to its datagram
counterpart) as that never went to sleep waiting for new messages with the
mutex held and thus, wouldn't cause secondary readers to block on the
mutex waiting for the sleeping primary reader. As the interruptible
locking makes the code more complicated in exchange for no benefit,
change it back to using mutex_lock.
Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull drm fixes from Dave Airlie:
"Some i915 fixes, one omap fix, one core regression fix.
Not even enough fixes for a twelve days of xmas song, which seemms
good"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm: Don't overwrite UNVERFIED mode status to OK
drm/omap: fix fbdev pix format to support all platforms
drm/i915: Do a better job at disabling primary plane in the noatomic case.
drm/i915/skl: Double RC6 WRL always on
drm/i915/skl: Disable coarse power gating up until F0
drm/i915: Remove incorrect warning in context cleanup
The Cavium guys reported a soft lockup on their arm64 machine, caused by
commit c55a6ffa62 ("locking/osq: Relax atomic semantics"):
mutex_optimistic_spin+0x9c/0x1d0
__mutex_lock_slowpath+0x44/0x158
mutex_lock+0x54/0x58
kernfs_iop_permission+0x38/0x70
__inode_permission+0x88/0xd8
inode_permission+0x30/0x6c
link_path_walk+0x68/0x4d4
path_openat+0xb4/0x2bc
do_filp_open+0x74/0xd0
do_sys_open+0x14c/0x228
SyS_openat+0x3c/0x48
el0_svc_naked+0x24/0x28
This is because in osq_lock we initialise the node for the current CPU:
node->locked = 0;
node->next = NULL;
node->cpu = curr;
and then publish the current CPU in the lock tail:
old = atomic_xchg_acquire(&lock->tail, curr);
Once the update to lock->tail is visible to another CPU, the node is
then live and can be both read and updated by concurrent lockers.
Unfortunately, the ACQUIRE semantics of the xchg operation mean that
there is no guarantee the contents of the node will be visible before
lock tail is updated. This can lead to lock corruption when, for
example, a concurrent locker races to set the next field.
Fixes: c55a6ffa62 ("locking/osq: Relax atomic semantics"):
Reported-by: David Daney <ddaney@caviumnetworks.com>
Reported-by: Andrew Pinski <andrew.pinski@caviumnetworks.com>
Tested-by: Andrew Pinski <andrew.pinski@caviumnetworks.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1449856001-21177-1-git-send-email-will.deacon@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull libnvdimm fixes from Dan Williams:
- Two bug fixes for misuse of PAGE_MASK in scatterlist and dma-debug.
These are tagged for -stable. The scatterlist impact is potentially
corrupted dma addresses on HIGHMEM enabled platforms.
- A minor locking fix for the NFIT hot-add implementation that is new
in 4.4-rc. This would only trigger in the case a hot-add raced
driver removal.
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
dma-debug: Fix dma_debug_entry offset calculation
Revert "scatterlist: use sg_phys()"
nfit: acpi_nfit_notify(): Do not leave device locked