emac4syn chips has availability to use 8192 rx/tx fifo buffer sizes,
in current state if we set it up in dts 8192 as example, we will get
only 2048 which may impact on network speed.
Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the sysfs attribute hangs off the RMI bus, which doesn't go away during
firmware flash, it needs to be explicitly removed, otherwise we would try and
register the same attribute twice.
This reverts commit 36a44af5c1.
Signed-off-by: Nick Dyer <nick@shmanahar.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
The code will try to access dev->iotlb when processing
VHOST_IOTLB_INVALIDATE even if it was not initialized which may lead
to NULL pointer dereference. Fixes this by check dev->iotlb before.
Fixes: 6b1e6cc785 ("vhost: new device IOTLB API")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We used to call mutex_lock() in vhost_dev_lock_vqs() which tries to
hold mutexes of all virtqueues. This may confuse lockdep to report a
possible deadlock because of trying to hold locks belong to same
class. Switch to use mutex_lock_nested() to avoid false positive.
Fixes: 6b1e6cc785 ("vhost: new device IOTLB API")
Reported-by: syzbot+dbb7c1161485e61b0241@syzkaller.appspotmail.com
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since TC block changes drivers are required to check if
the TC hw offload flag is set on the interface themselves.
Fixes: 2f4b411a3d ("i40e: Enable cloud filters via tc-flower")
Fixes: 44ae12a768 ("net: sched: move the can_offload check from binding phase to rule insertion phase")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Amritha Nambiar <amritha.nambiar@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the typo CONFIG_CRYPTO_DES_SPARC64 => CONFIG_CRYPTO_CAMELLIA_SPARC64
Fixes: 81658ad0d9 ("sparc64: Add CAMELLIA driver making use of the new camellia opcodes.")
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Kalderon says:
====================
qed: rdma bug fixes
This patch contains two small bug fixes related to RDMA.
Both related to resource reservations.
====================
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A tid was allocated for reserved MR during initialization but
not freed. This lead to an annoying output message during
rdma unload flow.
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Double reservation for kernel dedicated dpi was performed.
Once in the core module and once in qedr.
Remove the reservation from core.
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert says:
====================
kcm: fix two syzcaller issues
In this patch set:
- Don't allow attaching non-TCP or listener sockets to a KCM mux.
- In kcm_attach Check if sk_user_data is already set. This is
under lock to avoid race conditions. More work is need to make
all of the users of sk_user_data to use the same locking.
- v2
Remove unncessary check for not PF_KCM in kcm_attach (suggested by
Guillaume Nault)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This is needed to prevent sk_user_data being overwritten.
The check is done under the callback lock. This should prevent
a socket from being attached twice to a KCM mux. It also prevents
a socket from being attached for other use cases of sk_user_data
as long as the other cases set sk_user_data under the lock.
Followup work is needed to unify all the use cases of sk_user_data
to use the same locking.
Reported-by: syzbot+114b15f2be420a8886c3@syzkaller.appspotmail.com
Fixes: ab7ac4eb98 ("kcm: Kernel Connection Multiplexor module")
Signed-off-by: Tom Herbert <tom@quantonium.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TCP sockets for IPv4 and IPv6 that are not listeners or in closed
stated are allowed to be attached to a KCM mux.
Fixes: ab7ac4eb98 ("kcm: Kernel Connection Multiplexor module")
Reported-by: syzbot+8865eaff7f9acd593945@syzkaller.appspotmail.com
Signed-off-by: Tom Herbert <tom@quantonium.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TCF_LAYER_LINK and TCF_LAYER_NETWORK returned the same pointer as
skb->data points to the network header.
Use skb_mac_header instead.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
'ptr' is shifted by the offset and then validated,
the memcmp should not add it a second time.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In fixing the readdir+pagefault deadlock I accidentally introduced a
stale entry regression in readdir. If we get close to full for the
temporary buffer, and then skip a few delayed deletions, and then try to
add another entry that won't fit, we will emit the entries we found and
retry. Unfortunately we delete entries from our del_list as we find
them, assuming we won't need them. However our pos will be with
whatever our last entry was, which could be before the delayed deletions
we skipped, so the next search will add the deleted entries back into
our readdir buffer. So instead don't delete entries we find in our
del_list so we can make sure we always find our delayed deletions. This
is a slight perf hit for readdir with lots of pending deletions, but
hopefully this isn't a common occurrence. If it is we can revist this
and optimize it.
cc: stable@vger.kernel.org
Fixes: 23b5ec7494 ("btrfs: fix readdir deadlock with pagefault")
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
One was that ORC didn't know how to handle the ftrace callbacks in general
(which Josh fixed). The other was that ORC would just bail if it hit a
dynamically allocated trampoline. Which means all ftrace stack tracing that
happens from the function tracer would produce no results (that includes
killing the max stack size tracer). I added a check to the ORC unwinder to
see if the trampoline belonged to ftrace, and if it did, use the orc entry
of the static trampoline that was used to create the dynamic one (it would
be identical).
Finally, I noticed that the skip values of the stack tracing were out of
whack. I went through and fixed them up.
-----BEGIN PGP SIGNATURE-----
iQHIBAABCgAyFiEEPm6V/WuN2kyArTUe1a05Y9njSUkFAlpohNcUHHJvc3RlZHRA
Z29vZG1pcy5vcmcACgkQ1a05Y9njSUnJ4wv/evoOzbuF67P1N1ci9qjtAuUzOGMA
jr/x/kHj/jE+w5diXTw0XOlaWzK6tB8BEfaVVcljjjjUdoXzULXCv5zR59ARioio
VhnTt1VPr+4fc5huTcIXYf8NXTNoqzLVBIR7+iO9Qk1v5nwFcJjQThj42enCXQR4
sHdeOGpW4N8UKZ1yw+i95a/JibLbnwiQRQRtMXOqxvJiplJsytWlqZkVsOyMFwA9
X+X6FbBAJYwSktMvcEXvvq7BGTuCEJF2R6+P0M40yjBLkJKxa4knM39/zt8DiZFl
bTu9JGfo1uEuMR3I+l7pxNOcPY/8rYbkWS7GAjXxMZk8Zb7iHxaQeQyJSeJbUhaV
ZS+dWVv6zkDvE1Lf0jdeyfjN8HzEZUTYRaFvDZ6JhuykxcJzCLef23uW8laheghO
w58JpwzzVliPe3Iw9E4z8isEeXCU9FYGoxu8NvkP13O72j7D5ZnwwhGAAbgtmYpL
Ez92F0JIh3bNdpEJ7XgJG9ZF9uIhRoGI+D9l
=JDuB
-----END PGP SIGNATURE-----
Merge tag 'trace-v4.15-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"With the new ORC unwinder, ftrace stack tracing became disfunctional.
One was that ORC didn't know how to handle the ftrace callbacks in
general (which Josh fixed).
The other was that ORC would just bail if it hit a dynamically
allocated trampoline. Which means all ftrace stack tracing that
happens from the function tracer would produce no results (that
includes killing the max stack size tracer). I added a check to the
ORC unwinder to see if the trampoline belonged to ftrace, and if it
did, use the orc entry of the static trampoline that was used to
create the dynamic one (it would be identical).
Finally, I noticed that the skip values of the stack tracing were out
of whack. I went through and fixed them up"
* tag 'trace-v4.15-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Update stack trace skipping for ORC unwinder
ftrace, orc, x86: Handle ftrace dynamically allocated trampolines
x86/ftrace: Fix ORC unwinding from ftrace handlers
We're seeing a raise of automated reports from testing tools and reports
about address leaks that are not really exploitable as-is, many of which
do not represent an immediate risk justifying to work in closed places.
Signed-off-by: Willy Tarreau <w@1wt.eu>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Setup .enable_reg/.enable_mask/.enable_val fields, then we can use the
regmap helpers for enable/disable/is_enabled callback implementation.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
This reverts commit 6cfb521ac0.
Turns out distros do not want to make retpoline as part of their "ABI",
so this patch should not have been merged. Sorry Andi, this was my
fault, I suggested it when your original patch was the "correct" way of
doing this instead.
Reported-by: Jiri Kosina <jikos@kernel.org>
Fixes: 6cfb521ac0 ("module: Add retpoline tag to VERMAGIC")
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: rusty@rustcorp.com.au
Cc: arjan.van.de.ven@intel.com
Cc: jeyu@kernel.org
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use pr_debug instead of hand crafted macros. This way it is not needed to
re-compile the kernel to enable bsg debug outputs and it's possible to
selectively enable specific prints.
Cc: Joe Perches <joe@perches.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Attributes that only implement .seq_ops are read-only, any write to
them should be rejected. But currently kernel would crash when
writing to such debugfs entries, e.g.
chmod +w /sys/kernel/debug/block/<dev>/requeue_list
echo 0 > /sys/kernel/debug/block/<dev>/requeue_list
chmod -w /sys/kernel/debug/block/<dev>/requeue_list
Fix it by returning -EPERM in blk_mq_debugfs_write() when writing to
such attributes.
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Driver periodically samples all neighbors configured in device
in order to update the kernel regarding their state. When finding
an entry configured in HW that doesn't show in neigh_lookup()
driver logs an error message.
This introduces a race when removing multiple neighbors -
it's possible that a given entry would still be configured in HW
as its removal is still being processed but is already removed
from the kernel's neighbor tables.
Simply remove the error message and gracefully accept such events.
Fixes: c723c735fa ("mlxsw: spectrum_router: Periodically update the kernel's neigh table")
Fixes: 60f040ca11 ("mlxsw: spectrum_router: Periodically dump active IPv6 neighbours")
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert says:
====================
pull request (net): ipsec 2018-01-24
1) Only offloads SAs after they are fully initialized.
Otherwise a NIC may receive packets on a SA we can
not yet handle in the stack.
From Yossi Kuperman.
2) Fix negative refcount in case of a failing offload.
From Aviad Yehezkel.
3) Fix inner IP ptoro version when decapsulating
from interaddress family tunnels.
From Yossi Kuperman.
4) Use true or false for boolean variables instead of an
integer value in xfrm_get_type_offload.
From Gustavo A. R. Silva.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes races and potential use after free in the
cmma migration code.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJaaJimAAoJEBF7vIC1phx8fIsP/1ShgCjg1Bcgc9FI/Rgo8diG
oycjpv7BXp2dcrgW2FRB1LFoRpMoWAwOy4YNoMls/9gqemupYYa97F3viP3oVH56
wQaZjgRdpnvflmti+3S9Ca3aws7WUPuAONTld44TJZaVJXM8g52VtjPD5APMWFqL
agawGopcmoxUX5qE12BAxCx1lx5/hToF5bcVULCmNogW9Cl9QOUw+8W3ezmaoJ0w
37FVyK5Xv+JvF8gXJlhLwXkcFyaEfLNUsoJdOCgi89/aULmp1z3/EmOTos3vTQwT
83lNxdjXuS+Ps4qGUnNwNTThoXhdC+rVbAZd/tbLi/8CtjvSeKqB0T8sMgvAC1Aa
HXYgNbEtIxjdJYbWGbTy+8uCIHLgVMiYRnPymqa6j0Y8Tk07ji2g/r0kZNotykD6
Hb4CQ41MKAXljpLEXV1L1a/LKrfQBhlTIR4VCPy9GBN+4AgV+KzjF0l1N0B9r/RD
sqSsttkbGNyNkPpCT+EGQa1d82on7T9lm40Ag6AAaBaw5x6lPwLAmDCslst5c3Fx
pGmoJk6fe64saQcs49zOiPiBYrlawW69IIVJaHWWOG3z+EdgpMjvtn1XWJpkyPZq
N0MDPn5+VwPRPQ3qKpaCRuVaUNn2gGuZSpx2/DsaY8gF65BLIIRjLt/SpLGxw3wU
AUY13gafv82jM6xlpc0p
=aAYd
-----END PGP SIGNATURE-----
Merge tag 'kvm-s390-master-4.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux
KVM: s390: another fix for cmma migration
This fixes races and potential use after free in the
cmma migration code.
Fixes the following sparse warnings :
line 767: warning: incorrect type in assignment (different base types)
line 767: expected unsigned int [unsigned] [assigned] [usertype] val_out
line 767: got restricted __le32 [usertype] <noident>
line 776: warning: cast to restricted __le32
This takes advantage of readl/writel to do the endianness reordering,
and removes an extra variable in the function.
Fixes: f68a7dcb91 ("spi: a3700: Add full-duplex support")
Signed-off-by: Maxime Chevallier <maxime.chevallier@smile.fr>
Reviewed-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Fixes the following sparse warnings :
line 504: warning: incorrect type in assignment (different base types)
line 504: expected unsigned int [unsigned] [usertype] val
line 504: got restricted __le32 [usertype] <noident>
line 527: warning: cast to restricted __le32
This is solved by removing endian-converson functions, since the
converted values are going through readl/writel anyway, which take care
of the conversion.
Fixes: 6fd6fd68c9 ("spi: armada-3700: Fix padding when sending not 4-byte aligned data")
Signed-off-by: Maxime Chevallier <maxime.chevallier@smile.fr>
Reviewed-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Some parts of the cmma migration bitmap is already protected
with the kvm->lock (e.g. the migration start). On the other
hand the read of the cmma bits is not protected against a
concurrent free, neither is the emulation of the ESSA instruction.
Let's extend the locking to all related ioctls by using
the slots lock for
- kvm_s390_vm_start_migration
- kvm_s390_vm_stop_migration
- kvm_s390_set_cmma_bits
- kvm_s390_get_cmma_bits
In addition to that, we use synchronize_srcu before freeing
the migration structure as all users hold kvm->srcu for read.
(e.g. the ESSA handler).
Reported-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: stable@vger.kernel.org # 4.13+
Fixes: 190df4a212 (KVM: s390: CMMA tracking, ESSA emulation, migration mode)
Reviewed-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: f9bb304ce8 ("mmc: mmci: Add support for setting pad type via pinctrl")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Patrice Chotard <patrice.chotard@st.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Centaur CPU has a constant frequency TSC and that TSC does not stop in
C-States. But because the corresponding TSC feature flags are not set for
that CPU, the TSC is treated as not constant frequency and assumed to stop
in C-States, which makes it an unreliable and unusable clock source.
Setting those flags tells the kernel that the TSC is usable, so it will
select it over HPET. The effect of this is that reading time stamps (from
kernel or user space) will be faster and more efficent.
Signed-off-by: davidwang <davidwang@zhaoxin.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: qiyuanwang@zhaoxin.com
Cc: linux-pm@vger.kernel.org
Cc: brucechang@via-alliance.com
Cc: cooperyan@zhaoxin.com
Cc: benjaminpan@viatech.com
Link: https://lkml.kernel.org/r/1516616057-5158-1-git-send-email-davidwang@zhaoxin.com
Commit 24c2503255 ("x86/microcode: Do not access the initrd after it has
been freed") fixed attempts to access initrd from the microcode loader
after it has been freed. However, a similar KASAN warning was reported
(stack trace edited):
smpboot: Booting Node 0 Processor 1 APIC 0x11
==================================================================
BUG: KASAN: use-after-free in find_cpio_data+0x9b5/0xa50
Read of size 1 at addr ffff880035ffd000 by task swapper/1/0
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.14.8-slack #7
Hardware name: System manufacturer System Product Name/A88X-PLUS, BIOS 3003 03/10/2016
Call Trace:
dump_stack
print_address_description
kasan_report
? find_cpio_data
__asan_report_load1_noabort
find_cpio_data
find_microcode_in_initrd
__load_ucode_amd
load_ucode_amd_ap
load_ucode_ap
After some investigation, it turned out that a merge was done using the
wrong side to resolve, leading to picking up the previous state, before
the 24c2503255 fix. Therefore the Fixes tag below contains a merge
commit.
Revert the mismerge by catching the save_microcode_in_initrd_amd()
retval and thus letting the function exit with the last return statement
so that initrd_gone can be set to true.
Fixes: f26483eaed ("Merge branch 'x86/urgent' into x86/microcode, to resolve conflicts")
Reported-by: <higuita@gmx.net>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=198295
Link: https://lkml.kernel.org/r/20180123104133.918-2-bp@alien8.de
Commit b94b737331 ("x86/microcode/intel: Extend BDW late-loading with a
revision check") reduced the impact of erratum BDF90 for Broadwell model
79.
The impact can be reduced further by checking the size of the last level
cache portion per core.
Tony: "The erratum says the problem only occurs on the large-cache SKUs.
So we only need to avoid the update if we are on a big cache SKU that is
also running old microcode."
For more details, see erratum BDF90 in document #334165 (Intel Xeon
Processor E7-8800/4800 v4 Product Family Specification Update) from
September 2017.
Fixes: b94b737331 ("x86/microcode/intel: Extend BDW late-loading with a revision check")
Signed-off-by: Jia Zhang <zhang.jia@linux.alibaba.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1516321542-31161-1-git-send-email-zhang.jia@linux.alibaba.com
The AMD power module can be loaded on non AMD platforms, but unload fails
with the following Oops:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: __list_del_entry_valid+0x29/0x90
Call Trace:
perf_pmu_unregister+0x25/0xf0
amd_power_pmu_exit+0x1c/0xd23 [power]
SyS_delete_module+0x1a8/0x2b0
? exit_to_usermode_loop+0x8f/0xb0
entry_SYSCALL_64_fastpath+0x20/0x83
Return -ENODEV instead of 0 from the module init function if the CPU does
not match.
Fixes: c7ab62bfbe ("perf/x86/amd/power: Add AMD accumulated power reporting mechanism")
Signed-off-by: Xiao Liang <xiliang@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180122061252.6394-1-xiliang@redhat.com
CONFIG_IRQ_DOMAIN_DEBUG is similar to CONFIG_GENERIC_IRQ_DEBUGFS,
just with less information.
Spring cleanup time.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Yang Shunyong <shunyong.yang@hxt-semitech.com>
Link: https://lkml.kernel.org/r/20180117142647.23622-1-marc.zyngier@arm.com
It doesn't make sense to have an indirect call thunk with esp/rsp as
retpoline code won't work correctly with the stack pointer register.
Removing it will help compiler writers to catch error in case such
a thunk call is emitted incorrectly.
Fixes: 76b043848f ("x86/retpoline: Add initial retpoline support")
Suggested-by: Jeff Law <law@redhat.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Kees Cook <keescook@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1516658974-27852-1-git-send-email-longman@redhat.com
debug_show_all_locks() iterates all tasks and print held locks whole
holding tasklist_lock. This can take a while on a slow console device
and may end up triggering NMI hardlockup detector if someone else ends
up waiting for tasklist_lock.
Touch the NMI watchdog while printing the held locks to avoid
spuriously triggering the hardlockup detector.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-team@fb.com
Link: http://lkml.kernel.org/r/20180122220055.GB1771050@devbig577.frc2.facebook.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Both Geert and DaveJ reported that the recent futex commit:
c1e2f0eaf0 ("futex: Avoid violating the 10th rule of futex")
introduced a problem with setting OWNER_DEAD. We set the bit on an
uninitialized variable and then entirely optimize it away as a
dead-store.
Move the setting of the bit to where it is more useful.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: c1e2f0eaf0 ("futex: Avoid violating the 10th rule of futex")
Link: http://lkml.kernel.org/r/20180122103947.GD2228@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
with the introduction of commit
b0eb57cb97, it appears that rq->buf_info
is improperly handled. While it is heap allocated when an rx queue is
setup, and freed when torn down, an old line of code in
vmxnet3_rq_destroy was not properly removed, leading to rq->buf_info[0]
being set to NULL prior to its being freed, causing a memory leak, which
eventually exhausts the system on repeated create/destroy operations
(for example, when the mtu of a vmxnet3 interface is changed
frequently.
Fix is pretty straight forward, just move the NULL set to after the
free.
Tested by myself with successful results
Applies to net, and should likely be queued for stable, please
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-By: boyang@redhat.com
CC: boyang@redhat.com
CC: Shrikrishna Khare <skhare@vmware.com>
CC: "VMware, Inc." <pv-drivers@vmware.com>
CC: David S. Miller <davem@davemloft.net>
Acked-by: Shrikrishna Khare <skhare@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 513674b5a2 ("net: reevalulate autoflowlabel setting after
sysctl setting") removed the initialisation of
ipv6_pinfo::autoflowlabel and added a second flag to indicate
whether this field or the net namespace default should be used.
The getsockopt() handling for this case was not updated, so it
currently returns 0 for all sockets for which IPV6_AUTOFLOWLABEL is
not explicitly enabled. Fix it to return the effective value, whether
that has been set at the socket or net namespace level.
Fixes: 513674b5a2 ("net: reevalulate autoflowlabel setting after sysctl ...")
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
In pppoe_sendmsg(), reserving dev->hard_header_len bytes of headroom
was probably fine before the introduction of ->needed_headroom in
commit f5184d267c ("net: Allow netdevices to specify needed head/tailroom").
But now, virtual devices typically advertise the size of their overhead
in dev->needed_headroom, so we must also take it into account in
skb_reserve().
Allocation size of skb is also updated to take dev->needed_tailroom
into account and replace the arbitrary 32 bytes with the real size of
a PPPoE header.
This issue was discovered by syzbot, who connected a pppoe socket to a
gre device which had dev->header_ops->create == ipgre_header and
dev->hard_header_len == 0. Therefore, PPPoE didn't reserve any
headroom, and dev_hard_header() crashed when ipgre_header() tried to
prepend its header to skb->data.
skbuff: skb_under_panic: text:000000001d390b3a len:31 put:24
head:00000000d8ed776f data:000000008150e823 tail:0x7 end:0xc0 dev:gre0
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:104!
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 3670 Comm: syzkaller801466 Not tainted
4.15.0-rc7-next-20180115+ #97
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
RIP: 0010:skb_panic+0x162/0x1f0 net/core/skbuff.c:100
RSP: 0018:ffff8801d9bd7840 EFLAGS: 00010282
RAX: 0000000000000083 RBX: ffff8801d4f083c0 RCX: 0000000000000000
RDX: 0000000000000083 RSI: 1ffff1003b37ae92 RDI: ffffed003b37aefc
RBP: ffff8801d9bd78a8 R08: 1ffff1003b37ae8a R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff86200de0
R13: ffffffff84a981ad R14: 0000000000000018 R15: ffff8801d2d34180
FS: 00000000019c4880(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000208bc000 CR3: 00000001d9111001 CR4: 00000000001606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
skb_under_panic net/core/skbuff.c:114 [inline]
skb_push+0xce/0xf0 net/core/skbuff.c:1714
ipgre_header+0x6d/0x4e0 net/ipv4/ip_gre.c:879
dev_hard_header include/linux/netdevice.h:2723 [inline]
pppoe_sendmsg+0x58e/0x8b0 drivers/net/ppp/pppoe.c:890
sock_sendmsg_nosec net/socket.c:630 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:640
sock_write_iter+0x31a/0x5d0 net/socket.c:909
call_write_iter include/linux/fs.h:1775 [inline]
do_iter_readv_writev+0x525/0x7f0 fs/read_write.c:653
do_iter_write+0x154/0x540 fs/read_write.c:932
vfs_writev+0x18a/0x340 fs/read_write.c:977
do_writev+0xfc/0x2a0 fs/read_write.c:1012
SYSC_writev fs/read_write.c:1085 [inline]
SyS_writev+0x27/0x30 fs/read_write.c:1082
entry_SYSCALL_64_fastpath+0x29/0xa0
Admittedly PPPoE shouldn't be allowed to run on non Ethernet-like
interfaces, but reserving space for ->needed_headroom is a more
fundamental issue that needs to be addressed first.
Same problem exists for __pppoe_xmit(), which also needs to take
dev->needed_headroom into account in skb_cow_head().
Fixes: f5184d267c ("net: Allow netdevices to specify needed head/tailroom")
Reported-by: syzbot+ed0838d0fa4c4f2b528e20286e6dc63effc7c14d@syzkaller.appspotmail.com
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the addition of ORC unwinder and FRAME POINTER unwinder, the stack
trace skipping requirements have changed.
I went through the tracing stack trace dumps with ORC and with frame
pointers and recalculated the proper values.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
The function tracer can create a dynamically allocated trampoline that is
called by the function mcount or fentry hook that is used to call the
function callback that is registered. The problem is that the orc undwinder
will bail if it encounters one of these trampolines. This breaks the stack
trace of function callbacks, which include the stack tracer and setting the
stack trace for individual functions.
Since these dynamic trampolines are basically copies of the static ftrace
trampolines defined in ftrace_*.S, we do not need to create new orc entries
for the dynamic trampolines. Finding the return address on the stack will be
identical as the functions that were copied to create the dynamic
trampolines. When encountering a ftrace dynamic trampoline, we can just use
the orc entry of the ftrace static function that was copied for that
trampoline.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJaZ5GHAAoJEFmIoMA60/r8UmAP/AjrS+C+OSMHzM/U/h+UHmou
DRg5Tg7gG3PjnnKULEQno4BSGM5VJeXpM+QC8Ypqa2I1T4GTHVHiSn6qtOi4yn0R
k9CAGyeRD1zYH4Mtw4dyUOVE5j0ANoBuji0KklawqaxcPFJDh6FhSCbChPq0WjIA
ZarpJZ4YKeJF5ZYXFs4G5W6EjnXokYogjZL3lzCOVXoTC9aVfo+NoJ9Hg9P38VA3
WmYaLUEbIzV40JXkDieZq7nqO8m3mFBtxi7+74BvsjtgozW0og1tPNI6Hige6/U3
dUAzrdNGQg52T3pTMm8r0+efagDV11UO0IBhVhktNzTTVaUggxluBEza//FaQ85m
PLm7vsnwtIALujnp74VcFnTsVM2yYa9NeaPNIQ9FQ6kOB7TS0/ELtyHaPLNS8JoM
lhssdhT3jmVNg86UG2pOi9MJZhyYb6SQmWAGmYopzwEn/idBTOcLsTdcbptF5LfP
0hbc9BLI0wCF8sv0JXD4IBQFWN264z1vPGNDWD4cnkEMPiAJ23h1ySpS3S0HjZe3
c6AMEMNj5E1b6unwIebBHfeSj4DUUnfyypb4/oICNxCThQBIzPyXtnlk07DNFMOl
9pRZBebfUl5xAZeWz2Sxxvs0PqMc8QEDa5j8YzLS3cgz9y1i80Hn6fGxSeHptgC+
Zb4SYRr5S82kU3SbAtnY
=tRVW
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.15-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fix from Bjorn Helgaas:
"Fix AMD regression due to not re-enabling the big window on resume
(Christian König)"
* tag 'pci-v4.15-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
x86/PCI: Enable AMD 64-bit window on resume
The queue count says the highest queue that's been allocated, so don't
reallocate a queue lower than that.
Fixes: 147b27e4bd ("nvme-pci: allocate device queues storage space at probe")
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Steven Rostedt discovered that the ftrace stack tracer is broken when
it's used with the ORC unwinder. The problem is that objtool is
instructed by the Makefile to ignore the ftrace_64.S code, so it doesn't
generate any ORC data for it.
Fix it by making the asm code objtool-friendly:
- Objtool doesn't like the fact that save_mcount_regs pushes RBP at the
beginning, but it's never restored (directly, at least). So just skip
the original RBP push, which is only needed for frame pointers anyway.
- Annotate some functions as normal callable functions with
ENTRY/ENDPROC.
- Add an empty unwind hint to return_to_handler(). The return address
isn't on the stack, so there's nothing ORC can do there. It will just
punt in the unlikely case it tries to unwind from that code.
With all that fixed, remove the OBJECT_FILES_NON_STANDARD Makefile
annotation so objtool can read the file.
Link: http://lkml.kernel.org/r/20180123040746.ih4ep3tk4pbjvg7c@treble
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Pull networking fixes from David Miller:
1) Fix divide by zero in mlx5, from Talut Batheesh.
2) Guard against invalid GSO packets coming from untrusted guests and
arriving in qdisc_pkt_len_init(), from Eric Dumazet.
3) Similarly add such protection to the various protocol GSO handlers.
From Willem de Bruijn.
4) Fix regression added to IGMP source address checking for IGMPv3
reports, from Felix Feitkau.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
tls: Correct length of scatterlist in tls_sw_sendpage
be2net: restore properly promisc mode after queues reconfiguration
net: igmp: fix source address check for IGMPv3 reports
gso: validate gso_type in GSO handlers
net: qdisc_pkt_len_init() should be more robust
ibmvnic: Allocate and request vpd in init_resources
ibmvnic: Revert to previous mtu when unsupported value requested
ibmvnic: Modify buffer size and number of queues on failover
rds: tcp: compute m_ack_seq as offset from ->write_seq
usbnet: silence an unnecessary warning
cxgb4: fix endianness for vlan value in cxgb4_tc_flower
cxgb4: set filter type to 1 for ETH_P_IPV6
net/mlx5e: Fix fixpoint divide exception in mlx5e_am_stats_compare
We inadvertently set it again on the source bio, but we need
to set it on the new split bio instead.
Fixes: fbbaf700e7 ("block: trace completion of all bios.")
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Assign true or false to boolean variables instead of an integer value.
This issue was detected with the help of Coccinelle.
Fixes: ffdb5211da ("xfrm: Auto-load xfrm offload modules")
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
IPSec tunnel mode supports encapsulation of IPv4 over IPv6 and vice-versa.
The outer IP header is stripped and the inner IP inherits the original
Ethernet header. Tcpdump fails to properly decode the inner packet in
case that h_proto is different than the inner IP version.
Fix h_proto to reflect the inner IP version.
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>