When doing HW restart we tear down aggregations.
Since at this point we are not TX'ing any aggregation, while
the peer is still sending RX aggregation over the air, it will
make sense to tear down the RX aggregations first.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The previous path metric update from RANN frame has not considered
the own link metric toward the transmitting mesh STA. Fix this.
Reported-by: Michael65535
Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When connected to a QoS/WMM AP, mac80211 should use a QoS NDP
for probing it, instead of a regular non-QoS one, fix this.
Change all the drivers to *not* allow QoS NDP for now, even
though it looks like most of them should be OK with that.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
If we want to add a datapath flow, which has more than 500 vxlan outputs'
action, we will get the following error reports:
openvswitch: netlink: Flow action size 32832 bytes exceeds max
openvswitch: netlink: Flow action size 32832 bytes exceeds max
openvswitch: netlink: Actions may not be safe on all matching packets
... ...
It seems that we can simply enlarge the MAX_ACTIONS_BUFSIZE to fix it, but
this is not the root cause. For example, for a vxlan output action, we need
about 60 bytes for the nlattr, but after it is converted to the flow
action, it only occupies 24 bytes. This means that we can still support
more than 1000 vxlan output actions for a single datapath flow under the
the current 32k max limitation.
So even if the nla_len(attr) is larger than MAX_ACTIONS_BUFSIZE, we
shouldn't report EINVAL and keep it move on, as the judgement can be
done by the reserve_sfa_size.
Signed-off-by: zhangliping <zhangliping02@baidu.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull ARM fixes from Russell King:
- LPAE fixes for kernel-readonly regions
- Fix for get_user_pages_fast on LPAE systems
- avoid tying decompressor to a particular platform if DEBUG_LL is
enabled
- BUG if we attempt to return to userspace but the to-be-restored PSR
value keeps us in privileged mode (defeating an issue that ftracetest
found)
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: BUG if jumping to usermode address in kernel mode
ARM: 8722/1: mm: make STRICT_KERNEL_RWX effective for LPAE
ARM: 8721/1: mm: dump: check hardware RO bit for LPAE
ARM: make decompressor debug output user selectable
ARM: fix get_user_pages_fast
Pull irq fixes from Thomas Glexiner:
- unbreak the irq trigger type check for legacy platforms
- a handful fixes for ARM GIC v3/4 interrupt controllers
- a few trivial fixes all over the place
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/matrix: Make - vs ?: Precedence explicit
irqchip/imgpdc: Use resource_size function on resource object
irqchip/qcom: Fix u32 comparison with value less than zero
irqchip/exiu: Fix return value check in exiu_init()
irqchip/gic-v3-its: Remove artificial dependency on PCI
irqchip/gic-v4: Add forward definition of struct irq_domain_ops
irqchip/gic-v3: pr_err() strings should end with newlines
irqchip/s3c24xx: pr_err() strings should end with newlines
irqchip/gic-v3: Fix ppi-partitions lookup
irqchip/gic-v4: Clear IRQ_DISABLE_UNLAZY again if mapping fails
genirq: Track whether the trigger type has been set
Pull misc x86 fixes from Ingo Molnar:
- topology enumeration fixes
- KASAN fix
- two entry fixes (not yet the big series related to KASLR)
- remove obsolete code
- instruction decoder fix
- better /dev/mem sanity checks, hopefully working better this time
- pkeys fixes
- two ACPI fixes
- 5-level paging related fixes
- UMIP fixes that should make application visible faults more debuggable
- boot fix for weird virtualization environment
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
x86/decoder: Add new TEST instruction pattern
x86/PCI: Remove unused HyperTransport interrupt support
x86/umip: Fix insn_get_code_seg_params()'s return value
x86/boot/KASLR: Remove unused variable
x86/entry/64: Add missing irqflags tracing to native_load_gs_index()
x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow
x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing
x86/pkeys/selftests: Fix protection keys write() warning
x86/pkeys/selftests: Rename 'si_pkey' to 'siginfo_pkey'
x86/mpx/selftests: Fix up weird arrays
x86/pkeys: Update documentation about availability
x86/umip: Print a warning into the syslog if UMIP-protected instructions are used
x86/smpboot: Fix __max_logical_packages estimate
x86/topology: Avoid wasting 128k for package id array
perf/x86/intel/uncore: Cache logical pkg id in uncore driver
x86/acpi: Reduce code duplication in mp_override_legacy_irq()
x86/acpi: Handle SCI interrupts above legacy space gracefully
x86/boot: Fix boot failure when SMP MP-table is based at 0
x86/mm: Limit mmap() of /dev/mem to valid physical addresses
x86/selftests: Add test for mapping placement for 5-level paging
...
Pull scheduler fixes from Ingo Molnar:
"Misc fixes: a documentation fix, a Sparse warning fix and a debugging
fix"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/debug: Fix task state recording/printout
sched/deadline: Don't use dubious signed bitfields
sched/deadline: Fix the description of runtime accounting in the documentation
Pull perf fixes from Ingo Molnar:
"Misc fixes: two PMU driver fixes and a memory leak fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix memory leak triggered by perf --namespace
perf/x86/intel/uncore: Add event constraint for BDX PCU
perf/x86/intel: Hide TSX events when RTM is not supported
Pull static key fix from Ingo Molnar:
"Fix a boot warning related to bad init ordering of the static keys
self-test"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
jump_label: Invoke jump_label_test() via early_initcall()
Pull objtool fixes from Ingo Molnar:
"A handful of objtool fixes, most of them related to making the UAPI
header-syncing warnings easier to read and easier to act upon"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tools/headers: Sync objtool UAPI header
objtool: Fix cross-build
objtool: Move kernel headers/code sync check to a script
objtool: Move synced files to their original relative locations
objtool: Make unreachable annotation inline asms explicitly volatile
objtool: Add a comment for the unreachable annotation macros
gso_type is being used in binary AND operations together with SKB_GSO_UDP.
The issue is that variable gso_type is of type unsigned short and
SKB_GSO_UDP expands to more than 16 bits:
SKB_GSO_UDP = 1 << 16
this makes any binary AND operation between gso_type and SKB_GSO_UDP to
be always zero, hence making some code unreachable and likely causing
undesired behavior.
Fix this by changing the data type of variable gso_type to unsigned int.
Addresses-Coverity-ID: 1462223
Fixes: 0c19f846d5 ("net: accept UFO datagrams from tuntap and packet")
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Detect if we are returning to usermode via the normal kernel exit paths
but the saved PSR value indicates that we are in kernel mode. This
could occur due to corrupted stack state, which has been observed with
"ftracetest".
This ensures that we catch the problem case before we get to user code.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
New file seems to have missed the SPDX license scan and update.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Setting the refcount to 0 when allocating a tree to match the number of
switch devices it holds may cause an 'increment on 0; use-after-free',
if CONFIG_REFCOUNT_FULL is enabled.
To fix this, do not decrement the refcount of a newly allocated tree,
increment it when an already allocated tree is found, and decrement it
after the probing of a switch, as done with the previous behavior.
At the same time, make dsa_tree_get and dsa_tree_put accept a NULL
argument to simplify callers, and return the tree after incrementation,
as most kref users like of_node_get and of_node_put do.
Fixes: 8e5bf9759a ("net: dsa: simplify tree reference counting")
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using the host personality, VMCI will grab a mutex for any
queue pair access. In the detach callback for the vmci vsock
transport, we call vsock_stream_has_data while holding a spinlock,
and vsock_stream_has_data will access a queue pair.
To avoid this, we can simply omit calling vsock_stream_has_data
for host side queue pairs, since the QPs are empty per default
when the guest has detached.
This bug affects users of VMware Workstation using kernel version
4.4 and later.
Testing: Ran vsock tests between guest and host, and verified that
with this change, the host isn't calling vsock_stream_has_data
during detach. Ran mixedTest between guest and host using both
guest and host as server.
v2: Rebased on top of recent change to sk_state values
Reviewed-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull timer updates from Thomas Gleixner:
- The final conversion of timer wheel timers to timer_setup().
A few manual conversions and a large coccinelle assisted sweep and
the removal of the old initialization mechanisms and the related
code.
- Remove the now unused VSYSCALL update code
- Fix permissions of /proc/timer_list. I still need to get rid of that
file completely
- Rename a misnomed clocksource function and remove a stale declaration
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
m68k/macboing: Fix missed timer callback assignment
treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts
timer: Remove redundant __setup_timer*() macros
timer: Pass function down to initialization routines
timer: Remove unused data arguments from macros
timer: Switch callback prototype to take struct timer_list * argument
timer: Pass timer_list pointer to callbacks unconditionally
Coccinelle: Remove setup_timer.cocci
timer: Remove setup_*timer() interface
timer: Remove init_timer() interface
treewide: setup_timer() -> timer_setup() (2 field)
treewide: setup_timer() -> timer_setup()
treewide: init_timer() -> setup_timer()
treewide: Switch DEFINE_TIMER callbacks to struct timer_list *
s390: cmm: Convert timers to use timer_setup()
lightnvm: Convert timers to use timer_setup()
drivers/net: cris: Convert timers to use timer_setup()
drm/vc4: Convert timers to use timer_setup()
block/laptop_mode: Convert timers to use timer_setup()
net/atm/mpc: Avoid open-coded assignment of timer callback function
...
- Use pwd instead of /bin/pwd for portability
- Clean up Makefiles
- Fix ld-option for clang
- Fix malloc'ed data size in Kconfig
- Fix parallel building along with coccicheck
- Fix a minor issue of package building
- Prompt to use "rpm-pkg" instead of "rpm"
- Clean up *.i and *.lst patterns by "make clean"
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJaGEuYAAoJED2LAQed4NsG1H8P/1cKSZk7G9tA1L7DndvoTNLz
oc+wM7dUQ3NBndBsyymGe3/cvmlPoB6tam5otAyHaDjBsiGL6pQAeCi1RjJ24aMO
BHb3vUWLh7GiGnIpe9uMNYIghuRpHdK53rkh9uPaKXmLZBSHtOdd81wHlZGsUdyY
6U2HHUM6c6KiueTYuULiP2wXE6u0YhREjb5MUL6SThAoME0tDaf7JnWguo7sKJAv
CUlw9196LoDtj/2VRN3Rd9baGefqDw05PYqUZ9RWfCcl/h8gm7Xo9jW8THRLTIj5
hRbqNRDHEBjN0QY1FFghrfQf8Iud7zyZC2bzD+0PeUHtIMsb2zA+0zNzdOLN8y9f
VC34LcQjpG4k9jgrRabAt5xJKcKPLA3rBfkZdB36NKnyj0EFndGoR5YaYsINFBrd
Asg58Es6OQVINvLdinInpp2GAZAz30zNuF1VivjKgDIUpMw1yU7x5h+Sea3+b5n2
hO439hlQj4Hb5yNrKeDVaOUJ8eY4yIUtGAUm1vFXODdC4y5FcH7aNf5GiY5bxC7n
warleTm/DxCV9CI2MANOJmJgC2XgdE7/JNMNvWQnjcmOvhFywyYE65c26QMLIQSM
k2L0b26FtlES3aCUUKaAmBvAGEZznturBg0fmpnuIULHvYWf2aLndzWDv6C5LhiQ
mC3FAJc1rzDfQR7UsPrs
=a2Td
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v4.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull more Kbuild updates from Masahiro Yamada:
- use 'pwd' instead of '/bin/pwd' for portability
- clean up Makefiles
- fix ld-option for clang
- fix malloc'ed data size in Kconfig
- fix parallel building along with coccicheck
- fix a minor issue of package building
- prompt to use "rpm-pkg" instead of "rpm"
- clean up *.i and *.lst patterns by "make clean"
* tag 'kbuild-v4.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: drop $(extra-y) from real-objs-y
kbuild: clean up *.i and *.lst patterns by make clean
kbuild: rpm: prompt to use "rpm-pkg" if "rpm" target is used
kbuild: pkg: use --transform option to prefix paths in tar
coccinelle: fix parallel build with CHECK=scripts/coccicheck
kconfig/symbol.c: use correct pointer type argument for sizeof
kbuild: Set KBUILD_CFLAGS before incl. arch Makefile
kbuild: remove all dummy assignments to obj-
kbuild: create built-in.o automatically if parent directory wants it
kbuild: /bin/pwd -> pwd
-----BEGIN PGP SIGNATURE-----
iQIVAwUAWhglCPSw1s6N8H32AQJr5g/7BFKQ5KrbkPcjJTjP18bgVTFDq2in6/ui
3aYXvcI5dqKzfGyCZkFYS48tSnvNeWKVYbgsLsSOdDHLQ40QW4mDnJmbtK1A9Adx
scXgQsgGdyK3NrIFBWPcKCbttiomj4pDQhkc5MVYxy/hFhXAB7J2CvNvxgkA5suv
K14cg1y9hbY2WSe+/dXBB8WNCmL4CSXV23sb2Dy+JkPUGOE+DhGTwdbK5DSDr2FN
wJOkEle7k1fsHn3z8S5CK+h2p5lwy26KXMD+boEQS8UvFwq+SMm4J3Emkk7L6BvQ
WDbQlvGt1EF/+O6GTaB/FKZd2pO51sf5BNPuVoFyk5AmhNrZcTOPyQl83JHedHGp
nlKWOI8bOWYeRZEeBnrXfoEkOAs9U0NKZk6+NOxgXrhDBkmcBwyMqrHNgaP6iY45
ducE3UCKsL0a0yC/lz9usq6gM2QIbd1BB2RcVoFRAFHk7DU7aLxtgTZRF3NFT36n
vKVUIPbAMh+T8lzxw/bJmyfiyVZZpIlxMdkJmyWPMelgw8R4c448kXcQwQ5kofBz
0UeZGcYZ7+B/XUtkvfL3ZSGzRJN0k5ibA3gMKwhUd+UvyG1hVB4m1Tg9cO6EWHtS
vbj+GL2D/SDRmjCGKv5HmImik5cHWufjqjxJHW+0LolkqTw500RZDScT0pxLpHdT
sK6AHEamcn8=
=v3Rx
-----END PGP SIGNATURE-----
Merge tag 'afs-fixes-20171124' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull AFS fixes from David Howells:
- Make AFS file locking work again.
- Don't write to a page that's being written out, but wait for it to
complete.
- Do d_drop() and d_add() in the right places.
- Put keys on error paths.
- Remove some redundant code.
* tag 'afs-fixes-20171124' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
afs: remove redundant assignment of dvnode to itself
afs: cell: Remove unnecessary code in afs_lookup_cell
afs: Fix signal handling in some file ops
afs: Fix some dentry handling in dir ops and missing key_puts
afs: Make afs_write_begin() avoid writing to a page that's being stored
afs: Fix file locking
tcf_block_put_ext has assumed that all filters (and thus their goto
actions) are destroyed in RCU callback and thus can not race with our
list iteration. However, that is not true during netns cleanup (see
tcf_exts_get_net comment).
Prevent the user after free by holding all chains (except 0, that one is
already held). foreach_safe is not enough in this case.
To reproduce, run the following in a netns and then delete the ns:
ip link add dtest type dummy
tc qdisc add dev dtest ingress
tc filter add dev dtest chain 1 parent ffff: handle 1 prio 1 flower action goto chain 2
Fixes: 822e86d997 ("net_sched: remove tcf_block_put_deferred()")
Signed-off-by: Roman Kapl <code@rkapl.cz>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 86dabda426 ("net: thunderbolt: Clear finished Tx frame bus
address in tbnet_tx_callback()") fixed a DMA-API violation where the
driver called dma_unmap_page() in tbnet_free_buffers() for a bus address
that might already be unmapped. The fix was to zero out the bus address
of a frame in tbnet_tx_callback().
However, as pointed out by David Miller, zero might well be valid
mapping (at least in theory) so it is not good idea to use it here.
It turns out that we don't need the whole map/unmap dance for Tx buffers
at all. Instead we can map the buffers when they are initially allocated
and unmap them when the interface is brought down. In between we just
DMA sync the buffers for the CPU or device as needed.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't offload IP header checksum to NIC.
This fixes a previous patch which enabled checksum offloading
for both IPv4 and IPv6 packets. So L3 checksum offload was
getting enabled for IPv6 pkts. And HW is dropping these pkts
as it assumes the pkt is IPv4 when IP csum offload is set
in the SQ descriptor.
Fixes: 3a9024f52c ("net: thunderx: Enable TSO and checksum offloads for ipv6")
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@auriga.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* GICv4 Support for KVM/ARM
All ARM patches were in next-20171113. I have postponed most x86 fixes
to 4.15-rc2 and UMIP to 4.16, but there are fixes that would be good to
have already in 4.15-rc1:
* re-introduce support for CPUs without virtual NMI (cc stable)
and allow testing of KVM without virtual NMI on available CPUs
* fix long-standing performance issues with assigned devices on AMD
(cc stable)
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJaGECGAAoJEED/6hsPKofoT08H/AuaMi8qprw2BNpVBbQxWRWM
O4WPk7yz1zB4SkdRNrPzCMBy+qoK7FcV/3BpsFPuQS4NHQ+GvQ87N/7tUbouVyl6
CuPGJMCnNzMQ8GvLOJgB1/sz+uW5W/ph3y8kv1UP3/hNCZU4fqukoUeLroOH/wr6
N3bSY8bok7ycdpgybHmbUHY0Yk4IUk3m0RXWY9U5Jl3sjoNEwCw3pWdrq9Swfs/6
W8QJRdE4Z6KHPqW5sRnPj24IpoUpCxu+IT+gPuGlDUCN/h3sfhYvMS6GgDrCjiiZ
2z1TwaIAo+wGjlBQzGmyTUjUPjbGew+f3ixBlf2BtmNutX+tX2qsVfl1NKXYTto=
=GGge
-----END PGP SIGNATURE-----
Merge tag 'kvm-4.15-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Radim Krčmář:
"Trimmed second batch of KVM changes for Linux 4.15:
- GICv4 Support for KVM/ARM
- re-introduce support for CPUs without virtual NMI (cc stable) and
allow testing of KVM without virtual NMI on available CPUs
- fix long-standing performance issues with assigned devices on AMD
(cc stable)"
* tag 'kvm-4.15-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (30 commits)
kvm: vmx: Allow disabling virtual NMI support
kvm: vmx: Reinstate support for CPUs without virtual NMI
KVM: SVM: obey guest PAT
KVM: arm/arm64: Don't queue VLPIs on INV/INVALL
KVM: arm/arm64: Fix GICv4 ITS initialization issues
KVM: arm/arm64: GICv4: Theory of operations
KVM: arm/arm64: GICv4: Enable VLPI support
KVM: arm/arm64: GICv4: Prevent userspace from changing doorbell affinity
KVM: arm/arm64: GICv4: Prevent a VM using GICv4 from being saved
KVM: arm/arm64: GICv4: Enable virtual cpuif if VLPIs can be delivered
KVM: arm/arm64: GICv4: Hook vPE scheduling into vgic flush/sync
KVM: arm/arm64: GICv4: Use the doorbell interrupt as an unblocking source
KVM: arm/arm64: GICv4: Add doorbell interrupt handling
KVM: arm/arm64: GICv4: Use pending_last as a scheduling hint
KVM: arm/arm64: GICv4: Handle INVALL applied to a vPE
KVM: arm/arm64: GICv4: Propagate property updates to VLPIs
KVM: arm/arm64: GICv4: Handle MOVALL applied to a vPE
KVM: arm/arm64: GICv4: Handle CLEAR applied to a VLPI
KVM: arm/arm64: GICv4: Propagate affinity changes to the physical ITS
KVM: arm/arm64: GICv4: Unmap VLPI when freeing an LPI
...
A small batch of fixes, about 50% tagged for stable and the rest for recently
merged code.
There's one more fix for the >128T handling on hash. Once a process had
requested a single mmap above 128T we would then always search above 128T. The
correct behaviour is to consider the hint address in isolation for each mmap
request.
Then a couple of fixes for the IMC PMU, a missing EXPORT_SYMBOL in VAS, a fix
for STRICT_KERNEL_RWX on 32-bit, and a fix to correctly identify P9 DD2.1 but in
code that is currently not used by default.
Thanks to:
Aneesh Kumar K.V, Christophe Leroy, Madhavan Srinivasan, Sukadev Bhattiprolu.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJaF/VqAAoJEFHr6jzI4aWA994P/3NNXkSASJHjLrIlQAKXtmx9
lrv1v+6MbPWhyB8Q8LVnnC3Ab2LTHnkccjq2Jw0bP0RQ86HF4mH7Sb7N5Wj0cG+M
5NioikvGE057ncLfxVhesOK0C9Lhc7Zb+zphXZliYP76IGxwbxorJRepeZctVkyO
KPMv4eaImdblVn71aoQQSlepON4+/rtiW2yo5u98uCqR+Ttds4J1fiDZ4TNrBYRP
Ilh6DmA//CWvN+KsGT+brRd/PjEkxQKHyS8px3lxRl4cwCJucXPCik/Gn9t6OiMw
3S6y1Mu8nrh4z+YepKv6APvl2DEwwXn8w9f85kn+QiE9Qp3Z/wckW9/4LT5FeuKE
L8E3dKq2NzJ9oDs/20sVbBvVR7CUvBoyWytsXVkmmlC6sVReTrYAJ1UP9HnNvcF6
be4zYUKusU83uG6saGgchRrPUrD31XKXw8Piv9EoWo1Uz7VgWCkxidclRNocgeDO
k5VxYnRd9jPsv2pCzXH2YmuQAypGUh12IPTxEOnSt5uzXSXcamZJBLKp5fAJ/9dl
jD6GlRQMX8JpNRJzxOBLly3CmwQBw2ekOuPLXI+M/ilks66AGK8lp4bg5cWwDGNe
puzmRJ2mO3dnFlVUHBQ5LyX8ne4yunin1JZB1YQ4xm8yxZbGO2AdypEWMSkPKNPN
fkrGPlwQ1JwFheMbHHLj
=gv70
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"A small batch of fixes, about 50% tagged for stable and the rest for
recently merged code.
There's one more fix for the >128T handling on hash. Once a process
had requested a single mmap above 128T we would then always search
above 128T. The correct behaviour is to consider the hint address in
isolation for each mmap request.
Then a couple of fixes for the IMC PMU, a missing EXPORT_SYMBOL in
VAS, a fix for STRICT_KERNEL_RWX on 32-bit, and a fix to correctly
identify P9 DD2.1 but in code that is currently not used by default.
Thanks to: Aneesh Kumar K.V, Christophe Leroy, Madhavan Srinivasan,
Sukadev Bhattiprolu"
* tag 'powerpc-4.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: Fix Power9 DD2.1 logic in DT CPU features
powerpc/perf: Fix IMC_MAX_PMU macro
powerpc/perf: Fix pmu_count to count only nest imc pmus
powerpc: Fix boot on BOOK3S_32 with CONFIG_STRICT_KERNEL_RWX
powerpc/perf/imc: Use cpu_to_node() not topology_physical_package_id()
powerpc/vas: Export chip_to_vas_id()
powerpc/64s/slice: Use addr limit when computing slice mask
Pull SCSI target updates from Nicholas Bellinger:
"This series is predominantly bug-fixes, with a few small improvements
that have been outstanding over the last release cycle.
As usual, the associated bug-fixes have CC' tags for stable.
Also, things have been particularly quiet wrt new developments the
last months, with most folks continuing to focus on stability atop 4.x
stable kernels for their respective production configurations.
Also at this point, the stable trees have been synced up with
mainline. This will continue to be a priority, as production users
tend to run exclusively atop stable kernels, a few releases behind
mainline.
The highlights include:
- Fix PR PREEMPT_AND_ABORT null pointer dereference regression in
v4.11+ (tangwenji)
- Fix OOPs during removing TCMU device (Xiubo Li + Zhang Zhuoyu)
- Add netlink command reply supported option for each device (Kenjiro
Nakayama)
- cxgbit: Abort the TCP connection in case of data out timeout (Varun
Prakash)
- Fix PR/ALUA file path truncation (David Disseldorp)
- Fix double se_cmd completion during ->cmd_time_out (Mike Christie)
- Fix QUEUE_FULL + SCSI task attribute handling in 4.1+ (Bryant Ly +
nab)
- Fix quiese during transport_write_pending_qf endless loop (nab)
- Avoid early CMD_T_PRE_EXECUTE failures during ABORT_TASK in 3.14+
(Don White + nab)"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (35 commits)
tcmu: Add a missing unlock on an error path
tcmu: Fix some memory corruption
iscsi-target: Fix non-immediate TMR reference leak
iscsi-target: Make TASK_REASSIGN use proper se_cmd->cmd_kref
target: Avoid early CMD_T_PRE_EXECUTE failures during ABORT_TASK
target: Fix quiese during transport_write_pending_qf endless loop
target: Fix caw_sem leak in transport_generic_request_failure
target: Fix QUEUE_FULL + SCSI task attribute handling
iSCSI-target: Use common error handling code in iscsi_decode_text_input()
target/iscsi: Detect conn_cmd_list corruption early
target/iscsi: Fix a race condition in iscsit_add_reject_from_cmd()
target/iscsi: Modify iscsit_do_crypto_hash_buf() prototype
target/iscsi: Fix endianness in an error message
target/iscsi: Use min() in iscsit_dump_data_payload() instead of open-coding it
target/iscsi: Define OFFLOAD_BUF_SIZE once
target: Inline transport_put_cmd()
target: Suppress gcc 7 fallthrough warnings
target: Move a declaration of a global variable into a header file
tcmu: fix double se_cmd completion
target: return SAM_STAT_TASK_SET_FULL for TCM_OUT_OF_RESOURCES
...
The skcipher_walk_aead_common function calls scatterwalk_copychunks on
the input and output walks to skip the associated data. If the AD end
at an SG list entry boundary, then after these calls the walks will
still be pointing to the end of the skipped region.
These offsets are later checked for alignment in skcipher_walk_next,
so the skcipher_walk may detect the alignment incorrectly.
This patch fixes it by calling scatterwalk_done after the copychunks
calls to ensure that the offsets refer to the right SG list entry.
Fixes: b286d8b1a6 ("crypto: skcipher - Add skcipher walk interface")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When regulatory database certificates are built-in, they're
currently using the SHA256 digest algorithm, so add that to
the build in that case.
Also add a note that for custom certificates, one may need
to add the right algorithms.
Reported-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The function pci_unmap_page is obsolete. So it is replaced with
the function dma_unmap_page.
CC: Srinivas Eeda <srinivas.eeda@oracle.com>
CC: Joe Jin <joe.jin@oracle.com>
CC: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQIVAwUAWhfyR/Sw1s6N8H32AQITwA//YY35HFqAiox+QiUJAeW0E/KZGwcGHzW9
KTdofOzwNQzrpWD4tcjeYHxccmO/qD8kQssMcfJlPpknAXE4bPg4OC50PoAEc3Xm
UTurZ2dT0T8lK/wzfouYFdVfS1Pvd4tLlQ2HZC4DarJNSD1O37NLVPkbkrgfpwMx
pekNUi6nDt7KTcpX2ysRJbYJJ8QrRmGIW1Jk+rtZYhgxGF7RwLGzCttl1RxMemAw
ecilmlCYvaiawyJkhd03rAY6U76SMq/lH/Pq8XhRZjlVD/NzdGr7RaTxkISv713W
q7Xk7kIf71OJNlHoAXa/J+4GbUDJBO0SFermD15MFJZ/mKCfSPsqKxSXFXti3CBj
bgotkD2txkmToLKxiETB57JzEtqH8nr7aWVAccHP4n+EKUA0iNo7tgWi9e0jW0eX
OxlEA9dfS+gQkGp7uwnxyWmBwCyEIUAu7smUWo4eEqsw6Kq4je6vxth5f++FlcSF
mKuNm85l9BYUVKqGSvKnHvH8fZSoURzNqvovcj/tF8JJL90WXWt9hoWFt8WpqERj
erPU+viMgKaA+iB/ZKKLlcoMt6xaOanXfQFNzFjcfq0Y0jfrWna8lFF5KeCV1W1j
1HKTWwPQypy2v3JTkjpGGqY4zYZc7Y0heN8msVNg7MR9MeurKrXu2Kv113PzA6Cc
pmTwehqC7hU=
=Rakt
-----END PGP SIGNATURE-----
Merge tag 'rxrpc-fixes-20171124' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Fixes and improvements
Here's a set of patches that fix and improve some stuff in the AF_RXRPC
protocol:
The patches are:
(1) Unlock mutex returned by rxrpc_accept_call().
(2) Don't set connection upgrade by default.
(3) Differentiate the call->user_mutex used by the kernel from that used
by userspace calling sendmsg() to avoid lockdep warnings.
(4) Delay terminal ACK transmission to a work queue so that it can be
replaced by the next call if there is one.
(5) Split the call parameters from the connection parameters so that more
call-specific parameters can be passed through.
(6) Fix the call timeouts to work the same as for other RxRPC/AFS
implementations.
(7) Don't transmit DELAY ACKs immediately, but instead delay them slightly
so that can be discarded or can represent more packets.
(8) Use RTT to calculate certain protocol timeouts.
(9) Add a timeout to detect lost ACK/DATA packets.
(10) Add a keepalive function so that we ping the peer if we haven't
transmitted for a short while, thereby keeping intervening firewall
routes open.
(11) Make service endpoints expire like they're supposed to so that the UDP
port can be reused.
(12) Fix connection expiry timers to make cleanup happen in a more timely
fashion.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a missed function prototype callback from the timer conversions.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20171123221902.GA75727@beast
The assignment of dvnode to itself is redundant and can be removed.
Cleans up warning detected by cppcheck:
fs/afs/dir.c:975: (warning) Redundant assignment of 'dvnode' to itself.
Fixes: d2ddc776a4 ("afs: Overhaul volume and server record caching and fileserver rotation")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Due to recent changes this piece of code is no longer needed.
Addresses-Coverity-ID: 1462033
Link: https://lkml.kernel.org/r/4923.1510957307@warthog.procyon.org.uk
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
afs_mkdir(), afs_create(), afs_link() and afs_symlink() all need to drop
the target dentry if a signal causes the operation to be killed immediately
before we try to contact the server.
Signed-off-by: David Howells <dhowells@redhat.com>
Fix some of dentry handling in AFS directory ops:
(1) Do d_drop() on the new_dentry before assigning a new inode to it in
afs_vnode_new_inode(). It's fine to do this before calling afs_iget()
because the operation has taken place on the server.
(2) Replace d_instantiate()/d_rehash() with d_add().
(3) Don't d_drop() the new_dentry in afs_rename() on error.
Also fix afs_link() and afs_rename() to call key_put() on all error paths
where the key is taken.
Signed-off-by: David Howells <dhowells@redhat.com>
Make afs_write_begin() wait for a page that's marked PG_writeback because:
(1) We need to avoid interference with the data being stored so that the
data on the server ends up in a defined state.
(2) page->private is used to track the window of dirty data within a page,
but it's also used by the storage code to track what's being written,
being cleared by the completion notification. Ownership can't be
relinquished by the storage code until completion because it a store
fails, the data must be remarked dirty.
Tracing shows something like the following (edited):
x86_64-linux-gn-15940 [1] afs_page_dirty: vn=ffff8800bef33800 9c75 begin 0-125
kworker/u8:3-114 [2] afs_page_dirty: vn=ffff8800bef33800 9c75 store+ 0-125
x86_64-linux-gn-15940 [1] afs_page_dirty: vn=ffff8800bef33800 9c75 begin 0-2052
kworker/u8:3-114 [2] afs_page_dirty: vn=ffff8800bef33800 9c75 clear 0-2052
kworker/u8:3-114 [2] afs_page_dirty: vn=ffff8800bef33800 9c75 store 0-0
kworker/u8:3-114 [2] afs_page_dirty: vn=ffff8800bef33800 9c75 WARN 0-0
The clear (completion) corresponding to the store+ (store continuation from
a previous page) happens between the second begin (afs_write_begin) and the
store corresponding to that. This results in the second store not seeing
any data to write back, leading to the following warning:
WARNING: CPU: 2 PID: 114 at ../fs/afs/write.c:403 afs_write_back_from_locked_page+0x19d/0x76c [kafs]
Modules linked in: kafs(E)
CPU: 2 PID: 114 Comm: kworker/u8:3 Tainted: G E 4.14.0-fscache+ #242
Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
Workqueue: writeback wb_workfn (flush-afs-2)
task: ffff8800cad72600 task.stack: ffff8800cad44000
RIP: 0010:afs_write_back_from_locked_page+0x19d/0x76c [kafs]
RSP: 0018:ffff8800cad47aa0 EFLAGS: 00010246
RAX: 0000000000000001 RBX: ffff8800bef33a20 RCX: 0000000000000000
RDX: 000000000000000f RSI: ffffffff81c5d0e0 RDI: ffff8800cad72e78
RBP: ffff8800d31ea1e8 R08: ffff8800c1358000 R09: ffff8800ca00e400
R10: ffff8800cad47a38 R11: ffff8800c5d9e400 R12: 0000000000000000
R13: ffffea0002d9df00 R14: ffffffffa0023c1c R15: 0000000000007fdf
FS: 0000000000000000(0000) GS:ffff8800ca700000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f85ac6c4000 CR3: 0000000001c10001 CR4: 00000000001606e0
Call Trace:
? clear_page_dirty_for_io+0x23a/0x267
afs_writepages_region+0x1be/0x286 [kafs]
afs_writepages+0x60/0x127 [kafs]
do_writepages+0x36/0x70
__writeback_single_inode+0x12f/0x635
writeback_sb_inodes+0x2cc/0x452
__writeback_inodes_wb+0x68/0x9f
wb_writeback+0x208/0x470
? wb_workfn+0x22b/0x565
wb_workfn+0x22b/0x565
? worker_thread+0x230/0x2ac
process_one_work+0x2cc/0x517
? worker_thread+0x230/0x2ac
worker_thread+0x1d4/0x2ac
? rescuer_thread+0x29b/0x29b
kthread+0x15d/0x165
? kthread_create_on_node+0x3f/0x3f
? call_usermodehelper_exec_async+0x118/0x11f
ret_from_fork+0x24/0x30
Signed-off-by: David Howells <dhowells@redhat.com>
Fix the rxrpc connection expiry timers so that connections for closed
AF_RXRPC sockets get deleted in a more timely fashion, freeing up the
transport UDP port much more quickly.
(1) Replace the delayed work items with work items plus timers so that
timer_reduce() can be used to shorten them and so that the timer
doesn't requeue the work item if the net namespace is dead.
(2) Don't use queue_delayed_work() as that won't alter the timeout if the
timer is already running.
(3) Don't rearm the timers if the network namespace is dead.
Signed-off-by: David Howells <dhowells@redhat.com>
RxRPC service endpoints expire like they're supposed to by the following
means:
(1) Mark dead rxrpc_net structs (with ->live) rather than twiddling the
global service conn timeout, otherwise the first rxrpc_net struct to
die will cause connections on all others to expire immediately from
then on.
(2) Mark local service endpoints for which the socket has been closed
(->service_closed) so that the expiration timeout can be much
shortened for service and client connections going through that
endpoint.
(3) rxrpc_put_service_conn() needs to schedule the reaper when the usage
count reaches 1, not 0, as idle conns have a 1 count.
(4) The accumulator for the earliest time we might want to schedule for
should be initialised to jiffies + MAX_JIFFY_OFFSET, not ULONG_MAX as
the comparison functions use signed arithmetic.
(5) Simplify the expiration handling, adding the expiration value to the
idle timestamp each time rather than keeping track of the time in the
past before which the idle timestamp must go to be expired. This is
much easier to read.
(6) Ignore the timeouts if the net namespace is dead.
(7) Restart the service reaper work item rather the client reaper.
Signed-off-by: David Howells <dhowells@redhat.com>
We need to transmit a packet every so often to act as a keepalive for the
peer (which has a timeout from the last time it received a packet) and also
to prevent any intervening firewalls from closing the route.
Do this by resetting a timer every time we transmit a packet. If the timer
ever expires, we transmit a PING ACK packet and thereby also elicit a PING
RESPONSE ACK from the other side - which prevents our last-rx timeout from
expiring.
The timer is set to 1/6 of the last-rx timeout so that we can detect the
other side going away if it misses 6 replies in a row.
This is particularly necessary for servers where the processing of the
service function may take a significant amount of time.
Signed-off-by: David Howells <dhowells@redhat.com>
Add an extra timeout that is set/updated when we send a DATA packet that
has the request-ack flag set. This allows us to detect if we don't get an
ACK in response to the latest flagged packet.
The ACK packet is adjudged to have been lost if it doesn't turn up within
2*RTT of the transmission.
If the timeout occurs, we schedule the sending of a PING ACK to find out
the state of the other side. If a new DATA packet is ready to go sooner,
we cancel the sending of the ping and set the request-ack flag on that
instead.
If we get back a PING-RESPONSE ACK that indicates a lower tx_top than what
we had at the time of the ping transmission, we adjudge all the DATA
packets sent between the response tx_top and the ping-time tx_top to have
been lost and retransmit immediately.
Rather than sending a PING ACK, we could just pick a DATA packet and
speculatively retransmit that with request-ack set. It should result in
either a REQUESTED ACK or a DUPLICATE ACK which we can then use in lieu the
a PING-RESPONSE ACK mentioned above.
Signed-off-by: David Howells <dhowells@redhat.com>
Express protocol timeouts for data retransmission and deferred ack
generation in terms on RTT rather than specified timeouts once we have
sufficient RTT samples.
For the moment, this requires just one RTT sample to be able to use this
for ack deferral and two for data retransmission.
The data retransmission timeout is set at RTT*1.5 and the ACK deferral
timeout is set at RTT.
Note that the calculated timeout is limited to a minimum of 4ns to make
sure it doesn't happen too quickly.
Signed-off-by: David Howells <dhowells@redhat.com>
Don't transmit a DELAY ACK immediately on proposal when the Rx window is
rotated, but rather defer it to the work function. This means that we have
a chance to queue/consume more received packets before we actually send the
DELAY ACK, or even cancel it entirely, thereby reducing the number of
packets transmitted.
We do, however, want to continue sending other types of packet immediately,
particularly REQUESTED ACKs, as they may be used for RTT calculation by the
other side.
Signed-off-by: David Howells <dhowells@redhat.com>
Fix the rxrpc call expiration timeouts and make them settable from
userspace. By analogy with other rx implementations, there should be three
timeouts:
(1) "Normal timeout"
This is set for all calls and is triggered if we haven't received any
packets from the peer in a while. It is measured from the last time
we received any packet on that call. This is not reset by any
connection packets (such as CHALLENGE/RESPONSE packets).
If a service operation takes a long time, the server should generate
PING ACKs at a duration that's substantially less than the normal
timeout so is to keep both sides alive. This is set at 1/6 of normal
timeout.
(2) "Idle timeout"
This is set only for a service call and is triggered if we stop
receiving the DATA packets that comprise the request data. It is
measured from the last time we received a DATA packet.
(3) "Hard timeout"
This can be set for a call and specified the maximum lifetime of that
call. It should not be specified by default. Some operations (such
as volume transfer) take a long time.
Allow userspace to set/change the timeouts on a call with sendmsg, using a
control message:
RXRPC_SET_CALL_TIMEOUTS
The data to the message is a number of 32-bit words, not all of which need
be given:
u32 hard_timeout; /* sec from first packet */
u32 idle_timeout; /* msec from packet Rx */
u32 normal_timeout; /* msec from data Rx */
This can be set in combination with any other sendmsg() that affects a
call.
Signed-off-by: David Howells <dhowells@redhat.com>
When rxrpc_sendmsg() parses the control message buffer, it places the
parameters extracted into a structure, but lumps together call parameters
(such as user call ID) with operation parameters (such as whether to send
data, send an abort or accept a call).
Split the call parameters out into their own structure, a copy of which is
then embedded in the operation parameters struct.
The call parameters struct is then passed down into the places that need it
instead of passing the individual parameters. This allows for extra call
parameters to be added.
Signed-off-by: David Howells <dhowells@redhat.com>
Delay terminal ACK transmission on a client call by deferring it to the
connection processor. This allows it to be skipped if we can send the next
call instead, the first DATA packet of which will implicitly ack this call.
Signed-off-by: David Howells <dhowells@redhat.com>
Don't set upgrade by default when creating a call from sendmsg(). This is
a holdover from when I was testing the code.
Signed-off-by: David Howells <dhowells@redhat.com>
The caller of rxrpc_accept_call() must release the lock on call->user_mutex
returned by that function.
Signed-off-by: David Howells <dhowells@redhat.com>
The recent conversion of the task state recording to use task_state_index()
broke the sched_switch tracepoint task state output.
task_state_index() returns surprisingly an index (0-7) which is then
printed with __print_flags() applying bitmasks. Not really working and
resulting in weird states like 'prev_state=t' instead of 'prev_state=I'.
Use TASK_REPORT_MAX instead of TASK_STATE_MAX to report preemption. Build a
bitmask from the return value of task_state_index() and store it in
entry->prev_state, which makes __print_flags() work as expected.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: stable@vger.kernel.org
Fixes: efb40f588b ("sched/tracing: Fix trace_sched_switch task-state printing")
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1711221304180.1751@nanos
Signed-off-by: Ingo Molnar <mingo@kernel.org>