fix the raster config setting for different iceland configs.
Signed-off-by: Ken Wang <Qingqing.Wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
iwpriv app uses iw_point structure to send data to Kernel. The iw_point
structure holds a pointer. For compatibility Kernel converts the pointer
as required for WEXT IOCTLs (SIOCIWFIRST to SIOCIWLAST). Some drivers
may use iw_handler_def.private_args to populate iwpriv commands instead
of iw_handler_def.private. For those case, the IOCTLs from
SIOCIWFIRSTPRIV to SIOCIWLASTPRIV will follow the path ndo_do_ioctl().
Accordingly when the filled up iw_point structure comes from 32 bit
iwpriv to 64 bit Kernel, Kernel will not convert the pointer and sends
it to driver. So, the driver may get the invalid data.
The pointer conversion for the IOCTLs (SIOCIWFIRSTPRIV to
SIOCIWLASTPRIV), which follow the path ndo_do_ioctl(), is mandatory.
This patch adds pointer conversion from 32 bit to 64 bit and vice versa,
if the ioctl comes from 32 bit iwpriv to 64 bit Kernel.
Cc: stable@vger.kernel.org
Signed-off-by: Prasun Maiti <prasunmaiti87@gmail.com>
Signed-off-by: Ujjal Roy <royujjal@gmail.com>
Tested-by: Dibyajyoti Ghosh <dibyajyotig@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Since set_tx_power and set_antenna are frequently implemented
without the matching get_tx_power/get_antenna, we shouldn't
have added warnings for those. Remove them.
The remaining ones are correct and need to be implemented
symmetrically for correct operation.
Cc: stable@vger.kernel.org
Fixes: de3bb771f4 ("cfg80211: add more warnings for inconsistent ops")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Kabylake shows up as PCI ID 0xa171. And Kabylake-LP as 0x9d71.
Since these are similar to Skylake add these to SKL_PLUS macro
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
When we need to create a new aggregate to enqueue the skb we call kzalloc.
If that fails we returned ENOBUFS without freeing the skb.
Spotted during code review.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ip6 GRE tap device should not be forced to down state to change
the mac address and should allow live address change for tap device
similar to ipv4 gre.
Signed-off-by: Shweta Choudaha <schoudah@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski says:
====================
incremental cls_u32 hardware offload fixes
These are incremental changes from v1 of cls_u32 fixes.
First patch is reposted in its entirety, patch 2 is an
incremental change from patch 2 of the original series.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Return an error if user requested skip-sw and the underlaying
hardware cannot handle tc offloads (or offloads are disabled).
This patch fixes the knode handling.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Errors reported by u32_replace_hw_hnode() were not propagated.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
merged before 4.7rc1, plus two new fixes that have come in since then.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCgAGBQJXVd6tAAoJELXWKTbR/J7oLGEP/0Y9PXDu3UzEXyyFhNC80L2D
S+UiW0SnZvcc0uWGts75timIq1CfodZbtN2ePymTLgyDWCOyUxdE0YhTh2NjwPjU
THmDiXia2RfkKYn/wU2ahHjCPIbyGt1ryjEOc/XflvfWGbNwgeLYY3PlzfxCej3F
rJKefcNarS5RJO90/HLJJwH2ZiDlLomMIqjBLco0al7kv5jYdf1mxJ0pzISWTDk2
10G7QM9s496t0weJ2RJxhTuylelomzZZ6+RUBAoUKNaqFrEunV6f1sjWX9vQZD0E
9zMQ+bj02jKa6yyVRyjS8t0SvdbUxXMWVrd9eU0hGa4TRANaZtTRsm4/1DKvD6+5
lKlw6fDzCoWkjkJSvDEu01GvWktFszO4exLU7MDzXXMmG2CU3Mo+0lA0KynAPjaV
CmiseVgGKB1VJZXVYfrXGdYYrqpCPZD04ZARvSEL8FeEGXCp2ggoLYOfIauSys0P
AVzQymAWSrR31uO7QI7hgos0k4lxSdNrGUjD5HivlJMBH4SeEvhQ5pSTBMamnGTV
qsezZeKg68kqF/JsSUmru9rQTrULFVpyHl/6SMmBj5KKwz5oHpCEMCmoSLxfI7lf
XkC2T8JrH5AVDvrGGKZxKxhxcw2wzbt8zGmkT9mDjnUVZdPdXDWOLS4AkJ5HZ1hf
N03d++EsGS/1cTwT70kE
=52r8
-----END PGP SIGNATURE-----
Merge tag 'drm-vc4-fixes-2016-06-06' of github.com:anholt/linux into drm-fixes
This pull request brings in vblank/pageflip fixes I had hoped to see
merged before 4.7rc1, plus two new fixes that have come in since then.
* tag 'drm-vc4-fixes-2016-06-06' of github.com:anholt/linux:
drm/vc4: Make pageflip completion handling more robust.
drm/vc4: Fix ioctl permissions for render nodes.
drm/vc4: Return -EBUSY if there's already a pending flip event.
drm/vc4: Fix drm_vblank_put/get imbalance in page flip path.
drm/vc4: Fix get_vblank_counter with proper no-op for Linux 4.4+
Fixes for two issues reported by KASAN, a display engine hang due to
incorrect BIOS table parsing, and incorrect LTC interrupt handling on
Maxwell which could lead to a never-ending interrupt storm.
* 'linux-4.7' of git://github.com/skeggsb/linux:
drm/nouveau/disp/sor/gm107: training pattern registers are like gm200
drm/nouveau/disp/sor/gf119: both links use the same training register
drm/nouveau/core: swap the order of imem/fb
drm/nouveau/fbcon: fix out-of-bounds memory accesses
drm/nouveau/gr/gf100-: update sm error decoding from gk20a nvgpu headers
drm/nouveau/ltc/gm107-: fix typo in the address of NV_PLTCG_LTC0_LTS0_INTR
drm/nouveau/bios/disp: fix handling of "match any protocol" entries
Using flat regmap cache instead of RB-tree to avoid the following
lockdep warning on driver load:
WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:2755 lockdep_trace_alloc+0x15c/0x160()
DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))
The RB-tree regmap cache needs to allocate new space on first
writes. However, allocations in an atomic context (e.g. when a
spinlock is held) are not allowed. The function regmap_write
calls map->lock, which acquires a spinlock in the fast_io case.
Since the FSL DCU driver uses MMIO, the regmap bus of type
regmap_mmio is being used which has fast_io set to true.
Use flat regmap cache and specify max register to be large
enouth to cover all registers available in LS1021a and Vybrids
register space.
Signed-off-by: Stefan Agner <stefan@agner.ch>
Cc: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
7000-series SFC NICs connected with an SFP+ module currently fail to
report any supported link speeds.
Reported-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Reviewed-by: Jarod Wilson <jarod@redhat.com>
Tested-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
"make htmldocs" complains otherwise:
.//net/core/gen_stats.c:65: warning: No description found for parameter 'padattr'
.//net/core/gen_stats.c:101: warning: No description found for parameter 'padattr'
Fixes: 9854518ea0 ("sched: align nlattr properly when needed")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
At present we perform an xfrm_lookup() for each UDPv6 message we
send. The lookup involves querying the flow cache (flow_cache_lookup)
and, in case of a cache miss, creating an XFRM bundle.
If we miss the flow cache, we can end up creating a new bundle and
deriving the path MTU (xfrm_init_pmtu) from on an already transformed
dst_entry, which we pass from the socket cache (sk->sk_dst_cache) down
to xfrm_lookup(). This can happen only if we're caching the dst_entry
in the socket, that is when we're using a connected UDP socket.
To put it another way, the path MTU shrinks each time we miss the flow
cache, which later on leads to incorrectly fragmented payload. It can
be observed with ESPv6 in transport mode:
1) Set up a transformation and lower the MTU to trigger fragmentation
# ip xfrm policy add dir out src ::1 dst ::1 \
tmpl src ::1 dst ::1 proto esp spi 1
# ip xfrm state add src ::1 dst ::1 \
proto esp spi 1 enc 'aes' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b
# ip link set dev lo mtu 1500
2) Monitor the packet flow and set up an UDP sink
# tcpdump -ni lo -ttt &
# socat udp6-listen:12345,fork /dev/null &
3) Send a datagram that needs fragmentation with a connected socket
# perl -e 'print "@" x 1470 | socat - udp6:[::1]:12345
2016/06/07 18:52:52 socat[724] E read(3, 0x555bb3d5ba00, 8192): Protocol error
00:00:00.000000 IP6 ::1 > ::1: frag (0|1448) ESP(spi=0x00000001,seq=0x2), length 1448
00:00:00.000014 IP6 ::1 > ::1: frag (1448|32)
00:00:00.000050 IP6 ::1 > ::1: ESP(spi=0x00000001,seq=0x3), length 1272
(^ ICMPv6 Parameter Problem)
00:00:00.000022 IP6 ::1 > ::1: ESP(spi=0x00000001,seq=0x5), length 136
4) Compare it to a non-connected socket
# perl -e 'print "@" x 1500' | socat - udp6-sendto:[::1]:12345
00:00:40.535488 IP6 ::1 > ::1: frag (0|1448) ESP(spi=0x00000001,seq=0x6), length 1448
00:00:00.000010 IP6 ::1 > ::1: frag (1448|64)
What happens in step (3) is:
1) when connecting the socket in __ip6_datagram_connect(), we
perform an XFRM lookup, miss the flow cache, create an XFRM
bundle, and cache the destination,
2) afterwards, when sending the datagram, we perform an XFRM lookup,
again, miss the flow cache (due to mismatch of flowi6_iif and
flowi6_oif, which is an issue of its own), and recreate an XFRM
bundle based on the cached (and already transformed) destination.
To prevent the recreation of an XFRM bundle, avoid an XFRM lookup
altogether whenever we already have a destination entry cached in the
socket. This prevents the path MTU shrinkage and brings us on par with
UDPv4.
The fix also benefits connected PINGv6 sockets, another user of
ip6_sk_dst_lookup_flow(), who also suffer messages being transformed
twice.
Joint work with Hannes Frederic Sowa.
Reported-by: Jan Tluka <jtluka@redhat.com>
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The of_find_net_device_by_node() function is defined in
<linux/of_net.h> but not included in the .c file that
implements it. Fix the following warning by including the
header:
net/core/net-sysfs.c:1494:19: warning: symbol 'of_find_net_device_by_node' was not declared. Should it be static?
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The missing br_vlan_should_use() test caused creation of an unneeded
local fdb entry on changing mac address of a bridge device when there is
a vlan which is configured on a bridge port but not on the bridge
device.
Fixes: 2594e9064a ("bridge: vlan: add per-vlan struct and move to rhashtables")
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull vfs fixes from Al Viro:
"Fixes for crap of assorted ages: EOPENSTALE one is 4.2+, autofs one is
4.6, d_walk - 3.2+.
The atomic_open() and coredump ones are regressions from this window"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
coredump: fix dumping through pipes
fix a regression in atomic_open()
fix d_walk()/non-delayed __d_free() race
autofs braino fix for do_last()
fix EOPENSTALE bug in do_last()
The offset in the core file used to be tracked with ->written field of
the coredump_params structure. The field was retired in favour of
file->f_pos.
However, ->f_pos is not maintained for pipes which leads to breakage.
Restore explicit tracking of the offset in coredump_params. Introduce
->pos field for this purpose since ->written was already reused.
Fixes: a008393951 ("get rid of coredump_params->written").
Reported-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Signed-off-by: Mateusz Guzik <mguzik@redhat.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
open("/foo/no_such_file", O_RDONLY | O_CREAT) on should fail with
EACCES when /foo is not writable; failing with ENOENT is obviously
wrong. That got broken by a braino introduced when moving the
creat_error logics from atomic_open() to lookup_open(). Easy to
fix, fortunately.
Spotted-by: "Yan, Zheng" <ukernel@gmail.com>
Tested-by: "Yan, Zheng" <ukernel@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Ascend-to-parent logics in d_walk() depends on all encountered child
dentries not getting freed without an RCU delay. Unfortunately, in
quite a few cases it is not true, with hard-to-hit oopsable race as
the result.
Fortunately, the fix is simiple; right now the rule is "if it ever
been hashed, freeing must be delayed" and changing it to "if it
ever had a parent, freeing must be delayed" closes that hole and
covers all cases the old rule used to cover. Moreover, pipes and
sockets remain _not_ covered, so we do not introduce RCU delay in
the cases which are the reason for having that delay conditional
in the first place.
Cc: stable@vger.kernel.org # v3.2+ (and watch out for __d_materialise_dentry())
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
When turbo is disabled, the ->set_policy() interface is broken.
For example, when turbo is disabled and cpuinfo.max = 2900000 (full
max turbo frequency), setting the limits results in frequency less
than the requested one:
Set 1000000 KHz results in 0700000 KHz
Set 1500000 KHz results in 1100000 KHz
Set 2000000 KHz results in 1500000 KHz
This is because the limits->max_perf fraction is calculated using
the max turbo frequency as the reference, but when the max P-State is
capped in intel_pstate_get_min_max(), the reference is not the max
turbo P-State. This results in reducing max P-State.
One option is to always use max turbo as reference for calculating
limits. But this will not be correct. By definition the intel_pstate
sysfs limits, shows percentage of available performance. So when
BIOS has disabled turbo, the available performance is max non turbo.
So the max_perf_pct should still show 100%.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw : Subject & changelog, rewrite in fewer lines of code ]
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The limits->max_perf is rounded_up but immediately overwritten by
another assignment to limits->max_perf.
Move that operation to the correct location.
While here also added a pr_debug() call in ->set_policy to aid in
debugging.
Fixes: 785ee27881 (cpufreq: intel_pstate: Fix limits->max_perf rounding error)
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw : Subject & changelog ]
Cc: 4.4+ <stable@vger.kernel.org> # 4.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains two Netfilter/IPVS fixes for your net
tree, they are:
1) Fix missing alignment in next offset calculation for standard
targets, introduced in the previous merge window, patch from
Florian Westphal.
2) Fix to correct the handling of outgoing connections which use the
SIP-pe such that the binding of a real-server is updated when needed.
This was an omission from changes introduced by Marco Angaroni in
the previous merge window too, to allow handling of outgoing
connections by the SIP-pe. Patch and report came via Simon Horman.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The v6 tcp stats scan do not provide TLP and ER timer information
correctly like the v4 version . This patch fixes that.
Fixes: 6ba8a3b19e ("tcp: Tail loss probe (TLP)")
Fixes: eed530b6c6 ("tcp: early retransmit")
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When offloading classifiers such as u32 or flower to hardware, and the
qdisc is clsact (TC_H_CLSACT), then we need to differentiate its classes,
since not all of them handle ingress, therefore we must leave those in
software path. Add a .tcf_cl_offload() callback, so we can generically
handle them, tested on ixgbe.
Fixes: 10cbc68434 ("net/sched: cls_flower: Hardware offloaded filters statistics support")
Fixes: 5b33f48842 ("net/flower: Introduce hardware offload support")
Fixes: a1b7c5fd7f ("net: sched: add cls_u32 offload hooks for netdevs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We properly scan the flow list to count number of packets,
but John passed 0 to gnet_stats_copy_queue() so we report
a zero value to user space instead of the result.
Fixes: 6401585366 ("net: sched: restrict use of qstats qlen")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: John Fastabend <john.r.fastabend@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski says:
====================
cls_u32 hardware offload fixes
This set fixes two small issues with error codes I noticed
in cls_u32. Second patch could be viewed as user space API
change but that portion of API is not part of any release,
yet.
Compile tested only.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Return an error if user requested skip-sw and the underlaying
hardware cannot handle tc offloads (or offloads are disabled).
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
'err' variable is not set in this test, we would return whatever
previous test set 'err' to.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix clang build warning:
./include/uapi/linux/gtp.h:1:9: warning: '_UAPI_LINUX_GTP_H_' is
used as a header guard here, followed by #define of a different
macro [-Wheader-guard]
fix by defining _UAPI_LINUX_GTP_H_ and not _UAPI_LINUX_GTP_H__
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
stragglers that didn't get merged by anyone this time around. Better to
do it now than wait for another one to pop up. There's also a minor
maintainers update and a Kconfig fix.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJXVoIpAAoJEK0CiJfG5JUlk/wP/Ro3dPTTJW8tf1kabMWwYRym
PRsKeBNUbiAbbiDdYFcDVgVrxMpkeRQX+qoTPT37FypMbyDnu+rIEeWqHyyNCdzR
4+di548c8XzStBMPNGaKG+WWVDOU/rRWGrun1vc2NR8JohgWFBx8ciV9Kht4g+Ss
5ggm0E/ZKV5Hj7SuiBVbzMsZ/jufDM/V9NeIHy5Gnz6dPuRBkzrvwu9obJ/QLCWE
mh7eRug4C+6xYaQrPbXzgxTXqRJQkk/M27ArodVhvZy16gPr70HC+oNGUJwHk+Fs
yiqx9wicQuxNQqibgOC087RjUTDfFcGLdV71ouIQQhWuZFdQlHr9RfKaq+v1g/DB
s3n8whjHJAukU4i34btG3Mq1UcoLTL4vkOYMW+2yjvUfdUdY5BtKGphrkPO5xKMP
4hpAKkNW3ViTLn3cJQMuk5OgzPr0XrVjd++GtU7XjczzDKx8j9vTbhyZL0mRl+6s
jx8GU4hGuEkuhBIfWENNe2W2rf4TBrfQeiLsJt9nLFY4yqJRNByplkMmL75in/cD
PzbF647286PJYJdhjP0n70E2jyZbfyGYaUdZ9rbuwbEtA3XpOq4ZiWG0ZqPi7aOf
UickP3QW0AoY4Y0QhZ+thTcNxAZPPq6IfEzFNvzGXArR6msQLYzF9Y1dQ/HuXmZP
+tyYKKBCZbKObv463cM3
=JewH
-----END PGP SIGNATURE-----
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"This finally removes the CLK_IS_ROOT flag by picking up the last few
stragglers that didn't get merged by anyone this time around.
Better to do it now than wait for another one to pop up. There's also
a minor maintainers update and a Kconfig fix"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: nxp: Select MFD_SYSCON for creg driver
MAINTAINERS: Add file patterns for clock device tree bindings
clk: Remove CLK_IS_ROOT flag
clk: microchip: Remove CLK_IS_ROOT
powerpc/512x: clk: Remove CLK_IS_ROOT
vexpress/spc: Remove CLK_IS_ROOT
trivial fix to spelling mistakes and add missing newline in pr_err
messages
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan says:
====================
bnxt_en: Bug fixes.
Fix a race condition and VLAN rx acceleration logic.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Since both CTAG and STAG rx acceleration must be enabled together, we
only need to check one feature flag (NETIF_F_HW_VLAN_CTAG_RX) before
calling __vlan_hwaccel_put_tag().
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The hardware can only be set to strip or not strip both the VLAN CTAG and
STAG. It cannot strip one and not strip the other. Add logic to
bnxt_fix_features() to toggle both feature flags when the user is toggling
one of them.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set the is_push flag in the software BD before the tx data is pushed to
the chip. It is possible to get the tx interrupt as soon as the tx data
is pushed. The tx handler will not handle the event properly if the
is_push flag is not set and it will crash.
Signed-off-by: Michael Chan <michael.chan@broadocm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/rxrpc/rxkad.c:1165:1-3: WARNING: PTR_ERR_OR_ZERO can be used
Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR
Generated by: scripts/coccinelle/api/ptr_ret.cocci
CC: David Howells <dhowells@redhat.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sowmini Varadhan says:
====================
RDS: TCP: socket locking RDS packet assembly fixes
This three part patchset fixes bugs in synchronization between
rds_tcp_accept_one() and the rds-tcp send/recv path.
Patch 1 ensures that the lock_sock() is taken appropriately
and the RDS datagram reassembly state is reset to synchronize
with the receive path.
Patch 2 ensures that partially sent RDS datagrams will get
retransmitted after rds_tcp_accept_one() switches sockets.
Patch 3 fixes a race window which would prematurely re-enable
rds_send_xmit() before the rds_tcp_connection setup has been
completed in rds_tcp_accept_one().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The send path needs to be quiesced before resetting callbacks from
rds_tcp_accept_one(), and commit eb19284026 ("RDS:TCP: Synchronize
rds_tcp_accept_one with rds_send_xmit when resetting t_sock") achieves
this using the c_state and RDS_IN_XMIT bit following the pattern
used by rds_conn_shutdown(). However this leaves the possibility
of a race window as shown in the sequence below
take t_conn_lock in rds_tcp_conn_connect
send outgoing syn to peer
drop t_conn_lock in rds_tcp_conn_connect
incoming from peer triggers rds_tcp_accept_one, conn is
marked CONNECTING
wait for RDS_IN_XMIT to quiesce any rds_send_xmit threads
call rds_tcp_reset_callbacks
[.. race-window where incoming syn-ack can cause the conn
to be marked UP from rds_tcp_state_change ..]
lock_sock called from rds_tcp_reset_callbacks, and we set
t_sock to null
As soon as the conn is marked UP in the race-window above, rds_send_xmit()
threads will proceed to rds_tcp_xmit and may encounter a null-pointer
deref on the t_sock.
Given that rds_tcp_state_change() is invoked in softirq context, whereas
rds_tcp_reset_callbacks() is in workq context, and testing for RDS_IN_XMIT
after lock_sock could result in a deadlock with tcp_sendmsg, this
commit fixes the race by using a new c_state, RDS_TCP_RESETTING, which
will prevent a transition to RDS_CONN_UP from rds_tcp_state_change().
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we switch a connection's sockets in rds_tcp_rest_callbacks,
any partially sent datagram must be retransmitted on the new
socket so that the receiver can correctly reassmble the RDS
datagram. Use rds_send_reset() which is designed for this purpose.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When rds_tcp_accept_one() has to replace the existing tcp socket
with a newer tcp socket (duelling-syn resolution), it must lock_sock()
to suppress the rds_tcp_data_recv() path while callbacks are being
changed. Also, existing RDS datagram reassembly state must be reset,
so that the next datagram on the new socket does not have corrupted
state. Similarly when resetting the newly accepted socket, appropriate
locks and synchronization is needed.
This commit ensures correct synchronization by invoking
kernel_sock_shutdown to reset a newly accepted sock, and by taking
appropriate lock_sock()s (for old and new sockets) when resetting
existing callbacks.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
My prior attempt to fix the backlogs of parents failed.
If we return NET_XMIT_CN, our parents wont increase their backlog,
so our qdisc_tree_reduce_backlog() should take this into account.
v2: Florian Westphal pointed out that we could drop the packet,
so we need to save qdisc_pkt_len(skb) in a temp variable before
calling fq_codel_drop()
Fixes: 9d18562a22 ("fq_codel: add batch ability to fq_codel_drop()")
Fixes: 2ccccf5fb4 ("net_sched: update hierarchical backlog too")
Reported-by: Stas Nichiporovich <stasn77@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In bpf_perf_event_read() and bpf_perf_event_output(), we must use
READ_ONCE() for fetching the struct file pointer, which could get
updated concurrently, so we must prevent the compiler from potential
refetching.
We already do this with tail calls for fetching the related bpf_prog,
but not so on stored perf events. Semantics for both are the same
with regards to updates.
Fixes: a43eec3042 ("bpf: introduce bpf_perf_event_output() helper")
Fixes: 35578d7984 ("bpf: Implement function bpf_perf_event_read() that get the selected hardware PMU conuter")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull userns fixes from Eric Biederman:
"This contains two small but significant fixes to fs/namespace.c.
The first adds a filesystem refcount drop on error. The second
corrects a test in fs_fully_visible which could be abused to allow
mounting of proc or sysfs, when that should not be allowed.
To keep myself honest I have tested to ensure the incorrect test in
fs_fully_visible actually allows improper mounting of proc before the
fix and that when fixed the improper mounting is not allowed"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
mnt: fs_fully_visible test the proper mount for MNT_LOCKED
mnt: If fs_fully_visible fails call put_filesystem.
ipoib_neigh_get unconditionally updates the "alive" variable member on
any packet send. This prevents the neighbor garbage collection from
cleaning out a dead neighbor entry if we are still queueing packets
for it. If the queue for this neighbor is full, then don't update the
alive timestamp. That way the neighbor can time out even if packets
are still being queued as long as none of them are being sent.
Fixes: b63b70d877 ("IPoIB: Use a private hash table for path lookup in xmit path")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>