This patch avoids having TCP sender or congestion control
overestimate the min RTT by orders of magnitude. This happens when
all the samples in the windowed filter are one-packet transfer
like small request and health-check like chit-chat, which is farily
common for applications using persistent connections. This patch
tries to conservatively labels and skip RTT samples obtained from
this type of workload.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Letting tipc_poll() dereference a socket's pointer to struct tipc_group
entails a race risk, as the group item may be deleted in a concurrent
tipc_sk_join() or tipc_sk_leave() thread.
We now move the 'open' flag in struct tipc_group to struct tipc_sock,
and let the former retain only a pointer to the moved field. This will
eliminate the race risk.
Reported-by: syzbot+799dafde0286795858ac@syzkaller.appspotmail.com
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the switch block in l2tp_nl_cmd_session_create() that
checks pseudowire-specific parameters since just L2TP_PWTYPE_ETH and
L2TP_PWTYPE_PPP are currently supported and no actual checks are
performed. Moreover the L2TP_PWTYPE_IP/default case presents a harmless
issue in error handling (break instead of goto out_tunnel)
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Acked-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
on T6, IPv6 filter would occupy 2 tids instead of 4.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lorenzo Bianconi says:
====================
l2tp: set l2specific_len based on l2specific_type
Do not rely on l2specific_len value provided by userspace but set sublayer
length according to l2specific_type.
Mark L2TP_ATTR_L2SPEC_LEN attribute as not used
Changes since v2:
- drop the patch related to a fix in the switch default case in
l2tp_nl_cmd_session_create()
- use L2SPECTYPE_NONE as default case in l2tp_get_l2specific_len()
Changes since v1:
- remove l2specific_len parameter
- add sanity check on l2specific_type provided by userspace
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Guillaume Nault <g.nault@alphalink.fr>
Tested-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove l2specific_len configuration parameter since now L2-Specific
Sublayer length is computed according to l2specific_type provided by
userspace.
Reviewed-by: Guillaume Nault <g.nault@alphalink.fr>
Tested-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove l2specific_len dependency while building l2tpv3 header or
parsing the received frame since default L2-Specific Sublayer is
always four bytes long and we don't need to rely on a user supplied
value.
Moreover in l2tp netlink code there are no sanity checks to
enforce the relation between l2specific_len and l2specific_type,
so sending a malformed netlink message is possible to set
l2specific_type to L2TP_L2SPECTYPE_DEFAULT (or even
L2TP_L2SPECTYPE_NONE) and set l2specific_len to a value greater than
4 leaking memory on the wire and sending corrupted frames.
Reviewed-by: Guillaume Nault <g.nault@alphalink.fr>
Tested-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add sanity check on l2specific_type provided by userspace in
l2tp_nl_cmd_session_create() since just L2TP_L2SPECTYPE_DEFAULT and
L2TP_L2SPECTYPE_NONE are currently supported.
Moreover explicitly set l2specific_type to L2TP_L2SPECTYPE_DEFAULT
only if the userspace does not provide a value for it
Reviewed-by: Guillaume Nault <g.nault@alphalink.fr>
Tested-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rahul Lakkireddy says:
====================
cxgb4: reduce memory footprint for collecting firmware dump
Firmware dump can be large (upto 2 GB). In low memory conditions,
ethtool fails to allocate such large memory. So, use zlib deflate
to compress collected firmware dump.
Patch 1 updates collection logic to use compression.
Patch 2 adds zlib deflate to compress collected firmware dump.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Use zlib deflate to compress firmware dump. Collect and compress
as much firmware dump as possible into a 32 MB buffer.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update firmware dump collection logic to use compression when available.
Let collection logic attempt to do compression, instead of returning out
of memory early.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Helmut reported a bug about devision by zero while
running traffic and doing physical cable pull test.
When the cable unplugged the ppms become zero, so when
dividing the current ppms by the previous ppms in the
next dim iteration there is devision by zero.
This patch prevent this division for both ppms and epms.
Fixes: c3164d2fc4 ("net/mlx5e: Added BW check for DIM decision mechanism")
Fixes: 4c4dbb4a73 ("net/mlx5e: Move dynamic interrupt coalescing code to include/linux")
Reported-by: Helmut Grauer <helmut.grauer@de.ibm.com>
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes the following sparse warnings:
net/core/devlink.c:2297:25: warning:
symbol 'devlink_resource_find' was not declared. Should it be static?
net/core/devlink.c:2322:6: warning:
symbol 'devlink_resource_validate_children' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes the following sparse warning:
drivers/net/ethernet/mellanox/mlxsw/spectrum_kvdl.c:289:5: warning:
symbol 'mlxsw_sp_kvdl_part_occ' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The variable miistat is not used. So it is removed.
CC: Srinivas Eeda <srinivas.eeda@oracle.com>
CC: Joe Jin <joe.jin@oracle.com>
CC: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 41033f029e ("snmp: Remove duplicate OUTMCAST stat
increment") one line of code became unneeded.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When CONFIG_KASAN is set, we can use relatively large amounts of kernel
stack space:
net/caif/cfctrl.c:555:1: warning: the frame size of 1600 bytes is larger than 1280 bytes [-Wframe-larger-than=]
This adds convenience wrappers around cfpkt_extr_head(), which is responsible
for most of the stack growth. With those wrapper functions, gcc apparently
starts reusing the stack slots for each instance, thus avoiding the
problem.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEE4bay/IylYqM/npjQHv7KIOw4HPYFAlphvhMTHG1rbEBwZW5n
dXRyb25peC5kZQAKCRAe/sog7Dgc9ocrB/92eX9hS7luraFld9CwH2Dq6U9vVTUE
bEUq9tLznxA8encUX6ulxBow0ZO/vzWjinDy+kjY4dzvswEV0SHm6YgAtd3uvj7W
4pLNfjl0aqNPdjM3mbr93isyNWixyr1VAX9hCi3sCgTbE29Ms9OmC3rF7HM8bI+J
thCAVxnNt0P9HngclMiopV92weDsGwcUUw/oU1+FpzEacjrp4frhh/9T1YCng4Ab
HAWgdTM1k5mKWQDUxzasOQrzcz+WncOox6dzw7IxWDLtNhCwBAHWu8drBLTCWEU6
e8NiKjjMuLafmhZZpXw+17GDeTYcxpzj7KQ+KR/HV2abXhAmrM+oGsTu
=MdGk
-----END PGP SIGNATURE-----
Merge tag 'linux-can-next-for-4.16-20180119' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:
====================
pull-request: can-next 2018-01-16
this is a pull request for net-next/master consisting of 1 patch.
This patch by Arnd Bergmann for the m_can driver silences a compiler
warning if CONFIG_PM is not selected.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Final few patches before the merge window, nothing really special.
ath9k
* add MSI support (not enabled by default yet)
rtlwifi
* support A-MSDU in A-MPDU aggregation
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJaYa7GAAoJEG4XJFUm622bp+IIAJuAvqfr5G1056cmvvpIDF1i
tsiNcmTkYlV33+99t1GXLgXATM5oYLOe4KLgVk+SxG2mCn2ZIun/TUUExM8tebIr
a6U1loiWhvztBHkzSglfIGknFHH6Ib6gZ6ti6cb5PpZE/ey1bsjlNBjvI5k2di5h
2KobJfQPk34e/DgJI49wYO6CwuzuT+rnNaWFzKEUoKm6lwGeUqpukV90gXN1O/qj
Xt2A9xIlUUEXHBfkJIef34+YMs/c9jeXqwoBMZIs8G/tQGeYOMjqbn7kTLW1MpMX
bu+59PCmshGdM/OMJsOZux5tm4DFGWaK/FYkedAJ1i5u4WzBEr1IJfo19es4oJM=
=kFNX
-----END PGP SIGNATURE-----
Merge tag 'wireless-drivers-next-for-davem-2018-01-19' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for 4.16
Final few patches before the merge window, nothing really special.
ath9k
* add MSI support (not enabled by default yet)
rtlwifi
* support A-MSDU in A-MPDU aggregation
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Building without CONFIG_PM results in a harmless warning:
drivers/net/can/m_can/m_can.c:1763:12: error: 'm_can_runtime_resume' defined but not used [-Werror=unused-function]
drivers/net/can/m_can/m_can.c:1752:12: error: 'm_can_runtime_suspend' defined but not used [-Werror=unused-function]
Marking the functions as __maybe_unused lets the compiler
silently drop them instead.
Fixes: cdf8259d65 ("can: m_can: Add PM Support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
gcc-4.4.4 has problems witn anon union initializers. Work around this.
net/sched/sch_prio.c: In function 'prio_dump_offload':
net/sched/sch_prio.c:260: error: unknown field 'stats' specified in initializer
net/sched/sch_prio.c:260: warning: initialization makes integer from pointer without a cast
net/sched/sch_prio.c:261: error: unknown field 'stats' specified in initializer
net/sched/sch_prio.c:261: warning: initialization makes integer from pointer without a cast
Fixes: 7fdb61b44c ("net: sch: prio: Add offload ability to PRIO qdisc")
Cc: Nogah Frankel <nogahf@mellanox.com>
Cc: Yuval Mintz <yuvalm@mellanox.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trivial fix removes unneeded semicolons after if blocks.
This issue was detected by using the Coccinelle software.
Signed-off-by: Christopher Díaz Riveros <chrisadr@gentoo.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The trailing semicolon is an empty statement that does no operation.
Removing it since it doesn't do anything.
Signed-off-by: Luis de Bethencourt <luisbg@kernel.org>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When tcm->tcm_ifindex == TCM_IFINDEX_MAGIC_BLOCK, parent is still passed
down but the value is never used. Compiler does not recognize it and
issues a warning. Silence it down initializing parent to 0.
Fixes: 7960d1daf2 ("net: sched: use block index as a handle instead of qdisc when block is shared")
Reported-by: David Miller <davem@davemloft.net>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to the fact that A-MSDU deaggregation is done in software,
we set this flag to support the A-MSDU in A-MPDU
Signed-off-by: Steven Ting <steventing@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
wil6210 maintainer email and mail list has changed, hence update
its MAINTAINERS entry accordingly.
Signed-off-by: Maya Erez <merez@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEE4bay/IylYqM/npjQHv7KIOw4HPYFAlpeCCQTHG1rbEBwZW5n
dXRyb25peC5kZQAKCRAe/sog7Dgc9ti9B/4jAoCYTuYXnkRvS34jdgQQV99QyC8M
EpcAgzHo2kYNPDf5q8TEVheXxvA9XeGiA+TtjL9cNowxwMMtJev3/dmBtOmU6jek
RgOsYlR8guxBHOx8pj1IMl1xPoWTLfg4Kv1qnpXx3zvhOP610G/edBBSZt659MGF
SW6pBoNivbl+EYSM5x81QIfqJ4NlD5AKY4PeeSnGrnSthd8EFNp2zKkcY8nMOJ0D
Gm2YyxGJXh+lHse965DQRNg+owZxIWyheQplulVrw9v34LOjbFko3Cd+D9KLW5MG
LVVRJ3E0jm7W75AvxNcv2WP+lZVcDxXqsxFH0dP8WOJNZiKZeJ5aZfji
=ROXk
-----END PGP SIGNATURE-----
Merge tag 'linux-can-next-for-4.16-20180116' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:
====================
pull-request: can-next 2018-01-16
this is a pull request for net-next/master consisting of 9 patches.
This is a series of patches, some of them initially by Franklin S Cooper
Jr, which was picked up by Faiz Abbas. Faiz Abbas added some patches
while working on this series, I contributed one as well.
The first two patches add support to CAN device infrastructure to limit
the bitrate of a CAN adapter if the used CAN-transceiver has a certain
maximum bitrate.
The remaining patches improve the m_can driver. They add support for
bitrate limiting to the driver, clean up the driver and add support for
runtime PM.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The trailing semicolon is an empty statement that does no operation.
It is completely stripped out by the compiler. Removing it since it doesn't do
anything.
Fixes: 5f35227ea3 ("net: Generalize ndo_gso_check to ndo_features_check")
Signed-off-by: Luis de Bethencourt <luisbg@kernel.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
restructure the code which adds support for configuring
PCIe VF via mgmt netdevice. which was added by
commit 7829451c69 ("cxgb4: Add control net_device for
configuring PCIe VF")
Original work by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
idr_find() is safe under rcu_read_lock() and
maybe_get_net() guarantees that net is alive.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
peernet2id_alloc() is racy without rtnl_lock() as refcount_read(&peer->count)
under net->nsid_lock does not guarantee, peer is alive:
rcu_read_lock()
peernet2id_alloc() ..
spin_lock_bh(&net->nsid_lock) ..
refcount_read(&peer->count) (!= 0) ..
.. put_net()
.. cleanup_net()
.. for_each_net(tmp)
.. spin_lock_bh(&tmp->nsid_lock)
.. __peernet2id(tmp, net) == -1
.. ..
.. ..
__peernet2id_alloc(alloc == true) ..
.. ..
rcu_read_unlock() ..
.. synchronize_rcu()
.. kmem_cache_free(net)
After the above situation, net::netns_id contains id pointing to freed memory,
and any other dereferencing by the id will operate with this freed memory.
Currently, peernet2id_alloc() is used under rtnl_lock() everywhere except
ovs_vport_cmd_fill_info(), and this race can't occur. But peernet2id_alloc()
is generic interface, and better we fix it before someone really starts
use it in wrong context.
v2: Don't place refcount_read(&net->count) under net->nsid_lock
as suggested by Eric W. Biederman <ebiederm@xmission.com>
v3: Rebase on top of net-next
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang says:
====================
tun: allow to attach eBPF filter
This series tries to implement eBPF socket filter for tun. This could
be used for implementing efficient virtio-net receive filter for
vhost-net.
Changes from V2:
- fix typo
- remove unnecessary double check
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch allows userspace to attach eBPF filter to tun. This will
allow to implement VM dataplane filtering in a more efficient way
compared to cBPF filter by allowing either qemu or libvirt to
attach eBPF filter to tun.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To be reused by other eBPF program other than queue selection.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko says:
====================
net: sched: allow qdiscs to share filter block instances
Currently the filters added to qdiscs are independent. So for example if you
have 2 netdevices and you create ingress qdisc on both and you want to add
identical filter rules both, you need to add them twice. This patchset
makes this easier and mainly saves resources allowing to share all filters
within a qdisc - I call it a "filter block". Also this helps to save
resources when we do offload to hw for example to expensive TCAM.
So back to the example. First, we create 2 qdiscs. Both will share
block number 22. "22" is just an identification:
$ tc qdisc add dev ens7 ingress_block 22 ingress
^^^^^^^^^^^^^^^^
$ tc qdisc add dev ens8 ingress_block 22 ingress
^^^^^^^^^^^^^^^^
If we don't specify "block" command line option, no shared block would
be created:
$ tc qdisc add dev ens9 ingress
Now if we list the qdiscs, we will see the block index in the output:
$ tc qdisc
qdisc ingress ffff: dev ens7 parent ffff:fff1 ingress_block 22
qdisc ingress ffff: dev ens8 parent ffff:fff1 ingress_block 22
qdisc ingress ffff: dev ens9 parent ffff:fff1
To make is more visual, the situation looks like this:
ens7 ingress qdisc ens7 ingress qdisc
| |
| |
+----------> block 22 <----------+
Unlimited number of qdiscs may share the same block.
Note that this patchset introduces block sharing support also for clsact
qdisc:
$ tc qdisc add dev ens10 ingress_block 23 egress_block 24 clsact
$ tc qdisc show dev ens10
qdisc clsact ffff: dev ens10 parent ffff:fff1 ingress_block 23 egress_block 24
We can add filter using the block index:
$ tc filter add block 22 protocol ip pref 25 flower dst_ip 192.168.0.0/16 action drop
Note we cannot use the qdisc for filter manipulations of shared blocks:
$ tc filter add dev ens8 ingress protocol ip pref 1 flower dst_ip 192.168.100.2 action drop
Error: This filter block is shared. Please use the block index to manipulate the filters.
We will see the same output if we list filters for ingress qdisc of
ens7 and ens8, also for the block 22:
$ tc filter show block 22
filter block 22 protocol ip pref 25 flower chain 0
filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
...
$ tc filter show dev ens7 ingress
filter block 22 protocol ip pref 25 flower chain 0
filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
...
$ tc filter show dev ens8 ingress
filter block 22 protocol ip pref 25 flower chain 0
filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
...
---
v10->v11:
- patch 2:
- fixed error path when register_pernet_subsys fails pointed out by Cong
- patch 9:
- rebased on top of the current net-next
v9->v10:
- patch 7:
- fixed ifindex magic in the patch description
- userspace patches:
- added manpages and patch descriptions
v8->v9:
- patch "net: sched: add rt netlink message type for block get" was
removed, userspace check filter existence using qdisc dump
v7->v8:
- patch 7:
- added comment to ifindex block magic
- patch 9:
- new patch
- patch 10:
- base this on the patch that introduces qdisc-generic block index
attributes parsing/dumping
- patch 13:
- rebased on top of current net-next
v6->v7:
- patch 1:
- unsquashed shared block patch that was previously squashed by mistake
- fixed error path in block create - freeing chain 0
- patch 2:
- new patch - splitted from the previous one as it got accidentaly
squashed in the rebasing process in the past
- converted to idr extended
- removed auto-generating of block indexes. Callers have to explicily
tell that the block is shared by passing non-zero block index
- fixed error path in block get ext - freeing chain 0
- patch 7:
- changed extack message for block index handle as suggested by DaveA
- added extack message when block index does not exist
- the block ifindex magic is in define and change to 0xffffffff
as suggested by Jamal
- patch 8:
- new patch implementing RTM_GETBLOCK in order to query if the block
with some index exists
- patch 9:
- adjust to the core changes and check block index attributes for being 0
v5->v6:
- added patch 6 that introduces block handle
v4->v5:
- patch 5:
- add tracking of binding of devs that are unable to offload and check
that before block cbs call.
v3->v4:
- patch 1:
- rebased on top of the current net-next
- added some extack strings
- patch 3:
- rebased on top of the current net-next
- patch 5:
- propagate netdev_ops->ndo_setup_tc error up to tcf_block_offload_bind
caller
- patch 7:
- rebased on top of the current net-next
v2->v3:
- removed original patch 1, removing tp->q cls_bpf dependency. Fixed by
Jakub in the meantime.
- patch 1:
- rebased on top of the current net-next
- patch 5:
- new patch
- patch 8:
- removed "p_" prefix from block index function args
- patch 10:
- add tc offload feature handling
====================
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to convert from mlxsw_sp_port to net_device and back again.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Benefit from the prepared TC and in-driver ACL infrastructure and
introduce block sharing offload. For that, a new struct "block" is
introduced in spectrum_acl in order to hold a list of specific
block-port bindings.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead, pass netdev and ingress flag to ruleset unbind op.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to prepare for follow-up changes, make the bind/unbind helpers
very simple. That required move of ht insertion/removal and bind/unbind
calls into mlxsw_sp_acl_ruleset_create/destroy.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Benefit from the previously introduced shared filter blocks
infrastructure and allow ingress and clsact qdisc instances to share
filter blocks. The block index is coming from userspace as qdisc option.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce two new attributes to be used for qdisc creation and dumping.
One for ingress block, one for egress block. Introduce a set of ops that
qdisc which supports block sharing would implement.
Passing block indexes in qdisc change is not supported yet and it is
checked and forbidded.
In future, these attributes are to be reused for specifying block
indexes for classes as well. As of this moment however, it is not
supported so a check is in place to forbid it.
Suggested-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As the tcm_ifindex with value TCM_IFINDEX_MAGIC_BLOCK is invalid ifindex,
use it to indicate that we work with block, instead of qdisc.
So if tcm_ifindex is set to TCM_IFINDEX_MAGIC_BLOCK, tcm_parent is used
to carry block_index.
If the block is set to be shared between at least 2 qdiscs, it is
forbidden to use the qdisc handle to add/delete filters. In that case,
userspace has to pass block_index.
Also, for dump of the filters, in case the block is shared in between at
least 2 qdiscs, the each filter is dumped with tcm_ifindex value
TCM_IFINDEX_MAGIC_BLOCK and tcm_parent set to block_index. That gives
the user clear indication, that the filter belongs to a shared block
and not only to one qdisc under which it is dumped.
Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During block bind, we need to check tc offload feature. If it is
disabled yet still the block contains offloaded filters, forbid the
bind. Also forbid to register callback for a block that already
contains offloaded filters, as the play back is not supported now.
For keeping track of offloaded filters there is a new counter
introduced, alongside with couple of helpers called from cls_* code.
These helpers set and clear TCA_CLS_FLAGS_IN_HW flag.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Both are no longer used, so remove them.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Couple of classifiers call netif_keep_dst directly on q->dev. That is
not possible to do directly for shared blocke where multiple qdiscs are
owning the block. So introduce a infrastructure to keep track of the
block owners in list and use this list to implement block variant of
netif_keep_dst.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use block index in the messages instead.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow qdiscs to share filter blocks among them. Each qdisc type has to
use block get/put extended modifications that enable sharing.
Shared blocks are tracked within each net namespace and identified
by u32 index. This index is passed from user during the qdisc creation.
If user passes index that is not used by any other qdisc, new block
is created. If user passes index that is already used, the existing
block will be re-used.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So far, there was possible only to register a single filter chain
pointer to block->chain[0]. However, when the blocks will get shareable,
we need to allow multiple filter chain pointers registration.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan says:
====================
bnxt_en: Updates for net-next.
First, we upgrade the firmware interface spec. Due to a change in
the toolchains, the auto-generated bnxt_hsi.h does not match the
old bnxt_hsi.h and the patch is really big. This should be just
one-time. Going forward, changes should be incremental.
The next 10 patches implement a new scheme for the PF and VF drivers
to allocate and reserve resources. The new scheme is more flexible
and allows dynamic and asymmetric distribution of resources, whereas
the old scheme is static and even distribution.
The last few patches add cacheline size setting, a couple of PCI IDs,
better management of VF MAC address, and a better parent switchdev ID
for dual-port devices.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>