Commit Graph

739260 Commits

Author SHA1 Message Date
Jian Shen
2097fdefa5 net: hns3: add result checking for VF when modify unicast mac address
VF changes unicast mac address by sending mailbox msg to PF, then PF
completes the mac address modification. It may fail when the target
uc mac address is already in the mac_vlan table. VF should be aware
of it by reading the message result.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-11 22:53:32 -04:00
Jian Shen
d07b6bb435 net: hns3: add existence checking before adding unicast mac address
It's not allowed to add two same unicast mac address entries to the
mac_vlan table. When modify the uc mac address of a VF device to the
same value with the PF device's, the PF device will lose its entry of
the mac_vlan table.

Lookup the mac address in the mac_vlan table, and add it if the entry
is inexistent.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-11 22:53:32 -04:00
Jian Shen
eefd00a5d7 net: hns3: fix return value error of hclge_get_mac_vlan_cmd_status()
Error code -EIO was used to indicate mutilple errors in function
hclge_get_mac_vlan_cmd_status().This patch fixes it by using
error code depending on the error type.

For no space error, return -ENOSPC.
For entry not found, return -ENOENT.
For command send fail, return -EIO.
For invalid op code, return -EINVAL.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-11 22:53:32 -04:00
Jian Shen
aa7a795eec net: hns3: fix error type definition of return value
An enum type variable was used to store an "int" type return value.
This patch fixes it.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-11 22:53:32 -04:00
Yunsheng Lin
5d02a58dae net: hns3: fix for buffer overflow smatch warning
This patch fixes the buffer overflow warning by refactoring
hclgevf_bind_ring_to_vector and hclge_get_ring_chain_from_mbx.

Fixes: e2cb1dec97 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Fixes: dde1a86e93 ("net: hns3: Add mailbox support to PF driver")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-11 22:53:32 -04:00
Yunsheng Lin
f96818a7cc net: hns3: fix for loopback failure when vlan filter is enable
When vlan ctag filter is enabled, the loopback selftest fails because
loopback selftest does not support vlan.

This patch fixes it by disabling the vlan ctag filter when runnig
loopback selftest.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-11 22:53:32 -04:00
Peng Li
64fd2300fc net: hns3: add support for querying pfc puase packets statistic
This patch add support for querying pfc puase packets statistic
in hclge_ieee_getpfc, which is used to tell user how many pfc
puase packets have been sent and received by this mac port.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-11 22:53:32 -04:00
Peng Li
f8d291f00b net: hns3: fix rx path skb->truesize reporting bug
Original skb->truesize reports the received packet size,
not the actual buffer size NIC driver allocated(1 Page).
The linux net protocol will misjudge the true size of rx queue.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-11 22:53:31 -04:00
Fuyun Liang
e98d7183f6 net: hns3: unify the pause params setup function
Since the firmware cmd to setup mac pause params is the same as the
firmware cmd to pfc pause params, this patch unifies the pause params
setup function.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-11 22:53:31 -04:00
Fuyun Liang
20e4bf982b net: hns3: fix for ipv6 address loss problem after setting channels
The function of dev_close and dev_open is just likes ifconfig <netif> down
and ifconfig <netif> up. The ipv6 address will be lost after dev_close and
dev_open are called. This patch uses hns3_nic_net_stop to replace dev_close
and uses hns3_nic_net_open to replace dev_open.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-11 22:53:31 -04:00
Fuyun Liang
8cc6c1f77b net: hns3: fix for netdev not running problem after calling net_stop and net_open
The link status update function is called by timer every second. But
net_stop and net_open may be called with very short intervals. The link
status update function can not detect the link state has changed. It
causes the netdev not running problem.

This patch fixes it by updating the link state in ae_stop function.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-11 22:53:31 -04:00
Fuyun Liang
590980558b net: hns3: add existence check when remove old uc mac address
When driver is in initial state, the mac_vlan table table is empty.
So the delete operation for mac address must fail. Existence check
is needed here. Otherwise, the error message will make user confused.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-11 22:53:31 -04:00
David S. Miller
daa13701a9 Merge branch 'selftests-forwarding-Tweaks-and-a-new-test'
Ido Schimmel says:

====================
selftests: forwarding: Tweaks and a new test

First patch adds a new test for VLAN-unaware bridges.

Next two patches make the tests fail in case they are missing interfaces
or dependencies.

Last patch allows one to create the veth interfaces even without the
optional configuration file.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-11 22:44:24 -04:00
Ido Schimmel
59be45c375 selftests: forwarding: Allow creation of interfaces without a config file
Some users want to be able to run the tests without a configuration file
which is useful when one needs to test both virtual and physical
interfaces on the same machine.

Move the defines that set the type of interface to create and whether to
create it away from the optional configuration file to the library like
the rest of the defines.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-11 22:44:24 -04:00
Ido Schimmel
231b85abaa selftests: forwarding: Exit with error when missing interfaces
Returning 0 gives a false sense of success when the required modules did
not even manage to be initialized and register the required net devices.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-11 22:44:24 -04:00
Ido Schimmel
ff0162af9e selftests: forwarding: Exit with error when missing dependencies
We already return an error when some dependencies (e.g., 'jq') are
missing so lets be consistent and do that for all.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-11 22:44:24 -04:00
Ido Schimmel
3a021ab564 selftests: forwarding: Add a test for VLAN-unaware bridge
Similar to the VLAN-aware bridge test, test the VLAN-unaware bridge and
make sure that ping, FDB learning and flooding work as expected.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-11 22:44:23 -04:00
David S. Miller
f44b1886a5 Merge branch 's390-qeth-next'
Julian Wiedmann says:

====================
s390/qeth: updates 2018-03-09

here is the current pile of qeth patches for net-next. Just the usual
small updates and clean ups. Please apply.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 13:10:06 -05:00
Julian Wiedmann
b1d5e36b41 s390/qeth: shrink qeth_ipaddr struct
Using up 8 bytes in every ipaddr object to store SETIP/DELIP flags is
rather wasteful. Except for takeover eligibility, the flag values all
just depend on the address type, so determine them on demand.

While at it reorder the struct to fill an alignment hole.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 13:10:05 -05:00
Julian Wiedmann
1617dae25e s390/qeth: extract helpers for managing special IPs
Reduce code duplication.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 13:10:05 -05:00
Julian Wiedmann
b9caa98c51 s390/qeth: simplify card look-up on IP notification
On an IP event, current code tries to determine if the netdev belongs
to a L3 card by walking all qeth cards in the system, and then all of
their VLAN devices too. Short-cut the whole thing by identifying a L3
device through its netdev_ops.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 13:10:05 -05:00
Julian Wiedmann
d66b1c0df3 s390/qeth: restructure IP notification handlers
Extract a helper that does the actual work & returns the right NOTIFY_*
responses, and start putting the temporary ipaddr container objects
on the stack rather than kmalloc'ing them. They are small, and this
reduces the confusion of which objects actually get added to qeth's
IP tables.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 13:10:05 -05:00
Julian Wiedmann
1b45c80be0 s390/qeth: reset NAPI context during queue init
init_qdio_queues() resets the Input Queue's overall QDIO state, and
positions the buffer cursor back to 0. So this is the obvious place to
also reset the queue's NAPI context (in contrast to doing it rather
randomly in the middle of the big set_online() path).
No functional change.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 13:10:05 -05:00
Julian Wiedmann
04f673983b s390/qeth: reduce RX skb setup
Newly-allocated skbs default to PACKET_HOST, and eth_type_trans() is
smart enough to determine any other packet type from the frame's
destination address.
So except for the IQD sniffer case, there is no need to set up
skb->pkt_type manually.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 13:10:05 -05:00
Julian Wiedmann
37cf05d2ff s390/qeth: allocate skb from NAPI cache
napi_alloc_skb() doesn't need to disable IRQs during the allocation,
and thus may save us a few cycles.
Doing so requires a small fix-up in the HiperTransport path, which
currently assumes a fixed NET_SKB_PAD headroom padding. napi_alloc_skb()
adds an additional NET_IP_ALIGN padding, so use the proper helper for
setting up the mac_header offset.

Use this opportunity to convert the non-NAPI path to netdev_alloc_skb(),
which means that skb->dev is now always set-up during allocation and
doesn't need to be assigned manually.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 13:10:04 -05:00
Julian Wiedmann
f0ea8bfbc1 s390/qeth: pass correct length to header_ops->create()
We need to pass the *payload* length, not the L2 address length.
For qeth (using eth_header()) this is merely a cosmetic change:
the parameter only matters when building headers for ETH_P_802_2
or ETH_P_802_3, whereas our fake headers are built with
ETH_P_IP / ETH_P_IPV6 / 0.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 13:10:04 -05:00
Julian Wiedmann
d3aacac4e4 s390/qeth: advertise IFF_UNICAST_FLT
qeth implements HW-based Unicast Filtering (via SETVMAC) on L2 devices.
Tell the stack, so it knows that receiving traffic for secondary
addresses doesn't require full-blown promiscuous mode.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 13:10:04 -05:00
Julian Wiedmann
0f34294527 s390/qeth: support SG for more device types
NETIF_F_SG support is currently limited to OSA (and for L2 even OSD)
devices. Advertise it for some more device types (OSM, L2 OSX, z/VM OSA)
that share the same code paths. For now, keep it switched off by
default on these devices.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 13:10:04 -05:00
Julian Wiedmann
d857e11193 s390/qeth: remove outdated portname debug msg
The 'portname' attribute is deprecated and setting it has no effect.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 13:10:04 -05:00
Julian Wiedmann
ff5caa7a28 s390/qeth: use __ipa_cmd() for casting an IPA cmd buffer
"s390/qeth: fix SETIP command handling" introduced a new helper, apply
it driver-wide.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 13:10:04 -05:00
Paolo Abeni
f5426250a6 net: introduce IFF_NO_RX_HANDLER
Some network devices - notably ipvlan slave - are not compatible with
any kind of rx_handler. Currently the hook can be installed but any
configuration (bridge, bond, macsec, ...) is nonfunctional.

This change allocates a priv_flag bit to mark such devices and explicitly
forbid installing a rx_handler if such bit is set. The new bit is used
by ipvlan slave device.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 13:00:08 -05:00
Ganesh Goudar
d185efc1da cxgb4: increase max tx rate limit to 100 Gbps
T6 cards can support up to 100 G speeds. So, increase
max programmable tx rate limit to 100 Gbps.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 12:57:35 -05:00
Gustavo A. R. Silva
35951393bb pktgen: Remove VLA usage
In preparation to enabling -Wvla, remove VLA usage and replace it
with a fixed-length array instead.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 11:57:17 -05:00
Vaibhav Murkute
ff3c1b1a81 drivers: vhost: vsock: fixed a brace coding style issue
Fixed a coding style issue.

Signed-off-by: Vaibhav Murkute <vaibhavmurkute88@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 11:39:44 -05:00
David S. Miller
feaf751d8d Merge branch 'hns3-fixes-for-configuration-lost-problems'
Peng Li says:

====================
fixes for configuration lost problems

This patchset refactors some functions and some bugs in order
to fix the configuration loss problem when resetting and
setting channel number.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 11:33:15 -05:00
Yunsheng Lin
7a242b232a net: hns3: fix for coal configuation lost when setting the channel
This patch fixes the coalesce configuation lost problem when
setting the channel number by restoring all vectors's coalesce
configuation to vector 0's, because all vectors belonging to
the same netdev have the same coalesce configuation for now.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 11:33:14 -05:00
Yunsheng Lin
9bc727a9d5 net: hns3: refactor the coalesce related struct
This patch refoctors the coalesce related struct by introducing
the hns3_enet_coalesce struct, in order to fix the coalesce
configuation lost problem when changing the channel number.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 11:33:14 -05:00
Yunsheng Lin
dd38c72604 net: hns3: fix for coalesce configuration lost during reset
Coalesce configuration will be set to default value by
hns3_nic_init_vector_data during reset, which causes the
coalesce configuration loss problem.

This patch fixes it by setting the default value in
hns3_nic_alloc_vector_data, which will not be called in the
reset process.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 11:33:14 -05:00
Yunsheng Lin
0d3e6631de net: hns3: refactor the get/put_vector function
There is a get_vector function, which allocate the vectors
for a client, but there is not a put_vector to free the
vector.

This patch introduces the put_vector function in order to
fix the coalesce configuration lost problem during reset
process.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 11:33:14 -05:00
Yunsheng Lin
ec77789032 net: hns3: fix for use-after-free when setting ring parameter
In hns3_set_ringparam, hns3_uninit_all_ring frees the
memory pointed by priv->ring_data[i].ring, and
hns3_change_all_ring_bd_num use that pointer without mallocing,
which will cause a use-after-free problem.

The patch fixes it by not freeing the memory in
hns3_uninit_all_ring, and uses hns3_put_ring_config to free it
when necessary.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 11:33:14 -05:00
Yunsheng Lin
f31c1ba668 net: hns3: fix for pause configuration lost during reset
Pause configuration will be set to default value by hclge_tm_schd_init
during reset, which causes the RSS configuration loss problem.

This patch fixes it by calling hclge_tm_init_hw during reset process
, which will set the pause configuration to default value.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 11:33:14 -05:00
Yunsheng Lin
268f5dfade net: hns3: fix for RSS configuration loss problem during reset
RSS configuration will be set to default value by hclge_rss_init_hw
during reset, which causes the RSS configuration loss problem.

This patch fixes it by setting the default value in
hclge_rss_init_cfg function, which will not be called in the reset
process.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 11:33:13 -05:00
Yunsheng Lin
6f2af42955 net: hns3: refactor the hclge_get/set_rss_tuple function
This patch refactors the hclge_get/set_rss_tuple function
in order to fix the rss configuration loss problem during
reset process.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 11:33:13 -05:00
Yunsheng Lin
89523cfaa5 net: hns3: refactor the hclge_get/set_rss function
This patch refactors the hclge_get/set_rss function in
order to fix the rss configuration loss problem during
reset process.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 11:33:13 -05:00
David S. Miller
0623951eb8 Merge branch 'sched-action-events'
Roman Mashak says:

====================
Fix event generation for actions batch Add/Delete mode

When adding or deleting a batch of entries, the kernel sends upto
TCA_ACT_MAX_PRIO entries in an event to user space. However it does not
consider that the action sizes may vary and require different skb sizes.

For example :

% cat tc-batch.sh
TC="sudo /mnt/iproute2.git/tc/tc"

$TC actions flush action gact
for i in `seq 1 $1`;
do
   cmd="action pass index $i "
   args=$args$cmd
done
$TC actions add $args
%
% ./tc-batch.sh 32
Error: Failed to fill netlink attributes while adding TC action.
We have an error talking to the kernel
%

This patchset introduces new callback in tc_action_ops, which calculates
the action size, and passes size to tcf_add_notify()/tcf_del_notify(). The
patch fixes act_gact, and the rest of actions will be updated in the
follow-up patches.

v3:
   Fixed tcf_action_fill_size() to return shared attrs length when
   action ->get_fill_size() isn't implemented.
v2:
   Restructured patches to make them bisectable.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 11:25:12 -05:00
Roman Mashak
9c5c9c5737 net sched actions: implement get_fill_size routine in act_gact
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 11:25:12 -05:00
Roman Mashak
4e76e75d6a net sched actions: calculate add/delete event message size
Introduce routines to calculate size of the shared tc netlink attributes
and the full message size including netlink header and tc service header.

Update add/delete action logic to have the size for event messages,
the size is passed to tcf_add_notify() and tcf_del_notify() where the
notification message is being allocated and constructed.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 11:25:11 -05:00
Roman Mashak
a03b91b176 net sched actions: add new tc_action_ops callback
Add a new callback in tc_action_ops, it will be needed by the tc actions
to compute its size when a ADD/DELETE notification message is constructed.
This routine has to take into account optional/variable size TLVs specific
per action.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 11:25:11 -05:00
Roman Mashak
d04e6990c9 net sched actions: update Add/Delete action API with new argument
Introduce a new function argument to carry total attributes size for
correct allocation of skb in event messages.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 11:25:11 -05:00
Eric Dumazet
79134e6ce2 net: do not create fallback tunnels for non-default namespaces
fallback tunnels (like tunl0, gre0, gretap0, erspan0, sit0,
ip6tnl0, ip6gre0) are automatically created when the corresponding
module is loaded.

These tunnels are also automatically created when a new network
namespace is created, at a great cost.

In many cases, netns are used for isolation purposes, and these
extra network devices are a waste of resources. We are using
thousands of netns per host, and hit the netns creation/delete
bottleneck a lot. (Many thanks to Kirill for recent work on this)

Add a new sysctl so that we can opt-out from this automatic creation.

Note that these tunnels are still created for the initial namespace,
to be the least intrusive for typical setups.

Tested:
lpk43:~# cat add_del_unshare.sh
for i in `seq 1 40`
do
 (for j in `seq 1 100` ; do  unshare -n /bin/true >/dev/null ; done) &
done
wait

lpk43:~# echo 0 >/proc/sys/net/core/fb_tunnels_only_for_init_net
lpk43:~# time ./add_del_unshare.sh

real	0m37.521s
user	0m0.886s
sys	7m7.084s
lpk43:~# echo 1 >/proc/sys/net/core/fb_tunnels_only_for_init_net
lpk43:~# time ./add_del_unshare.sh

real	0m4.761s
user	0m0.851s
sys	1m8.343s
lpk43:~#

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 11:23:11 -05:00