linux

Author	SHA1	Message	Date
Andrii Nakryiko	babf316409	bpf: Add bpf_link_new_file that doesn't install FD Add bpf_link_new_file() API for cases when we need to ensure anon_inode is successfully created before we proceed with expensive BPF program attachment procedure, which will require equally (if not more so) expensive and potentially failing compensation detachment procedure just because anon_inode creation failed. This API allows to simplify code by ensuring first that anon_inode is created and after BPF program is attached proceed with fd_install() that can't fail. After anon_inode file is created, link can't be just kfree()'d anymore, because its destruction will be performed by deferred file_operations->release call. For this, bpf_link API required specifying two separate operations: release() and dealloc(), former performing detachment only, while the latter frees memory used by bpf_link itself. dealloc() needs to be specified, because struct bpf_link is frequently embedded into link type-specific container struct (e.g., struct bpf_raw_tp_link), so bpf_link itself doesn't know how to properly free the memory. In case when anon_inode file was successfully created, but subsequent BPF attachment failed, bpf_link needs to be marked as "defunct", so that file's release() callback will perform only memory deallocation, but no detachment. Convert raw tracepoint and tracing attachment to new API and eliminate detachment from error handling path. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200309231051.1270337-1-andriin@fb.com	2020-03-11 14:02:48 +01:00
Nicolas Cavallari	ba32679cac	mac80211: Do not send mesh HWMP PREQ if HWMP is disabled When trying to transmit to an unknown destination, the mesh code would unconditionally transmit a HWMP PREQ even if HWMP is not the current path selection algorithm. Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr> Link: https://lore.kernel.org/r/20200305140409.12204-1-cavallar@lri.fr Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-03-11 09:04:14 +01:00
Jakub Kicinski	5cde05c61c	nl80211: add missing attribute validation for channel switch Add missing attribute validation for NL80211_ATTR_OPER_CLASS to the netlink policy. Fixes: `1057d35ede` ("cfg80211: introduce TDLS channel switch commands") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20200303051058.4089398-4-kuba@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-03-11 08:58:39 +01:00
Jakub Kicinski	056e9375e1	nl80211: add missing attribute validation for beacon report scanning Add missing attribute validation for beacon report scanning to the netlink policy. Fixes: `1d76250bd3` ("nl80211: support beacon report scanning") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20200303051058.4089398-3-kuba@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-03-11 08:58:31 +01:00
Jakub Kicinski	0e1a1d853e	nl80211: add missing attribute validation for critical protocol indication Add missing attribute validation for critical protocol fields to the netlink policy. Fixes: `5de1798489` ("cfg80211: introduce critical protocol indication from user-space") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20200303051058.4089398-2-kuba@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-03-11 08:58:27 +01:00
David S. Miller	96ee187bad	Merge branch 'ethtool-consolidate-irq-coalescing-part-3' Jakub Kicinski says: ==================== ethtool: consolidate irq coalescing - part 3 Convert more drivers following the groundwork laid in a recent patch set [1] and continued in [2]. The aim of the effort is to consolidate irq coalescing parameter validation in the core. This set converts 15 drivers in drivers/net/ethernet. 3 more conversion sets to come. None of the drivers here checked all unsupported parameters. [1] https://lore.kernel.org/netdev/20200305051542.991898-1-kuba@kernel.org/ [2] https://lore.kernel.org/netdev/20200306010602.1620354-1-kuba@kernel.org/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:28:54 -07:00
Jakub Kicinski	d13f1167ab	net: gemini: reject unsupported coalescing params Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:28:53 -07:00
Jakub Kicinski	009ab69b4b	net: cxgb4vf: reject unsupported coalescing params Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:28:53 -07:00
Jakub Kicinski	5608c64179	net: cxgb4: reject unsupported coalescing params Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:28:53 -07:00
Jakub Kicinski	62923b6abe	net: cxgb3: reject unsupported coalescing params Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:28:53 -07:00
Jakub Kicinski	d824178d0f	net: cxgb2: reject unsupported coalescing params Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:28:53 -07:00
Jakub Kicinski	bd4be35b4a	net: mlx4: reject unsupported coalescing params Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:28:53 -07:00
Jakub Kicinski	812df69beb	net: liquidio: reject unsupported coalescing params Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:28:53 -07:00
Jakub Kicinski	659d0760b0	net: bna: reject unsupported coalescing params Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:28:53 -07:00
Jakub Kicinski	3eb2efbea1	net: tg3: reject unsupported coalescing params Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:28:53 -07:00
Jakub Kicinski	f6f508c07a	net: bcmgenet: reject unsupported coalescing params Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject all unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:28:53 -07:00
Jakub Kicinski	a0dadb331d	net: bnx2x: reject unsupported coalescing params Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:28:53 -07:00
Jakub Kicinski	05c531452f	net: bnx2: reject unsupported coalescing params Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:28:53 -07:00
Jakub Kicinski	f4a76615f0	net: systemport: reject unsupported coalescing params Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject most of unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:28:52 -07:00
Jakub Kicinski	fcca747f18	net: aquantia: reject all unsupported coalescing params Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver only rejected some of the unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:28:52 -07:00
Jakub Kicinski	8e4f90caf0	net: ena: reject unsupported coalescing params Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Sameeh Jubran <sameehj@amazon.com> Acked-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:28:52 -07:00
Heiner Kallweit	314a9cbbfb	r8169: simplify getting stats by using netdev_stats_to_stats64 Let netdev_stats_to_stats64() do the copy work for us. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:23:44 -07:00
Heiner Kallweit	047521d7b1	r8169: let rtl8169_mark_to_asic clear rx descriptor field opts2 Clearing opts2 belongs to preparing the descriptor for DMA engine use. Therefore move it into rtl8169_mark_to_asic(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:23:23 -07:00
David S. Miller	6ee2425804	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2020-03-10 This series contains updates to ice and iavf drivers. Cleaned up unnecessary parenthesis, which was pointed out by Sergei Shtylyov. Mitch updates the iavf and ice drivers to expand the limitation on the number of queues that the driver can support to account for the newer 800-series capabilities. Brett cleans up the error messages for both SR-IOV and non SR-IOV use cases. Fixed the logic when the ice driver is removed and a bare-metal VF is passing traffic, which was causing a transmit hang on the VF. Updated the ice driver to display "Link detected" field via ethtool, when the driver is in safe mode. Updated ice driver to properly set VLAN pruning when transmit anti-spoof is off. Avinash fixed a corner case in DCB, when switching from IEEE to CEE mode, the DCBX mode does not get properly updated. Dave updates the logic when switching from software DCB to firmware DCB to renegotiate DCBX to ensure the firmware agent has up to date information about the DCB settings of the link partner. Lukasz increases the PF's mailbox receive queue size to the maximum to prevent potential bottleneck or slow down occurring from the PF's mailbox receive queue being full. Bruce updates the ice driver to use strscpy() instead of strlcpy(). Cleaned up variable names that were not very descriptive with names that had more meaning. Anirudh replaces the use of ENOTSUPP with EOPNOTSUPP in the ice driver. Jake fixed up a function header comment to properly reflect the variable size and use. v2: Dropped patch 5 of the original series, where Tony added tunnel offload support. Based on community feedback, the patch needed changes, so giving Tony additional time to work on those changes and not hold up the remaining changes in the series. ==================== Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:20:03 -07:00
DENG Qingfang	13e787ca82	net: dsa: mt7530: fix macro MIRROR_PORT The inner pair of parentheses should be around the variable x Fixes: `37feab6076` ("net: dsa: mt7530: add support for port mirroring") Signed-off-by: DENG Qingfang <dqfext@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:12:54 -07:00
George McCollister	469b390e1b	net: dsa: microchip: use delayed_work instead of timer + work Simplify ksz_common.c by using delayed_work instead of a combination of timer and work. Signed-off-by: George McCollister <george.mccollister@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:10:19 -07:00
David S. Miller	2165fdf4bc	Merge branch 's390-qeth-fixes' Julian Wiedmann says: ==================== s390/qeth: fixes 2020-03-10 This fixes three minor issues: 1) a setup parameter gets cleared unnecessarily when the HW config changes, 2) insufficient error handling when initially filling the RX ring, and 3) a rarely used worker that needs to be cancelled during tear down. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:07:49 -07:00
Julian Wiedmann	0e635c2a87	s390/qeth: cancel RX reclaim work earlier When qeth's napi poll code fails to refill an entirely empty RX ring, it kicks off buffer_reclaim_work to try again later. Make sure that this worker is cancelled when setting the qeth device offline. Otherwise a RX refill action can unexpectedly end up running concurrently to bigger re-configurations (eg. resizing the buffer pool), without any locking. Fixes: `b333293058` ("qeth: add support for af_iucv HiperSockets transport") Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:07:49 -07:00
Julian Wiedmann	1741385280	s390/qeth: handle error when backing RX buffer qeth_init_qdio_queues() fills the RX ring with an initial set of RX buffers. If qeth_init_input_buffer() fails to back one of the RX buffers with memory, we need to bail out and report the error. Fixes: `4a71df5004` ("qeth: new qeth device driver") Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:07:49 -07:00
Julian Wiedmann	240c194849	s390/qeth: don't reset default_out_queue When an OSA device in prio-queue setup is reduced to 1 TX queue due to HW restrictions, we reset its the default_out_queue to 0. In the old code this was needed so that qeth_get_priority_queue() gets the queue selection right. But with proper multiqueue support we already reduced dev->real_num_tx_queues to 1, and so the stack puts all traffic on txq 0 without even calling .ndo_select_queue. Thus we can preserve the user's configuration, and apply it if the OSA device later re-gains support for multiple TX queues. Fixes: `73dc2daf11` ("s390/qeth: add TX multiqueue support for OSA devices") Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:07:49 -07:00
David S. Miller	377bb76444	Merge branch 'flow_offload-follow-ups-to-HW-stats-type-patchset' Jiri Pirko says: ==================== flow_offload: follow-ups to HW stats type patchset This patchset includes couple of patches in reaction to the discussions to the original HW stats patchset. The first patch is a fix, the other two patches are basically cosmetics. ==================== Acked-by: Edward Cree <ecree@solarflare.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:04:19 -07:00
Jiri Pirko	a16fa28984	flow_offload: restrict driver to pass one allowed bit to flow_action_hw_stats_types_check() The intention of this helper was to allow driver to specify one type that it supports, so not only "any" value would pass. So make the API more strict and allow driver to pass only 1 bit that is going to be checked. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:04:19 -07:00
Jiri Pirko	42d5fe5f9c	flow_offload: turn hw_stats_type into dedicated enum Put the values into enum and add an enum to define the bits. Suggested-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:04:19 -07:00
Jiri Pirko	a393daa899	flow_offload: fix allowed types check Change the check to see if the passed allowed type bit is enabled. Fixes: `319a1d1947` ("flow_offload: check for basic action hw stats type") Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:04:19 -07:00
David S. Miller	a2d8bf77a2	Merge branch 'MACSec-bugfixes-related-to-MAC-address-change' Igor Russkikh says: ==================== MACSec bugfixes related to MAC address change We found out that there's an issue in MACSec code when the MAC address is changed. Both s/w and offloaded implementations don't update SCI when the MAC address changes at the moment, but they should do so, because SCI contains MAC in its first 6 octets. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 15:59:32 -07:00
Dmitry Bogdanov	09f4136c5d	net: macsec: invoke mdo_upd_secy callback when mac address changed Notify the offload engine about MAC address change to reconfigure it accordingly. Fixes: `3cf3227a21` ("net: macsec: hardware offloading infrastructure") Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 15:59:32 -07:00
Dmitry Bogdanov	6fc498bc82	net: macsec: update SCI upon MAC address change. SCI should be updated, because it contains MAC in its first 6 octets. Fixes: `c09440f7dc` ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 15:59:32 -07:00
Juliet Kim	7d7195a026	ibmvnic: Do not process device remove during device reset The ibmvnic driver does not check the device state when the device is removed. If the device is removed while a device reset is being processed, the remove may free structures needed by the reset, causing an oops. Fix this by checking the device state before processing device remove. Signed-off-by: Juliet Kim <julietk@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 15:50:42 -07:00
David S. Miller	79c57bffeb	Merge branch 'enetc-Support-extended-BD-rings-at-runtime' Claudiu Manoil says: ==================== enetc: Support extended BD rings at runtime First two patches are just misc code cleanup. The 3rd patch prepares the Rx BD processing code to be extended to processing both normal and extended BDs. The last one adds extended Rx BD support for timestamping without the need of a static config. Finally, the config option FSL_ENETC_HW_TIMESTAMPING can be dropped. Care was taken not to impact non-timestamping usecases. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 15:48:54 -07:00
Claudiu Manoil	434cebabd3	enetc: Add dynamic allocation of extended Rx BD rings Hardware timestamping support (PTP) on Rx requires extended buffer descriptors, double the size of normal Rx descriptors. On the current controller revision only the timestamping offload requires extended Rx descriptors. Since Rx timestamping can be turned on/off at runtime, make Rx ring allocation configurable at runtime too. As a result, the static config option FSL_ENETC_HW_TIMESTAMPING can be dropped and the extended descriptors can be used only when Rx timestamping gets activated. The extension has the same size as the base descriptor, making the descriptor iterators easy to update for the extended case. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 15:48:54 -07:00
Claudiu Manoil	714239ac63	enetc: Clean up Rx BD iteration Improve maintainability of the code iterating the Rx buffer descriptors to prepare it to support iterating extended Rx BD descriptors as well. Don't increment by one the h/w descriptor pointers explicitly, provide an iterator that takes care of the h/w details. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 15:48:54 -07:00
Claudiu Manoil	a784c92ee2	enetc: Clean up of ehtool stats len Refactor the stats len computation code to make it easier to add new stats counters. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 15:48:54 -07:00
Claudiu Manoil	9ff3dd7b84	enetc: Drop redundant device node check The existence of the DT port node is the first thing checked at probe time, and probing won't reach this point if the node is missing. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 15:48:54 -07:00
Lukas Wunner	1e09e5818b	pktgen: Allow on loopback device When pktgen is used to measure the performance of dev_queue_xmit() packet handling in the core, it is preferable to not hand down packets to a low-level Ethernet driver as it would distort the measurements. Allow using pktgen on the loopback device, thus constraining measurements to core code. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 15:44:59 -07:00
Jiri Pirko	62751b6808	flow_offload: use flow_action_for_each in flow_action_mixed_hw_stats_types_check() Instead of manually iterating over entries, use flow_action_for_each helper. Move the helper and wrap it to fit to 80 cols on the way. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 15:43:25 -07:00
Karsten Graul	ece0d7bd74	net/smc: cancel event worker during device removal During IB device removal, cancel the event worker before the device structure is freed. Fixes: `a4cf0443c4` ("smc: introduce SMC as an IB-client") Reported-by: syzbot+b297c6825752e7a07272@syzkaller.appspotmail.com Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 15:40:33 -07:00
Hangbin Liu	60380488e4	ipv6/addrconf: call ipv6_mc_up() for non-Ethernet interface Rafał found an issue that for non-Ethernet interface, if we down and up frequently, the memory will be consumed slowly. The reason is we add allnodes/allrouters addressed in multicast list in ipv6_add_dev(). When link down, we call ipv6_mc_down(), store all multicast addresses via mld_add_delrec(). But when link up, we don't call ipv6_mc_up() for non-Ethernet interface to remove the addresses. This makes idev->mc_tomb getting bigger and bigger. The call stack looks like: addrconf_notify(NETDEV_REGISTER) ipv6_add_dev ipv6_dev_mc_inc(ff01::1) ipv6_dev_mc_inc(ff02::1) ipv6_dev_mc_inc(ff02::2) addrconf_notify(NETDEV_UP) addrconf_dev_config /* Alas, we support only Ethernet autoconfiguration. */ return; addrconf_notify(NETDEV_DOWN) addrconf_ifdown ipv6_mc_down igmp6_group_dropped(ff02::2) mld_add_delrec(ff02::2) igmp6_group_dropped(ff02::1) igmp6_group_dropped(ff01::1) After investigating, I can't found a rule to disable multicast on non-Ethernet interface. In RFC2460, the link could be Ethernet, PPP, ATM, tunnels, etc. In IPv4, it doesn't check the dev type when calls ip_mc_up() in inetdev_event(). Even for IPv6, we don't check the dev type and call ipv6_add_dev(), ipv6_dev_mc_inc() after register device. So I think it's OK to fix this memory consumer by calling ipv6_mc_up() for non-Ethernet interface. v2: Also check IFF_MULTICAST flag to make sure the interface supports multicast Reported-by: Rafał Miłecki <zajec5@gmail.com> Tested-by: Rafał Miłecki <zajec5@gmail.com> Fixes: `74235a25c6` ("[IPV6] addrconf: Fix IPv6 on tuntap tunnels") Fixes: `1666d49e1d` ("mld: do not remove mld souce list info when set link down") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 15:37:49 -07:00
Linus Torvalds	f35111a946	Another update for the .clang-format macro list It has been a while since the last time I sent one! -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAl5nujkACgkQGXyLc2ht IW3cUBAAtLrfdsFPr/uLeFpHFIPm+X3ZAq/nOQBdrerR4GJfMD/zDN5FjKiXlsfo KWIk/kG0nyqEnRrwX/oh8XqGdx3kgGuniZY9Sjo9XOMggS4pdYZBYo/6LVnJ1Mgo 2AOb9z+5wqOvsOLeIH46M0avOk2V2+hdlClx/ns6WeQnTipqbMzgWAfdnsTWk3i7 ApK4Sc/1yQPKz1Nhr/REcADU38x2fyfHj1r2EvCm2E3ljC/che5LkvCmZHr5Iqud iC/BooxdeCsFfN4mvl17IBsK1iveeRt+ZFW4NsijiMNBYrfUQ74PxEre+iCn1uk3 +KLsBnYVsqkrczyGPHjtZJnonEOjte7BQp076vA2QcTRDS3akAfdm94QsUwHJq39 iJVo8LHJhLPoIgfdhml3RFDLSA5FB7O5feC6gDNgPnq7X9FqNeNvJRJzqYTe3TtE UqXpbbqFf4I+1GOBOmOSOL8eSBfnJVoTOwVJsvcMF1SBfwpyYyQApOXsIQjKcZH+ KGIoUTd6bS6xc4FMx/6/O2Z7aWUaH72X1jADo0uP947BhQGQDcnRmH/cu+IqXceB sHs6a2a3wuJ9iqDb3fT1tEF/k2n9X5y9PPYqRynD1320E9OWihavG89+3m5BrfJd NFSxp8Buw72hjbTJDRv79RtVhZcYdUOAauO26IX+3K5X9ThZfrk= =t4To -----END PGP SIGNATURE----- Merge tag 'clang-format-for-linus-v5.6-rc6' of git://github.com/ojeda/linux Pull clang-format update from Miguel Ojeda: "Another update for the .clang-format macro list It has been a while since the last time I sent one!" * tag 'clang-format-for-linus-v5.6-rc6' of git://github.com/ojeda/linux: clang-format: Update with the latest for_each macro list	2020-03-10 15:36:27 -07:00
Shakeel Butt	d752a49865	net: memcg: late association of sock to memcg If a TCP socket is allocated in IRQ context or cloned from unassociated (i.e. not associated to a memcg) in IRQ context then it will remain unassociated for its whole life. Almost half of the TCPs created on the system are created in IRQ context, so, memory used by such sockets will not be accounted by the memcg. This issue is more widespread in cgroup v1 where network memory accounting is opt-in but it can happen in cgroup v2 if the source socket for the cloning was created in root memcg. To fix the issue, just do the association of the sockets at the accept() time in the process context and then force charge the memory buffer already used and reserved by the socket. Signed-off-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 15:33:05 -07:00
Shakeel Butt	e876ecc67d	cgroup: memcg: net: do not associate sock with unrelated cgroup We are testing network memory accounting in our setup and noticed inconsistent network memory usage and often unrelated cgroups network usage correlates with testing workload. On further inspection, it seems like mem_cgroup_sk_alloc() and cgroup_sk_alloc() are broken in irq context specially for cgroup v1. mem_cgroup_sk_alloc() and cgroup_sk_alloc() can be called in irq context and kind of assumes that this can only happen from sk_clone_lock() and the source sock object has already associated cgroup. However in cgroup v1, where network memory accounting is opt-in, the source sock can be unassociated with any cgroup and the new cloned sock can get associated with unrelated interrupted cgroup. Cgroup v2 can also suffer if the source sock object was created by process in the root cgroup or if sk_alloc() is called in irq context. The fix is to just do nothing in interrupt. WARNING: Please note that about half of the TCP sockets are allocated from the IRQ context, so, memory used by such sockets will not be accouted by the memcg. The stack trace of mem_cgroup_sk_alloc() from IRQ-context: CPU: 70 PID: 12720 Comm: ssh Tainted: 5.6.0-smp-DEV #1 Hardware name: ... Call Trace: <IRQ> dump_stack+0x57/0x75 mem_cgroup_sk_alloc+0xe9/0xf0 sk_clone_lock+0x2a7/0x420 inet_csk_clone_lock+0x1b/0x110 tcp_create_openreq_child+0x23/0x3b0 tcp_v6_syn_recv_sock+0x88/0x730 tcp_check_req+0x429/0x560 tcp_v6_rcv+0x72d/0xa40 ip6_protocol_deliver_rcu+0xc9/0x400 ip6_input+0x44/0xd0 ? ip6_protocol_deliver_rcu+0x400/0x400 ip6_rcv_finish+0x71/0x80 ipv6_rcv+0x5b/0xe0 ? ip6_sublist_rcv+0x2e0/0x2e0 process_backlog+0x108/0x1e0 net_rx_action+0x26b/0x460 __do_softirq+0x104/0x2a6 do_softirq_own_stack+0x2a/0x40 </IRQ> do_softirq.part.19+0x40/0x50 __local_bh_enable_ip+0x51/0x60 ip6_finish_output2+0x23d/0x520 ? ip6table_mangle_hook+0x55/0x160 __ip6_finish_output+0xa1/0x100 ip6_finish_output+0x30/0xd0 ip6_output+0x73/0x120 ? __ip6_finish_output+0x100/0x100 ip6_xmit+0x2e3/0x600 ? ipv6_anycast_cleanup+0x50/0x50 ? inet6_csk_route_socket+0x136/0x1e0 ? skb_free_head+0x1e/0x30 inet6_csk_xmit+0x95/0xf0 __tcp_transmit_skb+0x5b4/0xb20 __tcp_send_ack.part.60+0xa3/0x110 tcp_send_ack+0x1d/0x20 tcp_rcv_state_process+0xe64/0xe80 ? tcp_v6_connect+0x5d1/0x5f0 tcp_v6_do_rcv+0x1b1/0x3f0 ? tcp_v6_do_rcv+0x1b1/0x3f0 __release_sock+0x7f/0xd0 release_sock+0x30/0xa0 __inet_stream_connect+0x1c3/0x3b0 ? prepare_to_wait+0xb0/0xb0 inet_stream_connect+0x3b/0x60 __sys_connect+0x101/0x120 ? __sys_getsockopt+0x11b/0x140 __x64_sys_connect+0x1a/0x20 do_syscall_64+0x51/0x200 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The stack trace of mem_cgroup_sk_alloc() from IRQ-context: Fixes: `2d75807383` ("mm: memcontrol: consolidate cgroup socket tracking") Fixes: `d979a39d72` ("cgroup: duplicate cgroup reference when cloning sockets") Signed-off-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Roman Gushchin <guro@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 15:33:05 -07:00

... 5 6 7 8 9 ...

903445 Commits