linux

Author	SHA1	Message	Date
Jacob Keller	e5f36ad14c	igb: check for Tx timestamp timeouts during watchdog The igb driver has logic to handle only one Tx timestamp at a time, using a state bit lock to avoid multiple requests at once. It may be possible, if incredibly unlikely, that a Tx timestamp event is requested but never completes. Since we use an interrupt scheme to determine when the Tx timestamp occurred we would never clear the state bit in this case. Add an igb_ptp_tx_hang() function similar to the already existing igb_ptp_rx_hang() function. This function runs in the watchdog routine and makes sure we eventually recover from this case instead of permanently disabling Tx timestamps. Note: there is no currently known way to cause this without hacking the driver code to force it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 01:03:17 -07:00
Jacob Keller	c3b8f85ec2	igb: add statistic indicating number of skipped Tx timestamps The igb driver can only handle one Tx timestamp request at a time. This means it is possible for an application timestamp request to be ignored. There is no easy way for an administrator to determine if this occurred. Add a new statistic which tracks this, tx_hwtstamp_skipped. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 01:02:05 -07:00
Jacob Keller	cff5714145	e1000e: add statistic indicating number of skipped Tx timestamps The e1000e driver can only handle one Tx timestamp request at a time. This means it is possible for an application timestamp request to be ignored. There is no easy way for an administrator to determine if this occurred. Add a new statistic which tracks this, tx_hwtstamp_skipped. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 01:01:27 -07:00
Jacob Keller	74344e32fc	igb: avoid permanent lock of *_PTP_TX_IN_PROGRESS The igb driver uses a state bit lock to avoid handling more than one Tx timestamp request at once. This is required because hardware is limited to a single set of registers for Tx timestamps. The state bit lock is not properly cleaned up during igb_xmit_frame_ring() if the transmit fails such as due to DMA or TSO failure. In some hardware this results in blocking timestamps until the service task times out. In other hardware this results in a permanent lock of the timestamp bit because we never receive an interrupt indicating the timestamp occurred, since indeed the packet was never transmitted. Fix this by checking for DMA and TSO errors in igb_xmit_frame_ring() and properly cleaning up after ourselves when these occur. Reported-by: Reported-by: David Mirabito <davidm@metamako.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 00:53:48 -07:00
Jacob Keller	4ccdc013b0	igb: fix race condition with PTP_TX_IN_PROGRESS bits Hardware related to the igb driver has a limitation of only handling one Tx timestamp at a time. Thus, the driver uses a state bit lock to enforce that only one timestamp request is honored at a time. Unfortunately this suffers from a simple race condition. The bit lock is not cleared until after skb_tstamp_tx() is called notifying the stack of a new Tx timestamp. Even a well behaved application which sends only one timestamp request at once and waits for a response might wake up and send a new packet before the bit lock is cleared. This results in needlessly dropping some Tx timestamp requests. We can fix this by unlocking the state bit as soon as we read the Timestamp register, as this is the first point at which it is safe to unlock. To avoid issues with the skb pointer, we'll use a copy of the pointer and set the global variable in the driver structure to NULL first. This ensures that the next timestamp request does not modify our local copy of the skb pointer. This ensures that well behaved applications do not accidentally race with the unlock bit. Obviously an application which sends multiple Tx timestamp requests at once will still only timestamp one packet at a time. Unfortunately there is nothing we can do about this. Reported-by: David Mirabito <davidm@metamako.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 00:53:07 -07:00
Jacob Keller	5012863b73	e1000e: fix race condition around skb_tstamp_tx() The e1000e driver and related hardware has a limitation on Tx PTP packets which requires we limit to timestamping a single packet at once. We do this by verifying that we never request a new Tx timestamp while we still have a tx_hwtstamp_skb pointer. Unfortunately the driver suffers from a race condition around this. The tx_hwtstamp_skb pointer is not set to NULL until after skb_tstamp_tx() is called. This function notifies the stack and applications of a new timestamp. Even a well behaved application that only sends a new request when the first one is finished might be woken up and possibly send a packet before we can free the timestamp in the driver again. The result is that we needlessly ignore some Tx timestamp requests in this corner case. Fix this by assigning the tx_hwtstamp_skb pointer prior to calling skb_tstamp_tx() and use a temporary pointer to hold the timestamped skb until that function finishes. This ensures that the application is not woken up until the driver is ready to begin timestamping a new packet. This ensures that well behaved applications do not accidentally race with condition to skip Tx timestamps. Obviously an application which sends multiple Tx timestamp requests at once will still only timestamp one packet at a time. Unfortunately there is nothing we can do about this. Reported-by: David Mirabito <davidm@metamako.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 00:52:17 -07:00
Arnd Bergmann	000ba1f2eb	igb: mark PM functions as __maybe_unused The new wake function is only used by the suspend/resume handlers that are defined in inside of an #ifdef, which can cause this harmless warning: drivers/net/ethernet/intel/igb/igb_main.c:7988:13: warning: 'igb_deliver_wake_packet' defined but not used [-Wunused-function] Removing the #ifdef, instead using a __maybe_unused annotation simplifies the code and avoids the warning. Fixes: `b90fa87635` ("igb: Enable reading of wake up packet") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 00:51:36 -07:00
Matwey V Kornilov	440aeca4b9	igb: Explicitly select page 0 at initialization The functions igb_read_phy_reg_gs40g/igb_write_phy_reg_gs40g (which were removed in `2a3cdea`) explicitly selected the required page at every phy_reg access. Currently, igb_get_phy_id_82575 relays on the fact that page 0 is already selected. The assumption is not fulfilled for my Lex 3I380CW motherboard with integrated dual i211 based gigabit ethernet. This leads to igb initialization failure and network interfaces are not working: igb: Intel(R) Gigabit Ethernet Network Driver - version 5.4.0-k igb: Copyright (c) 2007-2014 Intel Corporation. igb: probe of 0000:01:00.0 failed with error -2 igb: probe of 0000:02:00.0 failed with error -2 In order to fix it, we explicitly select page 0 before first access to phy registers. See also: https://bugzilla.suse.com/show_bug.cgi?id=1009911 See also: http://www.lex.com.tw/products/pdf/3I380A&3I380CW.pdf Fixes: `2a3cdea` ("igb: Remove GS40G specific defines/functions") Cc: <stable@vger.kernel.org> # 4.5+ Signed-off-by: Matwey V Kornilov <matwey@sai.msu.ru> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 00:49:20 -07:00
Colin Ian King	9d15e5cc8c	mdio: mux: fix an incorrect less than zero error check using a u32 The u32 variable v is being checked to see if an error return is less than zero and this check has no effect because it is unsigned. Fix this by making v and int (this also matches the type of cb->bus_number which is assigned to the value in v). Detected by CoverityScan, CID#1440454 ("Unsigned compared against zero") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-05 17:45:51 -04:00
Icenowy Zheng	2f878491b3	net-next: stmmac: dwmac-sun8i: ensure the EPHY is properly reseted The EPHY may be already enabled by bootloaders which have Ethernet capability (e.g. current U-Boot). Thus it should be reseted properly before doing the enabling sequence in the dwmac-sun8i driver, otherwise the EMAC reset process may fail if no cable is plugged, and then fail the dwmac-sun8i probing. Tested on Orange Pi PC, One and Zero. All the boards fail to have dwmac-sun8i probed with "EMAC reset timeout" without cable plugged before, and with this fix they're now all able to successfully probe the EMAC without cable plugged and then use the connection after a cable is hot-plugged in. Fixes: `9f93ac8d40` ("net-next: stmmac: Add dwmac-sun8i") Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Tested-by: Corentin Labbe <clabbe.montjoie@gmail.com> Acked-by: Corentin Labbe <clabbe.montjoie@gmail.com> Reviewed-by: Corentin Labbe <clabbe.montjoie@gmail.com> Acked-by: is not as formal as Signed-off-by:. It is a record that the acker Reviewed-by: is similar. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-05 11:03:52 -04:00
yuval.shaia@oracle.com	697dae1ee1	net/3com: Make el3_netdev_get_ecmd return void Make return value void since function never returns meaningfull value. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-05 11:00:43 -04:00
yuval.shaia@oracle.com	82c01a84d5	net/{mii, smsc}: Make mii_ethtool_get_link_ksettings and smc_netdev_get_ecmd return void Make return value void since functions never returns meaningfull value. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-05 11:00:42 -04:00
yuval.shaia@oracle.com	c7c6b8715a	net/dec: Make __de_get_link_ksettings return void Make return value void since function never return meaningfull value Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-05 11:00:41 -04:00
Jiri Pirko	8ec1507dc9	net: sched: select cls when cls_act is enabled It really makes no sense to have cls_act enabled without cls. In that case, the cls_act code is dead. So select it. This also fixes an issue recently reported by kbuild robot: [linux-next:master 1326/4151] net/sched/act_api.c:37:18: error: implicit declaration of function 'tcf_chain_get' Reported-by: kbuild test robot <fengguang.wu@intel.com> Fixes: `db50514f9a` ("net: sched: add termination action to allow goto chain") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-05 10:56:36 -04:00
Rosen, Rami	4e2ec43654	genetlink: remove ops_list from genetlink header. commit `d91824c08f` ("genetlink: register family ops as array") removed the ops_list member from both genl_family and genl_ops; while the documentation of genl_family was updated accordingly by this patch, ops_list remained in the documentation of the genl_ops object. This patch fixes it by removing ops_list from genl_ops documentation. Signed-off-by: Rami Rosen <rami.rosen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-05 10:54:55 -04:00
David S. Miller	aae1a2ce14	Merge branch 'mlxsw-Minor-cleanup' Jiri Pirko says: ==================== mlxsw: Minor cleanup Fix small issues I noticed during the refactoring. First patch adds file name comments in the header file to make it clear what goes where. Second patch fixes a typo and third patch simply aligns RIF index allocation with similar allocations in the driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:49:49 -04:00
Ido Schimmel	de5ed99e97	mlxsw: spectrum_router: Align RIF index allocation with existing code The way we usually allocate an index is by letting the allocation function return an error instead of an invalid index. Do the same for RIF index. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:49:49 -04:00
Ido Schimmel	da0abcf93f	mlxsw: Fix typo inside enumeration Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:49:48 -04:00
Ido Schimmel	cb4cc0e0b1	mlxsw: spectrum: Tidy up header file Make it clear where functions are defined and move misplaced declaration to their correct place. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:49:48 -04:00
Yotam Gigi	a4e1ce24f7	mlxsw: spectrum: Rename the firmware file Change the firmware file name to be in "mellanox" directory. This commit is a followup to the linux-firmware commit a4c72696f5f4 ("Mellanox: Add firmware for mlxsw_spectrum") Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:48:03 -04:00
David S. Miller	fc85c910ae	Merge branch 'qed-vf-xdp' Yuval Mintz says: qed*: Support VF XDP attachment ==================== Each driver queue [Rx, Tx, XDP-forwarding] requires an allocated HW/FW connection + configured queue-zone. VF handling by the PF has several limitations that prevented adding the capability to perform XDP at driver-level: - The VF assumes there's 1-to-1 correspondance between the VF queue and the used connection, meaning q<x> is always going to use cid<x>, whereas for its own queues the PF is acquiring a new cid per each new queue. - There's a 1-to-1 correspondate between the VF-queues and the HW queue zones. While this is necessary for Rx-queues [as the queue-zone contains the producer], transmission queues can share the underlaying queue-zone [only shared configuration is coalescing]. But all VF<->PF communication mechanisms assume there's a single identifier that identify a queue [as queue-zone == queue], while sharing queue-zones requires passing additional information. - VFs currently don't try mapping a doorbell bar - there's a small doorbell window in the regview allowing VFs to doorbell up to 16 connections; but this window isn's wide enough for the added XDP forwarding queues. This series is going to add the necessary infrastrucutre to finally let our VFs support XDP assuming both the PF and VF drivers are sufficiently new [Legacy support would be retained both for older VFs and older PFs, but both will be needed for this new support to work]. Basically, the various database driver maintains for its queue-cids would be revised, and queue-cids would be identified using the (queue-zone, unique index) pair. The TLV mechanism would then be extended to allow VFs to communicate that unique-index as well as the already provided queue-zone. Finally, the VFs would try to map their doorbell bar and inform their PF that they're using it. Almost all the changes are in qed, with exception of #3 [which does some cleanup in qede as well] and #11 that actually enables the feature. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:08:32 -04:00
Mintz, Yuval	e7b80dece8	qede: VF XDP support This introduces 2 changes needed for XDP to be supported for VFs: a. On VF-side, publish the NDO based on qed outputs b. On PF-side, request qed to allocate sufficient cids per-VF to allow the child vfs to support it Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:08:31 -04:00
Mintz, Yuval	cbb8a12c08	qed: VF XDP support The final addition on the qed front - - VFs would now require their PFs to provide multiple CIDs - Based on the availability of connections from PF, determine whether XDP is feasible and share it with qede via dev_info. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:08:31 -04:00
Mintz, Yuval	1a850bfc9e	qed: VFs to try utilizing the doorbell bar VFs are currently not mapping their doorbell bar, instead relying on the small doorbell window they have in their limited regview bar. In order to increase the number of possible Tx connections [queues] employeed by VF past 16, we need to start using the doorbell bar if one such is exposed - VF would communicate this fact to PF which would return the size-bar internally configured into chip, according to which the VF would decide whether to actually utilize the doorbell bar. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:08:31 -04:00
Mintz, Yuval	08bc8f15e6	qed: Multiple qzone queues for VFs This adds the infrastructure for supporting VFs that want to open multiple transmission queues on the same queue-zone. At this point, there are no VFs that actually request this functionality, but later patches would remedy that. a. VF and PF would communicate the capability during ACQUIRE; Legacy VFs would continue on behaving as they do today b. PF would communicate number of supported CIDs to the VF and would enforce said limitation c. Whenever VF passes a request for a given queue configuration it would also pass an associated index within said queue-zone Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:08:31 -04:00
Mintz, Yuval	007bc37179	qed: IOV db support multiple queues per qzone Allow the infrastructure a PF maintains for each one of its VFs to support multiple queue-cids on a single queue-zone. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:08:31 -04:00
Mintz, Yuval	3b19f47820	qed: Make VF legacy a bitfield Until now we used to have a single VF legacy compatibility mode, one that affected the place of the Rx producers of those VFs [mostly]. As PF would soon support allocating CIDs for VFs instead of having a static CID<->queue configuration for them, we'll need to have an additional legacy mode since existing VFs would need to continue on using the older mode of operation. Change the infrastrucutre so that the legacy would be able to indicate which of the legacy behaviors is needed for a given VF. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:08:31 -04:00
Mintz, Yuval	bbe3f233ec	qed: Assign a unique per-queue index to queue-cid When a queue-cid is allocated, assign an index inside that's CID's queue-zone. For PFs and VFS, this number is going to be unique and derive from a per-queue-zone bitmap, while for PF's VFs queues the number is currently going to constant; Later, we'd add the capability of a VF to communicate such an index to its PF. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:08:31 -04:00
Mintz, Yuval	3946497aff	qed: Pass vf_params when creating a queue-cid We're going to need additional information for queue-cids that a PF creates for its VFs, so start by refactoring existing logic used for initializing said struct into receiving a structure encapsulating the VF-specific information that needs to be provided. This also introduces QED_QUEUE_CID_SELF - each queue-cid would hold an indication to whether it belongs to the hw-function holding it [whether that's a PF or a VF], or else what's the VF id it belongs to. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:08:30 -04:00
Mintz, Yuval	f604b17d7f	qed*: L2 interface to use the SB structures directly Part of an effort of a cleaner seperation between qed and the protocol drivers, the L2 interface is to use the SB structure for initialization purposes opaquely. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:08:30 -04:00
Mintz, Yuval	0db711bb26	qed: Create L2 queue database First step in allowing a single PF/VF to open multiple queues on the same queue zone is to add per-hwfn database of queue-cids as a two-dimensional array where entry would be according to [queue zone][internal index]. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:08:30 -04:00
Mintz, Yuval	6bea61da17	qed: Add bitmaps for VF CIDs Each PF has a bitmap for its own ranges of CIDs, to allow easy grabbing of an available CID when such is needed. But VFs are not using the same mechanism, instead relying on hard-coded CIDs [ queue-index == cid ]. As an infrastructure step toward increasing number of CIDs of VFs, the PF is going to maintain bitmaps for the VF CIDs as well - the bitmaps would be per-VF and the ranges would be the same [in HW all VFs of a given PF have the same mapping of CIDs, and the HW is capable of distinguishing between those according to the VF index] Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:08:30 -04:00
David S. Miller	a619cc8bed	Merge branch 'skb-sgvec-overflow' Jason A. Donenfeld says: ==================== net: Avoiding stack overflow in skb_to_sgvec The recent bug with macsec and historical one with virtio have indicated that letting skb_to_sgvec trounce all over an sglist without checking the length is probably a bad idea. And it's not necessary either: an sglist already explicitly marks its last item, and the initialization functions are diligent in doing so. Thus there's a clear way of avoiding future overflows. So, this patchset, from a high level, makes skb_to_sgvec return a potential error code, and then adjusts all callers to check for the error code. There are two situations in which skb_to_sgvec might return such an error: 1) When the passed in sglist is too small; and 2) When the passed in skbuff is too deeply nested. So, the first patch in this series handles the issues with skb_to_sgvec directly, and the remaining ones then handle the call sites. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:01:48 -04:00
Jason A. Donenfeld	e2fcad58fd	virtio_net: check return value of skb_to_sgvec always Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:01:48 -04:00
Jason A. Donenfeld	cda7ea6903	macsec: check return value of skb_to_sgvec always Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:01:47 -04:00
Jason A. Donenfeld	89a5ea9966	rxrpc: check return value of skb_to_sgvec always Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:01:47 -04:00
Jason A. Donenfeld	3f29770723	ipsec: check return value of skb_to_sgvec always Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:01:47 -04:00
Jason A. Donenfeld	48a1df6533	skbuff: return -EMSGSIZE in skb_to_sgvec to prevent overflow This is a defense-in-depth measure in response to bugs like `4d6fa57b4d` ("macsec: avoid heap overflow in skb_to_sgvec"). There's not only a potential overflow of sglist items, but also a stack overflow potential, so we fix this by limiting the amount of recursion this function is allowed to do. Not actually providing a bounded base case is a future disaster that we can easily avoid here. As a small matter of house keeping, we take this opportunity to move the documentation comment over the actual function the documentation is for. While this could be implemented by using an explicit stack of skbuffs, when implementing this, the function complexity increased considerably, and I don't think such complexity and bloat is actually worth it. So, instead I built this and tested it on x86, x86_64, ARM, ARM64, and MIPS, and measured the stack usage there. I also reverted the recent MIPS changes that give it a separate IRQ stack, so that I could experience some worst-case situations. I found that limiting it to 24 layers deep yielded a good stack usage with room for safety, as well as being much deeper than any driver actually ever creates. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: David Howells <dhowells@redhat.com> Cc: Sabrina Dubroca <sd@queasysnail.net> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:01:47 -04:00
David S. Miller	a11227dcc3	Merge branch 'bpf-Add-BPF-support-to-all-perf_event' Merge branch 'bpf-Add-BPF-support-to-all-perf_event' Alexei Starovoitov says: ==================== bpf: Add BPF support to all perf_event v3->v4: one more tweak to reject unsupported events at map update time as Peter suggested v2->v3: more refactoring to address Peter's feedback. Now all perf_events are attachable and readable v1->v2: address Peter's feedback. Refactor patch 1 to allow attaching bpf programs to all event types and reading counters from all of them as well patch 2 - more tests patch 3 - address Dave's feedback and document bpf_perf_event_read() and bpf_perf_event_output() properly ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 21:58:24 -04:00
Teng Qin	b7d3ed5be9	bpf: update perf event helper functions documentation This commit updates documentation of the bpf_perf_event_output and bpf_perf_event_read helpers to match their implementation. Signed-off-by: Teng Qin <qinteng@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 21:58:16 -04:00
Teng Qin	41e9a8046c	samples/bpf: add tests for more perf event types $ trace_event tests attaching BPF program to HW_CPU_CYCLES, SW_CPU_CLOCK, HW_CACHE_L1D and other events. It runs 'dd' in the background while bpf program collects user and kernel stack trace on counter overflow. User space expects to see sys_read and sys_write in the kernel stack. $ tracex6 tests reading of various perf counters from BPF program. Both tests were refactored to increase coverage and be more accurate. Signed-off-by: Teng Qin <qinteng@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 21:58:15 -04:00
Alexei Starovoitov	f91840a32d	perf, bpf: Add BPF support to all perf_event types Allow BPF_PROG_TYPE_PERF_EVENT program types to attach to all perf_event types, including HW_CACHE, RAW, and dynamic pmu events. Only tracepoint/kprobe events are treated differently which require BPF_PROG_TYPE_TRACEPOINT/BPF_PROG_TYPE_KPROBE program types accordingly. Also add support for reading all event counters using bpf_perf_event_read() helper. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 21:58:01 -04:00
Sowmini Varadhan	5071034e4a	neigh: Really delete an arp/neigh entry on "ip neigh delete" or "arp -d" The command # arp -s 62.2.0.1 a🅱️c:d:e:f dev eth2 adds an entry like the following (listed by "arp -an") ? (62.2.0.1) at 0a:0b:0c:0d:0e:0f [ether] PERM on eth2 but the symmetric deletion command # arp -i eth2 -d 62.2.0.1 does not remove the PERM entry from the table, and instead leaves behind ? (62.2.0.1) at <incomplete> on eth2 The reason is that there is a refcnt of 1 for the arp_tbl itself (neigh_alloc starts off the entry with a refcnt of 1), thus the neigh_release() call from arp_invalidate() will (at best) just decrement the ref to 1, but will never actually free it from the table. To fix this, we need to do something like neigh_forced_gc: if the refcnt is 1 (i.e., on the table's ref), remove the entry from the table and free it. This patch refactors and shares common code between neigh_forced_gc and the newly added neigh_remove_one. A similar issue exists for IPv6 Neighbor Cache entries, and is fixed in a similar manner by this patch. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Reviewed-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 21:37:18 -04:00
Andrew Lunn	030a89028d	net: phy: smsc: Implement PHY statistics Most of the PHYs supported by the SMSC driver have a counter of symbol errors. This is 16 bit wide and wraps around when it reaches its maximum value. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-By: Woojung Huh <Woojung.Huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 21:36:20 -04:00
David S. Miller	386e2e4bd7	Merge branch 'dsa-Fixes-for-mv88e6161' Andrew Lunn says: ==================== dsa: Fixes for mv88e6161 Testing a board with an mv88e6161 turned up two issues. The PHYs were not found, because the wrong method to access them was used. The statistics did not work, because the wrong snapshot method was used ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 21:34:17 -04:00
Andrew Lunn	0ac64c3949	net: dsa: mv88e6xxx: mv88e6161 uses mv88e6320 stats snapshot The mv88e6161 was using the wrong method to perform statistics snapshot. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 21:34:17 -04:00
Andrew Lunn	ec8378bb4d	net: dsa: mv88e6xxx: 6161 uses global 2 for PHY access Access to the internal PHYs of the 6161 and 6123 go through global 2 SMI registers. Fix the ops structure. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 21:34:17 -04:00
David S. Miller	4e86d3cb7b	Merge branch 'dsa-mv88e6xxx-move-registers-macros' Vivien Didelot says: ==================== net: dsa: mv88e6xxx: move registers macros This patchset brings no functional changes. It is the first step of a cleanup renaming the chip header file and moving the Register definitions _as is_ in their proper header files. A following patchset will prefix them with the appropriate model (MV88E6XXX_ or e.g. MV88E6390_) to respect an implicit namespace and easily identify model subtleties in registers layout, as correctly done in the newly added serdes.h header. ==================== Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 20:08:34 -04:00
Vivien Didelot	d23a83f2ae	net: dsa: mv88e6xxx: move the Global 2 macros Move the GLOBAL2_* macros where they belong, in the related global2.h header. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 20:07:41 -04:00
Vivien Didelot	e097097b27	net: dsa: mv88e6xxx: move the Global 1 macros Move the GLOBAL_* macros where they belong, in the related global1.h header. Include it in global2.c which uses GLOBAL_STATUS_IRQ_DEVICE. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 20:07:41 -04:00

1 2 3 4 5 ...

678692 Commits