linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-26 04:42:12 +00:00

Author	SHA1	Message	Date
Linus Torvalds	8e97be2acd	Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Borislav Petkov: "The latest meager RAS updates: - Enable processing of action-optional MCEs which have the Overflow bit set (Tony Luck) - -Wmissing-prototypes warning fix and a build fix (Valdis Klētnieks)" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: RAS: Build debugfs.o only when enabled in Kconfig RAS: Fix prototype warnings x86/mce: Don't check for the overflow bit on action optional machine checks	2019-09-16 13:42:25 -07:00
Linus Torvalds	ff881842e1	* EDAC tree has three maintainers and one new designated reviewer now, so that the work can scale better. * New driver for Mellanox' BlueField SoC DDR controller (Shravan Kumar Ramani) * AMD Rome support in amd64_edac (Yazen Ghannam and Isaac Vaughn) * Misc fixes, cleanups and code improvements -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl17RrMACgkQEsHwGGHe VUpt1g//WHs2FkGvwzHFmNHaUpATWLVKJ3azrNs+q1Av53xGngg/V8qTewwcQ1tx OdgCqHj6PIMAoJbI2IIMIrjdMfM7EFXKgtk1kSut2uBowODkt29Ccj1ncdQ6uZOa OSBIt7fXcsQQoX+0/SbutR3GB0gH6/rJQ+LFZix1g1o0pUfqgPGf1ypZ+gaSc8zr bq+Ke8mXwaw13qy06eXAoOb32sAOfHqHjQUs53ecA+shmhua6AhRwuH+LcQyORoI RdIKLIKb6yw/Vri3PdrokVKZo4OPa8hbdSlC9JPhZ3SfAMlmrDfr7OiVqneKfBJz wRBfVVNNozbIHu5lv7ZxVaDAUh6VnWhhW036HxiMSG6NdjalVb9HB8xpKNbBlDxy L7q0Y5CLEcRQpkmO/S0SK8KA4Vr8w8Vw/rYVRO9ypM6LEz+uoyMIZ5+mJjiCnjYo sClPz5TqqgxCieId0tyAs+IZ1iXItRpEogBTXmvgIrrnWXkhQmo6Jhvd7VGDM6bT 3RwrkWXApzxdLQSv/JqRC7N8mdII0WwDI6AjBUoC7cOeBaMQcHrcD3sLYgjWL1Up IQgSfHdR13S2p4Og1Phni/gSOnruCzCJQI1aCC8jjj92f1z8a/T0rjyMkE/qpZ+X tPOqNGkXfw86LzBDFFIOtGI6v8fdVX+Di/pTQid5hm3YdPMe1D8= =lEnD -----END PGP SIGNATURE----- Merge tag 'edac_for_5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC updates from Borislav Petkov: "The new thing this time around is that we have three maintainers now and a new, old repo. New because it is new for the EDAC tree which is hosted there from now on and old because it is Tony's and mine's old RAS repo which we still use occasionally when the stuff isn't in tip. Summary: - EDAC tree has three maintainers and one new designated reviewer now, so that the work can scale better. - New driver for Mellanox' BlueField SoC DDR controller (Shravan Kumar Ramani) - AMD Rome support in amd64_edac (Yazen Ghannam and Isaac Vaughn) - Misc fixes, cleanups and code improvements" * tag 'edac_for_5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/amd64: Add PCI device IDs for family 17h, model 70h MAINTAINERS: Add Robert as a EDAC reviewer EDAC/mc_sysfs: Make debug messages consistent EDAC/mc_sysfs: Remove pointless gotos EDAC: Prefer 'unsigned int' to bare use of 'unsigned' EDAC/amd64: Support asymmetric dual-rank DIMMs EDAC/amd64: Cache secondary Chip Select registers EDAC/amd64: Decode syndrome before translating address EDAC/amd64: Find Chip Select memory size using Address Mask EDAC/amd64: Initialize DIMM info for systems with more than two channels EDAC/amd64: Recognize DRAM device type ECC capability EDAC/amd64: Support more than two controllers for chip selects handling EDAC/mc: Cleanup _edac_mc_free() code EDAC, pnd2: Fix ioremap() size in dnv_rd_reg() EDAC, mellanox: Add ECC support for BlueField DDR4 EDAC/altera: Use the proper type for the IRQ status bits EDAC/mc: Fix grain_bits calculation edac: altera: Move Stratix10 SDRAM ECC to peripheral MAINTAINERS: update EDAC entry to reflect current tree and maintainers	2019-09-16 13:38:45 -07:00
Linus Torvalds	a7bd4bcf13	tpmdd updates for Linux v5.4 -----BEGIN PGP SIGNATURE----- iJYEABYIAD4WIQRE6pSOnaBC00OEHEIaerohdGur0gUCXW0lVyAcamFya2tvLnNh a2tpbmVuQGxpbnV4LmludGVsLmNvbQAKCRAaerohdGur0sjdAQD3lpIS7YeT37Bu QVdUkRObzkN3gWvv2oZkGXPg72843AD+KH0GL+M9SfBrYO/1StBJascarOHIIqUt /1uqUgL7fAI= =IV4V -----END PGP SIGNATURE----- Merge tag 'tpmdd-next-20190902' of git://git.infradead.org/users/jjs/linux-tpmdd Pull tpm updates from Jarkko Sakkinen: "A new driver for fTPM living inside ARM TEE was added this round. In addition to that, there are three bug fixes and one clean up" * tag 'tpmdd-next-20190902' of git://git.infradead.org/users/jjs/linux-tpmdd: tpm/tpm_ftpm_tee: Document fTPM TEE driver tpm/tpm_ftpm_tee: A driver for firmware TPM running inside TEE tpm: Remove a deprecated comments about implicit sysfs locking tpm_tis_core: Set TPM_CHIP_FLAG_IRQ before probing for interrupts tpm_tis_core: Turn on the TPM before probing IRQ's MAINTAINERS: fix style in KEYS-TRUSTED entry	2019-09-16 13:34:04 -07:00
David S. Miller	990925fad5	Merge branch 'mlxsw-spectrum_buffers-Add-the-ability-to-query-the-CPU-ports-shared-buffer' Ido Schimmel says: ==================== mlxsw: spectrum_buffers: Add the ability to query the CPU port's shared buffer Shalom says: While debugging packet loss towards the CPU, it is useful to be able to query the CPU port's shared buffer quotas and occupancy. Patch #1 prevents changing the CPU port's threshold and binding. Patch #2 registers the CPU port with devlink. Patch #3 adds the ability to query the CPU port's shared buffer quotas and occupancy. v3: Patch #2: * Remove unnecessary wrapping v2: Patch #1: * s/0/MLXSW_PORT_CPU_PORT/ * Assign "mlxsw_sp->ports[MLXSW_PORT_CPU_PORT]" at the end of mlxsw_sp_cpu_port_create() to avoid NULL assignment on error path * Add common functions for mlxsw_core_port_init/fini() Patch #2: * Move "changing CPU port's threshold and binding" check to a separate patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 22:07:59 +02:00
Shalom Toledo	a759ab6dac	mlxsw: spectrum_buffers: Add the ability to query the CPU port's shared buffer While debugging packet loss towards the CPU, it is useful to be able to query the CPU port's shared buffer quotas and occupancy. Since the CPU port has no ingress buffers, all the shared buffers ingress information will be cleared. Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 22:07:59 +02:00
Shalom Toledo	28b1987ef5	mlxsw: spectrum: Register CPU port with devlink Register CPU port with devlink. Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 22:07:59 +02:00
Shalom Toledo	9d0aa053ea	mlxsw: spectrum_buffers: Prevent changing CPU port's configuration Next patch is going to register the CPU port with devlink, but only so that the CPU port's shared buffer configuration and occupancy could be queried. Prevent changing CPU port's shared buffer threshold and binding configuration. Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 22:07:59 +02:00
David S. Miller	b63e1a02d7	Merge branch 'net-ena-implement-adaptive-interrupt-moderation-using-dim' Arthur Kiyanovski says: ==================== net: ena: implement adaptive interrupt moderation using dim In this patchset we replace our adaptive interrupt moderation implementation with the dim library implementation. The dim library showed great improvement in throughput, latency and CPU usage in different scenarios on ARM CPUs. This patchset also includes a few bug fixes to the parts of the old implementation of adaptive interrupt moderation that were left. Changes from V1 patchset: Removed stray empty lines from patches 01/11, 09/11. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 22:06:03 +02:00
Arthur Kiyanovski	79226cea4a	net: ena: fix incorrect update of intr_delay_resolution ena_dev->intr_moder_rx/tx_interval save the intervals received from the user after dividing them by ena_dev->intr_delay_resolution. Therefore when intr_delay_resolution changes, the code needs to first mutiply intr_moder_rx/tx_interval by the previous intr_delay_resolution to get the value originally given by the user, and only then divide it by the new intr_delay_resolution. Current code does not first multiply intr_moder_rx/tx_interval by the old intr_delay_resolution. This commit fixes it. Also initialize ena_dev->intr_delay_resolution to be 1. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 22:06:03 +02:00
Arthur Kiyanovski	0eda847953	net: ena: fix retrieval of nonadaptive interrupt moderation intervals Nonadaptive interrupt moderation intervals are assigned the value set by the user in ethtool -C divided by ena_dev->intr_delay_resolution. Therefore when the user tries to get the nonadaptive interrupt moderation intervals with ethtool -c the code needs to multiply the saved value by ena_dev->intr_delay_resolution. The current code erroneously divides instead of multiplying in ethtool -c. This patch fixes this. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 22:06:02 +02:00
Arthur Kiyanovski	7b8a28787e	net: ena: fix update of interrupt moderation register Current implementation always updates the interrupt register with the smoothed_interval of the rx_ring. However this should be done only in case of adaptive interrupt moderation. If non-adaptive interrupt moderation is used, the non-adaptive interrupt moderation interval should be used. This commit fixes that. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 22:06:02 +02:00
Arthur Kiyanovski	3ced8cbdf7	net: ena: remove all old adaptive rx interrupt moderation code from ena_com Remove previous implementation of adaptive rx interrupt moderation from ena_com files. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 22:06:02 +02:00
Arthur Kiyanovski	64d1fb9dfc	net: ena: remove ena_restore_ethtool_params() and relevant fields Deleted unused 4 fields from struct ena_adapter and their only user ena_restore_ethtool_params(). Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 22:06:02 +02:00
Arthur Kiyanovski	242d81fd3d	net: ena: remove old adaptive interrupt moderation code from ena_netdev 1. Out of the fields {per_napi_bytes, per_napi_packets} in struct ena_ring, only rx_ring->per_napi_packets are used to determine if napi did work for dim. This commit removes all other uses of these fields. 2. Remove ena_ring->moder_tbl_idx, which is not used by dim. 3. Remove all calls to ena_com_destroy_interrupt_moderation(), since all it did was to destroy the interrupt moderation table, which is removed as part of removing old interrupt moderation code. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 22:06:02 +02:00
Arthur Kiyanovski	57e3a5f24b	net: ena: remove code duplication in ena_com_update_nonadaptive_moderation_interval _*() Remove code duplication in: ena_com_update_nonadaptive_moderation_interval_tx() ena_com_update_nonadaptive_moderation_interval_rx() functions. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 22:06:02 +02:00
Arthur Kiyanovski	bd21b0cc3a	net: ena: enable the interrupt_moderation in driver_supported_features Add driver_supported_features to host_host info which is a new API used to communicate to the device which features are supported by the driver. Add the interrupt_moderation bit to host_info->driver_supported_features and enable it to signal the device that this driver supports interrupt moderation properly. Reserved bits are for features implemented in the future Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 22:06:02 +02:00
Arthur Kiyanovski	b3db86dc4b	net: ena: reimplement set/get_coalesce() 1. Remove old adaptive interrupt moderation code from set/get_coalesce() 2. Add ena_update_rx_rings_intr_moderation() function for updating nonadaptive interrupt moderation intervals similarly to ena_update_tx_rings_intr_moderation(). 3. Remove checks of multiple unsupported received interrupt coalescing parameters. This makes code cleaner and cancels the need to update it every time a new coalescing parameter is invented. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 22:06:02 +02:00
Arthur Kiyanovski	282faf61a0	net: ena: switch to dim algorithm for rx adaptive interrupt moderation Use the dim library for the rx adaptive interrupt moderation implementation Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 22:06:02 +02:00
Arthur Kiyanovski	15619e722b	net: ena: add intr_moder_rx_interval to struct ena_com_dev and use it Add intr_moder_rx_interval to struct ena_com_dev and use it as the location where the interrupt moderation rx interval is saved, instead of the interrupt moderation table. This is done as a first step before removing the old interrupt moderation code. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 22:06:02 +02:00
David S. Miller	1b8da10370	Merge branch 'ethtool-implement-Energy-Detect-Powerdown-support-via-phy-tunable' Alexandru Ardelean says: ==================== ethtool: implement Energy Detect Powerdown support via phy-tunable This changeset proposes a new control for PHY tunable to control Energy Detect Power Down. The `phy_tunable_id` has been named `ETHTOOL_PHY_EDPD` since it looks like this feature is common across other PHYs (like EEE), and defining `ETHTOOL_PHY_ENERGY_DETECT_POWER_DOWN` seems too long. The way EDPD works, is that the RX block is put to a lower power mode, except for link-pulse detection circuits. The TX block is also put to low power mode, but the PHY wakes-up periodically to send link pulses, to avoid lock-ups in case the other side is also in EDPD mode. Currently, there are 2 PHY drivers that look like they could use this new PHY tunable feature: the `adin` && `micrel` PHYs. This series updates only the `adin` PHY driver to support this new feature, as this chip has been tested. A change for `micrel` can be proposed after a discussion of the PHY-tunable API is resolved. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 22:02:53 +02:00
Alexandru Ardelean	65d7be094f	net: phy: adin: implement Energy Detect Powerdown mode via phy-tunable This driver becomes the first user of the kernel's `ETHTOOL_PHY_EDPD` phy-tunable feature. EDPD is also enabled by default on PHY config_init, but can be disabled via the phy-tunable control. When enabling EDPD, it's also a good idea (for the ADIN PHYs) to enable TX periodic pulses, so that in case the other PHY is also on EDPD mode, there is no lock-up situation where both sides are waiting for the other to transmit. Via the phy-tunable control, TX pulses can be disabled if specifying 0 `tx-interval` via ethtool. The ADIN PHY supports only fixed 1 second intervals; they cannot be configured. That is why the acceptable values are 1, ETHTOOL_PHY_EDPD_DFLT_TX_MSECS and ETHTOOL_PHY_EDPD_NO_TX (which disables TX pulses). Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 22:02:45 +02:00
Alexandru Ardelean	9f2f13f4ff	ethtool: implement Energy Detect Powerdown support via phy-tunable The `phy_tunable_id` has been named `ETHTOOL_PHY_EDPD` since it looks like this feature is common across other PHYs (like EEE), and defining `ETHTOOL_PHY_ENERGY_DETECT_POWER_DOWN` seems too long. The way EDPD works, is that the RX block is put to a lower power mode, except for link-pulse detection circuits. The TX block is also put to low power mode, but the PHY wakes-up periodically to send link pulses, to avoid lock-ups in case the other side is also in EDPD mode. Currently, there are 2 PHY drivers that look like they could use this new PHY tunable feature: the `adin` && `micrel` PHYs. The ADIN's datasheet mentions that TX pulses are at intervals of 1 second default each, and they can be disabled. For the Micrel KSZ9031 PHY, the datasheet does not mention whether they can be disabled, but mentions that they can modified. The way this change is structured, is similar to the PHY tunable downshift control: * a `ETHTOOL_PHY_EDPD_DFLT_TX_MSECS` value is exposed to cover a default TX interval; some PHYs could specify a certain value that makes sense * `ETHTOOL_PHY_EDPD_NO_TX` would disable TX when EDPD is enabled * `ETHTOOL_PHY_EDPD_DISABLE` will disable EDPD As noted by the `ETHTOOL_PHY_EDPD_DFLT_TX_MSECS` the interval unit is 1 millisecond, which should cover a reasonable range of intervals: - from 1 millisecond, which does not sound like much of a power-saver - to ~65 seconds which is quite a lot to wait for a link to come up when plugging a cable Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 22:02:45 +02:00
Dongli Zhang	00b368502d	xen-netfront: do not assume sk_buff_head list is empty in error handling When skb_shinfo(skb) is not able to cache extra fragment (that is, skb_shinfo(skb)->nr_frags >= MAX_SKB_FRAGS), xennet_fill_frags() assumes the sk_buff_head list is already empty. As a result, cons is increased only by 1 and returns to error handling path in xennet_poll(). However, if the sk_buff_head list is not empty, queue->rx.rsp_cons may be set incorrectly. That is, queue->rx.rsp_cons would point to the rx ring buffer entries whose queue->rx_skbs[i] and queue->grant_rx_ref[i] are already cleared to NULL. This leads to NULL pointer access in the next iteration to process rx ring buffer entries. Below is how xennet_poll() does error handling. All remaining entries in tmpq are accounted to queue->rx.rsp_cons without assuming how many outstanding skbs are remained in the list. 985 static int xennet_poll(struct napi_struct *napi, int budget) ... ... 1032 if (unlikely(xennet_set_skb_gso(skb, gso))) { 1033 __skb_queue_head(&tmpq, skb); 1034 queue->rx.rsp_cons += skb_queue_len(&tmpq); 1035 goto err; 1036 } It is better to always have the error handling in the same way. Fixes: `ad4f15dc2c` ("xen/netfront: don't bug in case of too many frags") Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 21:46:22 +02:00
Markus Elfring	56a4e37ef1	s390/ctcm: Delete unnecessary checks before the macro call “dev_kfree_skb” The dev_kfree_skb() function performs also input parameter validation. Thus the test around the shown calls is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 21:45:32 +02:00
Sameeh Jubran	a53651ec93	net: ena: don't wake up tx queue when down There is a race condition that can occur when calling ena_down(). The ena_clean_tx_irq() - which is a part of the napi handler - function might wake up the tx queue when the queue is supposed to be down (during recovery or changing the size of the queues for example) This causes the ena_start_xmit() function to trigger and possibly try to access the destroyed queues. The race is illustrated below: Flow A: Flow B(napi handler) ena_down() netif_carrier_off() netif_tx_disable() ena_clean_tx_irq() netif_tx_wake_queue() ena_napi_disable_all() ena_destroy_all_io_queues() After these flows the tx queue is active and ena_start_xmit() accesses the destroyed queue which leads to a kernel panic. fixes: `1738cd3ed3` (net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)) Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 21:40:18 +02:00
David S. Miller	f432c2e304	Merge branch 'drop_monitor-Better-sanitize-notified-packets' Ido Schimmel says: ==================== drop_monitor: Better sanitize notified packets When working in 'packet' mode, drop monitor generates a notification with a potentially truncated payload of the dropped packet. The payload is copied from the MAC header, but I forgot to check that the MAC header was set, so do it now. Patch #1 sets the offsets to the various protocol layers in netdevsim, so that it will continue to work after the MAC header check is added to drop monitor in patch #2. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 21:39:27 +02:00
Ido Schimmel	bef1746681	drop_monitor: Better sanitize notified packets When working in 'packet' mode, drop monitor generates a notification with a potentially truncated payload of the dropped packet. The payload is copied from the MAC header, but I forgot to check that the MAC header was set, so do it now. Fixes: `ca30707dee` ("drop_monitor: Add packet alert mode") Fixes: `5e58109b1e` ("drop_monitor: Add support for packet alert mode for hardware drops") Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 21:39:27 +02:00
Ido Schimmel	58a406def4	netdevsim: Set offsets to various protocol layers The driver periodically generates "trapped" UDP packets that it then passes on to devlink. Set the offsets to the various protocol layers. This is a prerequisite to the next patch, where drop monitor is taught to check that the offset to the MAC header was set. Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 21:39:27 +02:00
David S. Miller	db539cae12	Merge branch 'tc-taprio-offload-for-SJA1105-DSA' Vladimir Oltean says: ==================== tc-taprio offload for SJA1105 DSA This is the third attempt to submit the tc-taprio offload model for inclusion in the networking tree. The sja1105 switch driver will provide the first implementation of the offload. Only the bare minimum is added: - The offload model and a DSA pass-through - The hardware implementation - The interaction with the netdev queues in the tagger code - Documentation What has been removed from previous attempts is support for PTP-as-clocksource in sja1105, as well as configuring the traffic class for management traffic. These will be added as soon as the offload model is settled. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 21:32:58 +02:00
Vladimir Oltean	7c95afa42f	docs: net: dsa: sja1105: Add info about the Time-Aware Scheduler While not an exhaustive usage tutorial, this describes the details needed to build more complex scenarios. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 21:32:58 +02:00
Vladimir Oltean	317ab5b86c	net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio offload This qdisc offload is the closest thing to what the SJA1105 supports in hardware for time-based egress shaping. The switch core really is built around SAE AS6802/TTEthernet (a TTTech standard) but can be made to operate similarly to IEEE 802.1Qbv with some constraints: - The gate control list is a global list for all ports. There are 8 execution threads that iterate through this global list in parallel. I don't know why 8, there are only 4 front-panel ports. - Care must be taken by the user to make sure that two execution threads never get to execute a GCL entry simultaneously. I created a O(n^4) checker for this hardware limitation, prior to accepting a taprio offload configuration as valid. - The spec says that if a GCL entry's interval is shorter than the frame length, you shouldn't send it (and end up in head-of-line blocking). Well, this switch does anyway. - The switch has no concept of ADMIN and OPER configurations. Because it's so simple, the TAS settings are loaded through the static config tables interface, so there isn't even place for any discussion about 'graceful switchover between ADMIN and OPER'. You just reset the switch and upload a new OPER config. - The switch accepts multiple time sources for the gate events. Right now I am using the standalone clock source as opposed to PTP. So the base time parameter doesn't really do much. Support for the PTP clock source will be added in a future series. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 21:32:58 +02:00
Vladimir Oltean	5f06c63bd3	net: dsa: sja1105: Advertise the 8 TX queues This is a preparation patch for the tc-taprio offload (and potentially for other future offloads such as tc-mqprio). Instead of looking directly at skb->priority during xmit, let's get the netdev queue and the queue-to-traffic-class mapping, and put the resulting traffic class into the dsa_8021q PCP field. The switch is configured with a 1-to-1 PCP-to-ingress-queue-to-egress-queue mapping (see vlan_pmap in sja1105_main.c), so the effect is that we can inject into a front-panel's egress traffic class through VLAN tagging from Linux, completely transparently. Unfortunately the switch doesn't look at the VLAN PCP in the case of management traffic to/from the CPU (link-local frames at 01-80-C2-xx-xx-xx or 01-1B-19-xx-xx-xx) so we can't alter the transmission queue of this type of traffic on a frame-by-frame basis. It is only selected through the "hostprio" setting which ATM is harcoded in the driver to 7. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 21:32:57 +02:00
Vladimir Oltean	7f1e4ba814	net: dsa: sja1105: Add static config tables for scheduling In order to support tc-taprio offload, the TTEthernet egress scheduling core registers must be made visible through the static interface. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 21:32:57 +02:00
Vladimir Oltean	47d23af292	net: dsa: Pass ndo_setup_tc slave callback to drivers DSA currently handles shared block filters (for the classifier-action qdisc) in the core due to what I believe are simply pragmatic reasons - hiding the complexity from drivers and offerring a simple API for port mirroring. Extend the dsa_slave_setup_tc function by passing all other qdisc offloads to the driver layer, where the driver may choose what it implements and how. DSA is simply a pass-through in this case. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Acked-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 21:32:57 +02:00
Vinicius Costa Gomes	9c66d15646	taprio: Add support for hardware offloading This allows taprio to offload the schedule enforcement to capable network cards, resulting in more precise windows and less CPU usage. The gate mask acts on traffic classes (groups of queues of same priority), as specified in IEEE 802.1Q-2018, and following the existing taprio and mqprio semantics. It is up to the driver to perform conversion between tc and individual netdev queues if for some reason it needs to make that distinction. Full offload is requested from the network interface by specifying "flags 2" in the tc qdisc creation command, which in turn corresponds to the TCA_TAPRIO_ATTR_FLAG_FULL_OFFLOAD bit. The important detail here is the clockid which is implicitly /dev/ptpN for full offload, and hence not configurable. A reference counting API is added to support the use case where Ethernet drivers need to keep the taprio offload structure locally (i.e. they are a multi-port switch driver, and configuring a port depends on the settings of other ports as well). The refcount_t variable is kept in a private structure (__tc_taprio_qopt_offload) and not exposed to drivers. In the future, the private structure might also be expanded with a backpointer to taprio_sched *q, to implement the notification system described in the patch (of when admin became oper, or an error occurred, etc, so the offload can be monitored with 'tc qdisc show'). Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 21:32:57 +02:00
Aurelien Aptel	352f2c9a57	cifs: cifsroot: add more err checking make cifs more verbose about buffer size errors and add some comments Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2019-09-16 11:43:39 -05:00
Steve French	c3498185b7	smb3: add missing worker function for SMB3 change notify SMB3 change notify is important to allow applications to wait on directory change events of different types (e.g. adding and deleting files from others systems). Add worker functions for this. Acked-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2019-09-16 11:43:39 -05:00
Paulo Alcantara (SUSE)	8eecd1c2e5	cifs: Add support for root file systems Introduce a new CONFIG_CIFS_ROOT option to handle root file systems over a SMB share. In order to mount the root file system during the init process, make cifs.ko perform non-blocking socket operations while mounting and accessing it. Cc: Steve French <smfrench@gmail.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Paulo Alcantara (SUSE) <paulo@paulo.ac> Signed-off-by: Steve French <stfrench@microsoft.com>	2019-09-16 11:43:38 -05:00
Aurelien Aptel	0892ba693f	cifs: modefromsid: make room for 4 ACE when mounting with modefromsid, we end up writing 4 ACE in a security descriptor that only has room for 3, thus triggering an out-of-bounds write. fix this by changing the min size of a security descriptor. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2019-09-16 11:43:38 -05:00
Steve French	2255397c33	smb3: fix potential null dereference in decrypt offload commit a091c5f67c99 ("smb3: allow parallelizing decryption of reads") had a potential null dereference Reported-by: kbuild test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Suggested-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2019-09-16 11:43:38 -05:00
Steve French	96d9f7ed00	smb3: fix unmount hang in open_shroot An earlier patch "CIFS: fix deadlock in cached root handling" did not completely address the deadlock in open_shroot. This patch addresses the deadlock. In testing the recent patch: smb3: improve handling of share deleted (and share recreated) we were able to reproduce the open_shroot deadlock to one of the target servers in unmount in a delete share scenario. Fixes: `7e5a70ad88` ("CIFS: fix deadlock in cached root handling") This is version 2 of this patch. An earlier version of this patch "smb3: fix unmount hang in open_shroot" had a problem found by Dan. Reported-by: kbuild test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Suggested-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Aurelien Aptel <aaptel@suse.com> CC: Stable <stable@vger.kernel.org>	2019-09-16 11:43:38 -05:00
Steve French	3e7a02d478	smb3: allow disabling requesting leases In some cases to work around server bugs or performance problems it can be helpful to be able to disable requesting SMB2.1/SMB3 leases on a particular mount (not to all servers and all shares we are mounted to). Add new mount parm "nolease" which turns off requesting leases on directory or file opens. Currently the only way to disable leases is globally through a module load parameter. This is more granular. Suggested-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> CC: Stable <stable@vger.kernel.org>	2019-09-16 11:43:38 -05:00
Steve French	7dcc82c2df	smb3: improve handling of share deleted (and share recreated) When a share is deleted, returning EIO is confusing and no useful information is logged. Improve the handling of this case by at least logging a better error for this (and also mapping the error differently to EREMCHG). See e.g. the new messages that would be logged: [55243.639530] server share \\192.168.1.219\scratch deleted [55243.642568] CIFS VFS: \\192.168.1.219\scratch BAD_NETWORK_NAME: \\192.168.1.219\scratch In addition for the case where a share is deleted and then recreated with the same name, have now fixed that so it works. This is sometimes done for example, because the admin had to move a share to a different, bigger local drive when a share is running low on space. Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2019-09-16 11:43:38 -05:00
Steve French	1b63f1840e	smb3: display max smb3 requests in flight at any one time Displayed in /proc/fs/cifs/Stats once for each socket we are connected to. This allows us to find out what the maximum number of requests that had been in flight (at any one time). Note that /proc/fs/cifs/Stats can be reset if you want to look for maximum over a small period of time. Sample output (immediately after mount): Resources in use CIFS Session: 1 Share (unique mount targets): 2 SMB Request/Response Buffer: 1 Pool size: 5 SMB Small Req/Resp Buffer: 1 Pool size: 30 Operations (MIDs): 0 0 session 0 share reconnects Total vfs operations: 5 maximum at one time: 2 Max requests in flight: 2 1) \\localhost\scratch SMBs: 18 Bytes read: 0 Bytes written: 0 ... Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>	2019-09-16 11:43:38 -05:00
Steve French	10328c44cc	smb3: only offload decryption of read responses if multiple requests No point in offloading read decryption if no other requests on the wire Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>	2019-09-16 11:43:38 -05:00
Ronnie Sahlberg	496902dc17	cifs: add a helper to find an existing readable handle to a file and convert smb2_query_path_info() to use it. This will eliminate the need for a SMB2_Create when we already have an open handle that can be used. This will also prevent a oplock break in case the other handle holds a lease. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2019-09-16 11:43:38 -05:00
Steve French	563317ec30	smb3: enable offload of decryption of large reads via mount option Disable offload of the decryption of encrypted read responses by default (equivalent to setting this new mount option "esize=0"). Allow setting the minimum encrypted read response size that we will choose to offload to a worker thread - it is now configurable via on a new mount option "esize=" Depending on which encryption mechanism (GCM vs. CCM) and the number of reads that will be issued in parallel and the performance of the network and CPU on the client, it may make sense to enable this since it can provide substantial benefit when multiple large reads are in flight at the same time. Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>	2019-09-16 11:43:38 -05:00
Steve French	35cf94a397	smb3: allow parallelizing decryption of reads decrypting large reads on encrypted shares can be slow (e.g. adding multiple milliseconds per-read on non-GCM capable servers or when mounting with dialects prior to SMB3.1.1) - allow parallelizing of read decryption by launching worker threads. Testing to Samba on localhost showed 25% improvement. Testing to remote server showed very large improvement when doing more than one 'cp' command was called at one time. Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>	2019-09-16 11:43:38 -05:00
Ronnie Sahlberg	3175eb9b57	cifs: add a debug macro that prints \\server\share for errors Where we have a tcon available we can log \\server\share as part of the message. Only do this for the VFS log level. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2019-09-16 11:43:38 -05:00
Steve French	46f17d1768	smb3: fix signing verification of large reads Code cleanup in the 5.1 kernel changed the array passed into signing verification on large reads leading to warning messages being logged when copying files to local systems from remote. SMB signature verification returned error = -5 This changeset fixes verification of SMB3 signatures of large reads. Suggested-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2019-09-16 11:43:38 -05:00

1 2 3 4 5 ...

865446 Commits