linux

Author	SHA1	Message	Date
Yunjian Wang	58618ef855	net: nxp: Fix use correct return type for ndo_start_xmit() The method ndo_start_xmit() returns a value of type netdev_tx_t. Fix the ndo function to use the correct type. Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-05 11:17:56 -07:00
Yunjian Wang	ab99b7d2ae	net: altera: Fix use correct return type for ndo_start_xmit() The method ndo_start_xmit() returns a value of type netdev_tx_t. Fix the ndo function to use the correct type. Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-05 11:15:41 -07:00
Yunjian Wang	09f6c44aaa	net: allwinner: Fix use correct return type for ndo_start_xmit() The method ndo_start_xmit() returns a value of type netdev_tx_t. Fix the ndo function to use the correct type. And emac_start_xmit() can leak one skb if 'channel' == 3. Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-05 11:14:40 -07:00
David S. Miller	354d861417	Merge branch 'net-reduce-dynamic-lockdep-keys' Cong Wang says: ==================== net: reduce dynamic lockdep keys syzbot has been complaining about low MAX_LOCKDEP_KEYS for a long time, it is mostly because we register 4 dynamic keys per network device. This patchset reduces the number of dynamic lockdep keys from 4 to 1 per netdev, by reverting to the previous static keys, except for addr_list_lock which still has to be dynamic. The second patch removes a bonding-specific key by the way. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 12:05:56 -07:00
Cong Wang	e7511f560f	bonding: remove useless stats_lock_key After commit `b3e80d44f5` ("bonding: fix lockdep warning in bond_get_stats()") the dynamic key is no longer necessary, as we compute nest level at run-time. So, we can just remove it to save some lockdep key entries. Test commands: ip link add bond0 type bond ip link add bond1 type bond ip link set bond0 master bond1 ip link set bond0 nomaster ip link set bond1 master bond0 Reported-and-tested-by: syzbot+aaa6fa4949cc5d9b7b25@syzkaller.appspotmail.com Cc: Dmitry Vyukov <dvyukov@google.com> Acked-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 12:05:56 -07:00
Cong Wang	1a33e10e4a	net: partially revert dynamic lockdep key changes This patch reverts the folowing commits: commit `064ff66e2b` "bonding: add missing netdev_update_lockdep_key()" commit `53d374979e` "net: avoid updating qdisc_xmit_lock_key in netdev_update_lockdep_key()" commit `1f26c0d3d2` "net: fix kernel-doc warning in <linux/netdevice.h>" commit `ab92d68fc2` "net: core: add generic lockdep keys" but keeps the addr_list_lock_key because we still lock addr_list_lock nestedly on stack devices, unlikely xmit_lock this is safe because we don't take addr_list_lock on any fast path. Reported-and-tested-by: syzbot+aaa6fa4949cc5d9b7b25@syzkaller.appspotmail.com Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 12:05:56 -07:00
David S. Miller	ea84c84290	Merge branch 'net-ethernet-ti-k3-introduce-common-platform-time-sync-driver-cpts' Grygorii Strashko says: ==================== net: ethernet: ti: k3: introduce common platform time sync driver - cpts This series introduced support for significantly upgraded TI A65x/J721E Common platform time sync (CPTS) modules which are part of AM65xx Time Synchronization Architecture [1]. The TI A65x/J721E now contain more than one CPTS instance: - MCU CPSW CPTS (IEEE 1588 compliant) - Main NAVSS CPTS (central) - PCIe CPTS(s) (PTM compliant) - J721E: Main CPSW9g CPTS (IEEE 1588 compliant) which can work as separately as interact to each other through Time Sync Router (TSR) and Compare Event Router (CER). In addition there are also ICSS-G IEP blocks which can perform similar timsync functions, but require FW support. More info also available in TRM [2][3]. Not all above modules are available to the Linux by as of now as some of them are reserved for RTOS/FW purposes. The scope of this submission is TI A65x/J721E CPSW CPTS and Main NAVSS CPTS, and TSR was used for testing purposes. +---------------------------+ \| MCU CPSW \| +-------------------+ +------------------------+ \| TS \| \| Main Navss CPTS \| \| Time Sync Router (TSR) \| \| +-------------+ \| \| \| \| \| \| \| \| \| \| HW1_TS +<----------+ \| \| +--------v-----+ +--+--+ \| \| \| \| \| \| CPTS \| \|Port \| \| ... \| \| \| X+-->HW1_TS \| \| \| \| HW8_TS <------------<---------+ \| X\|-->HW2_TS \| +--^--+ \| \| \| \| +--------------->HW3_TS \| \| \| \| \| \| \| +--------------->HW4_TS \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| Genf0 +-----------> (A)---------+ +<--------------+Genf0 \| \| \| \| \| \| \| \| \| \| \| \| \| \| ... \| \| +-----------> <---------------+Genf1 ESTf+-------+ \| \| \| \| \| \| \| \| \| \| \| \| \| \| +--------------+ \| \| Genf8 +---------->+ \| \| \| \| \| \| SYNC0 ... SYNC3 \| \| \| +-------------------+ +------+------------+----+ +---------------------------+ + + X X (A) shows possible routing path for MCU CPSW CPTS Genf0 signal as an example. Main features of the new TI A65x/J721E CPTS modules are: - 64-bit timestamp/counter mode support in ns by using add_val - implemented in HW PPM and nudge adjustment. - control of time sync events via interrupt or polling - selection of multiple external reference clock sources - hardware timestamp of ext. inputs events (HWx_TS_PUSH) - periodic generator function outputs (TS_GENFx) - (CPSW only) Ethernet Enhanced Scheduled Traffic Operations (CPTS_ESTFn), which drives TSN schedule - timestamping of all RX packets bypassing CPTS FIFO Patch 1 - DT bindings Patch 2 - the AM65x/J721E driver Patch 3 - enables packet timestamping support in TI AM65x/J721E MCU CPSW driver. Patches 4-7 - DT updates. === PTP Testing: phc2sys -s CLOCK_REALTIME -c eth0 -m -O 0 -u30 phc2sys[627.331]: eth0 rms 409912446712787392 max 1587584079521858304 freq -6665 +/- 35040 delay 832 +/- 27 phc2sys[657.335]: eth0 rms 33 max 66 freq -0 +/- 28 delay 820 +/- 30 phc2sys[687.339]: eth0 rms 37 max 70 freq -1 +/- 32 delay 830 +/- 29 phc2sys[717.343]: eth0 rms 33 max 71 freq -0 +/- 29 delay 828 +/- 23 phc2sys[747.346]: eth0 rms 35 max 75 freq -0 +/- 31 delay 829 +/- 26 phc2sys[777.350]: eth0 rms 37 max 68 freq -1 +/- 32 delay 825 +/- 25 phc2sys[807.354]: eth0 rms 28 max 57 freq -1 +/- 25 delay 824 +/- 21 phc2sys[837.358]: eth0 rms 43 max 81 freq -1 +/- 37 delay 836 +/- 23 phc2sys[867.361]: eth0 rms 33 max 74 freq +0 +/- 29 delay 828 +/- 24 phc2sys[897.365]: eth0 rms 35 max 77 freq -2 +/- 30 delay 824 +/- 25 phc2sys[927.369]: eth0 rms 28 max 50 freq +0 +/- 25 delay 825 +/- 25 ptp4l -P -2 -H -i eth0 -l 6 -m -q -p /dev/ptp1 -f ptp.cfg -s ptp4l[22095.754]: port 1: MASTER to UNCALIBRATED on RS_SLAVE ptp4l[22097.754]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED ptp4l[22159.757]: rms 317 max 1418 freq +79 +/- 186 delay 410 +/- 1 ptp4l[22223.760]: rms 9 max 24 freq +42 +/- 12 delay 409 +/- 1 ptp4l[22287.763]: rms 10 max 28 freq +41 +/- 11 delay 410 +/- 1 ptp4l[22351.767]: rms 10 max 26 freq +34 +/- 12 delay 410 +/- 1 ptp4l[22415.770]: rms 10 max 26 freq +49 +/- 14 delay 410 +/- 1 === Ext. HW_TS and Genf testing: For testing purposes Time Sync Router (TSR) can be modeled in DT as pin controller + timesync_router: timesync_router@A40000 { + compatible = "pinctrl-single"; + reg = <0x0 0xA40000 0x0 0x800>; + #address-cells = <1>; + #size-cells = <0>; + #pinctrl-cells = <1>; + pinctrl-single,register-width = <32>; + pinctrl-single,function-mask = <0x800007ff>; + }; then signals routing can be done in board file, for example: +#define TS_OFFSET(pa, val) (0x4+(pa)4) (0x80000000 \| val) + +&timesync_router { + pinctrl-names = "default"; + pinctrl-0 = <&mcu_cpts>; + + / Example of the timesync routing / + mcu_cpts: mcu_cpts { + pinctrl-single,pins = < + / [cpts genf1] in13 -> out25 [cpts hw4_push] / + TS_OFFSET(25, 13) + / [cpts genf1] in13 -> out0 [main cpts hw1_push] / + TS_OFFSET(0, 13) + / [main cpts genf0] in4 -> out1 [main cpts hw2_push] / + TS_OFFSET(1, 4) + / [main cpts genf0] in4 -> out24 [cpts hw3_push] */ + TS_OFFSET(24, 4) + >; + }; +}; will create link: cpsw cpts Genf1 -> main cpts hw1_push -> cpsw cpts hw4_push main cpts Genf0 -> main cpts hw2_push -> cpsw cpts hw3_push testptp -d /dev/ptp0 -i 0 -p 1000000000 periodic output request okay testptp -d /dev/ptp0 -i 1 -e 5 external time stamp request okay event index 1 at 22583.000000025 event index 1 at 22584.000000025 event index 1 at 22585.000000025 event index 1 at 22586.000000025 event index 1 at 22587.000000025 testptp -d /dev/ptp1 -i 2 -e 5 external time stamp request okay event index 2 at 1587606764.249304554 event index 2 at 1587606765.249304467 event index 2 at 1587606766.249304380 event index 2 at 1587606767.249304293 event index 2 at 1587606768.249304206 [1] https://www.ti.com/lit/pdf/spracp7 [2] https://www.ti.com/lit/pdf/sprz452 [3] https://www.ti.com/lit/pdf/spruil1 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 12:02:03 -07:00
Grygorii Strashko	461d6d058c	arm64: dts: ti: j721e-main: add main navss cpts node Add DT node for Main NAVSS CPTS module. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 12:02:03 -07:00
Grygorii Strashko	29390928fe	arm64: dts: ti: k3-j721e-mcu: add mcu cpsw cpts node Add DT node for The TI J721E MCU CPSW CPTS which is part of MCU CPSW NUSS. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 12:02:03 -07:00
Grygorii Strashko	b3f7e95f03	arm64: dts: ti: k3-am65-main: add main navss cpts node Add DT node for Main NAVSS CPTS module. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 12:02:03 -07:00
Grygorii Strashko	885a26bae0	arm64: dts: ti: k3-am65-mcu: add cpsw cpts node Add DT node for the TI AM65x SoC Common Platform Time Sync (CPTS). Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 12:02:03 -07:00
Grygorii Strashko	b1f66a5bee	net: ethernet: ti: am65-cpsw-nuss: enable packet timestamping support The MCU CPSW Common Platform Time Sync (CPTS) provides possibility to timestamp TX PTP packets and all RX packets. This enables corresponding support in TI AM65x/J721E MCU CPSW driver. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 12:02:03 -07:00
Grygorii Strashko	f6bd59526c	net: ethernet: ti: introduce am654 common platform time sync driver The CPTS module is used to facilitate host control of time sync operations. Main features of CPTS module are: - selection of multiple external clock sources - control of time sync events via interrupt or polling - 64-bit timestamp mode in ns with HW PPM and nudge adjustment. - hardware timestamp ext. inputs (HWx_TS_PUSH) - timestamp Generator function outputs (TS_GENFx) Depending on integration it enables compliance with the IEEE 1588-2008 standard for a precision clock synchronization protocol, Ethernet Enhanced Scheduled Traffic Operations (CPTS_ESTFn) and PCIe Subsystem Precision Time Measurement (PTM). Introduced driver provides Linux PTP hardware clock for each CPTS device and network packets timestamping where applicable. CPTS PTP hardware clock supports following operations: - Set time - Get time - Shift the clock by a given offset atomically - Adjust clock frequency - Time stamp external events - Periodic output signals Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 12:02:02 -07:00
Grygorii Strashko	6e87ac748e	dt-binding: ti: am65x: document common platform time sync cpts module Document device tree bindings for TI AM654/J721E SoC The Common Platform Time Sync (CPTS) module. The CPTS module is used to facilitate host control of time sync operations. Main features of CPTS module are: - selection of multiple external clock sources - 64-bit timestamp mode in ns with ppm and nudge adjustment. - control of time sync events via interrupt or polling - hardware timestamp of ext. events (HWx_TS_PUSH) - periodic generator function outputs (TS_GENFx) - PPS in combination with timesync router - Depending on integration it enables compliance with the IEEE 1588-2008 standard for a precision clock synchronization protocol, Ethernet Enhanced Scheduled Traffic Operations (CPTS_ESTFn) and PCIe Subsystem Precision Time Measurement (PTM). Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 12:02:02 -07:00
David S. Miller	1248dc00fb	Merge branch 'devlink-kernel-region-snapshot-id-allocation' Jakub Kicinski says: ==================== devlink: kernel region snapshot id allocation currently users have to find a free snapshot id to pass to the kernel when they are requesting a snapshot to be taken. This set extends the kernel so it can allocate the id on its own and send it back to user space in a response. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 11:58:31 -07:00
Jakub Kicinski	aebbd7dfab	docs: devlink: clarify the scope of snapshot id In past discussions Jiri explained snapshot ids are cross-region. Explain this in the docs. v3: new patch Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 11:58:31 -07:00
Jakub Kicinski	043b3e2276	devlink: let kernel allocate region snapshot id Currently users have to choose a free snapshot id before calling DEVLINK_CMD_REGION_NEW. This is potentially racy and inconvenient. Make the DEVLINK_ATTR_REGION_SNAPSHOT_ID optional and try to allocate id automatically. Send a message back to the caller with the snapshot info. Example use: $ devlink region new netdevsim/netdevsim1/dummy netdevsim/netdevsim1/dummy: snapshot 1 $ id=$(devlink -j region new netdevsim/netdevsim1/dummy \| \ jq '.[][][][]') $ devlink region dump netdevsim/netdevsim1/dummy snapshot $id [...] $ devlink region del netdevsim/netdevsim1/dummy snapshot $id v4: - inline the notification code v3: - send the notification only once snapshot creation completed. v2: - don't wrap the line containing extack; - add a few sentences to the docs. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 11:58:31 -07:00
Jakub Kicinski	dd86fec7e0	devlink: factor out building a snapshot notification We'll need to send snapshot info back on the socket which requested a snapshot to be created. Factor out constructing a snapshot description from the broadcast notification code. v3: new patch Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 11:58:31 -07:00
Eric Dumazet	39d010504e	net_sched: sch_fq: add horizon attribute QUIC servers would like to use SO_TXTIME, without having CAP_NET_ADMIN, to efficiently pace UDP packets. As far as sch_fq is concerned, we need to add safety checks, so that a buggy application does not fill the qdisc with packets having delivery time far in the future. This patch adds a configurable horizon (default: 10 seconds), and a configurable policy when a packet is beyond the horizon at enqueue() time: - either drop the packet (default policy) - or cap its delivery time to the horizon. $ tc -s -d qd sh dev eth0 qdisc fq 8022: root refcnt 257 limit 10000p flow_limit 100p buckets 1024 orphan_mask 1023 quantum 10Kb initial_quantum 51160b low_rate_threshold 550Kbit refill_delay 40.0ms timer_slack 10.000us horizon 10.000s Sent 1234215879 bytes 837099 pkt (dropped 21, overlimits 0 requeues 6) backlog 0b 0p requeues 6 flows 1191 (inactive 1177 throttled 0) gc 0 highprio 0 throttled 692 latency 11.480us pkts_too_long 0 alloc_errors 0 horizon_drops 21 horizon_caps 0 v2: fixed an overflow on 32bit kernels in fq_init(), reported by kbuild test robot <lkp@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 11:56:17 -07:00
Jesper Dangaard Brouer	bf6dba76d2	net: sched: fallback to qdisc noqueue if default qdisc setup fail Currently if the default qdisc setup/init fails, the device ends up with qdisc "noop", which causes all TX packets to get dropped. With the introduction of sysctl net/core/default_qdisc it is possible to change the default qdisc to be more advanced, which opens for the possibility that Qdisc_ops->init() can fail. This patch detect these kind of failures, and choose to fallback to qdisc "noqueue", which is so simple that its init call will not fail. This allows the interface to continue functioning. V2: As this also captures memory failures, which are transient, the device is not kept in IFF_NO_QUEUE state. This allows the net_device to retry to default qdisc assignment. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 11:50:51 -07:00
David S. Miller	09be4c47ab	Merge branch 'net-ipa-I-O-map-SMEM-and-IMEM' Alex Elder says: ==================== net: ipa: I/O map SMEM and IMEM This series adds the definition of two memory regions that must be mapped for IPA to access through an SMMU. It requires the SMMU to be defined in the IPA node in the SoC's Device Tree file. There is no change since version 1 to the content of the code in these patches, however this time the first patch is an update to the binding definition rather than an update to a DTS file. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 11:26:55 -07:00
Alex Elder	a0036bb413	net: ipa: define SMEM memory region for IPA Arrange to use an item from SMEM memory for IPA. SMEM item number 497 is designated to be used by the IPA. Specify the item ID and size of the region in platform configuration data. Allocate and get a pointer to this region from ipa_mem_init(). The memory must be mapped for access through an SMMU. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 11:26:55 -07:00
Alex Elder	3e313c3f5a	net: ipa: define IMEM memory region for IPA Define a region of IMEM memory available for use by IPA in the platform configuration data. Initialize it from ipa_mem_init(). The memory must be mapped for access through an SMMU. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 11:26:55 -07:00
Alex Elder	3128aae8c4	net: ipa: redefine struct ipa_mem_data The ipa_mem_data structure type was never actually used. Instead, the IPA memory regions were defined using the ipa_mem structure. Redefine struct ipa_mem_data so it encapsulates the array of IPA-local memory region descriptors along with the count of entries in that array. Pass just an ipa_mem structure pointer to ipa_mem_init(). Rename the ipa_mem_data[] array ipa_mem_local_data[] to emphasize that the memory regions it defines are IPA-local memory. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 11:26:55 -07:00
Alex Elder	8456c54408	dt-bindings: net: add IPA iommus property The IPA accesses "IMEM" and main system memory through an SMMU, so its DT node requires an iommus property to define range of stream IDs it uses. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 11:26:55 -07:00
David S. Miller	cad5eaf74f	Merge branch 'net-add-helper-eth_hw_addr_crc' Heiner Kallweit says: ==================== net: add helper eth_hw_addr_crc Several drivers use the same code as basis for filter hashes. Therefore let's factor it out to a helper. This way drivers don't have to access struct netdev_hw_addr internals. First user is r8169. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 11:19:58 -07:00
Heiner Kallweit	bc54ac3609	r8169: use new helper eth_hw_addr_crc Use new helper eth_hw_addr_crc to simplify the code. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 11:19:58 -07:00
Heiner Kallweit	b86cd700ed	net: add helper eth_hw_addr_crc Several drivers use the same code as basis for filter hashes. Therefore let's factor it out to a helper. This way drivers don't have to access struct netdev_hw_addr internals. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 11:19:58 -07:00
Michael Walle	e90c9fcedc	net: dsa: felix: allow the device to be disabled If there is no specific configuration of the felix switch in the device tree, but only the default configuration (ie. given by the SoCs dtsi file), the probe fails because no CPU port has been set. On the other hand you cannot set a default CPU port because that depends on the actual board using the switch. [ 2.701300] DSA: tree 0 has no CPU port [ 2.705167] mscc_felix 0000:00:00.5: Failed to register DSA switch: -22 [ 2.711844] mscc_felix: probe of 0000:00:00.5 failed with error -22 Thus let the device tree disable this device entirely, like it is also done with the enetc driver of the same SoC. Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 11:15:55 -07:00
David S. Miller	627642f07b	Merge branch 'net-smc-add-failover-processing' Karsten Graul says: ==================== net/smc: add failover processing This patch series adds the actual SMC-R link failover processing and improved link group termination. There will be one more (very small) series after this which will complete the SMC-R link failover support. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:54:40 -07:00
Karsten Graul	649758fff3	net/smc: save SMC-R peer link_uid During SMC-R link establishment the peers exchange the link_uid that is used for debugging purposes. Save the peer link_uid in smc_link so it can be retrieved by the smc_diag netlink interface. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:54:39 -07:00
Karsten Graul	45fa8da0bf	net/smc: create improved SMC-R link_uid The link_uid of an SMC-R link is exchanged between SMC peers and its value can be used for debugging purposes. Create a unique link_uid during link initialization and use it in communication with SMC-R peers. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:54:39 -07:00
Karsten Graul	a52bcc919b	net/smc: improve termination processing Add helper smcr_lgr_link_deactivate_all() and eliminate duplicate code. In smc_lgr_free(), clear the smc-r links before smc_lgr_free_bufs() is called so buffers are already prepared for free. The usage of the soft parameter in __smc_lgr_terminate() is no longer needed, smc_lgr_free() can be called directly. smc_lgr_terminate_sched() and smc_smcd_terminate() set lgr->freeing to indicate that the link group will be freed soon to avoid unnecessary schedules of the free worker. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:54:39 -07:00
Karsten Graul	3e0c40afce	net/smc: add termination reason and handle LLC protocol violation Allow to set the reason code for the link group termination, and set meaningful values before termination processing is triggered. This reason code is sent to the peer in the final delete link message. When the LLC request or response layer receives a message type that was not handled, drop a warning and terminate the link group. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:54:39 -07:00
Karsten Graul	ad6c111b8a	net/smc: asymmetric link tagging New connections must not be assigned to asymmetric links. Add asymmetric link tagging using new link variable link_is_asym. The new helpers smcr_lgr_set_type() and smcr_lgr_set_type_asym() are called to set the state of the link group, and tag all links accordingly. smcr_lgr_conn_assign_link() respects the link tagging and will not assign new connections to links tagged as asymmetric link. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:54:39 -07:00
Karsten Graul	56bc3b2094	net/smc: assign link to a new connection For new connections, assign a link from the link group, using some simple load balancing. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:54:39 -07:00
Karsten Graul	f3811fd7bc	net/smc: send DELETE_LINK, ALL message and wait for send to complete Add smc_llc_send_message_wait() which uses smc_wr_tx_send_wait() to send an LLC message and waits for the message send to complete. smc_llc_send_link_delete_all() calls the new function to send an DELETE_LINK,ALL LLC message. The RFC states that the sender of this type of message needs to wait for the completion event of the message transmission and can terminate the link afterwards. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:54:39 -07:00
Karsten Graul	09c61d24f9	net/smc: wait for departure of an IB message Introduce smc_wr_tx_send_wait() to send an IB message and wait for the tx completion event of the message. This makes sure that the message is no longer in-flight when the function returns. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:54:39 -07:00
Karsten Graul	b286a0651e	net/smc: handle incoming CDC validation message Call smc_cdc_msg_validate() when a CDC message with the failover validation bit enabled was received. Validate that the sequence number sent with the message is one we already have received. If not, messages were lost and the connection is terminated using a new abort_work. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:54:39 -07:00
Karsten Graul	29bd73dba4	net/smc: send failover validation message When a connection is switched to a new link then a link validation message must be sent to the peer over the new link, containing the sequence number of the last CDC message that was sent over the old link. The peer will validate if this sequence number is the same or lower then the number he received, and abort the connection if messages were lost. Add smcr_cdc_msg_send_validation() to send the message validation message and call it when a connection was switched in smc_switch_cursor(). Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:54:39 -07:00
Karsten Graul	c6f02ebeea	net/smc: switch connections to alternate link Add smc_switch_conns() to switch all connections from a link that is going down. Find an other link to switch the connections to, and switch each connection to the new link. smc_switch_cursor() updates the cursors of a connection to the state of the last successfully sent CDC message. When there is no link to switch to, terminate the link group. Call smc_switch_conns() when a link is going down. And with the possibility that links of connections can switch adapt CDC and TX functions to detect and handle link switches. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:54:39 -07:00
Karsten Graul	f0ec4f1d32	net/smc: save state of last sent CDC message When a link goes down and all connections of this link need to be switched to an other link then the producer cursor and the sequence of the last successfully sent CDC message must be known. Add the two fields to the SMC connection and update it in the tx completion handler. And to allow matching of sequences in error cases reset the seqno to the old value in smc_cdc_msg_send() when the actual send failed. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:54:39 -07:00
David S. Miller	fc99584e94	Merge branch 'bnxt_en-Updates-for-net-next' Michael Chan says: ==================== bnxt_en: Updates for net-next. This patchset includes these main changes: 1. Firmware spec. update. 2. Context memory sizing improvements for the hardware TQM block. 3. ethtool chip reset improvements and fixes for correctness. 4. Improve L2 doorbell mapping by mapping only up to the size specified by firmware. This allows the RoCE driver to map the remaining doorbell space for its purpose, such as write-combining. 5. Improve ethtool -S channel statistics by showing only relevant ring counters for non-combined channels. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:44:11 -07:00
Rajesh Ravi	125592fbf4	bnxt_en: show only relevant ethtool stats for a TX or RX ring Currently, ethtool -S shows all TX/RX ring counters whether the channel is combined, RX, or TX. The unused counters will always be zero. Improve it by showing only the relevant counters if the channel is RX or TX. If the channel is combined, the counters will be shown exactly the same as before. [ MChan: Lots of cleanups and simplifications on Rajesh's original code] Signed-off-by: Rajesh Ravi <rajesh.ravi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:44:11 -07:00
Michael Chan	3316d50905	bnxt_en: Split HW ring statistics strings into RX and TX parts. This will allow the RX and TX ring statistics to be separated if needed. In the next patch, we'll be able to only display RX or TX statistcis if the channel is RX only or TX only. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:44:11 -07:00
Michael Chan	9d8b5f0552	bnxt_en: Refactor the software ring counters. We currently have 3 software ring counters, rx_l4_csum_errors, rx_buf_errors, and missed_irqs. The 1st two are RX counters and the last one is a common counter. Organize them into 2 structures bnxt_rx_sw_stats and bnxt_cmn_sw_stats. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:44:11 -07:00
Michael Chan	098286ff93	bnxt_en: Add doorbell information to bnxt_en_dev struct. The purpose of this is to inform the RDMA driver the size of the doorbell BAR that the L2 driver has mapped and the portion that is mapped uncacheable. The unchaeable portion is shared with the RoCE driver. Any remaining unmapped doorbell BAR can be used by the RDMA driver for its own purpose. Currently, the entire L2 portion is mapped uncacheable. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:44:11 -07:00
Michael Chan	8ae2473842	bnxt_en: Add support for L2 doorbell size. Read the L2 doorbell size from the firmware and only map the portion of the doorbell BAR for L2 use. This will leave the remaining doorbell BAR available for the RoCE driver to use. The RoCE driver can map the remaining portion as write-combining to support the push feature. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:44:11 -07:00
Michael Chan	e93b30d56f	bnxt_en: Set the db_offset on 57500 chips for the RDMA MSIX entries. The driver provides completion ring or NQ doorbell offset for each MSIX entry requested by the RDMA driver. The NQ offset on 57500 chips is different than legacy chips. Set it correctly based on chip type for correctness. The RDMA driver is ignoring this field for the 57500 chips so it is not causing any problem. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:44:11 -07:00
Michael Chan	ebdf73dc59	bnxt_en: Define the doorbell offsets on 57500 chips. Define the 57500 chip doorbell offsets instead of using the magic values in the C file. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-04 10:44:11 -07:00

1 2 3 4 5 ...

916933 Commits