linux

Author	SHA1	Message	Date
Rohit Maheshwari	38f7e1c0c4	net/tls: race causes kernel panic BUG: kernel NULL pointer dereference, address: 00000000000000b8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 80000008b6fef067 P4D 80000008b6fef067 PUD 8b6fe6067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 12 PID: 23871 Comm: kworker/12:80 Kdump: loaded Tainted: G S 5.9.0-rc3+ #1 Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.1 03/29/2018 Workqueue: events tx_work_handler [tls] RIP: 0010:tx_work_handler+0x1b/0x70 [tls] Code: dc fe ff ff e8 16 d4 a3 f6 66 0f 1f 44 00 00 0f 1f 44 00 00 55 53 48 8b 6f 58 48 8b bd a0 04 00 00 48 85 ff 74 1c 48 8b 47 28 <48> 8b 90 b8 00 00 00 83 e2 02 75 0c f0 48 0f ba b0 b8 00 00 00 00 RSP: 0018:ffffa44ace61fe88 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff91da9e45cc30 RCX: dead000000000122 RDX: 0000000000000001 RSI: ffff91da9e45cc38 RDI: ffff91d95efac200 RBP: ffff91da133fd780 R08: 0000000000000000 R09: 000073746e657665 R10: 8080808080808080 R11: 0000000000000000 R12: ffff91dad7d30700 R13: ffff91dab6561080 R14: 0ffff91dad7d3070 R15: ffff91da9e45cc38 FS: 0000000000000000(0000) GS:ffff91dad7d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000b8 CR3: 0000000906478003 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: process_one_work+0x1a7/0x370 worker_thread+0x30/0x370 ? process_one_work+0x370/0x370 kthread+0x114/0x130 ? kthread_park+0x80/0x80 ret_from_fork+0x22/0x30 tls_sw_release_resources_tx() waits for encrypt_pending, which can have race, so we need similar changes as in commit `0cada33241` here as well. Fixes: `a42055e8d2` ("net/tls: Add support for async encryption of records for performance") Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 20:06:29 -07:00
Wang Qing	0eb11dfe22	net/ethernet/broadcom: fix spelling typo Modify the comment typo: "compliment" -> "complement". Signed-off-by: Wang Qing <wangqing@vivo.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 20:05:48 -07:00
Xiaoliang Yang	4ab810a4e0	net: mscc: ocelot: fix fields offset in SG_CONFIG_REG_3 INIT_IPS and GATE_ENABLE fields have a wrong offset in SG_CONFIG_REG_3. This register is used by stream gate control of PSFP, and it has not been used before, because PSFP is not implemented in ocelot driver. Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 20:00:40 -07:00
Xiaoliang Yang	dba1e4660a	net: dsa: felix: convert TAS link speed based on phylink speed state->speed holds a value of 10, 100, 1000 or 2500, but QSYS_TAG_CONFIG_LINK_SPEED expects a value of 0, 1, 2, 3. So convert the speed to a proper value. Fixes: `de143c0e27` ("net: dsa: felix: Configure Time-Aware Scheduler via taprio offload") Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 20:00:40 -07:00
Luo bin	f68910a805	hinic: fix wrong return value of mac-set cmd It should also be regarded as an error when hw return status=4 for PF's setting mac cmd. Only if PF return status=4 to VF should this cmd be taken special treatment. Fixes: `7dd29ee128` ("hinic: add sriov feature support") Signed-off-by: Luo bin <luobin9@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 20:00:40 -07:00
David S. Miller	a1a35529bd	Merge branch 'mptcp-RM_ADDR-ADD_ADDR-enhancements' Geliang Tang says: ==================== mptcp: RM_ADDR/ADD_ADDR enhancements This series include two enhancements for the MPTCP path management, namely RM_ADDR support and ADD_ADDR echo support, as specified by RFC sections 3.4.1 and 3.4.2. 1 RM_ADDR support include 9 patches (1-3 and 8-13): Patch 1 is the helper for patch 2, these two patches add the RM_ADDR outgoing functions, which are derived from ADD_ADDR's corresponding functions. Patch 3 adds the RM_ADDR incoming logic, when RM_ADDR suboption is received, close the subflow matching the rm_id, and update PM counter. Patch 8 is the main remove routine. When the PM netlink removes an address, we traverse all the existing msk sockets to find the relevant sockets. Then trigger the RM_ADDR signal and remove the subflow which using this local address, this subflow removing functions has been implemented in patch 9. Finally, patches 10-13 are the self-tests for RM_ADDR. 2 ADD_ADDR echo support include 7 patches (4-7 and 14-16). Patch 4 adds the ADD_ADDR echo logic, when the ADD_ADDR suboption has been received, send out the same ADD_ADDR suboption with echo-flag, and no HMAC included. Patches 5 and 6 are the self-tests for ADD_ADDR echo. Patch 7 is a little cleaning up. Patch 14 and 15 are the helpers for patch 16. These three patches add the ADD_ADDR retransmition when no ADD_ADDR echo is received. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:58:34 -07:00
Geliang Tang	00cfd77b90	mptcp: retransmit ADD_ADDR when timeout This patch implemented the retransmition of ADD_ADDR when no ADD_ADDR echo is received. It added a timer with the announced address. When timeout occurs, ADD_ADDR will be retransmitted. Suggested-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Suggested-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:58:34 -07:00
Geliang Tang	08b81d8731	mptcp: add sk_stop_timer_sync helper This patch added a new helper sk_stop_timer_sync, it deactivates a timer like sk_stop_timer, but waits for the handler to finish. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:58:34 -07:00
Geliang Tang	0abd40f823	mptcp: add struct mptcp_pm_add_entry Add a new struct mptcp_pm_add_entry to describe add_addr's entry. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:58:34 -07:00
Geliang Tang	dd72b0fede	selftests: mptcp: add remove addr and subflow test cases This patch added the remove addr and subflow test cases and two new functions. The first function run_remove_tests calls do_transfer with two new arguments, rm_nr_ns1 and rm_nr_ns2, for the numbers of addresses should be removed during the transfer process in namespace 1 and namespace 2. If both these two arguments are 0, we do the join test cases with "mptcp_connect -j" command. Otherwise, do the remove test cases with "mptcp_connect -r" command. The second function chk_rm_nr checks the RM_ADDR related mibs's counters. The output of the test cases looks like this: 11 remove single subflow syn[ ok ] - synack[ ok ] - ack[ ok ] rm [ ok ] - sf [ ok ] 12 remove multiple subflows syn[ ok ] - synack[ ok ] - ack[ ok ] rm [ ok ] - sf [ ok ] 13 remove single address syn[ ok ] - synack[ ok ] - ack[ ok ] add[ ok ] - echo [ ok ] rm [ ok ] - sf [ ok ] 14 remove subflow and signal syn[ ok ] - synack[ ok ] - ack[ ok ] add[ ok ] - echo [ ok ] rm [ ok ] - sf [ ok ] 15 remove subflows and signal syn[ ok ] - synack[ ok ] - ack[ ok ] add[ ok ] - echo [ ok ] rm [ ok ] - sf [ ok ] Suggested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Suggested-by: Paolo Abeni <pabeni@redhat.com> Suggested-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:58:34 -07:00
Geliang Tang	1315332409	selftests: mptcp: add remove cfg in mptcp_connect This patch added a new cfg, named cfg_remove in mptcp_connect. This new cfg_remove is copied from cfg_join. The only difference between them is in the do_rnd_write function. Here we slow down the transfer process of all data to let the RM_ADDR suboption can be sent and received completely. Otherwise the remove address and subflow test cases don't work. Suggested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Suggested-by: Paolo Abeni <pabeni@redhat.com> Suggested-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:58:34 -07:00
Geliang Tang	5c8c164095	mptcp: add mptcp_destroy_common helper This patch added a new helper named mptcp_destroy_common containing the shared code between mptcp_destroy() and mptcp_sock_destruct(). Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:58:34 -07:00
Geliang Tang	7a7e52e38a	mptcp: add RM_ADDR related mibs This patch added two new mibs for RM_ADDR, named MPTCP_MIB_RMADDR and MPTCP_MIB_RMSUBFLOW, when the RM_ADDR suboption is received, increase the first mib counter, when the local subflow is removed, increase the second mib counter. Suggested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Suggested-by: Paolo Abeni <pabeni@redhat.com> Suggested-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:58:34 -07:00
Geliang Tang	0ee4261a36	mptcp: implement mptcp_pm_remove_subflow This patch implemented the local subflow removing function, mptcp_pm_remove_subflow, it simply called mptcp_pm_nl_rm_subflow_received under the PM spin lock. We use mptcp_pm_remove_subflow to remove a local subflow, so change it's argument from remote_id to local_id. We check subflow->local_id in mptcp_pm_nl_rm_subflow_received to remove a subflow. Suggested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Suggested-by: Paolo Abeni <pabeni@redhat.com> Suggested-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:58:34 -07:00
Geliang Tang	b6c0838086	mptcp: remove addr and subflow in PM netlink This patch implements the remove announced addr and subflow logic in PM netlink. When the PM netlink removes an address, we traverse all the existing msk sockets to find the relevant sockets. We add a new list named anno_list in mptcp_pm_data, to record all the announced addrs. In the traversing, we check if it has been recorded. If it has been, we trigger the RM_ADDR signal. We also check if this address is in conn_list. If it is, we remove the subflow which using this local address. Since we call mptcp_pm_free_anno_list in mptcp_destroy, we need to move __mptcp_init_sock before the mptcp_is_enabled check in mptcp_init_sock. Suggested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Suggested-by: Paolo Abeni <pabeni@redhat.com> Suggested-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:58:33 -07:00
Geliang Tang	f58f065aa1	mptcp: add accept_subflow re-check The re-check of pm->accept_subflow with pm->lock held was missing, this patch fixed it. Suggested-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:58:33 -07:00
Geliang Tang	be61316003	selftests: mptcp: add ADD_ADDR mibs check function This patch added the ADD_ADDR related mibs counter check function chk_add_nr(). This function check both ADD_ADDR and ADD_ADDR with echo flag. The output looks like this: 07 unused signal address syn[ ok ] - synack[ ok ] - ack[ ok ] add[ ok ] - echo [ ok ] 08 signal address syn[ ok ] - synack[ ok ] - ack[ ok ] add[ ok ] - echo [ ok ] 09 subflow and signal syn[ ok ] - synack[ ok ] - ack[ ok ] add[ ok ] - echo [ ok ] 10 multiple subflows and signal syn[ ok ] - synack[ ok ] - ack[ ok ] add[ ok ] - echo [ ok ] 11 remove subflow and signal syn[ ok ] - synack[ ok ] - ack[ ok ] add[ ok ] - echo [ ok ] Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:58:33 -07:00
Geliang Tang	a877de0671	mptcp: add ADD_ADDR related mibs This patch added two mibs for ADD_ADDR, MPTCP_MIB_ADDADDR for receiving of the ADD_ADDR suboption with echo-flag=0, and MPTCP_MIB_ECHOADD for receiving the ADD_ADDR suboption with echo-flag=1. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:58:33 -07:00
Geliang Tang	6a6c05a8b0	mptcp: send out ADD_ADDR with echo flag When the ADD_ADDR suboption has been received, we need to send out the same ADD_ADDR suboption with echo-flag=1, and no HMAC. Suggested-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:58:33 -07:00
Geliang Tang	d0876b2284	mptcp: add the incoming RM_ADDR support This patch added the RM_ADDR option parsing logic: We parsed the incoming options to find if the rm_addr option is received, and called mptcp_pm_rm_addr_received to schedule PM work to a new status, named MPTCP_PM_RM_ADDR_RECEIVED. PM work got this status, and called mptcp_pm_nl_rm_addr_received to handle it. In mptcp_pm_nl_rm_addr_received, we closed the subflow matching the rm_id, and updated PM counter. Suggested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Suggested-by: Paolo Abeni <pabeni@redhat.com> Suggested-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:58:33 -07:00
Geliang Tang	5cb104ae55	mptcp: add the outgoing RM_ADDR support This patch added a new signal named rm_addr_signal in PM. On outgoing path, we called mptcp_pm_should_rm_signal to check if rm_addr_signal has been set. If it has been, we sent out the RM_ADDR option. Suggested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:58:33 -07:00
Geliang Tang	f643b8032e	mptcp: rename addr_signal and the related functions This patch renamed addr_signal and the related functions with the explicit word "add". Suggested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:58:33 -07:00
David S. Miller	075c156850	mlx5-updates-2020-09-22 This series includes mlx5 updates 1) Add support for Connection Tracking offload in NIC mode. Supporting CT offload in NIC mode on Mellanox cards is useful for scenarios where the dual port NIC serves as a gateway between 2 networks and forwards traffic between these networks. Since the traffic is not terminated on the host in this case, no use of SRIOV VFs and/or switchdev mode is required. Today Mellanox NIC cards already support offloading of packet forwarding between physical ports without going to the host so combining it with CT offloading allows users to create a gateway with forwarding and CT (Including NAT) offloading capabilities in non-switchdev mode. To support connection tracking in non-Switchdev mode (Single NIC mode), we need to make use of the current Connection tracking infrastructure implemented on top of E-Switch and the mlx5 generic flow table chains APIs, to make it work on non-Eswitch steering domain e.g. NIC RX domain, the following was performed: 1.1) Refactor current flow steering chains infrastructure and updates TC nic mode implementation to use flow table chains. 1.2) Refactor current Connection Tracking (CT) infrastructure to not assume E-switch backend, and make the CT layer agnostic to underlying steering mode (E-Switch/NIC) 1.3) Plumbing to support CT offload in NIC mode. 2) Trivial code cleanups. -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl9rz9cACgkQSD+KveBX +j7K/gf/ZysTFuFuC7MCo7xJO2vxlGGE1r6/ENsqonvUT2tcoZCdK9bZMw1Mx17Z r1nyn0xQ3MwRheXMSpqXngTPpfGM6eNgV9CDfFXm62z6WXMYieen0t/LrM/mxo+2 s74Okp53peyGNpePyseewEUGV7zaR6F6uukkKvr441gvAOF3Fcfaz+dIv7KzxKNS +b78yw0b6mGc4foYLSuJcDQlSwqjeIpdSib8xmETMZwRzCt20GCEBDsBAaKt0wzM 1fTZttY+kuLd/m/q+sh3s/4lN2kOO+dwK5NGf+RWtiOaDWT+J/ogVmI2ywXIwsg7 U63nhjGAr7GPqkaG0Jv3aS7na6pbSA== =sByc -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2020-09-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2020-09-22 This series includes mlx5 updates 1) Add support for Connection Tracking offload in NIC mode. Supporting CT offload in NIC mode on Mellanox cards is useful for scenarios where the dual port NIC serves as a gateway between 2 networks and forwards traffic between these networks. Since the traffic is not terminated on the host in this case, no use of SRIOV VFs and/or switchdev mode is required. Today Mellanox NIC cards already support offloading of packet forwarding between physical ports without going to the host so combining it with CT offloading allows users to create a gateway with forwarding and CT (Including NAT) offloading capabilities in non-switchdev mode. To support connection tracking in non-Switchdev mode (Single NIC mode), we need to make use of the current Connection tracking infrastructure implemented on top of E-Switch and the mlx5 generic flow table chains APIs, to make it work on non-Eswitch steering domain e.g. NIC RX domain, the following was performed: 1.1) Refactor current flow steering chains infrastructure and updates TC nic mode implementation to use flow table chains. 1.2) Refactor current Connection Tracking (CT) infrastructure to not assume E-switch backend, and make the CT layer agnostic to underlying steering mode (E-Switch/NIC) 1.3) Plumbing to support CT offload in NIC mode. 2) Trivial code cleanups. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:54:40 -07:00
Xie He	ed46cd1d4c	drivers/net/wan/x25_asy: Correct the ndo_open and ndo_stop functions 1. Move the lapb_register/lapb_unregister calls into the ndo_open/ndo_stop functions. This makes the LAPB protocol start/stop when the network interface starts/stops. When the network interface is down, the LAPB protocol shouldn't be running and the LAPB module shoudn't be generating control frames. 2. Move netif_start_queue/netif_stop_queue into the ndo_open/ndo_stop functions. This makes the TX queue start/stop when the network interface starts/stops. (netif_stop_queue was originally in the ndo_stop function. But to make the code look better, I created a new function to use as ndo_stop, and made it call the original ndo_stop function. I moved netif_stop_queue from the original ndo_stop function to the new ndo_stop function.) Cc: Martin Schiller <ms@dev.tdt.de> Signed-off-by: Xie He <xie.he.0141@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:52:58 -07:00
Maciej Żenczykowski	02a1b175b0	net/ipv4: always honour route mtu during forwarding Documentation/networking/ip-sysctl.txt:46 says: ip_forward_use_pmtu - BOOLEAN By default we don't trust protocol path MTUs while forwarding because they could be easily forged and can lead to unwanted fragmentation by the router. You only need to enable this if you have user-space software which tries to discover path mtus by itself and depends on the kernel honoring this information. This is normally not the case. Default: 0 (disabled) Possible values: 0 - disabled 1 - enabled Which makes it pretty clear that setting it to 1 is a potential security/safety/DoS issue, and yet it is entirely reasonable to want forwarded traffic to honour explicitly administrator configured route mtus (instead of defaulting to device mtu). Indeed, I can't think of a single reason why you wouldn't want to. Since you configured a route mtu you probably know better... It is pretty common to have a higher device mtu to allow receiving large (jumbo) frames, while having some routes via that interface (potentially including the default route to the internet) specify a lower mtu. Note that ipv6 forwarding uses device mtu unless the route is locked (in which case it will use the route mtu). This approach is not usable for IPv4 where an 'mtu lock' on a route also has the side effect of disabling TCP path mtu discovery via disabling the IPv4 DF (don't frag) bit on all outgoing frames. I'm not aware of a way to lock a route from an IPv6 RA, so that also potentially seems wrong. Signed-off-by: Maciej Żenczykowski <maze@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: Sunmeet Gill (Sunny) <sgill@quicinc.com> Cc: Vinay Paradkar <vparadka@qti.qualcomm.com> Cc: Tyler Wear <twear@quicinc.com> Cc: David Ahern <dsahern@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:51:16 -07:00
David S. Miller	54ce00ae36	Merge branch 'dpaa2-mac-add-PCS-support-through-the-Lynx-module' Ioana Ciornei says: ==================== dpaa2-mac: add PCS support through the Lynx module This patch set aims to add PCS support in the dpaa2-eth driver by leveraging the Lynx PCS module. The first two patches are some missing pieces: the first one adding support for 10GBASER in Lynx PCS while the second one adds a new function - of_mdio_find_device - which is helpful in retrieving the PCS represented as a mdio_device. The final patch adds the glue logic between phylink and the Lynx PCS module: it retrieves the PCS represented as an mdio_device and registers it to Lynx and phylink. From that point on, any PCS callbacks are treated by Lynx, without dpaa2-eth interaction. Changes in v2: - move put_device() after destroy - 3/3 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:49:36 -07:00
Ioana Ciornei	94ae899b20	dpaa2-mac: add PCS support through the Lynx module Include PCS support in the dpaa2-eth driver by integrating it with the new Lynx PCS module. There is not much to talk about in terms of changes needed in the dpaa2-eth driver since the only steps necessary are to find the MDIO device representing the PCS, register it to the Lynx PCS module and then let phylink know if its existence also. After this, the PCS callbacks will be treated directly by Lynx, without interraction from dpaa2-eth's part. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:49:36 -07:00
Russell King	b5b6775d72	of: add of_mdio_find_device() api Add a helper function which finds the mdio_device structure given a device tree node. This is helpful for finding the PCS device based on a DTS node but managing it as a mdio_device instead of a phy_device. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:49:36 -07:00
Ioana Ciornei	e7e95c9003	net: pcs-lynx: add support for 10GBASER Add support in the Lynx PCS module for the 10GBASE-R mode which is only used to get the link state, since it offers a single fixed speed. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:49:36 -07:00
Vladimir Oltean	e2f9a8fe73	net: mscc: ocelot: always pass skb clone to ocelot_port_add_txtstamp_skb Currently, ocelot switchdev passes the skb directly to the function that enqueues it to the list of skb's awaiting a TX timestamp. Whereas the felix DSA driver first clones the skb, then passes the clone to this queue. This matters because in the case of felix, the common IRQ handler, which is ocelot_get_txtstamp(), currently clones the clone, and frees the original clone. This is useless and can be simplified by using skb_complete_tx_timestamp() instead of skb_tstamp_tx(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:47:56 -07:00
David S. Miller	6d8899962a	Merge branch 'net_sched-fix-a-UAF-in-tcf_action_init' Cong Wang says: ==================== net_sched: fix a UAF in tcf_action_init() This patchset fixes a use-after-free triggered by syzbot. Please find more details in each patch description. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:46:21 -07:00
Cong Wang	0fedc63fad	net_sched: commit action insertions together syzbot is able to trigger a failure case inside the loop in tcf_action_init(), and when this happens we clean up with tcf_action_destroy(). But, as these actions are already inserted into the global IDR, other parallel process could free them before tcf_action_destroy(), then we will trigger a use-after-free. Fix this by deferring the insertions even later, after the loop, and committing all the insertions in a separate loop, so we will never fail in the middle of the insertions any more. One side effect is that the window between alloction and final insertion becomes larger, now it is more likely that the loop in tcf_del_walker() sees the placeholder -EBUSY pointer. So we have to check for error pointer in tcf_del_walker(). Reported-and-tested-by: syzbot+2287853d392e4b42374a@syzkaller.appspotmail.com Fixes: `0190c1d452` ("net: sched: atomically check-allocate action") Cc: Vlad Buslov <vladbu@mellanox.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:46:21 -07:00
Cong Wang	e49d8c22f1	net_sched: defer tcf_idr_insert() in tcf_action_init_1() All TC actions call tcf_idr_insert() for new action at the end of their ->init(), so we can actually move it to a central place in tcf_action_init_1(). And once the action is inserted into the global IDR, other parallel process could free it immediately as its refcnt is still 1, so we can not fail after this, we need to move it after the goto action validation to avoid handling the failure case after insertion. This is found during code review, is not directly triggered by syzbot. And this prepares for the next patch. Cc: Vlad Buslov <vladbu@mellanox.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-24 19:46:21 -07:00
Dave Airlie	ba78755e0c	drm-misc-fixes for v5.9: - Single null pointer deref fix for dma-buf. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAl9seRQACgkQ/lWMcqZw E8PODA/8DUbUMRyJqFtWYFiX92MULZSWdS4Khj1uFRhmLcpNSpLo/P+In88HJeLK bbYuDYAbAAxdce2VZEP/gNu5C2HDctJ4jCV+4fusczpUavthl1hkmlvDHnoJjtiX VYRHnZPyq5Yj+GRAtiHLzHSszUVsc3N18Vby2IG0Qu7MgN49pIgrVWD3/L8Dzdoi e6eVjSTypliCEhKgp3c+nl7ZaAnyqcKnhrisZW7sTuZm32GfG8C1E0DhGzbopPC5 KQyMtQk6YRw7aSIA9lQQ8KoWqzyNEyRhM82gVzjmnWPfjtrhowObEQkLW42QaSUF wcV8k5T/Bw4Kgqrbo7bsrVCzBN+Dwz8gRT2jfXkw7CmRLv9aN5cTmu9eRPlgUbnc NpkS/0S/V/zRv+spNd+ILn3gaJdFEU7+LSxhCJwaDq95JHf5fB+KNhqWWEpun6/y D6Ge2OGjnVWBF4pfumHgv+F39a4rO5KsWapxucWjakj7UBT3uOaO8O/mvkCT8G6g nKm2+A6u8CUIpZUQlE5ICyHwncLBUzLaoTEqCwujNdBWnNkCBreTfjX04i3yVrQg XODboY8EiN+w8rBHexKLBkF+dgmSOxW1hG7a6gt2A42Qn42PJ5LFAnsRbkUSKA2t NDNxDTzpIn/O1i+bTg96Q5hWDKY15Vo9d6tWX4CUOEwCTzq5b2c= =CEdq -----END PGP SIGNATURE----- Merge tag 'drm-misc-fixes-2020-09-24' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes drm-misc-fixes for v5.9: - Single null pointer deref fix for dma-buf. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/4106c21e-f52c-4c05-6cdb-daa743bb8617@linux.intel.com	2020-09-25 11:30:00 +10:00
Dave Airlie	f3231a02aa	Merge tag 'drm-intel-fixes-2020-09-24' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes drm/i915 fixes for v5.9-rc7: - Fix selftest reference to stack data out of scope - Fix GVT null pointer dereference - Backmerge from Linus' master to fix build Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87zh5fpmha.fsf@intel.com	2020-09-25 11:07:01 +10:00
Dave Airlie	720777c5be	BackMerge commit '98477740630f270aecf648f1d6a9dbc6027d4ff1' into drm-fixes The dax mess had some fallout, and i915 used a later base to fix their CI. Signed-off-by: Dave Airlie <airlied@redhat.com>	2020-09-25 11:06:18 +10:00
Xiao Ni	d3ee2d8415	md/raid10: improve discard request for far layout For far layout, the discard region is not continuous on disks. So it needs far copies r10bio to cover all regions. It needs a way to know all r10bios have finish or not. Similar with raid10_sync_request, only the first r10bio master_bio records the discard bio. Other r10bios master_bio record the first r10bio. The first r10bio can finish after other r10bios finish and then return the discard bio. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>	2020-09-24 16:44:45 -07:00
Xiao Ni	bcc90d2804	md/raid10: improve raid10 discard request Now the discard request is split by chunk size. So it takes a long time to finish mkfs on disks which support discard function. This patch improve handling raid10 discard request. It uses the similar way with patch `29efc390b` (md/md0: optimize raid0 discard handling). But it's a little complex than raid0. Because raid10 has different layout. If raid10 is offset layout and the discard request is smaller than stripe size. There are some holes when we submit discard bio to underlayer disks. For example: five disks (disk1 - disk5) D01 D02 D03 D04 D05 D05 D01 D02 D03 D04 D06 D07 D08 D09 D10 D10 D06 D07 D08 D09 The discard bio just wants to discard from D03 to D10. For disk3, there is a hole between D03 and D08. For disk4, there is a hole between D04 and D09. D03 is a chunk, raid10_write_request can handle one chunk perfectly. So the part that is not aligned with stripe size is still handled by raid10_write_request. If reshape is running when discard bio comes and the discard bio spans the reshape position, raid10_write_request is responsible to handle this discard bio. I did a test with this patch set. Without patch: time mkfs.xfs /dev/md0 real4m39.775s user0m0.000s sys0m0.298s With patch: time mkfs.xfs /dev/md0 real0m0.105s user0m0.000s sys0m0.007s nvme3n1 259:1 0 477G 0 disk └─nvme3n1p1 259:10 0 50G 0 part nvme4n1 259:2 0 477G 0 disk └─nvme4n1p1 259:11 0 50G 0 part nvme5n1 259:6 0 477G 0 disk └─nvme5n1p1 259:12 0 50G 0 part nvme2n1 259:9 0 477G 0 disk └─nvme2n1p1 259:15 0 50G 0 part nvme0n1 259:13 0 477G 0 disk └─nvme0n1p1 259:14 0 50G 0 part Reviewed-by: Coly Li <colyli@suse.de> Reviewed-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>	2020-09-24 16:44:45 -07:00
Xiao Ni	f046f5d0d7	md/raid10: pull codes that wait for blocked dev into one function The following patch will reuse these logics, so pull the same codes into one function. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>	2020-09-24 16:44:45 -07:00
Xiao Ni	8650a88901	md/raid10: extend r10bio devs to raid disks Now it allocs r10bio->devs[conf->copies]. Discard bio needs to submit to all member disks and it needs to use r10bio. So extend to r10bio->devs[geo.raid_disks]. Reviewed-by: Coly Li <colyli@suse.de> Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>	2020-09-24 16:44:45 -07:00
Xiao Ni	2628089b74	md: add md_submit_discard_bio() for submitting discard bio Move these logic from raid0.c to md.c, so that we can also use it in raid10.c. Reviewed-by: Coly Li <colyli@suse.de> Reviewed-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>	2020-09-24 16:44:45 -07:00
Zhen Lei	e287308b83	md: Simplify code with existing definition RESYNC_SECTORS in raid10.c #define RESYNC_SECTORS (RESYNC_BLOCK_SIZE >> 9) "RESYNC_BLOCK_SIZE/512" is equal to "RESYNC_BLOCK_SIZE >> 9", replace it with RESYNC_SECTORS. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Song Liu <songliubraving@fb.com>	2020-09-24 16:44:45 -07:00
Yufen Yu	3891258443	md/raid5: reallocate page array after setting new stripe_size When try to resize stripe_size, we also need to free old shared page array and allocate new. Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Song Liu <songliubraving@fb.com>	2020-09-24 16:44:45 -07:00
Yufen Yu	f16acaf328	md/raid5: resize stripe_head when reshape array When reshape array, we try to reuse shared pages of old stripe_head, and allocate more for the new one if needed. Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Song Liu <songliubraving@fb.com>	2020-09-24 16:44:45 -07:00
Yufen Yu	046169f048	md/raid5: let multiple devices of stripe_head share page In current implementation, grow_buffers() uses alloc_page() to allocate the buffers for each stripe_head, i.e. allocate a page for each dev[i] in stripe_head. After setting stripe_size as a configurable value by writing sysfs entry, it means that we always allocate 64K buffers, but just use 4K of them when stripe_size is 4K in 64KB arm64. To avoid wasting memory, we try to let multiple sh->dev share one real page. That means, multiple sh->dev[i].page will point to the only page with different offset. Example of 64K PAGE_SIZE and 4K stripe_size as following: 64K PAGE_SIZE +---+---+---+---+------------------------------+ \| \| \| \| \| \| \| \| \| \| +-+-+-+-+-+-+-+-+------------------------------+ ^ ^ ^ ^ \| \| \| +----------------------------+ \| \| \| \| \| \| +-------------------+ \| \| \| \| \| \| +----------+ \| \| \| \| \| \| +-+ \| \| \| \| \| \| \| +-----+-----+------+-----+------+-----+------+------+ sh \| offset(0) \| offset(4K) \| offset(8K) \| offset(12K) \| + +-----------+------------+------------+-------------+ +----> dev[0].page dev[1].page dev[2].page dev[3].page A new 'pages' array will be added into stripe_head to record shared page used by this stripe_head. Allocate them when grow_buffers() and free them when shrink_buffers(). After trying to share page, the users of sh->dev[i].page need to take care of the related page offset: page of issued bio and page passed to xor compution functions. But thanks for previous different page offset supported. Here, we just need to set correct dev[i].offset. Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Song Liu <songliubraving@fb.com>	2020-09-24 16:44:44 -07:00
Yufen Yu	4f86ff5580	md/raid6: let async recovery function support different page offset For now, asynchronous raid6 recovery calculate functions are require common offset for pages. But, we expect them to support different page offset after introducing stripe shared page. Do that by simplily adding page offset where each page address are referred. Then, replace the old interface with the new ones in raid6 and raid6test. Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Song Liu <songliubraving@fb.com>	2020-09-24 16:44:44 -07:00
Yufen Yu	d69454bc9f	md/raid6: let syndrome computor support different page offset For now, syndrome compute functions require common offset in the pages array. However, we expect them to support different offset when try to use shared page in the following. Simplily covert them by adding page offset where each page address are referred. Since the only caller of async_gen_syndrome() and async_syndrome_val() are in raid6, we don't want to reserve the old interface but modify the interface directly. After that, replacing old interfaces with new ones for raid6 and raid6test. Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Song Liu <songliubraving@fb.com>	2020-09-24 16:44:44 -07:00
Yufen Yu	a7c224a820	md/raid5: convert to new xor compution interface We try to replace async_xor() and async_xor_val() with the new introduced interface async_xor_offs() and async_xor_val_offs() for raid456. Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Song Liu <songliubraving@fb.com>	2020-09-24 16:44:44 -07:00
Yufen Yu	29bcff787a	md/raid5: add new xor function to support different page offset raid5 will call async_xor() and async_xor_val() to compute xor. For now, both of them require the common src/dst page offset. But, we want them to support different src/dst page offset for following shared page. Here, adding two new function async_xor_offs() and async_xor_val_offs() respectively for async_xor() and async_xor_val(). Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Song Liu <songliubraving@fb.com>	2020-09-24 16:44:44 -07:00
Yufen Yu	248728dd04	md/raid5: make async_copy_data() to support different page offset ops_run_biofill() and ops_run_biodrain() will call async_copy_data() to copy sh->dev[i].page from or to bio page. For now, it implies the offset of dev[i].page is 0. But we want to support different page offset in the following. Thus, pass page offset to these functions and replace 'page_offset' with 'page_offset + poff'. No functional change. Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Song Liu <songliubraving@fb.com>	2020-09-24 16:44:44 -07:00

... 122 123 124 125 126 ...

966273 Commits