Commit Graph

667 Commits

Author SHA1 Message Date
Geliang Tang
104125b82e mptcp: add mib for infinite map sending
This patch adds a new mib named MPTCP_MIB_INFINITEMAPTX, increase it
when a infinite mapping has been sent out.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23 11:51:05 +01:00
Geliang Tang
f8d4bcacff mptcp: infinite mapping receiving
This patch adds the infinite mapping receiving logic. When the infinite
mapping is received, set the map_data_len of the subflow to 0.

In subflow_check_data_avail(), only reset the subflow when the map_data_len
of the subflow is non-zero.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23 11:51:05 +01:00
Geliang Tang
1e39e5a32a mptcp: infinite mapping sending
This patch adds the infinite mapping sending logic.

Add a new flag send_infinite_map in struct mptcp_subflow_context. Set
it true when a single contiguous subflow is in use and the
allow_infinite_fallback flag is true in mptcp_pm_mp_fail_received().

In mptcp_sendmsg_frag(), if this flag is true, call the new function
mptcp_update_infinite_map() to set the infinite mapping.

Add a new flag infinite_map in struct mptcp_ext, set it true in
mptcp_update_infinite_map(), and check this flag in a new helper
mptcp_check_infinite_map().

In mptcp_update_infinite_map(), set data_len to 0, and clear the
send_infinite_map flag, then do fallback.

In mptcp_established_options(), use the helper mptcp_check_infinite_map()
to let the infinite mapping DSS can be sent out in the fallback mode.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23 11:51:05 +01:00
Geliang Tang
0530020a7c mptcp: track and update contiguous data status
This patch adds a new member allow_infinite_fallback in mptcp_sock,
which is initialized to 'true' when the connection begins and is set
to 'false' on any retransmit or successful MP_JOIN. Only do infinite
mapping fallback if there is a single subflow AND there have been no
retransmissions AND there have never been any MP_JOINs.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23 11:51:05 +01:00
Geliang Tang
0348c690ed mptcp: add the fallback check
This patch adds the fallback check in subflow_check_data_avail(). Only
do the fallback when the msk hasn't fallen back yet.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23 11:51:05 +01:00
Geliang Tang
1761fed256 mptcp: don't send RST for single subflow
When a bad checksum is detected and a single subflow is in use, don't
send RST + MP_FAIL, send data_ack + MP_FAIL instead.

So invoke tcp_send_active_reset() only when mptcp_has_another_subflow()
is true.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23 11:51:05 +01:00
Oliver Hartkopp
ec095263a9 net: remove noblock parameter from recvmsg() entities
The internal recvmsg() functions have two parameters 'flags' and 'noblock'
that were merged inside skb_recv_datagram(). As a follow up patch to commit
f4b41f062c ("net: remove noblock parameter from skb_recv_datagram()")
this patch removes the separate 'noblock' parameter for recvmsg().

Analogue to the referenced patch for skb_recv_datagram() the 'flags' and
'noblock' parameters are unnecessarily split up with e.g.

err = sk->sk_prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT,
                           flags & ~MSG_DONTWAIT, &addr_len);

or in

err = INDIRECT_CALL_2(sk->sk_prot->recvmsg, tcp_recvmsg, udp_recvmsg,
                      sk, msg, size, flags & MSG_DONTWAIT,
                      flags & ~MSG_DONTWAIT, &addr_len);

instead of simply using only flags all the time and check for MSG_DONTWAIT
where needed (to preserve for the formerly separated no(n)block condition).

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://lore.kernel.org/r/20220411124955.154876-1-socketcan@hartkopp.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-12 15:00:25 +02:00
Florian Westphal
4fa39b701c mptcp: listen diag dump support
makes 'ss -Ml' show mptcp listen sockets.

Iterate over the tcp listen sockets and pick those that have mptcp ulp
info attached.

mptcp_diag_get_info() is modified to prefer msk->first for mptcp sockets
in listen state.  This reports accurate number for recv and send queue
(pending / max connection backlog counters).

Sample output:
ss -Mil
State        Recv-Q Send-Q Local Address:Port  Peer Address:Port
LISTEN       0      20     127.0.0.1:12000     0.0.0.0:*
         subflows_max:2

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11 11:55:53 +01:00
Florian Westphal
e8887b7161 mptcp: remove locking in mptcp_diag_fill_info
Problem is that listener iteration would call this from atomic context
so this locking is not allowed.

One way is to drop locks before calling the helper, but afaics the lock
isn't really needed, all values are fetched via READ_ONCE().

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11 11:55:53 +01:00
Florian Westphal
6b9ea5c81e mptcp: diag: switch to context structure
Raw access to cb->arg[] is deprecated, use a context structure.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11 11:55:53 +01:00
Geliang Tang
c682bf536c mptcp: add pm_nl_pernet helpers
This patch adds two pm_nl_pernet related helpers, named pm_nl_get_pernet()
and pm_nl_get_pernet_from_msk() to get pm_nl_pernet from 'net' or 'msk'.
Use these helpers instead of using net_generic() directly.

Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11 11:55:53 +01:00
Paolo Abeni
0e203c3247 mptcp: reset the packet scheduler on PRIO change
Similar to the previous patch, for priority changes
requested by the local PM.

Reported-and-suggested-by: Davide Caratti <dcaratti@redhat.com>
Fixes: 067065422f ("mptcp: add the outgoing MP_PRIO support")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11 11:55:53 +01:00
Paolo Abeni
43f5b111d1 mptcp: reset the packet scheduler on incoming MP_PRIO
When an incoming MP_PRIO option changes the backup
status of any subflow, we need to reset the packet
scheduler status, or the next send could keep using
the previously selected subflow, without taking in account
the new priorities.

Reported-by: Davide Caratti <dcaratti@redhat.com>
Fixes: 40453a5c61 ("mptcp: add the incoming MP_PRIO support")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11 11:55:53 +01:00
Paolo Abeni
65a569b03c mptcp: optimize release_cb for the common case
The mptcp release callback checks several flags in atomic
context, but only MPTCP_CLEAN_UNA can be up frequently.

Reorganize the code to avoid multiple conditionals in the
most common scenarios.

Additional clarify a related comment.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11 11:55:53 +01:00
Jakub Kicinski
89695196f0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Merge in overtime fixes, no conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-23 10:53:49 -07:00
Yonglong Li
12a18341b5 mptcp: send ADD_ADDR echo before create subflows
In some corner cases, the peer handing an incoming ADD_ADDR option, can
receive a retransmitted ADD_ADDR for the same address before the subflow
creation completes.

We can avoid the above issue by generating and sending the ADD_ADDR echo
before starting the MPJ subflow connection.

This slightly changes the behaviour of the packetdrill tests as the
ADD_ADDR echo packet is sent earlier.

Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Yonglong Li <liyonglong@chinatelecom.cn>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Link: https://lore.kernel.org/r/20220317221444.426335-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18 14:14:09 -07:00
Yonglong Li
3ef3905aa3 mptcp: Fix crash due to tcp_tsorted_anchor was initialized before release skb
Got crash when doing pressure test of mptcp:

===========================================================================
dst_release: dst:ffffa06ce6e5c058 refcnt:-1
kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
BUG: unable to handle kernel paging request at ffffa06ce6e5c058
PGD 190a01067 P4D 190a01067 PUD 43fffb067 PMD 22e403063 PTE 8000000226e5c063
Oops: 0011 [#1] SMP PTI
CPU: 7 PID: 7823 Comm: kworker/7:0 Kdump: loaded Tainted: G            E
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.2.1 04/01/2014
Call Trace:
 ? skb_release_head_state+0x68/0x100
 ? skb_release_all+0xe/0x30
 ? kfree_skb+0x32/0xa0
 ? mptcp_sendmsg_frag+0x57e/0x750
 ? __mptcp_retrans+0x21b/0x3c0
 ? __switch_to_asm+0x35/0x70
 ? mptcp_worker+0x25e/0x320
 ? process_one_work+0x1a7/0x360
 ? worker_thread+0x30/0x390
 ? create_worker+0x1a0/0x1a0
 ? kthread+0x112/0x130
 ? kthread_flush_work_fn+0x10/0x10
 ? ret_from_fork+0x35/0x40
===========================================================================

In __mptcp_alloc_tx_skb skb was allocated and skb->tcp_tsorted_anchor will
be initialized, in under memory pressure situation sk_wmem_schedule will
return false and then kfree_skb. In this case skb->_skb_refdst is not null
because_skb_refdst and tcp_tsorted_anchor are stored in the same mem, and
kfree_skb will try to release dst and cause crash.

Fixes: f70cad1085 ("mptcp: stop relying on tcp_tx_skb_cache")
Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Yonglong Li <liyonglong@chinatelecom.cn>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Link: https://lore.kernel.org/r/20220317220953.426024-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18 14:13:48 -07:00
Geliang Tang
0dc626e5e8 mptcp: add fullmesh flag check for adding address
The fullmesh flag mustn't be used with the signal flag when adding an
address. This patch added the necessary flags check for this case.

Fixes: 73c762c1f0 ("mptcp: set fullmesh flag in pm_netlink")
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-08 22:06:12 -08:00
Paolo Abeni
4cf86ae84c mptcp: strict local address ID selection
The address ID selection for MPJ subflows created in response
to incoming ADD_ADDR option is currently unreliable: it happens
at MPJ socket creation time, when the local address could be
unknown.

Additionally, if the no local endpoint is available for the local
address, a new dummy endpoint is created, confusing the user-land.

This change refactor the code to move the address ID selection inside
the rebuild_header() helper, when the local address eventually
selected by the route lookup is finally known. If the address used
is not mapped by any endpoint - and thus can't be advertised/removed
pick the id 0 instead of allocate a new endpoint.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-08 22:06:12 -08:00
Paolo Abeni
d045b9eb95 mptcp: introduce implicit endpoints
In some edge scenarios, an MPTCP subflows can use a local address
mapped by a "implicit" endpoint created by the in-kernel path manager.

Such endpoints presence can be confusing, as it's creation is hard
to track and will prevent the later endpoint creation from the user-space
using the same address.

Define a new endpoint flag to mark implicit endpoints and allow the
user-space to replace implicit them with user-provided data at endpoint
creation time.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-08 22:06:11 -08:00
Paolo Abeni
6fa0174a7c mptcp: more careful RM_ADDR generation
The in-kernel MPTCP path manager, when processing the MPTCP_PM_CMD_FLUSH_ADDR
command, generates RM_ADDR events for each known local address. While that
is allowed by the RFC, it makes unpredictable the exact number of RM_ADDR
generated when both ends flush the PM addresses.

This change restricts the RM_ADDR generation to previously explicitly
announced addresses, and adjust the expected results in a bunch of related
self-tests.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-08 22:06:11 -08:00
Geliang Tang
ea56dcb43c mptcp: use MPTCP_SUBFLOW_NODATA
Set subflow->data_avail with the enum value MPTCP_SUBFLOW_NODATA, instead
of using 0 directly.

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-08 22:06:11 -08:00
Geliang Tang
0eb4e7ee16 mptcp: add tracepoint in mptcp_sendmsg_frag
The tracepoint in get_mapping_status() only dumped the incoming mpext
fields. This patch added a new tracepoint in mptcp_sendmsg_frag() to dump
the outgoing mpext too.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-08 22:06:10 -08:00
Geliang Tang
e40dd439d6 mptcp: add the mibs for MP_RST
This patch added two more mibs for MP_RST, MPTCP_MIB_MPRSTTX for
the MP_RST sending and MPTCP_MIB_MPRSTRX for the MP_RST receiving.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-04 21:54:30 -08:00
Geliang Tang
1e75629cb9 mptcp: add the mibs for MP_FASTCLOSE
This patch added two more mibs for MP_FASTCLOSE, MPTCP_MIB_MPFASTCLOSETX
for the MP_FASTCLOSE sending and MPTCP_MIB_MPFASTCLOSERX for receiving.

Also added a debug log for MP_FASTCLOSE receiving, printed out the recv_key
of MP_FASTCLOSE in mptcp_parse_option to show that MP_RST is received.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-04 21:54:29 -08:00
Jakub Kicinski
80901bff81 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
net/batman-adv/hard-interface.c
  commit 690bb6fb64 ("batman-adv: Request iflink once in batadv-on-batadv check")
  commit 6ee3c393ee ("batman-adv: Demote batadv-on-batadv skip error message")
https://lore.kernel.org/all/20220302163049.101957-1-sw@simonwunderlich.de/

net/smc/af_smc.c
  commit 4d08b7b57e ("net/smc: Fix cleanup when register ULP fails")
  commit 462791bbfa ("net/smc: add sysctl interface for SMC")
https://lore.kernel.org/all/20220302112209.355def40@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03 11:55:12 -08:00
Mat Martineau
877d11f033 mptcp: Correctly set DATA_FIN timeout when number of retransmits is large
Syzkaller with UBSAN uncovered a scenario where a large number of
DATA_FIN retransmits caused a shift-out-of-bounds in the DATA_FIN
timeout calculation:

================================================================================
UBSAN: shift-out-of-bounds in net/mptcp/protocol.c:470:29
shift exponent 32 is too large for 32-bit type 'unsigned int'
CPU: 1 PID: 13059 Comm: kworker/1:0 Not tainted 5.17.0-rc2-00630-g5fbf21c90c60 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Workqueue: events mptcp_worker
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 ubsan_epilogue+0xb/0x5a lib/ubsan.c:151
 __ubsan_handle_shift_out_of_bounds.cold+0xb2/0x20e lib/ubsan.c:330
 mptcp_set_datafin_timeout net/mptcp/protocol.c:470 [inline]
 __mptcp_retrans.cold+0x72/0x77 net/mptcp/protocol.c:2445
 mptcp_worker+0x58a/0xa70 net/mptcp/protocol.c:2528
 process_one_work+0x9df/0x16d0 kernel/workqueue.c:2307
 worker_thread+0x95/0xe10 kernel/workqueue.c:2454
 kthread+0x2f4/0x3b0 kernel/kthread.c:377
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
 </TASK>
================================================================================

This change limits the maximum timeout by limiting the size of the
shift, which keeps all intermediate values in-bounds.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/259
Fixes: 6477dd39e6 ("mptcp: Retransmit DATA_FIN")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 21:54:54 -08:00
Paolo Abeni
07c2c7a3b6 mptcp: accurate SIOCOUTQ for fallback socket
The MPTCP SIOCOUTQ implementation is not very accurate in
case of fallback: it only measures the data in the MPTCP-level
write queue, but it does not take in account the subflow
write queue utilization. In case of fallback the first can be
empty, while the latter is not.

The above produces sporadic self-tests issues and can foul
legit user-space application.

Fix the issue additionally querying the subflow in case of fallback.

Fixes: 644807e3e4 ("mptcp: add SIOCINQ, OUTQ and OUTQNSD ioctls")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/260
Reported-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 21:54:53 -08:00
Jakub Kicinski
aaa25a2fa7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
tools/testing/selftests/net/mptcp/mptcp_join.sh
  34aa6e3bcc ("selftests: mptcp: add ip mptcp wrappers")

  857898eb4b ("selftests: mptcp: add missing join check")
  6ef84b1517 ("selftests: mptcp: more robust signal race test")
https://lore.kernel.org/all/20220221131842.468893-1-broonie@kernel.org/

drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.h
drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ct.c
  fb7e76ea3f ("net/mlx5e: TC, Skip redundant ct clear actions")
  c63741b426 ("net/mlx5e: Fix MPLSoUDP encap to use MPLS action information")

  09bf979232 ("net/mlx5e: TC, Move pedit_headers_action to parse_attr")
  84ba8062e3 ("net/mlx5e: Test CT and SAMPLE on flow attr")
  efe6f961cd ("net/mlx5e: CT, Don't set flow flag CT for ct clear flow")
  3b49a7edec ("net/mlx5e: TC, Reject rules with multiple CT actions")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 17:54:25 -08:00
Paolo Abeni
f73c119463 mptcp: add mibs counter for ignored incoming options
The MPTCP in kernel path manager has some constraints on incoming
addresses announce processing, so that in edge scenarios it can
end-up dropping (ignoring) some of such announces.

The above is not very limiting in practice since such scenarios are
very uncommon and MPTCP will recover due to ADD_ADDR retransmissions.

This patch adds a few MIB counters to account for such drop events
to allow easier introspection of the critical scenarios.

Fixes: f7efc7771e ("mptcp: drop argument port from mptcp_pm_announce_addr")
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19 12:28:00 +00:00
Paolo Abeni
837cf45df1 mptcp: fix race in incoming ADD_ADDR option processing
If an MPTCP endpoint received multiple consecutive incoming
ADD_ADDR options, mptcp_pm_add_addr_received() can overwrite
the current remote address value after the PM lock is released
in mptcp_pm_nl_add_addr_received() and before such address
is echoed.

Fix the issue caching the remote address value a little earlier
and always using the cached value after releasing the PM lock.

Fixes: f7efc7771e ("mptcp: drop argument port from mptcp_pm_announce_addr")
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19 12:28:00 +00:00
Paolo Abeni
98247bc16a mptcp: fix race in overlapping signal events
After commit a88c9e4969 ("mptcp: do not block subflows
creation on errors"), if a signal address races with a failing
subflow creation, the subflow creation failure control path
can trigger the selection of the next address to be announced
while the current announced is still pending.

The above will cause the unintended suppression of the ADD_ADDR
announce.

Fix the issue skipping the to-be-suppressed announce before it
will mark an endpoint as already used. The relevant announce
will be triggered again when the current one will complete.

Fixes: a88c9e4969 ("mptcp: do not block subflows creation on errors")
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19 12:28:00 +00:00
Florian Westphal
952382c648 mptcp: don't save tcp data_ready and write space callbacks
Assign the helpers directly rather than save/restore in the context
structure.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:52:06 -08:00
Florian Westphal
51fa7f8ebf mptcp: mark ops structures as ro_after_init
These structures are initialised from the init hooks, so we can't make
them 'const'.  But no writes occur afterwards, so we can use ro_after_init.

Also, remove bogus EXPORT_SYMBOL, the only access comes from ip
stack, not from kernel modules.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:52:05 -08:00
Paolo Abeni
90d9308821 mptcp: constify a bunch of of helpers
A few pm-related helpers don't touch arguments which lacking
the const modifier, let's constify them.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:52:05 -08:00
Geliang Tang
af7939f390 mptcp: drop port parameter of mptcp_pm_add_addr_signal
Drop the port parameter of mptcp_pm_add_addr_signal() and reflect it to
avoid passing too many parameters.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:52:05 -08:00
Geliang Tang
742e2f36c0 mptcp: drop unneeded type casts for hmac
Drop the unneeded type casts to 'unsigned long long' for printing out the
hmac values in add_addr_hmac_valid() and subflow_thmac_valid().

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:52:04 -08:00
Geliang Tang
0799e21b5a mptcp: drop unused sk in mptcp_get_options
The parameter 'sk' became useless since the code using it was dropped
from mptcp_get_options() in the commit 8d548ea1dd ("mptcp: do not set
unconditionally csum_reqd on incoming opt"). Let's drop it.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:52:04 -08:00
Geliang Tang
d6ab5ea2a3 mptcp: add SNDTIMEO setsockopt support
Add setsockopt support for SO_SNDTIMEO_OLD and SO_SNDTIMEO_NEW to fix this
error reported by the mptcp bpf selftest:

 (network_helpers.c:64: errno: Operation not supported) Failed to set SO_SNDTIMEO
 test_mptcp:FAIL:115

 All error logs:

 (network_helpers.c:64: errno: Operation not supported) Failed to set SO_SNDTIMEO
 test_mptcp:FAIL:115
 Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:52:03 -08:00
Jakub Kicinski
5b91c5cc0e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-10 17:29:56 -08:00
Kishen Maloor
029744cd4b mptcp: netlink: process IPv6 addrs in creating listening sockets
This change updates mptcp_pm_nl_create_listen_socket() to create
listening sockets bound to IPv6 addresses (where IPv6 is supported).

Fixes: 1729cf186d ("mptcp: create the listening socket for new port")
Acked-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09 21:44:35 -08:00
Geliang Tang
09f12c3ab7 mptcp: allow to use port and non-signal in set_flags
It's illegal to use both port and non-signal flags for adding address.
But it's legal to use both of them for setting flags, which always uses
non-signal flags, backup or fullmesh.

This patch moves this non-signal flag with port check from
mptcp_pm_parse_addr() to mptcp_nl_cmd_add_addr(). Do the check only when
adding addresses, not setting flags or deleting addresses.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-04 20:30:23 -08:00
Geliang Tang
73c762c1f0 mptcp: set fullmesh flag in pm_netlink
This patch added the fullmesh flag setting support in pm_netlink.

If the fullmesh flag of the address is changed, remove all the related
subflows, update the fullmesh flag and create subflows again.

Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03 11:44:08 +00:00
Geliang Tang
9ddd1cac6f mptcp: print out reset infos of MP_RST
This patch printed out the reset infos, reset_transient and reset_reason,
of MP_RST in mptcp_parse_option() to show that MP_RST is received.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03 11:44:08 +00:00
Matthieu Baerts
8cca39e251 mptcp: clarify when options can be used
RFC8684 doesn't seem to clearly specify which MPTCP options can be used
together.

Some options are mutually exclusive -- e.g. MP_CAPABLE and MP_JOIN --,
some can be used together -- e.g. DSS + MP_PRIO --, some can but we
prefer not to -- e.g. DSS + ADD_ADDR -- and some have to be used
together at some points -- e.g. MP_FAIL and DSS.

We need to clarify this as a base before allowing other modifications.

For example, does it make sense to send a RM_ADDR with an MPC or MPJ?
This remains open for possible future discussions.

Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03 11:44:08 +00:00
Matthieu Baerts
902c8f8648 mptcp: reduce branching when writing MP_FAIL option
MP_FAIL should be use in very rare cases, either when the TCP RST flag
is set -- with or without an MP_RST -- or with a DSS, see
mptcp_established_options().

Here, we do the same in mptcp_write_options().

Co-developed-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03 11:44:08 +00:00
Geliang Tang
d7889cfa0b mptcp: move the declarations of ssk and subflow
Move the declarations of ssk and subflow in MP_FAIL and MP_PRIO to the
beginning of the function mptcp_write_options().

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03 11:44:08 +00:00
Kees Cook
63ec72bd58 mptcp: Use struct_group() to avoid cross-field memset()
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.

Use struct_group() to capture the fields to be reset, so that memset()
can be appropriately bounds-checked by the compiler.

Cc: Matthieu Baerts <matthieu.baerts@tessares.net>
Cc: mptcp@lists.linux.dev
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Link: https://lore.kernel.org/r/20220121073935.1154263-1-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-21 19:26:03 -08:00
Geliang Tang
a4c0214fbe mptcp: fix removing ids bitmap setting
In mptcp_pm_nl_rm_addr_or_subflow(), the bit of rm_list->ids[i] in the
id_avail_bitmap should be set, not rm_list->ids[1]. This patch fixed it.

Fixes: 86e39e0448 ("mptcp: keep track of local endpoint still available for each msk")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-20 20:24:01 -08:00
Paolo Abeni
8e9eacad7e mptcp: fix msk traversal in mptcp_nl_cmd_set_flags()
The MPTCP endpoint list is under RCU protection, guarded by the
pernet spinlock. mptcp_nl_cmd_set_flags() traverses the list
without acquiring the spin-lock nor under the RCU critical section.

This change addresses the issue performing the lookup and the endpoint
update under the pernet spinlock.

Fixes: 0f9f696a50 ("mptcp: add set_flags command in PM netlink")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-20 20:24:01 -08:00