Commit Graph

1566 Commits

Author SHA1 Message Date
Tuong Lien
91986ee166 tipc: fix link session and re-establish issues
When a link endpoint is re-created (e.g. after a node reboot or
interface reset), the link session number is varied by random, the peer
endpoint will be synced with this new session number before the link is
re-established.

However, there is a shortcoming in this mechanism that can lead to the
link never re-established or faced with a failure then. It happens when
the peer endpoint is ready in ESTABLISHING state, the 'peer_session' as
well as the 'in_session' flag have been set, but suddenly this link
endpoint leaves. When it comes back with a random session number, there
are two situations possible:

1/ If the random session number is larger than (or equal to) the
previous one, the peer endpoint will be updated with this new session
upon receipt of a RESET_MSG from this endpoint, and the link can be re-
established as normal. Otherwise, all the RESET_MSGs from this endpoint
will be rejected by the peer. In turn, when this link endpoint receives
one ACTIVATE_MSG from the peer, it will move to ESTABLISHED and start
to send STATE_MSGs, but again these messages will be dropped by the
peer due to wrong session.
The peer link endpoint can still become ESTABLISHED after receiving a
traffic message from this endpoint (e.g. a BCAST_PROTOCOL or
NAME_DISTRIBUTOR), but since all the STATE_MSGs are invalid, the link
will be forced down sooner or later!

Even in case the random session number is larger than the previous one,
it can be that the ACTIVATE_MSG from the peer arrives first, and this
link endpoint moves quickly to ESTABLISHED without sending out any
RESET_MSG yet. Consequently, the peer link will not be updated with the
new session number, and the same link failure scenario as above will
happen.

2/ Another situation can be that, the peer link endpoint was reset due
to any reasons in the meantime, its link state was set to RESET from
ESTABLISHING but still in session, i.e. the 'in_session' flag is not
reset...
Now, if the random session number from this endpoint is less than the
previous one, all the RESET_MSGs from this endpoint will be rejected by
the peer. In the other direction, when this link endpoint receives a
RESET_MSG from the peer, it moves to ESTABLISHING and starts to send
ACTIVATE_MSGs, but all these messages will be rejected by the peer too.
As a result, the link cannot be re-established but gets stuck with this
link endpoint in state ESTABLISHING and the peer in RESET!

Solution:

===========

This link endpoint should not go directly to ESTABLISHED when getting
ACTIVATE_MSG from the peer which may belong to the old session if the
link was re-created. To ensure the session to be correct before the
link is re-established, the peer endpoint in ESTABLISHING state will
send back the last session number in ACTIVATE_MSG for a verification at
this endpoint. Then, if needed, a new and more appropriate session
number will be regenerated to force a re-synch first.

In addition, when a link in ESTABLISHING state is reset, its state will
move to RESET according to the link FSM, along with resetting the
'in_session' flag (and the other data) as a normal link reset, it will
also be deleted if requested.

The solution is backward compatible.

Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-11 21:26:20 -08:00
Hoang Le
7384b538d3 tipc: fix skb may be leaky in tipc_link_input
When we free skb at tipc_data_input, we return a 'false' boolean.
Then, skb passed to subcalling tipc_link_input in tipc_link_rcv,

<snip>
1303 int tipc_link_rcv:
...
1354    if (!tipc_data_input(l, skb, l->inputq))
1355        rc |= tipc_link_input(l, skb, l->inputq);
</snip>

Fix it by simple changing to a 'true' boolean when skb is being free-ed.
Then, tipc_link_rcv will bypassed to subcalling tipc_link_input as above
condition.

Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <maloy@donjonn.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-11 18:35:52 -08:00
Zhaolong Zhang
8eab6dac8d tipc: remove dead code in struct tipc_topsrv
max_rcvbuf_size is no longer used since commit "414574a0af36".

Signed-off-by: Zhaolong Zhang <zhangzl2013@126.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-24 22:31:23 -08:00
Gustavo A. R. Silva
f79e3365bc tipc: mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

This patch fixes the following warnings:

net/tipc/link.c:1125:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
net/tipc/socket.c:736:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
net/tipc/socket.c:2418:7: warning: this statement may fall through [-Wimplicit-fallthrough=]

Warning level 3 was used: -Wimplicit-fallthrough=3

This patch is part of the ongoing efforts to enabling
-Wimplicit-fallthrough.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-23 09:06:35 -08:00
David S. Miller
fa7f3a8d56 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Completely minor snmp doc conflict.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-21 14:41:32 -08:00
YueHaibing
d4fb30f6f1 tipc: remove unneeded semicolon in trace.c
Remove unneeded semicolon

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-17 22:04:43 -08:00
Ying Xue
2753ca5d90 tipc: fix uninit-value in tipc_nl_compat_doit
BUG: KMSAN: uninit-value in tipc_nl_compat_doit+0x404/0xa10 net/tipc/netlink_compat.c:335
CPU: 0 PID: 4514 Comm: syz-executor485 Not tainted 4.16.0+ #87
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:17 [inline]
 dump_stack+0x185/0x1d0 lib/dump_stack.c:53
 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:683
 tipc_nl_compat_doit+0x404/0xa10 net/tipc/netlink_compat.c:335
 tipc_nl_compat_recv+0x164b/0x2700 net/tipc/netlink_compat.c:1153
 genl_family_rcv_msg net/netlink/genetlink.c:599 [inline]
 genl_rcv_msg+0x1686/0x1810 net/netlink/genetlink.c:624
 netlink_rcv_skb+0x378/0x600 net/netlink/af_netlink.c:2447
 genl_rcv+0x63/0x80 net/netlink/genetlink.c:635
 netlink_unicast_kernel net/netlink/af_netlink.c:1311 [inline]
 netlink_unicast+0x166b/0x1740 net/netlink/af_netlink.c:1337
 netlink_sendmsg+0x1048/0x1310 net/netlink/af_netlink.c:1900
 sock_sendmsg_nosec net/socket.c:630 [inline]
 sock_sendmsg net/socket.c:640 [inline]
 ___sys_sendmsg+0xec0/0x1310 net/socket.c:2046
 __sys_sendmsg net/socket.c:2080 [inline]
 SYSC_sendmsg+0x2a3/0x3d0 net/socket.c:2091
 SyS_sendmsg+0x54/0x80 net/socket.c:2087
 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x43fda9
RSP: 002b:00007ffd0c184ba8 EFLAGS: 00000213 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fda9
RDX: 0000000000000000 RSI: 0000000020023000 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8
R10: 00000000004002c8 R11: 0000000000000213 R12: 00000000004016d0
R13: 0000000000401760 R14: 0000000000000000 R15: 0000000000000000

Uninit was created at:
 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
 kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188
 kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314
 kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321
 slab_post_alloc_hook mm/slab.h:445 [inline]
 slab_alloc_node mm/slub.c:2737 [inline]
 __kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369
 __kmalloc_reserve net/core/skbuff.c:138 [inline]
 __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206
 alloc_skb include/linux/skbuff.h:984 [inline]
 netlink_alloc_large_skb net/netlink/af_netlink.c:1183 [inline]
 netlink_sendmsg+0x9a6/0x1310 net/netlink/af_netlink.c:1875
 sock_sendmsg_nosec net/socket.c:630 [inline]
 sock_sendmsg net/socket.c:640 [inline]
 ___sys_sendmsg+0xec0/0x1310 net/socket.c:2046
 __sys_sendmsg net/socket.c:2080 [inline]
 SYSC_sendmsg+0x2a3/0x3d0 net/socket.c:2091
 SyS_sendmsg+0x54/0x80 net/socket.c:2087
 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x3d/0xa2

In tipc_nl_compat_recv(), when the len variable returned by
nlmsg_attrlen() is 0, the message is still treated as a valid one,
which is obviously unresonable. When len is zero, it means the
message not only doesn't contain any valid TLV payload, but also
TLV header is not included. Under this stituation, tlv_type field
in TLV header is still accessed in tipc_nl_compat_dumpit() or
tipc_nl_compat_doit(), but the field space is obviously illegal.
Of course, it is not initialized.

Reported-by: syzbot+bca0dc46634781f08b38@syzkaller.appspotmail.com
Reported-by: syzbot+6bdb590321a7ae40c1a6@syzkaller.appspotmail.com
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-15 20:29:21 -08:00
Ying Xue
974cb0e3e7 tipc: fix uninit-value in tipc_nl_compat_name_table_dump
syzbot reported:

BUG: KMSAN: uninit-value in __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline]
BUG: KMSAN: uninit-value in __fswab32 include/uapi/linux/swab.h:59 [inline]
BUG: KMSAN: uninit-value in tipc_nl_compat_name_table_dump+0x4a8/0xba0 net/tipc/netlink_compat.c:826
CPU: 0 PID: 6290 Comm: syz-executor848 Not tainted 4.19.0-rc8+ #70
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x306/0x460 lib/dump_stack.c:113
 kmsan_report+0x1a2/0x2e0 mm/kmsan/kmsan.c:917
 __msan_warning+0x7c/0xe0 mm/kmsan/kmsan_instr.c:500
 __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline]
 __fswab32 include/uapi/linux/swab.h:59 [inline]
 tipc_nl_compat_name_table_dump+0x4a8/0xba0 net/tipc/netlink_compat.c:826
 __tipc_nl_compat_dumpit+0x59e/0xdb0 net/tipc/netlink_compat.c:205
 tipc_nl_compat_dumpit+0x63a/0x820 net/tipc/netlink_compat.c:270
 tipc_nl_compat_handle net/tipc/netlink_compat.c:1151 [inline]
 tipc_nl_compat_recv+0x1402/0x2760 net/tipc/netlink_compat.c:1210
 genl_family_rcv_msg net/netlink/genetlink.c:601 [inline]
 genl_rcv_msg+0x185c/0x1a20 net/netlink/genetlink.c:626
 netlink_rcv_skb+0x394/0x640 net/netlink/af_netlink.c:2454
 genl_rcv+0x63/0x80 net/netlink/genetlink.c:637
 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
 netlink_unicast+0x166d/0x1720 net/netlink/af_netlink.c:1343
 netlink_sendmsg+0x1391/0x1420 net/netlink/af_netlink.c:1908
 sock_sendmsg_nosec net/socket.c:621 [inline]
 sock_sendmsg net/socket.c:631 [inline]
 ___sys_sendmsg+0xe47/0x1200 net/socket.c:2116
 __sys_sendmsg net/socket.c:2154 [inline]
 __do_sys_sendmsg net/socket.c:2163 [inline]
 __se_sys_sendmsg+0x307/0x460 net/socket.c:2161
 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
 do_syscall_64+0xbe/0x100 arch/x86/entry/common.c:291
 entry_SYSCALL_64_after_hwframe+0x63/0xe7
RIP: 0033:0x440179
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffecec49318 EFLAGS: 00000213 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440179
RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8
R10: 0000000000000000 R11: 0000000000000213 R12: 0000000000401a00
R13: 0000000000401a90 R14: 0000000000000000 R15: 0000000000000000

Uninit was created at:
 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:255 [inline]
 kmsan_internal_poison_shadow+0xc8/0x1d0 mm/kmsan/kmsan.c:180
 kmsan_kmalloc+0xa4/0x120 mm/kmsan/kmsan_hooks.c:104
 kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan_hooks.c:113
 slab_post_alloc_hook mm/slab.h:446 [inline]
 slab_alloc_node mm/slub.c:2727 [inline]
 __kmalloc_node_track_caller+0xb43/0x1400 mm/slub.c:4360
 __kmalloc_reserve net/core/skbuff.c:138 [inline]
 __alloc_skb+0x422/0xe90 net/core/skbuff.c:206
 alloc_skb include/linux/skbuff.h:996 [inline]
 netlink_alloc_large_skb net/netlink/af_netlink.c:1189 [inline]
 netlink_sendmsg+0xcaf/0x1420 net/netlink/af_netlink.c:1883
 sock_sendmsg_nosec net/socket.c:621 [inline]
 sock_sendmsg net/socket.c:631 [inline]
 ___sys_sendmsg+0xe47/0x1200 net/socket.c:2116
 __sys_sendmsg net/socket.c:2154 [inline]
 __do_sys_sendmsg net/socket.c:2163 [inline]
 __se_sys_sendmsg+0x307/0x460 net/socket.c:2161
 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
 do_syscall_64+0xbe/0x100 arch/x86/entry/common.c:291
 entry_SYSCALL_64_after_hwframe+0x63/0xe7

We cannot take for granted the thing that the length of data contained
in TLV is longer than the size of struct tipc_name_table_query in
tipc_nl_compat_name_table_dump().

Reported-by: syzbot+06e771a754829716a327@syzkaller.appspotmail.com
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-15 20:29:20 -08:00
Ying Xue
edf5ff04a4 tipc: fix uninit-value in tipc_nl_compat_link_set
syzbot reports following splat:

BUG: KMSAN: uninit-value in strlen+0x3b/0xa0 lib/string.c:486
CPU: 1 PID: 9306 Comm: syz-executor172 Not tainted 4.20.0-rc7+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x173/0x1d0 lib/dump_stack.c:113
  kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613
  __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:313
  strlen+0x3b/0xa0 lib/string.c:486
  nla_put_string include/net/netlink.h:1154 [inline]
  __tipc_nl_compat_link_set net/tipc/netlink_compat.c:708 [inline]
  tipc_nl_compat_link_set+0x929/0x1220 net/tipc/netlink_compat.c:744
  __tipc_nl_compat_doit net/tipc/netlink_compat.c:311 [inline]
  tipc_nl_compat_doit+0x3aa/0xaf0 net/tipc/netlink_compat.c:344
  tipc_nl_compat_handle net/tipc/netlink_compat.c:1107 [inline]
  tipc_nl_compat_recv+0x14d7/0x2760 net/tipc/netlink_compat.c:1210
  genl_family_rcv_msg net/netlink/genetlink.c:601 [inline]
  genl_rcv_msg+0x185f/0x1a60 net/netlink/genetlink.c:626
  netlink_rcv_skb+0x444/0x640 net/netlink/af_netlink.c:2477
  genl_rcv+0x63/0x80 net/netlink/genetlink.c:637
  netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
  netlink_unicast+0xf40/0x1020 net/netlink/af_netlink.c:1336
  netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1917
  sock_sendmsg_nosec net/socket.c:621 [inline]
  sock_sendmsg net/socket.c:631 [inline]
  ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116
  __sys_sendmsg net/socket.c:2154 [inline]
  __do_sys_sendmsg net/socket.c:2163 [inline]
  __se_sys_sendmsg+0x305/0x460 net/socket.c:2161
  __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
  do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
  entry_SYSCALL_64_after_hwframe+0x63/0xe7

The uninitialised access happened in
    nla_put_string(skb, TIPC_NLA_LINK_NAME, lc->name)

This is because lc->name string is not validated before it's used.

Reported-by: syzbot+d78b8a29241a195aefb8@syzkaller.appspotmail.com
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-15 20:29:20 -08:00
Ying Xue
0762216c0a tipc: fix uninit-value in tipc_nl_compat_bearer_enable
syzbot reported:

BUG: KMSAN: uninit-value in strlen+0x3b/0xa0 lib/string.c:484
CPU: 1 PID: 6371 Comm: syz-executor652 Not tainted 4.19.0-rc8+ #70
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x306/0x460 lib/dump_stack.c:113
 kmsan_report+0x1a2/0x2e0 mm/kmsan/kmsan.c:917
 __msan_warning+0x7c/0xe0 mm/kmsan/kmsan_instr.c:500
 strlen+0x3b/0xa0 lib/string.c:484
 nla_put_string include/net/netlink.h:1011 [inline]
 tipc_nl_compat_bearer_enable+0x238/0x7b0 net/tipc/netlink_compat.c:389
 __tipc_nl_compat_doit net/tipc/netlink_compat.c:311 [inline]
 tipc_nl_compat_doit+0x39f/0xae0 net/tipc/netlink_compat.c:344
 tipc_nl_compat_recv+0x147c/0x2760 net/tipc/netlink_compat.c:1107
 genl_family_rcv_msg net/netlink/genetlink.c:601 [inline]
 genl_rcv_msg+0x185c/0x1a20 net/netlink/genetlink.c:626
 netlink_rcv_skb+0x394/0x640 net/netlink/af_netlink.c:2454
 genl_rcv+0x63/0x80 net/netlink/genetlink.c:637
 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
 netlink_unicast+0x166d/0x1720 net/netlink/af_netlink.c:1343
 netlink_sendmsg+0x1391/0x1420 net/netlink/af_netlink.c:1908
 sock_sendmsg_nosec net/socket.c:621 [inline]
 sock_sendmsg net/socket.c:631 [inline]
 ___sys_sendmsg+0xe47/0x1200 net/socket.c:2116
 __sys_sendmsg net/socket.c:2154 [inline]
 __do_sys_sendmsg net/socket.c:2163 [inline]
 __se_sys_sendmsg+0x307/0x460 net/socket.c:2161
 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
 do_syscall_64+0xbe/0x100 arch/x86/entry/common.c:291
 entry_SYSCALL_64_after_hwframe+0x63/0xe7
RIP: 0033:0x440179
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fffef7beee8 EFLAGS: 00000213 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440179
RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8
R10: 0000000000000000 R11: 0000000000000213 R12: 0000000000401a00
R13: 0000000000401a90 R14: 0000000000000000 R15: 0000000000000000

Uninit was created at:
 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:255 [inline]
 kmsan_internal_poison_shadow+0xc8/0x1d0 mm/kmsan/kmsan.c:180
 kmsan_kmalloc+0xa4/0x120 mm/kmsan/kmsan_hooks.c:104
 kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan_hooks.c:113
 slab_post_alloc_hook mm/slab.h:446 [inline]
 slab_alloc_node mm/slub.c:2727 [inline]
 __kmalloc_node_track_caller+0xb43/0x1400 mm/slub.c:4360
 __kmalloc_reserve net/core/skbuff.c:138 [inline]
 __alloc_skb+0x422/0xe90 net/core/skbuff.c:206
 alloc_skb include/linux/skbuff.h:996 [inline]
 netlink_alloc_large_skb net/netlink/af_netlink.c:1189 [inline]
 netlink_sendmsg+0xcaf/0x1420 net/netlink/af_netlink.c:1883
 sock_sendmsg_nosec net/socket.c:621 [inline]
 sock_sendmsg net/socket.c:631 [inline]
 ___sys_sendmsg+0xe47/0x1200 net/socket.c:2116
 __sys_sendmsg net/socket.c:2154 [inline]
 __do_sys_sendmsg net/socket.c:2163 [inline]
 __se_sys_sendmsg+0x307/0x460 net/socket.c:2161
 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
 do_syscall_64+0xbe/0x100 arch/x86/entry/common.c:291
 entry_SYSCALL_64_after_hwframe+0x63/0xe7

The root cause is that we don't validate whether bear name is a valid
string in tipc_nl_compat_bearer_enable().

Meanwhile, we also fix the same issue in the following functions:
tipc_nl_compat_bearer_disable()
tipc_nl_compat_link_stat_dump()
tipc_nl_compat_media_set()
tipc_nl_compat_bearer_set()

Reported-by: syzbot+b33d5cae0efd35dbfe77@syzkaller.appspotmail.com
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-15 20:29:20 -08:00
Ying Xue
8b66fee7f8 tipc: fix uninit-value in tipc_nl_compat_link_reset_stats
syzbot reports following splat:

BUG: KMSAN: uninit-value in strlen+0x3b/0xa0 lib/string.c:486
CPU: 1 PID: 11057 Comm: syz-executor0 Not tainted 4.20.0-rc7+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x173/0x1d0 lib/dump_stack.c:113
 kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613
 __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:295
 strlen+0x3b/0xa0 lib/string.c:486
 nla_put_string include/net/netlink.h:1154 [inline]
 tipc_nl_compat_link_reset_stats+0x1f0/0x360 net/tipc/netlink_compat.c:760
 __tipc_nl_compat_doit net/tipc/netlink_compat.c:311 [inline]
 tipc_nl_compat_doit+0x3aa/0xaf0 net/tipc/netlink_compat.c:344
 tipc_nl_compat_handle net/tipc/netlink_compat.c:1107 [inline]
 tipc_nl_compat_recv+0x14d7/0x2760 net/tipc/netlink_compat.c:1210
 genl_family_rcv_msg net/netlink/genetlink.c:601 [inline]
 genl_rcv_msg+0x185f/0x1a60 net/netlink/genetlink.c:626
 netlink_rcv_skb+0x444/0x640 net/netlink/af_netlink.c:2477
 genl_rcv+0x63/0x80 net/netlink/genetlink.c:637
 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
 netlink_unicast+0xf40/0x1020 net/netlink/af_netlink.c:1336
 netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1917
 sock_sendmsg_nosec net/socket.c:621 [inline]
 sock_sendmsg net/socket.c:631 [inline]
 ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116
 __sys_sendmsg net/socket.c:2154 [inline]
 __do_sys_sendmsg net/socket.c:2163 [inline]
 __se_sys_sendmsg+0x305/0x460 net/socket.c:2161
 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
 entry_SYSCALL_64_after_hwframe+0x63/0xe7
RIP: 0033:0x457ec9
Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f2557338c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457ec9
RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000003
RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f25573396d4
R13: 00000000004cb478 R14: 00000000004d86c8 R15: 00000000ffffffff

Uninit was created at:
 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline]
 kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:158
 kmsan_kmalloc+0xa6/0x130 mm/kmsan/kmsan_hooks.c:176
 kmsan_slab_alloc+0xe/0x10 mm/kmsan/kmsan_hooks.c:185
 slab_post_alloc_hook mm/slab.h:446 [inline]
 slab_alloc_node mm/slub.c:2759 [inline]
 __kmalloc_node_track_caller+0xe18/0x1030 mm/slub.c:4383
 __kmalloc_reserve net/core/skbuff.c:137 [inline]
 __alloc_skb+0x309/0xa20 net/core/skbuff.c:205
 alloc_skb include/linux/skbuff.h:998 [inline]
 netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline]
 netlink_sendmsg+0xb82/0x1300 net/netlink/af_netlink.c:1892
 sock_sendmsg_nosec net/socket.c:621 [inline]
 sock_sendmsg net/socket.c:631 [inline]
 ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116
 __sys_sendmsg net/socket.c:2154 [inline]
 __do_sys_sendmsg net/socket.c:2163 [inline]
 __se_sys_sendmsg+0x305/0x460 net/socket.c:2161
 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
 entry_SYSCALL_64_after_hwframe+0x63/0xe7

The uninitialised access happened in tipc_nl_compat_link_reset_stats:
    nla_put_string(skb, TIPC_NLA_LINK_NAME, name)

This is because name string is not validated before it's used.

Reported-by: syzbot+e01d94b5a4c266be6e4c@syzkaller.appspotmail.com
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-15 20:29:20 -08:00
Ying Xue
a88289f4dd tipc: fix uninit-value in in tipc_conn_rcv_sub
syzbot reported:

BUG: KMSAN: uninit-value in tipc_conn_rcv_sub+0x184/0x950 net/tipc/topsrv.c:373
CPU: 0 PID: 66 Comm: kworker/u4:4 Not tainted 4.17.0-rc3+ #88
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: tipc_rcv tipc_conn_recv_work
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x185/0x1d0 lib/dump_stack.c:113
 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:683
 tipc_conn_rcv_sub+0x184/0x950 net/tipc/topsrv.c:373
 tipc_conn_rcv_from_sock net/tipc/topsrv.c:409 [inline]
 tipc_conn_recv_work+0x3cd/0x560 net/tipc/topsrv.c:424
 process_one_work+0x12c6/0x1f60 kernel/workqueue.c:2145
 worker_thread+0x113c/0x24f0 kernel/workqueue.c:2279
 kthread+0x539/0x720 kernel/kthread.c:239
 ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:412

Local variable description: ----s.i@tipc_conn_recv_work
Variable was created at:
 tipc_conn_recv_work+0x65/0x560 net/tipc/topsrv.c:419
 process_one_work+0x12c6/0x1f60 kernel/workqueue.c:2145

In tipc_conn_rcv_from_sock(), it always supposes the length of message
received from sock_recvmsg() is not smaller than the size of struct
tipc_subscr. However, this assumption is false. Especially when the
length of received message is shorter than struct tipc_subscr size,
we will end up touching uninitialized fields in tipc_conn_rcv_sub().

Reported-by: syzbot+8951a3065ee7fd6d6e23@syzkaller.appspotmail.com
Reported-by: syzbot+75e6e042c5bbf691fc82@syzkaller.appspotmail.com
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-15 20:29:20 -08:00
Gustavo A. R. Silva
f87d8ad923 tipc: fix memory leak in tipc_nl_compat_publ_dump
There is a memory leak in case genlmsg_put fails.

Fix this by freeing *args* before return.

Addresses-Coverity-ID: 1476406 ("Resource leak")
Fixes: 46273cf7e0 ("tipc: fix a missing check of genlmsg_put")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-07 11:42:08 -05:00
Kangjie Lu
46273cf7e0 tipc: fix a missing check of genlmsg_put
genlmsg_put could fail. The fix inserts a check of its return value, and
if it fails, returns -EMSGSIZE.

Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-27 16:26:09 -08:00
Cong Wang
dc4501ff28 tipc: fix a double free in tipc_enable_bearer()
bearer_disable() already calls kfree_rcu() to free struct tipc_bearer,
we don't need to call kfree() again.

Fixes: cb30a63384 ("tipc: refactor function tipc_enable_bearer()")
Reported-by: syzbot+b981acf1fb240c0c128b@syzkaller.appspotmail.com
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-27 16:16:17 -08:00
David S. Miller
90cadbbf34 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull in bug fixes before respinning my net-next pull
request.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-24 16:19:56 -08:00
Aditya Pakki
89dfd00837 net/netlink_compat: Fix a missing check of nla_parse_nested
In tipc_nl_compat_sk_dump(), if nla_parse_nested() fails, it could return
an error. To be consistent with other invocations of the function call,
on error, the fix passes the return value upstream.

Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-24 14:42:11 -08:00
David S. Miller
2be09de7d6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Lots of conflicts, by happily all cases of overlapping
changes, parallel adds, things of that nature.

Thanks to Stephen Rothwell, Saeed Mahameed, and others
for their guidance in these resolutions.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 11:53:36 -08:00
Hoang Le
055722716c tipc: fix uninitialized value for broadcast retransmission
When sending broadcast message on high load system, there are a lot of
unnecessary packets restranmission. That issue was caused by missing in
initial criteria for retransmission.

To prevent this happen, just initialize this criteria for retransmission
in next 10 milliseconds.

Fixes: 31c4f4cc32 ("tipc: improve broadcast retransmission algorithm")
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 11:53:10 -08:00
Tuong Lien
cf5f55f7f0 tipc: add trace_events for tipc bearer
The commit adds the new trace_event for TIPC bearer, L2 device event:

trace_tipc_l2_device_event()

Also, it puts the trace at the tipc_l2_device_event() function, then
the device/bearer events and related info can be traced out during
runtime when needed.

Acked-by: Ying Xue <ying.xue@windriver.com>
Tested-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 11:49:25 -08:00
Tuong Lien
eb18a510b5 tipc: add trace_events for tipc node
The commit adds the new trace_events for TIPC node object:

trace_tipc_node_create()
trace_tipc_node_delete()
trace_tipc_node_lost_contact()
trace_tipc_node_timeout()
trace_tipc_node_link_up()
trace_tipc_node_link_down()
trace_tipc_node_reset_links()
trace_tipc_node_fsm_evt()
trace_tipc_node_check_state()

Also, enables the traces for the following cases:
- When a node is created/deleted;
- When a node contact is lost;
- When a node timer is timed out;
- When a node link is up/down;
- When all node links are reset;
- When node state is changed;
- When a skb comes and node state needs to be checked/updated.

Acked-by: Ying Xue <ying.xue@windriver.com>
Tested-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 11:49:24 -08:00
Tuong Lien
01e661ebfb tipc: add trace_events for tipc socket
The commit adds the new trace_events for TIPC socket object:

trace_tipc_sk_create()
trace_tipc_sk_poll()
trace_tipc_sk_sendmsg()
trace_tipc_sk_sendmcast()
trace_tipc_sk_sendstream()
trace_tipc_sk_filter_rcv()
trace_tipc_sk_advance_rx()
trace_tipc_sk_rej_msg()
trace_tipc_sk_drop_msg()
trace_tipc_sk_release()
trace_tipc_sk_shutdown()
trace_tipc_sk_overlimit1()
trace_tipc_sk_overlimit2()

Also, enables the traces for the following cases:
- When user creates a TIPC socket;
- When user calls poll() on TIPC socket;
- When user sends a dgram/mcast/stream message.
- When a message is put into the socket 'sk_receive_queue';
- When a message is released from the socket 'sk_receive_queue';
- When a message is rejected (e.g. due to no port, invalid, etc.);
- When a message is dropped (e.g. due to wrong message type);
- When socket is released;
- When socket is shutdown;
- When socket rcvq's allocation is overlimit (> 90%);
- When socket rcvq + bklq's allocation is overlimit (> 90%);
- When the 'TIPC_ERR_OVERLOAD/2' issue happens;

Note:
a) All the socket traces are designed to be able to trace on a specific
socket by either using the 'event filtering' feature on a known socket
'portid' value or the sysctl file:

/proc/sys/net/tipc/sk_filter

The file determines a 'tuple' for what socket should be traced:

(portid, sock type, name type, name lower, name upper)

where:
+ 'portid' is the socket portid generated at socket creating, can be
found in the trace outputs or the 'tipc socket list' command printouts;
+ 'sock type' is the socket type (1 = SOCK_TREAM, ...);
+ 'name type', 'name lower' and 'name upper' are the service name being
connected to or published by the socket.

Value '0' means 'ANY', the default tuple value is (0, 0, 0, 0, 0) i.e.
the traces happen for every sockets with no filter.

b) The 'tipc_sk_overlimit1/2' event is also a conditional trace_event
which happens when the socket receive queue (and backlog queue) is
about to be overloaded, when the queue allocation is > 90%. Then, when
the trace is enabled, the last skbs leading to the TIPC_ERR_OVERLOAD/2
issue can be traced.

The trace event is designed as an 'upper watermark' notification that
the other traces (e.g. 'tipc_sk_advance_rx' vs 'tipc_sk_filter_rcv') or
actions can be triggerred in the meanwhile to see what is going on with
the socket queue.

In addition, the 'trace_tipc_sk_dump()' is also placed at the
'TIPC_ERR_OVERLOAD/2' case, so the socket and last skb can be dumped
for post-analysis.

Acked-by: Ying Xue <ying.xue@windriver.com>
Tested-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 11:49:24 -08:00
Tuong Lien
26574db0c1 tipc: add trace_events for tipc link
The commit adds the new trace_events for TIPC link object:

trace_tipc_link_timeout()
trace_tipc_link_fsm()
trace_tipc_link_reset()
trace_tipc_link_too_silent()
trace_tipc_link_retrans()
trace_tipc_link_bc_ack()
trace_tipc_link_conges()

And the traces for PROTOCOL messages at building and receiving:

trace_tipc_proto_build()
trace_tipc_proto_rcv()

Note:
a) The 'tipc_link_too_silent' event will only happen when the
'silent_intv_cnt' is about to reach the 'abort_limit' value (and the
event is enabled). The benefit for this kind of event is that we can
get an early indication about TIPC link loss issue due to timeout, then
can do some necessary actions for troubleshooting.

For example: To trigger the 'tipc_proto_rcv' when the 'too_silent'
event occurs:

echo 'enable_event:tipc:tipc_proto_rcv' > \
      events/tipc/tipc_link_too_silent/trigger

And disable it when TIPC link is reset:

echo 'disable_event:tipc:tipc_proto_rcv' > \
      events/tipc/tipc_link_reset/trigger

b) The 'tipc_link_retrans' or 'tipc_link_bc_ack' event is useful to
trace TIPC retransmission issues.

In addition, the commit adds the 'trace_tipc_list/link_dump()' at the
'retransmission failure' case. Then, if the issue occurs, the link
'transmq' along with the link data can be dumped for post-analysis.
These dump events should be enabled by default since it will only take
effect when the failure happens.

The same approach is also applied for the faulty case that the
validation of protocol message is failed.

Acked-by: Ying Xue <ying.xue@windriver.com>
Tested-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 11:49:24 -08:00
Tuong Lien
b4b9771bcb tipc: enable tracepoints in tipc
As for the sake of debugging/tracing, the commit enables tracepoints in
TIPC along with some general trace_events as shown below. It also
defines some 'tipc_*_dump()' functions that allow to dump TIPC object
data whenever needed, that is, for general debug purposes, ie. not just
for the trace_events.

The following trace_events are now available:

- trace_tipc_skb_dump(): allows to trace and dump TIPC msg & skb data,
  e.g. message type, user, droppable, skb truesize, cloned skb, etc.

- trace_tipc_list_dump(): allows to trace and dump any TIPC buffers or
  queues, e.g. TIPC link transmq, socket receive queue, etc.

- trace_tipc_sk_dump(): allows to trace and dump TIPC socket data, e.g.
  sk state, sk type, connection type, rmem_alloc, socket queues, etc.

- trace_tipc_link_dump(): allows to trace and dump TIPC link data, e.g.
  link state, silent_intv_cnt, gap, bc_gap, link queues, etc.

- trace_tipc_node_dump(): allows to trace and dump TIPC node data, e.g.
  node state, active links, capabilities, link entries, etc.

How to use:
Put the trace functions at any places where we want to dump TIPC data
or events.

Note:
a) The dump functions will generate raw data only, that is, to offload
the trace event's processing, it can require a tool or script to parse
the data but this should be simple.

b) The trace_tipc_*_dump() should be reserved for a failure cases only
(e.g. the retransmission failure case) or where we do not expect to
happen too often, then we can consider enabling these events by default
since they will almost not take any effects under normal conditions,
but once the rare condition or failure occurs, we get the dumped data
fully for post-analysis.

For other trace purposes, we can reuse these trace classes as template
but different events.

c) A trace_event is only effective when we enable it. To enable the
TIPC trace_events, echo 1 to 'enable' files in the events/tipc/
directory in the 'debugfs' file system. Normally, they are located at:

/sys/kernel/debug/tracing/events/tipc/

For example:

To enable the tipc_link_dump event:

echo 1 > /sys/kernel/debug/tracing/events/tipc/tipc_link_dump/enable

To enable all the TIPC trace_events:

echo 1 > /sys/kernel/debug/tracing/events/tipc/enable

To collect the trace data:

cat trace

or

cat trace_pipe > /trace.out &

To disable all the TIPC trace_events:

echo 0 > /sys/kernel/debug/tracing/events/tipc/enable

To clear the trace buffer:

echo > trace

d) Like the other trace_events, the feature like 'filter' or 'trigger'
is also usable for the tipc trace_events.
For more details, have a look at:

Documentation/trace/ftrace.txt

MAINTAINERS | add two new files 'trace.h' & 'trace.c' in tipc

Acked-by: Ying Xue <ying.xue@windriver.com>
Tested-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 11:49:24 -08:00
Zhenbo Gao
5679ee784c tipc: handle broadcast NAME_DISTRIBUTOR packet when receiving it
NAME_DISTRIBUTOR messages are transmitted through unicast link on TIPC
2.0, by contrast, the messages are delivered through broadcast link on
TIPC 1.7. But at present, NAME_DISTRIBUTOR messages received by
broadcast link cannot be handled in tipc_rcv() until an unicast message
arrives, which may lead to a significant delay to update name table.

To avoid this delay, we will also deal with broadcast NAME_DISTRIBUTOR
message on broadcast receive path.

Signed-off-by: Zhenbo Gao <zhenbo.gao@windriver.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 21:50:48 -08:00
Cong Wang
3c6306d440 tipc: check group dests after tipc_wait_for_cond()
Similar to commit 143ece654f ("tipc: check tsk->group in tipc_wait_for_cond()")
we have to reload grp->dests too after we re-take the sock lock.
This means we need to move the dsts check after tipc_wait_for_cond()
too.

Fixes: 75da2163db ("tipc: introduce communication groups")
Reported-and-tested-by: syzbot+99f20222fc5018d2b97a@syzkaller.appspotmail.com
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 15:44:23 -08:00
Cong Wang
143ece654f tipc: check tsk->group in tipc_wait_for_cond()
tipc_wait_for_cond() drops socket lock before going to sleep,
but tsk->group could be freed right after that release_sock().
So we have to re-check and reload tsk->group after it wakes up.

After this patch, tipc_wait_for_cond() returns -ERESTARTSYS when
tsk->group is NULL, instead of continuing with the assumption of
a non-NULL tsk->group.

(It looks like 'dsts' should be re-checked and reloaded too, but
it is a different bug.)

Similar for tipc_send_group_unicast() and tipc_send_group_anycast().

Reported-by: syzbot+10a9db47c3a0e13eb31c@syzkaller.appspotmail.com
Fixes: b7d4263551 ("tipc: introduce flow control for group broadcast messages")
Fixes: ee106d7f94 ("tipc: introduce group anycast messaging")
Fixes: 27bd9ec027 ("tipc: introduce group unicast messaging")
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14 15:48:49 -08:00
Cong Wang
fb83ed496b tipc: compare remote and local protocols in tipc_udp_enable()
When TIPC_NLA_UDP_REMOTE is an IPv6 mcast address but
TIPC_NLA_UDP_LOCAL is an IPv4 address, a NULL-ptr deref is triggered
as the UDP tunnel sock is initialized to IPv4 or IPv6 sock merely
based on the protocol in local address.

We should just error out when the remote address and local address
have different protocols.

Reported-by: syzbot+eb4da3a20fad2e52555d@syzkaller.appspotmail.com
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14 13:28:03 -08:00
Cong Wang
acb4a33e98 tipc: fix a double kfree_skb()
tipc_udp_xmit() drops the packet on error, there is no
need to drop it again.

Fixes: ef20cd4dd1 ("tipc: introduce UDP replicast")
Reported-and-tested-by: syzbot+eae585ba2cc2752d3704@syzkaller.appspotmail.com
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14 13:23:44 -08:00
Cong Wang
15ef70e286 tipc: use lock_sock() in tipc_sk_reinit()
lock_sock() must be used in process context to be race-free with
other lock_sock() callers, for example, tipc_release(). Otherwise
using the spinlock directly can't serialize a parallel tipc_release().

As it is blocking, we have to hold the sock refcnt before
rhashtable_walk_stop() and release it after rhashtable_walk_start().

Fixes: 07f6c4bc04 ("tipc: convert tipc reference table to use generic rhashtable")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14 13:16:33 -08:00
Hoang Le
f5d6c3e5a3 tipc: fix node keep alive interval calculation
When setting LINK tolerance, node timer interval will be calculated
base on the LINK with lowest tolerance.

But when calculated, the old node timer interval only updated if current
setting value (tolerance/4) less than old ones regardless of number of
links as well as links' lowest tolerance value.

This caused to two cases missing if tolerance changed as following:
Case 1:
1.1/ There is one link (L1) available in the system
1.2/ Set L1's tolerance from 1500ms => lower (i.e 500ms)
1.3/ Then, fallback to default (1500ms) or higher (i.e 2000ms)

Expected:
    node timer interval is 1500/4=375ms after 1.3

Result:
node timer interval will not being updated after changing tolerance at 1.3
since its value 1500/4=375ms is not less than 500/4=125ms at 1.2.

Case 2:
2.1/ There are two links (L1, L2) available in the system
2.2/ L1 and L2 tolerance value are 2000ms as initial
2.3/ Set L2's tolerance from 2000ms => lower 1500ms
2.4/ Disable link L2 (bring down its bearer)

Expected:
    node timer interval is 2000ms/4=500ms after 2.4

Result:
node timer interval will not being updated after disabling L2 since
its value 2000ms/4=500ms is still not less than 1500/4=375ms at 2.3
although L2 is already not available in the system.

To fix this, we start the node interval calculation by initializing it to
a value larger than any conceivable calculated value. This way, the link
with the lowest tolerance will always determine the calculated value.

Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-05 20:52:31 -08:00
David S. Miller
e561bb29b6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Trivial conflict in net/core/filter.c, a locally computed
'sdif' is now an argument to the function.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-28 22:10:54 -08:00
Jon Maloy
ec835f8912 tipc: fix lockdep warning during node delete
We see the following lockdep warning:

[ 2284.078521] ======================================================
[ 2284.078604] WARNING: possible circular locking dependency detected
[ 2284.078604] 4.19.0+ #42 Tainted: G            E
[ 2284.078604] ------------------------------------------------------
[ 2284.078604] rmmod/254 is trying to acquire lock:
[ 2284.078604] 00000000acd94e28 ((&n->timer)#2){+.-.}, at: del_timer_sync+0x5/0xa0
[ 2284.078604]
[ 2284.078604] but task is already holding lock:
[ 2284.078604] 00000000f997afc0 (&(&tn->node_list_lock)->rlock){+.-.}, at: tipc_node_stop+0xac/0x190 [tipc]
[ 2284.078604]
[ 2284.078604] which lock already depends on the new lock.
[ 2284.078604]
[ 2284.078604]
[ 2284.078604] the existing dependency chain (in reverse order) is:
[ 2284.078604]
[ 2284.078604] -> #1 (&(&tn->node_list_lock)->rlock){+.-.}:
[ 2284.078604]        tipc_node_timeout+0x20a/0x330 [tipc]
[ 2284.078604]        call_timer_fn+0xa1/0x280
[ 2284.078604]        run_timer_softirq+0x1f2/0x4d0
[ 2284.078604]        __do_softirq+0xfc/0x413
[ 2284.078604]        irq_exit+0xb5/0xc0
[ 2284.078604]        smp_apic_timer_interrupt+0xac/0x210
[ 2284.078604]        apic_timer_interrupt+0xf/0x20
[ 2284.078604]        default_idle+0x1c/0x140
[ 2284.078604]        do_idle+0x1bc/0x280
[ 2284.078604]        cpu_startup_entry+0x19/0x20
[ 2284.078604]        start_secondary+0x187/0x1c0
[ 2284.078604]        secondary_startup_64+0xa4/0xb0
[ 2284.078604]
[ 2284.078604] -> #0 ((&n->timer)#2){+.-.}:
[ 2284.078604]        del_timer_sync+0x34/0xa0
[ 2284.078604]        tipc_node_delete+0x1a/0x40 [tipc]
[ 2284.078604]        tipc_node_stop+0xcb/0x190 [tipc]
[ 2284.078604]        tipc_net_stop+0x154/0x170 [tipc]
[ 2284.078604]        tipc_exit_net+0x16/0x30 [tipc]
[ 2284.078604]        ops_exit_list.isra.8+0x36/0x70
[ 2284.078604]        unregister_pernet_operations+0x87/0xd0
[ 2284.078604]        unregister_pernet_subsys+0x1d/0x30
[ 2284.078604]        tipc_exit+0x11/0x6f2 [tipc]
[ 2284.078604]        __x64_sys_delete_module+0x1df/0x240
[ 2284.078604]        do_syscall_64+0x66/0x460
[ 2284.078604]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 2284.078604]
[ 2284.078604] other info that might help us debug this:
[ 2284.078604]
[ 2284.078604]  Possible unsafe locking scenario:
[ 2284.078604]
[ 2284.078604]        CPU0                    CPU1
[ 2284.078604]        ----                    ----
[ 2284.078604]   lock(&(&tn->node_list_lock)->rlock);
[ 2284.078604]                                lock((&n->timer)#2);
[ 2284.078604]                                lock(&(&tn->node_list_lock)->rlock);
[ 2284.078604]   lock((&n->timer)#2);
[ 2284.078604]
[ 2284.078604]  *** DEADLOCK ***
[ 2284.078604]
[ 2284.078604] 3 locks held by rmmod/254:
[ 2284.078604]  #0: 000000003368be9b (pernet_ops_rwsem){+.+.}, at: unregister_pernet_subsys+0x15/0x30
[ 2284.078604]  #1: 0000000046ed9c86 (rtnl_mutex){+.+.}, at: tipc_net_stop+0x144/0x170 [tipc]
[ 2284.078604]  #2: 00000000f997afc0 (&(&tn->node_list_lock)->rlock){+.-.}, at: tipc_node_stop+0xac/0x19
[...}

The reason is that the node timer handler sometimes needs to delete a
node which has been disconnected for too long. To do this, it grabs
the lock 'node_list_lock', which may at the same time be held by the
generic node cleanup function, tipc_node_stop(), during module removal.
Since the latter is calling del_timer_sync() inside the same lock, we
have a potential deadlock.

We fix this letting the timer cleanup function use spin_trylock()
instead of just spin_lock(), and when it fails to grab the lock it
just returns so that the timer handler can terminate its execution.
This is safe to do, since tipc_node_stop() anyway is about to
delete both the timer and the node instance.

Fixes: 6a939f365b ("tipc: Auto removal of peer down node instance")
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-27 16:30:39 -08:00
David S. Miller
f2be6d710d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-11-19 10:55:00 -08:00
Jon Maloy
1c1274a569 tipc: don't assume linear buffer when reading ancillary data
The code for reading ancillary data from a received buffer is assuming
the buffer is linear. To make this assumption true we have to linearize
the buffer before message data is read.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17 22:08:02 -08:00
Jon Maloy
adba75be0d tipc: fix lockdep warning when reinitilaizing sockets
We get the following warning:

[   47.926140] 32-bit node address hash set to 2010a0a
[   47.927202]
[   47.927433] ================================
[   47.928050] WARNING: inconsistent lock state
[   47.928661] 4.19.0+ #37 Tainted: G            E
[   47.929346] --------------------------------
[   47.929954] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[   47.930116] swapper/3/0 [HC0[0]:SC1[3]:HE1:SE0] takes:
[   47.930116] 00000000af8bc31e (&(&ht->lock)->rlock){+.?.}, at: rhashtable_walk_enter+0x36/0xb0
[   47.930116] {SOFTIRQ-ON-W} state was registered at:
[   47.930116]   _raw_spin_lock+0x29/0x60
[   47.930116]   rht_deferred_worker+0x556/0x810
[   47.930116]   process_one_work+0x1f5/0x540
[   47.930116]   worker_thread+0x64/0x3e0
[   47.930116]   kthread+0x112/0x150
[   47.930116]   ret_from_fork+0x3a/0x50
[   47.930116] irq event stamp: 14044
[   47.930116] hardirqs last  enabled at (14044): [<ffffffff9a07fbba>] __local_bh_enable_ip+0x7a/0xf0
[   47.938117] hardirqs last disabled at (14043): [<ffffffff9a07fb81>] __local_bh_enable_ip+0x41/0xf0
[   47.938117] softirqs last  enabled at (14028): [<ffffffff9a0803ee>] irq_enter+0x5e/0x60
[   47.938117] softirqs last disabled at (14029): [<ffffffff9a0804a5>] irq_exit+0xb5/0xc0
[   47.938117]
[   47.938117] other info that might help us debug this:
[   47.938117]  Possible unsafe locking scenario:
[   47.938117]
[   47.938117]        CPU0
[   47.938117]        ----
[   47.938117]   lock(&(&ht->lock)->rlock);
[   47.938117]   <Interrupt>
[   47.938117]     lock(&(&ht->lock)->rlock);
[   47.938117]
[   47.938117]  *** DEADLOCK ***
[   47.938117]
[   47.938117] 2 locks held by swapper/3/0:
[   47.938117]  #0: 0000000062c64f90 ((&d->timer)){+.-.}, at: call_timer_fn+0x5/0x280
[   47.938117]  #1: 00000000ee39619c (&(&d->lock)->rlock){+.-.}, at: tipc_disc_timeout+0xc8/0x540 [tipc]
[   47.938117]
[   47.938117] stack backtrace:
[   47.938117] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G            E     4.19.0+ #37
[   47.938117] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   47.938117] Call Trace:
[   47.938117]  <IRQ>
[   47.938117]  dump_stack+0x5e/0x8b
[   47.938117]  print_usage_bug+0x1ed/0x1ff
[   47.938117]  mark_lock+0x5b5/0x630
[   47.938117]  __lock_acquire+0x4c0/0x18f0
[   47.938117]  ? lock_acquire+0xa6/0x180
[   47.938117]  lock_acquire+0xa6/0x180
[   47.938117]  ? rhashtable_walk_enter+0x36/0xb0
[   47.938117]  _raw_spin_lock+0x29/0x60
[   47.938117]  ? rhashtable_walk_enter+0x36/0xb0
[   47.938117]  rhashtable_walk_enter+0x36/0xb0
[   47.938117]  tipc_sk_reinit+0xb0/0x410 [tipc]
[   47.938117]  ? mark_held_locks+0x6f/0x90
[   47.938117]  ? __local_bh_enable_ip+0x7a/0xf0
[   47.938117]  ? lockdep_hardirqs_on+0x20/0x1a0
[   47.938117]  tipc_net_finalize+0xbf/0x180 [tipc]
[   47.938117]  tipc_disc_timeout+0x509/0x540 [tipc]
[   47.938117]  ? call_timer_fn+0x5/0x280
[   47.938117]  ? tipc_disc_msg_xmit.isra.19+0xa0/0xa0 [tipc]
[   47.938117]  ? tipc_disc_msg_xmit.isra.19+0xa0/0xa0 [tipc]
[   47.938117]  call_timer_fn+0xa1/0x280
[   47.938117]  ? tipc_disc_msg_xmit.isra.19+0xa0/0xa0 [tipc]
[   47.938117]  run_timer_softirq+0x1f2/0x4d0
[   47.938117]  __do_softirq+0xfc/0x413
[   47.938117]  irq_exit+0xb5/0xc0
[   47.938117]  smp_apic_timer_interrupt+0xac/0x210
[   47.938117]  apic_timer_interrupt+0xf/0x20
[   47.938117]  </IRQ>
[   47.938117] RIP: 0010:default_idle+0x1c/0x140
[   47.938117] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 41 54 55 53 65 8b 2d d8 2b 74 65 0f 1f 44 00 00 e8 c6 2c 8b ff fb f4 <65> 8b 2d c5 2b 74 65 0f 1f 44 00 00 5b 5d 41 5c c3 65 8b 05 b4 2b
[   47.938117] RSP: 0018:ffffaf6ac0207ec8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
[   47.938117] RAX: ffff8f5b3735e200 RBX: 0000000000000003 RCX: 0000000000000001
[   47.938117] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff8f5b3735e200
[   47.938117] RBP: 0000000000000003 R08: 0000000000000001 R09: 0000000000000000
[   47.938117] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[   47.938117] R13: 0000000000000000 R14: ffff8f5b3735e200 R15: ffff8f5b3735e200
[   47.938117]  ? default_idle+0x1a/0x140
[   47.938117]  do_idle+0x1bc/0x280
[   47.938117]  cpu_startup_entry+0x19/0x20
[   47.938117]  start_secondary+0x187/0x1c0
[   47.938117]  secondary_startup_64+0xa4/0xb0

The reason seems to be that tipc_net_finalize()->tipc_sk_reinit() is
calling the function rhashtable_walk_enter() within a timer interrupt.
We fix this by executing tipc_net_finalize() in work queue context.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17 22:01:31 -08:00
David S. Miller
2b9b7502df Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-11-11 17:57:54 -08:00
Jon Maloy
7ab412d33b tipc: fix link re-establish failure
When a link failure is detected locally, the link is reset, the flag
link->in_session is set to false, and a RESET_MSG with the 'stopping'
bit set is sent to the peer.

The purpose of this bit is to inform the peer that this endpoint just
is going down, and that the peer should handle the reception of this
particular RESET message as a local failure. This forces the peer to
accept another RESET or ACTIVATE message from this endpoint before it
can re-establish the link. This again is necessary to ensure that
link session numbers are properly exchanged before the link comes up
again.

If a failure is detected locally at the same time at the peer endpoint
this will do the same, which is also a correct behavior.

However, when receiving such messages, the endpoints will not
distinguish between 'stopping' RESETs and ordinary ones when it comes
to updating session numbers. Both endpoints will copy the received
session number and set their 'in_session' flags to true at the
reception, while they are still expecting another RESET from the
peer before they can go ahead and re-establish. This is contradictory,
since, after applying the validation check referred to below, the
'in_session' flag will cause rejection of all such messages, and the
link will never come up again.

We now fix this by not only handling received RESET/STOPPING messages
as a local failure, but also by omitting to set a new session number
and the 'in_session' flag in such cases.

Fixes: 7ea817f4e8 ("tipc: check session number before accepting link protocol messages")
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-11 10:03:38 -08:00
LUU Duc Canh
31c4f4cc32 tipc: improve broadcast retransmission algorithm
Currently, the broadcast retransmission algorithm is using the
'prev_retr' field in struct tipc_link to time stamp the latest broadcast
retransmission occasion. This helps to restrict retransmission of
individual broadcast packets to max once per 10 milliseconds, even
though all other criteria for retransmission are met.

We now move this time stamp to the control block of each individual
packet, and remove other limiting criteria. This simplifies the
retransmission algorithm, and eliminates any risk of logical errors
in selecting which packets can be retransmitted.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: LUU Duc Canh <canh.d.luu@dektech.com.au>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-11 09:57:46 -08:00
Linus Torvalds
9931a07d51 Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull AFS updates from Al Viro:
 "AFS series, with some iov_iter bits included"

* 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits)
  missing bits of "iov_iter: Separate type from direction and use accessor functions"
  afs: Probe multiple fileservers simultaneously
  afs: Fix callback handling
  afs: Eliminate the address pointer from the address list cursor
  afs: Allow dumping of server cursor on operation failure
  afs: Implement YFS support in the fs client
  afs: Expand data structure fields to support YFS
  afs: Get the target vnode in afs_rmdir() and get a callback on it
  afs: Calc callback expiry in op reply delivery
  afs: Fix FS.FetchStatus delivery from updating wrong vnode
  afs: Implement the YFS cache manager service
  afs: Remove callback details from afs_callback_break struct
  afs: Commit the status on a new file/dir/symlink
  afs: Increase to 64-bit volume ID and 96-bit vnode ID for YFS
  afs: Don't invoke the server to read data beyond EOF
  afs: Add a couple of tracepoints to log I/O errors
  afs: Handle EIO from delivery function
  afs: Fix TTL on VL server and address lists
  afs: Implement VL server rotation
  afs: Improve FS server rotation error handling
  ...
2018-11-01 19:58:52 -07:00
David Howells
aa563d7bca iov_iter: Separate type from direction and use accessor functions
In the iov_iter struct, separate the iterator type from the iterator
direction and use accessor functions to access them in most places.

Convert a bunch of places to use switch-statements to access them rather
then chains of bitwise-AND statements.  This makes it easier to add further
iterator types.  Also, this can be more efficient as to implement a switch
of small contiguous integers, the compiler can use ~50% fewer compare
instructions than it has to use bitwise-and instructions.

Further, cease passing the iterator type into the iterator setup function.
The iterator function can set that itself.  Only the direction is required.

Signed-off-by: David Howells <dhowells@redhat.com>
2018-10-24 00:41:07 +01:00
Karsten Graul
89ab066d42 Revert "net: simplify sock_poll_wait"
This reverts commit dd979b4df8.

This broke tcp_poll for SMC fallback: An AF_SMC socket establishes an
internal TCP socket for the initial handshake with the remote peer.
Whenever the SMC connection can not be established this TCP socket is
used as a fallback. All socket operations on the SMC socket are then
forwarded to the TCP socket. In case of poll, the file->private_data
pointer references the SMC socket because the TCP socket has no file
assigned. This causes tcp_poll to wait on the wrong socket.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-23 10:57:06 -07:00
Jon Maloy
988f3f1603 tipc: eliminate message disordering during binding table update
We have seen the following race scenario:
1) named_distribute() builds a "bulk" message, containing a PUBLISH
   item for a certain publication. This is based on the contents of
   the binding tables's 'cluster_scope' list.
2) tipc_named_withdraw() removes the same publication from the list,
   bulds a WITHDRAW message and distributes it to all cluster nodes.
3) tipc_named_node_up(), which was calling named_distribute(), sends
   out the bulk message built under 1)
4) The WITHDRAW message arrives at the just detected node, finds
   no corresponding publication, and is dropped.
5) The PUBLISH item arrives at the same node, is added to its binding
   table, and remains there forever.

This arrival disordering was earlier taken care of by the backlog queue,
originally added for a different purpose, which was removed in the
commit referred to below, but we now need a different solution.
In this commit, we replace the rcu lock protecting the 'cluster_scope'
list with a regular RW lock which comprises even the sending of the
bulk message. This both guarantees both the list integrity and the
message sending order. We will later add a commit which cleans up
this code further.

Note that this commit needs recently added commit d3092b2efc ("tipc:
fix unsafe rcu locking when accessing publication list") to apply
cleanly.

Fixes: 37922ea4a3 ("tipc: permit overlapping service ranges in name table")
Reported-by: Tuong Lien Tong <tuong.t.lien@dektech.com.au>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-22 19:29:12 -07:00
Guoqing Jiang
29e270fc32 tipc: use destination length for copy string
Got below warning with gcc 8.2 compiler.

net/tipc/topsrv.c: In function ‘tipc_topsrv_start’:
net/tipc/topsrv.c:660:2: warning: ‘strncpy’ specified bound depends on the length of the source argument [-Wstringop-overflow=]
  strncpy(srv->name, name, strlen(name) + 1);
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
net/tipc/topsrv.c:660:27: note: length computed here
  strncpy(srv->name, name, strlen(name) + 1);
                           ^~~~~~~~~~~~
So change it to correct length and use strscpy.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-22 19:25:32 -07:00
David S. Miller
2e2d6f0342 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
net/sched/cls_api.c has overlapping changes to a call to
nlmsg_parse(), one (from 'net') added rtm_tca_policy instead of NULL
to the 5th argument, and another (from 'net-next') added cb->extack
instead of NULL to the 6th argument.

net/ipv4/ipmr_base.c is a case of a bug fix in 'net' being done to
code which moved (to mr_table_dump)) in 'net-next'.  Thanks to David
Ahern for the heads up.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-19 11:03:06 -07:00
Jon Maloy
b06f9d9f1a tipc: fix info leak from kernel tipc_event
We initialize a struct tipc_event allocated on the kernel stack to
zero to avert info leak to user space.

Reported-by: syzbot+057458894bc8cada4dee@syzkaller.appspotmail.com
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-18 16:49:53 -07:00
Tung Nguyen
d3092b2efc tipc: fix unsafe rcu locking when accessing publication list
The binding table's 'cluster_scope' list is rcu protected to handle
races between threads changing the list and those traversing the list at
the same moment. We have now found that the function named_distribute()
uses the regular list_for_each() macro to traverse the said list.
Likewise, the function tipc_named_withdraw() is removing items from the
same list using the regular list_del() call. When these two functions
execute in parallel we see occasional crashes.

This commit fixes this by adding the missing _rcu() suffixes.

Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:33:27 -07:00
Jon Maloy
4af00f4cc1 tipc: initialize broadcast link stale counter correctly
In the commit referred to below we added link tolerance as an additional
criteria for declaring broadcast transmission "stale" and resetting the
unicast links to the affected node.

Unfortunately, this 'improvement' introduced two bugs, which each and
one alone cause only limited problems, but combined lead to seemingly
stochastic unicast link resets, depending on the amount of broadcast
traffic transmitted.

The first issue, a missing initialization of the 'tolerance' field of
the receiver broadcast link, was recently fixed by commit 047491ea33
("tipc: set link tolerance correctly in broadcast link").

Ths second issue, where we omit to reset the 'stale_cnt' field of
the same link after a 'stale' period is over, leads to this counter
accumulating over time, and in the absence of the 'tolerance' criteria
leads to the above described symptoms. This commit adds the missing
initialization.

Fixes: a4dc70d46c ("tipc: extend link reset criteria for stale packet retransmission")
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:03:34 -07:00
Hoang Le
acad76a5f6 tipc: support binding to specific ip address when activating UDP bearer
INADDR_ANY is hard-coded when activating UDP bearer. So, we could not
bind to a specific IP address even with replicast mode using - given
remote ip address instead of using multicast ip address.

In this commit, we fixed it by checking and switch to use appropriate
local ip address.

before:
$netstat -plu
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address
udp        0      0 **0.0.0.0:6118**            0.0.0.0:*

after:
$netstat -plu
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address
udp        0      0 **10.0.0.2:6118**           0.0.0.0:*

Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 21:56:56 -07:00
David S. Miller
d864991b22 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts were easy to resolve using immediate context mostly,
except the cls_u32.c one where I simply too the entire HEAD
chunk.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-12 21:38:46 -07:00
Ying Xue
a1f8dd34e6 tipc: eliminate possible recursive locking detected by LOCKDEP
When booting kernel with LOCKDEP option, below warning info was found:

WARNING: possible recursive locking detected
4.19.0-rc7+ #14 Not tainted
--------------------------------------------
swapper/0/1 is trying to acquire lock:
00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
include/linux/spinlock.h:334 [inline]
00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at:
tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850

but task is already holding lock:
00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
include/linux/spinlock.h:334 [inline]
00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&list->lock)->rlock#4);
  lock(&(&list->lock)->rlock#4);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

2 locks held by swapper/0/1:
 #0: 00000000f7539d34 (pernet_ops_rwsem){+.+.}, at:
register_pernet_subsys+0x19/0x40 net/core/net_namespace.c:1051
 #1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
spin_lock_bh include/linux/spinlock.h:334 [inline]
 #1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849

stack backtrace:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc7+ #14
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1af/0x295 lib/dump_stack.c:113
 print_deadlock_bug kernel/locking/lockdep.c:1759 [inline]
 check_deadlock kernel/locking/lockdep.c:1803 [inline]
 validate_chain kernel/locking/lockdep.c:2399 [inline]
 __lock_acquire+0xf1e/0x3c60 kernel/locking/lockdep.c:3411
 lock_acquire+0x1db/0x520 kernel/locking/lockdep.c:3900
 __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
 _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168
 spin_lock_bh include/linux/spinlock.h:334 [inline]
 tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850
 tipc_link_bc_create+0xb5/0x1f0 net/tipc/link.c:526
 tipc_bcast_init+0x59b/0xab0 net/tipc/bcast.c:521
 tipc_init_net+0x472/0x610 net/tipc/core.c:82
 ops_init+0xf7/0x520 net/core/net_namespace.c:129
 __register_pernet_operations net/core/net_namespace.c:940 [inline]
 register_pernet_operations+0x453/0xac0 net/core/net_namespace.c:1011
 register_pernet_subsys+0x28/0x40 net/core/net_namespace.c:1052
 tipc_init+0x83/0x104 net/tipc/core.c:140
 do_one_initcall+0x109/0x70a init/main.c:885
 do_initcall_level init/main.c:953 [inline]
 do_initcalls init/main.c:961 [inline]
 do_basic_setup init/main.c:979 [inline]
 kernel_init_freeable+0x4bd/0x57f init/main.c:1144
 kernel_init+0x13/0x180 init/main.c:1063
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:413

The reason why the noise above was complained by LOCKDEP is because we
nested to hold l->wakeupq.lock and l->inputq->lock in tipc_link_reset
function. In fact it's unnecessary to move skb buffer from l->wakeupq
queue to l->inputq queue while holding the two locks at the same time.
Instead, we can move skb buffers in l->wakeupq queue to a temporary
list first and then move the buffers of the temporary list to l->inputq
queue, which is also safe for us.

Fixes: 3f32d0be6c ("tipc: lock wakeup & inputq at tipc_link_reset()")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:23:48 -07:00
Parthasarathy Bhuvaragan
e7eb058238 tipc: queue socket protocol error messages into socket receive buffer
In tipc_sk_filter_rcv(), when we detect protocol messages with error we
call tipc_sk_conn_proto_rcv() and let it reset the connection and notify
the socket by calling sk->sk_state_change().

However, tipc_sk_filter_rcv() may have been called from the function
tipc_backlog_rcv(), in which case the socket lock is held and the socket
already awake. This means that the sk_state_change() call is ignored and
the error notification lost. Now the receive queue will remain empty and
the socket sleeps forever.

In this commit, we convert the protocol message into a connection abort
message and enqueue it into the socket's receive queue. By this addition
to the above state change we cover all conditions.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:56:07 -07:00
Jon Maloy
047491ea33 tipc: set link tolerance correctly in broadcast link
In the patch referred to below we added link tolerance as an additional
criteria for declaring broadcast transmission "stale" and resetting the
affected links.

However, the 'tolerance' field of the broadcast link is never set, and
remains at zero. This renders the whole commit without the intended
improving effect, but luckily also with no negative effect.

In this commit we add the missing initialization.

Fixes: a4dc70d46c ("tipc: extend link reset criteria for stale packet retransmission")
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:56:07 -07:00
David S. Miller
6f41617bf2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor conflict in net/core/rtnetlink.c, David Ahern's bug fix in 'net'
overlapped the renaming of a netlink attribute in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-03 21:00:17 -07:00
LUU Duc Canh
d949cfedbc tipc: ignore STATE_MSG on wrong link session
The initial session number when a link is created is based on a random
value, taken from struct tipc_net->random. It is then incremented for
each link reset to avoid mixing protocol messages from different link
sessions.

However, when a bearer is reset all its links are deleted, and will
later be re-created using the same random value as the first time.
This means that if the link never went down between creation and
deletion we will still sometimes have two subsequent sessions with
the same session number. In virtual environments with potentially
long transmission times this has turned out to be a real problem.

We now fix this by randomizing the session number each time a link
is created.

With a session number size of 16 bits this gives a risk of session
collision of 1/64k. To reduce this further, we also introduce a sanity
check on the very first STATE message arriving at a link. If this has
an acknowledge value differing from 0, which is logically impossible,
we ignore the message. The final risk for session collision is hence
reduced to 1/4G, which should be sufficient.

Signed-off-by: LUU Duc Canh <canh.d.luu@dektech.com.au>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:35:30 -07:00
LUU Duc Canh
c140eb166d tipc: fix failover problem
We see the following scenario:
1) Link endpoint B on node 1 discovers that its peer endpoint is gone.
   Since there is a second working link, failover procedure is started.
2) Link endpoint A on node 1 sends a FAILOVER message to peer endpoint
   A on node 2. The node item 1->2 goes to state FAILINGOVER.
3) Linke endpoint A/2 receives the failover, and is supposed to take
   down its parallell link endpoint B/2, while producing a FAILOVER
   message to send back to A/1.
4) However, B/2 has already been deleted, so no FAILOVER message can
   created.
5) Node 1->2 remains in state FAILINGOVER forever, refusing to receive
   any messages that can bring B/1 up again. We are left with a non-
   redundant link between node 1 and 2.

We fix this with letting endpoint A/2 build a dummy FAILOVER message
to send to back to A/1, so that the situation can be resolved.

Signed-off-by: LUU Duc Canh <canh.d.luu@dektech.com.au>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29 11:45:14 -07:00
Tung Nguyen
6787927475 tipc: buffer overflow handling in listener socket
Default socket receive buffer size for a listener socket is 2Mb. For
each arriving empty SYN, the linux kernel allocates a 768 bytes buffer.
This means that a listener socket can serve maximum 2700 simultaneous
empty connection setup requests before it hits a receive buffer
overflow, and much fewer if the SYN is carrying any significant
amount of data.

When this happens the setup request is rejected, and the client
receives an ECONNREFUSED error.

This commit mitigates this problem by letting the client socket try to
retransmit the SYN message multiple times when it sees it rejected with
the code TIPC_ERR_OVERLOAD. Retransmission is done at random intervals
in the range of [100 ms, setup_timeout / 4], as many times as there is
room for within the setup timeout limit.

Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29 11:24:22 -07:00
Jon Maloy
25b9221b95 tipc: add SYN bit to connection setup messages
Messages intended for intitating a connection are currently
indistinguishable from regular datagram messages. The TIPC
protocol specification defines bit 17 in word 0 as a SYN bit
to allow sanity check of such messages in the listening socket,
but this has so far never been implemented.

We do that in this commit.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29 11:24:22 -07:00
Jon Maloy
39fdc9c71f tipc: refactor function tipc_sk_filter_connect()
We refactor the function tipc_sk_filter_connect(), both to make it
more readable and as a preparation for the next commit.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29 11:24:22 -07:00
Jon Maloy
afe8792fec tipc: refactor function tipc_sk_timeout()
We refactor this function as a preparation for the coming commits in
the same series.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29 11:24:22 -07:00
Jon Maloy
5cbdbd1a1f tipc: refactor function tipc_msg_reverse()
The function tipc_msg_reverse() is reversing the header of a message
while reusing the original buffer. We have seen at several occasions
that this may have unfortunate side effects when the buffer to be
reversed is a clone.

In one of the following commits we will again need to reverse cloned
buffers, so this is the right time to permanently eliminate this
problem. In this commit we let the said function always consume the
original buffer and replace it with a new one when applicable.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29 11:24:22 -07:00
Parthasarathy Bhuvaragan
3f32d0be6c tipc: lock wakeup & inputq at tipc_link_reset()
In tipc_link_reset() we copy the wakeup queue to input queue using
skb_queue_splice_init(link->wakeupq, link->inputq).
This is performed without holding any locks. The lists might be
simultaneously be accessed by other cpu threads in tipc_sk_rcv(),
something leading to to random missing packets.

Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25 20:48:56 -07:00
Parthasarathy Bhuvaragan
94b6ddce71 tipc: reset bearer if device carrier not ok
If we detect that under lying carrier detects errors and goes down,
we reset the bearer.

Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25 20:48:56 -07:00
Parthasarathy Bhuvaragan
92ef12b32f tipc: fix flow control accounting for implicit connect
In the case of implicit connect message with data > 1K, the flow
control accounting is incorrect. At this state, the socket does not
know the peer nodes capability and falls back to legacy flow control
by return 1, however the receiver of this message will perform the
new block accounting. This leads to a slack and eventually traffic
disturbance.

In this commit, we perform tipc_node_get_capabilities() at implicit
connect and perform accounting based on the peer's capability.

Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25 10:47:37 -07:00
David S. Miller
aaf9253025 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-09-12 22:22:42 -07:00
Cong Wang
12a78b026f tipc: check return value of __tipc_dump_start()
When __tipc_dump_start() fails with running out of memory,
we have no reason to continue, especially we should avoid
calling tipc_dump_done().

Fixes: 8f5c5fcf35 ("tipc: call start and done ops directly in __tipc_nl_compat_dumpit()")
Reported-and-tested-by: syzbot+3f8324abccfbf8c74a9f@syzkaller.appspotmail.com
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-12 13:15:04 -07:00
David S. Miller
a8305bff68 net: Add and use skb_mark_not_on_list().
An SKB is not on a list if skb->next is NULL.

Codify this convention into a helper function and use it
where we are dequeueing an SKB and need to mark it as such.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-10 10:06:54 -07:00
Cong Wang
8f5c5fcf35 tipc: call start and done ops directly in __tipc_nl_compat_dumpit()
__tipc_nl_compat_dumpit() uses a netlink_callback on stack,
so the only way to align it with other ->dumpit() call path
is calling tipc_dump_start() and tipc_dump_done() directly
inside it. Otherwise ->dumpit() would always get NULL from
cb->args[].

But tipc_dump_start() uses sock_net(cb->skb->sk) to retrieve
net pointer, the cb->skb here doesn't set skb->sk, the net pointer
is saved in msg->net instead, so introduce a helper function
__tipc_dump_start() to pass in msg->net.

Ying pointed out cb->args[0...3] are already used by other
callbacks on this call path, so we can't use cb->args[0] any
more, use cb->args[4] instead.

Fixes: 9a07efa9ae ("tipc: switch to rhashtable iterator")
Reported-and-tested-by: syzbot+e93a2c41f91b8e2c7d9b@syzkaller.appspotmail.com
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-06 21:49:18 -07:00
Cong Wang
0a3b8b2b21 tipc: orphan sock in tipc_release()
Before we unlock the sock in tipc_release(), we have to
detach sk->sk_socket from sk, otherwise a parallel
tipc_sk_fill_sock_diag() could stil read it after we
free this socket.

Fixes: c30b70deb5 ("tipc: implement socket diagnostics for AF_TIPC")
Reported-and-tested-by: syzbot+48804b87c16588ad491d@syzkaller.appspotmail.com
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05 22:14:00 -07:00
David S. Miller
36302685f5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-09-04 21:33:03 -07:00
Zhenbo Gao
a484ef3442 tipc: correct spelling errors for tipc_topsrv_queue_evt() comments
tipc_conn_queue_evt -> tipc_topsrv_queue_evt
tipc_send_work -> tipc_conn_send_work
tipc_send_to_sock -> tipc_conn_send_to_sock

Signed-off-by: Zhenbo Gao <zhenbo.gao@windriver.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-03 22:03:07 -07:00
Zhenbo Gao
9cc1bf3928 tipc: correct spelling errors for struct tipc_bc_base's comment
Trivial fix for two spelling mistakes.

Signed-off-by: Zhenbo Gao <zhenbo.gao@windriver.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-03 22:03:07 -07:00
Zhenbo Gao
05a6843c92 tipc: correct structure parameter comments for topsrv
Remove the following obsolete parameter comments of tipc_topsrv struct:
  @rcvbuf_cache
  @tipc_conn_new
  @tipc_conn_release
  @tipc_conn_recvmsg
  @imp
  @type

Add the comments for the missing parameters below of tipc_topsrv struct:
  @awork
  @listener

Remove the unused or duplicated parameter comments of tipc_conn struct:
  @outqueue_lock
  @rx_action

Signed-off-by: Zhenbo Gao <zhenbo.gao@windriver.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-03 21:43:42 -07:00
Cong Wang
9a07efa9ae tipc: switch to rhashtable iterator
syzbot reported a use-after-free in tipc_group_fill_sock_diag(),
where tipc_group_fill_sock_diag() still reads tsk->group meanwhile
tipc_group_delete() just deletes it in tipc_release().

tipc_nl_sk_walk() aims to lock this sock when walking each sock
in the hash table to close race conditions with sock changes like
this one, by acquiring tsk->sk.sk_lock.slock spinlock, unfortunately
this doesn't work at all. All non-BH call path should take
lock_sock() instead to make it work.

tipc_nl_sk_walk() brutally iterates with raw rht_for_each_entry_rcu()
where RCU read lock is required, this is the reason why lock_sock()
can't be taken on this path. This could be resolved by switching to
rhashtable iterator API's, where taking a sleepable lock is possible.
Also, the iterator API's are friendly for restartable calls like
diag dump, the last position is remembered behind the scence,
all we need to do here is saving the iterator into cb->args[].

I tested this with parallel tipc diag dump and thousands of tipc
socket creation and release, no crash or memory leak.

Reported-by: syzbot+b9c8f3ab2994b7cd1625@syzkaller.appspotmail.com
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-29 18:04:54 -07:00
Cong Wang
bd583fe304 tipc: fix a missing rhashtable_walk_exit()
rhashtable_walk_exit() must be paired with rhashtable_walk_enter().

Fixes: 40f9f43970 ("tipc: Fix tipc_sk_reinit race conditions")
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-29 17:52:58 -07:00
Haiqing Bai
30935198b7 tipc: fix the big/little endian issue in tipc_dest
In function tipc_dest_push, the 32bit variables 'node' and 'port'
are stored separately in uppper and lower part of 64bit 'value'.
Then this value is assigned to dst->value which is a union like:
union
{
  struct {
    u32 port;
    u32 node;
  };
  u64 value;
}
This works on little-endian machines like x86 but fails on big-endian
machines.

The fix remove the 'value' stack parameter and even the 'value'
member of the union in tipc_dest, assign the 'node' and 'port' member
directly with the input parameter to avoid the endian issue.

Fixes: a80ae5306a ("tipc: improve destination linked list")
Signed-off-by: Zhenbo Gao <zhenbo.gao@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Haiqing Bai <Haiqing.Bai@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-27 15:23:31 -07:00
David S. Miller
a736e07468 Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net
Overlapping changes in RXRPC, changing to ktime_get_seconds() whilst
adding some tracepoints.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09 11:52:36 -07:00
Ying Xue
37436d9c0e tipc: fix an interrupt unsafe locking scenario
Commit 9faa89d4ed ("tipc: make function tipc_net_finalize() thread
safe") tries to make it thread safe to set node address, so it uses
node_list_lock lock to serialize the whole process of setting node
address in tipc_net_finalize(). But it causes the following interrupt
unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  rht_deferred_worker()
  rhashtable_rehash_table()
  lock(&(&ht->lock)->rlock)
			       tipc_nl_compat_doit()
                               tipc_net_finalize()
                               local_irq_disable();
                               lock(&(&tn->node_list_lock)->rlock);
                               tipc_sk_reinit()
                               rhashtable_walk_enter()
                               lock(&(&ht->lock)->rlock);
  <Interrupt>
  tipc_disc_rcv()
  tipc_node_check_dest()
  tipc_node_create()
  lock(&(&tn->node_list_lock)->rlock);

 *** DEADLOCK ***

When rhashtable_rehash_table() holds ht->lock on CPU0, it doesn't
disable BH. So if an interrupt happens after the lock, it can create
an inverse lock ordering between ht->lock and tn->node_list_lock. As
a consequence, deadlock might happen.

The reason causing the inverse lock ordering scenario above is because
the initial purpose of node_list_lock is not designed to do the
serialization of node address setting.

As cmpxchg() can guarantee CAS (compare-and-swap) process is atomic,
we use it to replace node_list_lock to ensure setting node address can
be atomically finished. It turns out the potential deadlock can be
avoided as well.

Fixes: 9faa89d4ed ("tipc: make function tipc_net_finalize() thread safe")
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <maloy@donjonn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07 13:15:35 -07:00
Colin Ian King
b053fcc4a1 net/tipc: remove redundant variables 'tn' and 'oport'
Variables 'tn' and 'oport'  are being assigned but are never used hence
they are redundant and can be removed.

Cleans up clang warnings:
warning: variable 'oport' set but not used [-Wunused-but-set-variable]
warning: variable 'tn' set but not used [-Wunused-but-set-variable]

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-01 09:46:49 -07:00
Christoph Hellwig
dd979b4df8 net: simplify sock_poll_wait
The wait_address argument is always directly derived from the filp
argument, so remove it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 09:10:25 -07:00
Jia-Ju Bai
a0732548ba net: tipc: bcast: Replace GFP_ATOMIC with GFP_KERNEL in tipc_bcast_init()
tipc_bcast_init() is never called in atomic context.
It calls kzalloc() with GFP_ATOMIC, which is not necessary.
GFP_ATOMIC can be replaced with GFP_KERNEL.

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-27 13:45:15 -07:00
Jia-Ju Bai
04b9ce48ef net: tipc: name_table: Replace GFP_ATOMIC with GFP_KERNEL in tipc_nametbl_init()
tipc_nametbl_init() is never called in atomic context.
It calls kzalloc() with GFP_ATOMIC, which is not necessary.
GFP_ATOMIC can be replaced with GFP_KERNEL.

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-27 13:45:15 -07:00
YueHaibing
63135ee0a6 tipc: add missing dev_put() on error in tipc_enable_l2_media
when tipc_own_id failed to obtain node identity,dev_put should
be call before return -EINVAL.

Fixes: 682cd3cf94 ("tipc: confgiure and apply UDP bearer MTU on running links")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 14:05:11 -07:00
YueHaibing
e064cce130 tipc: make some functions static
Fixes the following sparse warnings:

net/tipc/link.c:376:5: warning: symbol 'link_bc_rcv_gap' was not declared. Should it be static?
net/tipc/link.c:823:6: warning: symbol 'link_prepare_wakeup' was not declared. Should it be static?
net/tipc/link.c:959:6: warning: symbol 'tipc_link_advance_backlog' was not declared. Should it be static?
net/tipc/link.c:1009:5: warning: symbol 'tipc_link_retrans' was not declared. Should it be static?
net/tipc/monitor.c:687:5: warning: symbol '__tipc_nl_add_monitor_peer' was not declared. Should it be static?
net/tipc/group.c:230:20: warning: symbol 'tipc_group_find_member' was not declared. Should it be static?

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 16:23:22 -07:00
David S. Miller
c4c5551df1 Merge ra.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux
All conflicts were trivial overlapping changes, so reasonably
easy to resolve.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-20 21:17:12 -07:00
Jon Maloy
40999f11ce tipc: make link capability update thread safe
The commit referred to below introduced an update of the link
capabilities field that is not safe. Given the recently added
feature to remove idle node and link items after 5 minutes, there
is a small risk that the update will happen at the very moment the
targeted link is being removed. To avoid this we have to perform
the update inside the node item's write lock protection.

Fixes: 9012de5089 ("tipc: add sequence number check for link STATE messages")
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-20 12:36:13 -07:00
YueHaibing
d81d25e66a tipc: remove unused tipc_group_size
After commit eb929a91b2 ("tipc: improve poll() for group member socket"),
it is no longer used.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 13:49:08 -07:00
YueHaibing
c94b1ac732 tipc: remove unused tipc_link_is_active
tipc_link_is_active is no longer used and can be removed.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 13:48:46 -07:00
Jon Maloy
7ea817f4e8 tipc: check session number before accepting link protocol messages
In some virtual environments we observe a significant higher number of
packet reordering and delays than we have been used to traditionally.

This makes it necessary with stricter checks on incoming link protocol
messages' session number, which until now only has been validated for
RESET messages.

Since the other two message types, ACTIVATE and STATE messages also
carry this number, it is easy to extend the validation check to those
messages.

We also introduce a flag indicating if a link has a valid peer session
number or not. This eliminates the mixing of 32- and 16-bit arithmethics
we are currently using to achieve this.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 23:06:14 -07:00
Jon Maloy
9012de5089 tipc: add sequence number check for link STATE messages
Some switch infrastructures produce huge amounts of packet duplicates.
This becomes a problem if those messages are STATE/NACK protocol
messages, causing unnecessary retransmissions of already accepted
packets.

We now introduce a unique sequence number per STATE protocol message
so that duplicates can be identified and ignored. This will also be
useful when tracing such cases, and to avert replay attacks when TIPC
is encrypted.

For compatibility reasons we have to introduce a new capability flag
TIPC_LINK_PROTO_SEQNO to handle this new feature.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 23:06:14 -07:00
Jon Maloy
a4dc70d46c tipc: extend link reset criteria for stale packet retransmission
Currently a link is declared stale and reset if there has been 100
repeated attempts to retransmit the same packet. However, in certain
infrastructures we see that packet (NACK) duplicates and delays may
cause such retransmit attempts to occur at a high rate, so that the
peer doesn't have a reasonable chance to acknowledge the reception
before the 100-limit is hit. This may take much less than the
stipulated link tolerance time, and despite that probe/probe replies
otherwise go through as normal.

We now extend the criteria for link reset to also being time based.
I.e., we don't reset the link until the link tolerance time is passed
AND we have made 100 retransmissions attempts.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07 21:16:01 +09:00
Jon Maloy
9faa89d4ed tipc: make function tipc_net_finalize() thread safe
The setting of the node address is not thread safe, meaning that
two discoverers may decide to set it simultanously, with a duplicate
entry in the name table as result. We fix that with this commit.

Fixes: 25b0b9c4e8 ("tipc: handle collisions of 32-bit node address hash values")
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07 19:49:02 +09:00
Jon Maloy
92018c7ca9 tipc: fix correct setting of message type in second discoverer
The duplicate address discovery protocol is not safe against two
discoverers running in parallel. The one executing first after the
trial period is over will set the node address and change its own
message type to DSC_REQ_MSG. The one executing last may find that the
node address is already set, and never change message type, with the
result that its links may never be established.

In this commmit we ensure that the message type always is set correctly
after the trial period is over.

Fixes: 25b0b9c4e8 ("tipc: handle collisions of 32-bit node address hash values")
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07 19:49:01 +09:00
Jon Maloy
e415577f57 tipc: correct discovery message handling during address trial period
With the duplicate address discovery protocol for tipc nodes addresses
we introduced a one second trial period before a node is allocated a
hash number to use as address.

Unfortunately, we miss to handle the case when a regular LINK REQUEST/
RESPONSE arrives from a cluster node during the trial period. Such
messages are not ignored as they should be, leading to links setup
attempts while the node still has no address.

Fixes: 25b0b9c4e8 ("tipc: handle collisions of 32-bit node address hash values")
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07 19:49:01 +09:00
Jon Maloy
2a57f18242 tipc: fix wrong return value from function tipc_node_try_addr()
The function for checking if there is an node address conflict is
supposed to return a suggestion for a new address if it finds a
conflict, and zero otherwise. But in case the peer being checked
is previously unknown it does instead return a "suggestion" for
the checked address itself. This results in a DSC_TRIAL_FAIL_MSG
being sent unecessarily to the peer, and sometimes makes the trial
period starting over again.

Fixes: 25b0b9c4e8 ("tipc: handle collisions of 32-bit node address hash values")
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07 19:49:01 +09:00
Gustavo A. R. Silva
c53e0c787e tipc: mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Warning level 2 was used: -Wimplicit-fallthrough=2

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-05 20:24:01 +09:00
David S. Miller
5cd3da4ba2 Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net
Simple overlapping changes in stmmac driver.

Adjust skb_gro_flush_final_remcsum function signature to make GRO list
changes in net-next, as per Stephen Rothwell's example merge
resolution.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-03 10:29:26 +09:00
GhantaKrishnamurthy MohanKrishna
a1be5a20f1 tipc: extend sock diag for group communication
This commit extends the existing TIPC socket diagnostics framework
for information related to TIPC group communication.

Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: GhantaKrishnamurthy MohanKrishna <mohan.krishna.ghanta.krishnamurthy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-30 21:05:42 +09:00
GhantaKrishnamurthy MohanKrishna
6a939f365b tipc: Auto removal of peer down node instance
A peer node is considered down if there are no
active links (or) lost contact to the node. In current implementation,
a peer node instance is deleted either if

a) TIPC module is removed (or)
b) Application can use a netlink/iproute2 interface to delete a
specific down node.

Thus, a down node instance lives in the system forever, unless the
application explicitly removes it.

We fix this by deleting the nodes which are down for
a specified amount of time (5 minutes).
Existing node supervision timer is used to achieve this.

Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: GhantaKrishnamurthy MohanKrishna <mohan.krishna.ghanta.krishnamurthy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-30 21:05:23 +09:00
Tung Nguyen
759f29b62f tipc: optimize function tipc_node_timeout()
In single-link usage, the function tipc_node_timeout() still iterates
over the whole link array to handle each link. Given that the maximum
number of bearers are 3, there are 2 redundant iterations with lock
grab/release. Since this function is executing very frequently it makes
sense to optimize it.

This commit adds conditional checking to exit from the loop if the
known number of configured links has already been accessed.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-30 20:51:39 +09:00