linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-30 16:11:38 +00:00

Author	SHA1	Message	Date
Eric Dumazet	216808c6ba	tcp: refine rule to allow EPOLLOUT generation under mem pressure At the time commit `ce5ec44099` ("tcp: ensure epoll edge trigger wakeup when write queue is empty") was added to the kernel, we still had a single write queue, combining rtx and write queues. Once we moved the rtx queue into a separate rb-tree, testing if sk_write_queue is empty has been suboptimal. Indeed, if we have packets in the rtx queue, we probably want to delay the EPOLLOUT generation at the time incoming packets will free them, making room, but more importantly avoiding flooding application with EPOLLOUT events. Solution is to use tcp_rtx_and_write_queues_empty() helper. Fixes: `75c119afe1` ("tcp: implement rb-tree based retransmit queue") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jason Baron <jbaron@akamai.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-12-13 21:58:40 -08:00
Eric Dumazet	ee2aabd3fc	tcp: refine tcp_write_queue_empty() implementation Due to how tcp_sendmsg() is implemented, we can have an empty skb at the tail of the write queue. Most [1] tcp_write_queue_empty() callers want to know if there is anything to send (payload and/or FIN) Instead of checking if the sk_write_queue is empty, we need to test if tp->write_seq == tp->snd_nxt [1] tcp_send_fin() was the only caller that expected to see if an skb was in the write queue, I have changed the code to reuse the tcp_write_queue_tail() result. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-12-13 21:58:40 -08:00
Eric Dumazet	1f85e6267c	tcp: do not send empty skb from tcp_write_xmit() Backport of commit `fdfc5c8594` ("tcp: remove empty skb from write queue in error cases") in linux-4.14 stable triggered various bugs. One of them has been fixed in commit ba2ddb43f270 ("tcp: Don't dequeue SYN/FIN-segments from write-queue"), but we still have crashes in some occasions. Root-cause is that when tcp_sendmsg() has allocated a fresh skb and could not append a fragment before being blocked in sk_stream_wait_memory(), tcp_write_xmit() might be called and decide to send this fresh and empty skb. Sending an empty packet is not only silly, it might have caused many issues we had in the past with tp->packets_out being out of sync. Fixes: `c65f7f00c5` ("[TCP]: Simplify SKB data portion allocation with NETIF_F_SG.") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Christoph Paasch <cpaasch@apple.com> Acked-by: Neal Cardwell <ncardwell@google.com> Cc: Jason Baron <jbaron@akamai.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-12-13 21:58:40 -08:00
Eric Dumazet	5c9934b676	6pack,mkiss: fix possible deadlock We got another syzbot report [1] that tells us we must use write_lock_irq()/write_unlock_irq() to avoid possible deadlock. [1] WARNING: inconsistent lock state 5.5.0-rc1-syzkaller #0 Not tainted -------------------------------- inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-R} usage. syz-executor826/9605 [HC1[1]:SC0[0]:HE0:SE1] takes: ffffffff8a128718 (disc_data_lock){+-..}, at: sp_get.isra.0+0x1d/0xf0 drivers/net/ppp/ppp_synctty.c:138 {HARDIRQ-ON-W} state was registered at: lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485 __raw_write_lock_bh include/linux/rwlock_api_smp.h:203 [inline] _raw_write_lock_bh+0x33/0x50 kernel/locking/spinlock.c:319 sixpack_close+0x1d/0x250 drivers/net/hamradio/6pack.c:657 tty_ldisc_close.isra.0+0x119/0x1a0 drivers/tty/tty_ldisc.c:489 tty_set_ldisc+0x230/0x6b0 drivers/tty/tty_ldisc.c:585 tiocsetd drivers/tty/tty_io.c:2337 [inline] tty_ioctl+0xe8d/0x14f0 drivers/tty/tty_io.c:2597 vfs_ioctl fs/ioctl.c:47 [inline] file_ioctl fs/ioctl.c:545 [inline] do_vfs_ioctl+0x977/0x14e0 fs/ioctl.c:732 ksys_ioctl+0xab/0xd0 fs/ioctl.c:749 __do_sys_ioctl fs/ioctl.c:756 [inline] __se_sys_ioctl fs/ioctl.c:754 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:754 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe irq event stamp: 3946 hardirqs last enabled at (3945): [<ffffffff87c86e43>] __raw_spin_unlock_irq include/linux/spinlock_api_smp.h:168 [inline] hardirqs last enabled at (3945): [<ffffffff87c86e43>] _raw_spin_unlock_irq+0x23/0x80 kernel/locking/spinlock.c:199 hardirqs last disabled at (3946): [<ffffffff8100675f>] trace_hardirqs_off_thunk+0x1a/0x1c arch/x86/entry/thunk_64.S:42 softirqs last enabled at (2658): [<ffffffff86a8b4df>] spin_unlock_bh include/linux/spinlock.h:383 [inline] softirqs last enabled at (2658): [<ffffffff86a8b4df>] clusterip_netdev_event+0x46f/0x670 net/ipv4/netfilter/ipt_CLUSTERIP.c:222 softirqs last disabled at (2656): [<ffffffff86a8b22b>] spin_lock_bh include/linux/spinlock.h:343 [inline] softirqs last disabled at (2656): [<ffffffff86a8b22b>] clusterip_netdev_event+0x1bb/0x670 net/ipv4/netfilter/ipt_CLUSTERIP.c:196 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(disc_data_lock); <Interrupt> lock(disc_data_lock); * DEADLOCK * 5 locks held by syz-executor826/9605: #0: ffff8880a905e198 (&tty->legacy_mutex){+.+.}, at: tty_lock+0xc7/0x130 drivers/tty/tty_mutex.c:19 #1: ffffffff899a56c0 (rcu_read_lock){....}, at: mutex_spin_on_owner+0x0/0x330 kernel/locking/mutex.c:413 #2: ffff8880a496a2b0 (&(&i->lock)->rlock){-.-.}, at: spin_lock include/linux/spinlock.h:338 [inline] #2: ffff8880a496a2b0 (&(&i->lock)->rlock){-.-.}, at: serial8250_interrupt+0x2d/0x1a0 drivers/tty/serial/8250/8250_core.c:116 #3: ffffffff8c104048 (&port_lock_key){-.-.}, at: serial8250_handle_irq.part.0+0x24/0x330 drivers/tty/serial/8250/8250_port.c:1823 #4: ffff8880a905e090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref+0x22/0x90 drivers/tty/tty_ldisc.c:288 stack backtrace: CPU: 1 PID: 9605 Comm: syz-executor826 Not tainted 5.5.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 print_usage_bug.cold+0x327/0x378 kernel/locking/lockdep.c:3101 valid_state kernel/locking/lockdep.c:3112 [inline] mark_lock_irq kernel/locking/lockdep.c:3309 [inline] mark_lock+0xbb4/0x1220 kernel/locking/lockdep.c:3666 mark_usage kernel/locking/lockdep.c:3554 [inline] __lock_acquire+0x1e55/0x4a00 kernel/locking/lockdep.c:3909 lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485 __raw_read_lock include/linux/rwlock_api_smp.h:149 [inline] _raw_read_lock+0x32/0x50 kernel/locking/spinlock.c:223 sp_get.isra.0+0x1d/0xf0 drivers/net/ppp/ppp_synctty.c:138 sixpack_write_wakeup+0x25/0x340 drivers/net/hamradio/6pack.c:402 tty_wakeup+0xe9/0x120 drivers/tty/tty_io.c:536 tty_port_default_wakeup+0x2b/0x40 drivers/tty/tty_port.c:50 tty_port_tty_wakeup+0x57/0x70 drivers/tty/tty_port.c:387 uart_write_wakeup+0x46/0x70 drivers/tty/serial/serial_core.c:104 serial8250_tx_chars+0x495/0xaf0 drivers/tty/serial/8250/8250_port.c:1761 serial8250_handle_irq.part.0+0x2a2/0x330 drivers/tty/serial/8250/8250_port.c:1834 serial8250_handle_irq drivers/tty/serial/8250/8250_port.c:1820 [inline] serial8250_default_handle_irq+0xc0/0x150 drivers/tty/serial/8250/8250_port.c:1850 serial8250_interrupt+0xf1/0x1a0 drivers/tty/serial/8250/8250_core.c:126 __handle_irq_event_percpu+0x15d/0x970 kernel/irq/handle.c:149 handle_irq_event_percpu+0x74/0x160 kernel/irq/handle.c:189 handle_irq_event+0xa7/0x134 kernel/irq/handle.c:206 handle_edge_irq+0x25e/0x8d0 kernel/irq/chip.c:830 generic_handle_irq_desc include/linux/irqdesc.h:156 [inline] do_IRQ+0xde/0x280 arch/x86/kernel/irq.c:250 common_interrupt+0xf/0xf arch/x86/entry/entry_64.S:607 </IRQ> RIP: 0010:cpu_relax arch/x86/include/asm/processor.h:685 [inline] RIP: 0010:mutex_spin_on_owner+0x247/0x330 kernel/locking/mutex.c:579 Code: c3 be 08 00 00 00 4c 89 e7 e8 e5 06 59 00 4c 89 e0 48 c1 e8 03 42 80 3c 38 00 0f 85 e1 00 00 00 49 8b 04 24 a8 01 75 96 f3 90 <e9> 2f fe ff ff 0f 0b e8 0d 19 09 00 84 c0 0f 85 ff fd ff ff 48 c7 RSP: 0018:ffffc90001eafa20 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd7 RAX: 0000000000000000 RBX: ffff88809fd9e0c0 RCX: 1ffffffff13266dd RDX: 0000000000000000 RSI: 0000000000000008 RDI: 0000000000000000 RBP: ffffc90001eafa60 R08: 1ffff11013d22898 R09: ffffed1013d22899 R10: ffffed1013d22898 R11: ffff88809e9144c7 R12: ffff8880a905e138 R13: ffff88809e9144c0 R14: 0000000000000000 R15: dffffc0000000000 mutex_optimistic_spin kernel/locking/mutex.c:673 [inline] __mutex_lock_common kernel/locking/mutex.c:962 [inline] __mutex_lock+0x32b/0x13c0 kernel/locking/mutex.c:1106 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1121 tty_lock+0xc7/0x130 drivers/tty/tty_mutex.c:19 tty_release+0xb5/0xe90 drivers/tty/tty_io.c:1665 __fput+0x2ff/0x890 fs/file_table.c:280 ____fput+0x16/0x20 fs/file_table.c:313 task_work_run+0x145/0x1c0 kernel/task_work.c:113 exit_task_work include/linux/task_work.h:22 [inline] do_exit+0x8e7/0x2ef0 kernel/exit.c:797 do_group_exit+0x135/0x360 kernel/exit.c:895 __do_sys_exit_group kernel/exit.c:906 [inline] __se_sys_exit_group kernel/exit.c:904 [inline] __x64_sys_exit_group+0x44/0x50 kernel/exit.c:904 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x43fef8 Code: Bad RIP value. RSP: 002b:00007ffdb07d2338 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000043fef8 RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000 RBP: 00000000004bf730 R08: 00000000000000e7 R09: ffffffffffffffd0 R10: 00000000004002c8 R11: 0000000000000246 R12: 0000000000000001 R13: 00000000006d1180 R14: 0000000000000000 R15: 0000000000000000 Fixes: `6e4e2f811b` ("6pack,mkiss: fix lock inconsistency") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-12-13 21:49:29 -08:00
Eric Dumazet	8dbd76e79a	tcp/dccp: fix possible race __inet_lookup_established() Michal Kubecek and Firo Yang did a very nice analysis of crashes happening in __inet_lookup_established(). Since a TCP socket can go from TCP_ESTABLISH to TCP_LISTEN (via a close()/socket()/listen() cycle) without a RCU grace period, I should not have changed listeners linkage in their hash table. They must use the nulls protocol (Documentation/RCU/rculist_nulls.txt), so that a lookup can detect a socket in a hash list was moved in another one. Since we added code in commit `d296ba60d8` ("soreuseport: Resolve merge conflict for v4/v6 ordering fix"), we have to add hlist_nulls_add_tail_rcu() helper. Fixes: `3b24d854cb` ("tcp/dccp: do not touch listener sk_refcnt under synflood") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Michal Kubecek <mkubecek@suse.cz> Reported-by: Firo Yang <firo.yang@suse.com> Reviewed-by: Michal Kubecek <mkubecek@suse.cz> Link: https://lore.kernel.org/netdev/20191120083919.GH27852@unicorn.suse.cz/ Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-12-13 21:40:49 -08:00
Thomas Falcon	8f9cc1ee29	net/ibmvnic: Fix typo in retry check This conditional is missing a bang, with the intent being to break when the retry count reaches zero. Fixes: `476d96ca9c` ("ibmvnic: Bound waits for device queries") Suggested-by: Juliet Kim <julietk@linux.vnet.ibm.com> Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-12-13 17:23:10 -08:00
Hangbin Liu	2beb6d2901	ipv6/addrconf: only check invalid header values when NETLINK_F_STRICT_CHK is set In commit `4b1373de73` ("net: ipv6: addr: perform strict checks also for doit handlers") we add strict check for inet6_rtm_getaddr(). But we did the invalid header values check before checking if NETLINK_F_STRICT_CHK is set. This may break backwards compatibility if user already set the ifm->ifa_prefixlen, ifm->ifa_flags, ifm->ifa_scope in their netlink code. I didn't move the nlmsg_len check because I thought it's a valid check. Reported-by: Jianlin Shi <jishi@redhat.com> Fixes: `4b1373de73` ("net: ipv6: addr: perform strict checks also for doit handlers") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-12-13 17:13:49 -08:00
Arnd Bergmann	03b06e3f83	ptp: clockmatrix: add I2C dependency Without I2C, we get a link failure: drivers/ptp/ptp_clockmatrix.o: In function `idtcm_xfer.isra.3': ptp_clockmatrix.c:(.text+0xcc): undefined reference to `i2c_transfer' drivers/ptp/ptp_clockmatrix.o: In function `idtcm_driver_init': ptp_clockmatrix.c:(.init.text+0x14): undefined reference to `i2c_register_driver' drivers/ptp/ptp_clockmatrix.o: In function `idtcm_driver_exit': ptp_clockmatrix.c:(.exit.text+0x10): undefined reference to `i2c_del_driver' Fixes: `3a6ba7dc77` ("ptp: Add a ptp clock driver for IDT ClockMatrix.") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Vincent Cheng <vincent.cheng.xh@renesas.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-12-13 15:42:09 -08:00
Jonathan Lemon	6adc4601c2	bnxt: apply computed clamp value for coalece parameter After executing "ethtool -C eth0 rx-usecs-irq 0", the box becomes unresponsive, likely due to interrupt livelock. It appears that a minimum clamp value for the irq timer is computed, but is never applied. Fix by applying the corrected clamp value. Fixes: `74706afa71` ("bnxt_en: Update interrupt coalescing logic.") Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-12-13 15:32:44 -08:00
Vivien Didelot	692b93af71	mailmap: add entry for myself I no longer work at Savoir-faire Linux but even though MAINTAINERS is up-to-date, some emails are still sent to my old email address. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-12-13 15:15:33 -08:00
Subash Abhinov Kasiviswanathan	9c3194dd93	MAINTAINERS: Add maintainers for rmnet Add myself and Sean as maintainers for rmnet driver. Signed-off-by: Sean Tranchetti <stranche@codeaurora.org> Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-12-13 15:14:24 -08:00
Manish Chopra	0af67e49b0	qede: Fix multicast mac configuration Driver doesn't accommodate the configuration for max number of multicast mac addresses, in such particular case it leaves the device with improper/invalid multicast configuration state, causing connectivity issues (in lacp bonding like scenarios). Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-12 11:08:36 -08:00
Cristian Birsan	20032b6358	net: usb: lan78xx: Fix suspend/resume PHY register access error Lan78xx driver accesses the PHY registers through MDIO bus over USB connection. When performing a suspend/resume, the PHY registers can be accessed before the USB connection is resumed. This will generate an error and will prevent the device to resume correctly. This patch adds the dependency between the MDIO bus and USB device to allow correct handling of suspend/resume. Fixes: `ce85e13ad6` ("lan78xx: Update to use phylib instead of mii_if_info.") Signed-off-by: Cristian Birsan <cristian.birsan@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-12 10:57:12 -08:00
David S. Miller	148709bc27	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Alexei Starovoitov says: ==================== pull-request: bpf 2019-12-11 The following pull-request contains BPF updates for your net tree. We've added 8 non-merge commits during the last 1 day(s) which contain a total of 10 files changed, 126 insertions(+), 18 deletions(-). The main changes are: 1) Make BPF trampoline co-exist with ftrace-based tracers, from Alexei. 2) Fix build in minimal configurations, from Arnd. 3) Fix mips, riscv bpf_tail_call limit, from Paul. 4) Fix bpftool segfault, from Toke. 5) Fix samples/bpf, from Daniel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-11 20:13:45 -08:00
Daniel T. Lee	fe3300897c	samples: bpf: fix syscall_tp due to unused syscall Currently, open() is called from the user program and it calls the syscall 'sys_openat', not the 'sys_open'. This leads to an error of the program of user side, due to the fact that the counter maps are zero since no function such 'sys_open' is called. This commit adds the kernel bpf program which are attached to the tracepoint 'sys_enter_openat' and 'sys_enter_openat'. Fixes: `1da236b6be` ("bpf: add a test case for syscalls/sys_{enter\|exit}_* tracepoints") Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-12-11 15:28:06 -08:00
Daniel T. Lee	bba1b2a890	samples: bpf: Replace symbol compare of trace_event Previously, when this sample is added, commit `1c47910ef8` ("samples/bpf: add perf_event+bpf example"), a symbol 'sys_read' and 'sys_write' has been used without no prefixes. But currently there are no exact symbols with these under kallsyms and this leads to failure. This commit changes exact compare to substring compare to keep compatible with exact symbol or prefixed symbol. Fixes: `1c47910ef8` ("samples/bpf: add perf_event+bpf example") Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191205080114.19766-2-danieltimlee@gmail.com	2019-12-11 15:27:28 -08:00
Alexei Starovoitov	7f193c2519	selftests/bpf: Test function_graph tracer and bpf trampoline together Add simple test script to execute funciton graph tracer while BPF trampoline attaches and detaches from the functions being graph traced. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191209000114.1876138-4-ast@kernel.org	2019-12-11 15:19:29 -08:00
Alexei Starovoitov	b91e014f07	bpf: Make BPF trampoline use register_ftrace_direct() API Make BPF trampoline attach its generated assembly code to kernel functions via register_ftrace_direct() API. It helps ftrace-based tracers co-exist with BPF trampoline on the same kernel function. It also switches attaching logic from arch specific text_poke to generic ftrace that is available on many architectures. text_poke is still necessary for bpf-to-bpf attach and for bpf_tail_call optimization. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191209000114.1876138-3-ast@kernel.org	2019-12-11 15:18:08 -08:00
Toke Høiland-Jørgensen	5b79bcdf03	bpftool: Don't crash on missing jited insns or ksyms When the kptr_restrict sysctl is set, the kernel can fail to return jited_ksyms or jited_prog_insns, but still have positive values in nr_jited_ksyms and jited_prog_len. This causes bpftool to crash when trying to dump the program because it only checks the len fields not the actual pointers to the instructions and ksyms. Fix this by adding the missing checks. Fixes: `71bb428fe2` ("tools: bpf: add bpftool") Fixes: `f84192ee00` ("tools: bpftool: resolve calls without using imm field") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191210181412.151226-1-toke@redhat.com	2019-12-11 13:57:26 +01:00
Arnd Bergmann	4c80c7bc58	bpf: Fix build in minimal configurations, again Building with -Werror showed another failure: kernel/bpf/btf.c: In function 'btf_get_prog_ctx_type.isra.31': kernel/bpf/btf.c:3508:63: error: array subscript 0 is above array bounds of 'u8[0]' {aka 'unsigned char[0]'} [-Werror=array-bounds] ctx_type = btf_type_member(conv_struct) + bpf_ctx_convert_map[prog_type] * 2; I don't actually understand why the array is empty, but a similar fix has addressed a related problem, so I suppose we can do the same thing here. Fixes: `ce27709b81` ("bpf: Fix build in minimal configurations") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191210203553.2941035-1-arnd@arndb.de	2019-12-11 13:57:26 +01:00
Paul Chaignon	e49e6f6db0	bpf, mips: Limit to 33 tail calls All BPF JIT compilers except RISC-V's and MIPS' enforce a 33-tail calls limit at runtime. In addition, a test was recently added, in tailcalls2, to check this limit. This patch updates the tail call limit in MIPS' JIT compiler to allow 33 tail calls. Fixes: `b6bd53f9c4` ("MIPS: Add missing file for eBPF JIT.") Reported-by: Mahshid Khezri <khezri.mahshid@gmail.com> Signed-off-by: Paul Chaignon <paul.chaignon@orange.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/b8eb2caac1c25453c539248e56ca22f74b5316af.1575916815.git.paul.chaignon@gmail.com	2019-12-11 13:57:22 +01:00
Paul Chaignon	96bc4432f5	bpf, riscv: Limit to 33 tail calls All BPF JIT compilers except RISC-V's and MIPS' enforce a 33-tail calls limit at runtime. In addition, a test was recently added, in tailcalls2, to check this limit. This patch updates the tail call limit in RISC-V's JIT compiler to allow 33 tail calls. I tested it using the above selftest on an emulated RISCV64. Fixes: `2353ecc6f9` ("bpf, riscv: add BPF JIT for RV64G") Reported-by: Mahshid Khezri <khezri.mahshid@gmail.com> Signed-off-by: Paul Chaignon <paul.chaignon@orange.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/966fe384383bf23a0ee1efe8d7291c78a3fb832b.1575916815.git.paul.chaignon@gmail.com	2019-12-11 13:57:17 +01:00
Netanel Belgazal	24dee0c747	net: ena: fix napi handler misbehavior when the napi budget is zero In netpoll the napi handler could be called with budget equal to zero. Current ENA napi handler doesn't take that into consideration. The napi handler handles Rx packets in a do-while loop. Currently, the budget check happens only after decrementing the budget, therefore the napi handler, in rare cases, could run over MAX_INT packets. In addition to that, this moves all budget related variables to int calculation and stop mixing u32 to avoid ambiguity Fixes: `1738cd3ed3` ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-10 17:54:55 -08:00
David S. Miller	f1ce0a1557	Merge branch 'tipc-fix-some-issues' Tuong Lien says: ==================== tipc: fix some issues This series consists of some bug-fixes for TIPC. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-10 17:45:04 -08:00
Tuong Lien	31e4ccc99e	tipc: fix use-after-free in tipc_disc_rcv() In the function 'tipc_disc_rcv()', the 'msg_peer_net_hash()' is called to read the header data field but after the message skb has been freed, that might result in a garbage value... This commit fixes it by defining a new local variable to store the data first, just like the other header fields' handling. Fixes: `f73b12812a` ("tipc: improve throughput between nodes in netns") Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-10 17:45:04 -08:00
Tuong Lien	abc9b4e054	tipc: fix retrans failure due to wrong destination When a user message is sent, TIPC will check if the socket has faced a congestion at link layer. If that happens, it will make a sleep to wait for the congestion to disappear. This leaves a gap for other users to take over the socket (e.g. multi threads) since the socket is released as well. Also, in case of connectionless (e.g. SOCK_RDM), user is free to send messages to various destinations (e.g. via 'sendto()'), then the socket's preformatted header has to be updated correspondingly prior to the actual payload message building. Unfortunately, the latter action is done before the first action which causes a condition issue that the destination of a certain message can be modified incorrectly in the middle, leading to wrong destination when that message is built. Consequently, when the message is sent to the link layer, it gets stuck there forever because the peer node will simply reject it. After a number of retransmission attempts, the link is eventually taken down and the retransmission failure is reported. This commit fixes the problem by rearranging the order of actions to prevent the race condition from occurring, so the message building is 'atomic' and its header will not be modified by anyone. Fixes: `365ad353c2` ("tipc: reduce risk of user starvation during link congestion") Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-10 17:45:04 -08:00
Tuong Lien	dca4a17d24	tipc: fix potential hanging after b/rcast changing In commit `c55c8edafa` ("tipc: smooth change between replicast and broadcast"), we allow instant switching between replicast and broadcast by sending a dummy 'SYN' packet on the last used link to synchronize packets on the links. The 'SYN' message is an object of link congestion also, so if that happens, a 'SOCK_WAKEUP' will be scheduled to be sent back to the socket... However, in that commit, we simply use the same socket 'cong_link_cnt' counter for both the 'SYN' & normal payload message sending. Therefore, if both the replicast & broadcast links are congested, the counter will be not updated correctly but overwritten by the latter congestion. Later on, when the 'SOCK_WAKEUP' messages are processed, the counter is reduced one by one and eventually overflowed. Consequently, further activities on the socket will only wait for the false congestion signal to disappear but never been met. Because sending the 'SYN' message is vital for the mechanism, it should be done anyway. This commit fixes the issue by marking the message with an error code e.g. 'TIPC_ERR_NO_PORT', so its sending should not face a link congestion, there is no need to touch the socket 'cong_link_cnt' either. In addition, in the event of any error (e.g. -ENOBUFS), we will purge the entire payload message queue and make a return immediately. Fixes: `c55c8edafa` ("tipc: smooth change between replicast and broadcast") Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-10 17:45:04 -08:00
Tuong Lien	d5162f341e	tipc: fix name table rbtree issues The current rbtree for service ranges in the name table is built based on the 'lower' & 'upper' range values resulting in a flaw in the rbtree searching. Some issues have been observed in case of range overlapping: Case #1: unable to withdraw a name entry: After some name services are bound, all of them are withdrawn by user but one remains in the name table forever. This corrupts the table and that service becomes dummy i.e. no real port. E.g. / {22, 22} / / ---> {10, 50} / \ / \ {10, 30} {20, 60} The node {10, 30} cannot be removed since the rbtree searching stops at the node's ancestor i.e. {10, 50}, so starting from it will never reach the finding node. Case #2: failed to send data in some cases: E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for this service will be one of the two cases below depending on the order of the bindings: {20, 60} {10, 50} <-- / \ / \ / \ / \ {10, 50} NIL <-- NIL {20, 60} (a) (b) Now, try to send some data to service {30}, there will be two results: (a): Failed, no route to host. (b): Ok. The reason is that the rbtree searching will stop at the pointing node as shown above. Case #3: Same as case #2b above but if the data sending's scope is local and the {10, 50} is published by a peer node, then it will result in 'no route to host' even though the other {20, 60} is for example on the local node which should be able to get the data. The issues are actually due to the way we built the rbtree. This commit fixes it by introducing an additional field to each node - named 'max', which is the largest 'upper' of that node subtree. The 'max' value for each subtrees will be propagated correctly whenever a node is inserted/ removed or the tree is rebalanced by the augmented rbtree callbacks. By this way, we can change the rbtree searching appoarch to solve the issues above. Another benefit from this is that we can now improve the searching for a next range matching e.g. in case of multicast, so get rid of the unneeded looping over all nodes in the tree. Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-10 17:45:04 -08:00
David S. Miller	ac397934b3	Merge branch 'bnxt_en-Error-recovery-fixes' Michael Chan says: ==================== bnxt_en: Error recovery fixes. This patch series contains fixes mostly for the error recovery feature and related areas. Please queue the series for -stable also. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-10 17:37:14 -08:00
Vasundhara Volam	7e334fc800	bnxt_en: Add missing devlink health reporters for VFs. The VF driver also needs to create the health reporters since VFs are also involved in firmware reset and recovery. Modify bnxt_dl_register() and bnxt_dl_unregister() so that they can be called by the VFs to register/unregister devlink. Only the PF will register the devlink parameters. With devlink registered, we can now create the health reporters on the VFs. Fixes: `6763c779c2` ("bnxt_en: Add new FW devlink_health_reporter") Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-10 17:37:14 -08:00
Vasundhara Volam	937f188c1f	bnxt_en: Fix the logic that creates the health reporters. Fix the logic to properly check the fw capabilities and create the devlink health reporters only when needed. The current code creates the reporters unconditionally as long as bp->fw_health is valid, and that's not correct. Call bnxt_dl_fw_reporters_create() directly from the init and reset code path instead of from bnxt_dl_register(). This allows the reporters to be adjusted when capabilities change. The same applies to bnxt_dl_fw_reporters_destroy(). Fixes: `6763c779c2` ("bnxt_en: Add new FW devlink_health_reporter") Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-10 17:37:14 -08:00
Vasundhara Volam	0797c10d2d	bnxt_en: Remove unnecessary NULL checks for fw_health After fixing the allocation of bp->fw_health in the previous patch, the driver will not go through the fw reset and recovery code paths if bp->fw_health allocation fails. So we can now remove the unnecessary NULL checks. Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-10 17:37:14 -08:00
Vasundhara Volam	8280b38e01	bnxt_en: Fix bp->fw_health allocation and free logic. bp->fw_health needs to be allocated for either the firmware initiated reset feature or the driver initiated error recovery feature. The current code is not allocating bp->fw_health for all the necessary cases. This patch corrects the logic to allocate bp->fw_health correctly when needed. If allocation fails, we clear the feature flags. We also add the the missing kfree(bp->fw_health) when the driver is unloaded. If we get an async reset message from the firmware, we also need to make sure that we have a valid bp->fw_health before proceeding. Fixes: `07f83d72d2` ("bnxt_en: Discover firmware error recovery capabilities.") Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-10 17:37:14 -08:00
Vasundhara Volam	c74751f4c3	bnxt_en: Return error if FW returns more data than dump length If any change happened in the configuration of VF in VM while collecting live dump, there could be a race and firmware can return more data than allocated dump length. Fix it by keeping track of the accumulated core dump length copied so far and abort the copy with error code if the next chunk of core dump will exceed the original dump length. Fixes: `6c5657d085` ("bnxt_en: Add support for ethtool get dump.") Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-10 17:37:14 -08:00
Michael Chan	325f85f37e	bnxt_en: Free context memory in the open path if firmware has been reset. This will trigger new context memory to be rediscovered and allocated during the re-probe process after a firmware reset. Without this, the newly reset firmware does not have valid context memory and the driver will eventually fail to allocate some resources. Fixes: `ec5d31e3c1` ("bnxt_en: Handle firmware reset status during IF_UP.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-10 17:37:14 -08:00
Michael Chan	0c722ec0a2	bnxt_en: Fix MSIX request logic for RDMA driver. The logic needs to check both bp->total_irqs and the reserved IRQs in hw_resc->resv_irqs if applicable and see if both are enough to cover the L2 and RDMA requested vectors. The current code is only checking bp->total_irqs and can fail in some code paths, such as the TX timeout code path with the RDMA driver requesting vectors after recovery. In this code path, we have not reserved enough MSIX resources for the RDMA driver yet. Fixes: `75720e6323` ("bnxt_en: Keep track of reserved IRQs.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-10 17:37:14 -08:00
Stephan Gerhold	868afbaca1	NFC: nxp-nci: Fix probing without ACPI devm_acpi_dev_add_driver_gpios() returns -ENXIO if CONFIG_ACPI is disabled (e.g. on device tree platforms). In this case, nxp-nci will silently fail to probe. The other NFC drivers only log a debug message if devm_acpi_dev_add_driver_gpios() fails. Do the same in nxp-nci to fix this problem. Fixes: `ad0acfd69a` ("NFC: nxp-nci: Get rid of code duplication in ->probe()") Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-09 20:15:17 -08:00
Davide Caratti	991a34593b	tc-testing: unbreak full listing of tdc testcases the following command currently fails: [root@fedora tc-testing]# ./tdc.py -l The following test case IDs are not unique: {'6f5e'} Please correct them before continuing. this happens because there are two tests having the same id: [root@fedora tc-testing]# grep -r 6f5e tc-tests/* tc-tests/actions/pedit.json: "id": "6f5e", tc-tests/filters/basic.json: "id": "6f5e", fix it replacing the latest duplicate id with a brand new one: [root@fedora tc-testing]# sed -i 's/6f5e//1' tc-tests/filters/basic.json [root@fedora tc-testing]# ./tdc.py -i Fixes: `4717b05328` ("tc-testing: Introduced tdc tests for basic filter") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-09 14:44:31 -08:00
Chuhong Yuan	a288f105a0	fjes: fix missed check in fjes_acpi_add fjes_acpi_add() misses a check for platform_device_register_simple(). Add a check to fix it. Fixes: `658d439b22` ("fjes: Introduce FUJITSU Extended Socket Network Device driver") Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-09 14:41:25 -08:00
Mao Wenan	b43d1f9f70	af_packet: set defaule value for tmo There is softlockup when using TPACKET_V3: ... NMI watchdog: BUG: soft lockup - CPU#2 stuck for 60010ms! (__irq_svc) from [<c0558a0c>] (_raw_spin_unlock_irqrestore+0x44/0x54) (_raw_spin_unlock_irqrestore) from [<c027b7e8>] (mod_timer+0x210/0x25c) (mod_timer) from [<c0549c30>] (prb_retire_rx_blk_timer_expired+0x68/0x11c) (prb_retire_rx_blk_timer_expired) from [<c027a7ac>] (call_timer_fn+0x90/0x17c) (call_timer_fn) from [<c027ab6c>] (run_timer_softirq+0x2d4/0x2fc) (run_timer_softirq) from [<c021eaf4>] (__do_softirq+0x218/0x318) (__do_softirq) from [<c021eea0>] (irq_exit+0x88/0xac) (irq_exit) from [<c0240130>] (msa_irq_exit+0x11c/0x1d4) (msa_irq_exit) from [<c0209cf0>] (handle_IPI+0x650/0x7f4) (handle_IPI) from [<c02015bc>] (gic_handle_irq+0x108/0x118) (gic_handle_irq) from [<c0558ee4>] (__irq_usr+0x44/0x5c) ... If __ethtool_get_link_ksettings() is failed in prb_calc_retire_blk_tmo(), msec and tmo will be zero, so tov_in_jiffies is zero and the timer expire for retire_blk_timer is turn to mod_timer(&pkc->retire_blk_timer, jiffies + 0), which will trigger cpu usage of softirq is 100%. Fixes: `f6fb8f100b` ("af-packet: TPACKET_V3 flexible buffer implementation.") Tested-by: Xiao Jiangfeng <xiaojiangfeng@huawei.com> Signed-off-by: Mao Wenan <maowenan@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-09 14:30:19 -08:00
Grygorii Strashko	8a2b22203f	net: ethernet: ti: davinci_cpdma: fix warning "device driver frees DMA memory with different size" The TI CPSW(s) driver produces warning with DMA API debug options enabled: WARNING: CPU: 0 PID: 1033 at kernel/dma/debug.c:1025 check_unmap+0x4a8/0x968 DMA-API: cpsw 48484000.ethernet: device driver frees DMA memory with different size [device address=0x00000000abc6aa02] [map size=64 bytes] [unmap size=42 bytes] CPU: 0 PID: 1033 Comm: ping Not tainted 5.3.0-dirty #41 Hardware name: Generic DRA72X (Flattened Device Tree) [<c0112c60>] (unwind_backtrace) from [<c010d270>] (show_stack+0x10/0x14) [<c010d270>] (show_stack) from [<c09bc564>] (dump_stack+0xd8/0x110) [<c09bc564>] (dump_stack) from [<c013b93c>] (__warn+0xe0/0x10c) [<c013b93c>] (__warn) from [<c013b9ac>] (warn_slowpath_fmt+0x44/0x6c) [<c013b9ac>] (warn_slowpath_fmt) from [<c01e0368>] (check_unmap+0x4a8/0x968) [<c01e0368>] (check_unmap) from [<c01e08a8>] (debug_dma_unmap_page+0x80/0x90) [<c01e08a8>] (debug_dma_unmap_page) from [<c0752414>] (__cpdma_chan_free+0x114/0x16c) [<c0752414>] (__cpdma_chan_free) from [<c07525c4>] (__cpdma_chan_process+0x158/0x17c) [<c07525c4>] (__cpdma_chan_process) from [<c0753690>] (cpdma_chan_process+0x3c/0x5c) [<c0753690>] (cpdma_chan_process) from [<c0758660>] (cpsw_tx_mq_poll+0x48/0x94) [<c0758660>] (cpsw_tx_mq_poll) from [<c0803018>] (net_rx_action+0x108/0x4e4) [<c0803018>] (net_rx_action) from [<c010230c>] (__do_softirq+0xec/0x598) [<c010230c>] (__do_softirq) from [<c0143914>] (do_softirq.part.4+0x68/0x74) [<c0143914>] (do_softirq.part.4) from [<c0143a44>] (__local_bh_enable_ip+0x124/0x17c) [<c0143a44>] (__local_bh_enable_ip) from [<c0871590>] (ip_finish_output2+0x294/0xb7c) [<c0871590>] (ip_finish_output2) from [<c0875440>] (ip_output+0x210/0x364) [<c0875440>] (ip_output) from [<c0875e2c>] (ip_send_skb+0x1c/0xf8) [<c0875e2c>] (ip_send_skb) from [<c08a7fd4>] (raw_sendmsg+0x9a8/0xc74) [<c08a7fd4>] (raw_sendmsg) from [<c07d6b90>] (sock_sendmsg+0x14/0x24) [<c07d6b90>] (sock_sendmsg) from [<c07d8260>] (__sys_sendto+0xbc/0x100) [<c07d8260>] (__sys_sendto) from [<c01011ac>] (__sys_trace_return+0x0/0x14) Exception stack(0xea9a7fa8 to 0xea9a7ff0) ... The reason is that cpdma_chan_submit_si() now stores original buffer length (sw_len) in CPDMA descriptor instead of adjusted buffer length (hw_len) used to map the buffer. Hence, fix an issue by passing correct buffer length in CPDMA descriptor. Cc: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Fixes: `6670acacd5` ("net: ethernet: ti: davinci_cpdma: add dma mapped submit") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-09 14:26:04 -08:00
David S. Miller	7da538c1e1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Wait for rcu grace period after releasing netns in ctnetlink, from Florian Westphal. 2) Incorrect command type in flowtable offload ndo invocation, from wenxu. 3) Incorrect callback type in flowtable offload flow tuple updates, also from wenxu. 4) Fix compile warning on flowtable offload infrastructure due to possible reference to uninitialized variable, from Nathan Chancellor. 5) Do not inline nf_ct_resolve_clash(), this is called from slow path / stress situations. From Florian Westphal. 6) Missing IPv6 flow selector description in flowtable offload. 7) Missing check for NETDEV_UNREGISTER in nf_tables offload infrastructure, from wenxu. 8) Update NAT selftest to use randomized netns names, from Florian Westphal. 9) Restore nfqueue bridge support, from Marco Oliverio. 10) Compilation warning in SCTP_CHUNKMAP_*() on xt_sctp header. From Phil Sutter. 11) Fix bogus lookup/get match for non-anonymous rbtree sets. 12) Missing netlink validation for NFT_SET_ELEM_INTERVAL_END elements. 13) Missing netlink validation for NFT_DATA_VALUE after nft_data_init(). 14) If rule specifies no actions, offload infrastructure returns EOPNOTSUPP. 15) Module refcount leak in object updates. 16) Missing sanitization for ARP traffic from br_netfilter, from Eric Dumazet. 17) Compilation breakage on big-endian due to incorrect memcpy() size in the flowtable offload infrastructure. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-09 14:03:33 -08:00
Pablo Neira Ayuso	7acd9378dc	netfilter: nf_flow_table_offload: Correct memcpy size for flow_overload_mangle() In function 'memcpy', inlined from 'flow_offload_mangle' at net/netfilter/nf_flow_table_offload.c:112:2, inlined from 'flow_offload_port_dnat' at net/netfilter/nf_flow_table_offload.c:373:2, inlined from 'nf_flow_rule_route_ipv4' at net/netfilter/nf_flow_table_offload.c:424:3: ./include/linux/string.h:376:4: error: call to '__read_overflow2' declared with attribute error: detected read beyond size of object passed as 2nd parameter 376 \| __read_overflow2(); \| ^~~~~~~~~~~~~~~~~~ The original u8* was done in the hope to make this more adaptable but consensus is to keep this like it is in tc pedit. Fixes: `c29f74e0df` ("netfilter: nf_flow_table: hardware offload support") Reported-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-12-09 20:07:59 +01:00
Martin Schiller	f8fc57e8d7	net/x25: add new state X25_STATE_5 This is needed, because if the flag X25_ACCPT_APPRV_FLAG is not set on a socket (manual call confirmation) and the channel is cleared by remote before the manual call confirmation was sent, this situation needs to be handled. Signed-off-by: Martin Schiller <ms@dev.tdt.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-09 10:28:43 -08:00
Ido Schimmel	65cb139862	selftests: forwarding: Delete IPv6 address at the end When creating the second host in h2_create(), two addresses are assigned to the interface, but only one is deleted. When running the test twice in a row the following error is observed: $ ./router_bridge_vlan.sh TEST: ping [ OK ] TEST: ping6 [ OK ] TEST: vlan [ OK ] $ ./router_bridge_vlan.sh RTNETLINK answers: File exists TEST: ping [ OK ] TEST: ping6 [ OK ] TEST: vlan [ OK ] Fix this by deleting the address during cleanup. Fixes: `5b1e7f9ebd` ("selftests: forwarding: Test routed bridge interface") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-09 10:19:45 -08:00
Ido Schimmel	62201c00c4	mlxsw: spectrum_router: Remove unlikely user-triggerable warning In case the driver vetoes the addition of an IPv6 multipath route, the IPv6 stack will emit delete notifications for the sibling routes that were already added to the FIB trie. Since these siblings are not present in hardware, a warning will be generated. Have the driver ignore notifications for routes it does not have. Fixes: `ebee3cad83` ("ipv6: Add IPv6 multipath notifications for add / replace") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-09 10:18:28 -08:00
Xin Long	b6f3320b1d	sctp: fully initialize v4 addr in some functions Syzbot found a crash: BUG: KMSAN: uninit-value in crc32_body lib/crc32.c:112 [inline] BUG: KMSAN: uninit-value in crc32_le_generic lib/crc32.c:179 [inline] BUG: KMSAN: uninit-value in __crc32c_le_base+0x4fa/0xd30 lib/crc32.c:202 Call Trace: crc32_body lib/crc32.c:112 [inline] crc32_le_generic lib/crc32.c:179 [inline] __crc32c_le_base+0x4fa/0xd30 lib/crc32.c:202 chksum_update+0xb2/0x110 crypto/crc32c_generic.c:90 crypto_shash_update+0x4c5/0x530 crypto/shash.c:107 crc32c+0x150/0x220 lib/libcrc32c.c:47 sctp_csum_update+0x89/0xa0 include/net/sctp/checksum.h:36 __skb_checksum+0x1297/0x12a0 net/core/skbuff.c:2640 sctp_compute_cksum include/net/sctp/checksum.h:59 [inline] sctp_packet_pack net/sctp/output.c:528 [inline] sctp_packet_transmit+0x40fb/0x4250 net/sctp/output.c:597 sctp_outq_flush_transports net/sctp/outqueue.c:1146 [inline] sctp_outq_flush+0x1823/0x5d80 net/sctp/outqueue.c:1194 sctp_outq_uncork+0xd0/0xf0 net/sctp/outqueue.c:757 sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1781 [inline] sctp_side_effects net/sctp/sm_sideeffect.c:1184 [inline] sctp_do_sm+0x8fe1/0x9720 net/sctp/sm_sideeffect.c:1155 sctp_primitive_REQUESTHEARTBEAT+0x175/0x1a0 net/sctp/primitive.c:185 sctp_apply_peer_addr_params+0x212/0x1d40 net/sctp/socket.c:2433 sctp_setsockopt_peer_addr_params net/sctp/socket.c:2686 [inline] sctp_setsockopt+0x189bb/0x19090 net/sctp/socket.c:4672 The issue was caused by transport->ipaddr set with uninit addr param, which was passed by: sctp_transport_init net/sctp/transport.c:47 [inline] sctp_transport_new+0x248/0xa00 net/sctp/transport.c:100 sctp_assoc_add_peer+0x5ba/0x2030 net/sctp/associola.c:611 sctp_process_param net/sctp/sm_make_chunk.c:2524 [inline] where 'addr' is set by sctp_v4_from_addr_param(), and it doesn't initialize the padding of addr->v4. Later when calling sctp_make_heartbeat(), hbinfo.daddr(=transport->ipaddr) will become the part of skb, and the issue occurs. This patch is to fix it by initializing the padding of addr->v4 in sctp_v4_from_addr_param(), as well as other functions that do the similar thing, and these functions shouldn't trust that the caller initializes the memory, as Marcelo suggested. Reported-by: syzbot+6dcbfea81cd3d4dd0b02@syzkaller.appspotmail.com Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-09 10:16:39 -08:00
Eric Dumazet	9e99bfefdb	bonding: fix bond_neigh_init() 1) syzbot reported an uninit-value in bond_neigh_setup() [1] bond_neigh_setup() uses a temporary on-stack 'struct neigh_parms parms', but only clears parms.neigh_setup field. A stacked bonding device would then enter bond_neigh_setup() and read garbage from parms->dev. If we get really unlucky and garbage is matching @dev, then we could recurse and eventually crash. Let's make sure the whole structure is cleared to avoid surprises. 2) bond_neigh_setup() can be called while another cpu manipulates the master device, removing or adding a slave. We need at least rcu protection to prevent use-after-free. Note: Prior code does not support a stack of bonding devices, this patch does not attempt to fix this, and leave a comment instead. [1] BUG: KMSAN: uninit-value in bond_neigh_setup+0xa4/0x110 drivers/net/bonding/bond_main.c:3655 CPU: 0 PID: 11256 Comm: syz-executor.0 Not tainted 5.4.0-rc8-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x220 lib/dump_stack.c:118 kmsan_report+0x128/0x220 mm/kmsan/kmsan_report.c:108 __msan_warning+0x57/0xa0 mm/kmsan/kmsan_instr.c:245 bond_neigh_setup+0xa4/0x110 drivers/net/bonding/bond_main.c:3655 bond_neigh_init+0x216/0x4b0 drivers/net/bonding/bond_main.c:3626 ___neigh_create+0x169e/0x2c40 net/core/neighbour.c:613 __neigh_create+0xbd/0xd0 net/core/neighbour.c:674 ip6_finish_output2+0x149a/0x2670 net/ipv6/ip6_output.c:113 __ip6_finish_output+0x83d/0x8f0 net/ipv6/ip6_output.c:142 ip6_finish_output+0x2db/0x420 net/ipv6/ip6_output.c:152 NF_HOOK_COND include/linux/netfilter.h:294 [inline] ip6_output+0x5d3/0x720 net/ipv6/ip6_output.c:175 dst_output include/net/dst.h:436 [inline] NF_HOOK include/linux/netfilter.h:305 [inline] mld_sendpack+0xebd/0x13d0 net/ipv6/mcast.c:1682 mld_send_cr net/ipv6/mcast.c:1978 [inline] mld_ifc_timer_expire+0x116b/0x1680 net/ipv6/mcast.c:2477 call_timer_fn+0x232/0x530 kernel/time/timer.c:1404 expire_timers kernel/time/timer.c:1449 [inline] __run_timers+0xd60/0x1270 kernel/time/timer.c:1773 run_timer_softirq+0x2d/0x50 kernel/time/timer.c:1786 __do_softirq+0x4a1/0x83a kernel/softirq.c:293 invoke_softirq kernel/softirq.c:375 [inline] irq_exit+0x230/0x280 kernel/softirq.c:416 exiting_irq+0xe/0x10 arch/x86/include/asm/apic.h:536 smp_apic_timer_interrupt+0x48/0x70 arch/x86/kernel/apic/apic.c:1138 apic_timer_interrupt+0x2e/0x40 arch/x86/entry/entry_64.S:835 </IRQ> RIP: 0010:kmsan_free_page+0x18d/0x1c0 mm/kmsan/kmsan_shadow.c:439 Code: 4c 89 ff 44 89 f6 e8 82 0d ee ff 65 ff 0d 9f 26 3b 60 65 8b 05 98 26 3b 60 85 c0 75 24 e8 5b f6 35 ff 4c 89 6d d0 ff 75 d0 9d <48> 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b 0f 0b 0f 0b 0f RSP: 0018:ffffb328034af818 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: 0000000000000000 RBX: ffffe2d7471f8360 RCX: 0000000000000000 RDX: ffffffffadea7000 RSI: 0000000000000004 RDI: ffff93496fcda104 RBP: ffffb328034af850 R08: ffff934a47e86d00 R09: ffff93496fc41900 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 R13: 0000000000000246 R14: 0000000000000000 R15: ffffe2d7472225c0 free_pages_prepare mm/page_alloc.c:1138 [inline] free_pcp_prepare mm/page_alloc.c:1230 [inline] free_unref_page_prepare+0x1d9/0x770 mm/page_alloc.c:3025 free_unref_page mm/page_alloc.c:3074 [inline] free_the_page mm/page_alloc.c:4832 [inline] __free_pages+0x154/0x230 mm/page_alloc.c:4840 __vunmap+0xdac/0xf20 mm/vmalloc.c:2277 __vfree mm/vmalloc.c:2325 [inline] vfree+0x7c/0x170 mm/vmalloc.c:2355 copy_entries_to_user net/ipv6/netfilter/ip6_tables.c:883 [inline] get_entries net/ipv6/netfilter/ip6_tables.c:1041 [inline] do_ip6t_get_ctl+0xfa4/0x1030 net/ipv6/netfilter/ip6_tables.c:1709 nf_sockopt net/netfilter/nf_sockopt.c:104 [inline] nf_getsockopt+0x481/0x4e0 net/netfilter/nf_sockopt.c:122 ipv6_getsockopt+0x264/0x510 net/ipv6/ipv6_sockglue.c:1400 tcp_getsockopt+0x1c6/0x1f0 net/ipv4/tcp.c:3688 sock_common_getsockopt+0x13f/0x180 net/core/sock.c:3110 __sys_getsockopt+0x533/0x7b0 net/socket.c:2129 __do_sys_getsockopt net/socket.c:2144 [inline] __se_sys_getsockopt+0xe1/0x100 net/socket.c:2141 __x64_sys_getsockopt+0x62/0x80 net/socket.c:2141 do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45d20a Code: b8 34 01 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 8d 8b fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 49 89 ca b8 37 00 00 00 0f 05 <48> 3d 01 f0 ff ff 0f 83 6a 8b fb ff c3 66 0f 1f 84 00 00 00 00 00 RSP: 002b:0000000000a6f618 EFLAGS: 00000212 ORIG_RAX: 0000000000000037 RAX: ffffffffffffffda RBX: 0000000000a6f640 RCX: 000000000045d20a RDX: 0000000000000041 RSI: 0000000000000029 RDI: 0000000000000003 RBP: 0000000000717cc0 R08: 0000000000a6f63c R09: 0000000000004000 R10: 0000000000a6f740 R11: 0000000000000212 R12: 0000000000000003 R13: 0000000000000000 R14: 0000000000000029 R15: 0000000000715b00 Local variable description: ----parms@bond_neigh_init Variable was created at: bond_neigh_init+0x8c/0x4b0 drivers/net/bonding/bond_main.c:3617 bond_neigh_init+0x8c/0x4b0 drivers/net/bonding/bond_main.c:3617 Fixes: `9918d5bf32` ("bonding: modify only neigh_parms owned by us") Fixes: `234bcf8a49` ("net/bonding: correctly proxy slave neigh param setup ndo function") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-09 09:49:07 -08:00
Eric Dumazet	f394722fb0	neighbour: remove neigh_cleanup() method neigh_cleanup() has not been used for seven years, and was a wrong design. Messing with shared pointer in bond_neigh_init() without proper memory barriers would at least trigger syzbot complains eventually. It is time to remove this stuff. Fixes: `b63b70d877` ("IPoIB: Use a private hash table for path lookup in xmit path") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-09 09:48:47 -08:00
David S. Miller	43aad8104b	linux-can-fixes-for-5.5-20191208 -----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEEmvEkXzgOfc881GuFWsYho5HknSAFAl3s3dwTHG1rbEBwZW5n dXRyb25peC5kZQAKCRBaxiGjkeSdIKhhB/sE+kGrGECa6DSehxKsSTCvh/5AGw2U AGh+AHgpfCRKOPNBeN0yKCdBVSE40fURsIjNvn5SDZ1cY8tUOVCJAyUU31MfgzkW b+Xiro5iOu6G/UpfZgMCeGV617FZg/65rz7ylwNzzA7zuUcoml6o4A55/QUFNIXx juDzhdKmpmLRQusekzbP9qyCgDGXv8/cPBloI+frzqaDU5bgS6Q3ldHe26lkQzNH bwF6bCGpJYykuNFIrLoFpSuJ+7ttZhLexXCwsl90UfpwmHVvUgQKJuF8JVR1AF3x 5mjHWi3GFyi4aUL4yY4kU3cWWPse3MQLPt5QqMu56dW4UADp1k3FHByo =YMzV -----END PGP SIGNATURE----- Merge tag 'linux-can-fixes-for-5.5-20191208' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2019-12-08 this is a pull request of 13 patches for net/master. The first two patches are by Dan Murphy. He adds himself as a maintainer to the m-can MMIO and tcan SPI driver. The next two patches the j1939 stack. The first one is by Oleksij Rempel and fixes a locking problem found by the syzbot, the second one is by me an fixes a mistake in the documentation. Srinivas Neeli fixes missing RX CAN packets on CANFD2.0 in the xilinx driver. Sean Nyekjaer fixes a possible deadlock in the the flexcan driver after suspend/resume. Joakim Zhang contributes two patches for the flexcan driver that fix problems with the low power enter/exit. The next 4 patches all target the tcan part of the m_can driver. Sean Nyekjaer adds the required delay after reset and fixes the device tree binding example. Dan Murphy's patches make the wake-gpio optional. In the last patch Xiaolong Huang fixes several kernel memory info leaks to the USB device in the kvaser_usb_leaf driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-09 09:27:47 -08:00

1 2 3 4 5 ...

886933 Commits