linux

Author	SHA1	Message	Date
Kenneth Lee	00784da3e6	can: kvaser_usb: kvaser_usb_hydra: Use kzalloc for allocating only one element Use kzalloc(...) rather than kcalloc(1, ...) since because the number of elements we are specifying in this case is 1, kzalloc would accomplish the same thing and we can simplify. Also refactor how we calculate the sizeof() as checkstyle for kzalloc() prefers using the variable we are assigning to versus the type of that variable for calculating the size to allocate. Signed-off-by: Kenneth Lee <klee33@uw.edu> Link: https://lore.kernel.org/all/20220807051656.1991446-1-klee33@uw.edu Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2022-09-05 22:00:42 +02:00
Christophe JAILLET	ddbbed2530	can: rcar_canfd: Use dev_err_probe() to simplify code and better handle -EPROBE_DEFER devm_clk_get() can return -EPROBE_DEFER, so use dev_err_probe() instead of dev_err() in order to be less verbose in the log. This also saves a few LoC. While at it, turn a "goto fail_dev;" at the beginning of the function into a direct return in order to avoid mixing goto and return, which looks spurious. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/all/f5bf0b8f757bd3bc9b391094ece3548cc2f96456.1659858686.git.christophe.jaillet@wanadoo.fr Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2022-09-05 22:00:42 +02:00
Marc Kleine-Budde	d945346db1	can: flexcan: fix typo: FLEXCAN_QUIRK_SUPPPORT_* -> FLEXCAN_QUIRK_SUPPORT_* Fix typo "FLEXCAN_QUIRK_SUPPPORT_*" -> "FLEXCAN_QUIRK_SUPPORT_". Link: https://lore.kernel.org/all/20220811093617.1861938-3-mkl@pengutronix.de Fixes: `f04aefd465` ("can: flexcan: mark RX via mailboxes as supported on MCF5441X") Fixes: `c5c8859104` ("can: flexcan: add more quirks to describe RX path capabilities") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2022-09-05 22:00:42 +02:00
Marc Kleine-Budde	766108d912	can: rx-offload: can_rx_offload_init_queue(): fix typo Fix typo "rounted" -> "rounded". Link: https://lore.kernel.org/all/20220811093617.1861938-2-mkl@pengutronix.de Fixes: `d254586c34` ("can: rx-offload: Add support for HW fifo based irq offloading") Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2022-09-05 21:54:49 +02:00
Gao Xiang	2f44013e39	erofs: fix pcluster use-after-free on UP platforms During stress testing with CONFIG_SMP disabled, KASAN reports as below: ================================================================== BUG: KASAN: use-after-free in __mutex_lock+0xe5/0xc30 Read of size 8 at addr ffff8881094223f8 by task stress/7789 CPU: 0 PID: 7789 Comm: stress Not tainted 6.0.0-rc1-00002-g0d53d2e882f9 #3 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Call Trace: <TASK> .. __mutex_lock+0xe5/0xc30 .. z_erofs_do_read_page+0x8ce/0x1560 .. z_erofs_readahead+0x31c/0x580 .. Freed by task 7787 kasan_save_stack+0x1e/0x40 kasan_set_track+0x20/0x30 kasan_set_free_info+0x20/0x40 __kasan_slab_free+0x10c/0x190 kmem_cache_free+0xed/0x380 rcu_core+0x3d5/0xc90 __do_softirq+0x12d/0x389 Last potentially related work creation: kasan_save_stack+0x1e/0x40 __kasan_record_aux_stack+0x97/0xb0 call_rcu+0x3d/0x3f0 erofs_shrink_workstation+0x11f/0x210 erofs_shrink_scan+0xdc/0x170 shrink_slab.constprop.0+0x296/0x530 drop_slab+0x1c/0x70 drop_caches_sysctl_handler+0x70/0x80 proc_sys_call_handler+0x20a/0x2f0 vfs_write+0x555/0x6c0 ksys_write+0xbe/0x160 do_syscall_64+0x3b/0x90 The root cause is that erofs_workgroup_unfreeze() doesn't reset to orig_val thus it causes a race that the pcluster reuses unexpectedly before freeing. Since UP platforms are quite rare now, such path becomes unnecessary. Let's drop such specific-designed path directly instead. Fixes: `73f5c66df3` ("staging: erofs: fix `erofs_workgroup_{try_to_freeze, unfreeze}'") Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20220902045710.109530-1-hsiangkao@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-09-05 23:23:30 +08:00
Yue Hu	ea0b7b0d59	erofs: avoid the potentially wrong m_plen for big pcluster Actually, 'compressedlcs' stores compressed block count rather than lcluster count. Therefore, the number of bits for shifting the count should be 'LOG_BLOCK_SIZE' rather than 'lclusterbits' although current lcluster size is 4K. The value of 'm_plen' will be wrong once we enable the non 4K-sized lcluster. Signed-off-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20220812060150.8510-1-huyue2@coolpad.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-09-05 23:22:01 +08:00
Sun Ke	5bd9628b78	erofs: fix error return code in erofs_fscache_{meta_,}read_folio If erofs_fscache_alloc_request fail and then goto out, it will return 0. it should return a negative error code instead of 0. Fixes: `d435d53228` ("erofs: change to use asynchronous io for fscache readpage/readahead") Signed-off-by: Sun Ke <sunke32@huawei.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20220815034829.3940803-1-sunke32@huawei.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-09-05 23:21:15 +08:00
Heiner Kallweit	96efd6d014	r8169: remove not needed net_ratelimit() check We're not in a hot path and don't want to miss this message, therefore remove the net_ratelimit() check. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:49:59 +01:00
Kees Cook	710d21fdff	netlink: Bounds-check struct nlmsgerr creation In preparation for FORTIFY_SOURCE doing bounds-check on memcpy(), switch from __nlmsg_put to nlmsg_put(), and explain the bounds check for dealing with the memcpy() across a composite flexible array struct. Avoids this future run-time warning: memcpy: detected field-spanning write (size 32) of single field "&errmsg->msg" at net/netlink/af_netlink.c:2447 (size 16) Cc: Jakub Kicinski <kuba@kernel.org> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Jozsef Kadlecsik <kadlec@netfilter.org> Cc: Florian Westphal <fw@strlen.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: syzbot <syzkaller@googlegroups.com> Cc: netfilter-devel@vger.kernel.org Cc: coreteam@netfilter.org Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220901071336.1418572-1-keescook@chromium.org Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:45:22 +01:00
David S. Miller	beb432528c	bluetooth pull request for net: - Fix regression preventing ACL packet transmission -----BEGIN PGP SIGNATURE----- iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmMSfW0ZHGx1aXoudm9u LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKS1dD/96KqnPsJeA2FwTkagXpFFV cGVncCJswesn3J4V4J8pBgItlDVUVE0jH259VG485Q9ZERlGO9C/xhMzZWjCbZx+ 4eQqCvn11PeGXm1jEDNE0jWM/XJE7qIe8aD22PYvxS9UW9kGhT4tamPtTKELlCOW 4dDrMezRiq/oX2+Ec5eQqGLEUEL5xAIIvMSGBtfi/qB65X6MOey4XNwzaDfXJF9X sRkHHZjc8PrORI8G7D+Q3WfqpPNGSrWE3wU5igaZV26UX4sVr50uPK2QIvS1Vnt3 dsMOeQAElZbcsXlDHwCHzvEqaLnUFZ3pXQhdpb07bOdEPf2WBHMtBzvG1eZ8qQ+S T7q0XnIwe0jiqvTScjNY4lcKh1PI297zq6SEwG0Y56qE8Grk+c+Myx+/6v84r9ea NFiQ88vuoyktGyCfb4MxFMMy5KDbn9sAQ2QC3u9GoHNtEOEV9aKQWY6+ojEs0DeL jMETM8nGu1OhN3gzY3LVRLjnkvuQ/gY1Y0bYy73Q2cl2k4TjufbtfVh2D2KsFmyZ sX2++Ilk4c61As9L6TLHfk/Xow8JfcPe5S29UbcW+pZH4FVCzShW5UaTVIL5ishK VAErmB+o81XMUKctJcBknpoOOBthVROvHTgI8A9mTefhXhn2w1UzPTDwCMP0KDV3 WI3LpZoRZQAADW3ytWST+A== =ugXE -----END PGP SIGNATURE----- Merge tag 'for-net-2022-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - Fix regression preventing ACL packet transmission ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:43:18 +01:00
Daniel Borkmann	274052a2b0	Merge branch 'bpf-allocator' Alexei Starovoitov says: ==================== Introduce any context BPF specific memory allocator. Tracing BPF programs can attach to kprobe and fentry. Hence they run in unknown context where calling plain kmalloc() might not be safe. Front-end kmalloc() with per-cpu cache of free elements. Refill this cache asynchronously from irq_work. Major achievements enabled by bpf_mem_alloc: - Dynamically allocated hash maps used to be 10 times slower than fully preallocated. With bpf_mem_alloc and subsequent optimizations the speed of dynamic maps is equal to full prealloc. - Tracing bpf programs can use dynamically allocated hash maps. Potentially saving lots of memory. Typical hash map is sparsely populated. - Sleepable bpf programs can used dynamically allocated hash maps. Future work: - Expose bpf_mem_alloc as uapi FD to be used in dynptr_alloc, kptr_alloc - Convert lru map to bpf_mem_alloc - Further cleanup htab code. Example: htab_use_raw_lock can be removed. Changelog: v5->v6: - Debugged the reason for selftests/bpf/test_maps ooming in a small VM that BPF CI is using. Added patch 16 that optimizes the usage of rcu_barrier-s between bpf_mem_alloc and hash map. It drastically improved the speed of htab destruction. v4->v5: - Fixed missing migrate_disable in hash tab free path (Daniel) - Replaced impossible "memory leak" with WARN_ON_ONCE (Martin) - Dropped sysctl kernel.bpf_force_dyn_alloc patch (Daniel) - Added Andrii's ack - Added new patch 15 that removes kmem_cache usage from bpf_mem_alloc. It saves memory, speeds up map create/destroy operations while maintains hash map update/delete performance. v3->v4: - fix build issue due to missing local.h on 32-bit arch - add Kumar's ack - proposal for next steps from Delyan: https://lore.kernel.org/bpf/d3f76b27f4e55ec9e400ae8dcaecbb702a4932e8.camel@fb.com/ v2->v3: - Rewrote the free_list algorithm based on discussions with Kumar. Patch 1. - Allowed sleepable bpf progs use dynamically allocated maps. Patches 13 and 14. - Added sysctl to force bpf_mem_alloc in hash map even if pre-alloc is requested to reduce memory consumption. Patch 15. - Fix: zero-fill percpu allocation - Single rcu_barrier at the end instead of each cpu during bpf_mem_alloc destruction v2 thread: https://lore.kernel.org/bpf/20220817210419.95560-1-alexei.starovoitov@gmail.com/ v1->v2: - Moved unsafe direct call_rcu() from hash map into safe place inside bpf_mem_alloc. Patches 7 and 9. - Optimized atomic_inc/dec in hash map with percpu_counter. Patch 6. - Tuned watermarks per allocation size. Patch 8 - Adopted this approach to per-cpu allocation. Patch 10. - Fully converted hash map to bpf_mem_alloc. Patch 11. - Removed tracing prog restriction on map types. Combination of all patches and final patch 12. v1 thread: https://lore.kernel.org/bpf/20220623003230.37497-1-alexei.starovoitov@gmail.com/ LWN article: https://lwn.net/Articles/899274/ ==================== Link: https://lore.kernel.org/r/ Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2022-09-05 15:33:11 +02:00
Alexei Starovoitov	9f2c6e96c6	bpf: Optimize rcu_barrier usage between hash map and bpf_mem_alloc. User space might be creating and destroying a lot of hash maps. Synchronous rcu_barrier-s in a destruction path of hash map delay freeing of hash buckets and other map memory and may cause artificial OOM situation under stress. Optimize rcu_barrier usage between bpf hash map and bpf_mem_alloc: - remove rcu_barrier from hash map, since htab doesn't use call_rcu directly and there are no callback to wait for. - bpf_mem_alloc has call_rcu_in_progress flag that indicates pending callbacks. Use it to avoid barriers in fast path. - When barriers are needed copy bpf_mem_alloc into temp structure and wait for rcu barrier-s in the worker to let the rest of hash map freeing to proceed. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220902211058.60789-17-alexei.starovoitov@gmail.com	2022-09-05 15:33:07 +02:00
Alexei Starovoitov	bfc03c15be	bpf: Remove usage of kmem_cache from bpf_mem_cache. For bpf_mem_cache based hash maps the following stress test: for (i = 1; i <= 512; i <<= 1) for (j = 1; j <= 1 << 18; j <<= 1) fd = bpf_map_create(BPF_MAP_TYPE_HASH, NULL, i, j, 2, 0); creates many kmem_cache-s that are not mergeable in debug kernels and consume unnecessary amount of memory. Turned out bpf_mem_cache's free_list logic does batching well, so usage of kmem_cache for fixes size allocations doesn't bring any performance benefits vs normal kmalloc. Hence get rid of kmem_cache in bpf_mem_cache. That saves memory, speeds up map create/destroy operations, while maintains hash map update/delete performance. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220902211058.60789-16-alexei.starovoitov@gmail.com	2022-09-05 15:33:07 +02:00
Alexei Starovoitov	02cc5aa29e	bpf: Remove prealloc-only restriction for sleepable bpf programs. Since hash map is now converted to bpf_mem_alloc and it's waiting for rcu and rcu_tasks_trace GPs before freeing elements into global memory slabs it's safe to use dynamically allocated hash maps in sleepable bpf programs. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-15-alexei.starovoitov@gmail.com	2022-09-05 15:33:06 +02:00
Alexei Starovoitov	dccb4a9013	bpf: Prepare bpf_mem_alloc to be used by sleepable bpf programs. Use call_rcu_tasks_trace() to wait for sleepable progs to finish. Then use call_rcu() to wait for normal progs to finish and finally do free_one() on each element when freeing objects into global memory pool. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-14-alexei.starovoitov@gmail.com	2022-09-05 15:33:06 +02:00
Alexei Starovoitov	96da3f7d48	bpf: Remove tracing program restriction on map types The hash map is now fully converted to bpf_mem_alloc. Its implementation is not allocating synchronously and not calling call_rcu() directly. It's now safe to use non-preallocated hash maps in all types of tracing programs including BPF_PROG_TYPE_PERF_EVENT that runs out of NMI context. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-13-alexei.starovoitov@gmail.com	2022-09-05 15:33:06 +02:00
Alexei Starovoitov	ee4ed53c5e	bpf: Convert percpu hash map to per-cpu bpf_mem_alloc. Convert dynamic allocations in percpu hash map from alloc_percpu() to bpf_mem_cache_alloc() from per-cpu bpf_mem_alloc. Since bpf_mem_alloc frees objects after RCU gp the call_rcu() is removed. pcpu_init_value() now needs to zero-fill per-cpu allocations, since dynamically allocated map elements are now similar to full prealloc, since alloc_percpu() is not called inline and the elements are reused in the freelist. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-12-alexei.starovoitov@gmail.com	2022-09-05 15:33:06 +02:00
Alexei Starovoitov	4ab67149f3	bpf: Add percpu allocation support to bpf_mem_alloc. Extend bpf_mem_alloc to cache free list of fixed size per-cpu allocations. Once such cache is created bpf_mem_cache_alloc() will return per-cpu objects. bpf_mem_cache_free() will free them back into global per-cpu pool after observing RCU grace period. per-cpu flavor of bpf_mem_alloc is going to be used by per-cpu hash maps. The free list cache consists of tuples { llist_node, per-cpu pointer } Unlike alloc_percpu() that returns per-cpu pointer the bpf_mem_cache_alloc() returns a pointer to per-cpu pointer and bpf_mem_cache_free() expects to receive it back. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-11-alexei.starovoitov@gmail.com	2022-09-05 15:33:06 +02:00
Alexei Starovoitov	8d5a8011b3	bpf: Batch call_rcu callbacks instead of SLAB_TYPESAFE_BY_RCU. SLAB_TYPESAFE_BY_RCU makes kmem_caches non mergeable and slows down kmem_cache_destroy. All bpf_mem_cache are safe to share across different maps and programs. Convert SLAB_TYPESAFE_BY_RCU to batched call_rcu. This change solves the memory consumption issue, avoids kmem_cache_destroy latency and keeps bpf hash map performance the same. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-10-alexei.starovoitov@gmail.com	2022-09-05 15:33:06 +02:00
Alexei Starovoitov	7c266178aa	bpf: Adjust low/high watermarks in bpf_mem_cache The same low/high watermarks for every bucket in bpf_mem_cache consume significant amount of memory. Preallocating 64 elements of 4096 bytes each in the free list is not efficient. Make low/high watermarks and batching value dependent on element size. This change brings significant memory savings. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-9-alexei.starovoitov@gmail.com	2022-09-05 15:33:06 +02:00
Alexei Starovoitov	0fd7c5d433	bpf: Optimize call_rcu in non-preallocated hash map. Doing call_rcu() million times a second becomes a bottle neck. Convert non-preallocated hash map from call_rcu to SLAB_TYPESAFE_BY_RCU. The rcu critical section is no longer observed for one htab element which makes non-preallocated hash map behave just like preallocated hash map. The map elements are released back to kernel memory after observing rcu critical section. This improves 'map_perf_test 4' performance from 100k events per second to 250k events per second. bpf_mem_alloc + percpu_counter + typesafe_by_rcu provide 10x performance boost to non-preallocated hash map and make it within few % of preallocated map while consuming fraction of memory. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-8-alexei.starovoitov@gmail.com	2022-09-05 15:33:06 +02:00
Alexei Starovoitov	86fe28f769	bpf: Optimize element count in non-preallocated hash map. The atomic_inc/dec might cause extreme cache line bouncing when multiple cpus access the same bpf map. Based on specified max_entries for the hash map calculate when percpu_counter becomes faster than atomic_t and use it for such maps. For example samples/bpf/map_perf_test is using hash map with max_entries 1000. On a system with 16 cpus the 'map_perf_test 4' shows 14k events per second using atomic_t. On a system with 15 cpus it shows 100k events per second using percpu. map_perf_test is an extreme case where all cpus colliding on atomic_t which causes extreme cache bouncing. Note that the slow path of percpu_counter is 5k events per secound vs 14k for atomic, so the heuristic is necessary. See comment in the code why the heuristic is based on num_online_cpus(). Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-7-alexei.starovoitov@gmail.com	2022-09-05 15:33:06 +02:00
Alexei Starovoitov	34dd3bad1a	bpf: Relax the requirement to use preallocated hash maps in tracing progs. Since bpf hash map was converted to use bpf_mem_alloc it is safe to use from tracing programs and in RT kernels. But per-cpu hash map is still using dynamic allocation for per-cpu map values, hence keep the warning for this map type. In the future alloc_percpu_gfp can be front-end-ed with bpf_mem_cache and this restriction will be completely lifted. perf_event (NMI) bpf programs have to use preallocated hash maps, because free_htab_elem() is using call_rcu which might crash if re-entered. Sleepable bpf programs have to use preallocated hash maps, because life time of the map elements is not protected by rcu_read_lock/unlock. This restriction can be lifted in the future as well. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-6-alexei.starovoitov@gmail.com	2022-09-05 15:33:05 +02:00
Alexei Starovoitov	89dc8d0c38	samples/bpf: Reduce syscall overhead in map_perf_test. Make map_perf_test for preallocated and non-preallocated hash map spend more time inside bpf program to focus performance analysis on the speed of update/lookup/delete operations performed by bpf program. It makes 'perf report' of bpf_mem_alloc look like: 11.76% map_perf_test [k] _raw_spin_lock_irqsave 11.26% map_perf_test [k] htab_map_update_elem 9.70% map_perf_test [k] _raw_spin_lock 9.47% map_perf_test [k] htab_map_delete_elem 8.57% map_perf_test [k] memcpy_erms 5.58% map_perf_test [k] alloc_htab_elem 4.09% map_perf_test [k] __htab_map_lookup_elem 3.44% map_perf_test [k] syscall_exit_to_user_mode 3.13% map_perf_test [k] lookup_nulls_elem_raw 3.05% map_perf_test [k] migrate_enable 3.04% map_perf_test [k] memcmp 2.67% map_perf_test [k] unit_free 2.39% map_perf_test [k] lookup_elem_raw Reduce default iteration count as well to make 'map_perf_test' quick enough even on debug kernels. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-5-alexei.starovoitov@gmail.com	2022-09-05 15:33:05 +02:00
Alexei Starovoitov	37521bffdd	selftests/bpf: Improve test coverage of test_maps Make test_maps more stressful with more parallelism in update/delete/lookup/walk including different value sizes. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-4-alexei.starovoitov@gmail.com	2022-09-05 15:33:05 +02:00
Alexei Starovoitov	fba1a1c6c9	bpf: Convert hash map to bpf_mem_alloc. Convert bpf hash map to use bpf memory allocator. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-3-alexei.starovoitov@gmail.com	2022-09-05 15:33:05 +02:00
Alexei Starovoitov	7c8199e24f	bpf: Introduce any context BPF specific memory allocator. Tracing BPF programs can attach to kprobe and fentry. Hence they run in unknown context where calling plain kmalloc() might not be safe. Front-end kmalloc() with minimal per-cpu cache of free elements. Refill this cache asynchronously from irq_work. BPF programs always run with migration disabled. It's safe to allocate from cache of the current cpu with irqs disabled. Free-ing is always done into bucket of the current cpu as well. irq_work trims extra free elements from buckets with kfree and refills them with kmalloc, so global kmalloc logic takes care of freeing objects allocated by one cpu and freed on another. struct bpf_mem_alloc supports two modes: - When size != 0 create kmem_cache and bpf_mem_cache for each cpu. This is typical bpf hash map use case when all elements have equal size. - When size == 0 allocate 11 bpf_mem_cache-s for each cpu, then rely on kmalloc/kfree. Max allocation size is 4096 in this case. This is bpf_dynptr and bpf_kptr use case. bpf_mem_alloc/bpf_mem_free are bpf specific 'wrappers' of kmalloc/kfree. bpf_mem_cache_alloc/bpf_mem_cache_free are 'wrappers' of kmem_cache_alloc/kmem_cache_free. The allocators are NMI-safe from bpf programs only. They are not NMI-safe in general. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-2-alexei.starovoitov@gmail.com	2022-09-05 15:33:05 +02:00
Sean Anderson	05ad5d4581	net: phy: Add 1000BASE-KX interface mode Add 1000BASE-KX interface mode. This 1G backplane ethernet as described in clause 70. Clause 73 autonegotiation is mandatory, and only full duplex operation is supported. Although at the PMA level this interface mode is identical to 1000BASE-X, it uses a different form of in-band autonegation. This justifies a separate interface mode, since the interface mode (along with the MLO_AN_* autonegotiation mode) sets the type of autonegotiation which will be used on a link. This results in more than just electrical differences between the link modes. With regard to 1000BASE-X, 1000BASE-KX holds a similar position to SGMII: same signaling, but different autonegotiation. PCS drivers (which typically handle in-band autonegotiation) may only support 1000BASE-X, and not 1000BASE-KX. Similarly, the phy mode is used to configure serdes phys with phy_set_mode_ext. Due to the different electrical standards (SFI or XFI vs Clause 70), they will likely want to use different configuration. Adding a phy interface mode for 1000BASE-KX helps simplify configuration in these areas. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:30:42 +01:00
David S. Miller	ab526eaa84	Merge branch 'dpaa-cleanups' Sean Anderson says: ==================== net: dpaa: Cleanups in preparation for phylink conversion (part 2) This series contains several cleanup patches for dpaa/fman. While they are intended to prepare for a phylink conversion, they stand on their own. This series was originally submitted as part of [1]. [1] https://lore.kernel.org/netdev/20220715215954.1449214-1-sean.anderson@seco.com Changes in v5: - Reduce line length of tgec_config - Reduce line length of qman_update_cgr_safe - Rebase onto net-next/master Changes in v4: - weer -> were - tricy -> tricky - Use mac_dev for calling change_addr - qman_cgr_create -> qman_create_cgr Changes in v2: - Fix prototype for dtsec_initialization - Fix warning if sizeof(void *) != sizeof(resource_size_t) - Specify type of mac_dev for exception_cb - Add helper for sanity checking cgr ops - Add CGR update function - Adjust queue depth on rate change ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:27:40 +01:00
Sean Anderson	ef2a8d5478	net: dpaa: Adjust queue depth on rate change Instead of setting the queue depth once during probe, adjust it on the fly whenever we configure the link. This is a bit unusal, since usually the DPAA driver calls into the FMAN driver, but here we do the opposite. We need to add a netdev to struct mac_device for this, but it will soon live in the phylink config. I haven't tested this extensively, but it doesn't seem to break anything. We could possibly optimize this a bit by keeping track of the last rate, but for now we just update every time. 10GEC probably doesn't need to call into this at all, but I've added it for consistency. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:27:39 +01:00
Sean Anderson	914f8b228e	soc: fsl: qbman: Add CGR update function This adds a function to update a CGR with new parameters. qman_create_cgr can almost be used for this (with flags=0), but it's not suitable because it also registers the callback function. The _safe variant was modeled off of qman_cgr_delete_safe. However, we handle multiple arguments and a return value. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:27:39 +01:00
Sean Anderson	d0e17a4653	soc: fsl: qbman: Add helper for sanity checking cgr ops This breaks out/combines get_affine_portal and the cgr sanity check in preparation for the next commit. No functional change intended. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:27:39 +01:00
Sean Anderson	fca4804f68	net: dpaa: Use mac_dev variable in dpaa_netdev_init There are several references to mac_dev in dpaa_netdev_init. Make things a bit more concise by adding a local variable for it. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:27:39 +01:00
Sean Anderson	901bdff2f5	net: fman: Change return type of disable to void When disabling, there is nothing we can do about errors. In fact, the only error which can occur is misuse of the API. Just warn in the mac driver instead. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:27:39 +01:00
Sean Anderson	aedbeb4e59	net: fman: Clean up error handling This removes the _return label, since something like err = -EFOO; goto _return; can be replaced by the briefer return -EFOO; Additionally, this skips going to _return_of_node_put when dev_node has already been put (preventing a double put). Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:27:39 +01:00
Sean Anderson	5b6acb5540	net: fman: Specify type of mac_dev for exception_cb Instead of using a void pointer for mac_dev, specify its type explicitly. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:27:39 +01:00
Sean Anderson	19c788b144	net: fman: Use mac_dev for some params Some params are already present in mac_dev. Use them directly instead of passing them through params. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:27:39 +01:00
Sean Anderson	c6b7b1b515	net: fman: Pass params directly to mac init Instead of having the mac init functions call back into the fman core to get their params, just pass them directly to the init functions. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:27:39 +01:00
Sean Anderson	262f2b782e	net: fman: Map the base address once We don't need to remap the base address from the resource twice (once in mac_probe() and again in set_fman_mac_params()). We still need the resource to get the end address, but we can use a single function call to get both at once. While we're at it, use platform_get_mem_or_io and devm_request_resource to map the resource. I think this is the more "correct" way to do things here, since we use the pdev resource, instead of creating a new one. It's still a bit tricky, since we need to ensure that the resource is a child of the fman region when it gets requested. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:27:39 +01:00
Sean Anderson	45fa34bfaa	net: fman: Remove internal_phy_node from params This member was used to pass the phy node between mac_probe and the mac-specific initialization function. But now that the phy node is gotten in the initialization function, this parameter does not serve a purpose. Remove it, and do the grabbing of the node/grabbing of the phy in the same place. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:27:39 +01:00
Sean Anderson	4498862710	net: fman: Inline several functions into initialization There are several small functions which were only necessary because the initialization functions didn't have access to the mac private data. Now that they do, just do things directly. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:27:39 +01:00
Sean Anderson	1257c9623d	net: fman: Mark mac methods static These methods are no longer accessed outside of the driver file, so mark them as static. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:27:39 +01:00
Sean Anderson	302376feec	net: fman: Move initialization to mac-specific files This moves mac-specific initialization to mac-specific files. This will make it easier to work with individual macs. It will also make it easier to refactor the initialization to simplify the control flow. No functional change intended. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:27:38 +01:00
Heiner Kallweit	36f9b47457	r8169: remove useless PCI region size check Let's trust the hardware here and remove this useless check. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:26:41 +01:00
Christophe JAILLET	1621e70fc7	stmmac: intel: Simplify intel_eth_pci_remove() There is no point to call pcim_iounmap_regions() in the remove function, this frees a managed resource that would be release by the framework anyway. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:19:56 +01:00
Greg Kroah-Hartman	fe2c9c61f6	net: mvpp2: debugfs: fix memory leak when using debugfs_lookup() When calling debugfs_lookup() the result must have dput() called on it, otherwise the memory will leak over time. Fix this up to be much simpler logic and only create the root debugfs directory once when the driver is first accessed. That resolves the memory leak and makes things more obvious as to what the intent is. Cc: Marcin Wojtas <mw@semihalf.com> Cc: Russell King <linux@armlinux.org.uk> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: netdev@vger.kernel.org Cc: stable <stable@kernel.org> Fixes: `21da57a231` ("net: mvpp2: add a debugfs interface for the Header Parser") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 14:18:32 +01:00
David S. Miller	5f3c519347	Merge branch 'lan937x-phy-link-interrupt' Arun Ramadoss says: ==================== net: dsa: microchip: lan937x: enable interrupt for internal phy link detection This patch series enables the internal phy link detection for lan937x using the interrupt method. lan937x acts as the interrupt controller for the internal ports and phy, the irq_domain is registered for the individual ports and in turn for the individual port interrupts. RFC v3 -> Patch v1 - Removed the RFC v3 1/3 from the series - changing exit from reset - Changed the variable name in ksz_port from irq to pirq - Added the check for return value of irq_find_mapping during phy irq registeration. - Moved the clearing of POR_READY_INT from girq_thread_fn to lan937x_reset_switch RFC v2 -> v3 - Used the interrupt controller implementation of phy link Changes in RFC v2 - fixed the compilation issue ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 13:06:40 +01:00
Arun Ramadoss	c9cd961c0d	net: dsa: microchip: lan937x: add interrupt support for port phy link This patch enables the interrupts for internal phy link detection for LAN937x. The interrupt enable bits are active low. There is global interrupt mask for each port. And each port has the individual interrupt mask for TAS. QCI, SGMII, PTP, PHY and ACL. The first level of interrupt domain is registered for global port interrupt and second level of interrupt domain for the individual port interrupts. The phy interrupt is enabled in the lan937x_mdio_register function. Interrupt from which port is raised will be detected based on the interrupt host data. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 13:06:40 +01:00
Arun Ramadoss	f313936261	net: dsa: microchip: lan937x: clear the POR_READY_INT status bit In the lan937x_reset_switch(), it masks all the switch and port registers. In the Global_Int_status register, POR ready bit is write 1 to clear bit and all other bits are read only. So, this patch clear the por_ready_int status bit by writing 1. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 13:06:40 +01:00
Arun Ramadoss	f3c165459c	net: dsa: microchip: add reference to ksz_device inside the ksz_port struct ksz_port doesn't have reference to ksz_device as of now. In order to find out from which port interrupt has triggered, we need to pass the struct ksz_port as a host data. When the interrupt is triggered, we can get the port from which interrupt triggered, but to identify it is phy interrupt we have to read status register. The regmap structure for accessing the device register is present in the ksz_device struct. To access the ksz_device from the ksz_port, the reference is added to it with port number as well. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-05 13:06:39 +01:00

... 6 7 8 9 10 ...

1123790 Commits