linux

Author	SHA1	Message	Date
Linus Torvalds	a2b03e48e9	OpenRISC updates for 5.16 This includes 2 minor cleanups, plus a bug fix for OpenRISC TLB flush code that allows the the SMP kernel to boot again. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE2cRzVK74bBA6Je/xw7McLV5mJ+QFAmGIOh4ACgkQw7McLV5m J+Ts7hAApLDmupgPlJfc9hsIIxiCZcMnP5vyF2gNSkaDVkWkLkM24WuAfcCusEUE VIrI/K6AV8wQUfG9n3sAQ/v5bkKrLCkt3UYsOViidHJsrTeW7CO8v+Nzair3gbNa zKZEs6EyoGrd9l5nMv3D0z/PFWnTCMehRBHYYp6D/chgT+2cpQsVsq9SbN3gV5CR sdFrGOzWSemtWCqVibdsKa4j0IqLVNRAPv1MYC7jECOkRvMkbkXFWZK8Q8ijs0Mv th4nt9SJOSH+OAzBhXoH7bW9BUwfCEwnBYXhsidHVwf1S6+5/yFj24zQAz/1YZ21 ydTS6PHS66oFqa5QITNbfNuqeprrnb+8JavkYvJnWiPfg8yJ2gURZ/4pCHS8iMsB mQpLaQlTsEHODj7Vc4GSTlbzWDgtCSB6v0fOPJP01Bzf+z2wsqG6dLJVrWMl6nT9 0sNvfnU9egne8UhbTJ8fot9SISRjc/sFMXKnMMdcY+7KWdw+Kh4t7/hL5ZbKCZ8H Pg3NNTA37puqTn9qvd0ZA5Ey9L0bSjeQpGa2A+nutE3zfRL9MQFWWrpHPgfJyoZe VOZSK0dtSfYtTsCnno5S2px2XqPpXIf0P5LiXzvtc+x7JjZsP3G4jZxB8ql+ZaQF IeBI4Kvg4QPU2T/TUwt5VIMEL/I3yZ63/0L+d1xnIqaVZVK0RDQ= =c5nj -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://github.com/openrisc/linux Pull OpenRISC updates from Stafford Horne: "This includes two minor cleanups, plus a bug fix for OpenRISC TLB flush code that allows the the SMP kernel to boot again" * tag 'for-linus' of git://github.com/openrisc/linux: openrisc: fix SMP tlb flush NULL pointer dereference openrisc: signal: remove unused DEBUG_SIG macro openrisc: time: don't mark comment as kernel-doc	2021-11-08 09:31:25 -08:00
Linus Torvalds	bbdbeb0048	perf tools changes for v5.16: perf annotate: - Add riscv64 support. - Add fusion logic for AMD microarchs. perf record: - Add an option to control the synthesizing behavior: --synth <no\|all\|task\|mmap\|cgroup> Fine-tune event synthesis: default=all core: - Allow controlling synthesizing PERF_RECORD_ metadata events during record. - perf.data reader prep work for multithreaded processing. - Fix missing exclude_{host,guest} setting in PMUs that don't support it and that were causing the feature detection code to disable it for all events, even the ones in PMUs that support it. - Fix the default use of precise events on AMD, that were always falling back to non-precise because perf_event_attr.exclude_guest=1 was set and IBS does not have filtering capability, refusing precise + exclude_guest. - Add bitfield_swap() to handle branch_stack endian issue. perf script: - Show binary offsets for userspace addresses in callchains. - Support instruction latency via new "ins_lat" selectable field. - Add dlfilter-show-cycles perf inject: - Add vmlinux and ignore-vmlinux arguments, similar to other tools. perf list: - Display PMU prefix for partially supported hybrid cache events. - Display hybrid PMU events with cpu type. perf stat: - Improve metrics documentation of data structures. - Fix memory leaks in the metric code. - Use NAN for missing event IDs. - Don't compute unused events. - Fix memory leak on error path. - Encode and use metric-id as a metric qualifier. - Allow metrics with no events. - Avoid events for an 'if' constant result. - Only add a referenced metric once. - Simplify metric_refs calculation. - Allow modifiers on metrics. perf test: - Add workload test of metric and metric groups. - Workload test of all PMUs. - vmlinux-kallsyms: Ignore hidden symbols. - Add pmu-event test for event described as "config=". - Verify more event members in pmu-events test. - Add endian test for struct branch_flags on the sample-parsing test. - Improve temp file cleanup in several tests. perf daemon: - Address MSAN warnings on send_cmd(). perf kmem: - Improve man page for record options perf srcline: - Use long-running addr2line per DSO, greatly speeding up the 'srcline' sort order. perf symbols: - Ignore $a/$d symbols for ARM modules. - Fix /proc/kcore access on 32 bit systems. Kernel UAPI copies: - Update copy of linux/socket.h with the kernel sources, no change in tooling output. libbpf: - Pull in bpf_program__get_prog_info_linear() from libbpf, too much specific to perf. - Deprecate bpf_map__resize() in favor of bpf_map_set_max_entries() - Install libbpf headers locally when building. - Bump minimum LLVM C++ std to GNU++14. libperf: - Use binary search in perf_cpu_map__idx() as array are sorted. libtracefs: - Enable libtracefs dynamic linking. libtraceevent: - Increase logging when verbose. Arch specific: PowerPC: - Add support to expose instruction and data address registers as part of extended regs. Vendor events: JSON parser: - Support ConfigCode to set the config= in PMUs like: $ cat /sys/bus/event_source/devices/hisi_sccl1_ddrc3/events/act_cmd config=0x5 - Make the JSON parser more conformant when in strict mode. All JSON files: - Fix all remaining invalid JSON files. ARM: - Syntax corrections in Neoverse N1 json. - Categorise the Neoverse V1 counters. - Add new armv8 PMU events. - Revise hip08 uncore events. Hardware tracing: auxtrace: - Add missing Z option to ITRACE_HELP. - Add itrace A option to approximate IPC. - Add itrace d+o option to direct debug log to stdout. Intel PT: - Add support for PERF_RECORD_AUX_OUTPUT_HW_ID - Support itrace A option to approximate IPC - Support itrace d+o option to direct debug log to stdout. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYYg7RwAKCRCyPKLppCJ+ J1KgAQCGC3802gsI38/xwli1SLBNHsZ9DEy2nX5Ikw/f64K+0QEA6DsU0hpRTR0D i8ZFa2zNX748n+pU2WX6E7AjO01hrQo= =sXHg -----END PGP SIGNATURE----- Merge tag 'perf-tools-for-v5.16-2021-11-07-without-bpftool-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools updates from Arnaldo Carvalho de Melo: "perf annotate: - Add riscv64 support. - Add fusion logic for AMD microarchs. perf record: - Add an option to control the synthesizing behavior: --synth <no\|all\|task\|mmap\|cgroup> core: - Allow controlling synthesizing PERF_RECORD_ metadata events during record. - perf.data reader prep work for multithreaded processing. - Fix missing exclude_{host,guest} setting in PMUs that don't support it and that were causing the feature detection code to disable it for all events, even the ones in PMUs that support it. - Fix the default use of precise events on AMD, that were always falling back to non-precise because perf_event_attr.exclude_guest=1 was set and IBS does not have filtering capability, refusing precise + exclude_guest. - Add bitfield_swap() to handle branch_stack endian issue. perf script: - Show binary offsets for userspace addresses in callchains. - Support instruction latency via new "ins_lat" selectable field. - Add dlfilter-show-cycles perf inject: - Add vmlinux and ignore-vmlinux arguments, similar to other tools. perf list: - Display PMU prefix for partially supported hybrid cache events. - Display hybrid PMU events with cpu type. perf stat: - Improve metrics documentation of data structures. - Fix memory leaks in the metric code. - Use NAN for missing event IDs. - Don't compute unused events. - Fix memory leak on error path. - Encode and use metric-id as a metric qualifier. - Allow metrics with no events. - Avoid events for an 'if' constant result. - Only add a referenced metric once. - Simplify metric_refs calculation. - Allow modifiers on metrics. perf test: - Add workload test of metric and metric groups. - Workload test of all PMUs. - vmlinux-kallsyms: Ignore hidden symbols. - Add pmu-event test for event described as "config=". - Verify more event members in pmu-events test. - Add endian test for struct branch_flags on the sample-parsing test. - Improve temp file cleanup in several tests. perf daemon: - Address MSAN warnings on send_cmd(). perf kmem: - Improve man page for record options perf srcline: - Use long-running addr2line per DSO, greatly speeding up the 'srcline' sort order. perf symbols: - Ignore $a/$d symbols for ARM modules. - Fix /proc/kcore access on 32 bit systems. Kernel UAPI copies: - Update copy of linux/socket.h with the kernel sources, no change in tooling output. libbpf: - Pull in bpf_program__get_prog_info_linear() from libbpf, too much specific to perf. - Deprecate bpf_map__resize() in favor of bpf_map_set_max_entries() - Install libbpf headers locally when building. - Bump minimum LLVM C++ std to GNU++14. libperf: - Use binary search in perf_cpu_map__idx() as array are sorted. libtracefs: - Enable libtracefs dynamic linking. libtraceevent: - Increase logging when verbose. Arch specific: * PowerPC: - Add support to expose instruction and data address registers as part of extended regs. Vendor events: * JSON parser: - Support ConfigCode to set the config= in PMUs - Make the JSON parser more conformant when in strict mode. * All JSON files: - Fix all remaining invalid JSON files. * ARM: - Syntax corrections in Neoverse N1 json. - Categorise the Neoverse V1 counters. - Add new armv8 PMU events. - Revise hip08 uncore events. Hardware tracing: * auxtrace: - Add missing Z option to ITRACE_HELP. - Add itrace A option to approximate IPC. - Add itrace d+o option to direct debug log to stdout. * Intel PT: - Add support for PERF_RECORD_AUX_OUTPUT_HW_ID - Support itrace A option to approximate IPC - Support itrace d+o option to direct debug log to stdout" * tag 'perf-tools-for-v5.16-2021-11-07-without-bpftool-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (120 commits) perf build: Install libbpf headers locally when building perf MANIFEST: Add bpftool files to allow building with BUILD_BPF_SKEL=1 perf metric: Fix memory leaks perf parse-event: Add init and exit to parse_event_error perf parse-events: Rename parse_events_error functions perf stat: Fix memory leak on error path perf tools: Use __BYTE_ORDER__ perf inject: Add vmlinux and ignore-vmlinux arguments perf tools: Check vmlinux/kallsyms arguments in all tools perf tools: Refactor out kernel symbol argument sanity checking perf symbols: Ignore $a/$d symbols for ARM modules perf evsel: Don't set exclude_guest by default perf evsel: Fix missing exclude_{host,guest} setting perf bpf: Add missing free to bpf_event__print_bpf_prog_info() perf beauty: Update copy of linux/socket.h with the kernel sources perf clang: Fixes for more recent LLVM/clang tools: Bump minimum LLVM C++ std to GNU++14 perf bpf: Pull in bpf_program__get_prog_info_linear() Revert "perf bench futex: Add support for 32-bit systems with 64-bit time_t" perf test sample-parsing: Add endian test for struct branch_flags ...	2021-11-08 09:25:26 -08:00
Linus Torvalds	1e9ed9360f	Kbuild updates for v5.16 - Remove the global -isystem compiler flag, which was made possible by the introduction of <linux/stdarg.h> - Improve the Kconfig help to print the location in the top menu level - Fix "FORCE prerequisite is missing" build warning for sparc - Add new build targets, tarzst-pkg and perf-tarzst-src-pkg, which generate a zstd-compressed tarball - Prevent gen_init_cpio tool from generating a corrupted cpio when KBUILD_BUILD_TIMESTAMP is set to 2106-02-07 or later - Misc cleanups -----BEGIN PGP SIGNATURE----- iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmGGkysVHG1hc2FoaXJv eUBrZXJuZWwub3JnAAoJED2LAQed4NsGgZkQAIX4i9Tt6pyl/2xGDGkzUqjprfoH QUIo1DoUclLUygoakrrrX3EnZLWrslgPTKjQxdiV6RA6xHfe4cYgNTSq8zM9lsPT lu+B4nEDqoXQ5gyLxMlnjS3FRQTNYIeBZEhSAIiW8TENdLKlKc+NYdoj7th50dO0 SkXRa2dpWHa6t7ZRqHIHMpUWA7gm0w22ZbgQmyUv1CDGO4IHPLqe2b2PMsrzhSZ1 yypP1l6aQVKuP0hN9aytbTRqDxUd0uOzBf00PK5zx23hjdwZ9wmZrFTKDf9fAu/+ nR7gBsa5YoYNQh3UkayZXjR5dClmgsCXZ25OXI7YucQp/8OJ5fadfn1NFpJHsw56 n5cckbHIXgnFUcel5YlkR6qTHjpzdr9vHm90MmiuX99b3oy9czl6pY3qkNfRkllQ v7ME5L1qlw3P3ia1KA+H4zW/LIJ8p5cbKBwaY22m3kY3bTx7PiOfMlep4UVqxXSb 0/OqxSsmYg5LlmwEQ0SSsx45hE0o9nG/cdjkHu1jUOUHxYfpt1T4MTILeGUwmjzd TydJym5MZyXBawu4NVB3QLoKm5Jt2BXtyaWOtq74VSrs77roNCdYuQWJ+1aBf2Pg 0s4CVC2cC7KlxJDImoqswZATGXPMfbiVDcuVSSukYRgBMeCBPUzRhB8YP36BZyD3 9vFYmqSujtUU7nWb =ATFN -----END PGP SIGNATURE----- Merge tag 'kbuild-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Remove the global -isystem compiler flag, which was made possible by the introduction of <linux/stdarg.h> - Improve the Kconfig help to print the location in the top menu level - Fix "FORCE prerequisite is missing" build warning for sparc - Add new build targets, tarzst-pkg and perf-tarzst-src-pkg, which generate a zstd-compressed tarball - Prevent gen_init_cpio tool from generating a corrupted cpio when KBUILD_BUILD_TIMESTAMP is set to 2106-02-07 or later - Misc cleanups * tag 'kbuild-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (28 commits) kbuild: use more subdir- for visiting subdirectories while cleaning sh: remove meaningless archclean line initramfs: Check timestamp to prevent broken cpio archive kbuild: split DEBUG_CFLAGS out to scripts/Makefile.debug gen_init_cpio: add static const qualifiers kbuild: Add make tarzst-pkg build option scripts: update the comments of kallsyms support sparc: Add missing "FORCE" target when using if_changed kconfig: refactor conf_touch_dep() kconfig: refactor conf_write_dep() kconfig: refactor conf_write_autoconf() kconfig: add conf_get_autoheader_name() kconfig: move sym_escape_string_value() to confdata.c kconfig: refactor listnewconfig code kconfig: refactor conf_write_symbol() kconfig: refactor conf_write_heading() kconfig: remove 'const' from the return type of sym_escape_string_value() kconfig: rename a variable in the lexer to a clearer name kconfig: narrow the scope of variables in the lexer kconfig: Create links to main menu items in search ...	2021-11-08 09:15:45 -08:00
Linus Torvalds	67b7e1f241	modules patches for 5.16-rc1 As requested by Jessica I'm stepping in to help with modules maintenance. This is my first pull request to you. I've collected only two patches for modules for the 5.16-rc1 merge window. These patches are from Shuah Khan as she debugged some corner case error with modules. The error messages are improved for elf_validity_check(). While doing this work a corner case fix was spotted on validate_section_offset() due to a possible overflow bug on 64-bit. The impact of this fix is low given this just limits module section headers placed within the 32-bit boundary, and we obviously don't have insane module sizes. Even if a specially crafted module is constructed later checks would invalidate the module right away. I've let this sit through 0-day testing since October 15th with no issues found. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> -----BEGIN PGP SIGNATURE----- iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmGFrvcSHG1jZ3JvZkBr ZXJuZWwub3JnAAoJEM4jHQowkoinhFAP/1BBXuM/vevC1IdZaEU4M8pg07NOpkZt PYJc8CxWKTtEg5hrLJMqOexXGwvAg/nq28IFWvUKh3bGtEghPyrQu6+I4mXsjnjJ t9/AO+BOYU14DJGDAYEuReNsaAcyeRooHLriuUaNvhhaN9q+v+FRyBWNphmA6Tz7 VkCtmCNMFJZlhd9Cu4jOZpJe6CIe9gZ0czYfRshAl/3ZRSQjYaddtbYf1Cs8Vwah by4o2YyvctrRzeOj/Fy+kbqZw2St39nZ5fKYwijRn1ZwHRQo6NQqrlMeS8rI0LgG 1YwWgNWO1FjaPzyIFcAhk2bUF2TxEf5/eVpXn2qXHnmVZ55oBPP/O7Th0/5OK9gD utOMbO1nqBLBXUyX/1dO/UT36XcrqtUP0Y9VgjIvj9n8Y82RGYmBScH/TOU1f7A7 sH56sW9/3YvIOe8AShBHJ7IKqZXU0inIGasFYwKKm2pAOLtajaC9Sr5fqVbuyfNF J2+nXipVzjI0f9SGTqmE41jynFGln6nfd1pgOOiysg9ZqxieINB0J8l0OHe6fZz/ zU4TehXZHE9DApP8D+rVpP0ltwR2YWs2u0zRqHr/0GEWYH00JZu2ymDR13W7izSp KiiveBxhwBpewgV5cyua8TDyeKhn3mEJFNmijlaq4yq1P2oKeWTQRDRZjwUP8EZY s16oV+BW7Kp+ =Evek -----END PGP SIGNATURE----- Merge tag 'modules-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull module updates from Luis Chamberlain: "As requested by Jessica I'm stepping in to help with modules maintenance. This is my first pull request to you. I've collected only two patches for modules for the 5.16-rc1 merge window. These patches are from Shuah Khan as she debugged some corner case error with modules. The error messages are improved for elf_validity_check(). While doing this work a corner case fix was spotted on validate_section_offset() due to a possible overflow bug on 64-bit. The impact of this fix is low given this just limits module section headers placed within the 32-bit boundary, and we obviously don't have insane module sizes. Even if a specially crafted module is constructed later checks would invalidate the module right away. I've let this sit through 0-day testing since October 15th with no issues found" * tag 'modules-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: module: change to print useful messages from elf_validity_check() module: fix validate_section_offset() overflow bug on 64-bit	2021-11-08 09:04:59 -08:00
Arnd Bergmann	f91140e455	soc: ti: fix wkup_m3_rproc_boot_thread return type The wkup_m3_rproc_boot_thread() function uses a nonstandard prototype, which broke after Eric's recent cleanup: drivers/soc/ti/wkup_m3_ipc.c: In function 'wkup_m3_rproc_boot_thread': drivers/soc/ti/wkup_m3_ipc.c:429:16: error: 'return' with a value, in function returning void [-Werror=return-type] 429 \| return 0; \| ^ drivers/soc/ti/wkup_m3_ipc.c:416:13: note: declared here 416 \| static void wkup_m3_rproc_boot_thread(struct wkup_m3_ipc *m3_ipc) \| ^~~~~~~~~~~~~~~~~~~~~~~~~ Change it to the normal prototype as it should have been from the start. Fixes: `111e70490d` ("exit/kthread: Have kernel threads return instead of calling do_exit") Fixes: `cdd5de500b` ("soc: ti: Add wkup_m3_ipc driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lkml.kernel.org/r/20211105075119.2327190-1-arnd@kernel.org Acked-by: Santosh Shilimkar <ssantosh@kernel.org> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2021-11-08 11:01:51 -06:00
Arnd Bergmann	501586ea59	xen/balloon: fix unused-variable warning In configurations with CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=n and CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y, gcc warns about an unused variable: drivers/xen/balloon.c:83:12: error: 'xen_hotplug_unpopulated' defined but not used [-Werror=unused-variable] Since this is always zero when CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is disabled, turn it into a preprocessor constant in that case. Fixes: `121f2faca2` ("xen/balloon: rename alloc/free_xenballooned_pages") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20211108111408.3940366-1-arnd@kernel.org Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2021-11-08 10:40:17 -06:00
Pavel Begunkov	bad119b9a0	io_uring: honour zeroes as io-wq worker limits When we pass in zero as an io-wq worker number limit it shouldn't actually change the limits but return the old value, follow that behaviour with deferred limits setup as well. Cc: stable@kernel.org # 5.15 Reported-by: Beld Zhang <beldzhang@gmail.com> Fixes: `e139a1ec92` ("io_uring: apply max_workers limit to all future users") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1b222a92f7a78a24b042763805e891a4cdd4b544.1636384034.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-11-08 08:39:48 -07:00
Takashi Iwai	43d35ccc36	ALSA: pci: rme: Fix unaligned buffer addresses The recent fix for setting up the DMA buffer type on RME drivers tried to address the non-standard memory managements and changed the DMA buffer information to the standard snd_dma_buffer object that is allocated at the probe time. However, I overlooked that the RME drivers handle the buffer addresses based on 64k alignment, and the previous conversion broke that silently. This patch is an attempt to fix the regression. The snd_dma_buffer objects are copied to the original data with the correction to the aligned accesses, and those are passed to snd_pcm_set_runtime_buffer() helpers instead. The original snd_dma_buffer objects are managed by devres, hence they'll be released automagically. Fixes: `0899a7a230` ("ALSA: pci: rme: Set up buffer type properly") Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20211108145752.30572-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2021-11-08 15:58:31 +01:00
Ye Bin	a846a8e6c9	blk-mq: don't free tags if the tag_set is used by other device in queue initialztion We got UAF report on v5.10 as follows: [ 1446.674930] ================================================================== [ 1446.675970] BUG: KASAN: use-after-free in blk_mq_get_driver_tag+0x9a4/0xa90 [ 1446.676902] Read of size 8 at addr ffff8880185afd10 by task kworker/1:2/12348 [ 1446.677851] [ 1446.678073] CPU: 1 PID: 12348 Comm: kworker/1:2 Not tainted 5.10.0-10177-gc9c81b1e346a #2 [ 1446.679168] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 1446.680692] Workqueue: kthrotld blk_throtl_dispatch_work_fn [ 1446.681448] Call Trace: [ 1446.681800] dump_stack+0x9b/0xce [ 1446.682916] print_address_description.constprop.6+0x3e/0x60 [ 1446.685999] kasan_report.cold.9+0x22/0x3a [ 1446.687186] blk_mq_get_driver_tag+0x9a4/0xa90 [ 1446.687785] blk_mq_dispatch_rq_list+0x21a/0x1d40 [ 1446.692576] __blk_mq_do_dispatch_sched+0x394/0x830 [ 1446.695758] __blk_mq_sched_dispatch_requests+0x398/0x4f0 [ 1446.698279] blk_mq_sched_dispatch_requests+0xdf/0x140 [ 1446.698967] __blk_mq_run_hw_queue+0xc0/0x270 [ 1446.699561] __blk_mq_delay_run_hw_queue+0x4cc/0x550 [ 1446.701407] blk_mq_run_hw_queue+0x13b/0x2b0 [ 1446.702593] blk_mq_sched_insert_requests+0x1de/0x390 [ 1446.703309] blk_mq_flush_plug_list+0x4b4/0x760 [ 1446.705408] blk_flush_plug_list+0x2c5/0x480 [ 1446.708471] blk_finish_plug+0x55/0xa0 [ 1446.708980] blk_throtl_dispatch_work_fn+0x23b/0x2e0 [ 1446.711236] process_one_work+0x6d4/0xfe0 [ 1446.711778] worker_thread+0x91/0xc80 [ 1446.713400] kthread+0x32d/0x3f0 [ 1446.714362] ret_from_fork+0x1f/0x30 [ 1446.714846] [ 1446.715062] Allocated by task 1: [ 1446.715509] kasan_save_stack+0x19/0x40 [ 1446.716026] __kasan_kmalloc.constprop.1+0xc1/0xd0 [ 1446.716673] blk_mq_init_tags+0x6d/0x330 [ 1446.717207] blk_mq_alloc_rq_map+0x50/0x1c0 [ 1446.717769] __blk_mq_alloc_map_and_request+0xe5/0x320 [ 1446.718459] blk_mq_alloc_tag_set+0x679/0xdc0 [ 1446.719050] scsi_add_host_with_dma.cold.3+0xa0/0x5db [ 1446.719736] virtscsi_probe+0x7bf/0xbd0 [ 1446.720265] virtio_dev_probe+0x402/0x6c0 [ 1446.720808] really_probe+0x276/0xde0 [ 1446.721320] driver_probe_device+0x267/0x3d0 [ 1446.721892] device_driver_attach+0xfe/0x140 [ 1446.722491] __driver_attach+0x13a/0x2c0 [ 1446.723037] bus_for_each_dev+0x146/0x1c0 [ 1446.723603] bus_add_driver+0x3fc/0x680 [ 1446.724145] driver_register+0x1c0/0x400 [ 1446.724693] init+0xa2/0xe8 [ 1446.725091] do_one_initcall+0x9e/0x310 [ 1446.725626] kernel_init_freeable+0xc56/0xcb9 [ 1446.726231] kernel_init+0x11/0x198 [ 1446.726714] ret_from_fork+0x1f/0x30 [ 1446.727212] [ 1446.727433] Freed by task 26992: [ 1446.727882] kasan_save_stack+0x19/0x40 [ 1446.728420] kasan_set_track+0x1c/0x30 [ 1446.728943] kasan_set_free_info+0x1b/0x30 [ 1446.729517] __kasan_slab_free+0x111/0x160 [ 1446.730084] kfree+0xb8/0x520 [ 1446.730507] blk_mq_free_map_and_requests+0x10b/0x1b0 [ 1446.731206] blk_mq_realloc_hw_ctxs+0x8cb/0x15b0 [ 1446.731844] blk_mq_init_allocated_queue+0x374/0x1380 [ 1446.732540] blk_mq_init_queue_data+0x7f/0xd0 [ 1446.733155] scsi_mq_alloc_queue+0x45/0x170 [ 1446.733730] scsi_alloc_sdev+0x73c/0xb20 [ 1446.734281] scsi_probe_and_add_lun+0x9a6/0x2d90 [ 1446.734916] __scsi_scan_target+0x208/0xc50 [ 1446.735500] scsi_scan_channel.part.3+0x113/0x170 [ 1446.736149] scsi_scan_host_selected+0x25a/0x360 [ 1446.736783] store_scan+0x290/0x2d0 [ 1446.737275] dev_attr_store+0x55/0x80 [ 1446.737782] sysfs_kf_write+0x132/0x190 [ 1446.738313] kernfs_fop_write_iter+0x319/0x4b0 [ 1446.738921] new_sync_write+0x40e/0x5c0 [ 1446.739429] vfs_write+0x519/0x720 [ 1446.739877] ksys_write+0xf8/0x1f0 [ 1446.740332] do_syscall_64+0x2d/0x40 [ 1446.740802] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 1446.741462] [ 1446.741670] The buggy address belongs to the object at ffff8880185afd00 [ 1446.741670] which belongs to the cache kmalloc-256 of size 256 [ 1446.743276] The buggy address is located 16 bytes inside of [ 1446.743276] 256-byte region [ffff8880185afd00, ffff8880185afe00) [ 1446.744765] The buggy address belongs to the page: [ 1446.745416] page:ffffea0000616b00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x185ac [ 1446.746694] head:ffffea0000616b00 order:2 compound_mapcount:0 compound_pincount:0 [ 1446.747719] flags: 0x1fffff80010200(slab\|head) [ 1446.748337] raw: 001fffff80010200 ffffea00006a3208 ffffea000061bf08 ffff88801004f240 [ 1446.749404] raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000 [ 1446.750455] page dumped because: kasan: bad access detected [ 1446.751227] [ 1446.751445] Memory state around the buggy address: [ 1446.752102] ffff8880185afc00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 1446.753090] ffff8880185afc80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 1446.754079] >ffff8880185afd00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 1446.755065] ^ [ 1446.755589] ffff8880185afd80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 1446.756574] ffff8880185afe00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 1446.757566] ================================================================== Flag 'BLK_MQ_F_TAG_QUEUE_SHARED' will be set if the second device on the same host initializes it's queue successfully. However, if the second device failed to allocate memory in blk_mq_alloc_and_init_hctx() from blk_mq_realloc_hw_ctxs() from blk_mq_init_allocated_queue(), __blk_mq_free_map_and_rqs() will be called on error path, and if 'BLK_MQ_TAG_HCTX_SHARED' is not set, 'tag_set->tags' will be freed while it's still used by the first device. To fix this issue we move release newly allocated hardware context from blk_mq_realloc_hw_ctxs to __blk_mq_update_nr_hw_queues. As there is needn't to release hardware context in blk_mq_init_allocated_queue. Fixes: `868f2f0b72` ("blk-mq: dynamic h/w context count") Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20211108074019.1058843-1-yebin10@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-11-08 06:23:49 -07:00
Coly Li	2878feaed5	bcache: Revert "bcache: use bvec_virt" This reverts commit `2fd3e5efe7`. The above commit replaces page_address(bv->bv_page) by bvec_virt(bv) to avoid directly access to bv->bv_page, but in situation bv->bv_offset is not zero and page_address(bv->bv_page) is not equal to bvec_virt(bv). In such case a memory corruption may happen because memory in next page is tainted by following line in do_btree_node_write(), memcpy(bvec_virt(bv), addr, PAGE_SIZE); This patch reverts the mentioned commit to avoid the memory corruption. Fixes: `2fd3e5efe7` ("bcache: use bvec_virt") Signed-off-by: Coly Li <colyli@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org # 5.15 Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211103151041.70516-1-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-11-08 06:23:17 -07:00
Vineeth Vijayan	a4751f157c	s390/cio: check the subchannel validity for dev_busid Check the validity of subchanel before reading other fields in the schib. Fixes: `d3683c0552` ("s390/cio: add dev_busid sysfs entry for each subchannel") CC: <stable@vger.kernel.org> Reported-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20211105154451.847288-1-vneethv@linux.ibm.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2021-11-08 14:17:49 +01:00
Thomas Richter	9d48c7afed	s390/cpumf: cpum_cf PMU displays invalid value after hotplug remove When a CPU is hotplugged while the perf stat -e cycles command is running, a wrong (very large) value is displayed immediately after the CPU removal: Check the values, shouldn't be too high as in time counts unit events 1.001101919 29261846 cycles 2.002454499 17523405 cycles 3.003659292 24361161 cycles 4.004816983 18446744073638406144 cycles 5.005671647 <not counted> cycles ... The CPU hotplug off took place after 3 seconds. The issue is the read of the event count value after 4 seconds when the CPU is not available and the read of the counter returns an error. This is treated as a counter value of zero. This results in a very large value (0 - previous_value). Fix this by detecting the hotplugged off CPU and report 0 instead of a very large number. Cc: stable@vger.kernel.org Fixes: `a029a4eab3` ("s390/cpumf: Allow concurrent access for CPU Measurement Counter Facility") Reported-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2021-11-08 14:17:49 +01:00
Sven Schnelle	213fca9e23	s390/tape: fix timer initialization in tape_std_assign() commit `9c6c273aa4` ("timer: Remove init_timer_on_stack() in favor of timer_setup_on_stack()") changed the timer setup from init_timer_on_stack(() to timer_setup(), but missed to change the mod_timer() call. And while at it, use msecs_to_jiffies() instead of the open coded timeout calculation. Cc: stable@vger.kernel.org Fixes: `9c6c273aa4` ("timer: Remove init_timer_on_stack() in favor of timer_setup_on_stack()") Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2021-11-08 14:17:49 +01:00
Niklas Schnelle	4cdf2f4e24	s390/pci: implement minimal PCI error recovery When the platform detects an error on a PCI function or a service action has been performed it is put in the error state and an error event notification is provided to the OS. Currently we treat all error event notifications the same and simply set pdev->error_state = pci_channel_io_perm_failure requiring user intervention such as use of the recover attribute to get the device usable again. Despite requiring a manual step this also has the disadvantage that the device is completely torn down and recreated resulting in higher level devices such as a block or network device being recreated. In case of a block device this also means that it may need to be removed and added to a software raid even if that could otherwise survive with a temporary degradation. This is of course not ideal more so since an error notification with PEC 0x3A indicates that the platform already performed error recovery successfully or that the error state was caused by a service action that is now finished. At least in this case we can assume that the error state can be reset and the function made usable again. So as not to have the disadvantage of a full tear down and recreation we need to coordinate this recovery with the driver. Thankfully there is already a well defined recovery flow for this described in Documentation/PCI/pci-error-recovery.rst. The implementation of this is somewhat straight forward and simplified by the fact that our recovery flow is defined per PCI function. As a reset we use the newly introduced zpci_hot_reset_device() which also takes the PCI function out of the error state. Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Acked-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2021-11-08 14:17:49 +01:00
Niklas Schnelle	dfd5bb23ad	PCI: Export pci_dev_lock() Commit `e3a9b1212b` ("PCI: Export pci_dev_trylock() and pci_dev_unlock()") already exported pci_dev_trylock()/pci_dev_unlock() however in some circumstances such as during error recovery it makes sense to block waiting to get full access to the device so also export pci_dev_lock(). Link: https://lore.kernel.org/all/20210928181014.GA713179@bhelgaas/ Acked-by: Pierre Morel <pmorel@linux.ibm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2021-11-08 14:17:49 +01:00
Niklas Schnelle	da995d538d	s390/pci: implement reset_slot for hotplug slot This is done by adding a zpci_hot_reset_device() call which does a low level reset of the PCI function without changing its higher level function state. This way it can be used while the zPCI function is bound to a driver and with DMA tables being controlled either through the IOMMU or DMA APIs which is prohibited when using zpci_disable_device() as that drop existing DMA translations. As this reset, unlike a normal FLR, also calls zpci_clear_irq() we need to implement arch_restore_msi_irqs() and make sure we re-enable IRQs for the PCI function if they were previously disabled. Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2021-11-08 14:17:49 +01:00
Niklas Schnelle	4fe2049770	s390/pci: refresh function handle in iomap The function handle of a PCI function is updated when disabling or enabling it as well as when the function's availability changes or it enters the error state. Until now this only occurred either while there is no struct pci_dev associated with the function yet or the function became unavailable. This meant that leaving a stale function handle in the iomap either didn't happen because there was no iomap yet or it lead to errors on PCI access but so would the correct disabled function handle. In the future a CLP Set PCI Function Disable/Enable cycle during PCI device recovery may be done while the device is bound to a driver. In this case we must update the iomap associated with the now-stale function handle to ensure that the resulting zPCI instruction references an accurate function handle. Since the function handle is accessed by the PCI accessor helpers without locking use READ_ONCE()/WRITE_ONCE() to mark this access and prevent compiler optimizations that would move the load/store. With that infrastructure in place let's also properly update the function handle in the existing cases. This makes sure that in the future debugging of a zPCI function access through the handle will show an up to date handle reducing the chance of confusion. Also it makes sure we have one single place where a zPCI function handle is updated after initialization. Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2021-11-08 14:17:49 +01:00
Vivek Kasireddy	d89c0c8322	drm/virtio: Fix NULL dereference error in virtio_gpu_poll When virgl is not enabled, vfpriv pointer would not be allocated. Therefore, check for a valid value before dereferencing. Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Cc: Gurchetan Singh <gurchetansingh@chromium.org> Cc: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de> Link: http://patchwork.freedesktop.org/patch/msgid/20211104214249.1802789-1-vivek.kasireddy@intel.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2021-11-08 13:54:41 +01:00
Takashi Sakamoto	411ac2982c	ALSA: firewire-motu: add support for MOTU Track 16 Mark of the Unicorn designed Track 16 2011 as one of models in third generation of its FireWire series. The model is already discontinued. It consists of below ICs: * Texas Instruments TSB41AB1 * Microchip (SMSC) USB3300 * Xilinx Spartan-3A FPGA, XC3S700A * Texas Instruments TMS320C6722 * Microchip (Atmel) AT91SAM SAM7S512 It supports sampling transfer frequency up to 192.0 kHz. The packet format differs depending on both of current sampling transfer frequency and the type of signal in optical interfaces. The model supports transmission of PCM frames as well as MIDI messages. The model supports command mechanism to configure internal DSP. Hardware meter information is available in the first 2 chunks of each data block of tx packet. This commit adds support for it. $ cd linux-firewire-tools/src $ python crpp < /sys/bus/firewire/devices/fw1/config_rom ROM header and bus information block ----------------------------------------------------------------- 400 04107d95 bus_info_length 4, crc_length 16, crc 32149 404 31333934 bus_name "1394" 408 20ff7000 irmc 0, cmc 0, isc 1, bmc 0, cyc_clk_acc 255, max_rec 7 (256) 40c 0001f200 company_id 0001f2 \| 410 000a83c4 device_id 00000a83c4 \| EUI-64 0001f200000a83c4 root directory ----------------------------------------------------------------- 414 0004ef04 directory_length 4, crc 61188 418 030001f2 vendor 41c 0c0083c0 node capabilities per IEEE 1394 420 d1000002 --> unit directory at 428 424 8d000005 --> eui-64 leaf at 438 unit directory at 428 ----------------------------------------------------------------- 428 00035b04 directory_length 3, crc 23300 42c 120001f2 specifier id 430 13000039 version 434 17102800 model eui-64 leaf at 438 ----------------------------------------------------------------- 438 0002b25f leaf_length 2, crc 45663 43c 0001f200 company_id 0001f2 \| 440 000a83c4 device_id 00000a83c4 \| EUI-64 0001f200000a83c4 Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Link: https://lore.kernel.org/r/20211107110644.23511-1-o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai <tiwai@suse.de>	2021-11-08 13:49:27 +01:00
YueHaibing	08e873cb70	KVM: arm64: Change the return type of kvm_vcpu_preferred_target() kvm_vcpu_preferred_target() always return 0 because kvm_target_cpu() never returns a negative error code. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211105011500.16280-1-yuehaibing@huawei.com	2021-11-08 10:48:47 +00:00
Randy Dunlap	deacd669e1	KVM: arm64: nvhe: Fix a non-kernel-doc comment Do not use kernel-doc "/" notation when the comment is not in kernel-doc format. Fixes this docs build warning: arch/arm64/kvm/hyp/nvhe/sys_regs.c:478: warning: This comment starts with '/', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Handler for protected VM restricted exceptions. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.cs.columbia.edu Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211106032529.15057-1-rdunlap@infradead.org	2021-11-08 10:43:35 +00:00
Mark Rutland	8bb084119f	KVM: arm64: Extract ESR_ELx.EC only Since ARMv8.0 the upper 32 bits of ESR_ELx have been RES0, and recently some of the upper bits gained a meaning and can be non-zero. For example, when FEAT_LS64 is implemented, ESR_ELx[36:32] contain ISS2, which for an ST64BV or ST64BV0 can be non-zero. This can be seen in ARM DDI 0487G.b, page D13-3145, section D13.2.37. Generally, we must not rely on RES0 bit remaining zero in future, and when extracting ESR_ELx.EC we must mask out all other bits. All C code uses the ESR_ELx_EC() macro, which masks out the irrelevant bits, and therefore no alterations are required to C code to avoid consuming irrelevant bits. In a couple of places the KVM assembly extracts ESR_ELx.EC using LSR on an X register, and so could in theory consume previously RES0 bits. In both cases this is for comparison with EC values ESR_ELx_EC_HVC32 and ESR_ELx_EC_HVC64, for which the upper bits of ESR_ELx must currently be zero, but this could change in future. This patch adjusts the KVM vectors to use UBFX rather than LSR to extract ESR_ELx.EC, ensuring these are robust to future additions to ESR_ELx. Cc: stable@vger.kernel.org Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211103110545.4613-1-mark.rutland@arm.com	2021-11-08 10:41:12 +00:00
Arnd Bergmann	c7c386fbc2	arm64: pgtable: make __pte_to_phys/__phys_to_pte_val inline functions gcc warns about undefined behavior the vmalloc code when building with CONFIG_ARM64_PA_BITS_52, when the 'idx++' in the argument to __phys_to_pte_val() is evaluated twice: mm/vmalloc.c: In function 'vmap_pfn_apply': mm/vmalloc.c:2800:58: error: operation on 'data->idx' may be undefined [-Werror=sequence-point] 2800 \| pte = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot)); \| ~~~~~~~~~^~ arch/arm64/include/asm/pgtable-types.h:25:37: note: in definition of macro '__pte' 25 \| #define __pte(x) ((pte_t) { (x) } ) \| ^ arch/arm64/include/asm/pgtable.h:80:15: note: in expansion of macro '__phys_to_pte_val' 80 \| __pte(__phys_to_pte_val((phys_addr_t)(pfn) << PAGE_SHIFT) \| pgprot_val(prot)) \| ^~~~~~~~~~~~~~~~~ mm/vmalloc.c:2800:30: note: in expansion of macro 'pfn_pte' 2800 \| pte = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot)); \| ^~~~~~~ I have no idea why this never showed up earlier, but the safest workaround appears to be changing those macros into inline functions so the arguments get evaluated only once. Cc: Matthew Wilcox <willy@infradead.org> Fixes: `75387b9263` ("arm64: handle 52-bit physical addresses in page table entries") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20211105075414.2553155-1-arnd@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2021-11-08 10:05:54 +00:00
Qian Cai	c6975d7cab	arm64: Track no early_pgtable_alloc() for kmemleak After switched page size from 64KB to 4KB on several arm64 servers here, kmemleak starts to run out of early memory pool due to a huge number of those early_pgtable_alloc() calls: kmemleak_alloc_phys() memblock_alloc_range_nid() memblock_phys_alloc_range() early_pgtable_alloc() init_pmd() alloc_init_pud() __create_pgd_mapping() __map_memblock() paging_init() setup_arch() start_kernel() Increased the default value of DEBUG_KMEMLEAK_MEM_POOL_SIZE by 4 times won't be enough for a server with 200GB+ memory. There isn't much interesting to check memory leaks for those early page tables and those early memory mappings should not reference to other memory. Hence, no kmemleak false positives, and we can safely skip tracking those early allocations from kmemleak like we did in the commit `fed84c7852` ("mm/memblock.c: skip kmemleak for kasan_init()") without needing to introduce complications to automatically scale the value depends on the runtime memory size etc. After the patch, the default value of DEBUG_KMEMLEAK_MEM_POOL_SIZE becomes sufficient again. Signed-off-by: Qian Cai <quic_qiancai@quicinc.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/r/20211105150509.7826-1-quic_qiancai@quicinc.com Signed-off-by: Will Deacon <will@kernel.org>	2021-11-08 10:05:22 +00:00
Peter Collingbourne	aedad3e1c6	arm64: mte: change PR_MTE_TCF_NONE back into an unsigned long This constant was previously an unsigned long, but was changed into an int in commit `433c38f40f` ("arm64: mte: change ASYNC and SYNC TCF settings into bitfields"). This ended up causing spurious unsigned-signed comparison warnings in expressions such as: (x & PR_MTE_TCF_MASK) != PR_MTE_TCF_NONE Therefore, change it back into an unsigned long to silence these warnings. Link: https://linux-review.googlesource.com/id/I07a72310db30227a5b7d789d0b817d78b657c639 Signed-off-by: Peter Collingbourne <pcc@google.com> Link: https://lore.kernel.org/r/20211105230829.2254790-1-pcc@google.com Signed-off-by: Will Deacon <will@kernel.org>	2021-11-08 10:04:33 +00:00
Masahiro Yamada	34688c7691	arm64: vdso: remove -nostdlib compiler flag The -nostdlib option requests the compiler to not use the standard system startup files or libraries when linking. It is effective only when $(CC) is used as a linker driver. Since commit `691efbedc6` ("arm64: vdso: use $(LD) instead of $(CC) to link VDSO"), $(LD) is directly used, hence -nostdlib is unneeded. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20211107161802.323125-1-masahiroy@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2021-11-08 10:02:57 +00:00
Reiji Watanabe	9dc232a8ab	arm64: arm64_ftr_reg->name may not be a human-readable string The id argument of ARM64_FTR_REG_OVERRIDE() is used for two purposes: one as the system register encoding (used for the sys_id field of __ftr_reg_entry), and the other as the register name (stringified and used for the name field of arm64_ftr_reg), which is debug information. The id argument is supposed to be a macro that indicates an encoding of the register (eg. SYS_ID_AA64PFR0_EL1, etc). ARM64_FTR_REG(), which also has the same id argument, uses ARM64_FTR_REG_OVERRIDE() and passes the id to the macro. Since the id argument is completely macro-expanded before it is substituted into a macro body of ARM64_FTR_REG_OVERRIDE(), the stringified id in the body of ARM64_FTR_REG_OVERRIDE is not a human-readable register name, but a string of numeric bitwise operations. Fix this so that human-readable register names are available as debug information. Fixes: `8f266a5d87` ("arm64: cpufeature: Add global feature override facility") Signed-off-by: Reiji Watanabe <reijiw@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211101045421.2215822-1-reijiw@google.com Signed-off-by: Will Deacon <will@kernel.org>	2021-11-08 10:02:36 +00:00
Luís Henriques	c02cb7bdc4	ceph: add a new metric to keep track of remote object copies This patch adds latency and size metrics for remote object copies operations ("copyfrom"). For now, these metrics will be available on the client only, they won't be sent to the MDS. Signed-off-by: Luís Henriques <lhenriques@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-11-08 03:29:52 +01:00
Luís Henriques	aca39d9e86	libceph, ceph: move ceph_osdc_copy_from() into cephfs code This patch moves ceph_osdc_copy_from() function out of libceph code into cephfs. There are no other users for this function, and there is the need (in another patch) to access internal ceph_osd_request struct members. Signed-off-by: Luís Henriques <lhenriques@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-11-08 03:29:52 +01:00
Luís Henriques	17e9fc9fca	ceph: clean-up metrics data structures to reduce code duplication This patch modifies struct ceph_client_metric so that each metric block (read, write and metadata) becomes an element in a array. This allows to also re-write the helper functions that handle these blocks, making them simpler and, above all, reduce the amount of copy&paste every time a new metric is added. Thus, for each of these metrics there will be a new struct ceph_metric entry that'll will contain all the sizes and latencies fields (and a lock). Note however that the metadata metric doesn't really use the size_fields, and thus this metric won't be shown in the debugfs '../metrics/size' file. Signed-off-by: Luís Henriques <lhenriques@suse.de> Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-11-08 03:29:52 +01:00
Luís Henriques	cbed4ff76b	ceph: split 'metric' debugfs file into several files Currently, all the metrics are grouped together in a single file, making it difficult to process this file from scripts. Furthermore, as new metrics are added, processing this file will become even more challenging. This patch turns the 'metric' file into a directory that will contain several files, one for each metric. Signed-off-by: Luís Henriques <lhenriques@suse.de> Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-11-08 03:29:52 +01:00
Xiubo Li	c3d8e0b5de	ceph: return the real size read when it hits EOF Currently, if the sync read handler ends up reading more from the last object in the file than the i_size indicates, then it'll end up returning the wrong length. Ensure that we cap the returned length and pos at the EOF. Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-11-08 03:29:52 +01:00
Jeff Layton	8cfc0c7ed3	ceph: properly handle statfs on multifs setups ceph_statfs currently stuffs the cluster fsid into the f_fsid field. This was fine when we only had a single filesystem per cluster, but now that we have multiples we need to use something that will vary between them. Change ceph_statfs to xor each 32-bit chunk of the fsid (aka cluster id) into the lower bits of the statfs->f_fsid. Change the lower bits to hold the fscid (filesystem ID within the cluster). That should give us a value that is guaranteed to be unique between filesystems within a cluster, and should minimize the chance of collisions between mounts of different clusters. URL: https://tracker.ceph.com/issues/52812 Reported-by: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-11-08 03:29:52 +01:00
Jeff Layton	631ed4b082	ceph: shut down mount on bad mdsmap or fsmap decode As Greg pointed out, if we get a mangled mdsmap or fsmap, then something has gone very wrong, and we should avoid doing any activity on the filesystem. When this occurs, shut down the mount the same way we would with a forced umount by calling ceph_umount_begin when decoding fails on either map. This causes most operations done against the filesystem to return an error. Any dirty data or caps in the cache will be dropped as well. The effect is not reversible, so the only remedy is to umount. [ idryomov: print fsmap decoding error ] URL: https://tracker.ceph.com/issues/52303 Signed-off-by: Jeff Layton <jlayton@kernel.org> Acked-by: Greg Farnum <gfarnum@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-11-08 03:29:52 +01:00
Xiubo Li	0e24421ac4	ceph: fix mdsmap decode when there are MDS's beyond max_mds If the max_mds is decreased in a cephfs cluster, there is a window of time before the MDSs are removed. If a map goes out during this period, the mdsmap may show the decreased max_mds but still shows those MDSes as in or in the export target list. Ensure that we don't fail the map decode in that case. Cc: stable@vger.kernel.org URL: https://tracker.ceph.com/issues/52436 Fixes: `d517b3983d` ("ceph: reconnect to the export targets on new mdsmaps") Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-11-08 03:29:52 +01:00
Xiubo Li	e90334e89b	ceph: ignore the truncate when size won't change with Fx caps issued If the new size is the same as the current size, the MDS will do nothing but change the mtime/atime. POSIX doesn't mandate that the filesystems must update them in this case, so just ignore it instead. Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-11-08 03:29:52 +01:00
Kotresh HR	e1c9788cb3	ceph: don't rely on error_string to validate blocklisted session. The "error_string" in the metadata of MClientSession is being parsed by kclient to validate whether the session is blocklisted. The "error_string" is for humans and shouldn't be relied on it. Hence added the flag to MClientsession to indicate the session is blocklisted. [ jlayton: minor formatting cleanup ] URL: https://tracker.ceph.com/issues/47450 Signed-off-by: Kotresh HR <khiremat@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-11-08 03:29:51 +01:00
Jeff Layton	25b7351161	ceph: just use ci->i_version for fscache aux info If the i_version regresses, then it's likely that the mtime will do the same in lockstep with it. There's no need to track both here, just use the i_version counter since it's just as good and gets the aux size down to 64 bits. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-11-08 03:29:51 +01:00
Jeff Layton	5d6451b148	ceph: shut down access to inode when async create fails Add proper error handling for when an async create fails. The inode never existed, so any dirty caps or data are now toast. We already d_drop the dentry in that case, but the now-stale inode may still be around. We want to shut down access to these inodes, and ensure that they can't harbor any more dirty data, which can cause problems at umount time. When this occurs, flag such inodes as being SHUTDOWN, and trash any caps and cap flushes that may be in flight for them, and invalidate the pagecache for the inode. Add a new helper that can check whether an inode or an entire mount is now shut down, and call it instead of accessing the mount_state directly in places where we test that now. URL: https://tracker.ceph.com/issues/51279 Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-11-08 03:29:51 +01:00
Jeff Layton	36e6da987e	ceph: refactor remove_session_caps_cb Move remove_capsnaps to caps.c. Move the part of remove_session_caps_cb under i_ceph_lock into a separate function that lives in caps.c. Have remove_session_caps_cb call the new helper after taking the lock. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-11-08 03:29:51 +01:00
Jeff Layton	3c3050267e	ceph: fix auth cap handling logic in remove_session_caps_cb The existing logic relies on ci->i_auth_cap being NULL, but if we end up removing the auth cap early, then we'll do a lot of useless work and lock-taking on the remaining caps. Ensure that we only do the auth cap removal when we're _actually_ removing the auth cap. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-11-08 03:29:51 +01:00
Jeff Layton	c35cac610a	ceph: drop private list from remove_session_caps_cb This function does a lot of list-shuffling with cap flushes, all to avoid possibly freeing a slab allocation under spinlock (which is totally ok). Simplify the code by just detaching and freeing the cap flushes in place. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-11-08 03:29:51 +01:00
Jeff Layton	8006daff5f	ceph: don't use -ESTALE as special return code in try_get_cap_refs In some cases, we may want to return -ESTALE if it ends up that we're dealing with an inode that no longer exists. Switch to using -EUCLEAN as the "special" error return. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-11-08 03:29:51 +01:00
Jeff Layton	6407fbb9c3	ceph: print inode numbers instead of pointer values We have a lot of log messages that print inode pointer values. This is of dubious utility. Switch a random assortment of the ones I've found most useful to use ceph_vinop to print the snap:inum tuple instead. [ idryomov: use . as a separator, break unnecessarily long lines ] Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-11-08 03:29:51 +01:00
Jeff Layton	f7a67b463f	ceph: enable async dirops by default Async dirops have been supported in mainline kernels for quite some time now, and we've recently (as of June) started doing regular testing in teuthology with '-o nowsync'. There were a few issues, but we've sorted those out now. Enable async dirops by default, and change /proc/mounts to show "wsync" when they are disabled rather than "nowsync" when they are enabled. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-11-08 03:29:51 +01:00
Jean Sacren	a341131eb3	libceph: drop ->monmap and err initialization Call to build_initial_monmap() is one stone two birds. Explicitly it initializes err variable. Implicitly it initializes ->monmap via call to kzalloc(). We should only declare err and ->monmap is taken care of by ceph_monc_init() prototype. Signed-off-by: Jean Sacren <sakiwit@gmail.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-11-08 03:29:51 +01:00
Jeff Layton	9c43ff4490	ceph: convert to noop_direct_IO We have our own op, but the WARN_ON is not terribly helpful, and it's otherwise identical to the noop one. Just use that. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-11-08 03:29:51 +01:00
Yue Hu	4c7e42552b	erofs: remove useless cache strategy of DELAYEDALLOC After commit `1825c8d7ce` ("erofs: force inplace I/O under low memory scenario") and TRYALLOC is widely used, DELAYEDALLOC won't be used anymore. Remove related dead code. Also, remove the blank line at the end of zdata.h. Link: https://lore.kernel.org/r/20211106082315.25781-1-huyue2@yulong.com Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Yue Hu <huyue2@yulong.com> Signed-off-by: Gao Xiang <xiang@kernel.org>	2021-11-08 10:02:34 +08:00
Gao Xiang	86432a6dca	erofs: fix unsafe pagevec reuse of hooked pclusters There are pclusters in runtime marked with Z_EROFS_PCLUSTER_TAIL before actual I/O submission. Thus, the decompression chain can be extended if the following pcluster chain hooks such tail pcluster. As the related comment mentioned, if some page is made of a hooked pcluster and another followed pcluster, it can be reused for in-place I/O (since I/O should be submitted anyway): _______________________________________________________________ \| tail (partial) page \| head (partial) page \| \|_____PRIMARY_HOOKED___\|____________PRIMARY_FOLLOWED____________\| However, it's by no means safe to reuse as pagevec since if such PRIMARY_HOOKED pclusters finally move into bypass chain without I/O submission. It's somewhat hard to reproduce with LZ4 and I just found it (general protection fault) by ro_fsstressing a LZMA image for long time. I'm going to actively clean up related code together with multi-page folio adaption in the next few months. Let's address it directly for easier backporting for now. Call trace for reference: z_erofs_decompress_pcluster+0x10a/0x8a0 [erofs] z_erofs_decompress_queue.isra.36+0x3c/0x60 [erofs] z_erofs_runqueue+0x5f3/0x840 [erofs] z_erofs_readahead+0x1e8/0x320 [erofs] read_pages+0x91/0x270 page_cache_ra_unbounded+0x18b/0x240 filemap_get_pages+0x10a/0x5f0 filemap_read+0xa9/0x330 new_sync_read+0x11b/0x1a0 vfs_read+0xf1/0x190 Link: https://lore.kernel.org/r/20211103182006.4040-1-xiang@kernel.org Fixes: `3883a79abd` ("staging: erofs: introduce VLE decompression support") Cc: <stable@vger.kernel.org> # 4.19+ Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2021-11-08 10:02:10 +08:00
Christophe JAILLET	c45231a766	litex_liteeth: Fix a double free in the remove function 'netdev' is a managed resource allocated in the probe using 'devm_alloc_etherdev()'. It must not be freed explicitly in the remove function. Fixes: `ee7da21ac4` ("net: Add driver for LiteX's LiteETH network interface") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-07 21:51:17 +00:00

... 14 15 16 17 18 ...

1058307 Commits