linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-04 18:13:04 +00:00

Author	SHA1	Message	Date
Chen Zhongjin	569bea74c9	i2c: piix4: Fix adapter not be removed in piix4_remove() In piix4_probe(), the piix4 adapter will be registered in: piix4_probe() piix4_add_adapters_sb800() / piix4_add_adapter() i2c_add_adapter() Based on the probed device type, piix4_add_adapters_sb800() or single piix4_add_adapter() will be called. For the former case, piix4_adapter_count is set as the number of adapters, while for antoher case it is not set and kept default zero. When piix4 is removed, piix4_remove() removes the adapters added in piix4_probe(), basing on the piix4_adapter_count value. Because the count is zero for the single adapter case, the adapter won't be removed and makes the sources allocated for adapter leaked, such as the i2c client and device. These sources can still be accessed by i2c or bus and cause problems. An easily reproduced case is that if a new adapter is registered, i2c will get the leaked adapter and try to call smbus_algorithm, which was already freed: Triggered by: rmmod i2c_piix4 && modprobe max31730 BUG: unable to handle page fault for address: ffffffffc053d860 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page Oops: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 3752 Comm: modprobe Tainted: G Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) RIP: 0010:i2c_default_probe (drivers/i2c/i2c-core-base.c:2259) i2c_core RSP: 0018:ffff888107477710 EFLAGS: 00000246 ... <TASK> i2c_detect (drivers/i2c/i2c-core-base.c:2302) i2c_core __process_new_driver (drivers/i2c/i2c-core-base.c:1336) i2c_core bus_for_each_dev (drivers/base/bus.c:301) i2c_for_each_dev (drivers/i2c/i2c-core-base.c:1823) i2c_core i2c_register_driver (drivers/i2c/i2c-core-base.c:1861) i2c_core do_one_initcall (init/main.c:1296) do_init_module (kernel/module/main.c:2455) ... </TASK> ---[ end trace 0000000000000000 ]--- Fix this problem by correctly set piix4_adapter_count as 1 for the single adapter so it can be normally removed. Fixes: `528d53a159` ("i2c: piix4: Fix probing of reserved ports on AMD Family 16h Model 30h") Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> Reviewed-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Wolfram Sang <wsa@kernel.org>	2022-11-01 13:09:33 +01:00
Cristian Marussi	c4a7b9b587	arm64: dts: juno: Add thermal critical trip points When thermnal zones are defined, trip points definitions are mandatory. Define a couple of critical trip points for monitoring of existing PMIC and SOC thermal zones. This was lost between txt to yaml conversion and was re-enforced recently via the commit `8c59632423` ("dt-bindings: thermal: Fix missing required property") Cc: Rob Herring <robh+dt@kernel.org> Cc: Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org> Cc: devicetree@vger.kernel.org Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Fixes: `f7b636a8d8` ("arm64: dts: juno: add thermal zones for scpi sensors") Link: https://lore.kernel.org/r/20221028140833.280091-8-cristian.marussi@arm.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>	2022-11-01 12:03:07 +00:00
Cristian Marussi	1eff6929af	firmware: arm_scmi: Fix deferred_tx_wq release on error paths Use devres to allocate the dedicated deferred_tx_wq polling workqueue so as to automatically trigger the proper resource release on error path. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: `5a3b7185c4` ("firmware: arm_scmi: Add atomic mode support to virtio transport") Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20221028140833.280091-6-cristian.marussi@arm.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>	2022-11-01 11:36:20 +00:00
Cristian Marussi	5ffc1c4cb8	firmware: arm_scmi: Fix devres allocation device in virtio transport SCMI virtio transport device managed allocations must use the main platform device in devres operations instead of the channel devices. Cc: Peter Hilber <peter.hilber@opensynergy.com> Fixes: `46abe13b5e` ("firmware: arm_scmi: Add virtio transport") Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20221028140833.280091-5-cristian.marussi@arm.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>	2022-11-01 11:35:45 +00:00
Cristian Marussi	be9ba1f7f9	firmware: arm_scmi: Make Rx chan_setup fail on memory errors SCMI Rx channels are optional and they can fail to be setup when not present but anyway channels setup routines must bail-out on memory errors. Make channels setup, and related probing, fail when memory errors are reported on Rx channels. Fixes: `5c8a47a5a9` ("firmware: arm_scmi: Make scmi core independent of the transport type") Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20221028140833.280091-4-cristian.marussi@arm.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>	2022-11-01 11:35:22 +00:00
Cristian Marussi	59172b212e	firmware: arm_scmi: Make tx_prepare time out eventually SCMI transports based on shared memory, at start of transmissions, have to wait for the shared Tx channel area to be eventually freed by the SCMI platform before accessing the channel. In fact the channel is owned by the SCMI platform until marked as free by the platform itself and, as such, cannot be used by the agent until relinquished. As a consequence a badly misbehaving SCMI platform firmware could lock the channel indefinitely and make the kernel side SCMI stack loop forever waiting for such channel to be freed, possibly hanging the whole boot sequence. Add a timeout to the existent Tx waiting spin-loop so that, when the system ends up in this situation, the SCMI stack can at least bail-out, nosily warn the user, and abort the transmission. Reported-by: YaxiongTian <iambestgod@outlook.com> Suggested-by: YaxiongTian <iambestgod@outlook.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Etienne Carriere <etienne.carriere@linaro.org> Cc: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20221028140833.280091-3-cristian.marussi@arm.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>	2022-11-01 11:33:24 +00:00
Cristian Marussi	fd96fbc8fa	firmware: arm_scmi: Suppress the driver's bind attributes Suppress the capability to unbind the core SCMI driver since all the SCMI stack protocol drivers depend on it. Fixes: `aa4f886f38` ("firmware: arm_scmi: add basic driver infrastructure for SCMI") Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20221028140833.280091-2-cristian.marussi@arm.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>	2022-11-01 11:32:49 +00:00
Cristian Marussi	3f4071cbd2	firmware: arm_scmi: Cleanup the core driver removal callback Platform drivers .remove callbacks are not supposed to fail and report errors. Such errors are indeed ignored by the core platform drivers and the driver unbind process is anyway completed. The SCMI core platform driver as it is now, instead, bails out reporting an error in case of an explicit unbind request. Fix the removal path by adding proper device links between the core SCMI device and the SCMI protocol devices so that a full SCMI stack unbind is triggered when the core driver is removed. The remove process does not bail out anymore on the anomalous conditions triggered by an explicit unbind but the user is still warned. Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20221028140833.280091-1-cristian.marussi@arm.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>	2022-11-01 11:30:13 +00:00
Jay Fang	a0e215088e	MAINTAINERS: Update HiSilicon LPC BUS Driver maintainer Add Jay Fang as the maintainer of the HiSilicon LPC BUS Driver, replacing John Garry. Signed-off-by: Jay Fang <f.fangjian@huawei.com> Link: https://lore.kernel.org/r/20221028105434.1661264-1-f.fangjian@huawei.com' Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2022-11-01 12:22:02 +01:00
Linus Walleij	cd73adcdba	ARM: dts: ux500: Add trips to battery thermal zones Recent changes to the thermal framework has made the trip points (trips) for thermal zones compulsory, which made the Ux500 DTS files break validation and also stopped probing because of similar changes to the code. Fix this by adding an "outer bounding box": battery thermal zones should not get warmer than 70 degress, then we will shut down. Fixes: `8c59632423` ("dt-bindings: thermal: Fix missing required property") Fixes: `3fd6d6e2b4` ("thermal/of: Rework the thermal device tree initialization") Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: linux-pm@vger.kernel.org Link: https://lore.kernel.org/r/20221030210854.346662-1-linus.walleij@linaro.org' Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2022-11-01 12:21:17 +01:00
Arnd Bergmann	c872f6ce22	i.MX fixes for 6.1: - Fix imx93-pd driver to release resources when error occurs in probe. - A series from Ioana Ciornei to add missing clock frequencies for MDIO controllers on LayerScape SoCs, so that the kernel driver can work independently from bootloader. - A series from Li Jun to fix USB power domain setup in i.MX8MM/N device trees. - Fix CPLD_Dn pull configuration for MX8Menlo board to avoid interfering with CPLD power off functionality. - Fix ctrl_sleep_moci GPIO setup for verdin-imx8mp board. - Fix DT schema check warnings on uSDHC clocks for imx8-ss-conn device tree. - Fix up gpcv2 DT bindings to have an optional `power-domains` property. - A couple of i.MX93 device tree fixes on S4MU interrupt and gpio-ranges of GPIO controllers. - Keep PU regulator on for Quad and QuadPlus based imx6dl-yapp4 boards to work around a hardware design flaw in supply voltage distribution. - Fix user push-button GPIO offset on imx6qdl-gw59 boards. -----BEGIN PGP SIGNATURE----- iQFIBAABCgAyFiEEFmJXigPl4LoGSz08UFdYWoewfM4FAmNgjiEUHHNoYXduZ3Vv QGtlcm5lbC5vcmcACgkQUFdYWoewfM4anAgAmZ5lSvXKhlaYijN7DSh9WlpwDcf1 BRGAoT94d2C0rv1+lPKaEcSCJAo7/jqiMUgNorUVhlwR9frJIJK9HvNkfzJ66vQ9 DRksUZjomlIuefaOWdja18ZZD6WuPoe95pjW2ssB/A2zAmKYxtSoqiHsPfnzRtp1 yaBa1DyKBodVp+Eo0HT9Cyg1gBbPaHDex4D/v852UaBQ3J297phykeFrL/FjwsZT ojAlK30sEb5LOBvcBnQxus1f0EhaB+sjapAsNMX0vaEkx49zO+fOEDa2I/vBXbA8 NF/jOf87pl9QeYkVDGKlP4EtTbXKObsVOfSFc05C1gT68FUGUbGF2pX/3g== =YELK -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmNhAQoACgkQmmx57+YA GNknHQ/9ENt7IOQmPPYPF+9jsEE6YtlzPSO40hdbgTcQM94qFN2mQUYRTf9kBzr+ Gaeot/CE9cZEj5ZlN0qSUK/NWsIfHNiMmCMCTrEB9hqjJqyKfPrWjUD2CaAnORii za1Qma7IBrSc0wfiNNx0ZIiWeJluPYw5iw9BgF7HlHOYd81Rs6ZAxDjT0nwa81NN DjzpIswc7dFQSJU6mgJ20tEPvthZBRRlDnI4zdzr84hVovizCMF6UfM3K+XfM5Tb lJMuLC1w3uCVqS6v3+MpZJABg3qFZwB7TqRI+fS/W+R+w21EGkhzpo8OqqpSULYJ 4HHrd4Gw1QItue8dVqbqNSTBJNo0lpUvcFKOoNPh0kUuUwRgoxOWKE+EowLpqaNO 6Judyy4KWTTBv3YyBSF0AVEy/wuQDsv4wj//cttHOrWZKcEnF3t9EiKEBzu0f7b9 u4dVVw6aOlF/bv7CoOXwhUK79Dm2XOV6NseXKGPGkCdWZNg+bwojKjDIy80sjL7i 3YR0ggDCb5rN+eioKp04l//qaKbw6cr8FtW7AjfXASfqgHojcyh8UXu0PIuMKRNW jod9rP2HFKV8+gLVl/Ax4T9GWyKAbBtogzR+J+77MZXZ5+hbZa4e7CEtHy3feHzC d52+kjPqOLiW8CfhgNbTYeWkNhz+qSF1ea8I7AjpyUurWNtlVf4= =W2eO -----END PGP SIGNATURE----- Merge tag 'imx-fixes-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes i.MX fixes for 6.1: - Fix imx93-pd driver to release resources when error occurs in probe. - A series from Ioana Ciornei to add missing clock frequencies for MDIO controllers on LayerScape SoCs, so that the kernel driver can work independently from bootloader. - A series from Li Jun to fix USB power domain setup in i.MX8MM/N device trees. - Fix CPLD_Dn pull configuration for MX8Menlo board to avoid interfering with CPLD power off functionality. - Fix ctrl_sleep_moci GPIO setup for verdin-imx8mp board. - Fix DT schema check warnings on uSDHC clocks for imx8-ss-conn device tree. - Fix up gpcv2 DT bindings to have an optional `power-domains` property. - A couple of i.MX93 device tree fixes on S4MU interrupt and gpio-ranges of GPIO controllers. - Keep PU regulator on for Quad and QuadPlus based imx6dl-yapp4 boards to work around a hardware design flaw in supply voltage distribution. - Fix user push-button GPIO offset on imx6qdl-gw59 boards. * tag 'imx-fixes-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: arm64: dts: ls208xa: specify clock frequencies for the MDIO controllers arm64: dts: ls1088a: specify clock frequencies for the MDIO controllers arm64: dts: lx2160a: specify clock frequencies for the MDIO controllers soc: imx: imx93-pd: Fix the error handling path of imx93_pd_probe() arm64: dts: imx93: correct gpio-ranges arm64: dts: imx93: correct s4mu interrupt names dt-bindings: power: gpcv2: add power-domains property arm64: dts: imx8: correct clock order ARM: dts: imx6dl-yapp4: Do not allow PM to switch PU regulator off on Q/QP ARM: dts: imx6qdl-gw59{10,13}: fix user pushbutton GPIO offset arm64: dts: imx8mn: Correct the usb power domain arm64: dts: imx8mn: remove otg1 power domain dependency on hsio arm64: dts: imx8mm: correct usb power domains arm64: dts: imx8mm: remove otg1/2 power domain dependency on hsio arm64: dts: verdin-imx8mp: fix ctrl_sleep_moci arm64: dts: imx8mm: Enable CPLD_Dn pull down resistor on MX8Menlo Link: https://lore.kernel.org/r/20221101031547.GB125525@dragon Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2022-11-01 12:20:41 +01:00
Pablo Neira Ayuso	26b5934ff4	netfilter: nf_tables: release flow rule object from commit path No need to postpone this to the commit release path, since no packets are walking over this object, this is accessed from control plane only. This helped uncovered UAF triggered by races with the netlink notifier. Fixes: `9dd732e0bd` ("netfilter: nf_tables: memleak flow rule from commit path") Reported-by: syzbot+8f747f62763bc6c32916@syzkaller.appspotmail.com Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-11-01 12:19:47 +01:00
Pablo Neira Ayuso	d4bc8271db	netfilter: nf_tables: netlink notifier might race to release objects commit release path is invoked via call_rcu and it runs lockless to release the objects after rcu grace period. The netlink notifier handler might win race to remove objects that the transaction context is still referencing from the commit release path. Call rcu_barrier() to ensure pending rcu callbacks run to completion if the list of transactions to be destroyed is not empty. Fixes: `6001a930ce` ("netfilter: nftables: introduce table ownership") Reported-by: syzbot+8f747f62763bc6c32916@syzkaller.appspotmail.com Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-11-01 12:19:46 +01:00
Michael Ellerman	02a771c9a6	powerpc/32: Select ARCH_SPLIT_ARG64 On 32-bit kernels, 64-bit syscall arguments are split into two registers. For that to work with syscall wrappers, the prototype of the syscall must have the argument split so that the wrapper macro properly unpacks the arguments from pt_regs. The fanotify_mark() syscall is one such syscall, which already has a split prototype, guarded behind ARCH_SPLIT_ARG64. So select ARCH_SPLIT_ARG64 to get that prototype and fix fanotify_mark() on 32-bit kernels with syscall wrappers. Note also that fanotify_mark() is the only usage of ARCH_SPLIT_ARG64. Fixes: `7e92e01b72` ("powerpc: Provide syscall wrapper") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221101034852.2340319-1-mpe@ellerman.id.au	2022-11-01 15:27:12 +11:00
Ziyang Xuan	363a5328f4	net: tun: fix bugs for oversize packet when napi frags enabled Recently, we got two syzkaller problems because of oversize packet when napi frags enabled. One of the problems is because the first seg size of the iov_iter from user space is very big, it is 2147479538 which is bigger than the threshold value for bail out early in __alloc_pages(). And skb->pfmemalloc is true, __kmalloc_reserve() would use pfmemalloc reserves without __GFP_NOWARN flag. Thus we got a warning as following: ======================================================== WARNING: CPU: 1 PID: 17965 at mm/page_alloc.c:5295 __alloc_pages+0x1308/0x16c4 mm/page_alloc.c:5295 ... Call trace: __alloc_pages+0x1308/0x16c4 mm/page_alloc.c:5295 __alloc_pages_node include/linux/gfp.h:550 [inline] alloc_pages_node include/linux/gfp.h:564 [inline] kmalloc_large_node+0x94/0x350 mm/slub.c:4038 __kmalloc_node_track_caller+0x620/0x8e4 mm/slub.c:4545 __kmalloc_reserve.constprop.0+0x1e4/0x2b0 net/core/skbuff.c:151 pskb_expand_head+0x130/0x8b0 net/core/skbuff.c:1654 __skb_grow include/linux/skbuff.h:2779 [inline] tun_napi_alloc_frags+0x144/0x610 drivers/net/tun.c:1477 tun_get_user+0x31c/0x2010 drivers/net/tun.c:1835 tun_chr_write_iter+0x98/0x100 drivers/net/tun.c:2036 The other problem is because odd IPv6 packets without NEXTHDR_NONE extension header and have big packet length, it is 2127925 which is bigger than ETH_MAX_MTU(65535). After ipv6_gso_pull_exthdrs() in ipv6_gro_receive(), network_header offset and transport_header offset are all bigger than U16_MAX. That would trigger skb->network_header and skb->transport_header overflow error, because they are all '__u16' type. Eventually, it would affect the value for __skb_push(skb, value), and make it be a big value. After __skb_push() in ipv6_gro_receive(), skb->data would less than skb->head, an out of bounds memory bug occurred. That would trigger the problem as following: ================================================================== BUG: KASAN: use-after-free in eth_type_trans+0x100/0x260 ... Call trace: dump_backtrace+0xd8/0x130 show_stack+0x1c/0x50 dump_stack_lvl+0x64/0x7c print_address_description.constprop.0+0xbc/0x2e8 print_report+0x100/0x1e4 kasan_report+0x80/0x120 __asan_load8+0x78/0xa0 eth_type_trans+0x100/0x260 napi_gro_frags+0x164/0x550 tun_get_user+0xda4/0x1270 tun_chr_write_iter+0x74/0x130 do_iter_readv_writev+0x130/0x1ec do_iter_write+0xbc/0x1e0 vfs_writev+0x13c/0x26c To fix the problems, restrict the packet size less than (ETH_MAX_MTU - NET_SKB_PAD - NET_IP_ALIGN) which has considered reserved skb space in napi_alloc_skb() because transport_header is an offset from skb->head. Add len check in tun_napi_alloc_frags() simply. Fixes: `90e33d4594` ("tun: enable napi_gro_frags() for TUN/TAP driver") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20221029094101.1653855-1-william.xuanziyang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-10-31 20:04:55 -07:00
Rick Lindsley	e230d36f7d	ibmvnic: change maintainers for vnic driver Changed maintainers for vnic driver, since Dany has new responsibilities. Also added Nick Child as reviewer. Signed-off-by: Rick Lindsley <ricklind@us.ibm.com> Link: https://lore.kernel.org/r/20221028203509.4070154-1-ricklind@us.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-10-31 19:56:57 -07:00
Al Viro	878eb6e48f	block: blk_add_rq_to_plug(): clear stale 'last' after flush blk_mq_flush_plug_list() empties ->mq_list and request we'd peeked there before that call is gone; in any case, we are not dealing with a mix of requests for different queues now - there's no requests left in the plug. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-31 20:21:38 -06:00
Andreas Schwab	ce883a2ba3	powerpc/32: fix syscall wrappers with 64-bit arguments With the introduction of syscall wrappers all wrappers for syscalls with 64-bit arguments must be handled specially, not only those that have unaligned 64-bit arguments. This left out the fallocate() and sync_file_range2() syscalls. Fixes: `7e92e01b72` ("powerpc: Provide syscall wrapper") Fixes: `e237506238` ("powerpc/32: fix syscall wrappers with 64-bit arguments of unaligned register-pairs") Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/87mt9cxd6g.fsf_-_@igel.home	2022-11-01 10:24:09 +11:00
Andreas Schwab	40ff214328	asm-generic: compat: fix compat_arg_u64() and compat_arg_u64_dual() The macros are defined backwards. This affects the following compat syscalls: - compat_sys_truncate64() - compat_sys_ftruncate64() - compat_sys_fallocate() - compat_sys_sync_file_range() - compat_sys_fadvise64_64() - compat_sys_readahead() - compat_sys_pread64() - compat_sys_pwrite64() Fixes: `43d5de2b67` ("asm-generic: compat: Support BE for long long args in 32-bit ABIs") Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> [mpe: Add list of affected syscalls] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/871qqoyvni.fsf_-_@igel.home	2022-11-01 10:20:11 +11:00
Linus Torvalds	5aaef24b5c	for-6.1-rc3-tag -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmNfzNwACgkQxWXV+ddt WDuC6Q//a72PAq1sjwvQqAcr+OOe3PWnmlwYZCnXxiab5c74Kc7rDhDZcO3m/Qt5 3YTwgK5FT4Y0AI8RN1NXx3+UOAYCWp/TGeBdbPHg35XIYKAnCh4pfql84Uiw1Awz HbqmSTma7sqVdRMehkKCkd7w4YoyAAsDdyXFQlSFm4ah9WHFZDswBc+m6xQZuWvU QVQS6wUTxkxuBZp0UComWGBNHiDeDZbga7VqO8UHPYOB394IV2mYP6fh8l0oB/BS bfKgsHjV9e0S0Ul0oPVADCGCiJcTbdnw3IA+Cje7MSgZ3kds/4Bo5IJWT5QRb94A yDAFpxc+t3+FgpoKS3/tZK7imXwgpXueiT2bBj+BjDDWD2VUVVBG4QmXYIW6tuqY vtEFw9+NCAvS2gRetHyXxQshYh/QW//+AZSkuI6/fuPSM+lRG5E0lnDxqrZiOMIo e6SJOGH3tCmtusL5VSXIQ8DPaLI9PBg4OXChytwmLHwPIusbQOvD5sTDpd99UezB dLXqZOGGScAc11HU1AFyZfAxTBybUgUxX/xCviJtf7ZOWKdcwiFrzSJOL5upSPz3 8qZTVjrD71mJlEa0Z8wj0Utuu4Psecp0GN+fs5JJxmqsFO0cYApU17OqPZ22+yEV RU26YNpqurYVarHVER4WxyXYraBYd1Cr6s6bFVDnuZynfiCOYIw= =3tvc -----END PGP SIGNATURE----- Merge tag 'for-6.1-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A few more fixes and regression fixes: - fix a corner case when handling tree-mod-log chagnes in reallocated notes - fix crash on raid0 filesystems created with <5.4 mkfs.btrfs that could lead to division by zero - add missing super block checksum verification after thawing filesystem - handle one more case in send when dealing with orphan files - fix parameter type mismatch for generation when reading dentry - improved error handling in raid56 code - better struct bio packing after recent cleanups" * tag 'for-6.1-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: don't use btrfs_chunk::sub_stripes from disk btrfs: fix type of parameter generation in btrfs_get_dentry btrfs: send: fix send failure of a subcase of orphan inodes btrfs: make thaw time super block check to also verify checksum btrfs: fix tree mod log mishandling of reallocated nodes btrfs: reorder btrfs_bio for better packing btrfs: raid56: avoid double freeing for rbio if full_stripe_write() failed btrfs: raid56: properly handle the error when unable to find the missing stripe	2022-10-31 12:28:29 -07:00
Linus Torvalds	78a089d033	lsm/stable-6.1 PR 20221031 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmNfpvEUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXM4wBAAr3iQ2y+j88aZKbgHMp+uT5FF8fp6 xTAI+Zyqn6KUD3H2VC8DYm1crlyibA6bZhscO3Al14ustS4wyVxXqBkXBTukkXxE exTzfmyx8SHCcke5vEfWvF1M/w9nHGRLTwtMwc2W0GR3Qz1uB65ezsxTikDwjlyP Ax5nXoC9r0DMsunfkYuLlRpfoe3Vwbz2in93odemB4cHSDiqj0V0Llk5z/kidcqF XrPf/GknVZblqS9NDZYg9accZGe8cLuIVHEeiXhmCt21mVoX13PycUWRzSnAvG7/ 9M+Wb3KExpZFn+8J3G0HK89P7v+PUmpOUMsH03kQARdHS0br35jE7eAqfEwo96xk UWJKbJCCEqURKmR9nzG6tuHqbUA2e8Sw/fqCMFRTxYBhAl64ptRqJPD5hqwY50Od P6khJo75F8uIuwJtW+0fQ9kAIrJqjzVHiObOMEZmt9vSiOOGHqjriGsEitWIMe6+ cVxVSqwuNeaUyux5sj9IiKyKnFelPt0qMpMncrePZ8l2y4ATf9MQFX28X6HhskPt 7JD2nIprsCsMHUSjUf4Z+fBZC8IFw8yWSQbM+9S/ErnV2zieq5/OxlnJs87vro6W 3skrgwsB1C4TQoW9qRf3bDbT5O31kbu4lmUcD5mgUUzQd/V+L257DY2d+rF1rB3w QMDyRxPPR/BP6bE= =L2Xt -----END PGP SIGNATURE----- Merge tag 'lsm-pr-20221031' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull LSM fix from Paul Moore: "A single patch to the capabilities code to fix a potential memory leak in the xattr allocation error handling" * tag 'lsm-pr-20221031' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: capabilities: fix potential memleak on error path from vfs_getxattr_alloc()	2022-10-31 12:09:42 -07:00
Gavin Shan	7a2726ec32	KVM: Check KVM_CAP_DIRTY_LOG_{RING, RING_ACQ_REL} prior to enabling them There are two capabilities related to ring-based dirty page tracking: KVM_CAP_DIRTY_LOG_RING and KVM_CAP_DIRTY_LOG_RING_ACQ_REL. Both are supported by x86. However, arm64 supports KVM_CAP_DIRTY_LOG_RING_ACQ_REL only when the feature is supported on arm64. The userspace doesn't have to enable the advertised capability, meaning KVM_CAP_DIRTY_LOG_RING can be enabled on arm64 by userspace and it's wrong. Fix it by double checking if the capability has been advertised prior to enabling it. It's rejected to enable the capability if it hasn't been advertised. Fixes: `17601bfed9` ("KVM: Add KVM_CAP_DIRTY_LOG_RING_ACQ_REL capability and config option") Reported-by: Sean Christopherson <seanjc@google.com> Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221031003621.164306-4-gshan@redhat.com	2022-10-31 17:22:15 +00:00
Darrick J. Wong	9f187ba0d5	xfs: fix various problems with log intent item recovery Starting with 6.1-rc1, CONFIG_FORTIFY_SOURCE checks became smart enough to detect memcpy() callers that copy beyond what seems to be the end of a struct. Unfortunately, gcc has a bug wherein it cannot reliably compute the size of a struct containing another struct containing a flex array at the end. This is the case with the xfs log item format structures, which means that -rc1 starts complaining all over the place. Fix these problems by memcpying the struct head and the flex arrays separately. Although it's tempting to use the FLEX_ARRAY macros, the structs involved are part of the ondisk log format. Some day we're going to want to make the ondisk log contents endian-safe, which means that we will have to stop using memcpy entirely. While we're at it, fix some deficiencies in the validation of recovered log intent items -- if the size of the recovery buffer is not even large enough to cover the flex array record count in the head, we should abort the recovery of that item immediately. The last patch of this series changes the EFI/EFD sizeof functions names and behaviors to be consistent with the similarly named sizeof helpers for other log intent items. v2: fix more inadequate log intent done recovery validation and dump corrupt recovered items Signed-off-by: Darrick J. Wong <djwong@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAmNf8LkACgkQ+H93GTRK tOt3QQ//SuyxzE4i2Vr8o7dwFQ6qtQeSt9RixtgKUG3ay+eZLCgpA7KS8po0Dv7W /8aAY6K712Mp2IzmdJUIHb/Pch5UbRSN5rw0169CsNDOmU/R9njqfeMWMfDr9ixS HAWfo13yh/QSmBTyioijZhP08N0TpyNVFsM9s5/4hKU7UGV4h5g2kz+hyDHrsSmB KXAM7FAh6SX8eBjxpj3iKLgsdEW7mcsDYurSVOnfmgWkXvgZXoLOvPt84e09A+s3 tLq5AEiLr261o45VbfExrjqn0qvwE7HdMdLPJrTa/tp6ztfsU2SJ6AxmG/XTTlBj jnIcYL9unu8JOndmJjLZxuhXmXXwZ3eFfsUgn0/tluSeR/nMMc3CCItZ58Ox5zk7 kUpN0JnY1+ecYmDw1Qz8LhhSIReOiA5Rw2SwVQ8wB3Oit9/cBQsxtM9YxxOne3MN od2096CiyvCYjpm6EGTRCkxQuz2nleJ5LajXb7dmkw91IiPdvoWbTPT+trtjO/63 gYbD0A4Qko9iDW0bWCCvWPD6vBZhN1q6r1j1lu77Az+z/45W47ut6MGokK4NHzo3 fTarDMqbVDxyeSrhW713iQO7PypLoOv7b72HD1+SvSkHzKwdi42kIlWe8e5B8Rew GkH2ycfQtaq+UR2fT4rSs4wWWuxoZLyo9Utdh0DA0ZF5QZfAqFo= =wP07 -----END PGP SIGNATURE----- Merge tag 'fix-log-recovery-misuse-6.1_2022-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.1-fixes xfs: fix various problems with log intent item recovery Starting with 6.1-rc1, CONFIG_FORTIFY_SOURCE checks became smart enough to detect memcpy() callers that copy beyond what seems to be the end of a struct. Unfortunately, gcc has a bug wherein it cannot reliably compute the size of a struct containing another struct containing a flex array at the end. This is the case with the xfs log item format structures, which means that -rc1 starts complaining all over the place. Fix these problems by memcpying the struct head and the flex arrays separately. Although it's tempting to use the FLEX_ARRAY macros, the structs involved are part of the ondisk log format. Some day we're going to want to make the ondisk log contents endian-safe, which means that we will have to stop using memcpy entirely. While we're at it, fix some deficiencies in the validation of recovered log intent items -- if the size of the recovery buffer is not even large enough to cover the flex array record count in the head, we should abort the recovery of that item immediately. The last patch of this series changes the EFI/EFD sizeof functions names and behaviors to be consistent with the similarly named sizeof helpers for other log intent items. v2: fix more inadequate log intent done recovery validation and dump corrupt recovered items Signed-off-by: Darrick J. Wong <djwong@kernel.org> * tag 'fix-log-recovery-misuse-6.1_2022-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: dump corrupt recovered log intent items to dmesg consistently xfs: actually abort log recovery on corrupt intent-done log items xfs: refactor all the EFI/EFD log item sizeof logic xfs: fix memcpy fortify errors in EFI log format copying xfs: fix memcpy fortify errors in RUI log format copying xfs: fix memcpy fortify errors in CUI log format copying xfs: fix memcpy fortify errors in BUI log format copying xfs: fix validation in attr log item recovery	2022-10-31 09:15:37 -07:00
Darrick J. Wong	8b972158af	xfs: rename XFS_REFC_COW_START to _COWFLAG We've been (ab)using XFS_REFC_COW_START as both an integer quantity and a bit flag, even though it's only a bit flag. Rename the variable to reflect its nature and update the cast target since we're not supposed to be comparing it to xfs_agblock_t now. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:22 -07:00
Darrick J. Wong	c1ccf967bf	xfs: fix uninitialized list head in struct xfs_refcount_recovery We're supposed to initialize the list head of an object before adding it to another list. Fix that, and stop using the kmem_{alloc,free} calls from the Irix days. Fixes: `174edb0e46` ("xfs: store in-progress CoW allocations in the refcount btree") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:22 -07:00
Darrick J. Wong	f1fdc82078	xfs: fix agblocks check in the cow leftover recovery function As we've seen, refcount records use the upper bit of the rc_startblock field to ensure that all the refcount records are at the right side of the refcount btree. This works because an AG is never allowed to have more than (1U << 31) blocks in it. If we ever encounter a filesystem claiming to have that many blocks, we absolutely do not want reflink touching it at all. However, this test at the start of xfs_refcount_recover_cow_leftovers is slightly incorrect -- it /should/ be checking that agblocks isn't larger than the XFS_MAX_CRC_AG_BLOCKS constant, and it should check that the constant is never large enough to conflict with that CoW flag. Note that the V5 superblock verifier has not historically rejected filesystems where agblocks >= XFS_MAX_CRC_AG_BLOCKS, which is why this ended up in the COW recovery routine. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:21 -07:00
Darrick J. Wong	f62ac3e0ac	xfs: check record domain when accessing refcount records Now that we've separated the startblock and CoW/shared extent domain in the incore refcount record structure, check the domain whenever we retrieve a record to ensure that it's still in the domain that we want. Depending on the circumstances, a change in domain either means we're done processing or that we've found a corruption and need to fail out. The refcount check in xchk_xref_is_cow_staging is redundant since _get_rec has done that for a long time now, so we can get rid of it. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:21 -07:00
Darrick J. Wong	68d0f38917	xfs: remove XFS_FIND_RCEXT_SHARED and _COW Now that we have an explicit enum for shared and CoW staging extents, we can get rid of the old FIND_RCEXT flags. Omit a couple of conversions that disappear in the next patches. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:21 -07:00
Darrick J. Wong	f492135df0	xfs: refactor domain and refcount checking Create a helper function to ensure that CoW staging extent records have a single refcount and that shared extent records have more than 1 refcount. We'll put this to more use in the next patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:21 -07:00
Darrick J. Wong	571423a162	xfs: report refcount domain in tracepoints Now that we've broken out the startblock and shared/cow domain in the incore refcount extent record structure, update the tracepoints to report the domain. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:21 -07:00
Darrick J. Wong	9a50ee4f8d	xfs: track cow/shared record domains explicitly in xfs_refcount_irec Just prior to committing the reflink code into upstream, the xfs maintainer at the time requested that I find a way to shard the refcount records into two domains -- one for records tracking shared extents, and a second for tracking CoW staging extents. The idea here was to minimize mount time CoW reclamation by pushing all the CoW records to the right edge of the keyspace, and it was accomplished by setting the upper bit in rc_startblock. We don't allow AGs to have more than 2^31 blocks, so the bit was free. Unfortunately, this was a very late addition to the codebase, so most of the refcount record processing code still treats rc_startblock as a u32 and pays no attention to whether or not the upper bit (the cow flag) is set. This is a weakness is theoretically exploitable, since we're not fully validating the incoming metadata records. Fuzzing demonstrates practical exploits of this weakness. If the cow flag of a node block key record is corrupted, a lookup operation can go to the wrong record block and start returning records from the wrong cow/shared domain. This causes the math to go all wrong (since cow domain is still implicit in the upper bit of rc_startblock) and we can crash the kernel by tricking xfs into jumping into a nonexistent AG and tripping over xfs_perag_get(mp, <nonexistent AG>) returning NULL. To fix this, start tracking the domain as an explicit part of struct xfs_refcount_irec, adjust all refcount functions to check the domain of a returned record, and alter the function definitions to accept them where necessary. Found by fuzzing keys[2].cowflag = add in xfs/464. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:21 -07:00
Darrick J. Wong	5a8c345ca8	xfs: refactor refcount record usage in xchk_refcountbt_rec Consolidate the open-coded xfs_refcount_irec fields into an actual struct and use the existing _btrec_to_irec to decode the ondisk record. This will reduce code churn in the next patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:21 -07:00
Darrick J. Wong	950f0d50ee	xfs: dump corrupt recovered log intent items to dmesg consistently If log recovery decides that an intent item is corrupt and wants to abort the mount, capture a hexdump of the corrupt log item in the kernel log for further analysis. Some of the log item code already did this, so we're fixing the rest to do it consistently. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:20 -07:00
Darrick J. Wong	9e7e2436c1	xfs: move _irec structs to xfs_types.h Structure definitions for incore objects do not belong in the ondisk format header. Move them to the incore types header where they belong. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:20 -07:00
Darrick J. Wong	921ed96b4f	xfs: actually abort log recovery on corrupt intent-done log items If log recovery picks up intent-done log items that are not of the correct size it needs to abort recovery and fail the mount. Debug assertions are not good enough. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:20 -07:00
Darrick J. Wong	8edbe0cf8b	xfs: check deferred refcount op continuation parameters If we're in the middle of a deferred refcount operation and decide to roll the transaction to avoid overflowing the transaction space, we need to check the new agbno/aglen parameters that we're about to record in the new intent. Specifically, we need to check that the new extent is completely within the filesystem, and that continuation does not put us into a different AG. If the keys of a node block are wrong, the lookup to resume an xfs_refcount_adjust_extents operation can put us into the wrong record block. If this happens, we might not find that we run out of aglen at an exact record boundary, which will cause the loop control to do the wrong thing. The previous patch should take care of that problem, but let's add this extra sanity check to stop corruption problems sooner than later. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:20 -07:00
Darrick J. Wong	3c5aaaced9	xfs: refactor all the EFI/EFD log item sizeof logic Refactor all the open-coded sizeof logic for EFI/EFD log item and log format structures into common helper functions whose names reflect the struct names. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:20 -07:00
Darrick J. Wong	b65e08f83b	xfs: create a predicate to verify per-AG extents Create a predicate function to verify that a given agbno/blockcount pair fit entirely within a single allocation group and don't suffer mathematical overflows. Refactor the existng open-coded logic; we're going to add more calls to this function in the next patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:20 -07:00
Darrick J. Wong	03a7485cd7	xfs: fix memcpy fortify errors in EFI log format copying Starting in 6.1, CONFIG_FORTIFY_SOURCE checks the length parameter of memcpy. Since we're already fixing problems with BUI item copying, we should fix it everything else. An extra difficulty here is that the ef[id]_extents arrays are declared as single-element arrays. This is not the convention for flex arrays in the modern kernel, and it causes all manner of problems with static checking tools, since they often cannot tell the difference between a single element array and a flex array. So for starters, change those array[1] declarations to array[] declarations to signal that they are proper flex arrays and adjust all the "size-1" expressions to fit the new declaration style. Next, refactor the xfs_efi_copy_format function to handle the copying of the head and the flex array members separately. While we're at it, fix a minor validation deficiency in the recovery function. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:20 -07:00
Darrick J. Wong	f850995f60	xfs: make sure aglen never goes negative in xfs_refcount_adjust_extents Prior to calling xfs_refcount_adjust_extents, we trimmed agbno/aglen such that the end of the range would not be in the middle of a refcount record. If this is no longer the case, something is seriously wrong with the btree. Bail out with a corruption error. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:20 -07:00
Darrick J. Wong	b45ca961e9	xfs: fix memcpy fortify errors in RUI log format copying Starting in 6.1, CONFIG_FORTIFY_SOURCE checks the length parameter of memcpy. Since we're already fixing problems with BUI item copying, we should fix it everything else. Refactor the xfs_rui_copy_format function to handle the copying of the head and the flex array members separately. While we're at it, fix a minor validation deficiency in the recovery function. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:19 -07:00
Darrick J. Wong	a38935c03c	xfs: fix memcpy fortify errors in CUI log format copying Starting in 6.1, CONFIG_FORTIFY_SOURCE checks the length parameter of memcpy. Since we're already fixing problems with BUI item copying, we should fix it everything else. Refactor the xfs_cui_copy_format function to handle the copying of the head and the flex array members separately. While we're at it, fix a minor validation deficiency in the recovery function. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:19 -07:00
Darrick J. Wong	a38ebce1da	xfs: fix memcpy fortify errors in BUI log format copying Starting in 6.1, CONFIG_FORTIFY_SOURCE checks the length parameter of memcpy. Unfortunately, it doesn't handle flex arrays correctly: ------------[ cut here ]------------ memcpy: detected field-spanning write (size 48) of single field "dst_bui_fmt" at fs/xfs/xfs_bmap_item.c:628 (size 16) Fix this by refactoring the xfs_bui_copy_format function to handle the copying of the head and the flex array members separately. While we're at it, fix a minor validation deficiency in the recovery function. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:19 -07:00
Darrick J. Wong	59da7ff49d	xfs: fix validation in attr log item recovery Before we start fixing all the complaints about memcpy'ing log items around, let's fix some inadequate validation in the xattr log item recovery code and get rid of the (now trivial) copy_format function. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:58:19 -07:00
Filipe Manana	8184620ae2	btrfs: fix lost file sync on direct IO write with nowait and dsync iocb When doing a direct IO write using a iocb with nowait and dsync set, we end up not syncing the file once the write completes. This is because we tell iomap to not call generic_write_sync(), which would result in calling btrfs_sync_file(), in order to avoid a deadlock since iomap can call it while we are holding the inode's lock and btrfs_sync_file() needs to acquire the inode's lock. The deadlock happens only if the write happens synchronously, when iomap_dio_rw() calls iomap_dio_complete() before it returns. Instead we do the sync ourselves at btrfs_do_write_iter(). For a nowait write however we can end up not doing the sync ourselves at at btrfs_do_write_iter() because the write could have been queued, and therefore we get -EIOCBQUEUED returned from iomap in such case. That makes us skip the sync call at btrfs_do_write_iter(), as we don't do it for any error returned from btrfs_direct_write(). We can't simply do the call even if -EIOCBQUEUED is returned, since that would block the task waiting for IO, both for the data since there are bios still in progress as well as potentially blocking when joining a log transaction and when syncing the log (writing log trees, super blocks, etc). So let iomap do the sync call itself and in order to avoid deadlocks for the case of synchronous writes (without nowait), use __iomap_dio_rw() and have ourselves call iomap_dio_complete() after unlocking the inode. A test case will later be sent for fstests, after this is fixed in Linus' tree. Fixes: `51bd9563b6` ("btrfs: fix deadlock due to page faults during direct IO reads and writes") Reported-by: Марк Коренберг <socketpair@gmail.com> Link: https://lore.kernel.org/linux-btrfs/CAEmTpZGRKbzc16fWPvxbr6AfFsQoLmz-Lcg-7OgJOZDboJ+SGQ@mail.gmail.com/ CC: stable@vger.kernel.org # 6.0+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-10-31 16:52:56 +01:00
Darrick J. Wong	47ba8cc7b4	xfs: fix incorrect return type for fsdax fault handlers The kernel robot complained about this: >> fs/xfs/xfs_file.c:1266:31: sparse: sparse: incorrect type in return expression (different base types) @@ expected int @@ got restricted vm_fault_t @@ fs/xfs/xfs_file.c:1266:31: sparse: expected int fs/xfs/xfs_file.c:1266:31: sparse: got restricted vm_fault_t fs/xfs/xfs_file.c:1314:21: sparse: sparse: incorrect type in assignment (different base types) @@ expected restricted vm_fault_t [usertype] ret @@ got int @@ fs/xfs/xfs_file.c:1314:21: sparse: expected restricted vm_fault_t [usertype] ret fs/xfs/xfs_file.c:1314:21: sparse: got int Fix the incorrect return type for these two functions. While we're at it, make the !fsdax version return VM_FAULT_SIGBUS because a zero return value will cause some callers to try to lock vmf->page, which we never set here. Fixes: `ea6c49b784` ("xfs: support CoW in fsdax mode") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-10-31 08:51:45 -07:00
Christophe JAILLET	063b1f21cc	btrfs: fix a memory allocation failure test in btrfs_submit_direct After allocation 'dip' is tested instead of 'dip->csums'. Fix it. Fixes: `642c5d34da` ("btrfs: allocate the btrfs_dio_private as part of the iomap dio bio") CC: stable@vger.kernel.org # 5.19+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-10-31 16:50:15 +01:00
Helge Deller	2b6ae0962b	parisc: Avoid printing the hardware path twice Avoid that the hardware path is shown twice in the kernel log, and clean up the output of the version numbers to show up in the same order as they are listed in the hardware database in the hardware.c file. Additionally, optimize the memory footprint of the hardware database and mark some code as init code. Fixes: `cab56b51ec` ("parisc: Fix device names in /proc/iomem") Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v4.9+	2022-10-31 15:37:14 +01:00
Chen Jun	943f45b939	blk-mq: Fix kmemleak in blk_mq_init_allocated_queue There is a kmemleak caused by modprobe null_blk.ko unreferenced object 0xffff8881acb1f000 (size 1024): comm "modprobe", pid 836, jiffies 4294971190 (age 27.068s) hex dump (first 32 bytes): 00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N.......... ff ff ff ff ff ff ff ff 00 53 99 9e ff ff ff ff .........S...... backtrace: [<000000004a10c249>] kmalloc_node_trace+0x22/0x60 [<00000000648f7950>] blk_mq_alloc_and_init_hctx+0x289/0x350 [<00000000af06de0e>] blk_mq_realloc_hw_ctxs+0x2fe/0x3d0 [<00000000e00c1872>] blk_mq_init_allocated_queue+0x48c/0x1440 [<00000000d16b4e68>] __blk_mq_alloc_disk+0xc8/0x1c0 [<00000000d10c98c3>] 0xffffffffc450d69d [<00000000b9299f48>] 0xffffffffc4538392 [<0000000061c39ed6>] do_one_initcall+0xd0/0x4f0 [<00000000b389383b>] do_init_module+0x1a4/0x680 [<0000000087cf3542>] load_module+0x6249/0x7110 [<00000000beba61b8>] __do_sys_finit_module+0x140/0x200 [<00000000fdcfff51>] do_syscall_64+0x35/0x80 [<000000003c0f1f71>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 That is because q->ma_ops is set to NULL before blk_release_queue is called. blk_mq_init_queue_data blk_mq_init_allocated_queue blk_mq_realloc_hw_ctxs for (i = 0; i < set->nr_hw_queues; i++) { old_hctx = xa_load(&q->hctx_table, i); if (!blk_mq_alloc_and_init_hctx(.., i, ..)) [1] if (!old_hctx) break; xa_for_each_start(&q->hctx_table, j, hctx, j) blk_mq_exit_hctx(q, set, hctx, j); [2] if (!q->nr_hw_queues) [3] goto err_hctxs; err_exit: q->mq_ops = NULL; [4] blk_put_queue blk_release_queue if (queue_is_mq(q)) [5] blk_mq_release(q); [1]: blk_mq_alloc_and_init_hctx failed at i != 0. [2]: The hctxs allocated by [1] are moved to q->unused_hctx_list and will be cleaned up in blk_mq_release. [3]: q->nr_hw_queues is 0. [4]: Set q->mq_ops to NULL. [5]: queue_is_mq returns false due to [4]. And blk_mq_release will not be called. The hctxs in q->unused_hctx_list are leaked. To fix it, call blk_release_queue in exception path. Fixes: `2f8f1336a4` ("blk-mq: always free hctx after request queue is freed") Signed-off-by: Yuan Can <yuancan@huawei.com> Signed-off-by: Chen Jun <chenjun102@huawei.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20221031031242.94107-1-chenjun102@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-31 08:30:47 -06:00
Ville Syrjälä	12caf46cf4	drm/i915/sdvo: Grab mode_config.mutex during LVDS init to avoid WARNs drm_mode_probed_add() is unhappy about being called w/o mode_config.mutex. Grab it during LVDS fixed mode setup to silence the WARNs. Cc: stable@vger.kernel.org Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/7301 Fixes: `aa2b88074a` ("drm/i915/sdvo: Fix multi function encoder stuff") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026101134.20865-4-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit `a3cd4f4472`) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2022-10-31 14:09:15 +00:00

... 3 4 5 6 7 ...

1137343 Commits