Commit Graph

1043310 Commits

Author SHA1 Message Date
Marco Elver
fa360beac4 kasan: fix Kconfig check of CC_HAS_WORKING_NOSANITIZE_ADDRESS
In the main KASAN config option CC_HAS_WORKING_NOSANITIZE_ADDRESS is
checked for instrumentation-based modes.  However, if
HAVE_ARCH_KASAN_HW_TAGS is true all modes may still be selected.

To fix, also make the software modes depend on
CC_HAS_WORKING_NOSANITIZE_ADDRESS.

Link: https://lkml.kernel.org/r/20210910084240.1215803-1-elver@google.com
Fixes: 6a63a63ff1 ("kasan: introduce CONFIG_KASAN_HW_TAGS")
Signed-off-by: Marco Elver <elver@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Aleksandr Nogikh <nogikh@google.com>
Cc: Taras Madan <tarasmadan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-24 16:13:34 -07:00
Naoya Horiguchi
acfa299a4a mm, hwpoison: add is_free_buddy_page() in HWPoisonHandlable()
Commit fcc00621d8 ("mm/hwpoison: retry with shake_page() for
unhandlable pages") changed the return value of __get_hwpoison_page() to
retry for transiently unhandlable cases.  However, __get_hwpoison_page()
currently fails to properly judge buddy pages as handlable, so hard/soft
offline for buddy pages always fail as "unhandlable page".  This is
totally regrettable.

So let's add is_free_buddy_page() in HWPoisonHandlable(), so that
__get_hwpoison_page() returns different return values between buddy
pages and unhandlable pages as intended.

Link: https://lkml.kernel.org/r/20210909004131.163221-1-naoya.horiguchi@linux.dev
Fixes: fcc00621d8 ("mm/hwpoison: retry with shake_page() for unhandlable pages")
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-24 16:13:34 -07:00
Pavel Begunkov
7df778be2f io_uring: make OP_CLOSE consistent with direct open
From recently open/accept are now able to manipulate fixed file table,
but it's inconsistent that close can't. Close the gap, keep API same as
with open/accept, i.e. via sqe->file_slot.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-24 14:07:54 -06:00
Linus Torvalds
7d42e98182 gpio fixes for v5.15-rc3
- fix a regression in GPIO ACPI on HP ElitePad 1000 G2 where the
   gpio_set_debounce_timeout() now returns a fatal error if the specific
   debounce period is not supported by the driver instead of just emitting
   a warning
 - fix return values of irq_mask/unmask() callbacks in gpio-uniphier
 - fix hwirq calculation in gpio-aspeed-sgpio
 - fix two issues in gpio-rockchip: only make the extended debounce support
   available for v2 and remove a redundant BIT() usage
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmFMObIACgkQEacuoBRx
 13L8VQ//Vnzk57zjqcqiVj1aBCqMGm/pFnuYFJ1lkh/ny3F7nORBCZgiaNeJa+k2
 3FT2YTV/hrPlmwYzh65pmvx2FikoUxJNlQyh2JuQ06bqeGy5jKGoI1Dij/u5MyXy
 U60OyUcyYBXrCioHgb4ID571KnTu3l/uCd4CxKUaeABcz3EjhRE+Vu8oUIsmegA9
 jj+oihbTivTqsPu2FgTXkF02425rKXD1/5PvB+q68kxTsoIb1VY9Wm8XGuDBSLH8
 OYokxxwHNpUdTVACjp+58xhgbe4giyjH2VSs/Rw3/iIHf6ay/6hzqpMFaATiD/R0
 e3818eBiyYzRp7VfPvgrdXT4R3R/vk6GEmJMP/E8WxSjrx1riDhDlmHbj+w7uHTy
 UhGMpd4QcKHrsbEwjmpX7kMHFqs1dW3/ulQWSpD98apUHX619u6SWF5gWDwUTRxq
 JwQ+zEww2XH2erzUVbN5VGCLD7+xHbqAyWCp+3ZccSDkeq333bks2UhwfMq7+9He
 v4VaJy7GMezEsOUFdJIHVDQh4QMNP5SZedMxBCJn0A+HWJmbkBR09BgX8ZYznJnC
 OtdEeDv94OdUXSSwoyCweH9HHDpV2aGzAXyXdoAOLPk7TsnNKzqFkIOezb2mgOHn
 EgNUpDJpbDduXvxsnfk3j9co50yaGJl2QFukvAYrsRqgMuTps5Q=
 =3LqR
 -----END PGP SIGNATURE-----

Merge tag 'gpio-fixes-for-v5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - fix a regression in GPIO ACPI on HP ElitePad 1000 G2 where the
   gpio_set_debounce_timeout() now returns a fatal error if the specific
   debounce period is not supported by the driver instead of just
   emitting a warning

 - fix return values of irq_mask/unmask() callbacks in gpio-uniphier

 - fix hwirq calculation in gpio-aspeed-sgpio

 - fix two issues in gpio-rockchip: only make the extended debounce
   support available for v2 and remove a redundant BIT() usage

* tag 'gpio-fixes-for-v5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio/rockchip: fix get_direction value handling
  gpio/rockchip: extended debounce support is only available on v2
  gpio: gpio-aspeed-sgpio: Fix wrong hwirq in irq handler.
  gpio: uniphier: Fix void functions to remove return value
  gpiolib: acpi: Make set-debounce-timeout failures non fatal
2021-09-24 11:22:55 -07:00
Linus Torvalds
47d7e65d64 Device properties framework fix for 5.15-rc3
Fix software node refcount imbalance on device removal (Laurentiu Tudor).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmFOBoUSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxcgQP/RJGlD5rm2W3LSTi0MpdHm88PWTH/6JL
 XYqGDJ42DOS6+pm7KiCoI+XZ7plXtT9FL3kPwgj4+RSHfZQV7r5DVOFwXUa5lpkG
 IDeU8H/DXil/9VZSioxCN7HSE1mUsMj1eZM4odf9eZBiccLxdn050k7by4ZVt0Dx
 3OnBJ8Tv+4jp27XuuEiOFku2m2QA0myK0Fg7tHQ8D6UoBgfxawVMkFsDrNFOBEsx
 xKmGXml7VFsEbX6+FeirJCRLTGln2EOLbdHcwMbCzaOzgEp9MvuLm96PlLANyohk
 rZoAoasXmjMMgHk2tOC5nkEtxINVKWqM18HQqyzBLtx9fn/YP94HXXC00SzsZAwJ
 NQ8vpwtbHzHI8QydS8lJ9Sp8PLQGcHRrbut5Is1oXF3f8uNViyE1IHmuPh/AW0Mg
 UjQ/cqZ1yWDoYYaJ2Ar2G6uwFyfMmgPhoNOlUmW1niZqEm3iMA1VRd2EbSXiAbpu
 ExLzNh7dljDoVgP6Vzhqlj6oyV4JYZCHIuMjBixmheVb282pqanQaxsk877rTQiR
 KAk9d/V7GluXvNnkVQA1j8k8ONqEx4M2dxgn6tIFMGhT7eQeq/tjlljl7x3C1OW5
 iTQPmXzEr1mLPJ8HwdNLDFDbNZ1HEsLUS+tb6nz+1uwvyvhnJqEX2ewbx5nRUdrY
 XBMVa46lxeh8
 =PFOV
 -----END PGP SIGNATURE-----

Merge tag 'devprop-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull device properties framework fix from Rafael Wysocki:
 "Fix software node refcount imbalance on device removal (Laurentiu
  Tudor)"

* tag 'devprop-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  software node: balance refcount for managed software nodes
2021-09-24 11:20:29 -07:00
Linus Torvalds
ea1f9163ac ACPI fix for 5.15-rc3
Revert a recent commit related to memory management that turned out
 to be problematic (Jia He).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmFOBfkSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx7bsP/2vzDm4qYnHmIwxaJsY3Eg8GPfzXQYfw
 WSAlGP1FN97QPFyqCivJzviTWJfi0JNlZHRlJik7h/IZhEXdw9Qy8tSZBxukvWOG
 57ZdyOxBf6TbmEWEaWv+gUAEHxQo7jwP5M3v44uAisoRONx5jB/ObugHxOvRft3v
 06yJpPL2X3XQMTbZKXhZzC8FUg1mY+2+XhQ6w3jHgcliVHNMjxs2H23qkUOjAmox
 ItKS+wKiF30GXd3u99dSV3fI7QnErRliI6Aub9ebnBkEu6rWYT0lYoCHCPpEMnuD
 rEnGBobF+LpQmzV9d/kYPQ45FhkHgG8s3up6U5jIXjf3DqEIXZ3U5Wt6R7m+oiwq
 InWm194VJY526cjHUse5ygVpLEIw1cTHE66pM0AbWF3WcUv9rQV8rQcbU5rmLc3y
 fuq17Rgsn7qAOmEbwTnPSL4cvGZsavophntVpRaluS7yrvGZ4yZVWnpp/8M6jXfa
 wrvkOJU8DXAVpVcBFqdnbRtX61NjB5KF1ZzTzQ1gjD+mOAsB59niV9QcHr2xpajR
 L/vzWZtHbDQV3ouzoked+i3HFjEa1tihXPTZhqi8/I0+dVu8xb2KrNn1JILxFLeu
 5pFMBBLS6n9S8wuIv7XCYFjS3OmWUhbiT7N1dxtXmoDhh4dd7usv4OnRHP3CRInq
 5rM7HHDLWfjH
 =F6sF
 -----END PGP SIGNATURE-----

Merge tag 'acpi-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fix from Rafael Wysocki:
 "Revert a recent commit related to memory management that turned out to
  be problematic (Jia He)"

* tag 'acpi-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Revert "ACPI: Add memory semantics to acpi_os_map_memory()"
2021-09-24 11:17:32 -07:00
Linus Torvalds
1b7eaf5701 arm64 fixes:
- It turns out that the optimised string routines merged in 5.14 are not
   safe with in-kernel MTE (KASAN_HW_TAGS) because of reading beyond the
   end of a string (strcmp, strncmp). Such reading may go across a 16
   byte tag granule and cause a tag check fault. When KASAN_HW_TAGS is
   enabled, use the generic strcmp/strncmp C implementation.
 
 - An errata workaround for ThunderX relied on the CPU capabilities being
   enabled in a specific order. This disappeared with the automatic
   generation of the cpucaps.h file (sorted alphabetically). Fix it by
   checking the current CPU only rather than the system-wide capability.
 
 - Add system_supports_mte() checks on the kernel entry/exit path and
   thread switching to avoid unnecessary barriers and function calls on
   systems where MTE is not supported.
 
 - kselftests: skip arm64 tests if the required features are missing.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmFOC7oACgkQa9axLQDI
 XvHGhBAAkTzrLtyveymoeD5bk0Lh4Co2i0PKqoOl82ltvagjcvUY5lDLfgKV8ac2
 0fZ6E7Jjt6Khq0VAWryRf/RkjxTXmuqDmyb7hZ/t0TNciRXKMfQyE8o8KhsWOvRF
 hD0COUOnJ4CrgQuAJwsW9uagcrix9NucK9WziHB65RACxnD+k7foffCNv1b9Tw9s
 oqAfZ144B+L1Z0WFoqG4uGkngIyG/lcSPJMhDRo45iRWtNUnlR/m/ApWKb+r+X1X
 /Aq2dWklUnrGBB1MdO3WAaRH/lvulH6t4KVhTJw2Gkq5ogy+mYLO15wGYmM+FAWT
 ExS8fy4/G2vX4RAxYRBZF7Xl3zN7F9n3RYKyYfxgGwSy30TnOLYUMoemYvA9Eh27
 O6cjaZacyPVfaB5LUwj6H84NbDbcJKCO775sLuIW8IEMklyZhorS8HG7nYnL6rBk
 jXM5AVPC2j8kXCgawW9u+s/pEQAvLOvM9y7Q0pjj3n2WNeV09bxzkgX5wStSaMYJ
 Ok95zCK1SbKBYdA7yQnFcQkM52XkqpML8vccBHUdxVbtAetuYxoyGKKs/9G6mPEH
 vWSxP4ymvRza62EofmUJzNIEMWwwNUgyNeY/bVJXYf1oc324Wx4T1ewQagrGVQve
 miEgOVEpn+ygHFkEO3mc0wnFfQ3m5CSBvOjeWlTbv7lgekRDyx4=
 =JIq+
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - It turns out that the optimised string routines merged in 5.14 are
   not safe with in-kernel MTE (KASAN_HW_TAGS) because of reading beyond
   the end of a string (strcmp, strncmp). Such reading may go across a
   16 byte tag granule and cause a tag check fault. When KASAN_HW_TAGS
   is enabled, use the generic strcmp/strncmp C implementation.

 - An errata workaround for ThunderX relied on the CPU capabilities
   being enabled in a specific order. This disappeared with the
   automatic generation of the cpucaps.h file (sorted alphabetically).
   Fix it by checking the current CPU only rather than the system-wide
   capability.

 - Add system_supports_mte() checks on the kernel entry/exit path and
   thread switching to avoid unnecessary barriers and function calls on
   systems where MTE is not supported.

 - kselftests: skip arm64 tests if the required features are missing.

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Restore forced disabling of KPTI on ThunderX
  kselftest/arm64: signal: Skip tests if required features are missing
  arm64: Mitigate MTE issues with str{n}cmp()
  arm64: add MTE supported check to thread switching and syscall entry/exit
2021-09-24 11:12:17 -07:00
Linus Torvalds
4c4f0c2bf3 A fix for a potential array out of bounds access from Dan.
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAmFNudoTHGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzi93DCACjNadeFDipw3oAkzdAo+wo0MSYgByU
 4eu3XN77yTHphc+xoCU/9cCeSkthfNc2XZDktb22lbAX2QxCgvQsWFgck+i0d245
 9uQ68IycrUza9PGjLL3okZiLGzqsk97ZDt7vqXT51zN6dgEATVJ5YaXIhVIwAIM0
 F6VVcoHruoOPLhPXctQUZusWS+XPzPvU34n2sNodREv9mYeoFmaZJ15wB6UZn2Ps
 qL6Usoq15EH5cbW60XbqVWFDAhWmCN/6HG3b6w3WtD+lVb2NWTFJOAZGxMVvNo46
 1TacgQUVs7iIh+bol3modOT94yVZ4Cvrb0uUMY/oE4AgjVPGxSjHSeMB
 =7D7F
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-5.15-rc3' of git://github.com/ceph/ceph-client

Pull ceph fix from Ilya Dryomov:
 "A fix for a potential array out of bounds access from Dan"

* tag 'ceph-for-5.15-rc3' of git://github.com/ceph/ceph-client:
  ceph: fix off by one bugs in unsafe_request_wait()
2021-09-24 10:28:18 -07:00
Linus Torvalds
e655c81ade \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmFNp2UACgkQnJ2qBz9k
 QNlubwf/Zv5XJccDBGxn0pB7ew1fN4HowTbWdaS0ELDuLZ2KHhZgEbtUu0V2oZ7I
 pkUMO97llPk0KHWWjcooIaBMGbBQ78Hqq3xFWWboxEu5hMhJyN1cR2uJlrELvxp1
 HsKKaREaUl8jHNQyIuREl/SqLaHmW4LWgrVCZKUTBEc4BeRz2E0C4LTymAZEpTXt
 UCRwAU8itK8Z9/Da1xJ6b04/ZMamgoc0a8Rzq3YwCIcUyFLJATB1/XHR6j0c6B6x
 gH/y7m/y1m3BFsMbMwGNMwiyJWcBXy2RB4I3B4HB3U6ifBL3ebFQlPQIu5PzRbiG
 bGaWRx5o8c5mY/eHpFV4kZS30HM25A==
 =Wep3
 -----END PGP SIGNATURE-----

Merge tag 'fixes_for_v5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull misc filesystem fixes from Jan Kara:
 "A for ext2 sleep in atomic context in case of some fs problems and a
  cleanup of an invalidate_lock initialization"

* tag 'fixes_for_v5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  ext2: fix sleeping in atomic bugs on error
  mm: Fully initialize invalidate_lock, amend lock class later
2021-09-24 10:22:35 -07:00
Linus Torvalds
a801695f68 Merge branch 'work.init' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "Followups to nodev root stuff from this merge window"

* 'work.init' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  init: don't panic if mount_nodev_root failed
  init/do_mounts.c: Harden split_fs_names() against buffer overflow
2021-09-24 10:18:07 -07:00
Linus Torvalds
e61b2ad3e1 drm fixes for 5.15-rc3
i915:
 - Fix ADL-P memory bandwidth parameters
 - Fix memory corruption due to a double free
 - Fix memory leak in DMC firmware handling
 
 amdgpu:
 - Update MAINTAINERS entry for powerplay
 - Fix empty macros
 - SI DPM fix
 
 amdkfd:
 - SVM fixes
 - DMA mapping fix
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmFNQv8ACgkQDHTzWXnE
 hr6/XQ/6AjAOkNVatqBivAFLKVPKIcUw/cbZ6rjXPA2mbpmx2fd2GTTDNZq9KBLv
 GG+SGYadpOosl6WnF5kLOlZekBH8bpM03ZL/fVM8wlMw0c+eIuDVIZz0O1Dr/b3v
 71rmfdQAjknuHiLhuwFWwQbcODnD1EgEWbIUVTC2NmgYszDtky0Cq9aCc9EuJCVQ
 Yj4Dppocja4CFL3M5IHcFT/GzrKy3xNYvwMsZh5OC/1qVclkK77I+umYKE6cfkdB
 dAfAPdmKyxQf9+l+8jZ601/ZRlZ/uy2dZ5TM9A2KGDNv/ipU+OquN0xq9IJ2zqfc
 q0dMl/I7o1ostXLYLqsDd5/TiW0TeQDIH+uq34zMGXaUbxbI9mxuFkfisVxmFqgS
 sekW9NwJCPTlbjC1FPZ2Ygizlbbk1kSE/B/qOEyGanoT6E3idDdzp5qTRzpru72I
 Ltl9hCJNB4ELGXBl6vXIc1Us/f8GDxZ72o3mnumRMOi7Lu8xGD44jF6xqNrKR/dq
 nbPZQ5Rxi/CpzRlUJhGPotdilcJTkqbw5rwLP2Mmwi1I8KQejI33uk9TVTSUOL9Y
 iSWLRIV5z6wh7c8ptJiDof/IB29QW5D12QiqappXQvYDIny/Fb57hcXFq2b8c9Pb
 +X8Ur9tv+3eACs9JlZCbJe+MUfV76h36m7goa56+AOIcjq7+Qg4=
 =TwzM
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2021-09-24' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Quiet week this week, just some i915 and amd fixes, just getting ready
  for my all nighter maintainer summit!

  Summary:

  i915:
   - Fix ADL-P memory bandwidth parameters
   - Fix memory corruption due to a double free
   - Fix memory leak in DMC firmware handling

  amdgpu:
   - Update MAINTAINERS entry for powerplay
   - Fix empty macros
   - SI DPM fix

  amdkfd:
   - SVM fixes
   - DMA mapping fix"

* tag 'drm-fixes-2021-09-24' of git://anongit.freedesktop.org/drm/drm:
  drm/amdkfd: fix svm_migrate_fini warning
  drm/amdkfd: handle svm migrate init error
  drm/amd/pm: Update intermediate power state for SI
  drm/amdkfd: fix dma mapping leaking warning
  drm/amdkfd: SVM map to gpus check vma boundary
  MAINTAINERS: fix up entry for AMD Powerplay
  drm/amd/display: fix empty debug macros
  drm/i915: Free all DMC payloads
  drm/i915: Move __i915_gem_free_object to ttm_bo_destroy
  drm/i915: Update memory bandwidth parameters
2021-09-24 10:14:19 -07:00
Ming Lei
f278eb3d81 block: hold ->invalidate_lock in blkdev_fallocate
When running ->fallocate(), blkdev_fallocate() should hold
mapping->invalidate_lock to prevent page cache from being accessed,
otherwise stale data may be read in page cache.

Without this patch, blktests block/009 fails sometimes. With this patch,
block/009 can pass always.

Also as Jan pointed out, no pages can be created in the discarded area
while you are holding the invalidate_lock, so remove the 2nd
truncate_bdev_range().

Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20210923023751.1441091-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-24 11:06:58 -06:00
Zhihao Cheng
5afedf670c blktrace: Fix uaf in blk_trace access after removing by sysfs
There is an use-after-free problem triggered by following process:

      P1(sda)				P2(sdb)
			echo 0 > /sys/block/sdb/trace/enable
			  blk_trace_remove_queue
			    synchronize_rcu
			    blk_trace_free
			      relay_close
rcu_read_lock
__blk_add_trace
  trace_note_tsk
  (Iterate running_trace_list)
			        relay_close_buf
				  relay_destroy_buf
				    kfree(buf)
    trace_note(sdb's bt)
      relay_reserve
        buf->offset <- nullptr deference (use-after-free) !!!
rcu_read_unlock

[  502.714379] BUG: kernel NULL pointer dereference, address:
0000000000000010
[  502.715260] #PF: supervisor read access in kernel mode
[  502.715903] #PF: error_code(0x0000) - not-present page
[  502.716546] PGD 103984067 P4D 103984067 PUD 17592b067 PMD 0
[  502.717252] Oops: 0000 [#1] SMP
[  502.720308] RIP: 0010:trace_note.isra.0+0x86/0x360
[  502.732872] Call Trace:
[  502.733193]  __blk_add_trace.cold+0x137/0x1a3
[  502.733734]  blk_add_trace_rq+0x7b/0xd0
[  502.734207]  blk_add_trace_rq_issue+0x54/0xa0
[  502.734755]  blk_mq_start_request+0xde/0x1b0
[  502.735287]  scsi_queue_rq+0x528/0x1140
...
[  502.742704]  sg_new_write.isra.0+0x16e/0x3e0
[  502.747501]  sg_ioctl+0x466/0x1100

Reproduce method:
  ioctl(/dev/sda, BLKTRACESETUP, blk_user_trace_setup[buf_size=127])
  ioctl(/dev/sda, BLKTRACESTART)
  ioctl(/dev/sdb, BLKTRACESETUP, blk_user_trace_setup[buf_size=127])
  ioctl(/dev/sdb, BLKTRACESTART)

  echo 0 > /sys/block/sdb/trace/enable &
  // Add delay(mdelay/msleep) before kernel enters blk_trace_free()

  ioctl$SG_IO(/dev/sda, SG_IO, ...)
  // Enters trace_note_tsk() after blk_trace_free() returned
  // Use mdelay in rcu region rather than msleep(which may schedule out)

Remove blk_trace from running_list before calling blk_trace_free() by
sysfs if blk_trace is at Blktrace_running state.

Fixes: c71a896154 ("blktrace: add ftrace plugin")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Link: https://lore.kernel.org/r/20210923134921.109194-1-chengzhihao1@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-24 11:06:15 -06:00
Ming Lei
a647a524a4 block: don't call rq_qos_ops->done_bio if the bio isn't tracked
rq_qos framework is only applied on request based driver, so:

1) rq_qos_done_bio() needn't to be called for bio based driver

2) rq_qos_done_bio() needn't to be called for bio which isn't tracked,
such as bios ended from error handling code.

Especially in bio_endio():

1) request queue is referred via bio->bi_bdev->bd_disk->queue, which
may be gone since request queue refcount may not be held in above two
cases

2) q->rq_qos may be freed in blk_cleanup_queue() when calling into
__rq_qos_done_bio()

Fix the potential kernel panic by not calling rq_qos_ops->done_bio if
the bio isn't tracked. This way is safe because both ioc_rqos_done_bio()
and blkcg_iolatency_done_bio() are nop if the bio isn't tracked.

Reported-by: Yu Kuai <yukuai3@huawei.com>
Cc: tj@kernel.org
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20210924110704.1541818-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-24 11:04:39 -06:00
Pavel Begunkov
9f3a2cb228 io_uring: kill extra checks in io_write()
We don't retry short writes and so we would never get to async setup in
io_write() in that case. Thus ret2 > 0 is always false and
iov_iter_advance() is never used. Apparently, the same is found by
Coverity, which complains on the code.

Fixes: cd65869512 ("io_uring: use iov_iter state save/restore helpers")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/5b33e61034748ef1022766efc0fb8854cfcf749c.1632500058.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-24 10:26:11 -06:00
Jens Axboe
cdb31c29d3 io_uring: don't punt files update to io-wq unconditionally
There's no reason to punt it unconditionally, we just need to ensure that
the submit lock grabbing is conditional.

Fixes: 05f3fb3c53 ("io_uring: avoid ring quiesce for fixed file set unregister and update")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-24 10:24:34 -06:00
Jens Axboe
9990da93d2 io_uring: put provided buffer meta data under memcg accounting
For each provided buffer, we allocate a struct io_buffer to hold the
data associated with it. As a large number of buffers can be provided,
account that data with memcg.

Fixes: ddf0322db7 ("io_uring: add IORING_OP_PROVIDE_BUFFERS")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-24 10:24:34 -06:00
Jens Axboe
8bab4c09f2 io_uring: allow conditional reschedule for intensive iterators
If we have a lot of threads and rings, the tctx list can get quite big.
This is especially true if we keep creating new threads and rings.
Likewise for the provided buffers list. Be nice and insert a conditional
reschedule point while iterating the nodes for deletion.

Link: https://lore.kernel.org/io-uring/00000000000064b6b405ccb41113@google.com/
Reported-by: syzbot+111d2a03f51f5ae73775@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-24 10:24:34 -06:00
Hao Xu
5b7aa38d86 io_uring: fix potential req refcount underflow
For multishot mode, there may be cases like:

iowq                                 original context
io_poll_add
  _arm_poll()
  mask = vfs_poll() is not 0
  if mask
(2)  io_poll_complete()
  compl_unlock
   (interruption happens
    tw queued to original
    context)
                                     io_poll_task_func()
                                     compl_lock
                                 (3) done = io_poll_complete() is true
                                     compl_unlock
                                     put req ref
(1) if (poll->flags & EPOLLONESHOT)
      put req ref

EPOLLONESHOT flag in (1) may be from (2) or (3), so there are multiple
combinations that can cause ref underfow.
Let's address it by:
- check the return value in (2) as done
- change (1) to if (done)
    in this way, we only do ref put in (1) if 'oneshot flag' is from
    (2)
- do poll.done check in io_poll_task_func(), so that we won't put ref
  for the second time.

Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
Link: https://lore.kernel.org/r/20210922101238.7177-4-haoxu@linux.alibaba.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-24 10:24:34 -06:00
Hao Xu
a62682f92e io_uring: fix missing set of EPOLLONESHOT for CQ ring overflow
We should set EPOLLONESHOT if cqring_fill_event() returns false since
io_poll_add() decides to put req or not by it.

Fixes: 5082620fb2 ("io_uring: terminate multishot poll for CQ ring overflow")
Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
Link: https://lore.kernel.org/r/20210922101238.7177-3-haoxu@linux.alibaba.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-24 10:24:34 -06:00
Hao Xu
bd99c71bd1 io_uring: fix race between poll completion and cancel_hash insertion
If poll arming and poll completion runs in parallel, there maybe races.
For instance, run io_poll_add in iowq and io_poll_task_func in original
context, then:

  iowq                                      original context
  io_poll_add
    vfs_poll
     (interruption happens
      tw queued to original
      context)                              io_poll_task_func
                                              generate cqe
                                              del from cancel_hash[]
    if !poll.done
      insert to cancel_hash[]

The entry left in cancel_hash[], similar case for fast poll.
Fix it by set poll.done = true when del from cancel_hash[].

Fixes: 5082620fb2 ("io_uring: terminate multishot poll for CQ ring overflow")
Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
Link: https://lore.kernel.org/r/20210922101238.7177-2-haoxu@linux.alibaba.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-24 10:24:34 -06:00
Jens Axboe
87c1696655 io-wq: ensure we exit if thread group is exiting
Dave reports that a coredumping workload gets stuck in 5.15-rc2, and
identified the culprit in the Fixes line below. The problem is that
relying solely on fatal_signal_pending() to gate whether to exit or not
fails miserably if a process gets eg SIGILL sent. Don't exclusively
rely on fatal signals, also check if the thread group is exiting.

Fixes: 15e20db2e0 ("io-wq: only exit on fatal signals")
Reported-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-24 10:24:34 -06:00
Jens Axboe
5cad875691 nvme fixes for Linux 5.15:
- keep ctrl->namespaces ordered (me)
  - fix incorrect h2cdata pdu offset accounting in nvme-tcp
    (Sagi Grimberg)
  - handled updated hw_queues in nvme-fc more carefully (Daniel Wagner,
    James Smart)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmFNcZkLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYO5LQ//SD7S9poAxqPyHJ9n10hU7pz2+CcylgvKrfN/pJ68
 QEu8qDaMv8RCfyYmpkKvV+lbwa5yaX3e1E03KuGkwWF0ReLmQhMMkz32lWPdD96l
 V7/lorGc5sfOYCJXv9haTxyItegAZzTOp+h8Mdtz9hV0CJeCKQ57j0EQnCsacw4A
 c5nEg1hx7oHKWPTtPkTnmoWRcx1RndWYBpHPJY2nkRjVeLiE3I7mVFts6tDhnNwf
 zWRjY1MzhdBO4//XnXBsXnu37puJ/xhvL0WZH0lqNl6y6NVpdsEVBOFB5nwAihZP
 ThJqR4uU3ijpU/yNSuGl2VbR3klNkwTPpti1itsNyqLSowrW942aYVXFWarMBeQA
 a6hb2zht8vUM97uXtM5hVLvtUBhBHmoH4lJvxKumQ3NWsYJVUT2lQhIkVxFnDfsb
 b+Y1NicwXMrkSEquQfh+PIkCR3S7CGHz1eMB/Yr4w3yWtcNCTaMu0Go9AWR/EXjm
 mXSSWgyb+yWNPXYva3oaRItNpxNH5+EuX1dBAm+6HUT5lDdxF1KUd5g8Zo6qAVyA
 176i/lLVB0rhliNKgNwgyB4ciIgK2ICqArESR/mAMHQPtJKjDG9qShGGkw+wnPYS
 x5sL/AT8OPGvHdo83c+7csgsr0vJwdmrxOLZZ+GmqMSq9WWq1bMTJ7L2ACGwF0EY
 /Y4=
 =hiQa
 -----END PGP SIGNATURE-----

Merge tag 'nvme-5.15-2021-09-24' of git://git.infradead.org/nvme into block-5.15

Pull NVMe fixes from Christoph:

"nvme fixes for Linux 5.15:

 - keep ctrl->namespaces ordered (me)
 - fix incorrect h2cdata pdu offset accounting in nvme-tcp
   (Sagi Grimberg)
 - handled updated hw_queues in nvme-fc more carefully (Daniel Wagner,
   James Smart)"

* tag 'nvme-5.15-2021-09-24' of git://git.infradead.org/nvme:
  nvme: keep ctrl->namespaces ordered
  nvme-tcp: fix incorrect h2cdata pdu offset accounting
  nvme-fc: remove freeze/unfreeze around update_nr_hw_queues
  nvme-fc: avoid race between time out and tear down
  nvme-fc: update hardware queues before using them
2021-09-24 07:15:21 -06:00
Thomas Gleixner
f9bfed3ad5 irqchip fixes for 5.15, take #1
- Work around a bad GIC integration on a Renesas platform, where the
   interconnect cannot deal with byte-sized MMIO accesses
 
 - Cleanup another Renesas driver abusing the comma operator
 
 - Fix a potential GICv4 memory leak on an error path
 
 - Make the type of 'size' consistent with the rest of the code in
   __irq_domain_add()
 
 - Fix a regression in the Armada 370-XP IPI path
 
 - Fix the build for the obviously unloved goldfish-pic
 
 - Some documentation fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmFNk10PHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpD59kP/A4Al80ndT4GhIlj1b+LolpBctOl3OxNpoYm
 uCsf/LjmNjEQ62F3wd0lMe/qgioU+MKssA94/4pp9IkySNSxToHpaD5WwScaGKP4
 twATEs3cdAmrvE8YTiq+bjuX8mJ7toqhwRWjc2ZTlql4l3DbHzMoeywwnULza/A8
 ZGLJZ4SdvBQPUnMtEXJa9jHwtxRd0irinUApO5XpfRMiGAfCaCD2XfOMVmeBX3TP
 OFtpsxSluIURaAhEBsr60saagqftGrCABr8m19zGynutnosbVvDYq4HUIlIYxeRm
 7BWOskyGw1CZ9beylIO7v2Vp5pNx5KR4t/5wL7+tZXhY7VrgPPQjFf1CbJwB8NUz
 p8ad7n9yHJvzc90mzgqZfuAr7GBZt5wFXj1vKw5hDxlTDo4LfaMD+2Qkp2KOESqi
 ejX3vdrVgLCadzgDqpkjBRpsqjjG+1x+rjji4dpaADEUYxUoyX5lYObiImOznTeS
 9NitgJe5aGFOo0y7DOFYNSc4e2ODfGxTwVl4NTwd4NGVJ+CeBYHlow1B8+5NfoKo
 rqMgo6dgyKjfwyN6YxVo6RDvDe+e/xTKk7s1kaVzYVgQeDh5GeMd9SJ0xms3Dbhe
 pjZLsAmnnIOoHWqcvQOOFJPkhqQBpuY8Gbtw0X3JVrj/C/6HoAAS0FyqhYw53dSC
 gVC3Im4R
 =Fn7y
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-fixes-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent

Pull irqchip fixes from Marc Zyngier:

 - Work around a bad GIC integration on a Renesas platform, where the
   interconnect cannot deal with byte-sized MMIO accesses

 - Cleanup another Renesas driver abusing the comma operator

 - Fix a potential GICv4 memory leak on an error path

 - Make the type of 'size' consistent with the rest of the code in
   __irq_domain_add()

 - Fix a regression in the Armada 370-XP IPI path

 - Fix the build for the obviously unloved goldfish-pic

 - Some documentation fixes

Link: https://lore.kernel.org/r/20210924090933.2766857-1-maz@kernel.org
2021-09-24 14:11:04 +02:00
Numfor Mbiziwo-Tiapo
5ba1071f75 x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses
Don't perform unaligned loads in __get_next() and __peek_nbyte_next() as
these are forms of undefined behavior:

"A pointer to an object or incomplete type may be converted to a pointer
to a different object or incomplete type. If the resulting pointer
is not correctly aligned for the pointed-to type, the behavior is
undefined."

(from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)

These problems were identified using the undefined behavior sanitizer
(ubsan) with the tools version of the code and perf test.

 [ bp: Massage commit message. ]

Signed-off-by: Numfor Mbiziwo-Tiapo <nums@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lkml.kernel.org/r/20210923161843.751834-1-irogers@google.com
2021-09-24 12:37:38 +02:00
Greg Kroah-Hartman
0292dbd7bd USB-serial fixes for 5.15-rc3
Here's a fix for a regression affecting some CP2102 devices and a host
 of new device ids.
 
 Included are also a couple of cleanups of duplicate device ids, which
 are also tagged for stable to keep the tables in sync, and a trivial
 patch to help debugging cp210x issues.
 
 All have been in linux-next with no reported issues. Note however that
 the last last two device-id commits were rebased to fix up a lore link
 in a commit message (as the patch itself never made it to the list).
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQHbPq+cpGvN/peuzMLxc3C7H1lCAUCYU2GxAAKCRALxc3C7H1l
 CLzVAQCUyg9IUDizLyoMx0gWprMuzCCrN/H728XGRCsUcVCOlgD/Q+QRvWpt9zAX
 qZ2upQJvPm1/8F3aYl+YoH4vnsEvJg8=
 =6FW1
 -----END PGP SIGNATURE-----

Merge tag 'usb-serial-5.15-rc3' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus

Johan writes:

USB-serial fixes for 5.15-rc3

Here's a fix for a regression affecting some CP2102 devices and a host
of new device ids.

Included are also a couple of cleanups of duplicate device ids, which
are also tagged for stable to keep the tables in sync, and a trivial
patch to help debugging cp210x issues.

All have been in linux-next with no reported issues. Note however that
the last last two device-id commits were rebased to fix up a lore link
in a commit message (as the patch itself never made it to the list).

* tag 'usb-serial-5.15-rc3' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
  USB: serial: option: add device id for Foxconn T99W265
  USB: serial: cp210x: add ID for GW Instek GDM-834x Digital Multimeter
  USB: serial: cp210x: add part-number debug printk
  USB: serial: cp210x: fix dropped characters with CP2102
  USB: serial: option: remove duplicate USB device ID
  USB: serial: mos7840: remove duplicated 0xac24 device ID
  USB: serial: option: add Telit LN920 compositions
2021-09-24 10:22:53 +02:00
Slark Xiao
9e3eed534f USB: serial: option: add device id for Foxconn T99W265
Adding support for Foxconn device T99W265 for enumeration with
PID 0xe0db.

usb-devices output for 0xe0db
T:  Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 19 Spd=5000 MxCh= 0
D:  Ver= 3.20 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs=  1
P:  Vendor=0489 ProdID=e0db Rev=05.04
S:  Manufacturer=Microsoft
S:  Product=Generic Mobile Broadband Adapter
S:  SerialNumber=6c50f452
C:  #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=896mA
I:  If#=0x0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim
I:  If#=0x1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
I:  If#=0x3 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
I:  If#=0x4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option

if0/1: MBIM, if2:Diag, if3:GNSS, if4: Modem

Signed-off-by: Slark Xiao <slark_xiao@163.com>
Link: https://lore.kernel.org/r/20210917110106.9852-1-slark_xiao@163.com
[ johan: use USB_DEVICE_INTERFACE_CLASS(), amend comment ]
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
2021-09-24 09:32:11 +02:00
Uwe Brandt
3bd18ba7d8 USB: serial: cp210x: add ID for GW Instek GDM-834x Digital Multimeter
Add the USB serial device ID for the GW Instek GDM-834x Digital Multimeter.

Signed-off-by: Uwe Brandt <uwe.brandt@gmail.com>
Link: https://lore.kernel.org/r/YUxFl3YUCPGJZd8Y@hovoldconsulting.com
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
2021-09-24 09:31:10 +02:00
Steve French
9ed38fd4a1 cifs: fix incorrect check for null pointer in header_assemble
Although very unlikely that the tlink pointer would be null in this case,
get_next_mid function can in theory return null (but not an error)
so need to check for null (not for IS_ERR, which can not be returned
here).

Address warning:

        fs/smbfs_client/connect.c:2392 cifs_match_super()
        warn: 'tlink' isn't an ERR_PTR

Pointed out by Dan Carpenter via smatch code analysis tool

CC: stable@vger.kernel.org
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-23 21:12:53 -05:00
Steve French
1db1aa9887 smb3: correct server pointer dereferencing check to be more consistent
Address warning:

    fs/smbfs_client/misc.c:273 header_assemble()
    warn: variable dereferenced before check 'treeCon->ses->server'

Pointed out by Dan Carpenter via smatch code analysis tool

Although the check is likely unneeded, adding it makes the code
more consistent and easier to read, as the same check is
done elsewhere in the function.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-23 21:12:23 -05:00
Dave Airlie
ef88d7a8a5 Merge tag 'drm-intel-fixes-2021-09-23' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.15-rc3:
- Fix ADL-P memory bandwidth parameters
- Fix memory corruption due to a double free
- Fix memory leak in DMC firmware handling

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87o88jbk3o.fsf@intel.com
2021-09-24 09:39:17 +10:00
Dave Airlie
22a94600e2 Merge tag 'amd-drm-fixes-5.15-2021-09-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.15-2021-09-23:

amdgpu:
- Update MAINTAINERS entry for powerplay
- Fix empty macros
- SI DPM fix

amdkfd:
- SVM fixes
- DMA mapping fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210923211330.20725-1-alexander.deucher@amd.com
2021-09-24 08:39:02 +10:00
Linus Torvalds
f9e36107ec for-5.15-rc2-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmFM79wACgkQxWXV+ddt
 WDtdZQ/+K7NNEutg4JEH7n2KiXxwj8P23NwVK66a+XwH6/ZBe9xz5TQpnTJQ+D13
 +3mhthTJG7Wbcv+FlUVbfTSp5q8m2IH7CKox43o4JCZEEGtFfRPHrBGHLlGKMk3P
 ap2TZ3rvo0Sb21rx978HCQY824wdJvhv0SmWSScmvWzlTQEKaJHz1OFJgFxhUsMp
 Cy9y7mtIy+Ei4qJglU88iFNXhNL6YvwqXxDFY5LwN9rlCaV+rLk476aPfIBvvyf8
 4f34FHJOe1w9Jlk3KfydIwWefRBbq2dm0zNqNrMHNjl8zXbvfn8+ETOvf54HbjIz
 GGgKiZlBgNh2Na+p0SLoloEvBUdD5lSUCXis8099oUWZ+MporIwsyy4jAvtAeWR/
 QxBkZyxvTNFlXLamSo6oS58K9BNuxFYO7nLGSXQFEoYvb8/fu18rRt/A/rmNS8TU
 2vxpYacNKbggoULiGDzB74JY7MHdHRcMcAhmfDeG1bvNESPHfyLnpHfWamBVoUO6
 0eQOr78f1UpBlqJAGAGtfBefN1kMDnORyX0npGkGLFrKYiZbMgsxdjkNhiHnsufl
 9gsNVJ6baCeB1d5qS2vpZXeOLw0ln5iYZa5Yqz0eh/yc/9Wlj/YCsKRuAbaPMR1i
 i2ppHo3/na4K6L0EgSi6SU3xaUT+4LLzEEcBlJJuWZEwUTeYwiM=
 =VJFC
 -----END PGP SIGNATURE-----

Merge tag 'for-5.15-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - regression fix for leak of transaction handle after verity rollback
   failure

 - properly reset device last error between mounts

 - improve one error handling case when checksumming bios

 - fixup confusing displayed size of space info free space

* tag 'for-5.15-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: prevent __btrfs_dump_space_info() to underflow its free space
  btrfs: fix mount failure due to past and transient device flush error
  btrfs: fix transaction handle leak after verity rollback failure
  btrfs: replace BUG_ON() in btrfs_csum_one_bio() with proper error handling
2021-09-23 14:39:41 -07:00
Steve French
b06d893ef2 smb3: correct smb3 ACL security descriptor
Address warning:

        fs/smbfs_client/smb2pdu.c:2425 create_sd_buf()
        warn: struct type mismatch 'smb3_acl vs cifs_acl'

Pointed out by Dan Carpenter via smatch code analysis tool

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-23 16:17:07 -05:00
Linus Torvalds
831c9bd3da selinux/stable-5.15 PR 20210923
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmFM3UgUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNEGw/9GUC++utDACeGSsBF3sedd6LgeeRt
 HV7+zuICPNqfgf/nLjniVLDSRp5IyoSQXVqHMmKPihgFYc4Qw+B0d+WHqFecky48
 +icgwu5GZWjTr/hsHj0DTz2BuZyLSHO5yl0xi81AxDfUjXJWzyyoGci9ldTGXvQR
 L9aLmm4vN955QwG4qvTdX6lQW89Wb8pzkWVYVSeitNQn6jSIzCWk5p6S8bhmqt7t
 BzGwsSe17AiRW/NoqO/WrIkqOUnRi1ef7S2hnYaR/lmohRwBZPmfHEUp9WUeW6Is
 K2heeKqJOlHFjFc/+rsw0YIrqN24I+h9tu6x/Rt87I1Np1DLF5OXaTNGWH1jqnAP
 zApAj3XbuN18fo80gDLq3e9NC0heOxadJQGqxQHVR9pOqMfPw5BUwW4w+MWcul0S
 6L13E974syAjCfhabwjTBB4KsdMGFKrk5ZpNlUozAuf4r67CxdNqltAZITl+r75Q
 jxyG5U3w1/Yvq3PoUVk43MEampn9LsrFZJBd2Rzi+z/AqYDlqjPJHtKPi4uy3nFv
 kyVBHY9h0gw9RxoQgCY+nA5SNmyodGKjy48p8WXjhACBHzeoh7pskOoDZHxmtLSO
 QmeAhT7VpUu1geOW7PIEqYytZS4ISPPRpNzWL2evPABwU5n+peWX9UhCFtXdn2at
 9rf8xbPGXZ+L4K8=
 =PkjC
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20210923' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull SELinux/Smack fixes from Paul Moore:
 "Another single-patch pull request for SELinux, as well as Smack.

  This fixes some credential misuse and is explained reasonably well in
  the patch description"

* tag 'selinux-pr-20210923' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux,smack: fix subjective/objective credential use mixups
2021-09-23 14:17:06 -07:00
Steve French
4f22262280 cifs: Clear modified attribute bit from inode flags
Clear CIFS_INO_MODIFIED_ATTR bit from inode flags after
updating mtime and ctime

Signed-off-by: Rohith Surabattula <rohiths@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Cc: stable@vger.kernel.org # 5.13+
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-23 16:16:19 -05:00
Philip Yang
197ae17722 drm/amdkfd: fix svm_migrate_fini warning
Device manager releases device-specific resources when a driver
disconnects from a device, devm_memunmap_pages and
devm_release_mem_region calls in svm_migrate_fini are redundant.

It causes below warning trace after patch "drm/amdgpu: Split
amdgpu_device_fini into early and late", so remove function
svm_migrate_fini.

BUG: https://gitlab.freedesktop.org/drm/amd/-/issues/1718

WARNING: CPU: 1 PID: 3646 at drivers/base/devres.c:795
devm_release_action+0x51/0x60
Call Trace:
    ? memunmap_pages+0x360/0x360
    svm_migrate_fini+0x2d/0x60 [amdgpu]
    kgd2kfd_device_exit+0x23/0xa0 [amdgpu]
    amdgpu_amdkfd_device_fini_sw+0x1d/0x30 [amdgpu]
    amdgpu_device_fini_sw+0x45/0x290 [amdgpu]
    amdgpu_driver_release_kms+0x12/0x30 [amdgpu]
    drm_dev_release+0x20/0x40 [drm]
    release_nodes+0x196/0x1e0
    device_release_driver_internal+0x104/0x1d0
    driver_detach+0x47/0x90
    bus_remove_driver+0x7a/0xd0
    pci_unregister_driver+0x3d/0x90
    amdgpu_exit+0x11/0x20 [amdgpu]

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 17:06:11 -04:00
Philip Yang
7d6687200a drm/amdkfd: handle svm migrate init error
If svm migration init failed to create pgmap for device memory, set
pgmap type to 0 to disable device SVM support capability.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 17:06:11 -04:00
Lijo Lazar
ab39d3cef5 drm/amd/pm: Update intermediate power state for SI
Update the current state as boot state during dpm initialization.
During the subsequent initialization, set_power_state gets called to
transition to the final power state. set_power_state refers to values
from the current state and without current state populated, it could
result in NULL pointer dereference.

For ex: on platforms where PCI speed change is supported through ACPI
ATCS method, the link speed of current state needs to be queried before
deciding on changing to final power state's link speed. The logic to query
ATCS-support was broken on certain platforms. The issue became visible
when broken ATCS-support logic got fixed with commit
f9b7f3703f ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)").

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1698

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-23 17:06:11 -04:00
Philip Yang
f63251184a drm/amdkfd: fix dma mapping leaking warning
For xnack off, restore work dma unmap previous system memory page, and
dma map the updated system memory page to update GPU mapping, this is
not dma mapping leaking, remove the WARN_ONCE for dma mapping leaking.

prange->dma_addr store the VRAM page pfn after the range migrated to
VRAM, should not dma unmap VRAM page when updating GPU mapping or
remove prange. Add helper svm_is_valid_dma_mapping_addr to check VRAM
page and error cases.

Mask out SVM_RANGE_VRAM_DOMAIN flag in dma_addr before calling amdgpu vm
update to avoid BUG_ON(*addr & 0xFFFF00000000003FULL), and set it again
immediately after. This flag is used to know the type of page later to
dma unmapping system memory page.

Fixes: 1d5dbfe6c0 ("drm/amdkfd: classify and map mixed svm range pages in GPU")
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 16:00:16 -04:00
Philip Yang
7beb26dced drm/amdkfd: SVM map to gpus check vma boundary
SVM range may includes multiple VMAs with different vm_flags, if prange
page index is the last page of the VMA offset + npages, update GPU
mapping to create GPU page table with same VMA access permission.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 16:00:09 -04:00
Alex Deucher
6de0653f77 MAINTAINERS: fix up entry for AMD Powerplay
Fix the path to cover both the older powerplay infrastructure
and the newer SwSMU infrastructure.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:58:38 -04:00
Arnd Bergmann
c48977f020 drm/amd/display: fix empty debug macros
Using an empty macro expansion as a conditional expression
produces a W=1 warning:

drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_aux.c: In function 'dce_aux_transfer_with_retries':
drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_aux.c:775:156: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
  775 |                                                                 "dce_aux_transfer_with_retries: AUX_RET_SUCCESS: AUX_TRANSACTION_REPLY_I2C_OVER_AUX_DEFER");
      |                                                                                                                                                            ^
drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_aux.c:783:155: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
  783 |                                                                 "dce_aux_transfer_with_retries: AUX_RET_SUCCESS: AUX_TRANSACTION_REPLY_I2C_OVER_AUX_NACK");
      |                                                                                                                                                           ^

Expand it to "do { } while (0)" instead to make the expression
more robust and avoid the warning.

Fixes: 56aca23093 ("drm/amd/display: Add AUX I2C tracing.")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:58:21 -04:00
David Howells
03ab9cb982 cifs: Deal with some warnings from W=1
Deal with some warnings generated from make W=1:

 (1) Add/remove/fix kerneldoc parameters descriptions.

 (2) Turn cifs' rqst_page_get_length()'s banner comment into a kerneldoc
     comment.  It should probably be prefixed with "cifs_" though.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-23 14:06:17 -05:00
Jia He
12064c1768 Revert "ACPI: Add memory semantics to acpi_os_map_memory()"
This reverts commit 437b38c511.

The memory semantics added in commit 437b38c511 causes SystemMemory
Operation region, whose address range is not described in the EFI memory
map to be mapped as NormalNC memory on arm64 platforms (through
acpi_os_map_memory() in acpi_ex_system_memory_space_handler()).

This triggers the following abort on an ARM64 Ampere eMAG machine,
because presumably the physical address range area backing the Opregion
does not support NormalNC memory attributes driven on the bus.

 Internal error: synchronous external abort: 96000410 [#1] SMP
 Modules linked in:
 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0+ #462
 Hardware name: MiTAC RAPTOR EV-883832-X3-0001/RAPTOR, BIOS 0.14 02/22/2019
 pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[...snip...]
 Call trace:
  acpi_ex_system_memory_space_handler+0x26c/0x2c8
  acpi_ev_address_space_dispatch+0x228/0x2c4
  acpi_ex_access_region+0x114/0x268
  acpi_ex_field_datum_io+0x128/0x1b8
  acpi_ex_extract_from_field+0x14c/0x2ac
  acpi_ex_read_data_from_field+0x190/0x1b8
  acpi_ex_resolve_node_to_value+0x1ec/0x288
  acpi_ex_resolve_to_value+0x250/0x274
  acpi_ds_evaluate_name_path+0xac/0x124
  acpi_ds_exec_end_op+0x90/0x410
  acpi_ps_parse_loop+0x4ac/0x5d8
  acpi_ps_parse_aml+0xe0/0x2c8
  acpi_ps_execute_method+0x19c/0x1ac
  acpi_ns_evaluate+0x1f8/0x26c
  acpi_ns_init_one_device+0x104/0x140
  acpi_ns_walk_namespace+0x158/0x1d0
  acpi_ns_initialize_devices+0x194/0x218
  acpi_initialize_objects+0x48/0x50
  acpi_init+0xe0/0x498

If the Opregion address range is not present in the EFI memory map there
is no way for us to determine the memory attributes to use to map it -
defaulting to NormalNC does not work (and it is not correct on a memory
region that may have read side-effects) and therefore commit
437b38c511 should be reverted, which means reverting back to the
original behavior whereby address ranges that are mapped using
acpi_os_map_memory() default to the safe devicenGnRnE attributes on
ARM64 if the mapped address range is not defined in the EFI memory map.

Fixes: 437b38c511 ("ACPI: Add memory semantics to acpi_os_map_memory()")
Signed-off-by: Jia He <justin.he@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-09-23 20:39:36 +02:00
Linus Torvalds
f10f0481a5 A fix for a bug with restartable sequences and KVM. KVM's handling
of TIF_NOTIFY_RESUME, e.g. for task migration, clears the flag without
 informing rseq and leads to stale data in userspace's rseq struct.
 
 I'm sending this as a separate pull request since it's not code
 that I usually touch.  In particular, patch 2 ("entry: rseq: Call
 rseq_handle_notify_resume() in tracehook_notify_resume()") is just a
 cleanup to try and make future bugs less likely.  If you prefer this to
 be sent via Thomas and only in 5.16, please speak up.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmFLPYgUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNCowf6A9pTuDspCC2IRICfQnHj/Q/Xc7HH
 UmwYo26GctfYaq9AyIAGzcRx3iOEGjb9fZVJ6mzPUGIygio9ZyuiPHogf7lMAb+x
 39ts5uSOp+N+8e0fvX578WFfmG5hQa4Tp9W3T2Y5KsVgK2Nf8F08DckzIgD8cbkN
 NQKTRIi8AYgb20y3NFZjzsPRxF8850QK7xVCI+LBjryyWpEGT5ZsthrYUeexiJPz
 XN+VOYJen5GXVBCar2JbA7EVSrMZbKSy+M3fJ1vuW5dZHySaiu69JXJHop71jTnJ
 5BGue917MfH6RTDzIFFUcg7NmwcuXHpw4dsFeiyExYFNw1uWWQpk0efC1g==
 =/xlE
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-rseq' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull rseq fixes from Paolo Bonzini:
 "A fix for a bug with restartable sequences and KVM.

  KVM's handling of TIF_NOTIFY_RESUME, e.g. for task migration, clears
  the flag without informing rseq and leads to stale data in userspace's
  rseq struct"

* tag 'for-linus-rseq' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: selftests: Remove __NR_userfaultfd syscall fallback
  KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs
  tools: Move x86 syscall number fallbacks to .../uapi/
  entry: rseq: Call rseq_handle_notify_resume() in tracehook_notify_resume()
  KVM: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest
2021-09-23 11:24:12 -07:00
Linus Torvalds
9bc62afe03 Networking fixes for 5.15-rc3.
Current release - regressions:
 
  - dsa: bcm_sf2: fix array overrun in bcm_sf2_num_active_ports()
 
 Previous releases - regressions:
 
  - introduce a shutdown method to mdio device drivers, and make DSA
    switch drivers compatible with masters disappearing on shutdown;
    preventing infinite reference wait
 
  - fix issues in mdiobus users related to ->shutdown vs ->remove
 
  - virtio-net: fix pages leaking when building skb in big mode
 
  - xen-netback: correct success/error reporting for the SKB-with-fraglist
 
  - dsa: tear down devlink port regions when tearing down the devlink
         port on error
 
  - nexthop: fix division by zero while replacing a resilient group
 
  - hns3: check queue, vf, vlan ids range before using
 
 Previous releases - always broken:
 
  - napi: fix race against netpoll causing NAPI getting stuck
 
  - mlx4_en: ensure link operstate is updated even if link comes up
             before netdev registration
 
  - bnxt_en: fix TX timeout when TX ring size is set to the smallest
 
  - enetc: fix illegal access when reading affinity_hint;
           prevent oops on sysfs access
 
  - mtk_eth_soc: avoid creating duplicate offload entries
 
 Misc:
 
  - core: correct the sock::sk_lock.owned lockdep annotations
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmFMr4MACgkQMUZtbf5S
 Irv6Ag/+Ml4q6/IVO0jBppztZO1RrSalb3YE9JjPQyMchauVcdcADNpYF+Jo/gcH
 4q+/Oikfp6gkQpJTFd0Y9X7UhwA4Jm4wWtEisqc6PJeHOagDZmVUn353WtgpnCNL
 CgYBfa5k3msGudkqgeXIyiP/2sekBevTy+fOptubLZClyBrEwNUUUZBlpT9aI9Sj
 ru1eMYklfcxP60AQgNhqq6ZwJnRELgN75fSR6ypVCGcRnTK4UGL/b6TvnPYn8uYY
 zeNuMZZzYZK5B73tC6rWpteHWZ7VW3Km0WvIKs+ORM8nYchz/EprKZ0HCLPYrWvf
 ib5Wi7HyL7/n9k9NUTCGrQY3tkOWNzXOepjpiBZPqCG9r2hc3JSR7Q2lFwL+gKv/
 sh2y+T2xfp0WFGmG2XiU2MgnkypMSKah1sC/XRE7YLw02vPAnWQxxl/KVNek4j7M
 CH/Tg9ErVKDRLN7KO/kKl3s8I8N4hdctms/YUt9QD5J9Rw/Jqwr/79bq1uLy6d4o
 //ipmCTHex57Nvy80PtgcuKJhoeqGwR/Av6BvBMRZ1SOYs/C6q45skHTlYyiNY3+
 Dyj9+nfrhsyE835GKPe8lqBFZONBXpXw+EUNXeYRiv0Pcd+JKek07bbajSQVSpd8
 8nqQwylpGII0iPGyOc9wKajzh7O5W2odFIdOwtY/5yVjrcFgBd0=
 =VcL3
 -----END PGP SIGNATURE-----

Merge tag 'net-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Current release - regressions:

   - dsa: bcm_sf2: fix array overrun in bcm_sf2_num_active_ports()

  Previous releases - regressions:

   - introduce a shutdown method to mdio device drivers, and make DSA
     switch drivers compatible with masters disappearing on shutdown;
     preventing infinite reference wait

   - fix issues in mdiobus users related to ->shutdown vs ->remove

   - virtio-net: fix pages leaking when building skb in big mode

   - xen-netback: correct success/error reporting for the
     SKB-with-fraglist

   - dsa: tear down devlink port regions when tearing down the devlink
     port on error

   - nexthop: fix division by zero while replacing a resilient group

   - hns3: check queue, vf, vlan ids range before using

  Previous releases - always broken:

   - napi: fix race against netpoll causing NAPI getting stuck

   - mlx4_en: ensure link operstate is updated even if link comes up
     before netdev registration

   - bnxt_en: fix TX timeout when TX ring size is set to the smallest

   - enetc: fix illegal access when reading affinity_hint; prevent oops
     on sysfs access

   - mtk_eth_soc: avoid creating duplicate offload entries

  Misc:

   - core: correct the sock::sk_lock.owned lockdep annotations"

* tag 'net-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (51 commits)
  atlantic: Fix issue in the pm resume flow.
  net/mlx4_en: Don't allow aRFS for encapsulated packets
  net: mscc: ocelot: fix forwarding from BLOCKING ports remaining enabled
  net: ethernet: mtk_eth_soc: avoid creating duplicate offload entries
  nfc: st-nci: Add SPI ID matching DT compatible
  MAINTAINERS: remove Guvenc Gulce as net/smc maintainer
  nexthop: Fix memory leaks in nexthop notification chain listeners
  mptcp: ensure tx skbs always have the MPTCP ext
  qed: rdma - don't wait for resources under hw error recovery flow
  s390/qeth: fix deadlock during failing recovery
  s390/qeth: Fix deadlock in remove_discipline
  s390/qeth: fix NULL deref in qeth_clear_working_pool_list()
  net: dsa: realtek: register the MDIO bus under devres
  net: dsa: don't allocate the slave_mii_bus using devres
  Doc: networking: Fox a typo in ice.rst
  net: dsa: fix dsa_tree_setup error path
  net/smc: fix 'workqueue leaked lock' in smc_conn_abort_work
  net/smc: add missing error check in smc_clc_prfx_set()
  net: hns3: fix a return value error in hclge_get_reset_status()
  net: hns3: check vlan id before using it
  ...
2021-09-23 10:30:31 -07:00
Shakeel Butt
1f828223b7 memcg: flush lruvec stats in the refault
Prior to the commit 7e1c0d6f58 ("memcg: switch lruvec stats to rstat")
and the commit aa48e47e39 ("memcg: infrastructure to flush memcg
stats"), each lruvec memcg stats can be off by (nr_cgroups * nr_cpus *
32) at worst and for unbounded amount of time.  The commit aa48e47e39
moved the lruvec stats to rstat infrastructure and the commit
7e1c0d6f58 bounded the error for all the lruvec stats to (nr_cpus *
32) at worst for at most 2 seconds.  More specifically it decoupled the
number of stats and the number of cgroups from the error rate.

However this reduction in error comes with the cost of triggering the
slowpath of stats update more frequently.  Previously in the slowpath
the kernel adds the stats up the memcg tree.  After aa48e47e39, the
kernel triggers the asyn lruvec stats flush through queue_work().  This
causes regression reports from 0day kernel bot [1] as well as from
phoronix test suite [2].

We tried two options to fix the regression:

 1) Increase the threshold to trigger the slowpath in lruvec stats
    update codepath from 32 to 512.

 2) Remove the slowpath from lruvec stats update codepath and instead
    flush the stats in the page refault codepath. The assumption is that
    the kernel timely flush the stats, so, the update tree would be
    small in the refault codepath to not cause the preformance impact.

Following are the results of will-it-scale/page_fault[1|2|3] benchmark
on four settings i.e.  (1) 5.15-rc1 as baseline (2) 5.15-rc1 with
aa48e47e39 and 7e1c0d6f58 reverted (3) 5.15-rc1 with option-1
(4) 5.15-rc1 with option-2.

  test       (1)      (2)               (3)               (4)
  pg_f1   368563   406277 (10.23%)   399693  (8.44%)   416398 (12.97%)
  pg_f2   338399   372133  (9.96%)   369180  (9.09%)   381024 (12.59%)
  pg_f3   500853   575399 (14.88%)   570388 (13.88%)   576083 (15.02%)

From the above result, it seems like the option-2 not only solves the
regression but also improves the performance for at least these
benchmarks.

Feng Tang (intel) ran the aim7 benchmark with these two options and
confirms that option-1 reduces the regression but option-2 removes the
regression.

Michael Larabel (phoronix) ran multiple benchmarks with these options
and reported the results at [3] and it shows for most benchmarks
option-2 removes the regression introduced by the commit aa48e47e39
("memcg: infrastructure to flush memcg stats").

Based on the experiment results, this patch proposed the option-2 as the
solution to resolve the regression.

Link: https://lore.kernel.org/all/20210726022421.GB21872@xsang-OptiPlex-9020 [1]
Link: https://www.phoronix.com/scan.php?page=article&item=linux515-compile-regress [2]
Link: https://openbenchmarking.org/result/2109226-DEBU-LINUX5104 [3]
Fixes: aa48e47e39 ("memcg: infrastructure to flush memcg stats")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Tested-by: Michael Larabel <Michael@phoronix.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Hillf Danton <hdanton@sina.com>,
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-23 10:09:13 -07:00
Paul Moore
a3727a8bac selinux,smack: fix subjective/objective credential use mixups
Jann Horn reported a problem with commit eb1231f73c ("selinux:
clarify task subjective and objective credentials") where some LSM
hooks were attempting to access the subjective credentials of a task
other than the current task.  Generally speaking, it is not safe to
access another task's subjective credentials and doing so can cause
a number of problems.

Further, while looking into the problem, I realized that Smack was
suffering from a similar problem brought about by a similar commit
1fb057dcde ("smack: differentiate between subjective and objective
task credentials").

This patch addresses this problem by restoring the use of the task's
objective credentials in those cases where the task is other than the
current executing task.  Not only does this resolve the problem
reported by Jann, it is arguably the correct thing to do in these
cases.

Cc: stable@vger.kernel.org
Fixes: eb1231f73c ("selinux: clarify task subjective and objective credentials")
Fixes: 1fb057dcde ("smack: differentiate between subjective and objective task credentials")
Reported-by: Jann Horn <jannh@google.com>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-09-23 12:30:59 -04:00
Yue Hu
c40dd3ca2a erofs: clear compacted_2b if compacted_4b_initial > totalidx
Currently, the whole indexes will only be compacted 4B if
compacted_4b_initial > totalidx. So, the calculated compacted_2b
is worthless for that case. It may waste CPU resources.

No need to update compacted_4b_initial as mkfs since it's used to
fulfill the alignment of the 1st compacted_2b pack and would handle
the case above.

We also need to clarify compacted_4b_end here. It's used for the
last lclusters which aren't fitted in the previous compacted_2b
packs.

Some messages are from Xiang.

Link: https://lore.kernel.org/r/20210914035915.1190-1-zbestahu@gmail.com
Signed-off-by: Yue Hu <huyue2@yulong.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
[ Gao Xiang: it's enough to use "compacted_4b_initial < totalidx". ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2021-09-23 23:23:04 +08:00