Commit Graph

1075555 Commits

Author SHA1 Message Date
Tom Rix
779596ce6a drm/amdgpu: fix amdgpu_ras_block_late_init error handler
Clang build fails with
amdgpu_ras.c:2416:7: error: variable 'ras_obj' is used uninitialized
  whenever 'if' condition is true
  if (adev->in_suspend || amdgpu_in_reset(adev)) {
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

amdgpu_ras.c:2453:6: note: uninitialized use occurs here
 if (ras_obj->ras_cb)
     ^~~~~~~

There is a logic error in the error handler's labels.
ex/ The sysfs: is the last goto label in the normal code but
is the middle of error handler.  Rework the error handler.

cleanup: is the first error, so it's handler should be last.

interrupt: is the second error, it's handler is next.  interrupt:
handles the failure of amdgpu_ras_interrupt_add_hander() by
calling amdgpu_ras_interrupt_remove_handler().  This is wrong,
remove() assumes the interrupt has been setup, not torn down by
add().  Change the goto label to cleanup.

sysfs is the last error, it's handler should be first.  sysfs:
handles the failure of amdgpu_ras_sysfs_create() by calling
amdgpu_ras_sysfs_remove().  But when the create() fails there
is nothing added so there is nothing to remove.  This error
handler is not needed. Remove the error handler and change
goto label to interrupt.

Fixes: b293e891b0 ("drm/amdgpu: add helper function to do common ras_late_init/fini (v3)")
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:59:05 -05:00
Luben Tuikov
6b5033831f drm/amdgpu: Dynamically initialize IP instance attributes
Dynamically initialize IP instance attributes. This eliminates bugs
stemming from adding new attributes to an IP instance.

Cc: Alex Deucher <Alexander.Deucher@amd.com>
Reported-by: Tom StDenis <tom.stdenis@amd.com>
Fixes: 4d7ba312dd ("drm/amdgpu: Add "harvest" to IP discovery sysfs")
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:59:05 -05:00
Tom St Denis
8f74f68d90 drm/amd/amdgpu: Add APU flag to gca_config debugfs data (v3)
Needed by umr to detect if ip discovered ASIC is an APU or not.

(v2): Remove asic type from packet it's not strictly needed
(v3): Correct comment

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:59:05 -05:00
Mario Limonciello
d01899d3db drm/amd: Use amdgpu_device_should_use_aspm on navi umd pstate switching
The `program_aspm` callback is already guarded for aspm, but the
`enable_aspm` callback doesn't follow the module parameter.

Update it to use the helper `amdgpu_device_should_use_aspm`.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:59:05 -05:00
Mario Limonciello
0ab5d711ec drm/amd: Refactor amdgpu_aspm to be evaluated per device
Evaluating `pcie_aspm_enabled` as part of driver probe has the implication
that if one PCIe bridge with an AMD GPU connected doesn't support ASPM
then none of them do.  This is an invalid assumption as the PCIe core will
configure ASPM for individual PCIe bridges.

Create a new helper function that can be called by individual dGPUs to
react to the `amdgpu_aspm` module parameter without having negative results
for other dGPUs on the PCIe bus.

Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:59:05 -05:00
Luben Tuikov
f0d5409895 drm/amdgpu: Fix ARM compilation warning
Fix this ARM warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:664:35: warning: format '%ld'
expects argument of type 'long int', but argument 4 has type 'size_t' {aka
'unsigned int'} [-Wformat=]

Cc: Alex Deucher <Alexander.Deucher@amd.com>
Cc: kbuild-all@lists.01.org
Cc: linux-kernel@vger.kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Fixes: a6c40b1780 ("drm/amdgpu: Show IP discovery in sysfs")
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:59:05 -05:00
Mario Limonciello
cba07cce39 drm/amd: Check if ASPM is enabled from PCIe subsystem
commit 0064b0ce85 ("drm/amd/pm: enable ASPM by default") enabled ASPM
by default but a variety of hardware configurations it turns out that this
caused a regression.

* PPC64LE hardware does not support ASPM at a hardware level.
  CONFIG_PCIEASPM is often disabled on these architectures.
* Some dGPUs on ALD platforms don't work with ASPM enabled and PCIe subsystem
  disables it

Check with the PCIe subsystem to see that ASPM has been enabled
or not.

Fixes: 0064b0ce85 ("drm/amd/pm: enable ASPM by default")
Link: https://wiki.raptorcs.com/w/images/a/ad/P9_PHB_version1.0_27July2018_pub.pdf
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1723
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1739
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1907
Tested-by: koba.ko@canonical.com
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:59:05 -05:00
yipechai
418abce203 drm/amdgpu: Remove redundant .ras_late_init initialization in some ras blocks
1. Define amdgpu_ras_block_late_init_default in amdgpu_ras.c as
   .ras_late_init common function, which is called when
   .ras_late_init of ras block isn't initialized.
2. Remove the code of using amdgpu_ras_block_late_init to
   initialize .ras_late_init in ras blocks.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:59:05 -05:00
yipechai
867e24ca49 drm/amdgpu: define amdgpu_ras_late_init to call all ras blocks' .ras_late_init
Define amdgpu_ras_late_init to call all ras blocks' .ras_late_init.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:59:05 -05:00
yipechai
caae42f009 drm/amdgpu: Optimize xxx_ras_late_init function of each ras block
1. Move calling ras block instance members from module internal
   function to the top calling xxx_ras_late_init.
2. Module internal function calls can only use parameter variables
   of xxx_ras_late_init instead of ras block instance members.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:59:05 -05:00
yipechai
20c43547ad drm/amdgpu: Remove redundant calls of ras_late_init in mca ras block
Remove redundant calls of ras_late_init in mca ras block.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:59:05 -05:00
yipechai
068001b711 drm/amdgpu: Remove redundant calls of ras_late_init in mmhub ras block
Remove redundant calls of ras_late_init in mmhub ras block.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:59:04 -05:00
yipechai
72b3588e27 drm/amdgpu: Remove redundant calls of ras_late_init in hdp ras block
Remove redundant calls of ras_late_init in hdp ras block.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:59:04 -05:00
yipechai
4e9b1fa5a2 drm/amdgpu: Modify .ras_late_init function pointer parameter
Modify .ras_late_init function pointer parameter so that
it can remove redundant intermediate calls in some ras blocks.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:59:04 -05:00
Prike Liang
f83e14011e drm/amdgpu/discovery: Add sw DM function for 3.1.6 DCE
Add 3.1.6 DCE IP and assign relevant sw DM function for the new DCE.

Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:59:04 -05:00
Hansen Dsouza
c477eaa6a7 drm/amd/display: Add DCN316 resource and SMU clock manager
Add core DC implementation for DCN316.

Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Acked-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Hansen Dsouza <Hansen.Dsouza@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:58:45 -05:00
Leo Li
868f4357ed drm/amd/display: Add DMUB support for DCN316
Initialize DMUB for DCN316. Use same funcs as DCN31 for
DCN316.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:46:04 -05:00
Prike Liang
de7cc1b48f drm/amd/display: configure dc hw resource for DCN 3.1.6
- set DC version
- add construct/destroy dc clock management function
- register dcn interrupt handler

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:45:17 -05:00
Leo Li
f3f6eff85f drm/amd/include: Add MP 13.0.8 register headers
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:45:07 -05:00
Leo Li
64b14a184e drm/amd/include: Add register headers for DCN 3.1.6
Add register headers for the following IPs:
- DCN 3.1.6
- DPCS 4.2.3

Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:44:45 -05:00
Jakub Kicinski
7a2fb91285 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2022-02-17

We've added 8 non-merge commits during the last 7 day(s) which contain
a total of 8 files changed, 119 insertions(+), 15 deletions(-).

The main changes are:

1) Add schedule points in map batch ops, from Eric.

2) Fix bpf_msg_push_data with len 0, from Felix.

3) Fix crash due to incorrect copy_map_value, from Kumar.

4) Fix crash due to out of bounds access into reg2btf_ids, from Kumar.

5) Fix a bpf_timer initialization issue with clang, from Yonghong.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Add schedule points in batch ops
  bpf: Fix crash due to out of bounds access into reg2btf_ids.
  selftests: bpf: Check bpf_msg_push_data return value
  bpf: Fix a bpf_timer initialization issue
  bpf: Emit bpf_timer in vmlinux BTF
  selftests/bpf: Add test for bpf_timer overwriting crash
  bpf: Fix crash due to incorrect copy_map_value
  bpf: Do not try bpf_msg_push_data with len 0
====================

Link: https://lore.kernel.org/r/20220217190000.37925-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17 12:01:55 -08:00
Dave Airlie
5666b61019 Merge tag 'drm-intel-fixes-2022-02-17' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- GVT kerneldoc cleanup. (Randy Dunlap)
- GVT Kconfig should depend on X86. (Siva Mullati)
- Prevent out of range access in SWSCI display code. (Jani Nikula)
- Fix mbus join and dbuf slice config lookup. (Ville Syrjälä)
- Fix inverted priority selection in the TTM backend. (Matthew Auld)
- Fix FBC plane end Y offset check. (Ville Syrjälä)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Yg4lA6k8+xp8u3aB@tursulin-mobl2
2022-02-18 05:44:45 +10:00
Dave Airlie
babb1fc323 * drm/cma-helper: Set VM_DONTEXPAND
* drm/atomic: Fix error handling in drm_atomic_set_mode_for_crtc()
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmIOJqQACgkQaA3BHVML
 eiOETAf/VbMUCR9KTb2rtVQ7Rb2/GMDZeOKoo5+jXaXgYeb0vjSChNI6yvp+NclV
 u6gkB7dnWwSO0VKd+Sahr6SPPT1GqhOuvrGAE80DMnjMIqb+7X8NU/xHtmFOc55R
 mgxQV3TbOC3psbMzZ9zKUQNKchctxuUVmTub0v1BQrx9+GXE8pew/owbF3PVD5Ho
 deIOXZCYHk/hEqI93M2GZoUlVxZLfBm8AvwD9U2FoF7QterX7fu9IBfvLJJA+v91
 jIe9ZHtg+WpkluK0fL9EX6S2qD4IzEmboHfFxpMwDDbewJ4D5Q/0fBSLZrEZh8+c
 y6fnoZg8lLtduOoGizaJDR6rg1nZ5w==
 =OpKp
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-fixes-2022-02-17' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

 * drm/cma-helper: Set VM_DONTEXPAND
 * drm/atomic: Fix error handling in drm_atomic_set_mode_for_crtc()

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/Yg4mzQALMX69UmA3@linux-uq9g
2022-02-18 05:40:00 +10:00
Linus Torvalds
8b97cae315 Networking fixes for 5.17-rc5, including fixes from wireless and
netfilter.
 
 Current release - regressions:
 
  - dsa: lantiq_gswip: fix use after free in gswip_remove()
 
  - smc: avoid overwriting the copies of clcsock callback functions
 
 Current release - new code bugs:
 
  - iwlwifi:
    - fix use-after-free when no FW is present
    - mei: fix the pskb_may_pull check in ipv4
    - mei: retry mapping the shared area
    - mvm: don't feed the hardware RFKILL into iwlmei
 
 Previous releases - regressions:
 
  - ipv6: mcast: use rcu-safe version of ipv6_get_lladdr()
 
  - tipc: fix wrong publisher node address in link publications
 
  - iwlwifi: mvm: don't send SAR GEO command for 3160 devices,
    avoid FW assertion
 
  - bgmac: make idm and nicpm resource optional again
 
  - atl1c: fix tx timeout after link flap
 
 Previous releases - always broken:
 
  - vsock: remove vsock from connected table when connect is
    interrupted by a signal
 
  - ping: change destination interface checks to match raw sockets
 
  - crypto: af_alg - get rid of alg_memory_allocated to avoid confusing
    semantics (and null-deref) after SO_RESERVE_MEM was added
 
  - ipv6: make exclusive flowlabel checks per-netns
 
  - bonding: force carrier update when releasing slave
 
  - sched: limit TC_ACT_REPEAT loops
 
  - bridge: multicast: notify switchdev driver whenever MC processing
    gets disabled because of max entries reached
 
  - wifi: brcmfmac: fix crash in brcm_alt_fw_path when WLAN not found
 
  - iwlwifi: fix locking when "HW not ready"
 
  - phy: mediatek: remove PHY mode check on MT7531
 
  - dsa: mv88e6xxx: flush switchdev FDB workqueue before removing VLAN
 
  - dsa: lan9303:
    - fix polarity of reset during probe
    - fix accelerated VLAN handling
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmIOm44ACgkQMUZtbf5S
 IruWWBAAmNJBxVoVUahidwIVKHnKYeHClzDee3B6sYSupRW22Eeuh7Q8fqPMO4J7
 KO9nP/vGibFuhKfjApvS6wPvpYWCuXAoSozfOa+JWNrFg9uVpdyHIhfdXl0WPZqx
 A4+p2vIs1ldV0yOac/7ZGMWg57dzlYUurkld3xFwf7KyOOhV5/PkxkxpGN9eFkv1
 NTFWTaIsVzFMXMjNXkGfGHmt/8mSmZHgsH+tYd+KXsjbs2UpbGM3SyfHBlUf3aA0
 bceT4h07xA6C4rlUCbmalRqwvtcdM15MwlDBtSBXm5fXy0c59XxQOqj/dLhPuAO4
 42sQlO2MhqDrZjR0tOjmuP2cpc7llj1lIZe1Qs3nKiNFHcuOJEHGw1PYCO85jKdn
 xiWquuoe3G5YkQeoOoi+HqmXcP6aBZpUbROvYNjSJhcci4Ck0Qjna5J1rk8IRjb8
 AkDf68dodn8I+W5dx/EnopH/ShPQcqGw1+tH4215UB7b40Ecpc+laqFAHRcgs654
 ONuJVdRC4k3TyES1B9z8vawLcGYWa06fz8Mh/dS3gnLDphe5ZiH2tTrESfBdYixH
 idmuO5C/YDhsVelVuO+B0RT/yziPb3Lr+BTplSfkODCXT6LuOdCYUcHx5nGZ1TYW
 EeZ9hMSaxp2E06llEyD6JQQ+0Q17wnDGjLxtOMk+A8fmNX2F17g=
 =Nrzq
 -----END PGP SIGNATURE-----

Merge tag 'net-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from wireless and netfilter.

  Current release - regressions:

   - dsa: lantiq_gswip: fix use after free in gswip_remove()

   - smc: avoid overwriting the copies of clcsock callback functions

  Current release - new code bugs:

   - iwlwifi:
      - fix use-after-free when no FW is present
      - mei: fix the pskb_may_pull check in ipv4
      - mei: retry mapping the shared area
      - mvm: don't feed the hardware RFKILL into iwlmei

  Previous releases - regressions:

   - ipv6: mcast: use rcu-safe version of ipv6_get_lladdr()

   - tipc: fix wrong publisher node address in link publications

   - iwlwifi: mvm: don't send SAR GEO command for 3160 devices, avoid FW
     assertion

   - bgmac: make idm and nicpm resource optional again

   - atl1c: fix tx timeout after link flap

  Previous releases - always broken:

   - vsock: remove vsock from connected table when connect is
     interrupted by a signal

   - ping: change destination interface checks to match raw sockets

   - crypto: af_alg - get rid of alg_memory_allocated to avoid confusing
     semantics (and null-deref) after SO_RESERVE_MEM was added

   - ipv6: make exclusive flowlabel checks per-netns

   - bonding: force carrier update when releasing slave

   - sched: limit TC_ACT_REPEAT loops

   - bridge: multicast: notify switchdev driver whenever MC processing
     gets disabled because of max entries reached

   - wifi: brcmfmac: fix crash in brcm_alt_fw_path when WLAN not found

   - iwlwifi: fix locking when "HW not ready"

   - phy: mediatek: remove PHY mode check on MT7531

   - dsa: mv88e6xxx: flush switchdev FDB workqueue before removing VLAN

   - dsa: lan9303:
      - fix polarity of reset during probe
      - fix accelerated VLAN handling"

* tag 'net-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (65 commits)
  bonding: force carrier update when releasing slave
  nfp: flower: netdev offload check for ip6gretap
  ipv6: fix data-race in fib6_info_hw_flags_set / fib6_purge_rt
  ipv4: fix data races in fib_alias_hw_flags_set
  net: dsa: lan9303: add VLAN IDs to master device
  net: dsa: lan9303: handle hwaccel VLAN tags
  vsock: remove vsock from connected table when connect is interrupted by a signal
  Revert "net: ethernet: bgmac: Use devm_platform_ioremap_resource_byname"
  ping: fix the dif and sdif check in ping_lookup
  net: usb: cdc_mbim: avoid altsetting toggling for Telit FN990
  net: sched: limit TC_ACT_REPEAT loops
  tipc: fix wrong notification node addresses
  net: dsa: lantiq_gswip: fix use after free in gswip_remove()
  ipv6: per-netns exclusive flowlabel checks
  net: bridge: multicast: notify switchdev driver whenever MC processing gets disabled
  CDC-NCM: avoid overflow in sanity checking
  mctp: fix use after free
  net: mscc: ocelot: fix use-after-free in ocelot_vlan_del()
  bonding: fix data-races around agg_select_timer
  dpaa2-eth: Initialize mutex used in one step timestamping path
  ...
2022-02-17 11:33:59 -08:00
Zhang Changzhong
a6ab75cec1 bonding: force carrier update when releasing slave
In __bond_release_one(), bond_set_carrier() is only called when bond
device has no slave. Therefore, if we remove the up slave from a master
with two slaves and keep the down slave, the master will remain up.

Fix this by moving bond_set_carrier() out of if (!bond_has_slaves(bond))
statement.

Reproducer:
$ insmod bonding.ko mode=0 miimon=100 max_bonds=2
$ ifconfig bond0 up
$ ifenslave bond0 eth0 eth1
$ ifconfig eth0 down
$ ifenslave -d bond0 eth1
$ cat /proc/net/bonding/bond0

Fixes: ff59c4563a ("[PATCH] bonding: support carrier state for master")
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Link: https://lore.kernel.org/r/1645021088-38370-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17 10:55:21 -08:00
Eric Dumazet
75134f16e7 bpf: Add schedule points in batch ops
syzbot reported various soft lockups caused by bpf batch operations.

 INFO: task kworker/1:1:27 blocked for more than 140 seconds.
 INFO: task hung in rcu_barrier

Nothing prevents batch ops to process huge amount of data,
we need to add schedule points in them.

Note that maybe_wait_bpf_programs(map) calls from
generic_map_delete_batch() can be factorized by moving
the call after the loop.

This will be done later in -next tree once we get this fix merged,
unless there is strong opinion doing this optimization sooner.

Fixes: aa2e93b8e5 ("bpf: Add generic support for update and delete batch ops")
Fixes: cb4d03ab49 ("bpf: Add generic support for lookup batch op")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Brian Vazquez <brianvv@google.com>
Link: https://lore.kernel.org/bpf/20220217181902.808742-1-eric.dumazet@gmail.com
2022-02-17 10:48:26 -08:00
Reinette Chatre
e5733d8c89 x86/sgx: Fix missing poison handling in reclaimer
The SGX reclaimer code lacks page poison handling in its main
free path. This can lead to avoidable machine checks if a
poisoned page is freed and reallocated instead of being
isolated.

A troublesome scenario is:
 1. Machine check (#MC) occurs (asynchronous, !MF_ACTION_REQUIRED)
 2. arch_memory_failure() is eventually called
 3. (SGX) page->poison set to 1
 4. Page is reclaimed
 5. Page added to normal free lists by sgx_reclaim_pages()
    ^ This is the bug (poison pages should be isolated on the
    sgx_poison_page_list instead)
 6. Page is reallocated by some innocent enclave, a second (synchronous)
    in-kernel #MC is induced, probably during EADD instruction.
    ^ This is the fallout from the bug

(6) is unfortunate and can be avoided by replacing the open coded
enclave page freeing code in the reclaimer with sgx_free_epc_page()
to obtain support for poison page handling that includes placing the
poisoned page on the correct list.

Fixes: d6d261bded ("x86/sgx: Add new sgx_epc_page flag bit to mark free pages")
Fixes: 992801ae92 ("x86/sgx: Initial poison handling for dirty and free pages")
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lkml.kernel.org/r/dcc95eb2aaefb042527ac50d0a50738c7c160dac.1643830353.git.reinette.chatre@intel.com
2022-02-17 10:24:50 -08:00
Luis Chamberlain
a3580ac9b7 fs/file_table: fix adding missing kmemleak_not_leak()
Commit b42bc9a3c5 ("Fix regression due to "fs: move binfmt_misc sysctl
to its own file") fixed a regression, however it failed to add a
kmemleak_not_leak().

Fixes: b42bc9a3c5 ("Fix regression due to "fs: move binfmt_misc sysctl to its own file")
Reported-by: Tong Zhang <ztong0001@gmail.com>
Cc: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-17 10:23:19 -08:00
Linus Torvalds
2dd3a8a139 perf tools fixes for v5.17: 2nd batch
- Fix corrupt inject files when only last branch option is enabled with ARM CoreSight ETM.
 
 - Fix use-after-free for realloc(..., 0) in libsubcmd, found by gcc 12.
 
 - Defer freeing string after possible strlen() on it in the BPF loader, found by gcc 12.
 
 - Avoid early exit in 'perf trace' due SIGCHLD from non-workload processes.
 
 - Fix arm64 perf_event_attr 'perf test's wrt --call-graph initialization.
 
 - Fix libperf 32-bit build for 'perf test' wrt uint64_t printf.
 
 - Fix perf_cpu_map__for_each_cpu macro in libperf, providing access to the CPU iterator.
 
 - Sync linux/perf_event.h UAPI with the kernel sources.
 
 - Update Jiri Olsa's email address in MAINTAINERS.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYg5q1gAKCRCyPKLppCJ+
 JxbAAP9P7iIQuXecNop1ye4eLsWHYhBAJPSJdvUHwqpwGRm8HQEA2r+2tRtBcbei
 3qgOY0od4Xtw1yji1YmTeQ6jmKFuMQM=
 =c5Ce
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v5.17-2022-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix corrupt inject files when only last branch option is enabled with
   ARM CoreSight ETM

 - Fix use-after-free for realloc(..., 0) in libsubcmd, found by gcc 12

 - Defer freeing string after possible strlen() on it in the BPF loader,
   found by gcc 12

 - Avoid early exit in 'perf trace' due SIGCHLD from non-workload
   processes

 - Fix arm64 perf_event_attr 'perf test's wrt --call-graph
   initialization

 - Fix libperf 32-bit build for 'perf test' wrt uint64_t printf

 - Fix perf_cpu_map__for_each_cpu macro in libperf, providing access to
   the CPU iterator

 - Sync linux/perf_event.h UAPI with the kernel sources

 - Update Jiri Olsa's email address in MAINTAINERS

* tag 'perf-tools-fixes-for-v5.17-2022-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf bpf: Defer freeing string after possible strlen() on it
  perf test: Fix arm64 perf_event_attr tests wrt --call-graph initialization
  libsubcmd: Fix use-after-free for realloc(..., 0)
  libperf: Fix perf_cpu_map__for_each_cpu macro
  perf cs-etm: Fix corrupt inject files when only last branch option is enabled
  perf cs-etm: No-op refactor of synth opt usage
  libperf: Fix 32-bit build for tests uint64_t printf
  tools headers UAPI: Sync linux/perf_event.h with the kernel sources
  perf trace: Avoid early exit due SIGCHLD from non-workload processes
  MAINTAINERS: Update Jiri's email address
2022-02-17 10:06:09 -08:00
Linus Torvalds
edbd6c628e Fixes module decompression when CONFIG_SYSFS=n
The only fix trickled down for v5.17-rc cycle so far is
 the fix for module decompression when CONFIG_SYSFS=n. This
 was reported through 0-day.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmINZTUSHG1jZ3JvZkBr
 ZXJuZWwub3JnAAoJEM4jHQowkoinrUgP/Ai8ae+IHpscLbmBM/bdbqinKoOUo17M
 0nHTJIMnpf8J7B/MSx64VLBmfkCKEVdYfiQhkGcxAwmTcw7n+1+ekM5kTAhCRJfd
 Ej/4n0Q5cWl7SnCmhO94vBYFmDHn/NgoXhsAAFaZIRtCcyRWzXAb946lBgGUDOHl
 AtybuRIPJpU+6AeXeZyz9WCdYF2/uyz5HFs929SOE1iYazmpshei8bpOepVvH9LK
 Sa72/gVKC3CuVGx+6VwwvF+wL0dBvp8Gufg5/xh7xOhlls5N3AJUhJ7fkg7v1K9+
 gjeBhQrKzfbtd0MFsjnfNhN0zbQ/A+d3f2M6SNGcZK/xbDEnEypmcNyXiy0QKXjw
 QVSS5dXEI8bNoNPM2oqVmmCXliTqD3jY1mS7ynTwI7w0zUN8NL1y2tcF9xjBvk7O
 X7sirOUtiaEIfFkkt7BxHILEAVd/jqYg43g/cq0EMglxeRMJXJo04kY6EvzFSxCm
 bKUUV4d+wCq+JzQmPDIO01UcfKmlTC3a8bexusVz06MhDYW7KdVfNTDxu2/EQaNn
 gU3/v5SvmucV9Oo/norXbDf0jzDTSbmPc0yYbanXEV2CrHx0CEAAGMN5W+iNOWCz
 3YVMON3qu6nJcZuu7YemjKr8v8ZA5M37L9bUDlBZDCoEUqbaJV7wjRRjKqeL2d6F
 1gZ+hV6sqN+1
 =h/ue
 -----END PGP SIGNATURE-----

Merge tag 'modules-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull module fix from Luis Chamberlain:
 "Fixes module decompression when CONFIG_SYSFS=n

  The only fix trickled down for v5.17-rc cycle so far is the fix for
  module decompression when CONFIG_SYSFS=n. This was reported through
  0-day"

* tag 'modules-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  module: fix building with sysfs disabled
2022-02-17 09:54:00 -08:00
Danie du Toit
7dbcda584e nfp: flower: netdev offload check for ip6gretap
IPv6 GRE tunnels are not being offloaded, this is caused by a missing
netdev offload check. The functionality of IPv6 GRE tunnel offloading
was previously added but this check was not included. Adding the
ip6gretap check allows IPv6 GRE tunnels to be offloaded correctly.

Fixes: f7536ffb09 ("nfp: flower: Allow ipv6gretap interface for offloading")
Signed-off-by: Danie du Toit <danie.dutoit@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20220217124820.40436-1-louis.peens@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17 09:50:45 -08:00
Eric Dumazet
d95d6320ba ipv6: fix data-race in fib6_info_hw_flags_set / fib6_purge_rt
Because fib6_info_hw_flags_set() is called without any synchronization,
all accesses to gi6->offload, fi->trap and fi->offload_failed
need some basic protection like READ_ONCE()/WRITE_ONCE().

BUG: KCSAN: data-race in fib6_info_hw_flags_set / fib6_purge_rt

read to 0xffff8881087d5886 of 1 bytes by task 13953 on cpu 0:
 fib6_drop_pcpu_from net/ipv6/ip6_fib.c:1007 [inline]
 fib6_purge_rt+0x4f/0x580 net/ipv6/ip6_fib.c:1033
 fib6_del_route net/ipv6/ip6_fib.c:1983 [inline]
 fib6_del+0x696/0x890 net/ipv6/ip6_fib.c:2028
 __ip6_del_rt net/ipv6/route.c:3876 [inline]
 ip6_del_rt+0x83/0x140 net/ipv6/route.c:3891
 __ipv6_dev_ac_dec+0x2b5/0x370 net/ipv6/anycast.c:374
 ipv6_dev_ac_dec net/ipv6/anycast.c:387 [inline]
 __ipv6_sock_ac_close+0x141/0x200 net/ipv6/anycast.c:207
 ipv6_sock_ac_close+0x79/0x90 net/ipv6/anycast.c:220
 inet6_release+0x32/0x50 net/ipv6/af_inet6.c:476
 __sock_release net/socket.c:650 [inline]
 sock_close+0x6c/0x150 net/socket.c:1318
 __fput+0x295/0x520 fs/file_table.c:280
 ____fput+0x11/0x20 fs/file_table.c:313
 task_work_run+0x8e/0x110 kernel/task_work.c:164
 tracehook_notify_resume include/linux/tracehook.h:189 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:175 [inline]
 exit_to_user_mode_prepare+0x160/0x190 kernel/entry/common.c:207
 __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
 syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:300
 do_syscall_64+0x50/0xd0 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x44/0xae

write to 0xffff8881087d5886 of 1 bytes by task 1912 on cpu 1:
 fib6_info_hw_flags_set+0x155/0x3b0 net/ipv6/route.c:6230
 nsim_fib6_rt_hw_flags_set drivers/net/netdevsim/fib.c:668 [inline]
 nsim_fib6_rt_add drivers/net/netdevsim/fib.c:691 [inline]
 nsim_fib6_rt_insert drivers/net/netdevsim/fib.c:756 [inline]
 nsim_fib6_event drivers/net/netdevsim/fib.c:853 [inline]
 nsim_fib_event drivers/net/netdevsim/fib.c:886 [inline]
 nsim_fib_event_work+0x284f/0x2cf0 drivers/net/netdevsim/fib.c:1477
 process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
 worker_thread+0x616/0xa70 kernel/workqueue.c:2454
 kthread+0x2c7/0x2e0 kernel/kthread.c:327
 ret_from_fork+0x1f/0x30

value changed: 0x22 -> 0x2a

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 1912 Comm: kworker/1:3 Not tainted 5.16.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events nsim_fib_event_work

Fixes: 0c5fcf9e24 ("IPv6: Add "offload failed" indication to routes")
Fixes: bb3c4ab93e ("ipv6: Add "offload" and "trap" indications to routes")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Amit Cohen <amcohen@nvidia.com>
Cc: Ido Schimmel <idosch@nvidia.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Link: https://lore.kernel.org/r/20220216173217.3792411-2-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17 09:48:24 -08:00
Eric Dumazet
9fcf986cc4 ipv4: fix data races in fib_alias_hw_flags_set
fib_alias_hw_flags_set() can be used by concurrent threads,
and is only RCU protected.

We need to annotate accesses to following fields of struct fib_alias:

    offload, trap, offload_failed

Because of READ_ONCE()WRITE_ONCE() limitations, make these
field u8.

BUG: KCSAN: data-race in fib_alias_hw_flags_set / fib_alias_hw_flags_set

read to 0xffff888134224a6a of 1 bytes by task 2013 on cpu 1:
 fib_alias_hw_flags_set+0x28a/0x470 net/ipv4/fib_trie.c:1050
 nsim_fib4_rt_hw_flags_set drivers/net/netdevsim/fib.c:350 [inline]
 nsim_fib4_rt_add drivers/net/netdevsim/fib.c:367 [inline]
 nsim_fib4_rt_insert drivers/net/netdevsim/fib.c:429 [inline]
 nsim_fib4_event drivers/net/netdevsim/fib.c:461 [inline]
 nsim_fib_event drivers/net/netdevsim/fib.c:881 [inline]
 nsim_fib_event_work+0x1852/0x2cf0 drivers/net/netdevsim/fib.c:1477
 process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
 process_scheduled_works kernel/workqueue.c:2370 [inline]
 worker_thread+0x7df/0xa70 kernel/workqueue.c:2456
 kthread+0x1bf/0x1e0 kernel/kthread.c:377
 ret_from_fork+0x1f/0x30

write to 0xffff888134224a6a of 1 bytes by task 4872 on cpu 0:
 fib_alias_hw_flags_set+0x2d5/0x470 net/ipv4/fib_trie.c:1054
 nsim_fib4_rt_hw_flags_set drivers/net/netdevsim/fib.c:350 [inline]
 nsim_fib4_rt_add drivers/net/netdevsim/fib.c:367 [inline]
 nsim_fib4_rt_insert drivers/net/netdevsim/fib.c:429 [inline]
 nsim_fib4_event drivers/net/netdevsim/fib.c:461 [inline]
 nsim_fib_event drivers/net/netdevsim/fib.c:881 [inline]
 nsim_fib_event_work+0x1852/0x2cf0 drivers/net/netdevsim/fib.c:1477
 process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
 process_scheduled_works kernel/workqueue.c:2370 [inline]
 worker_thread+0x7df/0xa70 kernel/workqueue.c:2456
 kthread+0x1bf/0x1e0 kernel/kthread.c:377
 ret_from_fork+0x1f/0x30

value changed: 0x00 -> 0x02

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 4872 Comm: kworker/0:0 Not tainted 5.17.0-rc3-syzkaller-00188-g1d41d2e82623-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events nsim_fib_event_work

Fixes: 90b93f1b31 ("ipv4: Add "offload" and "trap" indications to routes")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20220216173217.3792411-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17 09:48:24 -08:00
Mans Rullgard
430065e267 net: dsa: lan9303: add VLAN IDs to master device
If the master device does VLAN filtering, the IDs used by the switch
must be added for any frames to be received.  Do this in the
port_enable() function, and remove them in port_disable().

Fixes: a1292595e0 ("net: dsa: add new DSA switch driver for the SMSC-LAN9303")
Signed-off-by: Mans Rullgard <mans@mansr.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20220216204818.28746-1-mans@mansr.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17 09:32:13 -08:00
Mans Rullgard
017b355bbd net: dsa: lan9303: handle hwaccel VLAN tags
Check for a hwaccel VLAN tag on rx and use it if present.  Otherwise,
use __skb_vlan_pop() like the other tag parsers do.  This fixes the case
where the VLAN tag has already been consumed by the master.

Fixes: a1292595e0 ("net: dsa: add new DSA switch driver for the SMSC-LAN9303")
Signed-off-by: Mans Rullgard <mans@mansr.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20220216124634.23123-1-mans@mansr.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17 09:32:06 -08:00
Linus Torvalds
80d47f5de5 mm: don't try to NUMA-migrate COW pages that have other uses
Oded Gabbay reports that enabling NUMA balancing causes corruption with
his Gaudi accelerator test load:

 "All the details are in the bug, but the bottom line is that somehow,
  this patch causes corruption when the numa balancing feature is
  enabled AND we don't use process affinity AND we use GUP to pin pages
  so our accelerator can DMA to/from system memory.

  Either disabling numa balancing, using process affinity to bind to
  specific numa-node or reverting this patch causes the bug to
  disappear"

and Oded bisected the issue to commit 09854ba94c ("mm: do_wp_page()
simplification").

Now, the NUMA balancing shouldn't actually be changing the writability
of a page, and as such shouldn't matter for COW.  But it appears it
does.  Suspicious.

However, regardless of that, the condition for enabling NUMA faults in
change_pte_range() is nonsensical.  It uses "page_mapcount(page)" to
decide if a COW page should be NUMA-protected or not, and that makes
absolutely no sense.

The number of mappings a page has is irrelevant: not only does GUP get a
reference to a page as in Oded's case, but the other mappings migth be
paged out and the only reference to them would be in the page count.

Since we should never try to NUMA-balance a page that we can't move
anyway due to other references, just fix the code to use 'page_count()'.
Oded confirms that that fixes his issue.

Now, this does imply that something in NUMA balancing ends up changing
page protections (other than the obvious one of making the page
inaccessible to get the NUMA faulting information).  Otherwise the COW
simplification wouldn't matter - since doing the GUP on the page would
make sure it's writable.

The cause of that permission change would be good to figure out too,
since it clearly results in spurious COW events - but fixing the
nonsensical test that just happened to work before is obviously the
CorrectThing(tm) to do regardless.

Fixes: 09854ba94c ("mm: do_wp_page() simplification")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215616
Link: https://lore.kernel.org/all/CAFCwf10eNmwq2wD71xjUhqkvv5+_pJMR1nPug2RqNDcFT4H86Q@mail.gmail.com/
Reported-and-tested-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-17 08:57:47 -08:00
Seth Forshee
b9208492fc vsock: remove vsock from connected table when connect is interrupted by a signal
vsock_connect() expects that the socket could already be in the
TCP_ESTABLISHED state when the connecting task wakes up with a signal
pending. If this happens the socket will be in the connected table, and
it is not removed when the socket state is reset. In this situation it's
common for the process to retry connect(), and if the connection is
successful the socket will be added to the connected table a second
time, corrupting the list.

Prevent this by calling vsock_remove_connected() if a signal is received
while waiting for a connection. This is harmless if the socket is not in
the connected table, and if it is in the table then removing it will
prevent list corruption from a double add.

Note for backporting: this patch requires d5afa82c97 ("vsock: correct
removal of socket from the list"), which is in all current stable trees
except 4.9.y.

Fixes: d021c34405 ("VSOCK: Introduce VM Sockets")
Signed-off-by: Seth Forshee <sforshee@digitalocean.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20220217141312.2297547-1-sforshee@digitalocean.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17 08:56:02 -08:00
Jonas Gorski
6aba04ee32 Revert "net: ethernet: bgmac: Use devm_platform_ioremap_resource_byname"
This reverts commit 3710e80952.

Since idm_base and nicpm_base are still optional resources not present
on all platforms, this breaks the driver for everything except Northstar
2 (which has both).

The same change was already reverted once with 755f5738ff ("net:
broadcom: fix a mistake about ioremap resource").

So let's do it again.

Fixes: 3710e80952 ("net: ethernet: bgmac: Use devm_platform_ioremap_resource_byname")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
[florian: Added comments to explain the resources are optional]
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220216184634.2032460-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17 08:45:34 -08:00
Maxime Ripard
6764eb690e
drm/vc4: crtc: Fix runtime_pm reference counting
At boot on the BCM2711, if the HDMI controllers are running, the CRTC
driver will disable itself and its associated HDMI controller to work
around a hardware bug that would leave some pixels stuck in a FIFO.

In order to avoid that issue, we need to run some operations in lockstep
between the CRTC and HDMI controller, and we need to make sure the HDMI
controller will be powered properly.

However, since we haven't enabled it through KMS, the runtime_pm state
is off at this point so we need to make sure the device is powered
through pm_runtime_resume_and_get, and once the operations are complete,
we call pm_runtime_put.

However, the HDMI controller will do that itself in its
post_crtc_powerdown, which means we'll end up calling pm_runtime_put for
a single pm_runtime_get, throwing the reference counting off. Let's
remove the pm_runtime_put call in the CRTC code in order to have the
proper counting.

Fixes: bca10db67b ("drm/vc4: crtc: Make sure the HDMI controller is powered when disabling")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220203102003.1114673-1-maxime@cerno.tech
2022-02-17 17:27:59 +01:00
Maxime Ripard
e40945ab7c
drm/vc4: hdmi: Unregister codec device on unbind
On bind we will register the HDMI codec device but we don't unregister
it on unbind, leading to a device leakage. Unregister our device at
unbind.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220127111452.222002-1-maxime@cerno.tech
2022-02-17 17:27:54 +01:00
Mike Marciniszyn
32f57cb1b2 IB/qib: Fix duplicate sysfs directory name
The qib driver load has been failing with the following message:

  sysfs: cannot create duplicate filename '/devices/pci0000:80/0000:80:02.0/0000:81:00.0/infiniband/qib0/ports/1/linkcontrol'

The patch below has two "linkcontrol" names causing the duplication.

Fix by using the correct "diag_counters" name on the second instance.

Fixes: 4a7aaf88c8 ("RDMA/qib: Use attributes for the port sysfs")
Link: https://lore.kernel.org/r/1645106372-23004-1-git-send-email-mike.marciniszyn@cornelisnetworks.com
Cc: <stable@vger.kernel.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-17 11:54:08 -04:00
Jon Lin
80808768e4
spi: rockchip: terminate dma transmission when slave abort
After slave abort, all DMA should be stopped, or it will affect the
next transmission and maybe abort again.

Signed-off-by: Jon Lin <jon.lin@rock-chips.com>
Link: https://lore.kernel.org/r/20220216014028.8123-3-jon.lin@rock-chips.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-02-17 15:33:18 +00:00
Jon Lin
9382df0a98
spi: rockchip: Fix error in getting num-cs property
Get num-cs u32 from dts of_node property rather than u16.

Signed-off-by: Jon Lin <jon.lin@rock-chips.com>
Link: https://lore.kernel.org/r/20220216014028.8123-2-jon.lin@rock-chips.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-02-17 15:33:17 +00:00
Prasad Kumpatla
d04ad245d6
regmap-irq: Update interrupt clear register for proper reset
With the existing logic where clear_ack is true (HW doesn’t support
auto clear for ICR), interrupt clear register reset is not handled
properly. Due to this only the first interrupts get processed properly
and further interrupts are blocked due to not resetting interrupt
clear register.

Example for issue case where Invert_ack is false and clear_ack is true:

    Say Default ISR=0x00 & ICR=0x00 and ISR is triggered with 2
    interrupts making ISR = 0x11.

    Step 1: Say ISR is set 0x11 (store status_buff = ISR). ISR needs to
            be cleared with the help of ICR once the Interrupt is processed.

    Step 2: Write ICR = 0x11 (status_buff), this will clear the ISR to 0x00.

    Step 3: Issue - In the existing code, ICR is written with ICR =
            ~(status_buff) i.e ICR = 0xEE -> This will block all the interrupts
            from raising except for interrupts 0 and 4. So expectation here is to
            reset ICR, which will unblock all the interrupts.

            if (chip->clear_ack) {
                 if (chip->ack_invert && !ret)
                  ........
                 else if (!ret)
                     ret = regmap_write(map, reg,
                            ~data->status_buf[i]);

So writing 0 and 0xff (when ack_invert is true) should have no effect, other
than clearing the ACKs just set.

Fixes: 3a6f0fb7b8 ("regmap: irq: Add support to clear ack registers")
Signed-off-by: Prasad Kumpatla <quic_pkumpatl@quicinc.com>
Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20220217085007.30218-1-quic_pkumpatl@quicinc.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-02-17 15:33:15 +00:00
Eric W. Biederman
0cbae9e24f ucounts: Handle wrapping in is_ucounts_overlimit
While examining is_ucounts_overlimit and reading the various messages
I realized that is_ucounts_overlimit fails to deal with counts that
may have wrapped.

Being wrapped should be a transitory state for counts and they should
never be wrapped for long, but it can happen so handle it.

Cc: stable@vger.kernel.org
Fixes: 21d1c5e386 ("Reimplement RLIMIT_NPROC on top of ucounts")
Link: https://lkml.kernel.org/r/20220216155832.680775-5-ebiederm@xmission.com
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-02-17 09:11:57 -06:00
Eric W. Biederman
c923a8e7ed ucounts: Move RLIMIT_NPROC handling after set_user
During set*id() which cred->ucounts to charge the the current process
to is not known until after set_cred_ucounts.  So move the
RLIMIT_NPROC checking into a new helper flag_nproc_exceeded and call
flag_nproc_exceeded after set_cred_ucounts.

This is very much an arbitrary subset of the places where we currently
change the RLIMIT_NPROC accounting, designed to preserve the existing
logic.

Fixing the existing logic will be the subject of another series of
changes.

Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20220216155832.680775-4-ebiederm@xmission.com
Fixes: 21d1c5e386 ("Reimplement RLIMIT_NPROC on top of ucounts")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-02-17 09:11:26 -06:00
Eric W. Biederman
a55d07294f ucounts: Base set_cred_ucounts changes on the real user
Michal Koutný <mkoutny@suse.com> wrote:
> Tasks are associated to multiple users at once. Historically and as per
> setrlimit(2) RLIMIT_NPROC is enforce based on real user ID.
>
> The commit 21d1c5e386 ("Reimplement RLIMIT_NPROC on top of ucounts")
> made the accounting structure "indexed" by euid and hence potentially
> account tasks differently.
>
> The effective user ID may be different e.g. for setuid programs but
> those are exec'd into already existing task (i.e. below limit), so
> different accounting is moot.
>
> Some special setresuid(2) users may notice the difference, justifying
> this fix.

I looked at cred->ucount and it is only used for rlimit operations
that were previously stored in cred->user.  Making the fact
cred->ucount can refer to a different user from cred->user a bug,
affecting all uses of cred->ulimit not just RLIMIT_NPROC.

Fix set_cred_ucounts to always use the real uid not the effective uid.

Further simplify set_cred_ucounts by noticing that set_cred_ucounts
somehow retained a draft version of the check to see if alloc_ucounts
was needed that checks the new->user and new->user_ns against the
current_real_cred().  Remove that draft version of the check.

All that matters for setting the cred->ucounts are the user_ns and uid
fields in the cred.

Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20220207121800.5079-4-mkoutny@suse.com
Link: https://lkml.kernel.org/r/20220216155832.680775-3-ebiederm@xmission.com
Reported-by: Michal Koutný <mkoutny@suse.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Fixes: 21d1c5e386 ("Reimplement RLIMIT_NPROC on top of ucounts")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-02-17 09:11:02 -06:00
Eric W. Biederman
8f2f9c4d82 ucounts: Enforce RLIMIT_NPROC not RLIMIT_NPROC+1
Michal Koutný <mkoutny@suse.com> wrote:

> It was reported that v5.14 behaves differently when enforcing
> RLIMIT_NPROC limit, namely, it allows one more task than previously.
> This is consequence of the commit 21d1c5e386 ("Reimplement
> RLIMIT_NPROC on top of ucounts") that missed the sharpness of
> equality in the forking path.

This can be fixed either by fixing the test or by moving the increment
to be before the test.  Fix it my moving copy_creds which contains
the increment before is_ucounts_overlimit.

In the case of CLONE_NEWUSER the ucounts in the task_cred changes.
The function is_ucounts_overlimit needs to use the final version of
the ucounts for the new process.  Which means moving the
is_ucounts_overlimit test after copy_creds is necessary.

Both the test in fork and the test in set_user were semantically
changed when the code moved to ucounts.  The change of the test in
fork was bad because it was before the increment.  The test in
set_user was wrong and the change to ucounts fixed it.  So this
fix only restores the old behavior in one lcation not two.

Link: https://lkml.kernel.org/r/20220204181144.24462-1-mkoutny@suse.com
Link: https://lkml.kernel.org/r/20220216155832.680775-2-ebiederm@xmission.com
Cc: stable@vger.kernel.org
Reported-by: Michal Koutný <mkoutny@suse.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Fixes: 21d1c5e386 ("Reimplement RLIMIT_NPROC on top of ucounts")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-02-17 09:10:33 -06:00
Fabrice Gasnier
32fde84362 usb: dwc2: drd: fix soft connect when gadget is unconfigured
When the gadget driver hasn't been (yet) configured, and the cable is
connected to a HOST, the SFTDISCON gets cleared unconditionally, so the
HOST tries to enumerate it.
At the host side, this can result in a stuck USB port or worse. When
getting lucky, some dmesg can be observed at the host side:
 new high-speed USB device number ...
 device descriptor read/64, error -110

Fix it in drd, by checking the enabled flag before calling
dwc2_hsotg_core_connect(). It will be called later, once configured,
by the normal flow:
- udc_bind_to_driver
 - usb_gadget_connect
   - dwc2_hsotg_pullup
     - dwc2_hsotg_core_connect

Fixes: 17f934024e ("usb: dwc2: override PHY input signals with usb role switch support")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com>
Link: https://lore.kernel.org/r/1644999135-13478-1-git-send-email-fabrice.gasnier@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-17 16:10:21 +01:00
Hans de Goede
62e3f0afe2 usb: dwc3: pci: Fix Bay Trail phy GPIO mappings
When the Bay Trail phy GPIO mappings where added cs and reset were swapped,
this did not cause any issues sofar, because sofar they were always driven
high/low at the same time.

Note the new mapping has been verified both in /sys/kernel/debug/gpio
output on Android factory images on multiple devices, as well as in
the schematics for some devices.

Fixes: 5741022cbd ("usb: dwc3: pci: Add GPIO lookup table on platforms without ACPI GPIO resources")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220213130524.18748-3-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-17 16:10:03 +01:00