- Add an error message to be printed when a power domain marked as
"always on" is not actually on during initialization (Johan Hovold).
- Extend macros used for defining power management callbacks to allow
conditional exporting of noirq and late/early suspend/resume PM
callbacks (Paul Cercueil).
- Update the turbostat utility:
* Add support for two new platforms (Zhang Rui).
* Adjust energy unit for Sapphire Rapids (Zhang Rui).
* Do not dump TRL if turbo is not supported (Artem Bityutskiy).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmNEVSISHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxKqUP/2Etq/OiSpMx7TYfLw9wpR6o+QOMSPuz
fjEByubypGBA387Z7F3siRBhy0tRiiU0h3Ti/cPtxkNVx7lLc+yEJYYvbkIvm9Fg
lhev8ky9m/JB9vq+r4GWnUWW3DntM84a3f/iNoqEMDeUUy0YhsPLfdIcEKN5y/qU
Q1BQ+4XY4PSwY0G315JT+EDxkMK/YcoJyi+1dlSO5iilwz9DJBT2iD8M+18Lz1k4
GWUryTvcANSimx4WzLFvCXGHagA5Yv7czgIMI3eBhTaln7P2G/Jn0LCrbVcHJk0t
dBgoK/FN9akbXzg6697No32soDLxf5uLzsUsYS3Xn/tdfp1b5jTX2o548qKjNBl7
x0ot2hUp713fS/4RVGeRTQcwzy1dMyzp35pMh+B61qKyVZTPcOXC6OcXRR5FmXOP
iZo84snmPRhQPT387HpRwhpv6Pm/M9EIdCpXY1e21V8fpGdwxAbo+sLN4Kh+bkbS
McxC7q5gOKFij8ucyVGUFQPMkkOuRi0kQMNVH8raobVUvsI8bAGau61LTpudyG6p
6kmcaKu4oEpl2cYpgQwdF5fT8NAp3H8cGuzkJRJqHijs0E4Kuqd9bigoPigUQTT2
1D92S/kiUcxJo4pJZ0eDaxB05LDPMeGjnm1DUz5kh8PJE060tjUMlaTG9Gh4HjlH
uQjgblji2NVc
=IsN+
-----END PGP SIGNATURE-----
Merge tag 'pm-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These update the turbostat utility, extend the macros used for
defining device power management callbacks and add a diagnostic
message to the generic power domains code.
Specifics:
- Add an error message to be printed when a power domain marked as
"always on" is not actually on during initialization (Johan
Hovold).
- Extend macros used for defining power management callbacks to allow
conditional exporting of noirq and late/early suspend/resume PM
callbacks (Paul Cercueil).
- Update the turbostat utility:
- Add support for two new platforms (Zhang Rui).
- Adjust energy unit for Sapphire Rapids (Zhang Rui).
- Do not dump TRL if turbo is not supported (Artem Bityutskiy)"
* tag 'pm-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
tools/power turbostat: version 2022.10.04
tools/power turbostat: Use standard Energy Unit for SPR Dram RAPL domain
tools/power turbostat: Do not dump TRL if turbo is not supported
tools/power turbostat: Add support for MeteorLake platforms
tools/power turbostat: Add support for RPL-S
PM: Improve EXPORT_*_DEV_PM_OPS macros
PM: domains: log failures to register always-on domains
* Improvements to the CPU topology subsystem, which fix some issues
where RISC-V would report bad topology information.
* The default NR_CPUS has increased to XLEN, and the maximum
configurable value is 512.
* The CD-ROM filesystems have been enabled in the defconfig.
* Support for THP_SWAP has been added for rv64 systems.
There are also a handful of cleanups and fixes throughout the tree.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmNAWgwTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYicSiEACmuB9WuGZmAasKvmPgz7thyLqakg7/
cE4YK0MxgJxkhsXzYSAv1Fn+WUfX7DSzhK4OOM5wEngAYul7QoFdc84MF0DYKO+E
InjdOvVavzUsWYqETNCuMHPRK6xyzvfHCqqBDDxKHx5jUoicCQfFwJyHLw+cvouR
7WSJoFdvOEV01QyN5Qw9bQp7ASx61ZZX1yE6OAPc2/EJlDEA2QSnjBAi4M+n2ZCx
ZsQz+Dp9RfSU8/nIr13oGiL3Zm+kyXwdOS/8PaDqtrkyiGh6+vSeGqZZwRLVITP/
oUxqGEgnn2eFBD1y8vjsQNWMLWoi9Av4746Fxr8CEHX+jX1cp9CCkU2OkkLxaFcv
6XFtXPJIh/UjzVgPmjZxK+ArEX28QOM5IVyBFxsSl0dNtvyVqKpBXCV1RQ+fFHkO
ntHF3ZxibqOn8ZJmziCn0nzWSOqugNTdAhD4dJAbl58RB/IQtQT0OnHpmpXCG3xh
+/JBzy//xkr7u2HMqU69PzwPtWwZrENUV6jl5SHUDUoW8pySng2Pl4pbmTFqgWty
JTfc5EdyWOWyshhoSCtK2//bnVFryl2ntwGr3LIZrZxkiUiOeYjn+C/YedXZIRob
yy2CN+QanW/FXdIa4GMNeGc9sGDApd3/RtP+8L9mV1kWK6OE0EVskkI1UMCGXrIP
5JoE1jLMVhjcKQ==
=LJg6
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-6.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Improvements to the CPU topology subsystem, which fix some issues
where RISC-V would report bad topology information.
- The default NR_CPUS has increased to XLEN, and the maximum
configurable value is 512.
- The CD-ROM filesystems have been enabled in the defconfig.
- Support for THP_SWAP has been added for rv64 systems.
There are also a handful of cleanups and fixes throughout the tree.
* tag 'riscv-for-linus-6.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: enable THP_SWAP for RV64
RISC-V: Print SSTC in canonical order
riscv: compat: s/failed/unsupported if compat mode isn't supported
RISC-V: Increase range and default value of NR_CPUS
cpuidle: riscv-sbi: Fix CPU_PM_CPU_IDLE_ENTER_xyz() macro usage
perf: RISC-V: throttle perf events
perf: RISC-V: exclude invalid pmu counters from SBI calls
riscv: enable CD-ROM file systems in defconfig
riscv: topology: fix default topology reporting
arm64: topology: move store_cpu_topology() to shared code
am sending out early due to me travelling next week. There is a
lone mm patch for which Andrew gave an informal ack at
https://lore.kernel.org/linux-mm/20220817102500.440c6d0a3fce296fdf91bea6@linux-foundation.org.
I will send the bulk of ARM work, as well as other
architectures, at the end of next week.
ARM:
* Account stage2 page table allocations in memory stats.
x86:
* Account EPT/NPT arm64 page table allocations in memory stats.
* Tracepoint cleanups/fixes for nested VM-Enter and emulated MSR accesses.
* Drop eVMCS controls filtering for KVM on Hyper-V, all known versions of
Hyper-V now support eVMCS fields associated with features that are
enumerated to the guest.
* Use KVM's sanitized VMCS config as the basis for the values of nested VMX
capabilities MSRs.
* A myriad event/exception fixes and cleanups. Most notably, pending
exceptions morph into VM-Exits earlier, as soon as the exception is
queued, instead of waiting until the next vmentry. This fixed
a longstanding issue where the exceptions would incorrecly become
double-faults instead of triggering a vmexit; the common case of
page-fault vmexits had a special workaround, but now it's fixed
for good.
* A handful of fixes for memory leaks in error paths.
* Cleanups for VMREAD trampoline and VMX's VM-Exit assembly flow.
* Never write to memory from non-sleepable kvm_vcpu_check_block()
* Selftests refinements and cleanups.
* Misc typo cleanups.
Generic:
* remove KVM_REQ_UNHALT
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmM2zwcUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroNpbwf+MlVeOlzE5SBdrJ0TEnLmKUel1lSz
QnZzP5+D65oD0zhCilUZHcg6G4mzZ5SdVVOvrGJvA0eXh25ruLNMF6jbaABkMLk/
FfI1ybN7A82hwJn/aXMI/sUurWv4Jteaad20JC2DytBCnsW8jUqc49gtXHS2QWy4
3uMsFdpdTAg4zdJKgEUfXBmQviweVpjjl3ziRyZZ7yaeo1oP7XZ8LaE1nR2l5m0J
mfjzneNm5QAnueypOh5KhSwIvqf6WHIVm/rIHDJ1HIFbgfOU0dT27nhb1tmPwAcE
+cJnnMUHjZqtCXteHkAxMClyRq0zsEoKk0OGvSOOMoq3Q0DavSXUNANOig==
=/hqX
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"The first batch of KVM patches, mostly covering x86.
ARM:
- Account stage2 page table allocations in memory stats
x86:
- Account EPT/NPT arm64 page table allocations in memory stats
- Tracepoint cleanups/fixes for nested VM-Enter and emulated MSR
accesses
- Drop eVMCS controls filtering for KVM on Hyper-V, all known
versions of Hyper-V now support eVMCS fields associated with
features that are enumerated to the guest
- Use KVM's sanitized VMCS config as the basis for the values of
nested VMX capabilities MSRs
- A myriad event/exception fixes and cleanups. Most notably, pending
exceptions morph into VM-Exits earlier, as soon as the exception is
queued, instead of waiting until the next vmentry. This fixed a
longstanding issue where the exceptions would incorrecly become
double-faults instead of triggering a vmexit; the common case of
page-fault vmexits had a special workaround, but now it's fixed for
good
- A handful of fixes for memory leaks in error paths
- Cleanups for VMREAD trampoline and VMX's VM-Exit assembly flow
- Never write to memory from non-sleepable kvm_vcpu_check_block()
- Selftests refinements and cleanups
- Misc typo cleanups
Generic:
- remove KVM_REQ_UNHALT"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (94 commits)
KVM: remove KVM_REQ_UNHALT
KVM: mips, x86: do not rely on KVM_REQ_UNHALT
KVM: x86: never write to memory from kvm_vcpu_check_block()
KVM: x86: Don't snapshot pending INIT/SIPI prior to checking nested events
KVM: nVMX: Make event request on VMXOFF iff INIT/SIPI is pending
KVM: nVMX: Make an event request if INIT or SIPI is pending on VM-Enter
KVM: SVM: Make an event request if INIT or SIPI is pending when GIF is set
KVM: x86: lapic does not have to process INIT if it is blocked
KVM: x86: Rename kvm_apic_has_events() to make it INIT/SIPI specific
KVM: x86: Rename and expose helper to detect if INIT/SIPI are allowed
KVM: nVMX: Make an event request when pending an MTF nested VM-Exit
KVM: x86: make vendor code check for all nested events
mailmap: Update Oliver's email address
KVM: x86: Allow force_emulation_prefix to be written without a reload
KVM: selftests: Add an x86-only test to verify nested exception queueing
KVM: selftests: Use uapi header to get VMX and SVM exit reasons/codes
KVM: x86: Rename inject_pending_events() to kvm_check_and_inject_events()
KVM: VMX: Update MTF and ICEBP comments to document KVM's subtle behavior
KVM: x86: Treat pending TRIPLE_FAULT requests as pending exceptions
KVM: x86: Morph pending exceptions to pending VM-Exits at queue time
...
Here is the big set of driver core and debug printk changes for 6.1-rc1.
Included in here is:
- dynamic debug updates for the core and the drm subsystem. The
drm changes have all been acked by the relevant maintainers.
- kernfs fixes for syzbot reported problems
- kernfs refactors and updates for cgroup requirements
- magic number cleanups and removals from the kernel tree (they
were not being used and they really did not actually do
anything.)
- other tiny cleanups
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY0BYUA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylozwCdFRlcghaf7XBUyNgRZRwMC+oQI8EAn1G/nEDE
6aFd2er41uK0IGQnSmYO
=OK0k
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the big set of driver core and debug printk changes for
6.1-rc1. Included in here is:
- dynamic debug updates for the core and the drm subsystem. The drm
changes have all been acked by the relevant maintainers
- kernfs fixes for syzbot reported problems
- kernfs refactors and updates for cgroup requirements
- magic number cleanups and removals from the kernel tree (they were
not being used and they really did not actually do anything)
- other tiny cleanups
All of these have been in linux-next for a while with no reported
issues"
* tag 'driver-core-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (74 commits)
docs: filesystems: sysfs: Make text and code for ->show() consistent
Documentation: NBD_REQUEST_MAGIC isn't a magic number
a.out: restore CMAGIC
device property: Add const qualifier to device_get_match_data() parameter
drm_print: add _ddebug descriptor to drm_*dbg prototypes
drm_print: prefer bare printk KERN_DEBUG on generic fn
drm_print: optimize drm_debug_enabled for jump-label
drm-print: add drm_dbg_driver to improve namespace symmetry
drm-print.h: include dyndbg header
drm_print: wrap drm_*_dbg in dyndbg descriptor factory macro
drm_print: interpose drm_*dbg with forwarding macros
drm: POC drm on dyndbg - use in core, 2 helpers, 3 drivers.
drm_print: condense enum drm_debug_category
debugfs: use DEFINE_SHOW_ATTRIBUTE to define debugfs_regset32_fops
driver core: use IS_ERR_OR_NULL() helper in device_create_groups_vargs()
Documentation: ENI155_MAGIC isn't a magic number
Documentation: NBD_REPLY_MAGIC isn't a magic number
nbd: remove define-only NBD_MAGIC, previously magic number
Documentation: FW_HEADER_MAGIC isn't a magic number
Documentation: EEPROM_MAGIC_VALUE isn't a magic number
...
This has been a busy release for regmap with one thing and other,
there's been an especially large interest in MMIO regmaps for some
reason. The bulk of the changes are cleanups but there are several user
visible changes too:
- Support for I/O ports in regmap-mmio.
- Support for accelerated noinc operations in regmap-mmio.
- Support for tracing the register values in bulk operations.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmM6sskACgkQJNaLcl1U
h9Bj1Qf+PLNz4gQq7ki06KI6+Liz6daN5j/bfUGSizx6rns5qOZxt1A/CwDFM5ZR
6aY+tGL4ksYfEUEZHsr7qaQKptWoLmwDVX0oYZqHdBf4Wf7I6iht8WCq68KwDCzz
zQoswzoLmVuRd6aplFifEF3SOqjBrTQO3gkXBteIeA6/i1pwO9wOJdM4ZU54FX+Q
zexv7H/E9uKVonrViBMLPczaPhge4+ILNEDekSUW4AZ0RmUZ6JjW3aOZbwR6ut1x
7bS3ric9xGhW4IQdOZISY+ARhPPworcgdK5GoqBfjWV2vYc0c1iCawvF73Wm/NJC
GMGc5FIBi3a82oCMmSR1dAcci9CRLw==
=TTeQ
-----END PGP SIGNATURE-----
Merge tag 'regmap-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"This has been a busy release for regmap with one thing and other,
there's been an especially large interest in MMIO regmaps for some
reason. The bulk of the changes are cleanups but there are several
user visible changes too:
- Support for I/O ports in regmap-mmio
- Support for accelerated noinc operations in regmap-mmio
- Support for tracing the register values in bulk operations"
* tag 'regmap-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: mmio: replace return 0 with break in switch statement
regmap: spi-avmm: Use swabXX_array() helpers
regmap: mmio: Use swabXX_array() helpers
swab: Add array operations
regmap: trace: Remove unneeded blank lines
regmap: trace: Remove explicit castings
regmap: trace: Remove useless check for NULL for bulk ops
regmap: mmio: Fix rebase error
regmap: check right noinc bounds in debug print
regmap: introduce value tracing for regmap bulk operations
regmap/hexagon: Properly fix the generic IO helpers
regmap: mmio: Support accelerared noinc operations
regmap: Support accelerated noinc operations
regmap: Make use of get_unaligned_be24(), put_unaligned_be24()
regmap: mmio: Fix MMIO accessors to avoid talking to IO port
regmap: mmio: Introduce IO accessors that can talk to IO port
regmap: mmio: Get rid of broken 64-bit IO
regmap: mmio: Remove mmio_relaxed member from context
Always-on PM domains must be on during initialisation or the domain is
currently silently rejected.
Print an error message in case an always-on domain is not on to make it
easier to debug drivers getting this wrong (e.g. by setting an always-on
genpd flag without making sure that the state matches).
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- A new driver for the FSL MU widget that provides platform MSI
- An update for the Realtek RTL irqchip to use a DT binding that
actually describes the hardware
- A handful of DT updates, as well as minor code and spelling fixes
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmM5iLgPHG1hekBrZXJu
ZWwub3JnAAoJECPQ0LrRPXpDvUgP/1XJGuYUsx3yvhe+JQNlO4R/T7DoH4rKhVNI
KU2UPubXzE7e9D5Z91Po9QBF+Y/kyBdIL9Gh9GcK79dQzAJdPy4WxYPX3dWXKuvi
swnmOSeoTJgEXHYan7yViRS+lVk6QOrXUhpL2fiRlG4QOoX0KpMljPhr5+P0LM7Y
QmerbXAuufa1sGz51Cdsptn16hSNke8MMJ54UB0y9gHL0Rp8VuIqQCIno9t93RKM
JXCKpTFUwMbtGX0twREyN6kPCVb9cbXL8NGj1jz+MoUcesCnsaObxmIST/cxTml8
79U+5PVu+HCBE/sVco0MVKBMHw4MAnHZyQl4+8snsyH7NVlOJFQ9VH2+1GykOjJ+
4TCHZVUbHAgpkp1cB85tlANGUSdpn9mAR+nPc70eNTOYoSXgNITstjea0oIMVusA
KavzKzUCGuHeyhVeEpCdreaxfcUhpiSfYXbLIfVzrzYTsbmxLXcqHxwETxAFduW5
kp9owyj1ZpEmqQHTqHM7YgALjOq/u08/IJ55eY50XPy4/eztaOS+quMYWHc6k3TR
M6lqtx+AGBCc62lUibLf6O+tsJjguM/98zLdVhiWThSNsMhjQ8yMYPwlr6VDVCvi
FnsvYLxHi6rhJlEK+1WOMZ3Omub8l9dkrL/jVDPPCkHwkGtHB0nmGQLjpMryN01v
uhtroZZ4
=JTl5
-----END PGP SIGNATURE-----
Merge tag 'irqchip-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core
Pull irqchip updates from Marc Zyngier:
- A new driver for the FSL MU widget that provides platform MSI
- An update for the Realtek RTL irqchip to use a DT binding that
actually describes the hardware
- A handful of DT updates, as well as minor code and spelling fixes
Link: https://lore.kernel.org/r/20221002125554.3902840-1-maz@kernel.org
The memory-notify-based approach aims to handle meory-less nodes, however,
it just adds the complexity of code as pointed by David in thread [1].
The handling of memory-less nodes is introduced by commit 4faf8d950e
("hugetlb: handle memory hot-plug events"). >From its commit message, we
cannot find any necessity of handling this case. So, we can simply
register/unregister sysfs entries in register_node/unregister_node to
simlify the code.
BTW, hotplug callback added because in hugetlb_register_all_nodes() we
register sysfs nodes only for N_MEMORY nodes, seeing commit 9b5e5d0fdc,
which said it was a preparation for handling memory-less nodes via memory
hotplug. Since we want to remove memory hotplug, so make sure we only
register per-node sysfs for online (N_ONLINE) nodes in
hugetlb_register_all_nodes().
https://lore.kernel.org/linux-mm/60933ffc-b850-976c-78a0-0ee6e0ea9ef0@redhat.com/ [1]
Link: https://lkml.kernel.org/r/20220914072603.60293-3-songmuchun@bytedance.com
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "simplify handling of per-node sysfs creation and removal",
v4.
This patch (of 2):
The following commit offload per-node sysfs creation and removal to a
kworker and did not say why it is needed. And it also said "I don't know
that this is absolutely required". It seems like the author was not sure
as well. Since it only complicates the code, this patch will revert the
changes to simplify the code.
39da08cb07 ("hugetlb: offload per node attribute registrations")
We could use memory hotplug notifier to do per-node sysfs creation and
removal instead of inserting those operations to node registration and
unregistration. Then, it can reduce the code coupling between node.c and
hugetlb.c. Also, it can simplify the code.
Link: https://lkml.kernel.org/r/20220914072603.60293-1-songmuchun@bytedance.com
Link: https://lkml.kernel.org/r/20220914072603.60293-2-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
- Add isupport for Tiger Lake in no-HWP mode to intel_pstate (Doug
Smythies).
- Update the AMD P-state driver (Perry Yuan):
* Fix wrong lowest perf fetch.
* Map desired perf into pstate scope for powersave governor.
* Update pstate frequency transition delay time.
* Fix initial highest_perf value.
* Clean up.
- Move max CPU capacity to sugov_policy in the schedutil cpufreq
governor (Lukasz Luba).
- Add SM6115 to cpufreq-dt blocklist (Adam Skladowski).
- Add support for Tegra239 and minor cleanups (Sumit Gupta, ye xingchen,
and Yang Yingliang).
- Add freq qos for qcom cpufreq driver and minor cleanups (Xuewen Yan,
and Viresh Kumar).
- Minor cleanups around functions called at module_init() (Xiu Jianfeng).
- Use module_init and add module_exit for bmips driver (Zhang Jianhua).
- Add AlderLake-N support to intel_idle (Zhang Rui).
- Replace strlcpy() with unused retval with strscpy() in intel_idle
(Wolfram Sang).
- Remove redundant check from cpuidle_switch_governor() (Yu Liao).
- Replace strlcpy() with unused retval with strscpy() in the powernv
cpuidle driver (Wolfram Sang).
- Drop duplicate word from a comment in the coupled cpuidle driver
(Jason Wang).
- Make rpm_resume() return -EINPROGRESS if RPM_NOWAIT is passed to it
in the flags and the device is about to resume (Rafael Wysocki).
- Add extra debugging statement for multiple active IRQs to system
wakeup handling code (Mario Limonciello).
- Replace strlcpy() with unused retval with strscpy() in the core
system suspend support code (Wolfram Sang).
- Update the intel_rapl power capping driver:
* Use standard Energy Unit for SPR Dram RAPL domain (Zhang Rui).
* Add support for RAPTORLAKE_S (Zhang Rui).
* Fix UBSAN shift-out-of-bounds issue (Chao Qin).
- Handle -EPROBE_DEFER when regulator is not probed on
mtk-ci-devfreq.c (AngeloGioacchino Del Regno).
- Fix message typo and use dev_err_probe() in rockchip-dfi.c
(Christophe JAILLET).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmM7OrYSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxeKAP/jFiZ1lhTGRngiVLMV6a6SSSy5xzzXZZ
b/V0oqsuUvWWo6CzVmfU4QfmKGr55+77NgI9Yh5qN6zJTEJmunuCYwVD80KdxPDJ
8SjMUNCACiVwfryLR1gFJlO+0BN4CWTxvto2gjGxzm0l1UQBACf71wm9MQCP8b7A
gcBNuOtM7o5NLywDB+/528SiF9AXfZKjkwXhJACimak5yQytaCJaqtOWtcG2KqYF
USunmqSB3IIVkAa5LJcwloc8wxHYo5mTPaWGGuSA65hfF42k3vJQ2/b8v8oTVza7
bKzhegErIYtL6B9FjB+P1FyknNOvT7BYr+4RSGLvaPySfjMn1bwz9fM1Epo59Guk
Azz3ExpaPixDh+x7b89W1Gb751FZU/zlWT+h1CNy5sOP/ChfxgCEBHw0mnWJ2Y0u
CPcI/Ch0FNQHG+PdbdGlyfvORHVh7te/t6dOhoEHXBue+1r3VkOo8tRGY9x+2IrX
/JB968u1r0oajF0btGwaDdbbWlyMRTzjrxVl3bwsuz/Kv/0JxsryND2JT0zkKAMZ
qYT29HQxhdE0Duw1chgAK6X+BsgP58Bu6LeM3mVcwnGPZE9QvcFa0GQh7z+H71AW
3yOGNmMVMqQSThBYFC6GDi7O2N1UEsLOMV9+ThTRh6D11nU4uiITM5QVIn8nWZGR
z3IZ52Jg0oeJ
=+3IL
-----END PGP SIGNATURE-----
Merge tag 'pm-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These add support for some new hardware, extend the existing hardware
support, fix some issues and clean up code
Specifics:
- Add isupport for Tiger Lake in no-HWP mode to intel_pstate (Doug
Smythies)
- Update the AMD P-state driver (Perry Yuan):
- Fix wrong lowest perf fetch
- Map desired perf into pstate scope for powersave governor
- Update pstate frequency transition delay time
- Fix initial highest_perf value
- Clean up
- Move max CPU capacity to sugov_policy in the schedutil cpufreq
governor (Lukasz Luba)
- Add SM6115 to cpufreq-dt blocklist (Adam Skladowski)
- Add support for Tegra239 and minor cleanups (Sumit Gupta, ye
xingchen, and Yang Yingliang)
- Add freq qos for qcom cpufreq driver and minor cleanups (Xuewen
Yan, and Viresh Kumar)
- Minor cleanups around functions called at module_init() (Xiu
Jianfeng)
- Use module_init and add module_exit for bmips driver (Zhang
Jianhua)
- Add AlderLake-N support to intel_idle (Zhang Rui)
- Replace strlcpy() with unused retval with strscpy() in intel_idle
(Wolfram Sang)
- Remove redundant check from cpuidle_switch_governor() (Yu Liao)
- Replace strlcpy() with unused retval with strscpy() in the powernv
cpuidle driver (Wolfram Sang)
- Drop duplicate word from a comment in the coupled cpuidle driver
(Jason Wang)
- Make rpm_resume() return -EINPROGRESS if RPM_NOWAIT is passed to it
in the flags and the device is about to resume (Rafael Wysocki)
- Add extra debugging statement for multiple active IRQs to system
wakeup handling code (Mario Limonciello)
- Replace strlcpy() with unused retval with strscpy() in the core
system suspend support code (Wolfram Sang)
- Update the intel_rapl power capping driver:
- Use standard Energy Unit for SPR Dram RAPL domain (Zhang Rui).
- Add support for RAPTORLAKE_S (Zhang Rui).
- Fix UBSAN shift-out-of-bounds issue (Chao Qin)
- Handle -EPROBE_DEFER when regulator is not probed on
mtk-ci-devfreq.c (AngeloGioacchino Del Regno)
- Fix message typo and use dev_err_probe() in rockchip-dfi.c
(Christophe JAILLET)"
* tag 'pm-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (29 commits)
cpufreq: qcom-cpufreq-hw: Add cpufreq qos for LMh
cpufreq: Add __init annotation to module init funcs
cpufreq: tegra194: change tegra239_cpufreq_soc to static
PM / devfreq: rockchip-dfi: Fix an error message
PM / devfreq: mtk-cci: Handle sram regulator probe deferral
powercap: intel_rapl: Use standard Energy Unit for SPR Dram RAPL domain
PM: runtime: Return -EINPROGRESS from rpm_resume() in the RPM_NOWAIT case
intel_idle: Add AlderLake-N support
powercap: intel_rapl: fix UBSAN shift-out-of-bounds issue
cpufreq: tegra194: Add support for Tegra239
cpufreq: qcom-cpufreq-hw: Fix uninitialized throttled_freq warning
cpufreq: intel_pstate: Add Tigerlake support in no-HWP mode
powercap: intel_rapl: Add support for RAPTORLAKE_S
cpufreq: amd-pstate: Fix initial highest_perf value
cpuidle: Remove redundant check in cpuidle_switch_governor()
PM: wakeup: Add extra debugging statement for multiple active IRQs
cpufreq: tegra194: Remove the unneeded result variable
PM: suspend: move from strlcpy() with unused retval to strscpy()
intel_idle: move from strlcpy() with unused retval to strscpy()
cpuidle: powernv: move from strlcpy() with unused retval to strscpy()
...
- Reimplement acpi_get_pci_dev() using the list of physical devices
associated with the given ACPI device object (Rafael Wysocki).
- Rename ACPI device object reference counting functions (Rafael
Wysocki).
- Rearrange ACPI device object initialization code (Rafael Wysocki).
- Drop parent field from struct acpi_device (Rafael Wysocki).
- Extend the the int3472-tps68470 driver to support multiple consumers
of a single TPS68470 along with the requisite framework-level
support (Daniel Scally).
- Filter out non-memory resources in is_memory(), add a helper
function to find all memory type resources of an ACPI device object
and use that function in 3 places (Heikki Krogerus).
- Add IRQ override quirks for Asus Vivobook K3402ZA/K3502ZA and ASUS
model S5402ZA (Tamim Khan, Kellen Renshaw).
- Fix acpi_dev_state_d0() kerneldoc (Sakari Ailus).
- Fix up suspend-to-idle support on ASUS Rembrandt laptops (Mario
Limonciello).
- Clean up ACPI platform devices support code (Andy Shevchenko, John
Garry).
- Clean up ACPI bus management code (Andy Shevchenko, ye xingchen).
- Add support for multiple DMA windows with different offsets to the
ACPI device enumeration code and use it on LoongArch (Jianmin Lv).
- Clean up the ACPI LPSS (Intel SoC) driver (Andy Shevchenko).
- Add a quirk for Dell Inspiron 14 2-in-1 for StorageD3Enable (Mario
Limonciello).
- Drop unused dev_fmt() and redundant 'HMAT' prefix from the HMAT
parsing code (Liu Shixin).
- Make ACPI FPDT parsing code avoid calling acpi_os_map_memory() on
invalid physical addresses (Hans de Goede).
- Silence missing-declarations warning related to Apple device
properties management (Lukas Wunner).
- Disable frequency invariance in the CPPC library if registers used
by cppc_get_perf_ctrs() are accessed via PCC (Jeremy Linton).
- Add ACPI disabled check to acpi_cpc_valid() (Perry Yuan).
- Fix Tx acknowledge in the PCC address space handler (Huisong Li).
- Use wait_for_completion_timeout() for PCC mailbox operations (Huisong
Li).
- Release resources on PCC address space setup failure path (Rafael
Mendonca).
- Remove unneeded result variables from APEI code (ye xingchen).
- Print total number of records found during BERT log parsing (Dmitry
Monakhov).
- Drop support for 3 _OSI strings that should not be necessary any
more and update documentation on custom _OSI strings so that adding
new ones is not encouraged any more (Mario Limonciello).
- Drop unneeded result variable from ec_write() (ye xingchen).
- Remove the leftover struct acpi_ac_bl from the ACPI AC driver (Hanjun
Guo).
- Reorder symbols to get rid of a few forward declarations in the ACPI
fan driver (Uwe Kleine-König).
- Add Toshiba Satellite/Portege Z830 ACPI backlight quirk (Arvid
Norlander).
- Add ARM DMA-330 controller to the supported list in the ACPI AMBA
driver (Vijayenthiran Subramaniam).
- Drop references to non-functional 01.org/linux-acpi web site from
MAINTAINERS and Kconfig help texts (Rafael Wysocki).
- Replace strlcpy() with unused retval with strscpy() in the ACPI
support code (Wolfram Sang).
- Do not initialize ret in main() in the pfrut utility (Shi junming).
- Drop useless ACPI DSDT override documentation (Rafael Wysocki).
- Fix a few typos and wording mistakes in the ACPI device enumeration
documentation (Jean Delvare).
- Introduce acpi_dev_uid_to_integer() to convert a _UID string into an
integer value (Andy Shevchenko).
- Use acpi_dev_uid_to_integer() in several places to unify _UID
handling (Andy Shevchenko).
- Drop unused pnpid32_to_pnpid() declaration from PNP code (Gaosheng
Cui).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmM7OhkSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx/TkQALQ4TN451dPSj9jcYSNY6qZ/9b4P9Iym
TmRf3wO3+IVZQ8JajeKKRuVKNsW3sC0RcFkJJVmgZkydJBr1Uui2L0ZLzi8axGNy
RlbZm5NyBeFnlP0fA8Gb2iRMXVAUcRIx+RvZCulxxFmgQ8UhoU4wlVZWlEcko4TQ
hGp++lJYcRHR1NbVLSXZhFvzopKLdhGL6vB1Awsjb/I7TVqn23+k4jVRV1DYkIQ7
qgFM+Z7osRVZiVQbaPoOgdykeSa43qXu7Vgs7F/QeJuIiUYx59xDh0/WCJBxnuDM
cHGiaNnvuJghKmCg43X8+joaHEH/jCFyvBVGfiSzRvjz03WOPRs1XztwdEiCi+py
RcZGzrPaXmkCjNeytPRooiifyqm95HT7aMBN/aTvKBXDaGRrfPheXF+i2idl24HM
NrHqMaa0+5qoDGHLUEaf5znlCHfS+3lwq6+lGVrq/UGf6B3cP+9HwOyevEW493JX
4nuv69Y517moR9W3mBU8sAn5mUjshcka7pghRj7QnuoqRqWLbU3lIz8oUDHr84cI
ixpIPvt2KlZ5UjnN9aqu/6k70JkJvy4SrKjnx4iqu03ePmMrRc0Hcpy7+VMlgumD
tgN9aW+YDgy0/Z5QmO1MOvFodVmA5sX6+gnX1neAjuDdIo3LkJptlkO1fCx2jfQu
cgPQk1CtPOos
=xyUK
-----END PGP SIGNATURE-----
Merge tag 'acpi-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"ACPI and PNP updates for 6.1-rc1.
These rearrange the ACPI device object initialization code (to get rid
of a redundant parent pointer from struct acpi_device among other
things), unify the _UID handling, drop support for some _OSI strings
that should not be necessary any more, add new IDs to support more
hardware and some more quirks, fix a few issues and clean up code all
over.
Specifics:
- Reimplement acpi_get_pci_dev() using the list of physical devices
associated with the given ACPI device object (Rafael Wysocki)
- Rename ACPI device object reference counting functions (Rafael
Wysocki)
- Rearrange ACPI device object initialization code (Rafael Wysocki)
- Drop parent field from struct acpi_device (Rafael Wysocki)
- Extend the the int3472-tps68470 driver to support multiple
consumers of a single TPS68470 along with the requisite
framework-level support (Daniel Scally)
- Filter out non-memory resources in is_memory(), add a helper
function to find all memory type resources of an ACPI device object
and use that function in 3 places (Heikki Krogerus)
- Add IRQ override quirks for Asus Vivobook K3402ZA/K3502ZA and ASUS
model S5402ZA (Tamim Khan, Kellen Renshaw)
- Fix acpi_dev_state_d0() kerneldoc (Sakari Ailus)
- Fix up suspend-to-idle support on ASUS Rembrandt laptops (Mario
Limonciello)
- Clean up ACPI platform devices support code (Andy Shevchenko, John
Garry)
- Clean up ACPI bus management code (Andy Shevchenko, ye xingchen)
- Add support for multiple DMA windows with different offsets to the
ACPI device enumeration code and use it on LoongArch (Jianmin Lv)
- Clean up the ACPI LPSS (Intel SoC) driver (Andy Shevchenko)
- Add a quirk for Dell Inspiron 14 2-in-1 for StorageD3Enable (Mario
Limonciello)
- Drop unused dev_fmt() and redundant 'HMAT' prefix from the HMAT
parsing code (Liu Shixin)
- Make ACPI FPDT parsing code avoid calling acpi_os_map_memory() on
invalid physical addresses (Hans de Goede)
- Silence missing-declarations warning related to Apple device
properties management (Lukas Wunner)
- Disable frequency invariance in the CPPC library if registers used
by cppc_get_perf_ctrs() are accessed via PCC (Jeremy Linton)
- Add ACPI disabled check to acpi_cpc_valid() (Perry Yuan)
- Fix Tx acknowledge in the PCC address space handler (Huisong Li)
- Use wait_for_completion_timeout() for PCC mailbox operations
(Huisong Li)
- Release resources on PCC address space setup failure path (Rafael
Mendonca)
- Remove unneeded result variables from APEI code (ye xingchen)
- Print total number of records found during BERT log parsing (Dmitry
Monakhov)
- Drop support for 3 _OSI strings that should not be necessary any
more and update documentation on custom _OSI strings so that adding
new ones is not encouraged any more (Mario Limonciello)
- Drop unneeded result variable from ec_write() (ye xingchen)
- Remove the leftover struct acpi_ac_bl from the ACPI AC driver
(Hanjun Guo)
- Reorder symbols to get rid of a few forward declarations in the
ACPI fan driver (Uwe Kleine-König)
- Add Toshiba Satellite/Portege Z830 ACPI backlight quirk (Arvid
Norlander)
- Add ARM DMA-330 controller to the supported list in the ACPI AMBA
driver (Vijayenthiran Subramaniam)
- Drop references to non-functional 01.org/linux-acpi web site from
MAINTAINERS and Kconfig help texts (Rafael Wysocki)
- Replace strlcpy() with unused retval with strscpy() in the ACPI
support code (Wolfram Sang)
- Do not initialize ret in main() in the pfrut utility (Shi junming)
- Drop useless ACPI DSDT override documentation (Rafael Wysocki)
- Fix a few typos and wording mistakes in the ACPI device enumeration
documentation (Jean Delvare)
- Introduce acpi_dev_uid_to_integer() to convert a _UID string into
an integer value (Andy Shevchenko)
- Use acpi_dev_uid_to_integer() in several places to unify _UID
handling (Andy Shevchenko)
- Drop unused pnpid32_to_pnpid() declaration from PNP code (Gaosheng
Cui)"
* tag 'acpi-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (79 commits)
ACPI: LPSS: Deduplicate skipping device in acpi_lpss_create_device()
ACPI: LPSS: Replace loop with first entry retrieval
ACPI: x86: s2idle: Add another ID to s2idle_dmi_table
ACPI: x86: s2idle: Fix a NULL pointer dereference
MAINTAINERS: Drop records pointing to 01.org/linux-acpi
ACPI: Kconfig: Drop link to https://01.org/linux-acpi
ACPI: docs: Drop useless DSDT override documentation
ACPI: DPTF: Drop stale link from Kconfig help
ACPI: x86: s2idle: Add a quirk for ASUSTeK COMPUTER INC. ROG Flow X13
ACPI: x86: s2idle: Add a quirk for Lenovo Slim 7 Pro 14ARH7
ACPI: x86: s2idle: Add a quirk for ASUS ROG Zephyrus G14
ACPI: x86: s2idle: Add a quirk for ASUS TUF Gaming A17 FA707RE
ACPI: x86: s2idle: Add module parameter to prefer Microsoft GUID
ACPI: x86: s2idle: If a new AMD _HID is missing assume Rembrandt
ACPI: x86: s2idle: Move _HID handling for AMD systems into structures
platform/x86: int3472: Add board data for Surface Go2 IR camera
platform/x86: int3472: Support multiple gpio lookups in board data
platform/x86: int3472: Support multiple clock consumers
ACPI: bus: Add iterator for dependent devices
ACPI: scan: Add acpi_dev_get_next_consumer_dev()
...
Merge cpuidle changes, PM core changes and power capping changes for
6.1-rc1:
- Add AlderLake-N support to intel_idle (Zhang Rui).
- Replace strlcpy() with unused retval with strscpy() in intel_idle
(Wolfram Sang).
- Remove redundant check from cpuidle_switch_governor() (Yu Liao).
- Replace strlcpy() with unused retval with strscpy() in the powernv
cpuidle driver (Wolfram Sang).
- Drop duplicate word from a comment in the coupled cpuidle driver
(Jason Wang).
- Make rpm_resume() return -EINPROGRESS if RPM_NOWAIT is passed to it
in the flags and the device is about to resume (Rafael Wysocki).
- Add extra debugging statement for multiple active IRQs to system
wakeup handling code (Mario Limonciello).
- Replace strlcpy() with unused retval with strscpy() in the core
system suspend support code (Wolfram Sang).
- Update the intel_rapl power capping driver:
* Use standard Energy Unit for SPR Dram RAPL domain (Zhang Rui).
* Add support for RAPTORLAKE_S (Zhang Rui).
* Fix UBSAN shift-out-of-bounds issue (Chao Qin).
* pm-cpuidle:
intel_idle: Add AlderLake-N support
cpuidle: Remove redundant check in cpuidle_switch_governor()
intel_idle: move from strlcpy() with unused retval to strscpy()
cpuidle: powernv: move from strlcpy() with unused retval to strscpy()
cpuidle: coupled: Drop duplicate word from a comment
* pm-core:
PM: runtime: Return -EINPROGRESS from rpm_resume() in the RPM_NOWAIT case
* pm-sleep:
PM: wakeup: Add extra debugging statement for multiple active IRQs
PM: suspend: move from strlcpy() with unused retval to strscpy()
* powercap:
powercap: intel_rapl: Use standard Energy Unit for SPR Dram RAPL domain
powercap: intel_rapl: fix UBSAN shift-out-of-bounds issue
powercap: intel_rapl: Add support for RAPTORLAKE_S
Merge new material related to CPPC, PCC, APEI and OSI strings handling
for 6.1-rc1:
- Disable frequency invariance in the CPPC library if registers used
by cppc_get_perf_ctrs() are accessed via PCC (Jeremy Linton).
- Add ACPI disabled check to acpi_cpc_valid() (Perry Yuan).
- Fix Tx acknowledge in the PCC address space handler (Huisong Li).
- Use wait_for_completion_timeout() for PCC mailbox operations (Huisong
Li).
- Release resources on PCC address space setup failure path (Rafael
Mendonca).
- Remove unneeded result variables from APEI code (ye xingchen).
- Print total number of records found during BERT log parsing (Dmitry
Monakhov).
- Drop support for 3 _OSI strings that should not be necessary any
more and update documentation on custom _OSI strings so that adding
new ones is not encouraged any more (Mario Limonciello).
* acpi-cppc:
ACPI: CPPC: Disable FIE if registers in PCC regions
ACPI: CPPC: Add ACPI disabled check to acpi_cpc_valid()
* acpi-pcc:
ACPI: PCC: Fix Tx acknowledge in the PCC address space handler
ACPI: PCC: replace wait_for_completion()
ACPI: PCC: Release resources on address space setup failure path
* acpi-apei:
ACPI: APEI: Remove unneeded result variables
ACPI: APEI: Add BERT error log footer
* acpi-osi:
ACPI: OSI: Update Documentation on custom _OSI strings
ACPI: OSI: Remove Linux-HPI-Hybrid-Graphics _OSI string
ACPI: OSI: Remove Linux-Lenovo-NV-HDMI-Audio _OSI string
ACPI: OSI: Remove Linux-Dell-Video _OSI string
The prospective callers of rpm_resume() passing RPM_NOWAIT to it may
be confused when it returns 0 without actually resuming the device
which may happen if the device is suspending at the given time and it
will only resume when the suspend in progress has completed. To avoid
that confusion, return -EINPROGRESS from rpm_resume() in that case.
Since none of the current callers passing RPM_NOWAIT to rpm_resume()
check its return value, this change has no functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Add const qualifier to the device_get_match_data() parameter.
Some of the future users may utilize this function without
forcing the type.
All the same, dev_fwnode() may be used with a const qualifier.
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220922135410.49694-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use IS_ERR_OR_NULL() helper in device_create_groups_vargs() to simplify code
and improve readiblity. No functional change.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220914140753.3799982-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In following scenario(diagram), when one thread X running dev_coredumpm()
adds devcd device to the framework which sends uevent notification to
userspace and another thread Y reads this uevent and call to
devcd_data_write() which eventually try to delete the queued timer that
is not initialized/queued yet.
So, debug object reports some warning and in the meantime, timer is
initialized and queued from X path. and from Y path, it gets reinitialized
again and timer->entry.pprev=NULL and try_to_grab_pending() stucks.
To fix this, introduce mutex and a boolean flag to serialize the behaviour.
cpu0(X) cpu1(Y)
dev_coredump() uevent sent to user space
device_add() ======================> user space process Y reads the
uevents writes to devcd fd
which results into writes to
devcd_data_write()
mod_delayed_work()
try_to_grab_pending()
del_timer()
debug_assert_init()
INIT_DELAYED_WORK()
schedule_delayed_work()
debug_object_fixup()
timer_fixup_assert_init()
timer_setup()
do_init_timer()
/*
Above call reinitializes
the timer to
timer->entry.pprev=NULL
and this will be checked
later in timer_pending() call.
*/
timer_pending()
!hlist_unhashed_lockless(&timer->entry)
!h->pprev
/*
del_timer() checks h->pprev and finds
it to be NULL due to which
try_to_grab_pending() stucks.
*/
Link: https://lore.kernel.org/lkml/2e1f81e2-428c-f11f-ce92-eb11048cb271@quicinc.com/
Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
Link: https://lore.kernel.org/r/1663073424-13663-1-git-send-email-quic_mojha@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This merges the driver core changes in 6.0-rc7 into driver-core-next as
they are needed here as well for testing.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Variable min_stride is assigned a value that is never read, fix this by
replacing the return 0 with a break statement. This also makes the case
statement consistent with the other cases in the switch statement.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20220922080445.818020-1-colin.i.king@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
This reverts commit 71066545b4.
It causes boot problems on some systems, so revert it for now until it
is worked out.
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Saravana Kannan <saravanak@google.com>
Fixes: 71066545b4 ("driver core: Set fw_devlink.strict=1 by default")
Reported-by: Olof Johansson <olof@lixom.net>
Link: https://lore.kernel.org/r/CAOesGMjQHhTUMBGHQcME4JBkZCof2NEQ4gaM1GWFgH40+LN9AQ@mail.gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmMeQ2keHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGYRMH+gLNHiGirGZlm2GQ
tKaZQUy7MiXuIP0hGDonDIIIAmIVhnjm9MDG8KT4W8AvEd7ukncyYqJfwWeWQPhP
4mZcf6l3Z8Ke+qiaFpXpMPCxTyWcln1ox0EoNx2g9gdPxZntaRuuaTQVljUfTiey
aVPHxve8ip3G7jDoJnuLSxESOqWxkb8v/SshBP1E5bF5BZ+cgZRqq7FNigFqxjbk
wF29K09BVOPjdgkSvY/b0/SnL5KlSdMAv+FrPcJNGivcdIPgf/qJks5cI2HRUo7o
CpKgbcLorCVyD+d+zLonJBwIy3arbmKD8JqYnfdTSIqVOUqHXWUDfeydsH32u1Gu
lPSI2Hw=
=7LTL
-----END PGP SIGNATURE-----
Merge 6.0-rc5 into driver-core-next
We need the driver core and debugfs changes in this branch.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Directly check state of struct memory_block, no need a single function.
Link: https://lkml.kernel.org/r/20220827112043.187028-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Here are some small driver core and debugfs fixes for 6.0-rc5.
Included in here are:
- multiple attempts to get the arch_topology code to work properly on
non-cluster SMT systems. First attempt caused build breakages in
linux-next and 0-day, second try worked.
- debugfs fixes for a long-suffering memory leak. The pattern of
debugfs_remove(debugfs_lookup(...)) turns out to leak dentries, so
add debugfs_lookup_and_remove() to fix this problem. Also fix up
the scheduler debug code that highlighted this problem. Fixes for
other subsystems will be trickling in over the next few months for
this same issue once the debugfs function is merged.
All of these have been in linux-next since Wednesday with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYxuERw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylPqwCgjU6xlN2y/80HH+66k+yyzlxocE8AoLPgnGrA
dJZIGWFXExzO26tvMT52
=zGHA
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are some small driver core and debugfs fixes for 6.0-rc5.
Included in here are:
- multiple attempts to get the arch_topology code to work properly on
non-cluster SMT systems. First attempt caused build breakages in
linux-next and 0-day, second try worked.
- debugfs fixes for a long-suffering memory leak. The pattern of
debugfs_remove(debugfs_lookup(...)) turns out to leak dentries, so
add debugfs_lookup_and_remove() to fix this problem. Also fix up
the scheduler debug code that highlighted this problem. Fixes for
other subsystems will be trickling in over the next few months for
this same issue once the debugfs function is merged.
All of these have been in linux-next since Wednesday with no reported
problems"
* tag 'driver-core-6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
arch_topology: Make cluster topology span at least SMT CPUs
sched/debug: fix dentry leak in update_sched_domain_debugfs
debugfs: add debugfs_lookup_and_remove()
driver core: fix driver_set_override() issue with empty strings
Revert "arch_topology: Make cluster topology span at least SMT CPUs"
arch_topology: Make cluster topology span at least SMT CPUs
make_class_name has been removed since
commit 39aba963d9 ("driver core: remove CONFIG_SYSFS_DEPRECATED_V2
but keep it for block devices"), so remove it.
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Link: https://lore.kernel.org/r/20220909063337.1146151-1-cuigaosheng1@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A fix for how we handle controller constraints on SPI message sizes,
only impacting systems with SPI controllers with very low limits like
the AMD controller used in the Steam Deck.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmMZ3OEACgkQJNaLcl1U
h9CKOwf+O49lVDWbWr1YSR1hLvUtqV+rZsk5+kHCurTbEsa6eOuajx9hqHtEpSg9
CDHODkBk1TZmyfhS6/qWU5YUVGFpskkxebsJkxvYOCqaTi5UCDCLMRTWTP4yW5Gq
YRVgjFRmPAcnfaJCFB9/1l6NOMez0EfbWwApPPh7PNYp/KiexfzTFYzWbwLLhvmX
piY2tkG1SXrUjszFOZdXeukPU+HdQy40u1V7Akcs9ihwRMiIjeo0iqlE/R8QO5i4
vgevTPKPhsQJWGnZsqJvu9gtG6+Eoac6Kp4nqk4W/OVXpQNBhDHTI94D7X78+5hS
xUwgHjYT4PjR9HmuYMksgjYov4DmFw==
=ZO8J
-----END PGP SIGNATURE-----
Merge tag 'regmap-fix-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
"A fix for how we handle controller constraints on SPI message sizes,
only impacting systems with SPI controllers with very low limits like
the AMD controller used in the Steam Deck"
* tag 'regmap-fix-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: spi: Reserve space for register address/padding
Currently cpu_clustergroup_mask() will return CPU mask if cluster span more
or the same CPUs as cpu_coregroup_mask(). This will result topology borken
on non-Cluster SMT machines when building with CONFIG_SCHED_CLUSTER=y.
Test with:
qemu-system-aarch64 -enable-kvm -machine virt \
-net none \
-cpu host \
-bios ./QEMU_EFI.fd \
-m 2G \
-smp 48,sockets=2,cores=12,threads=2 \
-kernel $Image \
-initrd $Rootfs \
-nographic
-append "rdinit=init console=ttyAMA0 sched_verbose loglevel=8"
We'll get below error:
[ 3.084568] BUG: arch topology borken
[ 3.084570] the SMT domain not a subset of the CLS domain
Since cluster is a level higher than SMT, fix this by making cluster
spans at least SMT CPUs.
Fixes: bfcc439743 ("arch_topology: Limit span of cpu_clustergroup_mask()")
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ionela Voinescu <ionela.voinescu@arm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20220905122615.12946-1-yangyicong@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since we have a few helpers to swab elements of a given size in an array
use them instead of open coded variants.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220831212744.56435-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Since we have a few helpers to swab elements of a given size in an array
use them instead of open coded variants.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220831212744.56435-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
There is a few unneeded blank lines in some of event definitions,
remove them in order to make those definitions consistent with
the rest.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220901132336.33234-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
There is no need to have explicit castings to the same type the
variables are of. Remove the explicit castings.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20220901132336.33234-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
If the buffer pointer is NULL we already are in troubles since
regmap bulk API expects caller to provide valid parameters,
it dereferences that without any checks before we call for
traces.
Moreover, the current code will print garbage in the case of
buffer is NULL and length is not 0.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220901132336.33234-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Python likes to send an empty string for some sysfs files, including the
driver_override field. When commit 23d99baf9d ("PCI: Use
driver_set_override() instead of open-coding") moved the PCI core to use
the driver core function instead of hand-rolling their own handler, this
showed up as a regression from some userspace tools, like DPDK.
Fix this up by actually looking at the length of the string first
instead of trusting that userspace got it correct.
Fixes: 23d99baf9d ("PCI: Use driver_set_override() instead of open-coding")
Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: stable <stable@kernel.org>
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Tested-by: Huisong Li <lihuisong@huawei.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220901163734.3583106-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since commit cb1f65c1e1 ("PM: s2idle: ACPI: Fix wakeup interrupts
handling") was introduced the kernel can now handle multiple
simultaneous interrupts during wakeup. Ths uncovered some existing
subtle firmware bugs where multiple IRQs are unintentionally active.
To help with fixing those bugs add an extra message when PM debugging
is enabled that can show the individual IRQs triggered as if a variety
are fired they'll potentially be lost as /sys/power/pm_wakeup_irq only
contains the first one that triggered the wakeup after resume is
complete but all may be needed to demonstrate the whole picture.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215770
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
[ rjw: Added empty line after if () ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This reverts commit 6b66ca0bac as it
breaks the build on some arches as reported by the kernel test robot.
Link: https://lore.kernel.org/r/202209030824.SouwDV5M-lkp@intel.com
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 6b66ca0bac ("arch_topology: Make cluster topology span at least SMT CPUs")
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ionela Voinescu <ionela.voinescu@arm.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently cpu_clustergroup_mask() will return CPU mask if cluster span more
or the same CPUs as cpu_coregroup_mask(). This will result topology borken
on non-Cluster SMT machines when building with CONFIG_SCHED_CLUSTER=y.
Test with:
qemu-system-aarch64 -enable-kvm -machine virt \
-net none \
-cpu host \
-bios ./QEMU_EFI.fd \
-m 2G \
-smp 48,sockets=2,cores=12,threads=2 \
-kernel $Image \
-initrd $Rootfs \
-nographic \
-append "rdinit=init console=ttyAMA0 sched_verbose loglevel=8"
We'll get below error:
[ 3.084568] BUG: arch topology borken
[ 3.084570] the SMT domain not a subset of the CLS domain
Since cluster is a level higher than SMT, fix this by making cluster
spans at least SMT CPUs.
Fixes: bfcc439743 ("arch_topology: Limit span of cpu_clustergroup_mask()")
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20220825092007.8129-1-yangyicong@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the gfp flag used for the memory allocation already has __GFP_ZERO,
then there is no need to explicitly clear the "struct devres_node". It is
already zeroed.
This saves a few cycles when using devm_zalloc() and co.
In the case of devres_alloc() (which calls __devres_alloc_node()), the
compiler could remove the test and the memset() because it should be able
to see that the __GFP_ZERO flag is set.
So this would make the code both faster and smaller.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/d255bd871484e63cdd628e819f929e2df59afb02.1658352383.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the case of firmware-upload, an instance of struct fw_upload is
allocated in firmware_upload_register(). This data needs to be freed
in fw_dev_release(). Create a new fw_upload_free() function in
sysfs_upload.c to handle the firmware-upload specific memory frees
and incorporate the missing kfree call for the fw_upload structure.
Fixes: 97730bbb24 ("firmware_loader: Add firmware-upload support")
Cc: stable <stable@kernel.org>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220831002518.465274-1-russell.h.weight@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the following code within firmware_upload_unregister(), the call to
device_unregister() could result in the dev_release function freeing the
fw_upload_priv structure before it is dereferenced for the call to
module_put(). This bug was found by the kernel test robot using
CONFIG_KASAN while running the firmware selftests.
device_unregister(&fw_sysfs->dev);
module_put(fw_upload_priv->module);
The problem is fixed by copying fw_upload_priv->module to a local variable
for use when calling device_unregister().
Fixes: 97730bbb24 ("firmware_loader: Add firmware-upload support")
Cc: stable <stable@kernel.org>
Reported-by: kernel test robot <oliver.sang@intel.com>
Reviewed-by: Matthew Gerlach <matthew.gerlach@linux.intel.com>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220829174557.437047-1-russell.h.weight@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Architectures which do not have cacheinfo such as ARM 32-bit would spit
out the following during boot:
Early cacheinfo failed, ret = -2
Treat -ENOENT specifically to silence this error since it means that the
platform does not support reporting its cache information.
Fixes: 3fcbf1c77d ("arch_topology: Fix cache attributes detection in the CPU hotplug path")
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Michael Walle <michael@walle.cc>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220805230736.1562801-1-f.fainelli@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Both __device_attach_driver() and __driver_attach() check the return
code of the bus_type.match() function to see if the device needs to be
added to the deferred probe list. After adding the device to the list,
the logic attempts to bind the device to the driver anyway, as if the
device had matched with the driver, which is not correct.
If __device_attach_driver() detects that the device in question is not
ready to match with a driver on the bus, then it doesn't make sense for
the device to attempt to bind with the current driver or continue
attempting to match with any of the other drivers on the bus. So, update
the logic in __device_attach_driver() to reflect this.
If __driver_attach() detects that a driver tried to match with a device
that is not ready to match yet, then the driver should not attempt to bind
with the device. However, the driver can still attempt to match and bind
with other devices on the bus, as drivers can be bound to multiple
devices. So, update the logic in __driver_attach() to reflect this.
Fixes: 656b8035b0 ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable@vger.kernel.org
Cc: Saravana Kannan <saravanak@google.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Saravana Kannan <saravanak@google.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Link: https://lore.kernel.org/r/20220817184026.3468620-1-isaacmanjarres@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A dangling pointless "ret 0" was left in and some unneeded
whitespace can go too.
Fixes: 81c0386c13 ("regmap: mmio: Support accelerared noinc operations")
Reported-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220831141303.501548-1-linus.walleij@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Make acpi_cpc_valid() check if ACPI is disabled, so that its callers
don't need to check that separately. This will also cause the AMD
pstate driver to refuse to load right away when ACPI is disabled.
Also update the warning message in amd_pstate_init() to mention the
ACPI disabled case for completeness.
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
[ rjw: Subject edits, new changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We keep track of several kernel memory stats (total kernel memory, page
tables, stack, vmalloc, etc) on multiple levels (global, per-node,
per-memcg, etc). These stats give insights to users to how much memory
is used by the kernel and for what purposes.
Currently, memory used by KVM mmu is not accounted in any of those
kernel memory stats. This patch series accounts the memory pages
used by KVM for page tables in those stats in a new
NR_SECONDARY_PAGETABLE stat. This stat can be later extended to account
for other types of secondary pages tables (e.g. iommu page tables).
KVM has a decent number of large allocations that aren't for page
tables, but for most of them, the number/size of those allocations
scales linearly with either the number of vCPUs or the amount of memory
assigned to the VM. KVM's secondary page table allocations do not scale
linearly, especially when nested virtualization is in use.
From a KVM perspective, NR_SECONDARY_PAGETABLE will scale with KVM's
per-VM pages_{4k,2m,1g} stats unless the guest is doing something
bizarre (e.g. accessing only 4kb chunks of 2mb pages so that KVM is
forced to allocate a large number of page tables even though the guest
isn't accessing that much memory). However, someone would need to either
understand how KVM works to make that connection, or know (or be told) to
go look at KVM's stats if they're running VMs to better decipher the stats.
Furthermore, having NR_PAGETABLE side-by-side with NR_SECONDARY_PAGETABLE
is informative. For example, when backing a VM with THP vs. HugeTLB,
NR_SECONDARY_PAGETABLE is roughly the same, but NR_PAGETABLE is an order
of magnitude higher with THP. So having this stat will at the very least
prove to be useful for understanding tradeoffs between VM backing types,
and likely even steer folks towards potential optimizations.
The original discussion with more details about the rationale:
https://lore.kernel.org/all/87ilqoi77b.wl-maz@kernel.org
This stat will be used by subsequent patches to count KVM mmu
memory usage.
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220823004639.2387269-2-yosryahmed@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
We were using the wrong bound in the debug prints: this
needs to be the number of elements, not the number of bytes,
since we're indexing into an element-size typed array.
Fixes: c20cc099b3 ("regmap: Support accelerated noinc operations")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220823135700.265019-1-linus.walleij@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Currently, only one-register io operations support tracepoints with
value logging. For the regmap bulk operations developer can view
hw_start/hw_done tracepoints with starting reg number and registers
count to be reading or writing. This patch injects tracepoints with
dumping registers values in the hex format to regmap bulk reading
and writing.
Signed-off-by: Dmitry Rokosov <ddrokosov@sberdevices.ru>
Link: https://lore.kernel.org/r/20220816181451.5628-1-ddrokosov@sberdevices.ru
Signed-off-by: Mark Brown <broonie@kernel.org>
This reverts commit 9cbffc7a59.
There are a few more issues to fix that have been reported in the thread
for the original series [1]. We'll need to fix those before this will work.
So, revert it for now.
[1] - https://lore.kernel.org/lkml/20220601070707.3946847-1-saravanak@google.com/
Fixes: 9cbffc7a59 ("driver core: Delete driver_deferred_probe_check_state()")
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Peng Fan <peng.fan@nxp.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220819221616.2107893-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently the max_raw_read and max_raw_write limits in regmap_spi struct
do not take into account the additional size of the transmitted register
address and padding. This may result in exceeding the maximum permitted
SPI message size, which could cause undefined behaviour, e.g. data
corruption.
Fix regmap_get_spi_bus() to properly adjust the above mentioned limits
by reserving space for the register address/padding as set in the regmap
configuration.
Fixes: f231ff38b7 ("regmap: spi: Set regmap max raw r/w from max_transfer_size")
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Reviewed-by: Lucas Tanure <tanureal@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20220818104851.429479-1-cristian.ciocaltea@collabora.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Use the newly added callback for accelerated noinc MMIO
to provide writesb, writesw, writesl, writesq, readsb, readsw,
readsl and readsq.
A special quirk is needed to deal with big endian regmaps: there
are no accelerated operations defined for big endian, so fall
back to calling the big endian operations itereatively for this
case.
The Hexagon architecture turns out to have an incomplete
<asm/io.h>: writesb() is not implemented. Fix this by doing
what other architectures do: include <asm-generic/io.h> into
the <asm/io.h> file.
Cc: Brian Cain <bcain@quicinc.com>
Cc: linux-hexagon@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220816204832.265837-2-linus.walleij@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Several architectures have accelerated operations for MMIO
operations writing to a single register, such as writesb, writesw,
writesl, writesq, readsb, readsw, readsl and readsq but regmap
currently cannot use them because we have no hooks for providing
an accelerated noinc back-end for MMIO.
Solve this by providing reg_[read/write]_noinc callbacks for
the bus abstraction, so that the regmap-mmio bus can use this.
Currently I do not see a need to support this for custom regmaps
so it is only added to the bus.
Callbacks are passed a void * with the array of values and a
count which is the number of items of the byte chunk size for
the specific register width.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220816204832.265837-1-linus.walleij@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Merge series from Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
Currently regmap MMIO doesn't support IO ports, while being inconsistent
in used IO accessors. Fix the latter and extend framework with the
former.
Since we have a proper endianness converters for BE 24-bit data use
them. While at it, format the code using switch-cases as it's done
for the rest of the endianness handlers.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220726151213.71712-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Currently regmap MMIO is inconsistent with IO accessors. I.e.
the Big Endian counterparts are using ioreadXXbe() / iowriteXXbe()
which are not clean implementations of readXXbe().
That said, reimplement current Big Endian MMIO accessors by replacing
ioread()/iowrite() with respective read()/write() and swab() calls.
Note, there are no current in-kernel users that may utilize the
functionality of the IO ports on Big Endian hardware. All drivers
that use regmap MMIO either Little Endian, or they don't map IO
ports in a way that ioreadXX()/iowriteXX() may be utilized.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/20220808203401.35153-5-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Some users may use regmap MMIO for IO ports, and this can be done
by assigning ioreadXX()/iowriteXX() and their Big Endian counterparts
to the regmap context.
Add IO port support with a corresponding flag added.
While doing that, make sure that user won't select relaxed MMIO access
along with IO port because the latter have no relaxed variants.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/20220808203401.35153-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The current implementation, besides having no active users, is broken
by design of regmap. For 64-bit IO we need to supply 64-bit value,
otherwise there is no way to handle upper 32 bits in 64-bit register.
Hence, remove the broken IO accessors for good and wait for real user
that can fix entire regmap API for that.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/20220808203401.35153-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
There is no need to keep mmio_relaxed member in the context, it's
onetime used during generation of the context. Remove it.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/20220808203401.35153-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
arm64's method of defining a default cpu topology requires only minimal
changes to apply to RISC-V also. The current arm64 implementation exits
early in a uniprocessor configuration by reading MPIDR & claiming that
uniprocessor can rely on the default values.
This is appears to be a hangover from prior to '3102bc0e6ac7 ("arm64:
topology: Stop using MPIDR for topology information")', because the
current code just assigns default values for multiprocessor systems.
With the MPIDR references removed, store_cpu_topolgy() can be moved to
the common arch_topology code.
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Here is the set of driver core and kernfs changes for 6.0-rc1.
"biggest" thing in here is some scalability improvements for kernfs for
large systems. Other than that, included in here are:
- arch topology and cache info changes that have been reviewed
and discussed a lot.
- potential error path cleanup fixes
- deferred driver probe cleanups
- firmware loader cleanups and tweaks
- documentation updates
- other small things
All of these have been in the linux-next tree for a while with no
reported problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYuqCnw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ym/JgCcCnaycJY00ZPRQm3LQCyzfJ0HgqoAn2qxGV+K
NKycLeXZSnuvIA87dycE
=/4Jk
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core / kernfs updates from Greg KH:
"Here is the set of driver core and kernfs changes for 6.0-rc1.
The "biggest" thing in here is some scalability improvements for
kernfs for large systems. Other than that, included in here are:
- arch topology and cache info changes that have been reviewed and
discussed a lot.
- potential error path cleanup fixes
- deferred driver probe cleanups
- firmware loader cleanups and tweaks
- documentation updates
- other small things
All of these have been in the linux-next tree for a while with no
reported problems"
* tag 'driver-core-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (63 commits)
docs: embargoed-hardware-issues: fix invalid AMD contact email
firmware_loader: Replace kmap() with kmap_local_page()
sysfs docs: ABI: Fix typo in comment
kobject: fix Kconfig.debug "its" grammar
kernfs: Fix typo 'the the' in comment
docs: driver-api: firmware: add driver firmware guidelines. (v3)
arch_topology: Fix cache attributes detection in the CPU hotplug path
ACPI: PPTT: Leave the table mapped for the runtime usage
cacheinfo: Use atomic allocation for percpu cache attributes
drivers/base: fix userspace break from using bin_attributes for cpumap and cpulist
MAINTAINERS: Change mentions of mpm to olivia
docs: ABI: sysfs-devices-soc: Update Lee Jones' email address
docs: ABI: sysfs-class-pwm: Update Lee Jones' email address
Documentation/process: Add embargoed HW contact for LLVM
Revert "kernfs: Change kernfs_notify_list to llist."
ACPI: Remove the unused find_acpi_cpu_cache_topology()
arch_topology: Warn that topology for nested clusters is not supported
arch_topology: Add support for parsing sockets in /cpu-map
arch_topology: Set cluster identifier in each core/thread from /cpu-map
arch_topology: Limit span of cpu_clustergroup_mask()
...
- Make cpufreq_show_cpus() more straightforward (Viresh Kumar).
- Drop unnecessary CPU hotplug locking from store() used by cpufreq
sysfs attributes (Viresh Kumar).
- Make the ACPI cpufreq driver support the boost control interface on
Zhaoxin/Centaur processors (Tony W Wang-oc).
- Print a warning message on attempts to free an active cpufreq policy
which should never happen (Viresh Kumar).
- Fix grammar in the Kconfig help text for the loongson2 cpufreq
driver (Randy Dunlap).
- Use cpumask_var_t for an on-stack CPU mask in the ondemand cpufreq
governor (Zhao Liu).
- Add trace points for guest_halt_poll_ns grow/shrink to the haltpoll
cpuidle driver (Eiichi Tsukata).
- Modify intel_idle to treat C1 and C1E as independent idle states on
Sapphire Rapids (Artem Bityutskiy).
- Extend support for wakeirq to callback wrappers used during system
suspend and resume (Ulf Hansson).
- Defer waiting for device probe before loading a hibernation image
till the first actual device access to avoid possible deadlocks
reported by syzbot (Tetsuo Handa).
- Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP (Bjorn
Helgaas).
- Add Raptor Lake-P to the list of processors supported by the Intel
RAPL driver (George D Sworo).
- Add Alder Lake-N and Raptor Lake-P to the list of processors for
which Power Limit4 is supported in the Intel RAPL driver (Sumeet
Pawnikar).
- Make pm_genpd_remove() check genpd_debugfs_dir against NULL before
attempting to remove it (Hsin-Yi Wang).
- Change the Energy Model code to represent power in micro-Watts and
adjust its users accordingly (Lukasz Luba).
- Add new devfreq driver for Mediatek CCI (Cache Coherent
Interconnect) (Johnson Wang).
- Convert the Samsung Exynos SoC Bus bindings to DT schema of
exynos-bus.c (Krzysztof Kozlowski).
- Address kernel-doc warnings by adding the description for unused
fucntion parameters in devfreq core (Mauro Carvalho Chehab).
- Use NULL to pass a null pointer rather than zero according to the
function propotype in imx-bus.c (Colin Ian King).
- Print error message instead of error interger value in
tegra30-devfreq.c (Dmitry Osipenko).
- Add checks to prevent setting negative frequency QoS limits for
CPUs (Shivnandan Kumar).
- Update the pm-graph suite of utilities to the latest revision 5.9
including multiple improvements (Todd Brandt).
- Drop pme_interrupt reference from the PCI power management
documentation (Mario Limonciello).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmLoKy8SHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx3+oQAJNVU+W14EaRPWXQRMuwBC5zk3hb6T9q
JqmMd8coEd+9/4ABAeRAWso1B26rUzB6JyBvw3lGH9OXInpYmvnJEhEPrTpK2h0D
U9HxEARuGJolrDm0X9NAkn7tKKMC9GnvPS9z2s7s+N97VFFWC/QiU+PHB0SypGNb
JxRfbVJZQCuxmNG9UeK+xeHFQ9lM2Z9ZdTxR71G0n7nQPPR+sUvnFufFby3Aogf3
XnBYfia+YNqkUlefxxwB5a0cFwPXOUGsQkIf4d64gZnq1TgZ+71kht1GEF08PDFl
wV8v1rOWuXEae8dozuf5xszp/eVyAqzgB+IShT9APREOO3Wg6I16XdBm8R1TGwCK
JTdZqnm6HVKBNqchEwYViJILX69rrNUT+AwHBWhtKKDNh3qeTuwi/JGTeDVN++en
xf3TNKx3LV31Nq6nWJFzDGLehfZMnAPkhfYohUBI7FNyblpk4mJRVcZ0bYI7UNnS
als77uoipvb5KdFCtdhxYBHd/y867NvXKa1qsAuDxusAsfJHf4SnlMdbgOepBH2y
jJg06CGrMDU3TZ8BL+WpqUYk4irQnAMs/159Txh7A6/dOnTjE7S9NHrENCwmt2og
FrHSLH1eLX6Sa4RSibiGHPC7mNULP2/TOtryf3zFdlIVcjm3NEU3bnfzx7nlJn05
8t6ObMxgMhWT
=XeLV
-----END PGP SIGNATURE-----
Merge tag 'pm-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These are mostly minor improvements all over including new CPU IDs for
the Intel RAPL driver, an Energy Model rework to use micro-Watt as the
power unit, cpufreq fixes and cleanus, cpuidle updates, devfreq
updates, documentation cleanups and a new version of the pm-graph
suite of utilities.
Specifics:
- Make cpufreq_show_cpus() more straightforward (Viresh Kumar).
- Drop unnecessary CPU hotplug locking from store() used by cpufreq
sysfs attributes (Viresh Kumar).
- Make the ACPI cpufreq driver support the boost control interface on
Zhaoxin/Centaur processors (Tony W Wang-oc).
- Print a warning message on attempts to free an active cpufreq
policy which should never happen (Viresh Kumar).
- Fix grammar in the Kconfig help text for the loongson2 cpufreq
driver (Randy Dunlap).
- Use cpumask_var_t for an on-stack CPU mask in the ondemand cpufreq
governor (Zhao Liu).
- Add trace points for guest_halt_poll_ns grow/shrink to the haltpoll
cpuidle driver (Eiichi Tsukata).
- Modify intel_idle to treat C1 and C1E as independent idle states on
Sapphire Rapids (Artem Bityutskiy).
- Extend support for wakeirq to callback wrappers used during system
suspend and resume (Ulf Hansson).
- Defer waiting for device probe before loading a hibernation image
till the first actual device access to avoid possible deadlocks
reported by syzbot (Tetsuo Handa).
- Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP (Bjorn
Helgaas).
- Add Raptor Lake-P to the list of processors supported by the Intel
RAPL driver (George D Sworo).
- Add Alder Lake-N and Raptor Lake-P to the list of processors for
which Power Limit4 is supported in the Intel RAPL driver (Sumeet
Pawnikar).
- Make pm_genpd_remove() check genpd_debugfs_dir against NULL before
attempting to remove it (Hsin-Yi Wang).
- Change the Energy Model code to represent power in micro-Watts and
adjust its users accordingly (Lukasz Luba).
- Add new devfreq driver for Mediatek CCI (Cache Coherent
Interconnect) (Johnson Wang).
- Convert the Samsung Exynos SoC Bus bindings to DT schema of
exynos-bus.c (Krzysztof Kozlowski).
- Address kernel-doc warnings by adding the description for unused
function parameters in devfreq core (Mauro Carvalho Chehab).
- Use NULL to pass a null pointer rather than zero according to the
function propotype in imx-bus.c (Colin Ian King).
- Print error message instead of error interger value in
tegra30-devfreq.c (Dmitry Osipenko).
- Add checks to prevent setting negative frequency QoS limits for
CPUs (Shivnandan Kumar).
- Update the pm-graph suite of utilities to the latest revision 5.9
including multiple improvements (Todd Brandt).
- Drop pme_interrupt reference from the PCI power management
documentation (Mario Limonciello)"
* tag 'pm-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (27 commits)
powercap: RAPL: Add Power Limit4 support for Alder Lake-N and Raptor Lake-P
PM: QoS: Add check to make sure CPU freq is non-negative
PM: hibernate: defer device probing when resuming from hibernation
intel_idle: make SPR C1 and C1E be independent
cpufreq: ondemand: Use cpumask_var_t for on-stack cpu mask
cpufreq: loongson2: fix Kconfig "its" grammar
pm-graph v5.9
cpufreq: Warn users while freeing active policy
cpufreq: scmi: Support the power scale in micro-Watts in SCMI v3.1
firmware: arm_scmi: Get detailed power scale from perf
Documentation: EM: Switch to micro-Watts scale
PM: EM: convert power field to micro-Watts precision and align drivers
PM / devfreq: tegra30: Add error message for devm_devfreq_add_device()
PM / devfreq: imx-bus: use NULL to pass a null pointer rather than zero
PM / devfreq: shut up kernel-doc warnings
dt-bindings: interconnect: samsung,exynos-bus: convert to dtschema
PM / devfreq: mediatek: Introduce MediaTek CCI devfreq driver
dt-bindings: interconnect: Add MediaTek CCI dt-bindings
PM: domains: Ensure genpd_debugfs_dir exists before remove
PM: runtime: Extend support for wakeirq for force_suspend|resume
...
The big thing this release is a big cleanup of the interrupt code from
Aidan MacDonald, plus a few new API updates:
- Rework of the interrupt code, making it much simpler and easier to
extend.
- Support for device specific update bits operations with devices that
otherwise use bitstream interfaces.
- Support for bit operations on fields as well as whole registers.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmLnvh0ACgkQJNaLcl1U
h9DlbQf/RELySinbi6WTIthuigJTfQq7xFIrR3LsEIrLWTuKsb+mBtxPkv0sAUF4
lmTtxmixtCAp3z2xWLXTw99dxDcII49YPmR1TzKZ9vBsK0vkAof6t7BQyhFoICpy
cbGdw4Mqi3qOHHeDH3obNbYhz1IwWL47Q0eASkNyaHrrnxykIyjeJ0TTURoVWJpX
FEzGwvFtH4+5w3yc0aE+WvjzHXgPj/xAsyE835TF9jv8cW9a2/KAYPLo7gW+icaz
Qx4JrmwMBuxzJEuaRvx2dxtasDCZKPFyod1cKS2gTpF4OnNKHocDwC3cSoGDLztr
BRljUV3VfWzOB8DdeAaB5XQM8LJeiw==
=talc
-----END PGP SIGNATURE-----
Merge tag 'regmap-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"The big thing this release is a big cleanup of the interrupt code from
Aidan MacDonald, plus a few new API updates:
- Rework of the interrupt code, making it much simpler and easier to
extend
- Support for device specific update bits operations with devices
that otherwise use bitstream interfaces
- Support for bit operations on fields as well as whole registers"
* tag 'regmap-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: permit to set reg_update_bits with bulk implementation
regmap: add WARN_ONCE when invalid mask is provided to regmap_field_init()
regmap-irq: Fix bug in regmap_irq_get_irq_reg_linear()
regmap: cache: Add extra parameter check in regcache_init
regmap-irq: Deprecate the not_fixed_stride flag
regmap-irq: Add get_irq_reg() callback
regmap-irq: Fix inverted handling of unmask registers
regmap-irq: Deprecate type registers and virtual registers
regmap-irq: Introduce config registers for irq types
regmap-irq: Refactor checks for status bulk read support
regmap-irq: Remove mask_writeonly and regmap_irq_update_bits()
regmap-irq: Remove inappropriate uses of regmap_irq_update_bits()
regmap-irq: Remove an unnecessary restriction on type_in_mask
regmap-irq: Cleanup sizeof(...) use in memory allocation
regmap-irq: Remove unused type_reg_stride field
regmap-irq: Convert bool bitfields to unsigned int
regmap: Don't warn about cache only mode for devices with no cache
regmap: provide regmap_field helpers for simple bit operations
regmap: cache: Fix syntax errors in comments
Merge core device power management changes for v5.20-rc1:
- Extend support for wakeirq to callback wrappers used during system
suspend and resume (Ulf Hansson).
- Defer waiting for device probe before loading a hibernation image
till the first actual device access to avoid possible deadlocks
reported by syzbot (Tetsuo Handa).
- Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP (Bjorn
Helgaas).
- Add Raptor Lake-P to the list of processors supported by the Intel
RAPL driver (George D Sworo).
- Add Alder Lake-N and Raptor Lake-P to the list of processors for
which Power Limit4 is supported in the Intel RAPL driver (Sumeet
Pawnikar).
- Make pm_genpd_remove() check genpd_debugfs_dir against NULL before
attempting to remove it (Hsin-Yi Wang).
- Change the Energy Model code to represent power in micro-Watts and
adjust its users accordingly (Lukasz Luba).
* pm-core:
PM: runtime: Extend support for wakeirq for force_suspend|resume
* pm-sleep:
PM: hibernate: defer device probing when resuming from hibernation
PM: wakeup: Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP
* powercap:
powercap: RAPL: Add Power Limit4 support for Alder Lake-N and Raptor Lake-P
powercap: intel_rapl: Add support for RAPTORLAKE_P
* pm-domains:
PM: domains: Ensure genpd_debugfs_dir exists before remove
* pm-em:
cpufreq: scmi: Support the power scale in micro-Watts in SCMI v3.1
firmware: arm_scmi: Get detailed power scale from perf
Documentation: EM: Switch to micro-Watts scale
PM: EM: convert power field to micro-Watts precision and align drivers
The use of kmap() is being deprecated in favor of kmap_local_page().
Two main problems with kmap(): (1) It comes with an overhead as mapping
space is restricted and protected by a global lock for synchronization and
(2) kmap() also requires global TLB invalidation when the kmap’s pool
wraps and it might block when the mapping space is fully utilized until a
slot becomes available.
kmap_local_page() is preferred over kmap() and kmap_atomic(). Where it
cannot mechanically replace the latters, code refactor should be considered
(special care must be taken if kernel virtual addresses are aliases in
different contexts).
With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
Call kmap_local_page() in firmware_loader wherever kmap() is currently
used. In firmware_rw() use the helpers copy_{from,to}_page() instead of
open coding the local mappings + memcpy().
Successfully tested with "firmware" selftests on a QEMU/KVM 32-bits VM
with 4GB RAM, booting a kernel with HIGHMEM64GB enabled.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Link: https://lore.kernel.org/r/20220714235030.12732-1-fmdefrancesco@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
init_cpu_topology() is called only once at the boot and all the cache
attributes are detected early for all the possible CPUs. However when
the CPUs are hotplugged out, the cacheinfo gets removed. While the
attributes are added back when the CPUs are hotplugged back in as part
of CPU hotplug state machine, it ends up called quite late after the
update_siblings_masks() are called in the secondary_start_kernel()
resulting in wrong llc_sibling_masks.
Move the call to detect_cache_attributes() inside update_siblings_masks()
to ensure the cacheinfo is updated before the LLC sibling masks are
updated. This will fix the incorrect LLC sibling masks generated when
the CPUs are hotplugged out and hotplugged back in again.
Reported-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20220720-arch_topo_fixes-v3-3-43d696288e84@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On couple of architectures like RISC-V and ARM64, we need to detect
cache attribues quite early during the boot when the secondary CPUs
start. So we will call detect_cache_attributes in the atomic context
and since use of normal allocation can sleep, we will end up getting
"sleeping in the atomic context" bug splat.
In order avoid that, move the allocation to use atomic version in
preparation to move the actual detection of cache attributes in the
CPU hotplug path which is atomic.
Cc: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20220720-arch_topo_fixes-v3-1-43d696288e84@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A regmap may still require to set a custom reg_update_bits instead of
relying to the regmap_bus_read/write general function.
Permit to set it in the map if provided by the regmap config.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Link: https://lore.kernel.org/r/20220715201032.19507-1-ansuelsmth@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Using bin_attributes with a 0 size causes fstat and friends to return that
0 size. This breaks userspace code that retrieves the size before reading
the file. Rather than reverting 75bd50fa84 ("drivers/base/node.c: use
bin_attribute to break the size limitation of cpumap ABI") let's put in a
size value at compile time.
For cpulist the maximum size is on the order of
NR_CPUS * (ceil(log10(NR_CPUS)) + 1)/2
which for 8192 is 20480 (8192 * 5)/2. In order to get near that you'd need
a system with every other CPU on one node. For example: (0,2,4,8, ... ).
To simplify the math and support larger NR_CPUS in the future we are using
(NR_CPUS * 7)/2. We also set it to a min of PAGE_SIZE to retain the older
behavior for smaller NR_CPUS.
The cpumap file the size works out to be NR_CPUS/4 + NR_CPUS/32 - 1
(or NR_CPUS * 9/32 - 1) including the ","s.
Add a set of macros for these values to cpumask.h so they can be used in
multiple places. Apply these to the handful of such files in
drivers/base/topology.c as well as node.c.
As an example, on an 80 cpu 4-node system (NR_CPUS == 8192):
before:
-r--r--r--. 1 root root 0 Jul 12 14:08 system/node/node0/cpulist
-r--r--r--. 1 root root 0 Jul 11 17:25 system/node/node0/cpumap
after:
-r--r--r--. 1 root root 28672 Jul 13 11:32 system/node/node0/cpulist
-r--r--r--. 1 root root 4096 Jul 13 11:31 system/node/node0/cpumap
CONFIG_NR_CPUS = 16384
-r--r--r--. 1 root root 57344 Jul 13 14:03 system/node/node0/cpulist
-r--r--r--. 1 root root 4607 Jul 13 14:02 system/node/node0/cpumap
The actual number of cpus doesn't matter for the reported size since they
are based on NR_CPUS.
Fixes: 75bd50fa84 ("drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI")
Fixes: bb9ec13d15 ("topology: use bin_attribute to break the size limitation of cpumap ABI")
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: stable@vger.kernel.org
Acked-by: Yury Norov <yury.norov@gmail.com> (for include/linux/cpumask.h)
Signed-off-by: Phil Auld <pauld@redhat.com>
Link: https://lore.kernel.org/r/20220715134924.3466194-1-pauld@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Both genpd_debug_add() and genpd_debug_remove() may be called
indirectly by other drivers while genpd_debugfs_dir is not yet
set. For example, drivers can call pm_genpd_init() in probe or
pm_genpd_init() in probe fail/cleanup path:
pm_genpd_init()
--> genpd_debug_add()
pm_genpd_remove()
--> genpd_remove()
--> genpd_debug_remove()
At this time, genpd_debug_init() may not yet be called.
genpd_debug_add() checks that if genpd_debugfs_dir is NULL, it
will return directly. Make sure this is also checked
in pm_genpd_remove(), otherwise components under debugfs root
which has the same name as other components under pm_genpd may
be accidentally removed, since NULL represents debugfs root.
Fixes: 718072ceb2 ("PM: domains: create debugfs nodes when adding power domains")
Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
solved and the nightmare is complete, here's the next one: speculating
after RET instructions and leaking privileged information using the now
pretty much classical covert channels.
It is called RETBleed and the mitigation effort and controlling
functionality has been modelled similar to what already existing
mitigations provide.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmLKqAgACgkQEsHwGGHe
VUoM5w/8CSvwPZ3otkhmu8MrJPtWc7eLDPjYN4qQP+19e+bt094MoozxeeWG2wmp
hkDJAYHT2Oik/qDuEdhFgNYwS7XGgbV3Py3B8syO4//5SD5dkOSG+QqFXvXMdFri
YsVqqNkjJOWk/YL9Ql5RS/xQewsrr0OqEyWWocuI6XAvfWV4kKvlRSd+6oPqtZEO
qYlAHTXElyIrA/gjmxChk1HTt5HZtK3uJLf4twNlUfzw7LYFf3+sw3bdNuiXlyMr
WcLXMwGpS0idURwP3mJa7JRuiVBzb4+kt8mWwWqA02FkKV45FRRRFhFUsy667r00
cdZBaWdy+b7dvXeliO3FN/x1bZwIEUxmaNy1iAClph4Ifh0ySPUkxAr8EIER7YBy
bstDJEaIqgYg8NIaD4oF1UrG0ZbL0ImuxVaFdhG1hopQsh4IwLSTLgmZYDhfn/0i
oSqU0Le+A7QW9s2A2j6qi7BoAbRW+gmBuCgg8f8ECYRkFX1ZF6mkUtnQxYrU7RTq
rJWGW9nhwM9nRxwgntZiTjUUJ2HtyXEgYyCNjLFCbEBfeG5QTg7XSGFhqDbgoymH
85vsmSXYxgTgQ/kTW7Fs26tOqnP2h1OtLJZDL8rg49KijLAnISClEgohYW01CWQf
ZKMHtz3DM0WBiLvSAmfGifScgSrLB5AjtvFHT0hF+5/okEkinVk=
=09fW
-----END PGP SIGNATURE-----
Merge tag 'x86_bugs_retbleed' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 retbleed fixes from Borislav Petkov:
"Just when you thought that all the speculation bugs were addressed and
solved and the nightmare is complete, here's the next one: speculating
after RET instructions and leaking privileged information using the
now pretty much classical covert channels.
It is called RETBleed and the mitigation effort and controlling
functionality has been modelled similar to what already existing
mitigations provide"
* tag 'x86_bugs_retbleed' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
x86/speculation: Disable RRSBA behavior
x86/kexec: Disable RET on kexec
x86/bugs: Do not enable IBPB-on-entry when IBPB is not supported
x86/entry: Move PUSH_AND_CLEAR_REGS() back into error_entry
x86/bugs: Add Cannon lake to RETBleed affected CPU list
x86/retbleed: Add fine grained Kconfig knobs
x86/cpu/amd: Enumerate BTC_NO
x86/common: Stamp out the stepping madness
KVM: VMX: Prevent RSB underflow before vmenter
x86/speculation: Fill RSB on vmexit for IBRS
KVM: VMX: Fix IBRS handling after vmexit
KVM: VMX: Prevent guest RSB poisoning attacks with eIBRS
KVM: VMX: Convert launched argument to flags
KVM: VMX: Flatten __vmx_vcpu_run()
objtool: Re-add UNWIND_HINT_{SAVE_RESTORE}
x86/speculation: Remove x86_spec_ctrl_mask
x86/speculation: Use cached host SPEC_CTRL value for guest entry/exit
x86/speculation: Fix SPEC_CTRL write on SMT state change
x86/speculation: Fix firmware entry SPEC_CTRL handling
x86/speculation: Fix RSB filling with CONFIG_RETPOLINE=n
...
A driver that makes use of pm_runtime_force_suspend|resume() to support
system suspend/resume, currently needs to manage the wakeirq support
itself. To avoid the boilerplate code in the driver's system suspend/resume
callbacks in particular, let's extend pm_runtime_force_suspend|resume() to
deal with the wakeirq.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
These are updates to fix some discrepancies we have in the CPU topology
parsing from the device tree /cpu-map node and the divergence from the
behaviour on a ACPI enabled platform. The expectation is that both DT
and ACPI enabled systems must present consistent view of the CPU topology.
The current assignment of generated cluster count as the physical package
identifier for each CPU is wrong. The device tree bindings for CPU
topology supports sockets to infer the socket or physical package
identifier for a given CPU. It is now being made use of you address the
issue. These updates also assigns the cluster identifier as parsed from
the device tree cluster nodes within /cpu-map without support for
nesting of the clusters as there are no such reported/known platforms.
In order to be on par with ACPI PPTT physical package/socket support,
these updates also include support for socket nodes in /cpu-map.
The only exception is that the last level cache id information can be
inferred from the same ACPI PPTT while we need to parse CPU cache nodes
in the device tree. The cacheinfo changes here is to enable the re-use
of the cacheinfo to detect the cache attributes for all the CPU quite
early even before the scondardaries are booted so that the information
can be used to build the schedular domains especially the last level
cache(LLC).
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEunHlEgbzHrJD3ZPhAEG6vDF+4pgFAmLFak4ACgkQAEG6vDF+
4pjHNxAAyzXpazkWuTjTot9UcX2TE8kIlEB4wXoGr1WJfi2uefVuvo3owvSKLdmr
pBNUf0fSLKShljueXOYcmwxVvSoN3zWSrOh4huUBWv0VPBg5yyNplZChwbhxjmiL
N5FGtSDHoTgPjjMUXnlsa/Y7/RDUhxR+s4ZdGS/vnMHbGr8Gsm5bjc6BNCq9E6cz
xnGDzUOS3+Sin+751De09HIuH5FOoCEpWOj6FGVK3MtEsizHU4ANEKTgFdsE26mG
nmVY1CU3GJmHluvG1JgL4+HmZsW02h2yU0tRSLqcseJCUou8gJ5yr0wYF6wmsHGk
nzGDfV7GzLdQg5rVnWcgzrzibqbBKJvh795e2cW3tV60VjMxlh57L7OWnHAEzQh8
QCF8HeE2CGg6VlYC+oB5JZ4pLdPE69e0+fHzhFy7hqK/B9yOr0olIKXxcm4tR/TV
5Ri7D8bX4Oviq1pQcT+GE/8Of5vX5R9LTiH1V0ld38npVvA55KDnO6WKvcadEucO
tKnHZx2dZYR/sRJ8ABz4hb7+UlLiLpCPFudx+BcXLHn+nWSqXYlu+F/2D2nsGRP+
HpmJHISJ40gD669KvupLg0/TtPdQ1oRmuf9CXUiMkzQZcySIW4wmFvCvbUfMcApl
7OzQ+FtAPb7n821mSV7KJBYJ5xOxNVR7DfUovmXCa/91xfqmE1M=
=UCxa
-----END PGP SIGNATURE-----
Merge tag 'arch-cache-topo-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into driver-core-next
Sudeep writes:
cacheinfo and arch_topology updates for v5.20
These are updates to fix some discrepancies we have in the CPU topology
parsing from the device tree /cpu-map node and the divergence from the
behaviour on a ACPI enabled platform. The expectation is that both DT
and ACPI enabled systems must present consistent view of the CPU topology.
The current assignment of generated cluster count as the physical package
identifier for each CPU is wrong. The device tree bindings for CPU
topology supports sockets to infer the socket or physical package
identifier for a given CPU. It is now being made use of you address the
issue. These updates also assigns the cluster identifier as parsed from
the device tree cluster nodes within /cpu-map without support for
nesting of the clusters as there are no such reported/known platforms.
In order to be on par with ACPI PPTT physical package/socket support,
these updates also include support for socket nodes in /cpu-map.
The only exception is that the last level cache id information can be
inferred from the same ACPI PPTT while we need to parse CPU cache nodes
in the device tree. The cacheinfo changes here is to enable the re-use
of the cacheinfo to detect the cache attributes for all the CPU quite
early even before the scondardaries are booted so that the information
can be used to build the schedular domains especially the last level
cache(LLC).
* tag 'arch-cache-topo-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: (21 commits)
ACPI: Remove the unused find_acpi_cpu_cache_topology()
arch_topology: Warn that topology for nested clusters is not supported
arch_topology: Add support for parsing sockets in /cpu-map
arch_topology: Set cluster identifier in each core/thread from /cpu-map
arch_topology: Limit span of cpu_clustergroup_mask()
arch_topology: Don't set cluster identifier as physical package identifier
arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
arch_topology: Check for non-negative value rather than -1 for IDs validity
arch_topology: Set thread sibling cpumask only within the cluster
arch_topology: Drop LLC identifier stash from the CPU topology
arm64: topology: Remove redundant setting of llc_id in CPU topology
arch_topology: Use the last level cache information from the cacheinfo
arch_topology: Add support to parse and detect cache attributes
cacheinfo: Align checks in cache_shared_cpu_map_{setup,remove} for readability
cacheinfo: Use cache identifiers to check if the caches are shared if available
cacheinfo: Allow early detection and population of cache attributes
cacheinfo: Add support to check if last level cache(LLC) is valid or shared
cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
cacheinfo: Add helper to access any cache index for a given CPU
cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
...
In regmap_field_init() when a invalid mask is provided it still
initializes with any warnings.
An example of this is when the LSB is greater than MSB a mask of zero
is produced.
WARN_ONCE() is not ideal for this but requires less changes to core regmap
code.
Cc: Mark Brown <broonie@kernel.org>
Cc: Nishanth Menon <nm@ti.com>
Signed-off-by: Matt Ranostay <mranostay@ti.com>
Link: https://lore.kernel.org/r/20220708013125.313892-1-mranostay@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Previously the CONFIG_PM_SLEEP and !CONFIG_PM_SLEEP device_init_wakeup()
implementations differed in confusing ways:
- The PM_SLEEP version checked for a NULL device pointer and returned
-EINVAL, while the !PM_SLEEP version did not and would simply
dereference a NULL pointer.
- When called with "false", the !PM_SLEEP version cleared "capable" and
"enable" in the opposite order of the PM_SLEEP version. That was
harmless because for !PM_SLEEP they're simple assignments, but it's
unnecessary confusion.
Use a simplified version of the PM_SLEEP implementation for both cases.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
irq_reg_stride in struct regmap_irq_chip is often 0, but that
actually means to use the default stride of 1. The effective
stride is stored in struct regmap_irq_chip_data->irq_reg_stride
and will get the corrected default value.
The default ->get_irq_reg() callback was using the stride from
the chip definition, which is wrong; fix it to use the effective
stride from the chip data instead.
Link: https://lore.kernel.org/lkml/acaaf77f-3282-8544-dd3c-7915fc1a6a4f@samsung.com/
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220704112847.23844-1-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
We don't support the topology for clusters of CPU clusters while the
DT and ACPI bindings theoritcally support the same. Just warn about the
same so that it is clear to the users of arch_topology that the nested
clusters are not yet supported.
Link: https://lore.kernel.org/r/20220704101605.1318280-21-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Finally let us add support for socket nodes in /cpu-map in the device
tree. Since this may not be present in all the old platforms and even
most of the existing platforms, we need to assume absence of the socket
node indicates that it is a single socket system and handle appropriately.
Also it is likely that most single socket systems skip to as the node
since it is optional.
Link: https://lore.kernel.org/r/20220704101605.1318280-20-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Let us set the cluster identifier as parsed from the device tree
cluster nodes within /cpu-map.
We don't support nesting of clusters yet as there are no real hardware
to support clusters of clusters.
Link: https://lore.kernel.org/r/20220704101605.1318280-19-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Currently the cluster identifier is not set on DT based platforms.
The reset or default value is -1 for all the CPUs. Once we assign the
cluster identifier values correctly, the cluster_sibling mask will be
populated and returned by cpu_clustergroup_mask() to contribute in the
creation of the CLS scheduling domain level, if SCHED_CLUSTER is
enabled.
To avoid topologies that will result in questionable or incorrect
scheduling domains, impose restrictions regarding the span of clusters,
as presented to scheduling domains building code: cluster_sibling should
not span more or the same CPUs as cpu_coregroup_mask().
This is needed in order to obtain a strict separation between the MC and
CLS levels, and maintain the same domains for existing platforms in
the presence of CONFIG_SCHED_CLUSTER, where the new cluster information
is redundant and irrelevant for the scheduler.
While previously the scheduling domain builder code would have removed MC
as redundant and kept CLS if SCHED_CLUSTER was enabled and the
cpu_coregroup_mask() and cpu_clustergroup_mask() spanned the same CPUs,
now CLS will be removed and MC kept.
Link: https://lore.kernel.org/r/20220704101605.1318280-18-sudeep.holla@arm.com
Cc: Darren Hart <darren@os.amperecomputing.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Currently as we parse the CPU topology from /cpu-map node from the
device tree, we assign generated cluster count as the physical package
identifier for each CPU which is wrong.
The device tree bindings for CPU topology supports sockets to infer
the socket or physical package identifier for a given CPU. Since it is
fairly new and not supported on most of the old and existing systems, we
can assume all such systems have single socket/physical package.
Fix the physical package identifier to 0 by removing the assignment of
cluster identifier to the same.
Link: https://lore.kernel.org/r/20220704101605.1318280-17-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
There is no point in looping through all the CPU's physical package
identifier to check if it is valid or not once a CPU which is outside
the topology(i.e. outlier CPU) is found.
Let us just break out of the loop early in such case.
Link: https://lore.kernel.org/r/20220704101605.1318280-16-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Instead of just comparing the cpu topology IDs with -1 to check their
validity, improve that by checking for a valid non-negative value.
Link: https://lore.kernel.org/r/20220704101605.1318280-15-sudeep.holla@arm.com
Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Currently the cluster identifier is not set on the DT based platforms.
The reset or default value is -1 for all the CPUs. Once we assign the
cluster identifier values correctly that may result in getting the thread
siblings wrong as the core identifiers can be same for 2 different CPUs
belonging to 2 different cluster.
So, in order to get the thread sibling cpumasks correct, we need to
update them only if the cores they belong are in the same cluster within
the socket. Let us skip updation of the thread sibling cpumaks if the
cluster identifier doesn't match.
This change won't affect even if the cluster identifiers are not set
currently but will avoid any breakage once we set the same correctly.
Link: https://lore.kernel.org/r/20220704101605.1318280-14-sudeep.holla@arm.com
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Since the cacheinfo LLC information is used directly in arch_topology,
there is no need to parse and store the LLC ID information only for
ACPI systems in the CPU topology.
Remove the redundant LLC ID from the generic CPU arch_topology
information.
Link: https://lore.kernel.org/r/20220704101605.1318280-13-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The cacheinfo is now initialised early along with the CPU topology
initialisation. Instead of relying on the LLC ID information parsed
separately only with ACPI PPTT elsewhere, migrate to use the similar
information from the cacheinfo.
This is generic for both DT and ACPI systems. The ACPI LLC ID information
parsed separately can now be removed from arch specific code.
Link: https://lore.kernel.org/r/20220704101605.1318280-11-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Currently ACPI populates just the minimum information about the last
level cache from PPTT in order to feed the same to build sched_domains.
Similar support for DT platforms is not present.
In order to enable the same, the entire cache hierarchy information can
be built as part of CPU topoplogy parsing both on ACPI and DT platforms.
Note that this change builds the cacheinfo early even on ACPI systems,
but the current mechanism of building llc_sibling mask remains unchanged.
Link: https://lore.kernel.org/r/20220704101605.1318280-10-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The checks to skip the CPU itself or no cacheinfo case are implemented
bit differently though the effect is exactly same. Just align the
implementation in both cache_shared_cpu_map_{setup,remove} just for
improved readability. No functional change.
Link: https://lore.kernel.org/r/20220704101605.1318280-9-sudeep.holla@arm.com
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The cache identifiers is an optional property on most of the platforms.
The presence of one must be indicated by the CACHE_ID valid bit in the
attributes.
We can use the cache identifiers provided by the firmware to check if
any two cpus share the same cache instead of relying on the fw_token
generated and set in the OS.
Link: https://lore.kernel.org/r/20220704101605.1318280-8-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Some architecture/platforms may need to setup cache properties very
early in the boot along with other cpu topologies so that all these
information can be used to build sched_domains which is used by the
scheduler.
Allow detect_cache_attributes to be called quite early during the boot.
Link: https://lore.kernel.org/r/20220704101605.1318280-7-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
It is useful to have helper to check if the given two CPUs share last
level cache. We can do that check by comparing fw_token or by comparing
the cache ID. Currently we check just for fw_token as the cache ID is
optional.
This helper can be used to build the llc_sibling during arch specific
topology parsing and feeding information to the sched_domains. This also
helps to get rid of llc_id in the CPU topology as it is sort of duplicate
information.
Also add helper to check if the llc information in cacheinfo is valid
or not.
Link: https://lore.kernel.org/r/20220704101605.1318280-6-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
cache_leaves_are_shared is already used even with ACPI and PPTT. It
checks if the cache leaves are the shared based on fw_token pointer.
However it is defined conditionally only if CONFIG_OF is enabled which
is wrong.
Move the function cache_leaves_are_shared out of CONFIG_OF and keep it
generic. It also handles the case where both OF and ACPI is not defined.
Link: https://lore.kernel.org/r/20220704101605.1318280-5-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The cacheinfo for a given CPU at a given index is used at quite a few
places by fetching the base point for index 0 using the helper
per_cpu_cacheinfo(cpu) and offsetting it by the required index.
Instead, add another helper to fetch the required pointer directly and
use it to simplify and improve readability.
Link: https://lore.kernel.org/r/20220704101605.1318280-4-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The of_cpu_device_node_get takes care of fetching the CPU'd device node
either from cached cpu_dev->of_node if cpu_dev is initialised or uses
of_get_cpu_node to parse and fetch node if cpu_dev isn't available yet.
Just use of_cpu_device_node_get instead of getting the cpu device first
and then using cpu_dev->of_node for two reasons:
1. There is no other use of cpu_dev and can be simplified
2. It enabled the use detect_cache_attributes and hence cache_setup_of_node
much earlier before the CPUs are registered as devices.
Link: https://lore.kernel.org/r/20220704101605.1318280-3-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Because pm_runtime_get_suppliers() bumps up the rpm_active counter
of each device link to a supplier of the given device in addition
to bumping up the supplier's PM-runtime usage counter, a runtime
suspend of the consumer device may case the latter to go down to 0
when pm_runtime_put_suppliers() is running on a remote CPU. If that
happens after pm_runtime_put_suppliers() has released power.lock for
the consumer device, and a runtime resume of that device takes place
immediately after it, before pm_runtime_put() is called for the
supplier, that pm_runtime_put() call may cause the supplier to be
suspended even though the consumer is active.
To prevent that from happening, modify pm_runtime_get_suppliers() to
call pm_runtime_get_sync() for the given device's suppliers without
touching the rpm_active counters of the involved device links
Accordingly, modify pm_runtime_put_suppliers() to call pm_runtime_put()
for the given device's suppliers without looking at the rpm_active
counters of the device links at hand. [This is analogous to what
happened before commit 4c06c4e6cf ("driver core: Fix possible
supplier PM-usage counter imbalance").]
Since pm_runtime_get_suppliers() sets supplier_preactivated for each
device link where the supplier's PM-runtime usage counter has been
incremented and pm_runtime_put_suppliers() calls pm_runtime_put() for
the suppliers whose device links have supplier_preactivated set, the
PM-runtime usage counter is balanced for each supplier and this is
independent of the runtime suspend and resume of the consumer device.
However, in case a device link with DL_FLAG_PM_RUNTIME set is dropped
during the consumer device probe, so pm_runtime_get_suppliers() bumps
up the supplier's PM-runtime usage counter, but it cannot be dropped by
pm_runtime_put_suppliers(), make device_link_release_fn() take care of
that.
Fixes: 4c06c4e6cf ("driver core: Fix possible supplier PM-usage counter imbalance")
Reported-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Cc: 5.1+ <stable@vger.kernel.org> # 5.1+
Instead of passing an extra bool argument to pm_runtime_release_supplier(),
make its callers take care of triggering a runtime-suspend of the
supplier device as needed.
No expected functional impact.
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: 5.1+ <stable@vger.kernel.org> # 5.1+
Merge series from Aidan MacDonald <aidanmacdonald.0x0@gmail.com>:
This series is an attempt at cleaning up the regmap-irq API in order
to simplify things and consolidate existing features, while at the
same time generalizing it to support a wider range of hardware.
There is a new system for IRQ type configuration, some tweaks to
unmask registers so they're more intuitive and useful, and a new
callback for calculating register addresses. There's also a few
minor code cleanups in here.
In v2 I've taken the approach of adding new features and deprecating
existing ones rather than removing them aggressively. Warnings will
be issued for any drivers that use deprecated features, but they'll
otherwise continue to function normally.
One important caveat: not all of these changes are tested beyond
compile testing, since I don't have hardware to exercise all of
the features.
When num_reg_defaults > 0 but reg_defaults is NULL, there will be a
NULL pointer exception.
Current code has no such usage, but as additional hardening, also
check this to prevent any chance of crashing.
Signed-off-by: Schspa Shi <schspa@gmail.com>
Link: https://lore.kernel.org/r/20220629130951.63040-1-schspa@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
This flag is a bit of a hack and the same thing can be accomplished
using a custom ->get_irq_reg() callback. Add a warning to catch any
use of the flag.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-13-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Replace the internal sub_irq_reg() function with a public callback
that drivers can use when they have more complex register layouts.
The default implementation is regmap_irq_get_irq_reg_linear(), used
if the chip doesn't provide its own callback.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-12-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
To me "unmask" suggests that we write 1s to the register when
an interrupt is enabled. This also makes sense because it's the
opposite of what the "mask" register does (write 1s to disable
an interrupt).
But regmap-irq does the opposite: for a disabled interrupt, it
writes 1s to "unmask" and 0s to "mask". This is surprising and
deviates from the usual way mask registers are handled.
Additionally, mask_invert didn't interact with unmask registers
properly -- it caused them to be ignored entirely.
Fix this by making mask and unmask registers orthogonal, using
the following behavior:
* Mask registers are written with 1s for disabled interrupts.
* Unmask registers are written with 1s for enabled interrupts.
This behavior supports both normal or inverted mask registers
and separate set/clear registers via different combinations of
mask_base/unmask_base.
The old unmask register behavior is deprecated. Drivers need to
opt-in to the new behavior by setting mask_unmask_non_inverted.
Warnings are issued if the driver relies on deprecated behavior.
Chips that only set one of mask_base/unmask_base don't have to
use the mask_unmask_non_inverted flag because that use case was
previously not supported.
The mask_invert flag is also deprecated in favor of describing
inverted mask registers as unmask registers.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-11-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Config registers can be used to replace both type and virtual
registers, so mark both features are deprecated and issue a
warning if they're used.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-10-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Config registers provide a more uniform approach to handling irq type
registers. They are essentially an extension of the virtual registers
used by the qcom-pm8008 driver.
Config registers can be represented as a 2D array:
config_base[0] reg0,0 reg0,1 reg0,2 reg0,3
config_base[1] reg1,0 reg1,1 reg1,2 reg1,3
config_base[2] reg2,0 reg2,1 reg2,2 reg2,3
There are 'num_config_bases' base registers, each of which is used to
address 'num_config_regs' registers. The addresses are calculated in
the same way as for other bases. It is assumed that an irq's type is
controlled by one column of registers; that column is identified by
the irq's 'type_reg_offset'.
The set_type_config() callback is responsible for updating the config
register contents. It receives an array of buffers (each represents a
row of registers) and the index of the column to update, along with
the 'struct regmap_irq' description and requested irq type.
Buffered values are written to registers in regmap_irq_sync_unlock().
Note that the entire register contents are overwritten, which is a
minor change in behavior from type registers via 'type_base'.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-9-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
There are several conditions that must be satisfied to support
bulk read of status registers. Move the check into a function
to avoid duplicating it in two places.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-8-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Commit a71411dbf6 ("regmap: irq: add chip option mask_writeonly")
introduced the mask_writeonly option, but it isn't used now and it
appears it's never been used by any in-tree drivers. The motivation
for the option is mentioned in the commit message,
Some irq controllers have writeonly/multipurpose register
layouts. In those cases we read invalid data back. [...]
The option causes mask register updates to use regmap_write_bits()
instead of regmap_update_bits().
However, regmap_write_bits() doesn't solve the reading invalid data
problem. It's still a read-modify-write op like regmap_update_bits().
The difference is that 'update bits' will only write the new value
if it is different from the current value, while 'write bits' will
write the new value unconditionally, even if it's the same as the
current value.
This seems like a bit of a specialized use case and probably isn't
that useful for regmap-irq, so let's just remove the option and go
back to using an 'update bits' op for the mask registers. We can
always add the option back if some driver ends up needing it in the
future.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-7-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
regmap_irq_update_bits() is misnamed and should only be used for
updating mask registers, since it checks the mask_writeonly flag.
However, it was also used for updating wake and type registers.
It's safe to replace these uses with regmap_update_bits() because
there are no users of the mask_writeonly flag.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-6-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Check types_supported instead of checking type_rising/falling_val
when using type_in_mask interrupts. This makes the intent clearer
and allows a type_in_mask irq to support level or edge triggers,
rather than only edge triggers.
Update the documentation and comments to reflect the new behavior.
This shouldn't affect existing drivers, because if they didn't
set types_supported properly the type buffer wouldn't be updated.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-5-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Instead of mentioning unsigned int directly, use a sizeof(...)
involving the buffer we're allocating to ensure the types don't
get out of sync.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-4-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
It appears that no chip ever required a nonzero type_reg_stride
and commit 1066cfbdfa ("regmap-irq: Extend sub-irq to support
non-fixed reg strides") broke support. Just remove the field.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-3-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
When firmware sets the FWNODE_FLAG_BEST_EFFORT flag for a fwnode,
fw_devlink will do a best effort ordering for that device where it'll
only enforce the probe/suspend/resume ordering of that device with
suppliers that have drivers. The driver of that device can then decide
if it wants to defer probe or probe without the suppliers.
This will be useful for avoid probe delays of the console device that
were caused by commit 71066545b4 ("driver core: Set
fw_devlink.strict=1 by default").
Fixes: 71066545b4 ("driver core: Set fw_devlink.strict=1 by default")
Reported-by: Sascha Hauer <sha@pengutronix.de>
Reported-by: Peng Fan <peng.fan@nxp.com>
Tested-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220623080344.783549-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In __driver_attach function, There are also AA deadlock problem,
like the commit b232b02bf3 ("driver core: fix deadlock in
__device_attach").
stack like commit b232b02bf3 ("driver core: fix deadlock in
__device_attach").
list below:
In __driver_attach function, The lock holding logic is as follows:
...
__driver_attach
if (driver_allows_async_probing(drv))
device_lock(dev) // get lock dev
async_schedule_dev(__driver_attach_async_helper, dev); // func
async_schedule_node
async_schedule_node_domain(func)
entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
/* when fail or work limit, sync to execute func, but
__driver_attach_async_helper will get lock dev as
will, which will lead to A-A deadlock. */
if (!entry || atomic_read(&entry_count) > MAX_WORK) {
func;
else
queue_work_node(node, system_unbound_wq, &entry->work)
device_unlock(dev)
As above show, when it is allowed to do async probes, because of
out of memory or work limit, async work is not be allowed, to do
sync execute instead. it will lead to A-A deadlock because of
__driver_attach_async_helper getting lock dev.
Reproduce:
and it can be reproduce by make the condition
(if (!entry || atomic_read(&entry_count) > MAX_WORK)) untenable, like
below:
[ 370.785650] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[ 370.787154] task:swapper/0 state:D stack: 0 pid: 1 ppid:
0 flags:0x00004000
[ 370.788865] Call Trace:
[ 370.789374] <TASK>
[ 370.789841] __schedule+0x482/0x1050
[ 370.790613] schedule+0x92/0x1a0
[ 370.791290] schedule_preempt_disabled+0x2c/0x50
[ 370.792256] __mutex_lock.isra.0+0x757/0xec0
[ 370.793158] __mutex_lock_slowpath+0x1f/0x30
[ 370.794079] mutex_lock+0x50/0x60
[ 370.794795] __device_driver_lock+0x2f/0x70
[ 370.795677] ? driver_probe_device+0xd0/0xd0
[ 370.796576] __driver_attach_async_helper+0x1d/0xd0
[ 370.797318] ? driver_probe_device+0xd0/0xd0
[ 370.797957] async_schedule_node_domain+0xa5/0xc0
[ 370.798652] async_schedule_node+0x19/0x30
[ 370.799243] __driver_attach+0x246/0x290
[ 370.799828] ? driver_allows_async_probing+0xa0/0xa0
[ 370.800548] bus_for_each_dev+0x9d/0x130
[ 370.801132] driver_attach+0x22/0x30
[ 370.801666] bus_add_driver+0x290/0x340
[ 370.802246] driver_register+0x88/0x140
[ 370.802817] ? virtio_scsi_init+0x116/0x116
[ 370.803425] scsi_register_driver+0x1a/0x30
[ 370.804057] init_sd+0x184/0x226
[ 370.804533] do_one_initcall+0x71/0x3a0
[ 370.805107] kernel_init_freeable+0x39a/0x43a
[ 370.805759] ? rest_init+0x150/0x150
[ 370.806283] kernel_init+0x26/0x230
[ 370.806799] ret_from_fork+0x1f/0x30
To fix the deadlock, move the async_schedule_dev outside device_lock,
as we can see, in async_schedule_node_domain, the parameter of
queue_work_node is system_unbound_wq, so it can accept concurrent
operations. which will also not change the code logic, and will
not lead to deadlock.
Fixes: ef0ff68351 ("driver core: Probe devices asynchronously instead of the driver")
Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com>
Link: https://lore.kernel.org/r/20220622074327.497102-1-zhangwensheng5@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the devtmpfs fails to mount, a dangling pointer still remains in
global. Specifically, the err variable is passed by a pointer to the
devtmpfsd. When the devtmpfsd exits, it sets the error and completes the
setup_done. In this situation, the thread pointer is not set to null.
After the devtmpfsd exited, the devtmpfs can wakes up the destroyed
devtmpfsd thread by wake_up_process if a device change event comes.
Signed-off-by: Yangxi Xiang <xyangxi5@gmail.com>
Link: https://lore.kernel.org/r/20220627120409.11174-1-xyangxi5@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 77515ebaf0 as it
causes build problems in linux-next. It needs to be reintroduced in a
way that can allow the api to evolve and not require a "flag day" to
catch all users.
Link: https://lore.kernel.org/r/20220623160723.7a44b573@canb.auug.org.au
Cc: Duoming Zhou <duoming@zju.edu.cn>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For devices with no cache it can make sense to use cache only mode as a
mechanism for trapping writes to hardware which is inaccessible but since
no cache is equivalent to cache bypass we force such devices into bypass
mode. This means that our check that bypass and cache only mode aren't both
enabled simultanously is less sensible for devices without a cache so relax
it.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220622171723.1235749-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Fixes for post-5.18 changes:
- fix for a damon boot hang, from SeongJae
- fix for a kfence warning splat, from Jason Donenfeld
- fix for zero-pfn pinning, from Alex Williamson
- fix for fallocate hole punch clearing, from Mike Kravetz
Fixes pre-5.18 material:
- fix for a performance regression, from Marcelo
- fix for a hwpoisining BUG from zhenwei pi
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYri4RgAKCRDdBJ7gKXxA
jmhsAQDCvGqtIUhgkTwid8KBRNbowsg0LXd6k+gUjcxBhH403wEA0r0cxxkDAmgr
QNXn/qZRzQP2ji+pdjH9NBOsd2g2XQA=
=UGJ7
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2022-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull hotfixes from Andrew Morton:
"Minor things, mainly - mailmap updates, MAINTAINERS updates, etc.
Fixes for this merge window:
- fix for a damon boot hang, from SeongJae
- fix for a kfence warning splat, from Jason Donenfeld
- fix for zero-pfn pinning, from Alex Williamson
- fix for fallocate hole punch clearing, from Mike Kravetz
Fixes for previous releases:
- fix for a performance regression, from Marcelo
- fix for a hwpoisining BUG from zhenwei pi"
* tag 'mm-hotfixes-stable-2022-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mailmap: add entry for Christian Marangi
mm/memory-failure: disable unpoison once hw error happens
hugetlbfs: zero partial pages during fallocate hole punch
mm: memcontrol: reference to tools/cgroup/memcg_slabinfo.py
mm: re-allow pinning of zero pfns
mm/kfence: select random number before taking raw lock
MAINTAINERS: add maillist information for LoongArch
MAINTAINERS: update MM tree references
MAINTAINERS: update Abel Vesa's email
MAINTAINERS: add MEMORY HOT(UN)PLUG section and add David as reviewer
MAINTAINERS: add Miaohe Lin as a memory-failure reviewer
mailmap: add alias for jarkko@profian.com
mm/damon/reclaim: schedule 'damon_reclaim_timer' only after 'system_wq' is initialized
kthread: make it clear that kthread_create_on_node() might be terminated by any fatal signal
mm: lru_cache_disable: use synchronize_rcu_expedited
mm/page_isolation.c: fix one kernel-doc comment
Two sets of fixes - one for things that were missed with the support for
custom bulk I/O operations introduced in the last merge window, and
another for some long standing issues with regmap-irq which affect a
fairly small subset of devices.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmK16ucACgkQJNaLcl1U
h9CXmAf/Rpq3pky1EEgc/BAx+Vi6K1PD/5VvUX1Wb0xHYK52yk+XBuyqkpzhTptm
BhY5TIZWriai6UdFnuZtxRYiDy65IBZo1SGtNMlwcnjBWzD1weIWukGmkPzRGr9y
Vdr6PK1SOsP+qVrxrnNsbdEIICev2KDzmVxXa81zsKeyj9Dx4W4ETbpbGDvwyaHk
y+cBucVrb3RSoruoJUL3iV7XeobroLNO1ZcD9Wim1Kx2QFtdgbOEF5oBmqWmfVlu
Au59ug97u3AGVJpeAFwvM6EFlG7ft2wn4gsP92QCJ58zUPLdxHaS/+mWAfUButCk
km/yP4Su9KyRFFecMLwIYa4+UvH9kA==
=vT5N
-----END PGP SIGNATURE-----
Merge tag 'regmap-fix-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
"Two sets of fixes - one for things that were missed with the support
for custom bulk I/O operations introduced in the last merge window,
and another for some long standing issues with regmap-irq which affect
a fairly small subset of devices"
* tag 'regmap-fix-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap-irq: Fix offset/index mismatch in read_sub_irq_data()
regmap-irq: Fix a bug in regmap_irq_enable() for type_in_mask chips
regmap: Wire up regmap_config provided bulk write in missed functions
regmap: Make regmap_noinc_read() return -ENOTSUPP if map->read isn't set
regmap: Re-introduce bulk read support check in regmap_bulk_read()
We need to divide the sub-irq status register offset by register
stride to get an index for the status buffer to avoid an out of
bounds write when the register stride is greater than 1.
Fixes: a2d21848d9 ("regmap: regmap-irq: Add main status register support")
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220620200644.1961936-3-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
When enabling a type_in_mask irq, the type_buf contents must be
AND'd with the mask of the IRQ we're enabling to avoid enabling
other IRQs by accident, which can happen if several type_in_mask
irqs share a mask register.
Fixes: bc998a7303 ("regmap: irq: handle HW using separate rising/falling edge interrupts")
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220620200644.1961936-2-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The dev_coredumpv() and dev_coredumpm() could not be used in atomic
context, because they call kvasprintf_const() and kstrdup() with
GFP_KERNEL parameter. The process is shown below:
dev_coredumpv(.., gfp_t gfp)
dev_coredumpm(.., gfp_t gfp)
dev_set_name
kobject_set_name_vargs
kvasprintf_const(GFP_KERNEL, ...); //may sleep
kstrdup(s, GFP_KERNEL); //may sleep
This patch removes gfp_t parameter of dev_coredumpv() and dev_coredumpm()
and changes the gfp_t parameter of kzalloc() in dev_coredumpm() to
GFP_KERNEL in order to show they could not be used in atomic context.
Fixes: 833c95456a ("device coredump: add new device coredump class")
Reviewed-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Link: https://lore.kernel.org/r/df72af3b1862bac7d8e793d1f3931857d3779dfd.1654569290.git.duoming@zju.edu.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are some functions that were missed by commit d77e745613 ("regmap:
Add bulk read/write callbacks into regmap_config") when support to define
bulk read/write callbacks in regmap_config was introduced.
The regmap_bulk_write() and regmap_noinc_write() functions weren't changed
to use the added map->write instead of the map->bus->write handler.
Also, the regmap_can_raw_write() was not modified to take map->write into
account. So will only return true if a bus with a .write callback is set.
Fixes: d77e745613 ("regmap: Add bulk read/write callbacks into regmap_config")
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Link: https://lore.kernel.org/r/20220616073435.1988219-4-javierm@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Before adding support to define bulk read/write callbacks in regmap_config
by the commit d77e745613 ("regmap: Add bulk read/write callbacks into
regmap_config"), the regmap_noinc_read() function returned an errno early
a map->bus->read callback wasn't set.
But that commit dropped the check and now a call to _regmap_raw_read() is
attempted even when bulk read operations are not supported. That function
checks for map->read anyways but there's no point to continue if the read
can't succeed.
Also is a fragile assumption to make so is better to make it fail earlier.
Fixes: d77e745613 ("regmap: Add bulk read/write callbacks into regmap_config")
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Link: https://lore.kernel.org/r/20220616073435.1988219-3-javierm@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Support for drivers to define bulk read/write callbacks in regmap_config
was introduced by the commit d77e745613 ("regmap: Add bulk read/write
callbacks into regmap_config"), but this commit wrongly dropped a check
in regmap_bulk_read() to determine whether bulk reads can be done or not.
Before that commit, it was checked if map->bus was set. Now has to check
if a map->read callback has been set.
Fixes: d77e745613 ("regmap: Add bulk read/write callbacks into regmap_config")
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Link: https://lore.kernel.org/r/20220616073435.1988219-2-javierm@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Pull writeback and ext2 fixes from Jan Kara:
"A fix for writeback bug which prevented machines with kdevtmpfs from
booting and also one small ext2 bugfix in IO error handling"
* tag 'fs_for_v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
init: Initialize noop_backing_dev_info early
ext2: fix fs corruption when trying to remove a non-empty directory with IO error
noop_backing_dev_info is used by superblocks of various
pseudofilesystems such as kdevtmpfs. After commit 10e1407310
("writeback: Fix inode->i_io_list not be protected by inode->i_lock
error") this broke because __mark_inode_dirty() started to access more
fields from noop_backing_dev_info and this led to crashes inside
locked_inode_to_wb_and_lock_list() called from __mark_inode_dirty().
Fix the problem by initializing noop_backing_dev_info before the
filesystems get mounted.
Fixes: 10e1407310 ("writeback: Fix inode->i_io_list not be protected by inode->i_lock error")
Reported-and-tested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reported-and-tested-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Add simple bit operations for setting, clearing and testing specific
bits with regmap_field.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmKpskgACgkQJNaLcl1U
h9BF7Af7Bg1qhqvTJQKvfc1TU7pRnO2Mt90MJTTAFMgiMP/+FZhiaDrKm12SBTPh
w8Gomg8X+EYVpiqtywFoc0mZIUe03g9+8RpuKyK2EOReaguhU2qHQPDo2Dhx6x5p
VnNY9w/rhQsrX8153c6ICZ8YqrxNqCbgCh7eulMeXz74wbxwbrNdZbbCaM9+WCS/
fTWtNUwZ+3JtaSP55UCX/ITNrLct0gPdWqkRpLExv9HdOf1HaBRUTZH69ftQg82S
PHsxEBpPWKbViflSU7V5/UgeBCTS7FjRpwB6OEX69V+VGhRIey9IgZqbSsX9KoXX
RYPKj5DZD9JtPNA0W5Dzs0KyUIlDtA==
=vniQ
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmKqENwACgkQJNaLcl1U
h9D1pQf/cbMWcZeRe6eRxvDueIhyMYXwhahxEHMCL6EOR5SLLARB40klJq5DDPlM
D7SZlPES1oXXQ8zm8rpBahfQf3AAeiUMZ4+bOMMU45xaNTKeOGPuihiP69PQxzp1
RryrB6/AN+qKepHaWsnhOyarVnJ2XUDJ58QBmYwee5UFdxiPPIzjMLC0FVEgLhIc
Ng75dFIur3SknYyoYyd9gsH/VydlaEiYQ9s3GRKtq+D0P/fbsylPp1w5Lup3tXWh
kGYyJroz9doevY5K+TeMsVqSJhES3VJkz9GVxkdwCB6qhrXzGkbdnU7ZF1MZGlnf
bOC6+lE+CxoDWx/apLmdJg6ZmnHxdQ==
=CWeE
-----END PGP SIGNATURE-----
Merge tag 'regmap-field-bit-helpers' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap into regmap-5.20
regmap: Add regmap_field helpers for simple bit operations
Add simple bit operations for setting, clearing and testing specific
bits with regmap_field.
We have set/clear/test operations for regmap, but not for regmap_field yet.
So let's introduce regmap_field helpers too.
In many instances regmap_field_update_bits() is used for simple bit setting
and clearing. In these cases the last argument is redundant and we can
hide it with a static inline function.
This adds three new helpers for simple bit operations: set_bits,
clear_bits and test_bits (the last one defined as a regular function).
Signed-off-by: Li Chen <lchen@ambarella.com>
Link: https://lore.kernel.org/r/180eef422c3.deae9cd960729.8518395646822099769@zohomail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Stale Data.
They are a class of MMIO-related weaknesses which can expose stale data
by propagating it into core fill buffers. Data which can then be leaked
using the usual speculative execution methods.
Mitigations include this set along with microcode updates and are
similar to MDS and TAA vulnerabilities: VERW now clears those buffers
too.
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmKXMkMTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoWGPD/idalLIhhV5F2+hZIKm0WSnsBxAOh9K
7y8xBxpQQ5FUfW3vm7Pg3ro6VJp7w2CzKoD4lGXzGHriusn3qst3vkza9Ay8xu8g
RDwKe6hI+p+Il9BV9op3f8FiRLP9bcPMMReW/mRyYsOnJe59hVNwRAL8OG40PY4k
hZgg4Psfvfx8bwiye5efjMSe4fXV7BUCkr601+8kVJoiaoszkux9mqP+cnnB5P3H
zW1d1jx7d6eV1Y063h7WgiNqQRYv0bROZP5BJkufIoOHUXDpd65IRF3bDnCIvSEz
KkMYJNXb3qh7EQeHS53NL+gz2EBQt+Tq1VH256qn6i3mcHs85HvC68gVrAkfVHJE
QLJE3MoXWOqw+mhwzCRrEXN9O1lT/PqDWw8I4M/5KtGG/KnJs+bygmfKBbKjIVg4
2yQWfMmOgQsw3GWCRjgEli7aYbDJQjany0K/qZTq54I41gu+TV8YMccaWcXgDKrm
cXFGUfOg4gBm4IRjJ/RJn+mUv6u+/3sLVqsaFTs9aiib1dpBSSUuMGBh548Ft7g2
5VbFVSDaLjB2BdlcG7enlsmtzw0ltNssmqg7jTK/L7XNVnvxwUoXw+zP7RmCLEYt
UV4FHXraMKNt2ZketlomC8ui2hg73ylUp4pPdMXCp7PIXp9sVamRTbpz12h689VJ
/s55bWxHkR6S
=LBxT
-----END PGP SIGNATURE-----
Merge tag 'x86-bugs-2022-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 MMIO stale data fixes from Thomas Gleixner:
"Yet another hw vulnerability with a software mitigation: Processor
MMIO Stale Data.
They are a class of MMIO-related weaknesses which can expose stale
data by propagating it into core fill buffers. Data which can then be
leaked using the usual speculative execution methods.
Mitigations include this set along with microcode updates and are
similar to MDS and TAA vulnerabilities: VERW now clears those buffers
too"
* tag 'x86-bugs-2022-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/speculation/mmio: Print SMT warning
KVM: x86/speculation: Disable Fill buffer clear within guests
x86/speculation/mmio: Reuse SRBDS mitigation for SBDS
x86/speculation/srbds: Update SRBDS mitigation selection
x86/speculation/mmio: Add sysfs reporting for Processor MMIO Stale Data
x86/speculation/mmio: Enable CPU Fill buffer clearing on idle
x86/bugs: Group MDS, TAA & Processor MMIO Stale Data mitigations
x86/speculation/mmio: Add mitigation for Processor MMIO Stale Data
x86/speculation: Add a common function for MD_CLEAR mitigation update
x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug
Documentation: Add documentation for Processor MMIO Stale Data
There are several places in the kernel where this kind of functionality is
being used. Provide a generic helper for such cases.
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220610120219.18988-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function is no longer used. So delete it.
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220601070707.3946847-10-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that deferred_probe_timeout is non-zero by default, fw_devlink will
never permanently block the probing of devices. It'll try its best to
probe the devices in the right order and then finally let devices probe
even if their suppliers don't have any drivers.
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220601070707.3946847-8-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some devices might need to be probed and bound successfully before the
kernel boot sequence can finish and move on to init/userspace. For
example, a network interface might need to be bound to be able to mount
a NFS rootfs.
With fw_devlink=on by default, some of these devices might be blocked
from probing because they are waiting on a optional supplier that
doesn't have a driver. While fw_devlink will eventually identify such
devices and unblock the probing automatically, it might be too late by
the time it unblocks the probing of devices. For example, the IP4
autoconfig might timeout before fw_devlink unblocks probing of the
network interface.
This function is available to temporarily try and probe all devices that
have a driver even if some of their suppliers haven't been added or
don't have drivers.
The drivers can then decide which of the suppliers are optional vs
mandatory and probe the device if possible. By the time this function
returns, all such "best effort" probes are guaranteed to be completed.
If a device successfully probes in this mode, we delete all fw_devlink
discovered dependencies of that device where the supplier hasn't yet
probed successfully because they have to be optional dependencies.
This also means that some devices that aren't needed for init and could
have waited for their optional supplier to probe (when the supplier's
module is loaded later on) would end up probing prematurely with limited
functionality. So call this function only when boot would fail without
it.
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220601070707.3946847-5-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that fw_devlink=on by default and fw_devlink supports
"power-domains" property, the execution will never get to the point
where driver_deferred_probe_check_state() is called before the supplier
has probed successfully or before deferred probe timeout has expired.
So, delete the call and replace it with -ENODEV.
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220601070707.3946847-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 23cfbc6ec4 ("firmware: Add the support for ZSTD-compressed
firmware files") added support for ZSTD compression, but in the process
also made the previously default XZ compression a config option.
That means that anybody who upgrades their kernel and does a
make oldconfig
to update their configuration, will end up without the XZ compression
that the configuration used to have.
Add the 'default y' to make sure this doesn't happen.
The whole compression question should probably be improved upon, since
it is now possible to "enable" compression in the kernel config but not
enable any actual compression algorithm, which makes it all very
useless. It makes no sense to ask Kconfig questions that enable
situations that are nonsensical like that.
This at least fixes the immediate problem of a kernel update resulting
in a nonbootable machine because of a missed option.
Fixes: 23cfbc6ec4 ("firmware: Add the support for ZSTD-compressed firmware files")
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since we had to effectively reverted
commit 35a672363a ("driver core: Ensure wait_for_device_probe() waits
until the deferred_probe_timeout fires") in an earlier patch, a non-zero
deferred_probe_timeout will break NFS rootfs mounting [1] again. So, set
the default back to zero until we can fix that.
[1] - https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/
Fixes: 2b28a1a84a ("driver core: Extend deferred probe timeout on driver registration")
Cc: Mark Brown <broonie@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220526034609.480766-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mounting NFS rootfs was timing out when deferred_probe_timeout was
non-zero [1]. This was because ip_auto_config() initcall times out
waiting for the network interfaces to show up when
deferred_probe_timeout was non-zero. While ip_auto_config() calls
wait_for_device_probe() to make sure any currently running deferred
probe work or asynchronous probe finishes, that wasn't sufficient to
account for devices being deferred until deferred_probe_timeout.
Commit 35a672363a ("driver core: Ensure wait_for_device_probe() waits
until the deferred_probe_timeout fires") tried to fix that by making
sure wait_for_device_probe() waits for deferred_probe_timeout to expire
before returning.
However, if wait_for_device_probe() is called from the kernel_init()
context:
- Before deferred_probe_initcall() [2], it causes the boot process to
hang due to a deadlock.
- After deferred_probe_initcall() [3], it blocks kernel_init() from
continuing till deferred_probe_timeout expires and beats the point of
deferred_probe_timeout that's trying to wait for userspace to load
modules.
Neither of this is good. So revert the changes to
wait_for_device_probe().
[1] - https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/
[2] - https://lore.kernel.org/lkml/YowHNo4sBjr9ijZr@dev-arch.thelio-3990X/
[3] - https://lore.kernel.org/lkml/Yo3WvGnNk3LvLb7R@linutronix.de/
Fixes: 35a672363a ("driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires")
Cc: John Stultz <jstultz@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Basil Eljuse <Basil.Eljuse@arm.com>
Cc: Ferry Toth <fntoth@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Anders Roxell <anders.roxell@linaro.org>
Cc: linux-pm@vger.kernel.org
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: John Stultz <jstultz@google.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220526034609.480766-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here is the set of driver core changes for 5.19-rc1.
Note, I'm not really happy with this pull request as-is, see below for
details, but overall this is all good for everything but a small set of
systems, which we have a fix for already.
Lots of tiny driver core changes and cleanups happened this cycle,
but the two major things were:
- firmware_loader reorganization and additions including the
ability to have XZ compressed firmware images and the ability
for userspace to initiate the firmware load when it needs to,
instead of being always initiated by the kernel. FPGA devices
specifically want this ability to have their firmware changed
over the lifetime of the system boot, and this allows them to
work without having to come up with yet-another-custom-uapi
interface for loading firmware for them.
- physical location support added to sysfs so that devices that
know this information, can tell userspace where they are
located in a common way. Some ACPI devices already support
this today, and more bus types should support this in the
future.
Smaller changes included:
- driver_override api cleanups and fixes
- error path cleanups and fixes
- get_abi script fixes
- deferred probe timeout changes.
It's that last change that I'm the most worried about. It has been
reported to cause boot problems for a number of systems, and I have a
tested patch series that resolves this issue. But I didn't get it
merged into my tree before 5.18-final came out, so it has not gotten any
linux-next testing.
I'll send the fixup patches (there are 2) as a follow-on series to this
pull request if you want to take them directly, _OR_ I can just revert
the probe timeout changes and they can wait for the next -rc1 merge
cycle. Given that the fixes are tested, and pretty simple, I'm leaning
toward that choice. Sorry this all came at the end of the merge window,
I should have resolved this all 2 weeks ago, that's my fault as it was
in the middle of some travel for me.
All have been tested in linux-next for weeks, with no reported issues
other than the above-mentioned boot time outs.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYpnv/A8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yk/fACgvmenbo5HipqyHnOmTQlT50xQ9EYAn2eTq6ai
GkjLXBGNWOPBa5cU52qf
=yEi/
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of driver core changes for 5.19-rc1.
Lots of tiny driver core changes and cleanups happened this cycle, but
the two major things are:
- firmware_loader reorganization and additions including the ability
to have XZ compressed firmware images and the ability for userspace
to initiate the firmware load when it needs to, instead of being
always initiated by the kernel. FPGA devices specifically want this
ability to have their firmware changed over the lifetime of the
system boot, and this allows them to work without having to come up
with yet-another-custom-uapi interface for loading firmware for
them.
- physical location support added to sysfs so that devices that know
this information, can tell userspace where they are located in a
common way. Some ACPI devices already support this today, and more
bus types should support this in the future.
Smaller changes include:
- driver_override api cleanups and fixes
- error path cleanups and fixes
- get_abi script fixes
- deferred probe timeout changes.
It's that last change that I'm the most worried about. It has been
reported to cause boot problems for a number of systems, and I have a
tested patch series that resolves this issue. But I didn't get it
merged into my tree before 5.18-final came out, so it has not gotten
any linux-next testing.
I'll send the fixup patches (there are 2) as a follow-on series to this
pull request.
All have been tested in linux-next for weeks, with no reported issues
other than the above-mentioned boot time-outs"
* tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits)
driver core: fix deadlock in __device_attach
kernfs: Separate kernfs_pr_cont_buf and rename_lock.
topology: Remove unused cpu_cluster_mask()
driver core: Extend deferred probe timeout on driver registration
MAINTAINERS: add Russ Weight as a firmware loader maintainer
driver: base: fix UAF when driver_attach failed
test_firmware: fix end of loop test in upload_read_show()
driver core: location: Add "back" as a possible output for panel
driver core: location: Free struct acpi_pld_info *pld
driver core: Add "*" wildcard support to driver_async_probe cmdline param
driver core: location: Check for allocations failure
arch_topology: Trace the update thermal pressure
kernfs: Rename kernfs_put_open_node to kernfs_unlink_open_file.
export: fix string handling of namespace in EXPORT_SYMBOL_NS
rpmsg: use local 'dev' variable
rpmsg: Fix calling device_lock() on non-initialized device
firmware_loader: describe 'module' parameter of firmware_upload_register()
firmware_loader: Move definitions from sysfs_upload.h to sysfs.h
firmware_loader: Fix configs for sysfs split
selftests: firmware: Add firmware upload selftests
...
Here is the "big" set of USB and Thunderbolt driver changes for
5.18-rc1. For the most part it's been a quiet development cycle for the
USB core, but there are the usual "hot spots" of development activity.
Included in here are:
- Thunderbolt driver updates:
- fixes for devices without displayport adapters
- lane bonding support and improvements
- other minor changes based on device testing
- dwc3 gadget driver changes. It seems this driver will never
be finished given that the IP core is showing up in zillions
of new devices and each implementation decides to do something
different with it...
- uvc gadget driver updates as more devices start to use and
rely on this hardware as well
- usb_maxpacket() api changes to remove an unneeded and unused
parameter.
- usb-serial driver device id updates and small cleanups
- typec cleanups and fixes based on device testing
- device tree updates for usb properties
- lots of other small fixes and driver updates.
All of these have been in linux-next for weeks with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYpnZGw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymQhwCeLVANsQjBcL4ys4skl+1In17y28gAn3rEZ7rQ
Yv4uP9zadUqg3Cx0vjgf
=3s5s
-----END PGP SIGNATURE-----
Merge tag 'usb-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt updates from Greg KH:
"Here is the "big" set of USB and Thunderbolt driver changes for
5.18-rc1. For the most part it's been a quiet development cycle for
the USB core, but there are the usual "hot spots" of development
activity.
Included in here are:
- Thunderbolt driver updates:
- fixes for devices without displayport adapters
- lane bonding support and improvements
- other minor changes based on device testing
- dwc3 gadget driver changes.
It seems this driver will never be finished given that the IP core
is showing up in zillions of new devices and each implementation
decides to do something different with it...
- uvc gadget driver updates as more devices start to use and rely on
this hardware as well
- usb_maxpacket() api changes to remove an unneeded and unused
parameter.
- usb-serial driver device id updates and small cleanups
- typec cleanups and fixes based on device testing
- device tree updates for usb properties
- lots of other small fixes and driver updates.
All of these have been in linux-next for weeks with no reported
problems"
* tag 'usb-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (154 commits)
USB: new quirk for Dell Gen 2 devices
usb: dwc3: core: Add error log when core soft reset failed
usb: dwc3: gadget: Move null pinter check to proper place
usb: hub: Simplify error and success path in port_over_current_notify
usb: cdns3: allocate TX FIFO size according to composite EP number
usb: dwc3: Fix ep0 handling when getting reset while doing control transfer
usb: Probe EHCI, OHCI controllers asynchronously
usb: isp1760: Fix out-of-bounds array access
xhci: Don't defer primary roothub registration if there is only one roothub
USB: serial: option: add Quectel BG95 modem
USB: serial: pl2303: fix type detection for odd device
xhci: Allow host runtime PM as default for Intel Alder Lake N xHCI
xhci: Remove quirk for over 10 year old evaluation hardware
xhci: prevent U2 link power state if Intel tier policy prevented U1
xhci: use generic command timer for stop endpoint commands.
usb: host: xhci-plat: omit shared hcd if either root hub has no ports
usb: host: xhci-plat: prepare operation w/o shared hcd
usb: host: xhci-plat: create shared hcd after having added main hcd
xhci: prepare for operation w/o shared hcd
xhci: factor out parts of xhci_gen_setup()
...
Including:
- Intel VT-d driver updates
- Domain force snooping improvement.
- Cleanups, no intentional functional changes.
- ARM SMMU driver updates
- Add new Qualcomm device-tree compatible strings
- Add new Nvidia device-tree compatible string for Tegra234
- Fix UAF in SMMUv3 shared virtual addressing code
- Force identity-mapped domains for users of ye olde SMMU
legacy binding
- Minor cleanups
- Patches to fix a BUG_ON in the vfio_iommu_group_notifier
- Groundwork for upcoming iommufd framework
- Introduction of DMA ownership so that an entire IOMMU group
is either controlled by the kernel or by user-space
- MT8195 and MT8186 support in the Mediatek IOMMU driver
- Patches to make forcing of cache-coherent DMA more coherent
between IOMMU drivers
- Fixes for thunderbolt device DMA protection
- Various smaller fixes and cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmKWCbUACgkQK/BELZcB
GuPHmRAAuoH9iK/jrC3SgrqpBfH2iRN7ovIX8dFvgbQWX27lhXF4gvj2/nYdIvPK
75j/LmdibuzV3Iez4kjbGKNG1AikwK3dKIH21a84f3ctnoamQyL6nMfCVBFaVD/D
kvPpTHyjbGPNf6KZyWQdkJ5DXD1aoG1DKkBnslH5pTNPqGuNqbcnRTg0YxiJFLBv
5w2B6jL06XRzunh+Sp1Dbj+po8ROjLRCEU+tdrndO8W/Dyp6+ZNNuxL9/3BM9zMj
py0M4piFtGnhmJSdym1eeHm7r1YRjkZw+MN+e8NcrcSihmDutEWo7nRRxA5uVaa+
3O2DNERqCvQUYxfNRUOKwzV8v51GYQHEPhvOe/MLgaEQDmDmlF2dHNGm93eCMdrv
m1cT011oU7pa4qHomwLyTJxSsR7FzJ37igq/WeY++MBhl+frqfzEQPVxF+W7GLb8
QvT/+woCPzLVpJbE7s0FUD4nbPd8c1dAz4+HO1DajxILIOTq1bnPIorSjgXODRjq
yzsiP1rAg0L0PsL7pXn3cPMzNCE//xtOsRsAGmaVv6wBoMLyWVFCU/wjPEdjrSWA
nXpAuCL84uxCEl/KLYMsg9UhjT6ko7CuKdsybIG9zNIiUau43uSqgTen0xCpYt0i
m//O/X3tPyxmoLKRW+XVehGOrBZW+qrQny6hk/Zex+6UJQqVMTA=
=W0hj
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
- Intel VT-d driver updates:
- Domain force snooping improvement.
- Cleanups, no intentional functional changes.
- ARM SMMU driver updates:
- Add new Qualcomm device-tree compatible strings
- Add new Nvidia device-tree compatible string for Tegra234
- Fix UAF in SMMUv3 shared virtual addressing code
- Force identity-mapped domains for users of ye olde SMMU legacy
binding
- Minor cleanups
- Fix a BUG_ON in the vfio_iommu_group_notifier:
- Groundwork for upcoming iommufd framework
- Introduction of DMA ownership so that an entire IOMMU group is
either controlled by the kernel or by user-space
- MT8195 and MT8186 support in the Mediatek IOMMU driver
- Make forcing of cache-coherent DMA more coherent between IOMMU
drivers
- Fixes for thunderbolt device DMA protection
- Various smaller fixes and cleanups
* tag 'iommu-updates-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (88 commits)
iommu/amd: Increase timeout waiting for GA log enablement
iommu/s390: Tolerate repeat attach_dev calls
iommu/vt-d: Remove hard coding PGSNP bit in PASID entries
iommu/vt-d: Remove domain_update_iommu_snooping()
iommu/vt-d: Check domain force_snooping against attached devices
iommu/vt-d: Block force-snoop domain attaching if no SC support
iommu/vt-d: Size Page Request Queue to avoid overflow condition
iommu/vt-d: Fold dmar_insert_one_dev_info() into its caller
iommu/vt-d: Change return type of dmar_insert_one_dev_info()
iommu/vt-d: Remove unneeded validity check on dev
iommu/dma: Explicitly sort PCI DMA windows
iommu/dma: Fix iova map result check bug
iommu/mediatek: Fix NULL pointer dereference when printing dev_name
iommu: iommu_group_claim_dma_owner() must always assign a domain
iommu/arm-smmu: Force identity domains for legacy binding
iommu/arm-smmu: Support Tegra234 SMMU
dt-bindings: arm-smmu: Add compatible for Tegra234 SOC
dt-bindings: arm-smmu: Document nvidia,memory-controller property
iommu/arm-smmu-qcom: Add SC8280XP support
dt-bindings: arm-smmu: Add compatible for Qualcomm SC8280XP
...
- Add driver-core infrastructure for lockdep validation of
device_lock(), and fixup a deadlock report that was previously hidden
behind the 'lockdep no validate' policy.
- Add CXL _OSC support for claiming native control of CXL hotplug and
error handling.
- Disable suspend in the presence of CXL memory unless and until a
protocol is identified for restoring PCI device context from memory
hosted on CXL PCI devices.
- Add support for snooping CXL mailbox commands to protect against
inopportune changes, like set-partition with the 'immediate' flag set.
- Rework how the driver detects legacy CXL 1.1 configurations (CXL DVSEC
/ 'mem_enable') before enabling new CXL 2.0 decode configurations (CXL
HDM Capability).
- Miscellaneous cleanups and fixes from -next exposure.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYpFUogAKCRDfioYZHlFs
Zz+VAP9o/NkYhbaM2Ne9ImgsdJii96gA8nN7q/q/ZoXjsSx2WQD+NRC5d3ZwZDCa
9YKEkntnvbnAZOCs+ZUuyZBgNh6vsgU=
=p92w
-----END PGP SIGNATURE-----
Merge tag 'cxl-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull cxl updates from Dan Williams:
"Compute Express Link (CXL) updates for this cycle.
The highlight is new driver-core infrastructure and CXL subsystem
changes for allowing lockdep to validate device_lock() usage. Thanks
to PeterZ for setting me straight on the current capabilities of the
lockdep API, and Greg acked it as well.
On the CXL ACPI side this update adds support for CXL _OSC so that
platform firmware knows that it is safe to still grant Linux native
control of PCIe hotplug and error handling in the presence of CXL
devices. A circular dependency problem was discovered between suspend
and CXL memory for cases where the suspend image might be stored in
CXL memory where that image also contains the PCI register state to
restore to re-enable the device. Disable suspend for now until an
architecture is defined to clarify that conflict.
Lastly a collection of reworks, fixes, and cleanups to the CXL
subsystem where support for snooping mailbox commands and properly
handling the "mem_enable" flow are the highlights.
Summary:
- Add driver-core infrastructure for lockdep validation of
device_lock(), and fixup a deadlock report that was previously
hidden behind the 'lockdep no validate' policy.
- Add CXL _OSC support for claiming native control of CXL hotplug and
error handling.
- Disable suspend in the presence of CXL memory unless and until a
protocol is identified for restoring PCI device context from memory
hosted on CXL PCI devices.
- Add support for snooping CXL mailbox commands to protect against
inopportune changes, like set-partition with the 'immediate' flag
set.
- Rework how the driver detects legacy CXL 1.1 configurations (CXL
DVSEC / 'mem_enable') before enabling new CXL 2.0 decode
configurations (CXL HDM Capability).
- Miscellaneous cleanups and fixes from -next exposure"
* tag 'cxl-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (47 commits)
cxl/port: Enable HDM Capability after validating DVSEC Ranges
cxl/port: Reuse 'struct cxl_hdm' context for hdm init
cxl/port: Move endpoint HDM Decoder Capability init to port driver
cxl/pci: Drop @info argument to cxl_hdm_decode_init()
cxl/mem: Merge cxl_dvsec_ranges() and cxl_hdm_decode_init()
cxl/mem: Skip range enumeration if mem_enable clear
cxl/mem: Consolidate CXL DVSEC Range enumeration in the core
cxl/pci: Move cxl_await_media_ready() to the core
cxl/mem: Validate port connectivity before dvsec ranges
cxl/mem: Fix cxl_mem_probe() error exit
cxl/pci: Drop wait_for_valid() from cxl_await_media_ready()
cxl/pci: Consolidate wait_for_media() and wait_for_media_ready()
cxl/mem: Drop mem_enabled check from wait_for_media()
nvdimm: Fix firmware activation deadlock scenarios
device-core: Kill the lockdep_mutex
nvdimm: Drop nd_device_lock()
ACPI: NFIT: Drop nfit_device_lock()
nvdimm: Replace lockdep_mutex with local lock classes
cxl: Drop cxl_device_lock()
cxl/acpi: Add root device lockdep validation
...
file-backed transparent hugepages.
Johannes Weiner has arranged for zswap memory use to be tracked and
managed on a per-cgroup basis.
Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for runtime
enablement of the recent huge page vmemmap optimization feature.
Baolin Wang contributes a series to fix some issues around hugetlb
pagetable invalidation.
Zhenwei Pi has fixed some interactions between hwpoisoned pages and
virtualization.
Tong Tiangen has enabled the use of the presently x86-only
page_table_check debugging feature on arm64 and riscv.
David Vernet has done some fixup work on the memcg selftests.
Peter Xu has taught userfaultfd to handle write protection faults against
shmem- and hugetlbfs-backed files.
More DAMON development from SeongJae Park - adding online tuning of the
feature and support for monitoring of fixed virtual address ranges. Also
easier discovery of which monitoring operations are available.
Nadav Amit has done some optimization of TLB flushing during mprotect().
Neil Brown continues to labor away at improving our swap-over-NFS support.
David Hildenbrand has some fixes to anon page COWing versus
get_user_pages().
Peng Liu fixed some errors in the core hugetlb code.
Joao Martins has reduced the amount of memory consumed by device-dax's
compound devmaps.
Some cleanups of the arch-specific pagemap code from Anshuman Khandual.
Muchun Song has found and fixed some errors in the TLB flushing of
transparent hugepages.
Roman Gushchin has done more work on the memcg selftests.
And, of course, many smaller fixes and cleanups. Notably, the customary
million cleanup serieses from Miaohe Lin.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYo52xQAKCRDdBJ7gKXxA
jtJFAQD238KoeI9z5SkPMaeBRYSRQmNll85mxs25KapcEgWgGQD9FAb7DJkqsIVk
PzE+d9hEfirUGdL6cujatwJ6ejYR8Q8=
=nFe6
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"Almost all of MM here. A few things are still getting finished off,
reviewed, etc.
- Yang Shi has improved the behaviour of khugepaged collapsing of
readonly file-backed transparent hugepages.
- Johannes Weiner has arranged for zswap memory use to be tracked and
managed on a per-cgroup basis.
- Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for
runtime enablement of the recent huge page vmemmap optimization
feature.
- Baolin Wang contributes a series to fix some issues around hugetlb
pagetable invalidation.
- Zhenwei Pi has fixed some interactions between hwpoisoned pages and
virtualization.
- Tong Tiangen has enabled the use of the presently x86-only
page_table_check debugging feature on arm64 and riscv.
- David Vernet has done some fixup work on the memcg selftests.
- Peter Xu has taught userfaultfd to handle write protection faults
against shmem- and hugetlbfs-backed files.
- More DAMON development from SeongJae Park - adding online tuning of
the feature and support for monitoring of fixed virtual address
ranges. Also easier discovery of which monitoring operations are
available.
- Nadav Amit has done some optimization of TLB flushing during
mprotect().
- Neil Brown continues to labor away at improving our swap-over-NFS
support.
- David Hildenbrand has some fixes to anon page COWing versus
get_user_pages().
- Peng Liu fixed some errors in the core hugetlb code.
- Joao Martins has reduced the amount of memory consumed by
device-dax's compound devmaps.
- Some cleanups of the arch-specific pagemap code from Anshuman
Khandual.
- Muchun Song has found and fixed some errors in the TLB flushing of
transparent hugepages.
- Roman Gushchin has done more work on the memcg selftests.
... and, of course, many smaller fixes and cleanups. Notably, the
customary million cleanup serieses from Miaohe Lin"
* tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (381 commits)
mm: kfence: use PAGE_ALIGNED helper
selftests: vm: add the "settings" file with timeout variable
selftests: vm: add "test_hmm.sh" to TEST_FILES
selftests: vm: check numa_available() before operating "merge_across_nodes" in ksm_tests
selftests: vm: add migration to the .gitignore
selftests/vm/pkeys: fix typo in comment
ksm: fix typo in comment
selftests: vm: add process_mrelease tests
Revert "mm/vmscan: never demote for memcg reclaim"
mm/kfence: print disabling or re-enabling message
include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace"
include/trace/events/mmflags.h: cleanup for "tracing: incorrect gfp_t conversion"
mm: fix a potential infinite loop in start_isolate_page_range()
MAINTAINERS: add Muchun as co-maintainer for HugeTLB
zram: fix Kconfig dependency warning
mm/shmem: fix shmem folio swapoff hang
cgroup: fix an error handling path in alloc_pagecache_max_30M()
mm: damon: use HPAGE_PMD_SIZE
tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate
nodemask.h: fix compilation error with GCC12
...
- Allow error pointer to be passed to fwnode APIs (Andy Shevchenko).
- Introduce fwnode_for_each_parent_node() (Andy Shevchenko, Douglas
Anderson).
- Advertise fwnode and device property count API calls (Andy
Shevchenko).
- Clean up fwnode_is_ancestor_of() (Andy Shevchenko).
- Convert device_{dma_supported,get_dma_attr} to fwnode (Sakari Ailus).
- Release subnode properties with data nodes (Sakari Ailus).
- Add ->iomap() and ->irq_get() to fwnode operations (Sakari Ailus).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmKL4qwSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxO+AP/ir+1VKydUioBbH9vh8grCF1vAoJhknv
jb0STEBq+SH7+EwWbpj/J9+ldACvIJ0wnrsZ83vbl9k0z5+mrddw3HiQrrCc2PSc
hRfWCi4ihnTlDK2Sctm/suBSNivh8kcGwJUcYOwYaWEVdGoUXrqldWzzRo48DYgo
KnHm4e2V5Gob2u4edbdgtOl4BGcBcOPNDdOe15Ra6pjcp+0DUWP55j51kmhpjLSk
PuAEtbNLPOzOVIVFpANU4Q7/3G3gjKIwPjKwAsWaa2UKYdsmcLhr8gaBo05UBV6g
M6VPq14OHLGBlwxcxrsFgsNy9uBHnxEAHE0NRg7hrA2eHAqiuA7R+itBvzcgx8Wj
HRTI9ZFjZabx82rWkb7iKI/K/ok5igzX3/zrbr7Bf3+7QY0+UEMqpmPVak01xqe7
nFk7x3rHgZOsKPs+0ZN8pG16vEyuPFEQLJV0o25fiyhK2ob8tbAZxy3nVBRm8H/o
yzuD9rYmZKHHwp3Ff4Bn+S8/T6vpYhsQkEz+NjjGMcXZXMBhbH+wsu9aBtVY/Jdp
wVrmkxm/RC3b97oVoisqY05Wv/tgUwybSkPbARZWQjmshuazrvaJAi+N/LWwBuMK
gsBVLTivWthjlCiPwRZlADTKbhlpdwSikWaC0xktsyqaO3JDtJFn305V7L9oWpQz
tShfSJ2GRBOT
=ly4P
-----END PGP SIGNATURE-----
Merge tag 'devprop-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull device properties framework updates from Rafael Wysocki:
"These mostly extend the device property API and make it easier to use
in some cases.
Specifics:
- Allow error pointer to be passed to fwnode APIs (Andy Shevchenko).
- Introduce fwnode_for_each_parent_node() (Andy Shevchenko, Douglas
Anderson).
- Advertise fwnode and device property count API calls (Andy
Shevchenko).
- Clean up fwnode_is_ancestor_of() (Andy Shevchenko).
- Convert device_{dma_supported,get_dma_attr} to fwnode (Sakari
Ailus).
- Release subnode properties with data nodes (Sakari Ailus).
- Add ->iomap() and ->irq_get() to fwnode operations (Sakari Ailus)"
* tag 'devprop-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
device property: Advertise fwnode and device property count API calls
device property: Fix recent breakage of fwnode_get_next_parent_dev()
device property: Drop 'test' prefix in parameters of fwnode_is_ancestor_of()
device property: Introduce fwnode_for_each_parent_node()
device property: Allow error pointer to be passed to fwnode APIs
ACPI: property: Release subnode properties with data nodes
device property: Add irq_get to fwnode operation
device property: Add iomap to fwnode operations
ACPI: property: Move acpi_fwnode_device_get_match_data() up
device property: Convert device_{dma_supported,get_dma_attr} to fwnode
- Add thermal library and thermal tools to encapsulate the netlink
into event based callbacks (Daniel Lezcano, Jiapeng Chong).
- Improve overheat condition handling during suspend-to-idle in the
Intel PCH thermal driver (Zhang Rui).
- Use local ops instead of global ops in devfreq_cooling (Kant Fan).
- Clean up _OSC handling in int340x (Davidlohr Bueso).
- Switch hisi_termal from CONFIG_PM_SLEEP guards to pm_sleep_ptr()
(Hesham Almatary).
- Add new k3 j72xx bangdap driver and the corresponding bindings
(Keerthy).
- Fix missing of_node_put() in the SC iMX driver at probe time
(Miaoqian Lin).
- Fix memory leak in __thermal_cooling_device_register() when
device_register() fails by calling thermal_cooling_device_destroy_sysfs()
(Yang Yingliang).
- Add sc8180x and sc8280xp compatible string in the DT bindings and
lMH support for QCom tsens driver (Bjorn Andersson).
- Fix OTP Calibration Register values conforming to the documentation
on RZ/G2L and bindings documentation for RZ/G2UL (Biju Das).
- Fix type in kerneldoc description for __thermal_bind_params (Corentin
Labbe).
- Fix potential NULL dereference in sr_thermal_probe() on Broadcom
platform (Zheng Yongjun).
- Add change mode ops to the thermal-of sensor (Manaf Meethalavalappu
Pallikunhi).
- Fix non-negative value support by preventing the value to be clamp
to zero (Stefan Wahren).
- Add compatible string and DT bindings for MSM8960 tsens driver
(Dmitry Baryshkov).
- Add hwmon support for K3 driver (Massimiliano Minella).
- Refactor and add multiple generations support for QCom ADC driver
(Jishnu Prakash).
- Use platform_get_irq_optional() to get the interrupt on RCar driver and
document Document RZ/V2L bindings (Lad Prabhakar).
- Remove NULL check after container_of() call from the Intel HFI
thermal driver (Haowen Bai).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmKL3MESHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxgzsQAKj4D0ROHLdqReYeIutS7IU6YQ+ejh3N
krcAZvxQd8B3sVwSHHABPdsHwwfYjP4BwoHHVtha5q+5zFGBq5F3fJHsDeRxN23T
NEOW81HBaLu81N5gNuAYg+NQcdqBR4U4IGKZfxWRyx13OMECXGtJRb/SIW/TuDYk
AQIrqOivVEsVRtn+qp+x0rKHsj6ge+2lU1OyJagr2BDhgLrwzxl6cmNBfali1C3D
sw4SqGwGVjEB0QDDpMbty9BsLEjNRE86kwoPh2u0K9dT9/b/P9pk1lp+pOnltspZ
eOdwr0CtoEHw2x5tuhbO7fmvNIuf5jgSDWsCrP/xYnsozcRiwWsgyeJj4f5soreN
kTJB8S/FH+fd7UuYqdmDIeJpDnkkGt1jIqjkDvfJNL7ffBWIe/meXKF4vBnrdlbd
FMNQwzgc5eug07IFjOE43V1v5Hw1H1leVUwZczOMGNIhqyy0WZyQr7vzwbr16jCO
x0hu3q3Y++5SUAUy9WzsOOrpRHa9JxUwVhZuLG7ajf+6pqPxB8kqs9CJe+sgWKju
T4KqHbljJ2Gnkm9SngT0BdnB+AgparhXf8/8Mj0ZBQvamXw+ylNbYfG7qTZk5Ne/
rUf4YDew6hGDqikyJBe2e0a1lLSoN4zaADC1lLYhC+Zp7m6eK0r3y3mhic0bZe1c
RU/2eeA4kTuZ
=3ltV
-----END PGP SIGNATURE-----
Merge tag 'thermal-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
"These add a thermal library and thermal tools to wrap the netlink
interface into event-based callbacks, improve overheat condition
handling during suspend-to-idle on Intel SoCs, add some new hardware
support, fix bugs and clean up code.
Specifics:
- Add thermal library and thermal tools to encapsulate the netlink
into event based callbacks (Daniel Lezcano, Jiapeng Chong).
- Improve overheat condition handling during suspend-to-idle in the
Intel PCH thermal driver (Zhang Rui).
- Use local ops instead of global ops in devfreq_cooling (Kant Fan).
- Clean up _OSC handling in int340x (Davidlohr Bueso).
- Switch hisi_termal from CONFIG_PM_SLEEP guards to pm_sleep_ptr()
(Hesham Almatary).
- Add new k3 j72xx bangdap driver and the corresponding bindings
(Keerthy).
- Fix missing of_node_put() in the SC iMX driver at probe time
(Miaoqian Lin).
- Fix memory leak in __thermal_cooling_device_register()
when device_register() fails by calling
thermal_cooling_device_destroy_sysfs() (Yang Yingliang).
- Add sc8180x and sc8280xp compatible string in the DT bindings and
lMH support for QCom tsens driver (Bjorn Andersson).
- Fix OTP Calibration Register values conforming to the documentation
on RZ/G2L and bindings documentation for RZ/G2UL (Biju Das).
- Fix type in kerneldoc description for __thermal_bind_params
(Corentin Labbe).
- Fix potential NULL dereference in sr_thermal_probe() on Broadcom
platform (Zheng Yongjun).
- Add change mode ops to the thermal-of sensor (Manaf Meethalavalappu
Pallikunhi).
- Fix non-negative value support by preventing the value to be clamp
to zero (Stefan Wahren).
- Add compatible string and DT bindings for MSM8960 tsens driver
(Dmitry Baryshkov).
- Add hwmon support for K3 driver (Massimiliano Minella).
- Refactor and add multiple generations support for QCom ADC driver
(Jishnu Prakash).
- Use platform_get_irq_optional() to get the interrupt on RCar driver
and document Document RZ/V2L bindings (Lad Prabhakar).
- Remove NULL check after container_of() call from the Intel HFI
thermal driver (Haowen Bai)"
* tag 'thermal-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (38 commits)
thermal: intel: pch: improve the cooling delay log
thermal: intel: pch: enhance overheat handling
thermal: intel: pch: move cooling delay to suspend_noirq phase
PM: wakeup: expose pm_wakeup_pending to modules
thermal: k3_j72xx_bandgap: Add the bandgap driver support
dt-bindings: thermal: k3-j72xx: Add VTM bindings documentation
thermal/drivers/imx_sc_thermal: Fix refcount leak in imx_sc_thermal_probe
thermal/core: Fix memory leak in __thermal_cooling_device_register()
dt-bindings: thermal: tsens: Add sc8280xp compatible
dt-bindings: thermal: lmh: Add Qualcomm sc8180x compatible
thermal/drivers/qcom/lmh: Add sc8180x compatible
thermal/drivers/rz2gl: Fix OTP Calibration Register values
dt-bindings: thermal: rzg2l-thermal: Document RZ/G2UL bindings
thermal: thermal_of: fix typo on __thermal_bind_params
tools/thermal: remove unneeded semicolon
tools/lib/thermal: remove unneeded semicolon
thermal/drivers/broadcom: Fix potential NULL dereference in sr_thermal_probe
tools/thermal: Add thermal daemon skeleton
tools/thermal: Add a temperature capture tool
tools/thermal: Add util library
...
- Update the Energy Model support code to allow the Energy Model to be
artificial, which means that the power values may not be on a uniform
scale with other devices providing power information, and update the
cpufreq_cooling and devfreq_cooling thermal drivers to support
artificial Energy Models (Lukasz Luba).
- Make DTPM check the Energy Model type (Lukasz Luba).
- Fix policy counter decrementation in cpufreq if Energy Model is in
use (Pierre Gondois).
- Add CPU-based scaling support to passive devfreq governor (Saravana
Kannan, Chanwoo Choi).
- Update the rk3399_dmc devfreq driver (Brian Norris).
- Export dev_pm_ops instead of suspend() and resume() in the IIO
chemical scd30 driver (Jonathan Cameron).
- Add namespace variants of EXPORT[_GPL]_SIMPLE_DEV_PM_OPS and
PM-runtime counterparts (Jonathan Cameron).
- Move symbol exports in the IIO chemical scd30 driver into the
IIO_SCD30 namespace (Jonathan Cameron).
- Avoid device PM-runtime usage count underflows (Rafael Wysocki).
- Allow dynamic debug to control printing of PM messages (David
Cohen).
- Fix some kernel-doc comments in hibernation code (Yang Li, Haowen
Bai).
- Preserve ACPI-table override during hibernation (Amadeusz Sławiński).
- Improve support for suspend-to-RAM for PSCI OSI mode (Ulf Hansson).
- Make Intel RAPL power capping driver support the RaptorLake and
AlderLake N processors (Zhang Rui, Sumeet Pawnikar).
- Remove redundant store to value after multiply in the RAPL power
capping driver (Colin Ian King).
- Add AlderLake processor support to the intel_idle driver (Zhang Rui).
- Fix regression leading to no genpd governor in the PSCI cpuidle
driver and fix the riscv-sbi cpuidle driver to allow a genpd
governor to be used (Ulf Hansson).
- Fix cpufreq governor clean up code to avoid using kfree() directly
to free kobject-based items (Kevin Hao).
- Prepare cpufreq for powerpc's asm/prom.h cleanup (Christophe Leroy).
- Make intel_pstate notify frequency invariance code when no_turbo is
turned on and off (Chen Yu).
- Add Sapphire Rapids OOB mode support to intel_pstate (Srinivas
Pandruvada).
- Make cpufreq avoid unnecessary frequency updates due to mismatch
between hardware and the frequency table (Viresh Kumar).
- Make remove_cpu_dev_symlink() clear the real_cpus mask to simplify
code (Viresh Kumar).
- Rearrange cpufreq_offline() and cpufreq_remove_dev() to make the
calling convention for some driver callbacks consistent (Rafael
Wysocki).
- Avoid accessing half-initialized cpufreq policies from the show()
and store() sysfs functions (Schspa Shi).
- Rearrange cpufreq_offline() to make the calling convention for some
driver callbacks consistent (Schspa Shi).
- Update CPPC handling in cpufreq (Pierre Gondois).
- Extend dev_pm_domain_detach() doc (Krzysztof Kozlowski).
- Move genpd's time-accounting to ktime_get_mono_fast_ns() (Ulf
Hansson).
- Improve the way genpd deals with its governors (Ulf Hansson).
- Update the turbostat utility to version 2022.04.16 (Len Brown,
Dan Merillat, Sumeet Pawnikar, Zephaniah E. Loss-Cutler-Hull, Chen
Yu).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmKL3hsSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxW4oP/RzMh6dclWXs3J/gUCKTqRepq6cb80tq
Q2r9xRRHwy6ZH/PVddGDHmhQ7d3NAv13s4srA9kznZognF3hzuxnGau226ilDqHh
qxVSBRjWY9ijxRBvkcCaa6HZm4Chb91pUX0CLpdYSl9BTgIdk66HZYaMsKhHU/di
j7KKHPdKyyQkssWnMjGEyuaF+UebiEgISCF3+X0eb6c1m7GHXpgLJVxNy0pKkUdK
j+n6+ms12OlVLtg1eIl0J5824w/rkK3ZdqfEXJSq++mNMqSj/KCI3yWpzsLKp9AB
xxhox/tPgJVyON8Vtbb2IkWkiQUKeSrAGIUYXWmnwIZYLPSGD7BPzr82Cxr7S/ez
imMB+1Qd3SsOQ9EdI9rGYgNsEF2vOs1xjMehSdUdmTz148IzBOBt4YyQeb/mfXqH
nh9eVuFCzqH1lAayYt6iP1+V5gQn9as/+rR91k4k4A6OKXomuQUGORLeHfuKMfNH
eBZ72tdXqiq6z+ag3lY3pBAMSm11epCOa3VR6QNaC7hrlY3AZP+o3tIUL6W813b+
V3l1gWApGHZE1hiDM95dll/dIt9IZpTRd3dlqF/YnFW7fPDrz71EGvhrZpO7vdO0
/G6eJcCDjqJVcbCE8Y77I6/AXjpVQ7PRPeNx6aW7jPcQhpVIgcsF2BGjk9anjXDs
3yHJs9R/HMmA
=Hewm
-----END PGP SIGNATURE-----
Merge tag 'pm-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These add support for 'artificial' Energy Models in which power
numbers for different entities may be in different scales, add support
for some new hardware, fix bugs and clean up code in multiple places.
Specifics:
- Update the Energy Model support code to allow the Energy Model to
be artificial, which means that the power values may not be on a
uniform scale with other devices providing power information, and
update the cpufreq_cooling and devfreq_cooling thermal drivers to
support artificial Energy Models (Lukasz Luba).
- Make DTPM check the Energy Model type (Lukasz Luba).
- Fix policy counter decrementation in cpufreq if Energy Model is in
use (Pierre Gondois).
- Add CPU-based scaling support to passive devfreq governor (Saravana
Kannan, Chanwoo Choi).
- Update the rk3399_dmc devfreq driver (Brian Norris).
- Export dev_pm_ops instead of suspend() and resume() in the IIO
chemical scd30 driver (Jonathan Cameron).
- Add namespace variants of EXPORT[_GPL]_SIMPLE_DEV_PM_OPS and
PM-runtime counterparts (Jonathan Cameron).
- Move symbol exports in the IIO chemical scd30 driver into the
IIO_SCD30 namespace (Jonathan Cameron).
- Avoid device PM-runtime usage count underflows (Rafael Wysocki).
- Allow dynamic debug to control printing of PM messages (David
Cohen).
- Fix some kernel-doc comments in hibernation code (Yang Li, Haowen
Bai).
- Preserve ACPI-table override during hibernation (Amadeusz
Sławiński).
- Improve support for suspend-to-RAM for PSCI OSI mode (Ulf Hansson).
- Make Intel RAPL power capping driver support the RaptorLake and
AlderLake N processors (Zhang Rui, Sumeet Pawnikar).
- Remove redundant store to value after multiply in the RAPL power
capping driver (Colin Ian King).
- Add AlderLake processor support to the intel_idle driver (Zhang
Rui).
- Fix regression leading to no genpd governor in the PSCI cpuidle
driver and fix the riscv-sbi cpuidle driver to allow a genpd
governor to be used (Ulf Hansson).
- Fix cpufreq governor clean up code to avoid using kfree() directly
to free kobject-based items (Kevin Hao).
- Prepare cpufreq for powerpc's asm/prom.h cleanup (Christophe
Leroy).
- Make intel_pstate notify frequency invariance code when no_turbo is
turned on and off (Chen Yu).
- Add Sapphire Rapids OOB mode support to intel_pstate (Srinivas
Pandruvada).
- Make cpufreq avoid unnecessary frequency updates due to mismatch
between hardware and the frequency table (Viresh Kumar).
- Make remove_cpu_dev_symlink() clear the real_cpus mask to simplify
code (Viresh Kumar).
- Rearrange cpufreq_offline() and cpufreq_remove_dev() to make the
calling convention for some driver callbacks consistent (Rafael
Wysocki).
- Avoid accessing half-initialized cpufreq policies from the show()
and store() sysfs functions (Schspa Shi).
- Rearrange cpufreq_offline() to make the calling convention for some
driver callbacks consistent (Schspa Shi).
- Update CPPC handling in cpufreq (Pierre Gondois).
- Extend dev_pm_domain_detach() doc (Krzysztof Kozlowski).
- Move genpd's time-accounting to ktime_get_mono_fast_ns() (Ulf
Hansson).
- Improve the way genpd deals with its governors (Ulf Hansson).
- Update the turbostat utility to version 2022.04.16 (Len Brown, Dan
Merillat, Sumeet Pawnikar, Zephaniah E. Loss-Cutler-Hull, Chen Yu)"
* tag 'pm-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (94 commits)
PM: domains: Trust domain-idle-states from DT to be correct by genpd
PM: domains: Measure power-on/off latencies in genpd based on a governor
PM: domains: Allocate governor data dynamically based on a genpd governor
PM: domains: Clean up some code in pm_genpd_init() and genpd_remove()
PM: domains: Fix initialization of genpd's next_wakeup
PM: domains: Fixup QoS latency measurements for IRQ safe devices in genpd
PM: domains: Measure suspend/resume latencies in genpd based on governor
PM: domains: Move the next_wakeup variable into the struct gpd_timing_data
PM: domains: Allocate gpd_timing_data dynamically based on governor
PM: domains: Skip another warning in irq_safe_dev_in_sleep_domain()
PM: domains: Rename irq_safe_dev_in_no_sleep_domain() in genpd
PM: domains: Don't check PM_QOS_FLAG_NO_POWER_OFF in genpd
PM: domains: Drop redundant code for genpd always-on governor
PM: domains: Add GENPD_FLAG_RPM_ALWAYS_ON for the always-on governor
powercap: intel_rapl: remove redundant store to value after multiply
cpufreq: CPPC: Enable dvfs_possible_from_any_cpu
cpufreq: CPPC: Enable fast_switch
ACPI: CPPC: Assume no transition latency if no PCCT
ACPI: bus: Set CPPC _OSC bits for all and when CPPC_LIB is supported
ACPI: CPPC: Check _OSC for flexible address space
...