post_init_entity_util_avg() init task util_avg according to the cpu util_avg
at the time of fork, which will decay when switched_to_fair() some time later,
we'd better to not set them at all in the case of !fair task.
Suggested-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-10-zhouchengming@bytedance.com
When wake_up_new_task(), we use post_init_entity_util_avg() to init
util_avg/runnable_avg based on cpu's util_avg at that time, and
attach task sched_avg to cfs_rq.
Since enqueue_task_fair() -> enqueue_entity() -> update_load_avg()
loop will do attach, we can move this work to update_load_avg().
wake_up_new_task(p)
post_init_entity_util_avg(p)
attach_entity_cfs_rq() --> (1)
activate_task(rq, p)
enqueue_task() := enqueue_task_fair()
enqueue_entity() loop
update_load_avg(cfs_rq, se, UPDATE_TG | DO_ATTACH)
if (!se->avg.last_update_time && (flags & DO_ATTACH))
attach_entity_load_avg() --> (2)
This patch move attach from (1) to (2), update related comments too.
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-9-zhouchengming@bytedance.com
commit 7dc603c902 ("sched/fair: Fix PELT integrity for new tasks")
introduce a TASK_NEW state and an unnessary limitation that would fail
when changing cgroup of new forked task.
Because at that time, we can't handle task_change_group_fair() for new
forked fair task which hasn't been woken up by wake_up_new_task(),
which will cause detach on an unattached task sched_avg problem.
This patch delete this unnessary limitation by adding check before do
detach or attach in task_change_group_fair().
So cpu_cgrp_subsys.can_attach() has nothing to do for fair tasks,
only define it in #ifdef CONFIG_RT_GROUP_SCHED.
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-8-zhouchengming@bytedance.com
commit 7dc603c902 ("sched/fair: Fix PELT integrity for new tasks")
fixed two load tracking problems for new task, including detach on
unattached new task problem.
There still left another detach on unattached task problem for the task
which has been woken up by try_to_wake_up() and waiting for actually
being woken up by sched_ttwu_pending().
try_to_wake_up(p)
cpu = select_task_rq(p)
if (task_cpu(p) != cpu)
set_task_cpu(p, cpu)
migrate_task_rq_fair()
remove_entity_load_avg() --> unattached
se->avg.last_update_time = 0;
__set_task_cpu()
ttwu_queue(p, cpu)
ttwu_queue_wakelist()
__ttwu_queue_wakelist()
task_change_group_fair()
detach_task_cfs_rq()
detach_entity_cfs_rq()
detach_entity_load_avg() --> detach on unattached task
set_task_rq()
attach_task_cfs_rq()
attach_entity_cfs_rq()
attach_entity_load_avg()
The reason of this problem is similar, we should check in detach_entity_cfs_rq()
that se->avg.last_update_time != 0, before do detach_entity_load_avg().
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-7-zhouchengming@bytedance.com
When we are migrating task out of the CPU, we can combine detach and
propagation into dequeue_entity() to save the detach_entity_cfs_rq()
in migrate_task_rq_fair().
This optimization is like combining DO_ATTACH in the enqueue_entity()
when migrating task to the CPU. So we don't have to traverse the CFS tree
extra time to do the detach_entity_cfs_rq() -> propagate_entity_cfs_rq(),
which wouldn't be called anymore with this patch's change.
detach_task()
deactivate_task()
dequeue_task_fair()
for_each_sched_entity(se)
dequeue_entity()
update_load_avg() /* (1) */
detach_entity_load_avg()
set_task_cpu()
migrate_task_rq_fair()
detach_entity_cfs_rq() /* (2) */
update_load_avg();
detach_entity_load_avg();
propagate_entity_cfs_rq();
for_each_sched_entity()
update_load_avg()
This patch save the detach_entity_cfs_rq() called in (2) by doing
the detach_entity_load_avg() for a CPU migrating task inside (1)
(the task being the first se in the loop)
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-6-zhouchengming@bytedance.com
When reading the sched_avg related code, I found the comments in
enqueue/dequeue_entity() are not updated with the current code.
We don't add/subtract entity's runnable_avg from cfs_rq->runnable_avg
during enqueue/dequeue_entity(), those are done only for attach/detach.
This patch updates the comments to reflect the current code working.
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-5-zhouchengming@bytedance.com
set_task_rq() -> set_task_rq_fair() will try to synchronize the blocked
task's sched_avg when migrate, which is not needed for already detached
task.
task_change_group_fair() will detached the task sched_avg from prev cfs_rq
first, so reset sched_avg last_update_time before set_task_rq() to avoid that.
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-4-zhouchengming@bytedance.com
We use cpu_cgrp_subsys->fork() to set task group for the new fair task
in cgroup_post_fork().
Since commit b1e8206582 ("sched: Fix yet more sched_fork() races")
has already set_task_rq() for the new fair task in sched_cgroup_fork(),
so cpu_cgrp_subsys->fork() can be removed.
cgroup_can_fork() --> pin parent's sched_task_group
sched_cgroup_fork()
__set_task_cpu()
set_task_rq()
cgroup_post_fork()
ss->fork() := cpu_cgroup_fork()
sched_change_group(..., TASK_SET_GROUP)
task_set_group_fair()
set_task_rq() --> can be removed
After this patch's change, task_change_group_fair() only need to
care about task cgroup migration, make the code much simplier.
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lore.kernel.org/r/20220818124805.601-3-zhouchengming@bytedance.com
Previously we only maintain task se depth in task_move_group_fair(),
if a !fair task change task group, its se depth will not be updated,
so commit eb7a59b2c8 ("sched/fair: Reset se-depth when task switched to FAIR")
fix the problem by updating se depth in switched_to_fair() too.
Then commit daa59407b5 ("sched/fair: Unify switched_{from,to}_fair()
and task_move_group_fair()") unified these two functions, moved se.depth
setting to attach_task_cfs_rq(), which further into attach_entity_cfs_rq()
with commit df217913e7 ("sched/fair: Factorize attach/detach entity").
This patch move task se depth maintenance from attach_entity_cfs_rq()
to set_task_rq(), which will be called when CPU/cgroup change, so its
depth will always be correct.
This patch is preparation for the next patch.
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-2-zhouchengming@bytedance.com
The load_balance_mask and select_rq_mask percpu variables are only used in
kernel/sched/fair.c.
Make them static and move their allocation into init_sched_fair_class().
Replace kzalloc_node() with zalloc_cpumask_var_node() to get rid of the
CONFIG_CPUMASK_OFFSTACK #ifdef and to align with per-cpu cpumask
allocation for RT (local_cpu_mask in init_sched_rt_class()) and DL
class (local_cpu_mask_dl in init_sched_dl_class()).
[ mingo: Tidied up changelog & touched up the code. ]
Signed-off-by: Bing Huang <huangbing@kylinos.cn>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220722213609.3901-1-huangbing775@126.com
After commit 7a82e5f52a ("sched/fair: Merge for each idle cpu loop of ILB"),
_nohz_idle_balance()'s 'idle' parameter is not used anymore, so we can remove it.
Signed-off-by: Hao Jia <jiahao.os@bytedance.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220803130223.70419-1-jiahao.os@bytedance.com
Currently, the values of some fields are printed right-aligned, causing
the field value to be next to the next field name rather than next to its
own field name. So print each field value left-aligned, to make it more
readable.
Before:
stack: 0 pid: 307 ppid: 2 flags:0x00000008
After:
stack:0 pid:308 ppid:2 flags:0x0000000a
This also makes them print in the same style as the other two fields:
task:demo0 state:R running task
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Link: https://lore.kernel.org/r/20220727060819.1085-1-thunder.leizhen@huawei.com
Save a multiplication in dl_task_fits_capacity() by using already
maintained per-sched_dl_entity (i.e. per-task) `dl_runtime/dl_deadline`
(dl_density).
cap_scale(dl_deadline, cap) >= dl_runtime
dl_deadline * cap >> SCHED_CAPACITY_SHIFT >= dl_runtime
cap >= dl_runtime << SCHED_CAPACITY_SHIFT / dl_deadline
cap >= (dl_runtime << BW_SHIFT / dl_deadline) >>
BW_SHIFT - SCHED_CAPACITY_SHIFT
cap >= dl_density >> BW_SHIFT - SCHED_CAPACITY_SHIFT
__sched_setscheduler()->__checkparam_dl() ensures that the 2 corner
cases (if conditions) `runtime == RUNTIME_INF (-1)` and `period == 0`
of to_ratio(deadline, runtime) are not met when setting dl_density in
__sched_setscheduler()-> __setscheduler_params()->__setparam_dl().
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20220729111305.1275158-4-dietmar.eggemann@arm.com
dl_cpuset_cpumask_can_shrink() is used to validate whether there is
still enough CPU capacity for DL tasks in the reduced cpuset.
Currently it still operates on `# remaining CPUs in the cpuset` (1).
Change this to use the already capacity-aware DL admission control
__dl_overflow() for the `cpumask can shrink` test.
dl_b->bw = sched_rt_period << BW_SHIFT / sched_rt_period
dl_b->bw * (1) >= currently allocated bandwidth in root_domain (rd)
Replace (1) w/ `\Sum CPU capacity in rd >> SCHED_CAPACITY_SHIFT`
Adapt __dl_bw_capacity() to take a cpumask instead of a CPU number
argument so that `rd->span` and `cpumask of the reduced cpuset` can
be used here.
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20220729111305.1275158-3-dietmar.eggemann@arm.com
Create an inline helper for conditional code to be only executed on
asymmetric CPU capacity systems. This makes these (currently ~10 and
future) conditions a lot more readable.
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20220729111305.1275158-2-dietmar.eggemann@arm.com
core:
- Fix a few inconsistencies between UP and SMP vs. interrupt affinities
- Small updates and cleanups all over the place
drivers:
- New driver for the LoongArch interrupt controller
- New driver for the Renesas RZ/G2L interrupt controller
- Hotpath optimization for SiFive PLIC
- Workaround for broken PLIC edge triggered interrupts
- Simall cleanups and improvements as usual
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmLn5agTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoV2HD/4u0+09Fd8Awt1Knnb4CInmwFihZ/bu
EiS1Air+MEJ/fyFb5sT/Dn8YdUWYA6a3ifpLMGBwrKCcb5pMwPEtI8uqjSmtgsN/
2Z4o3N5v6EgM25CtrHNBrXK0E9Rz5Py49gm5p3K7+h4g63z9JwrM4G0Bvr8+krLS
EV9IZU6kVmGC6gnG/MspkArsLk1rCM0PU0SJ2lEPsWd1fjhVSDfunvy/qnnzXRzz
wjrcAf+a2Kgb1TMnpL6tx9d2Xx8KrKfODZTdOmPHrdv58F0EbJzapJnAVkYZDPtR
LE2kQc2Qhdawx0kgNNNhvu9P6oZtpnK9N7KAhDQdw17sgrRygINjAMSEe2RykYL1
lK+lJOIzfyd2JkEuC/8w1ZezL88S0EaTNawrkxjJ8L3fa7WDbwilCC+1w95QydCv
sQB137OaLKgWetcRsht9PLWFb4ujkWzxoPf2cPPsm81EzCicNtBuNPLReBTcNrWJ
X2VPpbaqRK8t8bnkXRqhahbq7f8c86feoICHfA4c7T4eZUp/Oq6T8aNvf6WPgjae
I2/FO6kxZj3CQqFkhJGhiZRtGZdx6HLCsL84A+2Ktsra+D8+/qecZCnkHYtz0Vo6
aFuGg+Wj+zuc2QfdaWwG8Dh5dijbxgHGHhzbh9znsWzytN9gfoBxuvBejf65i6sC
In63mEkv35ttfA==
=OnhF
-----END PGP SIGNATURE-----
Merge tag 'irq-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"Updates for interrupt core and drivers:
Core:
- Fix a few inconsistencies between UP and SMP vs interrupt
affinities
- Small updates and cleanups all over the place
New drivers:
- LoongArch interrupt controller
- Renesas RZ/G2L interrupt controller
Updates:
- Hotpath optimization for SiFive PLIC
- Workaround for broken PLIC edge triggered interrupts
- Simall cleanups and improvements as usual"
* tag 'irq-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
irqchip/mmp: Declare init functions in common header file
irqchip/mips-gic: Check the return value of ioremap() in gic_of_init()
genirq: Use for_each_action_of_desc in actions_show()
irqchip / ACPI: Introduce ACPI_IRQ_MODEL_LPIC for LoongArch
irqchip: Add LoongArch CPU interrupt controller support
irqchip: Add Loongson Extended I/O interrupt controller support
irqchip/loongson-liointc: Add ACPI init support
irqchip/loongson-pch-msi: Add ACPI init support
irqchip/loongson-pch-pic: Add ACPI init support
irqchip: Add Loongson PCH LPC controller support
LoongArch: Prepare to support multiple pch-pic and pch-msi irqdomain
LoongArch: Use ACPI_GENERIC_GSI for gsi handling
genirq/generic_chip: Export irq_unmap_generic_chip
ACPI: irq: Allow acpi_gsi_to_irq() to have an arch-specific fallback
APCI: irq: Add support for multiple GSI domains
LoongArch: Provisionally add ACPICA data structures
irqdomain: Use hwirq_max instead of revmap_size for NOMAP domains
irqdomain: Report irq number for NOMAP domains
irqchip/gic-v3: Fix comment typo
dt-bindings: interrupt-controller: renesas,rzg2l-irqc: Document RZ/V2L SoC
...
core:
- Make wait_event_hrtimeout() ware of RT/DL tasks
drivers:
- New driver for the R-Car Gen4 timer
- New driver for the Tegra186 timer
- New driver for the Mediatek MT6795 CPUXGPT timer
- Rework suspend/resume handling in timer drivers so it
takes inactive clocks into account.
- The usual device tree compatible add ons
- Small fixed and cleanups all over the place
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmLn5vUTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoV5HEACj2xGuGCnMBkOddhwAhSh8dsMP0+Sb
gRoC6urvlAxDvehblZJ7BD1o9yZqHEguoPeJLRgPVzSDGADWppzj3N0lGtJnELcQ
XQ93SQYBxpy3eJP7AzkUxdy/pvleIAfmje6VlbKnzNEtM/csXS+kJuL9HSgOFDBr
NWVtTtkQCgFmkI6bRoQqq3fU0mYkHnK+KwBrXoX3QJntwEZ7WM0YySE+U1RbeYuh
bo6K2x+gllJXD7wF4X2quQ7PBbiRLw7/2aEeO10mcSPn1sl/efMg3i4a4cmck0ce
zrjQWC1s0hv4PRJRbEKN9wiwrgYyyYaUeWPWDO+6iF69e2i5qu6wRHJM8tyeEm9R
olEt8o6Go0Yw4KZfOeVgK87vsIV50GDD6b/D2ingY5OjiJT6glwmO6yMQ9USa3B6
BUfSLt1LG/AMJ2kWkOKN4AckaUaSh/IZ150ZjFA7hEy1RW9+XCsf0q0IMgEh43xL
D6W+IHdtOZnVMHztxT9H0hzmUnCrj7vcsyrxagEo9rMeiFhCK/5OCh9HENZYsrNg
zElnG++WwP5i9pOzC3Sl5mj/4nYVj5LNpmt4XZGp+7A3+UAfeFvgq7It0b74M6rW
SfyC6D9X4uL0j2VVLRxV0o9Kqnqecc0S50NrPOkXTVPn2pYjRfqVBp0gj5T3knBO
0Vvrwy8Owc1iAA==
=kPNA
-----END PGP SIGNATURE-----
Merge tag 'timers-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
"Timers, timekeeping and related drivers update:
Core:
- Make wait_event_hrtimeout() aware of RT/DL tasks
New drivers:
- R-Car Gen4 timer
- Tegra186 timer
- Mediatek MT6795 CPUXGPT timer
Updates:
- Rework suspend/resume handling in timer drivers so it
takes inactive clocks into account.
- The usual device tree compatible add ons
- Small fixed and cleanups all over the place"
* tag 'timers-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
wait: Fix __wait_event_hrtimeout for RT/DL tasks
clocksource/drivers/sun5i: Remove unnecessary (void*) conversions
dt-bindings: timer: allwinner,sun4i-a10-timer: Add D1 compatible
dt-bindings: timer: ingenic,tcu: use absolute path to other schema
clocksource/drivers/sun4i: Remove unnecessary (void*) conversions
dt-bindings: timer: renesas,cmt: Fix R-Car Gen4 fall-out
clocksource/drivers/tegra186: Put Kconfig option 'tristate' to 'bool'
clocksource/drivers/timer-ti-dm: Make driver selection bool for TI K3
clocksource/drivers/timer-ti-dm: Add compatible for am6 SoCs
clocksource/drivers/timer-ti-dm: Make timer selectable for ARCH_K3
clocksource/drivers/timer-ti-dm: Move inline functions to driver for am6
clocksource/drivers/sh_cmt: Add R-Car Gen4 support
dt-bindings: timer: renesas,cmt: R-Car V3U is R-Car Gen4
dt-bindings: timer: renesas,cmt: Add r8a779f0 and generic Gen4 CMT support
clocksource/drivers/timer-microchip-pit64b: Fix compilation warnings
clocksource/drivers/timer-microchip-pit64b: Use mchp_pit64b_{suspend, resume}
clocksource/drivers/timer-microchip-pit64b: Remove suspend/resume ops for ce
thermal/drivers/rcar_gen3_thermal: Add r8a779f0 support
clocksource/drivers/timer-mediatek: Implement CPUXGPT timers
dt-bindings: timer: mediatek: Add CPUX System Timer and MT6795 compatible
...
- Fix Intel Alder Lake PEBS memory access latency & data source profiling info bugs.
- Use Intel large-PEBS hardware feature in more circumstances, to reduce
PMI overhead & reduce sampling data.
- Extend the lost-sample profiling output with the PERF_FORMAT_LOST ABI variant,
which tells tooling the exact number of samples lost.
- Add new IBS register bits definitions.
- AMD uncore events: Add PerfMonV2 DF (Data Fabric) enhancements.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmLn5MARHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1jAWA/+N48UX35dD0u3k5S2zdYJRHzQkdbivVGc
dOuCB3XTJYneaSI5byQkI4Xo8LUuMbF4q2Zi3/+XhTaqn2zYPP65D6ACL5hU9Shh
F95TnLWbedIaxSJmjMCsWDlwBob8WgtLhokWvyq+ks66BqaDoBKHRtn+2fi0rwZb
MbuN0199Gx/EicWEOeUGBSxoeKbjSix0BApqy+CuXC0DC3+3iwIPk4dbNfHXpHYs
nqxjQKhJnoxdlgjiOY3UuYhdCZl1cuQFIu2Ce1N2nXCAgR2FeQD7ZqtcaA2TnsAO
9BwRfLljavzHhOoz0zALR42kF+eOcnH5K9pIBx7py9Hjdmdsx88fUCovWK34MdG5
KTuqiMWNLIUvoP9WBjl7wUtl2+vcjr9XwgCdneOO+zoNsk44qSRyer1RpEP6D9UM
e9HvdXBVRzhnIhK9NYugeLJ+3nxvFL+OLvc3ZfUrtm04UzeetCBxMlvMv3y021V7
0fInZjhzh4Dz2tJgNlG7AKXkXlsHlyj6/BH9uKc9wVokK+94g5mbspxW8R4gKPr2
l06pYB7ttSpp26sq9vl5ASHO0rniiYAPsQcr7Ko3y72mmp6kfIe/HzYNhCEvgYe2
6JJ8F9kPgRuKr0CwGvUzxFwBC7PJR80zUtZkRCIpV+rgxQcNmK5YXp/KQFIjQqkI
rJfEaDOshl0=
=DqaA
-----END PGP SIGNATURE-----
Merge tag 'perf-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf events updates from Ingo Molnar:
- Fix Intel Alder Lake PEBS memory access latency & data source
profiling info bugs.
- Use Intel large-PEBS hardware feature in more circumstances, to
reduce PMI overhead & reduce sampling data.
- Extend the lost-sample profiling output with the PERF_FORMAT_LOST ABI
variant, which tells tooling the exact number of samples lost.
- Add new IBS register bits definitions.
- AMD uncore events: Add PerfMonV2 DF (Data Fabric) enhancements.
* tag 'perf-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/ibs: Add new IBS register bits into header
perf/x86/intel: Fix PEBS data source encoding for ADL
perf/x86/intel: Fix PEBS memory access info encoding for ADL
perf/core: Add a new read format to get a number of lost samples
perf/x86/amd/uncore: Add PerfMonV2 RDPMC assignments
perf/x86/amd/uncore: Add PerfMonV2 DF event format
perf/x86/amd/uncore: Detect available DF counters
perf/x86/amd/uncore: Use attr_update for format attributes
perf/x86/amd/uncore: Use dynamic events array
x86/events/intel/ds: Enable large PEBS for PERF_SAMPLE_WEIGHT_TYPE
- lockdep: Fix a handful of the more complex lockdep_init_map_*() primitives
that can lose the lock_type & cause false reports. No such mishap was
observed in the wild.
- jump_label improvements: simplify the cross-arch support of
initial NOP patching by making it arch-specific code (used on MIPS only),
and remove the s390 initial NOP patching that was superfluous.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmLn3jERHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1hzeg/7BTC90XeMANhTiL23iiH7dOYZwqdFeB12
VBqdaPaGC8i+mJzVAdGyPFwCFDww6Ak6P33PcHkemuIO5+DhWis8hfw5krHEOO1k
AyVSMOZuWJ8/g6ZenjgNFozQ8C+3NqURrpdqN55d7jhMazPWbsNLLqUgvSSqo6DY
Ah2O+EKrDfGNCxT6/YaTAmUryctotxafSyFDQxv3RKPfCoIIVv9b3WApYqTOqFIu
VYTPr+aAcMsU20hPMWQI4kbQaoCxFqr3bZiZtAiS/IEunqi+PlLuWjrnCUpLwVTC
+jOCkNJHt682FPKTWelUnCnkOg9KhHRujRst5mi1+2tWAOEvKltxfe05UpsZYC3b
jhzddREMwBt3iYsRn65LxxsN4AMK/C/41zjejHjZpf+Q5kwDsc6Ag3L5VifRFURS
KRwAy9ejoVYwnL7CaVHM2zZtOk4YNxPeXmiwoMJmOufpdmD1LoYbNUbpSDf+goIZ
yPJpxFI5UN8gi8IRo3DMe4K2nqcFBC3wFn8tNSAu+44gqDwGJAJL6MsLpkLSZkk8
3QN9O11UCRTJDkURjoEWPgRRuIu9HZ4GKNhiblDy6gNM/jDE/m5OG4OYfiMhojgc
KlMhsPzypSpeApL55lvZ+AzxH8mtwuUGwm8lnIdZ2kIse1iMwapxdWXWq9wQr8eW
jLWHgyZ6rcg=
=4B89
-----END PGP SIGNATURE-----
Merge tag 'locking-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
"This was a fairly quiet cycle for the locking subsystem:
- lockdep: Fix a handful of the more complex lockdep_init_map_*()
primitives that can lose the lock_type & cause false reports. No
such mishap was observed in the wild.
- jump_label improvements: simplify the cross-arch support of initial
NOP patching by making it arch-specific code (used on MIPS only),
and remove the s390 initial NOP patching that was superfluous"
* tag 'locking-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/lockdep: Fix lockdep_init_map_*() confusion
jump_label: make initial NOP patching the special case
jump_label: mips: move module NOP patching into arch code
jump_label: s390: avoid pointless initial NOP patching
Load-balancing improvements:
============================
- Improve NUMA balancing on AMD Zen systems for affine workloads.
- Improve the handling of reduced-capacity CPUs in load-balancing.
- Energy Model improvements: fix & refine all the energy fairness metrics (PELT),
and remove the conservative threshold requiring 6% energy savings to
migrate a task. Doing this improves power efficiency for most workloads,
and also increases the reliability of energy-efficiency scheduling.
- Optimize/tweak select_idle_cpu() to spend (much) less time searching
for an idle CPU on overloaded systems. There's reports of several
milliseconds spent there on large systems with large workloads ...
[ Since the search logic changed, there might be behavioral side effects. ]
- Improve NUMA imbalance behavior. On certain systems
with spare capacity, initial placement of tasks is non-deterministic,
and such an artificial placement imbalance can persist for a long time,
hurting (and sometimes helping) performance.
The fix is to make fork-time task placement consistent with runtime
NUMA balancing placement.
Note that some performance regressions were reported against this,
caused by workloads that are not memory bandwith limited, which benefit
from the artificial locality of the placement bug(s). Mel Gorman's
conclusion, with which we concur, was that consistency is better than
random workload benefits from non-deterministic bugs:
"Given there is no crystal ball and it's a tradeoff, I think it's
better to be consistent and use similar logic at both fork time
and runtime even if it doesn't have universal benefit."
- Improve core scheduling by fixing a bug in sched_core_update_cookie() that
caused unnecessary forced idling.
- Improve wakeup-balancing by allowing same-LLC wakeup of idle CPUs for newly
woken tasks.
- Fix a newidle balancing bug that introduced unnecessary wakeup latencies.
ABI improvements/fixes:
=======================
- Do not check capabilities and do not issue capability check denial messages
when a scheduler syscall doesn't require privileges. (Such as increasing niceness.)
- Add forced-idle accounting to cgroups too.
- Fix/improve the RSEQ ABI to not just silently accept unknown flags.
(No existing tooling is known to have learned to rely on the previous behavior.)
- Depreciate the (unused) RSEQ_CS_FLAG_NO_RESTART_ON_* flags.
Optimizations:
==============
- Optimize & simplify leaf_cfs_rq_list()
- Micro-optimize set_nr_{and_not,if}_polling() via try_cmpxchg().
Misc fixes & cleanups:
======================
- Fix the RSEQ self-tests on RISC-V and Glibc 2.35 systems.
- Fix a full-NOHZ bug that can in some cases result in the tick not being
re-enabled when the last SCHED_RT task is gone from a runqueue but there's
still SCHED_OTHER tasks around.
- Various PREEMPT_RT related fixes.
- Misc cleanups & smaller fixes.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmLn2ywRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1iNfxAAhPJMwM4tYCpIM6PhmxKiHl6kkiT2tt42
HhEmiJVLjczLybWaWwmGA2dSFkv1f4+hG7nqdZTm9QYn0Pqat2UTSRcwoKQc+gpB
x85Hwt2IUmnUman52fRl5r1miH9LTdCI6agWaFLQae5ds1XmOugFo52t2ahax+Gn
dB8LxS2fa/GrKj229EhkJSPWAK4Y94asoTProwpKLuKEeXhDkqUNrOWbKhz+wEnA
pVZySpA9uEOdNLVSr1s0VB6mZoh5/z6yQefj5YSNntsG71XWo9jxKCIm5buVdk2U
wjdn6UzoTThOy/5Ygm64eYRexMHG71UamF1JYUdmvDeUJZ5fhG6RD0FECUQNVcJB
Msu2fce6u1AV0giZGYtiooLGSawB/+e6MoDkjTl8guFHi/peve9CezKX1ZgDWPfE
eGn+EbYkUS9RMafXCKuEUBAC1UUqAavGN9sGGN1ufyR4za6ogZplOqAFKtTRTGnT
/Ne3fHTtvv73DLGW9ohO5vSS2Rp7zhAhB6FunhibhxCWlt7W6hA4Ze2vU9hf78Yn
SJDLAJjOEilLaKUkRG/d9uM3FjKJM1tqxuT76+sUbM0MNxdyiKcviQlP1b8oq5Um
xE1KNZUevnr/WXqOTGDKHH/HNPFgwxbwavMiP7dNFn8h/hEk4t9dkf5siDmVHtn4
nzDVOob1LgE=
=xr2b
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"Load-balancing improvements:
- Improve NUMA balancing on AMD Zen systems for affine workloads.
- Improve the handling of reduced-capacity CPUs in load-balancing.
- Energy Model improvements: fix & refine all the energy fairness
metrics (PELT), and remove the conservative threshold requiring 6%
energy savings to migrate a task. Doing this improves power
efficiency for most workloads, and also increases the reliability
of energy-efficiency scheduling.
- Optimize/tweak select_idle_cpu() to spend (much) less time
searching for an idle CPU on overloaded systems. There's reports of
several milliseconds spent there on large systems with large
workloads ...
[ Since the search logic changed, there might be behavioral side
effects. ]
- Improve NUMA imbalance behavior. On certain systems with spare
capacity, initial placement of tasks is non-deterministic, and such
an artificial placement imbalance can persist for a long time,
hurting (and sometimes helping) performance.
The fix is to make fork-time task placement consistent with runtime
NUMA balancing placement.
Note that some performance regressions were reported against this,
caused by workloads that are not memory bandwith limited, which
benefit from the artificial locality of the placement bug(s). Mel
Gorman's conclusion, with which we concur, was that consistency is
better than random workload benefits from non-deterministic bugs:
"Given there is no crystal ball and it's a tradeoff, I think
it's better to be consistent and use similar logic at both fork
time and runtime even if it doesn't have universal benefit."
- Improve core scheduling by fixing a bug in
sched_core_update_cookie() that caused unnecessary forced idling.
- Improve wakeup-balancing by allowing same-LLC wakeup of idle CPUs
for newly woken tasks.
- Fix a newidle balancing bug that introduced unnecessary wakeup
latencies.
ABI improvements/fixes:
- Do not check capabilities and do not issue capability check denial
messages when a scheduler syscall doesn't require privileges. (Such
as increasing niceness.)
- Add forced-idle accounting to cgroups too.
- Fix/improve the RSEQ ABI to not just silently accept unknown flags.
(No existing tooling is known to have learned to rely on the
previous behavior.)
- Depreciate the (unused) RSEQ_CS_FLAG_NO_RESTART_ON_* flags.
Optimizations:
- Optimize & simplify leaf_cfs_rq_list()
- Micro-optimize set_nr_{and_not,if}_polling() via try_cmpxchg().
Misc fixes & cleanups:
- Fix the RSEQ self-tests on RISC-V and Glibc 2.35 systems.
- Fix a full-NOHZ bug that can in some cases result in the tick not
being re-enabled when the last SCHED_RT task is gone from a
runqueue but there's still SCHED_OTHER tasks around.
- Various PREEMPT_RT related fixes.
- Misc cleanups & smaller fixes"
* tag 'sched-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
rseq: Kill process when unknown flags are encountered in ABI structures
rseq: Deprecate RSEQ_CS_FLAG_NO_RESTART_ON_* flags
sched/core: Fix the bug that task won't enqueue into core tree when update cookie
nohz/full, sched/rt: Fix missed tick-reenabling bug in dequeue_task_rt()
sched/core: Always flush pending blk_plug
sched/fair: fix case with reduced capacity CPU
sched/core: Use try_cmpxchg in set_nr_{and_not,if}_polling
sched/core: add forced idle accounting for cgroups
sched/fair: Remove the energy margin in feec()
sched/fair: Remove task_util from effective utilization in feec()
sched/fair: Use the same cpumask per-PD throughout find_energy_efficient_cpu()
sched/fair: Rename select_idle_mask to select_rq_mask
sched, drivers: Remove max param from effective_cpu_util()/sched_cpu_util()
sched/fair: Decay task PELT values during wakeup migration
sched/fair: Provide u64 read for 32-bits arch helper
sched/fair: Introduce SIS_UTIL to search idle CPU based on sum of util_avg
sched: only perform capability check on privileged operation
sched: Remove unused function group_first_cpu()
sched/fair: Remove redundant word " *"
selftests/rseq: check if libc rseq support is registered
...
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEjUuTAak14xi+SF7M4CHKc/GJqRAFAmLnqTQACgkQ4CHKc/GJ
qRBnBwgAohP0MXszRnhGEKKTmLBtsEyPrV0OBEIlz3MnDYBYfnDLd5JdSMMA+1jp
sT80QWYKPMr10WKWX5vPjhIYRIfgWchEYND/93DnJYC6Fdap/D0hDd6tIQEKnxpN
YeGZHck6orj9L2HfazJo7qpt//Th5mM8WRTN9OIiFdKPYOvlm7DT51wukVLnK9fA
WoWrx3CsyIh6unvAC6AMOVFt7ZJOfD6muMQsGmkcpp1sJLeM1Ofoe8l+h5oSrFZQ
CrdV4XXrprVi7JhqvSX4alRnF5vmOAVKVXhBLZ3A/3uTou2Bhic6n68chyb/x2RE
FhwmsXS+v7jsOI0PV4gNzwT+sp+01w==
=y2kQ
-----END PGP SIGNATURE-----
Merge tag 'slab-for-5.20_or_6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab updates from Vlastimil Babka:
- An addition of 'accounted' flag to slab allocation tracepoints to
indicate memcg_kmem accounting, by Vasily
- An optimization of memcg handling in freeing paths, by Muchun
- Various smaller fixes and cleanups
* tag 'slab-for-5.20_or_6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
mm/slab_common: move generic bulk alloc/free functions to SLOB
mm/sl[au]b: use own bulk free function when bulk alloc failed
mm: slab: optimize memcg_slab_free_hook()
mm/tracing: add 'accounted' entry into output of allocation tracepoints
tools/vm/slabinfo: Handle files in debugfs
mm/slub: Simplify __kmem_cache_alias()
mm, slab: fix bad alignments
- Remove unused generic cpuidle support (replaced by PSCI version)
- Fix documentation describing the kernel virtual address space
- Handling of some new CPU errata in Arm implementations
- Rework of our exception table code in preparation for handling
machine checks (i.e. RAS errors) more gracefully
- Switch over to the generic implementation of ioremap()
- Fix lockdep tracking in NMI context
- Instrument our memory barrier macros for KCSAN
- Rework of the kPTI G->nG page-table repainting so that the MMU remains
enabled and the boot time is no longer slowed to a crawl for systems
which require the late remapping
- Enable support for direct swapping of 2MiB transparent huge-pages on
systems without MTE
- Fix handling of MTE tags with allocating new pages with HW KASAN
- Expose the SMIDR register to userspace via sysfs
- Continued rework of the stack unwinder, particularly improving the
behaviour under KASAN
- More repainting of our system register definitions to match the
architectural terminology
- Improvements to the layout of the vDSO objects
- Support for allocating additional bits of HWCAP2 and exposing
FEAT_EBF16 to userspace on CPUs that support it
- Considerable rework and optimisation of our early boot code to reduce
the need for cache maintenance and avoid jumping in and out of the
kernel when handling relocation under KASLR
- Support for disabling SVE and SME support on the kernel command-line
- Support for the Hisilicon HNS3 PMU
- Miscellanous cleanups, trivial updates and minor fixes
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmLeccUQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNCysB/4ml92RJLhVwRAofbtFfVgVz3JLTSsvob9x
Z7FhNDxfM/G32wKtOHU9tHkGJ+PMVWOPajukzxkMhxmilfTyHBbiisNWVRjKQxj4
wrd07DNXPIv3bi8SWzS1y2y8ZqujZWjNJlX8SUCzEoxCVtuNKwrh96kU1jUjrkFZ
kBo4E4wBWK/qW29nClGSCgIHRQNJaB/jvITlQhkqIb0pwNf3sAUzW7QoF1iTZWhs
UswcLh/zC4q79k9poegdCt8chV5OBDLtLPnMxkyQFvsLYRp3qhyCSQQY/BxvO5JS
jT9QR6d+1ewET9BFhqHlIIuOTYBCk3xn/PR9AucUl+ZBQd2tO4B1
=LVH0
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"Highlights include a major rework of our kPTI page-table rewriting
code (which makes it both more maintainable and considerably faster in
the cases where it is required) as well as significant changes to our
early boot code to reduce the need for data cache maintenance and
greatly simplify the KASLR relocation dance.
Summary:
- Remove unused generic cpuidle support (replaced by PSCI version)
- Fix documentation describing the kernel virtual address space
- Handling of some new CPU errata in Arm implementations
- Rework of our exception table code in preparation for handling
machine checks (i.e. RAS errors) more gracefully
- Switch over to the generic implementation of ioremap()
- Fix lockdep tracking in NMI context
- Instrument our memory barrier macros for KCSAN
- Rework of the kPTI G->nG page-table repainting so that the MMU
remains enabled and the boot time is no longer slowed to a crawl
for systems which require the late remapping
- Enable support for direct swapping of 2MiB transparent huge-pages
on systems without MTE
- Fix handling of MTE tags with allocating new pages with HW KASAN
- Expose the SMIDR register to userspace via sysfs
- Continued rework of the stack unwinder, particularly improving the
behaviour under KASAN
- More repainting of our system register definitions to match the
architectural terminology
- Improvements to the layout of the vDSO objects
- Support for allocating additional bits of HWCAP2 and exposing
FEAT_EBF16 to userspace on CPUs that support it
- Considerable rework and optimisation of our early boot code to
reduce the need for cache maintenance and avoid jumping in and out
of the kernel when handling relocation under KASLR
- Support for disabling SVE and SME support on the kernel
command-line
- Support for the Hisilicon HNS3 PMU
- Miscellanous cleanups, trivial updates and minor fixes"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (136 commits)
arm64: Delay initialisation of cpuinfo_arm64::reg_{zcr,smcr}
arm64: fix KASAN_INLINE
arm64/hwcap: Support FEAT_EBF16
arm64/cpufeature: Store elf_hwcaps as a bitmap rather than unsigned long
arm64/hwcap: Document allocation of upper bits of AT_HWCAP
arm64: enable THP_SWAP for arm64
arm64/mm: use GENMASK_ULL for TTBR_BADDR_MASK_52
arm64: errata: Remove AES hwcap for COMPAT tasks
arm64: numa: Don't check node against MAX_NUMNODES
drivers/perf: arm_spe: Fix consistency of SYS_PMSCR_EL1.CX
perf: RISC-V: Add of_node_put() when breaking out of for_each_of_cpu_node()
docs: perf: Include hns3-pmu.rst in toctree to fix 'htmldocs' WARNING
arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags"
mm: kasan: Skip page unpoisoning only if __GFP_SKIP_KASAN_UNPOISON
mm: kasan: Skip unpoisoning of user pages
mm: kasan: Ensure the tags are visible before the tag in page->flags
drivers/perf: hisi: add driver for HNS3 PMU
drivers/perf: hisi: Add description for HNS3 PMU driver
drivers/perf: riscv_pmu_sbi: perf format
perf/arm-cci: Use the bitmap API to allocate bitmaps
...
- Use RNG seed from bootinfo block on virt platform,
- Defconfig updates,
- Minor fixes and improvements.
-----BEGIN PGP SIGNATURE-----
iIsEABYIADMWIQQ9qaHoIs/1I4cXmEiKwlD9ZEnxcAUCYufI8xUcZ2VlcnRAbGlu
dXgtbTY4ay5vcmcACgkQisJQ/WRJ8XBovgD8CGRi7FpKl4IzIxGV+ru3Zo/b1eG+
ngN/Dodin420UP0A/2l5jXJKAaoylZTZ3FNKYJEttIJA+pvPkkIjB7Jz64UE
=DW6X
-----END PGP SIGNATURE-----
Merge tag 'm68k-for-v5.20-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull m68k updates from Geert Uytterhoeven:
- Use RNG seed from bootinfo block on virt platform
- defconfig updates
- Minor fixes and improvements
* tag 'm68k-for-v5.20-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: defconfig: Update defconfigs for v5.19-rc1
m68k: Add common forward declaration for show_registers()
m68k: mac: Remove forward declaration for mac_nmi_handler()
m68k: virt: Fix missing platform_device_unregister() on error in virt_platform_init()
m68k: virt: Use RNG seed from bootinfo block
m68k: bitops: Change __fls to return and accept unsigned long
m68k: Kconfig.machine: Add endif comment
m68k: Kconfig.debug: Replace single quotes
m68k: Kconfig.cpu: Fix indentation and add endif comments
m68k: q40: Align '*' in comments
m68k: sun3: Use __func__ to get function's name in an output message
m68k: mac: Fix typos in comments
m68k: virt: Kconfig minor fixes
loader
- Add the ability to pass the IMA measurement of kernel and bootloader
to the kexec-ed kernel
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmLn770ACgkQEsHwGGHe
VUoOyA//R7ljAspkzqE+kY02GOXCvVo+Ix/WFbpeUMouSb71vxjyqJED6lMrWKvM
HPzXwuQ5C1bXIbvWW424l66q9O48Iu3FvnURGc05ngBvgnyLxw+IdfWREr3rhVtR
ZKdaMHCzj1RsxCRYXie4NIyW86D1Bd4V4W7KFG/u26LSo9VL2oY1JXd0vxXrh0e6
F4pwJsS+5TrgaFPwfSLm66HWlM2oxmqBVD/Fi8Pmzq7/ewb3KSgIWralOjew5X13
f4ob9GVLojM9yVPLSww0p2CRitlxypO5pv3rsrcwo77UhikflFk4Ruc4IeMd4792
ZszDCyWWCzFHZDizo2tni4IbcKtOx1lL389sYj/ZVsAYarGzeRRNYpN5TE6cSFXK
6hqurMMTDrmeczScBK3uQ4BFkMzWYGCYWy6JNrTmD43Onb5fe2usWIbpz+oFB0Kd
26Oa85lAKUhOUTnU1yM5aeRYBYiouyD80BRKgve5pcN00BXwO0OOny5sijFt3hvC
266k2g/+zY6wNawnEesNfLFkUvR09416xEbe5W3l64vlCGsjt9doB4vPKLkHBXq4
YilUVFFT3/djTvfLy50L2ta9oNdYXK7ECfGj0t2UCcnj0IrO4E0Cm0BlPN8r/a6L
gwE9I4txaYZmT8VRBG2kiyUljUSqZUj1UFHevMuCS09dzLonJN4=
=s9Om
-----END PGP SIGNATURE-----
Merge tag 'x86_kdump_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 kdump updates from Borislav Petkov:
- Add the ability to pass early an RNG seed to the kernel from the boot
loader
- Add the ability to pass the IMA measurement of kernel and bootloader
to the kexec-ed kernel
* tag 'x86_kdump_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/setup: Use rng seeds from setup_data
x86/kexec: Carry forward IMA measurement log on kexec
- Other Kbuild improvements and fixes
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmLnvZYACgkQEsHwGGHe
VUqCFA//T2b0IkFHk63x4Ln+kDwEhB+NFjeuwhEd9XTktEgT0tAU6pZcUruSAHMJ
MeQFNYGbqbqTNztRDvc9WSbJVQWxxTLJYbXi2HbJpk7cOBpGwQQW1dBeLH0W0eTt
YyAZN/bhpjQK7+/rTyFZlUxqjr8VgBWCrmK7OFV65kZxZW3K7Qjg54NotAHLwU7/
oywR+REyiU4soKPUJPtIA42kNXkxzrQDjMu42/ywdQHBS5PMCUp16rnvP8uw1ERN
9aO+yGZ907Xl6xQXTxnbo2ehlhbu25xK8IZMbgEXE66iSeEtWb27H3xGMktYBETn
LsG4OkiqQ0z7lYHCcWPduAlP1ZrHAOEjD6sJhDOtbGffHRyJG4AHHngQkxfeWxPR
evRsPJssEUui65fW30cB+zY0glnGXpsfb+Ma9i8/02B2DNq5N4ERv8q4rfrLJX9Y
l9hMGw8FqI57ALoDiXJWCTGHZbEvEHiLOUQh9lLEKQmf5hTqb4Trulv+uhlc/JVS
vrfUzfhVdpWU6XY0ba0jXdbkYBJV8xERG9dsBSYqvjGaXJES6TXILiZ7CeMHoXhN
srBPCM3QZUGp/gzhua2s7SctQfyMmx6CrCg39sMligBM2wl1o3vJML5y1h/RwN7g
LTVZTObvMPYDU6dn1APOZUh77DlW7JWg//DcsB0BrlfZCySRkfE=
=SMFp
-----END PGP SIGNATURE-----
Merge tag 'x86_build_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 build updates from Borislav Petkov:
- Fix stack protector builds when cross compiling with Clang
- Other Kbuild improvements and fixes
* tag 'x86_build_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/purgatory: Omit use of bin2c
x86/purgatory: Hard-code obj-y in Makefile
x86/build: Remove unused OBJECT_FILES_NON_STANDARD_test_nx.o
x86/Kconfig: Fix CONFIG_CC_HAS_SANE_STACKPROTECTOR when cross compiling with clang
pr_warn_once() change broke that
- Simplify {JMP,CALL}_NOSPEC and let the objtool retpoline patching
infra take care of them instead of having unreadable alternative macros
there
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmLnvG8ACgkQEsHwGGHe
VUpbPw//WUoBMnR9B9xpwuk6EBA0rVbQqCt0QYy/h27QuD9aIC+lYduL8CId1AbS
J+bqDvpqFQRqc9idAvgkjspJNMnbOSqzAx1AbT4gBCH33hPYrB/6F7XasXSDZn0M
OVJDyvhOhror2I6YuFc1uwjMPBZj8+fgv823io/RKqvJfj/WUoIBK1oxUlzm4rNs
LkrxgKpvs3QznJ0RNnZUP1kKnezzC1RtxIdz8QD3rirpuZITF8HD04jIUyK7gohh
XP5Vgt+9BzRCHyn43XT7UEobz93WASDn0YOlDCdOuMxhJONYpnvk8dYJkss2BReG
oCGZNiUHKNALd/FWUr5EeWoa84XQwRwm2Y+jjdH3al7LIaIbERGXbTIHsWH6JtVk
+EiXqj+B6d2muZpG/ka0PSgPvbeTeipeugNDzXGCLuWHkjLRwJRV/afUmzLuT7lJ
zzrejZeXACtWmTLXXt5EPAPWsibapiuA6/4ucyr+jJj7CCY8pmrOGcxqHKlq7MHU
9Lk50F6Y9jbEe7mRPmR67dj/P9QAFYewjVDLEgOdyganrNNGfRkddJVbckYYZTVF
YSUsoZBAc5E3Fzo/yQkG3+6K0d13Syuf3QRhGxe2JvYlWt2mktaFsRtcTGSCd9KM
AJw8tV0FDezxmdKQ8DNkeLG+tAt7m22dRJkhbSUilYQsNR2Cwxg=
=70Gc
-----END PGP SIGNATURE-----
Merge tag 'x86_core_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core updates from Borislav Petkov:
- Have invalid MSR accesses warnings appear only once after a
pr_warn_once() change broke that
- Simplify {JMP,CALL}_NOSPEC and let the objtool retpoline patching
infra take care of them instead of having unreadable alternative
macros there
* tag 'x86_core_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/extable: Fix ex_handler_msr() print condition
x86,nospec: Simplify {JMP,CALL}_NOSPEC
- Free the pmem platform device on the registration error path
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmLnuOIACgkQEsHwGGHe
VUrY0Q/+JPlcOVTbneHaUazeG7i0EYugNLslPa42FzcM8B0NMnK241XjP8fubZML
cHh76LWWu7MjUdkpiXjiQoBwp9fJHF1amkUNJ2JBJ3Vcz8phvfnCeaiotMeKIrDZ
XsG7+UuQcUAL5C5C9UOJYWHAA32IlcuvaKA0xV76TF3qMo6vmI62AYdc8ddcQj7k
qvtdzwVwoJ1RAJ1jD54c5nHZHu1yuPAx9unYsGnE7v6X8geU9aZ6EsNsLO/vAmt7
0qmIuUYmR12QL3PwfAlOqIev/yIOjcac/sA3zEDOwOjZt4fS77d8GLG3mnj9x4GY
9v9n0d2ZJd8iDb6g27mzZTqzqTTcm60TckiW+asdlc+tWi+bLH6btpdmZ5RZD9M1
GgsbEXsXpnG33rMucLiGKWPseFhKty7jWH51ux0KWygHajFSRqZHyPL8LNSmohZJ
X45wvRKB5VuTbITHZ+Ix7mTna34IFYu1TiLT+Bd0oGFS4WUfee5vb7JDpIOXfV3W
0Fi7xM7JCJQnvj1K5ernx0W37t35AfWahPKLPMtZu0EFodXxyHDfGG0yWeV9THD4
zkrZTYudZbtUNLO1vep/ikPnFCJ/EhntVNhk4J2MvSZ2xdUMbyFpd8h3kVR+UL0r
aprdPGwhWEDmjycO41sp+ifC/t1e5QBxTilp5WzUzoVEH05rIUk=
=IY+U
-----END PGP SIGNATURE-----
Merge tag 'x86_misc_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 updates from Borislav Petkov:
- Add a bunch of PCI IDs for new AMD CPUs and use them in k10temp
- Free the pmem platform device on the registration error path
* tag 'x86_misc_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
hwmon: (k10temp): Add support for new family 17h and 19h models
x86/amd_nb: Add AMD PCI IDs for SMN communication
x86/pmem: Fix platform-device leak in error path
- Respect idle=nomwait when supplied on the kernel cmdline
- Two small cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmLntx0ACgkQEsHwGGHe
VUqlRxAAkULobsk6Dx3wrQcYlpA8Mt/ctttTQXWiIQwhK1j7uP0zlGWBqImr5Wsk
T04g1s29azulnPs3PydCF2QlLqSyF4v2PyyUwnpKfTP6CPM+MLtz98Gm6Xcbkt+s
f28ISYgNP+15tskWdNqB5XIVGkuyBdNne9TiFwtnVrJYF47FSwqEWRyqMH+bIOGT
wSZUCfjcw7PtKwfIAmYq4beS2+wbY9bsfVyIz+H0ks2EVFQdjYWb/kH9PgUYEQFe
VEOBsPvTHDOJt0QXEXSJjmoSRUS77Wduw56Y3L2T4jWdXXQFWJ79rqNYDBvXGAdh
Y8BKM5IYFZpzrmfw2RB6jbDY/JWO5PPFvHTXogQf9+wttSerZEffVQdOeTwjT8VD
wc9/ZnNkT7915033VI90V+hdFkwarq8FXuFH8TkzcxP9DQNYG8CRTZBceq0UWBl0
5RpIDwNX9JxGrR+frJi0D24qxz//wLe56UqW9hLp73NP8QtEYEW1nb1q30Q2eM3N
iQblgmh63qQ/dy6JV1GFb3aePiWMUNQwcTrj1pd8YDfNlp4IsFsSswnsdAZWtr1A
l9qewHkBZbbzyTQkBjExUsaIdiaMywFwnUmcQNL+fHqznZIvMhJC/oCJeS0Pe/RH
alTUrYsk6Y87HFpxoXpd85a9+20m8yrA64uY8cSQguGZ9i5Lm8g=
=jkpj
-----END PGP SIGNATURE-----
Merge tag 'x86_cpu_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu updates from Borislav Petkov:
- Remove the vendor check when selecting MWAIT as the default idle
state
- Respect idle=nomwait when supplied on the kernel cmdline
- Two small cleanups
* tag 'x86_cpu_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu: Use MSR_IA32_MISC_ENABLE constants
x86: Fix comment for X86_FEATURE_ZEN
x86: Remove vendor checks from prefer_mwait_c1_over_halt
x86: Handle idle=nomwait cmdline properly for x86_idle
be able to enter deeper low-power state
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmLnsksACgkQEsHwGGHe
VUpOOw//WAfkouWFd7kmACSiWtkgEQfXgImhhM7tw5Zzks+aEMtL2RrKqFYzkFg5
hJK+lMI8QDkBFU/bgI/nAZfFiAS7iBMPY4T2Uw4+jZCPLr3TmUheJ2Pe1CxlIzQC
MfjXQm/j5uTZcB2jEORjPT5dVE3p6k1KpSbvf5ZKCc9YTwdylv3VeYcfv5WEkihR
61bWU+T7Yse4A3Bx32ewabLmk7lwOcdS1vbfsqdvkpI1vE1gI8CThgTuNAt8JWij
27GIxiF2BQkyw3d/IPt3wGIPOgVowISXWdtMgpCr17Mw1m+44vXG9cjSuAKfqAUY
wNXrBzirdqzJgN85WVJEFIoJasFJicrz/oNLYbcHQa8+AruRu6in22cSkPYPvVGc
iNgSlQOZdoY9Vl6izEV4OawCccYnKjskEW7nEVIqfENrwRPYWB/IAnGxkla7q3Ch
q+T8dyOAWToumuPK13c5VoX0nd02bfwSJACYRxN+M22zq8s7+Jv1fNtQeAGLnmD1
jG3HR0wJWBOVVyira7AbFI7Mx667HayslIesftEGU33FfY0gZTcwZ7jsZ9GTSyOi
AgHN3PvHyJYQ648T8JzbyuNJe3dyDKf81OLaPHP6+nV9Dy3aCrERTML0jo8xWv2N
rDA61BV/q+hdQS3vzmLRVPzLLZksGRNCS2ZzIbkR4dGxLQAAB2M=
=w/wH
-----END PGP SIGNATURE-----
Merge tag 'x86_fpu_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fpu update from Borislav Petkov:
- Add machinery to initialize AMX register state in order for
AMX-capable CPUs to be able to enter deeper low-power state
* tag 'x86_fpu_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
intel_idle: Add a new flag to initialize the AMX state
x86/fpu: Add a helper to prepare AMX state for low-power CPU idle
- Update pkeys documentation
- Avoid reading contended mm's TLB generation var if not absolutely
necessary along with fixing a case where arch_tlbbatch_flush() doesn't
adhere to the generation scheme and thus violates the conditions for the
above avoidance.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmLnmpYACgkQEsHwGGHe
VUrINQ/9FGnQya6mTJitM3Ohdzu1lOrHm5+XAxCO3SVzPPQlx0mRZmszzDOIZpG/
9iCEDhSi+kLdkTwIXk8Nmm1imNT2MSqswjQYr8KDtl69/j12W8Y0Pb5C5tnQnUyi
FXPiVVCAk0iegNg+QvarQa8Ou6tGWDqFMLzdrq9XNokdBmFq7FCDsOjdwd8So3IY
95755wDtCxgBXc2TVr08qSpD0Q/VlHKqb5shtzuoBe9a0YLEaRmWne9UzTOx5U6c
//qk8lmy9ohL8dmN7SgcRITzfpU8ue+/J4oZ+GV9mc/UTW5Ah2WNX+3BFnmCqZrK
gr7G5pukuuJxFj8yGzGbGIM28OHKYIE+So2Q5pA6Vrqst/oyDJS+pcoxyhAYGYCQ
hDjp4yu5AUnsPky6h6VHaR8Er5Nvo7YwhdSazcGD+HC7smwbnVEzI5H7MUgcJ05F
1CkAQSy2TVZe0hhilOu8dcHN23+2ISF8BzxKbn4qtZOsJTN6/U4MYFWl6VPh8P80
vjZcIJYZ4i6Gz03m7ITk2bHwfOD8f/7UkbZEggO/GYm1BgmxaMB0IogoIkSUG9vN
CLGZomRMfBcVVS1DTWJsUzRLbNx3x3pL41NrlxPbC/rTmvts5eJAvcDcffPfRGzx
tCqcASRdV7tQBgMT5MLjmIY8cM1aphdGSdlKVD7QHZ11bJVFZE4=
=aD0S
-----END PGP SIGNATURE-----
Merge tag 'x86_mm_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm updates from Borislav Petkov:
- Rename a PKRU macro to make more sense when reading the code
- Update pkeys documentation
- Avoid reading contended mm's TLB generation var if not absolutely
necessary along with fixing a case where arch_tlbbatch_flush()
doesn't adhere to the generation scheme and thus violates the
conditions for the above avoidance.
* tag 'x86_mm_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm/tlb: Ignore f->new_tlb_gen when zero
x86/pkeys: Clarify PKRU_AD_KEY macro
Documentation/protection-keys: Clean up documentation for User Space pkeys
x86/mm/tlb: Avoid reading mm_tlb_gen when possible
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmLnlkQACgkQEsHwGGHe
VUp9ZA//VO6OYECA3ff6x1KBqB6dbC/W2dc54CPy2MSw5xwUlpPuD0ipbRMM7UEn
zhnMNSwXgtqJdtdJZfvT9XwvrBjigAJJNjR/E3SA7bJ9jk10MEWFY/0YHtshV9vp
y75c028m3yIxf+97Cku59k2Y69u1LWgM1mpFYVgJJCwkprliY9JMyvhboSXO63v9
SR9m6jnDOJV1dIdmp7SgQvKEy2gRkz/kli5yzgNgw9Q0t5yodudaq9nEVKh0tWqP
EQ6TzyORVGmqFZ5Jxti3W6NqUardrCeWwmodC1KwWm7vAcfaJET9ADTj3sAYPG4m
m26i0fdjixnYAu27adiG7txtVgoZ3JkkNMLkDa30S2Dau1zhmGxdwAFrzLP2P84T
UGQqsZ7TNkhtLp2Jlb5pfOAdj5q8mI4BDT6KLIXgJYRm0kLkV7W069mcYCKFD2S4
BCHmDNG1ZVFQUVLG16gdZe4mRlf8mJ8WLCkDbXEAOGHCCB944xLuspgh2L88VrhR
s8EP8mPn9rXGM4drAsao6+gseS4Q1gJ0E2gFh8YTOb0fFGvlTbcvqWCQSVC6P65z
xRv1MWKg/VZcPo3Xpe40CELMcobOGxmxXaKQRJp6KSY7SdvidSb41DGv6LKffmPJ
xckm/13Aip3zVXeBE8hrjulphOAKIfPkujbW0jSTOepTnVBc3Hc=
=ps1Q
-----END PGP SIGNATURE-----
Merge tag 'x86_cleanups_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanup from Borislav Petkov:
- A single CONFIG_ symbol correction in a comment
* tag 'x86_cleanups_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Refer to the intended config STRICT_DEVMEM in a comment
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmLnjdMACgkQEsHwGGHe
VUoNfw//W/eJCIdTZ4bYku0KVRvA2tP8xXqsevBaLGhi0yh4knoMI+b7pMUnUEYX
SlV2dAF0m85ICB7dN52TB6Bn0eyt1nGj9AHmgyiZ345R2IH+bvC5qig88JOR91gd
5o2HE+CICjXVvItOwwt+FMm8GrykZ2FrciAo92CTTt5TIcZyrkUXWJKwn9c1YNKd
bZFPOmAnrLUcMlweqeoZBTCVxu+yFm/CIYEs3eXISVitCEJ1JRVqxygJicycBwmw
kN1U7glF66ptJ5l1bas5ScsgKeDUbyFFiwKXrBMJI+T/FWU6YxYQW868+5E0/8g3
uhoKpDh4hECH36DdCO/DdEcpt2sBrPskx/3f1gY+LzX/uxWNB8+1996AQlOWyJSQ
W12hZED4HpyamJr6Z5BiVjSmCKhFG8kLk09D0dB35MBIsneBpFVbm4PHmnGm2X1e
0Cm92qMeIRj4unjGEK8rybJV1uy0b6mNzUgqdyXMzRagqespwi0/4rwNTn5uU9uW
gk5gsd7oV0HmbWKw83fHxE9MWj/L4t+9fW8UnVAYJMjehXhJohIUMK+B/dLQk61I
F0mX7XQDmrKgPOyBURGM36vkWqlgUPKISl2BlC/b7qgDOUnEDZmIdnv7Fnrplwt1
Ktwzsk7eTigi9iC4lpZ8mVs+m1ZXUlQnFlibXi2HB8fZe/4pWn4=
=e088
-----END PGP SIGNATURE-----
Merge tag 'x86_vmware_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 vmware cleanup from Borislav Petkov:
- A single statement simplification by using the BIT() macro
* tag 'x86_vmware_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vmware: Use BIT() macro for shifting
when injecting errors on AMD platforms. In some cases, the platform
could prohibit those.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmLni2gACgkQEsHwGGHe
VUq1uQ//SeO7mVATL+gtwbh3NGBUsLhYJeZkNOGaIxbiKSxEUiCuHwdUmIZukLIL
dTOAY60Wa9O7wuO9g1p2oeAK8SQO3ZyoIbKX5KZxy+eiCw0lgVyRv12l9qatj/bt
KL+ImDGkoUYp1GMrZP7Lp1B9vVc4lm73qkHSRseNrnjv8EKJbty62Ed6bhgjU+CN
jw+mbTHYGIO8M7XSPvzQhDmIBUSy1N6XVIUcBD2IqWoQCEgecW6woPUHvkoWlI/B
OwQ8KJjM5oRre/AqNN8t7COP5erYY1Qi3xX1+1QnFYlxx8/Z5w4V09X00MDN7NpG
1sJZPIctJ5lcEv6kSG+mI4D2TpmiMWDlWL1ifyZjY/p4Fu7bXEvtCpGTFGlsTWzN
kdiLEjjhA9D+ag2Ah52FBBgL3FpfJxrjDPoL8fYsVkxpzETiwXugqHr7MUh5HeHE
rQldU3aUdXvH94ilQn5Mx9bVwvVMY/egwCXMKQnz/Xzt+V4NnXPYs4didcPNsnDB
QlPpeiCkDmFsqdVQB+GDFq/bh9TeIHh6I+3zY+Esvi2y1m1IjzGbwwqjZgqhpmf3
9dVH7+bucn1muekA7uQL6R34AaPR6cST5QEEM2Lzp/77XnuQ35uvXLH80gHUT4BZ
a3UUiVXRELT5+xjx57efnnJj56NVuGsdTreC2QSA11fIPW91L84=
=Qz6G
-----END PGP SIGNATURE-----
Merge tag 'ras_core_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS update from Borislav Petkov:
"A single RAS change:
- Probe whether hardware error injection (direct MSR writes) is
possible when injecting errors on AMD platforms. In some cases, the
platform could prohibit those"
* tag 'ras_core_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Check whether writes to MCA_STATUS are getting ignored
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYufiMwAKCRCRxhvAZXjc
os2iAQDr3tK9e2EUZDZ3Vgu3tvmTLKiU7W7f4U/ZAjJE5snBOwD+OqK8r1RdvXf8
TatkVFFNZYlINDN6JrS5yGSKBm1+RwE=
=8eZE
-----END PGP SIGNATURE-----
Merge tag 'fs.idmapped.overlay.acl.v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull acl updates from Christian Brauner:
"Last cycle we introduced support for mounting overlayfs on top of
idmapped mounts. While looking into additional testing we realized
that posix acls don't really work correctly with stacking filesystems
on top of idmapped layers.
We already knew what the fix were but it would require work that is
more suitable for the merge window so we turned off posix acls for
v5.19 for overlayfs on top of idmapped layers with Miklos routing my
patch upstream in 72a8e05d4f ("Merge tag 'ovl-fixes-5.19-rc7' [..]").
This contains the work to support posix acls for overlayfs on top of
idmapped layers. Since the posix acl fixes should use the new
vfs{g,u}id_t work the associated branch has been merged in. (We sent a
pull request for this earlier.)
We've also pulled in Miklos pull request containing my patch to turn
of posix acls on top of idmapped layers. This allowed us to avoid
rebasing the branch which we didn't like because we were already at
rc7 by then. Merging it in allows this branch to first fix posix acls
and then to cleanly revert the temporary fix it brought in by commit
4a47c6385b ("ovl: turn of SB_POSIXACL with idmapped layers
temporarily").
The last patch in this series adds Seth Forshee as a co-maintainer for
idmapped mounts. Seth has been integral to all of this work and is
also the main architect behind the filesystem idmapping work which
ultimately made filesystems such as FUSE and overlayfs available in
containers. He continues to be active in both development and review.
I'm very happy he decided to help and he has my full trust. This
increases the bus factor which is always great for work like this. I'm
honestly very excited about this because I think in general we don't
do great in the bringing on new maintainers department"
For more explanations of the ACL issues, see
https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org/
* tag 'fs.idmapped.overlay.acl.v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
Add Seth Forshee as co-maintainer for idmapped mounts
Revert "ovl: turn of SB_POSIXACL with idmapped layers temporarily"
ovl: handle idmappings in ovl_get_acl()
acl: make posix_acl_clone() available to overlayfs
acl: port to vfs{g,u}id_t
acl: move idmapped mount fixup into vfs_{g,s}etxattr()
mnt_idmapping: add vfs[g,u]id_into_k[g,u]id()
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYufP6AAKCRCRxhvAZXjc
omzRAQCGJ11r7T0C7t1kTdQiFSs5XN9ksFa86Hfj3dHEBIj+LQEA+bZ2/LLpElDz
zPekgXkFQqdMr+FUL8sk94dzHT0GAgk=
=BcK/
-----END PGP SIGNATURE-----
Merge tag 'fs.idmapped.vfsuid.v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull fs idmapping updates from Christian Brauner:
"This introduces the new vfs{g,u}id_t types we agreed on. Similar to
k{g,u}id_t the new types are just simple wrapper structs around
regular {g,u}id_t types.
They allow to establish a type safety boundary in the VFS for idmapped
mounts preventing confusion betwen {g,u}ids mapped into an idmapped
mount and {g,u}ids mapped into the caller's or the filesystem's
idmapping.
An initial set of helpers is introduced that allows to operate on
vfs{g,u}id_t types. We will remove all references to non-type safe
idmapped mounts helpers in the very near future. The patches do
already exist.
This converts the core attribute changing codepaths which become
significantly easier to reason about because of this change.
Just a few highlights here as the patches give detailed overviews of
what is happening in the commit messages:
- The kernel internal struct iattr contains type safe vfs{g,u}id_t
values clearly communicating that these values have to take a given
mount's idmapping into account.
- The ownership values placed in struct iattr to change ownership are
identical for idmapped and non-idmapped mounts going forward. This
also allows to simplify stacking filesystems such as overlayfs that
change attributes In other words, they always represent the values.
- Instead of open coding checks for whether ownership changes have
been requested and an actual update of the inode is required we now
have small static inline wrappers that abstract this logic away
removing a lot of code duplication from individual filesystems that
all open-coded the same checks"
* tag 'fs.idmapped.vfsuid.v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
mnt_idmapping: align kernel doc and parameter order
mnt_idmapping: use new helpers in mapped_fs{g,u}id()
fs: port HAS_UNMAPPED_ID() to vfs{g,u}id_t
mnt_idmapping: return false when comparing two invalid ids
attr: fix kernel doc
attr: port attribute changes to new types
security: pass down mount idmapping to setattr hook
quota: port quota helpers mount ids
fs: port to iattr ownership update helpers
fs: introduce tiny iattr ownership update helpers
fs: use mount types in iattr
fs: add two type safe mapping helpers
mnt_idmapping: add vfs{g,u}id_t
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEES8DXskRxsqGE6vXTAA5oQRlWghUFAmLnshQTHGpsYXl0b25A
a2VybmVsLm9yZwAKCRAADmhBGVaCFa1ZEACWzjP9gDRO+b5HuovRofO5gCfi0LNK
jQAnUQmFBbV28MuRBr8lzjZFsn52C3nEz/unHpl2NXrg1dErdXmTZIUYkZIoESQl
0hyA2lhdm/pvqfj5t9xwt9lK9xts7G+Q1Q2JsT53QlpGd7q9VOq0CFrFTuIe+HmZ
qw9Sy/3rfP/rPALv1OzIlGDdBuslfPuijJJZq0wYx4WupA6vlGGSZXn+LxF2dHW9
Ex/Z+n6o5mzEuPedopBBsCvdMTO2/sVmz33puqM0KBb/gmL47i15o1XXdg1O0cbL
7LxIDOfaIm6gFsznUwrJV54WrL8zISQd/BhXbQOrbE8kmnNii1kfIyJHYx55Sa4X
y6TmqVbYERXIwCFquO78Uywt8UgjRjuxG8SRe0AmqsxvIn/IxTjqMn5yaMURCTxA
uyOmXHxLss3Jf2LNfd6nnrK5qKpOnPOBAn8I/4UY+eNdJGqesLyKoVPZ9O6K1dr3
+jZJ8Ju4TVs7L3fljq6pHvbhAWivM3JEZmYrv+y8QKSRZBV0XqHagwDGHUaaLe9H
6eHgU5yxCb+fj8EXbwxzKnJkhHXJikd4bbPOaJC+QZEKPCJJMo/pyXmDkCVwhJ73
pjO4W0w6TGmCHinlVX6dkyYrCvWYjWglHyO5BnTY2F0Ub87/59KmepZz4dh81hi+
ZdOIvHoF6uca3A==
=wDtt
-----END PGP SIGNATURE-----
Merge tag 'filelock-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux
Pull file locking updates from Jeff Layton:
"Just a couple of flock() patches from Kuniyuki Iwashima.
The main change is that this moves a file_lock allocation from the
slab to the stack"
* tag 'filelock-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
fs/lock: Rearrange ops in flock syscall.
fs/lock: Don't allocate file_lock in flock_make_lock().
- Add Yue Hu and Jeffle Xu as reviewers;
- Add the missing wake_up when updating lzma streams;
- Avoid consecutive detection for Highmem memory;
- Prepare for multi-reference pclusters and get rid of PG_error;
- Fix ctx->pos update for NFS export;
- minor cleanups.
-----BEGIN PGP SIGNATURE-----
iIcEABYIAC8WIQThPAmQN9sSA0DVxtI5NzHcH7XmBAUCYub3chEceGlhbmdAa2Vy
bmVsLm9yZwAKCRA5NzHcH7XmBMMSAP9P7kMPLuc0RP9AjoiQXKNAfWqIbGnbkI5C
ACUUu5tZEgD/T7HhkDYIs/wAZzYB7qTkpepkY/XzuwnlodhaSTwnPQ8=
=/vU1
-----END PGP SIGNATURE-----
Merge tag 'erofs-for-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs updates from Gao Xiang:
"First of all, we'd like to add Yue Hu and Jeffle Xu as two new
reviewers. Thank them for spending time working on EROFS!
There is no major feature outstanding in this cycle, mainly a patchset
I worked on to prepare for rolling hash deduplication and folios for
compressed data as the next big features. It kills the unneeded
PG_error flag dependency as well.
Apart from that, there are bugfixes and cleanups as always. Details
are listed below:
- Add Yue Hu and Jeffle Xu as reviewers
- Add the missing wake_up when updating lzma streams
- Avoid consecutive detection for Highmem memory
- Prepare for multi-reference pclusters and get rid of PG_error
- Fix ctx->pos update for NFS export
- minor cleanups"
* tag 'erofs-for-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: (23 commits)
erofs: update ctx->pos for every emitted dirent
erofs: get rid of the leftover PAGE_SIZE in dir.c
erofs: get rid of erofs_prepare_dio() helper
erofs: introduce multi-reference pclusters (fully-referenced)
erofs: record the longest decompressed size in this round
erofs: introduce z_erofs_do_decompressed_bvec()
erofs: try to leave (de)compressed_pages on stack if possible
erofs: introduce struct z_erofs_decompress_backend
erofs: get rid of `z_pagemap_global'
erofs: clean up `enum z_erofs_collectmode'
erofs: get rid of `enum z_erofs_page_type'
erofs: rework online page handling
erofs: switch compressed_pages[] to bufvec
erofs: introduce `z_erofs_parse_in_bvecs'
erofs: drop the old pagevec approach
erofs: introduce bufvec to store decompressed buffers
erofs: introduce `z_erofs_parse_out_bvecs()'
erofs: clean up z_erofs_collector_begin()
erofs: get rid of unneeded `inode', `map' and `sb'
erofs: avoid consecutive detection for Highmem memory
...
Pull fsnotify updates from Jan Kara:
- support for FAN_MARK_IGNORE which untangles some of the not well
defined corner cases with fanotify ignore masks
- small cleanups
* tag 'fsnotify_for_v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fsnotify: Fix comment typo
fanotify: introduce FAN_MARK_IGNORE
fanotify: cleanups for fanotify_mark() input validations
fanotify: prepare for setting event flags in ignore mask
fs: inotify: Fix typo in inotify comment
Pull ext2 and reiserfs updates from Jan Kara:
"A fix for ext2 handling of a corrupted fs image and cleanups in ext2
and reiserfs"
* tag 'fs_for_v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
ext2: Add more validity checks for inode counts
fs/reiserfs/inode: remove dead code in _get_block_create_0()
fs/ext2: replace ternary operator with min_t()
Changes in this set of commits:
. Delay the cleanup of interrupted posix lock requests until the
user space result arrives. Previously, the immediate cleanup
would lead to extraneous warnings when the result arrived.
. Tracepoint improvements, e.g. adding the lock resource name.
. Delay the completion of lockspace creation until one full recovery
cycle has completed. This allows more error cases to be returned to
the caller.
. Remove warnings from the locking layer about delayed network replies.
The recently added midcomms warnings are much more useful.
. Begin the process of deprecating two unused lock-timeout-related
features. These features now require enabling via a Kconfig option,
and enabling them triggers deprecation warnings. We expect to
remove the code in v6.2.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJi5+RGAAoJEDgbc8f8gGmq1RwP/2xZaVKiTPYcQ0GfcmefCnnG
8WxpMNv4ZPkjKVv7csBA8mcQyNuQqA4yLb3P+jEgkWDOKesJQNeTvfrittXCyfhG
C7uvbUe3OCg9m+dIrzNKBu+2WtFu6tKa3aSlpPUF3Bhhe8IhwRmAkyd/Ky0VCGr9
5jQWvy8D1p2pNoFsGKqhkfolqovmeTxgYtGxd/eHtiApo6tNwzbgcQAZw4vquCjk
FSPO7s5HyINik0nQQ9b8MCjywmF6HG6UZjcd/qYHTUmcBZkgpegCKZRnYwQklnBD
6BYj6X+w7WxgVsHgYBAtgd8oLRN5CtCmPljvnPTCjvgx6N9FTl8RJV8rwMqZ9C8U
9+w7WosLxQFSyRm7KxHmKaatkOa3Baqg7cPXSwaZnsA3vBpitHWKs9cyDKwA0j3/
sUWZFw+3VSuf7AJkSA848tC8Xs8G6YXvZgzvxzNEvtTJgO3X7sXB2lavZDyI0S26
nwgXgs/Dt6QcOoQKGv8WgRSOMrFxtq/gX+f3gwPCHvM3panttPevXwKKQW2UtVOn
u/BF3Oe9bGhf+J0o58Zp3gjtfDIz+c3yPkxeQqAc3pC/o1Lw7AMV2WxlxULoBLsv
aErKwT0UemrQYRZnBmlGPaV4H1KyXzwC/fA1N8YAObJ/Ohe6x7oCKioWWMA4ggiD
A4mOIY95o24rm++lNUkD
=Dnn4
-----END PGP SIGNATURE-----
Merge tag 'dlm-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm updates from David Teigland:
- Delay the cleanup of interrupted posix lock requests until the user
space result arrives. Previously, the immediate cleanup would lead to
extraneous warnings when the result arrived.
- Tracepoint improvements, e.g. adding the lock resource name.
- Delay the completion of lockspace creation until one full recovery
cycle has completed. This allows more error cases to be returned to
the caller.
- Remove warnings from the locking layer about delayed network replies.
The recently added midcomms warnings are much more useful.
- Begin the process of deprecating two unused lock-timeout-related
features. These features now require enabling via a Kconfig option,
and enabling them triggers deprecation warnings. We expect to remove
the code in v6.2.
* tag 'dlm-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
fs: dlm: move kref_put assert for lkb structs
fs: dlm: don't use deprecated timeout features by default
fs: dlm: add deprecation Kconfig and warnings for timeouts
fs: dlm: remove timeout from dlm_user_adopt_orphan
fs: dlm: remove waiter warnings
fs: dlm: fix grammar in lowcomms output
fs: dlm: add comment about lkb IFL flags
fs: dlm: handle recovery result outside of ls_recover
fs: dlm: make new_lockspace() wait until recovery completes
fs: dlm: call dlm_lsop_recover_prep once
fs: dlm: update comments about recovery and membership handling
fs: dlm: add resource name to tracepoints
fs: dlm: remove additional dereference of lksb
fs: dlm: change ast and bast trace order
fs: dlm: change posix lock sigint handling
fs: dlm: use dlm_plock_info for do_unlock_close
fs: dlm: change plock interrupted message to debug again
fs: dlm: add pid to debug log
fs: dlm: plock use list_first_entry
The unhold_lkb() function decrements the lock's kref, and
asserts that the ref count was not the final one. Use the
kref_put release function (which should not be called) to
call the assert, rather than doing the assert based on the
kref_put return value. Using kill_lkb() as the release
function doesn't make sense if we only want to assert.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
This patch will disable use of deprecated timeout features if
CONFIG_DLM_DEPRECATED_API is not set. The deprecated features
will be removed in upcoming kernel release v6.2.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
This patch adds a CONFIG_DLM_DEPRECATED_API Kconfig option
that must be enabled to use two timeout-related features
that we intend to remove in kernel v6.2. Warnings are
printed if either is enabled and used. Neither has ever
been used as far as we know.
. The DLM_LSFL_TIMEWARN lockspace creation flag will be
removed, along with the associated configfs entry for
setting the timeout. Setting the flag and configfs file
would cause dlm to track how long locks were waiting
for reply messages. After a timeout, a kernel message
would be logged, and a netlink message would be sent
to userspace. Recently, midcomms messages have been
added that produce much better logging about actual
problems with messages. No use has ever been found
for the netlink messages.
. The userspace libdlm API has allowed the DLM_LKF_TIMEOUT
flag with a timeout value to be set in lock requests.
The lock request would be cancelled after the timeout.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
rseq_abi()->flags and rseq_abi()->rseq_cs->flags 29 upper bits are
currently unused.
The current behavior when those bits are set is to ignore them. This is
not an ideal behavior, because when future features will start using
those flags, if user-space fails to correctly validate that the kernel
indeed supports those flags (e.g. with a new sys_rseq flags bit) before
using them, it may incorrectly assume that the kernel will handle those
flags way when in fact those will be silently ignored on older kernels.
Validating that unused flags bits are cleared will allow a smoother
transition when those flags will start to be used by allowing
applications to fail early, and obviously, when they attempt to use the
new flags on an older kernel that does not support them.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20220622194617.1155957-2-mathieu.desnoyers@efficios.com
- Check the IBPB feature flag before enabling IBPB in firmware calls
because cloud vendors' fantasy when it comes to creating guest
configurations is unlimited
- Unexport sev_es_ghcb_hv_call() before 5.19 releases now that HyperV
doesn't need it anymore
- Remove dead CONFIG_* items
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmLmVtEACgkQEsHwGGHe
VUoPnBAApfqJMYSnevjBqhiO7W/8s1GDkbvzZD/qHwQKIiTSNZWmB1QGaBJLmPWr
6UvsFq3ElxFkg7rovHKYV197cHZlldWNt6BC2mDUESAHZb8HMw38e0IUcxbOJHZq
DnLVxcek3VkDG8THGSoY+NX3lvcvTx+w5C7o2SZnjBxhBYMBEXWP14UvoVAWV+HT
/vEcHi3jkYiNwyTtQFdszIxF5u5qMo2qV24hiTZDYFHBBsEGTRxVRgo4kHBQlQ/t
3AxrW01Ut4zunqKlXG0wXncF1aSgfsb7XplR9bqfWz9eQzFHkZ0DqqfoCXQZRQZo
nYQQT/A/hY2rm/HFBZ329hDm6fnu+u/8FzaBGm3DUp9UWGLqxFcCqH+QtKmpJXhr
wTK/7mB2Baw0lhc110LhDLLFydI8smQwfPf8B9IzR3Ij7j9OYqO8+NFwNR+tMk+J
VWl5aFafzVEQcf7gBGVsu/sRkxc05VtEohOV25J9VHDzlaBCMCvCpoGKfwntpp0h
9xaWUNE9/P1ggbRcxUHVmdnDnoNn087hqUBOO7GOX/cnFvADMjL3h0GqvZinj/wI
8BbpTxAU8i5qodJcsnnzxtzekxzKk6KhcHo/sMULyVSAeDnTfaPIkyfE3b6Pxiam
U1QFTWPqV9371u26dnF0bYsg+UEJasuuth8noybVwej+MJvapts=
=fEYI
-----END PGP SIGNATURE-----
Merge tag 'x86_urgent_for_v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Update the 'mitigations=' kernel param documentation
- Check the IBPB feature flag before enabling IBPB in firmware calls
because cloud vendors' fantasy when it comes to creating guest
configurations is unlimited
- Unexport sev_es_ghcb_hv_call() before 5.19 releases now that HyperV
doesn't need it anymore
- Remove dead CONFIG_* items
* tag 'x86_urgent_for_v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
docs/kernel-parameters: Update descriptions for "mitigations=" param with retbleed
x86/bugs: Do not enable IBPB at firmware entry when IBPB is not available
Revert "x86/sev: Expose sev_es_ghcb_hv_call() for use by HyperV"
x86/configs: Update configs in x86_debug.config
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmLmUPkACgkQEsHwGGHe
VUqgow/+Oj8acqImjR1OGW0MGW5F4OBRxPlWYGRBem0PwtysKSOUEuLKFGrfUPP8
9/o/WDK7sKm0A0Ph4++zyuxQVUdww1kWR1BaOzBBJZMhB3dYk511JW2EZc7TPQg8
qnBWOh1WGztaIATImo1JtN7GVlz6mWEq5i7CkyYWOfqqgMMfzS5N548KtFs37k1F
GPwR2fntThsgYlL7+5ekHVBabx3Lf5CvpUkct484LtIrvO9xvBr+R5fzxdkd/j7s
xGVFpt0sMEGjnOatLP+Q41E6n4Vugzjk9FdxOAYLcSl8NPGj/7HUtXB0oLcU7jSn
eFxr2vurueVxpueNieBKJNiSicFsgx+QNsEtERtzLfyosgKtDkWtl5cP6k7qzqVm
9KGAWc5tiQJ5DcIoxf+pKBEXBnf6EKFS7PrknYFTbWPFnbun0nw4OnFLufUgeg9c
qB6afbWUOwKLWYIcJZadmnvmE2ZhaPAv1KPvqeE7E8ln5ERbg2UKY4qV37bvyJFg
N+gVv+acSip4KtGswGUBKFriJ/vvN1dh/PiBqqJC3AHwlz+CxYsOVgpk9tkhlaQ9
1HsQ51hyN/pb688J9SshqZf2BH3qS6Kz4eLa1eXGPEywsRBJfg4lufncn1JbrCg8
CzkUfVPbS31LahMDs5U3IWGSiYSUsy1JDRLZ2zns9ZEMaaZWPKQ=
=SBw2
-----END PGP SIGNATURE-----
Merge tag 'locking_urgent_for_v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fix from Borislav Petkov:
- Avoid rwsem lockups in certain situations when handling the handoff
bit
* tag 'locking_urgent_for_v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter