71de496cc4
36784 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
|
345daff2e9 |
ucounts: Fix race condition between alloc_ucounts and put_ucounts
The race happens because put_ucounts() doesn't use spinlock and
get_ucounts is not under spinlock:
CPU0 CPU1
---- ----
alloc_ucounts() put_ucounts()
spin_lock_irq(&ucounts_lock);
ucounts = find_ucounts(ns, uid, hashent);
atomic_dec_and_test(&ucounts->count))
spin_unlock_irq(&ucounts_lock);
spin_lock_irqsave(&ucounts_lock, flags);
hlist_del_init(&ucounts->node);
spin_unlock_irqrestore(&ucounts_lock, flags);
kfree(ucounts);
ucounts = get_ucounts(ucounts);
==================================================================
BUG: KASAN: use-after-free in instrument_atomic_read_write include/linux/instrumented.h:101 [inline]
BUG: KASAN: use-after-free in atomic_add_negative include/asm-generic/atomic-instrumented.h:556 [inline]
BUG: KASAN: use-after-free in get_ucounts kernel/ucount.c:152 [inline]
BUG: KASAN: use-after-free in get_ucounts kernel/ucount.c:150 [inline]
BUG: KASAN: use-after-free in alloc_ucounts+0x19b/0x5b0 kernel/ucount.c:188
Write of size 4 at addr ffff88802821e41c by task syz-executor.4/16785
CPU: 1 PID: 16785 Comm: syz-executor.4 Not tainted 5.14.0-rc1-next-20210712-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105
print_address_description.constprop.0.cold+0x6c/0x309 mm/kasan/report.c:233
__kasan_report mm/kasan/report.c:419 [inline]
kasan_report.cold+0x83/0xdf mm/kasan/report.c:436
check_region_inline mm/kasan/generic.c:183 [inline]
kasan_check_range+0x13d/0x180 mm/kasan/generic.c:189
instrument_atomic_read_write include/linux/instrumented.h:101 [inline]
atomic_add_negative include/asm-generic/atomic-instrumented.h:556 [inline]
get_ucounts kernel/ucount.c:152 [inline]
get_ucounts kernel/ucount.c:150 [inline]
alloc_ucounts+0x19b/0x5b0 kernel/ucount.c:188
set_cred_ucounts+0x171/0x3a0 kernel/cred.c:684
__sys_setuid+0x285/0x400 kernel/sys.c:623
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x4665d9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fde54097188 EFLAGS: 00000246 ORIG_RAX: 0000000000000069
RAX: ffffffffffffffda RBX: 000000000056bf80 RCX: 00000000004665d9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000000ff
RBP: 00000000004bfcb9 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf80
R13: 00007ffc8655740f R14: 00007fde54097300 R15: 0000000000022000
Allocated by task 16784:
kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
kasan_set_track mm/kasan/common.c:46 [inline]
set_alloc_info mm/kasan/common.c:434 [inline]
____kasan_kmalloc mm/kasan/common.c:513 [inline]
____kasan_kmalloc mm/kasan/common.c:472 [inline]
__kasan_kmalloc+0x9b/0xd0 mm/kasan/common.c:522
kmalloc include/linux/slab.h:591 [inline]
kzalloc include/linux/slab.h:721 [inline]
alloc_ucounts+0x23d/0x5b0 kernel/ucount.c:169
set_cred_ucounts+0x171/0x3a0 kernel/cred.c:684
__sys_setuid+0x285/0x400 kernel/sys.c:623
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
Freed by task 16785:
kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
kasan_set_track+0x1c/0x30 mm/kasan/common.c:46
kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:360
____kasan_slab_free mm/kasan/common.c:366 [inline]
____kasan_slab_free mm/kasan/common.c:328 [inline]
__kasan_slab_free+0xfb/0x130 mm/kasan/common.c:374
kasan_slab_free include/linux/kasan.h:229 [inline]
slab_free_hook mm/slub.c:1650 [inline]
slab_free_freelist_hook+0xdf/0x240 mm/slub.c:1675
slab_free mm/slub.c:3235 [inline]
kfree+0xeb/0x650 mm/slub.c:4295
put_ucounts kernel/ucount.c:200 [inline]
put_ucounts+0x117/0x150 kernel/ucount.c:192
put_cred_rcu+0x27a/0x520 kernel/cred.c:124
rcu_do_batch kernel/rcu/tree.c:2550 [inline]
rcu_core+0x7ab/0x1380 kernel/rcu/tree.c:2785
__do_softirq+0x29b/0x9c2 kernel/softirq.c:558
Last potentially related work creation:
kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
kasan_record_aux_stack+0xe5/0x110 mm/kasan/generic.c:348
insert_work+0x48/0x370 kernel/workqueue.c:1332
__queue_work+0x5c1/0xed0 kernel/workqueue.c:1498
queue_work_on+0xee/0x110 kernel/workqueue.c:1525
queue_work include/linux/workqueue.h:507 [inline]
call_usermodehelper_exec+0x1f0/0x4c0 kernel/umh.c:435
kobject_uevent_env+0xf8f/0x1650 lib/kobject_uevent.c:618
netdev_queue_add_kobject net/core/net-sysfs.c:1621 [inline]
netdev_queue_update_kobjects+0x374/0x450 net/core/net-sysfs.c:1655
register_queue_kobjects net/core/net-sysfs.c:1716 [inline]
netdev_register_kobject+0x35a/0x430 net/core/net-sysfs.c:1959
register_netdevice+0xd33/0x1500 net/core/dev.c:10331
nsim_init_netdevsim drivers/net/netdevsim/netdev.c:317 [inline]
nsim_create+0x381/0x4d0 drivers/net/netdevsim/netdev.c:364
__nsim_dev_port_add+0x32e/0x830 drivers/net/netdevsim/dev.c:1295
nsim_dev_port_add_all+0x53/0x150 drivers/net/netdevsim/dev.c:1355
nsim_dev_probe+0xcb5/0x1190 drivers/net/netdevsim/dev.c:1496
call_driver_probe drivers/base/dd.c:517 [inline]
really_probe+0x23c/0xcd0 drivers/base/dd.c:595
__driver_probe_device+0x338/0x4d0 drivers/base/dd.c:747
driver_probe_device+0x4c/0x1a0 drivers/base/dd.c:777
__device_attach_driver+0x20b/0x2f0 drivers/base/dd.c:894
bus_for_each_drv+0x15f/0x1e0 drivers/base/bus.c:427
__device_attach+0x228/0x4a0 drivers/base/dd.c:965
bus_probe_device+0x1e4/0x290 drivers/base/bus.c:487
device_add+0xc2f/0x2180 drivers/base/core.c:3356
nsim_bus_dev_new drivers/net/netdevsim/bus.c:431 [inline]
new_device_store+0x436/0x710 drivers/net/netdevsim/bus.c:298
bus_attr_store+0x72/0xa0 drivers/base/bus.c:122
sysfs_kf_write+0x110/0x160 fs/sysfs/file.c:139
kernfs_fop_write_iter+0x342/0x500 fs/kernfs/file.c:296
call_write_iter include/linux/fs.h:2152 [inline]
new_sync_write+0x426/0x650 fs/read_write.c:518
vfs_write+0x75a/0xa40 fs/read_write.c:605
ksys_write+0x12d/0x250 fs/read_write.c:658
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
Second to last potentially related work creation:
kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
kasan_record_aux_stack+0xe5/0x110 mm/kasan/generic.c:348
insert_work+0x48/0x370 kernel/workqueue.c:1332
__queue_work+0x5c1/0xed0 kernel/workqueue.c:1498
queue_work_on+0xee/0x110 kernel/workqueue.c:1525
queue_work include/linux/workqueue.h:507 [inline]
call_usermodehelper_exec+0x1f0/0x4c0 kernel/umh.c:435
kobject_uevent_env+0xf8f/0x1650 lib/kobject_uevent.c:618
kobject_synth_uevent+0x701/0x850 lib/kobject_uevent.c:208
uevent_store+0x20/0x50 drivers/base/core.c:2371
dev_attr_store+0x50/0x80 drivers/base/core.c:2072
sysfs_kf_write+0x110/0x160 fs/sysfs/file.c:139
kernfs_fop_write_iter+0x342/0x500 fs/kernfs/file.c:296
call_write_iter include/linux/fs.h:2152 [inline]
new_sync_write+0x426/0x650 fs/read_write.c:518
vfs_write+0x75a/0xa40 fs/read_write.c:605
ksys_write+0x12d/0x250 fs/read_write.c:658
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
The buggy address belongs to the object at ffff88802821e400
which belongs to the cache kmalloc-192 of size 192
The buggy address is located 28 bytes inside of
192-byte region [ffff88802821e400, ffff88802821e4c0)
The buggy address belongs to the page:
page:ffffea0000a08780 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2821e
flags: 0xfff00000000200(slab|node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000000200 dead000000000100 dead000000000122 ffff888010841a00
raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x12cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY), pid 1, ts 12874702440, free_ts 12637793385
prep_new_page mm/page_alloc.c:2433 [inline]
get_page_from_freelist+0xa72/0x2f80 mm/page_alloc.c:4166
__alloc_pages+0x1b2/0x500 mm/page_alloc.c:5374
alloc_page_interleave+0x1e/0x200 mm/mempolicy.c:2119
alloc_pages+0x238/0x2a0 mm/mempolicy.c:2242
alloc_slab_page mm/slub.c:1713 [inline]
allocate_slab+0x32b/0x4c0 mm/slub.c:1853
new_slab mm/slub.c:1916 [inline]
new_slab_objects mm/slub.c:2662 [inline]
___slab_alloc+0x4ba/0x820 mm/slub.c:2825
__slab_alloc.constprop.0+0xa7/0xf0 mm/slub.c:2865
slab_alloc_node mm/slub.c:2947 [inline]
slab_alloc mm/slub.c:2989 [inline]
__kmalloc+0x312/0x330 mm/slub.c:4133
kmalloc include/linux/slab.h:596 [inline]
kzalloc include/linux/slab.h:721 [inline]
__register_sysctl_table+0x112/0x1090 fs/proc/proc_sysctl.c:1318
rds_tcp_init_net+0x1db/0x4f0 net/rds/tcp.c:551
ops_init+0xaf/0x470 net/core/net_namespace.c:140
__register_pernet_operations net/core/net_namespace.c:1137 [inline]
register_pernet_operations+0x35a/0x850 net/core/net_namespace.c:1214
register_pernet_device+0x26/0x70 net/core/net_namespace.c:1301
rds_tcp_init+0x77/0xe0 net/rds/tcp.c:717
do_one_initcall+0x103/0x650 init/main.c:1285
do_initcall_level init/main.c:1360 [inline]
do_initcalls init/main.c:1376 [inline]
do_basic_setup init/main.c:1396 [inline]
kernel_init_freeable+0x6b8/0x741 init/main.c:1598
page last free stack trace:
reset_page_owner include/linux/page_owner.h:24 [inline]
free_pages_prepare mm/page_alloc.c:1343 [inline]
free_pcp_prepare+0x312/0x7d0 mm/page_alloc.c:1394
free_unref_page_prepare mm/page_alloc.c:3329 [inline]
free_unref_page+0x19/0x690 mm/page_alloc.c:3408
__vunmap+0x783/0xb70 mm/vmalloc.c:2587
free_work+0x58/0x70 mm/vmalloc.c:82
process_one_work+0x98d/0x1630 kernel/workqueue.c:2276
worker_thread+0x658/0x11f0 kernel/workqueue.c:2422
kthread+0x3e5/0x4d0 kernel/kthread.c:319
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
Memory state around the buggy address:
ffff88802821e300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff88802821e380: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc
>ffff88802821e400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88802821e480: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
ffff88802821e500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================
- The race fix has two parts.
* Changing the code to guarantee that ucounts->count is only decremented
when ucounts_lock is held. This guarantees that find_ucounts
will never find a structure with a zero reference count.
* Changing alloc_ucounts to increment ucounts->count while
ucounts_lock is held. This guarantees the reference count on the
found data structure will not be decremented to zero (and the data
structure freed) before the reference count is incremented.
-- Eric Biederman
Reported-by: syzbot+01985d7909f9468f013c@syzkaller.appspotmail.com
Reported-by: syzbot+59dd63761094a80ad06d@syzkaller.appspotmail.com
Reported-by: syzbot+6cd79f45bb8fa1c9eeae@syzkaller.appspotmail.com
Reported-by: syzbot+b6e65bd125a05f803d6b@syzkaller.appspotmail.com
Fixes:
|
||
|
c3df5fb57f |
cgroup: rstat: fix A-A deadlock on 32bit around u64_stats_sync
|
||
|
51bbe7ebac |
Merge branch 'for-5.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fix from Tejun Heo: "Fix leak of filesystem context root which is triggered by LTP. Not too likely to be a problem in non-testing environments" * 'for-5.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup1: fix leaked context root causing sporadic NULL deref in LTP |
||
|
82d712f6d1 |
Merge branch 'for-5.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fix from Tejun Heo: "Fix a use-after-free in allocation failure handling path" * 'for-5.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: fix UAF in pwq_unbound_release_workfn() |
||
|
bb7262b295 |
timers: Move clearing of base::timer_running under base:: Lock
syzbot reported KCSAN data races vs. timer_base::timer_running being set to
NULL without holding base::lock in expire_timers().
This looks innocent and most reads are clearly not problematic, but
Frederic identified an issue which is:
int data = 0;
void timer_func(struct timer_list *t)
{
data = 1;
}
CPU 0 CPU 1
------------------------------ --------------------------
base = lock_timer_base(timer, &flags); raw_spin_unlock(&base->lock);
if (base->running_timer != timer) call_timer_fn(timer, fn, baseclk);
ret = detach_if_pending(timer, base, true); base->running_timer = NULL;
raw_spin_unlock_irqrestore(&base->lock, flags); raw_spin_lock(&base->lock);
x = data;
If the timer has previously executed on CPU 1 and then CPU 0 can observe
base->running_timer == NULL and returns, assuming the timer has completed,
but it's not guaranteed on all architectures. The comment for
del_timer_sync() makes that guarantee. Moving the assignment under
base->lock prevents this.
For non-RT kernel it's performance wise completely irrelevant whether the
store happens before or after taking the lock. For an RT kernel moving the
store under the lock requires an extra unlock/lock pair in the case that
there is a waiter for the timer, but that's not the end of the world.
Reported-by: syzbot+aa7c2385d46c5eba0b89@syzkaller.appspotmail.com
Reported-by: syzbot+abea4558531bae1ba9fe@syzkaller.appspotmail.com
Fixes:
|
||
|
a1833a5403 |
smpboot: fix duplicate and misplaced inlining directive
gcc doesn't care, but clang quite reasonably pointed out that the recent
commit
|
||
|
12e9bd168c |
A samll set of timer related fixes:
- Plug a race between rearm and process tick in the posix CPU timers code - Make the optimization to avoid recalculation of the next timer interrupt work correctly when there are no timers pending. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmD9LLUTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYocqMD/9ciaE5J4gbsamavrPfMaQNo3c3mom1 3xoskHFKiz9fnYL/f3yNQ2OjXl+lxV0VLSwjJnc1TfhQM5g+X7uI4P/sCZzHtEmP F3jTCUX99DSlTEB4Fc3ssr/hZPQab9nXearek9eAEpLKQDe1U1q2p314Mr4dA+Kj awJUuOlkLt+NwRCajuZK6Ur0D1Zte56C/Nl3PeppgJY1U5tLCKTE8ZmRbdLo11zM BEMFL95on1a/wKzTkuYqhfSQn4JWZo4lLP9jVHueF2nbPbOGdC9VwvegRMtRKG/k 0I8n7mcXvzi9pDP08o96rzjdZ9KBpG2hLkL0PgCDjrXwlOBH7tDkOBajJ+x5AgAf 71Zi/XOfjaHbkm37uLTTeerG2pKilds7ukjnQ3VS3t/XfPw6Nr6r9ig5AyBxpnjk 8Shw8kfEOApLEnmYwKAXHCewjxppp9pDsR4J7sYIfiYoDZRfF56Xyr/pKa8cpZge 9ByK8Pul4J9RhTOgIgJNMBb78gmFxsKi5CMt6kj8O89omc4pIUVpEK0z3kWmbjrd m1mtcO2kS/ry+7TgAjkxHcrcm/QX+y/2SrvvLoqVLAJQfIrffsiGGehaIXS8rAKg jlCd3s0NET6yULyQwI7qUdS+ZgtYSQKfIkU38VVcXaplUOABWcCwNXRh0rk6+gBs +HK8UxQnAYRGrw== =Q6oh -----END PGP SIGNATURE----- Merge tag 'timers-urgent-2021-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "A small set of timer related fixes: - Plug a race between rearm and process tick in the posix CPU timers code - Make the optimization to avoid recalculation of the next timer interrupt work correctly when there are no timers pending" * tag 'timers-urgent-2021-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timers: Fix get_next_timer_interrupt() with no timers pending posix-cpu-timers: Fix rearm racing against process tick |
||
|
9041a4d2ee |
A single update for the boot code to prevent aggressive un-inlining which
causes a section mismatch. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmD9KsYTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYof1HEACfpcjQbezuNjuoSYQNpAz9jz/W63iH FXTBAiC8IVzloRgNjlbVAVGgJlYFgbgkke2ulNwsRIwwVhhyDYDPio88U498LdB5 bQdw/jsylHY+5ipvGCxf2uvMI8xIk6Gj3bbgIsVxBqpCTaJOsFE8E69UUVZjxPMB kpHQ2GzCLe0FxXC7reFoIIPhY9U1cYXNHUso4w2um6enr10uEwMouXOGkhNAboOl bmi97wWd/wgfSJNUGrsGpK+f4yTAhfdXvPv79t0Dzbg7KqTxFdU2dmrHQYinuw96 YTZLW32o5JJzW/AjPcDTTixXStbwtrAS6GPqoAL65n2rvwit64SYfhPjJAie3+Ly eHzNUyZX6+zczfe7zj8nXiutMc/KI5aRcwf1S2PCMefi1IoJuVOdbe9Pgj/iFtCt GRGSZ+gob7R4Onvs2Yaw8iS32KjetmGTN7V9BuPTIgPgZ9UjQvB9pnDWcICJnT9U 6HLvKBhVBfDVOSe2YDNEIMFVJdacb0EKegyLXU0S/E0CUPitXM9TivazRrSprEiJ nZxH5sM+9rFtpHWA6lxKYmRP5e0BoMCZ6kerHmOMNa6dW4ha6D3J1KUJPYgJ4YH0 wQ9dF4ppHgBGsBtkZ7YVwXHkNnKJnr8GN8Xcf6CUFh1Yu61QMDCDyGR8cEVFc8Ai TtZDCYJLBBA5bA== =t8VJ -----END PGP SIGNATURE----- Merge tag 'core-urgent-2021-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core fix from Thomas Gleixner: "A single update for the boot code to prevent aggressive un-inlining which causes a section mismatch" * tag 'core-urgent-2021-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: smpboot: Mark idle_init() as __always_inlined to work around aggressive compiler un-inlining |
||
|
04ca88d056 |
dma-mapping fix for Lonux 5.14
- handle vmalloc addresses in dma_common_{mmap,get_sgtable} (Roman Skakun) -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmD8/l8LHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYNBzhAAuj1XSAfWr+me5bnSX2uxDJuD4aW7fPpmRo2VxePT uvAL5IY76urO68vnQgPsL5h1fJEsPpaUbztb4FtLb/xEKy22IOtWq5xV4igSzqMd 1c1FBCsaNXrVUtEfpZinQW5kXi82X0L1AI+52gnN847D/ZH8crPKkBajMUk2bQ+v tqR9OxQoHfRmRB3ACWSgo0RifA0131lnWCxtc/lhbZV4XvajKzZn95mx0M3kG20K AtTEhIDYJNkPB4rbHn0HbkG6MG8tnNHAoOhPrfySUlZnrb/WM6el/9FlBf1Hks/u g8MZuZM7ChRLLDGkQtRUDiFwCmQm7TPkj9N9OQ6pyr0sO6P0o2L0Ligv4jiaX2Qg O84/VIskJSY/tix3aLCOrqRIAK8JK1H6sShraDBpLRki0SJo7wtrml1SIYCmG2kV WsYgVqYWvXP+Z0imck9UFmA/LEgbMk3y3k8op8wxSUe3fRRuAHYW14Nsko7nLUoI NKyXw1/kcvqHfo2Bwg4BQFpjddLKGIDcBGulY8+1WyxdmckkKyWV7HL7TG+wIc1n TJc7LedsG4fEUdUuPpjzQNGT6IQQWlwn40bVnFH1hU5n9dQeiv9s73+rAENnWlCB SFtRbDd9i1l2D9qpyEQehLvEchKYOi2AeplDPO5AWum5vo9/LmuoiUgXLxHb1UO9 Qsg= =i9u8 -----END PGP SIGNATURE----- Merge tag 'dma-mapping-5.14-1' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping fix from Christoph Hellwig: - handle vmalloc addresses in dma_common_{mmap,get_sgtable} (Roman Skakun) * tag 'dma-mapping-5.14-1' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: handle vmalloc addresses in dma_common_{mmap,get_sgtable} |
||
|
05daae0fb0 |
Tracing fixes for 5.14-rc2:
- Fix deadloop in ring buffer because of using stale "read" variable - Fix synthetic event use of field_pos as boolean and not an index - Fixed histogram special var "cpu" overriding event fields called "cpu" - Cleaned up error prone logic in alloc_synth_event() - Removed call to synchronize_rcu_tasks_rude() when not needed - Removed redundant initialization of a local variable "ret" - Fixed kernel crash when updating tracepoint callbacks of different priorities. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYPrGTxQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qusoAQDZkMo/vBFZgNGcL8GNCFpOF9HcV7QI JtBU+UG5GY2LagD/SEFEoG1o9UwKnIaBwr7qxGvrPgg8jKWtK/HEVFU94wk= =EVfM -----END PGP SIGNATURE----- Merge tag 'trace-v5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: - Fix deadloop in ring buffer because of using stale "read" variable - Fix synthetic event use of field_pos as boolean and not an index - Fixed histogram special var "cpu" overriding event fields called "cpu" - Cleaned up error prone logic in alloc_synth_event() - Removed call to synchronize_rcu_tasks_rude() when not needed - Removed redundant initialization of a local variable "ret" - Fixed kernel crash when updating tracepoint callbacks of different priorities. * tag 'trace-v5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracepoints: Update static_call before tp_funcs when adding a tracepoint ftrace: Remove redundant initialization of variable ret ftrace: Avoid synchronize_rcu_tasks_rude() call when not necessary tracing: Clean up alloc_synth_event() tracing/histogram: Rename "cpu" to "common_cpu" tracing: Synthetic event field_pos is an index not a boolean tracing: Fix bug in rb_per_cpu_empty() that might cause deadloop. |
||
|
352384d5c8 |
tracepoints: Update static_call before tp_funcs when adding a tracepoint
Because of the significant overhead that retpolines pose on indirect
calls, the tracepoint code was updated to use the new "static_calls" that
can modify the running code to directly call a function instead of using
an indirect caller, and this function can be changed at runtime.
In the tracepoint code that calls all the registered callbacks that are
attached to a tracepoint, the following is done:
it_func_ptr = rcu_dereference_raw((&__tracepoint_##name)->funcs);
if (it_func_ptr) {
__data = (it_func_ptr)->data;
static_call(tp_func_##name)(__data, args);
}
If there's just a single callback, the static_call is updated to just call
that callback directly. Once another handler is added, then the static
caller is updated to call the iterator, that simply loops over all the
funcs in the array and calls each of the callbacks like the old method
using indirect calling.
The issue was discovered with a race between updating the funcs array and
updating the static_call. The funcs array was updated first and then the
static_call was updated. This is not an issue as long as the first element
in the old array is the same as the first element in the new array. But
that assumption is incorrect, because callbacks also have a priority
field, and if there's a callback added that has a higher priority than the
callback on the old array, then it will become the first callback in the
new array. This means that it is possible to call the old callback with
the new callback data element, which can cause a kernel panic.
static_call = callback1()
funcs[] = {callback1,data1};
callback2 has higher priority than callback1
CPU 1 CPU 2
----- -----
new_funcs = {callback2,data2},
{callback1,data1}
rcu_assign_pointer(tp->funcs, new_funcs);
/*
* Now tp->funcs has the new array
* but the static_call still calls callback1
*/
it_func_ptr = tp->funcs [ new_funcs ]
data = it_func_ptr->data [ data2 ]
static_call(callback1, data);
/* Now callback1 is called with
* callback2's data */
[ KERNEL PANIC ]
update_static_call(iterator);
To prevent this from happening, always switch the static_call to the
iterator before assigning the tp->funcs to the new array. The iterator will
always properly match the callback with its data.
To trigger this bug:
In one terminal:
while :; do hackbench 50; done
In another terminal
echo 1 > /sys/kernel/tracing/events/sched/sched_waking/enable
while :; do
echo 1 > /sys/kernel/tracing/set_event_pid;
sleep 0.5
echo 0 > /sys/kernel/tracing/set_event_pid;
sleep 0.5
done
And it doesn't take long to crash. This is because the set_event_pid adds
a callback to the sched_waking tracepoint with a high priority, which will
be called before the sched_waking trace event callback is called.
Note, the removal to a single callback updates the array first, before
changing the static_call to single callback, which is the proper order as
the first element in the array is the same as what the static_call is
being changed to.
Link: https://lore.kernel.org/io-uring/4ebea8f0-58c9-e571-fd30-0ce4f6f09c70@samba.org/
Cc: stable@vger.kernel.org
Fixes:
|
||
|
3b1a8f457f |
ftrace: Remove redundant initialization of variable ret
The variable ret is being initialized with a value that is never read, it is being updated later on. The assignment is redundant and can be removed. Link: https://lkml.kernel.org/r/20210721120915.122278-1-colin.king@canonical.com Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> |
||
|
68e83498cb |
ftrace: Avoid synchronize_rcu_tasks_rude() call when not necessary
synchronize_rcu_tasks_rude() triggers IPIs and forces rescheduling on all CPUs. It is a costly operation and, when targeting nohz_full CPUs, very disrupting (hence the name). So avoid calling it when 'old_hash' doesn't need to be freed. Link: https://lkml.kernel.org/r/20210721114726.1545103-1-nsaenzju@redhat.com Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> |
||
|
9528c19507 |
tracing: Clean up alloc_synth_event()
alloc_synth_event() currently has the following code to initialize the event fields and dynamic_fields: for (i = 0, j = 0; i < n_fields; i++) { event->fields[i] = fields[i]; if (fields[i]->is_dynamic) { event->dynamic_fields[j] = fields[i]; event->dynamic_fields[j]->field_pos = i; event->dynamic_fields[j++] = fields[i]; event->n_dynamic_fields++; } } 1) It would make more sense to have all fields keep track of their field_pos. 2) event->dynmaic_fields[j] is assigned twice for no reason. 3) We can move updating event->n_dynamic_fields outside the loop, and just assign it to j. This combination makes the code much cleaner. Link: https://lkml.kernel.org/r/20210721195341.29bb0f77@oasis.local.home Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> |
||
|
1e3bac71c5 |
tracing/histogram: Rename "cpu" to "common_cpu"
Currently the histogram logic allows the user to write "cpu" in as an
event field, and it will record the CPU that the event happened on.
The problem with this is that there's a lot of events that have "cpu"
as a real field, and using "cpu" as the CPU it ran on, makes it
impossible to run histograms on the "cpu" field of events.
For example, if I want to have a histogram on the count of the
workqueue_queue_work event on its cpu field, running:
># echo 'hist:keys=cpu' > events/workqueue/workqueue_queue_work/trigger
Gives a misleading and wrong result.
Change the command to "common_cpu" as no event should have "common_*"
fields as that's a reserved name for fields used by all events. And
this makes sense here as common_cpu would be a field used by all events.
Now we can even do:
># echo 'hist:keys=common_cpu,cpu if cpu < 100' > events/workqueue/workqueue_queue_work/trigger
># cat events/workqueue/workqueue_queue_work/hist
# event histogram
#
# trigger info: hist:keys=common_cpu,cpu:vals=hitcount:sort=hitcount:size=2048 if cpu < 100 [active]
#
{ common_cpu: 0, cpu: 2 } hitcount: 1
{ common_cpu: 0, cpu: 4 } hitcount: 1
{ common_cpu: 7, cpu: 7 } hitcount: 1
{ common_cpu: 0, cpu: 7 } hitcount: 1
{ common_cpu: 0, cpu: 1 } hitcount: 1
{ common_cpu: 0, cpu: 6 } hitcount: 2
{ common_cpu: 0, cpu: 5 } hitcount: 2
{ common_cpu: 1, cpu: 1 } hitcount: 4
{ common_cpu: 6, cpu: 6 } hitcount: 4
{ common_cpu: 5, cpu: 5 } hitcount: 14
{ common_cpu: 4, cpu: 4 } hitcount: 26
{ common_cpu: 0, cpu: 0 } hitcount: 39
{ common_cpu: 2, cpu: 2 } hitcount: 184
Now for backward compatibility, I added a trick. If "cpu" is used, and
the field is not found, it will fall back to "common_cpu" and work as
it did before. This way, it will still work for old programs that use
"cpu" to get the actual CPU, but if the event has a "cpu" as a field, it
will get that event's "cpu" field, which is probably what it wants
anyway.
I updated the tracefs/README to include documentation about both the
common_timestamp and the common_cpu. This way, if that text is present in
the README, then an application can know that common_cpu is supported over
just plain "cpu".
Link: https://lkml.kernel.org/r/20210721110053.26b4f641@oasis.local.home
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org
Fixes:
|
||
|
3b13911a2f |
tracing: Synthetic event field_pos is an index not a boolean
Performing the following:
># echo 'wakeup_lat s32 pid; u64 delta; char wake_comm[]' > synthetic_events
># echo 'hist:keys=pid:__arg__1=common_timestamp.usecs' > events/sched/sched_waking/trigger
># echo 'hist:keys=next_pid:pid=next_pid,delta=common_timestamp.usecs-$__arg__1:onmatch(sched.sched_waking).trace(wakeup_lat,$pid,$delta,prev_comm)'\
> events/sched/sched_switch/trigger
># echo 1 > events/synthetic/enable
Crashed the kernel:
BUG: kernel NULL pointer dereference, address: 000000000000001b
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP
CPU: 7 PID: 0 Comm: swapper/7 Not tainted 5.13.0-rc5-test+ #104
Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
RIP: 0010:strlen+0x0/0x20
Code: f6 82 80 2b 0b bc 20 74 11 0f b6 50 01 48 83 c0 01 f6 82 80 2b 0b bc
20 75 ef c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 <80> 3f 00 74 10
48 89 f8 48 83 c0 01 80 38 9 f8 c3 31
RSP: 0018:ffffaa75000d79d0 EFLAGS: 00010046
RAX: 0000000000000002 RBX: ffff9cdb55575270 RCX: 0000000000000000
RDX: ffff9cdb58c7a320 RSI: ffffaa75000d7b40 RDI: 000000000000001b
RBP: ffffaa75000d7b40 R08: ffff9cdb40a4f010 R09: ffffaa75000d7ab8
R10: ffff9cdb4398c700 R11: 0000000000000008 R12: ffff9cdb58c7a320
R13: ffff9cdb55575270 R14: ffff9cdb58c7a000 R15: 0000000000000018
FS: 0000000000000000(0000) GS:ffff9cdb5aa00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000001b CR3: 00000000c0612006 CR4: 00000000001706e0
Call Trace:
trace_event_raw_event_synth+0x90/0x1d0
action_trace+0x5b/0x70
event_hist_trigger+0x4bd/0x4e0
? cpumask_next_and+0x20/0x30
? update_sd_lb_stats.constprop.0+0xf6/0x840
? __lock_acquire.constprop.0+0x125/0x550
? find_held_lock+0x32/0x90
? sched_clock_cpu+0xe/0xd0
? lock_release+0x155/0x440
? update_load_avg+0x8c/0x6f0
? enqueue_entity+0x18a/0x920
? __rb_reserve_next+0xe5/0x460
? ring_buffer_lock_reserve+0x12a/0x3f0
event_triggers_call+0x52/0xe0
trace_event_buffer_commit+0x1ae/0x240
trace_event_raw_event_sched_switch+0x114/0x170
__traceiter_sched_switch+0x39/0x50
__schedule+0x431/0xb00
schedule_idle+0x28/0x40
do_idle+0x198/0x2e0
cpu_startup_entry+0x19/0x20
secondary_startup_64_no_verify+0xc2/0xcb
The reason is that the dynamic events array keeps track of the field
position of the fields array, via the field_pos variable in the
synth_field structure. Unfortunately, that field is a boolean for some
reason, which means any field_pos greater than 1 will be a bug (in this
case it was 2).
Link: https://lkml.kernel.org/r/20210721191008.638bce34@oasis.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org
Fixes:
|
||
|
4784dc99c7 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller: 1) Fix type of bind option flag in af_xdp, from Baruch Siach. 2) Fix use after free in bpf_xdp_link_release(), from Xuan Zhao. 3) PM refcnt imbakance in r8152, from Takashi Iwai. 4) Sign extension ug in liquidio, from Colin Ian King. 5) Mising range check in s390 bpf jit, from Colin Ian King. 6) Uninit value in caif_seqpkt_sendmsg(), from Ziyong Xuan. 7) Fix skb page recycling race, from Ilias Apalodimas. 8) Fix memory leak in tcindex_partial_destroy_work, from Pave Skripkin. 9) netrom timer sk refcnt issues, from Nguyen Dinh Phi. 10) Fix data races aroun tcp's tfo_active_disable_stamp, from Eric Dumazet. 11) act_skbmod should only operate on ethernet packets, from Peilin Ye. 12) Fix slab out-of-bpunds in fib6_nh_flush_exceptions(),, from Psolo Abeni. 13) Fix sparx5 dependencies, from Yajun Deng. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (74 commits) dpaa2-switch: seed the buffer pool after allocating the swp net: sched: cls_api: Fix the the wrong parameter net: sparx5: fix unmet dependencies warning net: dsa: tag_ksz: dont let the hardware process the layer 4 checksum net: dsa: ensure linearized SKBs in case of tail taggers ravb: Remove extra TAB ravb: Fix a typo in comment net: dsa: sja1105: make VID 4095 a bridge VLAN too tcp: disable TFO blackhole logic by default sctp: do not update transport pathmtu if SPP_PMTUD_ENABLE is not set net: ixp46x: fix ptp build failure ibmvnic: Remove the proper scrq flush selftests: net: add ESP-in-UDP PMTU test udp: check encap socket in __udp_lib_err sctp: update active_key for asoc when old key is being replaced r8169: Avoid duplicate sysfs entry creation error ixgbe: Fix packet corruption due to missing DMA sync Revert "qed: fix possible unpaired spin_{un}lock_bh in _qed_mcp_cmd_and_union()" ipv6: fix another slab-out-of-bounds in fib6_nh_flush_exceptions fsl/fman: Add fibre support ... |
||
|
67f0d6d988 |
tracing: Fix bug in rb_per_cpu_empty() that might cause deadloop.
The "rb_per_cpu_empty()" misinterpret the condition (as not-empty) when "head_page" and "commit_page" of "struct ring_buffer_per_cpu" points to the same buffer page, whose "buffer_data_page" is empty and "read" field is non-zero. An error scenario could be constructed as followed (kernel perspective): 1. All pages in the buffer has been accessed by reader(s) so that all of them will have non-zero "read" field. 2. Read and clear all buffer pages so that "rb_num_of_entries()" will return 0 rendering there's no more data to read. It is also required that the "read_page", "commit_page" and "tail_page" points to the same page, while "head_page" is the next page of them. 3. Invoke "ring_buffer_lock_reserve()" with large enough "length" so that it shot pass the end of current tail buffer page. Now the "head_page", "commit_page" and "tail_page" points to the same page. 4. Discard current event with "ring_buffer_discard_commit()", so that "head_page", "commit_page" and "tail_page" points to a page whose buffer data page is now empty. When the error scenario has been constructed, "tracing_read_pipe" will be trapped inside a deadloop: "trace_empty()" returns 0 since "rb_per_cpu_empty()" returns 0 when it hits the CPU containing such constructed ring buffer. Then "trace_find_next_entry_inc()" always return NULL since "rb_num_of_entries()" reports there's no more entry to read. Finally "trace_seq_to_user()" returns "-EBUSY" spanking "tracing_read_pipe" back to the start of the "waitagain" loop. I've also written a proof-of-concept script to construct the scenario and trigger the bug automatically, you can use it to trace and validate my reasoning above: https://github.com/aegistudio/RingBufferDetonator.git Tests has been carried out on linux kernel 5.14-rc2 ( |
||
|
b42b0bddcb |
workqueue: fix UAF in pwq_unbound_release_workfn()
I got a UAF report when doing fuzz test:
[ 152.880091][ T8030] ==================================================================
[ 152.881240][ T8030] BUG: KASAN: use-after-free in pwq_unbound_release_workfn+0x50/0x190
[ 152.882442][ T8030] Read of size 4 at addr ffff88810d31bd00 by task kworker/3:2/8030
[ 152.883578][ T8030]
[ 152.883932][ T8030] CPU: 3 PID: 8030 Comm: kworker/3:2 Not tainted 5.13.0+ #249
[ 152.885014][ T8030] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 152.886442][ T8030] Workqueue: events pwq_unbound_release_workfn
[ 152.887358][ T8030] Call Trace:
[ 152.887837][ T8030] dump_stack_lvl+0x75/0x9b
[ 152.888525][ T8030] ? pwq_unbound_release_workfn+0x50/0x190
[ 152.889371][ T8030] print_address_description.constprop.10+0x48/0x70
[ 152.890326][ T8030] ? pwq_unbound_release_workfn+0x50/0x190
[ 152.891163][ T8030] ? pwq_unbound_release_workfn+0x50/0x190
[ 152.891999][ T8030] kasan_report.cold.15+0x82/0xdb
[ 152.892740][ T8030] ? pwq_unbound_release_workfn+0x50/0x190
[ 152.893594][ T8030] __asan_load4+0x69/0x90
[ 152.894243][ T8030] pwq_unbound_release_workfn+0x50/0x190
[ 152.895057][ T8030] process_one_work+0x47b/0x890
[ 152.895778][ T8030] worker_thread+0x5c/0x790
[ 152.896439][ T8030] ? process_one_work+0x890/0x890
[ 152.897163][ T8030] kthread+0x223/0x250
[ 152.897747][ T8030] ? set_kthread_struct+0xb0/0xb0
[ 152.898471][ T8030] ret_from_fork+0x1f/0x30
[ 152.899114][ T8030]
[ 152.899446][ T8030] Allocated by task 8884:
[ 152.900084][ T8030] kasan_save_stack+0x21/0x50
[ 152.900769][ T8030] __kasan_kmalloc+0x88/0xb0
[ 152.901416][ T8030] __kmalloc+0x29c/0x460
[ 152.902014][ T8030] alloc_workqueue+0x111/0x8e0
[ 152.902690][ T8030] __btrfs_alloc_workqueue+0x11e/0x2a0
[ 152.903459][ T8030] btrfs_alloc_workqueue+0x6d/0x1d0
[ 152.904198][ T8030] scrub_workers_get+0x1e8/0x490
[ 152.904929][ T8030] btrfs_scrub_dev+0x1b9/0x9c0
[ 152.905599][ T8030] btrfs_ioctl+0x122c/0x4e50
[ 152.906247][ T8030] __x64_sys_ioctl+0x137/0x190
[ 152.906916][ T8030] do_syscall_64+0x34/0xb0
[ 152.907535][ T8030] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 152.908365][ T8030]
[ 152.908688][ T8030] Freed by task 8884:
[ 152.909243][ T8030] kasan_save_stack+0x21/0x50
[ 152.909893][ T8030] kasan_set_track+0x20/0x30
[ 152.910541][ T8030] kasan_set_free_info+0x24/0x40
[ 152.911265][ T8030] __kasan_slab_free+0xf7/0x140
[ 152.911964][ T8030] kfree+0x9e/0x3d0
[ 152.912501][ T8030] alloc_workqueue+0x7d7/0x8e0
[ 152.913182][ T8030] __btrfs_alloc_workqueue+0x11e/0x2a0
[ 152.913949][ T8030] btrfs_alloc_workqueue+0x6d/0x1d0
[ 152.914703][ T8030] scrub_workers_get+0x1e8/0x490
[ 152.915402][ T8030] btrfs_scrub_dev+0x1b9/0x9c0
[ 152.916077][ T8030] btrfs_ioctl+0x122c/0x4e50
[ 152.916729][ T8030] __x64_sys_ioctl+0x137/0x190
[ 152.917414][ T8030] do_syscall_64+0x34/0xb0
[ 152.918034][ T8030] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 152.918872][ T8030]
[ 152.919203][ T8030] The buggy address belongs to the object at ffff88810d31bc00
[ 152.919203][ T8030] which belongs to the cache kmalloc-512 of size 512
[ 152.921155][ T8030] The buggy address is located 256 bytes inside of
[ 152.921155][ T8030] 512-byte region [ffff88810d31bc00, ffff88810d31be00)
[ 152.922993][ T8030] The buggy address belongs to the page:
[ 152.923800][ T8030] page:ffffea000434c600 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10d318
[ 152.925249][ T8030] head:ffffea000434c600 order:2 compound_mapcount:0 compound_pincount:0
[ 152.926399][ T8030] flags: 0x57ff00000010200(slab|head|node=1|zone=2|lastcpupid=0x7ff)
[ 152.927515][ T8030] raw: 057ff00000010200 dead000000000100 dead000000000122 ffff888009c42c80
[ 152.928716][ T8030] raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
[ 152.929890][ T8030] page dumped because: kasan: bad access detected
[ 152.930759][ T8030]
[ 152.931076][ T8030] Memory state around the buggy address:
[ 152.931851][ T8030] ffff88810d31bc00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 152.932967][ T8030] ffff88810d31bc80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 152.934068][ T8030] >ffff88810d31bd00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 152.935189][ T8030] ^
[ 152.935763][ T8030] ffff88810d31bd80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 152.936847][ T8030] ffff88810d31be00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 152.937940][ T8030] ==================================================================
If apply_wqattrs_prepare() fails in alloc_workqueue(), it will call put_pwq()
which invoke a work queue to call pwq_unbound_release_workfn() and use the 'wq'.
The 'wq' allocated in alloc_workqueue() will be freed in error path when
apply_wqattrs_prepare() fails. So it will lead a UAF.
CPU0 CPU1
alloc_workqueue()
alloc_and_link_pwqs()
apply_wqattrs_prepare() fails
apply_wqattrs_cleanup()
schedule_work(&pwq->unbound_release_work)
kfree(wq)
worker_thread()
pwq_unbound_release_workfn() <- trigger uaf here
If apply_wqattrs_prepare() fails, the new pwq are not linked, it doesn't
hold any reference to the 'wq', 'wq' is invalid to access in the worker,
so add check pwq if linked to fix this.
Fixes:
|
||
|
1e7107c5ef |
cgroup1: fix leaked context root causing sporadic NULL deref in LTP
Richard reported sporadic (roughly one in 10 or so) null dereferences and other strange behaviour for a set of automated LTP tests. Things like: BUG: kernel NULL pointer dereference, address: 0000000000000008 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 1516 Comm: umount Not tainted 5.10.0-yocto-standard #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014 RIP: 0010:kernfs_sop_show_path+0x1b/0x60 ...or these others: RIP: 0010:do_mkdirat+0x6a/0xf0 RIP: 0010:d_alloc_parallel+0x98/0x510 RIP: 0010:do_readlinkat+0x86/0x120 There were other less common instances of some kind of a general scribble but the common theme was mount and cgroup and a dubious dentry triggering the NULL dereference. I was only able to reproduce it under qemu by replicating Richard's setup as closely as possible - I never did get it to happen on bare metal, even while keeping everything else the same. In commit |
||
|
ff5a6a3550 |
Merge branch 'timers/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into timers/urgent
Pull dyntick fixes from Frederic Weisbecker: - Fix a rearm race in the posix cpu timer code - Handle get_next_timer_interrupt() correctly when no timers are pending Link: https://lore.kernel.org/r/20210715104218.81276-1-frederic@kernel.org |
||
|
3fdacf402b |
tracing: Fix the histogram logic from possibly crashing the kernel
Working on the histogram code, I found that if you dereference a char pointer in a trace event that happens to point to user space, it can crash the kernel, as it does no checks of that pointer. I have code coming that will do this better, so just remove this ability to treat character pointers in trace events as stings in the histogram. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYPH9FRQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qsyhAQDKiQzVJtjfsNbIWliDQOaUwJMO9tNl Qu5TUDmPbAA4fwD+MgYsnITPL+o/YcKQ+aMdj/wLLMKfIjhNkFY8wqdLvwg= =CN97 -----END PGP SIGNATURE----- Merge tag 'trace-v5.14-5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "Fix the histogram logic from possibly crashing the kernel Working on the histogram code, I found that if you dereference a char pointer in a trace event that happens to point to user space, it can crash the kernel, as it does no checks of that pointer. I have code coming that will do this better, so just remove this ability to treat character pointers in trace events as stings in the histogram" * tag 'trace-v5.14-5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Do not reference char * as a string in histograms |
||
|
6e442d0662 |
Merge branch 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU fixes from Paul McKenney: - fix regressions induced by a merge-window change in scheduler semantics, which means that smp_processor_id() can no longer be used in kthreads using simple affinity to bind themselves to a specific CPU. - fix a bug in Tasks Trace RCU that was thought to be strictly theoretical. However, production workloads have started hitting this, so these fixes need to be merged sooner rather than later. - fix a minor printk()-format-mismatch issue introduced during the merge window. * 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: rcu: Fix pr_info() formats and values in show_rcu_gp_kthreads() rcu-tasks: Don't delete holdouts within trc_wait_for_one_reader() rcu-tasks: Don't delete holdouts within trc_inspect_reader() refscale: Avoid false-positive warnings in ref_scale_reader() scftorture: Avoid false-positive warnings in scftorture_invoker() |
||
|
b068fc04de |
perf: Refactor permissions check into perf_check_permission()
Refactor the permission check in perf_event_open() into a helper perf_check_permission(). This makes the permission check logic more readable (because we no longer have a negated disjunction). Add a comment mentioning the ptrace check also checks the uid. No functional change intended. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Link: https://lore.kernel.org/r/20210705084453.2151729-2-elver@google.com |
||
|
9d7a6c95f6 |
perf: Fix required permissions if sigtrap is requested
If perf_event_open() is called with another task as target and
perf_event_attr::sigtrap is set, and the target task's user does not
match the calling user, also require the CAP_KILL capability or
PTRACE_MODE_ATTACH permissions.
Otherwise, with the CAP_PERFMON capability alone it would be possible
for a user to send SIGTRAP signals via perf events to another user's
tasks. This could potentially result in those tasks being terminated if
they cannot handle SIGTRAP signals.
Note: The check complements the existing capability check, but is not
supposed to supersede the ptrace_may_access() check. At a high level we
now have:
capable of CAP_PERFMON and (CAP_KILL if sigtrap)
OR
ptrace_may_access(...) // also checks for same thread-group and uid
Fixes:
|
||
|
e042aa532c |
bpf: Fix pointer arithmetic mask tightening under state pruning
In |
||
|
59089a189e |
bpf: Remove superfluous aux sanitation on subprog rejection
Follow-up to
|
||
|
40ac971eab |
dma-mapping: handle vmalloc addresses in dma_common_{mmap,get_sgtable}
xen-swiotlb can use vmalloc backed addresses for dma coherent allocations
and uses the common helpers. Properly handle them to unbreak Xen on
ARM platforms.
Fixes:
|
||
|
20192d9c9f |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Andrii Nakryiko says: ==================== pull-request: bpf 2021-07-15 The following pull-request contains BPF updates for your *net* tree. We've added 9 non-merge commits during the last 5 day(s) which contain a total of 9 files changed, 37 insertions(+), 15 deletions(-). The main changes are: 1) Fix NULL pointer dereference in BPF_TEST_RUN for BPF_XDP_DEVMAP and BPF_XDP_CPUMAP programs, from Xuan Zhuo. 2) Fix use-after-free of net_device in XDP bpf_link, from Xuan Zhuo. 3) Follow-up fix to subprog poke descriptor use-after-free problem, from Daniel Borkmann and John Fastabend. 4) Fix out-of-range array access in s390 BPF JIT backend, from Colin Ian King. 5) Fix memory leak in BPF sockmap, from John Fastabend. 6) Fix for sockmap to prevent proc stats reporting bug, from John Fastabend and Jakub Sitnicki. 7) Fix NULL pointer dereference in bpftool, from Tobias Klauser. 8) AF_XDP documentation fixes, from Baruch Siach. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
704adfb5a9 |
tracing: Do not reference char * as a string in histograms
The histogram logic was allowing events with char * pointers to be used as normal strings. But it was easy to crash the kernel with: # echo 'hist:keys=filename' > events/syscalls/sys_enter_openat/trigger And open some files, and boom! BUG: unable to handle page fault for address: 00007f2ced0c3280 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 1173fa067 P4D 1173fa067 PUD 1171b6067 PMD 1171dd067 PTE 0 Oops: 0000 [#1] PREEMPT SMP CPU: 6 PID: 1810 Comm: cat Not tainted 5.13.0-rc5-test+ #61 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016 RIP: 0010:strlen+0x0/0x20 Code: f6 82 80 2a 0b a9 20 74 11 0f b6 50 01 48 83 c0 01 f6 82 80 2a 0b a9 20 75 ef c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 RSP: 0018:ffffbdbf81567b50 EFLAGS: 00010246 RAX: 0000000000000003 RBX: ffff93815cdb3800 RCX: ffff9382401a22d0 RDX: 0000000000000100 RSI: 0000000000000000 RDI: 00007f2ced0c3280 RBP: 0000000000000100 R08: ffff9382409ff074 R09: ffffbdbf81567c98 R10: ffff9382409ff074 R11: 0000000000000000 R12: ffff9382409ff074 R13: 0000000000000001 R14: ffff93815a744f00 R15: 00007f2ced0c3280 FS: 00007f2ced0f8580(0000) GS:ffff93825a800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f2ced0c3280 CR3: 0000000107069005 CR4: 00000000001706e0 Call Trace: event_hist_trigger+0x463/0x5f0 ? find_held_lock+0x32/0x90 ? sched_clock_cpu+0xe/0xd0 ? lock_release+0x155/0x440 ? kernel_init_free_pages+0x6d/0x90 ? preempt_count_sub+0x9b/0xd0 ? kernel_init_free_pages+0x6d/0x90 ? get_page_from_freelist+0x12c4/0x1680 ? __rb_reserve_next+0xe5/0x460 ? ring_buffer_lock_reserve+0x12a/0x3f0 event_triggers_call+0x52/0xe0 ftrace_syscall_enter+0x264/0x2c0 syscall_trace_enter.constprop.0+0x1ee/0x210 do_syscall_64+0x1c/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Where it triggered a fault on strlen(key) where key was the filename. The reason is that filename is a char * to user space, and the histogram code just blindly dereferenced it, with obvious bad results. I originally tried to use strncpy_from_user/kernel_nofault() but found that there's other places that its dereferenced and not worth the effort. Just do not allow "char *" to act like strings. Link: https://lkml.kernel.org/r/20210715000206.025df9d2@rorschach.local.home Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Tzvetomir Stoyanov <tz.stoyanov@gmail.com> Cc: stable@vger.kernel.org Acked-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Tom Zanussi <zanussi@kernel.org> Fixes: |
||
|
e9338abf0e |
fallthrough fixes for Clang for 5.14-rc2
Hi Linus, Please, pull the following patches that fix many fall-through warnings when building with Clang and -Wimplicit-fallthrough. This pull-request also contains the patch for Makefile that enables -Wimplicit-fallthrough for Clang, globally. It's also important to notice that since we have adopted the use of the pseudo-keyword macro fallthrough; we also want to avoid having more /* fall through */ comments being introduced. Notice that contrary to GCC, Clang doesn't recognize any comments as implicit fall-through markings when the -Wimplicit-fallthrough option is enabled. So, in order to avoid having more comments being introduced, we have to use the option -Wimplicit-fallthrough=5 for GCC, which similar to Clang, will cause a warning in case a code comment is intended to be used as a fall-through marking. The patch for Makefile also enforces this. We had almost 4,000 of these issues for Clang in the beginning, and there might be a couple more out there when building some architectures with certain configurations. However, with the recent fixes I think we are in good shape and it is now possible to enable -Wimplicit-fallthrough for Clang. :) Thanks! -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEkmRahXBSurMIg1YvRwW0y0cG2zEFAmDvPV0ACgkQRwW0y0cG 2zGJ1xAAsbSN8I+ESxFCKi4UwQC71988isb0rHGL6rQPBbEYD2bbI+82aGu6WH8z 2ZVQZa5x3tUoIsqR2GMK5Npn1dFvuNWGZkYzlgjAp+FgzaULCLUc4NW9DEWU/Kbd UBz8ilXL7tUAlDalUYjXAVVTxMnRTZ+BDCbEmIdRZ5X/pMlD2xHeMGG4kh8/cHjJ +mqFwCFPZoyLJzPBnm37OEfIr8JxrLZAo1gfmz5sxfmBdYDY3hjV/BVgP8h0bkY9 ITaq0LMEAPMXvU+4DP3DxuqzRr9COYMcddmbaWwsXPd7UAuoXwGLvqx7JE0QqY3g J80w4/hXqEa4o4/fUAsfjYEQoNTEnhl0iGskBJD2xy5Lj17/m4aOSL44ibr2Ag67 yfJPWi9tooYELEGNobPVHPuqk4ts6amKTOKqHWMBEcTEOIGzaGWPEjRoyaMivfZ3 G3ZbEPEYERcJRizm63UbciLeslGsxb6tdYQEy5VI8Wa7caz9Ci9HT+jjS7Vm2fuq 1zg+vIgwuSxvwxrSV/PDTRcQm9EM/MoRL0D4jbr7gtYD6KNwH8Vo6M21CoaB9gH3 FgMiCT4zy9KiONjVDJKkjtzuk+9OwpI5AAg0/fAQ4Ak9TzUN3Xfg7HpytETCIJ0Z Ii3Jqwe+3IRTNq94ZTIiX/0hT7i5GsPxUfPzI2vwttCJn9ctUcM= =XVNN -----END PGP SIGNATURE----- Merge tag 'Wimplicit-fallthrough-clang-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux Pull fallthrough fixes from Gustavo Silva: "This fixes many fall-through warnings when building with Clang and -Wimplicit-fallthrough, and also enables -Wimplicit-fallthrough for Clang, globally. It's also important to notice that since we have adopted the use of the pseudo-keyword macro fallthrough, we also want to avoid having more /* fall through */ comments being introduced. Contrary to GCC, Clang doesn't recognize any comments as implicit fall-through markings when the -Wimplicit-fallthrough option is enabled. So, in order to avoid having more comments being introduced, we use the option -Wimplicit-fallthrough=5 for GCC, which similar to Clang, will cause a warning in case a code comment is intended to be used as a fall-through marking. The patch for Makefile also enforces this. We had almost 4,000 of these issues for Clang in the beginning, and there might be a couple more out there when building some architectures with certain configurations. However, with the recent fixes I think we are in good shape and it is now possible to enable the warning for Clang" * tag 'Wimplicit-fallthrough-clang-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (27 commits) Makefile: Enable -Wimplicit-fallthrough for Clang powerpc/smp: Fix fall-through warning for Clang dmaengine: mpc512x: Fix fall-through warning for Clang usb: gadget: fsl_qe_udc: Fix fall-through warning for Clang powerpc/powernv: Fix fall-through warning for Clang MIPS: Fix unreachable code issue MIPS: Fix fall-through warnings for Clang ASoC: Mediatek: MT8183: Fix fall-through warning for Clang power: supply: Fix fall-through warnings for Clang dmaengine: ti: k3-udma: Fix fall-through warning for Clang s390: Fix fall-through warnings for Clang dmaengine: ipu: Fix fall-through warning for Clang iommu/arm-smmu-v3: Fix fall-through warning for Clang mmc: jz4740: Fix fall-through warning for Clang PCI: Fix fall-through warning for Clang scsi: libsas: Fix fall-through warning for Clang video: fbdev: Fix fall-through warning for Clang math-emu: Fix fall-through warning cpufreq: Fix fall-through warning for Clang drm/msm: Fix fall-through warning in msm_gem_new_impl() ... |
||
|
aebacb7f6c |
timers: Fix get_next_timer_interrupt() with no timers pending
|
||
|
1a3402d93c |
posix-cpu-timers: Fix rearm racing against process tick
Since the process wide cputime counter is started locklessly from
posix_cpu_timer_rearm(), it can be concurrently stopped by operations
on other timers from the same thread group, such as in the following
unlucky scenario:
CPU 0 CPU 1
----- -----
timer_settime(TIMER B)
posix_cpu_timer_rearm(TIMER A)
cpu_clock_sample_group()
(pct->timers_active already true)
handle_posix_cpu_timers()
check_process_timers()
stop_process_timers()
pct->timers_active = false
arm_timer(TIMER A)
tick -> run_posix_cpu_timers()
// sees !pct->timers_active, ignore
// our TIMER A
Fix this with simply locking process wide cputime counting start and
timer arm in the same block.
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Fixes:
|
||
|
8096acd744 |
Networking fixes for 5.14-rc2, including fixes from bpf and netfilter.
Current release - regressions: - sock: fix parameter order in sock_setsockopt() Current release - new code bugs: - netfilter: nft_last: - fix incorrect arithmetic when restoring last used - honor NFTA_LAST_SET on restoration Previous releases - regressions: - udp: properly flush normal packet at GRO time - sfc: ensure correct number of XDP queues; don't allow enabling the feature if there isn't sufficient resources to Tx from any CPU - dsa: sja1105: fix address learning getting disabled on the CPU port - mptcp: addresses a rmem accounting issue that could keep packets in subflow receive buffers longer than necessary, delaying MPTCP-level ACKs - ip_tunnel: fix mtu calculation for ETHER tunnel devices - do not reuse skbs allocated from skbuff_fclone_cache in the napi skb cache, we'd try to return them to the wrong slab cache - tcp: consistently disable header prediction for mptcp Previous releases - always broken: - bpf: fix subprog poke descriptor tracking use-after-free - ipv6: - allocate enough headroom in ip6_finish_output2() in case iptables TEE is used - tcp: drop silly ICMPv6 packet too big messages to avoid expensive and pointless lookups (which may serve as a DDOS vector) - make sure fwmark is copied in SYNACK packets - fix 'disable_policy' for forwarded packets (align with IPv4) - netfilter: conntrack: do not renew entry stuck in tcp SYN_SENT state - netfilter: conntrack: do not mark RST in the reply direction coming after SYN packet for an out-of-sync entry - mptcp: cleanly handle error conditions with MP_JOIN and syncookies - mptcp: fix double free when rejecting a join due to port mismatch - validate lwtstate->data before returning from skb_tunnel_info() - tcp: call sk_wmem_schedule before sk_mem_charge in zerocopy path - mt76: mt7921: continue to probe driver when fw already downloaded - bonding: fix multiple issues with offloading IPsec to (thru?) bond - stmmac: ptp: fix issues around Qbv support and setting time back - bcmgenet: always clear wake-up based on energy detection Misc: - sctp: move 198 addresses from unusable to private scope - ptp: support virtual clocks and timestamping - openvswitch: optimize operation for key comparison -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmDu3mMACgkQMUZtbf5S Irsjxg//UwcPJMYFmXV+fGkEsWYe1Kf29FcUDEeANFtbltfAcIfZ0GoTbSDRnrVb HcYAKcm4XRx5bWWdQrQsQq/yiLbnS/rSLc7VRB+uRHWRKl3eYcaUB2rnCXsxrjGw wQJgOmztDCJS4BIky24iQpF/8lg7p/Gj2Ih532gh93XiYo612FrEJKkYb2/OQfYX GkbnZ0kL2Y1SV+bhy6aT5azvhHKM4/3eA4fHeJ2p8e2gOZ5ni0vpX0xEzdzKOCd0 vwR/Wu3h/+2QuFYVcSsVguuM++JXACG8MAS/Tof78dtNM4a3kQxzqeh5Bv6IkfTu rokENLq4pjNRy+nBAOeQZj8Jd0K0kkf/PN9WMdGQtplMoFhjjV25R6PeRrV9wwPo peozIz2MuQo7Kfof1D+44h2foyLfdC28/Z0CvRbDpr5EHOfYynvBbrnhzIGdQp6V xgftKTOdgz2Djgg8HiblZund1FA44OYerddVAASrIsnSFnIz1VLVQIsfV+GLBwwc FawrIZ6WfIjzRSrDGOvDsbAQI47T/1jbaPJeK6XgjWkQmjEd6UtRWRZLYCxemQEw 4HP3sWC96BOehuD8ylipVE1oFqrxCiOB/fZxezXqjo8dSX3NLdak4cCHTHoW5SuZ eEAxQRaBliKd+P7hoy9cZ57CAu3zUa8kijfM5QRlCAHF+zSxaPs= =QFnb -----END PGP SIGNATURE----- Merge tag 'net-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski. "Including fixes from bpf and netfilter. Current release - regressions: - sock: fix parameter order in sock_setsockopt() Current release - new code bugs: - netfilter: nft_last: - fix incorrect arithmetic when restoring last used - honor NFTA_LAST_SET on restoration Previous releases - regressions: - udp: properly flush normal packet at GRO time - sfc: ensure correct number of XDP queues; don't allow enabling the feature if there isn't sufficient resources to Tx from any CPU - dsa: sja1105: fix address learning getting disabled on the CPU port - mptcp: addresses a rmem accounting issue that could keep packets in subflow receive buffers longer than necessary, delaying MPTCP-level ACKs - ip_tunnel: fix mtu calculation for ETHER tunnel devices - do not reuse skbs allocated from skbuff_fclone_cache in the napi skb cache, we'd try to return them to the wrong slab cache - tcp: consistently disable header prediction for mptcp Previous releases - always broken: - bpf: fix subprog poke descriptor tracking use-after-free - ipv6: - allocate enough headroom in ip6_finish_output2() in case iptables TEE is used - tcp: drop silly ICMPv6 packet too big messages to avoid expensive and pointless lookups (which may serve as a DDOS vector) - make sure fwmark is copied in SYNACK packets - fix 'disable_policy' for forwarded packets (align with IPv4) - netfilter: conntrack: - do not renew entry stuck in tcp SYN_SENT state - do not mark RST in the reply direction coming after SYN packet for an out-of-sync entry - mptcp: cleanly handle error conditions with MP_JOIN and syncookies - mptcp: fix double free when rejecting a join due to port mismatch - validate lwtstate->data before returning from skb_tunnel_info() - tcp: call sk_wmem_schedule before sk_mem_charge in zerocopy path - mt76: mt7921: continue to probe driver when fw already downloaded - bonding: fix multiple issues with offloading IPsec to (thru?) bond - stmmac: ptp: fix issues around Qbv support and setting time back - bcmgenet: always clear wake-up based on energy detection Misc: - sctp: move 198 addresses from unusable to private scope - ptp: support virtual clocks and timestamping - openvswitch: optimize operation for key comparison" * tag 'net-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (158 commits) net: dsa: properly check for the bridge_leave methods in dsa_switch_bridge_leave() sfc: add logs explaining XDP_TX/REDIRECT is not available sfc: ensure correct number of XDP queues sfc: fix lack of XDP TX queues - error XDP TX failed (-22) net: fddi: fix UAF in fza_probe net: dsa: sja1105: fix address learning getting disabled on the CPU port net: ocelot: fix switchdev objects synced for wrong netdev with LAG offload net: Use nlmsg_unicast() instead of netlink_unicast() octeontx2-pf: Fix uninitialized boolean variable pps ipv6: allocate enough headroom in ip6_finish_output2() net: hdlc: rename 'mod_init' & 'mod_exit' functions to be module-specific net: bridge: multicast: fix MRD advertisement router port marking race net: bridge: multicast: fix PIM hello router port marking race net: phy: marvell10g: fix differentiation of 88X3310 from 88X3340 dsa: fix for_each_child.cocci warnings virtio_net: check virtqueue_add_sgs() return value mptcp: properly account bulk freed memory selftests: mptcp: fix case multiple subflows limited by server mptcp: avoid processing packet if a subflow reset mptcp: fix syncookie process if mptcp can not_accept new subflow ... |
||
|
d1d488d813 |
fs: add vfs_parse_fs_param_source() helper
Add a simple helper that filesystems can use in their parameter parser to parse the "source" parameter. A few places open-coded this function and that already caused a bug in the cgroup v1 parser that we fixed. Let's make it harder to get this wrong by introducing a helper which performs all necessary checks. Link: https://syzkaller.appspot.com/bug?id=6312526aba5beae046fdae8f00399f87aab48b12 Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
3b0462726e |
cgroup: verify that source is a string
The following sequence can be used to trigger a UAF:
int fscontext_fd = fsopen("cgroup");
int fd_null = open("/dev/null, O_RDONLY);
int fsconfig(fscontext_fd, FSCONFIG_SET_FD, "source", fd_null);
close_range(3, ~0U, 0);
The cgroup v1 specific fs parser expects a string for the "source"
parameter. However, it is perfectly legitimate to e.g. specify a file
descriptor for the "source" parameter. The fs parser doesn't know what
a filesystem allows there. So it's a bug to assume that "source" is
always of type fs_value_is_string when it can reasonably also be
fs_value_is_file.
This assumption in the cgroup code causes a UAF because struct
fs_parameter uses a union for the actual value. Access to that union is
guarded by the param->type member. Since the cgroup paramter parser
didn't check param->type but unconditionally moved param->string into
fc->source a close on the fscontext_fd would trigger a UAF during
put_fs_context() which frees fc->source thereby freeing the file stashed
in param->file causing a UAF during a close of the fd_null.
Fix this by verifying that param->type is actually a string and report
an error if not.
In follow up patches I'll add a new generic helper that can be used here
and by other filesystems instead of this error-prone copy-pasta fix.
But fixing it in here first makes backporting a it to stable a lot
easier.
Fixes:
|
||
|
5dd0a6b858 |
bpf: Fix tail_call_reachable rejection for interpreter when jit failed
During testing of |
||
|
e9ba16e68c |
smpboot: Mark idle_init() as __always_inlined to work around aggressive compiler un-inlining
While this function is a static inline, and is only used once in local scope, certain Kconfig variations may cause it to be compiled as a standalone function: 89231bf0 <idle_init>: 89231bf0: 83 05 60 d9 45 89 01 addl $0x1,0x8945d960 89231bf7: 55 push %ebp Resulting in this build failure: WARNING: modpost: vmlinux.o(.text.unlikely+0x7fd5): Section mismatch in reference from the function idle_init() to the function .init.text:fork_idle() The function idle_init() references the function __init fork_idle(). This is often because idle_init lacks a __init annotation or the annotation of fork_idle is wrong. ERROR: modpost: Section mismatches detected. Certain USBSAN options x86-32 builds with CONFIG_CC_OPTIMIZE_FOR_SIZE=y seem to be causing this. So mark idle_init() as __always_inline to work around this compiler bug/feature. Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
1adee589cd |
kernel: debug: Fix unreachable code in gdb_serial_stub()
Fix the following warning: kernel/debug/gdbstub.c:1049:4: warning: fallthrough annotation in unreachable code [-Wimplicit-fallthrough] fallthrough; ^ include/linux/compiler_attributes.h:210:41: note: expanded from macro 'fallthrough' # define fallthrough __attribute__((__fallthrough__) by placing the fallthrough; statement inside ifdeffery. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> |
||
|
98f7fdced2 |
Two fixes:
- Fix a MIPS IRQ handling RCU bug - Remove a DocBook annotation for a parameter that doesn't exist anymore Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDq9IcRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1gOnA//U6VcTbivf1QG05Omof+378iFZWWSdDHP /iF12DrtcalqCH73o4iuiEg3W8PdISqvLh5QI9G1WvKNXOmX/RRoFPpd1W24uM/D DM9m3hEjjEX3AmRZoQ7yrA2Jzd435DKQOHjzClV/5ykKiyBw533twHm7L9sI6WCl fsP5yiCMXUP2wS2Go/wWgRjrIIb5iQezXGGxeKhP4zW/lam86ntPcdm8I6EuRD7r bT4TXmJkjk98FL+A6QSk8cP+ZlzauHkNa0HYiSz1pvOnwPDYs8IXacCEeOwog3Wr hH0IXbX2NXJIUPDQc/BvWzOMxRNBgMZ1ixnj8B2kaDfIM5YfaGAV3/GuogFCUSns 1u9E+LkLohvE61w7m49PjnYIlWSkuQFrMlJL+69O2wiqmbFX9aQa/T9i7Mq7bmsv Q5qTe935QRVcraUPQ5h0uJnKmBL8xteHpQSSrF5ppsC8cxEDX/CC/GsqAn0xG15Q VZ91ocXr58zCXDpLbesULY7LbGyB6c1Kt55WYxhp/IOtRhEpVzD5Y9ukXYdckIpq 25TMFsPyTUTaS3r8nD8G0QeW3CLVj8Lzi/r0P3oa8GvgyqxlX7jzXI8aGPRb7QIi JtX6iY2nH0DZfp600VMJNUS67yMDuDXJ2Me5WY3XkvpYuFtPGRm41HOs3dEV+Ay1 nxBMxUC/6Ug= =xghF -----END PGP SIGNATURE----- Merge tag 'irq-urgent-2021-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Ingo Molnar: "Two fixes: - Fix a MIPS IRQ handling RCU bug - Remove a DocBook annotation for a parameter that doesn't exist anymore" * tag 'irq-urgent-2021-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/mips: Fix RCU violation when using irqdomain lookup on interrupt entry genirq/irqdesc: Drop excess kernel-doc entry @lookup |
||
|
877029d921 |
Three fixes:
- Fix load tracking bug/inconsistency - Fix a sporadic CFS bandwidth constraints enforcement bug - Fix a uclamp utilization tracking bug for newly woken tasks Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDq8scRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1i1nQ/+PsOimY0+2kW2EUp/a1a8h1ZHfyG1RjDo Kl/CCUDdEP0OuWC/cdnvYgo3hNPTVCpZvsIoR/t7CFinuR/ubfsFtLZ5juOzjk5v 2RjmSyOr0DbhUcO3GFKo+ABlrmiiJ9de89+oal+W/t/Zks/xqVHQ+f7HqSGvSd1f Lhem4U9UakbO3yjT/+VwSvNZgP8trtPQ6rrKw+yrwrxfjSo4D0Y+/3u/HCaYthIW 5i1+uBEXGnZaU7QhDfqqzbGcAKLRA+i2vBmNfbOyUeCcyKTsKlLwX9L1DlgNPLRP XvxVJWJcxwTsLbwVG1F3TvWw93iSLi34jPO//2ZnNppEhA4fjxmLSYV3uIsm8PUY /YmDdZ6fTW7ZIO/nhfcf3nS8Sp0UlfHXL9dV3mn2EzeMLGOKZY6vgAKdvEd+Fj+y J+VB01MgmVzGvFr9o1/ez3vWyk03CLDQuYMUo0yVcqAi4OLaArAz5vxXR/cF4PsB r69CCdSinMj2finaR39Eq0431Tpv71NDDfyqjVJEOk88Weszu6IACIOJCvpy0ZLQ LOA5kl2I8/mYEevnXgg9NPX8XO2iUFS1cVVNsRHUe4zqQZPPoBD6Oppb+kmfUQUe gABCZK217nkqFH4GdC9RCtRdnb4HO+6H15cLlDHjilECgGOPeJ8CPaK3pRvzv0g3 N2m4KcFI2j0= =FBcX -----END PGP SIGNATURE----- Merge tag 'sched-urgent-2021-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Three fixes: - Fix load tracking bug/inconsistency - Fix a sporadic CFS bandwidth constraints enforcement bug - Fix a uclamp utilization tracking bug for newly woken tasks" * tag 'sched-urgent-2021-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/uclamp: Ignore max aggregation if rq is idle sched/fair: Fix CFS bandwidth hrtimer expiry type sched/fair: Sync load_sum with load_avg after dequeue |
||
|
301c8b1d7c |
Locking fixes:
- Fix a Sparc crash - Fix a number of objtool warnings - Fix /proc/lockdep output on certain configs - Restore a kprobes fail-safe Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDq8DMRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1jZcA//aYhW8gm3rjtXeRme6H5vLF3fehxw9xoC g6RTAHStHd9xJyctsFYR7Fx7o1l2G05jf5tv4MWoAYMtnjz6OKfPQu7b8eTD3Z+3 n0AAfsrrVaK4f8AgGZ+bj4kw/BCJL0Xx8HyRXjDWODVZVY+yUEo2c5vsw02inQeW 3AQ1m4ZhQBYvl7r4pD0oi6BrL0ruvC0NN5kRYuh1Ib4I8GtF1h9ACPFICxsV6Glx 4SKqzsvaQbV+9EiiLpKqEpi/EJqMmAE5sr4EUnQxWsMeuOKavzETck1ZxWTO5iIh gXI2yTuLS6++yBPCQer/8eePsP3bAiQeNJ+71xpfFdmwx9osA7DFe3aV3f5Ug+Bq f4yswcw1Y1jZhvNp3AV9kE+h2mrSUEWGKAj9LCIV6VqNfOeKKrAyrxSfLRYiB1Ek M9+ML+lN3M2c4n5P7qxx1ZUOZ1It19Nx6HNEeTPkfKhlI+57hpmvPvKIjqZQRdAD oE9exVRssFxDQLIHWoshoDQ7JVR7fsqn7I6ExejnAIpl6veFAAQ457gOHmFyi+jo aLeCTAie0hA18TrMqWtp/ftnpTTJvRJKtHPQXIYmqEkp8S85ryd7Co/9sMRHDS8e XhQRFPSfp4MHqucmoyUIlbRkv16f/0RsC0gv10U0T/WUkjQGMBL5/dvZLpJILtDm DOmYxoe0UP8= =WvwL -----END PGP SIGNATURE----- Merge tag 'locking-urgent-2021-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: - Fix a Sparc crash - Fix a number of objtool warnings - Fix /proc/lockdep output on certain configs - Restore a kprobes fail-safe * tag 'locking-urgent-2021-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/atomic: sparc: Fix arch_cmpxchg64_local() kprobe/static_call: Restore missing static_call_text_reserved() static_call: Fix static_call_text_reserved() vs __init jump_label: Fix jump_label_text_reserved() vs __init locking/lockdep: Fix meaningless /proc/lockdep output of lock classes on !CONFIG_PROVE_LOCKING |
||
|
81361b837a |
Kbuild updates for v5.14
- Increase the -falign-functions alignment for the debug option. - Remove ugly libelf checks from the top Makefile. - Make the silent build (-s) more silent. - Re-compile the kernel if KBUILD_BUILD_TIMESTAMP is specified. - Various script cleanups -----BEGIN PGP SIGNATURE----- iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmDon90VHG1hc2FoaXJv eUBrZXJuZWwub3JnAAoJED2LAQed4NsGWFUP/RGNwlGD/YV1xg0ZmM0/ynBzzOy2 3dcr3etJZpipQDeqnHy3jt0esgMVlbkTdrHvP+2hpNaeXFwjF1fDHjhur9m8ZkVD efOA6nugOnNwhy2G3BvtCJv+Vhb+KZ0nNLB27z3Bl0LGP6LJdMRNAxFBJMv4k3aR F3sABugwCpnT2/YtuprxRl2/3/CyLur5NjY24FD+ugON3JIWfl6ETbHeFmxr1JE4 mE+zaN5AwYuSuH9LpdRy85XVCcW/FFqP/DwOFllVvCCCNvvS0KWYSNHWfEsKdR75 hmAAaS/rpi2eaL0vp88sNhAtYnhMSf+uFu0fyfYeWZuJqMt4Xz5xZKAzDsifCdif aQ6UEPDjiKABh9gpX26BMd2CXzkGR+L4qZ7iBPfO586Iy7opajrFX9kIj5U7ZtCl wsPat/9+18xpVJOTe0sss3idId7Ft4cRoW5FQMEAW2EWJ9fXAG1yDxEREj1V5gFx sMXtpmCoQag968qjfARvP08s3MB1P4Ij6tXcioGqHuEWeJLxOMK/KWyafQUg611d 0kSWNO0OMo+odBj6j/vM+MIIaPhgwtZnPgw2q4uHGMcemzQxaEvGW+G/5a5qEpTv SKm8W24wXplNot4tuTGWq5/jANRJcMvVsyC48DYT81OZEOWrIc0kDV4v4qZToTxW 97jn1NKa2H6L0J1V =Za8V -----END PGP SIGNATURE----- Merge tag 'kbuild-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Increase the -falign-functions alignment for the debug option. - Remove ugly libelf checks from the top Makefile. - Make the silent build (-s) more silent. - Re-compile the kernel if KBUILD_BUILD_TIMESTAMP is specified. - Various script cleanups * tag 'kbuild-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (27 commits) scripts: add generic syscallnr.sh scripts: check duplicated syscall number in syscall table sparc: syscalls: use pattern rules to generate syscall headers parisc: syscalls: use pattern rules to generate syscall headers nds32: add arch/nds32/boot/.gitignore kbuild: mkcompile_h: consider timestamp if KBUILD_BUILD_TIMESTAMP is set kbuild: modpost: Explicitly warn about unprototyped symbols kbuild: remove trailing slashes from $(KBUILD_EXTMOD) kconfig.h: explain IS_MODULE(), IS_ENABLED() kconfig: constify long_opts scripts/setlocalversion: simplify the short version part scripts/setlocalversion: factor out 12-chars hash construction scripts/setlocalversion: add more comments to -dirty flag detection scripts/setlocalversion: remove workaround for old make-kpkg scripts/setlocalversion: remove mercurial, svn and git-svn supports kbuild: clean up ${quiet} checks in shell scripts kbuild: sink stdout from cmd for silent build init: use $(call cmd,) for generating include/generated/compile.h kbuild: merge scripts/mkmakefile to top Makefile sh: move core-y in arch/sh/Makefile to arch/sh/Kbuild ... |
||
|
5a7f7fc5dd |
Tracing fix for histograms and a clean up in ftrace
- Fixed a bug that broke the .sym-offset modifier and added a test to make sure nothing breaks it again. - Replace a list_del/list_add() with a list_move() -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYOhMlxQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6quYRAQCbciaL9gQwk6lS4Rn3NkC/i1xsxv0A q56XDPK3qyuHWQD/d6gcJ5n+Bl0OoNaDyG4UqnSGucZctYQeY4LQMzd4PQA= =wPyY -----END PGP SIGNATURE----- Merge tag 'trace-v5.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix and cleanup from Steven Rostedt: "Tracing fix for histograms and a clean up in ftrace: - Fixed a bug that broke the .sym-offset modifier and added a test to make sure nothing breaks it again. - Replace a list_del/list_add() with a list_move()" * tag 'trace-v5.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace: Use list_move instead of list_del/list_add tracing/selftests: Add tests to test histogram sym and sym-offset modifiers tracing/histograms: Fix parsing of "sym-offset" modifier |
||
|
bd9c350603 |
Merge branch 'akpm' (patches from Andrew)
Pull yet more updates from Andrew Morton: "54 patches. Subsystems affected by this patch series: lib, mm (slub, secretmem, cleanups, init, pagemap, and mremap), and debug" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (54 commits) powerpc/mm: enable HAVE_MOVE_PMD support powerpc/book3s64/mm: update flush_tlb_range to flush page walk cache mm/mremap: allow arch runtime override mm/mremap: hold the rmap lock in write mode when moving page table entries. mm/mremap: use pmd/pud_poplulate to update page table entries mm/mremap: don't enable optimized PUD move if page table levels is 2 mm/mremap: convert huge PUD move to separate helper selftest/mremap_test: avoid crash with static build selftest/mremap_test: update the test to handle pagesize other than 4K mm: rename p4d_page_vaddr to p4d_pgtable and make it return pud_t * mm: rename pud_page_vaddr to pud_pgtable and make it return pmd_t * kdump: use vmlinux_build_id to simplify buildid: fix kernel-doc notation buildid: mark some arguments const scripts/decode_stacktrace.sh: indicate 'auto' can be used for base path scripts/decode_stacktrace.sh: silence stderr messages from addr2line/nm scripts/decode_stacktrace.sh: support debuginfod x86/dumpstack: use %pSb/%pBb for backtrace printing arm64: stacktrace: use %pSb for backtrace printing module: add printk formats to add module build ID to stacktraces ... |
||
|
4840048356 |
irqchip fixes for 5.14, take #1
- Fix a MIPS bug where irqdomain loopkups could occur in a context where RCU is not allowed - Fix a documentation bug for handle_domain_irq -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmDoFZwPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDkxgP/0IlcPoBfamLbzpjyrMm0359AIllKP7Vgpb6 IaDnoiWTCe1B6rFoP59j357iGCTvB04+oJ17DsB51R9RKaM2sdhi2dKxbEHRd+se v1EyU5gmVB7JJP/1lPnjYdPGnXz73jjEq5MRI6o4yLG2xnQ+ZcdHifuP/tD7glGZ UmJkAWrq53IbBCja9HbtbAQanuDtDu6xgYIweaCdf2oKzNS9jQVvjqFokMd0AcIA juY8xTFNzHoA/8AF8eBUE5TfLVcG8j3P31ffw4gVvzEkman77AP5DZ8qkSvi7plH wOdjlBrsGTfoti4kfAIvsfi6zBHyXJEaW0Vd38uaA+cXYI1QNiZ9qqzQPvjuK9gs VFORcWe2pDiI1q4mg1pz7dGBy0FoEswe4uVnr6xm+vn98KeAn/gS5Dc/K1JpmfN+ iOJt9H7hvDrUC/KnntzuRY82I8y7gDyyjGDJHhFlWgZBeONPhpOiE/d6xbxGtQm8 SpVBD9QZnMBxDY2eVNQp31SCdrwLCLczNeQrJHP6Oh9LLmjQq3VD4M3OpBLpEA8e 8WIiH9vJfXI5emU1wQRkscyA8tT2mAo3Kb192fpC5nDwTMSxbfk1l94oLGUldcV5 liQ78yd/12mJBMS5MJPsA39g1ww2vv5m6Bq0JDwDu33l1qKrlRJqveeS8UjDSXrD s3cVrxFO =eTw3 -----END PGP SIGNATURE----- Merge tag 'irqchip-fixes-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent Pull irqchip fixes from Marc Zyngier: - Fix a MIPS bug where irqdomain loopkups could occur in a context where RCU is not allowed - Fix a documentation bug for handle_domain_irq |
||
|
f263a81451 |
bpf: Track subprog poke descriptors correctly and fix use-after-free
Subprograms are calling map_poke_track(), but on program release there is no hook to call map_poke_untrack(). However, on program release, the aux memory (and poke descriptor table) is freed even though we still have a reference to it in the element list of the map aux data. When we run map_poke_run(), we then end up accessing free'd memory, triggering KASAN in prog_array_map_poke_run(): [...] [ 402.824689] BUG: KASAN: use-after-free in prog_array_map_poke_run+0xc2/0x34e [ 402.824698] Read of size 4 at addr ffff8881905a7940 by task hubble-fgs/4337 [ 402.824705] CPU: 1 PID: 4337 Comm: hubble-fgs Tainted: G I 5.12.0+ #399 [ 402.824715] Call Trace: [ 402.824719] dump_stack+0x93/0xc2 [ 402.824727] print_address_description.constprop.0+0x1a/0x140 [ 402.824736] ? prog_array_map_poke_run+0xc2/0x34e [ 402.824740] ? prog_array_map_poke_run+0xc2/0x34e [ 402.824744] kasan_report.cold+0x7c/0xd8 [ 402.824752] ? prog_array_map_poke_run+0xc2/0x34e [ 402.824757] prog_array_map_poke_run+0xc2/0x34e [ 402.824765] bpf_fd_array_map_update_elem+0x124/0x1a0 [...] The elements concerned are walked as follows: for (i = 0; i < elem->aux->size_poke_tab; i++) { poke = &elem->aux->poke_tab[i]; [...] The access to size_poke_tab is a 4 byte read, verified by checking offsets in the KASAN dump: [ 402.825004] The buggy address belongs to the object at ffff8881905a7800 which belongs to the cache kmalloc-1k of size 1024 [ 402.825008] The buggy address is located 320 bytes inside of 1024-byte region [ffff8881905a7800, ffff8881905a7c00) The pahole output of bpf_prog_aux: struct bpf_prog_aux { [...] /* --- cacheline 5 boundary (320 bytes) --- */ u32 size_poke_tab; /* 320 4 */ [...] In general, subprograms do not necessarily manage their own data structures. For example, BTF func_info and linfo are just pointers to the main program structure. This allows reference counting and cleanup to be done on the latter which simplifies their management a bit. The aux->poke_tab struct, however, did not follow this logic. The initial proposed fix for this use-after-free bug further embedded poke data tracking into the subprogram with proper reference counting. However, Daniel and Alexei questioned why we were treating these objects special; I agree, its unnecessary. The fix here removes the per subprogram poke table allocation and map tracking and instead simply points the aux->poke_tab pointer at the main programs poke table. This way, map tracking is simplified to the main program and we do not need to manage them per subprogram. This also means, bpf_prog_free_deferred(), which unwinds the program reference counting and kfrees objects, needs to ensure that we don't try to double free the poke_tab when free'ing the subprog structures. This is easily solved by NULL'ing the poke_tab pointer. The second detail is to ensure that per subprogram JIT logic only does fixups on poke_tab[] entries it owns. To do this, we add a pointer in the poke structure to point at the subprogram value so JITs can easily check while walking the poke_tab structure if the current entry belongs to the current program. The aux pointer is stable and therefore suitable for such comparison. On the jit_subprogs() error path, we omit cleaning up the poke->aux field because these are only ever referenced from the JIT side, but on error we will never make it to the JIT, so its fine to leave them dangling. Removing these pointers would complicate the error path for no reason. However, we do need to untrack all poke descriptors from the main program as otherwise they could race with the freeing of JIT memory from the subprograms. Lastly, |
||
|
44e8a5e912 |
kdump: use vmlinux_build_id to simplify
We can use the vmlinux_build_id array here now instead of open coding it. This mostly consolidates code. Link: https://lkml.kernel.org/r/20210511003845.2429846-14-swboyd@chromium.org Signed-off-by: Stephen Boyd <swboyd@chromium.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jessica Yu <jeyu@kernel.org> Cc: Evan Green <evgreen@chromium.org> Cc: Hsin-Yi Wang <hsinyi@chromium.org> Cc: Dave Young <dyoung@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Matthew Wilcox <willy@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Sasha Levin <sashal@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
9294523e37 |
module: add printk formats to add module build ID to stacktraces
Let's make kernel stacktraces easier to identify by including the build ID[1] of a module if the stacktrace is printing a symbol from a module. This makes it simpler for developers to locate a kernel module's full debuginfo for a particular stacktrace. Combined with scripts/decode_stracktrace.sh, a developer can download the matching debuginfo from a debuginfod[2] server and find the exact file and line number for the functions plus offsets in a stacktrace that match the module. This is especially useful for pstore crash debugging where the kernel crashes are recorded in something like console-ramoops and the recovery kernel/modules are different or the debuginfo doesn't exist on the device due to space concerns (the debuginfo can be too large for space limited devices). Originally, I put this on the %pS format, but that was quickly rejected given that %pS is used in other places such as ftrace where build IDs aren't meaningful. There was some discussions on the list to put every module build ID into the "Modules linked in:" section of the stacktrace message but that quickly becomes very hard to read once you have more than three or four modules linked in. It also provides too much information when we don't expect each module to be traversed in a stacktrace. Having the build ID for modules that aren't important just makes things messy. Splitting it to multiple lines for each module quickly explodes the number of lines printed in an oops too, possibly wrapping the warning off the console. And finally, trying to stash away each module used in a callstack to provide the ID of each symbol printed is cumbersome and would require changes to each architecture to stash away modules and return their build IDs once unwinding has completed. Instead, we opt for the simpler approach of introducing new printk formats '%pS[R]b' for "pointer symbolic backtrace with module build ID" and '%pBb' for "pointer backtrace with module build ID" and then updating the few places in the architecture layer where the stacktrace is printed to use this new format. Before: Call trace: lkdtm_WARNING+0x28/0x30 [lkdtm] direct_entry+0x16c/0x1b4 [lkdtm] full_proxy_write+0x74/0xa4 vfs_write+0xec/0x2e8 After: Call trace: lkdtm_WARNING+0x28/0x30 [lkdtm 6c2215028606bda50de823490723dc4bc5bf46f9] direct_entry+0x16c/0x1b4 [lkdtm 6c2215028606bda50de823490723dc4bc5bf46f9] full_proxy_write+0x74/0xa4 vfs_write+0xec/0x2e8 [akpm@linux-foundation.org: fix build with CONFIG_MODULES=n, tweak code layout] [rdunlap@infradead.org: fix build when CONFIG_MODULES is not set] Link: https://lkml.kernel.org/r/20210513171510.20328-1-rdunlap@infradead.org [akpm@linux-foundation.org: make kallsyms_lookup_buildid() static] [cuibixuan@huawei.com: fix build error when CONFIG_SYSFS is disabled] Link: https://lkml.kernel.org/r/20210525105049.34804-1-cuibixuan@huawei.com Link: https://lkml.kernel.org/r/20210511003845.2429846-6-swboyd@chromium.org Link: https://fedoraproject.org/wiki/Releases/FeatureBuildId [1] Link: https://sourceware.org/elfutils/Debuginfod.html [2] Signed-off-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Bixuan Cui <cuibixuan@huawei.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jessica Yu <jeyu@kernel.org> Cc: Evan Green <evgreen@chromium.org> Cc: Hsin-Yi Wang <hsinyi@chromium.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Matthew Wilcox <willy@infradead.org> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Young <dyoung@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Sasha Levin <sashal@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
9a436f8ff6 |
PM: hibernate: disable when there are active secretmem users
It is unsafe to allow saving of secretmem areas to the hibernation snapshot as they would be visible after the resume and this essentially will defeat the purpose of secret memory mappings. Prevent hibernation whenever there are active secret memory users. Link: https://lkml.kernel.org/r/20210518072034.31572-6-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christopher Lameter <cl@linux.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Hagen Paul Pfeifer <hagen@jauu.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Bottomley <jejb@linux.ibm.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Palmer Dabbelt <palmerdabbelt@google.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tycho Andersen <tycho@tycho.ws> Cc: Will Deacon <will@kernel.org> Cc: kernel test robot <lkp@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
1507f51255 |
mm: introduce memfd_secret system call to create "secret" memory areas
Introduce "memfd_secret" system call with the ability to create memory
areas visible only in the context of the owning process and not mapped not
only to other processes but in the kernel page tables as well.
The secretmem feature is off by default and the user must explicitly
enable it at the boot time.
Once secretmem is enabled, the user will be able to create a file
descriptor using the memfd_secret() system call. The memory areas created
by mmap() calls from this file descriptor will be unmapped from the kernel
direct map and they will be only mapped in the page table of the processes
that have access to the file descriptor.
Secretmem is designed to provide the following protections:
* Enhanced protection (in conjunction with all the other in-kernel
attack prevention systems) against ROP attacks. Seceretmem makes
"simple" ROP insufficient to perform exfiltration, which increases the
required complexity of the attack. Along with other protections like
the kernel stack size limit and address space layout randomization which
make finding gadgets is really hard, absence of any in-kernel primitive
for accessing secret memory means the one gadget ROP attack can't work.
Since the only way to access secret memory is to reconstruct the missing
mapping entry, the attacker has to recover the physical page and insert
a PTE pointing to it in the kernel and then retrieve the contents. That
takes at least three gadgets which is a level of difficulty beyond most
standard attacks.
* Prevent cross-process secret userspace memory exposures. Once the
secret memory is allocated, the user can't accidentally pass it into the
kernel to be transmitted somewhere. The secreremem pages cannot be
accessed via the direct map and they are disallowed in GUP.
* Harden against exploited kernel flaws. In order to access secretmem,
a kernel-side attack would need to either walk the page tables and
create new ones, or spawn a new privileged uiserspace process to perform
secrets exfiltration using ptrace.
The file descriptor based memory has several advantages over the
"traditional" mm interfaces, such as mlock(), mprotect(), madvise(). File
descriptor approach allows explicit and controlled sharing of the memory
areas, it allows to seal the operations. Besides, file descriptor based
memory paves the way for VMMs to remove the secret memory range from the
userspace hipervisor process, for instance QEMU. Andy Lutomirski says:
"Getting fd-backed memory into a guest will take some possibly major
work in the kernel, but getting vma-backed memory into a guest without
mapping it in the host user address space seems much, much worse."
memfd_secret() is made a dedicated system call rather than an extension to
memfd_create() because it's purpose is to allow the user to create more
secure memory mappings rather than to simply allow file based access to
the memory. Nowadays a new system call cost is negligible while it is way
simpler for userspace to deal with a clear-cut system calls than with a
multiplexer or an overloaded syscall. Moreover, the initial
implementation of memfd_secret() is completely distinct from
memfd_create() so there is no much sense in overloading memfd_create() to
begin with. If there will be a need for code sharing between these
implementation it can be easily achieved without a need to adjust user
visible APIs.
The secret memory remains accessible in the process context using uaccess
primitives, but it is not exposed to the kernel otherwise; secret memory
areas are removed from the direct map and functions in the
follow_page()/get_user_page() family will refuse to return a page that
belongs to the secret memory area.
Once there will be a use case that will require exposing secretmem to the
kernel it will be an opt-in request in the system call flags so that user
would have to decide what data can be exposed to the kernel.
Removing of the pages from the direct map may cause its fragmentation on
architectures that use large pages to map the physical memory which
affects the system performance. However, the original Kconfig text for
CONFIG_DIRECT_GBPAGES said that gigabyte pages in the direct map "... can
improve the kernel's performance a tiny bit ..." (commit
|
||
|
f3791f4df5 |
Fix UCOUNT_RLIMIT_SIGPENDING counter leak
We must properly handle an errors when we increase the rlimit counter
and the ucounts reference counter. We have to this with RCU protection
to prevent possible use-after-free that could occur due to concurrent
put_cred_rcu().
The following reproducer triggers the problem:
$ cat testcase.sh
case "${STEP:-0}" in
0)
ulimit -Si 1
ulimit -Hi 1
STEP=1 unshare -rU "$0"
killall sleep
;;
1)
for i in 1 2 3 4 5; do unshare -rU sleep 5 & done
;;
esac
with the KASAN report being along the lines of
BUG: KASAN: use-after-free in put_ucounts+0x17/0xa0
Write of size 4 at addr ffff8880045f031c by task swapper/2/0
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.13.0+ #19
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-alt4 04/01/2014
Call Trace:
<IRQ>
put_ucounts+0x17/0xa0
put_cred_rcu+0xd5/0x190
rcu_core+0x3bf/0xcb0
__do_softirq+0xe3/0x341
irq_exit_rcu+0xbe/0xe0
sysvec_apic_timer_interrupt+0x6a/0x90
</IRQ>
asm_sysvec_apic_timer_interrupt+0x12/0x20
default_idle_call+0x53/0x130
do_idle+0x311/0x3c0
cpu_startup_entry+0x14/0x20
secondary_startup_64_no_verify+0xc2/0xcb
Allocated by task 127:
kasan_save_stack+0x1b/0x40
__kasan_kmalloc+0x7c/0x90
alloc_ucounts+0x169/0x2b0
set_cred_ucounts+0xbb/0x170
ksys_unshare+0x24c/0x4e0
__x64_sys_unshare+0x16/0x20
do_syscall_64+0x37/0x70
entry_SYSCALL_64_after_hwframe+0x44/0xae
Freed by task 0:
kasan_save_stack+0x1b/0x40
kasan_set_track+0x1c/0x30
kasan_set_free_info+0x20/0x30
__kasan_slab_free+0xeb/0x120
kfree+0xaa/0x460
put_cred_rcu+0xd5/0x190
rcu_core+0x3bf/0xcb0
__do_softirq+0xe3/0x341
The buggy address belongs to the object at ffff8880045f0300
which belongs to the cache kmalloc-192 of size 192
The buggy address is located 28 bytes inside of
192-byte region [ffff8880045f0300, ffff8880045f03c0)
The buggy address belongs to the page:
page:000000008de0a388 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff8880045f0000 pfn:0x45f0
flags: 0x100000000000200(slab|node=0|zone=1)
raw: 0100000000000200 ffffea00000f4640 0000000a0000000a ffff888001042a00
raw: ffff8880045f0000 000000008010000d 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff8880045f0200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880045f0280: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>ffff8880045f0300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8880045f0380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
ffff8880045f0400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Disabling lock debugging due to kernel taint
Fixes:
|
||
|
3ecda64475 |
ftrace: Use list_move instead of list_del/list_add
Using list_move() instead of list_del() + list_add(). Link: https://lkml.kernel.org/r/20210608031108.2820996-1-libaokun1@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> |
||
|
aef4226f91 |
More power management updates for 5.14-rc1
- Drop the ->stop_cpu() (not really useful) and ->resolve_freq() (unused) cpufreq driver callbacks and modify the users of the former accordingly (Viresh Kumar, Rafael Wysocki). - Add frequency invariance support to the ACPI CPPC cpufreq driver again along with the related fixes and cleanups (Viresh Kumar). - Update the Meditak, qcom and SCMI ARM cpufreq drivers (Fabien Parent, Seiya Wang, Sibi Sankar, Christophe JAILLET). - Rename black/white-lists in the DT cpufreq driver (Viresh Kumar). - Add generic performance domains support to the dvfs DT bindings (Sudeep Holla). - Refine locking in the generic power domains (genpd) support code to avoid lock dependency issues (Stephen Boyd). - Update the MSM and qcom ARM cpuidle drivers (Bartosz Dudziak). - Simplify the PM core debug code by using ktime_us_delta() to compute time interval lengths (Mark-PK Tsai). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmDl+NcSHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRxyggQALTT/4fFZYlAG/2jpu0wl/3OeR5EPS/O /KrfU7NfXOlNDiuZzVVrMsu1YhkAMedvhHl+9bvveYJBgNuMz7Y1WSGY4PAbW9j+ htnvPd8UdpNODAhT3MDV9sdpXglkWAjivCAM7nr57nl3vwrRgAhBEBkA0/OKiij7 dlTSRy+Doq0AI3ceOHlMFHGnL+/PBx3YJcOuKBwK9gHrbCQxoBQJ5QJnm2Sa8NEy QGoQWWvSI6mDntsAYL4g7mbUdOqGSN04KdXnf0nA9Ywe/Lb/Rp/Il7zCfc13iBSV Nv1ala2NFtp/W+nQan8hxXOgbET+ybbduG65bB33McTuaZ3VIV5sSxxAfEWV+DiM PGEGwDJ/vZqoJiiTKmYij07/sIUXe1Q2YsxTvDh0tqzFmO78reDfXlLdfrrOjRsd Ybw3dkI/XQH7/sTawCxn/TdFRj2hx7wg7yDqkLJXnA037BN+EBFqSckyABoBPpmA TwcacjyotwnfwOpoxbxmjCX1VwIIZ2Mk7Q3h+v9Ej5aNIuyDfR0DzybqCLqIan/4 vETtz+OCO5H1BT6g/ctIi13e7MvWzZXudNWTRTS8ZVXzuzo1hO3okHWuXiQbKGOA Dh2sjkJBBgr9WPFjkF5mgZZp8SM0D4S5SSAEglTKSwxcaMcqXVlgEN8beyMa5FuD y/UeopwhL66a =Un7e -----END PGP SIGNATURE----- Merge tag 'pm-5.14-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These include cpufreq core simplifications and fixes, cpufreq driver updates, cpuidle driver update, a generic power domains (genpd) locking fix and a debug-related simplification of the PM core. Specifics: - Drop the ->stop_cpu() (not really useful) and ->resolve_freq() (unused) cpufreq driver callbacks and modify the users of the former accordingly (Viresh Kumar, Rafael Wysocki). - Add frequency invariance support to the ACPI CPPC cpufreq driver again along with the related fixes and cleanups (Viresh Kumar). - Update the Meditak, qcom and SCMI ARM cpufreq drivers (Fabien Parent, Seiya Wang, Sibi Sankar, Christophe JAILLET). - Rename black/white-lists in the DT cpufreq driver (Viresh Kumar). - Add generic performance domains support to the dvfs DT bindings (Sudeep Holla). - Refine locking in the generic power domains (genpd) support code to avoid lock dependency issues (Stephen Boyd). - Update the MSM and qcom ARM cpuidle drivers (Bartosz Dudziak). - Simplify the PM core debug code by using ktime_us_delta() to compute time interval lengths (Mark-PK Tsai)" * tag 'pm-5.14-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (21 commits) PM: domains: Shrink locking area of the gpd_list_lock PM: sleep: Use ktime_us_delta() in initcall_debug_report() cpufreq: CPPC: Add support for frequency invariance arch_topology: Avoid use-after-free for scale_freq_data cpufreq: CPPC: Pass structure instance by reference cpufreq: CPPC: Fix potential memleak in cppc_cpufreq_cpu_init cpufreq: Remove ->resolve_freq() cpufreq: Reuse cpufreq_driver_resolve_freq() in __cpufreq_driver_target() cpufreq: Remove the ->stop_cpu() driver callback cpufreq: powernv: Migrate to ->exit() callback instead of ->stop_cpu() cpufreq: CPPC: Migrate to ->exit() callback instead of ->stop_cpu() cpufreq: intel_pstate: Combine ->stop_cpu() and ->offline() cpuidle: qcom: Add SPM register data for MSM8226 dt-bindings: arm: msm: Add SAW2 for MSM8226 dt-bindings: cpufreq: update cpu type and clock name for MT8173 SoC clk: mediatek: remove deprecated CLK_INFRA_CA57SEL for MT8173 SoC cpufreq: dt: Rename black/white-lists cpufreq: scmi: Fix an error message cpufreq: mediatek: add support for mt8365 dt-bindings: dvfs: Add support for generic performance domains ... |
||
|
a931dd33d3 |
Modules updates for v5.14
Summary of modules changes for the 5.14 merge window: - Fix incorrect logic in module_kallsyms_on_each_symbol() - Fix for a Coccinelle warning Signed-off-by: Jessica Yu <jeyu@kernel.org> -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEVrp26glSWYuDNrCUwEV+OM47wXIFAmDltvoQHGpleXVAa2Vy bmVsLm9yZwAKCRDARX44zjvBcu74D/9+HDmNlEd92ongCVhod9+s4p/g7pArBPdW YrvjYM+icH0qpE58etIQOibvFL04WxelDqyrCewEhABwB5hbad+owShMl+R43C8P IRCYYN2sG7vHVPMiUgeXlS5OiBZeFOb+Mf9Ccnu5MuI/BHpsTX9uhG+rbHQg+lXb eOWu0QmmCXgtZy0NVeCNtdOJV/PwgXC3ZHE2zgWukf1QWtwm1+0VciZwui162eyN TL/XLSH5pNZNr2MSxi5b2rdwrUVdxb0enE4L4cwCdALipooVEsrxuVvugcYPBHrC 38b7cJBfdYpVuS2iK9tSqSMdRkqX2LKZrfiGFCAxn2TA55Jc6WJhbzL0nEPGyGX1 fRR9tGAuWe1ihQHRIajLMtmdfA/HHJMmuPx8c0JDIAeTj3SdDch/gEhcKcXj6kP2 eXRoQS3RD7RSz6jKKRTFxJsn8lYJhSYN8IGwfjQzPwruLRynDQaCpnN+7wfGsdCf rKb7NxdsbNdQKjkaewu2zb/Qk7U7TOVjJhAv2gN0u/RbGSJ3i+MJTLfk3d/LLJcD ncLaVh2zRHIdKNQj/dtmIeXiSuJ23NjNQszOiaI8xZ7YKexhJCimAWPP/8f4dCuB 2tL/Xkg1GVy9+WZQXkApyOPzvtXXE9dDfQR+HiNTQaTlZpw5SxYg0ZPBBL22yv1J xDD9Hfl1jw== =/zzN -----END PGP SIGNATURE----- Merge tag 'modules-for-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux Pull module updates from Jessica Yu: - Fix incorrect logic in module_kallsyms_on_each_symbol() - Fix for a Coccinelle warning * tag 'modules-for-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux: module: correctly exit module_kallsyms_on_each_symbol when fn() != 0 kernel/module: Use BUG_ON instead of if condition followed by BUG |
||
|
26c5637310 |
tracing/histograms: Fix parsing of "sym-offset" modifier
With the addition of simple mathematical operations (plus and minus), the
parsing of the "sym-offset" modifier broke, as it took the '-' part of the
"sym-offset" as a minus, and tried to break it up into a mathematical
operation of "field.sym - offset", in which case it failed to parse
(unless the event had a field called "offset").
Both .sym and .sym-offset modifiers should not be entered into
mathematical calculations anyway. If ".sym-offset" is found in the
modifier, then simply make it not an operation that can be calculated on.
Link: https://lkml.kernel.org/r/20210707110821.188ae255@oasis.local.home
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: stable@vger.kernel.org
Fixes:
|
||
|
2a2ed5618a |
rcu: Fix pr_info() formats and values in show_rcu_gp_kthreads()
This commit changes from "%lx" to "%x" and from "0x1ffffL" to "0x1ffff" to match the change in type between the old field ->state (unsigned long) and the new field ->__state (unsigned int). Signed-off-by: Paul E. McKenney <paulmck@kernel.org> |
||
|
a9ab9cce93 |
rcu-tasks: Don't delete holdouts within trc_wait_for_one_reader()
Invoking trc_del_holdout() from within trc_wait_for_one_reader() is only a performance optimization because the RCU Tasks Trace grace-period kthread will eventually do this within check_all_holdout_tasks_trace(). But it is not a particularly important performance optimization because it only applies to the grace-period kthread, of which there is but one. This commit therefore removes this invocation of trc_del_holdout() in favor of the one in check_all_holdout_tasks_trace() in the grace-period kthread. Reported-by: "Xu, Yanfei" <yanfei.xu@windriver.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> |
||
|
1d10bf55d8 |
rcu-tasks: Don't delete holdouts within trc_inspect_reader()
As Yanfei pointed out, although invoking trc_del_holdout() is safe from the viewpoint of the integrity of the holdout list itself, the put_task_struct() invoked by trc_del_holdout() can result in use-after-free errors due to later accesses to this task_struct structure by the RCU Tasks Trace grace-period kthread. This commit therefore removes this call to trc_del_holdout() from trc_inspect_reader() in favor of the grace-period thread's existing call to trc_del_holdout(), thus eliminating that particular class of use-after-free errors. Reported-by: "Xu, Yanfei" <yanfei.xu@windriver.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> |
||
|
05bc276cf2 |
refscale: Avoid false-positive warnings in ref_scale_reader()
If the call to set_cpus_allowed_ptr() in ref_scale_reader()
fails, a later WARN_ONCE() complains. But with the advent of
|
||
|
22b6d14992 |
scftorture: Avoid false-positive warnings in scftorture_invoker()
If the call to set_cpus_allowed_ptr() in scftorture_invoker()
fails, a later WARN_ONCE() complains. But with the advent of
|
||
|
df8ba5f160 |
kgdb patches for 5.14
This was a extremely quiet cycle for kgdb. This PR consists of two patches that between them address spelling errors and a switch fallthrough warning. Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEELzVBU1D3lWq6cKzwfOMlXTn3iKEFAmDkLFQACgkQfOMlXTn3 iKEXqA/8CjjiWc7Tr9GrXuSKf2QjcPxUEQwzEVCqalk/X7bQwdJg69nKVulW4Ixm db4fANz90w9E8CCMNJlc/Wfev43kcQprn6cO1AkBUyhtDn6F/ql8E/oUWzniGJiU ko+2Hkd9Vfr6lYsDeMNPMe4bjYYUcfBIkJivPA6auX2iyJq7WJgUAh9iea2VoaNB MVjy8F7uUA/TUK4Y/4YiCK4lBILT0KnUvr50n67ZxhMlE3nzFfI0DtIx+7D1vcor w6COVZSkDeyuj+gYq7BFrnOaXPqmEnwesva8VI3UHBOY8+bXdzAld0MrzpyktXTh HPTRIR1KPdqxr4P+wOJc/D8HRIQZ2oKXFXaziUvy6rSxyfBWpJZ1ZCy/uUbPhmyx d620xwHa9R4g0VeA6NGCHhLv7T7vCItBap+bTVAzFJ1LfBYWF4Y3NUsnWxkFNBjM URv9lfXSD6wXt8k/CtkyYUCilT9RsTrQUwJio3yd+kmrbUm3iPfDMWWHW78Jdp0v JIR3/47YNCh53O37QzRGgiB4/cVBzhk5hI/pRnkfRFttUqIjeaSYXvT4mmTUox+f euZESQbwsi8XmvqNSIkDuNjru6JmrfEiiAotjBY6lZKhRP0dhldWPGNT5x1XCh8i p1Dx42So4BNaPJSf3eTiFWGgtYKqu/gqkns4D4ejtiQUe60pzC0= =dnV5 -----END PGP SIGNATURE----- Merge tag 'kgdb-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux Pull kgdb updates from Daniel Thompson: "This was a extremely quiet cycle for kgdb. This consists of two patches that between them address spelling errors and a switch fallthrough warning" * tag 'kgdb-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux: kgdb: Fix fall-through warning for Clang kgdb: Fix spelling mistakes |
||
|
fa68bd09fc |
kprobe/static_call: Restore missing static_call_text_reserved()
Restore two hunks from commit: |
||
|
2bee6d16e4 |
static_call: Fix static_call_text_reserved() vs __init
It turns out that static_call_text_reserved() was reporting __init
text as being reserved past the time when the __init text was freed
and re-used.
This is mostly harmless and will at worst result in refusing a kprobe.
Fixes:
|
||
|
9e667624c2 |
jump_label: Fix jump_label_text_reserved() vs __init
It turns out that jump_label_text_reserved() was reporting __init text as being reserved past the time when the __init text was freed and re-used. For a long time, this resulted in, at worst, not being able to kprobe text that happened to land at the re-used address. However a recent commit |
||
|
4840ce2267 |
locking/lockdep: Fix meaningless /proc/lockdep output of lock classes on !CONFIG_PROVE_LOCKING
When enabling CONFIG_LOCK_STAT=y, then CONFIG_LOCKDEP=y is forcedly enabled, but CONFIG_PROVE_LOCKING is disabled. We can get output from /proc/lockdep, which currently includes usages of lock classes. But the usages are meaningless, see the output below: / # cat /proc/lockdep all lock classes: ffffffff9af63350 ....: cgroup_mutex ffffffff9af54eb8 ....: (console_sem).lock ffffffff9af54e60 ....: console_lock ffffffff9ae74c38 ....: console_owner_lock ffffffff9ae74c80 ....: console_owner ffffffff9ae66e60 ....: cpu_hotplug_lock Only one usage context for each lock, this is because each usage is only changed in mark_lock() that is in the CONFIG_PROVE_LOCKING=y section, however in the test situation, it's not. The fix is to move the usages reading and seq_print from the !CONFIG_PROVE_LOCKING section to its defined section. Also, locks_after list of lock_class is empty when !CONFIG_PROVE_LOCKING, so do the same thing as what have done for usages of lock classes. With this patch with !CONFIG_PROVE_LOCKING we can get the results below: / # cat /proc/lockdep all lock classes: ffffffff85163290: cgroup_mutex ffffffff85154dd8: (console_sem).lock ffffffff85154d80: console_lock ffffffff85074b58: console_owner_lock ffffffff85074ba0: console_owner ffffffff85066d60: cpu_hotplug_lock ... a class key and the relevant class name each line. Signed-off-by: Xiongwei Song <sxwjean@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Waiman Long <longman@redhat.com> Link: https://lore.kernel.org/r/20210629135916.308210-1-sxwjean@me.com |
||
|
28e92f9903 |
Merge branch 'core-rcu-2021.07.04' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU updates from Paul McKenney: - Bitmap parsing support for "all" as an alias for all bits - Documentation updates - Miscellaneous fixes, including some that overlap into mm and lockdep - kvfree_rcu() updates - mem_dump_obj() updates, with acks from one of the slab-allocator maintainers - RCU NOCB CPU updates, including limited deoffloading - SRCU updates - Tasks-RCU updates - Torture-test updates * 'core-rcu-2021.07.04' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (78 commits) tasks-rcu: Make show_rcu_tasks_gp_kthreads() be static inline rcu-tasks: Make ksoftirqd provide RCU Tasks quiescent states rcu: Add missing __releases() annotation rcu: Remove obsolete rcu_read_unlock() deadlock commentary rcu: Improve comments describing RCU read-side critical sections rcu: Create an unrcu_pointer() to remove __rcu from a pointer srcu: Early test SRCU polling start rcu: Fix various typos in comments rcu/nocb: Unify timers rcu/nocb: Prepare for fine-grained deferred wakeup rcu/nocb: Only cancel nocb timer if not polling rcu/nocb: Delete bypass_timer upon nocb_gp wakeup rcu/nocb: Cancel nocb_timer upon nocb_gp wakeup rcu/nocb: Allow de-offloading rdp leader rcu/nocb: Directly call __wake_nocb_gp() from bypass timer rcu: Don't penalize priority boosting when there is nothing to boost rcu: Point to documentation of ordering guarantees rcu: Make rcu_gp_cleanup() be noinline for tracing rcu: Restrict RCU_STRICT_GRACE_PERIOD to at most four CPUs rcu: Make show_rcu_gp_kthreads() dump rcu_node structures blocking GP ... |
||
|
b97efd5e98 |
Merge branch 'kcsan.2021.05.18a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull KCSAN updates from Paul McKenney. * 'kcsan.2021.05.18a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: kcsan: Use URL link for pointing access-marking.txt kcsan: Document "value changed" line kcsan: Report observed value changes kcsan: Remove kcsan_report_type kcsan: Remove reporting indirection kcsan: Refactor access_info initialization kcsan: Fold panic() call into print_report() kcsan: Refactor passing watchpoint/other_info kcsan: Distinguish kcsan_report() calls kcsan: Simplify value change detection kcsan: Add pointer to access-marking.txt to data_race() bullet |
||
|
58ec9059b3 |
Merge branch 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs name lookup updates from Al Viro: "Small namei.c patch series, mostly to simplify the rules for nameidata state. It's actually from the previous cycle - but I didn't post it for review in time... Changes visible outside of fs/namei.c: file_open_root() calling conventions change, some freed bits in LOOKUP_... space" * 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: namei: make sure nd->depth is always valid teach set_nameidata() to handle setting the root as well take LOOKUP_{ROOT,ROOT_GRABBED,JUMPED} out of LOOKUP_... space switch file_open_root() to struct path |
||
|
757fa80f4e |
Tracing updates for 5.14:
- Added option for per CPU threads to the hwlat tracer - Have hwlat tracer handle hotplug CPUs - New tracer: osnoise, that detects latency caused by interrupts, softirqs and scheduling of other tasks. - Added timerlat tracer that creates a thread and measures in detail what sources of latency it has for wake ups. - Removed the "success" field of the sched_wakeup trace event. This has been hardcoded as "1" since 2015, no tooling should be looking at it now. If one exists, we can revert this commit, fix that tool and try to remove it again in the future. - tgid mapping fixed to handle more than PID_MAX_DEFAULT pids/tgids. - New boot command line option "tp_printk_stop", as tp_printk causes trace events to write to console. When user space starts, this can easily live lock the system. Having a boot option to stop just after boot up is useful to prevent that from happening. - Have ftrace_dump_on_oops boot command line option take numbers that match the numbers shown in /proc/sys/kernel/ftrace_dump_on_oops. - Bootconfig clean ups, fixes and enhancements. - New ktest script that tests bootconfig options. - Add tracepoint_probe_register_may_exist() to register a tracepoint without triggering a WARN*() if it already exists. BPF has a path from user space that can do this. All other paths are considered a bug. - Small clean ups and fixes -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYN8YPhQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qhxLAP9Mo5hHv7Hg6W7Ddv77rThm+qclsMR/ yW0P+eJpMm4+xAD8Cq03oE1DimPK+9WZBKU5rSqAkqG6CjgDRw6NlIszzQQ= =WEPR -----END PGP SIGNATURE----- Merge tag 'trace-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: - Added option for per CPU threads to the hwlat tracer - Have hwlat tracer handle hotplug CPUs - New tracer: osnoise, that detects latency caused by interrupts, softirqs and scheduling of other tasks. - Added timerlat tracer that creates a thread and measures in detail what sources of latency it has for wake ups. - Removed the "success" field of the sched_wakeup trace event. This has been hardcoded as "1" since 2015, no tooling should be looking at it now. If one exists, we can revert this commit, fix that tool and try to remove it again in the future. - tgid mapping fixed to handle more than PID_MAX_DEFAULT pids/tgids. - New boot command line option "tp_printk_stop", as tp_printk causes trace events to write to console. When user space starts, this can easily live lock the system. Having a boot option to stop just after boot up is useful to prevent that from happening. - Have ftrace_dump_on_oops boot command line option take numbers that match the numbers shown in /proc/sys/kernel/ftrace_dump_on_oops. - Bootconfig clean ups, fixes and enhancements. - New ktest script that tests bootconfig options. - Add tracepoint_probe_register_may_exist() to register a tracepoint without triggering a WARN*() if it already exists. BPF has a path from user space that can do this. All other paths are considered a bug. - Small clean ups and fixes * tag 'trace-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (49 commits) tracing: Resize tgid_map to pid_max, not PID_MAX_DEFAULT tracing: Simplify & fix saved_tgids logic treewide: Add missing semicolons to __assign_str uses tracing: Change variable type as bool for clean-up trace/timerlat: Fix indentation on timerlat_main() trace/osnoise: Make 'noise' variable s64 in run_osnoise() tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing tracing: Fix spelling in osnoise tracer "interferences" -> "interference" Documentation: Fix a typo on trace/osnoise-tracer trace/osnoise: Fix return value on osnoise_init_hotplug_support trace/osnoise: Make interval u64 on osnoise_main trace/osnoise: Fix 'no previous prototype' warnings tracing: Have osnoise_main() add a quiescent state for task rcu seq_buf: Make trace_seq_putmem_hex() support data longer than 8 seq_buf: Fix overflow in seq_buf_putmem_hex() trace/osnoise: Support hotplug operations trace/hwlat: Support hotplug operations trace/hwlat: Protect kdata->kthread with get/put_online_cpus trace: Add timerlat tracer trace: Add osnoise tracer ... |
||
|
bd31b9efbf |
SCSI misc on 20210702
This series consists of the usual driver updates (ufs, ibmvfc, megaraid_sas, lpfc, elx, mpi3mr, qedi, iscsi, storvsc, mpt3sas) with elx and mpi3mr being new drivers. The major core change is a rework to drop the status byte handling macros and the old bit shifted definitions and the rest of the updates are minor fixes. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYN7I6iYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishXpRAQCkngYZ 35yQrqOxgOk2pfrysE95tHrV1MfJm2U49NFTwAEAuZutEvBUTfBF+sbcJ06r6q7i H0hkJN/Io7enFs5v3WA= =zwIa -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This series consists of the usual driver updates (ufs, ibmvfc, megaraid_sas, lpfc, elx, mpi3mr, qedi, iscsi, storvsc, mpt3sas) with elx and mpi3mr being new drivers. The major core change is a rework to drop the status byte handling macros and the old bit shifted definitions and the rest of the updates are minor fixes" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (287 commits) scsi: aha1740: Avoid over-read of sense buffer scsi: arcmsr: Avoid over-read of sense buffer scsi: ips: Avoid over-read of sense buffer scsi: ufs: ufs-mediatek: Add missing of_node_put() in ufs_mtk_probe() scsi: elx: libefc: Fix IRQ restore in efc_domain_dispatch_frame() scsi: elx: libefc: Fix less than zero comparison of a unsigned int scsi: elx: efct: Fix pointer error checking in debugfs init scsi: elx: efct: Fix is_originator return code type scsi: elx: efct: Fix link error for _bad_cmpxchg scsi: elx: efct: Eliminate unnecessary boolean check in efct_hw_command_cancel() scsi: elx: efct: Do not use id uninitialized in efct_lio_setup_session() scsi: elx: efct: Fix error handling in efct_hw_init() scsi: elx: efct: Remove redundant initialization of variable lun scsi: elx: efct: Fix spelling mistake "Unexected" -> "Unexpected" scsi: lpfc: Fix build error in lpfc_scsi.c scsi: target: iscsi: Remove redundant continue statement scsi: qla4xxx: Remove redundant continue statement scsi: ppa: Switch to use module_parport_driver() scsi: imm: Switch to use module_parport_driver() scsi: mpt3sas: Fix error return value in _scsih_expander_add() ... |
||
|
e72b069609 |
dma-mapping updates for Linux 5.14
- a trivivial whitespace fix (Zhen Lei) - report -EEXIST errors in add_dma_entry (Hamza Mahfooz) -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmDfOS4LHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYP8sBAAq1qOMdxAsOtfdl4MrUP3/LrxL9FSN6ThMeaZy3QV jUgcIEZ8To8KAXoEAu0LzIDhI6xGfAcwzavNviCPzoyT4OPzTOZQvHK4AOotruu0 qQ/Xa9iSputWNA/eV8eMLRYKRZJv0MB0su4mqn9h/ruxRENm/iBKHj30Q1JrCERQ bhbEm1Bnw6BZqjQrhn4hEKq7UVVTUeVOITDXmj4TiYQJPLsr2RFgZa1UzOx1xbvu x1HzzRNeBRjri2yIn562eIuIPno+d6NtLDPlCXD46dqd6VUclV0hYy6irooZY0uY 0gnXo35NKB1DG45he6TIm/LTKMjUh7glGLeNY6Id1ZcZJWOTf3zCSs3Z/BiX0aXK ituMdk56FjFK2aXExioUJ3heZ2lrn2WHNMeZ9wNcZ38UPJNo/dfFXXfYbsGp+TGa XsS6QbPW2CmaVz6nktM6ra0VaUNWF/dHY2IpAatMnPfGJbrvmeYLoRQZSvkGwB64 k95Vu+Q8BAydb9cjLqs5EIZLAPGPsVT6bWZWYUPkc74I7Ke8c03r6K81IwAky9mS 5EtitaDFQPOlUFwI79RLgg4I+xPEMw3u/mmYLP6JH4WhPjHWk7SeOP1p6EBLCQHE 7OJLVJxmsrG7VLK+YW7N9q7pSokij0ghpNAeYv7CtxA+tiEPuEYx39Td7+btiP6f 0KQ= =QIFu -----END PGP SIGNATURE----- Merge tag 'dma-mapping-5.14' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping updates from Christoph Hellwig: - a trivivial whitespace fix (Zhen Lei) - report -EEXIST errors in add_dma_entry (Hamza Mahfooz) * tag 'dma-mapping-5.14' of git://git.infradead.org/users/hch/dma-mapping: dma-debug: report -EEXIST errors in add_dma_entry dma-mapping: remove a trailing space |
||
|
a48ad6e7a3 |
linux-kselftest-kunit-fixes-5.14-rc1
This KUnit update for Linux 5.14-rc1 consists of fixes and features: -- add support for skipped tests -- introduce kunit_kmalloc_array/kunit_kcalloc() helpers -- add gnu_printf specifiers -- add kunit_shutdown -- add unit test for filtering suites by names -- convert lib/test_list_sort.c to use KUnit -- code organization moving default config to tools/testing/kunit -- refactor of internal parser input handling -- cleanups and updates to documentation -- code cleanup related to casts -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmDfNHgACgkQCwJExA0N QxzU5g/+ORbcbE5jhumo82xYUCGTScgIqPllNO000Vk9xydMqwx6tpY5puV0ig2z 5X7pxKEQTnype58yzY5mS1p336ryIx3q+AwEownxTW3YurXo2naQ59ZPjEvlAV+E h+DMYLzEjrxzcyJATw02uC9YLUZ3w6FPJfiXViIv93YYrtcnM0u6JpwG0yfBlI4v 4KKB2Xu4K7T90C9/ADFYFKX3mjXQl5fQwvIdtA7wS90Cgq52LKp2mvg1XEiZE5d+ 0dkTZ4Zo8TxxHt665o7vfnUjQQNmh45iGlW65wONxfAPb8BPoneGjVKWQTnN0Hor W+93ZPbMuFMSSKJuoHY9U7sP5VySKvaiIYaGdi6prnZZu0zUabKnLZ6FOy7kEdfs v09ulCBTVLslixVgNcp/kD9T+G/SXwCF5YBMAiMDQ0GNfUqlFtBkEA3gd44KwMI0 KwCcOgUSiaCkqyzOz/VeQsu/nhA5jdMO0KjiAs7Z3e7r7O/qKFs/ll7hZgDNCWSC q8eIrcBkSL0EGgXR1iZ4AtGm8op6KKd4ACBM8NdtTyoGFl1npZOgZnHoIsy35G9K 9mhc7eXSoaDGqy9dONL1Tc8Neg7qLTXQNp2radqsnAAgNPUrJuC7+8YC+DdIsjBH W7OyMjpfbwPws5rP4CS+JdwL+nQprKXZvFIhWGYhkDK44MbOngw= =5QAv -----END PGP SIGNATURE----- Merge tag 'linux-kselftest-kunit-fixes-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull KUnit update from Shuah Khan: "Fixes and features: - add support for skipped tests - introduce kunit_kmalloc_array/kunit_kcalloc() helpers - add gnu_printf specifiers - add kunit_shutdown - add unit test for filtering suites by names - convert lib/test_list_sort.c to use KUnit - code organization moving default config to tools/testing/kunit - refactor of internal parser input handling - cleanups and updates to documentation - code cleanup related to casts" * tag 'linux-kselftest-kunit-fixes-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (29 commits) kunit: add unit test for filtering suites by names kasan: test: make use of kunit_skip() kunit: test: Add example tests which are always skipped kunit: tool: Support skipped tests in kunit_tool kunit: Support skipped tests thunderbolt: test: Reinstate a few casts of bitfields kunit: tool: internal refactor of parser input handling lib/test: convert lib/test_list_sort.c to use KUnit kunit: introduce kunit_kmalloc_array/kunit_kcalloc() helpers kunit: Remove the unused all_tests.config kunit: Move default config from arch/um -> tools/testing/kunit kunit: arch/um/configs: Enable KUNIT_ALL_TESTS by default kunit: Add gnu_printf specifiers lib/cmdline_kunit: Remove a cast which are no-longer required kernel/sysctl-test: Remove some casts which are no-longer required thunderbolt: test: Remove some casts which are no longer required mmc: sdhci-of-aspeed: Remove some unnecessary casts from KUnit tests iio: Remove a cast in iio-test-format which is no longer required device property: Remove some casts in property-entry-test Documentation: kunit: Clean up some string casts in examples ... |
||
|
019b3fd94b |
powerpc updates for 5.14
- A big series refactoring parts of our KVM code, and converting some to C. - Support for ARCH_HAS_SET_MEMORY, and ARCH_HAS_STRICT_MODULE_RWX on some CPUs. - Support for the Microwatt soft-core. - Optimisations to our interrupt return path on 64-bit. - Support for userspace access to the NX GZIP accelerator on PowerVM on Power10. - Enable KUAP and KUEP by default on 32-bit Book3S CPUs. - Other smaller features, fixes & cleanups. Thanks to: Andy Shevchenko, Aneesh Kumar K.V, Arnd Bergmann, Athira Rajeev, Baokun Li, Benjamin Herrenschmidt, Bharata B Rao, Christophe Leroy, Daniel Axtens, Daniel Henrique Barboza, Finn Thain, Geoff Levand, Haren Myneni, Jason Wang, Jiapeng Chong, Joel Stanley, Jordan Niethe, Kajol Jain, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Nick Desaulniers, Paul Mackerras, Russell Currey, Sathvika Vasireddy, Shaokun Zhang, Stephen Rothwell, Sudeep Holla, Suraj Jitindar Singh, Tom Rix, Vaibhav Jain, YueHaibing, Zhang Jianhua, Zhen Lei. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmDfFS4THG1wZUBlbGxl cm1hbi5pZC5hdQAKCRBR6+o8yOGlgFxHEAC88NJ+Gz87LiTQFt6QjhziBaJUd+sY uqADPRROr4P50O8PjYZbMi2qbXzOlLkZO4wJWX7jpZ1F9KmbPNqY2shD8h4ahyge F/uqzBW1FXBJfnDEKdU2MzalkeTP+dwxLZyouUamjDCGNLFjOV4x/Fft5otOdXjO k9uO6yoGyOkWYzjC+Y/irNPlIDDByB/+bD92Cb52Y2mXMDDEnx4JzbtkeJW+8udT Sjn3bWzeL+dz5GehjMKwK4+SptNiyQGOgM8FwtnKUMvgzxv04DqCGjr9YC12L2Z7 VoFZc4GzVgtf8DZg4fJ3KG5aG2nH3Tui7jc9lUckdrxixDAZw5wSG7CQ39gFb/5+ 7A4fEJk4Z3h5llibwxAZrC7wV8ZDDXn8oRFzRcOJjfxYaD+ohOOyWHIebwkdiXYx nfYI7sBcScDLXeBvHtDra2GJpbFSpVL3S/QNhhi1vKVNrFSyAgbAybcVL2xPLZ6+ 8Mh7A8xt+hf2bo9AXuYJDo9mwXWfg1093d0kT+AslcRhZioBk18c2AiZLIz0FzuL Ua/e5FPb99x9LSdcZHvaAXBoHT2iTgDyCyDa3gkIesyuRX6ggHoFcVQuvdDcbJ9d H8LK+Tahy1Y+E5b6KdtU8mDEGE+QG+CWLnwQ6YSCaL/MYgaFzNa32Jdj1fmztSBC cttP43kHZ7ljTw== =zo4d -----END PGP SIGNATURE----- Merge tag 'powerpc-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - A big series refactoring parts of our KVM code, and converting some to C. - Support for ARCH_HAS_SET_MEMORY, and ARCH_HAS_STRICT_MODULE_RWX on some CPUs. - Support for the Microwatt soft-core. - Optimisations to our interrupt return path on 64-bit. - Support for userspace access to the NX GZIP accelerator on PowerVM on Power10. - Enable KUAP and KUEP by default on 32-bit Book3S CPUs. - Other smaller features, fixes & cleanups. Thanks to: Andy Shevchenko, Aneesh Kumar K.V, Arnd Bergmann, Athira Rajeev, Baokun Li, Benjamin Herrenschmidt, Bharata B Rao, Christophe Leroy, Daniel Axtens, Daniel Henrique Barboza, Finn Thain, Geoff Levand, Haren Myneni, Jason Wang, Jiapeng Chong, Joel Stanley, Jordan Niethe, Kajol Jain, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Nick Desaulniers, Paul Mackerras, Russell Currey, Sathvika Vasireddy, Shaokun Zhang, Stephen Rothwell, Sudeep Holla, Suraj Jitindar Singh, Tom Rix, Vaibhav Jain, YueHaibing, Zhang Jianhua, and Zhen Lei. * tag 'powerpc-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (218 commits) powerpc: Only build restart_table.c for 64s powerpc/64s: move ret_from_fork etc above __end_soft_masked powerpc/64s/interrupt: clean up interrupt return labels powerpc/64/interrupt: add missing kprobe annotations on interrupt exit symbols powerpc/64: enable MSR[EE] in irq replay pt_regs powerpc/64s/interrupt: preserve regs->softe for NMI interrupts powerpc/64s: add a table of implicit soft-masked addresses powerpc/64e: remove implicit soft-masking and interrupt exit restart logic powerpc/64e: fix CONFIG_RELOCATABLE build warnings powerpc/64s: fix hash page fault interrupt handler powerpc/4xx: Fix setup_kuep() on SMP powerpc/32s: Fix setup_{kuap/kuep}() on SMP powerpc/interrupt: Use names in check_return_regs_valid() powerpc/interrupt: Also use exit_must_hard_disable() on PPC32 powerpc/sysfs: Replace sizeof(arr)/sizeof(arr[0]) with ARRAY_SIZE powerpc/ptrace: Refactor regs_set_return_{msr/ip} powerpc/ptrace: Move set_return_regs_changed() before regs_set_return_{msr/ip} powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi() powerpc/pseries/vas: Include irqdomain.h powerpc: mark local variables around longjmp as volatile ... |
||
|
71bd934101 |
Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton: "190 patches. Subsystems affected by this patch series: mm (hugetlb, userfaultfd, vmscan, kconfig, proc, z3fold, zbud, ras, mempolicy, memblock, migration, thp, nommu, kconfig, madvise, memory-hotplug, zswap, zsmalloc, zram, cleanups, kfence, and hmm), procfs, sysctl, misc, core-kernel, lib, lz4, checkpatch, init, kprobes, nilfs2, hfs, signals, exec, kcov, selftests, compress/decompress, and ipc" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (190 commits) ipc/util.c: use binary search for max_idx ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock ipc: use kmalloc for msg_queue and shmid_kernel ipc sem: use kvmalloc for sem_undo allocation lib/decompressors: remove set but not used variabled 'level' selftests/vm/pkeys: exercise x86 XSAVE init state selftests/vm/pkeys: refill shadow register after implicit kernel write selftests/vm/pkeys: handle negative sys_pkey_alloc() return code selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really random kcov: add __no_sanitize_coverage to fix noinstr for all architectures exec: remove checks in __register_bimfmt() x86: signal: don't do sas_ss_reset() until we are certain that sigframe won't be abandoned hfsplus: report create_date to kstat.btime hfsplus: remove unnecessary oom message nilfs2: remove redundant continue statement in a while-loop kprobes: remove duplicated strong free_insn_page in x86 and s390 init: print out unknown kernel parameters checkpatch: do not complain about positive return values starting with EPOLL checkpatch: improve the indented label test checkpatch: scripts/spdxcheck.py now requires python3 ... |
||
|
3e1493f463 |
sched/uclamp: Ignore max aggregation if rq is idle
When a task wakes up on an idle rq, uclamp_rq_util_with() would max
aggregate with rq value. But since there is no task enqueued yet, the
values are stale based on the last task that was running. When the new
task actually wakes up and enqueued, then the rq uclamp values should
reflect that of the newly woken up task effective uclamp values.
This is a problem particularly for uclamp_max because it default to
1024. If a task p with uclamp_max = 512 wakes up, then max aggregation
would ignore the capping that should apply when this task is enqueued,
which is wrong.
Fix that by ignoring max aggregation if the rq is idle since in that
case the effective uclamp value of the rq will be the ones of the task
that will wake up.
Fixes:
|
||
|
72d0ad7cb5 |
sched/fair: Fix CFS bandwidth hrtimer expiry type
The time remaining until expiry of the refresh_timer can be negative. Casting the type to an unsigned 64-bit value will cause integer underflow, making the runtime_refresh_within return false instead of true. These situations are rare, but they do happen. This does not cause user-facing issues or errors; other than possibly unthrottling cfs_rq's using runtime from the previous period(s), making the CFS bandwidth enforcement less strict in those (special) situations. Signed-off-by: Odin Ugedal <odin@uged.al> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ben Segall <bsegall@google.com> Link: https://lore.kernel.org/r/20210629121452.18429-1-odin@uged.al |
||
|
ceb6ba45dc |
sched/fair: Sync load_sum with load_avg after dequeue
commit |
||
|
3dbdb38e28 |
Merge branch 'for-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo: - cgroup.kill is added which implements atomic killing of the whole subtree. Down the line, this should be able to replace the multiple userland implementations of "keep killing till empty". - PSI can now be turned off at boot time to avoid overhead for configurations which don't care about PSI. * 'for-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: make per-cgroup pressure stall tracking configurable cgroup: Fix kernel-doc cgroup: inline cgroup_task_freeze() tests/cgroup: test cgroup.kill tests/cgroup: move cg_wait_for(), cg_prepare_for_wait() tests/cgroup: use cgroup.kill in cg_killall() docs/cgroup: add entry for cgroup.kill cgroup: introduce cgroup.kill |
||
|
911a2997a5 |
\n
-----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmDcl7AACgkQnJ2qBz9k QNnsBQf+LBAPsfykQ/f8EdHErO1lfbVTmwf2g/JzTkjrIVZTZ6Ic47aCIiFxgHU2 Js9ufaPxpsbbopzpn2PAoCUzxNsZDqgXtnC03MOUAqoSFbAvgLHz2sQwjqeYJUGQ P6n7VipEA/qBVpQI5zeCUhHYcahoNrRjSLzaFnE2Z8CrQYQ6Ry9gVEhduvu2OTru 62cWlAWlTJfx/FcR1Y0F/ZznnNSKMiAHcEe3F6Beztplg2ooq+z6FclJYrkmnxMq SXSOsqTCdi1/oFx36NpvLkykrIS9I7N/iqCnKwbm6X+nyZZKyAwYZhWVqkbozPPu +u1Ppq8o0IuWwEA6/UAmxgAO3m/Gkw== =tn0h -----END PGP SIGNATURE----- Merge tag 'fs_for_v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull misc fs updates from Jan Kara: "The new quotactl_fd() syscall (remake of quotactl_path() syscall that got introduced & disabled in 5.13 cycle), and couple of udf, reiserfs, isofs, and writeback fixes and cleanups" * tag 'fs_for_v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: writeback: fix obtain a reference to a freeing memcg css quota: remove unnecessary oom message isofs: remove redundant continue statement quota: Wire up quotactl_fd syscall quota: Change quotactl_path() systcall to an fd-based one reiserfs: Remove unneed check in reiserfs_write_full_page() udf: Fix NULL pointer dereference in udf_symlink function reiserfs: add check for invalid 1st journal block |
||
|
4030a6e6a6 |
tracing: Resize tgid_map to pid_max, not PID_MAX_DEFAULT
Currently tgid_map is sized at PID_MAX_DEFAULT entries, which means that
on systems where pid_max is configured higher than PID_MAX_DEFAULT the
ftrace record-tgid option doesn't work so well. Any tasks with PIDs
higher than PID_MAX_DEFAULT are simply not recorded in tgid_map, and
don't show up in the saved_tgids file.
In particular since systemd v243 & above configure pid_max to its
highest possible 1<<22 value by default on 64 bit systems this renders
the record-tgids option of little use.
Increase the size of tgid_map to the configured pid_max instead,
allowing it to cover the full range of PIDs up to the maximum value of
PID_MAX_LIMIT if the system is configured that way.
On 64 bit systems with pid_max == PID_MAX_LIMIT this will increase the
size of tgid_map from 256KiB to 16MiB. Whilst this 64x increase in
memory overhead sounds significant 64 bit systems are presumably best
placed to accommodate it, and since tgid_map is only allocated when the
record-tgid option is actually used presumably the user would rather it
spends sufficient memory to actually record the tgids they expect.
The size of tgid_map could also increase for CONFIG_BASE_SMALL=y
configurations, but these seem unlikely to be systems upon which people
are both configuring a large pid_max and running ftrace with record-tgid
anyway.
Of note is that we only allocate tgid_map once, the first time that the
record-tgid option is enabled. Therefore its size is only set once, to
the value of pid_max at the time the record-tgid option is first
enabled. If a user increases pid_max after that point, the saved_tgids
file will not contain entries for any tasks with pids beyond the earlier
value of pid_max.
Link: https://lkml.kernel.org/r/20210701172407.889626-2-paulburton@google.com
Fixes:
|
||
|
97c885d585 |
x86: signal: don't do sas_ss_reset() until we are certain that sigframe won't be abandoned
Currently we handle SS_AUTODISARM as soon as we have stored the altstack settings into sigframe - that's the point when we have set the things up for eventual sigreturn to restore the old settings. And if we manage to set the sigframe up (we are not done with that yet), everything's fine. However, in case of failure we end up with sigframe-to-be abandoned and SIGSEGV force-delivered. And in that case we end up with inconsistent rules - late failures have altstack reset, early ones do not. It's trivial to get consistent behaviour - just handle SS_AUTODISARM once we have set the sigframe up and are committed to entering the handler, i.e. in signal_delivered(). Link: https://lore.kernel.org/lkml/20200404170604.GN23230@ZenIV.linux.org.uk/ Link: https://github.com/ClangBuiltLinux/linux/issues/876 Link: https://lkml.kernel.org/r/20210422230846.1756380-1-ndesaulniers@google.com Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
66ce75144d |
kprobes: remove duplicated strong free_insn_page in x86 and s390
free_insn_page() in x86 and s390 is same with the common weak function in kernel/kprobes.c. Plus, the comment "Recover page to RW mode before releasing it" in x86 seems insensible to be there since resetting mapping is done by common code in vfree() of module_memfree(). So drop these two duplicated strong functions and related comment, then mark the common one in kernel/kprobes.c strong. Link: https://lkml.kernel.org/r/20210608065736.32656-1-song.bao.hua@hisilicon.com Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Qi Liu <liuqi115@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
f39650de68 |
kernel.h: split out panic and oops helpers
kernel.h is being used as a dump for all kinds of stuff for a long time. Here is the attempt to start cleaning it up by splitting out panic and oops helpers. There are several purposes of doing this: - dropping dependency in bug.h - dropping a loop by moving out panic_notifier.h - unload kernel.h from something which has its own domain At the same time convert users tree-wide to use new headers, although for the time being include new header back to kernel.h to avoid twisted indirected includes for existing users. [akpm@linux-foundation.org: thread_info.h needs limits.h] [andriy.shevchenko@linux.intel.com: ia64 fix] Link: https://lkml.kernel.org/r/20210520130557.55277-1-andriy.shevchenko@linux.intel.com Link: https://lkml.kernel.org/r/20210511074137.33666-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Co-developed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Corey Minyard <cminyard@mvista.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Sebastian Reichel <sre@kernel.org> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Stephen Boyd <sboyd@kernel.org> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Acked-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
9a52c5f3c8 |
sysctl: remove redundant assignment to first
Variable first is set to '0', but this value is never read as it is not used later on, hence it is a redundant assignment and can be removed. Clean up the following clang-analyzer warning: kernel/sysctl.c:1562:4: warning: Value stored to 'first' is never read [clang-analyzer-deadcode.DeadStores]. Link: https://lkml.kernel.org/r/1620469990-22182-1-git-send-email-jiapeng.chong@linux.alibaba.com Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Reported-by: Abaci Robot <abaci@linux.alibaba.com> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
0fc4dcc13f |
bpf, devmap: Convert remaining READ_ONCE() to rcu_dereference_check()
There were a couple of READ_ONCE()-invocations left-over by the devmap
RCU conversion. Convert these to rcu_dereference_check() as well to avoid
complaints from sparse.
Fixes:
|
||
|
1eb5dde674 |
cpufreq: CPPC: Add support for frequency invariance
The Frequency Invariance Engine (FIE) is providing a frequency scaling correction factor that helps achieve more accurate load-tracking. Normally, this scaling factor can be obtained directly with the help of the cpufreq drivers as they know the exact frequency the hardware is running at. But that isn't the case for CPPC cpufreq driver. Another way of obtaining that is using the arch specific counter support, which is already present in kernel, but that hardware is optional for platforms. This patch updates the CPPC driver to register itself with the topology core to provide its own implementation (cppc_scale_freq_tick()) of topology_scale_freq_tick() which gets called by the scheduler on every tick. Note that the arch specific counters have higher priority than CPPC counters, if available, though the CPPC driver doesn't need to have any special handling for that. On an invocation of cppc_scale_freq_tick(), we schedule an irq work (since we reach here from hard-irq context), which then schedules a normal work item and cppc_scale_freq_workfn() updates the per_cpu arch_freq_scale variable based on the counter updates since the last tick. To allow platforms to disable this CPPC counter-based frequency invariance support, this is all done under CONFIG_ACPI_CPPC_CPUFREQ_FIE, which is enabled by default. This also exports sched_setattr_nocheck() as the CPPC driver can be built as a module. Cc: linux-acpi@vger.kernel.org Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Qian Cai <quic_qiancai@quicinc.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> |
||
|
dbe69e4337 |
Networking changes for 5.14.
Core: - BPF: - add syscall program type and libbpf support for generating instructions and bindings for in-kernel BPF loaders (BPF loaders for BPF), this is a stepping stone for signed BPF programs - infrastructure to migrate TCP child sockets from one listener to another in the same reuseport group/map to improve flexibility of service hand-off/restart - add broadcast support to XDP redirect - allow bypass of the lockless qdisc to improving performance (for pktgen: +23% with one thread, +44% with 2 threads) - add a simpler version of "DO_ONCE()" which does not require jump labels, intended for slow-path usage - virtio/vsock: introduce SOCK_SEQPACKET support - add getsocketopt to retrieve netns cookie - ip: treat lowest address of a IPv4 subnet as ordinary unicast address allowing reclaiming of precious IPv4 addresses - ipv6: use prandom_u32() for ID generation - ip: add support for more flexible field selection for hashing across multi-path routes (w/ offload to mlxsw) - icmp: add support for extended RFC 8335 PROBE (ping) - seg6: add support for SRv6 End.DT46 behavior - mptcp: - DSS checksum support (RFC 8684) to detect middlebox meddling - support Connection-time 'C' flag - time stamping support - sctp: packetization Layer Path MTU Discovery (RFC 8899) - xfrm: speed up state addition with seq set - WiFi: - hidden AP discovery on 6 GHz and other HE 6 GHz improvements - aggregation handling improvements for some drivers - minstrel improvements for no-ack frames - deferred rate control for TXQs to improve reaction times - switch from round robin to virtual time-based airtime scheduler - add trace points: - tcp checksum errors - openvswitch - action execution, upcalls - socket errors via sk_error_report Device APIs: - devlink: add rate API for hierarchical control of max egress rate of virtual devices (VFs, SFs etc.) - don't require RCU read lock to be held around BPF hooks in NAPI context - page_pool: generic buffer recycling New hardware/drivers: - mobile: - iosm: PCIe Driver for Intel M.2 Modem - support for Qualcomm MSM8998 (ipa) - WiFi: Qualcomm QCN9074 and WCN6855 PCI devices - sparx5: Microchip SparX-5 family of Enterprise Ethernet switches - Mellanox BlueField Gigabit Ethernet (control NIC of the DPU) - NXP SJA1110 Automotive Ethernet 10-port switch - Qualcomm QCA8327 switch support (qca8k) - Mikrotik 10/25G NIC (atl1c) Driver changes: - ACPI support for some MDIO, MAC and PHY devices from Marvell and NXP (our first foray into MAC/PHY description via ACPI) - HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx - Mellanox/Nvidia NIC (mlx5) - NIC VF offload of L2 bridging - support IRQ distribution to Sub-functions - Marvell (prestera): - add flower and match all - devlink trap - link aggregation - Netronome (nfp): connection tracking offload - Intel 1GE (igc): add AF_XDP support - Marvell DPU (octeontx2): ingress ratelimit offload - Google vNIC (gve): new ring/descriptor format support - Qualcomm mobile (rmnet & ipa): inline checksum offload support - MediaTek WiFi (mt76) - mt7915 MSI support - mt7915 Tx status reporting - mt7915 thermal sensors support - mt7921 decapsulation offload - mt7921 enable runtime pm and deep sleep - Realtek WiFi (rtw88) - beacon filter support - Tx antenna path diversity support - firmware crash information via devcoredump - Qualcomm 60GHz WiFi (wcn36xx) - Wake-on-WLAN support with magic packets and GTK rekeying - Micrel PHY (ksz886x/ksz8081): add cable test support Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmDb+fUACgkQMUZtbf5S Irs2Jg//aqN0Q8CgIvYCVhPxQw1tY7pTAbgyqgBZ01vwjyvtIOgJiWzSfFEU84mX M8fcpFX5eTKrOyJ9S6UFfQ/JG114n3hjAxFFT4Hxk2gC1Tg0vHuFQTDHcUl28bUE mTm61e1YpdorILnv2k5JVQ/wu0vs5QKDrjcYcrcPnh+j93wvnPOgAfDBV95nZzjS OTt4q2fR8GzLcSYWWsclMbDNkzyTG50RW/0Yd6aGjr5QGvXfrMeXfUJNz533PMf/ w5lNyjRKv+x9mdTZJzU0+msNUrZgUdRz7W8Ey8lD3hJZRE+D6/uU7FtsE8Mi3+uc HWxeZUyzA3YF1MfVl/eesbxyPT7S/OkLzk4O5B35FbqP0YltaP+bOjq1/nM3ce1/ io9Dx9pIl/2JANUgRCAtLi8Z2dkvRoqTaBxZ/nPudCCljFwDwl6joTMJ7Ow22i5Y 5aIkcXFmZq4LbJDiHvbTlqT7yiuaEvu2UK/23bSIg/K3nF4eAmkY9Y1EgiMf60OF 78Ttw0wk2tUegwaS5MZnCniKBKDyl9gM2F6rbZ/IxQRR2LTXFc1B6gC+ynUxgXfh Ub8O++6qGYGYZ0XvQH4pzco79p3qQWBTK5beIp2eu6BOAjBVIXq4AibUfoQLACsu hX7jMPYd0kc3WFgUnKgQP8EnjFSwbf4XiaE7fIXvWBY8hzCw2h4= =LvtX -----END PGP SIGNATURE----- Merge tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - BPF: - add syscall program type and libbpf support for generating instructions and bindings for in-kernel BPF loaders (BPF loaders for BPF), this is a stepping stone for signed BPF programs - infrastructure to migrate TCP child sockets from one listener to another in the same reuseport group/map to improve flexibility of service hand-off/restart - add broadcast support to XDP redirect - allow bypass of the lockless qdisc to improving performance (for pktgen: +23% with one thread, +44% with 2 threads) - add a simpler version of "DO_ONCE()" which does not require jump labels, intended for slow-path usage - virtio/vsock: introduce SOCK_SEQPACKET support - add getsocketopt to retrieve netns cookie - ip: treat lowest address of a IPv4 subnet as ordinary unicast address allowing reclaiming of precious IPv4 addresses - ipv6: use prandom_u32() for ID generation - ip: add support for more flexible field selection for hashing across multi-path routes (w/ offload to mlxsw) - icmp: add support for extended RFC 8335 PROBE (ping) - seg6: add support for SRv6 End.DT46 behavior - mptcp: - DSS checksum support (RFC 8684) to detect middlebox meddling - support Connection-time 'C' flag - time stamping support - sctp: packetization Layer Path MTU Discovery (RFC 8899) - xfrm: speed up state addition with seq set - WiFi: - hidden AP discovery on 6 GHz and other HE 6 GHz improvements - aggregation handling improvements for some drivers - minstrel improvements for no-ack frames - deferred rate control for TXQs to improve reaction times - switch from round robin to virtual time-based airtime scheduler - add trace points: - tcp checksum errors - openvswitch - action execution, upcalls - socket errors via sk_error_report Device APIs: - devlink: add rate API for hierarchical control of max egress rate of virtual devices (VFs, SFs etc.) - don't require RCU read lock to be held around BPF hooks in NAPI context - page_pool: generic buffer recycling New hardware/drivers: - mobile: - iosm: PCIe Driver for Intel M.2 Modem - support for Qualcomm MSM8998 (ipa) - WiFi: Qualcomm QCN9074 and WCN6855 PCI devices - sparx5: Microchip SparX-5 family of Enterprise Ethernet switches - Mellanox BlueField Gigabit Ethernet (control NIC of the DPU) - NXP SJA1110 Automotive Ethernet 10-port switch - Qualcomm QCA8327 switch support (qca8k) - Mikrotik 10/25G NIC (atl1c) Driver changes: - ACPI support for some MDIO, MAC and PHY devices from Marvell and NXP (our first foray into MAC/PHY description via ACPI) - HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx - Mellanox/Nvidia NIC (mlx5) - NIC VF offload of L2 bridging - support IRQ distribution to Sub-functions - Marvell (prestera): - add flower and match all - devlink trap - link aggregation - Netronome (nfp): connection tracking offload - Intel 1GE (igc): add AF_XDP support - Marvell DPU (octeontx2): ingress ratelimit offload - Google vNIC (gve): new ring/descriptor format support - Qualcomm mobile (rmnet & ipa): inline checksum offload support - MediaTek WiFi (mt76) - mt7915 MSI support - mt7915 Tx status reporting - mt7915 thermal sensors support - mt7921 decapsulation offload - mt7921 enable runtime pm and deep sleep - Realtek WiFi (rtw88) - beacon filter support - Tx antenna path diversity support - firmware crash information via devcoredump - Qualcomm WiFi (wcn36xx) - Wake-on-WLAN support with magic packets and GTK rekeying - Micrel PHY (ksz886x/ksz8081): add cable test support" * tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2168 commits) tcp: change ICSK_CA_PRIV_SIZE definition tcp_yeah: check struct yeah size at compile time gve: DQO: Fix off by one in gve_rx_dqo() stmmac: intel: set PCI_D3hot in suspend stmmac: intel: Enable PHY WOL option in EHL net: stmmac: option to enable PHY WOL with PMT enabled net: say "local" instead of "static" addresses in ndo_dflt_fdb_{add,del} net: use netdev_info in ndo_dflt_fdb_{add,del} ptp: Set lookup cookie when creating a PTP PPS source. net: sock: add trace for socket errors net: sock: introduce sk_error_report net: dsa: replay the local bridge FDB entries pointing to the bridge dev too net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev net: dsa: include fdb entries pointing to bridge in the host fdb list net: dsa: include bridge addresses which are local in the host fdb list net: dsa: sync static FDB entries on foreign interfaces to hardware net: dsa: install the host MDB and FDB entries in the master's RX filter net: dsa: reference count the FDB addresses at the cross-chip notifier level net: dsa: introduce a separate cross-chip notifier type for host FDBs net: dsa: reference count the MDB entries at the cross-chip notifier level ... |
||
|
a6eaf3850c |
- Fix a small inconsistency (bug) in load tracking, caught by a
new warning that several people reported. - Flip CONFIG_SCHED_CORE to default-disabled, and update the Kconfig help text. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDcSeURHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1g0ixAApwxCWCyNWKEKzg/I3FsR3RYfErpIUgnf gwQRtXNKZBOt/q8yWYw0VXG2ob17NgUdiC3p9IaxPrmCmB8tuojcsuPssHj5UktQ GWFoJdUwqQrM68Pl0centbN3xeKPcp7tHXK0jfmpbzolJxSFnx4/bQVPdhjlmWTM CoSTx4QCJoHYsNhZXj7aHQczKUMKBZ2/SD74+c/2Ft3jWkFmwdynzMdySKJfEuVX u/6PiFQDK0/5Qsic3a6pWmXZHcA0Q6HUwKKAaJcqI1OcAQyPuzwxY5qwbzvq5fRM ZhaJ7T5rJFQ8u6KndV5wv4+Xqgv6teAZBaPVP93cbnfeyVzWaiYL6/SC+xJnNDT+ fNpE41FINHq/NfUa6sId84lIpcQvAZzy3tpxS3VI/WXqDymFsG9BuoX9V2fm1uQP O3oa+B/PkP+VVezDS8qxZ2xixpcsRW97UVFtaj2P0QFOeb+qrOhPuSoCjNREkFmu CFWW+qnbiVaxQ3rtaXaHWhEtAhOMPbE4frYtbS4pjEMgv0K2E/mvDivHwrVX23II WkEQxUbzdp/APppBlhtFpcxKNTnNsV5Uk0VPhTu1bEOpKAC7wA+dmyPEMG0MLBUc 88DizVnofzKTax8PZnuabWpITcS+TS1qzBRmn1W1SajEh66iJltCSafWZ3JxytDO meqBYVd0ubQ= =Jgwc -----END PGP SIGNATURE----- Merge tag 'sched-urgent-2021-06-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: - Fix a small inconsistency (bug) in load tracking, caught by a new warning that several people reported. - Flip CONFIG_SCHED_CORE to default-disabled, and update the Kconfig help text. * tag 'sched-urgent-2021-06-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/core: Disable CONFIG_SCHED_CORE by default sched/fair: Ensure _sum and _avg values stay consistent |
||
|
290fe0fa6f |
audit/stable-5.14 PR 20210629
-----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmDbjZ4UHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMMMRAAgwwYgJ8pqac21co0rLCEMLdnSKvu ueaqCa0F/c+UJ9VmHZ0eDr4FHFgCxKamUP1+/Zn4U53tc483mX/Kz6blGd91hz1K YnYJGK/tlOmhOJtaZOsp0saRrYn/62kO5tyzp+mhNKje5ON4ICA5jR3Wz+B+RDSM GXH8EDuxKVkAuqpTZHm79a8LL2G9uDre2uWWHehDxLNAOuLxK/3DnObHRdEi4ZcV 9MSFljLnSocFdLOPRL2R05LQ2ce18NHkTW+FMzPbTezLP4lm62lEsAYOQ1vvUm0A MoncR4bFGMPYbM5WQmwiDDBBVoo6+HB5Q0AK6fbZwaIJYNFCCc0ytKKyfSyyIaRK 8qSaGeXpUU4nHeygeHvhz7uOVFJfG+WXWRM2WIOwUWU0KKbvCSJ3UDM3cuJHIQol jv2yuYIZQcz/tXTsOn59ivjsxXclsb3d1Z6FmligaoAoASCerCgeGkhDC4CwHYg5 qZHxE/rtiZxzgZ15cghxITOcGzOVEeYqbo9FzD5aXh66MGzOkrtYwJ+CCVTSx27O uco6PKelVI2EJ9v4KeQtIMOFDU/ZyJrde/q2otfqQoSaR12Jarpi0qL/5ossuR2B bEqNN6sX22++QfvLZOAJ8EmYjzkmDfs4q4LYQxMpQfw27OTcK17rXcnI//mmRbUr SrCuVW3mF8FzxJc= =1bAZ -----END PGP SIGNATURE----- Merge tag 'audit-pr-20210629' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit updates from Paul Moore: "Another merge window, another small audit pull request. Four patches in total: one is cosmetic, one removes an unnecessary initialization, one renames some enum values to prevent name collisions, and one converts list_del()/list_add() to list_move(). None of these are earth shattering and all pass the audit-testsuite tests while merging cleanly on top of your tree from earlier today" * tag 'audit-pr-20210629' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: remove unnecessary 'ret' initialization audit: remove trailing spaces and tabs audit: Use list_move instead of list_del/list_add audit: Rename enum audit_state constants to avoid AUDIT_DISABLED redefinition audit: add blank line after variable declarations |
||
|
44b6ed4cfa |
Clang feature updates for v5.14-rc1
- Add CC_HAS_NO_PROFILE_FN_ATTR in preparation for PGO support in the face of the noinstr attribute, paving the way for PGO and fixing GCOV. (Nick Desaulniers) - x86_64 LTO coverage is expanded to 32-bit x86. (Nathan Chancellor) - Small fixes to CFI. (Mark Rutland, Nathan Chancellor) -----BEGIN PGP SIGNATURE----- iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmDbiFYWHGtlZXNjb29r QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJtd7D/9O7KE4M1O38TumCK9e6djPETb6 CHF5dpxnV5w1ZWgBysy8+nZ0ORWAm05rgF65K4ROBUhdrygEElIIkI88a/F9pDyE 99E0WTgQi4x4pFFJHF1Sj2G6YoCqrvFpZ45fMd8xk3y/sykhKO4k2A2ux1cHH1zh yYkzASDdukpr/xfcu1JCSFyjRU3Yk9aRzpg0PtrcMSDDuCYqg+oL91rxtkdXc6wS FbVSkUiFQq+RZk9h6DaiVDen/rPvo4rqgQYbdVM8s94gMaHA4MiMiQE6cKkClfdp zacqqh9Cjaeyievz6jkVSqFtmO7e231E6kAWg/ebqVjs6WIcS3NVEfGGjCEaCuMq qKy/m30YzpJ0jLbbQ9L/Cm3xu5ZqfSaQBQmBjNcBMkeMQN8o/P6qt6UASZfBXXCs ++MUpNQEJqxCyZdwu/6qlzfKUiGo5AJo7RRes5/shqTXQLLBni4j7vtkSYZsfPYr b1nHk6TnyY7PjcMekG/IWU89pMchEDswGxSGlrqoop1kT3zumzJeZdPAB8sdNjI8 aBb120qLIC8n9ybZZsNliNtK4IHerBOxDDJB40EEbtBCPowZDEUt/z/DQrKjbOv4 viOulu1D8f/MDXVBx2aTXGpMo/jQf7bKRITtpzt1eFWSTZzqCqWLfGRq2myjz0t5 f2x1rpJLC2oV4KNCYw== =IhVh -----END PGP SIGNATURE----- Merge tag 'clang-features-v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull clang feature updates from Kees Cook: - Add CC_HAS_NO_PROFILE_FN_ATTR in preparation for PGO support in the face of the noinstr attribute, paving the way for PGO and fixing GCOV. (Nick Desaulniers) - x86_64 LTO coverage is expanded to 32-bit x86. (Nathan Chancellor) - Small fixes to CFI. (Mark Rutland, Nathan Chancellor) * tag 'clang-features-v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: qemu_fw_cfg: Make fw_cfg_rev_attr a proper kobj_attribute Kconfig: Introduce ARCH_WANTS_NO_INSTR and CC_HAS_NO_PROFILE_FN_ATTR compiler_attributes.h: cleanups for GCC 4.9+ compiler_attributes.h: define __no_profile, add to noinstr x86, lto: Enable Clang LTO for 32-bit as well CFI: Move function_nocfi() into compiler.h MAINTAINERS: Add Clang CFI section |
||
|
df668a5fe4 |
for-5.14/block-2021-06-29
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmDbXAwQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpr0HEADDJaSgjpnWQwH1RVLNagJa9KnktxZYsEs+ as3QmDdpKRG3rEC9bdE7FLe/xq3WBaO5j1hTQ9P6IguqLyS1Df72DtTlKyaCrZoe zv9eIlY4lZUfksE2nzWmlN9uG0FBVXeEQpHCLSNbUZeK1zvV6+NNhQqw2kc0sEqu hReUFeMUbsMcu/w5T3XMVJNsTMCql9wta2H0q5hONQyJQSrIwa1D+sUdE5I8fO4j bnoYX9yxHX26EztX1UJiGRgoq5Trz7LY7hAfljKSkewpFwiHE2vBdq2L0C2RKsIV tTs2DjMCMQyPNeA7WAG8HlR4aPG+7+/fuBP1KJHkykjWXglWN7OqISuBv6rrBgQs gNRnZ4qmb1CzD6aLEBk59nHt6po6eMxXIW856YktKy8rKcrgK29qP44Z+oomkPKo ZjQ0wqN5CvpObM/dIKxl9bAJ4zQDHBt49d5nTTQLfWl/mgevu6ZNWD/hONyCQmFy zKKqQ/wkxWHutOsjC5/MKNb3ZRNH9tt9X+HfULO2DU6IqqifYw/ex4z4MVsBopJC 7pPfd81kgC73TgXe1AaCwHqNWsrqYCuTK0ew1CtGudlS3lucMwtap4GBiCgg5gbu M8pEgwO4OcCLHyRUc8zdfqI7HumbprbFmojPkwGSEe0ofVD74lMhzbUj5jvTYY2B t8D2XcgyOA== =lhon -----END PGP SIGNATURE----- Merge tag 'for-5.14/block-2021-06-29' of git://git.kernel.dk/linux-block Pull core block updates from Jens Axboe: - disk events cleanup (Christoph) - gendisk and request queue allocation simplifications (Christoph) - bdev_disk_changed cleanups (Christoph) - IO priority improvements (Bart) - Chained bio completion trace fix (Edward) - blk-wbt fixes (Jan) - blk-wbt enable/disable fix (Zhang) - Scheduler dispatch improvements (Jan, Ming) - Shared tagset scheduler improvements (John) - BFQ updates (Paolo, Luca, Pietro) - BFQ lock inversion fix (Jan) - Documentation improvements (Kir) - CLONE_IO block cgroup fix (Tejun) - Remove of ancient and deprecated block dump feature (zhangyi) - Discard merge fix (Ming) - Misc fixes or followup fixes (Colin, Damien, Dan, Long, Max, Thomas, Yang) * tag 'for-5.14/block-2021-06-29' of git://git.kernel.dk/linux-block: (129 commits) block: fix discard request merge block/mq-deadline: Remove a WARN_ON_ONCE() call blk-mq: update hctx->dispatch_busy in case of real scheduler blk: Fix lock inversion between ioc lock and bfqd lock bfq: Remove merged request already in bfq_requests_merged() block: pass a gendisk to bdev_disk_changed block: move bdev_disk_changed block: add the events* attributes to disk_attrs block: move the disk events code to a separate file block: fix trace completion for chained bio block/partitions/msdos: Fix typo inidicator -> indicator block, bfq: reset waker pointer with shared queues block, bfq: check waker only for queues with no in-flight I/O block, bfq: avoid delayed merge of async queues block, bfq: boost throughput by extending queue-merging times block, bfq: consider also creation time in delayed stable merge block, bfq: fix delayed stable merge check block, bfq: let also stably merged queues enjoy weight raising blk-wbt: make sure throttle is enabled properly blk-wbt: introduce a new disable state to prevent false positive by rwb_enabled() ... |
||
|
b81b3e959a |
tracing: Simplify & fix saved_tgids logic
The tgid_map array records a mapping from pid to tgid, where the index
of an entry within the array is the pid & the value stored at that index
is the tgid.
The saved_tgids_next() function iterates over pointers into the tgid_map
array & dereferences the pointers which results in the tgid, but then it
passes that dereferenced value to trace_find_tgid() which treats it as a
pid & does a further lookup within the tgid_map array. It seems likely
that the intent here was to skip over entries in tgid_map for which the
recorded tgid is zero, but instead we end up skipping over entries for
which the thread group leader hasn't yet had its own tgid recorded in
tgid_map.
A minimal fix would be to remove the call to trace_find_tgid, turning:
if (trace_find_tgid(*ptr))
into:
if (*ptr)
..but it seems like this logic can be much simpler if we simply let
seq_read() iterate over the whole tgid_map array & filter out empty
entries by returning SEQ_SKIP from saved_tgids_show(). Here we take that
approach, removing the incorrect logic here entirely.
Link: https://lkml.kernel.org/r/20210630003406.4013668-1-paulburton@google.com
Fixes:
|
||
|
bfbf8d157a |
tracing: Change variable type as bool for clean-up
The wakeup_rt wakeup_dl, tracing_dl is only set to 0, 1. So changing type of wakeup_rt wakeup_dl, tracing_dl as bool makes relevant routine be more readable. Link: https://lkml.kernel.org/r/20210629140548.GA1627@raspberrypi Signed-off-by: Austin Kim <austin.kim@lge.com> [ Removed unneeded initialization of static bool tracing_dl ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> |
||
|
a22a5cb81e |
Merge branch 'sched/core' into sched/urgent, to pick up fix
Pick up a fix for a warning that several people reported. Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
65090f30ab |
Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton: "191 patches. Subsystems affected by this patch series: kthread, ia64, scripts, ntfs, squashfs, ocfs2, kernel/watchdog, and mm (gup, pagealloc, slab, slub, kmemleak, dax, debug, pagecache, gup, swap, memcg, pagemap, mprotect, bootmem, dma, tracing, vmalloc, kasan, initialization, pagealloc, and memory-failure)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (191 commits) mm,hwpoison: make get_hwpoison_page() call get_any_page() mm,hwpoison: send SIGBUS with error virutal address mm/page_alloc: split pcp->high across all online CPUs for cpuless nodes mm/page_alloc: allow high-order pages to be stored on the per-cpu lists mm: replace CONFIG_FLAT_NODE_MEM_MAP with CONFIG_FLATMEM mm: replace CONFIG_NEED_MULTIPLE_NODES with CONFIG_NUMA docs: remove description of DISCONTIGMEM arch, mm: remove stale mentions of DISCONIGMEM mm: remove CONFIG_DISCONTIGMEM m68k: remove support for DISCONTIGMEM arc: remove support for DISCONTIGMEM arc: update comment about HIGHMEM implementation alpha: remove DISCONTIGMEM and NUMA mm/page_alloc: move free_the_page mm/page_alloc: fix counting of managed_pages mm/page_alloc: improve memmap_pages dbg msg mm: drop SECTION_SHIFT in code comments mm/page_alloc: introduce vm.percpu_pagelist_high_fraction mm/page_alloc: limit the number of pages on PCP lists when reclaim is active mm/page_alloc: scale the number of pages that are batch freed ... |
||
|
b6df00789e |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Trivial conflict in net/netfilter/nf_tables_api.c. Duplicate fix in tools/testing/selftests/net/devlink_port_split.py - take the net-next version. skmsg, and L4 bpf - keep the bpf code but remove the flags and err params. Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
6a82f42a2e |
trace/timerlat: Fix indentation on timerlat_main()
Dan Carpenter reported that: The patch |
||
|
19c3eaa722 |
trace/osnoise: Make 'noise' variable s64 in run_osnoise()
Dab Carpenter reported that: The patch |
||
|
3563f55ce6 |
Power management updates for 5.14-rc1
- Make intel_pstate support hybrid processors using abstract performance units in the HWP interface (Rafael Wysocki). - Add Icelake servers and Cometlake support in no-HWP mode to intel_pstate (Giovanni Gherdovich). - Make cpufreq_online() error path be consistent with the CPU device removal path in cpufreq (Rafael Wysocki). - Clean up 3 cpufreq drivers and the statistics code (Hailong Liu, Randy Dunlap, Shaokun Zhang). - Make intel_idle use special idle state parameters for C6 when package C-states are disabled (Chen Yu). - Rework the TEO (timer events oriented) cpuidle governor to address some theoretical shortcomings in it (Rafael Wysocki). - Drop unneeded semicolon from the TEO governor (Wan Jiabing). - Modify the runtime PM framework to accept unassigned suspend and resume callback pointers (Ulf Hansson). - Improve pm_runtime_get_sync() documentation (Krzysztof Kozlowski). - Improve device performance states support in the generic power domains (genpd) framework (Ulf Hansson). - Fix some documentation issues in genpd (Yang Yingliang). - Make the operating performance points (OPP) framework use the required-opps DT property in use cases that are not related to genpd (Hsin-Yi Wang). - Make lazy_link_required_opp_table() use list_del_init instead of list_del/INIT_LIST_HEAD (Yang Yingliang). - Simplify wake IRQs handling in the core system-wide sleep support code and clean up some coding style inconsistencies in it (Tian Tao, Zhen Lei). - Add cooling support to the tegra30 devfreq driver and improve its DT bindings (Dmitry Osipenko). - Fix some assorted issues in the devfreq core and drivers (Chanwoo Choi, Dong Aisheng, YueHaibing). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmDbaFwSHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRxR/gP/inFFjdwWpq3r6XD5P4XeVvum/MEjqQs rDUDHXgEEZmWGL9CpQ0PxhXyvc7SSqtNLakECTXxrOccfMbo0NKQtjCt8eMO1XMu nrAcp68hr/Rg2TaenRC3j0+oGXPzOw5/Mg5Y9subRymdI3o5HyoJNjBkU+LlkdGs HC7k8zWPqKQaEoFgjOpYPuXKf2bvEm2jIh4dzmtCRWXBUOxDgN0z6Kckhs59xrU+ +CLP/W4XMDrWSYdd2zjPV2IBNsqfePFchZ+t2CMQwYycI+KJr2s8tLbAFnQXfxLz WRqxpZKvMUPthKsK2/vgnCQkQKhGq39NmlHdqRdJm8uivCPx1Q2uuHRYF9782M+o cWuO60VvtUax0RIk1prP2l6JBBU/3Hvln7uf4cBnIeh/3QZKKygIgnNI1YwaqXSq zP4EWY00kKNmKwRUZAkDR9ZavXHiHvtoytT44XU/NxS+YXh6nMLC34CeuDUQaqni JvniXQyZCIWecZhwbOpW+FmAXMBCyvqXarDM0Zw4coWoyFLN7Y8ow9C5T5EWcgQ+ pKyGhS698HiePPJrwDOtOqzfPqxcgXEWUqwmxTeD8MRDSMlamzPnRJh+wxlrIdv9 2c8SAOSD7xvRlQQcsOpEVcKVkjsWDy7tvK6/O2CtoBsUpZOXtKIKJhr7ixnLN3Ej XHX/voFVDIao =mgLK -----END PGP SIGNATURE----- Merge tag 'pm-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These add hybrid processors support to the intel_pstate driver and make it work with more processor models when HWP is disabled, make the intel_idle driver use special C6 idle state paremeters when package C-states are disabled, add cooling support to the tegra30 devfreq driver, rework the TEO (timer events oriented) cpuidle governor, extend the OPP (operating performance points) framework to use the required-opps DT property in more cases, fix some issues and clean up a number of assorted pieces of code. Specifics: - Make intel_pstate support hybrid processors using abstract performance units in the HWP interface (Rafael Wysocki). - Add Icelake servers and Cometlake support in no-HWP mode to intel_pstate (Giovanni Gherdovich). - Make cpufreq_online() error path be consistent with the CPU device removal path in cpufreq (Rafael Wysocki). - Clean up 3 cpufreq drivers and the statistics code (Hailong Liu, Randy Dunlap, Shaokun Zhang). - Make intel_idle use special idle state parameters for C6 when package C-states are disabled (Chen Yu). - Rework the TEO (timer events oriented) cpuidle governor to address some theoretical shortcomings in it (Rafael Wysocki). - Drop unneeded semicolon from the TEO governor (Wan Jiabing). - Modify the runtime PM framework to accept unassigned suspend and resume callback pointers (Ulf Hansson). - Improve pm_runtime_get_sync() documentation (Krzysztof Kozlowski). - Improve device performance states support in the generic power domains (genpd) framework (Ulf Hansson). - Fix some documentation issues in genpd (Yang Yingliang). - Make the operating performance points (OPP) framework use the required-opps DT property in use cases that are not related to genpd (Hsin-Yi Wang). - Make lazy_link_required_opp_table() use list_del_init instead of list_del/INIT_LIST_HEAD (Yang Yingliang). - Simplify wake IRQs handling in the core system-wide sleep support code and clean up some coding style inconsistencies in it (Tian Tao, Zhen Lei). - Add cooling support to the tegra30 devfreq driver and improve its DT bindings (Dmitry Osipenko). - Fix some assorted issues in the devfreq core and drivers (Chanwoo Choi, Dong Aisheng, YueHaibing)" * tag 'pm-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (39 commits) PM / devfreq: passive: Fix get_target_freq when not using required-opp cpufreq: Make cpufreq_online() call driver->offline() on errors opp: Allow required-opps to be used for non genpd use cases cpuidle: teo: remove unneeded semicolon in teo_select() dt-bindings: devfreq: tegra30-actmon: Add cooling-cells dt-bindings: devfreq: tegra30-actmon: Convert to schema PM / devfreq: userspace: Use DEVICE_ATTR_RW macro PM: runtime: Clarify documentation when callbacks are unassigned PM: runtime: Allow unassigned ->runtime_suspend|resume callbacks PM: runtime: Improve path in rpm_idle() when no callback PM: hibernate: remove leading spaces before tabs PM: sleep: remove trailing spaces and tabs PM: domains: Drop/restore performance state votes for devices at runtime PM PM: domains: Return early if perf state is already set for the device PM: domains: Split code in dev_pm_genpd_set_performance_state() cpuidle: teo: Use kerneldoc documentation in admin-guide cpuidle: teo: Rework most recent idle duration values treatment cpuidle: teo: Change the main idle state selection logic cpuidle: teo: Cosmetic modification of teo_select() cpuidle: teo: Cosmetic modifications of teo_update() ... |