158405e888
7247 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
|
8cb1ae19bf |
x86/fpu updates:
- Cleanup of extable fixup handling to be more robust, which in turn allows to make the FPU exception fixups more robust as well. - Change the return code for signal frame related failures from explicit error codes to a boolean fail/success as that's all what the calling code evaluates. - A large refactoring of the FPU code to prepare for adding AMX support: - Distangle the public header maze and remove especially the misnomed kitchen sink internal.h which is despite it's name included all over the place. - Add a proper abstraction for the register buffer storage (struct fpstate) which allows to dynamically size the buffer at runtime by flipping the pointer to the buffer container from the default container which is embedded in task_struct::tread::fpu to a dynamically allocated container with a larger register buffer. - Convert the code over to the new fpstate mechanism. - Consolidate the KVM FPU handling by moving the FPU related code into the FPU core which removes the number of exports and avoids adding even more export when AMX has to be supported in KVM. This also removes duplicated code which was of course unnecessary different and incomplete in the KVM copy. - Simplify the KVM FPU buffer handling by utilizing the new fpstate container and just switching the buffer pointer from the user space buffer to the KVM guest buffer when entering vcpu_run() and flipping it back when leaving the function. This cuts the memory requirements of a vCPU for FPU buffers in half and avoids pointless memory copy operations. This also solves the so far unresolved problem of adding AMX support because the current FPU buffer handling of KVM inflicted a circular dependency between adding AMX support to the core and to KVM. With the new scheme of switching fpstate AMX support can be added to the core code without affecting KVM. - Replace various variables with proper data structures so the extra information required for adding dynamically enabled FPU features (AMX) can be added in one place - Add AMX (Advanved Matrix eXtensions) support (finally): AMX is a large XSTATE component which is going to be available with Saphire Rapids XEON CPUs. The feature comes with an extra MSR (MSR_XFD) which allows to trap the (first) use of an AMX related instruction, which has two benefits: 1) It allows the kernel to control access to the feature 2) It allows the kernel to dynamically allocate the large register state buffer instead of burdening every task with the the extra 8K or larger state storage. It would have been great to gain this kind of control already with AVX512. The support comes with the following infrastructure components: 1) arch_prctl() to - read the supported features (equivalent to XGETBV(0)) - read the permitted features for a task - request permission for a dynamically enabled feature Permission is granted per process, inherited on fork() and cleared on exec(). The permission policy of the kernel is restricted to sigaltstack size validation, but the syscall obviously allows further restrictions via seccomp etc. 2) A stronger sigaltstack size validation for sys_sigaltstack(2) which takes granted permissions and the potentially resulting larger signal frame into account. This mechanism can also be used to enforce factual sigaltstack validation independent of dynamic features to help with finding potential victims of the 2K sigaltstack size constant which is broken since AVX512 support was added. 3) Exception handling for #NM traps to catch first use of a extended feature via a new cause MSR. If the exception was caused by the use of such a feature, the handler checks permission for that feature. If permission has not been granted, the handler sends a SIGILL like the #UD handler would do if the feature would have been disabled in XCR0. If permission has been granted, then a new fpstate which fits the larger buffer requirement is allocated. In the unlikely case that this allocation fails, the handler sends SIGSEGV to the task. That's not elegant, but unavoidable as the other discussed options of preallocation or full per task permissions come with their own set of horrors for kernel and/or userspace. So this is the lesser of the evils and SIGSEGV caused by unexpected memory allocation failures is not a fundamentally new concept either. When allocation succeeds, the fpstate properties are filled in to reflect the extended feature set and the resulting sizes, the fpu::fpstate pointer is updated accordingly and the trap is disarmed for this task permanently. 4) Enumeration and size calculations 5) Trap switching via MSR_XFD The XFD (eXtended Feature Disable) MSR is context switched with the same life time rules as the FPU register state itself. The mechanism is keyed off with a static key which is default disabled so !AMX equipped CPUs have zero overhead. On AMX enabled CPUs the overhead is limited by comparing the tasks XFD value with a per CPU shadow variable to avoid redundant MSR writes. In case of switching from a AMX using task to a non AMX using task or vice versa, the extra MSR write is obviously inevitable. All other places which need to be aware of the variable feature sets and resulting variable sizes are not affected at all because they retrieve the information (feature set, sizes) unconditonally from the fpstate properties. 6) Enable the new AMX states Note, this is relatively new code despite the fact that AMX support is in the works for more than a year now. The big refactoring of the FPU code, which allowed to do a proper integration has been started exactly 3 weeks ago. Refactoring of the existing FPU code and of the original AMX patches took a week and has been subject to extensive review and testing. The only fallout which has not been caught in review and testing right away was restricted to AMX enabled systems, which is completely irrelevant for anyone outside Intel and their early access program. There might be dragons lurking as usual, but so far the fine grained refactoring has held up and eventual yet undetected fallout is bisectable and should be easily addressable before the 5.16 release. Famous last words... Many thanks to Chang Bae and Dave Hansen for working hard on this and also to the various test teams at Intel who reserved extra capacity to follow the rapid development of this closely which provides the confidence level required to offer this rather large update for inclusion into 5.16-rc1. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/NkITHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYodDkEADH4+/nN/QoSUHIuuha5Zptj3g2b16a /3TxT9fhwPen/kzMGsUk70s3iWJMA+I5dCfkSZexJ2hfhcRe9cBzZIa1HCawKwf3 YCISTsO/M+LpeORuZ+TpfFLJKnxNr1SEOl+EYffGhq0AkCjifb9Cnr0JZuoMUzGU jpfJZ2bj28ri5lG812DtzSMBM9E3SAwgJv+GNjmZbxZKb9mAfhbAMdBUXHirX7Ej jmx6koQjYOKwYIW8w1BrdC270lUKQUyJTbQgdRkN9Mh/HnKyFixQ18JqGlgaV2cT EtYePUfTEdaHdAhUINLIlEug1MfOslHU+HyGsdywnoChNB4GHPQuePC5Tz60VeFN RbQ9aKcBUu8r95rjlnKtAtBijNMA4bjGwllVxNwJ/ZoA9RPv1SbDZ07RX3qTaLVY YhVQl8+shD33/W24jUTJv1kMMexpHXIlv0gyfMryzpwI7uzzmGHRPAokJdbYKctC dyMPfdE90rxTiMUdL/1IQGhnh3awjbyfArzUhHyQ++HyUyzCFh0slsO0CD18vUy8 FofhCugGBhjuKw3XwLNQ+KsWURz5qHctSzBc3qMOSyqFHbAJCVRANkhsFvWJo2qL 75+Z7OTRebtsyOUZIdq26r4roSxHrps3dupWTtN70HWx2NhQG1nLEw986QYiQu1T hcKvDmehQLrUvg== =x3WL -----END PGP SIGNATURE----- Merge tag 'x86-fpu-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fpu updates from Thomas Gleixner: - Cleanup of extable fixup handling to be more robust, which in turn allows to make the FPU exception fixups more robust as well. - Change the return code for signal frame related failures from explicit error codes to a boolean fail/success as that's all what the calling code evaluates. - A large refactoring of the FPU code to prepare for adding AMX support: - Distangle the public header maze and remove especially the misnomed kitchen sink internal.h which is despite it's name included all over the place. - Add a proper abstraction for the register buffer storage (struct fpstate) which allows to dynamically size the buffer at runtime by flipping the pointer to the buffer container from the default container which is embedded in task_struct::tread::fpu to a dynamically allocated container with a larger register buffer. - Convert the code over to the new fpstate mechanism. - Consolidate the KVM FPU handling by moving the FPU related code into the FPU core which removes the number of exports and avoids adding even more export when AMX has to be supported in KVM. This also removes duplicated code which was of course unnecessary different and incomplete in the KVM copy. - Simplify the KVM FPU buffer handling by utilizing the new fpstate container and just switching the buffer pointer from the user space buffer to the KVM guest buffer when entering vcpu_run() and flipping it back when leaving the function. This cuts the memory requirements of a vCPU for FPU buffers in half and avoids pointless memory copy operations. This also solves the so far unresolved problem of adding AMX support because the current FPU buffer handling of KVM inflicted a circular dependency between adding AMX support to the core and to KVM. With the new scheme of switching fpstate AMX support can be added to the core code without affecting KVM. - Replace various variables with proper data structures so the extra information required for adding dynamically enabled FPU features (AMX) can be added in one place - Add AMX (Advanced Matrix eXtensions) support (finally): AMX is a large XSTATE component which is going to be available with Saphire Rapids XEON CPUs. The feature comes with an extra MSR (MSR_XFD) which allows to trap the (first) use of an AMX related instruction, which has two benefits: 1) It allows the kernel to control access to the feature 2) It allows the kernel to dynamically allocate the large register state buffer instead of burdening every task with the the extra 8K or larger state storage. It would have been great to gain this kind of control already with AVX512. The support comes with the following infrastructure components: 1) arch_prctl() to - read the supported features (equivalent to XGETBV(0)) - read the permitted features for a task - request permission for a dynamically enabled feature Permission is granted per process, inherited on fork() and cleared on exec(). The permission policy of the kernel is restricted to sigaltstack size validation, but the syscall obviously allows further restrictions via seccomp etc. 2) A stronger sigaltstack size validation for sys_sigaltstack(2) which takes granted permissions and the potentially resulting larger signal frame into account. This mechanism can also be used to enforce factual sigaltstack validation independent of dynamic features to help with finding potential victims of the 2K sigaltstack size constant which is broken since AVX512 support was added. 3) Exception handling for #NM traps to catch first use of a extended feature via a new cause MSR. If the exception was caused by the use of such a feature, the handler checks permission for that feature. If permission has not been granted, the handler sends a SIGILL like the #UD handler would do if the feature would have been disabled in XCR0. If permission has been granted, then a new fpstate which fits the larger buffer requirement is allocated. In the unlikely case that this allocation fails, the handler sends SIGSEGV to the task. That's not elegant, but unavoidable as the other discussed options of preallocation or full per task permissions come with their own set of horrors for kernel and/or userspace. So this is the lesser of the evils and SIGSEGV caused by unexpected memory allocation failures is not a fundamentally new concept either. When allocation succeeds, the fpstate properties are filled in to reflect the extended feature set and the resulting sizes, the fpu::fpstate pointer is updated accordingly and the trap is disarmed for this task permanently. 4) Enumeration and size calculations 5) Trap switching via MSR_XFD The XFD (eXtended Feature Disable) MSR is context switched with the same life time rules as the FPU register state itself. The mechanism is keyed off with a static key which is default disabled so !AMX equipped CPUs have zero overhead. On AMX enabled CPUs the overhead is limited by comparing the tasks XFD value with a per CPU shadow variable to avoid redundant MSR writes. In case of switching from a AMX using task to a non AMX using task or vice versa, the extra MSR write is obviously inevitable. All other places which need to be aware of the variable feature sets and resulting variable sizes are not affected at all because they retrieve the information (feature set, sizes) unconditonally from the fpstate properties. 6) Enable the new AMX states Note, this is relatively new code despite the fact that AMX support is in the works for more than a year now. The big refactoring of the FPU code, which allowed to do a proper integration has been started exactly 3 weeks ago. Refactoring of the existing FPU code and of the original AMX patches took a week and has been subject to extensive review and testing. The only fallout which has not been caught in review and testing right away was restricted to AMX enabled systems, which is completely irrelevant for anyone outside Intel and their early access program. There might be dragons lurking as usual, but so far the fine grained refactoring has held up and eventual yet undetected fallout is bisectable and should be easily addressable before the 5.16 release. Famous last words... Many thanks to Chang Bae and Dave Hansen for working hard on this and also to the various test teams at Intel who reserved extra capacity to follow the rapid development of this closely which provides the confidence level required to offer this rather large update for inclusion into 5.16-rc1 * tag 'x86-fpu-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (110 commits) Documentation/x86: Add documentation for using dynamic XSTATE features x86/fpu: Include vmalloc.h for vzalloc() selftests/x86/amx: Add context switch test selftests/x86/amx: Add test cases for AMX state management x86/fpu/amx: Enable the AMX feature in 64-bit mode x86/fpu: Add XFD handling for dynamic states x86/fpu: Calculate the default sizes independently x86/fpu/amx: Define AMX state components and have it used for boot-time checks x86/fpu/xstate: Prepare XSAVE feature table for gaps in state component numbers x86/fpu/xstate: Add fpstate_realloc()/free() x86/fpu/xstate: Add XFD #NM handler x86/fpu: Update XFD state where required x86/fpu: Add sanity checks for XFD x86/fpu: Add XFD state to fpstate x86/msr-index: Add MSRs for XFD x86/cpufeatures: Add eXtended Feature Disabling (XFD) feature bit x86/fpu: Reset permission and fpstate on exec() x86/fpu: Prepare fpu_clone() for dynamically enabled features x86/fpu/signal: Prepare for variable sigframe length x86/signal: Use fpu::__state_user_size for sigalt stack validation ... |
||
|
9a7e0a90a4 |
Scheduler updates:
- Revert the printk format based wchan() symbol resolution as it can leak the raw value in case that the symbol is not resolvable. - Make wchan() more robust and work with all kind of unwinders by enforcing that the task stays blocked while unwinding is in progress. - Prevent sched_fork() from accessing an invalid sched_task_group - Improve asymmetric packing logic - Extend scheduler statistics to RT and DL scheduling classes and add statistics for bandwith burst to the SCHED_FAIR class. - Properly account SCHED_IDLE entities - Prevent a potential deadlock when initial priority is assigned to a newly created kthread. A recent change to plug a race between cpuset and __sched_setscheduler() introduced a new lock dependency which is now triggered. Break the lock dependency chain by moving the priority assignment to the thread function. - Fix the idle time reporting in /proc/uptime for NOHZ enabled systems. - Improve idle balancing in general and especially for NOHZ enabled systems. - Provide proper interfaces for live patching so it does not have to fiddle with scheduler internals. - Add cluster aware scheduling support. - A small set of tweaks for RT (irqwork, wait_task_inactive(), various scheduler options and delaying mmdrop) - The usual small tweaks and improvements all over the place -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/OUkTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoR/5D/9ikdGNpKg9osNqJ3GjAmxsK6kVkB29 iFe2k8pIpWDToWQf/wQRGih4Yj3Cl49QSnZcPIibh2/12EB1qrrW6iSPJkInz8Ec /1LS5/Vewn2OyoxyXZjdvGC5gTXEodSbIazASvX7nvdMeI4gsAsL5etzrMJirT/t aymqvr7zovvywrwMTQJrGjUMo9l4ewE8tafMNNhRu1BHU1U4ojM9yvThyRAAcmp7 3Xy49A+Yq3IgrvYI4u8FMK5Zh08KaxSFjiLhePGm/bF+wSfYmWop2TP1jY05W2Uo ti8hfbJMUoFRYuMxAiEldkItnc0wV4M9PtWZZ/x+B71bs65Y4Zjt9cW+rxJv2+m1 vzV31EsQwGnOti072dzWN4c/cZqngVXAjaNtErvDwJUr+Tw1ayv9KUvuodMQqZY6 mu68bFUO2kV9EMe1CBOv51Uy1RGHyLj3rlNqrkw+Xp5ISE9Ad2vhUEiRp5bQx5Ci V/XFhGZkGUluh0vccrdFlNYZwhj8cZEzkOPCnPSeZ+bq8SyZE6xuHH/lTP1CJCOy s800rW1huM+kgV+zRN8adDkGXibAk9N3RtVGnQXmuEy8gB9LZmQg+JeM2wsc9B+6 i0gdqZnsjNAfoK+BBAG4holxptSL8/eOJsFH8ZNIoxQ+iqooyPx9tFX7yXnRTBQj d2qWG7UvoseT+g== =fgtS -----END PGP SIGNATURE----- Merge tag 'sched-core-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Thomas Gleixner: - Revert the printk format based wchan() symbol resolution as it can leak the raw value in case that the symbol is not resolvable. - Make wchan() more robust and work with all kind of unwinders by enforcing that the task stays blocked while unwinding is in progress. - Prevent sched_fork() from accessing an invalid sched_task_group - Improve asymmetric packing logic - Extend scheduler statistics to RT and DL scheduling classes and add statistics for bandwith burst to the SCHED_FAIR class. - Properly account SCHED_IDLE entities - Prevent a potential deadlock when initial priority is assigned to a newly created kthread. A recent change to plug a race between cpuset and __sched_setscheduler() introduced a new lock dependency which is now triggered. Break the lock dependency chain by moving the priority assignment to the thread function. - Fix the idle time reporting in /proc/uptime for NOHZ enabled systems. - Improve idle balancing in general and especially for NOHZ enabled systems. - Provide proper interfaces for live patching so it does not have to fiddle with scheduler internals. - Add cluster aware scheduling support. - A small set of tweaks for RT (irqwork, wait_task_inactive(), various scheduler options and delaying mmdrop) - The usual small tweaks and improvements all over the place * tag 'sched-core-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (69 commits) sched/fair: Cleanup newidle_balance sched/fair: Remove sysctl_sched_migration_cost condition sched/fair: Wait before decaying max_newidle_lb_cost sched/fair: Skip update_blocked_averages if we are defering load balance sched/fair: Account update_blocked_averages in newidle_balance cost x86: Fix __get_wchan() for !STACKTRACE sched,x86: Fix L2 cache mask sched/core: Remove rq_relock() sched: Improve wake_up_all_idle_cpus() take #2 irq_work: Also rcuwait for !IRQ_WORK_HARD_IRQ on PREEMPT_RT irq_work: Handle some irq_work in a per-CPU thread on PREEMPT_RT irq_work: Allow irq_work_sync() to sleep if irq_work() no IRQ support. sched/rt: Annotate the RT balancing logic irqwork as IRQ_WORK_HARD_IRQ sched: Add cluster scheduler level for x86 sched: Add cluster scheduler level in core and related Kconfig for ARM64 topology: Represent clusters of CPUs within a die sched: Disable -Wunused-but-set-variable sched: Add wrapper for get_wchan() to keep task blocked x86: Fix get_wchan() to support the ORC unwinder proc: Use task_is_running() for wchan in /proc/$pid/stat ... |
||
|
595b28fb0c |
Locking updates:
- Move futex code into kernel/futex/ and split up the kitchen sink into seperate files to make integration of sys_futex_waitv() simpler. - Add a new sys_futex_waitv() syscall which allows to wait on multiple futexes. The main use case is emulating Windows' WaitForMultipleObjects which allows Wine to improve the performance of Windows Games. Also native Linux games can benefit from this interface as this is a common wait pattern for this kind of applications. - Add context to ww_mutex_trylock() to provide a path for i915 to rework their eviction code step by step without making lockdep upset until the final steps of rework are completed. It's also useful for regulator and TTM to avoid dropping locks in the non contended path. - Lockdep and might_sleep() cleanups and improvements - A few improvements for the RT substitutions. - The usual small improvements and cleanups. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/FTITHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoVNZD/9vIm3Bu1Coz8tbNXz58AiCYq9Y/vp5 mzFgSzz+VJTkW5Vh8jo5Uel4rCKZyt+rL276EoaRPzYl8KFtWDbpK3qd3PrXKqTX At49JO4ttAMJUHIBQ6vblEkykmfEd9YPU1uSWk5roJ+s7Jmr5VWnu0FEWHP00As5 tWOca/TM0ei9kof26V2fl5aecTGII4i4Zsvy+LPsXtI+TnmP0gSBcGAS/5UnZTtJ vQRWTR3ojoYvh5iTmNqbaURYoQLe2j8yscn1DSW1CABWVmP12eDWs+N7jRP4b5S9 73xOv5P7vpva41wxrK2ir5iNkpsLE97VL2JOHTW8nm7orblfiuxHLTCkTjEdd2pO h8blI2IBizEB3JYn2BMkOAaZQOSjN8hd6Ye/b2B4AMEGWeXEoEv6eVy/orYKCluQ XDqGn47Vce/SYmo5vfTB8VMt6nANx8PKvOP3IvjHInYEQBgiT6QrlUw3RRkXBp5s clQkjYYwjAMVIXowcCrdhoKjMROzi6STShVwHwGL8MaZXqr8Vl6BUO9ckU0pY+4C F000Hzwxi8lGEQ9k+P+BnYOEzH5osCty8lloKiQ/7ciX6T+CZHGJPGK/iY4YL8P5 C3CJWMsHCqST7DodNFJmdfZt99UfIMmEhshMDduU9AAH0tHCn8vOu0U6WvCtpyBp BvHj68zteAtlYg== =RZ4x -----END PGP SIGNATURE----- Merge tag 'locking-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Thomas Gleixner: - Move futex code into kernel/futex/ and split up the kitchen sink into seperate files to make integration of sys_futex_waitv() simpler. - Add a new sys_futex_waitv() syscall which allows to wait on multiple futexes. The main use case is emulating Windows' WaitForMultipleObjects which allows Wine to improve the performance of Windows Games. Also native Linux games can benefit from this interface as this is a common wait pattern for this kind of applications. - Add context to ww_mutex_trylock() to provide a path for i915 to rework their eviction code step by step without making lockdep upset until the final steps of rework are completed. It's also useful for regulator and TTM to avoid dropping locks in the non contended path. - Lockdep and might_sleep() cleanups and improvements - A few improvements for the RT substitutions. - The usual small improvements and cleanups. * tag 'locking-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits) locking: Remove spin_lock_flags() etc locking/rwsem: Fix comments about reader optimistic lock stealing conditions locking: Remove rcu_read_{,un}lock() for preempt_{dis,en}able() locking/rwsem: Disable preemption for spinning region docs: futex: Fix kernel-doc references futex: Fix PREEMPT_RT build futex2: Documentation: Document sys_futex_waitv() uAPI selftests: futex: Test sys_futex_waitv() wouldblock selftests: futex: Test sys_futex_waitv() timeout selftests: futex: Add sys_futex_waitv() test futex,arm: Wire up sys_futex_waitv() futex,x86: Wire up sys_futex_waitv() futex: Implement sys_futex_waitv() futex: Simplify double_lock_hb() futex: Split out wait/wake futex: Split out requeue futex: Rename mark_wake_futex() futex: Rename: match_futex() futex: Rename: hb_waiter_{inc,dec,pending}() futex: Split out PI futex ... |
||
|
9c7516d669 |
tools/testing/selftests/vm/split_huge_page_test.c: fix application of sizeof to pointer
The coccinelle check report: ./tools/testing/selftests/vm/split_huge_page_test.c:344:36-42: ERROR: application of sizeof to pointer Use "strlen" to fix it. Link: https://lkml.kernel.org/r/20211012030116.184027-1-davidcomponentone@gmail.com Signed-off-by: David Yang <davidcomponentone@gmail.com> Reported-by: Zeal Robot <zealci@zte.com.cn> Cc: Zi Yan <ziy@nvidia.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
101c669d16 |
selftests/x86/amx: Add context switch test
XSAVE state is thread-local. The kernel switches between thread state at context switch time. Generally, running a selftest for a while will naturally expose it to some context switching and and will test the XSAVE code. Instead of just hoping that the tests get context-switched at random times, force context-switches on purpose. Spawn off a few userspace threads and force context-switches between them. Ensure that the kernel correctly context switches each thread's unique AMX state. [ dhansen: bunches of cleanups ] Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211026122525.6EFD5758@davehans-spike.ostc.intel.com |
||
|
6a3e0651b4 |
selftests/x86/amx: Add test cases for AMX state management
AMX TILEDATA is a very large XSAVE feature. It could have caused nasty XSAVE buffer space waste in two places: * Signal stacks * Kernel task_struct->fpu buffers To avoid this waste, neither of these buffers have AMX state by default. The non-default features are called "dynamic" features. There is an arch_prctl(ARCH_REQ_XCOMP_PERM) which allows a task to declare that it wants to use AMX or other "dynamic" XSAVE features. This arch_prctl() ensures that sufficient sigaltstack space is available before it will succeed. It also expands the task_struct buffer. Functions of this test: * Test arch_prctl(ARCH_REQ_XCOMP_PERM). Ensure that it checks for proper sigaltstack sizing and that the sizing is enforced for future sigaltstack calls. * Ensure that ARCH_REQ_XCOMP_PERM is inherited across fork() * Ensure that TILEDATA use before the prctl() is fatal * Ensure that TILEDATA is cleared across fork() Note: Generally, compiler support is needed to do something with AMX. Instead, directly load AMX state from userspace with a plain XSAVE. Do not depend on the compiler. [ dhansen: bunches of cleanups ] Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211026122524.7BEDAA95@davehans-spike.ostc.intel.com |
||
|
440ffcdd9d |
Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says: ==================== pull-request: bpf 2021-10-26 We've added 12 non-merge commits during the last 7 day(s) which contain a total of 23 files changed, 118 insertions(+), 98 deletions(-). The main changes are: 1) Fix potential race window in BPF tail call compatibility check, from Toke Høiland-Jørgensen. 2) Fix memory leak in cgroup fs due to missing cgroup_bpf_offline(), from Quanyang Wang. 3) Fix file descriptor reference counting in generic_map_update_batch(), from Xu Kuohai. 4) Fix bpf_jit_limit knob to the max supported limit by the arch's JIT, from Lorenz Bauer. 5) Fix BPF sockmap ->poll callbacks for UDP and AF_UNIX sockets, from Cong Wang and Yucong Sun. 6) Fix BPF sockmap concurrency issue in TCP on non-blocking sendmsg calls, from Liu Jian. 7) Fix build failure of INODE_STORAGE and TASK_STORAGE maps on !CONFIG_NET, from Tejun Heo. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Fix potential race in tail call compatibility check bpf: Move BPF_MAP_TYPE for INODE_STORAGE and TASK_STORAGE outside of CONFIG_NET selftests/bpf: Use recv_timeout() instead of retries net: Implement ->sock_is_readable() for UDP and AF_UNIX skmsg: Extract and reuse sk_msg_is_readable() net: Rename ->stream_memory_read to ->sock_is_readable tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function cgroup: Fix memory leak caused by missing cgroup_bpf_offline bpf: Fix error usage of map_fd and fdget() in generic_map_update_batch() bpf: Prevent increasing bpf_jit_limit above max bpf: Define bpf_jit_alloc_exec_limit for arm64 JIT bpf: Define bpf_jit_alloc_exec_limit for riscv JIT ==================== Link: https://lore.kernel.org/r/20211026201920.11296-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
67b821502d |
selftests/bpf: Use recv_timeout() instead of retries
We use non-blocking sockets in those tests, retrying for EAGAIN is ugly because there is no upper bound for the packet arrival time, at least in theory. After we fix poll() on sockmap sockets, now we can switch to select()+recv(). Signed-off-by: Yucong Sun <sunyucong@gmail.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211008203306.37525-5-xiyou.wangcong@gmail.com |
||
|
1f83b835a3 |
fcnal-test: kill hanging ping/nettest binaries on cleanup
On my box I see a bunch of ping/nettest processes hanging around after fcntal-test.sh is done. Clean those up before netns deletion. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20211021140247.29691-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
6c2c712767 |
Networking fixes for 5.15-rc7, including fixes from netfilter, and can.
Current release - regressions: - revert "vrf: reset skb conntrack connection on VRF rcv", there are valid uses for previous behavior - can: m_can: fix iomap_read_fifo() and iomap_write_fifo() Current release - new code bugs: - mlx5: e-switch, return correct error code on group creation failure Previous releases - regressions: - sctp: fix transport encap_port update in sctp_vtag_verify - stmmac: fix E2E delay mechanism (in PTP timestamping) Previous releases - always broken: - netfilter: ip6t_rt: fix out-of-bounds read of ipv6_rt_hdr - netfilter: xt_IDLETIMER: fix out-of-bound read caused by lack of init - netfilter: ipvs: make global sysctl read-only in non-init netns - tcp: md5: fix selection between vrf and non-vrf keys - ipv6: count rx stats on the orig netdev when forwarding - bridge: mcast: use multicast_membership_interval for IGMPv3 - can: - j1939: fix UAF for rx_kref of j1939_priv abort sessions on receiving bad messages - isotp: fix TX buffer concurrent access in isotp_sendmsg() fix return error on FC timeout on TX path - ice: fix re-init of RDMA Tx queues and crash if RDMA was not inited - hns3: schedule the polling again when allocation fails, prevent stalls - drivers: add missing of_node_put() when aborting for_each_available_child_of_node() - ptp: fix possible memory leak and UAF in ptp_clock_register() - e1000e: fix packet loss in burst mode on Tiger Lake and later - mlx5e: ipsec: fix more checksum offload issues Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmFxgHwACgkQMUZtbf5S IrvFgw//T73aR3B2Xvz5/1rtglfmtcUqFQsyGDXGD5HnfbAbsRcz8vcQ/mTsExl7 +mJY/ZuQefsD7UQDyg3GNhbgf1+pEjHC81ryeNsfET7+JxgYLD3NEYSBYUqIFZUo gStAStGBG+ClQUaqlkGFyyf6GrqwpmxZKRr6F9fUsufQ14m9tvcT/QPcrXL4q7qX Fz644yUe/IvKnuJDHJVZsc8UXR9NTPCyCNJT9kVewwPMIMEc/xMOg5QONLZT0TlC Zk4XJIqlBBEQWrN/QwrGXm82aO+3gQZyD5K9AvpczgcBjOr6FJOmN6zkQrqNNWaC 2wPAfWi7DALPtOZR6lCxoeWfLRfdn1ZOn5x2z5xrtAXCV2FTaMg8in9TzJ57qmcb /l43QzcNGSj1ytyny8pqgdsX2MSqs0O5VSG4egMtz7TeU/rs7uAx2IVHbPT8CHop PvhVHeUeu9lGu+FUK8piQbb5aVpbA9qlOj/rXNrHDIxdA9McQgVs+tljNG4X5KtX L7BR84wNg98HtIINVx6RjYz9lOpG1qBuw5RCiqiAaN1RBY7lYAhMaAE6U3azjgC+ AIz/MacNuAz/oTuutQB6/0WZDDJhy4WEy3TrDLlpQNz6yIrpKFN+ftyF6DuVUSMH PmtZ4E/DLooQL5KwuoDdYDH1gSMlggBejeGHTFJ+RUMuvRePZQ8= =Hwqr -----END PGP SIGNATURE----- Merge tag 'net-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter, and can. We'll have one more fix for a socket accounting regression, it's still getting polished. Otherwise things look fine. Current release - regressions: - revert "vrf: reset skb conntrack connection on VRF rcv", there are valid uses for previous behavior - can: m_can: fix iomap_read_fifo() and iomap_write_fifo() Current release - new code bugs: - mlx5: e-switch, return correct error code on group creation failure Previous releases - regressions: - sctp: fix transport encap_port update in sctp_vtag_verify - stmmac: fix E2E delay mechanism (in PTP timestamping) Previous releases - always broken: - netfilter: ip6t_rt: fix out-of-bounds read of ipv6_rt_hdr - netfilter: xt_IDLETIMER: fix out-of-bound read caused by lack of init - netfilter: ipvs: make global sysctl read-only in non-init netns - tcp: md5: fix selection between vrf and non-vrf keys - ipv6: count rx stats on the orig netdev when forwarding - bridge: mcast: use multicast_membership_interval for IGMPv3 - can: - j1939: fix UAF for rx_kref of j1939_priv abort sessions on receiving bad messages - isotp: fix TX buffer concurrent access in isotp_sendmsg() fix return error on FC timeout on TX path - ice: fix re-init of RDMA Tx queues and crash if RDMA was not inited - hns3: schedule the polling again when allocation fails, prevent stalls - drivers: add missing of_node_put() when aborting for_each_available_child_of_node() - ptp: fix possible memory leak and UAF in ptp_clock_register() - e1000e: fix packet loss in burst mode on Tiger Lake and later - mlx5e: ipsec: fix more checksum offload issues" * tag 'net-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (75 commits) usbnet: sanity check for maxpacket net: enetc: make sure all traffic classes can send large frames net: enetc: fix ethtool counter name for PM0_TERR ptp: free 'vclock_index' in ptp_clock_release() sfc: Don't use netif_info before net_device setup sfc: Export fibre-specific supported link modes net/mlx5e: IPsec: Fix work queue entry ethernet segment checksum flags net/mlx5e: IPsec: Fix a misuse of the software parser's fields net/mlx5e: Fix vlan data lost during suspend flow net/mlx5: E-switch, Return correct error code on group creation failure net/mlx5: Lag, change multipath and bonding to be mutually exclusive ice: Add missing E810 device ids igc: Update I226_K device ID e1000e: Fix packet loss on Tiger Lake and later e1000e: Separate TGP board type from SPT ptp: Fix possible memory leak in ptp_clock_register() net: stmmac: Fix E2E delay mechanism nfc: st95hf: Make spi remove() callback return zero net: hns3: disable sriov before unload hclge layer net: hns3: fix vf reset workqueue cannot exit ... |
||
|
1439caa1d9 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter fixes for net: 1) Crash due to missing initialization of timer data in xt_IDLETIMER, from Juhee Kang. 2) NF_CONNTRACK_SECMARK should be bool in Kconfig, from Vegard Nossum. 3) Skip netdev events on netns removal, from Florian Westphal. 4) Add testcase to show port shadowing via UDP, also from Florian. 5) Remove pr_debug() code in ip6t_rt, this fixes a crash due to unsafe access to non-linear skbuff, from Xin Long. 6) Make net/ipv4/vs/debug_level read-only from non-init netns, from Antoine Tenart. 7) Remove bogus invocation to bash in selftests/netfilter/nft_flowtable.sh also from Florian. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
8913970c19 |
mm/userfaultfd: selftests: fix memory corruption with thp enabled
In RHEL's gating selftests we've encountered memory corruption in the uffd event test even with upstream kernel: # ./userfaultfd anon 128 4 nr_pages: 32768, nr_pages_per_cpu: 32768 bounces: 3, mode: rnd racing read, userfaults: 6240 missing (6240) 14729 wp (14729) bounces: 2, mode: racing read, userfaults: 1444 missing (1444) 28877 wp (28877) bounces: 1, mode: rnd read, userfaults: 6055 missing (6055) 14699 wp (14699) bounces: 0, mode: read, userfaults: 82 missing (82) 25196 wp (25196) testing uffd-wp with pagemap (pgsize=4096): done testing uffd-wp with pagemap (pgsize=2097152): done testing events (fork, remap, remove): ERROR: nr 32427 memory corruption 0 1 (errno=0, line=963) ERROR: faulting process failed (errno=0, line=1117) It can be easily reproduced when global thp enabled, which is the default for RHEL. It's also known as a side effect of commit |
||
|
d49fe5e815 |
selftests/tls: add SM4 algorithm dependency for tls selftests
Kernel TLS test has added SM4 GCM/CCM algorithm support, but SM4 algorithm is not compiled by default, this patch add SM4 config dependency. Reported-by: Hangbin Liu <liuhangbin@gmail.com> Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
368a978cc5 |
Tracing fixes for 5.15:
- Fix defined but not use warning/error for osnoise function - Fix memory leak in event probe - Fix memblock leak in bootconfig - Fix the API of event probes to be like kprobes - Added test to check removal of event probe API - Fix recordmcount.pl for nds32 failed build -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYWpB6BQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qnu/AQD1eYekS43uCDyzzpvjsz0tZ6tzVH8z ainpgtcAd11q4AD8CHLvhBsEyo99Yna2Mvir6nCkafm2Y2IVGvVbnDofnAA= =yvDo -----END PGP SIGNATURE----- Merge tag 'trace-v5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Tracing fixes for 5.15: - Fix defined but not use warning/error for osnoise function - Fix memory leak in event probe - Fix memblock leak in bootconfig - Fix the API of event probes to be like kprobes - Added test to check removal of event probe API - Fix recordmcount.pl for nds32 failed build * tag 'trace-v5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: nds32/ftrace: Fix Error: invalid operands (*UND* and *UND* sections) for `^' selftests/ftrace: Update test for more eprobe removal process tracing: Fix event probe removal from dynamic events tracing: Fix missing * in comment block bootconfig: init: Fix memblock leak in xbc_make_cmdline() tracing: Fix memory leak in eprobe_register() tracing: Fix missing osnoise tracer on max_latency |
||
|
0857d6f8c7 |
ipv6: When forwarding count rx stats on the orig netdev
Commit |
||
|
64e4017778 |
selftests: net/fcnal: Test --{force,no}-bind-key-ifindex
Test that applications binding listening sockets to VRFs without specifying TCP_MD5SIG_FLAG_IFINDEX will work as expected. This would be broken if __tcp_md5_do_lookup always made a strict comparison on l3index. See this email: https://lore.kernel.org/netdev/209548b5-27d2-2059-f2e9-2148f5a0291b@gmail.com/ Applications using tcp_l3mdev_accept=1 and a single global socket (not bound to any interface) also should have a way to specify keys that are only for the default VRF, this is done by --force-bind-key-ifindex without otherwise binding to a device. Signed-off-by: Leonard Crestez <cdleonard@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
78a9cf6143 |
selftests: nettest: Add --{force,no}-bind-key-ifindex
These options allow explicit control over the TCP_MD5SIG_FLAG_IFINDEX flag instead of always setting it based on binding to an interface. Do this by converting to getopt_long because nettest has too many single-character flags already and getopt_long is widely used in selftests. Signed-off-by: Leonard Crestez <cdleonard@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
3e6ed7703d |
selftests: netfilter: remove stray bash debug line
This should not be there.
Fixes:
|
||
|
0282b0f012 |
selftests/ftrace: Update test for more eprobe removal process
The removal of eprobes was broken and missed in testing. Add various ways to remove eprobes that are considered acceptable to the testing process to catch when/if they break again. Link: https://lkml.kernel.org/r/20211013205533.836644549@goodmis.org Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> |
||
|
7b1700e009 |
selftests: net: modify IOAM tests for undef bits
The output behavior for undefined bits is now directly tested inside the bash script. Trying to set an undefined bit should be refused. The input behavior for undefined bits has been removed due to the fact that we would need another sender allowed to set undefined bits. Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
465f15a6d1 |
selftests: nft_nat: add udp hole punch test case
Add a test case that demonstrates port shadowing via UDP. ns2 sends packet to ns1, from source port used by a udp service on the router, ns0. Then, ns1 sends packet to ns0:service, but that ends up getting forwarded to ns2. Also add three test cases that demonstrate mitigations: 1. disable use of $port as source from 'unstrusted' origin 2. make the service untracked. This prevents masquerade entries from having any effects. 3. add forced PAT via 'random' mode to translate the "wrong" sport into an acceptable range. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> |
||
|
9d57f7c797 |
selftests: futex: Test sys_futex_waitv() wouldblock
Test if futex_waitv() returns -EWOULDBLOCK correctly when the expected value is different from the actual value for a waiter. Signed-off-by: André Almeida <andrealmeid@collabora.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210923171111.300673-22-andrealmeid@collabora.com |
||
|
02e56ccbae |
selftests: futex: Test sys_futex_waitv() timeout
Test if the futex_waitv timeout is working as expected, using the supported clockid options. Signed-off-by: André Almeida <andrealmeid@collabora.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210923171111.300673-21-andrealmeid@collabora.com |
||
|
5e59c1d1c7 |
selftests: futex: Add sys_futex_waitv() test
Create a new file to test the waitv mechanism. Test both private and shared futexes. Wake the last futex in the array, and check if the return value from futex_waitv() is the right index. Signed-off-by: André Almeida <andrealmeid@collabora.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210923171111.300673-20-andrealmeid@collabora.com |
||
|
1c36432b27 |
kselftests/sched: cleanup the child processes
Previously, 'make -C sched run_tests' will block forever when it occurs something wrong where the *selftests framework* is waiting for its child processes to exit. [root@iaas-rpma sched]# ./cs_prctl_test ## Create a thread/process/process group hiearchy Not a core sched system tid=74985, / tgid=74985 / pgid=74985: ffffffffffffffff Not a core sched system tid=74986, / tgid=74986 / pgid=74985: ffffffffffffffff Not a core sched system tid=74988, / tgid=74986 / pgid=74985: ffffffffffffffff Not a core sched system tid=74989, / tgid=74986 / pgid=74985: ffffffffffffffff Not a core sched system tid=74990, / tgid=74986 / pgid=74985: ffffffffffffffff Not a core sched system tid=74987, / tgid=74987 / pgid=74985: ffffffffffffffff Not a core sched system tid=74991, / tgid=74987 / pgid=74985: ffffffffffffffff Not a core sched system tid=74992, / tgid=74987 / pgid=74985: ffffffffffffffff Not a core sched system tid=74993, / tgid=74987 / pgid=74985: ffffffffffffffff Not a core sched system (268) FAILED: get_cs_cookie(0) == 0 ## Set a cookie on entire process group -1 = prctl(62, 1, 0, 2, 0) core_sched create failed -- PGID: Invalid argument (cs_prctl_test.c:272) - [root@iaas-rpma sched]# ps PID TTY TIME CMD 4605 pts/2 00:00:00 bash 74986 pts/2 00:00:00 cs_prctl_test 74987 pts/2 00:00:00 cs_prctl_test 74999 pts/2 00:00:00 ps Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Chris Hyser <chris.hyser@oracle.com> Link: https://lore.kernel.org/r/20210902024333.75983-1-lizhijian@cn.fujitsu.com |
||
|
f6274b06e3 |
linux-kselftest-fixes-5.15-rc5
This Kselftest fixes update for Linux 5.15-rc5 consists of a fix to implicit declaration warns in drivers/dma-buf test. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmFbaykACgkQCwJExA0N QxwXRhAAwGntF/dFNcKspMKTepdmKLFHr5Mopd1+OMbVLgjQAtXZs+wCoR0tm6RK EZf5D7y8IpGedFnbjWJ1t18XCRJtBW7HladC6otTmR0+1SEFdHR1n4H/pqrmSagV 5rYqwh1BgTb5Eb/9GPWKlZ0zagXGujR5UVdFPqCEMWpF6be9J989NdKGzpD4Qbkq eux3x5iaz5dsVf7iTGhzy+iSxKqIqSw5LCtSrE3PoTRbM5PS+K4I7ImafysnsYK9 kZ8zkmEzixziwoWUEJLRqFveS7cY/5l7Nd2kD9YH1WE2xBYxYQZFdj4HVWdlQePd 99UthTSfcyO6C2fplorlKkoNKuX1tlGCJyZbNjgQSHPYuVrLoau136yoSk5F+ZPg OUFsIs52d09e7vN70nh7UD9rRqihFRWwgI4EQt9nUP1mAmYm8y7bXKSFDF5hp3ay 4VbvYl50QEkllqU0sDcEuFAodGLz72T2PZNWP5DQxg5q70Hni/CWqFm+EizztRCi mv749twroD61b8Jziu8Mqhp5BqlO1tY8uu9z4GuRC4UIMhdARY2J4mD+8wm03JRO 4S28o+mH4StrrhyjngrCM9k8YYwa/n+XDrswiNJIpd/PmxVXD1KUP6z80sQE6Tcb o34yFVeABzdovryC/8VwDC9n3RDYIN6SM5XqOxvX5jckdtfwCbI= =t85S -----END PGP SIGNATURE----- Merge tag 'linux-kselftest-fixes-5.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest fixes from Shuah Khan: "A fix to implicit declaration warns in drivers/dma-buf test" * tag 'linux-kselftest-fixes-5.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests: drivers/dma-buf: Fix implicit declaration warns |
||
|
b2626f1e32 |
Small x86 fixes.
-----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmFXQUoUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroMglgf/egh3zb9/+BUQWe0xWfhcINNzpsVk PJtiBmJc3nQLbZbTSLp63rouy1lNgR0s2DiMwP7G1u39OwW8W3LHMrBUSqF1F01+ gntb4GGiRTiTPJI64K4z6ytORd3tuRarHq8TUIa2zvki9ZW5Obgkm1i1RsNMOo+s AOA7whhpS8e/a5fBbtbS9bTZb30PKTZmbW4oMjvO9Sw4Eb76IauqPSEtRPSuCAc7 r7z62RTlm10Qk0JR3tW1iXMxTJHZk+tYPJ8pclUAWVX5bZqWa/9k8R0Z5i/miFiZ glW/y3R4+aUwIQV2v7V3Jx9MOKDhZxniMtnqZG/Hp9NVDtWIz37V/U37vw== =zQQ1 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull more kvm fixes from Paolo Bonzini: "Small x86 fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: selftests: Ensure all migrations are performed when test is affined KVM: x86: Swap order of CPUID entry "index" vs. "significant flag" checks ptp: Fix ptp_kvm_getcrosststamp issue for x86 ptp_kvm x86/kvmclock: Move this_cpu_pvti into kvmclock.h selftests: KVM: Don't clobber XMM register when read KVM: VMX: Fix a TSX_CTRL_CPUID_CLEAR field mask issue |
||
|
4de593fb96 |
Networking fixes for 5.15-rc4, including fixes from mac80211, netfilter
and bpf. Current release - regressions: - bpf, cgroup: assign cgroup in cgroup_sk_alloc when called from interrupt - mdio: revert mechanical patches which broke handling of optional resources - dev_addr_list: prevent address duplication Previous releases - regressions: - sctp: break out if skb_header_pointer returns NULL in sctp_rcv_ootb (NULL deref) - Revert "mac80211: do not use low data rates for data frames with no ack flag", fixing broadcast transmissions - mac80211: fix use-after-free in CCMP/GCMP RX - netfilter: include zone id in tuple hash again, minimize collisions - netfilter: nf_tables: unlink table before deleting it (race -> UAF) - netfilter: log: work around missing softdep backend module - mptcp: don't return sockets in foreign netns - sched: flower: protect fl_walk() with rcu (race -> UAF) - ixgbe: fix NULL pointer dereference in ixgbe_xdp_setup - smsc95xx: fix stalled rx after link change - enetc: fix the incorrect clearing of IF_MODE bits - ipv4: fix rtnexthop len when RTA_FLOW is present - dsa: mv88e6xxx: 6161: use correct MAX MTU config method for this SKU - e100: fix length calculation & buffer overrun in ethtool::get_regs Previous releases - always broken: - mac80211: fix using stale frag_tail skb pointer in A-MSDU tx - mac80211: drop frames from invalid MAC address in ad-hoc mode - af_unix: fix races in sk_peer_pid and sk_peer_cred accesses (race -> UAF) - bpf, x86: Fix bpf mapping of atomic fetch implementation - bpf: handle return value of BPF_PROG_TYPE_STRUCT_OPS prog - netfilter: ip6_tables: zero-initialize fragment offset - mhi: fix error path in mhi_net_newlink - af_unix: return errno instead of NULL in unix_create1() when over the fs.file-max limit Misc: - bpf: exempt CAP_BPF from checks against bpf_jit_limit - netfilter: conntrack: make max chain length random, prevent guessing buckets by attackers - netfilter: nf_nat_masquerade: make async masq_inet6_event handling generic, defer conntrack walk to work queue (prevent hogging RTNL lock) Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmFV5KYACgkQMUZtbf5S Irs3cRAAqNsgaQXSVOXcPKsndeDDKHIv1Ktes8aOP9tgEXZw5rpsbct7g9Yxc0os Oolyt6HThjWr3u1/e3HHJO9I5Klr/J8eQReoRZnKW+6TYZflmmzfuf8u1nx6SLP/ tliz5y8wKbp8BqqNuTMdRpm+R1QQcNkXTeruUoR1PgREcY4J7bC2BRrqeZhBGHSR Z5yPOIietFN3nITxNwbe4AYJXlesMc6QCWhBtXjMPGQ4Zc4/sjDNfqi7eHJi2H2y kW2dHeXG86gnlgFllOBBWP85ptxynyxoNQJuhrxgC9T+/FpSVST7cwKbtmkwDI3M 5WGmeE6B3yfF8iOQuR8fbKQmsnLgQlYhjpbbhgN0GxzkyI7RpGYOFroX0Pht4IVZ mwprDOtvoLs4UeDjULRMB0JZfRN75PCtVlhfUkhhJxXGCCmnhGYaxG/pE+6OQWlr +n8RXYYMoOzPaHIYTS9NGSGqT0r32IUy/W5Yfv3rEzSeehy2/fxzGr2fOyBGs+q7 xrnqpsOnM8cODDwGMy3TclCI4Dd72WoHNCHPhA/bk/ZMjHpBd4CSEZPm8IROY3Ja g1t68cncgL8fB7TSD9WLFgYu67Lg5j0gC/BHOzUQDQMX5IbhOq/fj1xRy5Lc6SYp mqW1f7LdnixBe4W61VjDAYq5jJRqrwEWedx+rvV/ONLvr77KULk= =rSti -----END PGP SIGNATURE----- Merge tag 'net-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes, including fixes from mac80211, netfilter and bpf. Current release - regressions: - bpf, cgroup: assign cgroup in cgroup_sk_alloc when called from interrupt - mdio: revert mechanical patches which broke handling of optional resources - dev_addr_list: prevent address duplication Previous releases - regressions: - sctp: break out if skb_header_pointer returns NULL in sctp_rcv_ootb (NULL deref) - Revert "mac80211: do not use low data rates for data frames with no ack flag", fixing broadcast transmissions - mac80211: fix use-after-free in CCMP/GCMP RX - netfilter: include zone id in tuple hash again, minimize collisions - netfilter: nf_tables: unlink table before deleting it (race -> UAF) - netfilter: log: work around missing softdep backend module - mptcp: don't return sockets in foreign netns - sched: flower: protect fl_walk() with rcu (race -> UAF) - ixgbe: fix NULL pointer dereference in ixgbe_xdp_setup - smsc95xx: fix stalled rx after link change - enetc: fix the incorrect clearing of IF_MODE bits - ipv4: fix rtnexthop len when RTA_FLOW is present - dsa: mv88e6xxx: 6161: use correct MAX MTU config method for this SKU - e100: fix length calculation & buffer overrun in ethtool::get_regs Previous releases - always broken: - mac80211: fix using stale frag_tail skb pointer in A-MSDU tx - mac80211: drop frames from invalid MAC address in ad-hoc mode - af_unix: fix races in sk_peer_pid and sk_peer_cred accesses (race -> UAF) - bpf, x86: Fix bpf mapping of atomic fetch implementation - bpf: handle return value of BPF_PROG_TYPE_STRUCT_OPS prog - netfilter: ip6_tables: zero-initialize fragment offset - mhi: fix error path in mhi_net_newlink - af_unix: return errno instead of NULL in unix_create1() when over the fs.file-max limit Misc: - bpf: exempt CAP_BPF from checks against bpf_jit_limit - netfilter: conntrack: make max chain length random, prevent guessing buckets by attackers - netfilter: nf_nat_masquerade: make async masq_inet6_event handling generic, defer conntrack walk to work queue (prevent hogging RTNL lock)" * tag 'net-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits) af_unix: fix races in sk_peer_pid and sk_peer_cred accesses net: stmmac: fix EEE init issue when paired with EEE capable PHYs net: dev_addr_list: handle first address in __hw_addr_add_ex net: sched: flower: protect fl_walk() with rcu net: introduce and use lock_sock_fast_nested() net: phy: bcm7xxx: Fixed indirect MMD operations net: hns3: disable firmware compatible features when uninstall PF net: hns3: fix always enable rx vlan filter problem after selftest net: hns3: PF enable promisc for VF when mac table is overflow net: hns3: fix show wrong state when add existing uc mac address net: hns3: fix mixed flag HCLGE_FLAG_MQPRIO_ENABLE and HCLGE_FLAG_DCB_ENABLE net: hns3: don't rollback when destroy mqprio fail net: hns3: remove tc enable checking net: hns3: do not allow call hns3_nic_net_open repeatedly ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup net: bridge: mcast: Associate the seqcount with its protecting lock. net: mdio-ipq4019: Fix the error for an optional regs resource net: hns3: fix hclge_dbg_dump_tm_pg() stack usage net: mdio: mscc-miim: Fix the mdio controller af_unix: Return errno instead of NULL in unix_create1(). ... |
||
|
7b0035eaa7 |
KVM: selftests: Ensure all migrations are performed when test is affined
Rework the CPU selection in the migration worker to ensure the specified
number of migrations are performed when the test iteslf is affined to a
subset of CPUs. The existing logic skips iterations if the target CPU is
not in the original set of possible CPUs, which causes the test to fail
if too many iterations are skipped.
==== Test Assertion Failure ====
rseq_test.c:228: i > (NR_TASK_MIGRATIONS / 2)
pid=10127 tid=10127 errno=4 - Interrupted system call
1 0x00000000004018e5: main at rseq_test.c:227
2 0x00007fcc8fc66bf6: ?? ??:0
3 0x0000000000401959: _start at ??:?
Only performed 4 KVM_RUNs, task stalled too much?
Calculate the min/max possible CPUs as a cheap "best effort" to avoid
high runtimes when the test is affined to a small percentage of CPUs.
Alternatively, a list or xarray of the possible CPUs could be used, but
even in a horrendously inefficient setup, such optimizations are not
needed because the runtime is completely dominated by the cost of
migrating the task, and the absolute runtime is well under a minute in
even truly absurd setups, e.g. running on a subset of vCPUs in a VM that
is heavily overcommited (16 vCPUs per pCPU).
Fixes:
|
||
|
e02c16b9cd |
selftests: KVM: Don't clobber XMM register when read
There is no need to clobber a register that is only being read from. Oops. Drop the XMM register from the clobbers list. Signed-off-by: Oliver Upton <oupton@google.com> Message-Id: <20210927223621.50178-1-oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
|
4ccb9f03fe |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says: ==================== pull-request: bpf 2021-09-28 The following pull-request contains BPF updates for your *net* tree. We've added 10 non-merge commits during the last 14 day(s) which contain a total of 11 files changed, 139 insertions(+), 53 deletions(-). The main changes are: 1) Fix MIPS JIT jump code emission for too large offsets, from Piotr Krysiuk. 2) Fix x86 JIT atomic/fetch emission when dst reg maps to rax, from Johan Almbladh. 3) Fix cgroup_sk_alloc corner case when called from interrupt, from Daniel Borkmann. 4) Fix segfault in libbpf's linker for objects without BTF, from Kumar Kartikeya Dwivedi. 5) Fix bpf_jit_charge_modmem for applications with CAP_BPF, from Lorenz Bauer. 6) Fix return value handling for struct_ops BPF programs, from Hou Tao. 7) Various fixes to BPF selftests, from Jiri Benc. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> , |
||
|
79e2c30666 |
selftests, bpf: test_lwt_ip_encap: Really disable rp_filter
It's not enough to set net.ipv4.conf.all.rp_filter=0, that does not override
a greater rp_filter value on the individual interfaces. We also need to set
net.ipv4.conf.default.rp_filter=0 before creating the interfaces. That way,
they'll also get their own rp_filter value of zero.
Fixes:
|
||
|
d888eaac4f |
selftests, bpf: Fix makefile dependencies on libbpf
When building bpf selftest with make -j, I'm randomly getting build failures such as this one: In file included from progs/bpf_flow.c:19: [...]/tools/testing/selftests/bpf/tools/include/bpf/bpf_helpers.h:11:10: fatal error: 'bpf_helper_defs.h' file not found #include "bpf_helper_defs.h" ^~~~~~~~~~~~~~~~~~~ The file that fails the build varies between runs but it's always in the progs/ subdir. The reason is a missing make dependency on libbpf for the .o files in progs/. There was a dependency before commit |
||
|
9cccec2bf3 |
x86:
- missing TLB flush - nested virtualization fixes for SMM (secure boot on nested hypervisor) and other nested SVM fixes - syscall fuzzing fixes - live migration fix for AMD SEV - mirror VMs now work for SEV-ES too - fixes for reset - possible out-of-bounds access in IOAPIC emulation - fix enlightened VMCS on Windows 2022 ARM: - Add missing FORCE target when building the EL2 object - Fix a PMU probe regression on some platforms Generic: - KCSAN fixes selftests: - random fixes, mostly for clang compilation -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmFN0EwUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNqaQf/Vx7ePFTqwWpo+8wKapnc6JN9SLjC hM4jipxfc1WyQWcfCt8ZuPhCnhF7o8mG/mrqTm+JB+oGqIsydHW19DiUT8ekv09F dQ+XYSiR4B547wUH5XLQc4xG9imwYlXGEOHqrE7eJvGH3LOqVFX2fLRBnFefZbO8 GKhRJrGXwG3/JSAP6A0c22iVU+pLbfV9gpKwrAj0V7o8nzT2b3Wmh74WBNb47BzE a4+AwKpWO4rqJGOwdYwy67pdFHh1YmrlZ59cFZc7fzlXE+o0D0bitaJyioZALpOl 4mRGdzoYkNB++ZjDzVFnAClCYQV/oNxCNGFaFF2mh/gzXG1TLmN7B8zGDg== =7oVh -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "A bit late... I got sidetracked by back-from-vacation routines and conferences. But most of these patches are already a few weeks old and things look more calm on the mailing list than what this pull request would suggest. x86: - missing TLB flush - nested virtualization fixes for SMM (secure boot on nested hypervisor) and other nested SVM fixes - syscall fuzzing fixes - live migration fix for AMD SEV - mirror VMs now work for SEV-ES too - fixes for reset - possible out-of-bounds access in IOAPIC emulation - fix enlightened VMCS on Windows 2022 ARM: - Add missing FORCE target when building the EL2 object - Fix a PMU probe regression on some platforms Generic: - KCSAN fixes selftests: - random fixes, mostly for clang compilation" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (43 commits) selftests: KVM: Explicitly use movq to read xmm registers selftests: KVM: Call ucall_init when setting up in rseq_test KVM: Remove tlbs_dirty KVM: X86: Synchronize the shadow pagetable before link it KVM: X86: Fix missed remote tlb flush in rmap_write_protect() KVM: x86: nSVM: don't copy virt_ext from vmcb12 KVM: x86: nSVM: test eax for 4K alignment for GP errata workaround KVM: x86: selftests: test simultaneous uses of V_IRQ from L1 and L0 KVM: x86: nSVM: restore int_vector in svm_clear_vintr kvm: x86: Add AMD PMU MSRs to msrs_to_save_all[] KVM: x86: nVMX: re-evaluate emulation_required on nested VM exit KVM: x86: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry KVM: x86: VMX: synthesize invalid VM exit when emulating invalid guest state KVM: x86: nSVM: refactor svm_leave_smm and smm_enter_smm KVM: x86: SVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM mode KVM: x86: reset pdptrs_from_userspace when exiting smm KVM: x86: nSVM: restore the L1 host state prior to resuming nested guest on SMM exit KVM: nVMX: Filter out all unsupported controls when eVMCS was activated KVM: KVM: Use cpumask_available() to check for NULL cpumask when kicking vCPUs KVM: Clean up benign vcpu->cpu data races when kicking vCPUs ... |
||
|
2f96028708 |
selftests: drivers/dma-buf: Fix implicit declaration warns
udmabuf has the following implicit declaration warns: udmabuf.c:30:10: warning: implicit declaration of function 'open'; udmabuf.c:42:8: warning: implicit declaration of function 'fcntl' These are caused due to not including fcntl.h and including just linux/fcntl.h. Fix it to include fcntl.h which will bring in the linux/fcntl.h. In addition, define __EXPORTED_HEADERS__ to bring in F_ADD_SEALS and F_SEAL_SHRINK defines and fix the following error that show up when just fcntl.h is included. udmabuf.c:45:21: error: 'F_ADD_SEALS' undeclared 45 | ret = fcntl(memfd, F_ADD_SEALS, F_SEAL_SHRINK); | ^~~~~~~~~~~ udmabuf.c:45:34: error: 'F_SEAL_SHRINK' undeclared 45 | ret = fcntl(memfd, F_ADD_SEALS, F_SEAL_SHRINK); | ^~~~~~~~~~~~~ Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> |
||
|
90316e6ea0 |
linux-kselftest-fixes-5.15-rc3
This Kselftest fixes update for Linux 5.15-rc3 consists of: - fix to Kselftest common framework header install to run before other targets for it work correctly in parallel build case. - fixes to kvm test to not ignore fscanf() returns which could result in inconsistent test behavior and failures. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmFOL14ACgkQCwJExA0N Qxyz9hAA6LKcNpv7BGXWpTNB2z/p9KcbSWdA3eejvb+3kiovd2AyFPvf1U1fT5kA VlJWk4d3cbhPkCrXmkpsI2vBuSb+u5HcwoEy0WH4T4q2aDjoEOnyn3xr2ktnBdXl oeWqPusSRKTJn7gCCZN+H6s5JaUbGZxTWsL2n0v5Pzb6sW4hikc8nXpd+1IeoTKq 0xg7FM6NcaWRzKb6D97wFpQEo6mnC9Zifv6TalxQn71d//n9MXGW2600ZSDDfBRG XfolWqVUHGI2Lyy0mIT788fmntA7xOka3Tajzk4WrfmcgczABJFRJr8ZG6iZ+J0j dT+KaTqEZHcL1L9Pusf2VehJZTUwKMenc2NmOZ+n6pK+PUL/KZNobcXoFW01I9jI z4UYeH9DUnTiaTP8b7OJ1RB7H5XU3CncuO2gELkera212XckdmiTldpox0ywwyh9 x+X6mk4lzbDSFMPrSPJrVTasvLMEZv6A05I8Td8Gw7mgeYIxupyk0/k5jgjMlZGN 6CHyQu4iGre3oM72snuthomtzqwArF3bhWAm7ooZYG4qLQ6z6naDdvcpWYmszIV0 RuC74FZCHZWGXXojVJWPquf47C79QPFQyQcwXxfAaSMY3XezVsI7c/QaXLoT1VeA BZbsUoAPvLa8mLRiFU2jZmLbx57rxej5zM+UvKJjI1JdCL9+cUA= =Q9zx -----END PGP SIGNATURE----- Merge tag 'linux-kselftest-fixes-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest fixes from Shuah Khan: - fix to Kselftest common framework header install to run before other targets for it work correctly in parallel build case. - fixes to kvm test to not ignore fscanf() returns which could result in inconsistent test behavior and failures. * tag 'linux-kselftest-fixes-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests: kvm: fix get_run_delay() ignoring fscanf() return warn selftests: kvm: move get_run_delay() into lib/test_util selftests:kvm: fix get_trans_hugepagesz() ignoring fscanf() return warn selftests:kvm: fix get_warnings_count() ignoring fscanf() return warn selftests: be sure to make khdr before other targets |
||
|
7fe7f3182a |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net 1) ipset limits the max allocatable memory via kvmalloc() to MAX_INT, from Jozsef Kadlecsik. 2) Check ip_vs_conn_tab_bits value to be in the range specified in Kconfig, from Andrea Claudi. 3) Initialize fragment offset in ip6tables, from Jeremy Sowden. 4) Make conntrack hash chain length random, from Florian Westphal. 5) Add zone ID to conntrack and NAT hashtuple again, also from Florian. 6) Add selftests for bidirectional zone support and colliding tuples, from Florian Westphal. 7) Unlink table before synchronize_rcu when cleaning tables with owner, from Florian. 8) ipset limits the max allocatable memory via kvmalloc() to MAX_INT. 9) Release conntrack entries via workqueue in masquerade, from Florian. 10) Fix bogus net_init in iptables raw table definition, also from Florian. 11) Work around missing softdep in log extensions, from Florian Westphal. 12) Serialize hash resizes and cleanups with mutex, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf: netfilter: conntrack: serialize hash resizes and cleanups netfilter: log: work around missing softdep backend module netfilter: iptable_raw: drop bogus net_init annotation netfilter: nf_nat_masquerade: defer conntrack walk to work queue netfilter: nf_nat_masquerade: make async masq_inet6_event handling generic netfilter: nf_tables: Fix oversized kvmalloc() calls netfilter: nf_tables: unlink table before deleting it selftests: netfilter: add zone stress test with colliding tuples selftests: netfilter: add selftest for directional zone support netfilter: nat: include zone id in nat table hash again netfilter: conntrack: include zone id in tuple hash again netfilter: conntrack: make max chain length random netfilter: ip6_tables: zero-initialize fragment offset ipvs: check that ip_vs_conn_tab_bits is between 8 and 20 netfilter: ipset: Fix oversized kvmalloc() calls ==================== Link: https://lore.kernel.org/r/20210924221113.348767-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
1b7eaf5701 |
arm64 fixes:
- It turns out that the optimised string routines merged in 5.14 are not safe with in-kernel MTE (KASAN_HW_TAGS) because of reading beyond the end of a string (strcmp, strncmp). Such reading may go across a 16 byte tag granule and cause a tag check fault. When KASAN_HW_TAGS is enabled, use the generic strcmp/strncmp C implementation. - An errata workaround for ThunderX relied on the CPU capabilities being enabled in a specific order. This disappeared with the automatic generation of the cpucaps.h file (sorted alphabetically). Fix it by checking the current CPU only rather than the system-wide capability. - Add system_supports_mte() checks on the kernel entry/exit path and thread switching to avoid unnecessary barriers and function calls on systems where MTE is not supported. - kselftests: skip arm64 tests if the required features are missing. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmFOC7oACgkQa9axLQDI XvHGhBAAkTzrLtyveymoeD5bk0Lh4Co2i0PKqoOl82ltvagjcvUY5lDLfgKV8ac2 0fZ6E7Jjt6Khq0VAWryRf/RkjxTXmuqDmyb7hZ/t0TNciRXKMfQyE8o8KhsWOvRF hD0COUOnJ4CrgQuAJwsW9uagcrix9NucK9WziHB65RACxnD+k7foffCNv1b9Tw9s oqAfZ144B+L1Z0WFoqG4uGkngIyG/lcSPJMhDRo45iRWtNUnlR/m/ApWKb+r+X1X /Aq2dWklUnrGBB1MdO3WAaRH/lvulH6t4KVhTJw2Gkq5ogy+mYLO15wGYmM+FAWT ExS8fy4/G2vX4RAxYRBZF7Xl3zN7F9n3RYKyYfxgGwSy30TnOLYUMoemYvA9Eh27 O6cjaZacyPVfaB5LUwj6H84NbDbcJKCO775sLuIW8IEMklyZhorS8HG7nYnL6rBk jXM5AVPC2j8kXCgawW9u+s/pEQAvLOvM9y7Q0pjj3n2WNeV09bxzkgX5wStSaMYJ Ok95zCK1SbKBYdA7yQnFcQkM52XkqpML8vccBHUdxVbtAetuYxoyGKKs/9G6mPEH vWSxP4ymvRza62EofmUJzNIEMWwwNUgyNeY/bVJXYf1oc324Wx4T1ewQagrGVQve miEgOVEpn+ygHFkEO3mc0wnFfQ3m5CSBvOjeWlTbv7lgekRDyx4= =JIq+ -----END PGP SIGNATURE----- Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - It turns out that the optimised string routines merged in 5.14 are not safe with in-kernel MTE (KASAN_HW_TAGS) because of reading beyond the end of a string (strcmp, strncmp). Such reading may go across a 16 byte tag granule and cause a tag check fault. When KASAN_HW_TAGS is enabled, use the generic strcmp/strncmp C implementation. - An errata workaround for ThunderX relied on the CPU capabilities being enabled in a specific order. This disappeared with the automatic generation of the cpucaps.h file (sorted alphabetically). Fix it by checking the current CPU only rather than the system-wide capability. - Add system_supports_mte() checks on the kernel entry/exit path and thread switching to avoid unnecessary barriers and function calls on systems where MTE is not supported. - kselftests: skip arm64 tests if the required features are missing. * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Restore forced disabling of KPTI on ThunderX kselftest/arm64: signal: Skip tests if required features are missing arm64: Mitigate MTE issues with str{n}cmp() arm64: add MTE supported check to thread switching and syscall entry/exit |
||
|
386ca9d7fd |
selftests: KVM: Explicitly use movq to read xmm registers
Compiling the KVM selftests with clang emits the following warning:
>> include/x86_64/processor.h:297:25: error: variable 'xmm0' is uninitialized when used here [-Werror,-Wuninitialized]
>> return (unsigned long)xmm0;
where xmm0 is accessed via an uninitialized register variable.
Indeed, this is a misuse of register variables, which really should only
be used for specifying register constraints on variables passed to
inline assembly. Rather than attempting to read xmm registers via
register variables, just explicitly perform the movq from the desired
xmm register.
Fixes:
|
||
|
fbf094ce52 |
selftests: KVM: Call ucall_init when setting up in rseq_test
While x86 does not require any additional setup to use the ucall
infrastructure, arm64 needs to set up the MMIO address used to signal a
ucall to userspace. rseq_test does not initialize the MMIO address,
resulting in the test spinning indefinitely.
Fix the issue by calling ucall_init() during setup.
Fixes:
|
||
|
f10f0481a5 |
A fix for a bug with restartable sequences and KVM. KVM's handling
of TIF_NOTIFY_RESUME, e.g. for task migration, clears the flag without informing rseq and leads to stale data in userspace's rseq struct. I'm sending this as a separate pull request since it's not code that I usually touch. In particular, patch 2 ("entry: rseq: Call rseq_handle_notify_resume() in tracehook_notify_resume()") is just a cleanup to try and make future bugs less likely. If you prefer this to be sent via Thomas and only in 5.16, please speak up. -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmFLPYgUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNCowf6A9pTuDspCC2IRICfQnHj/Q/Xc7HH UmwYo26GctfYaq9AyIAGzcRx3iOEGjb9fZVJ6mzPUGIygio9ZyuiPHogf7lMAb+x 39ts5uSOp+N+8e0fvX578WFfmG5hQa4Tp9W3T2Y5KsVgK2Nf8F08DckzIgD8cbkN NQKTRIi8AYgb20y3NFZjzsPRxF8850QK7xVCI+LBjryyWpEGT5ZsthrYUeexiJPz XN+VOYJen5GXVBCar2JbA7EVSrMZbKSy+M3fJ1vuW5dZHySaiu69JXJHop71jTnJ 5BGue917MfH6RTDzIFFUcg7NmwcuXHpw4dsFeiyExYFNw1uWWQpk0efC1g== =/xlE -----END PGP SIGNATURE----- Merge tag 'for-linus-rseq' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull rseq fixes from Paolo Bonzini: "A fix for a bug with restartable sequences and KVM. KVM's handling of TIF_NOTIFY_RESUME, e.g. for task migration, clears the flag without informing rseq and leads to stale data in userspace's rseq struct" * tag 'for-linus-rseq' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: selftests: Remove __NR_userfaultfd syscall fallback KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs tools: Move x86 syscall number fallbacks to .../uapi/ entry: rseq: Call rseq_handle_notify_resume() in tracehook_notify_resume() KVM: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest |
||
|
9bc62afe03 |
Networking fixes for 5.15-rc3.
Current release - regressions: - dsa: bcm_sf2: fix array overrun in bcm_sf2_num_active_ports() Previous releases - regressions: - introduce a shutdown method to mdio device drivers, and make DSA switch drivers compatible with masters disappearing on shutdown; preventing infinite reference wait - fix issues in mdiobus users related to ->shutdown vs ->remove - virtio-net: fix pages leaking when building skb in big mode - xen-netback: correct success/error reporting for the SKB-with-fraglist - dsa: tear down devlink port regions when tearing down the devlink port on error - nexthop: fix division by zero while replacing a resilient group - hns3: check queue, vf, vlan ids range before using Previous releases - always broken: - napi: fix race against netpoll causing NAPI getting stuck - mlx4_en: ensure link operstate is updated even if link comes up before netdev registration - bnxt_en: fix TX timeout when TX ring size is set to the smallest - enetc: fix illegal access when reading affinity_hint; prevent oops on sysfs access - mtk_eth_soc: avoid creating duplicate offload entries Misc: - core: correct the sock::sk_lock.owned lockdep annotations Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmFMr4MACgkQMUZtbf5S Irv6Ag/+Ml4q6/IVO0jBppztZO1RrSalb3YE9JjPQyMchauVcdcADNpYF+Jo/gcH 4q+/Oikfp6gkQpJTFd0Y9X7UhwA4Jm4wWtEisqc6PJeHOagDZmVUn353WtgpnCNL CgYBfa5k3msGudkqgeXIyiP/2sekBevTy+fOptubLZClyBrEwNUUUZBlpT9aI9Sj ru1eMYklfcxP60AQgNhqq6ZwJnRELgN75fSR6ypVCGcRnTK4UGL/b6TvnPYn8uYY zeNuMZZzYZK5B73tC6rWpteHWZ7VW3Km0WvIKs+ORM8nYchz/EprKZ0HCLPYrWvf ib5Wi7HyL7/n9k9NUTCGrQY3tkOWNzXOepjpiBZPqCG9r2hc3JSR7Q2lFwL+gKv/ sh2y+T2xfp0WFGmG2XiU2MgnkypMSKah1sC/XRE7YLw02vPAnWQxxl/KVNek4j7M CH/Tg9ErVKDRLN7KO/kKl3s8I8N4hdctms/YUt9QD5J9Rw/Jqwr/79bq1uLy6d4o //ipmCTHex57Nvy80PtgcuKJhoeqGwR/Av6BvBMRZ1SOYs/C6q45skHTlYyiNY3+ Dyj9+nfrhsyE835GKPe8lqBFZONBXpXw+EUNXeYRiv0Pcd+JKek07bbajSQVSpd8 8nqQwylpGII0iPGyOc9wKajzh7O5W2odFIdOwtY/5yVjrcFgBd0= =VcL3 -----END PGP SIGNATURE----- Merge tag 'net-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Current release - regressions: - dsa: bcm_sf2: fix array overrun in bcm_sf2_num_active_ports() Previous releases - regressions: - introduce a shutdown method to mdio device drivers, and make DSA switch drivers compatible with masters disappearing on shutdown; preventing infinite reference wait - fix issues in mdiobus users related to ->shutdown vs ->remove - virtio-net: fix pages leaking when building skb in big mode - xen-netback: correct success/error reporting for the SKB-with-fraglist - dsa: tear down devlink port regions when tearing down the devlink port on error - nexthop: fix division by zero while replacing a resilient group - hns3: check queue, vf, vlan ids range before using Previous releases - always broken: - napi: fix race against netpoll causing NAPI getting stuck - mlx4_en: ensure link operstate is updated even if link comes up before netdev registration - bnxt_en: fix TX timeout when TX ring size is set to the smallest - enetc: fix illegal access when reading affinity_hint; prevent oops on sysfs access - mtk_eth_soc: avoid creating duplicate offload entries Misc: - core: correct the sock::sk_lock.owned lockdep annotations" * tag 'net-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (51 commits) atlantic: Fix issue in the pm resume flow. net/mlx4_en: Don't allow aRFS for encapsulated packets net: mscc: ocelot: fix forwarding from BLOCKING ports remaining enabled net: ethernet: mtk_eth_soc: avoid creating duplicate offload entries nfc: st-nci: Add SPI ID matching DT compatible MAINTAINERS: remove Guvenc Gulce as net/smc maintainer nexthop: Fix memory leaks in nexthop notification chain listeners mptcp: ensure tx skbs always have the MPTCP ext qed: rdma - don't wait for resources under hw error recovery flow s390/qeth: fix deadlock during failing recovery s390/qeth: Fix deadlock in remove_discipline s390/qeth: fix NULL deref in qeth_clear_working_pool_list() net: dsa: realtek: register the MDIO bus under devres net: dsa: don't allocate the slave_mii_bus using devres Doc: networking: Fox a typo in ice.rst net: dsa: fix dsa_tree_setup error path net/smc: fix 'workqueue leaked lock' in smc_conn_abort_work net/smc: add missing error check in smc_clc_prfx_set() net: hns3: fix a return value error in hclge_get_reset_status() net: hns3: check vlan id before using it ... |
||
|
1ad32105d7 |
KVM: x86: selftests: test simultaneous uses of V_IRQ from L1 and L0
Test that if: * L1 disables virtual interrupt masking, and INTR intercept. * L1 setups a virtual interrupt to be injected to L2 and enters L2 with interrupts disabled, thus the virtual interrupt is pending. * Now an external interrupt arrives in L1 and since L1 doesn't intercept it, it should be delivered to L2 when it enables interrupts. to do this L0 (abuses) V_IRQ to setup an interrupt window, and returns to L2. * L2 enables interrupts. This should trigger the interrupt window, injection of the external interrupt and delivery of the virtual interrupt that can now be done. * Test that now L2 gets those interrupts. This is the test that demonstrates the issue that was fixed in the previous patch. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210914154825.104886-3-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
|
7c236b816e |
KVM: selftests: Create a separate dirty bitmap per slot
The calculation to get the per-slot dirty bitmap was incorrect leading
to a buffer overrun. Fix it by splitting out the dirty bitmap into a
separate bitmap per slot.
Fixes:
|
||
|
9f2fc5554a |
KVM: selftests: Refactor help message for -s backing_src
All selftests that support the backing_src option were printing their
own description of the flag and then calling backing_src_help() to dump
the list of available backing sources. Consolidate the flag printing in
backing_src_help() to align indentation, reduce duplicated strings, and
improve consistency across tests.
Note: Passing "-s" to backing_src_help is unnecessary since every test
uses the same flag. However I decided to keep it for code readability
at the call sites.
While here this opportunistically fixes the incorrectly interleaved
printing -x help message and list of backing source types in
dirty_log_perf_test.
Fixes:
|
||
|
a1e638da1b |
KVM: selftests: Change backing_src flag to -s in demand_paging_test
Every other KVM selftest uses -s for the backing_src, so switch demand_paging_test to match. Reviewed-by: Ben Gardon <bgardon@google.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20210917173657.44011-2-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
|
01f91acb55 |
selftests: KVM: Align SMCCC call with the spec in steal_time
The SMC64 calling convention passes a function identifier in w0 and its parameters in x1-x17. Given this, there are two deviations in the SMC64 call performed by the steal_time test: the function identifier is assigned to a 64 bit register and the parameter is only 32 bits wide. Align the call with the SMCCC by using a 32 bit register to handle the function identifier and increasing the parameter width to 64 bits. Suggested-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <20210921171121.2148982-3-oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
|
90b54129e8 |
selftests: KVM: Fix check for !POLLIN in demand_paging_test
The logical not operator applies only to the left hand side of a bitwise
operator. As such, the check for POLLIN not being set in revents wrong.
Fix it by adding parentheses around the bitwise expression.
Fixes:
|
||
|
61e52f1630 |
KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs
Add a test to verify an rseq's CPU ID is updated correctly if the task is
migrated while the kernel is handling KVM_RUN. This is a regression test
for a bug introduced by commit
|
||
|
0e3dbf765f |
kselftest/arm64: signal: Skip tests if required features are missing
During initialization of a signal testcase, features declared as required
are properly checked against the running system but no action is then taken
to effectively skip such a testcase.
Fix core signals test logic to abort initialization and report such a
testcase as skipped to the KSelfTest framework.
Fixes:
|