mirror of
https://github.com/torvalds/linux.git
synced 2024-11-22 12:11:40 +00:00
beb8cee06f
10008 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Heiko Carstens
|
beb8cee06f |
s390/alternatives: Remove alternative facility list
The alternative and the normal facility list are always identical. Remove the alternative facility list, which allows to simplify the alternatives code. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Tested-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Heiko Carstens
|
47837a5c74 |
s390/nospec: Push down alternative handling
The nospec implementation is deeply integrated into the alternatives code: only for nospec an alternative facility list is implemented and used by the alternative code, while it is modified by nospec specific needs. Push down the nospec alternative handling into the nospec by introducing a new alternative type and a specific nospec callback to decide if alternatives should be applied. Also introduce a new global nobp variable which together with facility 82 can be used to decide if nobp is enabled or not. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Tested-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Sven Schnelle
|
7f9d85998f |
s390/alternatives: Allow early alternative patching in decompressor
Add the required code to patch alternatives early in the decompressor. This is required for the upcoming lowcore relocation changes, where alternatives for facility 193 need to get patched before lowcore alternatives. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Co-developed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Heiko Carstens
|
b3e0c5f734 |
s390/alternatives: Rework to allow for callbacks
Rework alternatives to allow for callbacks. With this every alternative entry has additional data encoded: - When (aka context) an alternative is supposed to be applied - The type of an alternative, which allows for type specific handling and callbacks - Extra type specific payload (patch information), which can be passed to callbacks in order to decide if an alternative should be applied or not With this only the "late" context is implemented, which means there is no change to the previous behaviour. All code is just converted to the more generic new infrastructure. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Tested-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Heiko Carstens
|
030f7951c5 |
s390/uaccess: Make s390_kernel_write() usable for decompressor
To avoid lots of ifdefs in C code make s390_kernel_write() usable for the decompressor: simply use memcpy() for this case since there is no write protection enabled that early. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Tested-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Heiko Carstens
|
ace76fac94 |
s390/alternatives: Move text sync functions
Move all text sync functions from alternative.c to processor.c. This way there is only minimal code left in alternative.c left, which is a prerequisite to use the C file within boot code as well. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Tested-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Heiko Carstens
|
c77f7354c4 |
s390/alternatives: Merge both alternative header files
The two alternative header files must stay in sync. This is easier to achieve within one header file. Therefore merge both of them and have only one file, like most other architectures. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Tested-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Heiko Carstens
|
9be999a612 |
s390/alternatives: Use consistent naming
The alternative code is using the words facility and feature for the same. Rename facility to more generic feature everywhere to have consistent naming. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Tested-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Sven Schnelle
|
035248a784 |
s390/alternatives: Remove noaltinstr option
The current Kernel doesn't boot without alternative patching on z16 machines. To avoid such bugs in the future, remove the option disable alternative patching. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Sven Schnelle
|
d3604ffba1 |
s390: Move CIF flags to struct pcpu
To allow testing flags for offline CPUs, move the CIF flags to struct pcpu. To avoid having to calculate the array index for each access, add a pointer to the pcpu member for the current cpu to lowcore. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Sven Schnelle
|
90fc5ac282 |
s390/smp: Switch pcpu_devices to percpu
In preparation of moving the CIF flags from lowcore to pcpu_devices, convert the pcpu_devices array to use the percpu infrastructure. This is required because using the pcpu_devices array as it is would introduce a performance penalty due to the fact that CPU flags for multiple CPUs would end up in the same cacheline. Note that a pointer to the pcpu struct of the IPL CPU is still required. This is because a restart interrupt can be triggered on an offline CPU. s390 stores the percpu offset in lowcore, but offline CPUs have no lowcore area allocated. So percpu data cannot be used from an offline CPU and we need to get the pcpu pointer for the IPL cpu from somewhere else. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Sven Schnelle
|
a795eeaf85 |
s390/smp: Handle restart interrupt on ipl cpu
The current smp code allows to trigger a restart interrupt on CPUs offline in linux. To allow using the percpu infrastructure instead of the pcpu_devices array, switch to the ipl cpu which is always online before calling do_restart(). Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Alexander Gordeev
|
b798b685b4 |
s390/boot: Do not assume the decompressor range is reserved
When allocating a random memory range for .amode31 sections the minimal randomization address is 0. That does not lead to a possible overlap with the decompressor image (which also starts from 0) since by that time the image range is already reserved. Do not assume the decompressor range is reserved and always provide the minimal randomization address for .amode31 sections beyond the decompressor. That is a prerequisite for moving the lowcore memory address from NULL elsewhere. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Thomas Richter
|
e6ce1f12d7 |
s390/cpum_cf: Fix endless loop in CF_DIAG event stop
Event CF_DIAG reads out complete counter sets using stcctm
instruction. This is done at event start time when the process
starts execution and at event stop time when the process is
removed from the CPU. During removal the difference of each
counter in the counter sets is calculated and saved as raw data
in the ring buffer. This works fine unless the number of counters
in a counter set is zero. This may happen for the extended counter
set. This set is machine specific and the size of the counter
set can be zero even when extended counter set is authorized for
read access.
This case is not handled. cfdiag_diffctr() checks authorization
of the extended counter set. If true the functions assumes
the extended counter set has been saved in a data buffer. However
this is not the case, cfdiag_getctrset() does not save a counter
set with counter set size of zero. This mismatch causes an endless
loop in the counter set readout during event stop handling.
The calculation of the difference of the counters in each counter
now verifies the size of the counter set is non-zero. A counter set
with size zero is skipped.
Fixes:
|
||
Ilya Leoshkevich
|
19af288706 |
s390/ptdump: Add KMSAN page markers
Add KMSAN vmalloc metadata areas to /sys/kernel/debug/kernel_page_tables. Example output: 0x000003a95fff9000-0x000003a960000000 28K PTE I ---[ vmalloc Area End ]--- ---[ Kmsan vmalloc Shadow Start ]--- 0x000003a960000000-0x000003a960010000 64K PTE RW NX [...] 0x000003d3dfff9000-0x000003d3e0000000 28K PTE I ---[ Kmsan vmalloc Shadow End ]--- ---[ Kmsan vmalloc Origins Start ]--- 0x000003d3e0000000-0x000003d3e0010000 64K PTE RW NX [...] 0x000003fe5fff9000-0x000003fe60000000 28K PTE I ---[ Kmsan vmalloc Origins End ]--- ---[ Kmsan Modules Shadow Start ]--- 0x000003fe60000000-0x000003fe60001000 4K PTE RW NX [...] 0x000003fe60100000-0x000003fee0000000 2047M PMD I ---[ Kmsan Modules Shadow End ]--- ---[ Kmsan Modules Origins Start ]--- 0x000003fee0000000-0x000003fee0001000 4K PTE RW NX [...] 0x000003fee0100000-0x000003ff60000000 2047M PMD I ---[ Kmsan Modules Origins End ]--- ---[ Modules Area Start ]--- 0x000003ff60000000-0x000003ff60001000 4K PTE RO X Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20240723124441.120044-3-iii@linux.ibm.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Ilya Leoshkevich
|
ec25f99cc8 |
s390/kmsan: Fix merge conflict with get_lowcore() introduction
Resolve the conflict between commit |
||
Vasily Gorbik
|
e188e5d5ff |
s390/setup: Fix __pa/__va for modules under non-GPL licenses
The struct vm_layout contains fields used in __pa/__va calculations. Such
fundamental things have to be exported with EXPORT_SYMBOL to avoid
breakages of out-of-tree modules under non-GPL licenses.
Fixes:
|
||
Gerd Bayer
|
ab42fcb511 |
s390/pci: Allow allocation of more than 1 MSI interrupt
On a PCI adapter that provides up to 8 MSI interrupt sources the s390
implementation of PCI interrupts rejected to accommodate them, although
the underlying hardware is able to support that.
For MSI-X it is sufficient to allocate a single irq_desc per msi_desc,
but for MSI multiple irq descriptors are attached to and controlled by
a single msi descriptor. Add the appropriate loops to maintain multiple
irq descriptors and tie/untie them to/from the appropriate AIBV bit, if
a device driver allocates more than 1 MSI interrupt.
Common PCI code passes on requests to allocate a number of interrupt
vectors based on the device drivers' demand and the PCI functions'
capabilities. However, the root-complex of s390 systems support just a
limited number of interrupt vectors per PCI function.
Produce a kernel log message to inform about any architecture-specific
capping that might be done.
With this change, we had a PCI adapter successfully raising
interrupts to its device driver via all 8 sources.
Fixes:
|
||
Gerd Bayer
|
5fd11b96b4 |
s390/pci: Refactor arch_setup_msi_irqs()
Factor out adapter interrupt allocation from arch_setup_msi_irqs() in preparation for enabling registration of multiple MSIs. Code movement only, no change of functionality intended. Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Heiko Carstens
|
57e29f4c89 |
s390: Add runtime constant support
Implement the runtime constant infrastructure for s390, allowing the dcache d_hash() function to be generated using as a constant for hash table address followed by shift by a constant of the hash index. This is the s390 variant of commit |
||
Linus Torvalds
|
fbc90c042c |
- 875fa64577da ("mm/hugetlb_vmemmap: fix race with speculative PFN
walkers") is known to cause a performance regression (https://lore.kernel.org/all/3acefad9-96e5-4681-8014-827d6be71c7a@linux.ibm.com/T/#mfa809800a7862fb5bdf834c6f71a3a5113eb83ff). Yu has a fix which I'll send along later via the hotfixes branch. - In the series "mm: Avoid possible overflows in dirty throttling" Jan Kara addresses a couple of issues in the writeback throttling code. These fixes are also targetted at -stable kernels. - Ryusuke Konishi's series "nilfs2: fix potential issues related to reserved inodes" does that. This should actually be in the mm-nonmm-stable tree, along with the many other nilfs2 patches. My bad. - More folio conversions from Kefeng Wang in the series "mm: convert to folio_alloc_mpol()" - Kemeng Shi has sent some cleanups to the writeback code in the series "Add helper functions to remove repeated code and improve readability of cgroup writeback" - Kairui Song has made the swap code a little smaller and a little faster in the series "mm/swap: clean up and optimize swap cache index". - In the series "mm/memory: cleanly support zeropage in vm_insert_page*(), vm_map_pages*() and vmf_insert_mixed()" David Hildenbrand has reworked the rather sketchy handling of the use of the zeropage in MAP_SHARED mappings. I don't see any runtime effects here - more a cleanup/understandability/maintainablity thing. - Dev Jain has improved selftests/mm/va_high_addr_switch.c's handling of higher addresses, for aarch64. The (poorly named) series is "Restructure va_high_addr_switch". - The core TLB handling code gets some cleanups and possible slight optimizations in Bang Li's series "Add update_mmu_tlb_range() to simplify code". - Jane Chu has improved the handling of our fake-an-unrecoverable-memory-error testing feature MADV_HWPOISON in the series "Enhance soft hwpoison handling and injection". - Jeff Johnson has sent a billion patches everywhere to add MODULE_DESCRIPTION() to everything. Some landed in this pull. - In the series "mm: cleanup MIGRATE_SYNC_NO_COPY mode", Kefeng Wang has simplified migration's use of hardware-offload memory copying. - Yosry Ahmed performs more folio API conversions in his series "mm: zswap: trivial folio conversions". - In the series "large folios swap-in: handle refault cases first", Chuanhua Han inches us forward in the handling of large pages in the swap code. This is a cleanup and optimization, working toward the end objective of full support of large folio swapin/out. - In the series "mm,swap: cleanup VMA based swap readahead window calculation", Huang Ying has contributed some cleanups and a possible fixlet to his VMA based swap readahead code. - In the series "add mTHP support for anonymous shmem" Baolin Wang has taught anonymous shmem mappings to use multisize THP. By default this is a no-op - users must opt in vis sysfs controls. Dramatic improvements in pagefault latency are realized. - David Hildenbrand has some cleanups to our remaining use of page_mapcount() in the series "fs/proc: move page_mapcount() to fs/proc/internal.h". - David also has some highmem accounting cleanups in the series "mm/highmem: don't track highmem pages manually". - Build-time fixes and cleanups from John Hubbard in the series "cleanups, fixes, and progress towards avoiding "make headers"". - Cleanups and consolidation of the core pagemap handling from Barry Song in the series "mm: introduce pmd|pte_needs_soft_dirty_wp helpers and utilize them". - Lance Yang's series "Reclaim lazyfree THP without splitting" has reduced the latency of the reclaim of pmd-mapped THPs under fairly common circumstances. A 10x speedup is seen in a microbenchmark. It does this by punting to aother CPU but I guess that's a win unless all CPUs are pegged. - hugetlb_cgroup cleanups from Xiu Jianfeng in the series "mm/hugetlb_cgroup: rework on cftypes". - Miaohe Lin's series "Some cleanups for memory-failure" does just that thing. - Is anyone reading this stuff? If so, email me! - Someone other than SeongJae has developed a DAMON feature in Honggyu Kim's series "DAMON based tiered memory management for CXL memory". This adds DAMON features which may be used to help determine the efficiency of our placement of CXL/PCIe attached DRAM. - DAMON user API centralization and simplificatio work in SeongJae Park's series "mm/damon: introduce DAMON parameters online commit function". - In the series "mm: page_type, zsmalloc and page_mapcount_reset()" David Hildenbrand does some maintenance work on zsmalloc - partially modernizing its use of pageframe fields. - Kefeng Wang provides more folio conversions in the series "mm: remove page_maybe_dma_pinned() and page_mkclean()". - More cleanup from David Hildenbrand, this time in the series "mm/memory_hotplug: use PageOffline() instead of PageReserved() for !ZONE_DEVICE". It "enlightens memory hotplug more about PageOffline() pages" and permits the removal of some virtio-mem hacks. - Barry Song's series "mm: clarify folio_add_new_anon_rmap() and __folio_add_anon_rmap()" is a cleanup to the anon folio handling in preparation for mTHP (multisize THP) swapin. - Kefeng Wang's series "mm: improve clear and copy user folio" implements more folio conversions, this time in the area of large folio userspace copying. - The series "Docs/mm/damon/maintaier-profile: document a mailing tool and community meetup series" tells people how to get better involved with other DAMON developers. From SeongJae Park. - A large series ("kmsan: Enable on s390") from Ilya Leoshkevich does that. - David Hildenbrand sends along more cleanups, this time against the migration code. The series is "mm/migrate: move NUMA hinting fault folio isolation + checks under PTL". - Jan Kara has found quite a lot of strangenesses and minor errors in the readahead code. He addresses this in the series "mm: Fix various readahead quirks". - SeongJae Park's series "selftests/damon: test DAMOS tried regions and {min,max}_nr_regions" adds features and addresses errors in DAMON's self testing code. - Gavin Shan has found a userspace-triggerable WARN in the pagecache code. The series "mm/filemap: Limit page cache size to that supported by xarray" addresses this. The series is marked cc:stable. - Chengming Zhou's series "mm/ksm: cmp_and_merge_page() optimizations and cleanup" cleans up and slightly optimizes KSM. - Roman Gushchin has separated the memcg-v1 and memcg-v2 code - lots of code motion. The series (which also makes the memcg-v1 code Kconfigurable) are "mm: memcg: separate legacy cgroup v1 code and put under config option" and "mm: memcg: put cgroup v1-specific memcg data under CONFIG_MEMCG_V1" - Dan Schatzberg's series "Add swappiness argument to memory.reclaim" adds an additional feature to this cgroup-v2 control file. - The series "Userspace controls soft-offline pages" from Jiaqi Yan permits userspace to stop the kernel's automatic treatment of excessive correctable memory errors. In order to permit userspace to monitor and handle this situation. - Kefeng Wang's series "mm: migrate: support poison recover from migrate folio" teaches the kernel to appropriately handle migration from poisoned source folios rather than simply panicing. - SeongJae Park's series "Docs/damon: minor fixups and improvements" does those things. - In the series "mm/zsmalloc: change back to per-size_class lock" Chengming Zhou improves zsmalloc's scalability and memory utilization. - Vivek Kasireddy's series "mm/gup: Introduce memfd_pin_folios() for pinning memfd folios" makes the GUP code use FOLL_PIN rather than bare refcount increments. So these paes can first be moved aside if they reside in the movable zone or a CMA block. - Andrii Nakryiko has added a binary ioctl()-based API to /proc/pid/maps for much faster reading of vma information. The series is "query VMAs from /proc/<pid>/maps". - In the series "mm: introduce per-order mTHP split counters" Lance Yang improves the kernel's presentation of developer information related to multisize THP splitting. - Michael Ellerman has developed the series "Reimplement huge pages without hugepd on powerpc (8xx, e500, book3s/64)". This permits userspace to use all available huge page sizes. - In the series "revert unconditional slab and page allocator fault injection calls" Vlastimil Babka removes a performance-affecting and not very useful feature from slab fault injection. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZp2C+QAKCRDdBJ7gKXxA joTkAQDvjqOoFStqk4GU3OXMYB7WCU/ZQMFG0iuu1EEwTVDZ4QEA8CnG7seek1R3 xEoo+vw0sWWeLV3qzsxnCA1BJ8cTJA8= =z0Lf -----END PGP SIGNATURE----- Merge tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - In the series "mm: Avoid possible overflows in dirty throttling" Jan Kara addresses a couple of issues in the writeback throttling code. These fixes are also targetted at -stable kernels. - Ryusuke Konishi's series "nilfs2: fix potential issues related to reserved inodes" does that. This should actually be in the mm-nonmm-stable tree, along with the many other nilfs2 patches. My bad. - More folio conversions from Kefeng Wang in the series "mm: convert to folio_alloc_mpol()" - Kemeng Shi has sent some cleanups to the writeback code in the series "Add helper functions to remove repeated code and improve readability of cgroup writeback" - Kairui Song has made the swap code a little smaller and a little faster in the series "mm/swap: clean up and optimize swap cache index". - In the series "mm/memory: cleanly support zeropage in vm_insert_page*(), vm_map_pages*() and vmf_insert_mixed()" David Hildenbrand has reworked the rather sketchy handling of the use of the zeropage in MAP_SHARED mappings. I don't see any runtime effects here - more a cleanup/understandability/maintainablity thing. - Dev Jain has improved selftests/mm/va_high_addr_switch.c's handling of higher addresses, for aarch64. The (poorly named) series is "Restructure va_high_addr_switch". - The core TLB handling code gets some cleanups and possible slight optimizations in Bang Li's series "Add update_mmu_tlb_range() to simplify code". - Jane Chu has improved the handling of our fake-an-unrecoverable-memory-error testing feature MADV_HWPOISON in the series "Enhance soft hwpoison handling and injection". - Jeff Johnson has sent a billion patches everywhere to add MODULE_DESCRIPTION() to everything. Some landed in this pull. - In the series "mm: cleanup MIGRATE_SYNC_NO_COPY mode", Kefeng Wang has simplified migration's use of hardware-offload memory copying. - Yosry Ahmed performs more folio API conversions in his series "mm: zswap: trivial folio conversions". - In the series "large folios swap-in: handle refault cases first", Chuanhua Han inches us forward in the handling of large pages in the swap code. This is a cleanup and optimization, working toward the end objective of full support of large folio swapin/out. - In the series "mm,swap: cleanup VMA based swap readahead window calculation", Huang Ying has contributed some cleanups and a possible fixlet to his VMA based swap readahead code. - In the series "add mTHP support for anonymous shmem" Baolin Wang has taught anonymous shmem mappings to use multisize THP. By default this is a no-op - users must opt in vis sysfs controls. Dramatic improvements in pagefault latency are realized. - David Hildenbrand has some cleanups to our remaining use of page_mapcount() in the series "fs/proc: move page_mapcount() to fs/proc/internal.h". - David also has some highmem accounting cleanups in the series "mm/highmem: don't track highmem pages manually". - Build-time fixes and cleanups from John Hubbard in the series "cleanups, fixes, and progress towards avoiding "make headers"". - Cleanups and consolidation of the core pagemap handling from Barry Song in the series "mm: introduce pmd|pte_needs_soft_dirty_wp helpers and utilize them". - Lance Yang's series "Reclaim lazyfree THP without splitting" has reduced the latency of the reclaim of pmd-mapped THPs under fairly common circumstances. A 10x speedup is seen in a microbenchmark. It does this by punting to aother CPU but I guess that's a win unless all CPUs are pegged. - hugetlb_cgroup cleanups from Xiu Jianfeng in the series "mm/hugetlb_cgroup: rework on cftypes". - Miaohe Lin's series "Some cleanups for memory-failure" does just that thing. - Someone other than SeongJae has developed a DAMON feature in Honggyu Kim's series "DAMON based tiered memory management for CXL memory". This adds DAMON features which may be used to help determine the efficiency of our placement of CXL/PCIe attached DRAM. - DAMON user API centralization and simplificatio work in SeongJae Park's series "mm/damon: introduce DAMON parameters online commit function". - In the series "mm: page_type, zsmalloc and page_mapcount_reset()" David Hildenbrand does some maintenance work on zsmalloc - partially modernizing its use of pageframe fields. - Kefeng Wang provides more folio conversions in the series "mm: remove page_maybe_dma_pinned() and page_mkclean()". - More cleanup from David Hildenbrand, this time in the series "mm/memory_hotplug: use PageOffline() instead of PageReserved() for !ZONE_DEVICE". It "enlightens memory hotplug more about PageOffline() pages" and permits the removal of some virtio-mem hacks. - Barry Song's series "mm: clarify folio_add_new_anon_rmap() and __folio_add_anon_rmap()" is a cleanup to the anon folio handling in preparation for mTHP (multisize THP) swapin. - Kefeng Wang's series "mm: improve clear and copy user folio" implements more folio conversions, this time in the area of large folio userspace copying. - The series "Docs/mm/damon/maintaier-profile: document a mailing tool and community meetup series" tells people how to get better involved with other DAMON developers. From SeongJae Park. - A large series ("kmsan: Enable on s390") from Ilya Leoshkevich does that. - David Hildenbrand sends along more cleanups, this time against the migration code. The series is "mm/migrate: move NUMA hinting fault folio isolation + checks under PTL". - Jan Kara has found quite a lot of strangenesses and minor errors in the readahead code. He addresses this in the series "mm: Fix various readahead quirks". - SeongJae Park's series "selftests/damon: test DAMOS tried regions and {min,max}_nr_regions" adds features and addresses errors in DAMON's self testing code. - Gavin Shan has found a userspace-triggerable WARN in the pagecache code. The series "mm/filemap: Limit page cache size to that supported by xarray" addresses this. The series is marked cc:stable. - Chengming Zhou's series "mm/ksm: cmp_and_merge_page() optimizations and cleanup" cleans up and slightly optimizes KSM. - Roman Gushchin has separated the memcg-v1 and memcg-v2 code - lots of code motion. The series (which also makes the memcg-v1 code Kconfigurable) are "mm: memcg: separate legacy cgroup v1 code and put under config option" and "mm: memcg: put cgroup v1-specific memcg data under CONFIG_MEMCG_V1" - Dan Schatzberg's series "Add swappiness argument to memory.reclaim" adds an additional feature to this cgroup-v2 control file. - The series "Userspace controls soft-offline pages" from Jiaqi Yan permits userspace to stop the kernel's automatic treatment of excessive correctable memory errors. In order to permit userspace to monitor and handle this situation. - Kefeng Wang's series "mm: migrate: support poison recover from migrate folio" teaches the kernel to appropriately handle migration from poisoned source folios rather than simply panicing. - SeongJae Park's series "Docs/damon: minor fixups and improvements" does those things. - In the series "mm/zsmalloc: change back to per-size_class lock" Chengming Zhou improves zsmalloc's scalability and memory utilization. - Vivek Kasireddy's series "mm/gup: Introduce memfd_pin_folios() for pinning memfd folios" makes the GUP code use FOLL_PIN rather than bare refcount increments. So these paes can first be moved aside if they reside in the movable zone or a CMA block. - Andrii Nakryiko has added a binary ioctl()-based API to /proc/pid/maps for much faster reading of vma information. The series is "query VMAs from /proc/<pid>/maps". - In the series "mm: introduce per-order mTHP split counters" Lance Yang improves the kernel's presentation of developer information related to multisize THP splitting. - Michael Ellerman has developed the series "Reimplement huge pages without hugepd on powerpc (8xx, e500, book3s/64)". This permits userspace to use all available huge page sizes. - In the series "revert unconditional slab and page allocator fault injection calls" Vlastimil Babka removes a performance-affecting and not very useful feature from slab fault injection. * tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (411 commits) mm/mglru: fix ineffective protection calculation mm/zswap: fix a white space issue mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio mm/hugetlb: fix possible recursive locking detected warning mm/gup: clear the LRU flag of a page before adding to LRU batch mm/numa_balancing: teach mpol_to_str about the balancing mode mm: memcg1: convert charge move flags to unsigned long long alloc_tag: fix page_ext_get/page_ext_put sequence during page splitting lib: reuse page_ext_data() to obtain codetag_ref lib: add missing newline character in the warning message mm/mglru: fix overshooting shrinker memory mm/mglru: fix div-by-zero in vmpressure_calc_level() mm/kmemleak: replace strncpy() with strscpy() mm, page_alloc: put should_fail_alloc_page() back behing CONFIG_FAIL_PAGE_ALLOC mm, slab: put should_failslab() back behind CONFIG_SHOULD_FAILSLAB mm: ignore data-race in __swap_writepage hugetlbfs: ensure generic_hugetlb_get_unmapped_area() returns higher address than mmap_min_addr mm: shmem: rename mTHP shmem counters mm: swap_state: use folio_alloc_mpol() in __read_swap_cache_async() mm/migrate: putback split folios when numa hint migration fails ... |
||
Linus Torvalds
|
2c9b351240 |
ARM:
* Initial infrastructure for shadow stage-2 MMUs, as part of nested virtualization enablement * Support for userspace changes to the guest CTR_EL0 value, enabling (in part) migration of VMs between heterogenous hardware * Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of the protocol * FPSIMD/SVE support for nested, including merged trap configuration and exception routing * New command-line parameter to control the WFx trap behavior under KVM * Introduce kCFI hardening in the EL2 hypervisor * Fixes + cleanups for handling presence/absence of FEAT_TCRX * Miscellaneous fixes + documentation updates LoongArch: * Add paravirt steal time support. * Add support for KVM_DIRTY_LOG_INITIALLY_SET. * Add perf kvm-stat support for loongarch. RISC-V: * Redirect AMO load/store access fault traps to guest * perf kvm stat support * Use guest files for IMSIC virtualization, when available ONE_REG support for the Zimop, Zcmop, Zca, Zcf, Zcd, Zcb and Zawrs ISA extensions is coming through the RISC-V tree. s390: * Assortment of tiny fixes which are not time critical x86: * Fixes for Xen emulation. * Add a global struct to consolidate tracking of host values, e.g. EFER * Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC bus frequency, because TDX. * Print the name of the APICv/AVIC inhibits in the relevant tracepoint. * Clean up KVM's handling of vendor specific emulation to consistently act on "compatible with Intel/AMD", versus checking for a specific vendor. * Drop MTRR virtualization, and instead always honor guest PAT on CPUs that support self-snoop. * Update to the newfangled Intel CPU FMS infrastructure. * Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as it reads '0' and writes from userspace are ignored. * Misc cleanups x86 - MMU: * Small cleanups, renames and refactoring extracted from the upcoming Intel TDX support. * Don't allocate kvm_mmu_page.shadowed_translation for shadow pages that can't hold leafs SPTEs. * Unconditionally drop mmu_lock when allocating TDP MMU page tables for eager page splitting, to avoid stalling vCPUs when splitting huge pages. * Bug the VM instead of simply warning if KVM tries to split a SPTE that is non-present or not-huge. KVM is guaranteed to end up in a broken state because the callers fully expect a valid SPTE, it's all but dangerous to let more MMU changes happen afterwards. x86 - AMD: * Make per-CPU save_area allocations NUMA-aware. * Force sev_es_host_save_area() to be inlined to avoid calling into an instrumentable function from noinstr code. * Base support for running SEV-SNP guests. API-wise, this includes a new KVM_X86_SNP_VM type, encrypting/measure the initial image into guest memory, and finalizing it before launching it. Internally, there are some gmem/mmu hooks needed to prepare gmem-allocated pages before mapping them into guest private memory ranges. This includes basic support for attestation guest requests, enough to say that KVM supports the GHCB 2.0 specification. There is no support yet for loading into the firmware those signing keys to be used for attestation requests, and therefore no need yet for the host to provide certificate data for those keys. To support fetching certificate data from userspace, a new KVM exit type will be needed to handle fetching the certificate from userspace. An attempt to define a new KVM_EXIT_COCO/KVM_EXIT_COCO_REQ_CERTS exit type to handle this was introduced in v1 of this patchset, but is still being discussed by community, so for now this patchset only implements a stub version of SNP Extended Guest Requests that does not provide certificate data. x86 - Intel: * Remove an unnecessary EPT TLB flush when enabling hardware. * Fix a series of bugs that cause KVM to fail to detect nested pending posted interrupts as valid wake eents for a vCPU executing HLT in L2 (with HLT-exiting disable by L1). * KVM: x86: Suppress MMIO that is triggered during task switch emulation Explicitly suppress userspace emulated MMIO exits that are triggered when emulating a task switch as KVM doesn't support userspace MMIO during complex (multi-step) emulation. Silently ignoring the exit request can result in the WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace for some other reason prior to purging mmio_needed. See commit |
||
Linus Torvalds
|
1c7d0c3af5 |
s390 updates for 6.11 merge window
- Remove restrictions on PAI NNPA and crypto counters, enabling concurrent per-task and system-wide sampling and counting events - Switch to GENERIC_CPU_DEVICES by setting up the CPU present mask in the architecture code and letting the generic code handle CPU bring-up - Add support for the diag204 busy indication facility to prevent undesirable blocking during hypervisor logical CPU utilization queries. Implement results caching - Improve the handling of Store Data SCLP events by suppressing unnecessary warning, preventing buffer release in I/O during failures, and adding timeout handling for Store Data requests to address potential firmware issues - Provide optimized __arch_hweight*() implementations - Remove the unnecessary CPU KOBJ_CHANGE uevents generated during topology updates, as they are unused and also not present on other architectures - Cleanup atomic_ops, optimize __atomic_set() for small values and __atomic_cmpxchg_bool() for compilers supporting flag output constraint - Couple of cleanups for KVM: - Move and improve KVM struct definitions for DAT tables from gaccess.c to a new header - Pass the asce as parameter to sie64a() - Make the crdte() and cspg() page table handling wrappers return a boolean to indicate success, like the other existing "compare and swap" wrappers - Add documentation for HWCAP flags - Switch to obtaining total RAM pages from memblock instead of totalram_pages() during mm init, to ensure correct calculation of zero page size, when defer_init is enabled - Refactor lowcore access and switch to using the get_lowcore() function instead of the S390_lowcore macro - Cleanups for PG_arch_1 and folio handling in UV and hugetlb code - Add missing MODULE_DESCRIPTION() macros - Fix VM_FAULT_HWPOISON handling in do_exception() -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmaYGegACgkQjYWKoQLX FBjCwwf/aRYoLIXCa9/nHGWFiUjZm6xBgVwZh55bXjfNG9TI2J9UZSsYlOFGUJKl gvD2Ym+LqAejK8R4EUHkfD6ftaKMQuIxNDoedxhwuSpfOQ2mZ5teu0MxTh8QcUAx 4Y2w5XEeCuqE3SuoZ4SJa58K4rGl4cFpPsKNa8ofdzH1ZLFNe8Wqzis4kh0htqLb FtPj6nsgfzQ5kg14rVkGxCa4CqoFxonXgsA6nH6xZLbxKUInyq8uV44UBQ+aJq5v dsdzZ5XuAJHN2FpBuuOYQYZYw3XIy/kka7o4EjffORi5SGCRMWO4Zt0P6HXaNkh6 xV8EEO8myeo7rV8dnrk1V4yGjGJmfA== =3IGY -----END PGP SIGNATURE----- Merge tag 's390-6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Remove restrictions on PAI NNPA and crypto counters, enabling concurrent per-task and system-wide sampling and counting events - Switch to GENERIC_CPU_DEVICES by setting up the CPU present mask in the architecture code and letting the generic code handle CPU bring-up - Add support for the diag204 busy indication facility to prevent undesirable blocking during hypervisor logical CPU utilization queries. Implement results caching - Improve the handling of Store Data SCLP events by suppressing unnecessary warning, preventing buffer release in I/O during failures, and adding timeout handling for Store Data requests to address potential firmware issues - Provide optimized __arch_hweight*() implementations - Remove the unnecessary CPU KOBJ_CHANGE uevents generated during topology updates, as they are unused and also not present on other architectures - Cleanup atomic_ops, optimize __atomic_set() for small values and __atomic_cmpxchg_bool() for compilers supporting flag output constraint - Couple of cleanups for KVM: - Move and improve KVM struct definitions for DAT tables from gaccess.c to a new header - Pass the asce as parameter to sie64a() - Make the crdte() and cspg() page table handling wrappers return a boolean to indicate success, like the other existing "compare and swap" wrappers - Add documentation for HWCAP flags - Switch to obtaining total RAM pages from memblock instead of totalram_pages() during mm init, to ensure correct calculation of zero page size, when defer_init is enabled - Refactor lowcore access and switch to using the get_lowcore() function instead of the S390_lowcore macro - Cleanups for PG_arch_1 and folio handling in UV and hugetlb code - Add missing MODULE_DESCRIPTION() macros - Fix VM_FAULT_HWPOISON handling in do_exception() * tag 's390-6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (54 commits) s390/mm: Fix VM_FAULT_HWPOISON handling in do_exception() s390/kvm: Move bitfields for dat tables s390/entry: Pass the asce as parameter to sie64a() s390/sthyi: Use cached data when diag is busy s390/sthyi: Move diag operations s390/hypfs_diag: Diag204 busy loop s390/diag: Add busy-indication-facility requirements s390/diag: Diag204 add busy return errno s390/diag: Return errno's from diag204 s390/sclp: Diag204 busy indication facility detection s390/atomic_ops: Make use of flag output constraint s390/atomic_ops: Improve __atomic_set() for small values s390/atomic_ops: Use symbolic names s390/smp: Switch to GENERIC_CPU_DEVICES s390/hwcaps: Add documentation for HWCAP flags s390/pgtable: Make crdte() and cspg() return a value s390/topology: Remove CPU KOBJ_CHANGE uevents s390/sclp: Add timeout to Store Data requests s390/sclp: Prevent release of buffer in I/O s390/sclp: Suppress unnecessary Store Data warning ... |
||
Linus Torvalds
|
70045bfc4c |
ftrace: Rewrite of function graph tracer
Up until now, the function graph tracer could only have a single user attached to it. If another user tried to attach to the function graph tracer while one was already attached, it would fail. Allowing function graph tracer to have more than one user has been asked for since 2009, but it required a rewrite to the logic to pull it off so it never happened. Until now! There's three systems that trace the return of a function. That is kretprobes, function graph tracer, and BPF. kretprobes and function graph tracing both do it similarly. The difference is that kretprobes uses a shadow stack per callback and function graph tracer creates a shadow stack for all tasks. The function graph tracer method makes it possible to trace the return of all functions. As kretprobes now needs that feature too, allowing it to use function graph tracer was needed. BPF also wants to trace the return of many probes and its method doesn't scale either. Having it use function graph tracer would improve that. By allowing function graph tracer to have multiple users allows both kretprobes and BPF to use function graph tracer in these cases. This will allow kretprobes code to be removed in the future as it's version will no longer be needed. Note, function graph tracer is only limited to 16 simultaneous users, due to shadow stack size and allocated slots. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZpbWlxQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qgtvAP9jxmgEiEhz4Bpe1vRKVSMYK6ozXHTT 7MFKRMeQqQ8zeAEA2sD5Zrt9l7zKzg0DFpaDLgc3/yh14afIDxzTlIvkmQ8= =umuf -----END PGP SIGNATURE----- Merge tag 'ftrace-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ftrace updates from Steven Rostedt: "Rewrite of function graph tracer to allow multiple users Up until now, the function graph tracer could only have a single user attached to it. If another user tried to attach to the function graph tracer while one was already attached, it would fail. Allowing function graph tracer to have more than one user has been asked for since 2009, but it required a rewrite to the logic to pull it off so it never happened. Until now! There's three systems that trace the return of a function. That is kretprobes, function graph tracer, and BPF. kretprobes and function graph tracing both do it similarly. The difference is that kretprobes uses a shadow stack per callback and function graph tracer creates a shadow stack for all tasks. The function graph tracer method makes it possible to trace the return of all functions. As kretprobes now needs that feature too, allowing it to use function graph tracer was needed. BPF also wants to trace the return of many probes and its method doesn't scale either. Having it use function graph tracer would improve that. By allowing function graph tracer to have multiple users allows both kretprobes and BPF to use function graph tracer in these cases. This will allow kretprobes code to be removed in the future as it's version will no longer be needed. Note, function graph tracer is only limited to 16 simultaneous users, due to shadow stack size and allocated slots" * tag 'ftrace-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (49 commits) fgraph: Use str_plural() in test_graph_storage_single() function_graph: Add READ_ONCE() when accessing fgraph_array[] ftrace: Add missing kerneldoc parameters to unregister_ftrace_direct() function_graph: Everyone uses HAVE_FUNCTION_GRAPH_RET_ADDR_PTR, remove it function_graph: Fix up ftrace_graph_ret_addr() function_graph: Make fgraph_update_pid_func() a stub for !DYNAMIC_FTRACE function_graph: Rename BYTE_NUMBER to CHAR_NUMBER in selftests fgraph: Remove some unused functions ftrace: Hide one more entry in stack trace when ftrace_pid is enabled function_graph: Do not update pid func if CONFIG_DYNAMIC_FTRACE not enabled function_graph: Make fgraph_do_direct static key static ftrace: Fix prototypes for ftrace_startup/shutdown_subops() ftrace: Assign RCU list variable with rcu_assign_ptr() ftrace: Assign ftrace_list_end to ftrace_ops_list type cast to RCU ftrace: Declare function_trace_op in header to quiet sparse warning ftrace: Add comments to ftrace_hash_move() and friends ftrace: Convert "inc" parameter to bool in ftrace_hash_rec_update_modify() ftrace: Add comments to ftrace_hash_rec_disable/enable() ftrace: Remove "filter_hash" parameter from __ftrace_hash_rec_update() ftrace: Rename dup_hash() and comment it ... |
||
Gerald Schaefer
|
df39038cd8 |
s390/mm: Fix VM_FAULT_HWPOISON handling in do_exception()
There is no support for HWPOISON, MEMORY_FAILURE, or ARCH_HAS_COPY_MC on s390. Therefore we do not expect to see VM_FAULT_HWPOISON in do_exception(). However, since commit |
||
Linus Torvalds
|
51835949dd |
Networking changes for 6.11. Not much excitement - a handful of large
patchsets (devmem among them) did not make it in time. Core & protocols ---------------- - Use local_lock in addition to local_bh_disable() to protect per-CPU resources in networking, a step closer for local_bh_disable() not to act as a big lock on PREEMPT_RT. - Use flex array for netdevice priv area, ensure its cache alignment. - Add a sysctl knob to allow user to specify a default rto_min at socket init time. Bit of a big hammer but multiple companies were independently carrying such patch downstream so clearly it's useful. - Support scheduling transmission of packets based on CLOCK_TAI. - Un-pin TCP TIMEWAIT timer to avoid it firing on CPUs later cordoned off using cpusets. - Support multiple L2TPv3 UDP tunnels using the same 5-tuple address. - Allow configuration of multipath hash seed, to both allow synchronizing hashing of two routers, and preventing partial accidental sync. - Improve TCP compliance with RFC 9293 for simultaneous connect(). - Support sending NAT keepalives in IPsec ESP in UDP states. Userspace IKE daemon had to do this before, but the kernel can better keep track of it. - Support sending supervision HSR frames with MAC addresses stored in ProxyNodeTable when RedBox (i.e. HSR-SAN) is enabled. - Introduce IPPROTO_SMC for selecting SMC when socket is created. - Allow UDP GSO transmit from devices with no checksum offload. - openvswitch: add packet sampling via psample, separating the sampled traffic from "upcall" packets sent to user space for forwarding. - nf_tables: shrink memory consumption for transaction objects. Things we sprinkled into general kernel code -------------------------------------------- - Power Sequencing subsystem (used by Qualcomm Bluetooth driver for QCA6390). - Add IRQ information in sysfs for auxiliary bus. - Introduce guard definition for local_lock. - Add aligned flavor of __cacheline_group_{begin, end}() markings for grouping fields in structures. BPF --- - Notify user space (via epoll) when a struct_ops object is getting detached/unregistered. - Add new kfuncs for a generic, open-coded bits iterator. - Enable BPF programs to declare arrays of kptr, bpf_rb_root, and bpf_list_head. - Support resilient split BTF which cuts down on duplication and makes BTF as compact as possible WRT BTF from modules. - Add support for dumping kfunc prototypes from BTF which enables both detecting as well as dumping compilable prototypes for kfuncs. - riscv64 BPF JIT improvements in particular to add 12-argument support for BPF trampolines and to utilize bpf_prog_pack for the latter. - Add the capability to offload the netfilter flowtable in XDP layer through kfuncs. Driver API ---------- - Allow users to configure IRQ tresholds between which automatic IRQ moderation can choose. - Expand Power Sourcing (PoE) status with power, class and failure reason. Support setting power limits. - Track additional RSS contexts in the core, make sure configuration changes don't break them. - Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated ESP data paths. - Support updating firmware on SFP modules. Tests and tooling ----------------- - mptcp: use net/lib.sh to manage netns. - TCP-AO and TCP-MD5: replace debug prints used by tests with tracepoints. - openvswitch: make test self-contained (don't depend on OvS CLI tools). Drivers ------- - Ethernet high-speed NICs: - Broadcom (bnxt): - increase the max total outstanding PTP TX packets to 4 - add timestamping statistics support - implement netdev_queue_mgmt_ops - support new RSS context API - Intel (100G, ice, idpf): - implement FEC statistics and dumping signal quality indicators - support E825C products (with 56Gbps PHYs) - nVidia/Mellanox: - support HW-GRO - mlx4/mlx5: support per-queue statistics via netlink - obey the max number of EQs setting in sub-functions - AMD/Solarflare: - support new RSS context API - AMD/Pensando: - ionic: rework fix for doorbell miss to lower overhead and skip it on new HW - Wangxun: - txgbe: support Flow Director perfect filters - Ethernet NICs consumer, embedded and virtual: - Add driver for Tehuti Networks TN40xx chips - Add driver for Meta's internal NIC chips - Add driver for Ethernet MAC on Airoha EN7581 SoCs - Add driver for Renesas Ethernet-TSN devices - Google cloud vNIC: - flow steering support - Microsoft vNIC: - support page sizes other than 4KB on ARM64 - vmware vNIC: - support latency measurement (update to version 9) - VirtIO net: - support for Byte Queue Limits - support configuring thresholds for automatic IRQ moderation - support for AF_XDP Rx zero-copy - Synopsys (stmmac): - support for STM32MP13 SoC - let platforms select the right PCS implementation - TI: - icssg-prueth: add multicast filtering support - icssg-prueth: enable PTP timestamping and PPS - Renesas: - ravb: improve Rx performance 30-400% by using page pool, theaded NAPI and timer-based IRQ coalescing - ravb: add MII support for R-Car V4M - Cadence (macb): - macb: add ARP support to Wake-On-LAN - Cortina: - use phylib for RX and TX pause configuration - Ethernet switches: - nVidia/Mellanox: - support configuration of multipath hash seed - report more accurate max MTU - use page_pool to improve Rx performance - MediaTek: - mt7530: add support for bridge port isolation - Qualcomm: - qca8k: add support for bridge port isolation - Microchip: - lan9371/2: add 100BaseTX PHY support - NXP: - vsc73xx: implement VLAN operations - Ethernet PHYs: - aquantia: enable support for aqr115c - aquantia: add support for PHY LEDs - realtek: add support for rtl8224 2.5Gbps PHY - xpcs: add memory-mapped device support - add BroadR-Reach link mode and support in Broadcom's PHY driver - CAN: - add document for ISO 15765-2 protocol support - mcp251xfd: workaround for erratum DS80000789E, use timestamps to catch when device returns incorrect FIFO status - WiFi: - mac80211/cfg80211: - parse Transmit Power Envelope (TPE) data in mac80211 instead of in drivers - improvements for 6 GHz regulatory flexibility - multi-link improvements - support multiple radios per wiphy - remove DEAUTH_NEED_MGD_TX_PREP flag - Intel (iwlwifi): - bump FW API to 91 for BZ/SC devices - report 64-bit radiotap timestamp - enable P2P low latency by default - handle Transmit Power Envelope (TPE) advertised by AP - remove support for older FW for new devices - fast resume (keeping the device configured) - mvm: re-enable Multi-Link Operation (MLO) - aggregation (A-MSDU) optimizations - MediaTek (mt76): - mt7925 Multi-Link Operation (MLO) support - Qualcomm (ath10k): - LED support for various chipsets - Qualcomm (ath12k): - remove unsupported Tx monitor handling - support channel 2 in 6 GHz band - support Spatial Multiplexing Power Save (SMPS) in 6 GHz band - supprt multiple BSSID (MBSSID) and Enhanced Multi-BSSID Advertisements (EMA) - support dynamic VLAN - add panic handler for resetting the firmware state - DebugFS support for datapath statistics - WCN7850: support for Wake on WLAN - Microchip (wilc1000): - read MAC address during probe to make it visible to user space - suspend/resume improvements - TI (wl18xx): - support newer firmware versions - RealTek (rtw89): - preparation for RTL8852BE-VT support - Wake on WLAN support for WiFi 6 chips - 36-bit PCI DMA support - RealTek (rtlwifi): - RTL8192DU support - Broadcom (brcmfmac): - Management Frame Protection support (to enable WPA3) - Bluetooth: - qualcomm: use the power sequencer for QCA6390 - btusb: mediatek: add ISO data transmission functions - hci_bcm4377: add BCM4388 support - btintel: add support for BlazarU core - btintel: add support for Whale Peak2 - btnxpuart: add support for AW693 A1 chipset - btnxpuart: add support for IW615 chipset - btusb: add Realtek RTL8852BE support ID 0x13d3:0x3591 Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmaWjBwACgkQMUZtbf5S IrvuSRAAkJuEzTRqgURBCe4eNEQde6mJJig7l2CKHwCbFiHZpRkFHf8qKbcGWbL6 uLW33SWnKtJVDhxVKWHLq635XW7BAa80YhqGw21GDi+mIEhWXZglHj3xbXNxsMfE 4eg/kG4BkfYWFmHaXOwVWV/mr7nXf6j7WmXNeXEi32ufE1j0OL+YlQenKnMj8yP2 j9JmYa2Chwppng1SblHmcjmGkdNVwFhStKeCG+2K7v06wdDH/QYBlbgUv9gw/cxp NlW//wgiaeX40U4O3kDwt9C+LDoh+0VrDDeVdQ+IsScLtY3PhAzEoKolFYTq2HSr I1JpoaHNnyNsJq3DZrACQ5WlH4yDn6C2EUB6dxNnFaI9F1ZPsi+7MTl6Sei1AklD TuQTj/lxOACBwW2Q77NU72uoxiIUauesGPHcnrAFuoCIEhZF0mso7k59BvrXhsOP QwcLbQdc1YHNkqv/Vc7NBY+ruMsYB+5Ubbhhj2p27dp/CWFIwxI29fze4dn2uhO6 ejHN3mbqwPdSzg12YJtM6Iq61Cnwo2eVSvhTxl+ZVSZtI4nu2arzR+y7QTYmNrXP 6tkgVN9UsWeLl2xJ8wyyqL5mcvNHP2rPXWZ2X56iTaa26m+UlleeQ7YRaYtQAAr0 Ec/vlDMX64SwHhd+qwE99DXGQf2g+KklHKSLsnajJUVrWFTlRI0= =opz8 -----END PGP SIGNATURE----- Merge tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Not much excitement - a handful of large patchsets (devmem among them) did not make it in time. Core & protocols: - Use local_lock in addition to local_bh_disable() to protect per-CPU resources in networking, a step closer for local_bh_disable() not to act as a big lock on PREEMPT_RT - Use flex array for netdevice priv area, ensure its cache alignment - Add a sysctl knob to allow user to specify a default rto_min at socket init time. Bit of a big hammer but multiple companies were independently carrying such patch downstream so clearly it's useful - Support scheduling transmission of packets based on CLOCK_TAI - Un-pin TCP TIMEWAIT timer to avoid it firing on CPUs later cordoned off using cpusets - Support multiple L2TPv3 UDP tunnels using the same 5-tuple address - Allow configuration of multipath hash seed, to both allow synchronizing hashing of two routers, and preventing partial accidental sync - Improve TCP compliance with RFC 9293 for simultaneous connect() - Support sending NAT keepalives in IPsec ESP in UDP states. Userspace IKE daemon had to do this before, but the kernel can better keep track of it - Support sending supervision HSR frames with MAC addresses stored in ProxyNodeTable when RedBox (i.e. HSR-SAN) is enabled - Introduce IPPROTO_SMC for selecting SMC when socket is created - Allow UDP GSO transmit from devices with no checksum offload - openvswitch: add packet sampling via psample, separating the sampled traffic from "upcall" packets sent to user space for forwarding - nf_tables: shrink memory consumption for transaction objects Things we sprinkled into general kernel code: - Power Sequencing subsystem (used by Qualcomm Bluetooth driver for QCA6390) [ Already merged separately - Linus ] - Add IRQ information in sysfs for auxiliary bus - Introduce guard definition for local_lock - Add aligned flavor of __cacheline_group_{begin, end}() markings for grouping fields in structures BPF: - Notify user space (via epoll) when a struct_ops object is getting detached/unregistered - Add new kfuncs for a generic, open-coded bits iterator - Enable BPF programs to declare arrays of kptr, bpf_rb_root, and bpf_list_head - Support resilient split BTF which cuts down on duplication and makes BTF as compact as possible WRT BTF from modules - Add support for dumping kfunc prototypes from BTF which enables both detecting as well as dumping compilable prototypes for kfuncs - riscv64 BPF JIT improvements in particular to add 12-argument support for BPF trampolines and to utilize bpf_prog_pack for the latter - Add the capability to offload the netfilter flowtable in XDP layer through kfuncs Driver API: - Allow users to configure IRQ tresholds between which automatic IRQ moderation can choose - Expand Power Sourcing (PoE) status with power, class and failure reason. Support setting power limits - Track additional RSS contexts in the core, make sure configuration changes don't break them - Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated ESP data paths - Support updating firmware on SFP modules Tests and tooling: - mptcp: use net/lib.sh to manage netns - TCP-AO and TCP-MD5: replace debug prints used by tests with tracepoints - openvswitch: make test self-contained (don't depend on OvS CLI tools) Drivers: - Ethernet high-speed NICs: - Broadcom (bnxt): - increase the max total outstanding PTP TX packets to 4 - add timestamping statistics support - implement netdev_queue_mgmt_ops - support new RSS context API - Intel (100G, ice, idpf): - implement FEC statistics and dumping signal quality indicators - support E825C products (with 56Gbps PHYs) - nVidia/Mellanox: - support HW-GRO - mlx4/mlx5: support per-queue statistics via netlink - obey the max number of EQs setting in sub-functions - AMD/Solarflare: - support new RSS context API - AMD/Pensando: - ionic: rework fix for doorbell miss to lower overhead and skip it on new HW - Wangxun: - txgbe: support Flow Director perfect filters - Ethernet NICs consumer, embedded and virtual: - Add driver for Tehuti Networks TN40xx chips - Add driver for Meta's internal NIC chips - Add driver for Ethernet MAC on Airoha EN7581 SoCs - Add driver for Renesas Ethernet-TSN devices - Google cloud vNIC: - flow steering support - Microsoft vNIC: - support page sizes other than 4KB on ARM64 - vmware vNIC: - support latency measurement (update to version 9) - VirtIO net: - support for Byte Queue Limits - support configuring thresholds for automatic IRQ moderation - support for AF_XDP Rx zero-copy - Synopsys (stmmac): - support for STM32MP13 SoC - let platforms select the right PCS implementation - TI: - icssg-prueth: add multicast filtering support - icssg-prueth: enable PTP timestamping and PPS - Renesas: - ravb: improve Rx performance 30-400% by using page pool, theaded NAPI and timer-based IRQ coalescing - ravb: add MII support for R-Car V4M - Cadence (macb): - macb: add ARP support to Wake-On-LAN - Cortina: - use phylib for RX and TX pause configuration - Ethernet switches: - nVidia/Mellanox: - support configuration of multipath hash seed - report more accurate max MTU - use page_pool to improve Rx performance - MediaTek: - mt7530: add support for bridge port isolation - Qualcomm: - qca8k: add support for bridge port isolation - Microchip: - lan9371/2: add 100BaseTX PHY support - NXP: - vsc73xx: implement VLAN operations - Ethernet PHYs: - aquantia: enable support for aqr115c - aquantia: add support for PHY LEDs - realtek: add support for rtl8224 2.5Gbps PHY - xpcs: add memory-mapped device support - add BroadR-Reach link mode and support in Broadcom's PHY driver - CAN: - add document for ISO 15765-2 protocol support - mcp251xfd: workaround for erratum DS80000789E, use timestamps to catch when device returns incorrect FIFO status - WiFi: - mac80211/cfg80211: - parse Transmit Power Envelope (TPE) data in mac80211 instead of in drivers - improvements for 6 GHz regulatory flexibility - multi-link improvements - support multiple radios per wiphy - remove DEAUTH_NEED_MGD_TX_PREP flag - Intel (iwlwifi): - bump FW API to 91 for BZ/SC devices - report 64-bit radiotap timestamp - enable P2P low latency by default - handle Transmit Power Envelope (TPE) advertised by AP - remove support for older FW for new devices - fast resume (keeping the device configured) - mvm: re-enable Multi-Link Operation (MLO) - aggregation (A-MSDU) optimizations - MediaTek (mt76): - mt7925 Multi-Link Operation (MLO) support - Qualcomm (ath10k): - LED support for various chipsets - Qualcomm (ath12k): - remove unsupported Tx monitor handling - support channel 2 in 6 GHz band - support Spatial Multiplexing Power Save (SMPS) in 6 GHz band - supprt multiple BSSID (MBSSID) and Enhanced Multi-BSSID Advertisements (EMA) - support dynamic VLAN - add panic handler for resetting the firmware state - DebugFS support for datapath statistics - WCN7850: support for Wake on WLAN - Microchip (wilc1000): - read MAC address during probe to make it visible to user space - suspend/resume improvements - TI (wl18xx): - support newer firmware versions - RealTek (rtw89): - preparation for RTL8852BE-VT support - Wake on WLAN support for WiFi 6 chips - 36-bit PCI DMA support - RealTek (rtlwifi): - RTL8192DU support - Broadcom (brcmfmac): - Management Frame Protection support (to enable WPA3) - Bluetooth: - qualcomm: use the power sequencer for QCA6390 - btusb: mediatek: add ISO data transmission functions - hci_bcm4377: add BCM4388 support - btintel: add support for BlazarU core - btintel: add support for Whale Peak2 - btnxpuart: add support for AW693 A1 chipset - btnxpuart: add support for IW615 chipset - btusb: add Realtek RTL8852BE support ID 0x13d3:0x3591" * tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1589 commits) eth: fbnic: Fix spelling mistake "tiggerring" -> "triggering" tcp: Replace strncpy() with strscpy() wifi: ath12k: fix build vs old compiler tcp: Don't access uninit tcp_rsk(req)->ao_keyid in tcp_create_openreq_child(). eth: fbnic: Write the TCAM tables used for RSS control and Rx to host eth: fbnic: Add L2 address programming eth: fbnic: Add basic Rx handling eth: fbnic: Add basic Tx handling eth: fbnic: Add link detection eth: fbnic: Add initial messaging to notify FW of our presence eth: fbnic: Implement Rx queue alloc/start/stop/free eth: fbnic: Implement Tx queue alloc/start/stop/free eth: fbnic: Allocate a netdevice and napi vectors with queues eth: fbnic: Add FW communication mechanism eth: fbnic: Add message parsing for FW messages eth: fbnic: Add register init to set PCIe/Ethernet device config eth: fbnic: Allocate core device specific structures and devlink interface eth: fbnic: Add scaffolding for Meta's NIC driver PCI: Add Meta Platforms vendor ID net/sched: cls_flower: propagate tca[TCA_OPTIONS] to NL_REQ_ATTR_CHECK ... |
||
Linus Torvalds
|
d80f2996b8 |
asm-generic updates for 6.11
Most of this is part of my ongoing work to clean up the system call tables. In this bit, all of the newer architectures are converted to use the machine readable syscall.tbl format instead in place of complex macros in include/uapi/asm-generic/unistd.h. This follows an earlier series that fixed various API mismatches and in turn is used as the base for planned simplifications. The other two patches are dead code removal and a warning fix. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmaVB1cACgkQYKtH/8kJ UicMqxAAnYKOxfjoMIhYYK6bl126wg/vIcDcjIR9cNWH21Nhn3qxn11ZXau3S7xv 3l/HreEhyEQr4gC2a70IlXyHUadYOlrk+83OURrunWk1oKPmZlMKcfPVbtp8GL7x PUNXQfwM1XZLveKwufY24hoZdwKC+Y/5WLc1t0ReznJuAqgeO2rM9W5dnV5bAfCp he3F5hFcr196Dz3/GJjJIWrY+cbwfmZWsNtj1vFTL5/r/LuCu8HTkqhsGj8tE5BJ NGVEEXbp5eaVTCIGqJWhnuZcsnKN9kM51M7CtdwWf8OTckUVuJap5OsDVKQkWkGl bLPbd2jhDltph0sah51hAIvv4WdkThW76u9FRW7KR3fo7ra67eF7l5j7wc1lE2JB GwLJ1X56Bxe1GhvvNTlDmb7DrnlP/DMPuRv3Z6xyH6l8iZ2pMGlnAxuw6Bs1s6Y5 WSs36ZpnS0ctgjfx37ZITsZSvbKFPpQFJP4siwS8aRNv/NFALNNdFyOCY5lNzspZ 0dxwjn6/7UpHE4MKh6/hvCg2QwupXXBTRytibw+75/rOsR+EYlmtuONtyq2sLUHe ktJ5pg+8XuZm27+wLffuluzmY7sv2F8OU4cTYeM60Ynmc6pRzwUY6/VhG52S1/mU Ua4VgYIpzOtlLrYmz5QTWIZpdSFSVbIc/3pLriD6hn4Mvg+BwdA= =XOhL -----END PGP SIGNATURE----- Merge tag 'asm-generic-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "Most of this is part of my ongoing work to clean up the system call tables. In this bit, all of the newer architectures are converted to use the machine readable syscall.tbl format instead in place of complex macros in include/uapi/asm-generic/unistd.h. This follows an earlier series that fixed various API mismatches and in turn is used as the base for planned simplifications. The other two patches are dead code removal and a warning fix" * tag 'asm-generic-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: vmlinux.lds.h: catch .bss..L* sections into BSS") fixmap: Remove unused set_fixmap_offset_io() riscv: convert to generic syscall table openrisc: convert to generic syscall table nios2: convert to generic syscall table loongarch: convert to generic syscall table hexagon: use new system call table csky: convert to generic syscall table arm64: rework compat syscall macros arm64: generate 64-bit syscall.tbl arm64: convert unistd_32.h to syscall.tbl format arc: convert to generic syscall table clone3: drop __ARCH_WANT_SYS_CLONE3 macro kbuild: add syscall table generation to scripts/Makefile.asm-headers kbuild: verify asm-generic header list loongarch: avoid generating extra header files um: don't generate asm/bpf_perf_event.h csky: drop asm/gpio.h wrapper syscalls: add generic scripts/syscall.tbl |
||
Paolo Bonzini
|
86014c1e20 |
KVM generic changes for 6.11
- Enable halt poll shrinking by default, as Intel found it to be a clear win. - Setup empty IRQ routing when creating a VM to avoid having to synchronize SRCU when creating a split IRQCHIP on x86. - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag that arch code can use for hooking both sched_in() and sched_out(). - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid truncating a bogus value from userspace, e.g. to help userspace detect bugs. - Mark a vCPU as preempted if and only if it's scheduled out while in the KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest memory when retrieving guest state during live migration blackout. - A few minor cleanups -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmaRuOYACgkQOlYIJqCj N/1UnQ/8CI5Qfr+/0gzYgtWmtEMczGG+rMNpzD3XVqPjJjXcMcBiQnplnzUVLhha vlPdYVK7vgmEt003XGzV55mik46LHL+DX/v4hI3HEdblfyCeNLW3fKEWVRB44qJe o+YUQwSK42SORUp9oXuQINxhA//U9EnI7CQxlJ8w8wenv5IJKfIGr01DefmfGPAV PKm9t6WLcNqvhZMEyy/zmzM3KVPCJL0NcwI97x6sHxFpQYIDtL0E/VexA4AFqMoT QK7cSDC/2US41Zvem/r/GzM/ucdF6vb9suzZYBohwhxtVhwJe2CDeYQZvtNKJ1U7 GOHPaKL6nBWdZCm/yyWbbX2nstY1lHqxhN3JD0X8wqU5rNcwm2b8Vfyav0Ehc7H+ jVbDTshOx4YJmIgajoKjgM050rdBK59TdfVL+l+AAV5q/TlHocalYtvkEBdGmIDg 2td9UHSime6sp20vQfczUEz4bgrQsh4l2Fa/qU2jFwLievnBw0AvEaMximkSGMJe b8XfjmdTjlOesWAejANKtQolfrq14+1wYw0zZZ8PA+uNVpKdoovmcqSOcaDC9bT8 GO/NFUvoG+lkcvJcIlo1SSl81SmGLosijwxWfGvFAqsgpR3/3l3dYp0QtztoCNJO d3+HnjgYn5o5FwufuTD3eUOXH4AFjG108DH0o25XrIkb2Kymy0o= =BalU -----END PGP SIGNATURE----- Merge tag 'kvm-x86-generic-6.11' of https://github.com/kvm-x86/linux into HEAD KVM generic changes for 6.11 - Enable halt poll shrinking by default, as Intel found it to be a clear win. - Setup empty IRQ routing when creating a VM to avoid having to synchronize SRCU when creating a split IRQCHIP on x86. - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag that arch code can use for hooking both sched_in() and sched_out(). - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid truncating a bogus value from userspace, e.g. to help userspace detect bugs. - Mark a vCPU as preempted if and only if it's scheduled out while in the KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest memory when retrieving guest state during live migration blackout. - A few minor cleanups |
||
Christophe Leroy
|
e6c0c03245 |
mm: provide mm_struct and address to huge_ptep_get()
On powerpc 8xx huge_ptep_get() will need to know whether the given ptep is a PTE entry or a PMD entry. This cannot be known with the PMD entry itself because there is no easy way to know it from the content of the entry. So huge_ptep_get() will need to know either the size of the page or get the pmd. In order to be consistent with huge_ptep_get_and_clear(), give mm and address to huge_ptep_get(). Link: https://lkml.kernel.org/r/cc00c70dd384298796a4e1b25d6c4eb306d3af85.1719928057.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
Paolo Bonzini
|
c8b8b8190a |
LoongArch KVM changes for v6.11
1. Add ParaVirt steal time support. 2. Add some VM migration enhancement. 3. Add perf kvm-stat support for loongarch. -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmaOS6UWHGNoZW5odWFj YWlAa2VybmVsLm9yZwAKCRAChivD8uImehejD/9pACGe3h3krXLcFVWXOFIu5Hpc 5kQLP0lSPJ/o5Xs8t/oPLrnDX70z90wXI1LOmltc7h32MSwFa2l8COQh+sN5eJBQ PNyt7u7bMipp0yJS4Gl3LQQ5vklcGOSpQc/gbeXnVx8J/tz+Mo9YGGLIXVRXRM6W Ri8D2VVFiwzQQYeTpPo1u1Ob8C6mA4KOppwvhscMTM3vj4NMbsinBzRnR0lG0Tdw meFhxDPly1Ksxsbnj9UGO6UnEY0A2SLONs6MiO4y4DtoqoDlw/lbqFJuYo4vvbx1 pxtjyirD/PX/wjslQFWUOuU0hMfAodera+JupZ5BZWfcG8FltA4DQfDsm/U9RjK/ 7gGNnr8Xk2/tp6+4AVV+HU2iTgRvq+mXCL72zSy2Y4r7ElBAANDfk4n+Zn/PWisn U9wwV8Ue7tVB15BRpRsg77NzBidiCFEe/6flWYiX2y24ke71gwDJBGUy8hMdKt6t 4Cq8atsU0MvDAzfYMsK9JjskJp4UFq6wb1tXbbuADM4TDhnzlK6s6h3vM+pFlh/f my7fDH8/2qsCWhBDM4pmsJskVp+I1GOk/80RjTQISwx7iHktJWvxNYTaisK2fvD5 Qs1IUWfNFbDX0Lr0QpN6j6X4rZkghR4R6XoFkd4nkicwi+UHVn3oK9GSqv24QJn9 7+Ev3dfRTUYLd6mC4Q== =DpIK -----END PGP SIGNATURE----- Merge tag 'loongarch-kvm-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v6.11 1. Add ParaVirt steal time support. 2. Add some VM migration enhancement. 3. Add perf kvm-stat support for loongarch. |
||
Jakub Kicinski
|
7c8267275d |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR. Conflicts: net/sched/act_ct.c |
||
Claudio Imbrenda
|
275d05ce06 |
s390/kvm: Move bitfields for dat tables
Move and improve the struct definitions for DAT tables from gaccess.c to a new header. Once in a separate header, the structs become available everywhere. One possible usecase is to merge them in the s390 pte_t and p?d_t definitions, which is left as an exercise for the reader. Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Nico Boehr <nrb@linux.ibm.com> Link: https://lore.kernel.org/r/20240703155900.103783-3-imbrenda@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Claudio Imbrenda
|
723ac2d6ba |
s390/entry: Pass the asce as parameter to sie64a()
Pass the guest ASCE explicitly as parameter, instead of having sie64a() take it from lowcore. This removes hidden state from lowcore, and makes things look cleaner. Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Nico Boehr <nrb@linux.ibm.com> Link: https://lore.kernel.org/r/20240703155900.103783-2-imbrenda@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Mete Durlu
|
b7a5e5dfbd |
s390/sthyi: Use cached data when diag is busy
When sthyi is being emulated, data from diag204 is used. If diag204 returns busy, previously cached sthyi info block is returned to the caller and cache expiry is set to expired. Acked-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Tobias Huschle <huschle@linux.ibm.com> Signed-off-by: Tobias Huschle <huschle@linux.ibm.com> Signed-off-by: Mete Durlu <meted@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Mete Durlu
|
6fdf72c9a9 |
s390/sthyi: Move diag operations
Move diag204 related operations to their own functions for better error handling and better readability. Acked-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Tobias Huschle <huschle@linux.ibm.com> Signed-off-by: Mete Durlu <meted@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Mete Durlu
|
f449395421 |
s390/hypfs_diag: Diag204 busy loop
When diag204 busy-indiciation facility is installed and diag204 is returning busy, hypfs diag204 handler now does an interruptable busy wait until diag204 is no longer busy. If there is a signal pending, call would be restarted with -ERESTARTSYSCALL, except for fatal signals. Acked-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Tobias Huschle <huschle@linux.ibm.com> Signed-off-by: Mete Durlu <meted@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Mete Durlu
|
97999f8c62 |
s390/diag: Add busy-indication-facility requirements
To verify if busy indication facility is installed or not sclp bits has to be checked. Add a function that checks sclp to improve readability. Add busy-indication-request bit mask for diag204 subcodes. Acked-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Tobias Huschle <huschle@linux.ibm.com> Signed-off-by: Mete Durlu <meted@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Mete Durlu
|
df7e714d6d |
s390/diag: Diag204 add busy return errno
When diag204-busy-indication facility is installed, diag204 can return '8' which means device is busy and no operation is done. Add check for return codes of diag204 call. Return error codes according to diag204 return codes. Acked-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Tobias Huschle <huschle@linux.ibm.com> Signed-off-by: Mete Durlu <meted@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Mete Durlu
|
bb9be93acb |
s390/diag: Return errno's from diag204
Return different errno's from diag204 to allow users to handle them accordingly. Instead of returning -1 regardless of the failing condition, return -EINVAL on invalid memory address and -EOPNOTSUPP when diag instruction fails. Acked-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Tobias Huschle <huschle@linux.ibm.com> Signed-off-by: Mete Durlu <meted@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Mete Durlu
|
7455a33179 |
s390/sclp: Diag204 busy indication facility detection
Detect diag204 busy indication facility. Acked-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Tobias Huschle <huschle@linux.ibm.com> Signed-off-by: Mete Durlu <meted@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Heiko Carstens
|
279a0164e0 |
s390/atomic_ops: Make use of flag output constraint
With gcc 14.1.0 support for flag output constraint was added for s390. Use this for __atomic_cmpxchg_bool(). This allows for slightly better code, since the compiler can generate code depending on the condition code which is the result of an inline assembly. The size of the kernel image is reduced by ~12kb. Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Heiko Carstens
|
ee19370c92 |
s390/atomic_ops: Improve __atomic_set() for small values
Use mvhi/mvghi for small constant values within the __atomic_set() inline assemblies. This avoids loading the specified value into a register. The size of the kernel image is reduced by ~1.2kb. Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Heiko Carstens
|
f2ed8367bf |
s390/atomic_ops: Use symbolic names
Consistently use symbolic names in all atomic ops inline assemblies. Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Sven Schnelle
|
4a39f12e75 |
s390/smp: Switch to GENERIC_CPU_DEVICES
Instead of setting up non-boot CPUs early in architecture code, only setup the cpu present mask and let the generic code handle cpu bringup. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> |
||
Arnd Bergmann
|
505d66d1ab |
clone3: drop __ARCH_WANT_SYS_CLONE3 macro
When clone3() was introduced, it was not obvious how each architecture deals with setting up the stack and keeping the register contents in a fork()-like system call, so this was left for the architecture maintainers to implement, with __ARCH_WANT_SYS_CLONE3 defined by those that already implement it. Five years later, we still have a few architectures left that are missing clone3(), and the macro keeps getting in the way as it's fundamentally different from all the other __ARCH_WANT_SYS_* macros that are meant to provide backwards-compatibility with applications using older syscalls that are no longer provided by default. Address this by reversing the polarity of the macro, adding an __ARCH_BROKEN_SYS_CLONE3 macro to all architectures that don't already provide the syscall, and remove __ARCH_WANT_SYS_CLONE3 from all the other ones. Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> |
||
Paolo Abeni
|
7b769adc26 |
bpf-next-for-netdev
-----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZoxN0AAKCRDbK58LschI g0c5AQDa3ZV9gfbN42y1zSDoM1uOgO60fb+ydxyOYh8l3+OiQQD/fLfpTY3gBFSY 9yi/pZhw/QdNzQskHNIBrHFGtJbMxgs= =p1Zz -----END PGP SIGNATURE----- Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-07-08 The following pull-request contains BPF updates for your *net-next* tree. We've added 102 non-merge commits during the last 28 day(s) which contain a total of 127 files changed, 4606 insertions(+), 980 deletions(-). The main changes are: 1) Support resilient split BTF which cuts down on duplication and makes BTF as compact as possible wrt BTF from modules, from Alan Maguire & Eduard Zingerman. 2) Add support for dumping kfunc prototypes from BTF which enables both detecting as well as dumping compilable prototypes for kfuncs, from Daniel Xu. 3) Batch of s390x BPF JIT improvements to add support for BPF arena and to implement support for BPF exceptions, from Ilya Leoshkevich. 4) Batch of riscv64 BPF JIT improvements in particular to add 12-argument support for BPF trampolines and to utilize bpf_prog_pack for the latter, from Pu Lehui. 5) Extend BPF test infrastructure to add a CHECKSUM_COMPLETE validation option for skbs and add coverage along with it, from Vadim Fedorenko. 6) Inline bpf_get_current_task/_btf() helpers in the arm64 BPF JIT which gives a small 1% performance improvement in micro-benchmarks, from Puranjay Mohan. 7) Extend the BPF verifier to track the delta between linked registers in order to better deal with recent LLVM code optimizations, from Alexei Starovoitov. 8) Fix bpf_wq_set_callback_impl() kfunc signature where the third argument should have been a pointer to the map value, from Benjamin Tissoires. 9) Extend BPF selftests to add regular expression support for test output matching and adjust some of the selftest when compiled under gcc, from Cupertino Miranda. 10) Simplify task_file_seq_get_next() and remove an unnecessary loop which always iterates exactly once anyway, from Dan Carpenter. 11) Add the capability to offload the netfilter flowtable in XDP layer through kfuncs, from Florian Westphal & Lorenzo Bianconi. 12) Various cleanups in networking helpers in BPF selftests to shave off a few lines of open-coded functions on client/server handling, from Geliang Tang. 13) Properly propagate prog->aux->tail_call_reachable out of BPF verifier, so that x86 JIT does not need to implement detection, from Leon Hwang. 14) Fix BPF verifier to add a missing check_func_arg_reg_off() to prevent an out-of-bounds memory access for dynpointers, from Matt Bobrowski. 15) Fix bpf_session_cookie() kfunc to return __u64 instead of long pointer as it might lead to problems on 32-bit archs, from Jiri Olsa. 16) Enhance traffic validation and dynamic batch size support in xsk selftests, from Tushar Vyavahare. bpf-next-for-netdev * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (102 commits) selftests/bpf: DENYLIST.aarch64: Remove fexit_sleep selftests/bpf: amend for wrong bpf_wq_set_callback_impl signature bpf: helpers: fix bpf_wq_set_callback_impl signature libbpf: Add NULL checks to bpf_object__{prev_map,next_map} selftests/bpf: Remove exceptions tests from DENYLIST.s390x s390/bpf: Implement exceptions s390/bpf: Change seen_reg to a mask bpf: Remove unnecessary loop in task_file_seq_get_next() riscv, bpf: Optimize stack usage of trampoline bpf, devmap: Add .map_alloc_check selftests/bpf: Remove arena tests from DENYLIST.s390x selftests/bpf: Add UAF tests for arena atomics selftests/bpf: Introduce __arena_global s390/bpf: Support arena atomics s390/bpf: Enable arena s390/bpf: Support address space cast instruction s390/bpf: Support BPF_PROBE_MEM32 s390/bpf: Land on the next JITed instruction after exception s390/bpf: Introduce pre- and post- probe functions s390/bpf: Get rid of get_probe_mem_regno() ... ==================== Link: https://patch.msgid.link/20240708221438.10974-1-daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com> |
||
Heiko Carstens
|
b5efb63acf |
s390/mm: Add NULL pointer check to crst_table_free() base_crst_free()
crst_table_free() used to work with NULL pointers before the conversion
to ptdescs. Since crst_table_free() can be called with a NULL pointer
(error handling in crst_table_upgrade() add an explicit check.
Also add the same check to base_crst_free() for consistency reasons.
In real life this should not happen, since order two GFP_KERNEL
allocations will not fail, unless FAIL_PAGE_ALLOC is enabled and used.
Reported-by: Yunseong Kim <yskelg@gmail.com>
Fixes:
|
||
Ilya Leoshkevich
|
fa7bd4b000 |
s390/bpf: Implement exceptions
Implement the following three pieces required from the JIT: - A "top-level" BPF prog (exception_boundary) must save all non-volatile registers, and not only the ones that it clobbers. - A "handler" BPF prog (exception_cb) must switch stack to that of exception_boundary, and restore the registers that exception_boundary saved. - arch_bpf_stack_walk() must unwind the stack and provide the results in a way that satisfies both bpf_throw() and exception_cb. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240703005047.40915-3-iii@linux.ibm.com |
||
Ilya Leoshkevich
|
7ba4f43e16 |
s390/bpf: Change seen_reg to a mask
Using a mask instead of an array saves a small amount of memory and allows marking multiple registers as seen with a simple "or". Another positive side-effect is that it speeds up verification with jitterbug. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240703005047.40915-2-iii@linux.ibm.com |
||
Linus Torvalds
|
75aa87ca48 |
s390: fix support for z16 systems.
-----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaHstsUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroPg2Qf+IUHv0lUOiPtbanMpwAWuXVLUFUOs QxhzcKKmi/Det9gkgYA6SQEtr0nR22FIE50+lRFex9gCLeaFIizVCKtvm9vi5pKe IpJ3sNAO88O/ehxurdBJ4ueADybruXyyKeQfpPKBoPiUAJpr85BYIpRbXwTyJTok eCFQS39jvTN2YMyGrD7zdKBOcf3V3Da/m1v4f5bAx8nRYQNPLFX1GFbJG8gzQ8wN DnYlDiKBzu8BOnBkdQh53Oz2e6a5gU3EHo9ZGITOr6Fmu4tGuXBkZ50pwaIdXD52 ZSZdMfbYJr7cWg2H6z3rqfIgO32uCGzHX7ZR8s3JbckVvQyaO4zL14ZW7w== =szz2 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fix from Paolo Bonzini: - s390: fix support for z16 systems * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: s390: fix LPSWEY handling |