Patch series "kasan: switch tag-based modes to stack ring from per-object
metadata", v3.
This series makes the tag-based KASAN modes use a ring buffer for storing
stack depot handles for alloc/free stack traces for slab objects instead
of per-object metadata. This ring buffer is referred to as the stack
ring.
On each alloc/free of a slab object, the tagged address of the object and
the current stack trace are recorded in the stack ring.
On each bug report, if the accessed address belongs to a slab object, the
stack ring is scanned for matching entries. The newest entries are used
to print the alloc/free stack traces in the report: one entry for alloc
and one for free.
The advantages of this approach over storing stack trace handles in
per-object metadata with the tag-based KASAN modes:
- Allows to find relevant stack traces for use-after-free bugs without
using quarantine for freed memory. (Currently, if the object was
reallocated multiple times, the report contains the latest alloc/free
stack traces, not necessarily the ones relevant to the buggy allocation.)
- Allows to better identify and mark use-after-free bugs, effectively
making the CONFIG_KASAN_TAGS_IDENTIFY functionality always-on.
- Has fixed memory overhead.
The disadvantage:
- If the affected object was allocated/freed long before the bug happened
and the stack trace events were purged from the stack ring, the report
will have no stack traces.
Discussion
==========
The proposed implementation of the stack ring uses a single ring buffer
for the whole kernel. This might lead to contention due to atomic
accesses to the ring buffer index on multicore systems.
At this point, it is unknown whether the performance impact from this
contention would be significant compared to the slowdown introduced by
collecting stack traces due to the planned changes to the latter part, see
the section below.
For now, the proposed implementation is deemed to be good enough, but this
might need to be revisited once the stack collection becomes faster.
A considered alternative is to keep a separate ring buffer for each CPU
and then iterate over all of them when printing a bug report. This
approach requires somehow figuring out which of the stack rings has the
freshest stack traces for an object if multiple stack rings have them.
Further plans
=============
This series is a part of an effort to make KASAN stack trace collection
suitable for production. This requires stack trace collection to be fast
and memory-bounded.
The planned steps are:
1. Speed up stack trace collection (potentially, by using SCS;
patches on-hold until steps #2 and #3 are completed).
2. Keep stack trace handles in the stack ring (this series).
3. Add a memory-bounded mode to stack depot or provide an alternative
memory-bounded stack storage.
4. Potentially, implement stack trace collection sampling to minimize
the performance impact.
This patch (of 34):
__kasan_metadata_size() calculates the size of the redzone for objects in
a slab cache.
When accounting for presence of kasan_free_meta in the redzone, this
function only compares free_meta_offset with 0. But free_meta_offset
could also be equal to KASAN_NO_FREE_META, which indicates that
kasan_free_meta is not present at all.
Add a comparison with KASAN_NO_FREE_META into __kasan_metadata_size().
Link: https://lkml.kernel.org/r/cover.1662411799.git.andreyknvl@google.com
Link: https://lkml.kernel.org/r/c7b316d30d90e5947eb8280f4dc78856a49298cf.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
It is unnecessary to get the number of the running kdamond to judge
whether kdamonds are busy. Here we can use the
damon_sysfs_kdamond_running() helper and return -EBUSY directly when
finding a running kdamond. Meanwhile, merging with the judgement that a
kdamond has current sysfs command callback request to make the code more
clear.
Link: https://lkml.kernel.org/r/1662302166-13216-1-git-send-email-kaixuxia@tencent.com
Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
With all callers now passing in a folio, rename the function and convert
all callers. Removes a couple of calls to compound_head() and a reference
to page->mapping.
Link: https://lkml.kernel.org/r/20220902194653.1739778-55-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Removes a lot of calls to compound_head(). Also remove a VM_BUG_ON that
can never trigger as the PageAnon bit is the bottom bit of page->mapping.
Link: https://lkml.kernel.org/r/20220902194653.1739778-51-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
All callers have now been converted to folio_free_swap() and we can remove
this wrapper.
Link: https://lkml.kernel.org/r/20220902194653.1739778-49-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
All callers now have a folio, so convert the function to take a folio.
Saves a couple of calls to compound_head().
Link: https://lkml.kernel.org/r/20220902194653.1739778-48-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Also convert should_try_to_free_swap() to use a folio. This removes a few
calls to compound_head().
Link: https://lkml.kernel.org/r/20220902194653.1739778-47-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Removes many calls to compound_head(). Does not remove the assumption
that a folio may not be larger than a PMD.
Link: https://lkml.kernel.org/r/20220902194653.1739778-43-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
All callers have now been converted to swap_cache_get_folio(), so we can
remove this wrapper.
Link: https://lkml.kernel.org/r/20220902194653.1739778-39-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Delay fetching the precise page from the folio until we're in unuse_pte().
Saves many calls to compound_head().
Link: https://lkml.kernel.org/r/20220902194653.1739778-37-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
With all callers removed, remove this wrapper function. The flags are now
mysteriously called SGP, but I think we can live with that.
Link: https://lkml.kernel.org/r/20220902194653.1739778-34-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
shmem_getpage() is being replaced by shmem_get_folio() so use a folio
throughout this function. Saves several calls to compound_head().
Link: https://lkml.kernel.org/r/20220902194653.1739778-33-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
shmem_getpage() is being removed, so call its replacement and find the
precise page ourselves.
Link: https://lkml.kernel.org/r/20220902194653.1739778-32-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Symlinks will never use a large folio, but using the folio API removes a
lot of unnecessary folio->page->folio conversions.
Link: https://lkml.kernel.org/r/20220902194653.1739778-31-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
While symlinks will always be < PAGE_SIZE, using the folio APIs gets rid
of unnecessary calls to compound_head().
Link: https://lkml.kernel.org/r/20220902194653.1739778-30-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Call shmem_get_folio() and use the folio APIs instead of the page APIs.
Saves several calls to compound_head() and removes assumptions about the
size of a large folio.
Link: https://lkml.kernel.org/r/20220902194653.1739778-29-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
With no remaining callers of shmem_getpage_gfp(), add shmem_get_folio()
and reimplement shmem_getpage() as a call to shmem_get_folio().
Link: https://lkml.kernel.org/r/20220902194653.1739778-25-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Convert shmem_swapin() to return a folio and use swap_cache_get_folio(),
removing all uses of struct page in this function.
Link: https://lkml.kernel.org/r/20220902194653.1739778-21-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Convert lookup_swap_cache() into swap_cache_get_folio() and add a
lookup_swap_cache() wrapper around it.
[akpm@linux-foundation.org: add CONFIG_SWAP=n stub for swap_cache_get_folio()]
Link: https://lkml.kernel.org/r/20220902194653.1739778-20-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Assert that this is a single-page folio as there are several assumptions
in here that it's exactly PAGE_SIZE bytes large. Saves several calls to
compound_head() and removes the last caller of shmem_alloc_page().
Link: https://lkml.kernel.org/r/20220902194653.1739778-18-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
All callers now have a folio, so pass it in here and remove an unnecessary
call to page_folio().
Link: https://lkml.kernel.org/r/20220902194653.1739778-17-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The 'swapcache' variable is used to track whether the page is from the
swapcache or not. It can do this equally well by being the folio of the
page rather than the page itself, and this saves a number of calls to
compound_head().
Link: https://lkml.kernel.org/r/20220902194653.1739778-16-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
With all callers using folios, we can convert add_to_swap_cache() to take
a folio and use it throughout.
Link: https://lkml.kernel.org/r/20220902194653.1739778-13-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add kernel-doc for folio_free_swap() and make it return bool. Add a
try_to_free_swap() compatibility wrapper.
Link: https://lkml.kernel.org/r/20220902194653.1739778-11-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>