Merge branch 'master' into next
Merge master back into next, this allows us to resolve some conflicts in arch/powerpc/Kconfig, and also re-sort the symbols under config PPC so that they are in alphabetical order again.
This commit is contained in:
@@ -4,7 +4,7 @@
|
||||
|
||||
1 char Memory devices
|
||||
1 = /dev/mem Physical memory access
|
||||
2 = /dev/kmem Kernel virtual memory access
|
||||
2 = /dev/kmem OBSOLETE - replaced by /proc/kcore
|
||||
3 = /dev/null Null device
|
||||
4 = /dev/port I/O port access
|
||||
5 = /dev/zero Null byte source
|
||||
|
||||
@@ -17,17 +17,18 @@ module.
|
||||
gpio_mockup_ranges
|
||||
|
||||
This parameter takes an argument in the form of an array of integer
|
||||
pairs. Each pair defines the base GPIO number (if any) and the number
|
||||
of lines exposed by the chip. If the base GPIO is -1, the gpiolib
|
||||
will assign it automatically.
|
||||
pairs. Each pair defines the base GPIO number (non-negative integer)
|
||||
and the first number after the last of this chip. If the base GPIO
|
||||
is -1, the gpiolib will assign it automatically. while the following
|
||||
parameter is the number of lines exposed by the chip.
|
||||
|
||||
Example: gpio_mockup_ranges=-1,8,-1,16,405,4
|
||||
Example: gpio_mockup_ranges=-1,8,-1,16,405,409
|
||||
|
||||
The line above creates three chips. The first one will expose 8 lines,
|
||||
the second 16 and the third 4. The base GPIO for the third chip is set
|
||||
to 405 while for two first chips it will be assigned automatically.
|
||||
|
||||
gpio_named_lines
|
||||
gpio_mockup_named_lines
|
||||
|
||||
This parameter doesn't take any arguments. It lets the driver know that
|
||||
GPIO lines exposed by it should be named.
|
||||
|
||||
@@ -1469,6 +1469,12 @@
|
||||
Don't use this when you are not running on the
|
||||
android emulator
|
||||
|
||||
gpio-mockup.gpio_mockup_ranges
|
||||
[HW] Sets the ranges of gpiochip of for this device.
|
||||
Format: <start1>,<end1>,<start2>,<end2>...
|
||||
gpio-mockup.gpio_mockup_named_lines
|
||||
[HW] Let the driver know GPIO lines should be named.
|
||||
|
||||
gpt [EFI] Forces disk with valid GPT signature but
|
||||
invalid Protective MBR to be treated as GPT. If the
|
||||
primary GPT is corrupted, it enables the backup/alternate
|
||||
@@ -1492,10 +1498,6 @@
|
||||
Format: <unsigned int> such that (rxsize & ~0x1fffc0) == 0.
|
||||
Default: 1024
|
||||
|
||||
gpio-mockup.gpio_mockup_ranges
|
||||
[HW] Sets the ranges of gpiochip of for this device.
|
||||
Format: <start1>,<end1>,<start2>,<end2>...
|
||||
|
||||
hardlockup_all_cpu_backtrace=
|
||||
[KNL] Should the hard-lockup detector generate
|
||||
backtraces on all cpus.
|
||||
@@ -1833,6 +1835,18 @@
|
||||
initcall functions. Useful for debugging built-in
|
||||
modules and initcalls.
|
||||
|
||||
initramfs_async= [KNL]
|
||||
Format: <bool>
|
||||
Default: 1
|
||||
This parameter controls whether the initramfs
|
||||
image is unpacked asynchronously, concurrently
|
||||
with devices being probed and
|
||||
initialized. This should normally just work,
|
||||
but as a debugging aid, one can get the
|
||||
historical behaviour of the initramfs
|
||||
unpacking being completed before device_ and
|
||||
late_ initcalls.
|
||||
|
||||
initrd= [BOOT] Specify the location of the initial ramdisk
|
||||
|
||||
initrdmem= [KNL] Specify a physical address and size from which to
|
||||
@@ -2802,7 +2816,24 @@
|
||||
seconds. Use this parameter to check at some
|
||||
other rate. 0 disables periodic checking.
|
||||
|
||||
memtest= [KNL,X86,ARM,PPC] Enable memtest
|
||||
memory_hotplug.memmap_on_memory
|
||||
[KNL,X86,ARM] Boolean flag to enable this feature.
|
||||
Format: {on | off (default)}
|
||||
When enabled, runtime hotplugged memory will
|
||||
allocate its internal metadata (struct pages)
|
||||
from the hotadded memory which will allow to
|
||||
hotadd a lot of memory without requiring
|
||||
additional memory to do so.
|
||||
This feature is disabled by default because it
|
||||
has some implication on large (e.g. GB)
|
||||
allocations in some configurations (e.g. small
|
||||
memory blocks).
|
||||
The state of the flag can be read in
|
||||
/sys/module/memory_hotplug/parameters/memmap_on_memory.
|
||||
Note that even when enabled, there are a few cases where
|
||||
the feature is not effective.
|
||||
|
||||
memtest= [KNL,X86,ARM,PPC,RISCV] Enable memtest
|
||||
Format: <integer>
|
||||
default : 0 <disable>
|
||||
Specifies the number of memtest passes to be
|
||||
|
||||
@@ -357,6 +357,15 @@ creates ZONE_MOVABLE as following.
|
||||
Unfortunately, there is no information to show which memory block belongs
|
||||
to ZONE_MOVABLE. This is TBD.
|
||||
|
||||
.. note::
|
||||
Techniques that rely on long-term pinnings of memory (especially, RDMA and
|
||||
vfio) are fundamentally problematic with ZONE_MOVABLE and, therefore, memory
|
||||
hot remove. Pinned pages cannot reside on ZONE_MOVABLE, to guarantee that
|
||||
memory can still get hot removed - be aware that pinning can fail even if
|
||||
there is plenty of free memory in ZONE_MOVABLE. In addition, using
|
||||
ZONE_MOVABLE might make page pinning more expensive, because pages have to be
|
||||
migrated off that zone first.
|
||||
|
||||
.. _memory_hotplug_how_to_offline_memory:
|
||||
|
||||
How to offline memory
|
||||
|
||||
@@ -63,36 +63,36 @@ the generic ioctl available.
|
||||
|
||||
The ``uffdio_api.features`` bitmask returned by the ``UFFDIO_API`` ioctl
|
||||
defines what memory types are supported by the ``userfaultfd`` and what
|
||||
events, except page fault notifications, may be generated.
|
||||
events, except page fault notifications, may be generated:
|
||||
|
||||
If the kernel supports registering ``userfaultfd`` ranges on hugetlbfs
|
||||
virtual memory areas, ``UFFD_FEATURE_MISSING_HUGETLBFS`` will be set in
|
||||
``uffdio_api.features``. Similarly, ``UFFD_FEATURE_MISSING_SHMEM`` will be
|
||||
set if the kernel supports registering ``userfaultfd`` ranges on shared
|
||||
memory (covering all shmem APIs, i.e. tmpfs, ``IPCSHM``, ``/dev/zero``,
|
||||
``MAP_SHARED``, ``memfd_create``, etc).
|
||||
- The ``UFFD_FEATURE_EVENT_*`` flags indicate that various other events
|
||||
other than page faults are supported. These events are described in more
|
||||
detail below in the `Non-cooperative userfaultfd`_ section.
|
||||
|
||||
The userland application that wants to use ``userfaultfd`` with hugetlbfs
|
||||
or shared memory need to set the corresponding flag in
|
||||
``uffdio_api.features`` to enable those features.
|
||||
- ``UFFD_FEATURE_MISSING_HUGETLBFS`` and ``UFFD_FEATURE_MISSING_SHMEM``
|
||||
indicate that the kernel supports ``UFFDIO_REGISTER_MODE_MISSING``
|
||||
registrations for hugetlbfs and shared memory (covering all shmem APIs,
|
||||
i.e. tmpfs, ``IPCSHM``, ``/dev/zero``, ``MAP_SHARED``, ``memfd_create``,
|
||||
etc) virtual memory areas, respectively.
|
||||
|
||||
If the userland desires to receive notifications for events other than
|
||||
page faults, it has to verify that ``uffdio_api.features`` has appropriate
|
||||
``UFFD_FEATURE_EVENT_*`` bits set. These events are described in more
|
||||
detail below in `Non-cooperative userfaultfd`_ section.
|
||||
- ``UFFD_FEATURE_MINOR_HUGETLBFS`` indicates that the kernel supports
|
||||
``UFFDIO_REGISTER_MODE_MINOR`` registration for hugetlbfs virtual memory
|
||||
areas.
|
||||
|
||||
Once the ``userfaultfd`` has been enabled the ``UFFDIO_REGISTER`` ioctl should
|
||||
be invoked (if present in the returned ``uffdio_api.ioctls`` bitmask) to
|
||||
register a memory range in the ``userfaultfd`` by setting the
|
||||
The userland application should set the feature flags it intends to use
|
||||
when invoking the ``UFFDIO_API`` ioctl, to request that those features be
|
||||
enabled if supported.
|
||||
|
||||
Once the ``userfaultfd`` API has been enabled the ``UFFDIO_REGISTER``
|
||||
ioctl should be invoked (if present in the returned ``uffdio_api.ioctls``
|
||||
bitmask) to register a memory range in the ``userfaultfd`` by setting the
|
||||
uffdio_register structure accordingly. The ``uffdio_register.mode``
|
||||
bitmask will specify to the kernel which kind of faults to track for
|
||||
the range (``UFFDIO_REGISTER_MODE_MISSING`` would track missing
|
||||
pages). The ``UFFDIO_REGISTER`` ioctl will return the
|
||||
the range. The ``UFFDIO_REGISTER`` ioctl will return the
|
||||
``uffdio_register.ioctls`` bitmask of ioctls that are suitable to resolve
|
||||
userfaults on the range registered. Not all ioctls will necessarily be
|
||||
supported for all memory types depending on the underlying virtual
|
||||
memory backend (anonymous memory vs tmpfs vs real filebacked
|
||||
mappings).
|
||||
supported for all memory types (e.g. anonymous memory vs. shmem vs.
|
||||
hugetlbfs), or all types of intercepted faults.
|
||||
|
||||
Userland can use the ``uffdio_register.ioctls`` to manage the virtual
|
||||
address space in the background (to add or potentially also remove
|
||||
@@ -100,21 +100,46 @@ memory from the ``userfaultfd`` registered range). This means a userfault
|
||||
could be triggering just before userland maps in the background the
|
||||
user-faulted page.
|
||||
|
||||
The primary ioctl to resolve userfaults is ``UFFDIO_COPY``. That
|
||||
atomically copies a page into the userfault registered range and wakes
|
||||
up the blocked userfaults
|
||||
(unless ``uffdio_copy.mode & UFFDIO_COPY_MODE_DONTWAKE`` is set).
|
||||
Other ioctl works similarly to ``UFFDIO_COPY``. They're atomic as in
|
||||
guaranteeing that nothing can see an half copied page since it'll
|
||||
keep userfaulting until the copy has finished.
|
||||
Resolving Userfaults
|
||||
--------------------
|
||||
|
||||
There are three basic ways to resolve userfaults:
|
||||
|
||||
- ``UFFDIO_COPY`` atomically copies some existing page contents from
|
||||
userspace.
|
||||
|
||||
- ``UFFDIO_ZEROPAGE`` atomically zeros the new page.
|
||||
|
||||
- ``UFFDIO_CONTINUE`` maps an existing, previously-populated page.
|
||||
|
||||
These operations are atomic in the sense that they guarantee nothing can
|
||||
see a half-populated page, since readers will keep userfaulting until the
|
||||
operation has finished.
|
||||
|
||||
By default, these wake up userfaults blocked on the range in question.
|
||||
They support a ``UFFDIO_*_MODE_DONTWAKE`` ``mode`` flag, which indicates
|
||||
that waking will be done separately at some later time.
|
||||
|
||||
Which ioctl to choose depends on the kind of page fault, and what we'd
|
||||
like to do to resolve it:
|
||||
|
||||
- For ``UFFDIO_REGISTER_MODE_MISSING`` faults, the fault needs to be
|
||||
resolved by either providing a new page (``UFFDIO_COPY``), or mapping
|
||||
the zero page (``UFFDIO_ZEROPAGE``). By default, the kernel would map
|
||||
the zero page for a missing fault. With userfaultfd, userspace can
|
||||
decide what content to provide before the faulting thread continues.
|
||||
|
||||
- For ``UFFDIO_REGISTER_MODE_MINOR`` faults, there is an existing page (in
|
||||
the page cache). Userspace has the option of modifying the page's
|
||||
contents before resolving the fault. Once the contents are correct
|
||||
(modified or not), userspace asks the kernel to map the page and let the
|
||||
faulting thread continue with ``UFFDIO_CONTINUE``.
|
||||
|
||||
Notes:
|
||||
|
||||
- If you requested ``UFFDIO_REGISTER_MODE_MISSING`` when registering then
|
||||
you must provide some kind of page in your thread after reading from
|
||||
the uffd. You must provide either ``UFFDIO_COPY`` or ``UFFDIO_ZEROPAGE``.
|
||||
The normal behavior of the OS automatically providing a zero page on
|
||||
an anonymous mmaping is not in place.
|
||||
- You can tell which kind of fault occurred by examining
|
||||
``pagefault.flags`` within the ``uffd_msg``, checking for the
|
||||
``UFFD_PAGEFAULT_FLAG_*`` flags.
|
||||
|
||||
- None of the page-delivering ioctls default to the range that you
|
||||
registered with. You must fill in all fields for the appropriate
|
||||
@@ -122,9 +147,9 @@ Notes:
|
||||
|
||||
- You get the address of the access that triggered the missing page
|
||||
event out of a struct uffd_msg that you read in the thread from the
|
||||
uffd. You can supply as many pages as you want with ``UFFDIO_COPY`` or
|
||||
``UFFDIO_ZEROPAGE``. Keep in mind that unless you used DONTWAKE then
|
||||
the first of any of those IOCTLs wakes up the faulting thread.
|
||||
uffd. You can supply as many pages as you want with these IOCTLs.
|
||||
Keep in mind that unless you used DONTWAKE then the first of any of
|
||||
those IOCTLs wakes up the faulting thread.
|
||||
|
||||
- Be sure to test for all errors including
|
||||
(``pollfd[0].revents & POLLERR``). This can happen, e.g. when ranges
|
||||
|
||||
@@ -24,7 +24,8 @@ longterm series? One still supported? Then search the `LKML
|
||||
you don't find any, install `the latest release from that series
|
||||
<https://kernel.org/>`_. If it still shows the issue, report it to the stable
|
||||
mailing list (stable@vger.kernel.org) and CC the regressions list
|
||||
(regressions@lists.linux.dev).
|
||||
(regressions@lists.linux.dev); ideally also CC the maintainer and the mailing
|
||||
list for the subsystem in question.
|
||||
|
||||
In all other cases try your best guess which kernel part might be causing the
|
||||
issue. Check the :ref:`MAINTAINERS <maintainers>` file for how its developers
|
||||
@@ -48,8 +49,9 @@ before the issue occurs.
|
||||
If you are facing multiple issues with the Linux kernel at once, report each
|
||||
separately. While writing your report, include all information relevant to the
|
||||
issue, like the kernel and the distro used. In case of a regression, CC the
|
||||
regressions mailing list (regressions@lists.linux.dev) to your report; also try
|
||||
to include the commit-id of the change causing it, which a bisection can find.
|
||||
regressions mailing list (regressions@lists.linux.dev) to your report. Also try
|
||||
to pin-point the culprit with a bisection; if you succeed, include its
|
||||
commit-id and CC everyone in the sign-off-by chain.
|
||||
|
||||
Once the report is out, answer any questions that come up and help where you
|
||||
can. That includes keeping the ball rolling by occasionally retesting with newer
|
||||
@@ -198,10 +200,11 @@ report them:
|
||||
|
||||
* Send a short problem report to the Linux stable mailing list
|
||||
(stable@vger.kernel.org) and CC the Linux regressions mailing list
|
||||
(regressions@lists.linux.dev). Roughly describe the issue and ideally
|
||||
explain how to reproduce it. Mention the first version that shows the
|
||||
problem and the last version that's working fine. Then wait for further
|
||||
instructions.
|
||||
(regressions@lists.linux.dev); if you suspect the cause in a particular
|
||||
subsystem, CC its maintainer and its mailing list. Roughly describe the
|
||||
issue and ideally explain how to reproduce it. Mention the first version
|
||||
that shows the problem and the last version that's working fine. Then
|
||||
wait for further instructions.
|
||||
|
||||
The reference section below explains each of these steps in more detail.
|
||||
|
||||
@@ -768,7 +771,9 @@ regular internet search engine and add something like
|
||||
the results to the archives at that URL.
|
||||
|
||||
It's also wise to check the internet, LKML and maybe bugzilla.kernel.org again
|
||||
at this point.
|
||||
at this point. If your report needs to be filed in a bug tracker, you may want
|
||||
to check the mailing list archives for the subsystem as well, as someone might
|
||||
have reported it only there.
|
||||
|
||||
For details how to search and what to do if you find matching reports see
|
||||
"Search for existing reports, first run" above.
|
||||
@@ -1249,9 +1254,10 @@ and the oldest where the issue occurs (say 5.8-rc1).
|
||||
|
||||
When sending the report by mail, CC the Linux regressions mailing list
|
||||
(regressions@lists.linux.dev). In case the report needs to be filed to some web
|
||||
tracker, proceed to do so; once filed, forward the report by mail to the
|
||||
regressions list. Make sure to inline the forwarded report, hence do not attach
|
||||
it. Also add a short note at the top where you mention the URL to the ticket.
|
||||
tracker, proceed to do so. Once filed, forward the report by mail to the
|
||||
regressions list; CC the maintainer and the mailing list for the subsystem in
|
||||
question. Make sure to inline the forwarded report, hence do not attach it.
|
||||
Also add a short note at the top where you mention the URL to the ticket.
|
||||
|
||||
When mailing or forwarding the report, in case of a successful bisection add the
|
||||
author of the culprit to the recipients; also CC everyone in the signed-off-by
|
||||
@@ -1536,17 +1542,20 @@ Report the regression
|
||||
|
||||
*Send a short problem report to the Linux stable mailing list
|
||||
(stable@vger.kernel.org) and CC the Linux regressions mailing list
|
||||
(regressions@lists.linux.dev). Roughly describe the issue and ideally
|
||||
explain how to reproduce it. Mention the first version that shows the
|
||||
problem and the last version that's working fine. Then wait for further
|
||||
instructions.*
|
||||
(regressions@lists.linux.dev); if you suspect the cause in a particular
|
||||
subsystem, CC its maintainer and its mailing list. Roughly describe the
|
||||
issue and ideally explain how to reproduce it. Mention the first version
|
||||
that shows the problem and the last version that's working fine. Then
|
||||
wait for further instructions.*
|
||||
|
||||
When reporting a regression that happens within a stable or longterm kernel
|
||||
line (say when updating from 5.10.4 to 5.10.5) a brief report is enough for
|
||||
the start to get the issue reported quickly. Hence a rough description is all
|
||||
it takes.
|
||||
the start to get the issue reported quickly. Hence a rough description to the
|
||||
stable and regressions mailing list is all it takes; but in case you suspect
|
||||
the cause in a particular subsystem, CC its maintainers and its mailing list
|
||||
as well, because that will speed things up.
|
||||
|
||||
But note, it helps developers a great deal if you can specify the exact version
|
||||
And note, it helps developers a great deal if you can specify the exact version
|
||||
that introduced the problem. Hence if possible within a reasonable time frame,
|
||||
try to find that version using vanilla kernels. Lets assume something broke when
|
||||
your distributor released a update from Linux kernel 5.10.5 to 5.10.8. Then as
|
||||
@@ -1563,7 +1572,9 @@ pinpoint the exact change that causes the issue (which then can easily get
|
||||
reverted to fix the issue quickly). Hence consider to do a proper bisection
|
||||
right away if time permits. See the section 'Special care for regressions' and
|
||||
the document 'Documentation/admin-guide/bug-bisect.rst' for details how to
|
||||
perform one.
|
||||
perform one. In case of a successful bisection add the author of the culprit to
|
||||
the recipients; also CC everyone in the signed-off-by chain, which you find at
|
||||
the end of its commit message.
|
||||
|
||||
|
||||
Reference for "Reporting issues only occurring in older kernel version lines"
|
||||
|
||||
Reference in New Issue
Block a user