mirror of
https://github.com/torvalds/linux.git
synced 2024-11-24 21:21:41 +00:00
It has been a moderately calm cycle for documentation; the significant
changes include: - Some significant additions to the memory-management documentation - Some improvements to navigation in the HTML-rendered docs - More Spanish and Chinese translations ...and the usual set of typo fixes and such. -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmPzkQUPHGNvcmJldEBs d24ubmV0AAoJEBdDWhNsDH5YC0QH/09u10xV3N+RuveNE/tArVxKcQi7JZd/xugQ toSXygh64WY10lzwi7Ms1bHZzpPYB0fOrqTGNqNQuhrVTjQzaZB0BBJqm8lwt2w/ S/Z5wj+IicJTmQ7+0C2Hc/dcK5SCPfY3CgwqOUVdr3dEm1oU+4QaBy31fuIJJ0Hx NdbXBco8BZqJX9P67jwp9vbrFrSGBjPI0U4HNHVjrWlcBy8JT0aAnf0fyWFy3orA T86EzmEw8drA1mXsHa5pmVwuHDx2X+D+eRurG9llCBrlIG9EDSmnalY4BeGqR4LS oDrEH6M91I5+9iWoJ0rBheD8rPclXO2HpjXLApXzTjrORgEYZsM= =MCdX -----END PGP SIGNATURE----- Merge tag 'docs-6.3' of git://git.lwn.net/linux Pull documentation updates from Jonathan Corbet: "It has been a moderately calm cycle for documentation; the significant changes include: - Some significant additions to the memory-management documentation - Some improvements to navigation in the HTML-rendered docs - More Spanish and Chinese translations ... and the usual set of typo fixes and such" * tag 'docs-6.3' of git://git.lwn.net/linux: (68 commits) Documentation/watchdog/hpwdt: Fix Format Documentation/watchdog/hpwdt: Fix Reference Documentation: core-api: padata: correct spelling docs/mm: Physical Memory: correct spelling in reference to CONFIG_PAGE_EXTENSION docs: Use HTML comments for the kernel-toc SPDX line docs: Add more information to the HTML sidebar Documentation: KVM: Update AMD memory encryption link printk: Document that CONFIG_BOOT_PRINTK_DELAY required for boot_delay= Documentation: userspace-api: correct spelling Documentation: sparc: correct spelling Documentation: driver-api: correct spelling Documentation: admin-guide: correct spelling docs: add workload-tracing document to admin-guide docs/admin-guide/mm: remove useless markup docs/mm: remove useless markup docs/mm: Physical Memory: remove useless markup docs/sp_SP: Add process magic-number translation docs: ftrace: always use canonical ftrace path Doc/damon: fix the data path error dma-buf: Add "dma-buf" to title of documentation ...
This commit is contained in:
commit
70756b49be
@ -1,6 +1,9 @@
|
|||||||
|
if COMPILE_TEST
|
||||||
|
|
||||||
|
menu "Documentation"
|
||||||
|
|
||||||
config WARN_MISSING_DOCUMENTS
|
config WARN_MISSING_DOCUMENTS
|
||||||
bool "Warn if there's a missing documentation file"
|
bool "Warn if there's a missing documentation file"
|
||||||
depends on COMPILE_TEST
|
|
||||||
help
|
help
|
||||||
It is not uncommon that a document gets renamed.
|
It is not uncommon that a document gets renamed.
|
||||||
This option makes the Kernel to check for missing dependencies,
|
This option makes the Kernel to check for missing dependencies,
|
||||||
@ -11,7 +14,6 @@ config WARN_MISSING_DOCUMENTS
|
|||||||
|
|
||||||
config WARN_ABI_ERRORS
|
config WARN_ABI_ERRORS
|
||||||
bool "Warn if there are errors at ABI files"
|
bool "Warn if there are errors at ABI files"
|
||||||
depends on COMPILE_TEST
|
|
||||||
help
|
help
|
||||||
The files under Documentation/ABI should follow what's
|
The files under Documentation/ABI should follow what's
|
||||||
described at Documentation/ABI/README. Yet, as they're manually
|
described at Documentation/ABI/README. Yet, as they're manually
|
||||||
@ -20,3 +22,7 @@ config WARN_ABI_ERRORS
|
|||||||
scripts/get_abi.pl. Add a check to verify them.
|
scripts/get_abi.pl. Add a check to verify them.
|
||||||
|
|
||||||
If unsure, select 'N'.
|
If unsure, select 'N'.
|
||||||
|
|
||||||
|
endmenu
|
||||||
|
|
||||||
|
endif
|
||||||
|
@ -1,8 +1,8 @@
|
|||||||
.. SPDX-License-Identifier: GPL-2.0
|
.. SPDX-License-Identifier: GPL-2.0
|
||||||
|
|
||||||
=======================
|
=================
|
||||||
Linux PCI Bus Subsystem
|
PCI Bus Subsystem
|
||||||
=======================
|
=================
|
||||||
|
|
||||||
.. toctree::
|
.. toctree::
|
||||||
:maxdepth: 2
|
:maxdepth: 2
|
||||||
|
@ -69,7 +69,7 @@ The accelerator devices will be exposed to the user space with the dedicated
|
|||||||
|
|
||||||
- device char files - /dev/accel/accel*
|
- device char files - /dev/accel/accel*
|
||||||
- sysfs - /sys/class/accel/accel*/
|
- sysfs - /sys/class/accel/accel*/
|
||||||
- debugfs - /sys/kernel/debug/accel/accel*/
|
- debugfs - /sys/kernel/debug/accel/*/
|
||||||
|
|
||||||
Getting Started
|
Getting Started
|
||||||
===============
|
===============
|
||||||
|
@ -204,7 +204,7 @@ For example::
|
|||||||
This should present your unmodified backing device data in /dev/loop0
|
This should present your unmodified backing device data in /dev/loop0
|
||||||
|
|
||||||
If your cache is in writethrough mode, then you can safely discard the
|
If your cache is in writethrough mode, then you can safely discard the
|
||||||
cache device without loosing data.
|
cache device without losing data.
|
||||||
|
|
||||||
|
|
||||||
E) Wiping a cache device
|
E) Wiping a cache device
|
||||||
|
@ -106,7 +106,7 @@ Proportional weight policy files
|
|||||||
see Documentation/block/bfq-iosched.rst.
|
see Documentation/block/bfq-iosched.rst.
|
||||||
|
|
||||||
blkio.bfq.weight_device
|
blkio.bfq.weight_device
|
||||||
Specifes per cgroup per device weights, overriding the default group
|
Specifies per cgroup per device weights, overriding the default group
|
||||||
weight. For more details, see Documentation/block/bfq-iosched.rst.
|
weight. For more details, see Documentation/block/bfq-iosched.rst.
|
||||||
|
|
||||||
Following is the format::
|
Following is the format::
|
||||||
|
@ -624,7 +624,7 @@ and is an example of this type.
|
|||||||
Limits
|
Limits
|
||||||
------
|
------
|
||||||
|
|
||||||
A child can only consume upto the configured amount of the resource.
|
A child can only consume up to the configured amount of the resource.
|
||||||
Limits can be over-committed - the sum of the limits of children can
|
Limits can be over-committed - the sum of the limits of children can
|
||||||
exceed the amount of resource available to the parent.
|
exceed the amount of resource available to the parent.
|
||||||
|
|
||||||
@ -642,11 +642,11 @@ on an IO device and is an example of this type.
|
|||||||
Protections
|
Protections
|
||||||
-----------
|
-----------
|
||||||
|
|
||||||
A cgroup is protected upto the configured amount of the resource
|
A cgroup is protected up to the configured amount of the resource
|
||||||
as long as the usages of all its ancestors are under their
|
as long as the usages of all its ancestors are under their
|
||||||
protected levels. Protections can be hard guarantees or best effort
|
protected levels. Protections can be hard guarantees or best effort
|
||||||
soft boundaries. Protections can also be over-committed in which case
|
soft boundaries. Protections can also be over-committed in which case
|
||||||
only upto the amount available to the parent is protected among
|
only up to the amount available to the parent is protected among
|
||||||
children.
|
children.
|
||||||
|
|
||||||
Protections are in the range [0, max] and defaults to 0, which is
|
Protections are in the range [0, max] and defaults to 0, which is
|
||||||
@ -1079,7 +1079,7 @@ All time durations are in microseconds.
|
|||||||
|
|
||||||
$MAX $PERIOD
|
$MAX $PERIOD
|
||||||
|
|
||||||
which indicates that the group may consume upto $MAX in each
|
which indicates that the group may consume up to $MAX in each
|
||||||
$PERIOD duration. "max" for $MAX indicates no limit. If only
|
$PERIOD duration. "max" for $MAX indicates no limit. If only
|
||||||
one number is written, $MAX is updated.
|
one number is written, $MAX is updated.
|
||||||
|
|
||||||
@ -2289,7 +2289,7 @@ Cpuset Interface Files
|
|||||||
For a valid partition root with the sibling cpu exclusivity
|
For a valid partition root with the sibling cpu exclusivity
|
||||||
rule enabled, changes made to "cpuset.cpus" that violate the
|
rule enabled, changes made to "cpuset.cpus" that violate the
|
||||||
exclusivity rule will invalidate the partition as well as its
|
exclusivity rule will invalidate the partition as well as its
|
||||||
sibiling partitions with conflicting cpuset.cpus values. So
|
sibling partitions with conflicting cpuset.cpus values. So
|
||||||
care must be taking in changing "cpuset.cpus".
|
care must be taking in changing "cpuset.cpus".
|
||||||
|
|
||||||
A valid non-root parent partition may distribute out all its CPUs
|
A valid non-root parent partition may distribute out all its CPUs
|
||||||
|
@ -399,7 +399,7 @@ A partial list of the supported mount options follows:
|
|||||||
sep
|
sep
|
||||||
if first mount option (after the -o), overrides
|
if first mount option (after the -o), overrides
|
||||||
the comma as the separator between the mount
|
the comma as the separator between the mount
|
||||||
parms. e.g.::
|
parameters. e.g.::
|
||||||
|
|
||||||
-o user=myname,password=mypassword,domain=mydom
|
-o user=myname,password=mypassword,domain=mydom
|
||||||
|
|
||||||
@ -765,7 +765,7 @@ cifsFYI If set to non-zero value, additional debug information
|
|||||||
Some debugging statements are not compiled into the
|
Some debugging statements are not compiled into the
|
||||||
cifs kernel unless CONFIG_CIFS_DEBUG2 is enabled in the
|
cifs kernel unless CONFIG_CIFS_DEBUG2 is enabled in the
|
||||||
kernel configuration. cifsFYI may be set to one or
|
kernel configuration. cifsFYI may be set to one or
|
||||||
nore of the following flags (7 sets them all)::
|
more of the following flags (7 sets them all)::
|
||||||
|
|
||||||
+-----------------------------------------------+------+
|
+-----------------------------------------------+------+
|
||||||
| log cifs informational messages | 0x01 |
|
| log cifs informational messages | 0x01 |
|
||||||
|
@ -70,7 +70,7 @@ the entries (each hotspot block covers a larger area than a single
|
|||||||
cache block).
|
cache block).
|
||||||
|
|
||||||
All this means smq uses ~25bytes per cache block. Still a lot of
|
All this means smq uses ~25bytes per cache block. Still a lot of
|
||||||
memory, but a substantial improvement nontheless.
|
memory, but a substantial improvement nonetheless.
|
||||||
|
|
||||||
Level balancing
|
Level balancing
|
||||||
^^^^^^^^^^^^^^^
|
^^^^^^^^^^^^^^^
|
||||||
|
@ -31,7 +31,7 @@ Mandatory parameters:
|
|||||||
|
|
||||||
Optional parameter:
|
Optional parameter:
|
||||||
|
|
||||||
<underyling sectors>:
|
<underlying sectors>:
|
||||||
Number of sectors defining the logical block size of <dev path>.
|
Number of sectors defining the logical block size of <dev path>.
|
||||||
2^N supported, e.g. 8 = emulate 8 sectors of 512 bytes = 4KiB.
|
2^N supported, e.g. 8 = emulate 8 sectors of 512 bytes = 4KiB.
|
||||||
If not provided, the logical block size of <dev path> will be used.
|
If not provided, the logical block size of <dev path> will be used.
|
||||||
|
@ -46,7 +46,7 @@ just like conventional zones.
|
|||||||
The zones of the device(s) are separated into 2 types:
|
The zones of the device(s) are separated into 2 types:
|
||||||
|
|
||||||
1) Metadata zones: these are conventional zones used to store metadata.
|
1) Metadata zones: these are conventional zones used to store metadata.
|
||||||
Metadata zones are not reported as useable capacity to the user.
|
Metadata zones are not reported as usable capacity to the user.
|
||||||
|
|
||||||
2) Data zones: all remaining zones, the vast majority of which will be
|
2) Data zones: all remaining zones, the vast majority of which will be
|
||||||
sequential zones used exclusively to store user data. The conventional
|
sequential zones used exclusively to store user data. The conventional
|
||||||
|
@ -35,7 +35,7 @@ An example of undoing an existing dm-stripe
|
|||||||
|
|
||||||
This small bash script will setup 4 loop devices and use the existing
|
This small bash script will setup 4 loop devices and use the existing
|
||||||
striped target to combine the 4 devices into one. It then will use
|
striped target to combine the 4 devices into one. It then will use
|
||||||
the unstriped target ontop of the striped device to access the
|
the unstriped target on top of the striped device to access the
|
||||||
individual backing loop devices. We write data to the newly exposed
|
individual backing loop devices. We write data to the newly exposed
|
||||||
unstriped devices and verify the data written matches the correct
|
unstriped devices and verify the data written matches the correct
|
||||||
underlying device on the striped array::
|
underlying device on the striped array::
|
||||||
@ -110,8 +110,8 @@ to get a 92% reduction in read latency using this device mapper target.
|
|||||||
Example dmsetup usage
|
Example dmsetup usage
|
||||||
=====================
|
=====================
|
||||||
|
|
||||||
unstriped ontop of Intel NVMe device that has 2 cores
|
unstriped on top of Intel NVMe device that has 2 cores
|
||||||
-----------------------------------------------------
|
------------------------------------------------------
|
||||||
|
|
||||||
::
|
::
|
||||||
|
|
||||||
@ -124,8 +124,8 @@ respectively::
|
|||||||
/dev/mapper/nvmset0
|
/dev/mapper/nvmset0
|
||||||
/dev/mapper/nvmset1
|
/dev/mapper/nvmset1
|
||||||
|
|
||||||
unstriped ontop of striped with 4 drives using 128K chunk size
|
unstriped on top of striped with 4 drives using 128K chunk size
|
||||||
--------------------------------------------------------------
|
---------------------------------------------------------------
|
||||||
|
|
||||||
::
|
::
|
||||||
|
|
||||||
|
@ -330,7 +330,7 @@ Examples
|
|||||||
|
|
||||||
// boot-args example, with newlines and comments for readability
|
// boot-args example, with newlines and comments for readability
|
||||||
Kernel command line: ...
|
Kernel command line: ...
|
||||||
// see whats going on in dyndbg=value processing
|
// see what's going on in dyndbg=value processing
|
||||||
dynamic_debug.verbose=3
|
dynamic_debug.verbose=3
|
||||||
// enable pr_debugs in the btrfs module (can be builtin or loadable)
|
// enable pr_debugs in the btrfs module (can be builtin or loadable)
|
||||||
btrfs.dyndbg="+p"
|
btrfs.dyndbg="+p"
|
||||||
|
@ -123,7 +123,7 @@ Each simulated GPIO chip creates a separate sysfs group under its device
|
|||||||
directory for each exposed line
|
directory for each exposed line
|
||||||
(e.g. ``/sys/devices/platform/gpio-sim.X/gpiochipY/``). The name of each group
|
(e.g. ``/sys/devices/platform/gpio-sim.X/gpiochipY/``). The name of each group
|
||||||
is of the form: ``'sim_gpioX'`` where X is the offset of the line. Inside each
|
is of the form: ``'sim_gpioX'`` where X is the offset of the line. Inside each
|
||||||
group there are two attibutes:
|
group there are two attributes:
|
||||||
|
|
||||||
``pull`` - allows to read and set the current simulated pull setting for
|
``pull`` - allows to read and set the current simulated pull setting for
|
||||||
every line, when writing the value must be one of: ``'pull-up'``,
|
every line, when writing the value must be one of: ``'pull-up'``,
|
||||||
|
@ -64,8 +64,8 @@ architecture section: :ref:`Documentation/x86/mds.rst <mds>`.
|
|||||||
Attack scenarios
|
Attack scenarios
|
||||||
----------------
|
----------------
|
||||||
|
|
||||||
Attacks against the MDS vulnerabilities can be mounted from malicious non
|
Attacks against the MDS vulnerabilities can be mounted from malicious non-
|
||||||
priviledged user space applications running on hosts or guest. Malicious
|
privileged user space applications running on hosts or guest. Malicious
|
||||||
guest OSes can obviously mount attacks as well.
|
guest OSes can obviously mount attacks as well.
|
||||||
|
|
||||||
Contrary to other speculation based vulnerabilities the MDS vulnerability
|
Contrary to other speculation based vulnerabilities the MDS vulnerability
|
||||||
|
@ -56,6 +56,17 @@ ABI will be found here.
|
|||||||
|
|
||||||
sysfs-rules
|
sysfs-rules
|
||||||
|
|
||||||
|
This is the beginning of a section with information of interest to
|
||||||
|
application developers and system integrators doing analysis of the
|
||||||
|
Linux kernel for safety critical applications. Documents supporting
|
||||||
|
analysis of kernel interactions with applications, and key kernel
|
||||||
|
subsystems expectations will be found here.
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 1
|
||||||
|
|
||||||
|
workload-tracing
|
||||||
|
|
||||||
The rest of this manual consists of various unordered guides on how to
|
The rest of this manual consists of various unordered guides on how to
|
||||||
configure specific aspects of kernel behavior to your liking.
|
configure specific aspects of kernel behavior to your liking.
|
||||||
|
|
||||||
|
@ -378,18 +378,16 @@
|
|||||||
autoconf= [IPV6]
|
autoconf= [IPV6]
|
||||||
See Documentation/networking/ipv6.rst.
|
See Documentation/networking/ipv6.rst.
|
||||||
|
|
||||||
show_lapic= [APIC,X86] Advanced Programmable Interrupt Controller
|
|
||||||
Limit apic dumping. The parameter defines the maximal
|
|
||||||
number of local apics being dumped. Also it is possible
|
|
||||||
to set it to "all" by meaning -- no limit here.
|
|
||||||
Format: { 1 (default) | 2 | ... | all }.
|
|
||||||
The parameter valid if only apic=debug or
|
|
||||||
apic=verbose is specified.
|
|
||||||
Example: apic=debug show_lapic=all
|
|
||||||
|
|
||||||
apm= [APM] Advanced Power Management
|
apm= [APM] Advanced Power Management
|
||||||
See header of arch/x86/kernel/apm_32.c.
|
See header of arch/x86/kernel/apm_32.c.
|
||||||
|
|
||||||
|
apparmor= [APPARMOR] Disable or enable AppArmor at boot time
|
||||||
|
Format: { "0" | "1" }
|
||||||
|
See security/apparmor/Kconfig help text
|
||||||
|
0 -- disable.
|
||||||
|
1 -- enable.
|
||||||
|
Default value is set via kernel config option.
|
||||||
|
|
||||||
arcrimi= [HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
|
arcrimi= [HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
|
||||||
Format: <io>,<irq>,<nodeID>
|
Format: <io>,<irq>,<nodeID>
|
||||||
|
|
||||||
@ -480,8 +478,10 @@
|
|||||||
See Documentation/block/cmdline-partition.rst
|
See Documentation/block/cmdline-partition.rst
|
||||||
|
|
||||||
boot_delay= Milliseconds to delay each printk during boot.
|
boot_delay= Milliseconds to delay each printk during boot.
|
||||||
Values larger than 10 seconds (10000) are changed to
|
Only works if CONFIG_BOOT_PRINTK_DELAY is enabled,
|
||||||
no delay (0).
|
and you may also have to specify "lpj=". Boot_delay
|
||||||
|
values larger than 10 seconds (10000) are assumed
|
||||||
|
erroneous and ignored.
|
||||||
Format: integer
|
Format: integer
|
||||||
|
|
||||||
bootconfig [KNL]
|
bootconfig [KNL]
|
||||||
@ -673,7 +673,7 @@
|
|||||||
Sets the size of kernel per-numa memory area for
|
Sets the size of kernel per-numa memory area for
|
||||||
contiguous memory allocations. A value of 0 disables
|
contiguous memory allocations. A value of 0 disables
|
||||||
per-numa CMA altogether. And If this option is not
|
per-numa CMA altogether. And If this option is not
|
||||||
specificed, the default value is 0.
|
specified, the default value is 0.
|
||||||
With per-numa CMA enabled, DMA users on node nid will
|
With per-numa CMA enabled, DMA users on node nid will
|
||||||
first try to allocate buffer from the pernuma area
|
first try to allocate buffer from the pernuma area
|
||||||
which is located in node nid, if the allocation fails,
|
which is located in node nid, if the allocation fails,
|
||||||
@ -945,7 +945,7 @@
|
|||||||
driver code when a CPU writes to (or reads from) a
|
driver code when a CPU writes to (or reads from) a
|
||||||
random memory location. Note that there exists a class
|
random memory location. Note that there exists a class
|
||||||
of memory corruptions problems caused by buggy H/W or
|
of memory corruptions problems caused by buggy H/W or
|
||||||
F/W or by drivers badly programing DMA (basically when
|
F/W or by drivers badly programming DMA (basically when
|
||||||
memory is written at bus level and the CPU MMU is
|
memory is written at bus level and the CPU MMU is
|
||||||
bypassed) which are not detectable by
|
bypassed) which are not detectable by
|
||||||
CONFIG_DEBUG_PAGEALLOC, hence this option will not help
|
CONFIG_DEBUG_PAGEALLOC, hence this option will not help
|
||||||
@ -1046,26 +1046,12 @@
|
|||||||
can be useful when debugging issues that require an SLB
|
can be useful when debugging issues that require an SLB
|
||||||
miss to occur.
|
miss to occur.
|
||||||
|
|
||||||
stress_slb [PPC]
|
|
||||||
Limits the number of kernel SLB entries, and flushes
|
|
||||||
them frequently to increase the rate of SLB faults
|
|
||||||
on kernel addresses.
|
|
||||||
|
|
||||||
stress_hpt [PPC]
|
|
||||||
Limits the number of kernel HPT entries in the hash
|
|
||||||
page table to increase the rate of hash page table
|
|
||||||
faults on kernel addresses.
|
|
||||||
|
|
||||||
disable= [IPV6]
|
disable= [IPV6]
|
||||||
See Documentation/networking/ipv6.rst.
|
See Documentation/networking/ipv6.rst.
|
||||||
|
|
||||||
disable_radix [PPC]
|
disable_radix [PPC]
|
||||||
Disable RADIX MMU mode on POWER9
|
Disable RADIX MMU mode on POWER9
|
||||||
|
|
||||||
radix_hcall_invalidate=on [PPC/PSERIES]
|
|
||||||
Disable RADIX GTSE feature and use hcall for TLB
|
|
||||||
invalidate.
|
|
||||||
|
|
||||||
disable_tlbie [PPC]
|
disable_tlbie [PPC]
|
||||||
Disable TLBIE instruction. Currently does not work
|
Disable TLBIE instruction. Currently does not work
|
||||||
with KVM, with HASH MMU, or with coherent accelerators.
|
with KVM, with HASH MMU, or with coherent accelerators.
|
||||||
@ -1167,16 +1153,6 @@
|
|||||||
Documentation/admin-guide/dynamic-debug-howto.rst
|
Documentation/admin-guide/dynamic-debug-howto.rst
|
||||||
for details.
|
for details.
|
||||||
|
|
||||||
nopku [X86] Disable Memory Protection Keys CPU feature found
|
|
||||||
in some Intel CPUs.
|
|
||||||
|
|
||||||
<module>.async_probe[=<bool>] [KNL]
|
|
||||||
If no <bool> value is specified or if the value
|
|
||||||
specified is not a valid <bool>, enable asynchronous
|
|
||||||
probe on this module. Otherwise, enable/disable
|
|
||||||
asynchronous probe on this module as indicated by the
|
|
||||||
<bool> value. See also: module.async_probe
|
|
||||||
|
|
||||||
early_ioremap_debug [KNL]
|
early_ioremap_debug [KNL]
|
||||||
Enable debug messages in early_ioremap support. This
|
Enable debug messages in early_ioremap support. This
|
||||||
is useful for tracking down temporary early mappings
|
is useful for tracking down temporary early mappings
|
||||||
@ -1753,7 +1729,7 @@
|
|||||||
boot-time allocation of gigantic hugepages is skipped.
|
boot-time allocation of gigantic hugepages is skipped.
|
||||||
|
|
||||||
hugetlb_free_vmemmap=
|
hugetlb_free_vmemmap=
|
||||||
[KNL] Reguires CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
|
[KNL] Requires CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
|
||||||
enabled.
|
enabled.
|
||||||
Control if HugeTLB Vmemmap Optimization (HVO) is enabled.
|
Control if HugeTLB Vmemmap Optimization (HVO) is enabled.
|
||||||
Allows heavy hugetlb users to free up some more
|
Allows heavy hugetlb users to free up some more
|
||||||
@ -1792,12 +1768,6 @@
|
|||||||
which allow the hypervisor to 'idle' the
|
which allow the hypervisor to 'idle' the
|
||||||
guest on lock contention.
|
guest on lock contention.
|
||||||
|
|
||||||
keep_bootcon [KNL]
|
|
||||||
Do not unregister boot console at start. This is only
|
|
||||||
useful for debugging when something happens in the window
|
|
||||||
between unregistering the boot console and initializing
|
|
||||||
the real console.
|
|
||||||
|
|
||||||
i2c_bus= [HW] Override the default board specific I2C bus speed
|
i2c_bus= [HW] Override the default board specific I2C bus speed
|
||||||
or register an additional I2C bus that is not
|
or register an additional I2C bus that is not
|
||||||
registered from board initialization code.
|
registered from board initialization code.
|
||||||
@ -2367,17 +2337,18 @@
|
|||||||
js= [HW,JOY] Analog joystick
|
js= [HW,JOY] Analog joystick
|
||||||
See Documentation/input/joydev/joystick.rst.
|
See Documentation/input/joydev/joystick.rst.
|
||||||
|
|
||||||
nokaslr [KNL]
|
|
||||||
When CONFIG_RANDOMIZE_BASE is set, this disables
|
|
||||||
kernel and module base offset ASLR (Address Space
|
|
||||||
Layout Randomization).
|
|
||||||
|
|
||||||
kasan_multi_shot
|
kasan_multi_shot
|
||||||
[KNL] Enforce KASAN (Kernel Address Sanitizer) to print
|
[KNL] Enforce KASAN (Kernel Address Sanitizer) to print
|
||||||
report on every invalid memory access. Without this
|
report on every invalid memory access. Without this
|
||||||
parameter KASAN will print report only for the first
|
parameter KASAN will print report only for the first
|
||||||
invalid access.
|
invalid access.
|
||||||
|
|
||||||
|
keep_bootcon [KNL]
|
||||||
|
Do not unregister boot console at start. This is only
|
||||||
|
useful for debugging when something happens in the window
|
||||||
|
between unregistering the boot console and initializing
|
||||||
|
the real console.
|
||||||
|
|
||||||
keepinitrd [HW,ARM]
|
keepinitrd [HW,ARM]
|
||||||
|
|
||||||
kernelcore= [KNL,X86,IA-64,PPC]
|
kernelcore= [KNL,X86,IA-64,PPC]
|
||||||
@ -3326,6 +3297,13 @@
|
|||||||
For details see:
|
For details see:
|
||||||
Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst
|
Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst
|
||||||
|
|
||||||
|
<module>.async_probe[=<bool>] [KNL]
|
||||||
|
If no <bool> value is specified or if the value
|
||||||
|
specified is not a valid <bool>, enable asynchronous
|
||||||
|
probe on this module. Otherwise, enable/disable
|
||||||
|
asynchronous probe on this module as indicated by the
|
||||||
|
<bool> value. See also: module.async_probe
|
||||||
|
|
||||||
module.async_probe=<bool>
|
module.async_probe=<bool>
|
||||||
[KNL] When set to true, modules will use async probing
|
[KNL] When set to true, modules will use async probing
|
||||||
by default. To enable/disable async probing for a
|
by default. To enable/disable async probing for a
|
||||||
@ -3709,7 +3687,7 @@
|
|||||||
implementation; requires CONFIG_GENERIC_IDLE_POLL_SETUP
|
implementation; requires CONFIG_GENERIC_IDLE_POLL_SETUP
|
||||||
to be effective. This is useful on platforms where the
|
to be effective. This is useful on platforms where the
|
||||||
sleep(SH) or wfi(ARM,ARM64) instructions do not work
|
sleep(SH) or wfi(ARM,ARM64) instructions do not work
|
||||||
correctly or when doing power measurements to evalute
|
correctly or when doing power measurements to evaluate
|
||||||
the impact of the sleep instructions. This is also
|
the impact of the sleep instructions. This is also
|
||||||
useful when using JTAG debugger.
|
useful when using JTAG debugger.
|
||||||
|
|
||||||
@ -3780,6 +3758,11 @@
|
|||||||
|
|
||||||
nojitter [IA-64] Disables jitter checking for ITC timers.
|
nojitter [IA-64] Disables jitter checking for ITC timers.
|
||||||
|
|
||||||
|
nokaslr [KNL]
|
||||||
|
When CONFIG_RANDOMIZE_BASE is set, this disables
|
||||||
|
kernel and module base offset ASLR (Address Space
|
||||||
|
Layout Randomization).
|
||||||
|
|
||||||
no-kvmclock [X86,KVM] Disable paravirtualized KVM clock driver
|
no-kvmclock [X86,KVM] Disable paravirtualized KVM clock driver
|
||||||
|
|
||||||
no-kvmapf [X86,KVM] Disable paravirtualized asynchronous page
|
no-kvmapf [X86,KVM] Disable paravirtualized asynchronous page
|
||||||
@ -3825,6 +3808,19 @@
|
|||||||
|
|
||||||
nopcid [X86-64] Disable the PCID cpu feature.
|
nopcid [X86-64] Disable the PCID cpu feature.
|
||||||
|
|
||||||
|
nopku [X86] Disable Memory Protection Keys CPU feature found
|
||||||
|
in some Intel CPUs.
|
||||||
|
|
||||||
|
nopv= [X86,XEN,KVM,HYPER_V,VMWARE]
|
||||||
|
Disables the PV optimizations forcing the guest to run
|
||||||
|
as generic guest with no PV drivers. Currently support
|
||||||
|
XEN HVM, KVM, HYPER_V and VMWARE guest.
|
||||||
|
|
||||||
|
nopvspin [X86,XEN,KVM]
|
||||||
|
Disables the qspinlock slow path using PV optimizations
|
||||||
|
which allow the hypervisor to 'idle' the guest on lock
|
||||||
|
contention.
|
||||||
|
|
||||||
norandmaps Don't use address space randomization. Equivalent to
|
norandmaps Don't use address space randomization. Equivalent to
|
||||||
echo 0 > /proc/sys/kernel/randomize_va_space
|
echo 0 > /proc/sys/kernel/randomize_va_space
|
||||||
|
|
||||||
@ -4592,6 +4588,10 @@
|
|||||||
|
|
||||||
r128= [HW,DRM]
|
r128= [HW,DRM]
|
||||||
|
|
||||||
|
radix_hcall_invalidate=on [PPC/PSERIES]
|
||||||
|
Disable RADIX GTSE feature and use hcall for TLB
|
||||||
|
invalidate.
|
||||||
|
|
||||||
raid= [HW,RAID]
|
raid= [HW,RAID]
|
||||||
See Documentation/admin-guide/md.rst.
|
See Documentation/admin-guide/md.rst.
|
||||||
|
|
||||||
@ -5584,13 +5584,6 @@
|
|||||||
1 -- enable.
|
1 -- enable.
|
||||||
Default value is 1.
|
Default value is 1.
|
||||||
|
|
||||||
apparmor= [APPARMOR] Disable or enable AppArmor at boot time
|
|
||||||
Format: { "0" | "1" }
|
|
||||||
See security/apparmor/Kconfig help text
|
|
||||||
0 -- disable.
|
|
||||||
1 -- enable.
|
|
||||||
Default value is set via kernel config option.
|
|
||||||
|
|
||||||
serialnumber [BUGS=X86-32]
|
serialnumber [BUGS=X86-32]
|
||||||
|
|
||||||
sev=option[,option...] [X86-64] See Documentation/x86/x86_64/boot-options.rst
|
sev=option[,option...] [X86-64] See Documentation/x86/x86_64/boot-options.rst
|
||||||
@ -5598,6 +5591,15 @@
|
|||||||
shapers= [NET]
|
shapers= [NET]
|
||||||
Maximal number of shapers.
|
Maximal number of shapers.
|
||||||
|
|
||||||
|
show_lapic= [APIC,X86] Advanced Programmable Interrupt Controller
|
||||||
|
Limit apic dumping. The parameter defines the maximal
|
||||||
|
number of local apics being dumped. Also it is possible
|
||||||
|
to set it to "all" by meaning -- no limit here.
|
||||||
|
Format: { 1 (default) | 2 | ... | all }.
|
||||||
|
The parameter valid if only apic=debug or
|
||||||
|
apic=verbose is specified.
|
||||||
|
Example: apic=debug show_lapic=all
|
||||||
|
|
||||||
simeth= [IA-64]
|
simeth= [IA-64]
|
||||||
simscsi=
|
simscsi=
|
||||||
|
|
||||||
@ -6037,6 +6039,16 @@
|
|||||||
be used to filter out binaries which have
|
be used to filter out binaries which have
|
||||||
not yet been made aware of AT_MINSIGSTKSZ.
|
not yet been made aware of AT_MINSIGSTKSZ.
|
||||||
|
|
||||||
|
stress_hpt [PPC]
|
||||||
|
Limits the number of kernel HPT entries in the hash
|
||||||
|
page table to increase the rate of hash page table
|
||||||
|
faults on kernel addresses.
|
||||||
|
|
||||||
|
stress_slb [PPC]
|
||||||
|
Limits the number of kernel SLB entries, and flushes
|
||||||
|
them frequently to increase the rate of SLB faults
|
||||||
|
on kernel addresses.
|
||||||
|
|
||||||
sunrpc.min_resvport=
|
sunrpc.min_resvport=
|
||||||
sunrpc.max_resvport=
|
sunrpc.max_resvport=
|
||||||
[NFS,SUNRPC]
|
[NFS,SUNRPC]
|
||||||
@ -6290,7 +6302,7 @@
|
|||||||
that can be enabled or disabled just as if you were
|
that can be enabled or disabled just as if you were
|
||||||
to echo the option name into
|
to echo the option name into
|
||||||
|
|
||||||
/sys/kernel/debug/tracing/trace_options
|
/sys/kernel/tracing/trace_options
|
||||||
|
|
||||||
For example, to enable stacktrace option (to dump the
|
For example, to enable stacktrace option (to dump the
|
||||||
stack trace of each event), add to the command line:
|
stack trace of each event), add to the command line:
|
||||||
@ -6323,7 +6335,7 @@
|
|||||||
[FTRACE] enable this option to disable tracing when a
|
[FTRACE] enable this option to disable tracing when a
|
||||||
warning is hit. This turns off "tracing_on". Tracing can
|
warning is hit. This turns off "tracing_on". Tracing can
|
||||||
be enabled again by echoing '1' into the "tracing_on"
|
be enabled again by echoing '1' into the "tracing_on"
|
||||||
file located in /sys/kernel/debug/tracing/
|
file located in /sys/kernel/tracing/
|
||||||
|
|
||||||
This option is useful, as it disables the trace before
|
This option is useful, as it disables the trace before
|
||||||
the WARNING dump is called, which prevents the trace to
|
the WARNING dump is called, which prevents the trace to
|
||||||
@ -6778,11 +6790,11 @@
|
|||||||
functions are at fixed addresses, they make nice
|
functions are at fixed addresses, they make nice
|
||||||
targets for exploits that can control RIP.
|
targets for exploits that can control RIP.
|
||||||
|
|
||||||
emulate [default] Vsyscalls turn into traps and are
|
emulate Vsyscalls turn into traps and are emulated
|
||||||
emulated reasonably safely. The vsyscall
|
reasonably safely. The vsyscall page is
|
||||||
page is readable.
|
readable.
|
||||||
|
|
||||||
xonly Vsyscalls turn into traps and are
|
xonly [default] Vsyscalls turn into traps and are
|
||||||
emulated reasonably safely. The vsyscall
|
emulated reasonably safely. The vsyscall
|
||||||
page is not readable.
|
page is not readable.
|
||||||
|
|
||||||
@ -6979,16 +6991,6 @@
|
|||||||
fairer and the number of possible event channels is
|
fairer and the number of possible event channels is
|
||||||
much higher. Default is on (use fifo events).
|
much higher. Default is on (use fifo events).
|
||||||
|
|
||||||
nopv= [X86,XEN,KVM,HYPER_V,VMWARE]
|
|
||||||
Disables the PV optimizations forcing the guest to run
|
|
||||||
as generic guest with no PV drivers. Currently support
|
|
||||||
XEN HVM, KVM, HYPER_V and VMWARE guest.
|
|
||||||
|
|
||||||
nopvspin [X86,XEN,KVM]
|
|
||||||
Disables the qspinlock slow path using PV optimizations
|
|
||||||
which allow the hypervisor to 'idle' the guest on lock
|
|
||||||
contention.
|
|
||||||
|
|
||||||
xirc2ps_cs= [NET,PCMCIA]
|
xirc2ps_cs= [NET,PCMCIA]
|
||||||
Format:
|
Format:
|
||||||
<irq>,<irq_mask>,<io>,<full_duplex>,<do_sound>,<lockup_hack>[,<irq2>[,<irq3>[,<irq4>]]]
|
<irq>,<irq_mask>,<io>,<full_duplex>,<do_sound>,<lockup_hack>[,<irq2>[,<irq3>[,<irq4>]]]
|
||||||
|
@ -25,7 +25,7 @@ References
|
|||||||
|
|
||||||
- In order to locate kernel-generated OS jitter on CPU N:
|
- In order to locate kernel-generated OS jitter on CPU N:
|
||||||
|
|
||||||
cd /sys/kernel/debug/tracing
|
cd /sys/kernel/tracing
|
||||||
echo 1 > max_graph_depth # Increase the "1" for more detail
|
echo 1 > max_graph_depth # Increase the "1" for more detail
|
||||||
echo function_graph > current_tracer
|
echo function_graph > current_tracer
|
||||||
# run workload
|
# run workload
|
||||||
|
@ -1488,7 +1488,7 @@ Example of command to set keyboard language is mentioned below::
|
|||||||
Text corresponding to keyboard layout to be set in sysfs are: be(Belgian),
|
Text corresponding to keyboard layout to be set in sysfs are: be(Belgian),
|
||||||
cz(Czech), da(Danish), de(German), en(English), es(Spain), et(Estonian),
|
cz(Czech), da(Danish), de(German), en(English), es(Spain), et(Estonian),
|
||||||
fr(French), fr-ch(French(Switzerland)), hu(Hungarian), it(Italy), jp (Japan),
|
fr(French), fr-ch(French(Switzerland)), hu(Hungarian), it(Italy), jp (Japan),
|
||||||
nl(Dutch), nn(Norway), pl(Polish), pt(portugese), sl(Slovenian), sv(Sweden),
|
nl(Dutch), nn(Norway), pl(Polish), pt(portuguese), sl(Slovenian), sv(Sweden),
|
||||||
tr(Turkey)
|
tr(Turkey)
|
||||||
|
|
||||||
WWAN Antenna type
|
WWAN Antenna type
|
||||||
|
@ -317,7 +317,7 @@ All md devices contain:
|
|||||||
suspended (not supported yet)
|
suspended (not supported yet)
|
||||||
All IO requests will block. The array can be reconfigured.
|
All IO requests will block. The array can be reconfigured.
|
||||||
|
|
||||||
Writing this, if accepted, will block until array is quiessent
|
Writing this, if accepted, will block until array is quiescent
|
||||||
|
|
||||||
readonly
|
readonly
|
||||||
no resync can happen. no superblocks get written.
|
no resync can happen. no superblocks get written.
|
||||||
|
@ -909,7 +909,7 @@ DE hat diverse Treiber fuer diese Modelle (Stand 09/2002):
|
|||||||
- TVPhone98 (Bt878)
|
- TVPhone98 (Bt878)
|
||||||
- AVerTV und TVCapture98 w/VCR (Bt 878)
|
- AVerTV und TVCapture98 w/VCR (Bt 878)
|
||||||
- AVerTVStudio und TVPhone98 w/VCR (Bt878)
|
- AVerTVStudio und TVPhone98 w/VCR (Bt878)
|
||||||
- AVerTV GO Serie (Kein SVideo Input)
|
- AVerTV GO Series (Kein SVideo Input)
|
||||||
- AVerTV98 (BT-878 chip)
|
- AVerTV98 (BT-878 chip)
|
||||||
- AVerTV98 mit Fernbedienung (BT-878 chip)
|
- AVerTV98 mit Fernbedienung (BT-878 chip)
|
||||||
- AVerTV/FM98 (BT-878 chip)
|
- AVerTV/FM98 (BT-878 chip)
|
||||||
|
@ -137,7 +137,7 @@ The ``LIRC user interface`` option adds enhanced functionality when using the
|
|||||||
from remote controllers.
|
from remote controllers.
|
||||||
|
|
||||||
The ``Support for eBPF programs attached to lirc devices`` option allows
|
The ``Support for eBPF programs attached to lirc devices`` option allows
|
||||||
the usage of special programs (called eBPF) that would allow aplications
|
the usage of special programs (called eBPF) that would allow applications
|
||||||
to add extra remote controller decoding functionality to the Linux Kernel.
|
to add extra remote controller decoding functionality to the Linux Kernel.
|
||||||
|
|
||||||
The ``Remote controller decoders`` option allows selecting the
|
The ``Remote controller decoders`` option allows selecting the
|
||||||
|
@ -142,7 +142,7 @@ The drivers exposes following files:
|
|||||||
indicator
|
indicator
|
||||||
0x18 lassi Signed Low side adjacent Channel
|
0x18 lassi Signed Low side adjacent Channel
|
||||||
Strength indicator
|
Strength indicator
|
||||||
0x19 hassi ditto fpr High side
|
0x19 hassi ditto for High side
|
||||||
0x20 mult Multipath indicator
|
0x20 mult Multipath indicator
|
||||||
0x21 dev Frequency deviation
|
0x21 dev Frequency deviation
|
||||||
0x24 assi Adjacent channel SSI
|
0x24 assi Adjacent channel SSI
|
||||||
|
@ -580,7 +580,7 @@ Metadata Capture
|
|||||||
----------------
|
----------------
|
||||||
|
|
||||||
The Metadata capture generates UVC format metadata. The PTS and SCR are
|
The Metadata capture generates UVC format metadata. The PTS and SCR are
|
||||||
transmitted based on the values set in vivid contols.
|
transmitted based on the values set in vivid controls.
|
||||||
|
|
||||||
The Metadata device will only work for the Webcam input, it will give
|
The Metadata device will only work for the Webcam input, it will give
|
||||||
back an error for all other inputs.
|
back an error for all other inputs.
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _mm_concepts:
|
|
||||||
|
|
||||||
=================
|
=================
|
||||||
Concepts overview
|
Concepts overview
|
||||||
=================
|
=================
|
||||||
@ -86,16 +84,15 @@ memory with the huge pages. The first one is `HugeTLB filesystem`, or
|
|||||||
hugetlbfs. It is a pseudo filesystem that uses RAM as its backing
|
hugetlbfs. It is a pseudo filesystem that uses RAM as its backing
|
||||||
store. For the files created in this filesystem the data resides in
|
store. For the files created in this filesystem the data resides in
|
||||||
the memory and mapped using huge pages. The hugetlbfs is described at
|
the memory and mapped using huge pages. The hugetlbfs is described at
|
||||||
:ref:`Documentation/admin-guide/mm/hugetlbpage.rst <hugetlbpage>`.
|
Documentation/admin-guide/mm/hugetlbpage.rst.
|
||||||
|
|
||||||
Another, more recent, mechanism that enables use of the huge pages is
|
Another, more recent, mechanism that enables use of the huge pages is
|
||||||
called `Transparent HugePages`, or THP. Unlike the hugetlbfs that
|
called `Transparent HugePages`, or THP. Unlike the hugetlbfs that
|
||||||
requires users and/or system administrators to configure what parts of
|
requires users and/or system administrators to configure what parts of
|
||||||
the system memory should and can be mapped by the huge pages, THP
|
the system memory should and can be mapped by the huge pages, THP
|
||||||
manages such mappings transparently to the user and hence the
|
manages such mappings transparently to the user and hence the
|
||||||
name. See
|
name. See Documentation/admin-guide/mm/transhuge.rst for more details
|
||||||
:ref:`Documentation/admin-guide/mm/transhuge.rst <admin_guide_transhuge>`
|
about THP.
|
||||||
for more details about THP.
|
|
||||||
|
|
||||||
Zones
|
Zones
|
||||||
=====
|
=====
|
||||||
@ -125,8 +122,8 @@ processor. Each bank is referred to as a `node` and for each node Linux
|
|||||||
constructs an independent memory management subsystem. A node has its
|
constructs an independent memory management subsystem. A node has its
|
||||||
own set of zones, lists of free and used pages and various statistics
|
own set of zones, lists of free and used pages and various statistics
|
||||||
counters. You can find more details about NUMA in
|
counters. You can find more details about NUMA in
|
||||||
:ref:`Documentation/mm/numa.rst <numa>` and in
|
Documentation/mm/numa.rst` and in
|
||||||
:ref:`Documentation/admin-guide/mm/numa_memory_policy.rst <numa_memory_policy>`.
|
Documentation/admin-guide/mm/numa_memory_policy.rst.
|
||||||
|
|
||||||
Page cache
|
Page cache
|
||||||
==========
|
==========
|
||||||
|
@ -54,7 +54,7 @@ that is built with ``CONFIG_DAMON_LRU_SORT=y``.
|
|||||||
To let sysadmins enable or disable it and tune for the given system,
|
To let sysadmins enable or disable it and tune for the given system,
|
||||||
DAMON_LRU_SORT utilizes module parameters. That is, you can put
|
DAMON_LRU_SORT utilizes module parameters. That is, you can put
|
||||||
``damon_lru_sort.<parameter>=<value>`` on the kernel boot command line or write
|
``damon_lru_sort.<parameter>=<value>`` on the kernel boot command line or write
|
||||||
proper values to ``/sys/modules/damon_lru_sort/parameters/<parameter>`` files.
|
proper values to ``/sys/module/damon_lru_sort/parameters/<parameter>`` files.
|
||||||
|
|
||||||
Below are the description of each parameter.
|
Below are the description of each parameter.
|
||||||
|
|
||||||
@ -283,7 +283,7 @@ doesn't make progress and therefore the free memory rate becomes lower than
|
|||||||
20%, it asks DAMON_LRU_SORT to do nothing again, so that we can fall back to
|
20%, it asks DAMON_LRU_SORT to do nothing again, so that we can fall back to
|
||||||
the LRU-list based page granularity reclamation. ::
|
the LRU-list based page granularity reclamation. ::
|
||||||
|
|
||||||
# cd /sys/modules/damon_lru_sort/parameters
|
# cd /sys/module/damon_lru_sort/parameters
|
||||||
# echo 500 > hot_thres_access_freq
|
# echo 500 > hot_thres_access_freq
|
||||||
# echo 120000000 > cold_min_age
|
# echo 120000000 > cold_min_age
|
||||||
# echo 10 > quota_ms
|
# echo 10 > quota_ms
|
||||||
|
@ -46,7 +46,7 @@ that is built with ``CONFIG_DAMON_RECLAIM=y``.
|
|||||||
To let sysadmins enable or disable it and tune for the given system,
|
To let sysadmins enable or disable it and tune for the given system,
|
||||||
DAMON_RECLAIM utilizes module parameters. That is, you can put
|
DAMON_RECLAIM utilizes module parameters. That is, you can put
|
||||||
``damon_reclaim.<parameter>=<value>`` on the kernel boot command line or write
|
``damon_reclaim.<parameter>=<value>`` on the kernel boot command line or write
|
||||||
proper values to ``/sys/modules/damon_reclaim/parameters/<parameter>`` files.
|
proper values to ``/sys/module/damon_reclaim/parameters/<parameter>`` files.
|
||||||
|
|
||||||
Below are the description of each parameter.
|
Below are the description of each parameter.
|
||||||
|
|
||||||
@ -251,7 +251,7 @@ therefore the free memory rate becomes lower than 20%, it asks DAMON_RECLAIM to
|
|||||||
do nothing again, so that we can fall back to the LRU-list based page
|
do nothing again, so that we can fall back to the LRU-list based page
|
||||||
granularity reclamation. ::
|
granularity reclamation. ::
|
||||||
|
|
||||||
# cd /sys/modules/damon_reclaim/parameters
|
# cd /sys/module/damon_reclaim/parameters
|
||||||
# echo 30000000 > min_age
|
# echo 30000000 > min_age
|
||||||
# echo $((1 * 1024 * 1024 * 1024)) > quota_sz
|
# echo $((1 * 1024 * 1024 * 1024)) > quota_sz
|
||||||
# echo 1000 > quota_reset_interval_ms
|
# echo 1000 > quota_reset_interval_ms
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _hugetlbpage:
|
|
||||||
|
|
||||||
=============
|
=============
|
||||||
HugeTLB Pages
|
HugeTLB Pages
|
||||||
=============
|
=============
|
||||||
@ -86,7 +84,7 @@ by increasing or decreasing the value of ``nr_hugepages``.
|
|||||||
|
|
||||||
Note: When the feature of freeing unused vmemmap pages associated with each
|
Note: When the feature of freeing unused vmemmap pages associated with each
|
||||||
hugetlb page is enabled, we can fail to free the huge pages triggered by
|
hugetlb page is enabled, we can fail to free the huge pages triggered by
|
||||||
the user when ths system is under memory pressure. Please try again later.
|
the user when the system is under memory pressure. Please try again later.
|
||||||
|
|
||||||
Pages that are used as huge pages are reserved inside the kernel and cannot
|
Pages that are used as huge pages are reserved inside the kernel and cannot
|
||||||
be used for other purposes. Huge pages cannot be swapped out under
|
be used for other purposes. Huge pages cannot be swapped out under
|
||||||
@ -313,7 +311,7 @@ memory policy mode--bind, preferred, local or interleave--may be used. The
|
|||||||
resulting effect on persistent huge page allocation is as follows:
|
resulting effect on persistent huge page allocation is as follows:
|
||||||
|
|
||||||
#. Regardless of mempolicy mode [see
|
#. Regardless of mempolicy mode [see
|
||||||
:ref:`Documentation/admin-guide/mm/numa_memory_policy.rst <numa_memory_policy>`],
|
Documentation/admin-guide/mm/numa_memory_policy.rst],
|
||||||
persistent huge pages will be distributed across the node or nodes
|
persistent huge pages will be distributed across the node or nodes
|
||||||
specified in the mempolicy as if "interleave" had been specified.
|
specified in the mempolicy as if "interleave" had been specified.
|
||||||
However, if a node in the policy does not contain sufficient contiguous
|
However, if a node in the policy does not contain sufficient contiguous
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _idle_page_tracking:
|
|
||||||
|
|
||||||
==================
|
==================
|
||||||
Idle Page Tracking
|
Idle Page Tracking
|
||||||
==================
|
==================
|
||||||
@ -70,9 +68,8 @@ If the tool is run initially with the appropriate option, it will mark all the
|
|||||||
queried pages as idle. Subsequent runs of the tool can then show which pages have
|
queried pages as idle. Subsequent runs of the tool can then show which pages have
|
||||||
their idle flag cleared in the interim.
|
their idle flag cleared in the interim.
|
||||||
|
|
||||||
See :ref:`Documentation/admin-guide/mm/pagemap.rst <pagemap>` for more
|
See Documentation/admin-guide/mm/pagemap.rst for more information about
|
||||||
information about ``/proc/pid/pagemap``, ``/proc/kpageflags``, and
|
``/proc/pid/pagemap``, ``/proc/kpageflags``, and ``/proc/kpagecgroup``.
|
||||||
``/proc/kpagecgroup``.
|
|
||||||
|
|
||||||
.. _impl_details:
|
.. _impl_details:
|
||||||
|
|
||||||
|
@ -16,8 +16,7 @@ are described in Documentation/admin-guide/sysctl/vm.rst and in `man 5 proc`_.
|
|||||||
.. _man 5 proc: http://man7.org/linux/man-pages/man5/proc.5.html
|
.. _man 5 proc: http://man7.org/linux/man-pages/man5/proc.5.html
|
||||||
|
|
||||||
Linux memory management has its own jargon and if you are not yet
|
Linux memory management has its own jargon and if you are not yet
|
||||||
familiar with it, consider reading
|
familiar with it, consider reading Documentation/admin-guide/mm/concepts.rst.
|
||||||
:ref:`Documentation/admin-guide/mm/concepts.rst <mm_concepts>`.
|
|
||||||
|
|
||||||
Here we document in detail how to interact with various mechanisms in
|
Here we document in detail how to interact with various mechanisms in
|
||||||
the Linux memory management.
|
the Linux memory management.
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _admin_guide_ksm:
|
|
||||||
|
|
||||||
=======================
|
=======================
|
||||||
Kernel Samepage Merging
|
Kernel Samepage Merging
|
||||||
=======================
|
=======================
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _admin_guide_memory_hotplug:
|
|
||||||
|
|
||||||
==================
|
==================
|
||||||
Memory Hot(Un)Plug
|
Memory Hot(Un)Plug
|
||||||
==================
|
==================
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _numa_memory_policy:
|
|
||||||
|
|
||||||
==================
|
==================
|
||||||
NUMA Memory Policy
|
NUMA Memory Policy
|
||||||
==================
|
==================
|
||||||
@ -246,7 +244,7 @@ MPOL_INTERLEAVED
|
|||||||
interleaved system default policy works in this mode.
|
interleaved system default policy works in this mode.
|
||||||
|
|
||||||
MPOL_PREFERRED_MANY
|
MPOL_PREFERRED_MANY
|
||||||
This mode specifices that the allocation should be preferrably
|
This mode specifies that the allocation should be preferably
|
||||||
satisfied from the nodemask specified in the policy. If there is
|
satisfied from the nodemask specified in the policy. If there is
|
||||||
a memory pressure on all nodes in the nodemask, the allocation
|
a memory pressure on all nodes in the nodemask, the allocation
|
||||||
can fall back to all existing numa nodes. This is effectively
|
can fall back to all existing numa nodes. This is effectively
|
||||||
@ -360,7 +358,7 @@ and NUMA nodes. "Usage" here means one of the following:
|
|||||||
2) examination of the policy to determine the policy mode and associated node
|
2) examination of the policy to determine the policy mode and associated node
|
||||||
or node lists, if any, for page allocation. This is considered a "hot
|
or node lists, if any, for page allocation. This is considered a "hot
|
||||||
path". Note that for MPOL_BIND, the "usage" extends across the entire
|
path". Note that for MPOL_BIND, the "usage" extends across the entire
|
||||||
allocation process, which may sleep during page reclaimation, because the
|
allocation process, which may sleep during page reclamation, because the
|
||||||
BIND policy nodemask is used, by reference, to filter ineligible nodes.
|
BIND policy nodemask is used, by reference, to filter ineligible nodes.
|
||||||
|
|
||||||
We can avoid taking an extra reference during the usages listed above as
|
We can avoid taking an extra reference during the usages listed above as
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _numaperf:
|
|
||||||
|
|
||||||
=============
|
=============
|
||||||
NUMA Locality
|
NUMA Locality
|
||||||
=============
|
=============
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _pagemap:
|
|
||||||
|
|
||||||
=============================
|
=============================
|
||||||
Examining Process Page Tables
|
Examining Process Page Tables
|
||||||
=============================
|
=============================
|
||||||
@ -19,10 +17,10 @@ There are four components to pagemap:
|
|||||||
* Bits 0-4 swap type if swapped
|
* Bits 0-4 swap type if swapped
|
||||||
* Bits 5-54 swap offset if swapped
|
* Bits 5-54 swap offset if swapped
|
||||||
* Bit 55 pte is soft-dirty (see
|
* Bit 55 pte is soft-dirty (see
|
||||||
:ref:`Documentation/admin-guide/mm/soft-dirty.rst <soft_dirty>`)
|
Documentation/admin-guide/mm/soft-dirty.rst)
|
||||||
* Bit 56 page exclusively mapped (since 4.2)
|
* Bit 56 page exclusively mapped (since 4.2)
|
||||||
* Bit 57 pte is uffd-wp write-protected (since 5.13) (see
|
* Bit 57 pte is uffd-wp write-protected (since 5.13) (see
|
||||||
:ref:`Documentation/admin-guide/mm/userfaultfd.rst <userfaultfd>`)
|
Documentation/admin-guide/mm/userfaultfd.rst)
|
||||||
* Bits 58-60 zero
|
* Bits 58-60 zero
|
||||||
* Bit 61 page is file-page or shared-anon (since 3.5)
|
* Bit 61 page is file-page or shared-anon (since 3.5)
|
||||||
* Bit 62 page swapped
|
* Bit 62 page swapped
|
||||||
@ -105,8 +103,7 @@ Short descriptions to the page flags
|
|||||||
A compound page with order N consists of 2^N physically contiguous pages.
|
A compound page with order N consists of 2^N physically contiguous pages.
|
||||||
A compound page with order 2 takes the form of "HTTT", where H donates its
|
A compound page with order 2 takes the form of "HTTT", where H donates its
|
||||||
head page and T donates its tail page(s). The major consumers of compound
|
head page and T donates its tail page(s). The major consumers of compound
|
||||||
pages are hugeTLB pages
|
pages are hugeTLB pages (Documentation/admin-guide/mm/hugetlbpage.rst),
|
||||||
(:ref:`Documentation/admin-guide/mm/hugetlbpage.rst <hugetlbpage>`),
|
|
||||||
the SLUB etc. memory allocators and various device drivers.
|
the SLUB etc. memory allocators and various device drivers.
|
||||||
However in this interface, only huge/giga pages are made visible
|
However in this interface, only huge/giga pages are made visible
|
||||||
to end users.
|
to end users.
|
||||||
@ -128,7 +125,7 @@ Short descriptions to the page flags
|
|||||||
Zero page for pfn_zero or huge_zero page.
|
Zero page for pfn_zero or huge_zero page.
|
||||||
25 - IDLE
|
25 - IDLE
|
||||||
The page has not been accessed since it was marked idle (see
|
The page has not been accessed since it was marked idle (see
|
||||||
:ref:`Documentation/admin-guide/mm/idle_page_tracking.rst <idle_page_tracking>`).
|
Documentation/admin-guide/mm/idle_page_tracking.rst).
|
||||||
Note that this flag may be stale in case the page was accessed via
|
Note that this flag may be stale in case the page was accessed via
|
||||||
a PTE. To make sure the flag is up-to-date one has to read
|
a PTE. To make sure the flag is up-to-date one has to read
|
||||||
``/sys/kernel/mm/page_idle/bitmap`` first.
|
``/sys/kernel/mm/page_idle/bitmap`` first.
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _shrinker_debugfs:
|
|
||||||
|
|
||||||
==========================
|
==========================
|
||||||
Shrinker Debugfs Interface
|
Shrinker Debugfs Interface
|
||||||
==========================
|
==========================
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _soft_dirty:
|
|
||||||
|
|
||||||
===============
|
===============
|
||||||
Soft-Dirty PTEs
|
Soft-Dirty PTEs
|
||||||
===============
|
===============
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _swap_numa:
|
|
||||||
|
|
||||||
===========================================
|
===========================================
|
||||||
Automatically bind swap device to numa node
|
Automatically bind swap device to numa node
|
||||||
===========================================
|
===========================================
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _admin_guide_transhuge:
|
|
||||||
|
|
||||||
============================
|
============================
|
||||||
Transparent Hugepage Support
|
Transparent Hugepage Support
|
||||||
============================
|
============================
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _userfaultfd:
|
|
||||||
|
|
||||||
===========
|
===========
|
||||||
Userfaultfd
|
Userfaultfd
|
||||||
===========
|
===========
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _zswap:
|
|
||||||
|
|
||||||
=====
|
=====
|
||||||
zswap
|
zswap
|
||||||
=====
|
=====
|
||||||
|
@ -53,7 +53,7 @@ two events have same value of bits 0~15 of config, that means they are
|
|||||||
event pair. And the bit 16 of config indicates getting counter 0 or
|
event pair. And the bit 16 of config indicates getting counter 0 or
|
||||||
counter 1 of hardware event.
|
counter 1 of hardware event.
|
||||||
|
|
||||||
After getting two values of event pair in usersapce, the formula of
|
After getting two values of event pair in userspace, the formula of
|
||||||
computation to calculate real performance data is:::
|
computation to calculate real performance data is:::
|
||||||
|
|
||||||
counter 0 / counter 1
|
counter 0 / counter 1
|
||||||
|
@ -473,7 +473,7 @@ Unit Tests for amd-pstate
|
|||||||
|
|
||||||
* We can introduce more functional or performance tests to align the result together, it will benefit power and performance scale optimization.
|
* We can introduce more functional or performance tests to align the result together, it will benefit power and performance scale optimization.
|
||||||
|
|
||||||
1. Test case decriptions
|
1. Test case descriptions
|
||||||
|
|
||||||
1). Basic tests
|
1). Basic tests
|
||||||
|
|
||||||
|
@ -712,7 +712,7 @@ it works in the `active mode <Active Mode_>`_.
|
|||||||
The following sequence of shell commands can be used to enable them and see
|
The following sequence of shell commands can be used to enable them and see
|
||||||
their output (if the kernel is generally configured to support event tracing)::
|
their output (if the kernel is generally configured to support event tracing)::
|
||||||
|
|
||||||
# cd /sys/kernel/debug/tracing/
|
# cd /sys/kernel/tracing/
|
||||||
# echo 1 > events/power/pstate_sample/enable
|
# echo 1 > events/power/pstate_sample/enable
|
||||||
# echo 1 > events/power/cpu_frequency/enable
|
# echo 1 > events/power/cpu_frequency/enable
|
||||||
# cat trace
|
# cat trace
|
||||||
@ -732,7 +732,7 @@ The ``ftrace`` interface can be used for low-level diagnostics of
|
|||||||
P-state is called, the ``ftrace`` filter can be set to
|
P-state is called, the ``ftrace`` filter can be set to
|
||||||
:c:func:`intel_pstate_set_pstate`::
|
:c:func:`intel_pstate_set_pstate`::
|
||||||
|
|
||||||
# cd /sys/kernel/debug/tracing/
|
# cd /sys/kernel/tracing/
|
||||||
# cat available_filter_functions | grep -i pstate
|
# cat available_filter_functions | grep -i pstate
|
||||||
intel_pstate_set_pstate
|
intel_pstate_set_pstate
|
||||||
intel_pstate_cpu_init
|
intel_pstate_cpu_init
|
||||||
|
@ -1105,8 +1105,8 @@ speakup load
|
|||||||
Alternatively, you can add the above line to your file
|
Alternatively, you can add the above line to your file
|
||||||
~/.bashrc or ~/.bash_profile.
|
~/.bashrc or ~/.bash_profile.
|
||||||
|
|
||||||
If your system administrator ran himself the script, all the users will be able
|
If your system administrator himself ran the script, all the users will be able
|
||||||
to change from English to the language choosed by root and do directly
|
to change from English to the language chosen by root and do directly
|
||||||
speakupconf load (or add this to the ~/.bashrc or
|
speakupconf load (or add this to the ~/.bashrc or
|
||||||
~/.bash_profile file). If there are several languages to handle, the
|
~/.bash_profile file). If there are several languages to handle, the
|
||||||
administrator (or every user) will have to run the first steps until speakupconf
|
administrator (or every user) will have to run the first steps until speakupconf
|
||||||
|
@ -356,7 +356,7 @@ The lowmem_reserve_ratio is an array. You can see them by reading this file::
|
|||||||
|
|
||||||
But, these values are not used directly. The kernel calculates # of protection
|
But, these values are not used directly. The kernel calculates # of protection
|
||||||
pages for each zones from them. These are shown as array of protection pages
|
pages for each zones from them. These are shown as array of protection pages
|
||||||
in /proc/zoneinfo like followings. (This is an example of x86-64 box).
|
in /proc/zoneinfo like the following. (This is an example of x86-64 box).
|
||||||
Each zone has an array of protection pages like this::
|
Each zone has an array of protection pages like this::
|
||||||
|
|
||||||
Node 0, zone DMA
|
Node 0, zone DMA
|
||||||
@ -433,7 +433,7 @@ a 2bit error in a memory module) is detected in the background by hardware
|
|||||||
that cannot be handled by the kernel. In some cases (like the page
|
that cannot be handled by the kernel. In some cases (like the page
|
||||||
still having a valid copy on disk) the kernel will handle the failure
|
still having a valid copy on disk) the kernel will handle the failure
|
||||||
transparently without affecting any applications. But if there is
|
transparently without affecting any applications. But if there is
|
||||||
no other uptodate copy of the data it will kill to prevent any data
|
no other up-to-date copy of the data it will kill to prevent any data
|
||||||
corruptions from propagating.
|
corruptions from propagating.
|
||||||
|
|
||||||
1: Kill all processes that have the corrupted and not reloadable page mapped
|
1: Kill all processes that have the corrupted and not reloadable page mapped
|
||||||
|
@ -138,7 +138,7 @@ Command Function
|
|||||||
``v`` Forcefully restores framebuffer console
|
``v`` Forcefully restores framebuffer console
|
||||||
``v`` Causes ETM buffer dump [ARM-specific]
|
``v`` Causes ETM buffer dump [ARM-specific]
|
||||||
|
|
||||||
``w`` Dumps tasks that are in uninterruptable (blocked) state.
|
``w`` Dumps tasks that are in uninterruptible (blocked) state.
|
||||||
|
|
||||||
``x`` Used by xmon interface on ppc/powerpc platforms.
|
``x`` Used by xmon interface on ppc/powerpc platforms.
|
||||||
Show global PMU Registers on sparc64.
|
Show global PMU Registers on sparc64.
|
||||||
|
@ -87,7 +87,7 @@ migrated, unless the CPU is taken offline. In this case, threads
|
|||||||
belong to the offlined CPUs will be terminated immediately.
|
belong to the offlined CPUs will be terminated immediately.
|
||||||
|
|
||||||
Running as SCHED_FIFO and relatively high priority, also allows such
|
Running as SCHED_FIFO and relatively high priority, also allows such
|
||||||
scheme to work for both preemptable and non-preemptable kernels.
|
scheme to work for both preemptible and non-preemptible kernels.
|
||||||
Alignment of idle time around jiffies ensures scalability for HZ
|
Alignment of idle time around jiffies ensures scalability for HZ
|
||||||
values. This effect can be better visualized using a Perf timechart.
|
values. This effect can be better visualized using a Perf timechart.
|
||||||
The following diagram shows the behavior of kernel thread
|
The following diagram shows the behavior of kernel thread
|
||||||
|
606
Documentation/admin-guide/workload-tracing.rst
Normal file
606
Documentation/admin-guide/workload-tracing.rst
Normal file
@ -0,0 +1,606 @@
|
|||||||
|
.. SPDX-License-Identifier: (GPL-2.0+ OR CC-BY-4.0)
|
||||||
|
|
||||||
|
======================================================
|
||||||
|
Discovering Linux kernel subsystems used by a workload
|
||||||
|
======================================================
|
||||||
|
|
||||||
|
:Authors: - Shuah Khan <skhan@linuxfoundation.org>
|
||||||
|
- Shefali Sharma <sshefali021@gmail.com>
|
||||||
|
:maintained-by: Shuah Khan <skhan@linuxfoundation.org>
|
||||||
|
|
||||||
|
Key Points
|
||||||
|
==========
|
||||||
|
|
||||||
|
* Understanding system resources necessary to build and run a workload
|
||||||
|
is important.
|
||||||
|
* Linux tracing and strace can be used to discover the system resources
|
||||||
|
in use by a workload. The completeness of the system usage information
|
||||||
|
depends on the completeness of coverage of a workload.
|
||||||
|
* Performance and security of the operating system can be analyzed with
|
||||||
|
the help of tools such as:
|
||||||
|
`perf <https://man7.org/linux/man-pages/man1/perf.1.html>`_,
|
||||||
|
`stress-ng <https://www.mankier.com/1/stress-ng>`_,
|
||||||
|
`paxtest <https://github.com/opntr/paxtest-freebsd>`_.
|
||||||
|
* Once we discover and understand the workload needs, we can focus on them
|
||||||
|
to avoid regressions and use it to evaluate safety considerations.
|
||||||
|
|
||||||
|
Methodology
|
||||||
|
===========
|
||||||
|
|
||||||
|
`strace <https://man7.org/linux/man-pages/man1/strace.1.html>`_ is a
|
||||||
|
diagnostic, instructional, and debugging tool and can be used to discover
|
||||||
|
the system resources in use by a workload. Once we discover and understand
|
||||||
|
the workload needs, we can focus on them to avoid regressions and use it
|
||||||
|
to evaluate safety considerations. We use strace tool to trace workloads.
|
||||||
|
|
||||||
|
This method of tracing using strace tells us the system calls invoked by
|
||||||
|
the workload and doesn't include all the system calls that can be invoked
|
||||||
|
by it. In addition, this tracing method tells us just the code paths within
|
||||||
|
these system calls that are invoked. As an example, if a workload opens a
|
||||||
|
file and reads from it successfully, then the success path is the one that
|
||||||
|
is traced. Any error paths in that system call will not be traced. If there
|
||||||
|
is a workload that provides full coverage of a workload then the method
|
||||||
|
outlined here will trace and find all possible code paths. The completeness
|
||||||
|
of the system usage information depends on the completeness of coverage of a
|
||||||
|
workload.
|
||||||
|
|
||||||
|
The goal is tracing a workload on a system running a default kernel without
|
||||||
|
requiring custom kernel installs.
|
||||||
|
|
||||||
|
How do we gather fine-grained system information?
|
||||||
|
=================================================
|
||||||
|
|
||||||
|
strace tool can be used to trace system calls made by a process and signals
|
||||||
|
it receives. System calls are the fundamental interface between an
|
||||||
|
application and the operating system kernel. They enable a program to
|
||||||
|
request services from the kernel. For instance, the open() system call in
|
||||||
|
Linux is used to provide access to a file in the file system. strace enables
|
||||||
|
us to track all the system calls made by an application. It lists all the
|
||||||
|
system calls made by a process and their resulting output.
|
||||||
|
|
||||||
|
You can generate profiling data combining strace and perf record tools to
|
||||||
|
record the events and information associated with a process. This provides
|
||||||
|
insight into the process. "perf annotate" tool generates the statistics of
|
||||||
|
each instruction of the program. This document goes over the details of how
|
||||||
|
to gather fine-grained information on a workload's usage of system resources.
|
||||||
|
|
||||||
|
We used strace to trace the perf, stress-ng, paxtest workloads to illustrate
|
||||||
|
our methodology to discover resources used by a workload. This process can
|
||||||
|
be applied to trace other workloads.
|
||||||
|
|
||||||
|
Getting the system ready for tracing
|
||||||
|
====================================
|
||||||
|
|
||||||
|
Before we can get started we will show you how to get your system ready.
|
||||||
|
We assume that you have a Linux distribution running on a physical system
|
||||||
|
or a virtual machine. Most distributions will include strace command. Let’s
|
||||||
|
install other tools that aren’t usually included to build Linux kernel.
|
||||||
|
Please note that the following works on Debian based distributions. You
|
||||||
|
might have to find equivalent packages on other Linux distributions.
|
||||||
|
|
||||||
|
Install tools to build Linux kernel and tools in kernel repository.
|
||||||
|
scripts/ver_linux is a good way to check if your system already has
|
||||||
|
the necessary tools::
|
||||||
|
|
||||||
|
sudo apt-get build-essentials flex bison yacc
|
||||||
|
sudo apt install libelf-dev systemtap-sdt-dev libaudit-dev libslang2-dev libperl-dev libdw-dev
|
||||||
|
|
||||||
|
cscope is a good tool to browse kernel sources. Let's install it now::
|
||||||
|
|
||||||
|
sudo apt-get install cscope
|
||||||
|
|
||||||
|
Install stress-ng and paxtest::
|
||||||
|
|
||||||
|
apt-get install stress-ng
|
||||||
|
apt-get install paxtest
|
||||||
|
|
||||||
|
Workload overview
|
||||||
|
=================
|
||||||
|
|
||||||
|
As mentioned earlier, we used strace to trace perf bench, stress-ng and
|
||||||
|
paxtest workloads to show how to analyze a workload and identify Linux
|
||||||
|
subsystems used by these workloads. Let's start with an overview of these
|
||||||
|
three workloads to get a better understanding of what they do and how to
|
||||||
|
use them.
|
||||||
|
|
||||||
|
perf bench (all) workload
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
The perf bench command contains multiple multi-threaded microkernel
|
||||||
|
benchmarks for executing different subsystems in the Linux kernel and
|
||||||
|
system calls. This allows us to easily measure the impact of changes,
|
||||||
|
which can help mitigate performance regressions. It also acts as a common
|
||||||
|
benchmarking framework, enabling developers to easily create test cases,
|
||||||
|
integrate transparently, and use performance-rich tooling subsystems.
|
||||||
|
|
||||||
|
Stress-ng netdev stressor workload
|
||||||
|
----------------------------------
|
||||||
|
|
||||||
|
stress-ng is used for performing stress testing on the kernel. It allows
|
||||||
|
you to exercise various physical subsystems of the computer, as well as
|
||||||
|
interfaces of the OS kernel, using "stressor-s". They are available for
|
||||||
|
CPU, CPU cache, devices, I/O, interrupts, file system, memory, network,
|
||||||
|
operating system, pipelines, schedulers, and virtual machines. Please refer
|
||||||
|
to the `stress-ng man-page <https://www.mankier.com/1/stress-ng>`_ to
|
||||||
|
find the description of all the available stressor-s. The netdev stressor
|
||||||
|
starts specified number (N) of workers that exercise various netdevice
|
||||||
|
ioctl commands across all the available network devices.
|
||||||
|
|
||||||
|
paxtest kiddie workload
|
||||||
|
-----------------------
|
||||||
|
|
||||||
|
paxtest is a program that tests buffer overflows in the kernel. It tests
|
||||||
|
kernel enforcements over memory usage. Generally, execution in some memory
|
||||||
|
segments makes buffer overflows possible. It runs a set of programs that
|
||||||
|
attempt to subvert memory usage. It is used as a regression test suite for
|
||||||
|
PaX, but might be useful to test other memory protection patches for the
|
||||||
|
kernel. We used paxtest kiddie mode which looks for simple vulnerabilities.
|
||||||
|
|
||||||
|
What is strace and how do we use it?
|
||||||
|
====================================
|
||||||
|
|
||||||
|
As mentioned earlier, strace which is a useful diagnostic, instructional,
|
||||||
|
and debugging tool and can be used to discover the system resources in use
|
||||||
|
by a workload. It can be used:
|
||||||
|
|
||||||
|
* To see how a process interacts with the kernel.
|
||||||
|
* To see why a process is failing or hanging.
|
||||||
|
* For reverse engineering a process.
|
||||||
|
* To find the files on which a program depends.
|
||||||
|
* For analyzing the performance of an application.
|
||||||
|
* For troubleshooting various problems related to the operating system.
|
||||||
|
|
||||||
|
In addition, strace can generate run-time statistics on times, calls, and
|
||||||
|
errors for each system call and report a summary when program exits,
|
||||||
|
suppressing the regular output. This attempts to show system time (CPU time
|
||||||
|
spent running in the kernel) independent of wall clock time. We plan to use
|
||||||
|
these features to get information on workload system usage.
|
||||||
|
|
||||||
|
strace command supports basic, verbose, and stats modes. strace command when
|
||||||
|
run in verbose mode gives more detailed information about the system calls
|
||||||
|
invoked by a process.
|
||||||
|
|
||||||
|
Running strace -c generates a report of the percentage of time spent in each
|
||||||
|
system call, the total time in seconds, the microseconds per call, the total
|
||||||
|
number of calls, the count of each system call that has failed with an error
|
||||||
|
and the type of system call made.
|
||||||
|
|
||||||
|
* Usage: strace <command we want to trace>
|
||||||
|
* Verbose mode usage: strace -v <command>
|
||||||
|
* Gather statistics: strace -c <command>
|
||||||
|
|
||||||
|
We used the “-c” option to gather fine-grained run-time statistics in use
|
||||||
|
by three workloads we have chose for this analysis.
|
||||||
|
|
||||||
|
* perf
|
||||||
|
* stress-ng
|
||||||
|
* paxtest
|
||||||
|
|
||||||
|
What is cscope and how do we use it?
|
||||||
|
====================================
|
||||||
|
|
||||||
|
Now let’s look at `cscope <https://cscope.sourceforge.net/>`_, a command
|
||||||
|
line tool for browsing C, C++ or Java code-bases. We can use it to find
|
||||||
|
all the references to a symbol, global definitions, functions called by a
|
||||||
|
function, functions calling a function, text strings, regular expression
|
||||||
|
patterns, files including a file.
|
||||||
|
|
||||||
|
We can use cscope to find which system call belongs to which subsystem.
|
||||||
|
This way we can find the kernel subsystems used by a process when it is
|
||||||
|
executed.
|
||||||
|
|
||||||
|
Let’s checkout the latest Linux repository and build cscope database::
|
||||||
|
|
||||||
|
git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git linux
|
||||||
|
cd linux
|
||||||
|
cscope -R -p10 # builds cscope.out database before starting browse session
|
||||||
|
cscope -d -p10 # starts browse session on cscope.out database
|
||||||
|
|
||||||
|
Note: Run "cscope -R -p10" to build the database and c"scope -d -p10" to
|
||||||
|
enter into the browsing session. cscope by default cscope.out database.
|
||||||
|
To get out of this mode press ctrl+d. -p option is used to specify the
|
||||||
|
number of file path components to display. -p10 is optimal for browsing
|
||||||
|
kernel sources.
|
||||||
|
|
||||||
|
What is perf and how do we use it?
|
||||||
|
==================================
|
||||||
|
|
||||||
|
Perf is an analysis tool based on Linux 2.6+ systems, which abstracts the
|
||||||
|
CPU hardware difference in performance measurement in Linux, and provides
|
||||||
|
a simple command line interface. Perf is based on the perf_events interface
|
||||||
|
exported by the kernel. It is very useful for profiling the system and
|
||||||
|
finding performance bottlenecks in an application.
|
||||||
|
|
||||||
|
If you haven't already checked out the Linux mainline repository, you can do
|
||||||
|
so and then build kernel and perf tool::
|
||||||
|
|
||||||
|
git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git linux
|
||||||
|
cd linux
|
||||||
|
make -j3 all
|
||||||
|
cd tools/perf
|
||||||
|
make
|
||||||
|
|
||||||
|
Note: The perf command can be built without building the kernel in the
|
||||||
|
repository and can be run on older kernels. However matching the kernel
|
||||||
|
and perf revisions gives more accurate information on the subsystem usage.
|
||||||
|
|
||||||
|
We used "perf stat" and "perf bench" options. For a detailed information on
|
||||||
|
the perf tool, run "perf -h".
|
||||||
|
|
||||||
|
perf stat
|
||||||
|
---------
|
||||||
|
The perf stat command generates a report of various hardware and software
|
||||||
|
events. It does so with the help of hardware counter registers found in
|
||||||
|
modern CPUs that keep the count of these activities. "perf stat cal" shows
|
||||||
|
stats for cal command.
|
||||||
|
|
||||||
|
Perf bench
|
||||||
|
----------
|
||||||
|
The perf bench command contains multiple multi-threaded microkernel
|
||||||
|
benchmarks for executing different subsystems in the Linux kernel and
|
||||||
|
system calls. This allows us to easily measure the impact of changes,
|
||||||
|
which can help mitigate performance regressions. It also acts as a common
|
||||||
|
benchmarking framework, enabling developers to easily create test cases,
|
||||||
|
integrate transparently, and use performance-rich tooling.
|
||||||
|
|
||||||
|
"perf bench all" command runs the following benchmarks:
|
||||||
|
|
||||||
|
* sched/messaging
|
||||||
|
* sched/pipe
|
||||||
|
* syscall/basic
|
||||||
|
* mem/memcpy
|
||||||
|
* mem/memset
|
||||||
|
|
||||||
|
What is stress-ng and how do we use it?
|
||||||
|
=======================================
|
||||||
|
|
||||||
|
As mentioned earlier, stress-ng is used for performing stress testing on
|
||||||
|
the kernel. It allows you to exercise various physical subsystems of the
|
||||||
|
computer, as well as interfaces of the OS kernel, using stressor-s. They
|
||||||
|
are available for CPU, CPU cache, devices, I/O, interrupts, file system,
|
||||||
|
memory, network, operating system, pipelines, schedulers, and virtual
|
||||||
|
machines.
|
||||||
|
|
||||||
|
The netdev stressor starts N workers that exercise various netdevice ioctl
|
||||||
|
commands across all the available network devices. The following ioctls are
|
||||||
|
exercised:
|
||||||
|
|
||||||
|
* SIOCGIFCONF, SIOCGIFINDEX, SIOCGIFNAME, SIOCGIFFLAGS
|
||||||
|
* SIOCGIFADDR, SIOCGIFNETMASK, SIOCGIFMETRIC, SIOCGIFMTU
|
||||||
|
* SIOCGIFHWADDR, SIOCGIFMAP, SIOCGIFTXQLEN
|
||||||
|
|
||||||
|
The following command runs the stressor::
|
||||||
|
|
||||||
|
stress-ng --netdev 1 -t 60 --metrics command.
|
||||||
|
|
||||||
|
We can use the perf record command to record the events and information
|
||||||
|
associated with a process. This command records the profiling data in the
|
||||||
|
perf.data file in the same directory.
|
||||||
|
|
||||||
|
Using the following commands you can record the events associated with the
|
||||||
|
netdev stressor, view the generated report perf.data and annotate the to
|
||||||
|
view the statistics of each instruction of the program::
|
||||||
|
|
||||||
|
perf record stress-ng --netdev 1 -t 60 --metrics command.
|
||||||
|
perf report
|
||||||
|
perf annotate
|
||||||
|
|
||||||
|
What is paxtest and how do we use it?
|
||||||
|
=====================================
|
||||||
|
|
||||||
|
paxtest is a program that tests buffer overflows in the kernel. It tests
|
||||||
|
kernel enforcements over memory usage. Generally, execution in some memory
|
||||||
|
segments makes buffer overflows possible. It runs a set of programs that
|
||||||
|
attempt to subvert memory usage. It is used as a regression test suite for
|
||||||
|
PaX, and will be useful to test other memory protection patches for the
|
||||||
|
kernel.
|
||||||
|
|
||||||
|
paxtest provides kiddie and blackhat modes. The paxtest kiddie mode runs
|
||||||
|
in normal mode, whereas the blackhat mode tries to get around the protection
|
||||||
|
of the kernel testing for vulnerabilities. We focus on the kiddie mode here
|
||||||
|
and combine "paxtest kiddie" run with "perf record" to collect CPU stack
|
||||||
|
traces for the paxtest kiddie run to see which function is calling other
|
||||||
|
functions in the performance profile. Then the "dwarf" (DWARF's Call Frame
|
||||||
|
Information) mode can be used to unwind the stack.
|
||||||
|
|
||||||
|
The following command can be used to view resulting report in call-graph
|
||||||
|
format::
|
||||||
|
|
||||||
|
perf record --call-graph dwarf paxtest kiddie
|
||||||
|
perf report --stdio
|
||||||
|
|
||||||
|
Tracing workloads
|
||||||
|
=================
|
||||||
|
|
||||||
|
Now that we understand the workloads, let's start tracing them.
|
||||||
|
|
||||||
|
Tracing perf bench all workload
|
||||||
|
-------------------------------
|
||||||
|
|
||||||
|
Run the following command to trace perf bench all workload::
|
||||||
|
|
||||||
|
strace -c perf bench all
|
||||||
|
|
||||||
|
**System Calls made by the workload**
|
||||||
|
|
||||||
|
The below table shows the system calls invoked by the workload, number of
|
||||||
|
times each system call is invoked, and the corresponding Linux subsystem.
|
||||||
|
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| System Call | # calls | Linux Subsystem | System Call (API) |
|
||||||
|
+===================+===========+=================+=========================+
|
||||||
|
| getppid | 10000001 | Process Mgmt | sys_getpid() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| clone | 1077 | Process Mgmt. | sys_clone() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| prctl | 23 | Process Mgmt. | sys_prctl() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| prlimit64 | 7 | Process Mgmt. | sys_prlimit64() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| getpid | 10 | Process Mgmt. | sys_getpid() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| uname | 3 | Process Mgmt. | sys_uname() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| sysinfo | 1 | Process Mgmt. | sys_sysinfo() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| getuid | 1 | Process Mgmt. | sys_getuid() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| getgid | 1 | Process Mgmt. | sys_getgid() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| geteuid | 1 | Process Mgmt. | sys_geteuid() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| getegid | 1 | Process Mgmt. | sys_getegid |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| close | 49951 | Filesystem | sys_close() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| pipe | 604 | Filesystem | sys_pipe() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| openat | 48560 | Filesystem | sys_opennat() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| fstat | 8338 | Filesystem | sys_fstat() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| stat | 1573 | Filesystem | sys_stat() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| pread64 | 9646 | Filesystem | sys_pread64() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| getdents64 | 1873 | Filesystem | sys_getdents64() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| access | 3 | Filesystem | sys_access() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| lstat | 1880 | Filesystem | sys_lstat() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| lseek | 6 | Filesystem | sys_lseek() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| ioctl | 3 | Filesystem | sys_ioctl() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| dup2 | 1 | Filesystem | sys_dup2() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| execve | 2 | Filesystem | sys_execve() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| fcntl | 8779 | Filesystem | sys_fcntl() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| statfs | 1 | Filesystem | sys_statfs() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| epoll_create | 2 | Filesystem | sys_epoll_create() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| epoll_ctl | 64 | Filesystem | sys_epoll_ctl() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| newfstatat | 8318 | Filesystem | sys_newfstatat() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| eventfd2 | 192 | Filesystem | sys_eventfd2() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| mmap | 243 | Memory Mgmt. | sys_mmap() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| mprotect | 32 | Memory Mgmt. | sys_mprotect() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| brk | 21 | Memory Mgmt. | sys_brk() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| munmap | 128 | Memory Mgmt. | sys_munmap() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| set_mempolicy | 156 | Memory Mgmt. | sys_set_mempolicy() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| set_tid_address | 1 | Process Mgmt. | sys_set_tid_address() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| set_robust_list | 1 | Futex | sys_set_robust_list() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| futex | 341 | Futex | sys_futex() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| sched_getaffinity | 79 | Scheduler | sys_sched_getaffinity() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| sched_setaffinity | 223 | Scheduler | sys_sched_setaffinity() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| socketpair | 202 | Network | sys_socketpair() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| rt_sigprocmask | 21 | Signal | sys_rt_sigprocmask() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| rt_sigaction | 36 | Signal | sys_rt_sigaction() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| rt_sigreturn | 2 | Signal | sys_rt_sigreturn() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| wait4 | 889 | Time | sys_wait4() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| clock_nanosleep | 37 | Time | sys_clock_nanosleep() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| capget | 4 | Capability | sys_capget() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
|
||||||
|
Tracing stress-ng netdev stressor workload
|
||||||
|
------------------------------------------
|
||||||
|
|
||||||
|
Run the following command to trace stress-ng netdev stressor workload::
|
||||||
|
|
||||||
|
strace -c stress-ng --netdev 1 -t 60 --metrics
|
||||||
|
|
||||||
|
**System Calls made by the workload**
|
||||||
|
|
||||||
|
The below table shows the system calls invoked by the workload, number of
|
||||||
|
times each system call is invoked, and the corresponding Linux subsystem.
|
||||||
|
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| System Call | # calls | Linux Subsystem | System Call (API) |
|
||||||
|
+===================+===========+=================+=========================+
|
||||||
|
| openat | 74 | Filesystem | sys_openat() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| close | 75 | Filesystem | sys_close() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| read | 58 | Filesystem | sys_read() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| fstat | 20 | Filesystem | sys_fstat() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| flock | 10 | Filesystem | sys_flock() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| write | 7 | Filesystem | sys_write() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| getdents64 | 8 | Filesystem | sys_getdents64() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| pread64 | 8 | Filesystem | sys_pread64() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| lseek | 1 | Filesystem | sys_lseek() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| access | 2 | Filesystem | sys_access() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| getcwd | 1 | Filesystem | sys_getcwd() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| execve | 1 | Filesystem | sys_execve() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| mmap | 61 | Memory Mgmt. | sys_mmap() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| munmap | 3 | Memory Mgmt. | sys_munmap() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| mprotect | 20 | Memory Mgmt. | sys_mprotect() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| mlock | 2 | Memory Mgmt. | sys_mlock() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| brk | 3 | Memory Mgmt. | sys_brk() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| rt_sigaction | 21 | Signal | sys_rt_sigaction() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| rt_sigprocmask | 1 | Signal | sys_rt_sigprocmask() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| sigaltstack | 1 | Signal | sys_sigaltstack() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| rt_sigreturn | 1 | Signal | sys_rt_sigreturn() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| getpid | 8 | Process Mgmt. | sys_getpid() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| prlimit64 | 5 | Process Mgmt. | sys_prlimit64() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| arch_prctl | 2 | Process Mgmt. | sys_arch_prctl() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| sysinfo | 2 | Process Mgmt. | sys_sysinfo() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| getuid | 2 | Process Mgmt. | sys_getuid() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| uname | 1 | Process Mgmt. | sys_uname() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| setpgid | 1 | Process Mgmt. | sys_setpgid() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| getrusage | 1 | Process Mgmt. | sys_getrusage() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| geteuid | 1 | Process Mgmt. | sys_geteuid() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| getppid | 1 | Process Mgmt. | sys_getppid() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| sendto | 3 | Network | sys_sendto() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| connect | 1 | Network | sys_connect() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| socket | 1 | Network | sys_socket() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| clone | 1 | Process Mgmt. | sys_clone() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| set_tid_address | 1 | Process Mgmt. | sys_set_tid_address() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| wait4 | 2 | Time | sys_wait4() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| alarm | 1 | Time | sys_alarm() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
| set_robust_list | 1 | Futex | sys_set_robust_list() |
|
||||||
|
+-------------------+-----------+-----------------+-------------------------+
|
||||||
|
|
||||||
|
Tracing paxtest kiddie workload
|
||||||
|
-------------------------------
|
||||||
|
|
||||||
|
Run the following command to trace paxtest kiddie workload::
|
||||||
|
|
||||||
|
strace -c paxtest kiddie
|
||||||
|
|
||||||
|
**System Calls made by the workload**
|
||||||
|
|
||||||
|
The below table shows the system calls invoked by the workload, number of
|
||||||
|
times each system call is invoked, and the corresponding Linux subsystem.
|
||||||
|
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| System Call | # calls | Linux Subsystem | System Call (API) |
|
||||||
|
+===================+===========+=================+======================+
|
||||||
|
| read | 3 | Filesystem | sys_read() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| write | 11 | Filesystem | sys_write() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| close | 41 | Filesystem | sys_close() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| stat | 24 | Filesystem | sys_stat() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| fstat | 2 | Filesystem | sys_fstat() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| pread64 | 6 | Filesystem | sys_pread64() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| access | 1 | Filesystem | sys_access() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| pipe | 1 | Filesystem | sys_pipe() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| dup2 | 24 | Filesystem | sys_dup2() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| execve | 1 | Filesystem | sys_execve() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| fcntl | 26 | Filesystem | sys_fcntl() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| openat | 14 | Filesystem | sys_openat() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| rt_sigaction | 7 | Signal | sys_rt_sigaction() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| rt_sigreturn | 38 | Signal | sys_rt_sigreturn() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| clone | 38 | Process Mgmt. | sys_clone() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| wait4 | 44 | Time | sys_wait4() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| mmap | 7 | Memory Mgmt. | sys_mmap() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| mprotect | 3 | Memory Mgmt. | sys_mprotect() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| munmap | 1 | Memory Mgmt. | sys_munmap() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| brk | 3 | Memory Mgmt. | sys_brk() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| getpid | 1 | Process Mgmt. | sys_getpid() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| getuid | 1 | Process Mgmt. | sys_getuid() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| getgid | 1 | Process Mgmt. | sys_getgid() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| geteuid | 2 | Process Mgmt. | sys_geteuid() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| getegid | 1 | Process Mgmt. | sys_getegid() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| getppid | 1 | Process Mgmt. | sys_getppid() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
| arch_prctl | 2 | Process Mgmt. | sys_arch_prctl() |
|
||||||
|
+-------------------+-----------+-----------------+----------------------+
|
||||||
|
|
||||||
|
Conclusion
|
||||||
|
==========
|
||||||
|
|
||||||
|
This document is intended to be used as a guide on how to gather fine-grained
|
||||||
|
information on the resources in use by workloads using strace.
|
||||||
|
|
||||||
|
References
|
||||||
|
==========
|
||||||
|
|
||||||
|
* `Discovery Linux Kernel Subsystems used by OpenAPS <https://elisa.tech/blog/2022/02/02/discovery-linux-kernel-subsystems-used-by-openaps>`_
|
||||||
|
* `ELISA-White-Papers-Discovering Linux kernel subsystems used by a workload <https://github.com/elisa-tech/ELISA-White-Papers/blob/master/Processes/Discovering_Linux_kernel_subsystems_used_by_a_workload.md>`_
|
||||||
|
* `strace <https://man7.org/linux/man-pages/man1/strace.1.html>`_
|
||||||
|
* `perf <https://man7.org/linux/man-pages/man1/perf.1.html>`_
|
||||||
|
* `paxtest README <https://github.com/opntr/paxtest-freebsd/blob/hardenedbsd/0.9.14-hbsd/README>`_
|
||||||
|
* `stress-ng <https://www.mankier.com/1/stress-ng>`_
|
||||||
|
* `Monitoring and managing system status and performance <https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/monitoring_and_managing_system_status_and_performance/index>`_
|
@ -156,7 +156,7 @@ else:
|
|||||||
math_renderer = 'mathjax'
|
math_renderer = 'mathjax'
|
||||||
|
|
||||||
# Add any paths that contain templates here, relative to this directory.
|
# Add any paths that contain templates here, relative to this directory.
|
||||||
templates_path = ['_templates']
|
templates_path = ['sphinx/templates']
|
||||||
|
|
||||||
# The suffix(es) of source filenames.
|
# The suffix(es) of source filenames.
|
||||||
# You can specify multiple suffix as a list of string:
|
# You can specify multiple suffix as a list of string:
|
||||||
@ -331,6 +331,7 @@ if html_theme == 'alabaster':
|
|||||||
'description': get_cline_version(),
|
'description': get_cline_version(),
|
||||||
'page_width': '65em',
|
'page_width': '65em',
|
||||||
'sidebar_width': '15em',
|
'sidebar_width': '15em',
|
||||||
|
'fixed_sidebar': 'true',
|
||||||
'font_size': 'inherit',
|
'font_size': 'inherit',
|
||||||
'font_family': 'serif',
|
'font_family': 'serif',
|
||||||
}
|
}
|
||||||
@ -348,7 +349,7 @@ html_use_smartypants = False
|
|||||||
|
|
||||||
# Custom sidebar templates, maps document names to template names.
|
# Custom sidebar templates, maps document names to template names.
|
||||||
# Note that the RTD theme ignores this
|
# Note that the RTD theme ignores this
|
||||||
html_sidebars = { '**': ['searchbox.html', 'localtoc.html', 'sourcelink.html']}
|
html_sidebars = { '**': ['searchbox.html', 'kernel-toc.html', 'sourcelink.html']}
|
||||||
|
|
||||||
# about.html is available for alabaster theme. Add it at the front.
|
# about.html is available for alabaster theme. Add it at the front.
|
||||||
if html_theme == 'alabaster':
|
if html_theme == 'alabaster':
|
||||||
|
@ -42,7 +42,7 @@ padata_shells associated with it, each allowing a separate series of jobs.
|
|||||||
Modifying cpumasks
|
Modifying cpumasks
|
||||||
------------------
|
------------------
|
||||||
|
|
||||||
The CPUs used to run jobs can be changed in two ways, programatically with
|
The CPUs used to run jobs can be changed in two ways, programmatically with
|
||||||
padata_set_cpumask() or via sysfs. The former is defined::
|
padata_set_cpumask() or via sysfs. The former is defined::
|
||||||
|
|
||||||
int padata_set_cpumask(struct padata_instance *pinst, int cpumask_type,
|
int padata_set_cpumask(struct padata_instance *pinst, int cpumask_type,
|
||||||
|
@ -370,8 +370,8 @@ of possible problems:
|
|||||||
|
|
||||||
The first one can be tracked using tracing: ::
|
The first one can be tracked using tracing: ::
|
||||||
|
|
||||||
$ echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event
|
$ echo workqueue:workqueue_queue_work > /sys/kernel/tracing/set_event
|
||||||
$ cat /sys/kernel/debug/tracing/trace_pipe > out.txt
|
$ cat /sys/kernel/tracing/trace_pipe > out.txt
|
||||||
(wait a few secs)
|
(wait a few secs)
|
||||||
^C
|
^C
|
||||||
|
|
||||||
|
@ -1,8 +1,8 @@
|
|||||||
.. SPDX-License-Identifier: GPL-2.0
|
.. SPDX-License-Identifier: GPL-2.0
|
||||||
|
|
||||||
==============================================================================
|
========================================================================
|
||||||
Linux CPUFreq - CPU frequency and voltage scaling code in the Linux(TM) kernel
|
CPUFreq - CPU frequency and voltage scaling code in the Linux(TM) kernel
|
||||||
==============================================================================
|
========================================================================
|
||||||
|
|
||||||
Author: Dominik Brodowski <linux@brodo.de>
|
Author: Dominik Brodowski <linux@brodo.de>
|
||||||
|
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
=======================
|
==========
|
||||||
Linux Kernel Crypto API
|
Crypto API
|
||||||
=======================
|
==========
|
||||||
|
|
||||||
:Author: Stephan Mueller
|
:Author: Stephan Mueller
|
||||||
:Author: Marek Vasut
|
:Author: Marek Vasut
|
||||||
|
@ -219,7 +219,7 @@ instance::
|
|||||||
cat cocci.err
|
cat cocci.err
|
||||||
|
|
||||||
You can use SPFLAGS to add debugging flags; for instance you may want to
|
You can use SPFLAGS to add debugging flags; for instance you may want to
|
||||||
add both --profile --show-trying to SPFLAGS when debugging. For example
|
add both ``--profile --show-trying`` to SPFLAGS when debugging. For example
|
||||||
you may want to use::
|
you may want to use::
|
||||||
|
|
||||||
rm -f err.log
|
rm -f err.log
|
||||||
@ -248,7 +248,7 @@ variables for .cocciconfig is as follows:
|
|||||||
|
|
||||||
- Your current user's home directory is processed first
|
- Your current user's home directory is processed first
|
||||||
- Your directory from which spatch is called is processed next
|
- Your directory from which spatch is called is processed next
|
||||||
- The directory provided with the --dir option is processed last, if used
|
- The directory provided with the ``--dir`` option is processed last, if used
|
||||||
|
|
||||||
Since coccicheck runs through make, it naturally runs from the kernel
|
Since coccicheck runs through make, it naturally runs from the kernel
|
||||||
proper dir; as such the second rule above would be implied for picking up a
|
proper dir; as such the second rule above would be implied for picking up a
|
||||||
@ -265,8 +265,8 @@ The kernel coccicheck script has::
|
|||||||
fi
|
fi
|
||||||
|
|
||||||
KBUILD_EXTMOD is set when an explicit target with M= is used. For both cases
|
KBUILD_EXTMOD is set when an explicit target with M= is used. For both cases
|
||||||
the spatch --dir argument is used, as such third rule applies when whether M=
|
the spatch ``--dir`` argument is used, as such third rule applies when whether
|
||||||
is used or not, and when M= is used the target directory can have its own
|
M= is used or not, and when M= is used the target directory can have its own
|
||||||
.cocciconfig file. When M= is not passed as an argument to coccicheck the
|
.cocciconfig file. When M= is not passed as an argument to coccicheck the
|
||||||
target directory is the same as the directory from where spatch was called.
|
target directory is the same as the directory from where spatch was called.
|
||||||
|
|
||||||
|
@ -39,6 +39,10 @@ Setup
|
|||||||
this mode. In this case, you should build the kernel with
|
this mode. In this case, you should build the kernel with
|
||||||
CONFIG_RANDOMIZE_BASE disabled if the architecture supports KASLR.
|
CONFIG_RANDOMIZE_BASE disabled if the architecture supports KASLR.
|
||||||
|
|
||||||
|
- Build the gdb scripts (required on kernels v5.1 and above)::
|
||||||
|
|
||||||
|
make scripts_gdb
|
||||||
|
|
||||||
- Enable the gdb stub of QEMU/KVM, either
|
- Enable the gdb stub of QEMU/KVM, either
|
||||||
|
|
||||||
- at VM startup time by appending "-s" to the QEMU command line
|
- at VM startup time by appending "-s" to the QEMU command line
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
Buffer Sharing and Synchronization
|
Buffer Sharing and Synchronization (dma-buf)
|
||||||
==================================
|
============================================
|
||||||
|
|
||||||
The dma-buf subsystem provides the framework for sharing buffers for
|
The dma-buf subsystem provides the framework for sharing buffers for
|
||||||
hardware (DMA) access across multiple device drivers and subsystems, and
|
hardware (DMA) access across multiple device drivers and subsystems, and
|
||||||
@ -264,7 +264,7 @@ through memory management dependencies which userspace is unaware of, which
|
|||||||
randomly hangs workloads until the timeout kicks in. Workloads, which from
|
randomly hangs workloads until the timeout kicks in. Workloads, which from
|
||||||
userspace's perspective, do not contain a deadlock. In such a mixed fencing
|
userspace's perspective, do not contain a deadlock. In such a mixed fencing
|
||||||
architecture there is no single entity with knowledge of all dependencies.
|
architecture there is no single entity with knowledge of all dependencies.
|
||||||
Thefore preventing such deadlocks from within the kernel is not possible.
|
Therefore preventing such deadlocks from within the kernel is not possible.
|
||||||
|
|
||||||
The only solution to avoid dependencies loops is by not allowing indefinite
|
The only solution to avoid dependencies loops is by not allowing indefinite
|
||||||
fences in the kernel. This means:
|
fences in the kernel. This means:
|
||||||
|
@ -175,7 +175,7 @@ The details of these operations are:
|
|||||||
driver can ask for the pointer, maximum size and the currently used size of
|
driver can ask for the pointer, maximum size and the currently used size of
|
||||||
the metadata and can directly update or read it.
|
the metadata and can directly update or read it.
|
||||||
|
|
||||||
Becasue the DMA driver manages the memory area containing the metadata,
|
Because the DMA driver manages the memory area containing the metadata,
|
||||||
clients must make sure that they do not try to access or get the pointer
|
clients must make sure that they do not try to access or get the pointer
|
||||||
after their transfer completion callback has run for the descriptor.
|
after their transfer completion callback has run for the descriptor.
|
||||||
If no completion callback has been defined for the transfer, then the
|
If no completion callback has been defined for the transfer, then the
|
||||||
|
@ -89,7 +89,7 @@ The following command returns the state of the test. ::
|
|||||||
|
|
||||||
% cat /sys/module/dmatest/parameters/run
|
% cat /sys/module/dmatest/parameters/run
|
||||||
|
|
||||||
To wait for test completion userpace can poll 'run' until it is false, or use
|
To wait for test completion userspace can poll 'run' until it is false, or use
|
||||||
the wait parameter. Specifying 'wait=1' when loading the module causes module
|
the wait parameter. Specifying 'wait=1' when loading the module causes module
|
||||||
initialization to pause until a test run has completed, while reading
|
initialization to pause until a test run has completed, while reading
|
||||||
/sys/module/dmatest/parameters/wait waits for any running test to complete
|
/sys/module/dmatest/parameters/wait waits for any running test to complete
|
||||||
|
@ -4,7 +4,7 @@ High Speed Synchronous Serial Interface (HSI)
|
|||||||
Introduction
|
Introduction
|
||||||
---------------
|
---------------
|
||||||
|
|
||||||
High Speed Syncronous Interface (HSI) is a fullduplex, low latency protocol,
|
High Speed Synchronous Interface (HSI) is a full duplex, low latency protocol,
|
||||||
that is optimized for die-level interconnect between an Application Processor
|
that is optimized for die-level interconnect between an Application Processor
|
||||||
and a Baseband chipset. It has been specified by the MIPI alliance in 2003 and
|
and a Baseband chipset. It has been specified by the MIPI alliance in 2003 and
|
||||||
implemented by multiple vendors since then.
|
implemented by multiple vendors since then.
|
||||||
@ -52,7 +52,7 @@ hsi-char Device
|
|||||||
------------------
|
------------------
|
||||||
|
|
||||||
Each port automatically registers a generic client driver called hsi_char,
|
Each port automatically registers a generic client driver called hsi_char,
|
||||||
which provides a charecter device for userspace representing the HSI port.
|
which provides a character device for userspace representing the HSI port.
|
||||||
It can be used to communicate via HSI from userspace. Userspace may
|
It can be used to communicate via HSI from userspace. Userspace may
|
||||||
configure the hsi_char device using the following ioctl commands:
|
configure the hsi_char device using the following ioctl commands:
|
||||||
|
|
||||||
|
@ -1,6 +1,8 @@
|
|||||||
========================================
|
.. SPDX-License-Identifier: GPL-2.0
|
||||||
The Linux driver implementer's API guide
|
|
||||||
========================================
|
==============================
|
||||||
|
Driver implementer's API guide
|
||||||
|
==============================
|
||||||
|
|
||||||
The kernel offers a wide variety of interfaces to support the development
|
The kernel offers a wide variety of interfaces to support the development
|
||||||
of device drivers. This document is an only somewhat organized collection
|
of device drivers. This document is an only somewhat organized collection
|
||||||
|
@ -44,7 +44,7 @@ This _wc variant returns a write-combining map to the page and may only be
|
|||||||
used with mappings created by io_mapping_create_wc()
|
used with mappings created by io_mapping_create_wc()
|
||||||
|
|
||||||
Temporary mappings are only valid in the context of the caller. The mapping
|
Temporary mappings are only valid in the context of the caller. The mapping
|
||||||
is not guaranteed to be globaly visible.
|
is not guaranteed to be globally visible.
|
||||||
|
|
||||||
io_mapping_map_local_wc() has a side effect on X86 32bit as it disables
|
io_mapping_map_local_wc() has a side effect on X86 32bit as it disables
|
||||||
migration to make the mapping code work. No caller can rely on this side
|
migration to make the mapping code work. No caller can rely on this side
|
||||||
@ -78,7 +78,7 @@ variant, although this may be significantly slower::
|
|||||||
unsigned long offset)
|
unsigned long offset)
|
||||||
|
|
||||||
This works like io_mapping_map_atomic/local_wc() except it has no side
|
This works like io_mapping_map_atomic/local_wc() except it has no side
|
||||||
effects and the pointer is globaly visible.
|
effects and the pointer is globally visible.
|
||||||
|
|
||||||
The mappings are released with::
|
The mappings are released with::
|
||||||
|
|
||||||
|
@ -65,7 +65,7 @@ There are three groups of locks for managing the device:
|
|||||||
2.3 new-device management
|
2.3 new-device management
|
||||||
-------------------------
|
-------------------------
|
||||||
|
|
||||||
A single lock: "no-new-dev" is used to co-ordinate the addition of
|
A single lock: "no-new-dev" is used to coordinate the addition of
|
||||||
new devices - this must be synchronized across the array.
|
new devices - this must be synchronized across the array.
|
||||||
Normally all nodes hold a concurrent-read lock on this device.
|
Normally all nodes hold a concurrent-read lock on this device.
|
||||||
|
|
||||||
|
@ -81,7 +81,7 @@ The write-through and write-back cache use the same disk format. The cache disk
|
|||||||
is organized as a simple write log. The log consists of 'meta data' and 'data'
|
is organized as a simple write log. The log consists of 'meta data' and 'data'
|
||||||
pairs. The meta data describes the data. It also includes checksum and sequence
|
pairs. The meta data describes the data. It also includes checksum and sequence
|
||||||
ID for recovery identification. Data can be IO data and parity data. Data is
|
ID for recovery identification. Data can be IO data and parity data. Data is
|
||||||
checksumed too. The checksum is stored in the meta data ahead of the data. The
|
checksummed too. The checksum is stored in the meta data ahead of the data. The
|
||||||
checksum is an optimization because MD can write meta and data freely without
|
checksum is an optimization because MD can write meta and data freely without
|
||||||
worry about the order. MD superblock has a field pointed to the valid meta data
|
worry about the order. MD superblock has a field pointed to the valid meta data
|
||||||
of log head.
|
of log head.
|
||||||
|
@ -28,7 +28,7 @@ Currently, it consists of:
|
|||||||
takes parameters at initialization that will dictate how the simulation
|
takes parameters at initialization that will dictate how the simulation
|
||||||
behaves.
|
behaves.
|
||||||
|
|
||||||
- Code reponsible for encoding a valid MPEG Transport Stream, which is then
|
- Code responsible for encoding a valid MPEG Transport Stream, which is then
|
||||||
passed to the bridge driver. This fake stream contains some hardcoded content.
|
passed to the bridge driver. This fake stream contains some hardcoded content.
|
||||||
For now, we have a single, audio-only channel containing a single MPEG
|
For now, we have a single, audio-only channel containing a single MPEG
|
||||||
Elementary Stream, which in turn contains a SMPTE 302m encoded sine-wave.
|
Elementary Stream, which in turn contains a SMPTE 302m encoded sine-wave.
|
||||||
|
@ -24,7 +24,7 @@ unless this is fixed in the HW platform.
|
|||||||
|
|
||||||
The demux kABI only controls front-ends regarding to their connections with
|
The demux kABI only controls front-ends regarding to their connections with
|
||||||
demuxes; the kABI used to set the other front-end parameters, such as
|
demuxes; the kABI used to set the other front-end parameters, such as
|
||||||
tuning, are devined via the Digital TV Frontend kABI.
|
tuning, are defined via the Digital TV Frontend kABI.
|
||||||
|
|
||||||
The functions that implement the abstract interface demux should be defined
|
The functions that implement the abstract interface demux should be defined
|
||||||
static or module private and registered to the Demux core for external
|
static or module private and registered to the Demux core for external
|
||||||
|
@ -321,7 +321,7 @@ response to video node operations. This hides the complexity of the underlying
|
|||||||
hardware from applications. For complex devices, finer-grained control of the
|
hardware from applications. For complex devices, finer-grained control of the
|
||||||
device than what the video nodes offer may be required. In those cases, bridge
|
device than what the video nodes offer may be required. In those cases, bridge
|
||||||
drivers that implement :ref:`the media controller API <media_controller>` may
|
drivers that implement :ref:`the media controller API <media_controller>` may
|
||||||
opt for making the subdevice operations directly accessible from userpace.
|
opt for making the subdevice operations directly accessible from userspace.
|
||||||
|
|
||||||
Device nodes named ``v4l-subdev``\ *X* can be created in ``/dev`` to access
|
Device nodes named ``v4l-subdev``\ *X* can be created in ``/dev`` to access
|
||||||
sub-devices directly. If a sub-device supports direct userspace configuration
|
sub-devices directly. If a sub-device supports direct userspace configuration
|
||||||
@ -574,7 +574,7 @@ issues with subdevice drivers that let the V4L2 core manage the active state,
|
|||||||
as they expect to receive the appropriate state as a parameter. To help the
|
as they expect to receive the appropriate state as a parameter. To help the
|
||||||
conversion of subdevice drivers to a managed active state without having to
|
conversion of subdevice drivers to a managed active state without having to
|
||||||
convert all callers at the same time, an additional wrapper layer has been
|
convert all callers at the same time, an additional wrapper layer has been
|
||||||
added to v4l2_subdev_call(), which handles the NULL case by geting and locking
|
added to v4l2_subdev_call(), which handles the NULL case by getting and locking
|
||||||
the callee's active state with :c:func:`v4l2_subdev_lock_and_get_active_state()`,
|
the callee's active state with :c:func:`v4l2_subdev_lock_and_get_active_state()`,
|
||||||
and unlocking the state after the call.
|
and unlocking the state after the call.
|
||||||
|
|
||||||
|
@ -3,7 +3,7 @@
|
|||||||
MEI NFC
|
MEI NFC
|
||||||
-------
|
-------
|
||||||
|
|
||||||
Some Intel 8 and 9 Serieses chipsets supports NFC devices connected behind
|
Some Intel 8 and 9 Series chipsets support NFC devices connected behind
|
||||||
the Intel Management Engine controller.
|
the Intel Management Engine controller.
|
||||||
MEI client bus exposes the NFC chips as NFC phy devices and enables
|
MEI client bus exposes the NFC chips as NFC phy devices and enables
|
||||||
binding with Microread and NXP PN544 NFC device driver from the Linux NFC
|
binding with Microread and NXP PN544 NFC device driver from the Linux NFC
|
||||||
|
@ -150,7 +150,7 @@ LLC
|
|||||||
|
|
||||||
Communication between the CPU and the chip often requires some link layer
|
Communication between the CPU and the chip often requires some link layer
|
||||||
protocol. Those are isolated as modules managed by the HCI layer. There are
|
protocol. Those are isolated as modules managed by the HCI layer. There are
|
||||||
currently two modules : nop (raw transfert) and shdlc.
|
currently two modules : nop (raw transfer) and shdlc.
|
||||||
A new llc must implement the following functions::
|
A new llc must implement the following functions::
|
||||||
|
|
||||||
struct nfc_llc_ops {
|
struct nfc_llc_ops {
|
||||||
|
@ -82,7 +82,7 @@ LABEL:
|
|||||||
Metadata stored on a DIMM device that partitions and identifies
|
Metadata stored on a DIMM device that partitions and identifies
|
||||||
(persistently names) capacity allocated to different PMEM namespaces. It
|
(persistently names) capacity allocated to different PMEM namespaces. It
|
||||||
also indicates whether an address abstraction like a BTT is applied to
|
also indicates whether an address abstraction like a BTT is applied to
|
||||||
the namepsace. Note that traditional partition tables, GPT/MBR, are
|
the namespace. Note that traditional partition tables, GPT/MBR, are
|
||||||
layered on top of a PMEM namespace, or an address abstraction like BTT
|
layered on top of a PMEM namespace, or an address abstraction like BTT
|
||||||
if present, but partition support is deprecated going forward.
|
if present, but partition support is deprecated going forward.
|
||||||
|
|
||||||
|
@ -83,7 +83,7 @@ passed in.
|
|||||||
6. Freeze
|
6. Freeze
|
||||||
---------
|
---------
|
||||||
The freeze operation does not require any keys. The security config can be
|
The freeze operation does not require any keys. The security config can be
|
||||||
frozen by a user with root privelege.
|
frozen by a user with root privilege.
|
||||||
|
|
||||||
7. Disable
|
7. Disable
|
||||||
----------
|
----------
|
||||||
|
@ -836,7 +836,7 @@ hardware and shall be put into different subsystems:
|
|||||||
|
|
||||||
Depending on the exact HW register design, some functions exposed by the
|
Depending on the exact HW register design, some functions exposed by the
|
||||||
GPIO subsystem may call into the pinctrl subsystem in order to
|
GPIO subsystem may call into the pinctrl subsystem in order to
|
||||||
co-ordinate register settings across HW modules. In particular, this may
|
coordinate register settings across HW modules. In particular, this may
|
||||||
be needed for HW with separate GPIO and pin controller HW modules, where
|
be needed for HW with separate GPIO and pin controller HW modules, where
|
||||||
e.g. GPIO direction is determined by a register in the pin controller HW
|
e.g. GPIO direction is determined by a register in the pin controller HW
|
||||||
module rather than the GPIO HW module.
|
module rather than the GPIO HW module.
|
||||||
|
@ -20,7 +20,7 @@ Overview of the ``pldmfw`` library
|
|||||||
|
|
||||||
The ``pldmfw`` library is intended to be used by device drivers for
|
The ``pldmfw`` library is intended to be used by device drivers for
|
||||||
implementing device flash update based on firmware files following the PLDM
|
implementing device flash update based on firmware files following the PLDM
|
||||||
firwmare file format.
|
firmware file format.
|
||||||
|
|
||||||
It is implemented using an ops table that allows device drivers to provide
|
It is implemented using an ops table that allows device drivers to provide
|
||||||
the underlying device specific functionality.
|
the underlying device specific functionality.
|
||||||
|
@ -24,7 +24,7 @@ console support.
|
|||||||
Console Support
|
Console Support
|
||||||
---------------
|
---------------
|
||||||
|
|
||||||
The serial core provides a few helper functions. This includes identifing
|
The serial core provides a few helper functions. This includes identifying
|
||||||
the correct port structure (via uart_get_console()) and decoding command line
|
the correct port structure (via uart_get_console()) and decoding command line
|
||||||
arguments (uart_parse_options()).
|
arguments (uart_parse_options()).
|
||||||
|
|
||||||
|
@ -77,7 +77,7 @@ after the frame structure and before the payload. The payload is followed by
|
|||||||
its own CRC (over all payload bytes). If the payload is not present (i.e.
|
its own CRC (over all payload bytes). If the payload is not present (i.e.
|
||||||
the frame has ``LEN=0``), the CRC of the payload is still present and will
|
the frame has ``LEN=0``), the CRC of the payload is still present and will
|
||||||
evaluate to ``0xffff``. The |LEN| field does not include any of the CRCs, it
|
evaluate to ``0xffff``. The |LEN| field does not include any of the CRCs, it
|
||||||
equals the number of bytes inbetween the CRC of the frame and the CRC of the
|
equals the number of bytes between the CRC of the frame and the CRC of the
|
||||||
payload.
|
payload.
|
||||||
|
|
||||||
Additionally, the following fixed two-byte sequences are used:
|
Additionally, the following fixed two-byte sequences are used:
|
||||||
|
@ -18,7 +18,7 @@ controller which can be configured in one of 4 ways:
|
|||||||
4. Hub configuration
|
4. Hub configuration
|
||||||
|
|
||||||
Linux currently supports several versions of this controller. In all
|
Linux currently supports several versions of this controller. In all
|
||||||
likelyhood, the version in your SoC is already supported. At the time
|
likelihood, the version in your SoC is already supported. At the time
|
||||||
of this writing, known tested versions range from 2.02a to 3.10a. As a
|
of this writing, known tested versions range from 2.02a to 3.10a. As a
|
||||||
rule of thumb, anything above 2.02a should work reliably well.
|
rule of thumb, anything above 2.02a should work reliably well.
|
||||||
|
|
||||||
|
@ -48,7 +48,7 @@ kernel boot parameter::
|
|||||||
"earlyprintk=xdbc"
|
"earlyprintk=xdbc"
|
||||||
|
|
||||||
If there are multiple xHCI controllers in your system, you can
|
If there are multiple xHCI controllers in your system, you can
|
||||||
append a host contoller index to this kernel parameter. This
|
append a host controller index to this kernel parameter. This
|
||||||
index starts from 0.
|
index starts from 0.
|
||||||
|
|
||||||
Current design doesn't support DbC runtime suspend/resume. As
|
Current design doesn't support DbC runtime suspend/resume. As
|
||||||
|
@ -1284,6 +1284,7 @@ support this. Table 1-9 lists the files and their meaning.
|
|||||||
rt_cache Routing cache
|
rt_cache Routing cache
|
||||||
snmp SNMP data
|
snmp SNMP data
|
||||||
sockstat Socket statistics
|
sockstat Socket statistics
|
||||||
|
softnet_stat Per-CPU incoming packets queues statistics of online CPUs
|
||||||
tcp TCP sockets
|
tcp TCP sockets
|
||||||
udp UDP sockets
|
udp UDP sockets
|
||||||
unix UNIX domain sockets
|
unix UNIX domain sockets
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
==================================
|
============================
|
||||||
Linux GPU Driver Developer's Guide
|
GPU Driver Developer's Guide
|
||||||
==================================
|
============================
|
||||||
|
|
||||||
.. toctree::
|
.. toctree::
|
||||||
|
|
||||||
|
@ -344,8 +344,8 @@ Documentation/ABI/testing/sysfs-bus-iio for IIO ABIs to user space.
|
|||||||
|
|
||||||
To debug ISH, event tracing mechanism is used. To enable debug logs::
|
To debug ISH, event tracing mechanism is used. To enable debug logs::
|
||||||
|
|
||||||
echo 1 > /sys/kernel/debug/tracing/events/intel_ish/enable
|
echo 1 > /sys/kernel/tracing/events/intel_ish/enable
|
||||||
cat /sys/kernel/debug/tracing/trace
|
cat /sys/kernel/tracing/trace
|
||||||
|
|
||||||
3.8 ISH IIO sysfs Example on Lenovo thinkpad Yoga 260
|
3.8 ISH IIO sysfs Example on Lenovo thinkpad Yoga 260
|
||||||
-----------------------------------------------------
|
-----------------------------------------------------
|
||||||
|
@ -1,8 +1,8 @@
|
|||||||
.. SPDX-License-Identifier: GPL-2.0
|
.. SPDX-License-Identifier: GPL-2.0
|
||||||
|
|
||||||
=========================
|
===================
|
||||||
Linux Hardware Monitoring
|
Hardware Monitoring
|
||||||
=========================
|
===================
|
||||||
|
|
||||||
.. toctree::
|
.. toctree::
|
||||||
:maxdepth: 1
|
:maxdepth: 1
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
=============================
|
===================
|
||||||
The Linux Input Documentation
|
Input Documentation
|
||||||
=============================
|
===================
|
||||||
|
|
||||||
Contents:
|
Contents:
|
||||||
|
|
||||||
|
@ -26,3 +26,4 @@ LEDs
|
|||||||
leds-lp55xx
|
leds-lp55xx
|
||||||
leds-mlxcpld
|
leds-mlxcpld
|
||||||
leds-sc27xx
|
leds-sc27xx
|
||||||
|
leds-qcom-lpg
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _active_mm:
|
|
||||||
|
|
||||||
=========
|
=========
|
||||||
Active MM
|
Active MM
|
||||||
=========
|
=========
|
||||||
|
@ -1,7 +1,5 @@
|
|||||||
.. SPDX-License-Identifier: GPL-2.0
|
.. SPDX-License-Identifier: GPL-2.0
|
||||||
|
|
||||||
.. _arch_page_table_helpers:
|
|
||||||
|
|
||||||
===============================
|
===============================
|
||||||
Architecture Page Table Helpers
|
Architecture Page Table Helpers
|
||||||
===============================
|
===============================
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _balance:
|
|
||||||
|
|
||||||
================
|
================
|
||||||
Memory Balancing
|
Memory Balancing
|
||||||
================
|
================
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _free_page_reporting:
|
|
||||||
|
|
||||||
=====================
|
=====================
|
||||||
Free Page Reporting
|
Free Page Reporting
|
||||||
=====================
|
=====================
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _frontswap:
|
|
||||||
|
|
||||||
=========
|
=========
|
||||||
Frontswap
|
Frontswap
|
||||||
=========
|
=========
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _highmem:
|
|
||||||
|
|
||||||
====================
|
====================
|
||||||
High Memory Handling
|
High Memory Handling
|
||||||
====================
|
====================
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _hmm:
|
|
||||||
|
|
||||||
=====================================
|
=====================================
|
||||||
Heterogeneous Memory Management (HMM)
|
Heterogeneous Memory Management (HMM)
|
||||||
=====================================
|
=====================================
|
||||||
@ -304,7 +302,7 @@ devm_memunmap_pages(), and devm_release_mem_region() when the resources can
|
|||||||
be tied to a ``struct device``.
|
be tied to a ``struct device``.
|
||||||
|
|
||||||
The overall migration steps are similar to migrating NUMA pages within system
|
The overall migration steps are similar to migrating NUMA pages within system
|
||||||
memory (see :ref:`Page migration <page_migration>`) but the steps are split
|
memory (see Documentation/mm/page_migration.rst) but the steps are split
|
||||||
between device driver specific code and shared common code:
|
between device driver specific code and shared common code:
|
||||||
|
|
||||||
1. ``mmap_read_lock()``
|
1. ``mmap_read_lock()``
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _hugetlbfs_reserve:
|
|
||||||
|
|
||||||
=====================
|
=====================
|
||||||
Hugetlbfs Reservation
|
Hugetlbfs Reservation
|
||||||
=====================
|
=====================
|
||||||
@ -7,7 +5,7 @@ Hugetlbfs Reservation
|
|||||||
Overview
|
Overview
|
||||||
========
|
========
|
||||||
|
|
||||||
Huge pages as described at :ref:`hugetlbpage` are typically
|
Huge pages as described at Documentation/mm/hugetlbpage.rst are typically
|
||||||
preallocated for application use. These huge pages are instantiated in a
|
preallocated for application use. These huge pages are instantiated in a
|
||||||
task's address space at page fault time if the VMA indicates huge pages are
|
task's address space at page fault time if the VMA indicates huge pages are
|
||||||
to be used. If no huge page exists at page fault time, the task is sent
|
to be used. If no huge page exists at page fault time, the task is sent
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. hwpoison:
|
|
||||||
|
|
||||||
========
|
========
|
||||||
hwpoison
|
hwpoison
|
||||||
========
|
========
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
=====================================
|
===============================
|
||||||
Linux Memory Management Documentation
|
Memory Management Documentation
|
||||||
=====================================
|
===============================
|
||||||
|
|
||||||
Memory Management Guide
|
Memory Management Guide
|
||||||
=======================
|
=======================
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _ksm:
|
|
||||||
|
|
||||||
=======================
|
=======================
|
||||||
Kernel Samepage Merging
|
Kernel Samepage Merging
|
||||||
=======================
|
=======================
|
||||||
@ -8,7 +6,7 @@ KSM is a memory-saving de-duplication feature, enabled by CONFIG_KSM=y,
|
|||||||
added to the Linux kernel in 2.6.32. See ``mm/ksm.c`` for its implementation,
|
added to the Linux kernel in 2.6.32. See ``mm/ksm.c`` for its implementation,
|
||||||
and http://lwn.net/Articles/306704/ and https://lwn.net/Articles/330589/
|
and http://lwn.net/Articles/306704/ and https://lwn.net/Articles/330589/
|
||||||
|
|
||||||
The userspace interface of KSM is described in :ref:`Documentation/admin-guide/mm/ksm.rst <admin_guide_ksm>`
|
The userspace interface of KSM is described in Documentation/admin-guide/mm/ksm.rst
|
||||||
|
|
||||||
Design
|
Design
|
||||||
======
|
======
|
||||||
|
@ -1,7 +1,5 @@
|
|||||||
.. SPDX-License-Identifier: GPL-2.0
|
.. SPDX-License-Identifier: GPL-2.0
|
||||||
|
|
||||||
.. _physical_memory_model:
|
|
||||||
|
|
||||||
=====================
|
=====================
|
||||||
Physical Memory Model
|
Physical Memory Model
|
||||||
=====================
|
=====================
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _mmu_notifier:
|
|
||||||
|
|
||||||
When do you need to notify inside page table lock ?
|
When do you need to notify inside page table lock ?
|
||||||
===================================================
|
===================================================
|
||||||
|
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _numa:
|
|
||||||
|
|
||||||
Started Nov 1999 by Kanoj Sarcar <kanoj@sgi.com>
|
Started Nov 1999 by Kanoj Sarcar <kanoj@sgi.com>
|
||||||
|
|
||||||
=============
|
=============
|
||||||
@ -64,7 +62,7 @@ In addition, for some architectures, again x86 is an example, Linux supports
|
|||||||
the emulation of additional nodes. For NUMA emulation, linux will carve up
|
the emulation of additional nodes. For NUMA emulation, linux will carve up
|
||||||
the existing nodes--or the system memory for non-NUMA platforms--into multiple
|
the existing nodes--or the system memory for non-NUMA platforms--into multiple
|
||||||
nodes. Each emulated node will manage a fraction of the underlying cells'
|
nodes. Each emulated node will manage a fraction of the underlying cells'
|
||||||
physical memory. NUMA emluation is useful for testing NUMA kernel and
|
physical memory. NUMA emulation is useful for testing NUMA kernel and
|
||||||
application features on non-NUMA platforms, and as a sort of memory resource
|
application features on non-NUMA platforms, and as a sort of memory resource
|
||||||
management mechanism when used together with cpusets.
|
management mechanism when used together with cpusets.
|
||||||
[see Documentation/admin-guide/cgroup-v1/cpusets.rst]
|
[see Documentation/admin-guide/cgroup-v1/cpusets.rst]
|
||||||
@ -110,7 +108,7 @@ to improve NUMA locality using various CPU affinity command line interfaces,
|
|||||||
such as taskset(1) and numactl(1), and program interfaces such as
|
such as taskset(1) and numactl(1), and program interfaces such as
|
||||||
sched_setaffinity(2). Further, one can modify the kernel's default local
|
sched_setaffinity(2). Further, one can modify the kernel's default local
|
||||||
allocation behavior using Linux NUMA memory policy. [see
|
allocation behavior using Linux NUMA memory policy. [see
|
||||||
:ref:`Documentation/admin-guide/mm/numa_memory_policy.rst <numa_memory_policy>`].
|
Documentation/admin-guide/mm/numa_memory_policy.rst].
|
||||||
|
|
||||||
System administrators can restrict the CPUs and nodes' memories that a non-
|
System administrators can restrict the CPUs and nodes' memories that a non-
|
||||||
privileged user can specify in the scheduling or NUMA commands and functions
|
privileged user can specify in the scheduling or NUMA commands and functions
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _page_frags:
|
|
||||||
|
|
||||||
==============
|
==============
|
||||||
Page fragments
|
Page fragments
|
||||||
==============
|
==============
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _page_migration:
|
|
||||||
|
|
||||||
==============
|
==============
|
||||||
Page migration
|
Page migration
|
||||||
==============
|
==============
|
||||||
@ -9,8 +7,8 @@ nodes in a NUMA system while the process is running. This means that the
|
|||||||
virtual addresses that the process sees do not change. However, the
|
virtual addresses that the process sees do not change. However, the
|
||||||
system rearranges the physical location of those pages.
|
system rearranges the physical location of those pages.
|
||||||
|
|
||||||
Also see :ref:`Heterogeneous Memory Management (HMM) <hmm>`
|
Also see Documentation/mm/hmm.rst for migrating pages to or from device
|
||||||
for migrating pages to or from device private memory.
|
private memory.
|
||||||
|
|
||||||
The main intent of page migration is to reduce the latency of memory accesses
|
The main intent of page migration is to reduce the latency of memory accesses
|
||||||
by moving pages near to the processor where the process accessing that memory
|
by moving pages near to the processor where the process accessing that memory
|
||||||
|
@ -1,5 +1,3 @@
|
|||||||
.. _page_owner:
|
|
||||||
|
|
||||||
==================================================
|
==================================================
|
||||||
page owner: Tracking about who allocated each page
|
page owner: Tracking about who allocated each page
|
||||||
==================================================
|
==================================================
|
||||||
@ -52,7 +50,7 @@ pages are investigated and marked as allocated in initialization phase.
|
|||||||
Although it doesn't mean that they have the right owner information,
|
Although it doesn't mean that they have the right owner information,
|
||||||
at least, we can tell whether the page is allocated or not,
|
at least, we can tell whether the page is allocated or not,
|
||||||
more accurately. On 2GB memory x86-64 VM box, 13343 early allocated pages
|
more accurately. On 2GB memory x86-64 VM box, 13343 early allocated pages
|
||||||
are catched and marked, although they are mostly allocated from struct
|
are caught and marked, although they are mostly allocated from struct
|
||||||
page extension feature. Anyway, after that, no page is left in
|
page extension feature. Anyway, after that, no page is left in
|
||||||
un-tracking state.
|
un-tracking state.
|
||||||
|
|
||||||
@ -178,7 +176,7 @@ STANDARD FORMAT SPECIFIERS
|
|||||||
at alloc_ts timestamp of the page when it was allocated
|
at alloc_ts timestamp of the page when it was allocated
|
||||||
ator allocator memory allocator for pages
|
ator allocator memory allocator for pages
|
||||||
|
|
||||||
For --curl option:
|
For --cull option:
|
||||||
|
|
||||||
KEY LONG DESCRIPTION
|
KEY LONG DESCRIPTION
|
||||||
p pid process ID
|
p pid process ID
|
||||||
|
@ -1,7 +1,5 @@
|
|||||||
.. SPDX-License-Identifier: GPL-2.0
|
.. SPDX-License-Identifier: GPL-2.0
|
||||||
|
|
||||||
.. _page_table_check:
|
|
||||||
|
|
||||||
================
|
================
|
||||||
Page Table Check
|
Page Table Check
|
||||||
================
|
================
|
||||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user