linux/Documentation/core-api
Tejun Heo 636b927eba workqueue: Make unbound workqueues to use per-cpu pool_workqueues
A pwq (pool_workqueue) represents an association between a workqueue and a
worker_pool. When a work item is queued, the workqueue selects the pwq to
use, which in turn determines the pool, and queues the work item to the pool
through the pwq. pwq is also what implements the maximum concurrency limit -
@max_active.

As a per-cpu workqueue should be assocaited with a different worker_pool on
each CPU, it always had per-cpu pwq's that are accessed through wq->cpu_pwq.
However, unbound workqueues were sharing a pwq within each NUMA node by
default. The sharing has several downsides:

* Because @max_active is per-pwq, the meaning of @max_active changes
  depending on the machine configuration and whether workqueue NUMA locality
  support is enabled.

* Makes per-cpu and unbound code deviate.

* Gets in the way of making workqueue CPU locality awareness more flexible.

This patch makes unbound workqueues use per-cpu pwq's the same way per-cpu
workqueues do by making the following changes:

* wq->numa_pwq_tbl[] is removed and unbound workqueues now use wq->cpu_pwq
  just like per-cpu workqueues. wq->cpu_pwq is now RCU protected for unbound
  workqueues.

* numa_pwq_tbl_install() is renamed to install_unbound_pwq() and installs
  the specified pwq to the target CPU's wq->cpu_pwq.

* apply_wqattrs_prepare() now always allocates a separate pwq for each CPU
  unless the workqueue is ordered. If ordered, all CPUs use wq->dfl_pwq.
  This makes the return value of wq_calc_node_cpumask() unnecessary. It now
  returns void.

* @max_active now means the same thing for both per-cpu and unbound
  workqueues. WQ_UNBOUND_MAX_ACTIVE now equals WQ_MAX_ACTIVE and
  documentation is updated accordingly. WQ_UNBOUND_MAX_ACTIVE is no longer
  used in workqueue implementation and will be removed later.

* All unbound pwq operations which used to be per-numa-node are now per-cpu.

For most unbound workqueue users, this shouldn't cause noticeable changes.
Work item issue and completion will be a small bit faster, flush_workqueue()
would become a bit more expensive, and the total concurrency limit would
likely become higher. All @max_active==1 use cases are currently being
audited for conversion into alloc_ordered_workqueue() and they shouldn't be
affected once the audit and conversion is complete.

One area where the behavior change may be more noticeable is
workqueue_congested() as the reported congestion state is now per CPU
instead of NUMA node. There are only two users of this interface -
drivers/infiniband/hw/hfi1 and net/smc. Maintainers of both subsystems are
cc'd. Inputs on the behavior change would be very much appreciated.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Karsten Graul <kgraul@linux.ibm.com>
Cc: Wenjia Zhang <wenjia@linux.ibm.com>
Cc: Jan Karcher <jaka@linux.ibm.com>
2023-08-07 15:57:23 -10:00
..
irq Documentation: irqdomain: Fix typo of "at least once" 2022-08-18 11:11:52 -06:00
wrappers docs: put atomic*.txt and memory-barriers.txt into the core-api book 2022-09-29 12:55:06 -06:00
asm-annotations.rst docs: move x86 documentation into Documentation/arch/ 2023-03-30 12:58:51 -06:00
assoc_array.rst Documentation: Use "while" instead of "whilst" 2018-11-20 09:30:43 -07:00
boot-time-mm.rst
cachetlb.rst mm: Add flush_dcache_folio() 2021-10-18 07:49:36 -04:00
circular-buffers.rst doc: Remove ".vnet" from paulmck email addresses 2019-05-28 09:02:57 -07:00
cpu_hotplug.rst x86/topology: Remove CPU0 hotplug option 2023-05-15 13:44:49 +02:00
debug-objects.rst
debugging-via-ohci1394.rst docs: debugging-via-ohci1394.txt: add it to the core-api book 2020-05-15 11:59:17 -06:00
dma-api-howto.rst dma-api-howto: typo fix 2023-04-10 16:46:11 -06:00
dma-api.rst docs/mm: Physical Memory: remove useless markup 2023-02-02 10:18:04 -07:00
dma-attributes.rst Reinstate some of "swiotlb: rework "fix info leak with DMA_FROM_DEVICE"" 2022-03-28 11:37:05 -07:00
dma-isa-lpc.rst docs: core-api: avoid using ReST :doc:foo markup 2021-06-17 13:24:37 -06:00
entry.rst Documentation: core-api: entry: Add comments about nesting 2022-01-27 11:32:40 -07:00
errseq.rst
genalloc.rst lib/genalloc.c: rename addr_in_gen_pool to gen_pool_has_addr 2019-12-04 19:44:13 -08:00
generic-radix-tree.rst generic radix trees 2019-03-12 10:04:02 -07:00
genericirq.rst docs: genericirq.rst: don't document chip.c functions twice 2020-10-15 07:49:41 +02:00
gfp_mask-from-fs-io.rst
idr.rst IDR: Note that the IDR API is deprecated 2022-07-10 21:17:30 -04:00
index.rst docs: add more netlink docs (incl. spec docs) 2023-01-24 10:58:11 +01:00
kernel-api.rst It's been a relatively calm cycle in docsland. We do have: 2023-06-27 11:33:47 -07:00
kobject.rst kobject documentation: remove default_attrs information 2022-01-07 11:23:37 +01:00
kref.rst docs: move the kref doc into the core-api book 2020-05-15 12:02:19 -06:00
librs.rst
local_ops.rst timers: Update the documentation to reflect on the new timer_shutdown() API 2022-11-24 15:09:12 +01:00
maple_tree.rst Maple Tree: add new data structure 2022-09-26 19:46:13 -07:00
memory-allocation.rst mm/slab: document kfree() as allowed for kmem_cache_alloc() objects 2023-03-29 10:35:41 +02:00
memory-hotplug.rst mm/memory_hotplug: remove HIGHMEM leftovers 2021-11-06 13:30:42 -07:00
mm-api.rst mm/page_alloc: remove obsolete gfpflags_normal_context() 2022-10-03 14:03:30 -07:00
netlink.rst docs: add more netlink docs (incl. spec docs) 2023-01-24 10:58:11 +01:00
packing.rst Documentation: core-api: packing: correct spelling 2023-02-15 21:40:54 -08:00
padata.rst Documentation: core-api: padata: correct spelling 2023-02-16 16:58:01 -07:00
pin_user_pages.rst mm: Don't pin ZERO_PAGE in pin_user_pages() 2023-05-31 09:48:15 -06:00
printk-basics.rst printk: Move the printk() kerneldoc comment to its new home 2021-07-26 12:36:44 +02:00
printk-formats.rst mm, printk: introduce new format %pGt for page_type 2023-03-28 16:20:09 -07:00
printk-index.rst printk/index: Printk index feature documentation 2022-04-13 14:25:31 +02:00
protection-keys.rst Documentation/protection-keys: Clean up documentation for User Space pkeys 2022-06-07 16:06:22 -07:00
rbtree.rst docs: rbtree.rst: Fix a typo 2021-03-25 11:38:51 -06:00
refcount-vs-atomic.rst docs: remove :c:func: from refcount-vs-atomic.rst 2019-10-07 09:08:56 -06:00
symbol-namespaces.rst doc: module: update file references 2022-07-01 14:50:01 -07:00
this_cpu_ops.rst arch: Remove cmpxchg_double 2023-06-05 09:36:39 +02:00
timekeeping.rst timekeeping: Introduce fast accessor to clock tai 2022-04-14 16:19:30 +02:00
tracepoint.rst
unaligned-memory-access.rst docs: move other kAPI documents to core-api 2020-06-26 11:33:42 -06:00
watch_queue.rst Documentation: move watch_queue to core-api 2022-04-22 09:47:25 -06:00
workqueue.rst workqueue: Make unbound workqueues to use per-cpu pool_workqueues 2023-08-07 15:57:23 -10:00
xarray.rst XArray: Document the locking requirement for the xa_state 2022-02-03 15:56:50 -05:00