docs: scheduler: convert docs to ReST and rename to *.rst

In order to prepare to add them to the Kernel API book, convert the files to ReST format. The conversion is actually: - add blank lines and identation in order to identify paragraphs; - fix tables markups; - add some lists markups; - mark literal blocks; - adjust title markups. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2024-11-22 04:02:20 +00:00 · 2019-06-12 14:53:03 -03:00 · 2019-06-12 14:53:03 -03:00 · d6a3b24762
commit d6a3b24762
parent d223884089
16 changed files with 324 additions and 226 deletions
--- a/Documentation/ABI/testing/sysfs-kernel-uids
+++ b/Documentation/ABI/testing/sysfs-kernel-uids
@ -11,4 +11,4 @@ Description:
 		example would be, if User A has shares = 1024 and user
 		B has shares = 2048, User B will get twice the CPU
 		bandwidth user A will. For more details refer
-		Documentation/scheduler/sched-design-CFS.txt
+		Documentation/scheduler/sched-design-CFS.rst
--- a/Documentation/scheduler/completion.rst
+++ b/Documentation/scheduler/completion.rst
@ -1,3 +1,4 @@
+================================================
 Completions - "wait for completion" barrier APIs
 ================================================

@ -46,7 +47,7 @@ it has to wait for it.

 To use completions you need to #include <linux/completion.h> and
 create a static or dynamic variable of type 'struct completion',
-which has only two fields:
+which has only two fields::

 	struct completion {
 		unsigned int done;
@ -57,7 +58,7 @@ This provides the ->wait waitqueue to place tasks on for waiting (if any), and
 the ->done completion flag for indicating whether it's completed or not.

 Completions should be named to refer to the event that is being synchronized on.
-A good example is:
+A good example is::

 	wait_for_completion(&early_console_added);

@ -81,7 +82,7 @@ have taken place, even if these wait functions return prematurely due to a timeo
 or a signal triggering.

 Initializing of dynamically allocated completion objects is done via a call to
-init_completion():
+init_completion()::

 	init_completion(&dynamic_object->done);

@ -100,7 +101,8 @@ but be aware of other races.

 For static declaration and initialization, macros are available.

-For static (or global) declarations in file scope you can use DECLARE_COMPLETION():
+For static (or global) declarations in file scope you can use
+DECLARE_COMPLETION()::

 	static DECLARE_COMPLETION(setup_done);
 	DECLARE_COMPLETION(setup_done);
@ -111,7 +113,7 @@ initialized to 'not done' and doesn't require an init_completion() call.
 When a completion is declared as a local variable within a function,
 then the initialization should always use DECLARE_COMPLETION_ONSTACK()
 explicitly, not just to make lockdep happy, but also to make it clear
-that limited scope had been considered and is intentional:
+that limited scope had been considered and is intentional::

 	DECLARE_COMPLETION_ONSTACK(setup_done)

@ -140,11 +142,11 @@ Waiting for completions:
 ------------------------

 For a thread to wait for some concurrent activity to finish, it
-calls wait_for_completion() on the initialized completion structure:
+calls wait_for_completion() on the initialized completion structure::

 	void wait_for_completion(struct completion *done)

-A typical usage scenario is:
+A typical usage scenario is::

 	CPU#1					CPU#2

@ -192,17 +194,17 @@ A common problem that occurs is to have unclean assignment of return types,
 so take care to assign return-values to variables of the proper type.

 Checking for the specific meaning of return values also has been found
-to be quite inaccurate, e.g. constructs like:
+to be quite inaccurate, e.g. constructs like::

 	if (!wait_for_completion_interruptible_timeout(...))

 ... would execute the same code path for successful completion and for the
-interrupted case - which is probably not what you want.
+interrupted case - which is probably not what you want::

 	int wait_for_completion_interruptible(struct completion *done)

 This function marks the task TASK_INTERRUPTIBLE while it is waiting.
-If a signal was received while waiting it will return -ERESTARTSYS; 0 otherwise.
+If a signal was received while waiting it will return -ERESTARTSYS; 0 otherwise::

 	unsigned long wait_for_completion_timeout(struct completion *done, unsigned long timeout)

@ -214,7 +216,7 @@ Timeouts are preferably calculated with msecs_to_jiffies() or usecs_to_jiffies()
 to make the code largely HZ-invariant.

 If the returned timeout value is deliberately ignored a comment should probably explain
-why (e.g. see drivers/mfd/wm8350-core.c wm8350_read_auxadc()).
+why (e.g. see drivers/mfd/wm8350-core.c wm8350_read_auxadc())::

 	long wait_for_completion_interruptible_timeout(struct completion *done, unsigned long timeout)

@ -225,14 +227,14 @@ jiffies if completion occurred.

 Further variants include _killable which uses TASK_KILLABLE as the
 designated tasks state and will return -ERESTARTSYS if it is interrupted,
-or 0 if completion was achieved.  There is a _timeout variant as well:
+or 0 if completion was achieved.  There is a _timeout variant as well::

 	long wait_for_completion_killable(struct completion *done)
 	long wait_for_completion_killable_timeout(struct completion *done, unsigned long timeout)

 The _io variants wait_for_completion_io() behave the same as the non-_io
 variants, except for accounting waiting time as 'waiting on IO', which has
-an impact on how the task is accounted in scheduling/IO stats:
+an impact on how the task is accounted in scheduling/IO stats::

 	void wait_for_completion_io(struct completion *done)
 	unsigned long wait_for_completion_io_timeout(struct completion *done, unsigned long timeout)
@ -243,11 +245,11 @@ Signaling completions:

 A thread that wants to signal that the conditions for continuation have been
 achieved calls complete() to signal exactly one of the waiters that it can
-continue:
+continue::

 	void complete(struct completion *done)

-... or calls complete_all() to signal all current and future waiters:
+... or calls complete_all() to signal all current and future waiters::

 	void complete_all(struct completion *done)

@ -268,7 +270,7 @@ probably are a design bug.

 Signaling completion from IRQ context is fine as it will appropriately
 lock with spin_lock_irqsave()/spin_unlock_irqrestore() and it will never
-sleep. 
+sleep.


 try_wait_for_completion()/completion_done():
@ -276,14 +278,14 @@ try_wait_for_completion()/completion_done():

 The try_wait_for_completion() function will not put the thread on the wait
 queue but rather returns false if it would need to enqueue (block) the thread,
-else it consumes one posted completion and returns true.
+else it consumes one posted completion and returns true::

 	bool try_wait_for_completion(struct completion *done)

 Finally, to check the state of a completion without changing it in any way,
 call completion_done(), which returns false if there are no posted
 completions that were not yet consumed by waiters (implying that there are
-waiters) and true otherwise;
+waiters) and true otherwise::

 	bool completion_done(struct completion *done)

--- a/Documentation/scheduler/index.rst
+++ b/Documentation/scheduler/index.rst
@ -0,0 +1,29 @@
+:orphan:
+
+===============
+Linux Scheduler
+===============
+
+.. toctree::
+    :maxdepth: 1
+
+
+    completion
+    sched-arch
+    sched-bwc
+    sched-deadline
+    sched-design-CFS
+    sched-domains
+    sched-energy
+    sched-nice-design
+    sched-rt-group
+    sched-stats
+
+    text_files
+
+.. only::  subproject and html
+
+   Indices
+   =======
+
+   * :ref:`genindex`
--- a/Documentation/scheduler/sched-arch.rst
+++ b/Documentation/scheduler/sched-arch.rst
@ -1,4 +1,6 @@
-	CPU Scheduler implementation hints for architecture specific code
+=================================================================
+CPU Scheduler implementation hints for architecture specific code
+=================================================================

 	Nick Piggin, 2005

@ -35,9 +37,10 @@ Your cpu_idle routines need to obey the following rules:
 4. The only time interrupts need to be disabled when checking
   need_resched is if we are about to sleep the processor until
   the next interrupt (this doesn't provide any protection of
-   need_resched, it prevents losing an interrupt).
+   need_resched, it prevents losing an interrupt):
+
+	4a. Common problem with this type of sleep appears to be::

-	4a. Common problem with this type of sleep appears to be:
 	        local_irq_disable();
 	        if (!need_resched()) {
 	                local_irq_enable();
@ -51,10 +54,10 @@ Your cpu_idle routines need to obey the following rules:
   although it may be reasonable to do some background work or enter
   a low CPU priority.

-   	5a. If TIF_POLLING_NRFLAG is set, and we do decide to enter
-	    an interrupt sleep, it needs to be cleared then a memory
-	    barrier issued (followed by a test of need_resched with
-	    interrupts disabled, as explained in 3).
+      - 5a. If TIF_POLLING_NRFLAG is set, and we do decide to enter
+	an interrupt sleep, it needs to be cleared then a memory
+	barrier issued (followed by a test of need_resched with
+	interrupts disabled, as explained in 3).

 arch/x86/kernel/process.c has examples of both polling and
 sleeping idle functions.
@ -71,4 +74,3 @@ sh64 - Is sleeping racy vs interrupts? (See #4a)

 sparc - IRQs on at this point(?), change local_irq_save to _disable.
      - TODO: needs secondary CPUs to disable preempt (See #1)
-
--- a/Documentation/scheduler/sched-bwc.rst
+++ b/Documentation/scheduler/sched-bwc.rst
@ -1,8 +1,9 @@
+=====================
 CFS Bandwidth Control
 =====================

 [ This document only discusses CPU bandwidth control for SCHED_NORMAL.
-  The SCHED_RT case is covered in Documentation/scheduler/sched-rt-group.txt ]
+  The SCHED_RT case is covered in Documentation/scheduler/sched-rt-group.rst ]

 CFS bandwidth control is a CONFIG_FAIR_GROUP_SCHED extension which allows the
 specification of the maximum CPU bandwidth available to a group or hierarchy.
@ -27,7 +28,8 @@ cpu.cfs_quota_us: the total available run-time within a period (in microseconds)
 cpu.cfs_period_us: the length of a period (in microseconds)
 cpu.stat: exports throttling statistics [explained further below]

-The default values are:
+The default values are::
+
 	cpu.cfs_period_us=100ms
 	cpu.cfs_quota=-1

@ -55,7 +57,8 @@ For efficiency run-time is transferred between the global pool and CPU local
 on large systems.  The amount transferred each time such an update is required
 is described as the "slice".

-This is tunable via procfs:
+This is tunable via procfs::
+
 	/proc/sys/kernel/sched_cfs_bandwidth_slice_us (default=5ms)

 Larger slice values will reduce transfer overheads, while smaller values allow
@ -66,6 +69,7 @@ Statistics
 A group's bandwidth statistics are exported via 3 fields in cpu.stat.

 cpu.stat:
+
 - nr_periods: Number of enforcement intervals that have elapsed.
 - nr_throttled: Number of times the group has been throttled/limited.
 - throttled_time: The total time duration (in nanoseconds) for which entities
@ -78,12 +82,15 @@ Hierarchical considerations
 The interface enforces that an individual entity's bandwidth is always
 attainable, that is: max(c_i) <= C. However, over-subscription in the
 aggregate case is explicitly allowed to enable work-conserving semantics
-within a hierarchy.
+within a hierarchy:
+
  e.g. \Sum (c_i) may exceed C
+
 [ Where C is the parent's bandwidth, and c_i its children ]


 There are two ways in which a group may become throttled:
+
 	a. it fully consumes its own quota within a period
 	b. a parent's quota is fully consumed within its period

@ -92,7 +99,7 @@ be allowed to until the parent's runtime is refreshed.

 Examples
 --------
-1. Limit a group to 1 CPU worth of runtime.
+1. Limit a group to 1 CPU worth of runtime::

 	If period is 250ms and quota is also 250ms, the group will get
 	1 CPU worth of runtime every 250ms.
@ -100,10 +107,10 @@ Examples
 	# echo 250000 > cpu.cfs_quota_us /* quota = 250ms */
 	# echo 250000 > cpu.cfs_period_us /* period = 250ms */

-2. Limit a group to 2 CPUs worth of runtime on a multi-CPU machine.
+2. Limit a group to 2 CPUs worth of runtime on a multi-CPU machine

-	With 500ms period and 1000ms quota, the group can get 2 CPUs worth of
-	runtime every 500ms.
+   With 500ms period and 1000ms quota, the group can get 2 CPUs worth of
+   runtime every 500ms::

 	# echo 1000000 > cpu.cfs_quota_us /* quota = 1000ms */
 	# echo 500000 > cpu.cfs_period_us /* period = 500ms */
@ -112,11 +119,10 @@ Examples

 3. Limit a group to 20% of 1 CPU.

-	With 50ms period, 10ms quota will be equivalent to 20% of 1 CPU.
+   With 50ms period, 10ms quota will be equivalent to 20% of 1 CPU::

 	# echo 10000 > cpu.cfs_quota_us /* quota = 10ms */
 	# echo 50000 > cpu.cfs_period_us /* period = 50ms */

-	By using a small period here we are ensuring a consistent latency
-	response at the expense of burst capacity.
-
+   By using a small period here we are ensuring a consistent latency
+   response at the expense of burst capacity.
--- a/Documentation/scheduler/sched-deadline.rst
+++ b/Documentation/scheduler/sched-deadline.rst
@ -1,29 +1,29 @@
-			  Deadline Task Scheduling
-			  ------------------------
+========================
+Deadline Task Scheduling
+========================

-CONTENTS
-========
+.. CONTENTS

- 0. WARNING
- 1. Overview
- 2. Scheduling algorithm
-   2.1 Main algorithm
-   2.2 Bandwidth reclaiming
- 3. Scheduling Real-Time Tasks
-   3.1 Definitions
-   3.2 Schedulability Analysis for Uniprocessor Systems
-   3.3 Schedulability Analysis for Multiprocessor Systems
-   3.4 Relationship with SCHED_DEADLINE Parameters
- 4. Bandwidth management
-   4.1 System-wide settings
-   4.2 Task interface
-   4.3 Default behavior
-   4.4 Behavior of sched_yield()
- 5. Tasks CPU affinity
-   5.1 SCHED_DEADLINE and cpusets HOWTO
- 6. Future plans
- A. Test suite
- B. Minimal main()
+    0. WARNING
+    1. Overview
+    2. Scheduling algorithm
+      2.1 Main algorithm
+      2.2 Bandwidth reclaiming
+    3. Scheduling Real-Time Tasks
+      3.1 Definitions
+      3.2 Schedulability Analysis for Uniprocessor Systems
+      3.3 Schedulability Analysis for Multiprocessor Systems
+      3.4 Relationship with SCHED_DEADLINE Parameters
+    4. Bandwidth management
+      4.1 System-wide settings
+      4.2 Task interface
+      4.3 Default behavior
+      4.4 Behavior of sched_yield()
+    5. Tasks CPU affinity
+      5.1 SCHED_DEADLINE and cpusets HOWTO
+    6. Future plans
+    A. Test suite
+    B. Minimal main()


 0. WARNING
@ -44,7 +44,7 @@ CONTENTS


 2. Scheduling algorithm
-==================
+=======================

 2.1 Main algorithm
 ------------------
@ -80,7 +80,7 @@ CONTENTS
    a "remaining runtime". These two parameters are initially set to 0;

  - When a SCHED_DEADLINE task wakes up (becomes ready for execution),
-    the scheduler checks if
+    the scheduler checks if::

                 remaining runtime                  runtime
        ----------------------------------    >    ---------
@ -97,7 +97,7 @@ CONTENTS
    left unchanged;

  - When a SCHED_DEADLINE task executes for an amount of time t, its
-    remaining runtime is decreased as
+    remaining runtime is decreased as::

         remaining runtime = remaining runtime - t

@ -112,7 +112,7 @@ CONTENTS

  - When the current time is equal to the replenishment time of a
    throttled task, the scheduling deadline and the remaining runtime are
-    updated as
+    updated as::

         scheduling deadline = scheduling deadline + period
         remaining runtime = remaining runtime + runtime
@ -129,7 +129,7 @@ CONTENTS
 Reclamation of Unused Bandwidth) algorithm [15, 16, 17] and it is enabled
 when flag SCHED_FLAG_RECLAIM is set.

- The following diagram illustrates the state names for tasks handled by GRUB:
+ The following diagram illustrates the state names for tasks handled by GRUB::

                             ------------
                 (d)        |   Active   |
@ -168,7 +168,7 @@ CONTENTS
      breaking the real-time guarantees.

      The 0-lag time for a task entering the ActiveNonContending state is
-      computed as
+      computed as::

                        (runtime * dl_period)
             deadline - ---------------------
@ -183,7 +183,7 @@ CONTENTS
      the task's utilization must be removed from the previous runqueue's active
      utilization and must be added to the new runqueue's active utilization.
      In order to avoid races between a task waking up on a runqueue while the
-       "inactive timer" is running on a different CPU, the "dl_non_contending"
+      "inactive timer" is running on a different CPU, the "dl_non_contending"
      flag is used to indicate that a task is not on a runqueue but is active
      (so, the flag is set when the task blocks and is cleared when the
      "inactive timer" fires or when the task  wakes up).
@ -222,36 +222,36 @@ CONTENTS


 Let's now see a trivial example of two deadline tasks with runtime equal
- to 4 and period equal to 8 (i.e., bandwidth equal to 0.5):
+ to 4 and period equal to 8 (i.e., bandwidth equal to 0.5)::

-     A            Task T1
-     |
-     |                               |
-     |                               |
-     |--------                       |----
-     |       |                       V
-     |---|---|---|---|---|---|---|---|--------->t
-     0   1   2   3   4   5   6   7   8
+         A            Task T1
+         |
+         |                               |
+         |                               |
+         |--------                       |----
+         |       |                       V
+         |---|---|---|---|---|---|---|---|--------->t
+         0   1   2   3   4   5   6   7   8


-     A            Task T2
-     |
-     |                               |
-     |                               |
-     |       ------------------------|
-     |       |                       V
-     |---|---|---|---|---|---|---|---|--------->t
-     0   1   2   3   4   5   6   7   8
+         A            Task T2
+         |
+         |                               |
+         |                               |
+         |       ------------------------|
+         |       |                       V
+         |---|---|---|---|---|---|---|---|--------->t
+         0   1   2   3   4   5   6   7   8


-     A            running_bw
-     |
-   1 -----------------               ------
-     |               |               |
-  0.5-               -----------------
-     |                               |
-     |---|---|---|---|---|---|---|---|--------->t
-     0   1   2   3   4   5   6   7   8
+         A            running_bw
+         |
+       1 -----------------               ------
+         |               |               |
+      0.5-               -----------------
+         |                               |
+         |---|---|---|---|---|---|---|---|--------->t
+         0   1   2   3   4   5   6   7   8


  - Time t = 0:
@ -284,7 +284,7 @@ CONTENTS


 2.3 Energy-aware scheduling
------------------------
+---------------------------

 When cpufreq's schedutil governor is selected, SCHED_DEADLINE implements the
 GRUB-PA [19] algorithm, reducing the CPU operating frequency to the minimum
@ -299,15 +299,20 @@ CONTENTS
 3. Scheduling Real-Time Tasks
 =============================

- * BIG FAT WARNING ******************************************************
- *
- * This section contains a (not-thorough) summary on classical deadline
- * scheduling theory, and how it applies to SCHED_DEADLINE.
- * The reader can "safely" skip to Section 4 if only interested in seeing
- * how the scheduling policy can be used. Anyway, we strongly recommend
- * to come back here and continue reading (once the urge for testing is
- * satisfied :P) to be sure of fully understanding all technical details.
- ************************************************************************
+
+
+ ..  BIG FAT WARNING ******************************************************
+
+ .. warning::
+
+   This section contains a (not-thorough) summary on classical deadline
+   scheduling theory, and how it applies to SCHED_DEADLINE.
+   The reader can "safely" skip to Section 4 if only interested in seeing
+   how the scheduling policy can be used. Anyway, we strongly recommend
+   to come back here and continue reading (once the urge for testing is
+   satisfied :P) to be sure of fully understanding all technical details.
+
+ .. ************************************************************************

 There are no limitations on what kind of task can exploit this new
 scheduling discipline, even if it must be said that it is particularly
@ -329,6 +334,7 @@ CONTENTS
 sporadic with minimum inter-arrival time P is r_{j+1} >= r_j + P. Finally,
 d_j = r_j + D, where D is the task's relative deadline.
 Summing up, a real-time task can be described as
+
 	Task = (WCET, D, P)

 The utilization of a real-time task is defined as the ratio between its
@ -352,13 +358,15 @@ CONTENTS
 between the finishing time of a job and its absolute deadline).
 More precisely, it can be proven that using a global EDF scheduler the
 maximum tardiness of each task is smaller or equal than
+
 	((M − 1) · WCET_max − WCET_min)/(M − (M − 2) · U_max) + WCET_max
+
 where WCET_max = max{WCET_i} is the maximum WCET, WCET_min=min{WCET_i}
 is the minimum WCET, and U_max = max{WCET_i/P_i} is the maximum
 utilization[12].

 3.2 Schedulability Analysis for Uniprocessor Systems
------------------------
+----------------------------------------------------

 If M=1 (uniprocessor system), or in case of partitioned scheduling (each
 real-time task is statically assigned to one and only one CPU), it is
@ -370,7 +378,9 @@ CONTENTS
 a task as WCET_i/min{D_i,P_i}, and EDF is able to respect all the deadlines
 of all the tasks running on a CPU if the sum of the densities of the tasks
 running on such a CPU is smaller or equal than 1:
+
 	sum(WCET_i / min{D_i, P_i}) <= 1
+
 It is important to notice that this condition is only sufficient, and not
 necessary: there are task sets that are schedulable, but do not respect the
 condition. For example, consider the task set {Task_1,Task_2} composed by
@ -379,7 +389,9 @@ CONTENTS
 (Task_1 is scheduled as soon as it is released, and finishes just in time
 to respect its deadline; Task_2 is scheduled immediately after Task_1, hence
 its response time cannot be larger than 50ms + 10ms = 60ms) even if
+
 	50 / min{50,100} + 10 / min{100, 100} = 50 / 50 + 10 / 100 = 1.1
+
 Of course it is possible to test the exact schedulability of tasks with
 D_i != P_i (checking a condition that is both sufficient and necessary),
 but this cannot be done by comparing the total utilization or density with
@ -399,7 +411,7 @@ CONTENTS
 4 Linux uses an admission test based on the tasks' utilizations.

 3.3 Schedulability Analysis for Multiprocessor Systems
------------------------
+------------------------------------------------------

 On multiprocessor systems with global EDF scheduling (non partitioned
 systems), a sufficient test for schedulability can not be based on the
@ -428,7 +440,9 @@ CONTENTS
 between total utilization (or density) and a fixed constant. If all tasks
 have D_i = P_i, a sufficient schedulability condition can be expressed in
 a simple way:
+
 	sum(WCET_i / P_i) <= M - (M - 1) · U_max
+
 where U_max = max{WCET_i / P_i}[10]. Notice that for U_max = 1,
 M - (M - 1) · U_max becomes M - M + 1 = 1 and this schedulability condition
 just confirms the Dhall's effect. A more complete survey of the literature
@ -447,7 +461,7 @@ CONTENTS
 the tasks are limited.

 3.4 Relationship with SCHED_DEADLINE Parameters
------------------------
+-----------------------------------------------

 Finally, it is important to understand the relationship between the
 SCHED_DEADLINE scheduling parameters described in Section 2 (runtime,
@ -473,6 +487,7 @@ CONTENTS
 this task, as it is not possible to respect its temporal constraints.

 References:
+
  1 - C. L. Liu and J. W. Layland. Scheduling algorithms for multiprogram-
      ming in a hard-real-time environment. Journal of the Association for
      Computing Machinery, 20(1), 1973.
@ -550,7 +565,7 @@ CONTENTS
 The interface used to control the CPU bandwidth that can be allocated
 to -deadline tasks is similar to the one already used for -rt
 tasks with real-time group scheduling (a.k.a. RT-throttling - see
- Documentation/scheduler/sched-rt-group.txt), and is based on readable/
+ Documentation/scheduler/sched-rt-group.rst), and is based on readable/
 writable control files located in procfs (for system wide settings).
 Notice that per-group settings (controlled through cgroupfs) are still not
 defined for -deadline tasks, because more discussion is needed in order to
@ -596,11 +611,13 @@ CONTENTS
 Specifying a periodic/sporadic task that executes for a given amount of
 runtime at each instance, and that is scheduled according to the urgency of
 its own timing constraints needs, in general, a way of declaring:
+
  - a (maximum/typical) instance execution time,
  - a minimum interval between consecutive instances,
  - a time constraint by which each instance must be completed.

 Therefore:
+
  * a new struct sched_attr, containing all the necessary fields is
    provided;
  * the new scheduling related syscalls that manipulate it, i.e.,
@ -658,21 +675,21 @@ CONTENTS
 ------------------------------------

 An example of a simple configuration (pin a -deadline task to CPU0)
- follows (rt-app is used to create a -deadline task).
+ follows (rt-app is used to create a -deadline task)::

- mkdir /dev/cpuset
- mount -t cgroup -o cpuset cpuset /dev/cpuset
- cd /dev/cpuset
- mkdir cpu0
- echo 0 > cpu0/cpuset.cpus
- echo 0 > cpu0/cpuset.mems
- echo 1 > cpuset.cpu_exclusive
- echo 0 > cpuset.sched_load_balance
- echo 1 > cpu0/cpuset.cpu_exclusive
- echo 1 > cpu0/cpuset.mem_exclusive
- echo $$ > cpu0/tasks
- rt-app -t 100000:10000:d:0 -D5 (it is now actually superfluous to specify
- task affinity)
+   mkdir /dev/cpuset
+   mount -t cgroup -o cpuset cpuset /dev/cpuset
+   cd /dev/cpuset
+   mkdir cpu0
+   echo 0 > cpu0/cpuset.cpus
+   echo 0 > cpu0/cpuset.mems
+   echo 1 > cpuset.cpu_exclusive
+   echo 0 > cpuset.sched_load_balance
+   echo 1 > cpu0/cpuset.cpu_exclusive
+   echo 1 > cpu0/cpuset.mem_exclusive
+   echo $$ > cpu0/tasks
+   rt-app -t 100000:10000:d:0 -D5 # it is now actually superfluous to specify
+				  # task affinity

 6. Future plans
 ===============
@ -711,7 +728,7 @@ Appendix A. Test suite
 rt-app is available at: https://github.com/scheduler-tools/rt-app.

 Thread parameters can be specified from the command line, with something like
- this:
+ this::

  # rt-app -t 100000:10000:d -t 150000:20000:f:10 -D5

@ -721,27 +738,27 @@ Appendix A. Test suite
 of 5 seconds.

 More interestingly, configurations can be described with a json file that
- can be passed as input to rt-app with something like this:
+ can be passed as input to rt-app with something like this::

  # rt-app my_config.json

 The parameters that can be specified with the second method are a superset
 of the command line options. Please refer to rt-app documentation for more
- details (<rt-app-sources>/doc/*.json).
+ details (`<rt-app-sources>/doc/*.json`).

 The second testing application is a modification of schedtool, called
 schedtool-dl, which can be used to setup SCHED_DEADLINE parameters for a
 certain pid/application. schedtool-dl is available at:
 https://github.com/scheduler-tools/schedtool-dl.git.

- The usage is straightforward:
+ The usage is straightforward::

  # schedtool -E -t 10000000:100000000 -e ./my_cpuhog_app

 With this, my_cpuhog_app is put to run inside a SCHED_DEADLINE reservation
 of 10ms every 100ms (note that parameters are expressed in microseconds).
 You can also use schedtool to create a reservation for an already running
- application, given that you know its pid:
+ application, given that you know its pid::

  # schedtool -E -t 10000000:100000000 my_app_pid

@ -750,43 +767,43 @@ Appendix B. Minimal main()

 We provide in what follows a simple (ugly) self-contained code snippet
 showing how SCHED_DEADLINE reservations can be created by a real-time
- application developer.
+ application developer::

- #define _GNU_SOURCE
- #include <unistd.h>
- #include <stdio.h>
- #include <stdlib.h>
- #include <string.h>
- #include <time.h>
- #include <linux/unistd.h>
- #include <linux/kernel.h>
- #include <linux/types.h>
- #include <sys/syscall.h>
- #include <pthread.h>
+   #define _GNU_SOURCE
+   #include <unistd.h>
+   #include <stdio.h>
+   #include <stdlib.h>
+   #include <string.h>
+   #include <time.h>
+   #include <linux/unistd.h>
+   #include <linux/kernel.h>
+   #include <linux/types.h>
+   #include <sys/syscall.h>
+   #include <pthread.h>

- #define gettid() syscall(__NR_gettid)
+   #define gettid() syscall(__NR_gettid)

- #define SCHED_DEADLINE	6
+   #define SCHED_DEADLINE	6

- /* XXX use the proper syscall numbers */
- #ifdef __x86_64__
- #define __NR_sched_setattr		314
- #define __NR_sched_getattr		315
- #endif
+   /* XXX use the proper syscall numbers */
+   #ifdef __x86_64__
+   #define __NR_sched_setattr		314
+   #define __NR_sched_getattr		315
+   #endif

- #ifdef __i386__
- #define __NR_sched_setattr		351
- #define __NR_sched_getattr		352
- #endif
+   #ifdef __i386__
+   #define __NR_sched_setattr		351
+   #define __NR_sched_getattr		352
+   #endif

- #ifdef __arm__
- #define __NR_sched_setattr		380
- #define __NR_sched_getattr		381
- #endif
+   #ifdef __arm__
+   #define __NR_sched_setattr		380
+   #define __NR_sched_getattr		381
+   #endif

- static volatile int done;
+   static volatile int done;

- struct sched_attr {
+   struct sched_attr {
 	__u32 size;

 	__u32 sched_policy;
@ -802,25 +819,25 @@ Appendix B. Minimal main()
 	__u64 sched_runtime;
 	__u64 sched_deadline;
 	__u64 sched_period;
- };
+   };

- int sched_setattr(pid_t pid,
+   int sched_setattr(pid_t pid,
 		  const struct sched_attr *attr,
 		  unsigned int flags)
- {
+   {
 	return syscall(__NR_sched_setattr, pid, attr, flags);
- }
+   }

- int sched_getattr(pid_t pid,
+   int sched_getattr(pid_t pid,
 		  struct sched_attr *attr,
 		  unsigned int size,
 		  unsigned int flags)
- {
+   {
 	return syscall(__NR_sched_getattr, pid, attr, size, flags);
- }
+   }

- void *run_deadline(void *data)
- {
+   void *run_deadline(void *data)
+   {
 	struct sched_attr attr;
 	int x = 0;
 	int ret;
@ -851,10 +868,10 @@ Appendix B. Minimal main()

 	printf("deadline thread dies [%ld]\n", gettid());
 	return NULL;
- }
+   }

- int main (int argc, char **argv)
- {
+   int main (int argc, char **argv)
+   {
 	pthread_t thread;

 	printf("main thread [%ld]\n", gettid());
@ -868,4 +885,4 @@ Appendix B. Minimal main()

 	printf("main dies [%ld]\n", gettid());
 	return 0;
- }
+   }
--- a/Documentation/scheduler/sched-design-CFS.rst
+++ b/Documentation/scheduler/sched-design-CFS.rst
@ -1,9 +1,10 @@
-                      =============
-                      CFS Scheduler
-                      =============
+=============
+CFS Scheduler
+=============


 1.  OVERVIEW
+============

 CFS stands for "Completely Fair Scheduler," and is the new "desktop" process
 scheduler implemented by Ingo Molnar and merged in Linux 2.6.23.  It is the
@ -27,6 +28,7 @@ is its actual runtime normalized to the total number of running tasks.


 2.  FEW IMPLEMENTATION DETAILS
+==============================

 In CFS the virtual runtime is expressed and tracked via the per-task
 p->se.vruntime (nanosec-unit) value.  This way, it's possible to accurately
@ -49,6 +51,7 @@ algorithm variants to recognize sleepers.


 3.  THE RBTREE
+==============

 CFS's design is quite radical: it does not use the old data structures for the
 runqueues, but it uses a time-ordered rbtree to build a "timeline" of future
@ -84,6 +87,7 @@ picked and the current task is preempted.


 4.  SOME FEATURES OF CFS
+========================

 CFS uses nanosecond granularity accounting and does not rely on any jiffies or
 other HZ detail.  Thus the CFS scheduler has no notion of "timeslices" in the
@ -113,6 +117,7 @@ result.


 5. Scheduling policies
+======================

 CFS implements three scheduling policies:

@ -137,6 +142,7 @@ SCHED_IDLE.


 6.  SCHEDULING CLASSES
+======================

 The new CFS scheduler has been designed in such a way to introduce "Scheduling
 Classes," an extensible hierarchy of scheduler modules.  These modules
@ -197,6 +203,7 @@ This is the (partial) list of the hooks:


 7.  GROUP SCHEDULER EXTENSIONS TO CFS
+=====================================

 Normally, the scheduler operates on individual tasks and strives to provide
 fair CPU time to each task.  Sometimes, it may be desirable to group tasks and
@ -219,7 +226,7 @@ SCHED_BATCH) tasks.

 When CONFIG_FAIR_GROUP_SCHED is defined, a "cpu.shares" file is created for each
 group created using the pseudo filesystem.  See example steps below to create
-task groups and modify their CPU share using the "cgroups" pseudo filesystem.
+task groups and modify their CPU share using the "cgroups" pseudo filesystem::

 	# mount -t tmpfs cgroup_root /sys/fs/cgroup
 	# mkdir /sys/fs/cgroup/cpu
--- a/Documentation/scheduler/sched-domains.rst
+++ b/Documentation/scheduler/sched-domains.rst
@ -1,3 +1,7 @@
+=================
+Scheduler Domains
+=================
+
 Each CPU has a "base" scheduling domain (struct sched_domain). The domain
 hierarchy is built from these base domains via the ->parent pointer. ->parent
 MUST be NULL terminated, and domain structures should be per-CPU as they are
@ -46,7 +50,9 @@ CPU's runqueue and the newly found busiest one and starts moving tasks from it
 to our runqueue. The exact number of tasks amounts to an imbalance previously
 computed while iterating over this sched domain's groups.

-*** Implementing sched domains ***
+Implementing sched domains
+==========================
+
 The "base" domain will "span" the first level of the hierarchy. In the case
 of SMT, you'll span all siblings of the physical CPU, with each group being
 a single virtual CPU.
--- a/Documentation/scheduler/sched-energy.rst
+++ b/Documentation/scheduler/sched-energy.rst
@ -1,6 +1,6 @@
-			   =======================
-			   Energy Aware Scheduling
-			   =======================
+=======================
+Energy Aware Scheduling
+=======================

 1. Introduction
 ---------------
@ -12,7 +12,7 @@ with a minimal impact on throughput. This document aims at providing an
 introduction on how EAS works, what are the main design decisions behind it, and
 details what is needed to get it to run.

-Before going any further, please note that at the time of writing:
+Before going any further, please note that at the time of writing::

   /!\ EAS does not support platforms with symmetric CPU topologies /!\

@ -33,13 +33,13 @@ To make it clear from the start:
 - power = energy/time = [joule/second] = [watt]

 The goal of EAS is to minimize energy, while still getting the job done. That
-is, we want to maximize:
+is, we want to maximize::

 	performance [inst/s]
 	--------------------
 	    power [W]

-which is equivalent to minimizing:
+which is equivalent to minimizing::

 	energy [J]
 	-----------
@ -97,7 +97,7 @@ domains can contain duplicate elements.

 Example 1.
    Let us consider a platform with 12 CPUs, split in 3 performance domains
-    (pd0, pd4 and pd8), organized as follows:
+    (pd0, pd4 and pd8), organized as follows::

 	          CPUs:   0 1 2 3 4 5 6 7 8 9 10 11
 	          PDs:   |--pd0--|--pd4--|---pd8---|
@ -108,6 +108,7 @@ Example 1.
    containing 6 CPUs. The two root domains are denoted rd1 and rd2 in the
    above figure. Since pd4 intersects with both rd1 and rd2, it will be
    present in the linked list '->pd' attached to each of them:
+
       * rd1->pd: pd0 -> pd4
       * rd2->pd: pd4 -> pd8

@ -159,9 +160,9 @@ Example 2.
    Each performance domain has three Operating Performance Points (OPPs).
    The CPU capacity and power cost associated with each OPP is listed in
    the Energy Model table. The util_avg of P is shown on the figures
-    below as 'PP'.
+    below as 'PP'::

-    CPU util.
+     CPU util.
      1024                 - - - - - - -              Energy Model
                                               +-----------+-------------+
                                               |  Little   |     Big     |
@ -188,8 +189,7 @@ Example 2.
    (which is coherent with the behaviour of the schedutil CPUFreq
    governor, see Section 6. for more details on this topic).

-    Case 1. P is migrated to CPU1
-    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    **Case 1. P is migrated to CPU1**::

      1024                 - - - - - - -

@ -207,8 +207,7 @@ Example 2.
            CPU0   CPU1     CPU2   CPU3


-    Case 2. P is migrated to CPU3
-    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    **Case 2. P is migrated to CPU3**::

      1024                 - - - - - - -

@ -226,8 +225,7 @@ Example 2.
            CPU0   CPU1     CPU2   CPU3


-    Case 3. P stays on prev_cpu / CPU 0
-    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    **Case 3. P stays on prev_cpu / CPU 0**::

      1024                 - - - - - - -

@ -324,7 +322,9 @@ hardware properties and on other features of the kernel being enabled. This
 section lists these dependencies and provides hints as to how they can be met.


-  6.1 - Asymmetric CPU topology
+6.1 - Asymmetric CPU topology
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+

 As mentioned in the introduction, EAS is only supported on platforms with
 asymmetric CPU topologies for now. This requirement is checked at run-time by
@ -347,7 +347,8 @@ significant savings on SMP platforms have been observed yet. This restriction
 could be amended in the future if proven otherwise.


-  6.2 - Energy Model presence
+6.2 - Energy Model presence
+^^^^^^^^^^^^^^^^^^^^^^^^^^^

 EAS uses the EM of a platform to estimate the impact of scheduling decisions on
 energy. So, your platform must provide power cost tables to the EM framework in
@ -358,7 +359,8 @@ Please also note that the scheduling domains need to be re-built after the
 EM has been registered in order to start EAS.


-  6.3 - Energy Model complexity
+6.3 - Energy Model complexity
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

 The task wake-up path is very latency-sensitive. When the EM of a platform is
 too complex (too many CPUs, too many performance domains, too many performance
@ -388,7 +390,8 @@ two possible options:
       hence enabling it to cope with larger EMs in reasonable time.


-  6.4 - Schedutil governor
+6.4 - Schedutil governor
+^^^^^^^^^^^^^^^^^^^^^^^^

 EAS tries to predict at which OPP will the CPUs be running in the close future
 in order to estimate their energy consumption. To do so, it is assumed that OPPs
@ -405,7 +408,8 @@ frequency requests and energy predictions.
 Using EAS with any other governor than schedutil is not supported.


-  6.5 Scale-invariant utilization signals
+6.5 Scale-invariant utilization signals
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

 In order to make accurate prediction across CPUs and for all performance
 states, EAS needs frequency-invariant and CPU-invariant PELT signals. These can
@ -416,7 +420,8 @@ Using EAS on a platform that doesn't implement these two callbacks is not
 supported.


-  6.6 Multithreading (SMT)
+6.6 Multithreading (SMT)
+^^^^^^^^^^^^^^^^^^^^^^^^

 EAS in its current form is SMT unaware and is not able to leverage
 multithreaded hardware to save energy. EAS considers threads as independent
--- a/Documentation/scheduler/sched-nice-design.rst
+++ b/Documentation/scheduler/sched-nice-design.rst
@ -1,3 +1,7 @@
+=====================
+Scheduler Nice Design
+=====================
+
 This document explains the thinking about the revamped and streamlined
 nice-levels implementation in the new Linux scheduler.

@ -14,7 +18,7 @@ much stronger than they were before in 2.4 (and people were happy about
 that change), and we also intentionally calibrated the linear timeslice
 rule so that nice +19 level would be _exactly_ 1 jiffy. To better
 understand it, the timeslice graph went like this (cheesy ASCII art
-alert!):
+alert!)::


                   A
--- a/Documentation/scheduler/sched-rt-group.rst
+++ b/Documentation/scheduler/sched-rt-group.rst
@ -1,18 +1,18 @@
-				Real-Time group scheduling
-				--------------------------
+==========================
+Real-Time group scheduling
+==========================

-CONTENTS
-========
+.. CONTENTS

-0. WARNING
-1. Overview
-  1.1 The problem
-  1.2 The solution
-2. The interface
-  2.1 System-wide settings
-  2.2 Default behaviour
-  2.3 Basis for grouping tasks
-3. Future plans
+   0. WARNING
+   1. Overview
+     1.1 The problem
+     1.2 The solution
+   2. The interface
+     2.1 System-wide settings
+     2.2 Default behaviour
+     2.3 Basis for grouping tasks
+   3. Future plans


 0. WARNING
@ -159,9 +159,11 @@ Consider two sibling groups A and B; both have 50% bandwidth, but A's
 period is twice the length of B's.

 * group A: period=100000us, runtime=50000us
+
 	- this runs for 0.05s once every 0.1s

 * group B: period= 50000us, runtime=25000us
+
 	- this runs for 0.025s twice every 0.1s (or once every 0.05 sec).

 This means that currently a while (1) loop in A will run for the full period of
--- a/Documentation/scheduler/sched-stats.rst
+++ b/Documentation/scheduler/sched-stats.rst
@ -1,3 +1,7 @@
+====================
+Scheduler Statistics
+====================
+
 Version 15 of schedstats dropped counters for some sched_yield:
 yld_exp_empty, yld_act_empty and yld_both_empty. Otherwise, it is
 identical to version 14.
@ -35,19 +39,23 @@ CPU statistics
 cpu<N> 1 2 3 4 5 6 7 8 9

 First field is a sched_yield() statistic:
+
     1) # of times sched_yield() was called

 Next three are schedule() statistics:
+
     2) This field is a legacy array expiration count field used in the O(1)
 	scheduler. We kept it for ABI compatibility, but it is always set to zero.
     3) # of times schedule() was called
     4) # of times schedule() left the processor idle

 Next two are try_to_wake_up() statistics:
+
     5) # of times try_to_wake_up() was called
     6) # of times try_to_wake_up() was called to wake up the local cpu

 Next three are statistics describing scheduling latency:
+
     7) sum of all time spent running by tasks on this processor (in jiffies)
     8) sum of all time spent waiting to run by tasks on this processor (in
        jiffies)
@ -67,24 +75,23 @@ The first field is a bit mask indicating what cpus this domain operates over.
 The next 24 are a variety of load_balance() statistics in grouped into types
 of idleness (idle, busy, and newly idle):

-     1) # of times in this domain load_balance() was called when the
+    1)  # of times in this domain load_balance() was called when the
        cpu was idle
-     2) # of times in this domain load_balance() checked but found
+    2)  # of times in this domain load_balance() checked but found
        the load did not require balancing when the cpu was idle
-     3) # of times in this domain load_balance() tried to move one or
+    3)  # of times in this domain load_balance() tried to move one or
        more tasks and failed, when the cpu was idle
-     4) sum of imbalances discovered (if any) with each call to
+    4)  sum of imbalances discovered (if any) with each call to
        load_balance() in this domain when the cpu was idle
-     5) # of times in this domain pull_task() was called when the cpu
+    5)  # of times in this domain pull_task() was called when the cpu
        was idle
-     6) # of times in this domain pull_task() was called even though
+    6)  # of times in this domain pull_task() was called even though
        the target task was cache-hot when idle
-     7) # of times in this domain load_balance() was called but did
+    7)  # of times in this domain load_balance() was called but did
        not find a busier queue while the cpu was idle
-     8) # of times in this domain a busier queue was found while the
+    8)  # of times in this domain a busier queue was found while the
        cpu was idle but no busier group was found
-
-     9) # of times in this domain load_balance() was called when the
+    9)  # of times in this domain load_balance() was called when the
        cpu was busy
    10) # of times in this domain load_balance() checked but found the
        load did not require balancing when busy
@ -117,21 +124,25 @@ of idleness (idle, busy, and newly idle):
        was just becoming idle but no busier group was found

   Next three are active_load_balance() statistics:
+
    25) # of times active_load_balance() was called
    26) # of times active_load_balance() tried to move a task and failed
    27) # of times active_load_balance() successfully moved a task

   Next three are sched_balance_exec() statistics:
+
    28) sbe_cnt is not used
    29) sbe_balanced is not used
    30) sbe_pushed is not used

   Next three are sched_balance_fork() statistics:
+
    31) sbf_cnt is not used
    32) sbf_balanced is not used
    33) sbf_pushed is not used

   Next three are try_to_wake_up() statistics:
+
    34) # of times in this domain try_to_wake_up() awoke a task that
        last ran on a different cpu in this domain
    35) # of times in this domain try_to_wake_up() moved a task to the
@ -139,10 +150,11 @@ of idleness (idle, busy, and newly idle):
    36) # of times in this domain try_to_wake_up() started passive balancing

 /proc/<pid>/schedstat
----------------
+---------------------
 schedstats also adds a new /proc/<pid>/schedstat file to include some of
 the same information on a per-process level.  There are three fields in
 this file correlating for that process to:
+
     1) time spent on the cpu
     2) time spent waiting on a runqueue
     3) # of timeslices run on this cpu
@ -151,4 +163,5 @@ A program could be easily written to make use of these extra fields to
 report on how well a particular process or set of processes is faring
 under the scheduler's policies.  A simple version of such a program is
 available at
+
    http://eaglet.rain.com/rick/linux/schedstat/v12/latency.c
--- a/Documentation/scheduler/text_files.rst
+++ b/Documentation/scheduler/text_files.rst
@ -0,0 +1,5 @@
+Scheduler pelt c program
+------------------------
+
+.. literalinclude:: sched-pelt.c
+    :language: c
--- a/Documentation/vm/numa.rst
+++ b/Documentation/vm/numa.rst
@ -99,7 +99,7 @@ Local allocation will tend to keep subsequent access to the allocated memory
 as long as the task on whose behalf the kernel allocated some memory does not
 later migrate away from that memory.  The Linux scheduler is aware of the
 NUMA topology of the platform--embodied in the "scheduling domains" data
-structures [see Documentation/scheduler/sched-domains.txt]--and the scheduler
+structures [see Documentation/scheduler/sched-domains.rst]--and the scheduler
 attempts to minimize task migration to distant scheduling domains.  However,
 the scheduler does not take a task's NUMA footprint into account directly.
 Thus, under sufficient imbalance, tasks can migrate between nodes, remote
--- a/init/Kconfig
+++ b/init/Kconfig
@ -734,7 +734,7 @@ menuconfig CGROUPS
 	  use with process control subsystems such as Cpusets, CFS, memory
 	  controls or device isolation.
 	  See
-		- Documentation/scheduler/sched-design-CFS.txt	(CFS)
+		- Documentation/scheduler/sched-design-CFS.rst	(CFS)
 		- Documentation/cgroup-v1/ (features for grouping, isolation
 					  and resource control)

@ -835,7 +835,7 @@ config CFS_BANDWIDTH
 	  tasks running within the fair group scheduler.  Groups with no limit
 	  set are considered to be unconstrained and will run with no
 	  restriction.
-	  See Documentation/scheduler/sched-bwc.txt for more information.
+	  See Documentation/scheduler/sched-bwc.rst for more information.

 config RT_GROUP_SCHED
 	bool "Group scheduling for SCHED_RR/FIFO"
@ -846,7 +846,7 @@ config RT_GROUP_SCHED
 	  to task groups. If enabled, it will also make it impossible to
 	  schedule realtime tasks for non-root users until you allocate
 	  realtime bandwidth for them.
-	  See Documentation/scheduler/sched-rt-group.txt for more information.
+	  See Documentation/scheduler/sched-rt-group.rst for more information.

 endif #CGROUP_SCHED

--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@ -726,7 +726,7 @@ static void replenish_dl_entity(struct sched_dl_entity *dl_se,
 * refill the runtime and set the deadline a period in the future,
 * because keeping the current (absolute) deadline of the task would
 * result in breaking guarantees promised to other tasks (refer to
- * Documentation/scheduler/sched-deadline.txt for more information).
+ * Documentation/scheduler/sched-deadline.rst for more information).
 *
 * This function returns true if:
 *