mirror of
https://github.com/torvalds/linux.git
synced 2024-11-21 19:41:42 +00:00
srcu: Clarify comments on memory barrier "E"
There is an smp_mb() named "E" in srcu_flip() immediately before the increment (flip) of the srcu_struct structure's ->srcu_idx. The purpose of E is to order the preceding scan's read of lock counters against the flipping of the ->srcu_idx, in order to prevent new readers from continuing to use the old ->srcu_idx value, which might needlessly extend the grace period. However, this ordering is already enforced because of the control dependency between the preceding scan and the ->srcu_idx flip. This control dependency exists because atomic_long_read() is used to scan the counts, because WRITE_ONCE() is used to flip ->srcu_idx, and because ->srcu_idx is not flipped until the ->srcu_lock_count[] and ->srcu_unlock_count[] counts match. And such a match cannot happen when there is an in-flight reader that started before the flip (observation courtesy Mathieu Desnoyers). The litmus test below (courtesy of Frederic Weisbecker, with changes for ctrldep by Boqun and Joel) shows this: C srcu (* * bad condition: P0's first scan (SCAN1) saw P1's idx=0 LOCK count inc, though P1 saw flip. * * So basically, the ->po ordering on both P0 and P1 is enforced via ->ppo * (control deps) on both sides, and both P0 and P1 are interconnected by ->rf * relations. Combining the ->ppo with ->rf, a cycle is impossible. *) {} // updater P0(int *IDX, int *LOCK0, int *UNLOCK0, int *LOCK1, int *UNLOCK1) { int lock1; int unlock1; int lock0; int unlock0; // SCAN1 unlock1 = READ_ONCE(*UNLOCK1); smp_mb(); // A lock1 = READ_ONCE(*LOCK1); // FLIP if (lock1 == unlock1) { // Control dep smp_mb(); // E // Remove E and still passes. WRITE_ONCE(*IDX, 1); smp_mb(); // D // SCAN2 unlock0 = READ_ONCE(*UNLOCK0); smp_mb(); // A lock0 = READ_ONCE(*LOCK0); } } // reader P1(int *IDX, int *LOCK0, int *UNLOCK0, int *LOCK1, int *UNLOCK1) { int tmp; int idx1; int idx2; // 1st reader idx1 = READ_ONCE(*IDX); if (idx1 == 0) { // Control dep tmp = READ_ONCE(*LOCK0); WRITE_ONCE(*LOCK0, tmp + 1); smp_mb(); /* B and C */ tmp = READ_ONCE(*UNLOCK0); WRITE_ONCE(*UNLOCK0, tmp + 1); } else { tmp = READ_ONCE(*LOCK1); WRITE_ONCE(*LOCK1, tmp + 1); smp_mb(); /* B and C */ tmp = READ_ONCE(*UNLOCK1); WRITE_ONCE(*UNLOCK1, tmp + 1); } } exists (0:lock1=1 /\ 1:idx1=1) More complicated litmus tests with multiple SRCU readers also show that memory barrier E is not needed. This commit therefore clarifies the comment on memory barrier E. Why not also remove that redundant smp_mb()? Because control dependencies are quite fragile due to their not being recognized by most compilers and tools. Control dependencies therefore exact an ongoing maintenance burden, and such a burden cannot be justified in this slowpath. Therefore, that smp_mb() stays until such time as its overhead becomes a measurable problem in a real workload running on a real production system, or until such time as compilers start paying attention to this sort of control dependency. Co-developed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Co-developed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Co-developed-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
This commit is contained in:
parent
e5ad8b68f8
commit
754aa6427e
@ -1085,16 +1085,36 @@ static bool try_check_zero(struct srcu_struct *ssp, int idx, int trycount)
|
||||
static void srcu_flip(struct srcu_struct *ssp)
|
||||
{
|
||||
/*
|
||||
* Ensure that if this updater saw a given reader's increment
|
||||
* from __srcu_read_lock(), that reader was using an old value
|
||||
* of ->srcu_idx. Also ensure that if a given reader sees the
|
||||
* new value of ->srcu_idx, this updater's earlier scans cannot
|
||||
* have seen that reader's increments (which is OK, because this
|
||||
* grace period need not wait on that reader).
|
||||
* Because the flip of ->srcu_idx is executed only if the
|
||||
* preceding call to srcu_readers_active_idx_check() found that
|
||||
* the ->srcu_unlock_count[] and ->srcu_lock_count[] sums matched
|
||||
* and because that summing uses atomic_long_read(), there is
|
||||
* ordering due to a control dependency between that summing and
|
||||
* the WRITE_ONCE() in this call to srcu_flip(). This ordering
|
||||
* ensures that if this updater saw a given reader's increment from
|
||||
* __srcu_read_lock(), that reader was using a value of ->srcu_idx
|
||||
* from before the previous call to srcu_flip(), which should be
|
||||
* quite rare. This ordering thus helps forward progress because
|
||||
* the grace period could otherwise be delayed by additional
|
||||
* calls to __srcu_read_lock() using that old (soon to be new)
|
||||
* value of ->srcu_idx.
|
||||
*
|
||||
* This sum-equality check and ordering also ensures that if
|
||||
* a given call to __srcu_read_lock() uses the new value of
|
||||
* ->srcu_idx, this updater's earlier scans cannot have seen
|
||||
* that reader's increments, which is all to the good, because
|
||||
* this grace period need not wait on that reader. After all,
|
||||
* if those earlier scans had seen that reader, there would have
|
||||
* been a sum mismatch and this code would not be reached.
|
||||
*
|
||||
* This means that the following smp_mb() is redundant, but
|
||||
* it stays until either (1) Compilers learn about this sort of
|
||||
* control dependency or (2) Some production workload running on
|
||||
* a production system is unduly delayed by this slowpath smp_mb().
|
||||
*/
|
||||
smp_mb(); /* E */ /* Pairs with B and C. */
|
||||
|
||||
WRITE_ONCE(ssp->srcu_idx, ssp->srcu_idx + 1);
|
||||
WRITE_ONCE(ssp->srcu_idx, ssp->srcu_idx + 1); // Flip the counter.
|
||||
|
||||
/*
|
||||
* Ensure that if the updater misses an __srcu_read_unlock()
|
||||
|
Loading…
Reference in New Issue
Block a user