Merge tag 'arm64-mmiowb' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull mmiowb removal from Will Deacon: "Remove Mysterious Macro Intended to Obscure Weird Behaviours (mmiowb()) Remove mmiowb() from the kernel memory barrier API and instead, for architectures that need it, hide the barrier inside spin_unlock() when MMIO has been performed inside the critical section. The only relatively recent changes have been addressing review comments on the documentation, which is in a much better shape thanks to the efforts of Ben and Ingo. I was initially planning to split this into two pull requests so that you could run the coccinelle script yourself, however it's been plain sailing in linux-next so I've just included the whole lot here to keep things simple" * tag 'arm64-mmiowb' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (23 commits) docs/memory-barriers.txt: Update I/O section to be clearer about CPU vs thread docs/memory-barriers.txt: Fix style, spacing and grammar in I/O section arch: Remove dummy mmiowb() definitions from arch code net/ethernet/silan/sc92031: Remove stale comment about mmiowb() i40iw: Redefine i40iw_mmiowb() to do nothing scsi/qla1280: Remove stale comment about mmiowb() drivers: Remove explicit invocations of mmiowb() drivers: Remove useless trailing comments from mmiowb() invocations Documentation: Kill all references to mmiowb() riscv/mmiowb: Hook up mmwiob() implementation to asm-generic code powerpc/mmiowb: Hook up mmwiob() implementation to asm-generic code ia64/mmiowb: Add unconditional mmiowb() to arch_spin_unlock() mips/mmiowb: Add unconditional mmiowb() to arch_spin_unlock() sh/mmiowb: Add unconditional mmiowb() to arch_spin_unlock() m68k/io: Remove useless definition of mmiowb() nds32/io: Remove useless definition of mmiowb() x86/io: Remove useless definition of mmiowb() arm64/io: Remove useless definition of mmiowb() ARM/io: Remove useless definition of mmiowb() mmiowb: Hook up mmiowb helpers to spinlocks and generic I/O accessors ...
This commit is contained in:
@@ -103,51 +103,6 @@ continuing execution::
|
||||
ha->flags.ints_enabled = 0;
|
||||
}
|
||||
|
||||
In addition to write posting, on some large multiprocessing systems
|
||||
(e.g. SGI Challenge, Origin and Altix machines) posted writes won't be
|
||||
strongly ordered coming from different CPUs. Thus it's important to
|
||||
properly protect parts of your driver that do memory-mapped writes with
|
||||
locks and use the :c:func:`mmiowb()` to make sure they arrive in the
|
||||
order intended. Issuing a regular readX() will also ensure write ordering,
|
||||
but should only be used when the
|
||||
driver has to be sure that the write has actually arrived at the device
|
||||
(not that it's simply ordered with respect to other writes), since a
|
||||
full readX() is a relatively expensive operation.
|
||||
|
||||
Generally, one should use :c:func:`mmiowb()` prior to releasing a spinlock
|
||||
that protects regions using :c:func:`writeb()` or similar functions that
|
||||
aren't surrounded by readb() calls, which will ensure ordering
|
||||
and flushing. The following pseudocode illustrates what might occur if
|
||||
write ordering isn't guaranteed via :c:func:`mmiowb()` or one of the
|
||||
readX() functions::
|
||||
|
||||
CPU A: spin_lock_irqsave(&dev_lock, flags)
|
||||
CPU A: ...
|
||||
CPU A: writel(newval, ring_ptr);
|
||||
CPU A: spin_unlock_irqrestore(&dev_lock, flags)
|
||||
...
|
||||
CPU B: spin_lock_irqsave(&dev_lock, flags)
|
||||
CPU B: writel(newval2, ring_ptr);
|
||||
CPU B: ...
|
||||
CPU B: spin_unlock_irqrestore(&dev_lock, flags)
|
||||
|
||||
In the case above, newval2 could be written to ring_ptr before newval.
|
||||
Fixing it is easy though::
|
||||
|
||||
CPU A: spin_lock_irqsave(&dev_lock, flags)
|
||||
CPU A: ...
|
||||
CPU A: writel(newval, ring_ptr);
|
||||
CPU A: mmiowb(); /* ensure no other writes beat us to the device */
|
||||
CPU A: spin_unlock_irqrestore(&dev_lock, flags)
|
||||
...
|
||||
CPU B: spin_lock_irqsave(&dev_lock, flags)
|
||||
CPU B: writel(newval2, ring_ptr);
|
||||
CPU B: ...
|
||||
CPU B: mmiowb();
|
||||
CPU B: spin_unlock_irqrestore(&dev_lock, flags)
|
||||
|
||||
See tg3.c for a real world example of how to use :c:func:`mmiowb()`
|
||||
|
||||
PCI ordering rules also guarantee that PIO read responses arrive after any
|
||||
outstanding DMA writes from that bus, since for some devices the result of
|
||||
a readb() call may signal to the driver that a DMA transaction is
|
||||
|
||||
@@ -132,10 +132,6 @@ precludes passing these pages to userspace.
|
||||
P2P memory is also technically IO memory but should never have any side
|
||||
effects behind it. Thus, the order of loads and stores should not be important
|
||||
and ioreadX(), iowriteX() and friends should not be necessary.
|
||||
However, as the memory is not cache coherent, if access ever needs to
|
||||
be protected by a spinlock then :c:func:`mmiowb()` must be used before
|
||||
unlocking the lock. (See ACQUIRES VS I/O ACCESSES in
|
||||
Documentation/memory-barriers.txt)
|
||||
|
||||
|
||||
P2P DMA Support Library
|
||||
|
||||
Reference in New Issue
Block a user