linux/arch
Ross Zwisler d9dc64f30a x86/asm: Add support for the CLWB instruction
Add support for the new CLWB (cache line write back)
instruction.  This instruction was announced in the document
"Intel Architecture Instruction Set Extensions Programming
Reference" with reference number 319433-022.

  https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf

The CLWB instruction is used to write back the contents of
dirtied cache lines to memory without evicting the cache lines
from the processor's cache hierarchy.  This should be used in
favor of clflushopt or clflush in cases where you require the
cache line to be written to memory but plan to access the data
again in the near future.

One of the main use cases for this is with persistent memory
where CLWB can be used with PCOMMIT to ensure that data has been
accepted to memory and is durable on the DIMM.

This function shows how to properly use CLWB/CLFLUSHOPT/CLFLUSH
and PCOMMIT with appropriate fencing:

void flush_and_commit_buffer(void *vaddr, unsigned int size)
{
	void *vend = vaddr + size - 1;

	for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size)
		clwb(vaddr);

	/* Flush any possible final partial cacheline */
	clwb(vend);

	/*
	 * Use SFENCE to order CLWB/CLFLUSHOPT/CLFLUSH cache flushes.
	 * (MFENCE via mb() also works)
	 */
	wmb();

	/* PCOMMIT and the required SFENCE for ordering */
	pcommit_sfence();
}

After this function completes the data pointed to by vaddr is
has been accepted to memory and will be durable if the vaddr
points to persistent memory.

Regarding the details of how the alternatives assembly is set
up, we need one additional byte at the beginning of the CLFLUSH
so that we can flip it into a CLFLUSHOPT by changing that byte
into a 0x66 prefix.  Two options are to either insert a 1 byte
ASM_NOP1, or to add a 1 byte NOP_DS_PREFIX.  Both have no
functional effect with the plain CLFLUSH, but I've been told
that executing a CLFLUSH + prefix should be faster than
executing a CLFLUSH + NOP.

We had to hard code the assembly for CLWB because, lacking the
ability to assemble the CLWB instruction itself, the next
closest thing is to have an xsaveopt instruction with a 0x66
prefix.  Unfortunately XSAVEOPT itself is also relatively new,
and isn't included by all the GCC versions that the kernel needs
to support.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Acked-by: Borislav Petkov <bp@suse.de>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1422377631-8986-3-git-send-email-ross.zwisler@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-03 06:56:38 +02:00
..
alpha asm-generic: uaccess.h cleanup 2015-02-18 10:02:24 -08:00
arc ARC: Fix thread_saved_pc() 2015-02-27 10:59:34 +05:30
arm Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm 2015-03-21 10:03:22 -07:00
arm64 arm64 fixes: 2015-03-21 10:24:10 -07:00
avr32 asm-generic: uaccess.h cleanup 2015-02-18 10:02:24 -08:00
blackfin Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2015-02-21 12:59:04 -08:00
c6x arch/c6x/include/asm/pgtable.h: define dummy pgprot_writecombine for !MMU 2015-03-12 18:46:08 -07:00
cris CRIS changes for 3.20 2015-02-15 18:02:02 -08:00
frv mm: add missing __PAGETABLE_{PUD,PMD}_FOLDED defines 2015-02-28 09:57:51 -08:00
hexagon all arches, signal: move restart_block to struct task_struct 2015-02-12 18:54:12 -08:00
ia64 asm-generic: uaccess.h cleanup 2015-02-18 10:02:24 -08:00
m32r mm: add missing __PAGETABLE_{PUD,PMD}_FOLDED defines 2015-02-28 09:57:51 -08:00
m68k mm: add missing __PAGETABLE_{PUD,PMD}_FOLDED defines 2015-02-28 09:57:51 -08:00
metag metag: Fix KSTK_EIP() and KSTK_ESP() macros 2015-02-24 12:54:21 +00:00
microblaze microblaze: Fix syscall error recovery for invalid syscall IDs 2015-03-04 15:12:27 +01:00
mips KVM: MIPS: Enable after disabling interrupt 2015-03-02 19:18:12 -03:00
mn10300 mm: add missing __PAGETABLE_{PUD,PMD}_FOLDED defines 2015-02-28 09:57:51 -08:00
nios2 nios2: mm: do not invoke OOM killer on kernel fault OOM 2015-03-16 15:35:25 +08:00
openrisc asm-generic: uaccess.h cleanup 2015-02-18 10:02:24 -08:00
parisc mm: add missing __PAGETABLE_{PUD,PMD}_FOLDED defines 2015-02-28 09:57:51 -08:00
powerpc powerpc/iommu: Remove IOMMU device references via bus notifier 2015-03-04 13:19:33 +11:00
s390 kvm: move advertising of KVM_CAP_IRQFD to common code 2015-03-10 21:18:59 -03:00
score all arches, signal: move restart_block to struct task_struct 2015-02-12 18:54:12 -08:00
sh asm-generic: uaccess.h cleanup 2015-02-18 10:02:24 -08:00
sparc sparc: Fix /proc/kcore 2015-03-18 19:15:28 -07:00
tile tile: use %*pb[l] to print bitmaps including cpumasks and nodemasks 2015-02-13 21:21:37 -08:00
um all arches, signal: move restart_block to struct task_struct 2015-02-12 18:54:12 -08:00
unicore32 mm: vmalloc: pass additional vm_flags to __vmalloc_node_range() 2015-02-13 21:21:42 -08:00
x86 x86/asm: Add support for the CLWB instruction 2015-04-03 06:56:38 +02:00
xtensa asm-generic: uaccess.h cleanup 2015-02-18 10:02:24 -08:00
.gitignore
Kconfig