linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-03 09:31:26 +00:00

History

Ross Zwisler d9dc64f30a x86/asm: Add support for the CLWB instruction Add support for the new CLWB (cache line write back) instruction. This instruction was announced in the document "Intel Architecture Instruction Set Extensions Programming Reference" with reference number 319433-022. https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf The CLWB instruction is used to write back the contents of dirtied cache lines to memory without evicting the cache lines from the processor's cache hierarchy. This should be used in favor of clflushopt or clflush in cases where you require the cache line to be written to memory but plan to access the data again in the near future. One of the main use cases for this is with persistent memory where CLWB can be used with PCOMMIT to ensure that data has been accepted to memory and is durable on the DIMM. This function shows how to properly use CLWB/CLFLUSHOPT/CLFLUSH and PCOMMIT with appropriate fencing: void flush_and_commit_buffer(void vaddr, unsigned int size) { void vend = vaddr + size - 1; for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) clwb(vaddr); /* Flush any possible final partial cacheline / clwb(vend); / * Use SFENCE to order CLWB/CLFLUSHOPT/CLFLUSH cache flushes. * (MFENCE via mb() also works) / wmb(); / PCOMMIT and the required SFENCE for ordering */ pcommit_sfence(); } After this function completes the data pointed to by vaddr is has been accepted to memory and will be durable if the vaddr points to persistent memory. Regarding the details of how the alternatives assembly is set up, we need one additional byte at the beginning of the CLFLUSH so that we can flip it into a CLFLUSHOPT by changing that byte into a 0x66 prefix. Two options are to either insert a 1 byte ASM_NOP1, or to add a 1 byte NOP_DS_PREFIX. Both have no functional effect with the plain CLFLUSH, but I've been told that executing a CLFLUSH + prefix should be faster than executing a CLFLUSH + NOP. We had to hard code the assembly for CLWB because, lacking the ability to assemble the CLWB instruction itself, the next closest thing is to have an xsaveopt instruction with a 0x66 prefix. Unfortunately XSAVEOPT itself is also relatively new, and isn't included by all the GCC versions that the kernel needs to support. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1422377631-8986-3-git-send-email-ross.zwisler@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>		2015-04-03 06:56:38 +02:00
..
alpha	asm-generic: uaccess.h cleanup	2015-02-18 10:02:24 -08:00
arc	ARC: Fix thread_saved_pc()	2015-02-27 10:59:34 +05:30
arm	Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm	2015-03-21 10:03:22 -07:00
arm64	arm64 fixes:	2015-03-21 10:24:10 -07:00
avr32	asm-generic: uaccess.h cleanup	2015-02-18 10:02:24 -08:00
blackfin	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input	2015-02-21 12:59:04 -08:00
c6x	arch/c6x/include/asm/pgtable.h: define dummy pgprot_writecombine for !MMU	2015-03-12 18:46:08 -07:00
cris	CRIS changes for 3.20	2015-02-15 18:02:02 -08:00
frv	mm: add missing __PAGETABLE_{PUD,PMD}_FOLDED defines	2015-02-28 09:57:51 -08:00
hexagon	all arches, signal: move restart_block to struct task_struct	2015-02-12 18:54:12 -08:00
ia64	asm-generic: uaccess.h cleanup	2015-02-18 10:02:24 -08:00
m32r	mm: add missing __PAGETABLE_{PUD,PMD}_FOLDED defines	2015-02-28 09:57:51 -08:00
m68k	mm: add missing __PAGETABLE_{PUD,PMD}_FOLDED defines	2015-02-28 09:57:51 -08:00
metag	metag: Fix KSTK_EIP() and KSTK_ESP() macros	2015-02-24 12:54:21 +00:00
microblaze	microblaze: Fix syscall error recovery for invalid syscall IDs	2015-03-04 15:12:27 +01:00
mips	KVM: MIPS: Enable after disabling interrupt	2015-03-02 19:18:12 -03:00
mn10300	mm: add missing __PAGETABLE_{PUD,PMD}_FOLDED defines	2015-02-28 09:57:51 -08:00
nios2	nios2: mm: do not invoke OOM killer on kernel fault OOM	2015-03-16 15:35:25 +08:00
openrisc	asm-generic: uaccess.h cleanup	2015-02-18 10:02:24 -08:00
parisc	mm: add missing __PAGETABLE_{PUD,PMD}_FOLDED defines	2015-02-28 09:57:51 -08:00
powerpc	powerpc/iommu: Remove IOMMU device references via bus notifier	2015-03-04 13:19:33 +11:00
s390	kvm: move advertising of KVM_CAP_IRQFD to common code	2015-03-10 21:18:59 -03:00
score	all arches, signal: move restart_block to struct task_struct	2015-02-12 18:54:12 -08:00
sh	asm-generic: uaccess.h cleanup	2015-02-18 10:02:24 -08:00
sparc	sparc: Fix /proc/kcore	2015-03-18 19:15:28 -07:00
tile	tile: use %*pb[l] to print bitmaps including cpumasks and nodemasks	2015-02-13 21:21:37 -08:00
um	all arches, signal: move restart_block to struct task_struct	2015-02-12 18:54:12 -08:00
unicore32	mm: vmalloc: pass additional vm_flags to __vmalloc_node_range()	2015-02-13 21:21:42 -08:00
x86	x86/asm: Add support for the CLWB instruction	2015-04-03 06:56:38 +02:00
xtensa	asm-generic: uaccess.h cleanup	2015-02-18 10:02:24 -08:00
.gitignore
Kconfig