linux/arch/x86
Peter Zijlstra 119251855f x86/retpoline: Simplify retpolines
Due to:

  c9c324dc22 ("objtool: Support stack layout changes in alternatives")

it is now possible to simplify the retpolines.

Currently our retpolines consist of 2 symbols:

 - __x86_indirect_thunk_\reg: the compiler target
 - __x86_retpoline_\reg:  the actual retpoline.

Both are consecutive in code and aligned such that for any one register
they both live in the same cacheline:

  0000000000000000 <__x86_indirect_thunk_rax>:
   0:   ff e0                   jmpq   *%rax
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop

  0000000000000005 <__x86_retpoline_rax>:
   5:   e8 07 00 00 00          callq  11 <__x86_retpoline_rax+0xc>
   a:   f3 90                   pause
   c:   0f ae e8                lfence
   f:   eb f9                   jmp    a <__x86_retpoline_rax+0x5>
  11:   48 89 04 24             mov    %rax,(%rsp)
  15:   c3                      retq
  16:   66 2e 0f 1f 84 00 00 00 00 00   nopw   %cs:0x0(%rax,%rax,1)

The thunk is an alternative_2, where one option is a JMP to the
retpoline. This was done so that objtool didn't need to deal with
alternatives with stack ops. But that problem has been solved, so now
it is possible to fold the entire retpoline into the alternative to
simplify and consolidate unused bytes:

  0000000000000000 <__x86_indirect_thunk_rax>:
   0:   ff e0                   jmpq   *%rax
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop
   5:   90                      nop
   6:   90                      nop
   7:   90                      nop
   8:   90                      nop
   9:   90                      nop
   a:   90                      nop
   b:   90                      nop
   c:   90                      nop
   d:   90                      nop
   e:   90                      nop
   f:   90                      nop
  10:   90                      nop
  11:   66 66 2e 0f 1f 84 00 00 00 00 00        data16 nopw %cs:0x0(%rax,%rax,1)
  1c:   0f 1f 40 00             nopl   0x0(%rax)

Notice that since the longest alternative sequence is now:

   0:   e8 07 00 00 00          callq  c <.altinstr_replacement+0xc>
   5:   f3 90                   pause
   7:   0f ae e8                lfence
   a:   eb f9                   jmp    5 <.altinstr_replacement+0x5>
   c:   48 89 04 24             mov    %rax,(%rsp)
  10:   c3                      retq

17 bytes, we have 15 bytes NOP at the end of our 32 byte slot. (IOW, if
we can shrink the retpoline by 1 byte we can pack it more densely).

 [ bp: Massage commit message. ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210326151259.506071949@infradead.org
2021-04-02 12:42:04 +02:00
..
boot x86/boot/compressed/sev-es: Convert to insn_decode() 2021-03-15 11:18:35 +01:00
configs module: remove EXPORT_UNUSED_SYMBOL* 2021-02-08 12:28:07 +01:00
crypto crypto: aesni - release FPU during skcipher walk API calls 2021-01-22 14:58:04 +11:00
entry Merge 'x86/alternatives' 2021-03-31 18:04:19 +02:00
events Linux 5.12-rc5 2021-04-02 12:33:16 +02:00
hyperv iommu/hyperv: setup an IO-APIC IRQ remapping domain for root partition 2021-02-11 08:47:07 +00:00
ia32 x86/ia32_signal: Propagate __user annotation properly 2020-12-11 19:44:31 +01:00
include x86/retpoline: Simplify retpolines 2021-04-02 12:42:04 +02:00
kernel x86/alternatives: Optimize optimize_nops() 2021-04-02 12:41:17 +02:00
kvm KVM: X86: Fix missing local pCPU when executing wbinvd on all dirty pCPUs 2021-03-18 13:55:34 -04:00
lib x86/retpoline: Simplify retpolines 2021-04-02 12:42:04 +02:00
math-emu
mm Linux 5.12-rc5 2021-04-02 12:33:16 +02:00
net Merge branch 'x86/cpu' into WIP.x86/core, to merge the NOP changes & resolve a semantic conflict 2021-04-02 12:36:30 +02:00
pci Simple Firmware Interface (SFI) support removal for v5.12-rc1 2021-02-24 10:35:29 -08:00
platform Linux 5.12-rc5 2021-04-02 12:33:16 +02:00
power x86/stackprotector/32: Make the canary into a regular percpu variable 2021-03-08 13:19:05 +01:00
purgatory crypto: sha - split sha.h into sha1.h and sha2.h 2020-11-20 14:45:33 +11:00
ras
realmode
tools x86/tools/insn_sanity: Convert to insn_decode() 2021-03-15 12:21:11 +01:00
um um: remove process stub VMA 2021-02-12 21:37:38 +01:00
video
xen Linux 5.12-rc5 2021-04-02 12:33:16 +02:00
.gitignore
Kbuild
Kconfig Merge 'x86/alternatives' 2021-03-31 18:04:19 +02:00
Kconfig.assembler
Kconfig.cpu
Kconfig.debug
Makefile Linux 5.12-rc5 2021-04-02 12:33:16 +02:00
Makefile_32.cpu
Makefile.um