linux/arch
Peter Zijlstra 119251855f x86/retpoline: Simplify retpolines
Due to:

  c9c324dc22 ("objtool: Support stack layout changes in alternatives")

it is now possible to simplify the retpolines.

Currently our retpolines consist of 2 symbols:

 - __x86_indirect_thunk_\reg: the compiler target
 - __x86_retpoline_\reg:  the actual retpoline.

Both are consecutive in code and aligned such that for any one register
they both live in the same cacheline:

  0000000000000000 <__x86_indirect_thunk_rax>:
   0:   ff e0                   jmpq   *%rax
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop

  0000000000000005 <__x86_retpoline_rax>:
   5:   e8 07 00 00 00          callq  11 <__x86_retpoline_rax+0xc>
   a:   f3 90                   pause
   c:   0f ae e8                lfence
   f:   eb f9                   jmp    a <__x86_retpoline_rax+0x5>
  11:   48 89 04 24             mov    %rax,(%rsp)
  15:   c3                      retq
  16:   66 2e 0f 1f 84 00 00 00 00 00   nopw   %cs:0x0(%rax,%rax,1)

The thunk is an alternative_2, where one option is a JMP to the
retpoline. This was done so that objtool didn't need to deal with
alternatives with stack ops. But that problem has been solved, so now
it is possible to fold the entire retpoline into the alternative to
simplify and consolidate unused bytes:

  0000000000000000 <__x86_indirect_thunk_rax>:
   0:   ff e0                   jmpq   *%rax
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop
   5:   90                      nop
   6:   90                      nop
   7:   90                      nop
   8:   90                      nop
   9:   90                      nop
   a:   90                      nop
   b:   90                      nop
   c:   90                      nop
   d:   90                      nop
   e:   90                      nop
   f:   90                      nop
  10:   90                      nop
  11:   66 66 2e 0f 1f 84 00 00 00 00 00        data16 nopw %cs:0x0(%rax,%rax,1)
  1c:   0f 1f 40 00             nopl   0x0(%rax)

Notice that since the longest alternative sequence is now:

   0:   e8 07 00 00 00          callq  c <.altinstr_replacement+0xc>
   5:   f3 90                   pause
   7:   0f ae e8                lfence
   a:   eb f9                   jmp    5 <.altinstr_replacement+0x5>
   c:   48 89 04 24             mov    %rax,(%rsp)
  10:   c3                      retq

17 bytes, we have 15 bytes NOP at the end of our 32 byte slot. (IOW, if
we can shrink the retpoline by 1 byte we can pack it more densely).

 [ bp: Massage commit message. ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210326151259.506071949@infradead.org
2021-04-02 12:42:04 +02:00
..
alpha io_uring-worker.v3-2021-02-25 2021-02-27 08:29:02 -08:00
arc arch: setup PF_IO_WORKER threads like PF_KTHREAD 2021-02-21 17:25:22 -07:00
arm Linux 5.12-rc5 2021-04-02 12:33:16 +02:00
arm64 Linux 5.12-rc5 2021-04-02 12:33:16 +02:00
csky ftrace: Fix spelling mistake "disabed" -> "disabled" 2021-03-16 21:19:40 -07:00
h8300 arch: setup PF_IO_WORKER threads like PF_KTHREAD 2021-02-21 17:25:22 -07:00
hexagon io_uring-worker.v3-2021-02-25 2021-02-27 08:29:02 -08:00
ia64 ia64: fix format strings for err_inject 2021-03-25 09:22:55 -07:00
m68k m68k: Fix virt_addr_valid() W=1 compiler warnings 2021-03-06 14:15:07 +01:00
microblaze io_uring-worker.v3-2021-02-25 2021-02-27 08:29:02 -08:00
mips MIPS: vmlinux.lds.S: Fix appended dtb not properly aligned 2021-03-16 22:53:08 +01:00
nds32 io_uring-worker.v3-2021-02-25 2021-02-27 08:29:02 -08:00
nios2 io_uring-worker.v3-2021-02-25 2021-02-27 08:29:02 -08:00
openrisc io_uring-worker.v3-2021-02-25 2021-02-27 08:29:02 -08:00
parisc arch/parisc/kernel: remove duplicate include in ptrace 2021-03-04 09:12:29 +01:00
powerpc powerpc: Force inlining of cpu_has_feature() to avoid build failure 2021-03-14 20:32:24 +11:00
riscv riscv: Correct SPARSEMEM configuration 2021-03-16 22:15:21 -07:00
s390 s390/pci: fix leak of PCI device structure 2021-03-15 19:10:56 +01:00
sh io_uring-worker.v3-2021-02-25 2021-02-27 08:29:02 -08:00
sparc Merge git://git.kernel.org:/pub/scm/linux/kernel/git/davem/sparc 2021-03-09 17:08:41 -08:00
um io_uring-worker.v3-2021-02-25 2021-02-27 08:29:02 -08:00
x86 x86/retpoline: Simplify retpolines 2021-04-02 12:42:04 +02:00
xtensa io_uring-worker.v3-2021-02-25 2021-02-27 08:29:02 -08:00
.gitignore
Kconfig kbuild: remove LLVM=1 test from HAS_LTO_CLANG 2021-03-11 14:52:55 +09:00