In order to avoid calling str*cmp() on symbol names, over and over, do
them all once upfront and store the result.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120309.658539311@infradead.org
The section structure already contains sh_size, so just remove the extra
'len' member that requires extra mirroring and potential confusion.
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20210822225037.54620-3-joe.lawrence@redhat.com
Cc: Andy Lavr <andy.lavr@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Commit e31694e0a7 ("objtool: Don't make .altinstructions writable")
aligned objtool-created and kernel-created .altinstructions section
flags, but there remains a minor discrepency in their use of a section
entry size: objtool sets one while the kernel build does not.
While sh_entsize of sizeof(struct alt_instr) seems intuitive, this small
deviation can cause failures with external tooling (kpatch-build).
Fix this by creating new .altinstructions sections with sh_entsize of 0
and then later updating sec->sh_size as alternatives are added to the
section. An added benefit is avoiding the data descriptor and buffer
created by elf_create_section(), but previously unused by
elf_add_alternative().
Fixes: 9bc0bb5072 ("objtool/x86: Rewrite retpoline thunk calls")
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20210822225037.54620-2-joe.lawrence@redhat.com
Cc: Andy Lavr <andy.lavr@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Converting a special section's relocation reference to a symbol is
straightforward. No need for objtool to complain that it doesn't know
how to handle it. Just handle it.
This fixes the following warning:
arch/x86/kvm/emulate.o: warning: objtool: __ex_table+0x4: don't know how to handle reloc symbol type: kvm_fastop_exception
Fixes: 24ff652573 ("objtool: Teach get_alt_entry() about more relocation types")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/feadbc3dfb3440d973580fad8d3db873cbfe1694.1633367242.git.jpoimboe@redhat.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: linux-kernel@vger.kernel.org
The objtool warning that the kvm instruction emulation code triggered
wasn't very useful:
arch/x86/kvm/emulate.o: warning: objtool: __ex_table+0x4: don't know how to handle reloc symbol type: kvm_fastop_exception
in that it helpfully tells you which symbol name it had trouble figuring
out the relocation for, but it doesn't actually say what the unknown
symbol type was that triggered it all.
In this case it was because of missing type information (type 0, aka
STT_NOTYPE), but on the whole it really should just have printed that
out as part of the message.
Because if this warning triggers, that's very much the first thing you
want to know - why did reloc2sec_off() return failure for that symbol?
So rather than just saying you can't handle some type of symbol without
saying what the type _was_, just print out the type number too.
Fixes: 24ff652573 ("objtool: Teach get_alt_entry() about more relocation types")
Link: https://lore.kernel.org/lkml/CAHk-=wiZwq-0LknKhXN4M+T8jbxn_2i9mcKpO+OaBSSq_Eh7tg@mail.gmail.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Occasionally objtool encounters symbol (as opposed to section)
relocations in .altinstructions. Typically they are the alternatives
written by elf_add_alternative() as encountered on a noinstr
validation run on vmlinux after having already ran objtool on the
individual .o files.
Basically this is the counterpart of commit 44f6a7c075 ("objtool:
Fix seg fault with Clang non-section symbols"), because when these new
assemblers (binutils now also does this) strip the section symbols,
elf_add_reloc_to_insn() is forced to emit symbol based relocations.
As such, teach get_alt_entry() about different relocation types.
Fixes: 9bc0bb5072 ("objtool/x86: Rewrite retpoline thunk calls")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/YVWUvknIEVNkPvnP@hirez.programming.kicks-ass.net
Normally objtool will now follow indirect calls; there is no need.
However, this becomes a problem with noinstr validation; if there's an
indirect call from noinstr code, we very much need to know it is to
another noinstr function. Luckily there aren't many indirect calls in
entry code with the obvious exception of paravirt. As such, noinstr
validation didn't work with paravirt kernels.
In order to track pv_ops[] call targets, objtool reads the static
pv_ops[] tables as well as direct assignments to the pv_ops[] array,
provided the compiler makes them a single instruction like:
bf87: 48 c7 05 00 00 00 00 00 00 00 00 movq $0x0,0x0(%rip)
bf92 <xen_init_spinlocks+0x5f>
bf8a: R_X86_64_PC32 pv_ops+0x268
There are, as of yet, no warnings for when this goes wrong :/
Using the functions found with the above means, all pv_ops[] calls are
now subject to noinstr validation.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210624095149.118815755@infradead.org
Turns out the compilers also generate tail calls to __sanitize_cov*(),
make sure to also patch those out in noinstr code.
Fixes: 0f1441b44e ("objtool: Fix noinstr vs KCOV")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Marco Elver <elver@google.com>
Link: https://lore.kernel.org/r/20210624095147.818783799@infradead.org
Andi reported that objtool on vmlinux.o consumes more memory than his
system has, leading to horrific performance.
This is in part because we keep a struct instruction for every
instruction in the file in-memory. Shrink struct instruction by
removing the CFI state (which includes full register state) from it
and demand allocating it.
Given most instructions don't actually change CFI state, there's lots
of repetition there, so add a hash table to find previous CFI
instances.
Reduces memory consumption (and runtime) for processing an
x86_64-allyesconfig:
pre: 4:40.84 real, 143.99 user, 44.18 sys, 30624988 mem
post: 2:14.61 real, 108.58 user, 25.04 sys, 16396184 mem
Suggested-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210624095147.756759107@infradead.org
The asm_cpu_bringup_and_idle() function is required to push the return
value on the stack in order to make ORC happy, but the only reason
objtool doesn't complain is because of a happy accident.
The thing is that asm_cpu_bringup_and_idle() doesn't return, so
validate_branch() never terminates and falls through to the next
function, which in the normal case is the hypercall_page. And that, as
it happens, is 4095 NOPs and a RET.
Make asm_cpu_bringup_and_idle() terminate on it's own, by making the
function it calls as a dead-end. This way we no longer rely on what
code happens to come after.
Fixes: c3881eb58d ("x86/xen: Make the secondary CPU idle tasks reliable")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lore.kernel.org/r/20210624095147.693801717@infradead.org
kernel tooling such as kpatch-build.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDZYv4RHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1ipeBAAhJPS/kCQ17Y5zGyMB0/6yfCWIifODoS7
9J+6/mqKHPDdV07yzPtOXuTTmpKV4OHPi8Yj8kaXs5L5fOmQ1uAwITwZNF5hU0a5
CiFIsubUCJmglf9b6L9EH5pBEQ72Cq4u8zIhJ9LmZ4t625AHJAm2ikZgascc4U67
RvVoGr5sYTo0YEsc1IDM1wUtnUhXBNjS1VwkXNnCFFTXYHju47MeY1sPHq2hvkzO
iJGC9A+hxfM1eQt9/qC/2L/6F/XECN61gcR9Get8TkWeEGHmPG+FthmPLd4oO9Ho
03J4JfMbmXumWosAeilYBNUkfii/M5Em78Wpv/cB94iSt67rq7Eb+8gm4D5svmfN
l+utsPY/HYB+uWV0hy2cV/ORRiwcJnon54dEWL6912YkKz+OIb3DK/7l9ex5lW+D
r3o8NP0s6S+RgUkOFxz5VaYK1giu6fiaFysWdKeflvwlvY/64owMepQ1QfPBbeB7
3DTzvuYZ4Cb1x/vR6WBbFqGcuJKZ1CsZIBLCblveUs+G0wlu147K5E1qlXg/Wvq7
5Vzznc4fmRng8np5hxAw8ieLkatWg7szyryUV/4H2Ubs/jWGcH628ZYbapaCb7EM
Eson65xzbVfhnz16z8sN13XIF1lGe8sb0+qiFSclEfyDUnZDuhwMn6d9Ubqxrg5J
uTULEzmY/rI=
=MvPd
-----END PGP SIGNATURE-----
mergetag object d33b9035e1
type commit
tag objtool-core-2021-06-28
tagger Ingo Molnar <mingo@kernel.org> 1624859477 +0200
The biggest change in this cycle is the new code to handle
and rewrite variable sized jump labels - which results in
slightly tighter code generation in hot paths, through the
use of short(er) NOPs.
Also a number of cleanups and fixes, and a change to the
generic include/linux/compiler.h to handle a s390 GCC quirk.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDZZGcRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1goYg/7BxUIJXP0F5wbrMbAvJIDRgR/j3TA+ztk
uNU1yabBGluMxCqJ87HadJ+A5d010G+GRUn/birVr7w1UuwWv8HOda78dnyG7tme
xm78/1FlOnstuOTQxhK6rjbb2cp+QOmdsAQkq1TF4SOxArBQiwtjiOvytHjb5yNx
7LrlbtuZ7Dtc0qd2evkG4ma4QkGoDhBS1dRogrItc27ZLuFIQoNnEd2K2QNMgczw
a/Jx8fgNmdoJSq+vkBn9TnS/cJYUW/PAlPNtO3ac8yE857aDIVnjXFRzveAP/nTh
rwFD6aCGnJAqyqP7A8ElNjySos5O+ebYApxe7rEx0TNLbrc55qSP9lpdIO+vgytV
Xzy4O7z6o+lailQ4EoF8Qf+rlPeue0kLF23SsNbZY1uT0vjX1Uv70xgKbkuyPygp
GNXAy6dOXK0AfaZYL/Wa50yVnJnkYDjes/hHr+HEam5Oad566pqIyQNP8yWSPqaf
KHkL//1pb5C2RKwot4IYv/ftHfZB5QftoFq6bhGBc1GXUd/FiqivvGHPW/6g7rxi
ZIrXs+Fqm/5KP9mssNONfyz5XEvbcUTD1CbeqX9eyVbiYZbLp1oWSgtogiRW9ya+
HR7t0Dt/UFzFWbilb6EZff/Hdr1NZBZLdrfpvVDoMf5tR9J0BIOyjddTu89g/FIO
KcfJ5yyjJBU=
=+HAB
-----END PGP SIGNATURE-----
Merge tags 'objtool-urgent-2021-06-28' and 'objtool-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fix and updates from Ingo Molnar:
"An ELF format fix for a section flags mismatch bug that breaks kernel
tooling such as kpatch-build.
The biggest change in this cycle is the new code to handle and rewrite
variable sized jump labels - which results in slightly tighter code
generation in hot paths, through the use of short(er) NOPs.
Also a number of cleanups and fixes, and a change to the generic
include/linux/compiler.h to handle a s390 GCC quirk"
* tag 'objtool-urgent-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Don't make .altinstructions writable
* tag 'objtool-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Improve reloc hash size guestimate
instrumentation.h: Avoid using inline asm operand modifiers
compiler.h: Avoid using inline asm operand modifiers
kbuild: Fix objtool dependency for 'OBJECT_FILES_NON_STANDARD_<obj> := n'
objtool: Reflow handle_jump_alt()
jump_label/x86: Remove unused JUMP_LABEL_NOP_SIZE
jump_label, x86: Allow short NOPs
objtool: Provide stats for jump_labels
objtool: Rewrite jump_label instructions
objtool: Decode jump_entry::key addend
jump_label, x86: Emit short JMP
jump_label: Free jump_entry::key bit1 for build use
jump_label, x86: Add variable length patching support
jump_label, x86: Introduce jump_entry_size()
jump_label, x86: Improve error when we fail expected text
jump_label, x86: Factor out the __jump_table generation
jump_label, x86: Strip ASM jump_label support
x86, objtool: Dont exclude arch/x86/realmode/
objtool: Rewrite hashtable sizing
When objtool creates the .altinstructions section, it sets the SHF_WRITE
flag to make the section writable -- unless the section had already been
previously created by the kernel. The mismatch between kernel-created
and objtool-created section flags can cause failures with external
tooling (kpatch-build). And the section doesn't need to be writable
anyway.
Make the section flags consistent with the kernel's.
Fixes: 9bc0bb5072 ("objtool/x86: Rewrite retpoline thunk calls")
Reported-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/6c284ae89717889ea136f9f0064d914cd8329d31.1624462939.git.jpoimboe@redhat.com
Nathan reported that LLVM ThinLTO builds have a performance regression
with commit 25cf0d8aa2 ("objtool: Rewrite hashtable sizing"). Sami
was quick to note that this is due to their use of -ffunction-sections.
As a result the .text section is small and basing the number of relocs
off of that no longer works. Instead have read_sections() compute the
sum of all SHF_EXECINSTR sections and use that.
Fixes: 25cf0d8aa2 ("objtool: Rewrite hashtable sizing")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Debugged-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lkml.kernel.org/r/YMJpGLuGNsGtA5JJ@hirez.programming.kicks-ass.net
It turns out that the compilers generate conditional branches to the
retpoline thunks like:
5d5: 0f 85 00 00 00 00 jne 5db <cpuidle_reflect+0x22>
5d7: R_X86_64_PLT32 __x86_indirect_thunk_r11-0x4
while the rewrite can only handle JMP/CALL to the thunks. The result
is the alternative wrecking the code. Make sure to skip writing the
alternatives for conditional branches.
Fixes: 9bc0bb5072 ("objtool/x86: Rewrite retpoline thunk calls")
Reported-by: Lukasz Majczak <lma@semihalf.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
When an ELF object uses extended symbol section indexes (IOW it has a
.symtab_shndx section), these must be kept in sync with the regular
symbol table (.symtab).
So for every new symbol we emit, make sure to also emit a
.symtab_shndx value to keep the arrays of equal size.
Note: since we're writing an UNDEF symbol, most GElf_Sym fields will
be 0 and we can repurpose one (st_size) to host the 0 for the xshndx
value.
Fixes: 2f2f7e47f0 ("objtool: Add elf_create_undef_symbol()")
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lkml.kernel.org/r/YL3q1qFO9QIRL/BA@hirez.programming.kicks-ass.net
Miroslav figured the code flow in handle_jump_alt() was sub-optimal
with that goto. Reflow the code to make it clearer.
Reported-by: Miroslav Benes <mbenes@suse.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/YJ00lgslY+IpA/rL@hirez.programming.kicks-ass.net
Currently x86 kernel cross-compiled on big endian system fails at boot with:
kernel BUG at arch/x86/kernel/alternative.c:258!
Corresponding bug condition look like the following:
BUG_ON(feature >= (NCAPINTS + NBUGINTS) * 32);
Fix that by converting alternative feature/cpuid to target endianness.
Fixes: 9bc0bb5072 ("objtool/x86: Rewrite retpoline thunk calls")
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/patch-2.thread-6c9df9.git-6c9df9a8098d.your-ad-here.call-01620841104-ext-2554@work.hours
Currently x86 cross-compilation fails on big endian system with:
x86_64-cross-ld: init/main.o: invalid string offset 488112128 >= 6229 for section `.strtab'
Mark new ELF data in elf_create_undef_symbol() as symbol, so that libelf
does endianness handling correctly.
Fixes: 2f2f7e47f0 ("objtool: Add elf_create_undef_symbol()")
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/patch-1.thread-6c9df9.git-d39264656387.your-ad-here.call-01620841104-ext-2554@work.hours
When a jump_entry::key has bit1 set, rewrite the instruction to be a
NOP. This allows the compiler/assembler to emit JMP (and thus decide
on which encoding to use).
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210506194158.091028792@infradead.org
Teach objtool about the the low bits in the struct static_key pointer.
That is, the low two bits of @key in:
struct jump_entry {
s32 code;
s32 target;
long key;
}
as found in the __jump_table section. Since @key has a relocation to
the variable (to be resolved by the linker), the low two bits will be
reflected in the relocation's addend.
As such, find the reloc and store the addend, such that we can access
these bits.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210506194158.028024143@infradead.org
Currently objtool has 5 hashtables and sizes them 16 or 20 bits
depending on the --vmlinux argument.
However, a single side doesn't really work well for the 5 tables,
which among them, cover 3 different uses. Also, while vmlinux is
larger, there is still a very wide difference between a defconfig and
allyesconfig build, which again isn't optimally covered by a single
size.
Another aspect is the cost of elf_hash_init(), which for large tables
dominates the runtime for small input files. It turns out that all it
does it assign NULL, something that is required when using malloc().
However, when we allocate memory using mmap(), we're guaranteed to get
zero filled pages.
Therefore, rewrite the whole thing to:
1) use more dynamic sized tables, depending on the input file,
2) avoid the need for elf_hash_init() entirely by using mmap().
This speeds up a regular kernel build (100s to 98s for
x86_64-defconfig), and potentially dramatically speeds up vmlinux
processing.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210506194157.452881700@infradead.org
- Standardize the crypto asm code so that it looks like compiler-generated
code to objtool - so that it can understand it. This enables unwinding
from crypto asm code - and also fixes the last known remaining objtool
warnings for LTO and more.
- x86 decoder fixes: clean up and fix the decoder, and also extend it a bit
- Misc fixes and cleanups
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmCJEOQRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1jN0A//dZIR9GPW1cHkPD3Na+lxb+RWgCVnFUgw
meLEOum389zWr7S8YmcFpKWLy94f3l24i/e7ufKn6/RMdaQuT6pZUa6teiNPqDKN
Qq1v6EX/LT49Q1zh/zCCnKdlmF1t7wDyA1/+HBLnB4YfYEtteEt+p2Apyv4xIHOl
xWqaTMFcVR/El9FXSyqRWRR4zuqY0Uatz0fmfo5jmi2xq460k53fQlTLA/0w5Jw0
V3omyA3AYMUW6YlW5TGUINOhyDeAJm4PWl3siSUnSd6t8A/TVs5zpZX15BtseCle
0FRp2SbxOoVkiyo3N3XmkfYYns9+4wK7cr9qja9U9MsSBZJZwaBm2LO/t2WFrAhq
5dkOsoPmpIsjutsQnIhQgtVT9I/A4/u5m5Zi3trlXsBS0XAt/q+2GPfEngFmgb3q
nae4rhGUsQ3NTGBiqNuMHQF4yeEvQZ8DCf3ytTz7DjBeiQ9nAtwzbUUGQjYl2mj1
ZPOnl7Xmq/Nyw+AmdpffFPiEUJxqEg9HWjDo7DQATXb3Hw2VJ3cU8jwPRqDDlO10
OB81vysXNGTmhOngHXexxncpmU9gDOIC1imZZpw5lNx4W9Qn20AlGaGAIbqzlfx0
p5VuhkIWCySe1bOZx03xuk7Gq7GBIPPy/a2m204Ftipetlo1HBYwT3KB/wVpHmh7
CSjWgdiW3+k=
=poAZ
-----END PGP SIGNATURE-----
Merge tag 'objtool-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool updates from Ingo Molnar:
- Standardize the crypto asm code so that it looks like compiler-
generated code to objtool - so that it can understand it. This
enables unwinding from crypto asm code - and also fixes the last
known remaining objtool warnings for LTO and more.
- x86 decoder fixes: clean up and fix the decoder, and also extend it a
bit
- Misc fixes and cleanups
* tag 'objtool-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
x86/crypto: Enable objtool in crypto code
x86/crypto/sha512-ssse3: Standardize stack alignment prologue
x86/crypto/sha512-avx2: Standardize stack alignment prologue
x86/crypto/sha512-avx: Standardize stack alignment prologue
x86/crypto/sha256-avx2: Standardize stack alignment prologue
x86/crypto/sha1_avx2: Standardize stack alignment prologue
x86/crypto/sha_ni: Standardize stack alignment prologue
x86/crypto/crc32c-pcl-intel: Standardize jump table
x86/crypto/camellia-aesni-avx2: Unconditionally allocate stack buffer
x86/crypto/aesni-intel_avx: Standardize stack alignment prologue
x86/crypto/aesni-intel_avx: Fix register usage comments
x86/crypto/aesni-intel_avx: Remove unused macros
objtool: Support asm jump tables
objtool: Parse options from OBJTOOL_ARGS
objtool: Collate parse_options() users
objtool: Add --backup
objtool,x86: More ModRM sugar
objtool,x86: Rewrite ADD/SUB/AND
objtool,x86: Support %riz encodings
objtool,x86: Simplify register decode
...
Objtool detection of asm jump tables would normally just work, except
for the fact that asm retpolines use alternatives. Objtool thinks the
alternative code path (a jump to the retpoline) is a sibling call.
Don't treat alternative indirect branches as sibling calls when the
original instruction has a jump table.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Link: https://lore.kernel.org/r/460cf4dc675d64e1124146562cabd2c05aa322e8.1614182415.git.jpoimboe@redhat.com
When the compiler emits: "CALL __x86_indirect_thunk_\reg" for an
indirect call, have objtool rewrite it to:
ALTERNATIVE "call __x86_indirect_thunk_\reg",
"call *%reg", ALT_NOT(X86_FEATURE_RETPOLINE)
Additionally, in order to not emit endless identical
.altinst_replacement chunks, use a global symbol for them, see
__x86_indirect_alt_*.
This also avoids objtool from having to do code generation.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.320177914@infradead.org
When the .altinstr_replacement is a retpoline, skip the alternative.
We already special case retpolines anyway.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.259429287@infradead.org
Track the reloc of instructions in the new instruction->reloc field
to avoid having to look them up again later.
( Technically x86 instructions can have two relocations, but not jumps
and calls, for which we're using this. )
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.195441549@infradead.org
Provide infrastructure for architectures to rewrite/augment compiler
generated retpoline calls. Similar to what we do for static_call()s,
keep track of the instructions that are retpoline calls.
Use the same list_head, since a retpoline call cannot also be a
static_call.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.130805730@infradead.org
Allow objtool to create undefined symbols; this allows creating
relocations to symbols not currently in the symbol table.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.064743095@infradead.org
Create a common helper to append strings to a strtab.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.941474004@infradead.org
We have 4 instances of adding a relocation. Create a common helper
to avoid growing even more.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.817438847@infradead.org
Instead of manually calling elf_rebuild_reloc_section() on sections
we've called elf_add_reloc() on, have elf_write() DTRT.
This makes it easier to add random relocations in places without
carefully tracking when we're done and need to flush what section.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.754213408@infradead.org
Currently, objtool generates tail call entries in add_jump_destination()
but waits until validate_branch() to generate the regular call entries.
Move these to add_call_destination() for consistency.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.691529901@infradead.org
The __x86_indirect_ naming is obviously not generic. Shorten to allow
matching some additional magic names later.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.630296706@infradead.org
Just like JMP handling, convert a direct CALL to a retpoline thunk
into a retpoline safe indirect CALL.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.567568238@infradead.org
Due to:
c9c324dc22 ("objtool: Support stack layout changes in alternatives")
it is now possible to simplify the retpolines.
Currently our retpolines consist of 2 symbols:
- __x86_indirect_thunk_\reg: the compiler target
- __x86_retpoline_\reg: the actual retpoline.
Both are consecutive in code and aligned such that for any one register
they both live in the same cacheline:
0000000000000000 <__x86_indirect_thunk_rax>:
0: ff e0 jmpq *%rax
2: 90 nop
3: 90 nop
4: 90 nop
0000000000000005 <__x86_retpoline_rax>:
5: e8 07 00 00 00 callq 11 <__x86_retpoline_rax+0xc>
a: f3 90 pause
c: 0f ae e8 lfence
f: eb f9 jmp a <__x86_retpoline_rax+0x5>
11: 48 89 04 24 mov %rax,(%rsp)
15: c3 retq
16: 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:0x0(%rax,%rax,1)
The thunk is an alternative_2, where one option is a JMP to the
retpoline. This was done so that objtool didn't need to deal with
alternatives with stack ops. But that problem has been solved, so now
it is possible to fold the entire retpoline into the alternative to
simplify and consolidate unused bytes:
0000000000000000 <__x86_indirect_thunk_rax>:
0: ff e0 jmpq *%rax
2: 90 nop
3: 90 nop
4: 90 nop
5: 90 nop
6: 90 nop
7: 90 nop
8: 90 nop
9: 90 nop
a: 90 nop
b: 90 nop
c: 90 nop
d: 90 nop
e: 90 nop
f: 90 nop
10: 90 nop
11: 66 66 2e 0f 1f 84 00 00 00 00 00 data16 nopw %cs:0x0(%rax,%rax,1)
1c: 0f 1f 40 00 nopl 0x0(%rax)
Notice that since the longest alternative sequence is now:
0: e8 07 00 00 00 callq c <.altinstr_replacement+0xc>
5: f3 90 pause
7: 0f ae e8 lfence
a: eb f9 jmp 5 <.altinstr_replacement+0x5>
c: 48 89 04 24 mov %rax,(%rsp)
10: c3 retq
17 bytes, we have 15 bytes NOP at the end of our 32 byte slot. (IOW, if
we can shrink the retpoline by 1 byte we can pack it more densely).
[ bp: Massage commit message. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210326151259.506071949@infradead.org
Currently, optimize_nops() scans to see if the alternative starts with
NOPs. However, the emit pattern is:
141: \oldinstr
142: .skip (len-(142b-141b)), 0x90
That is, when 'oldinstr' is short, the tail is padded with NOPs. This case
never gets optimized.
Rewrite optimize_nops() to replace any trailing string of NOPs inside
the alternative to larger NOPs. Also run it irrespective of patching,
replacing NOPs in both the original and replaced code.
A direct consequence is that 'padlen' becomes superfluous, so remove it.
[ bp:
- Adjust commit message
- remove a stale comment about needing to pad
- add a comment in optimize_nops()
- exit early if the NOP verif. loop catches a mismatch - function
should not not add NOPs in that case
- fix the "optimized NOPs" offsets output ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210326151259.442992235@infradead.org
Since the kernel will rely on a single canonical set of NOPs, make sure
objtool uses the exact same ones.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210312115749.136357911@infradead.org
Add an explicit __ignore_sync_check__ marker which will be used to mark
lines which are supposed to be ignored by file synchronization check
scripts, its advantage being that it explicitly denotes such lines in
the code.
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lkml.kernel.org/r/20210304174237.31945-4-bp@alien8.de
Commit ab234a260b ("x86/pv: Rework arch_local_irq_restore() to not
use popf") replaced "push %reg; popf" with something like: "test
$0x200, %reg; jz 1f; sti; 1:", which breaks the pushf/popf symmetry
that commit ea24213d80 ("objtool: Add UACCESS validation") relies
on.
The result is:
drivers/gpu/drm/amd/amdgpu/si.o: warning: objtool: si_common_hw_init()+0xf36: PUSHF stack exhausted
Meanwhile, commit c9c324dc22 ("objtool: Support stack layout changes
in alternatives") makes that we can actually use stack-ops in
alternatives, which means we can revert 1ff865e343 ("x86,smap: Fix
smap_{save,restore}() alternatives").
That in turn means we can limit the PUSHF/POPF handling of
ea24213d80 to those instructions that are in alternatives.
Fixes: ab234a260b ("x86/pv: Rework arch_local_irq_restore() to not use popf")
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/YEY4rIbQYa5fnnEp@hirez.programming.kicks-ass.net
Teach objtool to parse options from the OBJTOOL_ARGS environment
variable.
This enables things like:
$ OBJTOOL_ARGS="--backup" make O=defconfig-build/ kernel/ponies.o
to obtain both defconfig-build/kernel/ponies.o{,.orig} and easily
inspect what objtool actually did.
Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20210226110004.252553847@infradead.org
Ensure there's a single place that parses check_options, in
preparation for extending where to get options from.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20210226110004.193108106@infradead.org
Teach objtool to write backups files, such that it becomes easier to
see what objtool did to the object file.
Backup files will be ${name}.orig.
Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/YD4obT3aoXPWl7Ax@hirez.programming.kicks-ass.net