To prepare for inlining do_softirq_own_stack() replace
__ARCH_HAS_DO_SOFTIRQ with a Kconfig switch and select it in the affected
architectures.
This allows in the next step to move the function prototype and the inline
stub into a seperate asm-generic header file which is required to avoid
include recursion.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002513.181713427@linutronix.de
Now that all invocations of irq_exit_rcu() happen on the irq stack, turn on
CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK which causes the core code to invoke
__do_softirq() directly without going through do_softirq_own_stack().
That means do_softirq_own_stack() is only invoked from task context which
means it can't be on the irq stack. Remove the conditional from
run_softirq_on_irqstack_cond() and rename the function accordingly.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002513.068033456@linutronix.de
Use the new inline stack switching and remove the old ASM indirect call
implementation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002512.972714001@linutronix.de
To avoid yet another macro implementation reuse the existing
run_sysvec_on_irqstack_cond() and move the set_irq_regs() handling into the
called function. Makes the code even simpler.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002512.869753106@linutronix.de
Convert device interrupts to inline stack switching by replacing the
existing macro implementation with the new inline version. Tweak the
function signature of the actual handler function to have the vector
argument as u32. That allows the inline macro to avoid extra intermediates
and lets the compiler be smarter about the whole thing.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002512.769728139@linutronix.de
To inline the stack switching and to prepare for enabling
CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK provide a macro template for system
vectors and device interrupts and convert the system vectors over to it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002512.676197354@linutronix.de
The effort to make the ASM entry code slim and unified moved the irq stack
switching out of the low level ASM code so that the whole return from
interrupt work and state handling can be done in C and the ASM code just
handles the low level details of entry and exit.
This ended up being a suboptimal implementation for various reasons
(including tooling). The main pain points are:
- The indirect call which is expensive thanks to retpoline
- The inability to stay on the irq stack for softirq processing on return
from interrupt
- The fact that the stack switching code ends up being an easy to target
exploit gadget.
Prepare for inlining the stack switching logic into the C entry points by
providing a ASM macro which contains the guts of the switching mechanism:
1) Store RSP at the top of the irq stack
2) Switch RSP to the irq stack
3) Invoke code
4) Pop the original RSP back
Document the unholy asm() logic while at it to reduce the amount of head
scratching required a half year from now.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002512.578371068@linutronix.de
sysvec_spurious_apic_interrupt() calls into the handling body of
__spurious_interrupt() which is not obvious as that function is declared
inside the DEFINE_IDTENTRY_IRQ(spurious_interrupt) macro.
As __spurious_interrupt() is currently always inlined this ends up with two
copies of the same code for no reason.
Split the handling function out and invoke it from both entry points.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002512.469379641@linutronix.de
The per CPU hardirq_stack_ptr contains the pointer to the irq stack in the
form that it is ready to be assigned to [ER]SP so that the first push ends
up on the top entry of the stack.
But the stack switching on 64 bit has the following rules:
1) Store the current stack pointer (RSP) in the top most stack entry
to allow the unwinder to link back to the previous stack
2) Set RSP to the top most stack entry
3) Invoke functions on the irq stack
4) Pop RSP from the top most stack entry (stored in #1) so it's back
to the original stack.
That requires all stack switching code to decrement the stored pointer by 8
in order to be able to store the current RSP and then set RSP to that
location. That's a pointless exercise.
Do the -8 adjustment right when storing the pointer and make the data type
a void pointer to avoid confusion vs. the struct irq_stack data type which
is on 64bit only used to declare the backing store. Move the definition
next to the inuse flag so they likely end up in the same cache
line. Sticking them into a struct to enforce it is a seperate change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002512.354260928@linutronix.de
The recursion protection for hard interrupt stacks is an unsigned int per
CPU variable initialized to -1 named __irq_count.
The irq stack switching is only done when the variable is -1, which creates
worse code than just checking for 0. When the stack switching happens it
uses this_cpu_add/sub(1), but there is no reason to do so. It simply can
use straight writes. This is a historical leftover from the low level ASM
code which used inc and jz to make a decision.
Rename it to hardirq_stack_inuse, make it a bool and use plain stores.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002512.228830141@linutronix.de
Embracing a callout into instrumentation_begin() / instrumentation_begin()
does not really make sense. Make the latter instrumentation_end().
Fixes: 2f6474e463 ("x86/entry: Switch XEN/PV hypercall entry to IDTENTRY")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210210002512.106502464@linutronix.de
Natively support the stack swizzle pattern:
mov %rsp, (%[tos])
mov %[tos], %rsp
...
pop %rsp
It uses the vals[] array to link the first two stack-ops, and detect
the SP to SP_INDIRECT swizzle.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Where we already decode: mov %rsp, %reg, also decode mov %rsp, (%reg).
Nothing should match for this new stack-op.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Currently REG_SP_INDIRECT is unused but means (%rsp + offset),
change it to mean (%rsp) + offset.
The reason is that we're going to swizzle stack in the middle of a C
function with non-trivial stack footprint. This means that when the
unwinder finds the ToS, it needs to dereference it (%rsp) and then add
the offset to the next frame, resulting in: (%rsp) + offset
This is somewhat unfortunate, since REG_BP_INDIRECT is used (by DRAP)
and thus needs to retain the current (%rbp + offset).
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
The OBJECT_FILES_NON_STANDARD annotation is used to tell objtool to
ignore a file. File-level ignores won't work when validating vmlinux.o.
Instead, convert restore_image() and core_restore_code() to be ELF
functions. Their code is conventional enough for objtool to be able to
understand them.
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/974f8ceb5385e470f72e93974c70ab5c894bb0dc.1611263462.git.jpoimboe@redhat.com
Because restore_registers() is page-aligned, the assembler inexplicably
adds an unreachable jump from after the end of the previous function to
the beginning of restore_registers().
That confuses objtool, understandably. It also creates significant text
fragmentation. As a result, most of the object file is wasted text
(nops).
Move restore_registers() to the beginning of the file to both prevent
the text fragmentation and avoid the dead jump instruction.
$ size /tmp/hibernate_asm_64.before.o /tmp/hibernate_asm_64.after.o
text data bss dec hex filename
4415 0 0 4415 113f /tmp/hibernate_asm_64.before.o
524 0 0 524 20c /tmp/hibernate_asm_64.after.o
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/8c7f634201d26453d73fe55032cbbdc05d004387.1611263462.git.jpoimboe@redhat.com
The OBJECT_FILES_NON_STANDARD annotation is used to tell objtool to
ignore a file. File-level ignores won't work when validating vmlinux.o.
Instead, tell objtool to ignore do_suspend_lowlevel() directly with the
STACK_FRAME_NON_STANDARD annotation.
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/269eda576c53bc9ecc8167c211989111013a67aa.1611263462.git.jpoimboe@redhat.com
With objtool vmlinux.o validation of return_to_handler(), now that
objtool has visibility inside the retpoline, jumping from EMPTY state to
a proper function state results in a stack state mismatch.
return_to_handler() is actually quite normal despite the underlying
magic. Just annotate it as a normal function.
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/14f48e623f61dbdcd84cf27a56ed8ccae73199ef.1611263462.git.jpoimboe@redhat.com
The Xen hypercall page is filled with zeros, causing objtool to fall
through all the empty hypercall functions until it reaches a real
function, resulting in a stack state mismatch.
The build-time contents of the hypercall page don't matter because the
page gets rewritten by the hypervisor. Make it more palatable to
objtool by making each hypervisor function a true empty function, with
nops and a return.
Cc: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/0883bde1d7a1fb3b6a4c952bc0200e873752f609.1611263462.git.jpoimboe@redhat.com
The OBJECT_FILES_NON_STANDARD annotation is used to tell objtool to
ignore a file. File-level ignores won't work when validating vmlinux.o.
Tweak the ELF metadata and unwind hints to allow objtool to follow the
code.
Cc: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/8b042a09c69e8645f3b133ef6653ba28f896807d.1611263462.git.jpoimboe@redhat.com
The ORC metadata generated for UNWIND_HINT_FUNC isn't actually very
func-like. With certain usages it can cause stack state mismatches
because it doesn't set the return address (CFI_RA).
Also, users of UNWIND_HINT_RET_OFFSET no longer need to set a custom
return stack offset. Instead they just need to specify a func-like
situation, so the current ret_offset code is hacky for no good reason.
Solve both problems by simplifying the RET_OFFSET handling and
converting it into a more useful UNWIND_HINT_FUNC.
If we end up needing the old 'ret_offset' functionality again in the
future, we should be able to support it pretty easily with the addition
of a custom 'sp_offset' in UNWIND_HINT_FUNC.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/db9d1f5d79dddfbb3725ef6d8ec3477ad199948d.1611263462.git.jpoimboe@redhat.com
There's an inconsistency in how sibling calls are detected in
non-function asm code, depending on the scope of the object. If the
target code is external to the object, objtool considers it a sibling
call. If the target code is internal but not a function, objtool
*doesn't* consider it a sibling call.
This can cause some inconsistencies between per-object and vmlinux.o
validation.
Instead, assume only ELF functions can do sibling calls. This generally
matches existing reality, and makes sibling call validation consistent
between vmlinux.o and per-object.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/0e9ab6f3628cc7bf3bde7aa6762d54d7df19ad78.1611263461.git.jpoimboe@redhat.com
Objtool converts direct retpoline jumps to type INSN_JUMP_DYNAMIC, since
that's what they are semantically.
That conversion doesn't work in vmlinux.o validation because the
indirect thunk function is present in the object, so the intra-object
jump check succeeds before the retpoline jump check gets a chance.
Rearrange the checks: check for a retpoline jump before checking for an
intra-object jump.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/4302893513770dde68ddc22a9d6a2a04aca491dd.1611263461.git.jpoimboe@redhat.com
With my version of GCC 9.3.1 the ".cold" subfunctions no longer have a
numbered suffix, so the trailing period is no longer there.
Presumably this doesn't yet trigger a user-visible bug since most of the
subfunction detection logic is duplicated. I only found it when
testing vmlinux.o validation.
Fixes: 54262aa283 ("objtool: Fix sibling call detection")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/ca0b5a57f08a2fbb48538dd915cc253b5edabb40.1611263461.git.jpoimboe@redhat.com
The JMP_NOSPEC macro branches to __x86_retpoline_*() rather than the
__x86_indirect_thunk_*() wrappers used by C code. Detect jumps to
__x86_retpoline_*() as retpoline dynamic jumps.
Presumably this doesn't trigger a user-visible bug. I only found it
when testing vmlinux.o validation.
Fixes: 39b735332c ("objtool: Detect jumps to retpoline thunks")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/31f5833e2e4f01e3d755889ac77e3661e906c09f.1611263461.git.jpoimboe@redhat.com
Arnd found a randconfig that produces the warning:
arch/x86/entry/thunk_64.o: warning: objtool: missing symbol for insn at
offset 0x3e
when building with LLVM_IAS=1 (Clang's integrated assembler). Josh
notes:
With the LLVM assembler not generating section symbols, objtool has no
way to reference this code when it generates ORC unwinder entries,
because this code is outside of any ELF function.
The limitation now being imposed by objtool is that all code must be
contained in an ELF symbol. And .L symbols don't create such symbols.
So basically, you can use an .L symbol *inside* a function or a code
segment, you just can't use the .L symbol to contain the code using a
SYM_*_START/END annotation pair.
Fangrui notes that this optimization is helpful for reducing image size
when compiling with -ffunction-sections and -fdata-sections. I have
observed on the order of tens of thousands of symbols for the kernel
images built with those flags.
A patch has been authored against GNU binutils to match this behavior
of not generating unused section symbols ([1]), so this will
also become a problem for users of GNU binutils once they upgrade to 2.36.
Omit the .L prefix on a label so that the assembler will emit an entry
into the symbol table for the label, with STB_LOCAL binding. This
enables objtool to generate proper unwind info here with LLVM_IAS=1 or
GNU binutils 2.36+.
[ bp: Massage commit message. ]
Reported-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Suggested-by: Borislav Petkov <bp@alien8.de>
Suggested-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20210112194625.4181814-1-ndesaulniers@google.com
Link: https://github.com/ClangBuiltLinux/linux/issues/1209
Link: https://reviews.llvm.org/D93783
Link: https://sourceware.org/binutils/docs/as/Symbol-Names.html
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1408485ce69f844dcd7ded093a8 [1]
The ORC unwinder showed a warning [1] which revealed the stack layout
didn't match what was expected. The problem was that paravirt patching
had replaced "CALL *pv_ops.irq.save_fl" with "PUSHF;POP". That changed
the stack layout between the PUSHF and the POP, so unwinding from an
interrupt which occurred between those two instructions would fail.
Part of the agreed upon solution was to rework the custom paravirt
patching code to use alternatives instead, since objtool already knows
how to read alternatives (and converging runtime patching infrastructure
is always a good thing anyway). But the main problem still remains,
which is that runtime patching can change the stack layout.
Making stack layout changes in alternatives was disallowed with commit
7117f16bf4 ("objtool: Fix ORC vs alternatives"), but now that paravirt
is going to be doing it, it needs to be supported.
One way to do so would be to modify the ORC table when the code gets
patched. But ORC is simple -- a good thing! -- and it's best to leave
it alone.
Instead, support stack layout changes by "flattening" all possible stack
states (CFI) from parallel alternative code streams into a single set of
linear states. The only necessary limitation is that CFI conflicts are
disallowed at all possible instruction boundaries.
For example, this scenario is allowed:
Alt1 Alt2 Alt3
0x00 CALL *pv_ops.save_fl CALL xen_save_fl PUSHF
0x01 POP %RAX
0x02 NOP
...
0x05 NOP
...
0x07 <insn>
The unwind information for offset-0x00 is identical for all 3
alternatives. Similarly offset-0x05 and higher also are identical (and
the same as 0x00). However offset-0x01 has deviating CFI, but that is
only relevant for Alt3, neither of the other alternative instruction
streams will ever hit that offset.
This scenario is NOT allowed:
Alt1 Alt2
0x00 CALL *pv_ops.save_fl PUSHF
0x01 NOP6
...
0x07 NOP POP %RAX
The problem here is that offset-0x7, which is an instruction boundary in
both possible instruction patch streams, has two conflicting stack
layouts.
[ The above examples were stolen from Peter Zijlstra. ]
The new flattened CFI array is used both for the detection of conflicts
(like the second example above) and the generation of linear ORC
entries.
BTW, another benefit of these changes is that, thanks to some related
cleanups (new fake nops and alt_group struct) objtool can finally be rid
of fake jumps, which were a constant source of headaches.
[1] https://lkml.kernel.org/r/20201111170536.arx2zbn4ngvjoov7@treble
Cc: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Create a new struct associated with each group of alternatives
instructions. This will help with the removal of fake jumps, and more
importantly with adding support for stack layout changes in
alternatives.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Decouple ORC entries from instructions. This simplifies the
control/data flow, and is going to make it easier to support alternative
instructions which change the stack layout.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Running instruction decoder posttest on an s390 host with an x86 target
with allyesconfig shows errors. Instructions used in a couple of kernel
objects could not be correctly decoded on big endian system.
insn_decoder_test: warning: objdump says 6 bytes, but insn_get_length() says 5
insn_decoder_test: warning: Found an x86 instruction decoder bug, please report this.
insn_decoder_test: warning: ffffffff831eb4e1: 62 d1 fd 48 7f 04 24 vmovdqa64 %zmm0,(%r12)
insn_decoder_test: warning: objdump says 7 bytes, but insn_get_length() says 6
insn_decoder_test: warning: Found an x86 instruction decoder bug, please report this.
insn_decoder_test: warning: ffffffff831eb4e8: 62 51 fd 48 7f 44 24 01 vmovdqa64 %zmm8,0x40(%r12)
insn_decoder_test: warning: objdump says 8 bytes, but insn_get_length() says 6
This is because in a few places instruction field bytes are set directly
with further usage of "value". To address that introduce and use a
insn_set_byte() helper, which correctly updates "value" on big endian
systems.
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Currently objtool headers are being included either by their base name
or included via ../ from a parent directory. In case of a base name usage:
#include "warn.h"
#include "arch_elf.h"
it does not make it apparent from which directory the file comes from.
To make it slightly better, and actually to avoid name clashes some arch
specific files have "arch_" suffix. And files from an arch folder have
to revert to including via ../ e.g:
#include "../../elf.h"
With additional architectures support and the code base growth there is
a need for clearer headers naming scheme for multiple reasons:
1. to make it instantly obvious where these files come from (objtool
itself / objtool arch|generic folders / some other external files),
2. to avoid name clashes of objtool arch specific headers, potential
obtool arch generic headers and the system header files (there is
/usr/include/elf.h already),
3. to avoid ../ includes and improve code readability.
4. to give a warm fuzzy feeling to developers who are mostly kernel
developers and are accustomed to linux kernel headers arranging
scheme.
Doesn't this make it instantly obvious where are these files come from?
#include <objtool/warn.h>
#include <arch/elf.h>
And doesn't it look nicer to avoid ugly ../ includes? Which also
guarantees this is elf.h from the objtool and not /usr/include/elf.h.
#include <objtool/elf.h>
This patch defines and implements new objtool headers arranging
scheme. Which is:
- all generic headers go to include/objtool (similar to include/linux)
- all arch headers go to arch/$(SRCARCH)/include/arch (to get arch
prefix). This is similar to linux arch specific "asm/*" headers but we
are not abusing "asm" name and calling it what it is. This also helps
to prevent name clashes (arch is not used in system headers or kernel
exports).
To bring objtool to this state the following things are done:
1. current top level tools/objtool/ headers are moved into
include/objtool/ subdirectory,
2. arch specific headers, currently only arch/x86/include/ are moved into
arch/x86/include/arch/ and were stripped of "arch_" suffix,
3. new -I$(srctree)/tools/objtool/include include path to make
includes like <objtool/warn.h> possible,
4. rewriting file includes,
5. make git not to ignore include/objtool/ subdirectory.
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Correct objtool orc generation endianness problems to enable fully
functional x86 cross-compiles on big endian hardware.
Introduce bswap_if_needed() macro, which does a byte swap if target
endianness doesn't match the host, i.e. cross-compilation for little
endian on big endian and vice versa. The macro is used for conversion
of multi-byte values which are read from / about to be written to a
target native endianness ELF file.
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Relocations generated in elf_rebuild_rel[a]_reloc_section() are broken
if objtool is built and run on a big endian system.
The following errors pop up during x86 cross-compilation:
x86_64-9.1.0-ld: fs/efivarfs/inode.o: bad reloc symbol index (0x2000000 >= 0x22) for offset 0 in section `.orc_unwind_ip'
x86_64-9.1.0-ld: final link failed: bad value
Convert those functions to use gelf_update_rel[a](), similar to what
elf_write_reloc() does.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Co-developed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
The x86 instruction decoder code is shared across the kernel source and
the tools. Currently objtool seems to be the only tool from build tools
needed which breaks x86 cross-compilation on big endian systems. Make
the x86 instruction decoder build host endianness agnostic to support
x86 cross-compilation and enable objtool to implement endianness
awareness for big endian architectures support.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Co-developed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Currently the x86 instruction decoder is used from:
- the kernel itself,
- from tools like objtool and perf,
- within x86 tools, i.e. instruction decoder selftests.
The first two cases are similar, because tools headers try to mimic
kernel headers.
Instruction decoder selftests include some of the kernel headers
directly, including uapi headers. This works until headers dependencies
are kept to a minimum and tools are not cross-compiled. Since the goal
of the x86 instruction decoder selftests is not to verify uapi headers,
move it to using tools headers, like is already done for vdso2c tool,
mkpiggy and other tools in arch/x86/boot/.
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Architectures without PUSH/POP instructions will always access the stack
though memory operations (SRC/DEST_INDIRECT). Make those operations have
the same effect on the CFA as PUSH/POP, with no stack pointer
modification.
Signed-off-by: Julien Thierry <jthierry@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
On arm64, the compiler can set the frame pointer either
with a move operation or with and add operation like:
add (SP + constant), BP
For a simple move operation, the CFA base is changed from SP to BP.
Handle also changing the CFA base when the frame pointer is set with
an addition instruction.
Signed-off-by: Julien Thierry <jthierry@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
A valid stack frame should contain both the return address and the
previous frame pointer value.
On x86, the return value is placed on the stack by the calling
instructions. On other architectures, the callee needs to explicitly
save the return address on the stack.
Add the necessary checks to verify a function properly sets up all the
elements of the stack frame.
Signed-off-by: Julien Thierry <jthierry@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
- Search for <ncurses.h> in the default header path of HOSTCC
- Tweak the option order to be kind to old BSD awk
- Remove 'kvmconfig' and 'xenconfig' shorthands
- Fix documentation
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl/7YsgVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGMywQAKXnzI0nOsJWCsRNhYaOvgR/xmEl
Tu/E+fN6aL/BGtIfkrdJzYRCa5UWONYMQuACvDhXarR1/Stkag9XyWY+rRTHK6In
rYGb2NmjKUsJ4mtZtpPY7lyMgizZJX63Ba8m+Np8U3dpHuCk2OncOJFszJAiPqoN
RbJiUNTT8F0/W827kvDcbdWc8iiXN75mqblFZQBSG71kiv/l/3Jsz/FE5980pGfv
XfZbokOKBGpEAnzHCt4wfodlTQxsezzbgkrzquVhosJYidXcl3VMnoJMfPFFJwIv
pxgIyrdUxgyuM3iqwymiXxJtv+wVUC9lTEgYp4KgIy0kj2ACe59u6wAcjSDoHWEu
pImhidGu0dZTrIltT4IE2WoqsFqtjkMCAM1DvLlgPS8GROL0IK0l9AnXAyCUoSYE
RMhYO3K9G6z0PGvNGJcxPttih8na/hLjvUwbFWUGlltadmYqSLA16MO4c3NOgqMR
RbLJ5K6c/tKe9ktW8f+FVJlcRnEebEbG9hzQLG5W/umEesxUQHFVe44nmCEJkyVB
tQdCXCqiBOcSoJC+pYceuGG/vUhX+Nmm+zMyI5pTwAvBL7Yx4Ef4HOIMlbKCLT6z
HsC/CtfIb1IEfNbKq+9nkHZRnO8lNAORJmyFzTwDYDQJWvNTKvDviJo5V7i9QfEm
0WVOr9qCIlOWJvGe
=T0vB
-----END PGP SIGNATURE-----
Merge tag 'kbuild-fixes-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Search for <ncurses.h> in the default header path of HOSTCC
- Tweak the option order to be kind to old BSD awk
- Remove 'kvmconfig' and 'xenconfig' shorthands
- Fix documentation
* tag 'kbuild-fixes-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
Documentation: kbuild: Fix section reference
kconfig: remove 'kvmconfig' and 'xenconfig' shorthands
lib/raid6: Let $(UNROLL) rules work with macOS userland
kconfig: Support building mconf with vendor sysroot ncurses
kconfig: config script: add a little user help
MAINTAINERS: adjust GCC PLUGINS after gcc-plugin.sh removal
This is two driver fixes (megaraid_sas and hisi_sas). The megaraid
one is a revert of a previous revert of a cpu hotplug fix which
exposed a bug in the block layer which has been fixed in this merge
window and the hisi_sas performance enhancement comes from switching
to interrupt managed completion queues, which depended on the addition
of devm_platform_get_irqs_affinity() which is now upstream via the irq
tree in the last merge window.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCX/tF/SYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishTYGAQDzi9Zi
iJ0uut1ex6io0q/X+DU7vwYFSfx9wILi8k8jFwD8C9O4pPfdvZyfPqLx/ol90lxW
fEIZK6p7MozgoSw7s9k=
=jp38
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is two driver fixes (megaraid_sas and hisi_sas).
The megaraid one is a revert of a previous revert of a cpu hotplug fix
which exposed a bug in the block layer which has been fixed in this
merge window.
The hisi_sas performance enhancement comes from switching to interrupt
managed completion queues, which depended on the addition of
devm_platform_get_irqs_affinity() which is now upstream via the irq
tree in the last merge window"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: hisi_sas: Expose HW queues for v2 hw
Revert "Revert "scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug""