FRED is a replacement for IDT event delivery on x86 and addresses most of
the technical nightmares which IDT exposes:
1) Exception cause registers like CR2 need to be manually preserved in
nested exception scenarios.
2) Hardware interrupt stack switching is suboptimal for nested exceptions
as the interrupt stack mechanism rewinds the stack on each entry which
requires a massive effort in the low level entry of #NMI code to handle
this.
3) No hardware distinction between entry from kernel or from user which
makes establishing kernel context more complex than it needs to be
especially for unconditionally nestable exceptions like NMI.
4) NMI nesting caused by IRET unconditionally reenabling NMIs, which is a
problem when the perf NMI takes a fault when collecting a stack trace.
5) Partial restore of ESP when returning to a 16-bit segment
6) Limitation of the vector space which can cause vector exhaustion on
large systems.
7) Inability to differentiate NMI sources
FRED addresses these shortcomings by:
1) An extended exception stack frame which the CPU uses to save exception
cause registers. This ensures that the meta information for each
exception is preserved on stack and avoids the extra complexity of
preserving it in software.
2) Hardware interrupt stack switching is non-rewinding if a nested
exception uses the currently interrupt stack.
3) The entry points for kernel and user context are separate and GS BASE
handling which is required to establish kernel context for per CPU
variable access is done in hardware.
4) NMIs are now nesting protected. They are only reenabled on the return
from NMI.
5) FRED guarantees full restore of ESP
6) FRED does not put a limitation on the vector space by design because it
uses a central entry points for kernel and user space and the CPUstores
the entry type (exception, trap, interrupt, syscall) on the entry stack
along with the vector number. The entry code has to demultiplex this
information, but this removes the vector space restriction.
The first hardware implementations will still have the current
restricted vector space because lifting this limitation requires
further changes to the local APIC.
7) FRED stores the vector number and meta information on stack which
allows having more than one NMI vector in future hardware when the
required local APIC changes are in place.
The series implements the initial FRED support by:
- Reworking the existing entry and IDT handling infrastructure to
accomodate for the alternative entry mechanism.
- Expanding the stack frame to accomodate for the extra 16 bytes FRED
requires to store context and meta information
- Providing FRED specific C entry points for events which have information
pushed to the extended stack frame, e.g. #PF and #DB.
- Providing FRED specific C entry points for #NMI and #MCE
- Implementing the FRED specific ASM entry points and the C code to
demultiplex the events
- Providing detection and initialization mechanisms and the necessary
tweaks in context switching, GS BASE handling etc.
The FRED integration aims for maximum code reuse vs. the existing IDT
implementation to the extent possible and the deviation in hot paths like
context switching are handled with alternatives to minimalize the
impact. The low level entry and exit paths are seperate due to the extended
stack frame and the hardware based GS BASE swichting and therefore have no
impact on IDT based systems.
It has been extensively tested on existing systems and on the FRED
simulation and as of now there are know outstanding problems.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmXuKPgTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoWyUEACevJMHU+Ot9zqBPizSWxByM1uunHbp
bjQXhaFeskd3mt7k7HU6GsPRSmC3q4lliP1Y9ypfbU0DvYSI2h/PhMWizjhmot2y
nIvFpl51r/NsI+JHx1oXcFetz0eGHEqBui/4YQ/swgOCMymYgfqgHhazXTdldV3g
KpH9/8W3AeGvw79uzXFH9tjBzTkbvywpam3v0LYNDJWTCuDkilyo8PjhsgRZD4x3
V9f1nLD7nSHZW8XLoktdJJ38bKwI2Lhao91NQ0ErwopekA4/9WphZEKsDpidUSXJ
sn1O148oQ8X92IO2OaQje8XC5pLGr5GqQBGPWzRH56P/Vd3+WOwBxaFoU6Drxc5s
tIe23ZjkVcpA8EEG7BQBZV1Un/NX7XaCCnMniOt0RauXw+1NaslX7t/tnUAh5F1V
TWCH4D0I0oJ0qJ7kNliGn2BP3agYXOVg81xVEUjT6KfHcYU4ImUrwi+BkeNXuXtL
Ch5ADnbYAcUjWLFnAmEmaRtfmfNGY5T7PeGFHW2RRkaOJ88v5g14Voo6gPJaDUPn
wMQ0nLq1xN4xZWF6ZgfRqAhArvh20k38ZujRku5vXEqnhOugQ76TF2UYiFEwOXbQ
8jcM+yEBLGgBz7tGMwmIAml6kfxaFF1KPpdrtcPxNkGlbE6KTSuIolLx2YGUvlSU
6/O8nwZy49ckmQ==
=Ib7w
-----END PGP SIGNATURE-----
Merge tag 'x86-fred-2024-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 FRED support from Thomas Gleixner:
"Support for x86 Fast Return and Event Delivery (FRED).
FRED is a replacement for IDT event delivery on x86 and addresses most
of the technical nightmares which IDT exposes:
1) Exception cause registers like CR2 need to be manually preserved
in nested exception scenarios.
2) Hardware interrupt stack switching is suboptimal for nested
exceptions as the interrupt stack mechanism rewinds the stack on
each entry which requires a massive effort in the low level entry
of #NMI code to handle this.
3) No hardware distinction between entry from kernel or from user
which makes establishing kernel context more complex than it needs
to be especially for unconditionally nestable exceptions like NMI.
4) NMI nesting caused by IRET unconditionally reenabling NMIs, which
is a problem when the perf NMI takes a fault when collecting a
stack trace.
5) Partial restore of ESP when returning to a 16-bit segment
6) Limitation of the vector space which can cause vector exhaustion
on large systems.
7) Inability to differentiate NMI sources
FRED addresses these shortcomings by:
1) An extended exception stack frame which the CPU uses to save
exception cause registers. This ensures that the meta information
for each exception is preserved on stack and avoids the extra
complexity of preserving it in software.
2) Hardware interrupt stack switching is non-rewinding if a nested
exception uses the currently interrupt stack.
3) The entry points for kernel and user context are separate and GS
BASE handling which is required to establish kernel context for
per CPU variable access is done in hardware.
4) NMIs are now nesting protected. They are only reenabled on the
return from NMI.
5) FRED guarantees full restore of ESP
6) FRED does not put a limitation on the vector space by design
because it uses a central entry points for kernel and user space
and the CPUstores the entry type (exception, trap, interrupt,
syscall) on the entry stack along with the vector number. The
entry code has to demultiplex this information, but this removes
the vector space restriction.
The first hardware implementations will still have the current
restricted vector space because lifting this limitation requires
further changes to the local APIC.
7) FRED stores the vector number and meta information on stack which
allows having more than one NMI vector in future hardware when the
required local APIC changes are in place.
The series implements the initial FRED support by:
- Reworking the existing entry and IDT handling infrastructure to
accomodate for the alternative entry mechanism.
- Expanding the stack frame to accomodate for the extra 16 bytes FRED
requires to store context and meta information
- Providing FRED specific C entry points for events which have
information pushed to the extended stack frame, e.g. #PF and #DB.
- Providing FRED specific C entry points for #NMI and #MCE
- Implementing the FRED specific ASM entry points and the C code to
demultiplex the events
- Providing detection and initialization mechanisms and the necessary
tweaks in context switching, GS BASE handling etc.
The FRED integration aims for maximum code reuse vs the existing IDT
implementation to the extent possible and the deviation in hot paths
like context switching are handled with alternatives to minimalize the
impact. The low level entry and exit paths are seperate due to the
extended stack frame and the hardware based GS BASE swichting and
therefore have no impact on IDT based systems.
It has been extensively tested on existing systems and on the FRED
simulation and as of now there are no outstanding problems"
* tag 'x86-fred-2024-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
x86/fred: Fix init_task thread stack pointer initialization
MAINTAINERS: Add a maintainer entry for FRED
x86/fred: Fix a build warning with allmodconfig due to 'inline' failing to inline properly
x86/fred: Invoke FRED initialization code to enable FRED
x86/fred: Add FRED initialization functions
x86/syscall: Split IDT syscall setup code into idt_syscall_init()
KVM: VMX: Call fred_entry_from_kvm() for IRQ/NMI handling
x86/entry: Add fred_entry_from_kvm() for VMX to handle IRQ/NMI
x86/entry/calling: Allow PUSH_AND_CLEAR_REGS being used beyond actual entry code
x86/fred: Fixup fault on ERETU by jumping to fred_entrypoint_user
x86/fred: Let ret_from_fork_asm() jmp to asm_fred_exit_user when FRED is enabled
x86/traps: Add sysvec_install() to install a system interrupt handler
x86/fred: FRED entry/exit and dispatch code
x86/fred: Add a machine check entry stub for FRED
x86/fred: Add a NMI entry stub for FRED
x86/fred: Add a debug fault entry stub for FRED
x86/idtentry: Incorporate definitions/declarations of the FRED entries
x86/fred: Make exc_page_fault() work for FRED
x86/fred: Allow single-step trap and NMI when starting a new task
x86/fred: No ESPFIX needed when FRED is enabled
...
No point in checking again as this was already done by the caller.
Signed-off-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240222111636.2214523-3-nik.borisov@suse.com
It's pointless checking if a particular part of an instruction is
decoded before calling the routine responsible for decoding it as this
check is duplicated in the routines itself. Streamline the code by
removing the superfluous checks. No functional difference.
Signed-off-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240222111636.2214523-2-nik.borisov@suse.com
ERETU returns from an event handler while making a transition to ring 3,
and ERETS returns from an event handler while staying in ring 0.
Add instruction opcodes used by ERET[US] to the x86 opcode map; opcode
numbers are per FRED spec v5.0.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Shan Kang <shan.kang@intel.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20231205105030.8698-10-xin3.li@intel.com
This is to get the changes from:
94ea9c0521 ("x86/headers: Replace #include <asm/export.h> with #include <linux/export.h>")
10f4c9b9a3 ("x86/asm: Fix build of UML with KASAN")
That addresses these perf tools build warning:
Warning: Kernel ABI header differences:
diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Link: https://lore.kernel.org/lkml/ZbkIKpKdNqOFdMwJ@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
WRMSRNS is an instruction that behaves exactly like WRMSR, with
the only difference being that it is not a serializing instruction
by default. Under certain conditions, WRMSRNS may replace WRMSR to
improve performance.
Add its CPU feature bit, opcode to the x86 opcode map, and an
always inline API __wrmsrns() to embed WRMSRNS into the code.
Signed-off-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Shan Kang <shan.kang@intel.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20231205105030.8698-2-xin3.li@intel.com
This is to get the changes from:
68674f94ff ("x86: don't use REP_GOOD or ERMS for small memory copies")
20f3337d35 ("x86: don't use REP_GOOD or ERMS for small memory clearing")
This also make the 'perf bench mem' files stop referring to the erms
versions that gone away with the above patches.
That addresses these perf tools build warning:
Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S'
diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
- Add Adrian Hunter to MAINTAINERS as a perf tools reviewer.
- Sync various tools/ copies of kernel headers with the kernel sources, this
time trying to avoid first merging with upstream to then update but instead
copy from upstream so that a merge is avoided and the end result after merging
this pull request is the one expected, tools/perf/check-headers.sh (mostly)
happy, less warnings while building tools/perf/.
- Fix counting when initial delay configured by setting
perf_attr.enable_on_exec when starting workloads from the perf command line.
- Don't avoid emitting a PERF_RECORD_MMAP2 in 'perf inject --buildid-all' when
that record comes with a build-id, otherwise we end up not being able to
resolve symbols.
- Don't use comma as the CSV output separator the "stat+csv_output" test, as
comma can appear on some tests as a modifier for an event, use @ instead,
ditto for the JSON linter test.
- The offcpu test was looking for some bits being set on
task_struct->prev_state without masking other bits not important for this
specific 'perf test', fix it.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZApKjQAKCRCyPKLppCJ+
JzdfAQDRnwDCxhb4cvx7lVR32L1XMIFW6qLWRBJWoxC2SJi6lgD/SoQgKswkxrJv
XnBP7jEaIsh3M3ak82MxLKbjSAEvnwk=
=jup7
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-fixes-for-v6.3-1-2023-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Add Adrian Hunter to MAINTAINERS as a perf tools reviewer
- Sync various tools/ copies of kernel headers with the kernel sources,
this time trying to avoid first merging with upstream to then update
but instead copy from upstream so that a merge is avoided and the end
result after merging this pull request is the one expected,
tools/perf/check-headers.sh (mostly) happy, less warnings while
building tools/perf/
- Fix counting when initial delay configured by setting
perf_attr.enable_on_exec when starting workloads from the perf
command line
- Don't avoid emitting a PERF_RECORD_MMAP2 in 'perf inject
--buildid-all' when that record comes with a build-id, otherwise we
end up not being able to resolve symbols
- Don't use comma as the CSV output separator the "stat+csv_output"
test, as comma can appear on some tests as a modifier for an event,
use @ instead, ditto for the JSON linter test
- The offcpu test was looking for some bits being set on
task_struct->prev_state without masking other bits not important for
this specific 'perf test', fix it
* tag 'perf-tools-fixes-for-v6.3-1-2023-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf tools: Add Adrian Hunter to MAINTAINERS as a reviewer
tools headers UAPI: Sync linux/perf_event.h with the kernel sources
tools headers x86 cpufeatures: Sync with the kernel sources
tools include UAPI: Sync linux/vhost.h with the kernel sources
tools arch x86: Sync the msr-index.h copy with the kernel sources
tools headers kvm: Sync uapi/{asm/linux} kvm.h headers with the kernel sources
tools include UAPI: Synchronize linux/fcntl.h with the kernel sources
tools headers: Synchronize {linux,vdso}/bits.h with the kernel sources
tools headers UAPI: Sync linux/prctl.h with the kernel sources
tools headers: Update the copy of x86's mem{cpy,set}_64.S used in 'perf bench'
perf stat: Fix counting when initial delay configured
tools headers svm: Sync svm headers with the kernel sources
perf test: Avoid counting commas in json linter
perf tests stat+csv_output: Switch CSV separator to @
perf inject: Fix --buildid-all not to eat up MMAP2
tools arch x86: Sync the msr-index.h copy with the kernel sources
perf test: Fix offcpu test prev_state check
We also continue with SYM_TYPED_FUNC_START() in util/include/linux/linkage.h
and with an exception in tools/perf/check_headers.sh's diff check to ignore
the include cfi_types.h line when checking if the kernel original files drifted
from the copies we carry.
This is to get the changes from:
69d4c0d321 ("entry, kasan, x86: Disallow overriding mem*() functions")
That addresses these perf tools build warning:
Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S'
diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/ZAH%2FjsioJXGIOrkf@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add the instruction opcode used by LKGS to x86-opcode-map.
Opcode number is per public FRED draft spec v3.0.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230112072032.35626-3-xin3.li@intel.com
We also need to add SYM_TYPED_FUNC_START() to util/include/linux/linkage.h
and update tools/perf/check_headers.sh to ignore the include cfi_types.h
line when checking if the kernel original files drifted from the copies
we carry.
This is to get the changes from:
ccace936ee ("x86: Add types to indirectly called assembly functions")
Addressing these tools/perf build warnings:
Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/lkml/Y1f3VRIec9EBgX6F@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
AMX, other misc insns.
- Update VMware-specific MAINTAINERS entries
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmI4URIACgkQEsHwGGHe
VUob3A/9GFyqt9bBKrSaq9Rt1UVkq6dQhG3kO7dW5d0YDvy8JmR9is4rNDV9GGx6
A1OAue/gDlZFIz/829oS1qwjB7GZ4Rfb0gRo33bytDLLmd0BRXW7ioZ54jBRnWvy
8dZ2WruMmazK6uJxoHvtOA+Pt3ukb074CZZ1SfW344clWK6FJZeptyRclWaT1Py2
QOIJOxMraCdNAay/1ZvOdIqqdIPx5+JyzbHIYOWUFzwT4y+Q8kFNbigrJnqxe5Ij
aqRjzMIvt6MeLwbq9CfLsPFA3gaSzYeOkuXQPcqRgd5LU5ZyXBLStUrGEv1fsMvd
9Kh7VFycZPS7MKzxoEcbuJTTOR4cBsINOlbo9iWr7UD5pm5h7c3vc+nCyia+U+Xo
5XRpf8nitt4a3r1f6HxwXJS0OlBkS4CqexE2OejY4yhWRlxhMcIvRyquU+Z0J4Bp
mgDJuXSzfJfFcBzp4jjOBxGPNEjXXOdy/qc/1jR97eMmTKrk3gk/74NWUx9hw4oN
5RGeC+khAD13TL0yVQfKBe5HuLK5tHppAzXAnT2xi6qUn+VJjLxNWgg3iV9tbShM
4q5vJp3BmvNOY8HQv1R3IDFfN0IAL09Q9v6EzEroNuVUhEOzBdH7JSzWkvBBveZb
FVgD3I+wNBE1nQD3cP/6DGbRe1JG3ULDF95WJshB8gNJwavlZGs=
=f7VZ
-----END PGP SIGNATURE-----
Merge tag 'x86_misc_for_v5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 updates from Borislav Petkov:
- Add support for a couple new insn sets to the insn decoder:
AVX512-FP16, AMX, other misc insns.
- Update VMware-specific MAINTAINERS entries
* tag 'x86_misc_for_v5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
MAINTAINERS: Mark VMware mailing list entries as email aliases
MAINTAINERS: Add Zack as maintainer of vmmouse driver
MAINTAINERS: Update maintainers for paravirt ops and VMware hypervisor interface
x86/insn: Add AVX512-FP16 instructions to the x86 instruction decoder
perf/tests: Add AVX512-FP16 instructions to x86 instruction decoder test
x86/insn: Add misc instructions to x86 instruction decoder
perf/tests: Add misc instructions to the x86 instruction decoder test
x86/insn: Add AMX instructions to the x86 instruction decoder
perf/tests: Add AMX instructions to x86 instruction decoder test
Now that we have SYM_FUNC_ALIAS() and SYM_FUNC_ALIAS_WEAK(), use those
to simplify the definition of function aliases across arch/x86.
For clarity, where there are multiple annotations such as
EXPORT_SYMBOL(), I've tried to keep annotations grouped by symbol. For
example, where a function has a name and an alias which are both
exported, this is organised as:
SYM_FUNC_START(func)
... asm insns ...
SYM_FUNC_END(func)
EXPORT_SYMBOL(func)
SYM_FUNC_ALIAS(alias, func)
EXPORT_SYMBOL(alias)
Where there are only aliases and no exports or other annotations, I have
not bothered with line spacing, e.g.
SYM_FUNC_START(func)
... asm insns ...
SYM_FUNC_END(func)
SYM_FUNC_ALIAS(alias, func)
The tools/perf/ copies of memset_64.S and memset_64.S are updated
likewise to avoid the build system complaining these are mismatched:
| Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
| diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
| Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S'
| diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Mark Brown <broonie@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220216162229.1076788-4-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
The x86 instruction decoder is used for both kernel instructions and
user space instructions (e.g. uprobes, perf tools Intel PT), so it is
good to update it with new instructions.
Add AVX512-FP16 instructions to x86 instruction decoder.
Note the EVEX map field is extended by 1 bit, and most instructions are in
map 5 and map 6.
Reference:
Intel AVX512-FP16 Architecture Specification
June 2021
Revision 1.0
Document Number: 347407-001US
Example using perf tools' x86 instruction decoder test:
$ perf test -v "x86 instruction decoder" |& grep vfcmaddcph | head -2
Decoded ok: 62 f6 6f 48 56 cb vfcmaddcph %zmm3,%zmm2,%zmm1
Decoded ok: 62 f6 6f 48 56 8c c8 78 56 34 12 vfcmaddcph 0x12345678(%eax,%ecx,8),%zmm2,%zmm1
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20211202095029.2165714-7-adrian.hunter@intel.com
x86 instruction decoder is used for both kernel instructions and user space
instructions (e.g. uprobes, perf tools Intel PT), so it is good to update
it with new instructions.
Add instructions to x86 instruction decoder:
User Interrupt
clui
senduipi
stui
testui
uiret
Prediction history reset
hreset
Serialize instruction execution
serialize
TSX suspend load address tracking
xresldtrk
xsusldtrk
Reference:
Intel Architecture Instruction Set Extensions and Future Features
Programming Reference
May 2021
Document Number: 319433-044
Example using perf tools' x86 instruction decoder test:
$ perf test -v "x86 instruction decoder" |& grep -i hreset
Decoded ok: f3 0f 3a f0 c0 00 hreset $0x0
Decoded ok: f3 0f 3a f0 c0 00 hreset $0x0
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20211202095029.2165714-5-adrian.hunter@intel.com
To bring in the change made in this cset:
f94909ceb1 ("x86: Prepare asm files for straight-line-speculation")
It silences these perf tools build warnings, no change in the tools:
Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S'
diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S
The code generated was checked before and after using 'objdump -d /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o',
no changes.
Cc: Borislav Petkov <bp@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use get_unaligned() instead of memcpy() to access potentially unaligned
memory, which, when accessed through a pointer, leads to undefined
behavior. get_unaligned() describes much better what is happening there
anyway even if memcpy() does the job.
In addition, since perf tool builds with -Werror, it would fire with:
util/intel-pt-decoder/../../../arch/x86/lib/insn.c: In function '__insn_get_emulate_prefix':
tools/include/../include/asm-generic/unaligned.h:10:15: error: packed attribute is unnecessary [-Werror=packed]
10 | const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \
because -Werror=packed would complain if the packed attribute would have
no effect on the layout of the structure.
In this case, that is intentional so disable the warning only for that
compilation unit.
That part is Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
No functional changes.
Fixes: 5ba1071f75 ("x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lkml.kernel.org/r/YVSsIkj9Z29TyUjE@zn.tnic
Don't perform unaligned loads in __get_next() and __peek_nbyte_next() as
these are forms of undefined behavior:
"A pointer to an object or incomplete type may be converted to a pointer
to a different object or incomplete type. If the resulting pointer
is not correctly aligned for the pointed-to type, the behavior is
undefined."
(from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)
These problems were identified using the undefined behavior sanitizer
(ubsan) with the tools version of the code and perf test.
[ bp: Massage commit message. ]
Signed-off-by: Numfor Mbiziwo-Tiapo <nums@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lkml.kernel.org/r/20210923161843.751834-1-irogers@google.com
To bring in the change made in this cset:
5e21a3ecad ("x86/alternative: Merge include files")
This just silences these perf tools build warnings, no change in the tools:
Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S'
diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S
Cc: Borislav Petkov <bp@suse.de>
Cc: Juergen Gross <jgross@suse.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Building perf on ppc causes:
In file included from util/intel-pt-decoder/intel-pt-insn-decoder.c:15:
util/intel-pt-decoder/../../../arch/x86/lib/insn.c:14:10: fatal error: asm/inat.h: No such file or directory
14 | #include <asm/inat.h> /*__ignore_sync_check__ */
| ^~~~~~~~~~~~
Restore the relative include paths so that the compiler can find the
headers.
Fixes: 93281c4a96 ("x86/insn: Add an insn_decode() API")
Reported-by: Ian Rogers <irogers@google.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Ian Rogers <irogers@google.com>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lkml.kernel.org/r/20210317150858.02b1bbc8@canb.auug.org.au
Users of the instruction decoder should use this to decode instruction
bytes. For that, have insn*() helpers return an int value to denote
success/failure. When there's an error fetching the next insn byte and
the insn falls short, return -ENODATA to denote that.
While at it, make insn_get_opcode() more stricter as to whether what has
seen so far is a valid insn and if not.
Copy linux/kconfig.h for the tools-version of the decoder so that it can
use IS_ENABLED().
Also, cast the INSN_MODE_KERN dummy define value to (enum insn_mode)
for tools use of the decoder because perf tool builds with -Werror and
errors out with -Werror=sign-compare otherwise.
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lkml.kernel.org/r/20210304174237.31945-5-bp@alien8.de
Add an explicit __ignore_sync_check__ marker which will be used to mark
lines which are supposed to be ignored by file synchronization check
scripts, its advantage being that it explicitly denotes such lines in
the code.
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lkml.kernel.org/r/20210304174237.31945-4-bp@alien8.de
Running instruction decoder posttest on an s390 host with an x86 target
with allyesconfig shows errors. Instructions used in a couple of kernel
objects could not be correctly decoded on big endian system.
insn_decoder_test: warning: objdump says 6 bytes, but insn_get_length() says 5
insn_decoder_test: warning: Found an x86 instruction decoder bug, please report this.
insn_decoder_test: warning: ffffffff831eb4e1: 62 d1 fd 48 7f 04 24 vmovdqa64 %zmm0,(%r12)
insn_decoder_test: warning: objdump says 7 bytes, but insn_get_length() says 6
insn_decoder_test: warning: Found an x86 instruction decoder bug, please report this.
insn_decoder_test: warning: ffffffff831eb4e8: 62 51 fd 48 7f 44 24 01 vmovdqa64 %zmm8,0x40(%r12)
insn_decoder_test: warning: objdump says 8 bytes, but insn_get_length() says 6
This is because in a few places instruction field bytes are set directly
with further usage of "value". To address that introduce and use a
insn_set_byte() helper, which correctly updates "value" on big endian
systems.
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
The x86 instruction decoder code is shared across the kernel source and
the tools. Currently objtool seems to be the only tool from build tools
needed which breaks x86 cross-compilation on big endian systems. Make
the x86 instruction decoder build host endianness agnostic to support
x86 cross-compilation and enable objtool to implement endianness
awareness for big endian architectures support.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Co-developed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
To bring in the change made in this cset:
4d6ffa27b8 ("x86/lib: Change .weak to SYM_FUNC_START_WEAK for arch/x86/lib/mem*_64.S")
6dcc5627f6 ("x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_*")
I needed to define SYM_FUNC_START_LOCAL() as SYM_L_GLOBAL as
mem{cpy,set}_{orig,erms} are used by 'perf bench'.
This silences these perf tools build warnings:
Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S'
diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Fangrui Song <maskray@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In reaction to a proposal to introduce a memcpy_mcsafe_fast()
implementation Linus points out that memcpy_mcsafe() is poorly named
relative to communicating the scope of the interface. Specifically what
addresses are valid to pass as source, destination, and what faults /
exceptions are handled.
Of particular concern is that even though x86 might be able to handle
the semantics of copy_mc_to_user() with its common copy_user_generic()
implementation other archs likely need / want an explicit path for this
case:
On Fri, May 1, 2020 at 11:28 AM Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> On Thu, Apr 30, 2020 at 6:21 PM Dan Williams <dan.j.williams@intel.com> wrote:
> >
> > However now I see that copy_user_generic() works for the wrong reason.
> > It works because the exception on the source address due to poison
> > looks no different than a write fault on the user address to the
> > caller, it's still just a short copy. So it makes copy_to_user() work
> > for the wrong reason relative to the name.
>
> Right.
>
> And it won't work that way on other architectures. On x86, we have a
> generic function that can take faults on either side, and we use it
> for both cases (and for the "in_user" case too), but that's an
> artifact of the architecture oddity.
>
> In fact, it's probably wrong even on x86 - because it can hide bugs -
> but writing those things is painful enough that everybody prefers
> having just one function.
Replace a single top-level memcpy_mcsafe() with either
copy_mc_to_user(), or copy_mc_to_kernel().
Introduce an x86 copy_mc_fragile() name as the rename for the
low-level x86 implementation formerly named memcpy_mcsafe(). It is used
as the slow / careful backend that is supplanted by a fast
copy_mc_generic() in a follow-on patch.
One side-effect of this reorganization is that separating copy_mc_64.S
to its own file means that perf no longer needs to track dependencies
for its memcpy_64.S benchmarks.
[ bp: Massage a bit. ]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: <stable@vger.kernel.org>
Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com
Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com
To bring in the change made in this cset:
e3a9e681ad ("x86/entry: Fixup bad_iret vs noinstr")
This doesn't cause any functional changes to tooling, just a rebuild.
Addresses this perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add the following CET instructions to the opcode map:
INCSSP:
Increment Shadow Stack pointer (SSP).
RDSSP:
Read SSP into a GPR.
SAVEPREVSSP:
Use "previous ssp" token at top of current Shadow Stack (SHSTK) to
create a "restore token" on the previous (outgoing) SHSTK.
RSTORSSP:
Restore from a "restore token" to SSP.
WRSS:
Write to kernel-mode SHSTK (kernel-mode instruction).
WRUSS:
Write to user-mode SHSTK (kernel-mode instruction).
SETSSBSY:
Verify the "supervisor token" pointed by MSR_IA32_PL0_SSP, set the
token busy, and set then Shadow Stack pointer(SSP) to the value of
MSR_IA32_PL0_SSP.
CLRSSBSY:
Verify the "supervisor token" and clear its busy bit.
ENDBR64/ENDBR32:
Mark a valid 64/32 bit control transfer endpoint.
Detailed information of CET instructions can be found in Intel Software
Developer's Manual.
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lkml.kernel.org/r/20200204171425.28073-2-yu-cheng.yu@intel.com
Add TEST opcode to Group3-2 reg=001b as same as Group3-1 does.
Commit
12a78d43de ("x86/decoder: Add new TEST instruction pattern")
added a TEST opcode assignment to f6 XX/001/XXX (Group 3-1), but did
not add f7 XX/001/XXX (Group 3-2).
Actually, this TEST opcode variant (ModRM.reg /1) is not described in
the Intel SDM Vol2 but in AMD64 Architecture Programmer's Manual Vol.3,
Appendix A.2 Table A-6. ModRM.reg Extensions for the Primary Opcode Map.
Without this fix, Randy found a warning by insn_decoder_test related
to this issue as below.
HOSTCC arch/x86/tools/insn_decoder_test
HOSTCC arch/x86/tools/insn_sanity
TEST posttest
arch/x86/tools/insn_decoder_test: warning: Found an x86 instruction decoder bug, please report this.
arch/x86/tools/insn_decoder_test: warning: ffffffff81000bf1: f7 0b 00 01 08 00 testl $0x80100,(%rbx)
arch/x86/tools/insn_decoder_test: warning: objdump says 6 bytes, but insn_get_length() says 2
arch/x86/tools/insn_decoder_test: warning: Decoded and checked 11913894 instructions with 1 failures
TEST posttest
arch/x86/tools/insn_sanity: Success: decoded and checked 1000000 random instructions with 0 errors (seed:0x871ce29c)
To fix this error, add the TEST opcode according to AMD64 APM Vol.3.
[ bp: Massage commit message. ]
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lkml.kernel.org/r/157966631413.9580.10311036595431878351.stgit@devnote2
And update linux/linkage.h, which requires in turn that we make these
files switch from ENTRY()/ENDPROC() to SYM_FUNC_START()/SYM_FUNC_END():
tools/perf/arch/arm64/tests/regs_load.S
tools/perf/arch/arm/tests/regs_load.S
tools/perf/arch/powerpc/tests/regs_load.S
tools/perf/arch/x86/tests/regs_load.S
We also need to switch SYM_FUNC_START_LOCAL() to SYM_FUNC_START() for
the functions used directly by 'perf bench', and update
tools/perf/check_headers.sh to ignore those changes when checking if the
kernel original files drifted from the copies we carry.
This is to get the changes from:
6dcc5627f6 ("x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_*")
ef1e03152c ("x86/asm: Make some functions local")
e9b9d020c4 ("x86/asm: Annotate aliases")
And address these tools/perf build warnings:
Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S'
diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-tay3l8x8k11p7y3qcpqh9qh5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Decode Xen and KVM's emulate-prefix signature by x86 insn decoder.
It is called "prefix" but actually not x86 instruction prefix, so
this adds insn.emulate_prefix_size field instead of reusing
insn.prefixes.
If x86 decoder finds a special sequence of instructions of
XEN_EMULATE_PREFIX and 'ud2a; .ascii "kvm"', it just counts the
length, set insn.emulate_prefix_size and fold it with the next
instruction. In other words, the signature and the next instruction
is treated as a single instruction.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: x86@kernel.org
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: xen-devel@lists.xenproject.org
Cc: Randy Dunlap <rdunlap@infradead.org>
Link: https://lkml.kernel.org/r/156777564986.25081.4964537658500952557.stgit@devnote2
Now that there's a common version of the decoder for all tools, use it
instead of the local copy.
Also use perf's check-headers.sh script to diff the decoder files to
make sure they remain in sync with the kernel version. Objtool has a
similar check.
Committer notes:
Had to keep this all pointing explicitely to x86 headers/files, i.e.
instead of asm/isnn.h we had to use ../include/asm/insn.h when the files
were in differemt dirs, or just replace "<asm/foo.h>" with "foo.h".
This way we continue to be able to process perf.data files with Intel PT
traces in distros other than x86.
Also fixed up the awk script paths to use $(srcdir)/tools/arch instead
or relative directories so that we keep detached tarballs (make help |
grep perf) working.
For now the include lines in these headers are being ignored so as not
to flag false reports of kernel/tools out of sync.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/8a37e615d2880f039505d693d1e068a009358a2b.1567118001.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The kernel tree has three identical copies of the x86 instruction
decoder. Two of them are in the tools subdir.
The tools subdir is supposed to be completely standalone and separate
from the kernel. So having at least one copy of the kernel decoder in
the tools subdir is unavoidable. However, we don't need *two* of them.
Move objtool's copy of the decoder to a shared location, so that perf
will also be able to use it.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/55b486b88f6bcd0c9a2a04b34f964860c8390ca8.1567118001.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add SPDX license identifiers to all files which:
- Have no license information of any form
- Have EXPORT_.*_SYMBOL_GPL inside which was used in the
initial scan/conversion to ignore the file
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To bring in the change made in this cset:
b69656fa7e ("x86/uaccess: Fix up the fixup")
Silencing this perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
No changes in the tooling using this, that was just to ease some objtool
return checking.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/n/tip-j0mxgqkuibhw5qid9saaspdu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To bring in the change made in this cset:
Fixes: a7bea83089 ("x86/asm/64: Use 32-bit XOR to zero registers")
CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
LD /tmp/build/perf/bench/perf-in.o
LD /tmp/build/perf/perf-in.o
LINK /tmp/build/perf/perf
Silencing this perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-sad22dudoz71qr3tsnlqtkia@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To cope with the changes in:
12c89130a5 ("x86/asm/memcpy_mcsafe: Add write-protection-fault handling")
60622d6822 ("x86/asm/memcpy_mcsafe: Return bytes remaining")
bd131544aa ("x86/asm/memcpy_mcsafe: Add labels for __memcpy_mcsafe() write fault handling")
da7bc9c57e ("x86/asm/memcpy_mcsafe: Remove loop unrolling")
This needed introducing a file with a copy of the mcsafe_handle_tail()
function, that is used in the new memcpy_64.S file, as well as a dummy
mcsafe_test.h header.
Testing it:
$ nm ~/bin/perf | grep mcsafe
0000000000484130 T mcsafe_handle_tail
0000000000484300 T __memcpy_mcsafe
$
$ perf bench mem memcpy
# Running 'mem/memcpy' benchmark:
# function 'default' (Default memcpy() provided by glibc)
# Copying 1MB bytes ...
44.389205 GB/sec
# function 'x86-64-unrolled' (unrolled memcpy() in arch/x86/lib/memcpy_64.S)
# Copying 1MB bytes ...
22.710756 GB/sec
# function 'x86-64-movsq' (movsq-based memcpy() in arch/x86/lib/memcpy_64.S)
# Copying 1MB bytes ...
42.459239 GB/sec
# function 'x86-64-movsb' (movsb-based memcpy() in arch/x86/lib/memcpy_64.S)
# Copying 1MB bytes ...
42.459239 GB/sec
$
This silences this perf tools build warning:
Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mika Penttilä <mika.penttila@nextfour.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-igdpciheradk3gb3qqal52d0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
After the SPDX license tags were added a number of tooling headers got out of
sync with their kernel variants, generating lots of build warnings.
Sync them:
- tools/arch/x86/include/asm/disabled-features.h,
tools/arch/x86/include/asm/required-features.h,
tools/include/linux/hash.h:
Remove the SPDX tag where the kernel version does not have it.
- tools/include/asm-generic/bitops/__fls.h,
tools/include/asm-generic/bitops/arch_hweight.h,
tools/include/asm-generic/bitops/const_hweight.h,
tools/include/asm-generic/bitops/fls.h,
tools/include/asm-generic/bitops/fls64.h,
tools/include/uapi/asm-generic/ioctls.h,
tools/include/uapi/asm-generic/mman-common.h,
tools/include/uapi/sound/asound.h,
tools/include/uapi/linux/kvm.h,
tools/include/uapi/linux/perf_event.h,
tools/include/uapi/linux/sched.h,
tools/include/uapi/linux/vhost.h,
tools/include/uapi/sound/asound.h:
Add the SPDX tag of the respective kernel header.
- tools/include/uapi/linux/bpf_common.h,
tools/include/uapi/linux/fcntl.h,
tools/include/uapi/linux/hw_breakpoint.h,
tools/include/uapi/linux/mman.h,
tools/include/uapi/linux/stat.h,
Change the tag to the kernel header version:
-/* SPDX-License-Identifier: GPL-2.0 */
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
Also sync other header details:
- include/uapi/sound/asound.h:
Fix pointless end of line whitespace noise the header grew in this cycle.
- tools/arch/x86/lib/memcpy_64.S:
Sync the code and add tools/include/asm/export.h with dummy wrappers
to support building the kernel side code in a tooling header environment.
- tools/include/uapi/asm-generic/mman.h,
tools/include/uapi/linux/bpf.h:
Sync other details that don't impact tooling's use of the ABIs.
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 9a6fb28a35 ("x86/mce: Improve memcpy_mcsafe()") renames
memcpy_mcsafe() to memcpy_mcsafe_unrolled(), making
tools/arch/x86/lib/memcpy_64.S drift from the its kernel counterpart,
triggering this warning in the perf build:
Warning: tools/arch/x86/lib/memcpy_64.S differs from kernel
Sync that copy to acknowledge that, no changes to 'perf bench' are
needed, as this function is not used there.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-xfwc1raw8obyrctxerwt1bbb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We can't access kernel files directly from tools/, so copy the required
bits, and make sure that we detect when the original files, in the
kernel, gets modified.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-z7e76274ch5j4nugv048qacb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>