Commit Graph

7698 Commits

Author SHA1 Message Date
Linus Torvalds
f0b13428c9 Merge branch 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/cache updates from Thomas Gleixner:
 "A set of patches which add support for L2 cache partitioning to the
  Intel RDT facility"

* 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/intel_rdt: Add command line parameter to control L2_CDP
  x86/intel_rdt: Enable L2 CDP in MSR IA32_L2_QOS_CFG
  x86/intel_rdt: Add two new resources for L2 Code and Data Prioritization (CDP)
  x86/intel_rdt: Enumerate L2 Code and Data Prioritization (CDP) feature
  x86/intel_rdt: Add L2CDP support in documentation
  x86/intel_rdt: Update documentation
2018-01-29 17:48:22 -08:00
Linus Torvalds
1a9a126b50 ACPI updates for v4.16-rc1
- Update the ACPICA kernel code to upstream revision 20171215 including:
    * Support for ACPI 6.0A changes in the NFIT table (Bob Moore).
    * Local 64-bit divide in string conversions (Bob Moore).
    * Fix for a regression in acpi_evaluate_object_type() (Bob Moore).
    * Fixes for memory leaks during package object resolution (Bob Moore).
    * Deployment of safe version of strncpy() (Bob Moore).
    * Debug and messaging updates (Bob Moore).
    * Support for PDTT, SDEV, TPM2 tables in iASL and tools (Bob Moore).
    * Null pointer dereference avoidance in Op and cleanups (Colin Ian King).
    * Fix for memory leak from building prefixed pathname (Erik Schmauss).
    * Coding style fixes, disassembler and compiler updates (Hanjun Guo,
      Erik Schmauss).
    * Additional PPTT flags from ACPI 6.2 (Jeremy Linton).
    * Fix for an off-by-one error in acpi_get_timer_duration() (Jung-uk Kim).
    * Infinite loop detection timeout and utilities cleanups (Lv Zheng).
    * Windows 10 version 1607 and 1703 OSI strings (Mario Limonciello).
 
  - Update ACPICA information in MAINTAINERS to reflect the current
    status of ACPICA maintenance and rename a local variable in one
    function to match the corresponding upstream code (Rafael Wysocki).
 
  - Clean up ACPI-related initialization on x86 (Andy Shevchenko).
 
  - Add support for Intel Merrifield to the ACPI GPIO code (Andy
    Shevchenko).
 
  - Clean up ACPI PMIC drivers (Andy Shevchenko, Arvind Yadav).
 
  - Fix the ACPI Generic Event Device (GED) driver to free IRQs on
    shutdown and clean up the PCI IRQ Link driver (Sinan Kaya).
 
  - Make the GHES code call into the AER driver on all errors and
    clean up the ACPI APEI code (Colin Ian King, Tyler Baicar).
 
  - Make the IA64 ACPI NUMA code parse all SRAT entries (Ganapatrao
    Kulkarni).
 
  - Add a lid switch blacklist to the ACPI button driver and make it
    print extra debug messages on lid events (Hans de Goede).
 
  - Add quirks for Asus GL502VSK and UX305LA to the ACPI battery
    driver and clean it up somewhat (Bjørn Mork, Kai-Heng Feng).
 
  - Add device link for CHT SD card dependency on I2C to the ACPI
    LPSS (Intel SoCs) driver and make it avoid creating platform
    device objects for devices without MMIO resources (Adrian Hunter,
    Hans de Goede).
 
  - Fix the ACPI GPE mask kernel command line parameter handling
    (Prarit Bhargava).
 
  - Fix the handling of (incorrectly exposed) backlight interfaces
    without LCD (Hans de Goede).
 
  - Fix the usage of debugfs_create_*() in the ACPI EC driver (Geert
    Uytterhoeven).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJaY/BrAAoJEILEb/54YlRxR10P/1dVxfhLiGBrwKzA1urr71Vg
 LH6ZdlIlihyu9a1PZHjfO72IuZCMSkSnoJUPJPFK6FNA0hIDqsP+hC8gcknCxnAU
 i6r2ZzQesOzzjGblpASvdDg0GkYe9r6sHpUQ0xW/hnijamforflGveW1bagbnFuI
 gvT6m6+lMJwBd0NrWhQiTJmTuSTwgJBXDA+HhlDnGd6ziVfHPaCxon4L9GQfVhsb
 jbOI/kBjnEKoN1dBbEAcSpgzklVUXUj4x2NHUMCyvKOJyKG/F7Ycbghux9t3C+ej
 1T0XJAU7K3hkmstWkWwylqVZt3UW47xiJKe6K2Z5p3CaJx0cnI18C+g7x/IcRGiA
 +J/Uco+xeMa8yqYV96j+AJexpUDu7fYo6B4nRZ/K+MjWifboeSLKn8PHLKhqYn6k
 sV3s0dUf8SJK5pTu+IkAgzDzsw/uJAI8Rylmig9ea12/nIt6EH3Kero31hi3lkoN
 Y2rdi9MIqFIj2tX42047Y/q2UEFkMWGO3q8fLkXRvWPwnwStHDDFVj/kd19CWcTy
 B1kNxNQQS/Q9u0uoW4rIHW6ipEU2sqyt/tvVQnmJlVP0HuO++uIO7h3mrBDxhSJS
 zQF4qtusbMHC85BHrozmGFECN8Cex8DYSUTO/yCoBvMlMxJ7UlSt25TBN+SzmnOV
 H1LhVRcFh1488lXzuM+u
 =/eCQ
 -----END PGP SIGNATURE-----

Merge tag 'acpi-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI updates from Rafael Wysocki:
 "The majority of this is an update of the ACPICA kernel code to
  upstream revision 20171215 with a cosmetic change and a maintainers
  information update on top of it.

  The rest is mostly some minor fixes and cleanups in the ACPI drivers
  and cleanups to initialization on x86.

  Specifics:

   - Update the ACPICA kernel code to upstream revision 20171215 including:
      * Support for ACPI 6.0A changes in the NFIT table (Bob Moore)
      * Local 64-bit divide in string conversions (Bob Moore)
      * Fix for a regression in acpi_evaluate_object_type() (Bob Moore)
      * Fixes for memory leaks during package object resolution (Bob
        Moore)
      * Deployment of safe version of strncpy() (Bob Moore)
      * Debug and messaging updates (Bob Moore)
      * Support for PDTT, SDEV, TPM2 tables in iASL and tools (Bob
        Moore)
      * Null pointer dereference avoidance in Op and cleanups (Colin Ian
        King)
      * Fix for memory leak from building prefixed pathname (Erik
        Schmauss)
      * Coding style fixes, disassembler and compiler updates (Hanjun
        Guo, Erik Schmauss)
      * Additional PPTT flags from ACPI 6.2 (Jeremy Linton)
      * Fix for an off-by-one error in acpi_get_timer_duration()
        (Jung-uk Kim)
      * Infinite loop detection timeout and utilities cleanups (Lv
        Zheng)
      * Windows 10 version 1607 and 1703 OSI strings (Mario
        Limonciello)

   - Update ACPICA information in MAINTAINERS to reflect the current
     status of ACPICA maintenance and rename a local variable in one
     function to match the corresponding upstream code (Rafael Wysocki)

   - Clean up ACPI-related initialization on x86 (Andy Shevchenko)

   - Add support for Intel Merrifield to the ACPI GPIO code (Andy
     Shevchenko)

   - Clean up ACPI PMIC drivers (Andy Shevchenko, Arvind Yadav)

   - Fix the ACPI Generic Event Device (GED) driver to free IRQs on
     shutdown and clean up the PCI IRQ Link driver (Sinan Kaya)

   - Make the GHES code call into the AER driver on all errors and clean
     up the ACPI APEI code (Colin Ian King, Tyler Baicar)

   - Make the IA64 ACPI NUMA code parse all SRAT entries (Ganapatrao
     Kulkarni)

   - Add a lid switch blacklist to the ACPI button driver and make it
     print extra debug messages on lid events (Hans de Goede)

   - Add quirks for Asus GL502VSK and UX305LA to the ACPI battery driver
     and clean it up somewhat (Bjørn Mork, Kai-Heng Feng)

   - Add device link for CHT SD card dependency on I2C to the ACPI LPSS
     (Intel SoCs) driver and make it avoid creating platform device
     objects for devices without MMIO resources (Adrian Hunter, Hans de
     Goede)

   - Fix the ACPI GPE mask kernel command line parameter handling
     (Prarit Bhargava)

   - Fix the handling of (incorrectly exposed) backlight interfaces
     without LCD (Hans de Goede)

   - Fix the usage of debugfs_create_*() in the ACPI EC driver (Geert
     Uytterhoeven)"

* tag 'acpi-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (62 commits)
  ACPI/PCI: pci_link: reduce verbosity when IRQ is enabled
  ACPI / LPSS: Do not instiate platform_dev for devs without MMIO resources
  ACPI / PMIC: Convert to use builtin_platform_driver() macro
  ACPI / x86: boot: Propagate error code in acpi_gsi_to_irq()
  ACPICA: Update version to 20171215
  ACPICA: trivial style fix, no functional change
  ACPICA: Fix a couple memory leaks during package object resolution
  ACPICA: Recognize the Windows 10 version 1607 and 1703 OSI strings
  ACPICA: DT compiler: prevent error if optional field at the end of table is not present
  ACPICA: Rename a global variable, no functional change
  ACPICA: Create and deploy safe version of strncpy
  ACPICA: Cleanup the global variables and update comments
  ACPICA: Debugger: fix slight indentation issue
  ACPICA: Fix a regression in the acpi_evaluate_object_type() interface
  ACPICA: Update for a few debug output statements
  ACPICA: Debug output, no functional change
  ACPI: EC: Fix debugfs_create_*() usage
  ACPI / video: Default lcd_only to true on Win8-ready and newer machines
  ACPI / x86: boot: Don't setup SCI on HW-reduced platforms
  ACPI / x86: boot: Use INVALID_ACPI_IRQ instead of 0 for acpi_sci_override_gsi
  ...
2018-01-29 10:17:53 -08:00
Linus Torvalds
49f9c3552c init_task out-of-lining
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAWl80tvSw1s6N8H32AQJq8A//ViRN5fExrd678Eh2Bz1ytrJYMUfYY3Hv
 QTH5TH9zFyLFyWLB1Iwe13sdLVTTM88O0qcDb54Lx9fWUqeMZyYvBhLtWPc00lTU
 0m3EyYR87MFWaEV+VxaVWgWaWkMDkd39KubDitcS+YIBDszTuMpYodhPUsgLt7lr
 pePX7eurXKdQPTh4NUOjGA2NaZot3tga76J6D8NKruGYUstQCGxpP1ryiFfACnwf
 NLWNO8ZBMtlDwX1mHYOOMFMaBzFzXorPm7jY4HJDf3mUM84xI3ach6CuH9RTSzfq
 A+qB1U3QILPVFo2HtqOHui4bFjRwqOf6uIrI/KcnioJ37w1O+KFcMJeDnX2I211q
 f2lXehJLQA7kPmxQw8T3//HDRaLXc0Qxt7IPZRFinrlkcN4oh3DD5euMfCFBSoZG
 PTbjxlgMfzJPoZtqAcy0rV5L54a/F4h915OQPJCKLwujIsXD2nT993vNmGDyq4zh
 BzNMxSXJC8p+jYvQpNhWyyxwDBBT/YsVQo/ACwg4eJnD3blVTAioRT9ZZcAcsY0F
 0z1eWW5RiknzIaXQWvjfK0gYKpO+aMSu9+gipHfMbU3yXG+sPj/H6zAHYzqX3uQZ
 jb5Iujjnu49W/YD+RiMenuu59lNXUnLSeRnlV7dw0qxGK1FzGo24+ZzKFhJhKvzG
 tdfUsev1Mc8=
 =jhWg
 -----END PGP SIGNATURE-----

Merge tag 'init_task-20180117' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull init_task initializer cleanups from David Howells:
 "It doesn't seem useful to have the init_task in a header file rather
  than in a normal source file. We could consolidate init_task handling
  instead and expand out various macros.

  Here's a series of patches that consolidate init_task handling:

   (1) Make THREAD_SIZE available to vmlinux.lds for cris, hexagon and
       openrisc.

   (2) Alter the INIT_TASK_DATA linker script macro to set
       init_thread_union and init_stack rather than defining these in C.

       Insert init_task and init_thread_into into the init_stack area in
       the linker script as appropriate to the configuration, with
       different section markers so that they end up correctly ordered.

       We can then get merge ia64's init_task.c into the main one.

       We then have a bunch of single-use INIT_*() macros that seem only
       to be macros because they used to be used per-arch. We can then
       expand these in place of the user and get rid of a few lines and
       a lot of backslashes.

   (3) Expand INIT_TASK() in place.

   (4) Expand in place various small INIT_*() macros that are defined
       conditionally. Expand them and surround them by #if[n]def/#endif
       in the .c file as it takes fewer lines.

   (5) Expand INIT_SIGNALS() and INIT_SIGHAND() in place.

   (6) Expand INIT_STRUCT_PID in place.

  These macros can then be discarded"

* tag 'init_task-20180117' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  Expand INIT_STRUCT_PID and remove
  Expand the INIT_SIGNALS and INIT_SIGHAND macros and remove
  Expand various INIT_* macros and remove
  Expand INIT_TASK() in init/init_task.c and remove
  Construct init thread stack in the linker script rather than by union
  openrisc: Make THREAD_SIZE available to vmlinux.lds
  hexagon: Make THREAD_SIZE available to vmlinux.lds
  cris: Make THREAD_SIZE available to vmlinux.lds
2018-01-29 09:08:34 -08:00
Linus Torvalds
24b1cccf92 Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 retpoline fixlet from Thomas Gleixner:
 "Remove the ESP/RSP thunks for retpoline as they cannot ever work.

  Get rid of them before they show up in a release"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/retpoline: Remove the esp/rsp thunk
2018-01-28 12:24:36 -08:00
Waiman Long
1df37383a8 x86/retpoline: Remove the esp/rsp thunk
It doesn't make sense to have an indirect call thunk with esp/rsp as
retpoline code won't work correctly with the stack pointer register.
Removing it will help compiler writers to catch error in case such
a thunk call is emitted incorrectly.

Fixes: 76b043848f ("x86/retpoline: Add initial retpoline support")
Suggested-by: Jeff Law <law@redhat.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Kees Cook <keescook@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1516658974-27852-1-git-send-email-longman@redhat.com
2018-01-24 12:31:55 +01:00
Linus Torvalds
5515114211 Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 pti fixes from Thomas Gleixner:
 "A small set of fixes for the meltdown/spectre mitigations:

   - Make kprobes aware of retpolines to prevent probes in the retpoline
     thunks.

   - Make the machine check exception speculation protected. MCE used to
     issue an indirect call directly from the ASM entry code. Convert
     that to a direct call into a C-function and issue the indirect call
     from there so the compiler can add the retpoline protection,

   - Make the vmexit_fill_RSB() assembly less stupid

   - Fix a typo in the PTI documentation"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/retpoline: Optimize inline assembler for vmexit_fill_RSB
  x86/pti: Document fix wrong index
  kprobes/x86: Disable optimizing on the function jumps to indirect thunk
  kprobes/x86: Blacklist indirect thunk functions for kprobes
  retpoline: Introduce start/end markers of indirect thunk
  x86/mce: Make machine check speculation protected
2018-01-21 10:48:35 -08:00
Andi Kleen
3f7d875566 x86/retpoline: Optimize inline assembler for vmexit_fill_RSB
The generated assembler for the C fill RSB inline asm operations has
several issues:

- The C code sets up the loop register, which is then immediately
  overwritten in __FILL_RETURN_BUFFER with the same value again.

- The C code also passes in the iteration count in another register, which
  is not used at all.

Remove these two unnecessary operations. Just rely on the single constant
passed to the macro for the iterations.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Cc: dave.hansen@intel.com
Cc: gregkh@linuxfoundation.org
Cc: torvalds@linux-foundation.org
Cc: arjan@linux.intel.com
Link: https://lkml.kernel.org/r/20180117225328.15414-1-andi@firstfloor.org
2018-01-19 16:31:30 +01:00
Masami Hiramatsu
736e80a421 retpoline: Introduce start/end markers of indirect thunk
Introduce start/end markers of __x86_indirect_thunk_* functions.
To make it easy, consolidate .text.__x86.indirect_thunk.* sections
to one .text.__x86.indirect_thunk section and put it in the
end of kernel text section and adds __indirect_thunk_start/end
so that other subsystem (e.g. kprobes) can identify it.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/151629206178.10241.6828804696410044771.stgit@devbox
2018-01-19 16:31:28 +01:00
Thomas Gleixner
6f41c34d69 x86/mce: Make machine check speculation protected
The machine check idtentry uses an indirect branch directly from the low
level code. This evades the speculation protection.

Replace it by a direct call into C code and issue the indirect call there
so the compiler can apply the proper speculation protection.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by:Borislav Petkov <bp@alien8.de>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Niced-by: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801181626290.1847@nanos
2018-01-19 16:31:28 +01:00
Fenghua Yu
a511e79353 x86/intel_rdt: Enumerate L2 Code and Data Prioritization (CDP) feature
L2 Code and Data Prioritization (CDP) is enumerated in
CPUID(EAX=0x10, ECX=0x2):ECX.bit2

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Cc: "Tony Luck" <tony.luck@intel.com>
Cc: Vikas" <vikas.shivappa@intel.com>
Cc: Sai Praneeth" <sai.praneeth.prakhya@intel.com>
Cc: Reinette" <reinette.chatre@intel.com>
Link: https://lkml.kernel.org/r/1513810644-78015-4-git-send-email-fenghua.yu@intel.com
2018-01-18 09:33:30 +01:00
Rafael J. Wysocki
0c81e26e86 Merge branches 'acpi-x86', 'acpi-apei' and 'acpi-ec'
* acpi-x86:
  ACPI / x86: boot: Propagate error code in acpi_gsi_to_irq()
  ACPI / x86: boot: Don't setup SCI on HW-reduced platforms
  ACPI / x86: boot: Use INVALID_ACPI_IRQ instead of 0 for acpi_sci_override_gsi
  ACPI / x86: boot: Get rid of ACPI_INVALID_GSI
  ACPI / x86: boot: Swap variables in condition in acpi_register_gsi_ioapic()

* acpi-apei:
  ACPI / APEI: remove redundant variables len and node_len
  ACPI: APEI: call into AER handling regardless of severity
  ACPI: APEI: handle PCIe AER errors in separate function

* acpi-ec:
  ACPI: EC: Fix debugfs_create_*() usage
2018-01-18 03:01:55 +01:00
Linus Torvalds
1d966eb4d6 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Misc fixes:

   - A rather involved set of memory hardware encryption fixes to
     support the early loading of microcode files via the initrd. These
     are larger than what we normally take at such a late -rc stage, but
     there are two mitigating factors: 1) much of the changes are
     limited to the SME code itself 2) being able to early load
     microcode has increased importance in the post-Meltdown/Spectre
     era.

   - An IRQ vector allocator fix

   - An Intel RDT driver use-after-free fix

   - An APIC driver bug fix/revert to make certain older systems boot
     again

   - A pkeys ABI fix

   - TSC calibration fixes

   - A kdump fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic/vector: Fix off by one in error path
  x86/intel_rdt/cqm: Prevent use after free
  x86/mm: Encrypt the initrd earlier for BSP microcode update
  x86/mm: Prepare sme_encrypt_kernel() for PAGE aligned encryption
  x86/mm: Centralize PMD flags in sme_encrypt_kernel()
  x86/mm: Use a struct to reduce parameters for SME PGD mapping
  x86/mm: Clean up register saving in the __enc_copy() assembly code
  x86/idt: Mark IDT tables __initconst
  Revert "x86/apic: Remove init_bsp_APIC()"
  x86/mm/pkeys: Fix fill_sig_info_pkey
  x86/tsc: Print tsc_khz, when it differs from cpu_khz
  x86/tsc: Fix erroneous TSC rate on Skylake Xeon
  x86/tsc: Future-proof native_calibrate_tsc()
  kdump: Write the correct address of mem_section into vmcoreinfo
2018-01-17 12:30:06 -08:00
Linus Torvalds
88dc7fca18 Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 pti bits and fixes from Thomas Gleixner:
 "This last update contains:

   - An objtool fix to prevent a segfault with the gold linker by
     changing the invocation order. That's not just for gold, it's a
     general robustness improvement.

   - An improved error message for objtool which spares tearing hairs.

   - Make KASAN fail loudly if there is not enough memory instead of
     oopsing at some random place later

   - RSB fill on context switch to prevent RSB underflow and speculation
     through other units.

   - Make the retpoline/RSB functionality work reliably for both Intel
     and AMD

   - Add retpoline to the module version magic so mismatch can be
     detected

   - A small (non-fix) update for cpufeatures which prevents cpu feature
     clashing for the upcoming extra mitigation bits to ease
     backporting"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  module: Add retpoline tag to VERMAGIC
  x86/cpufeature: Move processor tracing out of scattered features
  objtool: Improve error message for bad file argument
  objtool: Fix seg fault with gold linker
  x86/retpoline: Add LFENCE to the retpoline/RSB filling RSB macros
  x86/retpoline: Fill RSB on context switch for affected CPUs
  x86/kasan: Panic if there is not enough memory to boot
2018-01-17 11:54:56 -08:00
Paolo Bonzini
4fdec2034b x86/cpufeature: Move processor tracing out of scattered features
Processor tracing is already enumerated in word 9 (CPUID[7,0].EBX),
so do not duplicate it in the scattered features word.

Besides being more tidy, this will be useful for KVM when it presents
processor tracing to the guests.  KVM selects host features that are
supported by both the host kernel (depending on command line options,
CPU errata, or whatever) and KVM.  Whenever a full feature word exists,
KVM's code is written in the expectation that the CPUID bit number
matches the X86_FEATURE_* bit number, but this is not the case for
X86_FEATURE_INTEL_PT.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luwei Kang <luwei.kang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Link: http://lkml.kernel.org/r/1516117345-34561-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-17 07:38:39 +01:00
Tom Lendacky
107cd25321 x86/mm: Encrypt the initrd earlier for BSP microcode update
Currently the BSP microcode update code examines the initrd very early
in the boot process.  If SME is active, the initrd is treated as being
encrypted but it has not been encrypted (in place) yet.  Update the
early boot code that encrypts the kernel to also encrypt the initrd so
that early BSP microcode updates work.

Tested-by: Gabriel Craciunescu <nix.or.die@gmail.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180110192634.6026.10452.stgit@tlendack-t1.amdoffice.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16 01:50:59 +01:00
Tom Lendacky
28d437d550 x86/retpoline: Add LFENCE to the retpoline/RSB filling RSB macros
The PAUSE instruction is currently used in the retpoline and RSB filling
macros as a speculation trap.  The use of PAUSE was originally suggested
because it showed a very, very small difference in the amount of
cycles/time used to execute the retpoline as compared to LFENCE.  On AMD,
the PAUSE instruction is not a serializing instruction, so the pause/jmp
loop will use excess power as it is speculated over waiting for return
to mispredict to the correct target.

The RSB filling macro is applicable to AMD, and, if software is unable to
verify that LFENCE is serializing on AMD (possible when running under a
hypervisor), the generic retpoline support will be used and, so, is also
applicable to AMD.  Keep the current usage of PAUSE for Intel, but add an
LFENCE instruction to the speculation trap for AMD.

The same sequence has been adopted by GCC for the GCC generated retpolines.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Kees Cook <keescook@google.com>
Link: https://lkml.kernel.org/r/20180113232730.31060.36287.stgit@tlendack-t1.amdoffice.net
2018-01-15 00:32:55 +01:00
David Woodhouse
c995efd5a7 x86/retpoline: Fill RSB on context switch for affected CPUs
On context switch from a shallow call stack to a deeper one, as the CPU
does 'ret' up the deeper side it may encounter RSB entries (predictions for
where the 'ret' goes to) which were populated in userspace.

This is problematic if neither SMEP nor KPTI (the latter of which marks
userspace pages as NX for the kernel) are active, as malicious code in
userspace may then be executed speculatively.

Overwrite the CPU's return prediction stack with calls which are predicted
to return to an infinite loop, to "capture" speculation if this
happens. This is required both for retpoline, and also in conjunction with
IBRS for !SMEP && !KPTI.

On Skylake+ the problem is slightly different, and an *underflow* of the
RSB may cause errant branch predictions to occur. So there it's not so much
overwrite, as *filling* the RSB to attempt to prevent it getting
empty. This is only a partial solution for Skylake+ since there are many
other conditions which may result in the RSB becoming empty. The full
solution on Skylake+ is to use IBRS, which will prevent the problem even
when the RSB becomes empty. With IBRS, the RSB-stuffing will not be
required on context switch.

[ tglx: Added missing vendor check and slighty massaged comments and
  	changelog ]

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515779365-9032-1-git-send-email-dwmw@amazon.co.uk
2018-01-15 00:32:44 +01:00
Linus Torvalds
40548c6b6c Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 pti updates from Thomas Gleixner:
 "This contains:

   - a PTI bugfix to avoid setting reserved CR3 bits when PCID is
     disabled. This seems to cause issues on a virtual machine at least
     and is incorrect according to the AMD manual.

   - a PTI bugfix which disables the perf BTS facility if PTI is
     enabled. The BTS AUX buffer is not globally visible and causes the
     CPU to fault when the mapping disappears on switching CR3 to user
     space. A full fix which restores BTS on PTI is non trivial and will
     be worked on.

   - PTI bugfixes for EFI and trusted boot which make sure that the user
     space visible page table entries have the NX bit cleared

   - removal of dead code in the PTI pagetable setup functions

   - add PTI documentation

   - add a selftest for vsyscall to verify that the kernel actually
     implements what it advertises.

   - a sysfs interface to expose vulnerability and mitigation
     information so there is a coherent way for users to retrieve the
     status.

   - the initial spectre_v2 mitigations, aka retpoline:

      + The necessary ASM thunk and compiler support

      + The ASM variants of retpoline and the conversion of affected ASM
        code

      + Make LFENCE serializing on AMD so it can be used as speculation
        trap

      + The RSB fill after vmexit

   - initial objtool support for retpoline

  As I said in the status mail this is the most of the set of patches
  which should go into 4.15 except two straight forward patches still on
  hold:

   - the retpoline add on of LFENCE which waits for ACKs

   - the RSB fill after context switch

  Both should be ready to go early next week and with that we'll have
  covered the major holes of spectre_v2 and go back to normality"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits)
  x86,perf: Disable intel_bts when PTI
  security/Kconfig: Correct the Documentation reference for PTI
  x86/pti: Fix !PCID and sanitize defines
  selftests/x86: Add test_vsyscall
  x86/retpoline: Fill return stack buffer on vmexit
  x86/retpoline/irq32: Convert assembler indirect jumps
  x86/retpoline/checksum32: Convert assembler indirect jumps
  x86/retpoline/xen: Convert Xen hypercall indirect jumps
  x86/retpoline/hyperv: Convert assembler indirect jumps
  x86/retpoline/ftrace: Convert ftrace assembler indirect jumps
  x86/retpoline/entry: Convert entry assembler indirect jumps
  x86/retpoline/crypto: Convert crypto assembler indirect jumps
  x86/spectre: Add boot time option to select Spectre v2 mitigation
  x86/retpoline: Add initial retpoline support
  objtool: Allow alternatives to be ignored
  objtool: Detect jumps to retpoline thunks
  x86/pti: Make unpoison of pgd for trusted boot work for real
  x86/alternatives: Fix optimize_nops() checking
  sysfs/cpu: Fix typos in vulnerability documentation
  x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC
  ...
2018-01-14 09:51:25 -08:00
Ville Syrjälä
fc90ccfd28 Revert "x86/apic: Remove init_bsp_APIC()"
This reverts commit b371ae0d4a. It causes
boot hangs on old P3/P4 systems when the local APIC is enforced in UP mode.

Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
Cc: yinghai@kernel.org
Cc: bhe@redhat.com
Link: https://lkml.kernel.org/r/20171128145350.21560-1-ville.syrjala@linux.intel.com
2018-01-14 12:14:51 +01:00
Thomas Gleixner
f10ee3dcc9 x86/pti: Fix !PCID and sanitize defines
The switch to the user space page tables in the low level ASM code sets
unconditionally bit 12 and bit 11 of CR3. Bit 12 is switching the base
address of the page directory to the user part, bit 11 is switching the
PCID to the PCID associated with the user page tables.

This fails on a machine which lacks PCID support because bit 11 is set in
CR3. Bit 11 is reserved when PCID is inactive.

While the Intel SDM claims that the reserved bits are ignored when PCID is
disabled, the AMD APM states that they should be cleared.

This went unnoticed as the AMD APM was not checked when the code was
developed and reviewed and test systems with Intel CPUs never failed to
boot. The report is against a Centos 6 host where the guest fails to boot,
so it's not yet clear whether this is a virt issue or can happen on real
hardware too, but thats irrelevant as the AMD APM clearly ask for clearing
the reserved bits.

Make sure that on non PCID machines bit 11 is not set by the page table
switching code.

Andy suggested to rename the related bits and masks so they are clearly
describing what they should be used for, which is done as well for clarity.

That split could have been done with alternatives but the macro hell is
horrible and ugly. This can be done on top if someone cares to remove the
extra orq. For now it's a straight forward fix.

Fixes: 6fd166aae7 ("x86/mm: Use/Fix PCID to optimize user/kernel switches")
Reported-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable <stable@vger.kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Willy Tarreau <w@1wt.eu>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801140009150.2371@nanos
2018-01-14 10:45:53 +01:00
Linus Torvalds
8e66791a80 pci-v4.15-fixes-2
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJaWRZvAAoJEFmIoMA60/r8JqoP/22/st7YsJjk9kJE0DCIUSjv
 yP0iyAfyPyfnhqgDtLfpb41Q4+sjR7C2xKUW8tfqUKXR4Gb+7zUXEYKb+qcco0T5
 NJj5VWS5MnIGJHdHMqoqzswIsNSe1SDccsxAwSzY3CvmG9Mkg+BHmBAzEZBmsDcD
 6S1AtLrvUOcEUyBrgfBYpi8cQFsnrFsaG7seY5EqkuTcjKvbebBQKawzarYOppqQ
 j8QIQ19f2B9q4rGV98HabtJZqb+ll5S1swBbEz6P6MJ6gy1IhADdfLlhTtpH0gXH
 Xb9gpcyA7rrrxPzVo85gFgyFR3ATE96aTURrDqSjsumGwer+UtqIH/KJjcA12vMF
 ObZRVHPRO0F4l9mbOJV7F0o7QgOEwmKcdHjhTh9jlOjV3XgPCTEGJJ4ihpw3cdjH
 bVeaoloPgAT6wTtkWK4mI8RYgwZUYQQxKFzc/0pK4BpNghoX8wigZvoH+ey+HQ1u
 1KC8797zDUBquRBKZc9c8hFK+s7KkFj7FsKLAZs8k6MVPYHDjpF1CjqzCecVMKim
 tHRhlH/l+NTnKCh9D5HfmstPAtB/dojXE1dF+BI/I651FFpZVmvDNoMPq8/kOdM1
 Mj9SjCMmYYMnkNxLHUJO4j6mWEMyj/YJsZRBK+qpIa6F/sRYy2/P1tA2bT5UbgWr
 zAhHadF6dyri20PEjZuL
 =Qi1b
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.15-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:
 "Fix AMD boot regression due to 64-bit window conflicting with system
  memory (Christian König)"

* tag 'pci-v4.15-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  x86/PCI: Move and shrink AMD 64-bit window to avoid conflict
  x86/PCI: Add "pci=big_root_window" option for AMD 64-bit windows
2018-01-13 13:14:54 -08:00
David Woodhouse
117cc7a908 x86/retpoline: Fill return stack buffer on vmexit
In accordance with the Intel and AMD documentation, we need to overwrite
all entries in the RSB on exiting a guest, to prevent malicious branch
target predictions from affecting the host kernel. This is needed both
for retpoline and for IBRS.

[ak: numbers again for the RSB stuffing labels]

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515755487-8524-1-git-send-email-dwmw@amazon.co.uk
2018-01-12 12:33:37 +01:00
David Woodhouse
ea08816d5b x86/retpoline/xen: Convert Xen hypercall indirect jumps
Convert indirect call in Xen hypercall to use non-speculative sequence,
when CONFIG_RETPOLINE is enabled.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-10-git-send-email-dwmw@amazon.co.uk
2018-01-12 00:14:31 +01:00
David Woodhouse
e70e5892b2 x86/retpoline/hyperv: Convert assembler indirect jumps
Convert all indirect jumps in hyperv inline asm code to use non-speculative
sequences when CONFIG_RETPOLINE is enabled.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-9-git-send-email-dwmw@amazon.co.uk
2018-01-12 00:14:30 +01:00
David Woodhouse
da28512156 x86/spectre: Add boot time option to select Spectre v2 mitigation
Add a spectre_v2= option to select the mitigation used for the indirect
branch speculation vulnerability.

Currently, the only option available is retpoline, in its various forms.
This will be expanded to cover the new IBRS/IBPB microcode features.

The RETPOLINE_AMD feature relies on a serializing LFENCE for speculation
control. For AMD hardware, only set RETPOLINE_AMD if LFENCE is a
serializing instruction, which is indicated by the LFENCE_RDTSC feature.

[ tglx: Folded back the LFENCE/AMD fixes and reworked it so IBRS
  	integration becomes simple ]

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-5-git-send-email-dwmw@amazon.co.uk
2018-01-12 00:14:29 +01:00
David Woodhouse
76b043848f x86/retpoline: Add initial retpoline support
Enable the use of -mindirect-branch=thunk-extern in newer GCC, and provide
the corresponding thunks. Provide assembler macros for invoking the thunks
in the same way that GCC does, from native and inline assembler.

This adds X86_FEATURE_RETPOLINE and sets it by default on all CPUs. In
some circumstances, IBRS microcode features may be used instead, and the
retpoline can be disabled.

On AMD CPUs if lfence is serialising, the retpoline can be dramatically
simplified to a simple "lfence; jmp *\reg". A future patch, after it has
been verified that lfence really is serialising in all circumstances, can
enable this by setting the X86_FEATURE_RETPOLINE_AMD feature bit in addition
to X86_FEATURE_RETPOLINE.

Do not align the retpoline in the altinstr section, because there is no
guarantee that it stays aligned when it's copied over the oldinstr during
alternative patching.

[ Andi Kleen: Rename the macros, add CONFIG_RETPOLINE option, export thunks]
[ tglx: Put actual function CALL/JMP in front of the macros, convert to
  	symbolic labels ]
[ dwmw2: Convert back to numeric labels, merge objtool fixes ]

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-4-git-send-email-dwmw@amazon.co.uk
2018-01-12 00:14:28 +01:00
=?UTF-8?q?Christian=20K=C3=B6nig?=
f32ab75471 x86/PCI: Add "pci=big_root_window" option for AMD 64-bit windows
Only try to enable a 64-bit window on AMD CPUs when "pci=big_root_window"
is specified.

This taints the kernel because the new 64-bit window uses address space we
don't know anything about, and it may contain unreported devices or memory
that would conflict with the window.

The pci_amd_enable_64bit_bar() quirk that enables the window is specific to
AMD CPUs.  The generic solution would be to have the firmware enable the
window and describe it in the host bridge's _CRS method, or at least
describe it in the _PRS method so the OS would have the option of enabling
it.

Signed-off-by: Christian König <christian.koenig@amd.com>
[bhelgaas: changelog, extend doc, mention taint in dmesg]
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
2018-01-11 11:22:39 -06:00
David Howells
0500871f21 Construct init thread stack in the linker script rather than by union
Construct the init thread stack in the linker script rather than doing it
by means of a union so that ia64's init_task.c can be got rid of.

The following symbols are then made available from INIT_TASK_DATA() linker
script macro:

	init_thread_union
	init_stack

INIT_TASK_DATA() also expands the region to THREAD_SIZE to accommodate the
size of the init stack.  init_thread_union is given its own section so that
it can be placed into the stack space in the right order.  I'm assuming
that the ia64 ordering is correct and that the task_struct is first and the
thread_info second.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Tested-by: Will Deacon <will.deacon@arm.com> (arm64)
Tested-by: Palmer Dabbelt <palmer@sifive.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
2018-01-09 23:21:02 +00:00
Tom Lendacky
9c6a73c758 x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC
With LFENCE now a serializing instruction, use LFENCE_RDTSC in preference
to MFENCE_RDTSC.  However, since the kernel could be running under a
hypervisor that does not support writing that MSR, read the MSR back and
verify that the bit has been set successfully.  If the MSR can be read
and the bit is set, then set the LFENCE_RDTSC feature, otherwise set the
MFENCE_RDTSC feature.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/20180108220932.12580.52458.stgit@tlendack-t1.amdoffice.net
2018-01-09 01:43:11 +01:00
Tom Lendacky
e4d0e84e49 x86/cpu/AMD: Make LFENCE a serializing instruction
To aid in speculation control, make LFENCE a serializing instruction
since it has less overhead than MFENCE.  This is done by setting bit 1
of MSR 0xc0011029 (DE_CFG).  Some families that support LFENCE do not
have this MSR.  For these families, the LFENCE instruction is already
serializing.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/20180108220921.12580.71694.stgit@tlendack-t1.amdoffice.net
2018-01-09 01:43:10 +01:00
David Woodhouse
99c6fa2511 x86/cpufeatures: Add X86_BUG_SPECTRE_V[12]
Add the bug bits for spectre v1/2 and force them unconditionally for all
cpus.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1515239374-23361-2-git-send-email-dwmw@amazon.co.uk
2018-01-06 21:57:19 +01:00
Linus Torvalds
abb7099dbc Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull  more x86 pti fixes from Thomas Gleixner:
 "Another small stash of fixes for fallout from the PTI work:

   - Fix the modules vs. KASAN breakage which was caused by making
     MODULES_END depend of the fixmap size. That was done when the cpu
     entry area moved into the fixmap, but now that we have a separate
     map space for that this is causing more issues than it solves.

   - Use the proper cache flush methods for the debugstore buffers as
     they are mapped/unmapped during runtime and not statically mapped
     at boot time like the rest of the cpu entry area.

   - Make the map layout of the cpu_entry_area consistent for 4 and 5
     level paging and fix the KASLR vaddr_end wreckage.

   - Use PER_CPU_EXPORT for per cpu variable and while at it unbreak
     nvidia gfx drivers by dropping the GPL export. The subject line of
     the commit tells it the other way around, but I noticed that too
     late.

   - Fix the ASM alternative macros so they can be used in the middle of
     an inline asm block.

   - Rename the BUG_CPU_INSECURE flag to BUG_CPU_MELTDOWN so the attack
     vector is properly identified. The Spectre mitigations will come
     with their own bug bits later"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/pti: Rename BUG_CPU_INSECURE to BUG_CPU_MELTDOWN
  x86/alternatives: Add missing '\n' at end of ALTERNATIVE inline asm
  x86/tlb: Drop the _GPL from the cpu_tlbstate export
  x86/events/intel/ds: Use the proper cache flush method for mapping ds buffers
  x86/kaslr: Fix the vaddr_end mess
  x86/mm: Map cpu_entry_area at the same place on 4/5 level
  x86/mm: Set MODULES_END to 0xffffffffff000000
2018-01-05 12:23:57 -08:00
Thomas Gleixner
de791821c2 x86/pti: Rename BUG_CPU_INSECURE to BUG_CPU_MELTDOWN
Use the name associated with the particular attack which needs page table
isolation for mitigation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Cc: Jiri Koshina <jikos@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Lutomirski  <luto@amacapital.net>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Turner <pjt@google.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Greg KH <gregkh@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801051525300.1724@nanos
2018-01-05 15:34:43 +01:00
David Woodhouse
b9e705ef7c x86/alternatives: Add missing '\n' at end of ALTERNATIVE inline asm
Where an ALTERNATIVE is used in the middle of an inline asm block, this
would otherwise lead to the following instruction being appended directly
to the trailing ".popsection", and a failed compile.

Fixes: 9cebed423c ("x86, alternative: Use .pushsection/.popsection")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: ak@linux.intel.com
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Turner <pjt@google.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180104143710.8961-8-dwmw@amazon.co.uk
2018-01-05 14:01:15 +01:00
Thomas Gleixner
1dddd25125 x86/kaslr: Fix the vaddr_end mess
vaddr_end for KASLR is only documented in the KASLR code itself and is
adjusted depending on config options. So it's not surprising that a change
of the memory layout causes KASLR to have the wrong vaddr_end. This can map
arbitrary stuff into other areas causing hard to understand problems.

Remove the whole ifdef magic and define the start of the cpu_entry_area to
be the end of the KASLR vaddr range.

Add documentation to that effect.

Fixes: 92a0f81d89 ("x86/cpu_entry_area: Move it out of the fixmap")
Reported-by: Benjamin Gilbert <benjamin.gilbert@coreos.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Benjamin Gilbert <benjamin.gilbert@coreos.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable <stable@vger.kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Garnier <thgarnie@google.com>,
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801041320360.1771@nanos
2018-01-05 00:39:57 +01:00
Thomas Gleixner
f207890481 x86/mm: Map cpu_entry_area at the same place on 4/5 level
There is no reason for 4 and 5 level pagetables to have a different
layout. It just makes determining vaddr_end for KASLR harder than
necessary.

Fixes: 92a0f81d89 ("x86/cpu_entry_area: Move it out of the fixmap")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Gilbert <benjamin.gilbert@coreos.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable <stable@vger.kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Garnier <thgarnie@google.com>,
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801041320360.1771@nanos
2018-01-04 23:04:57 +01:00
Andrey Ryabinin
f5a40711fa x86/mm: Set MODULES_END to 0xffffffffff000000
Since f06bdd4001 ("x86/mm: Adapt MODULES_END based on fixmap section size")
kasan_mem_to_shadow(MODULES_END) could be not aligned to a page boundary.

So passing page unaligned address to kasan_populate_zero_shadow() have two
possible effects:

1) It may leave one page hole in supposed to be populated area. After commit
  21506525fb ("x86/kasan/64: Teach KASAN about the cpu_entry_area") that
  hole happens to be in the shadow covering fixmap area and leads to crash:

 BUG: unable to handle kernel paging request at fffffbffffe8ee04
 RIP: 0010:check_memory_region+0x5c/0x190

 Call Trace:
  <NMI>
  memcpy+0x1f/0x50
  ghes_copy_tofrom_phys+0xab/0x180
  ghes_read_estatus+0xfb/0x280
  ghes_notify_nmi+0x2b2/0x410
  nmi_handle+0x115/0x2c0
  default_do_nmi+0x57/0x110
  do_nmi+0xf8/0x150
  end_repeat_nmi+0x1a/0x1e

Note, the crash likely disappeared after commit 92a0f81d89, which
changed kasan_populate_zero_shadow() call the way it was before
commit 21506525fb.

2) Attempt to load module near MODULES_END will fail, because
   __vmalloc_node_range() called from kasan_module_alloc() will hit the
   WARN_ON(!pte_none(*pte)) in the vmap_pte_range() and bail out with error.

To fix this we need to make kasan_mem_to_shadow(MODULES_END) page aligned
which means that MODULES_END should be 8*PAGE_SIZE aligned.

The whole point of commit f06bdd4001 was to move MODULES_END down if
NR_CPUS is big, so the cpu_entry_area takes a lot of space.
But since 92a0f81d89 ("x86/cpu_entry_area: Move it out of the fixmap")
the cpu_entry_area is no longer in fixmap, so we could just set
MODULES_END to a fixed 8*PAGE_SIZE aligned address.

Fixes: f06bdd4001 ("x86/mm: Adapt MODULES_END based on fixmap section size")
Reported-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/20171228160620.23818-1-aryabinin@virtuozzo.com
2018-01-04 23:04:57 +01:00
Linus Torvalds
00a5ae218d Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 page table isolation fixes from Thomas Gleixner:
 "A couple of urgent fixes for PTI:

   - Fix a PTE mismatch between user and kernel visible mapping of the
     cpu entry area (differs vs. the GLB bit) and causes a TLB mismatch
     MCE on older AMD K8 machines

   - Fix the misplaced CR3 switch in the SYSCALL compat entry code which
     causes access to unmapped kernel memory resulting in double faults.

   - Fix the section mismatch of the cpu_tss_rw percpu storage caused by
     using a different mechanism for declaration and definition.

   - Two fixes for dumpstack which help to decode entry stack issues
     better

   - Enable PTI by default in Kconfig. We should have done that earlier,
     but it slipped through the cracks.

   - Exclude AMD from the PTI enforcement. Not necessarily a fix, but if
     AMD is so confident that they are not affected, then we should not
     burden users with the overhead"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/process: Define cpu_tss_rw in same section as declaration
  x86/pti: Switch to kernel CR3 at early in entry_SYSCALL_compat()
  x86/dumpstack: Print registers for first stack frame
  x86/dumpstack: Fix partial register dumps
  x86/pti: Make sure the user/kernel PTEs match
  x86/cpu, x86/pti: Do not enable PTI on AMD processors
  x86/pti: Enable PTI by default
2018-01-03 16:41:07 -08:00
Josh Poimboeuf
a9cdbe72c4 x86/dumpstack: Fix partial register dumps
The show_regs_safe() logic is wrong.  When there's an iret stack frame,
it prints the entire pt_regs -- most of which is random stack data --
instead of just the five registers at the end.

show_regs_safe() is also poorly named: the on_stack() checks aren't for
safety.  Rename the function to show_regs_if_on_stack() and add a
comment to explain why the checks are needed.

These issues were introduced with the "partial register dump" feature of
the following commit:

  b02fcf9ba1 ("x86/unwinder: Handle stack overflows more gracefully")

That patch had gone through a few iterations of development, and the
above issues were artifacts from a previous iteration of the patch where
'regs' pointed directly to the iret frame rather than to the (partially
empty) pt_regs.

Tested-by: Alexander Tsoy <alexander@tsoy.me>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toralf Förster <toralf.foerster@gmx.de>
Cc: stable@vger.kernel.org
Fixes: b02fcf9ba1 ("x86/unwinder: Handle stack overflows more gracefully")
Link: http://lkml.kernel.org/r/5b05b8b344f59db2d3d50dbdeba92d60f2304c54.1514736742.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-03 16:14:46 +01:00
Linus Torvalds
f39d7d78b7 Merge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "A couple of fixlets for x86:

   - Fix the ESPFIX double fault handling for 5-level pagetables

   - Fix the commandline parsing for 'apic=' on 32bit systems and update
     documentation

   - Make zombie stack traces reliable

   - Fix kexec with stack canary

   - Fix the delivery mode for APICs which was missed when the x86
     vector management was converted to single target delivery. Caused a
     regression due to the broken hardware which ignores affinity
     settings in lowest prio delivery mode.

   - Unbreak modules when AMD memory encryption is enabled

   - Remove an unused parameter of prepare_switch_to"

* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic: Switch all APICs to Fixed delivery mode
  x86/apic: Update the 'apic=' description of setting APIC driver
  x86/apic: Avoid wrong warning when parsing 'apic=' in X86-32 case
  x86-32: Fix kexec with stack canary (CONFIG_CC_STACKPROTECTOR)
  x86: Remove unused parameter of prepare_switch_to
  x86/stacktrace: Make zombie stack traces reliable
  x86/mm: Unbreak modules that use the DMA API
  x86/build: Make isoimage work on Debian
  x86/espfix/64: Fix espfix double-fault handling on 5-level systems
2017-12-31 13:13:56 -08:00
Linus Torvalds
52c90f2d32 Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 page table isolation fixes from Thomas Gleixner:
 "Four patches addressing the PTI fallout as discussed and debugged
  yesterday:

   - Remove stale and pointless TLB flush invocations from the hotplug
     code

   - Remove stale preempt_disable/enable from __native_flush_tlb()

   - Plug the memory leak in the write_ldt() error path"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ldt: Make LDT pgtable free conditional
  x86/ldt: Plug memory leak in error path
  x86/mm: Remove preempt_disable/enable() from __native_flush_tlb()
  x86/smpboot: Remove stale TLB flush invocations
2017-12-31 13:03:05 -08:00
Linus Torvalds
e7c632fc47 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:

 - plug a memory leak in the intel pmu init code

 - clang fixes

 - tooling fix to avoid including kernel headers

 - a fix for jvmti to generate correct debug information for inlined
   code

 - replace backtick with a regular shell function

 - fix the build in hardened environments

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Plug memory leak in intel_pmu_init()
  x86/asm: Allow again using asm.h when building for the 'bpf' clang target
  tools arch s390: Do not include header files from the kernel sources
  perf jvmti: Generate correct debug information for inlined code
  perf tools: Fix up build in hardened environments
  perf tools: Use shell function for perl cflags retrieval
2017-12-31 11:47:24 -08:00
Linus Torvalds
88fa025d30 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
 "A rather large update after the kaisered maintainer finally found time
  to handle regression reports.

   - The larger part addresses a regression caused by the x86 vector
     management rework.

     The reservation based model does not work reliably for MSI
     interrupts, if they cannot be masked (yes, yet another hw
     engineering trainwreck). The reason is that the reservation mode
     assigns a dummy vector when the interrupt is allocated and switches
     to a real vector when the interrupt is requested.

     If the MSI entry cannot be masked then the initialization might
     raise an interrupt before the interrupt is requested, which ends up
     as spurious interrupt and causes device malfunction and worse. The
     fix is to exclude MSI interrupts which do not support masking from
     reservation mode and assign a real vector right away.

   - Extend the extra lockdep class setup for nested interrupts with a
     class for the recently added irq_desc::request_mutex so lockdep can
     differeniate and does not emit false positive warnings.

   - A ratelimit guard for the bad irq printout so in case a bad irq
     comes back immediately the system does not drown in dmesg spam"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/msi, x86/vector: Prevent reservation mode for non maskable MSI
  genirq/irqdomain: Rename early argument of irq_domain_activate_irq()
  x86/vector: Use IRQD_CAN_RESERVE flag
  genirq: Introduce IRQD_CAN_RESERVE flag
  genirq/msi: Handle reactivation only on success
  gpio: brcmstb: Make really use of the new lockdep class
  genirq: Guard handle_bad_irq log messages
  kernel/irq: Extend lockdep class for request mutex
2017-12-31 11:23:11 -08:00
Thomas Gleixner
decab0888e x86/mm: Remove preempt_disable/enable() from __native_flush_tlb()
The preempt_disable/enable() pair in __native_flush_tlb() was added in
commit:

  5cf0791da5 ("x86/mm: Disable preemption during CR3 read+write")

... to protect the UP variant of flush_tlb_mm_range().

That preempt_disable/enable() pair should have been added to the UP variant
of flush_tlb_mm_range() instead.

The UP variant was removed with commit:

  ce4a4e565f ("x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code")

... but the preempt_disable/enable() pair stayed around.

The latest change to __native_flush_tlb() in commit:

  6fd166aae7 ("x86/mm: Use/Fix PCID to optimize user/kernel switches")

... added an access to a per CPU variable outside the preempt disabled
regions, which makes no sense at all. __native_flush_tlb() must always
be called with at least preemption disabled.

Remove the preempt_disable/enable() pair and add a WARN_ON_ONCE() to catch
bad callers independent of the smp_processor_id() debugging.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linuxfoundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171230211829.679325424@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-31 12:12:51 +01:00
Linus Torvalds
5aa90a8458 Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 page table isolation updates from Thomas Gleixner:
 "This is the final set of enabling page table isolation on x86:

   - Infrastructure patches for handling the extra page tables.

   - Patches which map the various bits and pieces which are required to
     get in and out of user space into the user space visible page
     tables.

   - The required changes to have CR3 switching in the entry/exit code.

   - Optimizations for the CR3 switching along with documentation how
     the ASID/PCID mechanism works.

   - Updates to dump pagetables to cover the user space page tables for
     W+X scans and extra debugfs files to analyze both the kernel and
     the user space visible page tables

  The whole functionality is compile time controlled via a config switch
  and can be turned on/off on the command line as well"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
  x86/ldt: Make the LDT mapping RO
  x86/mm/dump_pagetables: Allow dumping current pagetables
  x86/mm/dump_pagetables: Check user space page table for WX pages
  x86/mm/dump_pagetables: Add page table directory to the debugfs VFS hierarchy
  x86/mm/pti: Add Kconfig
  x86/dumpstack: Indicate in Oops whether PTI is configured and enabled
  x86/mm: Clarify the whole ASID/kernel PCID/user PCID naming
  x86/mm: Use INVPCID for __native_flush_tlb_single()
  x86/mm: Optimize RESTORE_CR3
  x86/mm: Use/Fix PCID to optimize user/kernel switches
  x86/mm: Abstract switching CR3
  x86/mm: Allow flushing for future ASID switches
  x86/pti: Map the vsyscall page if needed
  x86/pti: Put the LDT in its own PGD if PTI is on
  x86/mm/64: Make a full PGD-entry size hole in the memory map
  x86/events/intel/ds: Map debug buffers in cpu_entry_area
  x86/cpu_entry_area: Add debugstore entries to cpu_entry_area
  x86/mm/pti: Map ESPFIX into user space
  x86/mm/pti: Share entry text PMD
  x86/entry: Align entry text section to PMD boundary
  ...
2017-12-29 17:02:49 -08:00
Thomas Gleixner
702cb0a028 genirq/irqdomain: Rename early argument of irq_domain_activate_irq()
The 'early' argument of irq_domain_activate_irq() is actually used to
denote reservation mode. To avoid confusion, rename it before abuse
happens.

No functional change.

Fixes: 7249164346 ("genirq/irqdomain: Update irq_domain_ops.activate() signature")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexandru Chirvasitu <achirvasub@gmail.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Mikael Pettersson <mikpelinux@gmail.com>
Cc: Josh Poulson <jopoulso@microsoft.com>
Cc: Mihai Costache <v-micos@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-pci@vger.kernel.org
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Simon Xiao <sixiao@microsoft.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Jork Loeser <Jork.Loeser@microsoft.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: devel@linuxdriverproject.org
Cc: KY Srinivasan <kys@microsoft.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Sakari Ailus <sakari.ailus@intel.com>,
Cc: linux-media@vger.kernel.org
2017-12-29 21:13:04 +01:00
Andy Shevchenko
4565c4f605 ACPI / x86: boot: Use INVALID_ACPI_IRQ instead of 0 for acpi_sci_override_gsi
0 is valid hardware interrupt which might be in some cases overridden.
Due to this, switch to INVALID_ACPI_IRQ to mark SCI override not set.

While here, change the type of the variable from int to u32 to match
the GSI type used in the rest of the code.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-12-28 12:36:46 +01:00
rodrigosiqueira
7ac139eaa6 x86: Remove unused parameter of prepare_switch_to
Commit e37e43a497 ("x86/mm/64: Enable vmapped stacks
(CONFIG_HAVE_ARCH_VMAP_STACK=y)") added prepare_switch_to with one extra
parameter which is not used by the function, remove it.

Signed-off-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-janitors@vger.kernel.org
Link: https://lkml.kernel.org/r/20171215131533.hp6kqebw45o7uvsb@smtp.gmail.com
2017-12-27 20:37:41 +01:00
Thomas Gleixner
9f5cb6b32d x86/ldt: Make the LDT mapping RO
Now that the LDT mapping is in a known area when PAGE_TABLE_ISOLATION is
enabled its a primary target for attacks, if a user space interface fails
to validate a write address correctly. That can never happen, right?

The SDM states:

    If the segment descriptors in the GDT or an LDT are placed in ROM, the
    processor can enter an indefinite loop if software or the processor
    attempts to update (write to) the ROM-based segment descriptors. To
    prevent this problem, set the accessed bits for all segment descriptors
    placed in a ROM. Also, remove operating-system or executive code that
    attempts to modify segment descriptors located in ROM.

So its a valid approach to set the ACCESS bit when setting up the LDT entry
and to map the table RO. Fixup the selftest so it can handle that new mode.

Remove the manual ACCESS bit setter in set_tls_desc() as this is now
pointless. Folded the patch from Peter Ziljstra.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23 21:13:01 +01:00
Thomas Gleixner
a4b51ef655 x86/mm/dump_pagetables: Allow dumping current pagetables
Add two debugfs files which allow to dump the pagetable of the current
task.

current_kernel dumps the regular page table. This is the page table which
is normally shared between kernel and user space. If kernel page table
isolation is enabled this is the kernel space mapping.

If kernel page table isolation is enabled the second file, current_user,
dumps the user space page table.

These files allow to verify the resulting page tables for page table
isolation, but even in the normal case its useful to be able to inspect
user space page tables of current for debugging purposes.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Cc: linux-mm@kvack.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23 21:13:01 +01:00