This is loader specific code which can load bzImage and set it up for
64bit entry. This does not take care of 32bit entry or real mode entry.
32bit mode entry can be implemented if somebody needs it.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The core mm code will provide a default gate area based on
FIXADDR_USER_START and FIXADDR_USER_END if
!defined(__HAVE_ARCH_GATE_AREA) && defined(AT_SYSINFO_EHDR).
This default is only useful for ia64. arm64, ppc, s390, sh, tile, 64-bit
UML, and x86_32 have their own code just to disable it. arm, 32-bit UML,
and x86_64 have gate areas, but they have their own implementations.
This gets rid of the default and moves the code into ia64.
This should save some code on architectures without a gate area: it's now
possible to inline the gate_area functions in the default case.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Nathan Lynch <nathan_lynch@mentor.com>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [in principle]
Acked-by: Richard Weinberger <richard@nod.at> [for um]
Acked-by: Will Deacon <will.deacon@arm.com> [for arm64]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Nathan Lynch <Nathan_Lynch@mentor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rather than have architectures #define ARCH_HAS_SG_CHAIN in an
architecture specific scatterlist.h, make it a proper Kconfig option and
use that instead. At same time, remove the header files are are now
mostly useless and just include asm-generic/scatterlist.h.
[sfr@canb.auug.org.au: powerpc files now need asm/dma.h]
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de> [x86]
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [powerpc]
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- ACPICA update to upstream version 20140724. That includes
ACPI 5.1 material (support for the _CCA and _DSD predefined names,
changes related to the DMAR and PCCT tables and ARM support among
other things) and cleanups related to using ACPICA's header files.
A major part of it is related to acpidump and the core code used
by that utility. Changes from Bob Moore, David E Box, Lv Zheng,
Sascha Wildner, Tomasz Nowicki, Hanjun Guo.
- Radix trees for memory bitmaps used by the hibernation core from
Joerg Roedel.
- Support for waking up the system from suspend-to-idle (also known
as the "freeze" sleep state) using ACPI-based PCI wakeup signaling
(Rafael J Wysocki).
- Fixes for issues related to ACPI button events (Rafael J Wysocki).
- New device ID for an ACPI-enumerated device included into the
Wildcat Point PCH from Jie Yang.
- ACPI video updates related to backlight handling from Hans de Goede
and Linus Torvalds.
- Preliminary changes needed to support ACPI on ARM from Hanjun Guo
and Graeme Gregory.
- ACPI PNP core cleanups from Arjun Sreedharan and Zhang Rui.
- Cleanups related to ACPI_COMPANION() and ACPI_HANDLE() macros
(Rafael J Wysocki).
- ACPI-based device hotplug cleanups from Wei Yongjun and
Rafael J Wysocki.
- Cleanups and improvements related to system suspend from
Lan Tianyu, Randy Dunlap and Rafael J Wysocki.
- ACPI battery cleanup from Wei Yongjun.
- cpufreq core fixes from Viresh Kumar.
- Elimination of a deadband effect from the cpufreq ondemand
governor and intel_pstate driver cleanups from Stratos Karafotis.
- 350MHz CPU support for the powernow-k6 cpufreq driver from
Mikulas Patocka.
- Fix for the imx6 cpufreq driver from Anson Huang.
- cpuidle core and governor cleanups from Daniel Lezcano,
Sandeep Tripathy and Mohammad Merajul Islam Molla.
- Build fix for the big_little cpuidle driver from Sachin Kamat.
- Configuration fix for the Operation Performance Points (OPP)
framework from Mark Brown.
- APM cleanup from Jean Delvare.
- cpupower utility fixes and cleanups from Peter Senna Tschudin,
Andrey Utkin, Himangi Saraogi, Rickard Strandqvist, Thomas Renninger.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJT4nhtAAoJEILEb/54YlRxtZEP/2rtVQFSFdAW8l0Xm1SeSsl4
EnZpSNT1TFn+NdG23vSIot5Jzdz1/dLfeoJEbXpoVt4DPC9/PK4HPlv5FEDQYfh5
srftvvGcAva969sXzSBRNUeR+M8Yd2RdoYCfmqTEUjzf8GJLL4jC0VAIwMtsQklt
EbiQX8JaHQS7RIql7MDg1N2vaTo+zxkf39Kkcl56usmO/uATP7cAPjFreF/xQ3d8
OyBhz1cOXIhPw7bd9Dv9AgpJzA8WFpktDYEgy2sluBWMv+mLYjdZRCFkfpIRzmea
pt+hJDeAy8ZL6/bjWCzz2x6wG7uJdDLblreI28sgnJx/VHR3Co6u4H1BqUBj18ct
CHV6zQ55WFmx9/uJqBtwFy333HS2ysJziC5ucwmg8QjkvAn4RK8S0qHMfRvSSaHj
F9ejnHGxyrc3zzfsngUf/VXIp67FReaavyKX3LYxjHjMPZDMw2xCtCWEpUs52l2o
fAbkv8YFBbUalIv0RtELH5XnKQ2ggMP8UgvT74KyfXU6LaliH8lEV20FFjMgwrPI
sMr2xk04eS8mNRNAXL8OMMwvh6DY/Qsmb7BVg58RIw6CdHeFJl834yztzcf7+j56
4oUmA16QYBCFA3udGQ3Tb07mi8XTfrMdTOGA0koQG9tjswKXuLUXUk9WAXZe4vml
ItRpZKE86BCs3mLJMYre
=ZODv
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management updates from Rafael Wysocki:
"Again, ACPICA leads the pack (47 commits), followed by cpufreq (18
commits) and system suspend/hibernation (9 commits).
From the new code perspective, the ACPICA update brings ACPI 5.1 to
the table, including a new device configuration object called _DSD
(Device Specific Data) that will hopefully help us to operate device
properties like Device Trees do (at least to some extent) and changes
related to supporting ACPI on ARM.
Apart from that we have hibernation changes making it use radix trees
to store memory bitmaps which should speed up some operations carried
out by it quite significantly. We also have some power management
changes related to suspend-to-idle (the "freeze" sleep state) support
and more preliminary changes needed to support ACPI on ARM (outside of
ACPICA).
The rest is fixes and cleanups pretty much everywhere.
Specifics:
- ACPICA update to upstream version 20140724. That includes ACPI 5.1
material (support for the _CCA and _DSD predefined names, changes
related to the DMAR and PCCT tables and ARM support among other
things) and cleanups related to using ACPICA's header files. A
major part of it is related to acpidump and the core code used by
that utility. Changes from Bob Moore, David E Box, Lv Zheng,
Sascha Wildner, Tomasz Nowicki, Hanjun Guo.
- Radix trees for memory bitmaps used by the hibernation core from
Joerg Roedel.
- Support for waking up the system from suspend-to-idle (also known
as the "freeze" sleep state) using ACPI-based PCI wakeup signaling
(Rafael J Wysocki).
- Fixes for issues related to ACPI button events (Rafael J Wysocki).
- New device ID for an ACPI-enumerated device included into the
Wildcat Point PCH from Jie Yang.
- ACPI video updates related to backlight handling from Hans de Goede
and Linus Torvalds.
- Preliminary changes needed to support ACPI on ARM from Hanjun Guo
and Graeme Gregory.
- ACPI PNP core cleanups from Arjun Sreedharan and Zhang Rui.
- Cleanups related to ACPI_COMPANION() and ACPI_HANDLE() macros
(Rafael J Wysocki).
- ACPI-based device hotplug cleanups from Wei Yongjun and Rafael J
Wysocki.
- Cleanups and improvements related to system suspend from Lan
Tianyu, Randy Dunlap and Rafael J Wysocki.
- ACPI battery cleanup from Wei Yongjun.
- cpufreq core fixes from Viresh Kumar.
- Elimination of a deadband effect from the cpufreq ondemand governor
and intel_pstate driver cleanups from Stratos Karafotis.
- 350MHz CPU support for the powernow-k6 cpufreq driver from Mikulas
Patocka.
- Fix for the imx6 cpufreq driver from Anson Huang.
- cpuidle core and governor cleanups from Daniel Lezcano, Sandeep
Tripathy and Mohammad Merajul Islam Molla.
- Build fix for the big_little cpuidle driver from Sachin Kamat.
- Configuration fix for the Operation Performance Points (OPP)
framework from Mark Brown.
- APM cleanup from Jean Delvare.
- cpupower utility fixes and cleanups from Peter Senna Tschudin,
Andrey Utkin, Himangi Saraogi, Rickard Strandqvist, Thomas
Renninger"
* tag 'pm+acpi-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (118 commits)
ACPI / LPSS: add LPSS device for Wildcat Point PCH
ACPI / PNP: Replace faulty is_hex_digit() by isxdigit()
ACPICA: Update version to 20140724.
ACPICA: ACPI 5.1: Update for PCCT table changes.
ACPICA/ARM: ACPI 5.1: Update for GTDT table changes.
ACPICA/ARM: ACPI 5.1: Update for MADT changes.
ACPICA/ARM: ACPI 5.1: Update for FADT changes.
ACPICA: ACPI 5.1: Support for the _CCA predifined name.
ACPICA: ACPI 5.1: New notify value for System Affinity Update.
ACPICA: ACPI 5.1: Support for the _DSD predefined name.
ACPICA: Debug object: Add current value of Timer() to debug line prefix.
ACPICA: acpihelp: Add UUID support, restructure some existing files.
ACPICA: Utilities: Fix local printf issue.
ACPICA: Tables: Update for DMAR table changes.
ACPICA: Remove some extraneous printf arguments.
ACPICA: Update for comments/formatting. No functional changes.
ACPICA: Disassembler: Add support for the ToUUID opererator (macro).
ACPICA: Remove a redundant cast to acpi_size for ACPI_OFFSET() macro.
ACPICA: Work around an ancient GCC bug.
ACPI / processor: Make it possible to get local x2apic id via _MAT
...
There've been many updates in ASoC side at this time, especially the
framework enhancement for multiple CODECs on a single DAI and more
componentization works. The only major change in ALSA core is the
addition of timestamp type in sw_params field. This should behave in
backward compatible way. Other than that, there are lots of small
changes and new drivers in wide range, including a large code cut in
HD-audio driver for deprecated static quirks. Some highlights are
below:
ALSA Core:
- Add the new timestamp type field to sw_params to choose
MONOTONIC_RAW type
HD-audio:
- Continued conversion to standard printk macros, generic code
cleanups
- Removal of obsoleted static quirk codes for Conexant and C-Media
codecs
- Fixups for HP Envy TS, Dell XPS 15, HP and Dell mute/mic LED,
Gigabyte BXBT-2807 mobo
- Intel Braswell support
ASoC:
- Support for multiple CODECs attached to a single DAI, enabling
systems with for example multiple DAC/speaker drivers on a single
link, contributed by Benoit Cousson based on work from Misael Lopez
Cruz
- Support for byte controls larger than 256 bytes based on the use of
TLVs contributed by Omair Mohammed Abdullah
- More componentisation work from Lars-Peter Clausen
- The remainder of the conversions of CODEC drivers to params_width()
by Mark Brown
- Drivers for Cirrus Logic CS4265, Freescale i.MX ASRC blocks, Realtek
RT286 and RT5670, Rockchip RK3xxx I2S controllers and Texas
Instruments TAS2552
- Lots of updates and fixes, especially to the DaVinci, Intel,
Freescale, Realtek, and rcar drivers
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJT4fj0AAoJEGwxgFQ9KSmkXQ0QAIiRmVg40aiJoEdOLGgzNZtq
r/nXj69AuB6JSy0hKbFyyijjCcRpyCCGvjDYlogjT75M3c35Npz/m85oZHx2tajD
SB5OA+QxO4EQ3C0GjITIRHJROm4MM8/rnbnNYTsWnEGRkobTFTl0rHbSkA85RGFt
0zZqqs1R0s/nO9PMQ+5PA5x9xVFiZs2COeCK0CFA9s2ACf/hbxJBRIqYpIFWOo78
9L41jBOFuC/hIb4qwjgmsCWbKe1KQysTAf+Wty0CKipJ6VhfCbPn1Qn1zXGeUOxc
mj4eZ6LpJTrVMr/UN02c5vgPOiaBrQ7fWZo3dVHLlIjC6cEI1tUvNYAin7CMEzx8
DUsvo9p30OheA+ijc9wKaYFY6YmmJZRtpnnMd39i0oPG+bhvoV7vjXjJSB1sLJt1
o82xLpVL4Th8H+DMDVwA7UIBvvZGZBusw1qsNGfcOPrmExi4ScGhA0gSOO6W2y1z
VQLRbiXB/HtJGxeqWL6RqJOcLBOlJNmsk4UZMOSCu2OZrWd5I8MuRrNWeHDqhX1H
+VDEJVhFmM21vMpnobzEPxWsMgTVIAVf3Thh+WgaPxL4Krh0vkpZsgZk16VVmy/o
OJJF3n41FND4n9zSjOe4MkuL8UCOUpKCaBdqj9K1s6UKwOEKuDNslyT/zqutRWK5
x1uApU5y+E4iQT/b7cmA
=RL72
-----END PGP SIGNATURE-----
Merge tag 'sound-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"There've been many updates in ASoC side at this time, especially the
framework enhancement for multiple CODECs on a single DAI and more
componentization works.
The only major change in ALSA core is the addition of timestamp type
in sw_params field. This should behave in backward compatible way.
Other than that, there are lots of small changes and new drivers in
wide range, including a large code cut in HD-audio driver for
deprecated static quirks. Some highlights are below:
ALSA Core:
- Add the new timestamp type field to sw_params to choose
MONOTONIC_RAW type
HD-audio:
- Continued conversion to standard printk macros, generic code
cleanups
- Removal of obsoleted static quirk codes for Conexant and C-Media
codecs
- Fixups for HP Envy TS, Dell XPS 15, HP and Dell mute/mic LED,
Gigabyte BXBT-2807 mobo
- Intel Braswell support
ASoC:
- Support for multiple CODECs attached to a single DAI, enabling
systems with for example multiple DAC/speaker drivers on a single
link, contributed by Benoit Cousson based on work from Misael Lopez
Cruz
- Support for byte controls larger than 256 bytes based on the use of
TLVs contributed by Omair Mohammed Abdullah
- More componentisation work from Lars-Peter Clausen
- The remainder of the conversions of CODEC drivers to params_width()
by Mark Brown
- Drivers for Cirrus Logic CS4265, Freescale i.MX ASRC blocks,
Realtek RT286 and RT5670, Rockchip RK3xxx I2S controllers and Texas
Instruments TAS2552
- Lots of updates and fixes, especially to the DaVinci, Intel,
Freescale, Realtek, and rcar drivers"
* tag 'sound-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (402 commits)
ALSA: usb-audio: Whitespace cleanups for sound/usb/midi.*
ALSA: usb-audio: Respond to suspend and resume callbacks for MIDI input
sound/oss/pss: Remove typedefs pss_mixerdata and pss_confdata
sound/oss/opl3: Remove typedef opl_devinfo
ALSA: fireworks: fix specifiers in format strings for propper output
ASoC: imx-audmux: Use uintptr_t for port numbers
ASoC: davinci: Enable menuconfig entry for McASP
ASoC: fsl_asrc: Don't access members of config before checking it
ASoC: fsl_sarc_dma: Check pair before using it
ASoC: adau1977: Fix truncation warning on 64 bit architectures
ALSA: virtuoso: add Xonar Essence STX II support
ALSA: riptide: fix %d confusingly prefixed with 0x in format strings
ALSA: fireworks: fix %d confusingly prefixed with 0x in format strings
ALSA: hda - add codec ID for Braswell display audio codec
ALSA: hda - add PCI IDs for Intel Braswell
ALSA: usb-audio: Adjust Gamecom 780 volume level
ALSA: usb-audio: improve dmesg source grepability
ASoC: rt5670: Fix duplicate const warnings
ASoC: rt5670: Staticise non-exported symbols
ASoC: Intel: update stream only on stream IPC msgs
...
Pull x86 vdso updates from Ingo Molnar:
"Further simplifications and improvements to the VDSO code, by Andy
Lutomirski"
* 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86_64/vsyscall: Fix warn_bad_vsyscall log output
x86/vdso: Set VM_MAYREAD for the vvar vma
x86, vdso: Get rid of the fake section mechanism
x86, vdso: Move the vvar area before the vdso text
Pull x86 UV TLB update from Ingo Molnar:
"UV TLB shootdown logic updates for version of the UV architecture"
* 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/uv: Update the UV3 TLB shootdown logic
Pull x86 platform updates from Ingo Molnar:
"The main changes in this cycle are:
- Intel SOC driver updates, by Aubrey Li.
- TS5500 platform updates, by Vivien Didelot"
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/pmc_atom: Silence shift wrapping warnings in pmc_sleep_tmr_show()
x86/pmc_atom: Expose PMC device state and platform sleep state
x86/pmc_atom: Eisable a few S0ix wake up events for S0ix residency
x86/platform: New Intel Atom SOC power management controller driver
x86/platform/ts5500: Add support for TS-5400 boards
x86/platform/ts5500: Add a 'name' sysfs attribute
x86/platform/ts5500: Use the DEVICE_ATTR_RO() macro
Pull x86 mm changes from Ingo Molnar:
"The main change in this cycle is the rework of the TLB range flushing
code, to simplify, fix and consolidate the code. By Dave Hansen"
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Set TLB flush tunable to sane value (33)
x86/mm: New tunable for single vs full TLB flush
x86/mm: Add tracepoints for TLB flushes
x86/mm: Unify remote INVLPG code
x86/mm: Fix missed global TLB flush stat
x86/mm: Rip out complicated, out-of-date, buggy TLB flushing
x86/mm: Clean up the TLB flushing code
x86/smep: Be more informative when signalling an SMEP fault
Pull EFI changes from Ingo Molnar:
"Main changes in this cycle are:
- arm64 efi stub fixes, preservation of FP/SIMD registers across
firmware calls, and conversion of the EFI stub code into a static
library - Ard Biesheuvel
- Xen EFI support - Daniel Kiper
- Support for autoloading the efivars driver - Lee, Chun-Yi
- Use the PE/COFF headers in the x86 EFI boot stub to request that
the stub be loaded with CONFIG_PHYSICAL_ALIGN alignment - Michael
Brown
- Consolidate all the x86 EFI quirks into one file - Saurabh Tangri
- Additional error logging in x86 EFI boot stub - Ulf Winkelvos
- Support loading initrd above 4G in EFI boot stub - Yinghai Lu
- EFI reboot patches for ACPI hardware reduced platforms"
* 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
efi/arm64: Handle missing virtual mapping for UEFI System Table
arch/x86/xen: Silence compiler warnings
xen: Silence compiler warnings
x86/efi: Request desired alignment via the PE/COFF headers
x86/efi: Add better error logging to EFI boot stub
efi: Autoload efivars
efi: Update stale locking comment for struct efivars
arch/x86: Remove efi_set_rtc_mmss()
arch/x86: Replace plain strings with constants
xen: Put EFI machinery in place
xen: Define EFI related stuff
arch/x86: Remove redundant set_bit(EFI_MEMMAP) call
arch/x86: Remove redundant set_bit(EFI_SYSTEM_TABLES) call
efi: Introduce EFI_PARAVIRT flag
arch/x86: Do not access EFI memory map if it is not available
efi: Use early_mem*() instead of early_io*()
arch/ia64: Define early_memunmap()
x86/reboot: Add EFI reboot quirk for ACPI Hardware Reduced flag
efi/reboot: Allow powering off machines using EFI
efi/reboot: Add generic wrapper around EfiResetSystem()
...
Pull x86 cpufeature updates from Ingo Molnar:
"The main changes in this cycle were:
- Continued cleanups of CPU bugs mis-marked as 'missing features', by
Borislav Petkov.
- Detect the xsaves/xrstors feature and releated cleanup, by Fenghua
Yu"
* 'x86-cpufeature-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, cpu: Kill cpu_has_mp
x86, amd: Cleanup init_amd
x86/cpufeature: Add bug flags to /proc/cpuinfo
x86, cpufeature: Convert more "features" to bugs
x86/xsaves: Detect xsaves/xrstors feature
x86/cpufeature.h: Reformat x86 feature macros
Pull x86 build/cleanup/debug updates from Ingo Molnar:
"Robustify the build process with a quirk to avoid GCC reordering
related bugs.
Two code cleanups.
Simplify entry_64.S CFI annotations, by Jan Beulich"
* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, build: Change code16gcc.h from a C header to an assembly header
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Simplify __HAVE_ARCH_CMPXCHG tests
x86/tsc: Get rid of custom DIV_ROUND() macro
* 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/debug: Drop several unnecessary CFI annotations
Pull perf changes from Ingo Molnar:
"Kernel side changes:
- Consolidate the PMU interrupt-disabled code amongst architectures
(Vince Weaver)
- misc fixes
Tooling changes (new features, user visible changes):
- Add support for pagefault tracing in 'trace', please see multiple
examples in the changeset messages (Stanislav Fomichev).
- Add pagefault statistics in 'trace' (Stanislav Fomichev)
- Add header for columns in 'top' and 'report' TUI browsers (Jiri
Olsa)
- Add pagefault statistics in 'trace' (Stanislav Fomichev)
- Add IO mode into timechart command (Stanislav Fomichev)
- Fallback to syscalls:* when raw_syscalls:* is not available in the
perl and python perf scripts. (Daniel Bristot de Oliveira)
- Add --repeat global option to 'perf bench' to be used in benchmarks
such as the existing 'futex' one, that was modified to use it
instead of a local option. (Davidlohr Bueso)
- Fix fd -> pathname resolution in 'trace', be it using /proc or a
vfs_getname probe point. (Arnaldo Carvalho de Melo)
- Add suggestion of how to set perf_event_paranoid sysctl, to help
non-root users trying tools like 'trace' to get a working
environment. (Arnaldo Carvalho de Melo)
- Updates from trace-cmd for traceevent plugin_kvm plus args cleanup
(Steven Rostedt, Jan Kiszka)
- Support S/390 in 'perf kvm stat' (Alexander Yarygin)
Tooling infrastructure changes:
- Allow reserving a row for header purposes in the hists browser
(Arnaldo Carvalho de Melo)
- Various fixes and prep work related to supporting Intel PT (Adrian
Hunter)
- Introduce multiple debug variables control (Jiri Olsa)
- Add callchain and additional sample information for python scripts
(Joseph Schuchart)
- More prep work to support Intel PT: (Adrian Hunter)
- Polishing 'script' BTS output
- 'inject' can specify --kallsym
- VDSO is per machine, not a global var
- Expose data addr lookup functions previously private to 'script'
- Large mmap fixes in events processing
- Include standard stringify macros in power pc code (Sukadev
Bhattiprolu)
Tooling cleanups:
- Convert open coded equivalents to asprintf() (Andy Shevchenko)
- Remove needless reassignments in 'trace' (Arnaldo Carvalho de Melo)
- Cache the is_exit syscall test in 'trace) (Arnaldo Carvalho de
Melo)
- No need to reimplement err() in 'perf bench sched-messaging', drop
barf(). (Davidlohr Bueso).
- Remove ev_name argument from perf_evsel__hists_browse, can be
obtained from the other parameters. (Jiri Olsa)
Tooling fixes:
- Fix memory leak in the 'sched-messaging' perf bench test.
(Davidlohr Bueso)
- The -o and -n 'perf bench mem' options are mutually exclusive, emit
error when both are specified. (Davidlohr Bueso)
- Fix scrollbar refresh row index in the ui browser, problem exposed
now that headers will be added and will be allowed to be switched
on/off. (Jiri Olsa)
- Handle the num array type in python properly (Sebastian Andrzej
Siewior)
- Fix wrong condition for allocation failure (Jiri Olsa)
- Adjust callchain based on DWARF debug info on powerpc (Sukadev
Bhattiprolu)
- Fix a risk for doing free on uninitialized pointer in traceevent
lib (Rickard Strandqvist)
- Update attr test with PERF_FLAG_FD_CLOEXEC flag (Jiri Olsa)
- Enable close-on-exec flag on perf file descriptor (Yann Droneaud)
- Fix build on gcc 4.4.7 (Arnaldo Carvalho de Melo)
- Event ordering fixes (Jiri Olsa)"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (123 commits)
Revert "perf tools: Fix jump label always changing during tracing"
perf tools: Fix perf usage string leftover
perf: Check permission only for parent tracepoint event
perf record: Store PERF_RECORD_FINISHED_ROUND only for nonempty rounds
perf record: Always force PERF_RECORD_FINISHED_ROUND event
perf inject: Add --kallsyms parameter
perf tools: Expose 'addr' functions so they can be reused
perf session: Fix accounting of ordered samples queue
perf powerpc: Include util/util.h and remove stringify macros
perf tools: Fix build on gcc 4.4.7
perf tools: Add thread parameter to vdso__dso_findnew()
perf tools: Add dso__type()
perf tools: Separate the VDSO map name from the VDSO dso name
perf tools: Add vdso__new()
perf machine: Fix the lifetime of the VDSO temporary file
perf tools: Group VDSO global variables into a structure
perf session: Add ability to skip 4GiB or more
perf session: Add ability to 'skip' a non-piped event stream
perf tools: Pass machine to vdso__dso_findnew()
perf tools: Add dso__data_size()
...
Pull locking updates from Ingo Molnar:
"The main changes in this cycle are:
- big rtmutex and futex cleanup and robustification from Thomas
Gleixner
- mutex optimizations and refinements from Jason Low
- arch_mutex_cpu_relax() removal and related cleanups
- smaller lockdep tweaks"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
arch, locking: Ciao arch_mutex_cpu_relax()
locking/lockdep: Only ask for /proc/lock_stat output when available
locking/mutexes: Optimize mutex trylock slowpath
locking/mutexes: Try to acquire mutex only if it is unlocked
locking/mutexes: Delete the MUTEX_SHOW_NO_WAITER macro
locking/mutexes: Correct documentation on mutex optimistic spinning
rtmutex: Make the rtmutex tester depend on BROKEN
futex: Simplify futex_lock_pi_atomic() and make it more robust
futex: Split out the first waiter attachment from lookup_pi_state()
futex: Split out the waiter check from lookup_pi_state()
futex: Use futex_top_waiter() in lookup_pi_state()
futex: Make unlock_pi more robust
rtmutex: Avoid pointless requeueing in the deadlock detection chain walk
rtmutex: Cleanup deadlock detector debug logic
rtmutex: Confine deadlock logic to futex
rtmutex: Simplify remove_waiter()
rtmutex: Document pi chain walk
rtmutex: Clarify the boost/deboost part
rtmutex: No need to keep task ref for lock owner check
rtmutex: Simplify and document try_to_take_rtmutex()
...
few days.
MIPS and s390 have little going on this release; just bugfixes, some
small, some larger.
The highlights for x86 are nested VMX improvements (Jan Kiszka), optimizations
for old processor (up to Nehalem, by me and Bandan Das), and a lot of x86
emulator bugfixes (Nadav Amit).
Stephen Rothwell reported a trivial conflict with the tracing branch.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJT300XAAoJEBvWZb6bTYby3V8QAJz+XyajnhJ8wH55Vxczz22L
i2gtUGmBLhEXsBcaVKO4BBfek88lLzg0SGLjfW5wCMQmKtxVlrwTCXNkBoPGjapd
NwHtWkMKym44PDhRovn7zkSumkxC43uFIBR/ebrhP6Bvhh9s+MnkQUxfw9ILB+YV
EeKyEG8sSgxFCciuHbp3mIXpDcO6r/ldy6I7009OdyhLoMY+Kvmk7kRe9wtAivdg
CGJi60QvGOn2RGRPOCEtF6UWr8Ae8fe1t84o0hkXPv/j3jtabzAatXKJa4dYNbIs
7Mp4NQpxaGV6rq3WCYVeZRxGs+UReGDAS3Il4Z8C9eTOTooSfxdVr8acpM8PY6I8
UmLT6ECLGycc4ELXrETtR+QLmiXACyJqyVxz4aiLV3kWSWfamKD3hBeQK9NizNcE
VoPDl+PyISvR1tW4KstBuzfUWAEXi+gO78cqqFr/VW6cl7HKpA1DFQaPfGkYKDae
2CPwcLwI5/M6RtSgkyXTkEqNZLc2BjldqSeM1lmWjhZVW56X2iqePUL46Vab3Yvt
U+sELtwEE560NLN3hbaHUsLR1tcUix5w8vTzcXPxgoHQBszHCcAZTWd1XHulr64F
rp/cangqtkPKcu5j1mNhQs38oLjHI1MUsbQrqFoD4tmHjQ75iXHRFzYGoIVKXyHG
AnGbQzJzBcdAANhm3LW0
=UXxV
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM changes from Paolo Bonzini:
"These are the x86, MIPS and s390 changes; PPC and ARM will come in a
few days.
MIPS and s390 have little going on this release; just bugfixes, some
small, some larger.
The highlights for x86 are nested VMX improvements (Jan Kiszka),
optimizations for old processor (up to Nehalem, by me and Bandan Das),
and a lot of x86 emulator bugfixes (Nadav Amit).
Stephen Rothwell reported a trivial conflict with the tracing branch"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (104 commits)
x86/kvm: Resolve shadow warnings in macro expansion
KVM: s390: rework broken SIGP STOP interrupt handling
KVM: x86: always exit on EOIs for interrupts listed in the IOAPIC redir table
KVM: vmx: remove duplicate vmx_mpx_supported() prototype
KVM: s390: Fix memory leak on busy SIGP stop
x86/kvm: Resolve shadow warning from min macro
kvm: Resolve missing-field-initializers warnings
Replace NR_VMX_MSR with its definition
KVM: x86: Assertions to check no overrun in MSR lists
KVM: x86: set rflags.rf during fault injection
KVM: x86: Setting rflags.rf during rep-string emulation
KVM: x86: DR6/7.RTM cannot be written
KVM: nVMX: clean up nested_release_vmcs12 and code around it
KVM: nVMX: fix lifetime issues for vmcs02
KVM: x86: Defining missing x86 vectors
KVM: x86: emulator injects #DB when RFLAGS.RF is set
KVM: x86: Cleanup of rflags.rf cleaning
KVM: x86: Clear rflags.rf on emulated instructions
KVM: x86: popf emulation should not change RF
KVM: x86: Clearing rflags.rf upon skipped emulated instruction
...
to the ftrace function callback infrastructure. It's introducing a
way to allow different functions to call directly different trampolines
instead of all calling the same "mcount" one.
The only user of this for now is the function graph tracer, which always
had a different trampoline, but the function tracer trampoline was called
and did basically nothing, and then the function graph tracer trampoline
was called. The difference now, is that the function graph tracer
trampoline can be called directly if a function is only being traced by
the function graph trampoline. If function tracing is also happening on
the same function, the old way is still done.
The accounting for this takes up more memory when function graph tracing
is activated, as it needs to keep track of which functions it uses.
I have a new way that wont take as much memory, but it's not ready yet
for this merge window, and will have to wait for the next one.
Another big change was the removal of the ftrace_start/stop() calls that
were used by the suspend/resume code that stopped function tracing when
entering into suspend and resume paths. The stop of ftrace was done
because there was some function that would crash the system if one called
smp_processor_id()! The stop/start was a big hammer to solve the issue
at the time, which was when ftrace was first introduced into Linux.
Now ftrace has better infrastructure to debug such issues, and I found
the problem function and labeled it with "notrace" and function tracing
can now safely be activated all the way down into the guts of suspend
and resume.
Other changes include clean ups of uprobe code.
Clean up of the trace_seq() code.
And other various small fixes and clean ups to ftrace and tracing.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJT35zXAAoJEKQekfcNnQGuOz0H/38zqM0nLFhrgvz3EPk2UOjn
xqpX8qyb2V7TJZL+IqeXU2a5cQZl5ba0D4WtBGpxbTae3CJYiuQ87iKUNFoH0om5
FDpn80igb368k8V3qRdRsziKVCCf0XBd/NkHJXc0ZkfXGyzB2Ga4bBxALxp2gj9y
bnO+vKo6+tWYKG4hyQb4P3LRXUrK8/LWEsPr39cH2QH1Rdj69Lx9CgrCdUVJmwcb
Bj8hEiLXL/RYCFNn79A3wNTUvW0rG/AOIf4SLqXtasSRZ0ToaU0ZyDnrNv+0Ol47
rX8tSk+LfXchL9hpIvjCf1vlAYq3pO02favteR/jip3lx/dTjEDE4RJ9qtJzZ4Q=
=fwQY
-----END PGP SIGNATURE-----
Merge tag 'trace-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"This pull request has a lot of work done. The main thing is the
changes to the ftrace function callback infrastructure. It's
introducing a way to allow different functions to call directly
different trampolines instead of all calling the same "mcount" one.
The only user of this for now is the function graph tracer, which
always had a different trampoline, but the function tracer trampoline
was called and did basically nothing, and then the function graph
tracer trampoline was called. The difference now, is that the
function graph tracer trampoline can be called directly if a function
is only being traced by the function graph trampoline. If function
tracing is also happening on the same function, the old way is still
done.
The accounting for this takes up more memory when function graph
tracing is activated, as it needs to keep track of which functions it
uses. I have a new way that wont take as much memory, but it's not
ready yet for this merge window, and will have to wait for the next
one.
Another big change was the removal of the ftrace_start/stop() calls
that were used by the suspend/resume code that stopped function
tracing when entering into suspend and resume paths. The stop of
ftrace was done because there was some function that would crash the
system if one called smp_processor_id()! The stop/start was a big
hammer to solve the issue at the time, which was when ftrace was first
introduced into Linux. Now ftrace has better infrastructure to debug
such issues, and I found the problem function and labeled it with
"notrace" and function tracing can now safely be activated all the way
down into the guts of suspend and resume
Other changes include clean ups of uprobe code, clean up of the
trace_seq() code, and other various small fixes and clean ups to
ftrace and tracing"
* tag 'trace-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (57 commits)
ftrace: Add warning if tramp hash does not match nr_trampolines
ftrace: Fix trampoline hash update check on rec->flags
ring-buffer: Use rb_page_size() instead of open coded head_page size
ftrace: Rename ftrace_ops field from trampolines to nr_trampolines
tracing: Convert local function_graph functions to static
ftrace: Do not copy old hash when resetting
tracing: let user specify tracing_thresh after selecting function_graph
ring-buffer: Always run per-cpu ring buffer resize with schedule_work_on()
tracing: Remove function_trace_stop and HAVE_FUNCTION_TRACE_MCOUNT_TEST
s390/ftrace: remove check of obsolete variable function_trace_stop
arm64, ftrace: Remove check of obsolete variable function_trace_stop
Blackfin: ftrace: Remove check of obsolete variable function_trace_stop
metag: ftrace: Remove check of obsolete variable function_trace_stop
microblaze: ftrace: Remove check of obsolete variable function_trace_stop
MIPS: ftrace: Remove check of obsolete variable function_trace_stop
parisc: ftrace: Remove check of obsolete variable function_trace_stop
sh: ftrace: Remove check of obsolete variable function_trace_stop
sparc64,ftrace: Remove check of obsolete variable function_trace_stop
tile: ftrace: Remove check of obsolete variable function_trace_stop
ftrace: x86: Remove check of obsolete variable function_trace_stop
...
Pull percpu updates from Tejun Heo:
- Major reorganization of percpu header files which I think makes
things a lot more readable and logical than before.
- percpu-refcount is updated so that it requires explicit destruction
and can be reinitialized if necessary. This was pulled into the
block tree to replace the custom percpu refcnting implemented in
blk-mq.
- In the process, percpu and percpu-refcount got cleaned up a bit
* 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (21 commits)
percpu-refcount: implement percpu_ref_reinit() and percpu_ref_is_zero()
percpu-refcount: require percpu_ref to be exited explicitly
percpu-refcount: use unsigned long for pcpu_count pointer
percpu-refcount: add helpers for ->percpu_count accesses
percpu-refcount: one bit is enough for REF_STATUS
percpu-refcount, aio: use percpu_ref_cancel_init() in ioctx_alloc()
workqueue: stronger test in process_one_work()
workqueue: clear POOL_DISASSOCIATED in rebind_workers()
percpu: Use ALIGN macro instead of hand coding alignment calculation
percpu: invoke __verify_pcpu_ptr() from the generic part of accessors and operations
percpu: preffity percpu header files
percpu: use raw_cpu_*() to define __this_cpu_*()
percpu: reorder macros in percpu header files
percpu: move {raw|this}_cpu_*() definitions to include/linux/percpu-defs.h
percpu: move generic {raw|this}_cpu_*_N() definitions to include/asm-generic/percpu.h
percpu: only allow sized arch overrides for {raw|this}_cpu_*() ops
percpu: reorganize include/linux/percpu-defs.h
percpu: move accessors from include/linux/percpu.h to percpu-defs.h
percpu: include/asm-generic/percpu.h should contain only arch-overridable parts
percpu: introduce arch_raw_cpu_ptr()
...
We don't have any good way to figure out what kinds of flushes
are being attempted. Right now, we can try to use the vm
counters, but those only tell us what we actually did with the
hardware (one-by-one vs full) and don't tell us what was actually
_requested_.
This allows us to select out "interesting" TLB flushes that we
might want to optimize (like the ranged ones) and ignore the ones
that we have very little control over (the ones at context
switch).
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20140731154059.4C96CBA5@viggo.jf.intel.com
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
I think the flush_tlb_mm_range() code that tries to tune the
flush sizes based on the CPU needs to get ripped out for
several reasons:
1. It is obviously buggy. It uses mm->total_vm to judge the
task's footprint in the TLB. It should certainly be using
some measure of RSS, *NOT* ->total_vm since only resident
memory can populate the TLB.
2. Haswell, and several other CPUs are missing from the
intel_tlb_flushall_shift_set() function. Thus, it has been
demonstrated to bitrot quickly in practice.
3. It is plain wrong in my vm:
[ 0.037444] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
[ 0.037444] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0
[ 0.037444] tlb_flushall_shift: 6
Which leads to it to never use invlpg.
4. The assumptions about TLB refill costs are wrong:
http://lkml.kernel.org/r/1337782555-8088-3-git-send-email-alex.shi@intel.com
(more on this in later patches)
5. I can not reproduce the original data: https://lkml.org/lkml/2012/5/17/59
I believe the sample times were too short. Running the
benchmark in a loop yields times that vary quite a bit.
Note that this leaves us with a static ceiling of 1 page. This
is a conservative, dumb setting, and will be revised in a later
patch.
This also removes the code which attempts to predict whether we
are flushing data or instructions. We expect instruction flushes
to be relatively rare and not worth tuning for explicitly.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20140731154055.ABC88E89@viggo.jf.intel.com
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This moves the espfix64 logic into native_iret. To make this work,
it gets rid of the native patch for INTERRUPT_RETURN:
INTERRUPT_RETURN on native kernels is now 'jmp native_iret'.
This changes the 16-bit SS behavior on Xen from OOPSing to leaking
some bits of the Xen hypervisor's RSP (I think).
[ hpa: this is a nonzero cost on native, but probably not enough to
measure. Xen needs to fix this in their own code, probably doing
something equivalent to espfix64. ]
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/7b8f1d8ef6597cb16ae004a43c56980a7de3cf94.1406129132.git.luto@amacapital.net
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJT1VYNAAoJEHm+PkMAQRiGQJwIAKSYp1Uqz5O/e5r0V1TlZKT4
1B4Njopl57PwSrJQWcGEuH2yHyM896vfPO4L6BJIOfyWzh8kwpQqclDt6uhXoF/v
OsO1zb/7/j+n/pDZsePqP9AyIgErsHEBgUbhecDqzjN++ITPcZjQ6TIMPglZaumN
jFAdAZuAaEwqAk8jqN2wlm689Fh9MuUEarHXbXLCqu5RgLrWhFGhp/cTWY62aqnZ
XfEeQ9KtpRZmlR/IYjerbb1eRH7ZdJsZ88WngLX9dj/JdNxHWBkWQBXGAusXk5Fk
y6LsIV3TjyBdrRKJ1Ifyg/2EIXHNBs8HxTFGXpjtp2HPuMLDxZOWOWikb9URtNg=
=Fjf4
-----END PGP SIGNATURE-----
Merge tag 'v3.16-rc7' into perf/core, to merge in the latest fixes before applying new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add the following interfaces to exposes PMC device state and sleep
state residency via debugfs:
/sys/kernel/debugfs/pmc_atom/dev_state
/sys/kernel/debugfs/pmc_atom/sleep_state
Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com>
Link: http://lkml.kernel.org/r/53B0FF59.8000600@linux.intel.com
Signed-off-by: Kasagar, Srinidhi <srinidhi.kasagar@intel.com>
Reviewed-by: Rudramuni, Vishwesh M <vishwesh.m.rudramuni@intel.com>
Reviewed-by: Joe Perches <joe@perches.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Disable PMC S0IX_WAKE_EN events coming from LPC block(unused) and
also from GPIO_SUS ored dedicated IRQs (must be disabled as per PMC
programming rule), GPIOSCORE ored dedicated IRQs (must be disabled
as per PMC programming rule), GPIO_SUS shared IRQ (not necessary
since the IOAPIC_DS wake event will still work), GPIO_SCORE shared
IRQ (not necessary since the IOAPIC_DS wake event will still work).
Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com>
Link: http://lkml.kernel.org/r/53B0FF22.5080403@linux.intel.com
Signed-off-by: Olivier Leveque <olivier.leveque@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The Power Management Controller (PMC) controls many of the power
management features present in the Atom SoC. This driver provides
a native power off function via PMC PCI IO port.
On some ACPI hardware-reduced platforms(e.g. ASUS-T100), ACPI sleep
registers are not valid so that (*pm_power_off)() is not hooked by
acpi_power_off(). The power off function in this driver is installed
only when pm_power_off is NULL.
Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com>
Link: http://lkml.kernel.org/r/53B0FEEA.3010805@linux.intel.com
Signed-off-by: Lejun Zhu <lejun.zhu@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The forthcoming patch will make <acpi/acpi.h> to be visible to all kernel
source code. Thus for the architectures that do not support ACPI and
haven't implemented <asm/acenv.h>, we need to make it excluded.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Haswell and newer Intel CPUs have support for RTM, and in that case DR6.RTM is
not fixed to 1 and DR7.RTM is not fixed to zero. That is not the case in the
current KVM implementation. This bug is apparent only if the MOV-DR instruction
is emulated or the host also debugs the guest.
This patch is a partial fix which enables DR6.RTM and DR7.RTM to be cleared and
set respectively. It also sets DR6.RTM upon every debug exception. Obviously,
it is not a complete fix, as debugging of RTM is still unsupported.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now ARM64 support is being added to ACPI so architecture specific
values can not be used in core ACPI code.
Following on the patch "ACPI / processor: Check if LAPIC is present
during initialization" which uses acpi_lapic in acpi_processor.c,
on ARM64 platform, GIC is used instead of local APIC, so acpi_lapic
is not a suitable value for ARM64.
What is actually important at this point is if there is/are CPU
entry/entries (Local APIC/SAPIC, GICC) in MADT, so introduce
acpi_has_cpu_in_madt() to be arch specific and generic.
Signed-off-by: Graeme Gregory <graeme.gregory@linaro.org>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
It appears that the BayTrail-T class of hardware requires EFI in order
to powerdown and reboot and no other reliable method exists.
This quirk is generally applicable to all hardware that has the ACPI
Hardware Reduced bit set, since usually ACPI would be the preferred
method.
Cc: Len Brown <len.brown@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
The function graph trampoline is called from the function trampoline
and both do a save and restore of registers. The save of registers
done by the function trampoline when only the function graph tracer
is running is a waste of CPU cycles.
As the function graph tracer trampoline in x86 is dependent from
the function trampoline, we can call it directly when a function
is only being traced by the function graph trampoline.
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The arch_mutex_cpu_relax() function, introduced by 34b133f, is
hacky and ugly. It was added a few years ago to address the fact
that common cpu_relax() calls include yielding on s390, and thus
impact the optimistic spinning functionality of mutexes. Nowadays
we use this function well beyond mutexes: rwsem, qrwlock, mcs and
lockref. Since the macro that defines the call is in the mutex header,
any users must include mutex.h and the naming is misleading as well.
This patch (i) renames the call to cpu_relax_lowlatency ("relax, but
only if you can do it with very low latency") and (ii) defines it in
each arch's asm/processor.h local header, just like for regular cpu_relax
functions. On all archs, except s390, cpu_relax_lowlatency is simply cpu_relax,
and thus we can take it out of mutex.h. While this can seem redundant,
I believe it is a good choice as it allows us to move out arch specific
logic from generic locking primitives and enables future(?) archs to
transparently define it, similarly to System Z.
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bharat Bhushan <r65777@freescale.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Howells <dhowells@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Cc: Dominik Dingel <dingel@linux.vnet.ibm.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Joe Perches <joe@perches.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Joseph Myers <joseph@codesourcery.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Qais Yousef <qais.yousef@imgtec.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Stratos Karafotis <stratosk@semaphore.gr>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vasily Kulikov <segoon@openwall.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Cc: Waiman Long <Waiman.Long@hp.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: adi-buildroot-devel@lists.sourceforge.net
Cc: linux390@de.ibm.com
Cc: linux-alpha@vger.kernel.org
Cc: linux-am33-list@redhat.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-c6x-dev@linux-c6x.org
Cc: linux-cris-kernel@axis.com
Cc: linux-hexagon@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux@lists.openrisc.net
Cc: linux-m32r-ja@ml.linux-m32r.org
Cc: linux-m32r@ml.linux-m32r.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-metag@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: linux-xtensa@linux-xtensa.org
Cc: sparclinux@vger.kernel.org
Link: http://lkml.kernel.org/r/1404079773.2619.4.camel@buesod1.americas.hpqcorp.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently perf-kvm uses string literals for kvm event names, but it
works only for x86, because other architectures may have other names for
those events.
To reduce dependence on architecture, we add <asm/kvm_perf.h> file with
defines for:
- kvm_entry and kvm_exit events,
- exit reason field name in kvm_exit event,
- length of exit reasons strings,
- vcpu_id field name in kvm trace events,
and replace literals in perf-kvm.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by David Ahern <dsahern@gmail.com>
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1404397747-20939-2-git-send-email-yarygin@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Dump the flags which denote we have detected and/or have applied bug
workarounds to the CPU we're executing on, in a similar manner to the
feature flags.
The advantage is that those are not accumulating over time like the CPU
features.
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1403609105-8332-2-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Both the 32-bit and 64-bit cmpxchg.h header define __HAVE_ARCH_CMPXCHG
and there's ifdeffery which checks it. But since both bitness define it,
we can just as well move it up to the main cmpxchg header and simpify a
bit of code in doing that.
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20140711104338.GB17083@pd.tnic
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Putting the vvar area after the vdso text is rather complicated: it
only works of the total length of the vdso text mapping is known at
vdso link time, and the linker doesn't allow symbol addresses to
depend on the sizes of non-allocatable data after the PT_LOAD
segment.
Moving the vvar area before the vdso text will allow is to safely
map non-allocatable data after the vdso text, which is a nice
simplification.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/156c78c0d93144ff1055a66493783b9e56813983.1405040914.git.luto@amacapital.net
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
rip_relative is only set if decode_modrm runs, and if you have ModRM
you will also have a memopp. We can then access memopp unconditionally.
Note that rip_relative cannot be hoisted up to decode_modrm, or you
break "mov $0, xyz(%rip)".
Also, move typecast on "out of range value" of mem.ea to decode_modrm.
Together, all these optimizations save about 50 cycles on each emulated
instructions (4-6%).
Signed-off-by: Bandan Das <bsd@redhat.com>
[Fix immediate operands with rip-relative addressing. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
x86_decode_insn already sets a default for seg_override,
so remove it from the zeroed area. Also replace set/get functions
with direct access to the field.
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A lot of initializations are unnecessary as they get set to
appropriate values before actually being used. Optimize
placement of fields in x86_emulate_ctxt
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Core emulator functions all belong in emulator.c,
x86 should have no knowledge of emulator internals
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We can just blindly move all 16 bytes of ctxt->src's value to ctxt->dst.
write_register_operand will take care of writing only the lower bytes.
Avoiding a call to memcpy (the compiler optimizes it out) gains about
200 cycles on kvm-unit-tests for register-to-register moves, and makes
them about as fast as arithmetic instructions.
We could perhaps get a larger speedup by moving all instructions _except_
moves out of x86_emulate_insn, removing opcode_len, and replacing the
switch statement with an inlined em_mov.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For the next patch we will need to know the full state of the
interrupt shadow; we will then set KVM_REQ_EVENT when one bit
is cleared.
However, right now get_interrupt_shadow only returns the one
corresponding to the emulated instruction, or an unconditional
0 if the emulated instruction does not have an interrupt shadow.
This is confusing and does not allow us to check for cleared
bits as mentioned above.
Clean the callback up, and modify toggle_interruptibility to
match the comment above the call. As a small result, the
call to set_interrupt_shadow will be skipped in the common
case where int_shadow == 0 && mask == 0.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit b4aa016305 ("efifb: Implement vga_default_device() (v2)") added
efifb vga_default_device() so EFI systems that do not load shadow VBIOS or
setup VGA get proper value for boot_vga PCI sysfs attribute on the
corresponding PCI device.
Xorg doesn't detect devices when boot_vga=0, e.g., on some EFI systems such
as MacBookAir2,1. Xorg detects the GPU and finds the DRI device but then
bails out with "no devices detected".
Note: When vga_default_device() is set boot_vga PCI sysfs attribute
reflects its state. When unset this attribute is 1 whenever
IORESOURCE_ROM_SHADOW flag is set.
With introduction of sysfb/simplefb/simpledrm efifb is getting obsolete
while having native drivers for the GPU also makes selecting sysfb/efifb
optional.
Remove the efifb implementation of vga_default_device() and initialize
vgaarb's vga_default_device() with the PCI GPU that matches boot
screen_info in pci_fixup_video().
[bhelgaas: remove unused "dev" in efifb_setup()]
Fixes: b4aa016305 ("efifb: Implement vga_default_device() (v2)")
Tested-by: Anibal Francisco Martinez Cortina <linuxkid.zeuz@gmail.com>
Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Matthew Garrett <matthew.garrett@nebula.com>
CC: stable@vger.kernel.org # v3.5+
I've observed kvmclock being marked as unstable on a modern
single-socket system with a stable TSC and qemu-1.6.2 or qemu-2.0.0.
The culprit was failure in TSC matching because of overflow of
kvm_arch::nr_vcpus_matched_tsc in case there were multiple TSC writes
in a single synchronization cycle.
Turns out that qemu does multiple TSC writes during init, below is the
evidence of that (qemu-2.0.0):
The first one:
0xffffffffa08ff2b4 : vmx_write_tsc_offset+0xa4/0xb0 [kvm_intel]
0xffffffffa04c9c05 : kvm_write_tsc+0x1a5/0x360 [kvm]
0xffffffffa04cfd6b : kvm_arch_vcpu_postcreate+0x4b/0x80 [kvm]
0xffffffffa04b8188 : kvm_vm_ioctl+0x418/0x750 [kvm]
The second one:
0xffffffffa08ff2b4 : vmx_write_tsc_offset+0xa4/0xb0 [kvm_intel]
0xffffffffa04c9c05 : kvm_write_tsc+0x1a5/0x360 [kvm]
0xffffffffa090610d : vmx_set_msr+0x29d/0x350 [kvm_intel]
0xffffffffa04be83b : do_set_msr+0x3b/0x60 [kvm]
0xffffffffa04c10a8 : msr_io+0xc8/0x160 [kvm]
0xffffffffa04caeb6 : kvm_arch_vcpu_ioctl+0xc86/0x1060 [kvm]
0xffffffffa04b6797 : kvm_vcpu_ioctl+0xc7/0x5a0 [kvm]
#0 kvm_vcpu_ioctl at /build/buildd/qemu-2.0.0+dfsg/kvm-all.c:1780
#1 kvm_put_msrs at /build/buildd/qemu-2.0.0+dfsg/target-i386/kvm.c:1270
#2 kvm_arch_put_registers at /build/buildd/qemu-2.0.0+dfsg/target-i386/kvm.c:1909
#3 kvm_cpu_synchronize_post_init at /build/buildd/qemu-2.0.0+dfsg/kvm-all.c:1641
#4 cpu_synchronize_post_init at /build/buildd/qemu-2.0.0+dfsg/include/sysemu/kvm.h:330
#5 cpu_synchronize_all_post_init () at /build/buildd/qemu-2.0.0+dfsg/cpus.c:521
#6 main at /build/buildd/qemu-2.0.0+dfsg/vl.c:4390
The third one:
0xffffffffa08ff2b4 : vmx_write_tsc_offset+0xa4/0xb0 [kvm_intel]
0xffffffffa04c9c05 : kvm_write_tsc+0x1a5/0x360 [kvm]
0xffffffffa090610d : vmx_set_msr+0x29d/0x350 [kvm_intel]
0xffffffffa04be83b : do_set_msr+0x3b/0x60 [kvm]
0xffffffffa04c10a8 : msr_io+0xc8/0x160 [kvm]
0xffffffffa04caeb6 : kvm_arch_vcpu_ioctl+0xc86/0x1060 [kvm]
0xffffffffa04b6797 : kvm_vcpu_ioctl+0xc7/0x5a0 [kvm]
#0 kvm_vcpu_ioctl at /build/buildd/qemu-2.0.0+dfsg/kvm-all.c:1780
#1 kvm_put_msrs at /build/buildd/qemu-2.0.0+dfsg/target-i386/kvm.c:1270
#2 kvm_arch_put_registers at /build/buildd/qemu-2.0.0+dfsg/target-i386/kvm.c:1909
#3 kvm_cpu_synchronize_post_reset at /build/buildd/qemu-2.0.0+dfsg/kvm-all.c:1635
#4 cpu_synchronize_post_reset at /build/buildd/qemu-2.0.0+dfsg/include/sysemu/kvm.h:323
#5 cpu_synchronize_all_post_reset () at /build/buildd/qemu-2.0.0+dfsg/cpus.c:512
#6 main at /build/buildd/qemu-2.0.0+dfsg/vl.c:4482
The fix is to count each vCPU only once when matched, so that
nr_vcpus_matched_tsc holds the size of the matched set. This is
achieved by reusing generation counters. Every vCPU with
this_tsc_generation == cur_tsc_generation is in the matched set. The
match set is cleared by setting cur_tsc_generation to a value which no
other vCPU is set to (by incrementing it).
I needed to bump up the counter size form u8 to u64 to ensure it never
overflows. Otherwise in cases TSC is not written the same number of
times on each vCPU the counter could overflow and incorrectly indicate
some vCPUs as being in the matched set. This scenario seems unlikely
but I'm not sure if it can be disregarded.
Signed-off-by: Tomasz Grabiec <tgrabiec@cloudius-systems.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Obtaining the port number from DX is bogus as a) there are immediate
port accesses and b) user space may have changed the register content
while processing the PIO access. Forward the correct value from the
instruction emulator instead.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>