linux

Author	SHA1	Message	Date
Linus Torvalds	7cbb39d4d4	Merge tag 'kvm-3.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "PPC and ARM do not have much going on this time. Most of the cool stuff, instead, is in s390 and (after a few releases) x86. ARM has some caching fixes and PPC has transactional memory support in guests. MIPS has some fixes, with more probably coming in 3.16 as QEMU will soon get support for MIPS KVM. For x86 there are optimizations for debug registers, which trigger on some Windows games, and other important fixes for Windows guests. We now expose to the guest Broadwell instruction set extensions and also Intel MPX. There's also a fix/workaround for OS X guests, nested virtualization features (preemption timer), and a couple kvmclock refinements. For s390, the main news is asynchronous page faults, together with improvements to IRQs (floating irqs and adapter irqs) that speed up virtio devices" * tag 'kvm-3.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (96 commits) KVM: PPC: Book3S HV: Save/restore host PMU registers that are new in POWER8 KVM: PPC: Book3S HV: Fix decrementer timeouts with non-zero TB offset KVM: PPC: Book3S HV: Don't use kvm_memslots() in real mode KVM: PPC: Book3S HV: Return ENODEV error rather than EIO KVM: PPC: Book3S: Trim top 4 bits of physical address in RTAS code KVM: PPC: Book3S HV: Add get/set_one_reg for new TM state KVM: PPC: Book3S HV: Add transactional memory support KVM: Specify byte order for KVM_EXIT_MMIO KVM: vmx: fix MPX detection KVM: PPC: Book3S HV: Fix KVM hang with CONFIG_KVM_XICS=n KVM: PPC: Book3S: Introduce hypervisor call H_GET_TCE KVM: PPC: Book3S HV: Fix incorrect userspace exit on ioeventfd write KVM: s390: clear local interrupts at cpu initial reset KVM: s390: Fix possible memory leak in SIGP functions KVM: s390: fix calculation of idle_mask array size KVM: s390: randomize sca address KVM: ioapic: reinject pending interrupts on KVM_SET_IRQCHIP KVM: Bump KVM_MAX_IRQ_ROUTES for s390 KVM: s390: irq routing for adapter interrupts. KVM: s390: adapter interrupt sources ...	2014-04-02 14:50:10 -07:00
Linus Torvalds	467cbd207a	Merge branch 'x86-nuke-platforms-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 old platform removal from Peter Anvin: "This patchset removes support for several completely obsolete platforms, where the maintainers either have completely vanished or acked the removal. For some of them it is questionable if there even exists functional specimens of the hardware" Geert Uytterhoeven apparently thought this was a April Fool's pull request ;) * 'x86-nuke-platforms-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, platforms: Remove NUMAQ x86, platforms: Remove SGI Visual Workstation x86, apic: Remove support for IBM Summit/EXA chipset x86, apic: Remove support for ia32-based Unisys ES7000	2014-04-02 13:15:58 -07:00
Linus Torvalds	c6f21243ce	Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 vdso changes from Peter Anvin: "This is the revamp of the 32-bit vdso and the associated cleanups. This adds timekeeping support to the 32-bit vdso that we already have in the 64-bit vdso. Although 32-bit x86 is legacy, it is likely to remain in the embedded space for a very long time to come. This removes the traditional COMPAT_VDSO support; the configuration variable is reused for simply removing the 32-bit vdso, which will produce correct results but obviously suffer a performance penalty. Only one beta version of glibc was affected, but that version was unfortunately included in one OpenSUSE release. This is not the end of the vdso cleanups. Stefani and Andy have agreed to continue work for the next kernel cycle; in fact Andy has already produced another set of cleanups that came too late for this cycle. An incidental, but arguably important, change is that this ensures that unused space in the VVAR page is properly zeroed. It wasn't before, and would contain whatever garbage was left in memory by BIOS or the bootloader. Since the VVAR page is accessible to user space this had the potential of information leaks" * 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) x86, vdso: Fix the symbol versions on the 32-bit vDSO x86, vdso, build: Don't rebuild 32-bit vdsos on every make x86, vdso: Actually discard the .discard sections x86, vdso: Fix size of get_unmapped_area() x86, vdso: Finish removing VDSO32_PRELINK x86, vdso: Move more vdso definitions into vdso.h x86: Load the 32-bit vdso in place, just like the 64-bit vdsos x86, vdso32: handle 32 bit vDSO larger one page x86, vdso32: Disable stack protector, adjust optimizations x86, vdso: Zero-pad the VVAR page x86, vdso: Add 32 bit VDSO time support for 64 bit kernel x86, vdso: Add 32 bit VDSO time support for 32 bit kernel x86, vdso: Patch alternatives in the 32-bit VDSO x86, vdso: Introduce VVAR marco for vdso32 x86, vdso: Cleanup __vdso_gettimeofday() x86, vdso: Replace VVAR(vsyscall_gtod_data) by gtod macro x86, vdso: __vdso_clock_gettime() cleanup x86, vdso: Revamp vclock_gettime.c mm: Add new func _install_special_mapping() to mmap.c x86, vdso: Make vsyscall_gtod_data handling x86 generic ...	2014-04-02 12:26:43 -07:00
Steven Rostedt (Red Hat)	63c95654d8	x86: Fix dumpstack_64 irq stack handling Commit `2223f6f6ee` "x86: Clean up dumpstack_64.c code" changed the irq_stack processing a little from what it was before. The irq_stack_end variable needed to be cleared after its first use. By setting irq_stack to the per cpu irq_stack and passing that to analyze_stack(), and then clearing it after it is processed, we can get back the original behavior. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-04-02 11:46:50 -07:00
Steven Rostedt (Red Hat)	1aabc5990d	x86: Fix dumpstack_64 to keep state of "used" variable in loop Commit `2223f6f6ee` "x86: Clean up dumpstack_64.c code" moved the used variable to a local within the loop, but the in_exception_stack() depended on being non-volatile with the ability to change it. By always re-initializing the "used" variable to zero, it would cause the in_exception_stack() to return the same thing each time, and cause the dump_stack loop to go into an infinite loop. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-04-02 11:46:50 -07:00
K. Y. Srinivasan	f704a7d7f1	x86/platform/hyperv: Handle VMBUS driver being a module Hyper-V VMBUS driver can be a module; handle this case correctly. Please apply. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Cc: olaf@aepfle.de Cc: apw@canonical.com Cc: jasowang@redhat.com Link: http://lkml.kernel.org/r/1396421502-23222-1-git-send-email-kys@microsoft.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-04-02 09:49:47 +02:00
Ingo Molnar	b8764fe6d0	Merge branch 'linus' into x86/urgent Pick up Linus's latest, to fix a bug. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-04-02 09:48:56 +02:00
Vince Weaver	e69af4657e	perf/x86: Enable DRAM RAPL support on Intel Haswell It turns out all Haswell processors (including the Desktop variant) support RAPL DRAM readings in addition to package, pp0, and pp1. I've confirmed RAPL DRAM readings on my model 60 Haswell desktop. See the 4th-gen-core-family-desktop-vol-2-datasheet.pdf available from the Intel website for confirmation. Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Stephane Eranian <eranian@gmail.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1404020045290.17889@vincent-weaver-1.um.maine.edu Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-04-02 07:16:27 +02:00
Linus Torvalds	158e0d3621	Driver core / sysfs patches for 3.15-rc1 Here's the big driver core / sysfs update for 3.15-rc1. Lots of kernfs updates to make it useful for other subsystems, and a few other tiny driver core patches. All have been in linux-next for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEABECAAYFAlM7A0wACgkQMUfUDdst+ynJNACfZlY+KNKIhNFt1OOW8rQfSZzy 1PYAnjYuOoly01JlPrpJD5b4TdxaAq71 =GVUg -----END PGP SIGNATURE----- Merge tag 'driver-core-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core and sysfs updates from Greg KH: "Here's the big driver core / sysfs update for 3.15-rc1. Lots of kernfs updates to make it useful for other subsystems, and a few other tiny driver core patches. All have been in linux-next for a while" * tag 'driver-core-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (42 commits) Revert "sysfs, driver-core: remove unused {sysfs\|device}_schedule_callback_owner()" kernfs: cache atomic_write_len in kernfs_open_file numa: fix NULL pointer access and memory leak in unregister_one_node() Revert "driver core: synchronize device shutdown" kernfs: fix off by one error. kernfs: remove duplicate dir.c at the top dir x86: align x86 arch with generic CPU modalias handling cpu: add generic support for CPU feature based module autoloading sysfs: create bin_attributes under the requested group driver core: unexport static function create_syslog_header firmware: use power efficient workqueue for unloading and aborting fw load firmware: give a protection when map page failed firmware: google memconsole driver fixes firmware: fix google/gsmi duplicate efivars_sysfs_init() drivers/base: delete non-required instances of include <linux/init.h> kernfs: fix kernfs_node_from_dentry() ACPI / platform: drop redundant ACPI_HANDLE check kernfs: fix hash calculation in kernfs_rename_ns() kernfs: add CONFIG_KERNFS sysfs, kobject: add sysfs wrapper for kernfs_enable_ns() ...	2014-04-01 16:28:19 -07:00
Linus Torvalds	62ff577fa2	A bunch of EDAC updates all over the place: * Support for new AMD models, along with more graceful fallback for unsupported hw. * Bunch of fixes from SUSE accumulated from bug reports * Misc other fixes and cleanups -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJTMUueAAoJEBLB8Bhh3lVKbGIP/iXBanEUUXbg027hvNch8tbJ Yzsk3VkhmZ6hWXU4XNkvFFbSfCfHuHYQSEAxro6yc2t+6aVhMBrKIcCBNejNYt37 louswtfhC9A4geX206bzfOFNxOVsWzX2vxhhQFbqybZe6BnmcnCSKn//v+kF0/5E 6CrwQyKQjl2vGewPU4pBlrbnRNduPRs6bg6lWgzNxbKrhD8Ce/7NQRVd5v698pNm NTqfS37J90o1DP3I6yY30trtprPjjfnFANxb2DPaf09Z/X+aI1ONNe4s/URBd+qg yYIWAphbKopXPZOS0wDU0ikb7dGWH1lcQdVE69GxnCEvz9hz2ISg3plguwbYLPzK CmI4RGU0D8n5Irxu8bUID4SLT6fSAqU+cgJCdu6dqUncB7khcwG8dWJxMYNGjlBm TD9aCRRs+/b07KUrBeL1tufgd9gGPKWDeecTw55VMofIFkYvwIAz8TFdj3ql4Uz/ AsYJkAqvoO4dTz1ai+PJ06EWSijUT+KLkyZbybS5+PiP4uRAWY8pkCKbvTJQb1xN 0SAwm0tqT7mQKc/oIsvpZ/TLjwYxRhAeVMvrkElkNRPSsrt70rDVQiPgzQh9GTqV d5GmYI65Tk2Jrl7dVN+Pk+NDdS/y52SPk1AMDsfoBP6dndftdF6/gtqnijmbLgMY /yH+hIAKJXJUUKO4pELe =vafk -----END PGP SIGNATURE----- Merge tag 'edac_for_3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp Pull EDAC updates from Borislav Petkov: "A bunch of EDAC updates all over the place: - Support for new AMD models, along with more graceful fallback for unsupported hw. - Bunch of fixes from SUSE accumulated from bug reports - Misc other fixes and cleanups" * tag 'edac_for_3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: amd64_edac: Add support for newer F16h models i7core_edac: Drop unused variable i82875p_edac: Drop redundant call to pci_get_device() amd8111_edac: Fix leaks in probe error paths e752x_edac: Drop pvt->bridge_ck MCE, AMD: Fix decoding module loading on unsupported hw i5100_edac: Remove an unneeded condition in i5100_init_csrows() sb_edac: Degrade log level for device registration amd64_edac: Fix logic to determine channel for F15 M30h processors edac/85xx: Remove deprecated IRQF_DISABLED i3200_edac: Add a missing pci_disable_device() on the exit path i5400_edac: Disable device when unloading module e752x_edac: Simplify call to pci_get_device()	2014-04-01 13:54:00 -07:00
Linus Torvalds	4dedde7c7a	ACPI and power management updates for 3.15-rc1 - Device PM QoS support for latency tolerance constraints on systems with hardware interfaces allowing such constraints to be specified. That is necessary to prevent hardware-driven power management from becoming overly aggressive on some systems and to prevent power management features leading to excessive latencies from being used in some cases. - Consolidation of the handling of ACPI hotplug notifications for device objects. This causes all device hotplug notifications to go through the root notify handler (that was executed for all of them anyway before) that propagates them to individual subsystems, if necessary, by executing callbacks provided by those subsystems (those callbacks are associated with struct acpi_device objects during device enumeration). As a result, the code in question becomes both smaller in size and more straightforward and all of those changes should not affect users. - ACPICA update, including fixes related to the handling of _PRT in cases when it is broken and the addition of "Windows 2013" to the list of supported "features" for _OSI (which is necessary to support systems that work incorrectly or don't even boot without it). Changes from Bob Moore and Lv Zheng. - Consolidation of ACPI _OST handling from Jiang Liu. - ACPI battery and AC fixes allowing unusual system configurations to be handled by that code from Alexander Mezin. - New device IDs for the ACPI LPSS driver from Chiau Ee Chew. - ACPI fan and thermal optimizations related to system suspend and resume from Aaron Lu. - Cleanups related to ACPI video from Jean Delvare. - Assorted ACPI fixes and cleanups from Al Stone, Hanjun Guo, Lan Tianyu, Paul Bolle, Tomasz Nowicki. - Intel RAPL (Running Average Power Limits) driver cleanups from Jacob Pan. - intel_pstate fixes and cleanups from Dirk Brandewie. - cpufreq fixes related to system suspend/resume handling from Viresh Kumar. - cpufreq core fixes and cleanups from Viresh Kumar, Stratos Karafotis, Saravana Kannan, Rashika Kheria, Joe Perches. - cpufreq drivers updates from Viresh Kumar, Zhuoyu Zhang, Rob Herring. - cpuidle fixes related to the menu governor from Tuukka Tikkanen. - cpuidle fix related to coupled CPUs handling from Paul Burton. - Asynchronous execution of all device suspend and resume callbacks, except for ->prepare and ->complete, during system suspend and resume from Chuansheng Liu. - Delayed resuming of runtime-suspended devices during system suspend for the PCI bus type and ACPI PM domain. - New set of PM helper routines to allow device runtime PM callbacks to be used during system suspend and resume more easily from Ulf Hansson. - Assorted fixes and cleanups in the PM core from Geert Uytterhoeven, Prabhakar Lad, Philipp Zabel, Rashika Kheria, Sebastian Capella. - devfreq fix from Saravana Kannan. / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJTLgB1AAoJEILEb/54YlRxfs4P/35fIu9h8ClNWUPXqi3nlGIt yMyumKvF1VdsOKLbjTtFq6B3UOlhqDijYTCQd7Xt7X8ONTk/ND9ec2t/5xGkSdUI q46fa0qZXeqUn0Kt2t+kl6tgVQOkDj94aNlEh+7Ya3Uu6WYDDfmZtOBOFAMk6D8l ND4rHJpX+eUsRLBrcxaUxxdD8AW5guGcPKyeyzsXv1bY1BZnpLFrZ3PhuI5dn2CL L/zmk3A+wG6+ZlQxnwDdrKa3E6uhRSIDeF0vI4Byspa1wi5zXknJG2J7MoQ9JEE9 VQpBXlqach5wgXqJ8PAqAeaB6Ie26/F7PYG8r446zKw/5UUtdNUx+0dkjQ7Mz8Tu ajuVxfwrrPhZeQqmVBxlH5Gg7Ez2KBKEfDxTdRnzI7FoA7PE5XDcg3kO64bhj8LJ yugnV/ToU9wMztZnPC7CoGPwUgxMJvr9LwmxS4aeKcVUBES05eg0vS3lwdZMgqkV iO0QkWTmhZ952qZCqZxbh0JqaaX8Wgx2kpX2tf1G2GJqLMZco289bLh6njNT+8CH EzdQKYYyn6G6+Qg2M0f/6So3qU17x9XtE4ZBWQdGDpqYOGZhjZAOs/VnB1Ysw/K3 cDBzswlJd0CyyUps9B+qbf49OpbWVwl5kKeuHUuPxugEVryhpSp9AuG+tNil74Sj JuGTGR4fyFjDBX5cvAPm =ywR6 -----END PGP SIGNATURE----- Merge tag 'pm+acpi-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management updates from Rafael Wysocki: "The majority of this material spent some time in linux-next, some of it even several weeks. There are a few relatively fresh commits in it, but they are mostly fixes and simple cleanups. ACPI took the lead this time, both in terms of the number of commits and the number of modified lines of code, cpufreq follows and there are a few changes in the PM core and in cpuidle too. A new feature that already got some LWN.net's attention is the device PM QoS extension allowing latency tolerance requirements to be propagated from leaf devices to their ancestors with hardware interfaces for specifying latency tolerance. That should help systems with hardware-driven power management to avoid going too far with it in cases when there are latency tolerance constraints. There also are some significant changes in the ACPI core related to the way in which hotplug notifications are handled. They affect PCI hotplug (ACPIPHP) and the ACPI dock station code too. The bottom line is that all those notification now go through the root notify handler and are propagated to the interested subsystems by means of callbacks instead of having to install a notify handler for each device object that we can potentially get hotplug notifications for. In addition to that ACPICA will now advertise "Windows 2013" compatibility for _OSI, because some systems out there don't work correctly if that is not done (some of them don't even boot). On the system suspend side of things, all of the device suspend and resume callbacks, except for ->prepare() and ->complete(), are now going to be executed asynchronously as that turns out to speed up system suspend and resume on some platforms quite significantly and we have a few more optimizations in that area. Apart from that, there are some new device IDs and fixes and cleanups all over. In particular, the system suspend and resume handling by cpufreq should be improved and the cpuidle menu governor should be a bit more robust now. Specifics: - Device PM QoS support for latency tolerance constraints on systems with hardware interfaces allowing such constraints to be specified. That is necessary to prevent hardware-driven power management from becoming overly aggressive on some systems and to prevent power management features leading to excessive latencies from being used in some cases. - Consolidation of the handling of ACPI hotplug notifications for device objects. This causes all device hotplug notifications to go through the root notify handler (that was executed for all of them anyway before) that propagates them to individual subsystems, if necessary, by executing callbacks provided by those subsystems (those callbacks are associated with struct acpi_device objects during device enumeration). As a result, the code in question becomes both smaller in size and more straightforward and all of those changes should not affect users. - ACPICA update, including fixes related to the handling of _PRT in cases when it is broken and the addition of "Windows 2013" to the list of supported "features" for _OSI (which is necessary to support systems that work incorrectly or don't even boot without it). Changes from Bob Moore and Lv Zheng. - Consolidation of ACPI _OST handling from Jiang Liu. - ACPI battery and AC fixes allowing unusual system configurations to be handled by that code from Alexander Mezin. - New device IDs for the ACPI LPSS driver from Chiau Ee Chew. - ACPI fan and thermal optimizations related to system suspend and resume from Aaron Lu. - Cleanups related to ACPI video from Jean Delvare. - Assorted ACPI fixes and cleanups from Al Stone, Hanjun Guo, Lan Tianyu, Paul Bolle, Tomasz Nowicki. - Intel RAPL (Running Average Power Limits) driver cleanups from Jacob Pan. - intel_pstate fixes and cleanups from Dirk Brandewie. - cpufreq fixes related to system suspend/resume handling from Viresh Kumar. - cpufreq core fixes and cleanups from Viresh Kumar, Stratos Karafotis, Saravana Kannan, Rashika Kheria, Joe Perches. - cpufreq drivers updates from Viresh Kumar, Zhuoyu Zhang, Rob Herring. - cpuidle fixes related to the menu governor from Tuukka Tikkanen. - cpuidle fix related to coupled CPUs handling from Paul Burton. - Asynchronous execution of all device suspend and resume callbacks, except for ->prepare and ->complete, during system suspend and resume from Chuansheng Liu. - Delayed resuming of runtime-suspended devices during system suspend for the PCI bus type and ACPI PM domain. - New set of PM helper routines to allow device runtime PM callbacks to be used during system suspend and resume more easily from Ulf Hansson. - Assorted fixes and cleanups in the PM core from Geert Uytterhoeven, Prabhakar Lad, Philipp Zabel, Rashika Kheria, Sebastian Capella. - devfreq fix from Saravana Kannan" * tag 'pm+acpi-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (162 commits) PM / devfreq: Rewrite devfreq_update_status() to fix multiple bugs PM / sleep: Correct whitespace errors in <linux/pm.h> intel_pstate: Set core to min P state during core offline cpufreq: Add stop CPU callback to cpufreq_driver interface cpufreq: Remove unnecessary braces cpufreq: Fix checkpatch errors and warnings cpufreq: powerpc: add cpufreq transition latency for FSL e500mc SoCs MAINTAINERS: Reorder maintainer addresses for PM and ACPI PM / Runtime: Update runtime_idle() documentation for return value meaning video / output: Drop display output class support fujitsu-laptop: Drop unneeded include acer-wmi: Stop selecting VIDEO_OUTPUT_CONTROL ACPI / gpu / drm: Stop selecting VIDEO_OUTPUT_CONTROL ACPI / video: fix ACPI_VIDEO dependencies cpufreq: remove unused notifier: CPUFREQ_{SUSPENDCHANGE\|RESUMECHANGE} cpufreq: Do not allow ->setpolicy drivers to provide ->target cpufreq: arm_big_little: set 'physical_cluster' for each CPU cpufreq: arm_big_little: make vexpress driver depend on bL core driver ACPI / button: Add ACPI Button event via netlink routine ACPI: Remove duplicate definitions of PREFIX ...	2014-04-01 12:48:54 -07:00
Linus Torvalds	683b6c6f82	Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq code updates from Thomas Gleixner: "The irq department proudly presents: - Another tree wide sweep of irq infrastructure abuse. Clear winner of the trainwreck engineering contest was: #include "../../../kernel/irq/settings.h" - Tree wide update of irq_set_affinity() callbacks which miss a cpu online check when picking a single cpu out of the affinity mask. - Tree wide consolidation of interrupt statistics. - Updates to the threaded interrupt infrastructure to allow explicit wakeup of the interrupt thread and a variant of synchronize_irq() which synchronizes only the hard interrupt handler. Both are needed to replace the homebrewn thread handling in the mmc/sdhci code. - New irq chip callbacks to allow proper support for GPIO based irqs. The GPIO based interrupts need to request/release GPIO resources from request/free_irq. - A few new ARM interrupt chips. No revolutionary new hardware, just differently wreckaged variations of the scheme. - Small improvments, cleanups and updates all over the place" I was hoping that that trainwreck engineering contest was a April Fools' joke. But no. * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (68 commits) irqchip: sun7i/sun6i: Disable NMI before registering the handler ARM: sun7i/sun6i: dts: Fix IRQ number for sun6i NMI controller ARM: sun7i/sun6i: irqchip: Update the documentation ARM: sun7i/sun6i: dts: Add NMI irqchip support ARM: sun7i/sun6i: irqchip: Add irqchip driver for NMI controller genirq: Export symbol no_action() arm: omap: Fix typo in ams-delta-fiq.c m68k: atari: Fix the last kernel_stat.h fallout irqchip: sun4i: Simplify sun4i_irq_ack irqchip: sun4i: Use handle_fasteoi_irq for all interrupts genirq: procfs: Make smp_affinity values go+r softirq: Add linux/irq.h to make it compile again m68k: amiga: Add linux/irq.h to make it compile again irqchip: sun4i: Don't ack IRQs > 0, fix acking of IRQ 0 irqchip: sun4i: Fix a comment about mask register initialization irqchip: sun4i: Fix irq 0 not working genirq: Add a new IRQCHIP_EOI_THREADED flag genirq: Document IRQCHIP_ONESHOT_SAFE flag ARM: sunxi: dt: Convert to the new irq controller compatibles irqchip: sunxi: Change compatibles ...	2014-04-01 11:22:57 -07:00
Linus Torvalds	1ead658124	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer changes from Thomas Gleixner: "This assorted collection provides: - A new timer based timer broadcast feature for systems which do not provide a global accessible timer device. That allows those systems to put CPUs into deep idle states where the per cpu timer device stops. - A few NOHZ_FULL related improvements to the timer wheel - The usual updates to timer devices found in ARM SoCs - Small improvements and updates all over the place" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits) tick: Remove code duplication in tick_handle_periodic() tick: Fix spelling mistake in tick_handle_periodic() x86: hpet: Use proper destructor for delayed work workqueue: Provide destroy_delayed_work_on_stack() clocksource: CMT, MTU2, TMU and STI should depend on GENERIC_CLOCKEVENTS timer: Remove code redundancy while calling get_nohz_timer_target() hrtimer: Rearrange comments in the order struct members are declared timer: Use variable head instead of &work_list in __run_timers() clocksource: exynos_mct: silence a static checker warning arm: zynq: Add support for cpufreq arm: zynq: Don't use arm_global_timer with cpufreq clocksource/cadence_ttc: Overhaul clocksource frequency adjustment clocksource/cadence_ttc: Call clockevents_update_freq() with IRQs enabled clocksource: Add Kconfig entries for CMT, MTU2, TMU and STI sh: Remove Kconfig entries for TMU, CMT and MTU2 ARM: shmobile: Remove CMT, TMU and STI Kconfig entries clocksource: armada-370-xp: Use atomic access for shared registers clocksource: orion: Use atomic access for shared registers clocksource: timer-keystone: Delete unnecessary variable clocksource: timer-keystone: introduce clocksource driver for Keystone ...	2014-04-01 11:00:07 -07:00
Linus Torvalds	b6d739e958	Merge branch 'x86-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 iommu quirk fix from Thomas Gleixner: "A quirk for the iommu quirk to include silicon which was assumed not to be out in the wild. This time with the correct logic applied" * 'x86-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Adjust irq remapping quirk for older revisions of 5500/5520 chipsets	2014-04-01 10:59:08 -07:00
Linus Torvalds	99f7b025bf	Merge branch 'x86-threadinfo-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 threadinfo changes from Ingo Molnar: "The main change here is the consolidation/unification of 32 and 64 bit thread_info handling methods, from Steve Rostedt" * 'x86-threadinfo-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, threadinfo: Redo "x86: Use inline assembler to get sp" x86: Clean up dumpstack_64.c code x86: Keep thread_info on thread stack in x86_32 x86: Prepare removal of previous_esp from i386 thread_info structure x86: Nuke GET_THREAD_INFO_WITH_ESP() macro for i386 x86: Nuke the supervisor_stack field in i386 thread_info	2014-04-01 10:17:18 -07:00
Linus Torvalds	b9b16a7922	Merge branch 'x86-cpufeature-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpufeature update from Ingo Molnar: "Two refinements to clflushopt support" * 'x86-cpufeature-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, cpufeature: If we disable CLFLUSH, we should disable CLFLUSHOPT x86, cpufeature: Rename X86_FEATURE_CLFLSH to X86_FEATURE_CLFLUSH	2014-04-01 10:11:21 -07:00
Linus Torvalds	4b2ce8f15f	Merge branch 'x86-reboot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 reboot changes from Ingo Molnar: "Refine the reboot logic around the CF9 and EFI reboot methods, to make it more robust. The expectation is for no working system to break, and for a couple of reboot-force systems to start rebooting automatically again" * 'x86-reboot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, reboot: Only use CF9_COND automatically, not CF9 x86, reboot: Add EFI and CF9 reboot methods into the default list	2014-04-01 10:10:59 -07:00
Ingo Molnar	b8c89c6a0d	Fix the code to tell when a CMCI storm ends by actually looking at the machine check banks when we poll while interrupts are disabled. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQIcBAABAgAGBQJTOZzGAAoJEKurIx+X31iBD6IP/2zd94hyif6gl8qGEqRUO0Ts 3Bo+P8DhFXH7VY5qvN9dP4raOfQjb8B2pFCShq6c4nr1lv6AKb4Gehf9EcQnXF20 z10vZuKGyJfyck0dKr3xE35V34DAARNoCJCWoSwutbqEaCQT/7J/cymkgdqUgHo9 f7XIeMy51dAVw7IeYleNLlVdDet1s5BuCPVdJOIqirGA9qfdDUUVuYmBB/mrbneI uftl27zd+KM4tcuNkVxrsRbZ8q/IOu1mhtb09aDpeKLLRfl282XsLr1j+52lbM8t RPieFO56jGlZn5015aWNTXARqhdKQ8wnUzttEJCGkl+p7bqNp3pQmpbNUD2ZpJF8 euR89OaxaOmU8Y1anfw/pA1G7WumkTG9WjOHRQFzSERPq1GF+YLlkaSwVTkgT9Ny 6ubvGr9WYPTxzNeDqUeu4HjCKOS1soWak7zIoXbu2Yu59scCcITE/cKw3y76BZ2O 2gKkBmBxlaZt8DBmPR2W4A+Uhb7lhJWvrYdCwpnrMA0nDCKOEDs5g5MVr5xMAEJ4 Zy1X6hwOwvmOmJp/DdHrsmfsRhCRV/itjb0vpfEF+MzWYbECD8Zyslb6YP5E5ZoV eWoKglC/W25907YJZiX2yzsCi+GNjLO/0V5ZJHwCxrWPSUnSJMeXjgbslIEQyOlC 0Gkj2DUIrgsGJDhwTmAJ =ZqzJ -----END PGP SIGNATURE----- Merge tag 'please-pull-cmci-storm' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/urgent Pull RAS/CMCI storm code fix from Tony Luck: "Fix the code to tell when a CMCI storm ends by actually looking at the machine check banks when we poll while interrupts are disabled." Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-04-01 15:13:16 +02:00
Maciej W. Rozycki	023de4a09f	x86/apic: Reinstate error IRQ Pentium erratum 3AP workaround A change introduced with commit `60283df7ac` ("x86/apic: Read Error Status Register correctly") removed a read from the APIC ESR register made before writing to same required to retrieve the correct error status on Pentium systems affected by the 3AP erratum[1]: "3AP. Writes to Error Register Clears Register PROBLEM: The APIC Error register is intended to only be read. If there is a write to this register the data in the APIC Error register will be cleared and lost. IMPLICATION: There is a possibility of clearing the Error register status since the write to the register is not specifically blocked. WORKAROUND: Writes should not occur to the Pentium processor APIC Error register. STATUS: For the steppings affected see the Summary Table of Changes at the beginning of this section." The steppings affected are actually: B1, B3 and B5. To avoid this information loss this change avoids the write to ESR on all Pentium systems where it is actually never needed; in Pentium processor documentation ESR was noted read-only and the write only required for future architectural compatibility[2]. The approach taken is the same as in lapic_setup_esr(). References: [1] "Pentium Processor Family Developer's Manual", Intel Corporation, 1997, order number 241428-005, Appendix A "Errata and S-Specs for the Pentium Processor Family", p. A-92, [2] "Pentium Processor Family Developer's Manual, Volume 3: Architecture and Programming Manual", Intel Corporation, 1995, order number 241430-004, Section 19.3.3. "Error Handling In APIC", p. 19-33. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Cc: Richard Weinberger <richard@nod.at> Link: http://lkml.kernel.org/r/alpine.LFD.2.11.1404011300010.27402@eddie.linux-mips.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-04-01 14:59:43 +02:00
Dimitri Sivanich	5f40f7d938	x86/UV: Set n_lshift based on GAM_GR_CONFIG MMR for UV3 The value of n_lshift for UV is currently set based on the socket m_val. For UV3, set the n_lshift value based on the GAM_GR_CONFIG MMR. This will allow bios to control the n_lshift value independent of the socket m_val. Then n_lshift can be assigned a fixed value across a multi-partition system, allowing for a fixed common global physical address format that is independent of socket m_val. Cleanup unneeded macros. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> Link: http://lkml.kernel.org/r/20140331143700.GB29916@sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-04-01 12:10:44 +02:00
Linus Torvalds	176ab02d49	Merge branch 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 LTO changes from Peter Anvin: "More infrastructure work in preparation for link-time optimization (LTO). Most of these changes is to make sure symbols accessed from assembly code are properly marked as visible so the linker doesn't remove them. My understanding is that the changes to support LTO are still not upstream in binutils, but are on the way there. This patchset should conclude the x86-specific changes, and remaining patches to actually enable LTO will be fed through the Kbuild tree (other than keeping up with changes to the x86 code base, of course), although not necessarily in this merge window" * 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) Kbuild, lto: Handle basic LTO in modpost Kbuild, lto: Disable LTO for asm-offsets.c Kbuild, lto: Add a gcc-ld script to let run gcc as ld Kbuild, lto: add ld-version and ld-ifversion macros Kbuild, lto: Drop .number postfixes in modpost Kbuild, lto, workaround: Don't warn for initcall_reference in modpost lto: Disable LTO for sys_ni lto: Handle LTO common symbols in module loader lto, workaround: Add workaround for initcall reordering lto: Make asmlinkage __visible x86, lto: Disable LTO for the x86 VDSO initconst, x86: Fix initconst mistake in ts5500 code initconst: Fix initconst mistake in dcdbas asmlinkage: Make trace_hardirqs_on/off_caller visible asmlinkage, x86: Fix 32bit memcpy for LTO asmlinkage Make __stack_chk_failed and memcmp visible asmlinkage: Mark rwsem functions that can be called from assembler asmlinkage asmlinkage: Make main_extable_sort_needed visible asmlinkage, mutex: Mark __visible asmlinkage: Make trace_hardirq visible ...	2014-03-31 14:13:25 -07:00
Neil Horman	6f8a1b335f	x86: Adjust irq remapping quirk for older revisions of 5500/5520 chipsets Commit `03bbcb2e7e` (iommu/vt-d: add quirk for broken interrupt remapping on 55XX chipsets) properly disables irq remapping on the 5500/5520 chipsets that don't correctly perform that feature. However, when I wrote it, I followed the errata sheet linked in that commit too closely, and explicitly tied the activation of the quirk to revision 0x13 of the chip, under the assumption that earlier revisions were not in the field. Recently a system was reported to be suffering from this remap bug and the quirk hadn't triggered, because the revision id register read at a lower value that 0x13, so the quirk test failed improperly. Given this, it seems only prudent to adjust this quirk so that any revision less than 0x13 has the quirk asserted. [ tglx: Removed the 0x12 comparison of pci id 3405 as this is covered by the <= 0x13 check already ] Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1394649873-14913-1-git-send-email-nhorman@tuxdriver.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-03-31 22:07:01 +02:00
Linus Torvalds	e06df6a7ea	Merge branch 'x86-kaslr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 kaslr update from Ingo Molnar: "This adds kernel module load address randomization" * 'x86-kaslr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, kaslr: fix module lock ordering problem x86, kaslr: randomize module base load address	2014-03-31 12:34:49 -07:00
Linus Torvalds	c0fc3cbac0	Merge branch 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 hyperv change from Ingo Molnar: "Skip the timer_irq_works() check on hyperv systems" * 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, hyperv: Bypass the timer_irq_works() check	2014-03-31 12:28:38 -07:00
Linus Torvalds	7cc3afdf43	Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 EFI changes from Ingo Molnar: "The main changes: - Add debug code to the dump EFI pagetable - Borislav Petkov - Make 1:1 runtime mapping robust when booting on machines with lots of memory - Borislav Petkov - Move the EFI facilities bits out of 'x86_efi_facility' and into efi.flags which is the standard architecture independent place to keep EFI state, by Matt Fleming. - Add 'EFI mixed mode' support: this allows 64-bit kernels to be booted from 32-bit firmware. This needs a bootloader that supports the 'EFI handover protocol'. By Matt Fleming" * 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) x86, efi: Abstract x86 efi_early calls x86/efi: Restore 'attr' argument to query_variable_info() x86/efi: Rip out phys_efi_get_time() x86/efi: Preserve segment registers in mixed mode x86/boot: Fix non-EFI build x86, tools: Fix up compiler warnings x86/efi: Re-disable interrupts after calling firmware services x86/boot: Don't overwrite cr4 when enabling PAE x86/efi: Wire up CONFIG_EFI_MIXED x86/efi: Add mixed runtime services support x86/efi: Firmware agnostic handover entry points x86/efi: Split the boot stub into 32/64 code paths x86/efi: Add early thunk code to go from 64-bit to 32-bit x86/efi: Build our own EFI services pointer table efi: Add separate 32-bit/64-bit definitions x86/efi: Delete dead code when checking for non-native x86/mm/pageattr: Always dump the right page table in an oops x86, tools: Consolidate #ifdef code x86/boot: Cleanup header.S by removing some #ifdefs efi: Use NULL instead of 0 for pointer ...	2014-03-31 12:26:05 -07:00
Linus Torvalds	ad8946fbf9	Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 debug cleanup from Ingo Molnar: "A single trivial cleanup" * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: i386: Remove unneeded test of 'task' in dump_trace() (again)	2014-03-31 12:25:28 -07:00
Linus Torvalds	918d80a136	Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpu handling changes from Ingo Molnar: "Bigger changes: - Intel CPU hardware-enablement: new vector instructions support (AVX-512), by Fenghua Yu. - Support the clflushopt instruction and use it in appropriate places. clflushopt is similar to clflush but with more relaxed ordering, by Ross Zwisler. - MSR accessor cleanups, by Borislav Petkov. - 'forcepae' boot flag for those who have way too much time to spend on way too old Pentium-M systems and want to live way too dangerously, by Chris Bainbridge" * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, cpu: Add forcepae parameter for booting PAE kernels on PAE-disabled Pentium M Rename TAINT_UNSAFE_SMP to TAINT_CPU_OUT_OF_SPEC x86, intel: Make MSR_IA32_MISC_ENABLE bit constants systematic x86, Intel: Convert to the new bit access MSR accessors x86, AMD: Convert to the new bit access MSR accessors x86: Add another set of MSR accessor functions x86: Use clflushopt in drm_clflush_virt_range x86: Use clflushopt in drm_clflush_page x86: Use clflushopt in clflush_cache_range x86: Add support for the clflushopt instruction x86, AVX-512: Enable AVX-512 States Context Switch x86, AVX-512: AVX-512 Feature Detection	2014-03-31 12:00:45 -07:00
Linus Torvalds	26a5c0dfbc	Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Various smaller cleanups" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, pageattr: Correct WBINVD spelling in comment x86, crash: Unify ifdef x86, boot: Correct max ramdisk size name	2014-03-31 12:00:10 -07:00
Linus Torvalds	6ed7705167	Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 apic changes from Ingo Molnar: "An xAPIC CPU hotplug race fix, plus cleanups and minor fixes" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Plug racy xAPIC access of CPU hotplug code x86/apic: Always define nox2apic and define it as initdata x86/apic: Remove unused function prototypes x86/apic: Switch wait_for_init_deassert() to a bool flag x86/apic: Only use default_wait_for_init_deassert()	2014-03-31 11:58:45 -07:00
Linus Torvalds	3a88fe3b74	Merge branch 'x86-acpi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 acpi numa fix from Ingo Molnar: "A single NUMA CPU hotplug fix" * 'x86-acpi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, acpi: Fix bug in associating hot-added CPUs with corresponding NUMA node	2014-03-31 11:58:08 -07:00
Linus Torvalds	971eae7c99	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler changes from Ingo Molnar: "Bigger changes: - sched/idle restructuring: they are WIP preparation for deeper integration between the scheduler and idle state selection, by Nicolas Pitre. - add NUMA scheduling pseudo-interleaving, by Rik van Riel. - optimize cgroup context switches, by Peter Zijlstra. - RT scheduling enhancements, by Thomas Gleixner. The rest is smaller changes, non-urgnt fixes and cleanups" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (68 commits) sched: Clean up the task_hot() function sched: Remove double calculation in fix_small_imbalance() sched: Fix broken setscheduler() sparc64, sched: Remove unused sparc64_multi_core sched: Remove unused mc_capable() and smt_capable() sched/numa: Move task_numa_free() to __put_task_struct() sched/fair: Fix endless loop in idle_balance() sched/core: Fix endless loop in pick_next_task() sched/fair: Push down check for high priority class task into idle_balance() sched/rt: Fix picking RT and DL tasks from empty queue trace: Replace hardcoding of 19 with MAX_NICE sched: Guarantee task priority in pick_next_task() sched/idle: Remove stale old file sched: Put rq's sched_avg under CONFIG_FAIR_GROUP_SCHED cpuidle/arm64: Remove redundant cpuidle_idle_call() cpuidle/powernv: Remove redundant cpuidle_idle_call() sched, nohz: Exclude isolated cores from load balancing sched: Fix select_task_rq_fair() description comments workqueue: Replace hardcoding of -20 and 19 with MIN_NICE and MAX_NICE sys: Replace hardcoding of -20 and 19 with MIN_NICE and MAX_NICE ...	2014-03-31 11:21:19 -07:00
Linus Torvalds	8c292f1174	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf changes from Ingo Molnar: "Main changes: Kernel side changes: - Add SNB/IVB/HSW client uncore memory controller support (Stephane Eranian) - Fix various x86/P4 PMU driver bugs (Don Zickus) Tooling, user visible changes: - Add several futex 'perf bench' microbenchmarks (Davidlohr Bueso) - Speed up thread map generation (Don Zickus) - Introduce 'perf kvm --list-cmds' command line option for use by scripts (Ramkumar Ramachandra) - Print the evsel name in the annotate stdio output, prep to fix support outputting annotation for multiple events, not just for the first one (Arnaldo Carvalho de Melo) - Allow setting preferred callchain method in .perfconfig (Jiri Olsa) - Show in what binaries/modules 'perf probe's are set (Masami Hiramatsu) - Support distro-style debuginfo for uprobe in 'perf probe' (Masami Hiramatsu) Tooling, internal changes and fixes: - Use tid in mmap/mmap2 events to find maps (Don Zickus) - Record the reason for filtering an address_location (Namhyung Kim) - Apply all filters to an addr_location (Namhyung Kim) - Merge al->filtered with hist_entry->filtered in report/hists (Namhyung Kim) - Fix memory leak when synthesizing thread records (Namhyung Kim) - Use ui__has_annotation() in 'report' (Namhyung Kim) - hists browser refactorings to reuse code accross UIs (Namhyung Kim) - Add support for the new DWARF unwinder library in elfutils (Jiri Olsa) - Fix build race in the generation of bison files (Jiri Olsa) - Further streamline the feature detection display, trimming it a bit to show just the libraries detected, using VF=1 gets a more verbose output, showing the less interesting feature checks as well (Jiri Olsa). - Check compatible symtab type before loading dso (Namhyung Kim) - Check return value of filename__read_debuglink() (Stephane Eranian) - Move some hashing and fs related code from tools/perf/util/ to tools/lib/ so that it can be used by more tools/ living utilities (Borislav Petkov) - Prepare DWARF unwinding code for using an elfutils alternative unwinding library (Jiri Olsa) - Fix DWARF unwind max_stack processing (Jiri Olsa) - Add dwarf unwind 'perf test' entry (Jiri Olsa) - 'perf probe' improvements including memory leak fixes, sharing the intlist class with other tools, uprobes/kprobes code sharing and use of ref_reloc_sym (Masami Hiramatsu) - Shorten sample symbol resolving by adding cpumode to struct addr_location (Arnaldo Carvalho de Melo) - Fix synthesizing mmaps for threads (Don Zickus) - Fix invalid output on event group stdio report (Namhyung Kim) - Fixup header alignment in 'perf sched latency' output (Ramkumar Ramachandra) - Fix off-by-one error in 'perf timechart record' argv handling (Ramkumar Ramachandra) Tooling, cleanups: - Remove unused thread__find_map function (Jiri Olsa) - Remove unused simple_strtoul() function (Ramkumar Ramachandra) Tooling, documentation updates: - Update function names in debug messages (Ramkumar Ramachandra) - Update some code references in design.txt (Ramkumar Ramachandra) - Clarify load-latency information in the 'perf mem' docs (Andi Kleen) - Clarify x86 register naming in 'perf probe' docs (Andi Kleen)" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (96 commits) perf tools: Remove unused simple_strtoul() function perf tools: Update some code references in design.txt perf evsel: Update function names in debug messages perf tools: Remove thread__find_map function perf annotate: Print the evsel name in the stdio output perf report: Use ui__has_annotation() perf tools: Fix memory leak when synthesizing thread records perf tools: Use tid in mmap/mmap2 events to find maps perf report: Merge al->filtered with hist_entry->filtered perf symbols: Apply all filters to an addr_location perf symbols: Record the reason for filtering an address_location perf sched: Fixup header alignment in 'latency' output perf timechart: Fix off-by-one error in 'record' argv handling perf machine: Factor machine__find_thread to take tid argument perf tools: Speed up thread map generation perf kvm: introduce --list-cmds for use by scripts perf ui hists: Pass evsel to hpp->header/width functions explicitly perf symbols: Introduce thread__find_cpumode_addr_location perf session: Change header.misc dump from decimal to hex perf ui/tui: Reuse generic __hpp__fmt() code ...	2014-03-31 11:13:25 -07:00
Chen, Gong	27f6c573e0	x86, CMCI: Add proper detection of end of CMCI storms When CMCI storm persists for a long time(at least beyond predefined threshold. It's 30 seconds for now), we can watch CMCI storm is detected immediately after it subsides. ... Dec 10 22:04:29 kernel: CMCI storm detected: switching to poll mode Dec 10 22:04:59 kernel: CMCI storm subsided: switching to interrupt mode Dec 10 22:04:59 kernel: CMCI storm detected: switching to poll mode Dec 10 22:05:29 kernel: CMCI storm subsided: switching to interrupt mode ... The problem is that our logic that determines that the storm has ended is incorrect. We announce the end, re-enable interrupts and realize that the storm is still going on, so we switch back to polling mode. Rinse, repeat. When a storm happens we disable signaling of errors via CMCI and begin polling machine check banks instead. If we find any logged errors, then we need to set a per-cpu flag so that our per-cpu tests that check whether the storm is ongoing will see that errors are still being logged independently of whether mce_notify_irq() says that the error has been fully processed. cmci_clear() is not the right tool to disable a bank. It disables the interrupt for the bank as desired, but it also clears the bit for this bank in "mce_banks_owned" so we will skip the bank when polling (so we fail to see that the storm continues because we stop looking). New cmci_storm_disable_banks() just disables the interrupt while allowing polling to continue. Reported-by: William Dauchy <wdauchy@gmail.com> Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2014-03-28 13:40:16 -07:00
Jason Wang	ca3ba2a2f4	x86, hyperv: Bypass the timer_irq_works() check This patch bypass the timer_irq_works() check for hyperv guest since: - It was guaranteed to work. - timer_irq_works() may fail sometime due to the lpj calibration were inaccurate in a hyperv guest or a buggy host. In the future, we should get the tsc frequency from hypervisor and use preset lpj instead. [ hpa: I would prefer to not defer things to "the future" in the future... ] Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: <stable@vger.kernel.org> Acked-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Link: http://lkml.kernel.org/r/1393558229-14755-1-git-send-email-jasowang@redhat.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-03-27 11:02:45 -07:00
Thomas Gleixner	b712c8dae0	x86: hpet: Use proper destructor for delayed work destroy_timer_on_stack() is hardly the right thing for a delayed work. We leak a tracking object for the work itself when DEBUG_OBJECTS is enabled. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/20140323141940.034005322@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-03-25 17:34:01 +01:00
Kees Cook	9dd721c6db	x86, kaslr: fix module lock ordering problem There was a potential lock ordering problem with the module kASLR patch ("x86, kaslr: randomize module base load address"). This patch removes the usage of the module_mutex and creates a new mutex to protect the module base address offset value. Chain exists of: text_mutex --> kprobe_insn_slots.mutex --> module_mutex [ 0.515561] Possible unsafe locking scenario: [ 0.515561] [ 0.515561] CPU0 CPU1 [ 0.515561] ---- ---- [ 0.515561] lock(module_mutex); [ 0.515561] lock(kprobe_insn_slots.mutex); [ 0.515561] lock(module_mutex); [ 0.515561] lock(text_mutex); [ 0.515561] [ 0.515561] * DEADLOCK * Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Andy Honig <ahonig@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-03-24 10:18:26 -07:00
Rafael J. Wysocki	884411d9a5	Merge branch 'acpi-processor' * acpi-processor: ACPI: Move BAD_MADT_ENTRY() to linux/acpi.h ACPI / processor: Make it possible to get APIC ID via GIC ACPI / processor: Build idle_boot_override on x86 and ia64 ACPI / processor: Use ACPI_PROCESSOR_DEVICE_HID instead of "ACPI0007" ACPI / processor: Fix acpi_processor_eval_pdc() return value type	2014-03-21 16:53:28 +01:00
Chris Bainbridge	69f2366c94	x86, cpu: Add forcepae parameter for booting PAE kernels on PAE-disabled Pentium M Many Pentium M systems disable PAE but may have a functionally usable PAE implementation. This adds the "forcepae" parameter which bypasses the boot check for PAE, and sets the CPU as being PAE capable. Using this parameter will taint the kernel with TAINT_CPU_OUT_OF_SPEC. Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com> Link: http://lkml.kernel.org/r/20140307114040.GA4997@localhost Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2014-03-20 16:31:54 -07:00
Dave Jones	8c90487cdc	Rename TAINT_UNSAFE_SMP to TAINT_CPU_OUT_OF_SPEC Rename TAINT_UNSAFE_SMP to TAINT_CPU_OUT_OF_SPEC, so we can repurpose the flag to encompass a wider range of pushing the CPU beyond its warrany. Signed-off-by: Dave Jones <davej@fedoraproject.org> Link: http://lkml.kernel.org/r/20140226154949.GA770@redhat.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2014-03-20 16:28:09 -07:00
Srivatsa S. Bhat	9014ad2a50	x86, hpet: Fix CPU hotplug callback registration Subsystems that want to register CPU hotplug callbacks, as well as perform initialization for the CPUs that are already online, often do it as shown below: get_online_cpus(); for_each_online_cpu(cpu) init_cpu(cpu); register_cpu_notifier(&foobar_cpu_notifier); put_online_cpus(); This is wrong, since it is prone to ABBA deadlocks involving the cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently with CPU hotplug operations). Instead, the correct and race-free way of performing the callback registration is: cpu_notifier_register_begin(); for_each_online_cpu(cpu) init_cpu(cpu); /* Note the use of the double underscored version of the API */ __register_cpu_notifier(&foobar_cpu_notifier); cpu_notifier_register_done(); Fix the hpet code in x86 by using this latter form of callback registration. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-20 13:43:43 +01:00
Srivatsa S. Bhat	a8c17c2951	x86, amd, uncore: Fix CPU hotplug callback registration Subsystems that want to register CPU hotplug callbacks, as well as perform initialization for the CPUs that are already online, often do it as shown below: get_online_cpus(); for_each_online_cpu(cpu) init_cpu(cpu); register_cpu_notifier(&foobar_cpu_notifier); put_online_cpus(); This is wrong, since it is prone to ABBA deadlocks involving the cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently with CPU hotplug operations). Instead, the correct and race-free way of performing the callback registration is: cpu_notifier_register_begin(); for_each_online_cpu(cpu) init_cpu(cpu); /* Note the use of the double underscored version of the API */ __register_cpu_notifier(&foobar_cpu_notifier); cpu_notifier_register_done(); Fix the amd-uncore code in x86 by using this latter form of callback registration. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-20 13:43:43 +01:00
Srivatsa S. Bhat	fd537e56f6	x86, intel, rapl: Fix CPU hotplug callback registration Subsystems that want to register CPU hotplug callbacks, as well as perform initialization for the CPUs that are already online, often do it as shown below: get_online_cpus(); for_each_online_cpu(cpu) init_cpu(cpu); register_cpu_notifier(&foobar_cpu_notifier); put_online_cpus(); This is wrong, since it is prone to ABBA deadlocks involving the cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently with CPU hotplug operations). Instead, the correct and race-free way of performing the callback registration is: cpu_notifier_register_begin(); for_each_online_cpu(cpu) init_cpu(cpu); /* Note the use of the double underscored version of the API */ __register_cpu_notifier(&foobar_cpu_notifier); cpu_notifier_register_done(); Fix the intel rapl code in x86 by using this latter form of callback registration. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-20 13:43:43 +01:00
Srivatsa S. Bhat	8c60ea1464	x86, intel, cacheinfo: Fix CPU hotplug callback registration Subsystems that want to register CPU hotplug callbacks, as well as perform initialization for the CPUs that are already online, often do it as shown below: get_online_cpus(); for_each_online_cpu(cpu) init_cpu(cpu); register_cpu_notifier(&foobar_cpu_notifier); put_online_cpus(); This is wrong, since it is prone to ABBA deadlocks involving the cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently with CPU hotplug operations). Instead, the correct and race-free way of performing the callback registration is: cpu_notifier_register_begin(); for_each_online_cpu(cpu) init_cpu(cpu); /* Note the use of the double underscored version of the API */ __register_cpu_notifier(&foobar_cpu_notifier); cpu_notifier_register_done(); Fix the intel cacheinfo code in x86 by using this latter form of callback registration. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-20 13:43:43 +01:00
Srivatsa S. Bhat	047868ce29	x86, amd, ibs: Fix CPU hotplug callback registration Subsystems that want to register CPU hotplug callbacks, as well as perform initialization for the CPUs that are already online, often do it as shown below: get_online_cpus(); for_each_online_cpu(cpu) init_cpu(cpu); register_cpu_notifier(&foobar_cpu_notifier); put_online_cpus(); This is wrong, since it is prone to ABBA deadlocks involving the cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently with CPU hotplug operations). Instead, the correct and race-free way of performing the callback registration is: cpu_notifier_register_begin(); for_each_online_cpu(cpu) init_cpu(cpu); /* Note the use of the double underscored version of the API */ __register_cpu_notifier(&foobar_cpu_notifier); cpu_notifier_register_done(); Fix the amd-ibs code in x86 by using this latter form of callback registration. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-20 13:43:43 +01:00
Srivatsa S. Bhat	7b7139d4ab	x86, therm_throt.c: Remove unused therm_cpu_lock After fixing the CPU hotplug callback registration code, the callbacks invoked for each online CPU, during the initialization phase in thermal_throttle_init_device(), can no longer race with the actual CPU hotplug notifier callbacks (in thermal_throttle_cpu_callback). Hence the therm_cpu_lock is unnecessary now. Remove it. Cc: Tony Luck <tony.luck@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Suggested-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-20 13:43:43 +01:00
Srivatsa S. Bhat	4e6192bbec	x86, therm_throt.c: Fix CPU hotplug callback registration Subsystems that want to register CPU hotplug callbacks, as well as perform initialization for the CPUs that are already online, often do it as shown below: get_online_cpus(); for_each_online_cpu(cpu) init_cpu(cpu); register_cpu_notifier(&foobar_cpu_notifier); put_online_cpus(); This is wrong, since it is prone to ABBA deadlocks involving the cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently with CPU hotplug operations). Instead, the correct and race-free way of performing the callback registration is: cpu_notifier_register_begin(); for_each_online_cpu(cpu) init_cpu(cpu); /* Note the use of the double underscored version of the API */ __register_cpu_notifier(&foobar_cpu_notifier); cpu_notifier_register_done(); Fix the thermal throttle code in x86 by using this latter form of callback registration. Cc: Tony Luck <tony.luck@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-20 13:43:42 +01:00
Srivatsa S. Bhat	82a8f131aa	x86, mce: Fix CPU hotplug callback registration Subsystems that want to register CPU hotplug callbacks, as well as perform initialization for the CPUs that are already online, often do it as shown below: get_online_cpus(); for_each_online_cpu(cpu) init_cpu(cpu); register_cpu_notifier(&foobar_cpu_notifier); put_online_cpus(); This is wrong, since it is prone to ABBA deadlocks involving the cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently with CPU hotplug operations). Instead, the correct and race-free way of performing the callback registration is: cpu_notifier_register_begin(); for_each_online_cpu(cpu) init_cpu(cpu); /* Note the use of the double underscored version of the API */ __register_cpu_notifier(&foobar_cpu_notifier); cpu_notifier_register_done(); Fix the mce code in x86 by using this latter form of callback registration. Cc: Tony Luck <tony.luck@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-20 13:43:42 +01:00
Srivatsa S. Bhat	2c666adacc	x86, intel, uncore: Fix CPU hotplug callback registration Subsystems that want to register CPU hotplug callbacks, as well as perform initialization for the CPUs that are already online, often do it as shown below: get_online_cpus(); for_each_online_cpu(cpu) init_cpu(cpu); register_cpu_notifier(&foobar_cpu_notifier); put_online_cpus(); This is wrong, since it is prone to ABBA deadlocks involving the cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently with CPU hotplug operations). Instead, the correct and race-free way of performing the callback registration is: cpu_notifier_register_begin(); for_each_online_cpu(cpu) init_cpu(cpu); /* Note the use of the double underscored version of the API */ __register_cpu_notifier(&foobar_cpu_notifier); cpu_notifier_register_done(); Fix the uncore code in intel-x86 by using this latter form of callback registration. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-20 13:43:42 +01:00
Srivatsa S. Bhat	42112a0f5d	x86, vsyscall: Fix CPU hotplug callback registration Subsystems that want to register CPU hotplug callbacks, as well as perform initialization for the CPUs that are already online, often do it as shown below: get_online_cpus(); for_each_online_cpu(cpu) init_cpu(cpu); register_cpu_notifier(&foobar_cpu_notifier); put_online_cpus(); This is wrong, since it is prone to ABBA deadlocks involving the cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently with CPU hotplug operations). Instead, the correct and race-free way of performing the callback registration is: cpu_notifier_register_begin(); for_each_online_cpu(cpu) init_cpu(cpu); /* Note the use of the double underscored version of the API */ __register_cpu_notifier(&foobar_cpu_notifier); cpu_notifier_register_done(); Fix the vsyscall code in x86 by using this latter form of callback registration. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-20 13:43:42 +01:00
Srivatsa S. Bhat	4b660b384d	x86, cpuid: Fix CPU hotplug callback registration Subsystems that want to register CPU hotplug callbacks, as well as perform initialization for the CPUs that are already online, often do it as shown below: get_online_cpus(); for_each_online_cpu(cpu) init_cpu(cpu); register_cpu_notifier(&foobar_cpu_notifier); put_online_cpus(); This is wrong, since it is prone to ABBA deadlocks involving the cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently with CPU hotplug operations). Instead, the correct and race-free way of performing the callback registration is: cpu_notifier_register_begin(); for_each_online_cpu(cpu) init_cpu(cpu); /* Note the use of the double underscored version of the API */ __register_cpu_notifier(&foobar_cpu_notifier); cpu_notifier_register_done(); Fix the cpuid code in x86 by using this latter form of callback registration. Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-20 13:43:42 +01:00
Srivatsa S. Bhat	de82a01bef	x86, msr: Fix CPU hotplug callback registration Subsystems that want to register CPU hotplug callbacks, as well as perform initialization for the CPUs that are already online, often do it as shown below: get_online_cpus(); for_each_online_cpu(cpu) init_cpu(cpu); register_cpu_notifier(&foobar_cpu_notifier); put_online_cpus(); This is wrong, since it is prone to ABBA deadlocks involving the cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently with CPU hotplug operations). Instead, the correct and race-free way of performing the callback registration is: cpu_notifier_register_begin(); for_each_online_cpu(cpu) init_cpu(cpu); /* Note the use of the double underscored version of the API */ __register_cpu_notifier(&foobar_cpu_notifier); cpu_notifier_register_done(); Fix the msr code in x86 by using this latter form of callback registration. Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-20 13:43:42 +01:00
Rafael J. Wysocki	5a2d853ffc	Merge branch 'pm-cpufreq' * pm-cpufreq: (30 commits) intel_pstate: Set core to min P state during core offline cpufreq: Add stop CPU callback to cpufreq_driver interface cpufreq: Remove unnecessary braces cpufreq: Fix checkpatch errors and warnings cpufreq: powerpc: add cpufreq transition latency for FSL e500mc SoCs cpufreq: remove unused notifier: CPUFREQ_{SUSPENDCHANGE\|RESUMECHANGE} cpufreq: Do not allow ->setpolicy drivers to provide ->target cpufreq: arm_big_little: set 'physical_cluster' for each CPU cpufreq: arm_big_little: make vexpress driver depend on bL core driver cpufreq: SPEAr: Instantiate as platform_driver cpufreq: Remove unnecessary variable/parameter 'frozen' cpufreq: Remove cpufreq_generic_exit() cpufreq: add 'freq_table' in struct cpufreq_policy cpufreq: Reformat printk() statements cpufreq: Tegra: Use cpufreq_generic_suspend() cpufreq: s5pv210: Use cpufreq_generic_suspend() cpufreq: exynos: Use cpufreq_generic_suspend() cpufreq: Implement cpufreq_generic_suspend() cpufreq: suspend governors on system suspend/hibernate cpufreq: move call to __find_governor() to cpufreq_init_policy() ...	2014-03-20 13:26:12 +01:00
Linus Torvalds	7c1cfacca2	PCI updates for v3.14: Resource management - Revert "Insert GART region into resource map" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJTKdCYAAoJEFmIoMA60/r88lMP/3Njn9KD69FNU+Hphs5ExbQG OtqudAWiJBPAabsvXAJNHh/osCJJcPgewzKGOX9dboEP3eL/bCibT2dKgXIjxRav WULAlINj6x1h3K+/+VlBVsW0W/DcnK7zIQ0O7Qjqh8f0E2v7+nB5/tOOQ6gQeqxH kp7jzBxdTUfVeZtfwY80ccHM2zI/nqm6/p5PobMt3G8vsdamxNGDjHDlOlqARgNs /nP26/XusS9Uw0MI7wwuALgLNLNMbLWMmKBfz1n7Gw3wUvWvQeK+ib67gyTgoVT7 VVDuFn8bg3hMqVwlCrBU8SnS83/dQlSl0fze0gL4UMPH+AGxHnTPCeirOkEhvtDY krVuiber7PAkj5dz0ApRR9nawHSh20Y7WJxLDYrpILo8C3e4pXcGRVsm4d8e3GuP zYwD/6OH6SgMdozc456Jb1qSaXDfLfVffYKcjLS+5bPstP+QIFSUdJdhRrsOKZGn VuIvqXioKeBYe4QqJYGt7cMWdRXTqkrYXOV5N1qPOqBloc83sRdQWtSqHivISma9 rIeyBCAA30SYtr9FF0n/up8Mw9AhNukfcKuouWUuDV/FeMRbMbblvw8Vgnq8eXY9 fLqUMOTPnE0nz5hScHPfTolQ6ATghh/sk6EOkD4gvvfS7+Ul5oiduE+rBOb2eY2Q M6/H1MT1OiNKcRROt0q6 =yDUb -----END PGP SIGNATURE----- Merge tag 'pci-v3.14-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI resource management fix from Bjorn Helgaas: "This is a fix for an AGP regression exposed by `e501b3d87f` ("agp: Support 64-bit APBASE"), which we merged in v3.14-rc1. We've warned about the conflict between the GART and PCI resources and cleared out the PCI resource for a long time, but after `e501b3d87f`, we still use that cleared-out PCI resource. I think the GART resource is incorrect, so this patch removes it" * tag 'pci-v3.14-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: Revert "[PATCH] Insert GART region into resource map"	2014-03-19 16:15:54 -07:00
Viresh Kumar	0b443ead71	cpufreq: remove unused notifier: CPUFREQ_{SUSPENDCHANGE\|RESUMECHANGE} Two cpufreq notifiers CPUFREQ_RESUMECHANGE and CPUFREQ_SUSPENDCHANGE have not been used for some time, so remove them to clean up code a bit. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-19 14:10:24 +01:00
Bjorn Helgaas	707d4eefbd	Revert "[PATCH] Insert GART region into resource map" This reverts commit `56dd669a13`, which makes the GART visible in /proc/iomem. This fixes a regression: `e501b3d87f` ("agp: Support 64-bit APBASE") exposed an existing problem with a conflict between the GART region and a PCI BAR region. The GART addresses are bus addresses, not CPU addresses, and therefore should not be inserted in iomem_resource. On many machines, the GART region is addressable by the CPU as well as by an AGP master, but CPU addressability is not required by the spec. On some of these machines, the GART is mapped by a PCI BAR, and in that case, the PCI core automatically inserts it into iomem_resource, just as it does for all BARs. Inserting it here means we'll have a conflict if the PCI core later tries to claim the GART region, so let's drop the insertion here. The conflict indirectly causes X failures, as reported by Jouni in the bugzilla below. We detected the conflict even before `e501b3d87f`, but after it the AGP code (fix_northbridge()) uses the PCI resource (which is zeroed because of the conflict) instead of reading the BAR again. Conflicts: arch/x86_64/kernel/aperture.c Fixes: `e501b3d87f` agp: Support 64-bit APBASE Link: https://bugzilla.kernel.org/show_bug.cgi?id=72201 Reported-and-tested-by: Jouni Mettälä <jtmettala@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2014-03-18 14:26:12 -06:00
Andy Lutomirski	309944be29	x86, vdso: Zero-pad the VVAR page By coincidence, the VVAR page is at the end of an ELF segment. As a result, if it ends up being a partial page, the kernel loader will leave garbage behind at the end of the vvar page. Zero-pad it to a full page to fix this issue. This has probably been broken since the VVAR page was introduced. On QEMU, if you dump the run-time contents of the VVAR page, you can find entertaining strings from seabios left behind. It's remotely possible that this is a security bug -- conceivably there's some BIOS out there that leaves something sensitive in the few K of memory that is exposed to userspace. Signed-off-by: Stefani Seibold <stefani@seibold.net> Link: http://lkml.kernel.org/r/1395094933-14252-12-git-send-email-stefani@seibold.net Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-03-18 12:52:44 -07:00
Stefani Seibold	7c03156f34	x86, vdso: Add 32 bit VDSO time support for 64 bit kernel This patch add the VDSO time support for the IA32 Emulation Layer. Due the nature of the kernel headers and the LP64 compiler where the size of a long and a pointer differs against a 32 bit compiler, there is some type hacking necessary for optimal performance. The vsyscall_gtod_data struture must be a rearranged to serve 32- and 64-bit code access at the same time: - The seqcount_t was replaced by an unsigned, this makes the vsyscall_gtod_data intedepend of kernel configuration and internal functions. - All kernel internal structures are replaced by fix size elements which works for 32- and 64-bit access - The inner struct clock was removed to pack the whole struct. The "unsigned seq" would be handled by functions derivated from seqcount_t. Signed-off-by: Stefani Seibold <stefani@seibold.net> Link: http://lkml.kernel.org/r/1395094933-14252-11-git-send-email-stefani@seibold.net Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-03-18 12:52:41 -07:00
Stefani Seibold	d2312e3379	x86, vdso: Make vsyscall_gtod_data handling x86 generic This patch move the vsyscall_gtod_data handling out of vsyscall_64.c into an additonal file vsyscall_gtod.c to make the functionality available for x86 32 bit kernel. It also adds a new vsyscall_32.c which setup the VVAR page. Reviewed-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Stefani Seibold <stefani@seibold.net> Link: http://lkml.kernel.org/r/1395094933-14252-2-git-send-email-stefani@seibold.net Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-03-18 12:51:52 -07:00
Linus Torvalds	b44eeb4d47	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc smaller fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86: Fix leak in uncore_type_init failure paths perf machine: Use map as success in ip__resolve_ams perf symbols: Fix crash in elf_section_by_name perf trace: Decode architecture-specific signal numbers	2014-03-16 10:41:21 -07:00
Linus Torvalds	a4ecdf82f8	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "Two x86 fixes: Suresh's eager FPU fix, and a fix to the NUMA quirk for AMD northbridges. This only includes Suresh's fix patch, not the "mostly a cleanup" patch which had __init issues" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/amd/numa: Fix northbridge quirk to assign correct NUMA node x86, fpu: Check tsk_used_math() in kernel_fpu_end() for eager FPU	2014-03-14 18:07:51 -07:00
Daniel J Blueman	847d7970de	x86/amd/numa: Fix northbridge quirk to assign correct NUMA node For systems with multiple servers and routed fabric, all northbridges get assigned to the first server. Fix this by also using the node reported from the PCI bus. For single-fabric systems, the northbriges are on PCI bus 0 by definition, which are on NUMA node 0 by definition, so this is invarient on most systems. Tested on fam10h and fam15h single and multi-fabric systems and candidate for stable. Signed-off-by: Daniel J Blueman <daniel@numascale.com> Acked-by: Steffen Persvold <sp@numascale.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/1394710981-3596-1-git-send-email-daniel@numascale.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-03-14 11:05:36 +01:00
Stephane Eranian	81827ed8d8	perf/x86/uncore: Fix missing end markers for SNB/IVB/HSW IMC PMU This patch fixes a bug with the SNB/IVB/HSW uncore mmeory controller support. The PCI Ids tables for the memory controller were missing end markers. That could cause random crashes on boot during or after PCI device registration. Signed-off-by: Stephane Erainan <eranian@google.com> Cc: peterz@infradead.org Cc: zheng.z.yan@intel.com Cc: bp@alien8.de Cc: ak@linux.intel.com Link: http://lkml.kernel.org/r/20140313120436.GA14236@quad Signed-off-by: Ingo Molnar <mingo@kernel.org> --	2014-03-14 09:25:25 +01:00
H. Peter Anvin	0b131be8d4	x86, intel: Make MSR_IA32_MISC_ENABLE bit constants systematic Replace somewhat arbitrary constants for bits in MSR_IA32_MISC_ENABLE with verbose but systematic ones. Add _BIT defines for all the rest of them, too. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-03-13 15:55:46 -07:00
Borislav Petkov	c0a639ad0b	x86, Intel: Convert to the new bit access MSR accessors ... and save some lines of code. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1394384725-10796-4-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-03-13 15:35:09 -07:00
Borislav Petkov	8f86a7373a	x86, AMD: Convert to the new bit access MSR accessors ... and save us a bunch of code. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1394384725-10796-3-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-03-13 15:35:03 -07:00
Borislav Petkov	5314feebab	x86, crash: Unify ifdef Merge two back-to-back CONFIG_X86_32 ifdefs into one. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1394633584-5509-3-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-03-13 15:32:44 -07:00
Thomas Gleixner	ffb12cf002	Merge branch 'irq/for-gpio' into irq/core Merge the request/release callbacks which are in a separate branch for consumption by the gpio folks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-03-12 16:01:07 +01:00
Stephane Eranian	4191c29f05	perf/x86/uncore: Fix compilation warning in snb_uncore_imc_init_box() This patch fixes a compilation problem (unused variable) with the new SNB/IVB/HSW uncore IMC code. [ In -v2 we simplify the fix as suggested by Peter Zjilstra. ] Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140311235329.GA28624@quad Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-03-12 10:49:13 +01:00
Suresh Siddha	731bd6a93a	x86, fpu: Check tsk_used_math() in kernel_fpu_end() for eager FPU For non-eager fpu mode, thread's fpu state is allocated during the first fpu usage (in the context of device not available exception). This (math_state_restore()) can be a blocking call and hence we enable interrupts (which were originally disabled when the exception happened), allocate memory and disable interrupts etc. But the eager-fpu mode, call's the same math_state_restore() from kernel_fpu_end(). The assumption being that tsk_used_math() is always set for the eager-fpu mode and thus avoid the code path of enabling interrupts, allocating fpu state using blocking call and disable interrupts etc. But the below issue was noticed by Maarten Baert, Nate Eldredge and few others: If a user process dumps core on an ecrypt fs while aesni-intel is loaded, we get a BUG() in __find_get_block() complaining that it was called with interrupts disabled; then all further accesses to our ecrypt fs hang and we have to reboot. The aesni-intel code (encrypting the core file that we are writing) needs the FPU and quite properly wraps its code in kernel_fpu_{begin,end}(), the latter of which calls math_state_restore(). So after kernel_fpu_end(), interrupts may be disabled, which nobody seems to expect, and they stay that way until we eventually get to __find_get_block() which barfs. For eager fpu, most the time, tsk_used_math() is true. At few instances during thread exit, signal return handling etc, tsk_used_math() might be false. In kernel_fpu_end(), for eager-fpu, call math_state_restore() only if tsk_used_math() is set. Otherwise, don't bother. Kernel code path which cleared tsk_used_math() knows what needs to be done with the fpu state. Reported-by: Maarten Baert <maarten-baert@hotmail.com> Reported-by: Nate Eldredge <nate@thatsmathematics.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Suresh Siddha <sbsiddha@gmail.com> Link: http://lkml.kernel.org/r/1391410583.3801.6.camel@europa Cc: George Spelvin <linux@horizon.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-03-11 12:32:52 -07:00
Dave Jones	09df7c4c80	x86: Remove CONFIG_X86_OOSTORE This was an optimization that made memcpy type benchmarks a little faster on ancient (Circa 1998) IDT Winchip CPUs. In real-life workloads, it wasn't even noticable, and I doubt anyone is running benchmarks on 16 year old silicon any more. Given this code has likely seen very little use over the last decade, let's just remove it. Signed-off-by: Dave Jones <davej@fedoraproject.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-03-11 10:16:18 -07:00
Jan Kiszka	ea7bdc65bc	x86/apic: Plug racy xAPIC access of CPU hotplug code apic_icr_write() and its users in smpboot.c were apparently written under the assumption that this code would only run during early boot. But nowadays we also execute it when onlining a CPU later on while the system is fully running. That will make wakeup_cpu_via_init_nmi and, thus, also native_apic_icr_write run in plain process context. If we migrate the caller to a different CPU at the wrong time or interrupt it and write to ICR/ICR2 to send unrelated IPIs, we can end up sending INIT, SIPI or NMIs to wrong CPUs. Fix this by disabling interrupts during the write to the ICR halves and disable preemption around waiting for ICR availability and using it. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Tested-By: Igor Mammedov <imammedo@redhat.com> Link: http://lkml.kernel.org/r/52E6AFFE.3030004@siemens.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-03-11 12:03:31 +01:00
Steven Rostedt	7743a536be	i386: Remove unneeded test of 'task' in dump_trace() (again) Commit `028a690a1e` "i386: Remove unneeded test of 'task' in dump_trace()" correctly removed the unneeded 'task != NULL' check because it would be set to current if it was NULL. Commit `2bc5f927d4` "i386: split out dumpstack code from traps_32.c" moved the code from traps_32.c to its own file dump_stack.c for preparation of the i386 / x86_64 merge. Commit `8a541665b9` "dumpstack: x86: various small unification steps" worked to make i386 and x86_64 dump_stack logic similar. But this actually reverted the correct change from `028a690a1e`. Commit `d0caf29250` "x86/dumpstack: Remove unneeded check in dump_trace()" removed the unneeded "task != NULL" check for x86_64 but left that same unneeded check for i386, that was added because x86_64 had it! This chain of events ironically had i386 add back the unneeded task != NULL check because x86_64 did it, and then the fix for x86_64 was fixed by Dan. And even more ironically, it was Dan's smatch bot that told me that a change to dump_stack_32 I made may be wrong if current can be NULL (it can't), as there was a check for it by assigning task to current, and then checking if task is NULL. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Alexander van Heukelum <heukelum@fastmail.fm> Cc: Jesper Juhl <jesper.juhl@gmail.com> Link: http://lkml.kernel.org/r/20140307105242.79a0befd@gandalf.local.home Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-03-11 12:02:31 +01:00
Dave Jones	b7b4839d93	perf/x86: Fix leak in uncore_type_init failure paths The error path of uncore_type_init() frees up any allocations that were made along the way, but it relies upon type->pmus being set, which only happens if the function succeeds. As type->pmus remains null in this case, the call to uncore_type_exit will do nothing. Moving the assignment earlier will allow us to actually free those allocations should something go awry. Signed-off-by: Dave Jones <davej@fedoraproject.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140306172028.GA552@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-03-11 11:59:34 +01:00
Dongsheng Yang	ef11dadb83	perf/x86/uncore: Add __init for uncore_cpumask_init() Commit: `411cf180fa` perf/x86/uncore: fix initialization of cpumask introduced the function uncore_cpumask_init(), which is only called in __init intel_uncore_init(). But it is not marked with __init, which produces the following warning: WARNING: vmlinux.o(.text+0x2464a): Section mismatch in reference from the function uncore_cpumask_init() to the function .init.text:uncore_cpu_setup() The function uncore_cpumask_init() references the function __init uncore_cpu_setup(). This is often because uncore_cpumask_init lacks a __init annotation or the annotation of uncore_cpu_setup is wrong. This patch marks uncore_cpumask_init() with __init. Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com> Acked-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: http://lkml.kernel.org/r/1394013516-4964-1-git-send-email-yangds.fnst@cn.fujitsu.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-03-11 11:57:56 +01:00
Ingo Molnar	0066f3b93e	Merge branch 'perf/urgent' into perf/core Merge the latest fixes. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-03-11 11:53:50 +01:00
Ingo Molnar	a02ed5e3e0	Merge branch 'sched/urgent' into sched/core Pick up fixes before queueing up new changes. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-03-11 11:34:27 +01:00
Mathias Krause	6cce16f99d	x86, threadinfo: Redo "x86: Use inline assembler to get sp" This patch restores the changes of commit `dff38e3e93` "x86: Use inline assembler instead of global register variable to get sp". They got lost in commit `198d208df4` "x86: Keep thread_info on thread stack in x86_32" while moving the code to arch/x86/kernel/irq_32.c. Quoting Andi from commit `dff38e3e93`: """ LTO in gcc 4.6/47. has trouble with global register variables. They were used to read the stack pointer. Use a simple inline assembler statement with a mov instead. This also helps LLVM/clang, which does not support global register variables. """ Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: Mathias Krause <minipli@googlemail.com> Link: http://lkml.kernel.org/r/1394178752-18047-1-git-send-email-minipli@googlemail.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-03-10 17:32:01 -07:00
Linus Torvalds	b01d4e6893	x86: fix compile error due to X86_TRAP_NMI use in asm files It's an enum, not a #define, you can't use it in asm files. Introduced in commit `5fa10196bd` ("x86: Ignore NMIs that come in during early boot"), and sadly I didn't compile-test things like I should have before pushing out. My weak excuse is that the x86 tree generally doesn't introduce stupid things like this (and the ARM pull afterwards doesn't cause me to do a compile-test either, since I don't cross-compile). Cc: Don Zickus <dzickus@redhat.com> Cc: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-03-07 18:58:40 -08:00
H. Peter Anvin	5fa10196bd	x86: Ignore NMIs that come in during early boot Don Zickus reports: A customer generated an external NMI using their iLO to test kdump worked. Unfortunately, the machine hung. Disabling the nmi_watchdog made things work. I speculated the external NMI fired, caused the machine to panic (as expected) and the perf NMI from the watchdog came in and was latched. My guess was this somehow caused the hang. ---- It appears that the latched NMI stays latched until the early page table generation on 64 bits, which causes exceptions to happen which end in IRET, which re-enable NMI. Therefore, ignore NMIs that come in during early execution, until we have proper exception handling. Reported-and-tested-by: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/1394221143-29713-1-git-send-email-dzickus@redhat.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org> # v3.5+, older with some backport effort	2014-03-07 15:08:14 -08:00
Petr Mladek	7f11f5ecf4	ftrace/x86: BUG when ftrace recovery fails Ftrace modifies function calls using Int3 breakpoints on x86. The breakpoints are handled only when the patching is in progress. If something goes wrong, there is a recovery code that removes the breakpoints. If this fails, the system might get silently rebooted when a remaining break is not handled or an invalid instruction is proceed. We should BUG() when the breakpoint could not be removed. Otherwise, the system silently crashes when the function finishes the Int3 handler is disabled. Note that we need to modify remove_breakpoint() to return non-zero value only when there is an error. The return value was ignored before, so it does not cause any troubles. Link: http://lkml.kernel.org/r/1393258342-29978-4-git-send-email-pmladek@suse.cz Signed-off-by: Petr Mladek <pmladek@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-03-07 10:06:16 -05:00
Jiri Slaby	3a36cb11ca	ftrace: Do not pass data to ftrace_dyn_arch_init As the data parameter is not really used by any ftrace_dyn_arch_init, remove that from ftrace_dyn_arch_init. This also removes the addr local variable from ftrace_init which is now unused. Note the documentation was imprecise as it did not suggest to set (*data) to 0. Link: http://lkml.kernel.org/r/1393268401-24379-4-git-send-email-jslaby@suse.cz Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-arch@vger.kernel.org Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-03-07 10:06:14 -05:00
Jiri Slaby	af64a7cb09	ftrace: Pass retval through return in ftrace_dyn_arch_init() No architecture uses the "data" parameter in ftrace_dyn_arch_init() in any way, it just sets the value to 0. And this is used as a return value in the caller -- ftrace_init, which just checks the retval against zero. Note there is also "return 0" in every ftrace_dyn_arch_init. So it is enough to check the retval and remove all the indirect sets of data on all archs. Link: http://lkml.kernel.org/r/1393268401-24379-3-git-send-email-jslaby@suse.cz Cc: linux-arch@vger.kernel.org Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-03-07 10:06:13 -05:00
Steven Rostedt (Red Hat)	92550405c4	ftrace/x86: Have ftrace_write() return -EPERM and clean up callers Having ftrace_write() return -EPERM on failure, as that's what the callers return, then we can clean up the code a bit. That is, instead of: if (ftrace_write(...)) return -EPERM; return 0; or if (ftrace_write(...)) { ret = -EPERM; goto_out; } We can instead have: return ftrace_write(...); or ret = ftrace_write(...); if (ret) goto out; Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-03-07 10:05:50 -05:00
Steven Rostedt	2223f6f6ee	x86: Clean up dumpstack_64.c code The dump_trace() function in dumpstack_64.c is hard to follow. The test for exception stack is processed differently than the test for irq stack, and the normal stack is outside completely. By restructuring this code to have all the stacks determined by a single function that returns an enum of the following: STACK_IS_NORMAL STACK_IS_EXCEPTION STACK_IS_IRQ STACK_IS_UNKNOWN and has the logic of each within a switch statement. This should make the code much easier to read and understand. Link: http://lkml.kernel.org/r/20110806012354.684598995@goodmis.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20140206144322.086050042@goodmis.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-03-06 16:56:55 -08:00
Steven Rostedt	198d208df4	x86: Keep thread_info on thread stack in x86_32 x86_64 uses a per_cpu variable kernel_stack to always point to the thread stack of current. This is where the thread_info is stored and is accessed from this location even when the irq or exception stack is in use. This removes the complexity of having to maintain the thread info on the stack when interrupts are running and having to copy the preempt_count and other fields to the interrupt stack. x86_32 uses the old method of copying the thread_info from the thread stack to the exception stack just before executing the exception. Having the two different requires #ifdefs and also the x86_32 way is a bit of a pain to maintain. By converting x86_32 to the same method of x86_64, we can remove #ifdefs, clean up the x86_32 code a little, and remove the overhead of the copy. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20110806012354.263834829@goodmis.org Link: http://lkml.kernel.org/r/20140206144321.852942014@goodmis.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-03-06 16:56:55 -08:00
Steven Rostedt	0788aa6a23	x86: Prepare removal of previous_esp from i386 thread_info structure The i386 thread_info contains a previous_esp field that is used to daisy chain the different stacks for dump_stack() (ie. irq, softirq, thread stacks). The goal is to eventual make i386 handling of thread_info the same as x86_64, which means that the thread_info will not be in the stack but as a per_cpu variable. We will no longer depend on thread_info being able to daisy chain different stacks as it will only exist in one location (the thread stack). By moving previous_esp to the end of thread_info and referencing it as an offset instead of using a thread_info field, this becomes a stepping stone to moving the thread_info. The offset to get to the previous stack is rather ugly in this patch, but this is only temporary and the prev_esp will be changed in the next commit. This commit is more for sanity checks of the change. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Robert Richter <rric@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20110806012353.891757693@goodmis.org Link: http://lkml.kernel.org/r/20140206144321.608754481@goodmis.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-03-06 16:56:54 -08:00
H. Peter Anvin	fb3bd7b19b	x86, reboot: Only use CF9_COND automatically, not CF9 Only CF9_COND is appropriate for inclusion in the default chain, not CF9; the latter will poke that register unconditionally, whereas CF9_COND will at least look for PCI configuration method #1 or #2 first (a weak check, but better than nothing.) CF9 should be used for explicit system configuration (command line or DMI) only. Cc: Aubrey Li <aubrey.li@intel.com> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Link: http://lkml.kernel.org/r/53130A46.1010801@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-03-05 15:41:15 -08:00
Li, Aubrey	a4f1987e4c	x86, reboot: Add EFI and CF9 reboot methods into the default list Reboot is the last service linux OS provides to the end user. We are supposed to make this function more robust than today. This patch adds all of the known reboot methods into the default attempt list. The machines requiring reboot=efi or reboot=p or reboot=bios get a chance to reboot automatically now. If there is a new reboot method emerged, we are supposed to add it to the default list as well, instead of adding the endless dmidecode entry. If one method required is in the default list in this patch but the machine reboot still hangs, that means some methods ahead of the required method cause the system hangs, then reboot the machine by passing reboot= arguments and submit the reboot dmidecode table quirk. We are supposed to remove the reboot dmidecode table from the kernel, but to be safe, we keep it. This patch prevents us from adding more. If you happened to have a machine listed in the reboot dmidecode table and this patch makes reboot work on your machine, please submit a patch to remove the quirk. The default reboot order with this patch is now: ACPI > KBD > ACPI > KBD > EFI > CF9_COND > BIOS Because BIOS and TRIPLE are mutually exclusive (either will either work or hang the machine) that method is not included. [ hpa: as with any changes to the reboot order, this patch will have to be monitored carefully for regressions. ] Signed-off-by: Aubrey Li <aubrey.li@intel.com> Acked-by: Matthew Garrett <mjg59@srcf.ucam.org> Link: http://lkml.kernel.org/r/53130A46.1010801@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-03-05 15:27:07 -08:00
Matt Fleming	4fd69331ad	Merge remote-tracking branch 'tip/x86/urgent' into efi-for-mingo Conflicts: arch/x86/include/asm/efi.h	2014-03-05 17:31:41 +00:00
Thomas Gleixner	76d388cd72	x86: hyperv: Fixup the (brain) damage caused by the irq cleanup Compiling last minute changes without setting the proper config options is not really clever. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-03-05 13:42:14 +01:00
H. Peter Anvin	3c0b566334	* Disable the new EFI 1:1 virtual mapping for SGI UV because using it causes a crash during boot - Borislav Petkov -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJTFmV4AAoJEC84WcCNIz1VeSgP/1gykrBiH3Vr4H4c32la/arZ ktXRAT+RHdiebEXopt6+A1Pv6iyRYz3fBB1cKKb7q8fhDmVVmefGVkyO4qg4NbgL Ic1dP6uPgiFidcYML9/c6+UIovbwDD7f5wwJEjXxoZKg7b7P0TIykd8z8YPQ/6A6 fIY5Z2L9A8eVt4k1m6Dg2PUJ2B+XKOLa5BbL7gXB6u9avgAVAoLM9oQStss+V9Pv JQu+BZxeEwuxgi0LOxk1sFWXaAoRtgVDNH6nPK93CRvO2H83voWFf+OcW9hQWsF5 tLam6aMkvjuM5Dv1IA69PBAWrE/jxcMvjEdJyrGMWRKWtH/iTN4ez4EoFiO38o4R IgzXh9L5xbP3o+g3rntIu4h3/5yde9TRM18mER0lLTdNFZ8QzmLt4L0TptntRoBv bTXffILACq0uQU6T10P+EwseT472HphMeswaWVDkRxkN2hTCipFqQX0ekb0qbw3O yQeRyz1/t7DeA77iAGG96SfeziMdr44d6u6zDQPrTAJV+H1ZZ3XNIlJDi01CAg3b PDT6nHb/9V0tPZQebntZnczVRP+5EBdn5RbJaAMldEwVgjdjlFNY5slsF01KeCSF 6Cx6UyAI9XKuZMoayC3QDHKKWi+BTOwbKcVAdtZ1c9AMLt9pyxsDfdoOAV4zBiQa PsVs9t/l7TZoL1MQHlEJ =YIab -----END PGP SIGNATURE----- Merge tag 'efi-urgent' into x86/urgent * Disable the new EFI 1:1 virtual mapping for SGI UV because using it causes a crash during boot - Borislav Petkov Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-03-04 15:50:06 -08:00
Borislav Petkov	a5d90c923b	x86/efi: Quirk out SGI UV Alex reported hitting the following BUG after the EFI 1:1 virtual mapping work was merged, kernel BUG at arch/x86/mm/init_64.c:351! invalid opcode: 0000 [#1] SMP Call Trace: [<ffffffff818aa71d>] init_extra_mapping_uc+0x13/0x15 [<ffffffff818a5e20>] uv_system_init+0x22b/0x124b [<ffffffff8108b886>] ? clockevents_register_device+0x138/0x13d [<ffffffff81028dbb>] ? setup_APIC_timer+0xc5/0xc7 [<ffffffff8108b620>] ? clockevent_delta2ns+0xb/0xd [<ffffffff818a3a92>] ? setup_boot_APIC_clock+0x4a8/0x4b7 [<ffffffff8153d955>] ? printk+0x72/0x74 [<ffffffff818a1757>] native_smp_prepare_cpus+0x389/0x3d6 [<ffffffff818957bc>] kernel_init_freeable+0xb7/0x1fb [<ffffffff81535530>] ? rest_init+0x74/0x74 [<ffffffff81535539>] kernel_init+0x9/0xff [<ffffffff81541dfc>] ret_from_fork+0x7c/0xb0 [<ffffffff81535530>] ? rest_init+0x74/0x74 Getting this thing to work with the new mapping scheme would need more work, so automatically switch to the old memmap layout for SGI UV. Acked-by: Russ Anderson <rja@sgi.com> Cc: Alex Thorlton <athorlton@sgi.com Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2014-03-04 23:43:33 +00:00
Thomas Gleixner	13b5be56d1	x86: hyperv: Fix brown paperbag typos reported by Fenguangs build robot Reported-by: fengguang.wu@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linuxdrivers <devel@linuxdriverproject.org> Cc: x86 <x86@kernel.org>	2014-03-04 23:53:33 +01:00
Thomas Gleixner	3c433679ab	x86: hyperv: Make it build with CONFIG_HYPERV=m again Commit `1aec16967` (x86: Hyperv: Cleanup the irq mess) removed the ability to build the hyperv stuff as a module. Bring it back. Reported-by: fengguang.wu@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linuxdrivers <devel@linuxdriverproject.org> Cc: x86 <x86@kernel.org>	2014-03-04 23:41:44 +01:00
Michael Opdenacker	d20d2efbf2	x86: Remove deprecated IRQF_DISABLED This patch removes the IRQF_DISABLED flag from x86 architecture code. It's a NOOP since 2.6.35 and it will be removed one day. Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Cc: venki@google.com Link: http://lkml.kernel.org/r/1393965305-17248-1-git-send-email-michael.opdenacker@free-electrons.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-03-04 21:47:51 +01:00
Thomas Gleixner	1aec169673	x86: Hyperv: Cleanup the irq mess The vmbus/hyperv interrupt handling is another complete trainwreck and probably the worst of all currently in tree. If CONFIG_HYPERV=y then the interrupt delivery to the vmbus happens via the direct HYPERVISOR_CALLBACK_VECTOR. So far so good, but: The driver requests first a normal device interrupt. The only reason to do so is to increment the interrupt stats of that device interrupt. For no reason it also installs a private flow handler. We have proper accounting mechanisms for direct vectors, but of course it's too much effort to add that 5 lines of code. Aside of that the alloc_intr_gate() is not protected against reallocation which makes module reload impossible. Solution to the problem is simple to rip out the whole mess and implement it correctly. First of all move all that code to arch/x86/kernel/cpu/mshyperv.c and merily install the HYPERVISOR_CALLBACK_VECTOR with proper reallocation protection and use the proper direct vector accounting mechanism. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: K. Y. Srinivasan <kys@microsoft.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linuxdrivers <devel@linuxdriverproject.org> Cc: x86 <x86@kernel.org> Link: http://lkml.kernel.org/r/20140223212739.028307673@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-03-04 17:37:54 +01:00
Thomas Gleixner	929320e4b4	x86: Add proper vector accounting for HYPERVISOR_CALLBACK_VECTOR HyperV abuses a device interrupt to account for the HYPERVISOR_CALLBACK_VECTOR. Provide proper accounting as we have for the other vectors as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: x86 <x86@kernel.org> Link: http://lkml.kernel.org/r/20140223212738.681855582@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-03-04 17:37:54 +01:00
Matt Fleming	3e90959921	efi: Move facility flags to struct efi As we grow support for more EFI architectures they're going to want the ability to query which EFI features are available on the running system. Instead of storing this information in an architecture-specific place, stick it in the global 'struct efi', which is already the central location for EFI state. While we're at it, let's change the return value of efi_enabled() to be bool and replace all references to 'facility' with 'feature', which is the usual word used to describe the attributes of the running system. Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2014-03-04 16:16:16 +00:00
Paolo Bonzini	1c2af4968e	Merge tag 'kvm-for-3.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into kvm-next	2014-03-04 15:58:00 +01:00
Petr Mladek	12729f14d8	ftrace/x86: One more missing sync after fixup of function modification failure If a failure occurs while modifying ftrace function, it bails out and will remove the tracepoints to be back to what the code originally was. There is missing the final sync run across the CPUs after the fix up is done and before the ftrace int3 handler flag is reset. Here's the description of the problem: CPU0 CPU1 ---- ---- remove_breakpoint(); modifying_ftrace_code = 0; [still sees breakpoint] <takes trap> [sees modifying_ftrace_code as zero] [no breakpoint handler] [goto failed case] [trap exception - kernel breakpoint, no handler] BUG() Link: http://lkml.kernel.org/r/1393258342-29978-2-git-send-email-pmladek@suse.cz Fixes: `8a4d0a687a` "ftrace: Use breakpoint method to update ftrace caller" Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Petr Mladek <pmladek@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-03-03 21:23:07 -05:00
Steven Rostedt (Red Hat)	c932c6b7c9	ftrace/x86: Run a sync after fixup on failure If a failure occurs while enabling a trace, it bails out and will remove the tracepoints to be back to what the code originally was. But the fix up had some bugs in it. By injecting a failure in the code, the fix up ran to completion, but shortly afterward the system rebooted. There was two bugs here. The first was that there was no final sync run across the CPUs after the fix up was done, and before the ftrace int3 handler flag was reset. That means that other CPUs could still see the breakpoint and trigger on it long after the flag was cleared, and the int3 handler would think it was a spurious interrupt. Worse yet, the int3 handler could hit other breakpoints because the ftrace int3 handler flag would have prevented the int3 handler from going further. Here's a description of the issue: CPU0 CPU1 ---- ---- remove_breakpoint(); modifying_ftrace_code = 0; [still sees breakpoint] <takes trap> [sees modifying_ftrace_code as zero] [no breakpoint handler] [goto failed case] [trap exception - kernel breakpoint, no handler] BUG() The second bug was that the removal of the breakpoints required the "within()" logic updates instead of accessing the ip address directly. As the kernel text is mapped read-only when CONFIG_DEBUG_RODATA is set, and the removal of the breakpoint is a modification of the kernel text. The ftrace_write() includes the "within()" logic, where as, the probe_kernel_write() does not. This prevented the breakpoint from being removed at all. Link: http://lkml.kernel.org/r/1392650573-3390-1-git-send-email-pmladek@suse.cz Reported-by: Petr Mladek <pmladek@suse.cz> Tested-by: Petr Mladek <pmladek@suse.cz> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-03-03 21:23:06 -05:00
Greg Kroah-Hartman	13df797743	Merge 3.14-rc5 into driver-core-next We want the fixes in here.	2014-03-02 20:09:08 -08:00
Linus Torvalds	3154da34be	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc fixes, most of them on the tooling side" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf tools: Fix strict alias issue for find_first_bit perf tools: fix BFD detection on opensuse perf: Fix hotplug splat perf/x86: Fix event scheduling perf symbols: Destroy unused symsrcs perf annotate: Check availability of annotate when processing samples	2014-03-02 11:37:07 -06:00
Aravind Gopalakrishnan	85a8885bd0	amd64_edac: Add support for newer F16h models Extend ECC decoding support for F16h M30h. Tested on F16h M30h with ECC turned on using mce_amd_inj module and the patch works fine. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1392913726-16961-1-git-send-email-Aravind.Gopalakrishnan@amd.com Tested-by: Arindam Nath <Arindam.Nath@amd.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Borislav Petkov <bp@suse.de>	2014-02-27 18:03:16 +01:00
H. Peter Anvin	da4aaa7d86	x86, cpufeature: If we disable CLFLUSH, we should disable CLFLUSHOPT If we explicitly disable the use of CLFLUSH, we should disable the use of CLFLUSHOPT as well. Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/n/tip-jtdv7btppr4jgzxm3sxx1e74@git.kernel.org	2014-02-27 08:36:31 -08:00
H. Peter Anvin	840d2830e6	x86, cpufeature: Rename X86_FEATURE_CLFLSH to X86_FEATURE_CLFLUSH We call this "clflush" in /proc/cpuinfo, and have cpu_has_clflush()... let's be consistent and just call it that. Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Alan Cox <alan@linux.intel.com> Link: http://lkml.kernel.org/n/tip-mlytfzjkvuf739okyn40p8a5@git.kernel.org	2014-02-27 08:31:30 -08:00
H. Peter Anvin	b5660ba76b	x86, platforms: Remove NUMAQ The NUMAQ support seems to be unmaintained, remove it. Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: David Rientjes <rientjes@google.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/n/530CFD6C.7040705@zytor.com	2014-02-27 08:07:39 -08:00
H. Peter Anvin	c5f9ee3d66	x86, platforms: Remove SGI Visual Workstation The SGI Visual Workstation seems to be dead; remove support so we don't have to continue maintaining it. Cc: Andrey Panin <pazke@donpac.ru> Cc: Michael Reed <mdr@sgi.com> Link: http://lkml.kernel.org/r/530CFD6C.7040705@zytor.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-02-27 08:07:39 -08:00
Peter Zijlstra	c347a2f179	perf/x86: Add a few more comments Add a few comments on the ->add(), ->del() and ->*_txn() implementation. Requested-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-he3819318c245j7t5e1e22tr@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-27 12:43:25 +01:00
Ingo Molnar	ff5a7088f0	Merge branch 'perf/urgent' into perf/core Merge the latest fixes before queueing up new changes. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-27 12:41:17 +01:00
Peter Zijlstra	26e61e8939	perf/x86: Fix event scheduling Vince "Super Tester" Weaver reported a new round of syscall fuzzing (Trinity) failures, with perf WARN_ON()s triggering. He also provided traces of the failures. This is I think the relevant bit: > pec_1076_warn-2804 [000] d... 147.926153: x86_pmu_disable: x86_pmu_disable > pec_1076_warn-2804 [000] d... 147.926153: x86_pmu_state: Events: { > pec_1076_warn-2804 [000] d... 147.926156: x86_pmu_state: 0: state: .R config: ffffffffffffffff ( (null)) > pec_1076_warn-2804 [000] d... 147.926158: x86_pmu_state: 33: state: AR config: 0 (ffff88011ac99800) > pec_1076_warn-2804 [000] d... 147.926159: x86_pmu_state: } > pec_1076_warn-2804 [000] d... 147.926160: x86_pmu_state: n_events: 1, n_added: 0, n_txn: 1 > pec_1076_warn-2804 [000] d... 147.926161: x86_pmu_state: Assignment: { > pec_1076_warn-2804 [000] d... 147.926162: x86_pmu_state: 0->33 tag: 1 config: 0 (ffff88011ac99800) > pec_1076_warn-2804 [000] d... 147.926163: x86_pmu_state: } > pec_1076_warn-2804 [000] d... 147.926166: collect_events: Adding event: 1 (ffff880119ec8800) So we add the insn:p event (fd[23]). At this point we should have: n_events = 2, n_added = 1, n_txn = 1 > pec_1076_warn-2804 [000] d... 147.926170: collect_events: Adding event: 0 (ffff8800c9e01800) > pec_1076_warn-2804 [000] d... 147.926172: collect_events: Adding event: 4 (ffff8800cbab2c00) We try and add the {BP,cycles,br_insn} group (fd[3], fd[4], fd[15]). These events are 0:cycles and 4:br_insn, the BP event isn't x86_pmu so that's not visible. group_sched_in() pmu->start_txn() /* nop - BP pmu */ event_sched_in() event->pmu->add() So here we should end up with: 0: n_events = 3, n_added = 2, n_txn = 2 4: n_events = 4, n_added = 3, n_txn = 3 But seeing the below state on x86_pmu_enable(), the must have failed, because the 0 and 4 events aren't there anymore. Looking at group_sched_in(), since the BP is the leader, its event_sched_in() must have succeeded, for otherwise we would not have seen the sibling adds. But since neither 0 or 4 are in the below state; their event_sched_in() must have failed; but I don't see why, the complete state: 0,0,1:p,4 fits perfectly fine on a core2. However, since we try and schedule 4 it means the 0 event must have succeeded! Therefore the 4 event must have failed, its failure will have put group_sched_in() into the fail path, which will call: event_sched_out() event->pmu->del() on 0 and the BP event. Now x86_pmu_del() will reduce n_events; but it will not reduce n_added; giving what we see below: n_event = 2, n_added = 2, n_txn = 2 > pec_1076_warn-2804 [000] d... 147.926177: x86_pmu_enable: x86_pmu_enable > pec_1076_warn-2804 [000] d... 147.926177: x86_pmu_state: Events: { > pec_1076_warn-2804 [000] d... 147.926179: x86_pmu_state: 0: state: .R config: ffffffffffffffff ( (null)) > pec_1076_warn-2804 [000] d... 147.926181: x86_pmu_state: 33: state: AR config: 0 (ffff88011ac99800) > pec_1076_warn-2804 [000] d... 147.926182: x86_pmu_state: } > pec_1076_warn-2804 [000] d... 147.926184: x86_pmu_state: n_events: 2, n_added: 2, n_txn: 2 > pec_1076_warn-2804 [000] d... 147.926184: x86_pmu_state: Assignment: { > pec_1076_warn-2804 [000] d... 147.926186: x86_pmu_state: 0->33 tag: 1 config: 0 (ffff88011ac99800) > pec_1076_warn-2804 [000] d... 147.926188: x86_pmu_state: 1->0 tag: 1 config: 1 (ffff880119ec8800) > pec_1076_warn-2804 [000] d... 147.926188: x86_pmu_state: } > pec_1076_warn-2804 [000] d... 147.926190: x86_pmu_enable: S0: hwc->idx: 33, hwc->last_cpu: 0, hwc->last_tag: 1 hwc->state: 0 So the problem is that x86_pmu_del(), when called from a group_sched_in() that fails (for whatever reason), and without x86_pmu TXN support (because the leader is !x86_pmu), will corrupt the n_added state. Reported-and-Tested-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Stephane Eranian <eranian@google.com> Cc: Dave Jones <davej@redhat.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20140221150312.GF3104@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-27 12:38:02 +01:00
Rafael J. Wysocki	38953d3945	Merge back earlier 'acpi-processor' material.	2014-02-27 00:22:42 +01:00
Kees Cook	e2b32e6785	x86, kaslr: randomize module base load address Randomize the load address of modules in the kernel to make kASLR effective for modules. Modules can only be loaded within a particular range of virtual address space. This patch adds 10 bits of entropy to the load address by adding 1-1024 * PAGE_SIZE to the beginning range where modules are loaded. The single base offset was chosen because randomizing each module load ends up wasting/fragmenting memory too much. Prior approaches to minimizing fragmentation while doing randomization tend to result in worse entropy than just doing a single base address offset. Example kASLR boot without this change, with a single module loaded: ---[ Modules ]--- 0xffffffffc0000000-0xffffffffc0001000 4K ro GLB x pte 0xffffffffc0001000-0xffffffffc0002000 4K ro GLB NX pte 0xffffffffc0002000-0xffffffffc0004000 8K RW GLB NX pte 0xffffffffc0004000-0xffffffffc0200000 2032K pte 0xffffffffc0200000-0xffffffffff000000 1006M pmd ---[ End Modules ]--- Example kASLR boot after this change, same module loaded: ---[ Modules ]--- 0xffffffffc0000000-0xffffffffc0200000 2M pmd 0xffffffffc0200000-0xffffffffc03bf000 1788K pte 0xffffffffc03bf000-0xffffffffc03c0000 4K ro GLB x pte 0xffffffffc03c0000-0xffffffffc03c1000 4K ro GLB NX pte 0xffffffffc03c1000-0xffffffffc03c3000 8K RW GLB NX pte 0xffffffffc03c3000-0xffffffffc0400000 244K pte 0xffffffffc0400000-0xffffffffff000000 1004M pmd ---[ End Modules ]--- Signed-off-by: Andy Honig <ahonig@google.com> Link: http://lkml.kernel.org/r/20140226005916.GA27083@www.outflux.net Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-02-25 17:07:26 -08:00
Eugene Surovegin	b6085a8657	x86, kaslr: export offset in VMCOREINFO ELF notes Include kASLR offset in VMCOREINFO ELF notes to assist in debugging. [ hpa: pushing this for v3.14 to avoid having a kernel version with kASLR where we can't debug output. ] Signed-off-by: Eugene Surovegin <surovegin@google.com> Link: http://lkml.kernel.org/r/20140123173120.GA25474@www.outflux.net Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-02-25 16:57:47 -08:00
Linus Torvalds	208937fdcf	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: - a bugfix which prevents a divide by 0 panic when the newly introduced try_msr_calibrate_tsc() fails - enablement of the Baytrail platform to utilize the newfangled msr based calibration * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: tsc: Add missing Baytrail frequency to the table x86, tsc: Fallback to normal calibration if fast MSR calibration fails	2014-02-23 14:15:46 -08:00
Linus Torvalds	9b3e7c9b9a	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc fixlets from all around the place" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/uncore: Fix IVT/SNB-EP uncore CBOX NID filter table perf/x86: Correctly use FEATURE_PDCM perf, nmi: Fix unknown NMI warning perf trace: Fix ioctl 'request' beautifier build problems on !(i386 \|\| x86_64) arches perf trace: Add fallback definition of EFD_SEMAPHORE perf list: Fix checking for supported events on older kernels perf tools: Handle PERF_RECORD_HEADER_EVENT_TYPE properly perf probe: Do not add offset twice to uprobe address perf/x86: Fix Userspace RDPMC switch perf/x86/intel/p6: Add userspace RDPMC quirk for PPro	2014-02-22 12:11:54 -08:00
Fernando Luis Vázquez Cao	0d75de4a65	kvm: remove redundant registration of BSP's hv_clock area These days hv_clock allocation is memblock based (i.e. the percpu allocator is not involved), which means that the physical address of each of the per-cpu hv_clock areas is guaranteed to remain unchanged through all its lifetime and we do not need to update its location after CPU bring-up. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-02-22 15:53:32 +01:00
Stephane Eranian	337397f3af	perf/x86/uncore: Fix IVT/SNB-EP uncore CBOX NID filter table This patch updates the CBOX PMU filters mapping tables for SNB-EP and IVT (model 45 and 62 respectively). The NID umask always comes in addition to another umask. When set, the NID filter is applied. The current mapping tables were missing some code/umask combinations to account for the NID umask. This patch fixes that. Cc: mingo@elte.hu Cc: ak@linux.intel.com Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140219131018.GA24475@quad Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-02-21 22:09:01 +01:00
Peter Zijlstra	c9b08884c9	perf/x86: Correctly use FEATURE_PDCM The current code simply assumes Intel Arch PerfMon v2+ to have the IA32_PERF_CAPABILITIES MSR; the SDM specifies that we should check CPUID[1].ECX[15] (aka, FEATURE_PDCM) instead. This was found by KVM which implements v2+ but didn't provide the capabilities MSR. Change the code to DTRT; KVM will also implement the MSR and return 0. Cc: pbonzini@redhat.com Reported-by: "Michael S. Tsirkin" <mst@redhat.com> Suggested-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140203132903.GI8874@twins.programming.kicks-ass.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-02-21 22:09:01 +01:00
Markus Metzger	a3ef2229c9	perf, nmi: Fix unknown NMI warning When using BTS on Core i7-4*, I get the below kernel warning. $ perf record -c 1 -e branches:u ls Message from syslogd@labpc1501 at Nov 11 15:49:25 ... kernel:[ 438.317893] Uhhuh. NMI received for unknown reason 31 on CPU 2. Message from syslogd@labpc1501 at Nov 11 15:49:25 ... kernel:[ 438.317920] Do you have a strange power saving mode enabled? Message from syslogd@labpc1501 at Nov 11 15:49:25 ... kernel:[ 438.317945] Dazed and confused, but trying to continue Make intel_pmu_handle_irq() take the full exit path when returning early. Cc: eranian@google.com Cc: peterz@infradead.org Cc: mingo@kernel.org Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1392425048-5309-1-git-send-email-andi@firstfloor.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-02-21 22:09:01 +01:00
Stephane Eranian	e9d9768824	perf/x86/uncore: use MiB unit for events for SNB/IVB/HSW IMC This patch makes perf use Mebibytes to display the counts of uncore_imc/data_reads/ and uncore_imc/data_writes. 1MiB = 1024*1024 bytes. Cc: mingo@elte.hu Cc: acme@redhat.com Cc: ak@linux.intel.com Cc: zheng.z.yan@intel.com Cc: peterz@infradead.org Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1392132015-14521-9-git-send-email-eranian@google.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-02-21 21:49:08 +01:00
Stephane Eranian	ced2efb099	perf/x86/uncore: add hrtimer to SNB uncore IMC PMU This patch is needed because that PMU uses 32-bit free running counters with no interrupt capabilities. On SNB/IVB/HSW, we used 20GB/s theoretical peak to calculate the hrtimer timeout necessary to avoid missing an overflow. That delay is set to 5s to be on the cautious side. The SNB IMC uses free running counters, which are handled via pseudo fixed counters. The SNB IMC PMU implementation supports an arbitrary number of events, because the counters are read-only. Therefore it is not possible to track active counters. Instead we put active events on a linked list which is then used by the hrtimer handler to update the SW counts. Cc: mingo@elte.hu Cc: acme@redhat.com Cc: ak@linux.intel.com Cc: zheng.z.yan@intel.com Cc: peterz@infradead.org Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1392132015-14521-8-git-send-email-eranian@google.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-02-21 21:49:08 +01:00
Stephane Eranian	b9e1ab6d4c	perf/x86/uncore: add SNB/IVB/HSW client uncore memory controller support This patch adds a new uncore PMU for Intel SNB/IVB/HSW client CPUs. It adds the Integrated Memory Controller (IMC) PMU. This new PMU provides a set of events to measure memory bandwidth utilization. The IMC on those processor is PCI-space based. This patch exposes a new uncore PMU on those processor: uncore_imc Two new events are defined: - name: data_reads - code: 0x1 - unit: 64 bytes - number of full cacheline read requests to the IMC - name: data_writes - code: 0x2 - unit: 64 bytes - number of full cacheline write requests to the IMC Documentation available at: http://software.intel.com/en-us/articles/monitoring-integrated-memory-controller-requests-in-the-2nd-3rd-and-4th-generation-intel Cc: mingo@elte.hu Cc: acme@redhat.com Cc: ak@linux.intel.com Cc: zheng.z.yan@intel.com Cc: peterz@infradead.org Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1392132015-14521-7-git-send-email-eranian@google.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-02-21 21:49:08 +01:00
Stephane Eranian	001e413f7e	perf/x86/uncore: move uncore_event_to_box() and uncore_pmu_to_box() Move a couple of functions around to avoid forward declarations when we add code later on. Cc: mingo@elte.hu Cc: acme@redhat.com Cc: ak@linux.intel.com Cc: zheng.z.yan@intel.com Cc: peterz@infradead.org Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1392132015-14521-6-git-send-email-eranian@google.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-02-21 21:49:07 +01:00
Stephane Eranian	79859cce5a	perf/x86/uncore: make hrtimer timeout configurable per box This patch makes the hrtimer timeout configurable per PMU box. Not all counters have necessarily the same width and rate, thus the default timeout of 60s may need to be adjusted. This patch adds box->hrtimer_duration. It is set to default when the box is allocated. It can be overriden when the box is initialized. Cc: mingo@elte.hu Cc: acme@redhat.com Cc: ak@linux.intel.com Cc: zheng.z.yan@intel.com Cc: peterz@infradead.org Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1392132015-14521-5-git-send-email-eranian@google.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-02-21 21:49:07 +01:00
Stephane Eranian	d64b25b6a0	perf/x86/uncore: add ability to customize pmu callbacks This patch enables custom struct pmu callbacks per uncore PMU types. This feature may be used to simplify counter setup for certain uncore PMUs which have free running counters for instance. It becomes possible to bypass the event scheduling phase of the configuration. Cc: mingo@elte.hu Cc: acme@redhat.com Cc: ak@linux.intel.com Cc: zheng.z.yan@intel.com Cc: peterz@infradead.org Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1392132015-14521-3-git-send-email-eranian@google.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-02-21 21:49:07 +01:00
Stephane Eranian	411cf180fa	perf/x86/uncore: fix initialization of cpumask On certain processors, the uncore PMU boxes may only be msr-bsed or PCI-based. But in both cases, the cpumask, suggesting on which CPUs to monitor to get full coverage of the particular PMU, must be created. However with the current code base, the cpumask was only created on processor which had at least one MSR-based uncore PMU. This patch removes that restriction and ensures the cpumask is created even when there is no msr-based PMU. For instance, on SNB client where only a PCI-based memory controller PMU is supported. Cc: mingo@elte.hu Cc: acme@redhat.com Cc: ak@linux.intel.com Cc: zheng.z.yan@intel.com Cc: peterz@infradead.org Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1392132015-14521-2-git-send-email-eranian@google.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-02-21 21:49:07 +01:00
Thomas Gleixner	d97a860c4f	Merge branch 'linus' into sched/core Reason: Bring bakc upstream modification to resolve conflicts Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-02-21 21:37:09 +01:00
Jiang Liu	896dc50640	x86, acpi: Fix bug in associating hot-added CPUs with corresponding NUMA node Current ACPI cpu hotplug driver fails to associate hot-added CPUs with corresponding NUMA node when doing socket online. The code path to associate CPU with NUMA node is as below: acpi_processor_add() ->acpi_processor_get_info() ->acpi_processor_hotadd_init() ->acpi_map_lsapic() ->_acpi_map_lsapic() ->acpi_map_cpu2node() cpu_subsys_online() ->try_online_node() ->node_set_online() When doing socket online, a new NUMA node is introduced in addition to hot-added CPU and memory device. And the new NUMA node is marked as online when onlining hot-added CPUs through sysfs interface /sys/devices/system/cpu/cpuxx/online. On the other hand, acpi_map_cpu2node() will only build the CPU to node map if corresponding NUMA node is already online, so it always fails to associate hot-added CPUs with corresponding NUMA node because the NUMA node is still in offline state. For the fix, we could safely remove the "node_online(node)" check in function acpi_map_cpu2node() because it's only called for hot-added CPUs by acpi_processor_hotadd_init(). Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Link: http://lkml.kernel.org/r/1390185115-26850-1-git-send-email-jiang.liu@linux.intel.com Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-02-20 19:01:22 -08:00
Linus Torvalds	d49649615d	Merge branch 'fixes-for-v3.14' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping Pull DMA-mapping fixes from Marek Szyprowski: "This contains fixes for incorrect atomic test in dma-mapping subsystem for ARM and x86 architecture" * 'fixes-for-v3.14' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping: x86: dma-mapping: fix GFP_ATOMIC macro usage ARM: dma-mapping: fix GFP_ATOMIC macro usage	2014-02-20 11:58:56 -08:00
Len Brown	2194324d8b	ACPI idle: permit sparse C-state sub-state numbers Linux uses CPUID.MWAIT.EDX to validate the C-states reported by ACPI, silently discarding states which are not supported by the HW. This test is too restrictive, as some HW now uses sparse sub-state numbering, so the sub-state number may be higher than the number of sub-states... Also, rather than silently ignoring an invalid state, we should complain about a firmware bug. In practice... Bay Trail systems originally supported C6-no-shrink as MWAIT sub-state 0x58, and in CPUID.MWAIT.EDX 0x03000000 indicated that there were 3 MWAIT-C6 sub-states. So acpi_idle would discard that C-state because 8 >= 3. Upon discovering this issue, the ucode was updated so that C6-no-shrink was also exported as 0x51, and the BIOS was updated to match. However, systems shipped with 0x58, will never get a BIOS update, and this patch allows Linux to see C6-no-shrink on early Bay Trail. Signed-off-by: Len Brown <len.brown@intel.com>	2014-02-19 17:33:18 -05:00
Mika Westerberg	3e11e818bf	x86: tsc: Add missing Baytrail frequency to the table Intel Baytrail is based on Silvermont core so MSR_FSB_FREQ[2:0] == 0 means that the CPU reference clock runs at 83.3MHz. Add this missing frequency to the table. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Bin Gao <bin.gao@linux.intel.com> Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk> Cc: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/1392810750-18660-2-git-send-email-mika.westerberg@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-02-19 17:12:24 +01:00
Thomas Gleixner	5f0e030930	x86, tsc: Fallback to normal calibration if fast MSR calibration fails If we cannot calibrate TSC via MSR based calibration try_msr_calibrate_tsc() stores zero to fast_calibrate and returns that to the caller. This value gets then propagated further to clockevents code resulting division by zero oops like the one below: divide error: 0000 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 3.13.0+ #47 task: ffff880075508000 ti: ffff880075506000 task.ti: ffff880075506000 RIP: 0010:[<ffffffff810aec14>] [<ffffffff810aec14>] clockevents_config.part.3+0x24/0xa0 RSP: 0000:ffff880075507e58 EFLAGS: 00010246 RAX: ffffffffffffffff RBX: ffff880079c0cd80 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffffffffff RBP: ffff880075507e70 R08: 0000000000000001 R09: 00000000000000be R10: 00000000000000bd R11: 0000000000000003 R12: 000000000000b008 R13: 0000000000000008 R14: 000000000000b010 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff880079c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffff880079fff000 CR3: 0000000001c0b000 CR4: 00000000001006f0 Stack: ffff880079c0cd80 000000000000b008 0000000000000008 ffff880075507e88 ffffffff810aecb0 ffff880079c0cd80 ffff880075507e98 ffffffff81030168 ffff880075507ed8 ffffffff81d1104f 00000000000000c3 0000000000000000 Call Trace: [<ffffffff810aecb0>] clockevents_config_and_register+0x20/0x30 [<ffffffff81030168>] setup_APIC_timer+0xc8/0xd0 [<ffffffff81d1104f>] setup_boot_APIC_clock+0x4cc/0x4d8 [<ffffffff81d0f5de>] native_smp_prepare_cpus+0x3dd/0x3f0 [<ffffffff81d02ee9>] kernel_init_freeable+0xc3/0x205 [<ffffffff8177c910>] ? rest_init+0x90/0x90 [<ffffffff8177c91e>] kernel_init+0xe/0x120 [<ffffffff8178deec>] ret_from_fork+0x7c/0xb0 [<ffffffff8177c910>] ? rest_init+0x90/0x90 Prevent this from happening by: 1) Modifying try_msr_calibrate_tsc() to return calibration value or zero if it fails. 2) Check this return value in native_calibrate_tsc() and in case of zero fallback to use normal non-MSR based calibration. [mw: Added subject and changelog] Reported-and-tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bin Gao <bin.gao@linux.intel.com> Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk> Cc: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/1392810750-18660-1-git-send-email-mika.westerberg@linux.intel.com Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-02-19 17:12:24 +01:00
Hanjun Guo	328281b1cd	ACPI: Move BAD_MADT_ENTRY() to linux/acpi.h BAD_MADT_ENTRY() is arch independent and will be used for all architectures which parse MADT, so move it to linux/acpi.h to reduce code duplication. Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-02-19 00:56:07 +01:00
Ard Biesheuvel	2b9c1f0327	x86: align x86 arch with generic CPU modalias handling The x86 CPU feature modalias handling existed before it was reimplemented generically. This patch aligns the x86 handling so that it (a) reuses some more code that is now generic; (b) uses the generic format for the modalias module metadata entry, i.e., it now uses 'cpu:type:x86,venVVVVfamFFFFmodMMMM:feature:,XXXX,YYYY' instead of the 'x86cpu:vendor:VVVV👪FFFF:model:MMMM:feature:,XXXX,YYYY' that was used before. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-02-18 12:45:38 -08:00
Linus Torvalds	9bd01b9bbd	Two fixes in the tracing utility. The first is a fix for the way the ring buffer stores timestamps. After a restructure of the code was done, the ring buffer timestamp logic missed the fact that the first event on a sub buffer is to have a zero delta, as the full timestamp is stored on the sub buffer itself. But because the delta was not cleared to zero, the timestamp for that event will be calculated as the real timestamp + the delta from the last timestamp. This can skew the timestamps of the events and have them say they happened when they didn't really happen. That's bad. The second fix is for modifying the function graph caller site. When the stop machine was removed from updating the function tracing code, it missed updating the function graph call site location. It is still modified as if it is being done via stop machine. But it's not. This can lead to a GPF and kernel crash if the function graph call site happens to lie between cache lines and one CPU is executing it while another CPU is doing the update. It would be a very hard condition to hit, but the result is sever enough to have it fixed ASAP. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.15 (GNU/Linux) iQEcBAABAgAGBQJS/2leAAoJEKQekfcNnQGu7nYH/AltUO19AgM2sFLOLM7Q0dp4 Lg7vE8CLKtFq0fjtv/ri//fJ56Lr+/WNHiiD06aIrgnMVBbWynS0m0RO+9bhFl8/ rELiUpXTTruqljmlT2T5lPxk+ZKgtLbxK8hNywU99eLgkTwyaOwrSUol30E8pw41 UwtKg4OAn1LbjQ8/sddVynGlFDNRdqFiGTIDvhHqI6F6/QlaEX81EeZbLThDU4D/ l86fMuIdw5pb+efa29Rr0s7O4Xol7SJgnSMVgd0OYADRFmp4sg+MKxuJAUjPsHk7 9vvbylOb4w5H6lo5h7kUee3w7kG+FjYVoEx+Sqq9936+KlwtN0kbiNvl0DkrXnY= =kUmM -----END PGP SIGNATURE----- Merge tag 'trace-fixes-v3.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull twi tracing fixes from Steven Rostedt: "Two urgent fixes in the tracing utility. The first is a fix for the way the ring buffer stores timestamps. After a restructure of the code was done, the ring buffer timestamp logic missed the fact that the first event on a sub buffer is to have a zero delta, as the full timestamp is stored on the sub buffer itself. But because the delta was not cleared to zero, the timestamp for that event will be calculated as the real timestamp + the delta from the last timestamp. This can skew the timestamps of the events and have them say they happened when they didn't really happen. That's bad. The second fix is for modifying the function graph caller site. When the stop machine was removed from updating the function tracing code, it missed updating the function graph call site location. It is still modified as if it is being done via stop machine. But it's not. This can lead to a GPF and kernel crash if the function graph call site happens to lie between cache lines and one CPU is executing it while another CPU is doing the update. It would be a very hard condition to hit, but the result is severe enough to have it fixed ASAP" * tag 'trace-fixes-v3.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace/x86: Use breakpoints for converting function graph caller ring-buffer: Fix first commit on sub-buffer having non-zero delta	2014-02-15 15:03:34 -08:00
Andi Kleen	40747ffa5a	asmlinkage: Make jiffies visible Jiffies is referenced by the linker script, so it has to be visible. Handled both the generic and the x86 version. Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1391845930-28580-3-git-send-email-ak@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-02-13 18:12:09 -08:00
H. Peter Anvin	03bbd596ac	x86, smap: Don't enable SMAP if CONFIG_X86_SMAP is disabled If SMAP support is not compiled into the kernel, don't enable SMAP in CR4 -- in fact, we should clear it, because the kernel doesn't contain the proper STAC/CLAC instructions for SMAP support. Found by Fengguang Wu's test system. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Link: http://lkml.kernel.org/r/20140213124550.GA30497@localhost Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org> # v3.7+	2014-02-13 07:50:25 -08:00
David Rientjes	7cf6c94591	x86, apic: Remove support for IBM Summit/EXA chipset There should no longer be any IBM x440 systems or those using the Summit/EXA chipset out in the wild, so remove support for it. We've done our due diligence in reaching out to any contact information listed for this chipset and no indication was given that it should be kept around. Signed-off-by: David Rientjes <rientjes@google.com>	2014-02-11 18:11:13 -08:00
David Rientjes	58f5d2d448	x86, apic: Remove support for ia32-based Unisys ES7000 There should no longer be any ia32-based Unisys ES7000 systems out in the wild, so remove support for it. We've done our due diligence in reaching out to any contact information listed for this system and no indication was given that it should be kept around. Signed-off-by: David Rientjes <rientjes@google.com>	2014-02-11 17:47:48 -08:00
Steven Rostedt (Red Hat)	87fbb2ac60	ftrace/x86: Use breakpoints for converting function graph caller When the conversion was made to remove stop machine and use the breakpoint logic instead, the modification of the function graph caller is still done directly as though it was being done under stop machine. As it is not converted via stop machine anymore, there is a possibility that the code could be layed across cache lines and if another CPU is accessing that function graph call when it is being updated, it could cause a General Protection Fault. Convert the update of the function graph caller to use the breakpoint method as well. Cc: H. Peter Anvin <hpa@zytor.com> Cc: stable@vger.kernel.org # 3.5+ Fixes: `08d636b6d4` "ftrace/x86: Have arch x86_64 use breakpoints instead of stop machine" Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-02-11 20:19:44 -05:00
Nicolas Pitre	16f8b05abe	sched/idle, x86: Remove redundant cpuidle_idle_call() The core idle loop now takes care of it. Signed-off-by: Nicolas Pitre <nico@linaro.org> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-sh@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: Russell King <linux@arm.linux.org.uk> Cc: linaro-kernel@lists.linaro.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-3ioazimg4j5iq6kdefks04i8@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-11 09:58:28 +01:00
Marek Szyprowski	c091c71ad2	x86: dma-mapping: fix GFP_ATOMIC macro usage GFP_ATOMIC is not a single gfp flag, but a macro which expands to the other flags, where meaningful is the LACK of __GFP_WAIT flag. To check if caller wants to perform an atomic allocation, the code must test for a lack of the __GFP_WAIT flag. This patch fixes the issue introduced in v3.5-rc1. CC: stable@vger.kernel.org Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>	2014-02-11 09:40:15 +01:00
David Rientjes	dc9788f40a	x86/apic: Always define nox2apic and define it as initdata The "nox2apic" variable can be defined as __initdata since it is only used for bootstrap. It can now unconditionally be defined since it will later be freed. At the same time, it is also better off as a bool. Signed-off-by: David Rientjes <rientjes@google.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1402042354380.7839@chino.kir.corp.google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-09 15:15:11 +01:00
David Rientjes	465822cfc8	x86/apic: Switch wait_for_init_deassert() to a bool flag Now that there is only a single wait_for_init_deassert() function, just convert the member of struct apic to a bool to determine whether we need to wait for init_deassert to become non-zero. There are no more callers of default_wait_for_init_deassert(), so fold it into the caller. Signed-off-by: David Rientjes <rientjes@google.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1402042354010.7839@chino.kir.corp.google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-09 15:15:08 +01:00
David Rientjes	d3c63ae1e2	x86/apic: Only use default_wait_for_init_deassert() es7000_wait_for_init_deassert() is functionally equivalent to default_wait_for_init_deassert(), so remove the duplicate code and use only a single function. Signed-off-by: David Rientjes <rientjes@google.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1402042353030.7839@chino.kir.corp.google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-09 15:15:07 +01:00
Ville Syrjälä	c71ef7b3c3	x86/gpu: Print the Intel graphics stolen memory range Print an informative message when reserving the graphics stolen memory region in the early quirk. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Link: http://lkml.kernel.org/r/1391628540-23072-4-git-send-email-ville.syrjala@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-09 15:11:31 +01:00
Ville Syrjälä	a4dff76924	x86/gpu: Add Intel graphics stolen memory quirk for gen2 platforms There isn't an explicit stolen memory base register on gen2. Some old comment in the i915 code suggests we should get it via max_low_pfn_mapped, but that's clearly a bad idea on my MGM. The e820 map in said machine looks like this: BIOS-e820: [mem 0x0000000000000000-0x000000000009f7ff] usable BIOS-e820: [mem 0x000000000009f800-0x000000000009ffff] reserved BIOS-e820: [mem 0x00000000000ce000-0x00000000000cffff] reserved BIOS-e820: [mem 0x00000000000dc000-0x00000000000fffff] reserved BIOS-e820: [mem 0x0000000000100000-0x000000001f6effff] usable BIOS-e820: [mem 0x000000001f6f0000-0x000000001f6f7fff] ACPI data BIOS-e820: [mem 0x000000001f6f8000-0x000000001f6fffff] ACPI NVS BIOS-e820: [mem 0x000000001f700000-0x000000001fffffff] reserved BIOS-e820: [mem 0x00000000fec10000-0x00000000fec1ffff] reserved BIOS-e820: [mem 0x00000000ffb00000-0x00000000ffbfffff] reserved BIOS-e820: [mem 0x00000000fff00000-0x00000000ffffffff] reserved That makes max_low_pfn_mapped = 1f6f0000, so assuming our stolen memory would start there would place it on top of some ACPI memory regions. So not a good idea as already stated. The 9MB region after the ACPI regions at 0x1f700000 however looks promising given that the macine reports the stolen memory size to be 8MB. Looking at the PGTBL_CTL register, the GTT entries are at offset 0x1fee00000, and given that the GTT entries occupy 128KB, it looks like the stolen memory could start at 0x1f700000 and the GTT entries would occupy the last 128KB of the stolen memory. After some more digging through chipset documentation, I've determined the BIOS first allocates space for something called TSEG (something to do with SMM) from the top of memory, and then it allocates the graphics stolen memory below that. Accordind to the chipset documentation TSEG has a fixed size of 1MB on 855. So that explains the top 1MB in the e820 region. And it also confirms that the GTT entries are in fact at the end of the the stolen memory region. Derive the stolen memory base address on gen2 the same as the BIOS does (TOM-TSEG_SIZE-stolen_size). There are a few differences between the registers on various gen2 chipsets, so a few different codepaths are required. 865G is again bit more special since it seems to support enough memory to hit 4GB address space issues. This means the PCI allocations will also affect the location of the stolen memory. Fortunately there appears to be the TOUD register which may give us the correct answer directly. But the chipset docs are a bit unclear, so I'm not 100% sure that the graphics stolen memory is always the last thing the BIOS steals. Someone would need to verify it on a real system. I tested this on the my 830 and 855 machines, and so far everything looks peachy. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Link: http://lkml.kernel.org/r/1391628540-23072-3-git-send-email-ville.syrjala@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-09 15:11:30 +01:00
Ville Syrjälä	52ca70454e	x86/gpu: Add vfunc for Intel graphics stolen memory base address For gen2 devices we're going to need another way to determine the stolen memory base address. Make that into a vfunc as well. Also drop the bogus inline keyword from gen8_stolen_size(). Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Link: http://lkml.kernel.org/r/1391628540-23072-2-git-send-email-ville.syrjala@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-09 15:11:30 +01:00
Don Zickus	90ed5b0fa5	perf/x86/p4: Block PMIs on init to prevent a stream of unkown NMIs A bunch of unknown NMIs have popped up on a Pentium4 recently when booting into a kdump kernel. This was exposed because the watchdog timer went from 60 seconds down to 10 seconds (increasing the ability to reproduce this problem). What is happening is on boot up of the second kernel (the kdump one), the previous nmi_watchdogs were enabled on thread 0 and thread 1. The second kernel only initializes one cpu but the perf counter on thread 1 still counts. Normally in a kdump scenario, the other cpus are blocking in an NMI loop, but more importantly their local apics have the performance counters disabled (iow LVTPC is masked). So any counters that fire are masked and never get through to the second kernel. However, on a P4 the local apic is shared by both threads and thread1's PMI (despite being configured to only interrupt thread1) will generate an NMI on thread0. Because thread0 knows nothing about this NMI, it is seen as an unknown NMI. This would be fine because it is a kdump kernel, strange things happen what is the big deal about a single unknown NMI. Unfortunately, the P4 comes with another quirk: clearing the overflow bit to prevent a stream of NMIs. This is the problem. The kdump kernel can not execute because of the endless NMIs that happen. To solve this, I instrumented the p4 perf init code, to walk all the counters and zero them out (just like a normal reset would). Now when the counters go off, they do not generate anything and no unknown NMIs are seen. I tested this on a P4 we have in our lab. After two or three crashes, I could normally reproduce the problem. Now after 10 crashes, everything continues to boot correctly. Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140120154115.GZ25953@redhat.com [ Fixed a stylistic detail. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-09 13:20:35 +01:00
Don Zickus	13beacee81	perf/x86/p4: Fix counter corruption when using lots of perf groups On a P4 box stressing perf with: ./perf record -o perf.data ./perf stat -v ./perf bench all it was noticed that a slew of unknown NMIs would pop out rather quickly. Painfully debugging this ancient platform, led me to notice cross cpu counter corruption. The P4 machine is special in that it has 18 counters, half are used for cpu0 and the other half is for cpu1 (or all 18 if hyperthreading is disabled). But the splitting of the counters has to be actively managed by the software. In this particular bug, one of the cpu0 specific counters was being used by cpu1 and caused all sorts of random unknown nmis. I am not entirely sure on the corruption path, but what happens is: o perf schedules a group with p4_pmu_schedule_events() o inside p4_pmu_schedule_events(), it notices an hwc pointer is being reused but for a different cpu, so it 'swaps' the config bits and returns the updated 'assign' array with a _new_ index. o perf schedules another group with p4_pmu_schedule_events() o inside p4_pmu_schedule_events(), it notices an hwc pointer is being reused (the same one as above) but for the _same_ cpu [BUG!!], so it updates the 'assign' array to use the _old_ (wrong cpu) index because the _new_ index is in an earlier part of the 'assign' array (and hasn't been committed yet). o perf commits the transaction using the wrong index and corrupts the other cpu The [BUG!!] is because the 'hwc->config' is updated but not the 'hwc->idx'. So the check for 'p4_should_swap_ts()' is correct the first time around but incorrect the second time around (because hwc->config was updated in between). I think the spirit of perf was to not modify anything until all the transactions had a chance to 'test' if they would succeed, and if so, commit atomically. However, P4 breaks this spirit by touching the hwc->config element. So my fix is to continue the un-perf like breakage, by assigning hwc->idx to -1 on swap to tell follow up group scheduling to find a new index. Of course if the transaction fails rolling this back will be difficult, but that is not different than how the current code works. :-) And I wasn't sure how much effort to cleanup the code I should do for a platform that is almost 10 years old by now. Hence the lazy fix. Signed-off-by: Don Zickus <dzickus@redhat.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1391024270-19469-1-git-send-email-dzickus@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-09 13:17:23 +01:00
Peter Zijlstra	e90c785352	x86/nmi: Push duration printk() to irq context Calling printk() from NMI context is bad (TM), so move it to IRQ context. In doing so we slightly change (probably wreck) the debugfs nmi_longest_ns thingy, in that it doesn't update to reflect the longest, nor does writing to it reset the count. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Don Zickus <dzickus@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Link: http://lkml.kernel.org/n/tip-rdw0au56a5ymis1u8p48c12d@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-09 13:17:22 +01:00
Ingo Molnar	3c3d7cb1db	Merge branch 'linus' into perf/core Refresh the branch to a v3.14-rc base before queueing up new devel patches. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-09 13:13:45 +01:00
Steven Rostedt	569d6557ab	x86: Use preempt_disable_notrace() in cycles_2_ns() When debug preempt is enabled, preempt_disable() can be traced by function and function graph tracing. There's a place in the function graph tracer that calls trace_clock() which eventually calls cycles_2_ns() outside of the recursion protection. When cycles_2_ns() calls preempt_disable() it gets traced and the graph tracer will go into a recursive loop causing a crash or worse, a triple fault. Simple fix is to use preempt_disable_notrace() in cycles_2_ns, which makes sense because the preempt_disable() tracing may use that code too, and it tracing it, even with recursion protection is rather pointless. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140204141315.2a968a72@gandalf.local.home Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-09 13:09:08 +01:00
Peter Zijlstra	0e9f2204cf	perf/x86: Fix Userspace RDPMC switch The current code forgets to change the CR4 state on the current CPU. Use on_each_cpu() instead of smp_call_function(). Reported-by: Mark Davies <junk@eslaf.co.uk> Suggested-by: Mark Davies <junk@eslaf.co.uk> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: fweisbec@gmail.com Link: http://lkml.kernel.org/n/tip-69efsat90ibhnd577zy3z9gh@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-09 13:08:25 +01:00
Peter Zijlstra	e97df76377	perf/x86/intel/p6: Add userspace RDPMC quirk for PPro PPro machines can die hard when PCE gets enabled due to a CPU erratum. The safe way it so disable it by default and keep it disabled. See erratum 26 in: http://download.intel.com/design/archives/processors/pro/docs/24268935.pdf Reported-and-Tested-by: Mark Davies <junk@eslaf.co.uk> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vince@deater.net> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140206170815.GW2936@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-09 13:08:24 +01:00
H. Peter Anvin	a3b072cd18	* Avoid WARN_ON() when mapping BGRT on Baytrail (EFI 32-bit). -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJS9P2pAAoJEC84WcCNIz1VABsP/j0fwiWdaoEn/9p9EUQcBcMn CILuiKI8GoBR+0nH7vm/jSgUKfUxzM6T0+qjoZPVkO/u/qup+JCT5JxoywUyEbfb +wPNIhBgKSwJdWXPd4ZvObA9jknrP6r5sJTsixpESVpVWmcC8egPgkyBFILjokKS BCJqqruHZcXfWaDQTocYErl3T217J7RfnCG1o7yP96g4jteX8EdvIkPjEf2mM83I GXacHLy8IhGhdyPKSXcRqZ0ivhZ2WX1iFQYIY19FhmzBs364WulyXve+oV+l/4h4 Kx7ks4Ob5AlUIizchGNwVz7058F9o/v7m0CezTgi4Q/RrZi34samxbjk/95/Xc60 JSvWkekm3/jOODubad5zj+ZnJmG83/ZlCUuTaqsE47ftbaedSUHBN9QSpm8iHEex n3d4J3AdP/3amcP33kZ5MRALDYIFKb4ZxtDkADqDcXhS56COivGAdZe5hnyCpb/9 RPUDXTOlxfJQK/y2Atcdb400JjJ/Yr9Kew81LRIt0UMZU3dKSh05UZ+a7Ym0yCkt 3k0NNkgsFCZbYTO/Z3aPDcprwU5Lq9UrwjB17U2ev/qK+qRYDzCzSR0XGPrLMRv7 C5Bnov6uCn/0ZG/NlAx8UXK9wdWDsLhp1QkBz+daX3sGwRAS+OiKBv6+l8dqsdOc 1L8PMkTX2rgtELiv4PJ/ =hLCC -----END PGP SIGNATURE----- Merge tag 'efi-urgent' into x86/urgent * Avoid WARN_ON() when mapping BGRT on Baytrail (EFI 32-bit). Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-02-07 11:27:30 -08:00
Borislav Petkov	75a1ba5b2c	x86, microcode, AMD: Unify valid container checks For additional coverage, BorisO and friends unknowlingly did swap AMD microcode with Intel microcode blobs in order to see what happens. What did happen on 32-bit was [ 5.722656] BUG: unable to handle kernel paging request at be3a6008 [ 5.722693] IP: [<c106d6b4>] load_microcode_amd+0x24/0x3f0 [ 5.722716] pdpt = 0000000000000000 pde = 0000000000000000 because there was a valid initrd there but without valid microcode in it and the container check happened after the relocated ramdisk handling on 32-bit, which was clearly wrong. While at it, take care of the ramdisk relocation on both 32- and 64-bit as it is done on both. Also, comment what we're doing because this code is a bit tricky. Reported-and-tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1391460104-7261-1-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2014-02-06 11:11:19 -08:00
Linus Torvalds	ab5318788c	Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core debug changes from Ingo Molnar: "This contains mostly kernel debugging related updates: - make hung_task detection more configurable to distros - add final bits for x86 UV NMI debugging, with related KGDB changes - update the mailing-list of MAINTAINERS entries I'm involved with" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: hung_task: Display every hung task warning sysctl: Add neg_one as a standard constraint x86/uv/nmi, kgdb/kdb: Fix UV NMI handler when KDB not configured x86/uv/nmi: Fix Sparse warnings kgdb/kdb: Fix no KDB config problem MAINTAINERS: Restore "L: linux-kernel@vger.kernel.org" entries	2014-01-31 08:59:46 -08:00
Linus Torvalds	e2a0f813e0	Second batch of KVM updates. Some minor x86 fixes, two s390 guest features that need some handling in the host, and all the PPC changes. The PPC changes include support for little-endian guests and enablement for new POWER8 features. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJS6UF5AAoJEBvWZb6bTYby55kP/AgTJnyu7avN653/2aSHkjkx KgYSMYhZPIFoY5LyZuNetXaoXFRvCykux1VYSZ6V6s35h2PZ+hdJNbHGjFYKPGTq FQ92xQVNuWCAPxmFCjDNuDV/0BauG5y08/Orh/jpjz+GAfH43LruUQGbtXUuyJ8u vf+yTHniU5gguqsAmodqjHUgbf+GoPJ1j7hmRoWwt8IWm7Ns3v/IK4l0p6G0h26a RjE6aK+Tm208Yr5hD/dRAqeTbBNt3c4xub+QPsKoiEMaZBSuAOiux7D3Kx+If1gp WsmqEQxoymihVtkZhUFO9ONLJepvmG2QwJVVyMSUW9iqxX9rraXsvVyVMwcQAhog JuOAYxBftH07xu6Fs4eym5KvCFghM+EaJvxxt+kgnvdD4htK1+eK5trntc2zygSi /qGiIrkqjXpkskW8kujLayF0eAU3CrZvFWveEPBfFgYiOGX/2wzJCtSm/bt9Jo0M v60qgNFK3LNqAyeEfnm9VtlwGr6ZgsAB6DHNPX4fM5s2IBjL+qloXk/e/+aVKkW0 I3yeRdy/ExhLAab6w81JtMeR7G3YS0UNuAEVvcoxzNb5wIBY8qnpfUzTKyMxQR94 64EVpxWEYO1s55eCCyMujWrSvc+YAwhJcWHGKgC4K7mxxLD3FVyQXX6YZvgRozMX HjQju+DToj9CskyrFlRL =yd0Z -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull more KVM updates from Paolo Bonzini: "Second batch of KVM updates. Some minor x86 fixes, two s390 guest features that need some handling in the host, and all the PPC changes. The PPC changes include support for little-endian guests and enablement for new POWER8 features" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (45 commits) x86, kvm: correctly access the KVM_CPUID_FEATURES leaf at 0x40000101 x86, kvm: cache the base of the KVM cpuid leaves kvm: x86: move KVM_CAP_HYPERV_TIME outside #ifdef KVM: PPC: Book3S PR: Cope with doorbell interrupts KVM: PPC: Book3S HV: Add software abort codes for transactional memory KVM: PPC: Book3S HV: Add new state for transactional memory powerpc/Kconfig: Make TM select VSX and VMX KVM: PPC: Book3S HV: Basic little-endian guest support KVM: PPC: Book3S HV: Add support for DABRX register on POWER7 KVM: PPC: Book3S HV: Prepare for host using hypervisor doorbells KVM: PPC: Book3S HV: Handle new LPCR bits on POWER8 KVM: PPC: Book3S HV: Handle guest using doorbells for IPIs KVM: PPC: Book3S HV: Consolidate code that checks reason for wake from nap KVM: PPC: Book3S HV: Implement architecture compatibility modes for POWER8 KVM: PPC: Book3S HV: Add handler for HV facility unavailable KVM: PPC: Book3S HV: Flush the correct number of TLB sets on POWER8 KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs KVM: PPC: Book3S HV: Align physical and virtual CPU thread numbers KVM: PPC: Book3S HV: Don't set DABR on POWER8 kvm/ppc: IRQ disabling cleanup ...	2014-01-31 08:37:32 -08:00
Linus Torvalds	12f2bbd609	Merge branch 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asmlinkage (LTO) changes from Peter Anvin: "This patchset adds more infrastructure for link time optimization (LTO). This patchset was pulled into my tree late because of a miscommunication (part of the patchset was picked up by other maintainers). However, the patchset is strictly build-related and seems to be okay in testing" * 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, asmlinkage, xen: Fix type of NMI x86, asmlinkage, xen, kvm: Make {xen,kvm}_lock_spinning global and visible x86: Use inline assembler instead of global register variable to get sp x86, asmlinkage, paravirt: Make paravirt thunks global x86, asmlinkage, paravirt: Don't rely on local assembler labels x86, asmlinkage, lguest: Fix C functions used by inline assembler	2014-01-30 18:15:32 -08:00
Prarit Bhargava	39424e89d6	x86, cpu hotplug: Fix stack frame warning in check_irq_vectors_for_cpu_disable() Further discussion here: http://marc.info/?l=linux-kernel&m=139073901101034&w=2 kbuild, 0day kernel build service, outputs the warning: arch/x86/kernel/irq.c:333:1: warning: the frame size of 2056 bytes is larger than 2048 bytes [-Wframe-larger-than=] because check_irq_vectors_for_cpu_disable() allocates two cpumasks on the stack. Fix this by moving the two cpumasks to a global file context. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Tested-by: David Rientjes <rientjes@google.com> Signed-off-by: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1390915331-27375-1-git-send-email-prarit@redhat.com Cc: Andi Kleen <ak@linux.intel.com> Cc: Michel Lespinasse <walken@google.com> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Cc: Yang Zhang <yang.z.zhang@Intel.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Janet Morgan <janet.morgan@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Ruiv Wang <ruiv.wang@gmail.com> Cc: Gong Chen <gong.chen@linux.intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-01-30 16:40:13 -08:00
Andi Kleen	dd41f818e5	x86, asmlinkage, xen, kvm: Make {xen,kvm}_lock_spinning global and visible These functions are called from inline assembler stubs, thus need to be global and visible. Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1382458079-24450-7-git-send-email-andi@firstfloor.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-01-29 22:17:18 -08:00
Andi Kleen	a2e7f0e3a4	x86, asmlinkage, paravirt: Make paravirt thunks global The paravirt thunks use a hack of using a static reference to a static function to reference that function from the top level statement. This assumes that gcc always generates static function names in a specific format, which is not necessarily true. Simply make these functions global and asmlinkage or __visible. This way the static __used variables are not needed and everything works. Functions with arguments are __visible to keep the register calling convention on 32bit. Changed in paravirt and in all users (Xen and vsmp) v2: Use __visible for functions with arguments Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Ido Yariv <ido@wizery.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1382458079-24450-5-git-send-email-andi@firstfloor.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-01-29 22:17:17 -08:00
Paolo Bonzini	77f01bdfa5	x86, kvm: correctly access the KVM_CPUID_FEATURES leaf at 0x40000101 When Hyper-V hypervisor leaves are present, KVM must relocate its own leaves at 0x40000100, because Windows does not look for Hyper-V leaves at indices other than 0x40000000. In this case, the KVM features are at 0x40000101, but the old code would always look at 0x40000001. Fix by using kvm_cpuid_base(). This also requires making the function non-inline, since kvm_cpuid_base() is static. Fixes: `1085ba7f55` Cc: stable@vger.kernel.org Cc: mtosatti@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-01-29 18:11:55 +01:00
Paolo Bonzini	1c300a4077	x86, kvm: cache the base of the KVM cpuid leaves It is unnecessary to go through hypervisor_cpuid_base every time a leaf is found (which will be every time a feature is requested after the next patch). Fixes: `1085ba7f55` Cc: stable@vger.kernel.org Cc: mtosatti@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-01-29 18:11:54 +01:00
Yinghai Lu	4ce7a8697c	x86: revert wrong memblock current limit setting Dave reported big numa system booting is broken. It turns out that commit `5b6e529521` ("x86: memblock: set current limit to max low memory address") sets the limit to low wrongly. max_low_pfn_mapped is different from max_pfn_mapped. max_low_pfn_mapped is always under 4G. That will memblock_alloc_nid all go under 4G. Revert `5b6e529521` to fix a no-boot regression which was triggered by `457ff1de2d` ("lib/swiotlb.c: use memblock apis for early memory allocations"). Signed-off-by: Yinghai Lu <yinghai@kernel.org> Reported-by: Dave Hansen <dave.hansen@intel.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-27 21:02:38 -08:00
Linus Torvalds	f6d13daadd	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "A couple of regression fixes mostly hitting virtualized setups, but also some bare metal systems" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/x86/tsc: Initialize multiplier to 0 sched/clock: Fixup early initialization sched/preempt/x86: Fix voluntary preempt for x86 Revert "sched: Fix sleep time double accounting in enqueue entity"	2014-01-25 11:11:31 -08:00
Ingo Molnar	2b45e0f9f3	Merge branch 'linus' into x86/urgent Merge in the x86 changes to apply a fix. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-25 09:16:14 +01:00
Mel Gorman	b9a3b4c976	mm, x86: Revisit tlb_flushall_shift tuning for page flushes except on IvyBridge There was a large ebizzy performance regression that was bisected to commit `611ae8e3` (x86/tlb: enable tlb flush range support for x86). The problem was related to the tlb_flushall_shift tuning for IvyBridge which was altered. The problem is that it is not clear if the tuning values for each CPU family is correct as the methodology used to tune the values is unclear. This patch uses a conservative tlb_flushall_shift value for all CPU families except IvyBridge so the decision can be revisited if any regression is found as a result of this change. IvyBridge is an exception as testing with one methodology determined that the value of 2 is acceptable. Details are in the changelog for the patch "x86: mm: Change tlb_flushall_shift for IvyBridge". One important aspect of this to watch out for is Xen. The original commit log mentioned large performance gains on Xen. It's possible Xen is more sensitive to this value if it flushes small ranges of pages more frequently than workloads on bare metal typically do. Signed-off-by: Mel Gorman <mgorman@suse.de> Tested-by: Davidlohr Bueso <davidlohr@hp.com> Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alex Shi <alex.shi@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-dyzMww3fqugnhbhgo6Gxmtkw@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-25 09:10:44 +01:00
Mel Gorman	f98b7a772a	x86: mm: change tlb_flushall_shift for IvyBridge There was a large performance regression that was bisected to commit `611ae8e3` ("x86/tlb: enable tlb flush range support for x86"). This patch simply changes the default balance point between a local and global flush for IvyBridge. In the interest of allowing the tests to be reproduced, this patch was tested using mmtests 0.15 with the following configurations configs/config-global-dhp__tlbflush-performance configs/config-global-dhp__scheduler-performance configs/config-global-dhp__network-performance Results are from two machines Ivybridge 4 threads: Intel(R) Core(TM) i3-3240 CPU @ 3.40GHz Ivybridge 8 threads: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz Page fault microbenchmark showed nothing interesting. Ebizzy was configured to run multiple iterations and threads. Thread counts ranged from 1 to NR_CPUS2. For each thread count, it ran 100 iterations and each iteration lasted 10 seconds. Ivybridge 4 threads 3.13.0-rc7 3.13.0-rc7 vanilla altshift-v3 Mean 1 6395.44 ( 0.00%) 6789.09 ( 6.16%) Mean 2 7012.85 ( 0.00%) 8052.16 ( 14.82%) Mean 3 6403.04 ( 0.00%) 6973.74 ( 8.91%) Mean 4 6135.32 ( 0.00%) 6582.33 ( 7.29%) Mean 5 6095.69 ( 0.00%) 6526.68 ( 7.07%) Mean 6 6114.33 ( 0.00%) 6416.64 ( 4.94%) Mean 7 6085.10 ( 0.00%) 6448.51 ( 5.97%) Mean 8 6120.62 ( 0.00%) 6462.97 ( 5.59%) Ivybridge 8 threads 3.13.0-rc7 3.13.0-rc7 vanilla altshift-v3 Mean 1 7336.65 ( 0.00%) 7787.02 ( 6.14%) Mean 2 8218.41 ( 0.00%) 9484.13 ( 15.40%) Mean 3 7973.62 ( 0.00%) 8922.01 ( 11.89%) Mean 4 7798.33 ( 0.00%) 8567.03 ( 9.86%) Mean 5 7158.72 ( 0.00%) 8214.23 ( 14.74%) Mean 6 6852.27 ( 0.00%) 7952.45 ( 16.06%) Mean 7 6774.65 ( 0.00%) 7536.35 ( 11.24%) Mean 8 6510.50 ( 0.00%) 6894.05 ( 5.89%) Mean 12 6182.90 ( 0.00%) 6661.29 ( 7.74%) Mean 16 6100.09 ( 0.00%) 6608.69 ( 8.34%) Ebizzy hits the worst case scenario for TLB range flushing every time and it shows for these Ivybridge CPUs at least that the default choice is a poor on. The patch addresses the problem. Next was a tlbflush microbenchmark written by Alex Shi at http://marc.info/?l=linux-kernel&m=133727348217113 . It measures access costs while the TLB is being flushed. The expectation is that if there are always full TLB flushes that the benchmark would suffer and it benefits from range flushing There are 320 iterations of the test per thread count. The number of entries is randomly selected with a min of 1 and max of 512. To ensure a reasonably even spread of entries, the full range is broken up into 8 sections and a random number selected within that section. iteration 1, random number between 0-64 iteration 2, random number between 64-128 etc This is still a very weak methodology. When you do not know what are typical ranges, random is a reasonable choice but it can be easily argued that the opimisation was for smaller ranges and an even spread is not representative of any workload that matters. To improve this, we'd need to know the probability distribution of TLB flush range sizes for a set of workloads that are considered "common", build a synthetic trace and feed that into this benchmark. Even that is not perfect because it would not account for the time between flushes but there are limits of what can be reasonably done and still be doing something useful. If a representative synthetic trace is provided then this benchmark could be revisited and the shift values retuned. Ivybridge 4 threads 3.13.0-rc7 3.13.0-rc7 vanilla altshift-v3 Mean 1 10.50 ( 0.00%) 10.50 ( 0.03%) Mean 2 17.59 ( 0.00%) 17.18 ( 2.34%) Mean 3 22.98 ( 0.00%) 21.74 ( 5.41%) Mean 5 47.13 ( 0.00%) 46.23 ( 1.92%) Mean 8 43.30 ( 0.00%) 42.56 ( 1.72%) Ivybridge 8 threads 3.13.0-rc7 3.13.0-rc7 vanilla altshift-v3 Mean 1 9.45 ( 0.00%) 9.36 ( 0.93%) Mean 2 9.37 ( 0.00%) 9.70 ( -3.54%) Mean 3 9.36 ( 0.00%) 9.29 ( 0.70%) Mean 5 14.49 ( 0.00%) 15.04 ( -3.75%) Mean 8 41.08 ( 0.00%) 38.73 ( 5.71%) Mean 13 32.04 ( 0.00%) 31.24 ( 2.49%) Mean 16 40.05 ( 0.00%) 39.04 ( 2.51%) For both CPUs, average access time is reduced which is good as this is the benchmark that was used to tune the shift values in the first place albeit it is now known how* the benchmark was used. The scheduler benchmarks were somewhat inconclusive. They showed gains and losses and makes me reconsider how stable those benchmarks really are or if something else might be interfering with the test results recently. Network benchmarks were inconclusive. Almost all results were flat except for netperf-udp tests on the 4 thread machine. These results were unstable and showed large variations between reboots. It is unknown if this is a recent problems but I've noticed before that netperf-udp results tend to vary. Based on these results, changing the default for Ivybridge seems like a logical choice. Signed-off-by: Mel Gorman <mgorman@suse.de> Tested-by: Davidlohr Bueso <davidlohr@hp.com> Reviewed-by: Alex Shi <alex.shi@linaro.org> Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-cqnadffh1tiqrshthRj3Esge@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-25 09:10:43 +01:00
Mel Gorman	ec65993443	mm, x86: Account for TLB flushes only when debugging Bisection between 3.11 and 3.12 fingered commit `9824cf97` ("mm: vmstats: tlb flush counters") to cause overhead problems. The counters are undeniably useful but how often do we really need to debug TLB flush related issues? It does not justify taking the penalty everywhere so make it a debugging option. Signed-off-by: Mel Gorman <mgorman@suse.de> Tested-by: Davidlohr Bueso <davidlohr@hp.com> Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Alex Shi <alex.shi@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-XzxjntugxuwpxXhcrxqqh53b@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-25 09:10:41 +01:00
Mike Travis	74c93f9d39	x86/uv/nmi: Fix Sparse warnings Make uv_register_nmi_notifier() and uv_handle_nmi_ping() static to address sparse warnings. Fix problem where uv_nmi_kexec_failed is unused when CONFIG_KEXEC is not defined. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Hedi Berriche <hedi@sgi.com> Cc: Russ Anderson <rja@sgi.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Link: http://lkml.kernel.org/r/20140114162551.480872353@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-25 08:55:10 +01:00
Dan Carpenter	2993ae3305	x86/AMD/NB: Fix amd_set_subcaches() parameter type This is under CAP_SYS_ADMIN, but Smatch complains that mask comes from the user and the test for "mask > 0xf" can underflow. The fix is simple: amd_set_subcaches() should hand down not an 'int' but an 'unsigned long' like it was originally indended to do. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Daniel J Blueman <daniel@numascale-asia.com> Link: http://lkml.kernel.org/r/20140121072209.GA22095@elgon.mountain Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-25 08:50:09 +01:00
Aravind Gopalakrishnan	fb53a1ab88	x86/quirks: Add workaround for AMD F16h Erratum792 The workaround for this Erratum is included in AGESA. But BIOSes spun only after Jan2014 will have the fix (atleast server versions of the chip). The erratum affects both embedded and server platforms and since we cannot say with certainity that ALL BIOSes on systems out in the field will have the fix, we should probably insulate ourselves in case BIOS does not do the right thing or someone is using old BIOSes. Refer to Revision Guide for AMD F16h models 00h-0fh, document 51810 Rev. 3.04, November2013 for details on the Erratum. Tested the patch on Fam16h server platform and it works fine. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: <hmh@hmh.eng.br> Cc: <Kim.Naru@amd.com> Cc: <Suravee.Suthikulpanit@amd.com> Cc: <bp@suse.de> Cc: <sherry.hurwitz@amd.com> Link: http://lkml.kernel.org/r/1390515212-1824-1-git-send-email-Aravind.Gopalakrishnan@amd.com [ Minor edits. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-25 08:44:19 +01:00
Linus Torvalds	09da8dfa98	ACPI and power management updates for 3.14-rc1 - ACPI core changes to make it create a struct acpi_device object for every device represented in the ACPI tables during all namespace scans regardless of the current status of that device. In accordance with this, ACPI hotplug operations will not delete those objects, unless the underlying ACPI tables go away. - On top of the above, new sysfs attribute for ACPI device objects allowing user space to check device status by triggering the execution of _STA for its ACPI object. From Srinivas Pandruvada. - ACPI core hotplug changes reducing code duplication, integrating the PCI root hotplug with the core and reworking container hotplug. - ACPI core simplifications making it use ACPI_COMPANION() in the code "glueing" ACPI device objects to "physical" devices. - ACPICA update to upstream version 20131218. This adds support for the DBG2 and PCCT tables to ACPICA, fixes some bugs and improves debug facilities. From Bob Moore, Lv Zheng and Betty Dall. - Init code change to carry out the early ACPI initialization earlier. That should allow us to use ACPI during the timekeeping initialization and possibly to simplify the EFI initialization too. From Chun-Yi Lee. - Clenups of the inclusions of ACPI headers in many places all over from Lv Zheng and Rashika Kheria (work in progress). - New helper for ACPI _DSM execution and rework of the code in drivers that uses _DSM to execute it via the new helper. From Jiang Liu. - New Win8 OSI blacklist entries from Takashi Iwai. - Assorted ACPI fixes and cleanups from Al Stone, Emil Goode, Hanjun Guo, Lan Tianyu, Masanari Iida, Oliver Neukum, Prarit Bhargava, Rashika Kheria, Tang Chen, Zhang Rui. - intel_pstate driver updates, including proper Baytrail support, from Dirk Brandewie and intel_pstate documentation from Ramkumar Ramachandra. - Generic CPU boost ("turbo") support for cpufreq from Lukasz Majewski. - powernow-k6 cpufreq driver fixes from Mikulas Patocka. - cpufreq core fixes and cleanups from Viresh Kumar, Jane Li, Mark Brown. - Assorted cpufreq drivers fixes and cleanups from Anson Huang, John Tobias, Paul Bolle, Paul Walmsley, Sachin Kamat, Shawn Guo, Viresh Kumar. - cpuidle cleanups from Bartlomiej Zolnierkiewicz. - Support for hibernation APM events from Bin Shi. - Hibernation fix to avoid bringing up nonboot CPUs with ACPI EC disabled during thaw transitions from Bjørn Mork. - PM core fixes and cleanups from Ben Dooks, Leonardo Potenza, Ulf Hansson. - PNP subsystem fixes and cleanups from Dmitry Torokhov, Levente Kurusa, Rashika Kheria. - New tool for profiling system suspend from Todd E Brandt and a cpupower tool cleanup from One Thousand Gnomes. / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJS3a1eAAoJEILEb/54YlRxnTgP/iGawvgjKWm6Qqp7WSIvd5gQ zZ6q75C6Pc/W2fq1+OzVGnpCF8WYFy+nFDAXOvUHjIXuoxSwFcuW5l4aMckgl/0a TXEWe9MJrCHHRfDApfFacCJ44U02bjJAD5vTyL/hKA+IHeinq4WCSojryYC+8jU0 cBrUIV0aNH8r5JR2WJNAyv/U29rXsDUOu0I4qTqZ4YaZT6AignMjtLXn1e9AH1Pn DPZphTIo/HMnb+kgBOjt4snMk+ahVO9eCOxh/hH8ecnWExw9WynXoU5Nsna0tSZs ssyHC7BYexD3oYsG8D52cFUpp4FCsJ0nFQNa2kw0LY+0FBNay43LySisKYHZPXEs 2WpESDv+/t7yhtnrvM+TtA7aBheKm2XMWGFSu/aERLE17jIidOkXKH5Y7ryYLNf/ uyRKxNS0NcZWZ0G+/wuY02jQYNkfYz3k/nTr8BAUItRBjdporGIRNEnR9gPzgCUC uQhjXWMPulqubr8xbyefPWHTEzU2nvbXwTUWGjrBxSy8zkyy5arfqizUj+VG6afT NsboANoMHa9b+xdzigSFdA3nbVK6xBjtU6Ywntk9TIpODKF5NgfARx0H+oSH+Zrj 32bMzgZtHw/lAbYsnQ9OnTY6AEWQYt6NMuVbTiLXrMHhM3nWwfg/XoN4nZqs6jPo IYvE6WhQZU6L6fptGHFC =dRf6 -----END PGP SIGNATURE----- Merge tag 'pm+acpi-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management updates from Rafael Wysocki: "As far as the number of commits goes, the top spot belongs to ACPI this time with cpufreq in the second position and a handful of PM core, PNP and cpuidle updates. They are fixes and cleanups mostly, as usual, with a couple of new features in the mix. The most visible change is probably that we will create struct acpi_device objects (visible in sysfs) for all devices represented in the ACPI tables regardless of their status and there will be a new sysfs attribute under those objects allowing user space to check that status via _STA. Consequently, ACPI device eject or generally hot-removal will not delete those objects, unless the table containing the corresponding namespace nodes is unloaded, which is extremely rare. Also ACPI container hotplug will be handled quite a bit differently and cpufreq will support CPU boost ("turbo") generically and not only in the acpi-cpufreq driver. Specifics: - ACPI core changes to make it create a struct acpi_device object for every device represented in the ACPI tables during all namespace scans regardless of the current status of that device. In accordance with this, ACPI hotplug operations will not delete those objects, unless the underlying ACPI tables go away. - On top of the above, new sysfs attribute for ACPI device objects allowing user space to check device status by triggering the execution of _STA for its ACPI object. From Srinivas Pandruvada. - ACPI core hotplug changes reducing code duplication, integrating the PCI root hotplug with the core and reworking container hotplug. - ACPI core simplifications making it use ACPI_COMPANION() in the code "glueing" ACPI device objects to "physical" devices. - ACPICA update to upstream version 20131218. This adds support for the DBG2 and PCCT tables to ACPICA, fixes some bugs and improves debug facilities. From Bob Moore, Lv Zheng and Betty Dall. - Init code change to carry out the early ACPI initialization earlier. That should allow us to use ACPI during the timekeeping initialization and possibly to simplify the EFI initialization too. From Chun-Yi Lee. - Clenups of the inclusions of ACPI headers in many places all over from Lv Zheng and Rashika Kheria (work in progress). - New helper for ACPI _DSM execution and rework of the code in drivers that uses _DSM to execute it via the new helper. From Jiang Liu. - New Win8 OSI blacklist entries from Takashi Iwai. - Assorted ACPI fixes and cleanups from Al Stone, Emil Goode, Hanjun Guo, Lan Tianyu, Masanari Iida, Oliver Neukum, Prarit Bhargava, Rashika Kheria, Tang Chen, Zhang Rui. - intel_pstate driver updates, including proper Baytrail support, from Dirk Brandewie and intel_pstate documentation from Ramkumar Ramachandra. - Generic CPU boost ("turbo") support for cpufreq from Lukasz Majewski. - powernow-k6 cpufreq driver fixes from Mikulas Patocka. - cpufreq core fixes and cleanups from Viresh Kumar, Jane Li, Mark Brown. - Assorted cpufreq drivers fixes and cleanups from Anson Huang, John Tobias, Paul Bolle, Paul Walmsley, Sachin Kamat, Shawn Guo, Viresh Kumar. - cpuidle cleanups from Bartlomiej Zolnierkiewicz. - Support for hibernation APM events from Bin Shi. - Hibernation fix to avoid bringing up nonboot CPUs with ACPI EC disabled during thaw transitions from Bjørn Mork. - PM core fixes and cleanups from Ben Dooks, Leonardo Potenza, Ulf Hansson. - PNP subsystem fixes and cleanups from Dmitry Torokhov, Levente Kurusa, Rashika Kheria. - New tool for profiling system suspend from Todd E Brandt and a cpupower tool cleanup from One Thousand Gnomes" * tag 'pm+acpi-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (153 commits) thermal: exynos: boost: Automatic enable/disable of BOOST feature (at Exynos4412) cpufreq: exynos4x12: Change L0 driver data to CPUFREQ_BOOST_FREQ Documentation: cpufreq / boost: Update BOOST documentation cpufreq: exynos: Extend Exynos cpufreq driver to support boost cpufreq / boost: Kconfig: Support for software-managed BOOST acpi-cpufreq: Adjust the code to use the common boost attribute cpufreq: Add boost frequency support in core intel_pstate: Add trace point to report internal state. cpufreq: introduce cpufreq_generic_get() routine ARM: SA1100: Create dummy clk_get_rate() to avoid build failures cpufreq: stats: create sysfs entries when cpufreq_stats is a module cpufreq: stats: free table and remove sysfs entry in a single routine cpufreq: stats: remove hotplug notifiers cpufreq: stats: handle cpufreq_unregister_driver() and suspend/resume properly cpufreq: speedstep: remove unused speedstep_get_state platform: introduce OF style 'modalias' support for platform bus PM / tools: new tool for suspend/resume performance optimization ACPI: fix module autoloading for ACPI enumerated devices ACPI: add module autoloading support for ACPI enumerated devices ACPI: fix create_modalias() return value handling ...	2014-01-24 15:51:02 -08:00
Peter Zijlstra	5e3c1afd45	sched/x86/tsc: Initialize multiplier to 0 Since we keep the clock value linearly continuous on frequency change, make sure the initial multiplier is 0, such that our initial value is 0. Without this we compute the initial value at whatever the TSC has managed to reach since power-on. Reported-and-Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de> Fixes: `20d1c86a57` ("sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQs") Cc: lenb@kernel.org Cc: rjw@rjwysocki.net Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com> Cc: rui.zhang@intel.com Cc: jacob.jun.pan@linux.intel.com Cc: Mike Galbraith <bitbucket@online.de> Cc: hpa@zytor.com Cc: paulmck@linux.vnet.ibm.com Cc: John Stultz <john.stultz@linaro.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: dyoung@redhat.com Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140123094804.GP30183@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-23 14:48:36 +01:00
Linus Torvalds	e1ba84597c	PCI changes for the v3.14 merge window: Resource management - Change pci_bus_region addresses to dma_addr_t (Bjorn Helgaas) - Support 64-bit AGP BARs (Bjorn Helgaas, Yinghai Lu) - Add pci_bus_address() to get bus address of a BAR (Bjorn Helgaas) - Use pci_resource_start() for CPU address of AGP BARs (Bjorn Helgaas) - Enforce bus address limits in resource allocation (Yinghai Lu) - Allocate 64-bit BARs above 4G when possible (Yinghai Lu) - Convert pcibios_resource_to_bus() to take pci_bus, not pci_dev (Yinghai Lu) PCI device hotplug - Major rescan/remove locking update (Rafael J. Wysocki) - Make ioapic builtin only (not modular) (Yinghai Lu) - Fix release/free issues (Yinghai Lu) - Clean up pciehp (Bjorn Helgaas) - Announce pciehp slot info during enumeration (Bjorn Helgaas) MSI - Add pci_msi_vec_count(), pci_msix_vec_count() (Alexander Gordeev) - Add pci_enable_msi_range(), pci_enable_msix_range() (Alexander Gordeev) - Deprecate "tri-state" interfaces: fail/success/fail+info (Alexander Gordeev) - Export MSI mode using attributes, not kobjects (Greg Kroah-Hartman) - Drop "irq" param from _restore_msi_irqs() (DuanZhenzhong) SR-IOV - Clear NumVFs when disabling SR-IOV in sriov_init() (ethan.zhao) Virtualization - Add support for save/restore of extended capabilities (Alex Williamson) - Add Virtual Channel to save/restore support (Alex Williamson) - Never treat a VF as a multifunction device (Alex Williamson) - Add pci_try_reset_function(), et al (Alex Williamson) AER - Ignore non-PCIe error sources (Betty Dall) - Support ACPI HEST error sources for domains other than 0 (Betty Dall) - Consolidate HEST error source parsers (Bjorn Helgaas) - Add a TLP header print helper (Borislav Petkov) Freescale i.MX6 - Remove unnecessary code (Fabio Estevam) - Make reset-gpio optional (Marek Vasut) - Report "link up" only after link training completes (Marek Vasut) - Start link in Gen1 before negotiating for Gen2 mode (Marek Vasut) - Fix PCIe startup code (Richard Zhu) Marvell MVEBU - Remove duplicate of_clk_get_by_name() call (Andrew Lunn) - Drop writes to bridge Secondary Status register (Jason Gunthorpe) - Obey bridge PCI_COMMAND_MEM and PCI_COMMAND_IO bits (Jason Gunthorpe) - Support a bridge with no IO port window (Jason Gunthorpe) - Use max_t() instead of max(resource_size_t,) (Jingoo Han) - Remove redundant of_match_ptr (Sachin Kamat) - Call pci_ioremap_io() at startup instead of dynamically (Thomas Petazzoni) NVIDIA Tegra - Disable Gen2 for Tegra20 and Tegra30 (Eric Brower) Renesas R-Car - Add runtime PM support (Valentine Barshak) - Fix rcar_pci_probe() return value check (Wei Yongjun) Synopsys DesignWare - Fix crash in dw_msi_teardown_irq() (Bjørn Erik Nilsen) - Remove redundant call to pci_write_config_word() (Bjørn Erik Nilsen) - Fix missing MSI IRQs (Harro Haan) - Add dw_pcie prefix before cfg_read/write (Pratyush Anand) - Fix I/O transfers by using CPU (not realio) address (Pratyush Anand) - Whitespace cleanup (Jingoo Han) EISA - Call put_device() if device_register() fails (Levente Kurusa) - Revert EISA initialization breakage ((Bjorn Helgaas) Miscellaneous - Remove unused code, including PCIe 3.0 interfaces (Stephen Hemminger) - Prevent bus conflicts while checking for bridge apertures (Bjorn Helgaas) - Stop clearing bridge Secondary Status when setting up I/O aperture (Bjorn Helgaas) - Use dev_is_pci() to identify PCI devices (Yijing Wang) - Deprecate DEFINE_PCI_DEVICE_TABLE (Joe Perches) - Update documentation 00-INDEX (Erik Ekman) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJS3ujEAAoJEFmIoMA60/r8A4EQAK9AZSUSVNWvlKdC1PrBfT3w 7fVILx5A4KWsOU8eoFwCPQLrgvUtMltg16yN2tbCjqpKEdrVc36biMO9bwhnXSyZ KopHKMWnn0sza/z2H8mcGy+0azGdWcIjcErX/a8WeS6zyWBjm+yzckrHNVpPu4Ca SpCBhfgBMjKyIZyLtP6juFSH34S2DfQex4oUSyPC+gjqPy5wW/xw/kBxZfOXl+yU P9pQT+geMIc31pETMdG9wd/TT+47YAui4ieSggoVxfVrphCXv6S8mOMCMuQc2bAy MHy9uFm1jbvKZZIYrzJ+9HFiiU/6MNiOO3Ygua52xuSp1Zrcjwi2CLD9/QBXbDVs pTKU5JIO7q43llkQUpIXTwBvEApSZRhuqzXegsMAYIg4AWmbfm/2fXkfWlQThYMp J48blAJZ4t0vhMr9usgwbtdBe8F5euExOxpwH0QMCMABbuu8/B3TLm39+LTcIbsw Efgm3N9iUTyiV5fe9Rr62nflhyqXjTevPl4wbZZe4OOdm0MXZY+/BzuNJhg3wyY8 QKz2J3FB6OR7BCLHCp80l50s5+Ih4F5kmOXwFKjT1D1MFRaNaPDmp9BY6TitU6hg zj55gP4c8x6n3alakbf972Yhgs/4oi1va8cZL+pCYWb8nPO5ldaMiT7QBBLUreQV BtDtC7u/AFWJ5e73+jVO =La1R -----END PGP SIGNATURE----- Merge tag 'pci-v3.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "PCI changes for the v3.14 merge window: Resource management - Change pci_bus_region addresses to dma_addr_t (Bjorn Helgaas) - Support 64-bit AGP BARs (Bjorn Helgaas, Yinghai Lu) - Add pci_bus_address() to get bus address of a BAR (Bjorn Helgaas) - Use pci_resource_start() for CPU address of AGP BARs (Bjorn Helgaas) - Enforce bus address limits in resource allocation (Yinghai Lu) - Allocate 64-bit BARs above 4G when possible (Yinghai Lu) - Convert pcibios_resource_to_bus() to take pci_bus, not pci_dev (Yinghai Lu) PCI device hotplug - Major rescan/remove locking update (Rafael J. Wysocki) - Make ioapic builtin only (not modular) (Yinghai Lu) - Fix release/free issues (Yinghai Lu) - Clean up pciehp (Bjorn Helgaas) - Announce pciehp slot info during enumeration (Bjorn Helgaas) MSI - Add pci_msi_vec_count(), pci_msix_vec_count() (Alexander Gordeev) - Add pci_enable_msi_range(), pci_enable_msix_range() (Alexander Gordeev) - Deprecate "tri-state" interfaces: fail/success/fail+info (Alexander Gordeev) - Export MSI mode using attributes, not kobjects (Greg Kroah-Hartman) - Drop "irq" param from _restore_msi_irqs() (DuanZhenzhong) SR-IOV - Clear NumVFs when disabling SR-IOV in sriov_init() (ethan.zhao) Virtualization - Add support for save/restore of extended capabilities (Alex Williamson) - Add Virtual Channel to save/restore support (Alex Williamson) - Never treat a VF as a multifunction device (Alex Williamson) - Add pci_try_reset_function(), et al (Alex Williamson) AER - Ignore non-PCIe error sources (Betty Dall) - Support ACPI HEST error sources for domains other than 0 (Betty Dall) - Consolidate HEST error source parsers (Bjorn Helgaas) - Add a TLP header print helper (Borislav Petkov) Freescale i.MX6 - Remove unnecessary code (Fabio Estevam) - Make reset-gpio optional (Marek Vasut) - Report "link up" only after link training completes (Marek Vasut) - Start link in Gen1 before negotiating for Gen2 mode (Marek Vasut) - Fix PCIe startup code (Richard Zhu) Marvell MVEBU - Remove duplicate of_clk_get_by_name() call (Andrew Lunn) - Drop writes to bridge Secondary Status register (Jason Gunthorpe) - Obey bridge PCI_COMMAND_MEM and PCI_COMMAND_IO bits (Jason Gunthorpe) - Support a bridge with no IO port window (Jason Gunthorpe) - Use max_t() instead of max(resource_size_t,) (Jingoo Han) - Remove redundant of_match_ptr (Sachin Kamat) - Call pci_ioremap_io() at startup instead of dynamically (Thomas Petazzoni) NVIDIA Tegra - Disable Gen2 for Tegra20 and Tegra30 (Eric Brower) Renesas R-Car - Add runtime PM support (Valentine Barshak) - Fix rcar_pci_probe() return value check (Wei Yongjun) Synopsys DesignWare - Fix crash in dw_msi_teardown_irq() (Bjørn Erik Nilsen) - Remove redundant call to pci_write_config_word() (Bjørn Erik Nilsen) - Fix missing MSI IRQs (Harro Haan) - Add dw_pcie prefix before cfg_read/write (Pratyush Anand) - Fix I/O transfers by using CPU (not realio) address (Pratyush Anand) - Whitespace cleanup (Jingoo Han) EISA - Call put_device() if device_register() fails (Levente Kurusa) - Revert EISA initialization breakage ((Bjorn Helgaas) Miscellaneous - Remove unused code, including PCIe 3.0 interfaces (Stephen Hemminger) - Prevent bus conflicts while checking for bridge apertures (Bjorn Helgaas) - Stop clearing bridge Secondary Status when setting up I/O aperture (Bjorn Helgaas) - Use dev_is_pci() to identify PCI devices (Yijing Wang) - Deprecate DEFINE_PCI_DEVICE_TABLE (Joe Perches) - Update documentation 00-INDEX (Erik Ekman)" * tag 'pci-v3.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (119 commits) Revert "EISA: Initialize device before its resources" Revert "EISA: Log device resources in dmesg" vfio-pci: Use pci "try" reset interface PCI: Check parent kobject in pci_destroy_dev() xen/pcifront: Use global PCI rescan-remove locking powerpc/eeh: Use global PCI rescan-remove locking PCI: Fix pci_check_and_unmask_intx() comment typos PCI: Add pci_try_reset_function(), pci_try_reset_slot(), pci_try_reset_bus() MPT / PCI: Use pci_stop_and_remove_bus_device_locked() platform / x86: Use global PCI rescan-remove locking PCI: hotplug: Use global PCI rescan-remove locking pcmcia: Use global PCI rescan-remove locking ACPI / hotplug / PCI: Use global PCI rescan-remove locking ACPI / PCI: Use global PCI rescan-remove locking in PCI root hotplug PCI: Add global pci_lock_rescan_remove() PCI: Cleanup pci.h whitespace PCI: Reorder so actual code comes before stubs PCI/AER: Support ACPI HEST AER error sources for PCI domains other than 0 ACPICA: Add helper macros to extract bus/segment numbers from HEST table. PCI: Make local functions static ...	2014-01-22 16:39:28 -08:00
Grygorii Strashko	9a28f9dc8d	x86/mm: memblock: switch to use NUMA_NO_NODE Update X86 code to use NUMA_NO_NODE instead of MAX_NUMNODES while calling memblock APIs, because memblock API will be changed to use NUMA_NO_NODE and will produce warning during boot otherwise. See: https://lkml.org/lkml/2013/12/9/898 Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Tejun Heo <tj@kernel.org> Cc: Yinghai Lu <yinghai@kernel.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:47 -08:00
Santosh Shilimkar	5b6e529521	x86: memblock: set current limit to max low memory address The memblock current limit value is used to limit early boot memory allocations below max low memory address by default, as the kernel can access only to the low memory. Hence, set memblock current limit value to the max mapped low memory address instead of max mapped memory address. Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: Tejun Heo <tj@kernel.org> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Paul Walmsley <paul@pwsan.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Russell King <linux@arm.linux.org.uk> Cc: Tony Lindgren <tony@atomide.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:46 -08:00
Linus Torvalds	d3bad75a6d	Driver core / sysfs patches for 3.14-rc1 Here's the big driver core and sysfs patch set for 3.14-rc1. There's a lot of work here moving sysfs logic out into a "kernfs" to allow other subsystems to also have a virtual filesystem with the same attributes of sysfs (handle device disconnect, dynamic creation / removal as needed / unneeded, etc. This is primarily being done for the cgroups filesystem, but the goal is to also move debugfs to it when it is ready, solving all of the known issues in that filesystem as well. The code isn't completed yet, but all should be stable now (there is a big section that was reverted due to problems found when testing.) There's also some other smaller fixes, and a driver core addition that allows for a "collection" of objects, that the DRM people will be using soon (it's in this tree to make merges after -rc1 easier.) All of this has been in linux-next with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEABECAAYFAlLdh0cACgkQMUfUDdst+ylv4QCfeDKDgLo4LsaBIIrFSxLoH/c7 UUsAoMPRwA0h8wy+BQcJAg4H4J4maKj3 =0pc0 -----END PGP SIGNATURE----- Merge tag 'driver-core-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core / sysfs patches from Greg KH: "Here's the big driver core and sysfs patch set for 3.14-rc1. There's a lot of work here moving sysfs logic out into a "kernfs" to allow other subsystems to also have a virtual filesystem with the same attributes of sysfs (handle device disconnect, dynamic creation / removal as needed / unneeded, etc) This is primarily being done for the cgroups filesystem, but the goal is to also move debugfs to it when it is ready, solving all of the known issues in that filesystem as well. The code isn't completed yet, but all should be stable now (there is a big section that was reverted due to problems found when testing) There's also some other smaller fixes, and a driver core addition that allows for a "collection" of objects, that the DRM people will be using soon (it's in this tree to make merges after -rc1 easier) All of this has been in linux-next with no reported issues" * tag 'driver-core-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (113 commits) kernfs: associate a new kernfs_node with its parent on creation kernfs: add struct dentry declaration in kernfs.h kernfs: fix get_active failure handling in kernfs_seq_() Revert "kernfs: fix get_active failure handling in kernfs_seq_()" Revert "kernfs: replace kernfs_node->u.completion with kernfs_root->deactivate_waitq" Revert "kernfs: remove KERNFS_ACTIVE_REF and add kernfs_lockdep()" Revert "kernfs: remove KERNFS_REMOVED" Revert "kernfs: restructure removal path to fix possible premature return" Revert "kernfs: invoke kernfs_unmap_bin_file() directly from __kernfs_remove()" Revert "kernfs: remove kernfs_addrm_cxt" Revert "kernfs: make kernfs_get_active() block if the node is deactivated but not removed" Revert "kernfs: implement kernfs_{de\|re}activate[_self]()" Revert "kernfs, sysfs, driver-core: implement kernfs_remove_self() and its wrappers" Revert "pci: use device_remove_file_self() instead of device_schedule_callback()" Revert "scsi: use device_remove_file_self() instead of device_schedule_callback()" Revert "s390: use device_remove_file_self() instead of device_schedule_callback()" Revert "sysfs, driver-core: remove unused {sysfs\|device}_schedule_callback_owner()" Revert "kernfs: remove unnecessary NULL check in __kernfs_remove()" kernfs: remove unnecessary NULL check in __kernfs_remove() drivers/base: provide an infrastructure for componentised subsystems ...	2014-01-20 15:49:44 -08:00
Linus Torvalds	c9cdd9a6ae	Merge branch 'x86/mpx' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpufeature and mpx updates from Peter Anvin: "This includes the basic infrastructure for MPX (Memory Protection Extensions) support, but does not include MPX support itself. It is, however, a prerequisite for KVM support for MPX, which I believe will be pushed later this merge window by the KVM team. This includes moving the functionality in futex_atomic_cmpxchg_inatomic() into a new function in uaccess.h so it can be reused - this will be used by the final MPX patches. The actual MPX functionality (map management and so on) will be pushed in a future merge window, when ready" * 'x86/mpx' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/intel/mpx: Remove unused LWP structure x86, mpx: Add MPX related opcodes to the x86 opcode map x86: replace futex_atomic_cmpxchg_inatomic() with user_atomic_cmpxchg_inatomic x86: add user_atomic_cmpxchg_inatomic at uaccess.h x86, xsave: Support eager-only xsave features, add MPX support x86, cpufeature: Define the Intel MPX feature flag	2014-01-20 14:46:32 -08:00
Linus Torvalds	f4bcd8ccdd	Merge branch 'x86-kaslr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 kernel address space randomization support from Peter Anvin: "This enables kernel address space randomization for x86" * 'x86-kaslr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, kaslr: Clarify RANDOMIZE_BASE_MAX_OFFSET x86, kaslr: Remove unused including <linux/version.h> x86, kaslr: Use char array to gain sizeof sanity x86, kaslr: Add a circular multiply for better bit diffusion x86, kaslr: Mix entropy sources together as needed x86/relocs: Add percpu fixup for GNU ld 2.23 x86, boot: Rename get_flags() and check_flags() to *_cpuflags() x86, kaslr: Raise the maximum virtual address to -1 GiB on x86_64 x86, kaslr: Report kernel offset on panic x86, kaslr: Select random position from e820 maps x86, kaslr: Provide randomness functions x86, kaslr: Return location from decompress_kernel x86, boot: Move CPU flags out of cpucheck x86, relocs: Add more per-cpu gold special cases	2014-01-20 14:45:50 -08:00
Linus Torvalds	7fe67a1180	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull leftover x86 fixes from Ingo Molnar: "Two leftover fixes that did not make it into v3.13" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Add check for number of available vectors before CPU down x86, cpu, amd: Add workaround for family 16h, erratum 793	2014-01-20 12:11:41 -08:00
Linus Torvalds	fab5669d55	Merge branch 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 RAS changes from Ingo Molnar: - SCI reporting for other error types not only correctable ones - GHES cleanups - Add the functionality to override error reporting agents as some machines are sporting a new extended error logging capability which, if done properly in the BIOS, makes a corresponding EDAC module redundant - PCIe AER tracepoint severity levels fix - Error path correction for the mce device init - MCE timer fix - Add more flexibility to the error injection (EINJ) debugfs interface * 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, mce: Fix mce_start_timer semantics ACPI, APEI, GHES: Cleanup ghes memory error handling ACPI, APEI: Cleanup alignment-aware accesses ACPI, APEI, GHES: Do not report only correctable errors with SCI ACPI, APEI, EINJ: Changes to the ACPI/APEI/EINJ debugfs interface ACPI, eMCA: Combine eMCA/EDAC event reporting priority EDAC, sb_edac: Modify H/W event reporting policy EDAC: Add an edac_report parameter to EDAC PCI, AER: Fix severity usage in aer trace event x86, mce: Call put_device on device_register failure	2014-01-20 12:10:27 -08:00
Linus Torvalds	74e8ee8262	Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull Intel SoC changes from Ingo Molnar: "Improved Intel SoC platform support" * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, tsc, apic: Unbreak static (MSR) calibration when CONFIG_X86_LOCAL_APIC=n x86, tsc: Add static (MSR) TSC calibration on Intel Atom SoCs arch: x86: New MailBox support driver for Intel SOC's	2014-01-20 12:09:31 -08:00
Linus Torvalds	2bb2c5e235	Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 microcode loader updates from Ingo Molnar: "There are two main changes in this tree: - AMD microcode early loading fixes - some microcode loader source files reorganization" * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, microcode: Move to a proper location x86, microcode, AMD: Fix early ucode loading x86, microcode: Share native MSR accessing variants x86, ramdisk: Export relocated ramdisk VA	2014-01-20 12:07:54 -08:00
Linus Torvalds	972d5e7e5b	Merge branch 'x86-efi-kexec-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 EFI changes from Ingo Molnar: "This consists of two main parts: - New static EFI runtime services virtual mapping layout which is groundwork for kexec support on EFI (Borislav Petkov) - EFI kexec support itself (Dave Young)" * 'x86-efi-kexec-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/efi: parse_efi_setup() build fix x86: ksysfs.c build fix x86/efi: Delete superfluous global variables x86: Reserve setup_data ranges late after parsing memmap cmdline x86: Export x86 boot_params to sysfs x86: Add xloadflags bit for EFI runtime support on kexec x86/efi: Pass necessary EFI data for kexec via setup_data efi: Export EFI runtime memory mapping to sysfs efi: Export more EFI table variables to sysfs x86/efi: Cleanup efi_enter_virtual_mode() function x86/efi: Fix off-by-one bug in EFI Boot Services reservation x86/efi: Add a wrapper function efi_map_region_fixed() x86/efi: Remove unused variables in __map_region() x86/efi: Check krealloc return value x86/efi: Runtime services virtual mapping x86/mm/cpa: Map in an arbitrary pgd x86/mm/pageattr: Add last levels of error path x86/mm/pageattr: Add a PUD error unwinding path x86/mm/pageattr: Add a PTE pagetable populating function x86/mm/pageattr: Add a PMD pagetable populating function ...	2014-01-20 12:05:30 -08:00
Linus Torvalds	5d4863e4cc	Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 TLB detection update from Ingo Molnar: "A single change that extends our TLB cache size detection+reporting code" * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, cpu: Detect more TLB configuration	2014-01-20 12:04:45 -08:00
Linus Torvalds	2a0fede97f	Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Misc cleanups" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, cpu, amd: Fix a shadowed variable situation um, x86: Fix vDSO build x86: Delete non-required instances of include <linux/init.h> x86, realmode: Pointer walk cleanups, pull out invariant use of __pa() x86/traps: Clean up error exception handler definitions	2014-01-20 12:03:57 -08:00
Linus Torvalds	1a7dbbcc8c	Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/apic changes from Ingo Molnar: "Two main changes: - improve local APIC Error Status Register reporting robustness - add the 'disable_cpu_apicid=x' boot parameter for kexec booting" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, apic: Make disabled_cpu_apicid static read_mostly, fix typos x86, apic, kexec: Add disable_cpu_apicid kernel parameter x86/apic: Read Error Status Register correctly	2014-01-20 11:50:01 -08:00
Linus Torvalds	a0fa1dd3cd	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler changes from Ingo Molnar: - Add the initial implementation of SCHED_DEADLINE support: a real-time scheduling policy where tasks that meet their deadlines and periodically execute their instances in less than their runtime quota see real-time scheduling and won't miss any of their deadlines. Tasks that go over their quota get delayed (Available to privileged users for now) - Clean up and fix preempt_enable_no_resched() abuse all around the tree - Do sched_clock() performance optimizations on x86 and elsewhere - Fix and improve auto-NUMA balancing - Fix and clean up the idle loop - Apply various cleanups and fixes * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) sched: Fix __sched_setscheduler() nice test sched: Move SCHED_RESET_ON_FORK into attr::sched_flags sched: Fix up attr::sched_priority warning sched: Fix up scheduler syscall LTP fails sched: Preserve the nice level over sched_setscheduler() and sched_setparam() calls sched/core: Fix htmldocs warnings sched/deadline: No need to check p if dl_se is valid sched/deadline: Remove unused variables sched/deadline: Fix sparse static warnings m68k: Fix build warning in mac_via.h sched, thermal: Clean up preempt_enable_no_resched() abuse sched, net: Fixup busy_loop_us_clock() sched, net: Clean up preempt_enable_no_resched() abuse sched/preempt: Fix up missed PREEMPT_NEED_RESCHED folding sched/preempt, locking: Rework local_bh_{dis,en}able() sched/clock, x86: Avoid a runtime condition in native_sched_clock() sched/clock: Fix up clear_sched_clock_stable() sched/clock, x86: Use a static_key for sched_clock_stable sched/clock: Remove local_irq_disable() from the clocks sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQs ...	2014-01-20 10:42:08 -08:00
Linus Torvalds	9326657abe	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Kernel side changes: - Add Intel RAPL energy counter support (Stephane Eranian) - Clean up uprobes (Oleg Nesterov) - Optimize ring-buffer writes (Peter Zijlstra) Tooling side changes, user visible: - 'perf diff': - Add column colouring improvements (Ramkumar Ramachandra) - 'perf kvm': - Add guest related improvements, including allowing to specify a directory with guest specific /proc information (Dongsheng Yang) - Add shell completion support (Ramkumar Ramachandra) - Add '-v' option (Dongsheng Yang) - Support --guestmount (Dongsheng Yang) - 'perf probe': - Support showing source code, asking for variables to be collected at probe time and other 'perf probe' operations that use DWARF information. This supports only binaries with debugging information at this time, detached debuginfo (aka debuginfo packages) support should come in later patches (Masami Hiramatsu) - 'perf record': - Rename --no-delay option to --no-buffering, better reflecting its purpose and freeing up '--delay' to take the place of '--initial-delay', so that 'record' and 'stat' are consistent (Arnaldo Carvalho de Melo) - Default the -t/--thread option to no inheritance (Adrian Hunter) - Make per-cpu mmaps the default (Adrian Hunter) - 'perf report': - Improve callchain processing performance (Frederic Weisbecker) - Retain bfd reference to lookup source line numbers, greatly optimizing, among other use cases, 'perf report -s srcline' (Adrian Hunter) - Improve callchain processing performance even more (Namhyung Kim) - Add a perf.data file header window in the 'perf report' TUI, associated with the 'i' hotkey, providing a counterpart to the --header option in the stdio UI (Namhyung Kim) - 'perf script': - Add an option in 'perf script' to print the source line number (Adrian Hunter) - Add --header/--header-only options to 'script' and 'report', the default is not tho show the header info, but as this has been the default for some time, leave a single line explaining how to obtain that information (Jiri Olsa) - Add options to show comm, fork, exit and mmap PERF_RECORD_ events (Namhyung Kim) - Print callchains and symbols if they exist (David Ahern) - 'perf timechart' - Add backtrace support to CPU info - Print pid along the name - Add support for CPU topology - Add new option --highlight'ing threads, be it by name or, if a numeric value is provided, that run more than given duration (Stanislav Fomichev) - 'perf top': - Make 'perf top -g' refer to callchains, for consistency with other tools (David Ahern) - 'perf trace': - Handle old kernels where the "raw_syscalls" tracepoints were called plain "syscalls" (David Ahern) - Remove thread summary coloring, by Pekka Enberg. - Honour -m option in 'trace', the tool was offering the option to set the mmap size, but wasn't using it when doing the actual mmap on the events file descriptors (Jiri Olsa) - generic: - Backport libtraceevent plugin support (trace-cmd repository, with plugins for jbd2, hrtimer, kmem, kvm, mac80211, sched_switch, function, xen, scsi, cfg80211 (Jiri Olsa) - Print session information only if --stdio is given (Namhyung Kim) Tooling side changes, developer visible (plumbing): - Improve 'perf probe' exit path, release resources (Masami Hiramatsu) - Improve libtraceevent plugins exit path, allowing the registering of an unregister handler to be called at exit time (Namhyung Kim) - Add an alias to the build test makefile (make -C tools/perf build-test) (Namhyung Kim) - Get rid of die() and friends (good riddance!) in libtraceevent (Namhyung Kim) - Fix cross build problems related to pkgconfig and CROSS_COMPILE not being propagated to the feature tests, leading to features being tested in the host and then being enabled on the target (Mark Rutland) - Improve forked workload error reporting by sending the errno in the signal data queueing integer field, using sigqueue and by doing the signal setup in the evlist methods, removing open coded equivalents in various tools (Arnaldo Carvalho de Melo) - Do more auto exit cleanup chores in the 'evlist' destructor, so that the tools don't have to all do that sequence (Arnaldo Carvalho de Melo) - Pack 'struct perf_session_env' and 'struct trace' (Arnaldo Carvalho de Melo) - Add test for building detached source tarballs (Arnaldo Carvalho de Melo) - Move some header files (tools/perf/ to tools/include/ to make them available to other tools/ dwelling codebases (Namhyung Kim) - Move logic to warn about kptr_restrict'ed kernels to separate function in 'report' (Arnaldo Carvalho de Melo) - Move hist browser selection code to separate function (Arnaldo Carvalho de Melo) - Move histogram entries collapsing to separate function (Arnaldo Carvalho de Melo) - Introduce evlist__for_each() & friends (Arnaldo Carvalho de Melo) - Automate setup of FEATURE_CHECK_(C\|LD)FLAGS-all variables (Jiri Olsa) - Move arch setup into seprate Makefile (Jiri Olsa) - Make libtraceevent install target quieter (Jiri Olsa) - Make tests/make output more compact (Jiri Olsa) - Ignore generated files in feature-checks (Chunwei Chen) - Introduce pevent_filter_strerror() in libtraceevent, similar in purpose to libc's strerror() function (Namhyung Kim) - Use perf_data_file methods to write output file in 'record' and 'inject' (Jiri Olsa) - Use pr_() functions where applicable in 'report' (Namhyumg Kim) - Add 'machine' 'addr_location' struct to have full picture (machine, thread, map, symbol, addr) for a (partially) resolved address, reducing function signatures (Arnaldo Carvalho de Melo) - Reduce code duplication in the histogram entry creation/insertion (Arnaldo Carvalho de Melo) - Auto allocate annotation histogram data structures (Arnaldo Carvalho de Melo) - No need to test against NULL before calling free, also set freed memory in struct pointers to NULL, to help fixing use after free bugs (Arnaldo Carvalho de Melo) - Rename some struct DSO binary_type related members and methods, to clarify its purpose and need for differentiation (symtab_type, ie one is about the files .text, CFI, etc, i.e. its binary contents, and the other is about where the symbol table came from (Arnaldo Carvalho de Melo) - Convert to new topic libraries, starting with an API one (sysfs, debugfs, etc), renaming liblk in the process (Borislav Petkov) - Get rid of some more panic() like error handling in libtraceevent. (Namhyung Kim) - Get rid of panic() like calls in libtraceevent (Namyung Kim) - Start carving out symbol parsing routines (perf, just moving routines to topic files in tools/lib/symbol/, tools that want to use it need to integrate it directly, ie no tools/lib/symbol/Makefile is provided (Arnaldo Carvalho de Melo) - Assorted refactoring patches, moving code around and adding utility evlist methods that will be used in the IPT patchset (Adrian Hunter) - Assorted mmap_pages handling fixes (Adrian Hunter) - Several man pages typo fixes (Dongsheng Yang) - Get rid of several die() calls in libtraceevent (Namhyung Kim) - Use basename() in a more robust way, to avoid problems related to different system library implementations for that function (Stephane Eranian) - Remove open coded management of short_name_allocated member (Adrian Hunter) - Several cleanups in the "dso" methods, constifying some parameters and renaming some fields to clarify its purpose (Arnaldo Carvalho de Melo) - Add per-feature check flags, fixing libunwind related build problems on some architectures (Jean Pihet) - Do not disable source line lookup just because of one failure. (Adrian Hunter) - Several 'perf kvm' man page corrections (Dongsheng Yang) - Correct the message in feature-libnuma checking, swowing the right devel package names for various distros (Dongsheng Yang) - Polish 'readn()' function and introduce its counterpart, 'writen()' (Jiri Olsa) - Start moving timechart state from global variables to a 'perf_tool' derived 'timechart' struct (Arnaldo Carvalho de Melo) ... and lots of fixes and improvements I forgot to list" 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (282 commits) perf tools: Remove unnecessary callchain cursor state restore on unmatch perf callchain: Spare double comparison of callchain first entry perf tools: Do proper comm override error handling perf symbols: Export elf_section_by_name and reuse perf probe: Release all dynamically allocated parameters perf probe: Release allocated probe_trace_event if failed perf tools: Add 'build-test' make target tools lib traceevent: Unregister handler when xen plugin is unloaded tools lib traceevent: Unregister handler when scsi plugin is unloaded tools lib traceevent: Unregister handler when jbd2 plugin is is unloaded tools lib traceevent: Unregister handler when cfg80211 plugin is unloaded tools lib traceevent: Unregister handler when mac80211 plugin is unloaded tools lib traceevent: Unregister handler when sched_switch plugin is unloaded tools lib traceevent: Unregister handler when kvm plugin is unloaded tools lib traceevent: Unregister handler when kmem plugin is unloaded tools lib traceevent: Unregister handler when hrtimer plugin is unloaded tools lib traceevent: Unregister handler when function plugin is unloaded tools lib traceevent: Add pevent_unregister_print_function() tools lib traceevent: Add pevent_unregister_event_handler() tools lib traceevent: fix pointer-integer size mismatch ...	2014-01-20 10:28:30 -08:00
Linus Torvalds	2cc3f16cad	Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull IRQ changes from Ingo Molnar: "The only change in this cycle is a CPU hotplug related spurious warning fix" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/irq: Fix kbuild warning in smp_irq_move_cleanup_interrupt() x86/irq: Fix do_IRQ() interrupt warning for cpu hotplug retriggered irqs	2014-01-20 10:27:52 -08:00
H. Peter Anvin	ca1e631c3a	x86, tsc, apic: Unbreak static (MSR) calibration when CONFIG_X86_LOCAL_APIC=n If we aren't going to use the local APIC anyway, we obviously don't care about its timer frequency. Link: http://lkml.kernel.org/r/tip-rgm7xmg7k6qnjlw3ynkcjsmh@git.kernel.org Reported-by: Fengguang Wu <fengguang.wu@intel.com> Cc: Bin Gao <bin.gao@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-01-16 13:00:21 -08:00
Ingo Molnar	860fc2f264	Merge branch 'perf/urgent' into perf/core Pick up the latest fixes, refresh the development tree. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-16 09:33:30 +01:00
Robert Richter	bee09ed91c	perf/x86/amd/ibs: Fix waking up from S3 for AMD family 10h On AMD family 10h we see following error messages while waking up from S3 for all non-boot CPUs leading to a failed IBS initialization: Enabling non-boot CPUs ... smpboot: Booting Node 0 Processor 1 APIC 0x1 [Firmware Bug]: cpu 1, try to use APIC500 (LVT offset 0) for vector 0x400, but the register is already in use for vector 0xf9 on another cpu perf: IBS APIC setup failed on cpu #1 process: Switch to broadcast mode on CPU1 CPU1 is up ... ACPI: Waking up from system sleep state S3 Reason for this is that during suspend the LVT offset for the IBS vector gets lost and needs to be reinialized while resuming. The offset is read from the IBSCTL msr. On family 10h the offset needs to be 1 as offset 0 is used for the MCE threshold interrupt, but firmware assings it for IBS to 0 too. The kernel needs to reprogram the vector. The msr is a readonly node msr, but a new value can be written via pci config space access. The reinitialization is implemented for family 10h in setup_ibs_ctl() which is forced during IBS setup. This patch fixes IBS setup after waking up from S3 by adding resume/supend hooks for the boot cpu which does the offset reinitialization. Marking it as stable to let distros pick up this fix. Signed-off-by: Robert Richter <rric@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> v3.2.. Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1389797849-5565-1-git-send-email-rric.net@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-16 09:19:50 +01:00
Bin Gao	7da7c15613	x86, tsc: Add static (MSR) TSC calibration on Intel Atom SoCs On SoCs that have the calibration MSRs available, either there is no PIT, HPET or PMTIMER to calibrate against, or the PIT/HPET/PMTIMER is driven from the same clock as the TSC, so calibration is redundant and just slows down the boot. TSC rate is caculated by this formula: <maximum core-clock to bus-clock ratio> * <maximum resolved frequency> The ratio and the resolved frequency ID can be obtained from MSR. See Intel 64 and IA-32 System Programming Guid section 16.12 and 30.11.5 for details. Signed-off-by: Bin Gao <bin.gao@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/n/tip-rgm7xmg7k6qnjlw3ynkcjsmh@git.kernel.org	2014-01-15 22:28:48 -08:00
Prarit Bhargava	da6139e49c	x86: Add check for number of available vectors before CPU down Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64791 When a cpu is downed on a system, the irqs on the cpu are assigned to other cpus. It is possible, however, that when a cpu is downed there aren't enough free vectors on the remaining cpus to account for the vectors from the cpu that is being downed. This results in an interesting "overflow" condition where irqs are "assigned" to a CPU but are not handled. For example, when downing cpus on a 1-64 logical processor system: <snip> [ 232.021745] smpboot: CPU 61 is now offline [ 238.480275] smpboot: CPU 62 is now offline [ 245.991080] ------------[ cut here ]------------ [ 245.996270] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:264 dev_watchdog+0x246/0x250() [ 246.005688] NETDEV WATCHDOG: p786p1 (ixgbe): transmit queue 0 timed out [ 246.013070] Modules linked in: lockd sunrpc iTCO_wdt iTCO_vendor_support sb_edac ixgbe microcode e1000e pcspkr joydev edac_core lpc_ich ioatdma ptp mdio mfd_core i2c_i801 dca pps_core i2c_core wmi acpi_cpufreq isci libsas scsi_transport_sas [ 246.037633] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.0+ #14 [ 246.044451] Hardware name: Intel Corporation S4600LH ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 [ 246.057371] 0000000000000009 ffff88081fa03d40 ffffffff8164fbf6 ffff88081fa0ee48 [ 246.065728] ffff88081fa03d90 ffff88081fa03d80 ffffffff81054ecc ffff88081fa13040 [ 246.074073] 0000000000000000 ffff88200cce0000 0000000000000040 0000000000000000 [ 246.082430] Call Trace: [ 246.085174] <IRQ> [<ffffffff8164fbf6>] dump_stack+0x46/0x58 [ 246.091633] [<ffffffff81054ecc>] warn_slowpath_common+0x8c/0xc0 [ 246.098352] [<ffffffff81054fb6>] warn_slowpath_fmt+0x46/0x50 [ 246.104786] [<ffffffff815710d6>] dev_watchdog+0x246/0x250 [ 246.110923] [<ffffffff81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80 [ 246.119097] [<ffffffff8106092a>] call_timer_fn+0x3a/0x110 [ 246.125224] [<ffffffff8106280f>] ? update_process_times+0x6f/0x80 [ 246.132137] [<ffffffff81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80 [ 246.140308] [<ffffffff81061db0>] run_timer_softirq+0x1f0/0x2a0 [ 246.146933] [<ffffffff81059a80>] __do_softirq+0xe0/0x220 [ 246.152976] [<ffffffff8165fedc>] call_softirq+0x1c/0x30 [ 246.158920] [<ffffffff810045f5>] do_softirq+0x55/0x90 [ 246.164670] [<ffffffff81059d35>] irq_exit+0xa5/0xb0 [ 246.170227] [<ffffffff8166062a>] smp_apic_timer_interrupt+0x4a/0x60 [ 246.177324] [<ffffffff8165f40a>] apic_timer_interrupt+0x6a/0x70 [ 246.184041] <EOI> [<ffffffff81505a1b>] ? cpuidle_enter_state+0x5b/0xe0 [ 246.191559] [<ffffffff81505a17>] ? cpuidle_enter_state+0x57/0xe0 [ 246.198374] [<ffffffff81505b5d>] cpuidle_idle_call+0xbd/0x200 [ 246.204900] [<ffffffff8100b7ae>] arch_cpu_idle+0xe/0x30 [ 246.210846] [<ffffffff810a47b0>] cpu_startup_entry+0xd0/0x250 [ 246.217371] [<ffffffff81646b47>] rest_init+0x77/0x80 [ 246.223028] [<ffffffff81d09e8e>] start_kernel+0x3ee/0x3fb [ 246.229165] [<ffffffff81d0989f>] ? repair_env_string+0x5e/0x5e [ 246.235787] [<ffffffff81d095a5>] x86_64_start_reservations+0x2a/0x2c [ 246.242990] [<ffffffff81d0969f>] x86_64_start_kernel+0xf8/0xfc [ 246.249610] ---[ end trace fb74fdef54d79039 ]--- [ 246.254807] ixgbe 0000:c2:00.0 p786p1: initiating reset due to tx timeout [ 246.262489] ixgbe 0000:c2:00.0 p786p1: Reset adapter Last login: Mon Nov 11 08:35:14 from 10.18.17.119 [root@(none) ~]# [ 246.792676] ixgbe 0000:c2:00.0 p786p1: detected SFP+: 5 [ 249.231598] ixgbe 0000:c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [ 246.792676] ixgbe 0000:c2:00.0 p786p1: detected SFP+: 5 [ 249.231598] ixgbe 0000:c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX (last lines keep repeating. ixgbe driver is dead until module reload.) If the downed cpu has more vectors than are free on the remaining cpus on the system, it is possible that some vectors are "orphaned" even though they are assigned to a cpu. In this case, since the ixgbe driver had a watchdog, the watchdog fired and notified that something was wrong. This patch adds a function, check_vectors(), to compare the number of vectors on the CPU going down and compares it to the number of vectors available on the system. If there aren't enough vectors for the CPU to go down, an error is returned and propogated back to userspace. v2: Do not need to look at percpu irqs v3: Need to check affinity to prevent counting of MSIs in IOAPIC Lowest Priority Mode v4: Additional changes suggested by Gong Chen. v5/v6/v7/v8: Updated comment text Signed-off-by: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1389613861-3853-1-git-send-email-prarit@redhat.com Reviewed-by: Gong Chen <gong.chen@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michel Lespinasse <walken@google.com> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Cc: Yang Zhang <yang.z.zhang@Intel.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Janet Morgan <janet.morgan@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Ruiv Wang <ruiv.wang@gmail.com> Cc: Gong Chen <gong.chen@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org>	2014-01-15 22:24:02 -08:00
H. Peter Anvin	5b4d1dbc24	x86, apic: Make disabled_cpu_apicid static read_mostly, fix typos Make disabled_cpu_apicid static and read_mostly, and fix a couple of typos. Reported-by: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/20140115182511.GA22737@gmail.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>	2014-01-15 13:02:08 -08:00
HATAYAMA Daisuke	151e0c7de6	x86, apic, kexec: Add disable_cpu_apicid kernel parameter Add disable_cpu_apicid kernel parameter. To use this kernel parameter, specify an initial APIC ID of the corresponding CPU you want to disable. This is mostly used for the kdump 2nd kernel to disable BSP to wake up multiple CPUs without causing system reset or hang due to sending INIT from AP to BSP. Kdump users first figure out initial APIC ID of the BSP, CPU0 in the 1st kernel, for example from /proc/cpuinfo and then set up this kernel parameter for the 2nd kernel using the obtained APIC ID. However, doing this procedure at each boot time manually is awkward, which should be automatically done by user-land service scripts, for example, kexec-tools on fedora/RHEL distributions. This design is more flexible than disabling BSP in kernel boot time automatically in that in kernel boot time we have no choice but referring to ACPI/MP table to obtain initial APIC ID for BSP, meaning that the method is not applicable to the systems without such BIOS tables. One assumption behind this design is that users get initial APIC ID of the BSP in still healthy state and so BSP is uniquely kept in CPU0. Thus, through the kernel parameter, only one initial APIC ID can be specified. In a comparison with disabled_cpu_apicid, we use read_apic_id(), not boot_cpu_physical_apicid, because on some platforms, the variable is modified to the apicid reported as BSP through MP table and this function is executed with the temporarily modified boot_cpu_physical_apicid. As a result, disabled_cpu_apicid kernel parameter doesn't work well for apicids of APs. Fixing the wrong handling of boot_cpu_physical_apicid requires some reviews and tests beyond some platforms and it could take some time. The fix here is a kind of workaround to focus on the main topic of this patch. Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Link: http://lkml.kernel.org/r/20140115064458.1545.38775.stgit@localhost6.localdomain6 Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2014-01-15 09:19:20 -08:00
Borislav Petkov	d139336700	x86, cpu, amd: Fix a shadowed variable situation Having u32 and struct cpuinfo_x86 * by the same name is not very smart, although it was ok in this case due to the limited scope of u32 c and it being used only once in there. Fix this. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1389786735-16751-1-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2014-01-15 04:21:45 -08:00
Borislav Petkov	3b56496865	x86, cpu, amd: Add workaround for family 16h, erratum 793 This adds the workaround for erratum 793 as a precaution in case not every BIOS implements it. This addresses CVE-2013-6885. Erratum text: [Revision Guide for AMD Family 16h Models 00h-0Fh Processors, document 51810 Rev. 3.04 November 2013] 793 Specific Combination of Writes to Write Combined Memory Types and Locked Instructions May Cause Core Hang Description Under a highly specific and detailed set of internal timing conditions, a locked instruction may trigger a timing sequence whereby the write to a write combined memory type is not flushed, causing the locked instruction to stall indefinitely. Potential Effect on System Processor core hang. Suggested Workaround BIOS should set MSR C001_1020[15] = 1b. Fix Planned No fix planned [ hpa: updated description, fixed typo in MSR name ] Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/20140114230711.GS29865@pd.tnic Tested-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-01-14 16:39:07 -08:00
Richard Weinberger	60283df7ac	x86/apic: Read Error Status Register correctly Currently we do a read, a dummy write and a final read to fetch the error code. The value from the final read is taken. This is not the recommended way and leads to corrupted/lost ESR values. Intel(c) 64 and IA-32 Architectures Software Developer's Manual, Combined Volumes 1, 2ABC, 3ABC, Section 10.5.3 states: Before attempt to read from the ESR, software should first write to it. (The value written does not affect the values read subsequently; only zero may be written in x2APIC mode.) This write clears any previously logged errors and updates the ESR with any errors detected since the last write to the ESR. This write also rearms the APIC error interrupt triggering mechanism. This patch removes the first read such that we are conform with the manual. On my (very old) Pentium MMX SMP system this patch fixes the issue that APIC errors: a) are not always reported and b) are reported with false error numbers. Signed-off-by: Richard Weinberger <richard@nod.at> Cc: seiji.aguchi@hds.com Cc: rientjes@google.com Cc: konrad.wilk@oracle.com Cc: bp@alien8.de Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1389685487-20872-1-git-send-email-richard@nod.at Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-14 14:05:36 +01:00
Borislav Petkov	bad5fa631f	x86, microcode: Move to a proper location We've grown a bunch of microcode loader files all prefixed with "microcode_". They should be under cpu/ because this is strictly CPU-related functionality so do that and drop the prefix since they're in their own directory now which gives that prefix. :) While at it, drop MICROCODE_INTEL_LIB config item and stash the functionality under CONFIG_MICROCODE_INTEL as it was its only user. Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>	2014-01-13 20:00:12 +01:00
Borislav Petkov	5335ba5cf4	x86, microcode, AMD: Fix early ucode loading The original idea to use the microcode cache for the APs doesn't pan out because we do memory allocation there very early and with IRQs disabled and we don't want to involve GFP_ATOMIC allocations. Not if it can be helped. Thus, extend the caching of the BSP patch approach to the APs and iterate over the ucode in the initrd instead of using the cache. We still save the relevant patches to it but later, right before we jettison the initrd. While at it, fix early ucode loading on 32-bit too. Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>	2014-01-13 19:59:38 +01:00
Borislav Petkov	e1b43e3f13	x86, microcode: Share native MSR accessing variants We want to use those in AMD's early loading path too. Also, add a native_wrmsrl variant. Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>	2014-01-13 19:57:27 +01:00
Borislav Petkov	5aa3d718f2	x86, ramdisk: Export relocated ramdisk VA The ramdisk can possibly get relocated if the whole image is not mapped. And since we're going over it in the microcode loader and fishing out the relevant microcode patches, we want access it at its new location. Thus, export it. Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>	2014-01-13 19:56:25 +01:00
Ingo Molnar	c9c8986847	Merge branch 'x86/idle' into sched/core Merge these x86 specific bits - we are going to add generic bits as well. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-13 17:37:05 +01:00
Peter Zijlstra	10b033d434	sched/clock, x86: Avoid a runtime condition in native_sched_clock() Use a static_key to avoid touching tsc_disabled and a runtime condition in native_sched_clock() -- less cachelines touched is always better. MAINLINE PRE POST sched_clock_stable: 1 1 1 (cold) sched_clock: 329841 215295 213039 (cold) local_clock: 301773 220773 216084 (warm) sched_clock: 38375 25659 25231 (warm) local_clock: 100371 27242 27601 (warm) rdtsc: 27340 24208 24203 sched_clock_stable: 0 0 0 (cold) sched_clock: 382634 237019 240055 (cold) local_clock: 396890 294819 299942 (warm) sched_clock: 38194 25609 25276 (warm) local_clock: 143452 71232 73232 (warm) rdtsc: 27345 24243 24244 Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-hrz87bo37qke25bty6pnfy4b@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-13 15:13:17 +01:00
Peter Zijlstra	35af99e646	sched/clock, x86: Use a static_key for sched_clock_stable In order to avoid the runtime condition and variable load turn sched_clock_stable into a static_key. Also provide a shorter implementation of local_clock() and cpu_clock(int) when sched_clock_stable==1. MAINLINE PRE POST sched_clock_stable: 1 1 1 (cold) sched_clock: 329841 221876 215295 (cold) local_clock: 301773 234692 220773 (warm) sched_clock: 38375 25602 25659 (warm) local_clock: 100371 33265 27242 (warm) rdtsc: 27340 24214 24208 sched_clock_stable: 0 0 0 (cold) sched_clock: 382634 235941 237019 (cold) local_clock: 396890 297017 294819 (warm) sched_clock: 38194 25233 25609 (warm) local_clock: 143452 71234 71232 (warm) rdtsc: 27345 24245 24243 Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-eummbdechzz37mwmpags1gjr@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-13 15:13:13 +01:00
Peter Zijlstra	20d1c86a57	sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQs Use a ring-buffer like multi-version object structure which allows always having a coherent object; we use this to avoid having to disable IRQs while reading sched_clock() and avoids a problem when getting an NMI while changing the cyc2ns data. MAINLINE PRE POST sched_clock_stable: 1 1 1 (cold) sched_clock: 329841 331312 257223 (cold) local_clock: 301773 310296 309889 (warm) sched_clock: 38375 38247 25280 (warm) local_clock: 100371 102713 85268 (warm) rdtsc: 27340 27289 24247 sched_clock_stable: 0 0 0 (cold) sched_clock: 382634 372706 301224 (cold) local_clock: 396890 399275 399870 (warm) sched_clock: 38194 38124 25630 (warm) local_clock: 143452 148698 129629 (warm) rdtsc: 27345 27365 24307 Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-s567in1e5ekq2nlyhn8f987r@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-13 15:13:06 +01:00
Prarit Bhargava	c7a730fa46	x86/irq: Fix kbuild warning in smp_irq_move_cleanup_interrupt() Fengguang Wu's 0day kernel build service reported the following build warning: arch/x86/kernel/apic/io_apic.c:2211 smp_irq_move_cleanup_interrupt() warn: always true condition '(irq <= -1) => (0-u32max <= (-1))' because irq is defined as an unsigned int instead of an int. Fix this trivial error by redefining irq as a signed int. The remaining consumers of the int are okay. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Cc: Joerg Roedel <joro@8bytes.org> Cc: Fengguang Wu <fengguang.wu@intel.com> Link: http://lkml.kernel.org/r/1389620420-7110-1-git-send-email-prarit@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-13 15:08:37 +01:00
Peter Zijlstra	57c67da274	sched/clock, x86: Move some cyc2ns() code around There are no __cycles_2_ns() users outside of arch/x86/kernel/tsc.c, so move it there. There are no cycles_2_ns() users. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-01lslnavfgo3kmbo4532zlcj@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-13 13:47:39 +01:00
Rafael J. Wysocki	3e7cc142c1	Merge branch 'acpica' * acpica: (21 commits) ACPICA: Update version to 20131218. ACPICA: Utilities: Cleanup declarations of the acpi_gbl_debug_file global. ACPICA: Linuxize: Cleanup spaces after special macro invocations. ACPICA: Interpreter: Add additional debug info for an error case. ACPICA: Update ACPI example code to make it an actual working program. ACPICA: Add an error message if the Debugger fails initialization. ACPICA: Conditionally define a local variable that is used for debug only. ACPICA: Parser: Updates/fixes for debug output. ACPICA: Enhance ACPI warning for memory/IO address conflicts. ACPICA: Update several debug statements - no functional change. ACPICA: Improve exception handling for GPE block installation. ACPICA: Add helper macros to extract bus/segment numbers from HEST table. ACPICA: Tables: Add full support for the PCCT table, update table definition. ACPICA: Tables: Add full support for the DBG2 table. ACPICA: Add option to favor 32-bit FADT addresses. ACPICA: Cleanup the option of forcing the use of the RSDT. ACPICA: Back port and refine validation of the XSDT root table. ACPICA: Linux Header: Remove unused OSL prototypes. ACPICA: Remove unused ACPI_FREE_BUFFER macro. No functional change. ACPICA: Disassembler: Improve pathname support for emitted External() statements. ...	2014-01-12 23:45:43 +01:00
Rafael J. Wysocki	98feb7cc61	Merge branch 'acpi-cleanup' * acpi-cleanup: (22 commits) ACPI / tables: Return proper error codes from acpi_table_parse() and fix comment. ACPI / tables: Check if id is NULL in acpi_table_parse() ACPI / proc: Include appropriate header file in proc.c ACPI / EC: Remove unused functions and add prototype declaration in internal.h ACPI / dock: Include appropriate header file in dock.c ACPI / PCI: Include appropriate header file in pci_link.c ACPI / PCI: Include appropriate header file in pci_slot.c ACPI / EC: Mark the function acpi_ec_add_debugfs() as static in ec_sys.c ACPI / NVS: Include appropriate header file in nvs.c ACPI / OSL: Mark the function acpi_table_checksum() as static ACPI / processor: initialize a variable to silence compiler warning ACPI / processor: use ACPI_COMPANION() to get ACPI device ACPI: correct minor typos ACPI / sleep: Drop redundant acpi_disabled check ACPI / dock: Drop redundant acpi_disabled check ACPI / table: Replace '1' with specific error return values ACPI: remove trailing whitespace ACPI / IBFT: Fix incorrect <acpi/acpi.h> inclusion in iSCSI boot firmware module ACPI / i915: Fix incorrect <acpi/acpi.h> inclusions via <linux/acpi_io.h> SFI / ACPI: Fix warnings reported during builds with W=1 ... Conflicts: drivers/acpi/nvs.c drivers/hwmon/asus_atk0110.c	2014-01-12 23:44:09 +01:00
Ingo Molnar	b769e014f3	SCI reporting for other error types not only correctable ones + APEI GHES cleanups + mce timer fix -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.15 (GNU/Linux) iQIcBAABAgAGBQJS0qbXAAoJEBLB8Bhh3lVK0G4P/1fjGcBzCiC4qp0vUzhdvu1d CXVTtF20fdGP+hgwVU/1DLuIeUhyBLFM57rPgz0FxfvPo/bTYq92Lw9sElQpiVad trIRpw3bemjDY/8E91vR94SqLKTddoJifldK5ZTUpRJN9up06MwLky4IurdL2ixE QLaI20ZQIDjxe+pmh4wHtyV5qPdtkiXY2ICcdTXwAW7RdiG7pXXd5gLljL4oFUMF bp0QzM074szkDvhBgZg6KTAy1zfNzZVM9hUBOAesi1tfXqkT7pCEbUkYKZ6QFNVV haQNt+RtkVvYcFcZK+9TkRM32KHSy1MVuicv2W6eNvRDtauGFtSEwBBuvyq3Imjb PTY6Vd5g/hR/+o968ieYUYdFua4xaza2wqC3mwL6WuNoWGzzb/f4dk9+wU/x0Khu th1NMSwy7VfNqi8pbTTy13LG1yjqqorrMLAEAfsNAMjZPIPsBREFYu8J5kko4ird 1aEsaENUdkzgoDEUxwO7vQEfAL3RBOC8kr4SrBGd5jQtbN4SYRyP0bYVIsQL3Psz fo0H+9sVLwkSxcx7MM0bboBcv0NYazvTS5aIiivSdwMq0uxDsRWRmUCr70AlJDp9 h7EfyovynCpfOcAOIz212A9bRzBcfW46KnQm6xkcgVdKKhEbc5EWhx6Nz3kCMNQZ Or1B6ZtUe7Acdg2kRTRL =+eFf -----END PGP SIGNATURE----- Merge tag 'ras_for_3.14_p2' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/ras Pull RAS updates from Borislav Petkov: " SCI reporting for other error types not only correctable ones + APEI GHES cleanups + mce timer fix " Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-12 17:56:54 +01:00
Ingo Molnar	da4540757d	Linux 3.13-rc8 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJS0miqAAoJEHm+PkMAQRiGbfgIAJSWEfo8ludknhPcHJabBtxu 75SQAKJlL3sBVnxEc58Rtt8gsKYQIrm4IY5Slunklsn04RxuDUIQMgFoAYR5gQwz +Myqkw/HOqDe5VStGxtLYpWnfglxVwGDCd7ISfL9AOVy5adMWBxh4Tv+qqQc7aIZ eF7dy+DD+C6Q3Z5OoV8s0FZDxse29vOf17Nki7+7t8WMqyegYwjoOqNeqocGKsPi eHLrJgTl4T6jB4l9LKKC154DSKjKOTSwZMWgwK8mToyNLT/ufCiKgXloIjEvZZcY VVKUtncdHiTf+iqVojgpGBzOEeB5DM83iiapFeDiJg8C9yBzvT8lBtA9aPb5Wgw= =lEeV -----END PGP SIGNATURE----- Merge tag 'v3.13-rc8' into x86/ras, to pick up fixes. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-12 17:56:29 +01:00
Borislav Petkov	4f75d84127	x86, mce: Fix mce_start_timer semantics So mce_start_timer() has a 'cpu' argument which is supposed to mean to start a timer on that cpu. However, the code currently starts a timer on the current cpu the function runs on and causes the sanity-check in mce_timer_fn to fire: WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/mcheck/mce.c:1286 mce_timer_fn because it is running on the wrong cpu. This was triggered by Prarit Bhargava <prarit@redhat.com> by offlining all the cpus in succession. Then, we were fiddling with the CMCI storm settings when starting the timer whereas there's no need for that - if there's storm happening on this newly restarted cpu, we're going to be in normal CMCI mode initially and then when the CMCI interrupt starts firing, we're going to go to the polling mode with the timer real soon. Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Prarit Bhargava <prarit@redhat.com> Cc: Tony Luck <tony.luck@intel.com> Reviewed-by: Chen, Gong <gong.chen@linux.intel.com> Link: http://lkml.kernel.org/r/1387722156-5511-1-git-send-email-prarit@redhat.com	2014-01-12 15:22:25 +01:00
Prarit Bhargava	9345005f4e	x86/irq: Fix do_IRQ() interrupt warning for cpu hotplug retriggered irqs During heavy CPU-hotplug operations the following spurious kernel warnings can trigger: do_IRQ: No ... irq handler for vector (irq -1) [ See: https://bugzilla.kernel.org/show_bug.cgi?id=64831 ] When downing a cpu it is possible that there are unhandled irqs left in the APIC IRR register. The following code path shows how the problem can occur: 1. CPU 5 is to go down. 2. cpu_disable() on CPU 5 executes with interrupt flag cleared by local_irq_save() via stop_machine(). 3. IRQ 12 asserts on CPU 5, setting IRR but not ISR because interrupt flag is cleared (CPU unabled to handle the irq) 4. IRQs are migrated off of CPU 5, and the vectors' irqs are set to -1. 5. stop_machine() finishes cpu_disable() 6. cpu_die() for CPU 5 executes in normal context. 7. CPU 5 attempts to handle IRQ 12 because the IRR is set for IRQ 12. The code attempts to find the vector's IRQ and cannot because it has been set to -1. 8. do_IRQ() warning displays warning about CPU 5 IRQ 12. I added a debug printk to output which CPU & vector was retriggered and discovered that that we are getting bogus events. I see a 100% correlation between this debug printk in fixup_irqs() and the do_IRQ() warning. This patchset resolves this by adding definitions for VECTOR_UNDEFINED(-1) and VECTOR_RETRIGGERED(-2) and modifying the code to use them. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=64831 Signed-off-by: Prarit Bhargava <prarit@redhat.com> Reviewed-by: Rui Wang <rui.y.wang@intel.com> Cc: Michel Lespinasse <walken@google.com> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Cc: Yang Zhang <yang.z.zhang@Intel.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: janet.morgan@Intel.com Cc: tony.luck@Intel.com Cc: ruiv.wang@gmail.com Link: http://lkml.kernel.org/r/1388938252-16627-1-git-send-email-prarit@redhat.com [ Cleaned up the code a bit. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-12 13:13:02 +01:00
Stephane Eranian	f228c5b882	perf/x86/intel: Add Intel RAPL PP1 energy counter support This patch adds support for the Intel RAPL energy counter PP1 (Power Plane 1). On client processors, it usually corresponds to the energy consumption of the builtin graphic card. That is why the sysfs event is called energy-gpu. New event: - name: power/energy-gpu/ - code: event=0x4 - unit: 2^-32 Joules On processors without graphics, this should count 0. The patch only enables this event on client processors. Reviewed-by: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com> Signed-off-by: Stephane Eranian <eranian@google.com> Cc: ak@linux.intel.com Cc: acme@redhat.com Cc: jolsa@redhat.com Cc: zheng.z.yan@intel.com Cc: bp@alien8.de Cc: vincent.weaver@maine.edu Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1389176153-3128-3-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-12 10:16:08 +01:00
Steven Rostedt	1739f09e33	ftrace/x86: Load ftrace_ops in parameter not the variable holding it Function tracing callbacks expect to have the ftrace_ops that registered it passed to them, not the address of the variable that holds the ftrace_ops that registered it. Use a mov instead of a lea to store the ftrace_ops into the parameter of the function tracing callback. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Link: http://lkml.kernel.org/r/20131113152004.459787f9@gandalf.local.home Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org> # v3.8+	2014-01-09 13:24:29 -08:00
David E. Box	4618441536	arch: x86: New MailBox support driver for Intel SOC's Current Intel SOC cores use a MailBox Interface (MBI) to provide access to configuration registers on devices (called units) connected to the system fabric. This is a support driver that implements access to this interface on those platforms that can enumerate the device using PCI. Initial support is for BayTrail, for which port definitons are provided. This is a requirement for implementing platform specific features (e.g. RAPL driver requires this to perform platform specific power management using the registers in PUNIT). Dependant modules should select IOSF_MBI in their respective Kconfig configuraiton. Serialized access is handled by all exported routines with spinlocks. The API includes 3 functions for access to unit registers: int iosf_mbi_read(u8 port, u8 opcode, u32 offset, u32 *mdr) int iosf_mbi_write(u8 port, u8 opcode, u32 offset, u32 mdr) int iosf_mbi_modify(u8 port, u8 opcode, u32 offset, u32 mdr, u32 mask) port: indicating the unit being accessed opcode: the read or write port specific opcode offset: the register offset within the port mdr: the register data to be read, written, or modified mask: bit locations in mdr to change Returns nonzero on error Note: GPU code handles access to the GFX unit. Therefore access to that unit with this driver is disallowed to avoid conflicts. Signed-off-by: David E. Box <david.e.box@linux.intel.com> Link: http://lkml.kernel.org/r/1389216471-734-1-git-send-email-david.e.box@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Matthew Garrett <mjg59@srcf.ucam.org>	2014-01-08 14:36:29 -08:00
Lv Zheng	fab4610583	ACPICA: Cleanup the option of forcing the use of the RSDT. This change adds a runtime option that will force ACPICA to use the RSDT instead of the XSDT. Although the ACPI spec requires that an XSDT be used instead of the RSDT, the XSDT has been found to be corrupt or ill-formed on some machines. This option is already in the Linux kernel. When it is back ported to ACPICA, code is re-written to follow ACPICA coding style. This patch is the generation of the integration. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-01-08 15:31:36 +01:00
Paul Gortmaker	663b55b9b3	x86: Delete non-required instances of include <linux/init.h> None of these files are actually using any __init type directives and hence don't need to include <linux/init.h>. Most are just a left over from __devinit and __cpuinit removal, or simply due to code getting copied from one driver to the next. [ hpa: undid incorrect removal from arch/x86/kernel/head_32.S ] Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Link: http://lkml.kernel.org/r/1389054026-12947-1-git-send-email-paul.gortmaker@windriver.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2014-01-06 21:25:18 -08:00
Ingo Molnar	ef0b8b9a52	Linux 3.13-rc7 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJSyJVbAAoJEHm+PkMAQRiGa28H/0m7GpZSpT8mvBthITxzqWCq JRkSPS4KTurAWlA5CJMJePyCM30DgN90s06bYUen9sTecZUwnL+qSV5OqAmg2r+0 PrfwtXtGZR6/Y12XlZ/3oFxVfUxjmgJyDAS76TIH1IvIum52nvJmLrR+6AyVphIX DkgBOuapdA7lia+U+ZM1cRkeHxUOKTUEw9v611VgoN3LYZyzyRb6d0rB7JtZN1RV dnXRi27enaPhwxelsCnORioRjsByMwD40CERxfLHmr5CGhmvCehBjO6bJ+KAdp14 52bfwWcNdbFMzUobcR7qlfS3Hy3AYJci+P6JzeeZ+kWEdv/eh5/1lvNuXtBJRlc= =iwzJ -----END PGP SIGNATURE----- Merge tag 'v3.13-rc7' into x86/efi-kexec to resolve conflicts Conflicts: arch/x86/platform/efi/efi.c drivers/firmware/efi/Kconfig Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-05 12:34:29 +01:00
Kirill A. Shutemov	dd360393f4	x86, cpu: Detect more TLB configuration The Intel Software Developer’s Manual covers few more TLB configurations exposed as CPUID 2 descriptors: 61H Instruction TLB: 4 KByte pages, fully associative, 48 entries 63H Data TLB: 1 GByte pages, 4-way set associative, 4 entries 76H Instruction TLB: 2M/4M pages, fully associative, 8 entries B5H Instruction TLB: 4KByte pages, 8-way set associative, 64 entries B6H Instruction TLB: 4KByte pages, 8-way set associative, 128 entries C1H Shared 2nd-Level TLB: 4 KByte/2MByte pages, 8-way associative, 1024 entries C2H DTLB DTLB: 2 MByte/$MByte pages, 4-way associative, 16 entries Let's detect them as well. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Link: http://lkml.kernel.org/r/1387801018-14499-1-git-send-email-kirill.shutemov@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-01-03 14:35:42 -08:00
Dave Young	41a34cec2e	x86: ksysfs.c build fix kbuild test robot report below error for randconfig: arch/x86/kernel/ksysfs.c: In function 'get_setup_data_paddr': arch/x86/kernel/ksysfs.c:81:3: error: implicit declaration of function 'ioremap_cache' [-Werror=implicit-function-declaration] arch/x86/kernel/ksysfs.c:86:3: error: implicit declaration of function 'iounmap' [-Werror=implicit-function-declaration] Fix it by including <asm/io.h> in ksysfs.c Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2014-01-03 14:37:13 +00:00
Linus Torvalds	8cf126d927	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "There is a small EFI fix and a big power regression fix in this batch. My queue also had a fix for downing a CPU when there are insufficient number of IRQ vectors available, but I'm holding that one for now due to recent bug reports" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/efi: Don't select EFI from certain special ACPI drivers x86 idle: Repair large-server 50-watt idle-power regression	2013-12-29 13:35:04 -08:00
Dave Young	77ea8c9489	x86: Reserve setup_data ranges late after parsing memmap cmdline Currently e820_reserve_setup_data() is called before parsing early params, it works in normal case. But for memmap=exactmap, the final memory ranges are created after parsing memmap= cmdline params, so the previous e820_reserve_setup_data() has no effect. For example, setup_data ranges will still be marked as normal system ram, thus when later sysfs driver ioremap them kernel will warn about mapping normal ram. This patch fix it by moving the e820_reserve_setup_data() callback after parsing early params so they can be set as reserved ranges and later ioremap will be fine with it. Signed-off-by: Dave Young <dyoung@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Tested-by: Toshi Kani <toshi.kani@hp.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2013-12-29 13:09:07 +00:00
Dave Young	5039e316dd	x86: Export x86 boot_params to sysfs kexec-tools use boot_params for getting the 1st kernel hardware_subarch, the kexec kernel EFI runtime support also needs to read the old efi_info from boot_params. Currently it exists in debugfs which is not a good place for such infomation. Per HPA, we should avoid "sploit debugfs". In this patch /sys/kernel/boot_params are exported, also the setup_data is exported as a subdirectory. kexec-tools is using debugfs for hardware_subarch for a long time now so we're not removing it yet. Structure is like below: /sys/kernel/boot_params \|__ data /* boot_params in binary/ \|__ setup_data \| \|__ 0 / the first setup_data node / \| \| \|__ data / setup_data node 0 in binary/ \| \| \|__ type / setup_data type of setup_data node 0, hex string / [snip] \|__ version / boot protocal version (in hex, "0x" prefixed)*/ Signed-off-by: Dave Young <dyoung@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Tested-by: Toshi Kani <toshi.kani@hp.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2013-12-29 13:09:07 +00:00
Dave Young	1fec053369	x86/efi: Pass necessary EFI data for kexec via setup_data Add a new setup_data type SETUP_EFI for kexec use. Passing the saved fw_vendor, runtime, config tables and EFI runtime mappings. When entering virtual mode, directly mapping the EFI runtime regions which we passed in previously. And skip the step to call SetVirtualAddressMap(). Specially for HP z420 workstation we need save the smbios physical address. The kernel boot sequence proceeds in the following order. Step 2 requires efi.smbios to be the physical address. However, I found that on HP z420 EFI system table has a virtual address of SMBIOS in step 1. Hence, we need set it back to the physical address with the smbios in efi_setup_data. (When it is still the physical address, it simply sets the same value.) 1. efi_init() - Set efi.smbios from EFI system table 2. dmi_scan_machine() - Temporary map efi.smbios to access SMBIOS table 3. efi_enter_virtual_mode() - Map EFI ranges Tested on ovmf+qemu, lenovo thinkpad, a dell laptop and an HP z420 workstation. Signed-off-by: Dave Young <dyoung@redhat.com> Tested-by: Toshi Kani <toshi.kani@hp.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2013-12-29 13:09:05 +00:00
Greg Kroah-Hartman	5bd2010fbe	Merge 3.13-rc5 into staging-next We want these fixes here to handle some merge issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-12-24 09:43:21 -08:00
Chen, Gong	addccbb264	ACPI, APEI, GHES: Do not report only correctable errors with SCI Currently SCI is employed to handle corrected errors - memory corrected errors, more specifically but in fact SCI still can be used to handle any errors, e.g. uncorrected or even fatal ones if enabled by the BIOS. Enable logging for those kinds of errors too. Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/1385363701-12387-1-git-send-email-gong.chen@linux.intel.com [ Boris: massage commit message, rename function arg. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2013-12-21 13:31:06 +01:00
H. Peter Anvin	7d590cca7c	x86, idle: Add memory barriers around clflush in mwait_play_dead() For consistency with mwait_idle_with_hints(). Not sure they help, but they really won't hurt... Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Len Brown <len.brown@intel.com> Link: http://lkml.kernel.org/r/CA%2B55aFzGxcML7j8CEvQPYzh0W81uVoAAVmGctMOUZ7CZ1yYd2A@mail.gmail.com	2013-12-19 12:30:03 -08:00
Peter Zijlstra	1682425539	x86, acpi, idle: Restructure the mwait idle routines People seem to delight in writing wrong and broken mwait idle routines; collapse the lot. This leaves mwait_play_dead() the sole remaining user of __mwait() and new __mwait() users are probably doing it wrong. Also remove __sti_mwait() as its unused. Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jacob Jun Pan <jacob.jun.pan@linux.intel.com> Cc: Mike Galbraith <bitbucket@online.de> Cc: Len Brown <lenb@kernel.org> Cc: Rui Zhang <rui.zhang@intel.com> Acked-by: Rafael Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131212141654.616820819@infradead.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-12-19 11:54:44 -08:00
Len Brown	40e2d7f9b5	x86 idle: Repair large-server 50-watt idle-power regression Linux 3.10 changed the timing of how thread_info->flags is touched: x86: Use generic idle loop (`7d1a941731`) This caused Intel NHM-EX and WSM-EX servers to experience a large number of immediate MONITOR/MWAIT break wakeups, which caused cpuidle to demote from deep C-states to shallow C-states, which caused these platforms to experience a significant increase in idle power. Note that this issue was already present before the commit above, however, it wasn't seen often enough to be noticed in power measurements. Here we extend an errata workaround from the Core2 EX "Dunnington" to extend to NHM-EX and WSM-EX, to prevent these immediate returns from MWAIT, reducing idle power on these platforms. While only acpi_idle ran on Dunnington, intel_idle may also run on these two newer systems. As of today, there are no other models that are known to need this tweak. Link: http://lkml.kernel.org/r/CAJvTdK=%2BaNN66mYpCGgbHGCHhYQAKx-vB0kJSWjVpsNb_hOAtQ@mail.gmail.com Signed-off-by: Len Brown <len.brown@intel.com> Link: http://lkml.kernel.org/r/baff264285f6e585df757d58b17788feabc68918.1387403066.git.len.brown@intel.com Cc: <stable@vger.kernel.org> # 3.12.x, 3.11.x, 3.10.x Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-12-19 11:47:39 -08:00
Linus Torvalds	1070d5ac19	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Ingo Molnar: "An x86/intel event constraint fix" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86: Fix constraint table end marker bug	2013-12-17 12:35:05 -08:00
Stephane Eranian	7fd565e275	perf/x86: enable Haswell Celeron RAPL support Enable RAPL support for Haswell Celeron (model 69). Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: ak@linux.intel.com Cc: acme@redhat.com Cc: jolsa@redhat.com Cc: zheng.z.yan@intel.com Cc: bp@alien8.de Cc: vincent.weaver@maine.edu Cc: maria.n.dimakopoulou@gmail.com Cc: peterz@infradead.org Link: http://lkml.kernel.org/r/1387225224-27799-2-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-12-17 15:21:32 +01:00
Ingo Molnar	fe361cfcf4	Linux 3.13-rc4 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.15 (GNU/Linux) iQEcBAABAgAGBQJSrhGrAAoJEHm+PkMAQRiGsNoH/jIK3CsQ2lbW7yRLXmfgtbzz i2Kep6D4SDvmaLpLYOVC8xNYTiE8jtTbSXHomwP5wMZ63MQDhBfnEWsEWqeZ9+D9 3Q46p0QWuoBgYu2VGkoxTfygkT6hhSpwWIi3SeImbY4fg57OHiUil/+YGhORM4Qc K4549OCTY3sIrgmWL77gzqjRUo+pQ4C73NKqZ3+5nlOmYBZC1yugk8mFwEpQkwhK 4NRNU760Fo+XIht/bINqRiPMddzC15p0mxvJy3cDW8bZa1tFSS9SB7AQUULBbcHL +2dFlFOEb5SV1sNiNPrJ0W+h2qUh2e7kPB0F8epaBppgbwVdyQoC2u4uuLV2ZN0= =lI2r -----END PGP SIGNATURE----- Merge tag 'v3.13-rc4' into perf/core Merge Linux 3.13-rc4, to refresh this branch with the latest fixes. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-12-16 14:51:32 +01:00
Ingo Molnar	014952270e	* Add the functionality to override error reporting agents as some machines are sporting a new extended error logging capability which, if done properly in the BIOS, makes a corresponding EDAC module redundant, from Gong Chen. * PCIe AER tracepoint severity levels fix, from Rui Wang. * Error path correction for the mce device init, from Levente Kurusa. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.15 (GNU/Linux) iQIcBAABAgAGBQJSrCysAAoJEBLB8Bhh3lVK1ikP/0hKY1Kk4tjbSta9A9Z8LdQG 9F5JzEny47DpTrLaKij7MqAlbYFO8sSm7Zw0CEztTF7Ou/H37GAuxhMlB8ECMGOm Dzu53X1rySTna9mB+1gyXXd+pJypp/oe18/o16rw1QKjI9o2Kfgwfj7lKvytR549 kDM1dhxEImQIS5cpJPkOPbcpVlSqYN7BnK9/Qx3h0W70httT/8qrr9xVtVL7wjOT auTA0R5/TkV06FtxyfHUNULEWTSP+2yNP/iJbusR6f4Jk1j0XmyCFr0BYOkPA1UO 9+wC9+2R+r7rJw8MBfMzNmPrRzDJHdaiHPwYqse05yewRHfRHe5cgZWJYbL8Qv0u 2WOX+fY12EfDYlihcOYtlupRzhGfGKRsaRpSuG1zX87ctDxAfNZencv4hnaJvfqG Xk6ggIX6tHKEivO2gmaPsmhoKveh0zcozUs+wgh/tvV5QB6ioFCjzHfSEsix5+BH ryyg1ri7IZnh92g3UuSUpE0OCbAquMfI7XIJo+kFs0u79dZTL/kD3wVu6oYazwdy yTrvIq7Bq5cMWnnni5w7dIU09ef2uvDgyHyAS6+RiqaQxhYFsW8/yx2zJrIloWRs 7txz6t3CVmWFiejIg2gw6KyjaG6pXRBkDkI1XU6T+bKLb31ojx2+i9UKIIUeRZTB iisWAOI6ZSdt4eAkgeaI =r//I -----END PGP SIGNATURE----- Merge tag 'ras_for_3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/ras Pull RAS updates from Borislav Petkov: * Add the functionality to override error reporting agents as some machines are sporting a new extended error logging capability which, if done properly in the BIOS, makes a corresponding EDAC module redundant, from Gong Chen. * PCIe AER tracepoint severity levels fix, from Rui Wang. * Error path correction for the mce device init, from Levente Kurusa. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-12-16 14:33:17 +01:00
Bjorn Helgaas	1d72e71d45	Merge branch 'pci/yijing-dev_is_pci' into next * pci/yijing-dev_is_pci: alpha/PCI: Use dev_is_pci() to identify PCI devices arm/PCI: Use dev_is_pci() to identify PCI devices arm/PCI: Use dev_is_pci() to identify PCI devices parisc/PCI: Use dev_is_pci() to identify PCI devices sparc/PCI: Use dev_is_pci() to identify PCI devices ia64/PCI: Use dev_is_pci() to identify PCI devices x86/PCI: Use dev_is_pci() to identify PCI devices PCI: Use dev_is_pci() to identify PCI devices	2013-12-13 11:01:27 -07:00
DuanZhenzhong	ac8344c4c0	PCI: Drop "irq" param from _restore_msi_irqs() Change x86_msi.restore_msi_irqs(struct pci_dev dev, int irq) to x86_msi.restore_msi_irqs(struct pci_dev *dev). restore_msi_irqs() restores multiple MSI-X IRQs, so param 'int irq' is unneeded. This makes code more consistent between vm and bare metal. Dom0 MSI-X restore code can also be optimized as XEN only has a hypercall to restore all MSI-X vectors at one time. Tested-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-12-13 08:44:30 -07:00
Ingo Molnar	d8af4ce490	x86/traps: Clean up error exception handler definitions So I was reading the exception handler generation code and got a real headache looking at the unstructured mess that our DO_ERROR*() generation code is today. Make it more readable. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: http://lkml.kernel.org/n/tip-kuabysiykvUJpgus35lhnhvs@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-12-12 14:46:42 +01:00
Tejun Heo	13ccb93f41	Merge branch 'driver-core-linus' into driver-core-next `a8b1474442` ("sysfs: give different locking key to regular and bin files") in driver-core-linus modifies sysfs_open_file() so that it gives out different locking classes to sysfs_open_files depending on whether the file is bin or not. Due to the massive kernfs reorganization in driver-core-next, this naturally causes merge conflict in fs/sysfs/file.c. Due to the way things are split between kernfs and sysfs in driver-core-next, the same fix can't easily be applied to driver-core-next. This merge simply ignores the offending commit. A following patch will implement a separate fix for the issue. Signed-off-by: Tejun Heo <tj@kernel.org>	2013-12-10 08:44:37 -05:00
Yijing Wang	894d334378	x86/PCI: Use dev_is_pci() to identify PCI devices Use dev_is_pci() instead of checking bus type directly. Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2013-12-09 16:50:12 -07:00
Takashi Iwai	75da02b29f	microcode: Use request_firmware_direct() Use the new helper, request_firmware_direct(), for avoiding the lengthy timeout of non-existing firmware loads. Especially the Intel microcode driver suffers from this problem because each CPU triggers the f/w loading, thus it ends up taking (literally) hours with many cores. Tested-by: Prarit Bhargava <prarit@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-12-08 18:22:32 -08:00
Qiaowei Ren	e7d820a5e5	x86, xsave: Support eager-only xsave features, add MPX support Some features, like Intel MPX, work only if the kernel uses eagerfpu model. So we should force eagerfpu on unless the user has explicitly disabled it. Add definitions for Intel MPX and add it to the supported list. [ hpa: renamed XSTATE_FLEXIBLE to XSTATE_LAZY and added comments ] Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> Link: http://lkml.kernel.org/r/9E0BE1322F2F2246BD820DA9FC397ADE014A6115@SHSMSX102.ccr.corp.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-12-06 17:17:42 -08:00
Lv Zheng	8b48463f89	ACPI: Clean up inclusions of ACPI header files Replace direct inclusions of <acpi/acpi.h>, <acpi/acpi_bus.h> and <acpi/acpi_drivers.h>, which are incorrect, with <linux/acpi.h> inclusions and remove some inclusions of those files that aren't necessary. First of all, <acpi/acpi.h>, <acpi/acpi_bus.h> and <acpi/acpi_drivers.h> should not be included directly from any files that are built for CONFIG_ACPI unset, because that generally leads to build warnings about undefined symbols in !CONFIG_ACPI builds. For CONFIG_ACPI set, <linux/acpi.h> includes those files and for CONFIG_ACPI unset it provides stub ACPI symbols to be used in that case. Second, there are ordering dependencies between those files that always have to be met. Namely, it is required that <acpi/acpi_bus.h> be included prior to <acpi/acpi_drivers.h> so that the acpi_pci_root declarations the latter depends on are always there. And <acpi/acpi.h> which provides basic ACPICA type declarations should always be included prior to any other ACPI headers in CONFIG_ACPI builds. That also is taken care of including <linux/acpi.h> as appropriate. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Tony Luck <tony.luck@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> (drivers/pci stuff) Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> (Xen stuff) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-12-07 01:03:14 +01:00
Maria Dimakopoulou	cf30d52e2d	perf/x86: Fix constraint table end marker bug The EVENT_CONSTRAINT_END() macro defines the end marker as a constraint with a weight of zero. This was all fine until we blacklisted the corrupting memory events on Intel IvyBridge. These events are blacklisted by using a counter bitmask of zero. Thus, they also get a constraint weight of zero. The iteration macro: for_each_constraint tests the weight==0. Therefore, it was stopping at the first blacklisted event, i.e., 0xd0. The corrupting events were therefore considered as unconstrained and were scheduled on any of the generic counters. This patch fixes the end marker to have a weight of -1. With this, the blacklisted events get an empty constraint and cannot be scheduled which is what we want for now. Signed-off-by: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com> Reviewed-by: Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: ak@linux.intel.com Cc: jolsa@redhat.com Cc: zheng.z.yan@intel.com Link: http://lkml.kernel.org/r/20131204232437.GA10689@starlight Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-12-05 10:02:30 +01:00

... 3 4 5 6 7 ...

10017 Commits